As I mentioned earlier, ACL elements are the first step in building access controls. The second step is the access control rules, where you combine elements to allow or deny certain actions. You've already seen some http_access rules in the preceding examples. Squid has a number of other access control lists:
This is your most important access list. It determines which client HTTP requests are allowed, and which are denied. If you get the http_access configuration wrong, your Squid cache may be vulnerable to attacks and abuse from people who shouldn't have access to it.
The http_reply_access list is similar to http_access. The difference is that the former list is checked when Squid receives a reply from an origin server or upstream proxy. Most access controls are based on aspects of the client's request, in which case the http_access list is sufficient. However, some people prefer also to allow or deny requests based on the reply content type. Because Squid doesn't know the content type value until it receives the server's reply, this additional access list is necessary. See Section 6.3.9 for more information.
If your Squid cache is configured to serve ICP replies (see Section 10.6), you should use the icp_access list. In most cases, you'll want to allow ICP requests only from your neighbor caches.
You can use the no_cache access list to tell Squid it must never store certain responses (on disk or in memory). This list is typically used in conjunction with dst, dstdomain, and url_regex ACLs.
The "no" in no_cache causes some confusion because of double negatives. A request that is denied by the no_cache list isn't cached. In other words no_cache deny ... is the way to make something uncachable. See Section 6.3.10 for an example.
The miss_access list is primarily useful for a Squid cache with sibling neighbors. It determines how Squid handles requests that are cache misses. This feature is necessary for Squid to enforce sibling relationships with its neighbors. See Section 6.3.7 for an example.
This access list determines which requests are sent to one of the redirector processes (see Chapter 11). By default, all requests go through a redirector if you are using one. You can use the redirector_access list to prevent certain requests from being rewritten. This is particularly useful because a redirector receives less information about a particular request than does the access control system.
The ident_lookup_access list is similar to redirector_access. It enables you to make "lazy" ident lookups for certain requests. Squid doesn't issue ident queries by default. It does so only for requests that are allowed by the ident_lookup_access rules (or by an ident ACL).
This access list affects how a Squid cache with neighbors forwards cache misses. Usually Squid tries to forward cache misses to a parent cache, and/or Squid uses ICP to locate cached responses in neighbors. However, when a request matches an always_direct rule, Squid forwards the request directly to the origin server.
With this list, matching an allow rule causes Squid to forward the request directly. See Section 10.4.4 for more information and an example.
Not surprisingly, never_direct is the opposite of always_direct. Cache miss requests that match this list must be sent to a neighbor cache. This is particularly useful for proxies behind firewalls.
With this list, matching an allow rule causes Squid to forward the request to a neighbor. See Section 10.4.3 for more information and an example.
This access list applies to queries sent to Squid's SNMP port. The ACLs that you can use with this list are snmp_community and src. You can also use srcdomain, srcdom_regex, and src_as if you really want to. See Section 14.3 for an example.
This access list affects the way that Squid handles certain POST requests. Some older user-agents are known to send an extra CRLF (carriage return and linefeed) at the end of the request body. That is, the message body is two bytes longer than indicated by the Content-Length header. Even worse, some older HTTP servers actually rely on this incorrect behavior. When a request matches this access list, Squid emulates the buggy client and sends the extra CRLF characters.
Squid has a number of additional configuration directives that use ACL elements. Some of these used to be global settings that were modified to use ACLs to provide more flexibility.
This access list controls the HTTP requests and ICP/HTCP queries that are sent to a neighbor cache. See Section 10.4.1 for more information and examples.
This access list restricts the maximum acceptable size of an HTTP reply body. See Appendix A for more information.
This access rule list controls whether or not the delay pools are applied to the (cache miss) response for this request. See Appendix C.
This access list binds server-side TCP connections to specific local IP addresses. See Appendix A.
This access list can set different TOS/Diffserv values in TCP connections to origin servers and neighbors. See Appendix A.
With this directive, you can configure Squid to remove certain HTTP headers from the requests that it forwards. For example, you might want to automatically filter out Cookie headers in requests sent to certain origin servers, such as doubleclick.net. See Appendix A.
This directive allows you to replace, rather than just remove, the contents of HTTP headers. For example, you can set the User-Agent header to a bogus value to keep certain origin servers happy while still protecting your privacy. See Appendix A.
The syntax for an access control rule is as follows:
access_list allow|deny [!]ACLname ...
http_access allow MyClients http_access deny !Safe_Ports http_access allow GameSites AfterHours
When reading the configuration file, Squid makes only one pass through the access control lines. Thus, you must define the ACL elements (with an acl line) before referencing them in an access list. Furthermore, the order of the access list rules is very important. Incoming requests are checked in the same order that you write them. Placing the most common ACLs early in the list may reduce Squid's CPU usage.
Recall that Squid uses OR logic when searching ACL elements. Any single value in an acl can cause a match.
It's the opposite for access rules, however. For http_access and the other rule sets, Squid uses AND logic. Consider this generic example:
access_list allow ACL1 ACL2 ACL3
For this rule to be a match, the request must match each of ACL1, ACL2, and ACL3. If any of those ACLs don't match the request, Squid stops searching this rule and proceeds to the next. Within a single rule, you can optimize rule searching by putting least-likely-to-match ACLs first. Consider this simple example:
acl A method http acl B port 8080 http_access deny A B
This http_access rule is somewhat inefficient because the A ACL is more likely to be matched than B. It is better to reverse the order so that, in most cases, Squid only makes one ACL check, instead of two:
http_access deny B A
One mistake people commonly make is to write a rule that can never be true. For example:
acl A src 184.108.40.206 acl B src 220.127.116.11 http_access allow A B
This rule is never going to be true because a source IP address can't be equal to both 18.104.22.168 and 22.214.171.124 at the same time. Most likely, someone who writes a rule like that really means this:
acl A src 126.96.36.199 188.8.131.52 http_access allow A
As with the algorithm for matching the values of an ACL, when Squid finds a matching rule in an access list, the search terminates. If none of the access rules result in a match, the default action is the opposite of the last rule in the list. For example, consider this simple access configuration:
acl Bob ident bob http_access allow Bob
Now if the user Mary makes a request, she is denied. The last (and only) rule in the list is an allow rule, and it doesn't match the username Mary. Thus, the default action is the opposite of allow, so the request is denied. Similarly, if the last entry is a deny rule, the default action is to allow the request. It is good practice always to end your access lists with explicit rules that either allow or deny all requests. To be perfectly clear, the previous example should be written this way:
acl All src 0/0 acl Bob ident bob http_access allow Bob http_access deny All
The src 0/0 ACL is an easy way to match each and every type of request.
Squid's access control syntax is very powerful. In most cases, you can probably think of two or more ways to accomplish the same thing. In general, you should put the more specific and restrictive access controls first. For example, rather than:
acl All src 0/0 acl Net1 src 184.108.40.206/24 acl Net2 src 220.127.116.11/24 acl Net3 src 18.104.22.168/24 acl Net4 src 22.214.171.124/24 acl WorkingHours time 08:00-17:00 http_access allow Net1 WorkingHours http_access allow Net2 WorkingHours http_access allow Net3 WorkingHours http_access allow Net4 http_access deny All
you might find it easier to maintain and understand the access control configuration if you write it like this:
http_access allow Net4 http_access deny !WorkingHours http_access allow Net1 http_access allow Net2 http_access allow Net3 http_access deny All
Whenever you have a rule with two or more ACL elements, it's always a good idea to follow it up with an opposite, more general rule. For example, the default Squid configuration denies cache manager requests that don't come from the localhost IP address. You might be tempted to write it like this:
acl CacheManager proto cache_object acl Localhost src 127.0.0.1 http_access deny CacheManager !Localhost
However, the problem here is that you haven't yet allowed the cache manager requests that do come from localhost. Subsequent rules may cause the request to be denied anyway. These rules have this undesirable behavior:
acl CacheManager proto cache_object acl Localhost src 127.0.0.1 acl MyNet 10.0.0.0/24 acl All src 0/0 http_access deny CacheManager !Localhost http_access allow MyNet http_access deny All
Since a request from localhost doesn't match MyNet, it gets denied. A better way to write the rules is like this:
http_access allow CacheManager localhost http_access deny CacheManager http_access allow MyNet http_access deny All
Some ACLs can't be checked in one pass because the necessary information is unavailable. The ident, dst, srcdomain, and proxy_auth types fall into this category. When Squid encounters an ACL that can't be checked, it postpones the decision and issues a query for the necessary information (IP address, domain name, username, etc.). When the information is available, Squid checks the rules all over again, starting at the beginning of the list. It doesn't continue where the previous check left off. If possible, you may want to move these likely-to-be-delayed ACLs near the top of your rules to avoid unnecessary, repeated checks.
Because these delays are costly (in terms of time), Squid caches the information whenever possible. Ident lookups occur for each connection, rather than each request. This means that persistent HTTP connections can really benefit you in situations where you use ident queries. Hostnames and IP addresses are cached as specified by the DNS replies, unless you're using the older external dnsserver processes. Proxy Authentication information is cached as I described previously in Section 126.96.36.199.
Internally, Squid considers some access rule checks fast, and others slow. The difference is whether or not Squid postpones its decision to wait for additional information. In other words, a slow check may be deferred while Squid asks for additional data, such as:
A reverse DNS lookup: the hostname for a client's IP address
An RFC 1413 ident query: the username associated with a client's TCP connection
An authenticator: validating the user's credentials
A forward DNS lookup: the origin server's IP address
An external, user-defined ACL
Some access rules use fast checks out of necessity. For example, the icp_access rule is a fast check. It must be fast, to serve ICP queries quickly. Furthermore, certain ACL types, such as proxy_auth, are meaningless for ICP queries. The following access rules are fast checks:
The following ACL types may require information from external sources (DNS, authenticators, etc.) and are thus incompatible with fast access rules:
srcdomain, dstdomain, srcdom_regex, dstdom_regex
This means, for example, that you can't reliably use an ident ACL in a header_access rule.