15.6 Gotchas

Using Squid as a surrogate may improve your origin server's security and performance. However, there are some potentially negative side effects as well. Here are a few things to keep in mind.

15.6.1 Logging

When using a surrogate, the origin server's access log contains only the cache misses from Squid. Furthermore, those log-file entries have Squid's IP address, rather than the client's. In other words, Squid's access.log is where all the good information is now stored.

Recall that, by default, Squid doesn't use the common log-file format. You should use the emulate_httpd_log directive to make Squid's access.log look just like Apache's default log-file format.

15.6.2 Ignoring Reloads

The Reload button found on most browsers generates HTTP requests with the Cache-Control: no-cache directive set. While this is usually desirable for client-side caching proxies, it may ruin the performance of a surrogate. This is especially true if the backend server is heavily loaded. A reload request forces Squid to purge the currently cached response while retrieving the new response from the origin server. If those origin server responses arrive slowly, Squid consumes a larger than normal number of file descriptors and network resources.

To help in this situation, you may want to use one of the refresh_pattern options. When the ignore-reload option is set, Squid pretends that the request doesn't contain the no-cache directive. The ignore-reload option is generally safe for surrogates, although it does, technically, violate the HTTP protocol.

To make Squid ignore reloads for all requests, use a line like this in squid.conf:

refresh_pattern . 0 20% 4320 ignore-reload

For a somewhat safer alternative, you can use the reload-into-ims option. It causes Squid to validate its cached response when the request contains no-cache. Note, however, that this works only for responses that have cache validators (such as Last-Modified timestamps).

15.6.3 Uncachable Content

As a surrogate, Squid obeys the standard HTTP headers for caching responses from your backend server. This means, for example, that certain dynamic responses might not be cached. You might want to use the refresh_pattern directive to force caching of these objects. For example:

refresh_pattern \.dhtml$ 60 80% 180

This trick only works for certain types of responses, namely, those without a Last-Modified or Expires header. By default, Squid doesn't cache such responses. However, using a nonzero minimum time in a refresh_pattern rule instructs Squid to cache the response, and serve it as a cache hit for that amount of time anyway. See Section 7.7 for the details.

If your backend server generates other types of uncachable responses, you may not be able to trick Squid into storing them.

15.6.4 Errors

With Squid as a surrogate in front of your origin server, you should be aware that visitors to your site may see an error message from Squid, rather than the origin server itself. In other words, your use of Squid may be "exposed" through certain error messages. For example, Squid returns its own error message when it fails to parse the client's HTTP request, which could happen if the request is incomplete or is malformed in some way. Squid also returns an error message if it can't connect to the backend server for some reason.

If your site is consistent and functioning properly, you probably don't need to worry about Squid's error messages. Nonetheless, you may want to take a close look at the access.log from time to time and see what sort of errors, if any, your users might be seeing.

15.6.5 Purging Objects

You may find the PURGE method particularly useful when operating a surrogate. Because you have a good understanding of the content being served, you are more likely to know when a cached object must be purged. The technique for purging an object is the same as I mentioned previously. See Section 7.6 for a refresher.

15.6.6 Neighbors

Although I don't recommend it, you can configure Squid as a surrogate and as part of a mesh or hierarchy. If you choose to take on such an arrangement, note that, by default, Squid forwards cache misses to parents (rather than the backend server). Assuming that isn't what you really want, be sure to use the cache_peer_access directives so that requests for your backend server don't go to your neighbors instead.