A redirector receives data from Squid on stdin one line at a time. Each line contains the following four tokens separated by whitespace:
Client IP address and fully qualified domain name
User's name, via either RFC 1413 ident or proxy authentication
HTTP request method
http://www.example.com/page1.html 192.168.2.3/user.host.name jabroni GET
The Request-URI is taken from the client's request, including query terms, if any. Fragment identifier components (e.g., the # character and subsequent text) are removed, however.
The second token contains the client IP address and, optionally, its fully qualified domain name (FQDN). The FQDN is set only if you enable the log_fqdn directive or use a srcdomain ACL element. Even then, the FQDN may be unknown because the client's network administrators didn't properly set up the reverse pointer zones in their DNS. If Squid doesn't know the client's FQDN, it places a hyphen (-) in the field. For example:
http://www.example.com/page1.html 192.168.2.3/- jabroni GET
The client ident field is set if Squid knows the name of the user behind the request. This happens if you use proxy authentication, ident ACL elements, or enable ident_lookup_access. Remember, however, that the ident_lookup_access directive doesn't cause Squid to delay request processing. In other words, if you enable that directive, but don't use the access controls, Squid may not yet know the username when writing to the redirector process. If Squid doesn't know the username, it displays a -. For example:
http://www.example.com/page1.html 192.168.2.3/- - GET
Squid reads back one token from the redirector process: a URI. If Squid reads a blank line, the original URI remains unchanged.
A redirector program should never exit until end-of-file occurs on stdin. If the process does exit prematurely, Squid writes a warning to cache.log:
WARNING: redirector #2 (FD 18) exited
If 50% of the redirector processes exit prematurely, Squid aborts with a fatal error message.
If the Request-URI contains whitespace, and the uri_whitespace directive is set to allow, any whitespace in the URI is passed to the redirector. A redirector with a simple parser may become confused in this case. You have two options for handling whitespace in URIs when using a redirector.
One option is to set the uri_whitespace directive to anything except allow. The default setting, strip, is probably a good choice in most situations because Squid simply removes the whitespace from the URI when it parses the HTTP request. See Appendix A for information on the other values for this directive.
If that isn't an option, you need to make sure the redirector's parser is smart enough to detect the extra tokens. For example, if it finds more than four tokens in the line received from Squid, it can assume that the last three are the IP address, ident, and request method. Everything before the third-to-last token comprises the Request-URI.
When a redirector changes the client's URI, it normally doesn't know that Squid decided to fetch a different resource. This is, in all likelihood, a gross violation of the HTTP RFC. If you want to be nicer, and remain compliant, there is a little trick that makes Squid return an HTTP redirect message. Simply have the redirector insert 301:, 302:, 303:, or 307:, before the new URI.
For example, if a redirector writes this line on its stdout:
Squid sends a response like this back to the client:
HTTP/1.0 301 Moved Permanently Server: squid/2.5.STABLE4 Date: Mon, 29 Sep 2003 04:06:23 GMT Content-Length: 0 Location: http://www.example.com/page2.html X-Cache: MISS from zoidberg Proxy-Connection: close