Occasionally you may encounter a certain request or origin server that seems not to work with Squid. You can use the following technique to determine if the problem lies with Squid, the client, or the origin server. The trick is to capture the HTTP request, then replay it in different ways until you identify the problem.
Capturing the HTTP request means getting more than just the URL. You also need the request method, HTTP version number, and all of the request headers. One way to capture the request is by enabling full debugging in Squid for a short time. On the Squid box, type:
% squid -kdebug
Then, go to the web browser and issue the request. Squid should receive the request almost immediately. After a few seconds, go back to the Squid box and issue the same command:
% squid -kdebug
Now your cache.log file should contain the client's request. If your Squid is busy, the cache.log will contain a lot of requests, so you'll have to search for it. It looks something like this:
2003/09/29 10:37:40| parseHttpRequest: Method is 'GET' 2003/09/29 10:37:40| parseHttpRequest: URI is 'http://squidbook.org/' 2003/09/29 10:37:40| parseHttpRequest: Client HTTP version 1.1. 2003/09/29 10:37:40| parseHttpRequest: req_hdr = { User-Agent: Mozilla/5.0 (compatible; Konqueror/3) Pragma: no-cache Cache-control: no-cache Accept: text/*, image/jpeg, image/png, image/*, */* Accept-Encoding: x-gzip, gzip, identity Accept-Charset: iso-8859-1, utf-8;q=0.5, *;q=0.5 Accept-Language: en Host: squidbook.org
Note that Squid prints the components of the first line separately. You'll have to manually reassemble them like this:
GET http://squidbook.org/ HTTP/1.1
Another way to capture the full request is with a utility such as netcat or socket (http://www.jnickelsen.de/socket/). Start the socket program listening on some port, then configure the browser to use that port as the proxy address. When you make the request again, socket prints the HTTP request:
% socket -s 8080 GET http://squidbook.org/ HTTP/1.1 User-Agent: Mozilla/5.0 (compatible; Konqueror/3) Pragma: no-cache Cache-control: no-cache Accept: text/*, image/jpeg, image/png, image/*, */* Accept-Encoding: x-gzip, gzip, identity Accept-Charset: iso-8859-1, utf-8;q=0.5, *;q=0.5 Accept-Language: en Host: squidbook.org
Finally, you can also use a network packet capture utility, such as tcpdump or ethereal. After capturing a few packets with tcpdump, you can then use tcpshow to view them:
# tcpdump -w tcpdump.log -c 10 -s 1500 port 80 # tcpshow -noHostNames -noPortNames < tcpdump.log | less ... Packet 4 TIME: 08:39:29.593051 (0.000627) LINK: 00:90:27:16:AA:75 -> 00:00:24:C0:0D:25 type=IP IP: 10.0.0.21 -> 206.168.0.6 hlen=20 TOS=00 dgramlen=304 id=4B29 MF/DF=0/1 frag=0 TTL=64 proto=TCP cksum=15DC TCP: port 2074 -> 80 seq=0481728885 ack=4107144217 hlen=32 (data=252) UAPRSF=011000 wnd=57920 cksum=EB38 urg=0 DATA: GET / HTTP/1.0. Host: www.ircache.net. Accept: text/html, text/plain, application/pdf, application/ postscript, text/sgml, */*;q=0.01. Accept-Encoding: gzip, compress. Accept-Language: en. Negotiate: trans. User-Agent: Lynx/2.8.1rel.2 libwww-FM/2.14. .
Note that tcpshow prints a period where the data contains a newline character.
Once you've captured a request, save it to a file. Then you can replay it through Squid with netcat or socket:
% socket squidhost 3128 < request | less
If the response looks normal, the problem might be with the user-agent. Otherwise, you can change various things to isolate the problem. For example, if you see some funny-looking HTTP headers, delete them from the request and try it again. You may also find it useful to try the request directly with the origin server, instead of going through Squid. To do that, remove the http://host.name/ from the request and send it to the origin server:
% cat request GET / HTTP/1.1 User-Agent: Mozilla/5.0 (compatible; Konqueror/3) Pragma: no-cache Cache-control: no-cache Accept: text/*, image/jpeg, image/png, image/*, */* Accept-Encoding: x-gzip, gzip, identity Accept-Charset: iso-8859-1, utf-8;q=0.5, *;q=0.5 Accept-Language: en Host: squidbook.org % socket squidbook.org 80 < request | less
When working with HTTP in this manner, you might find it useful to refer to RFC 2616 and O'Reilly's HTTP: The Definitive Guide.