Apache provides several tools for managing your logs. Other Apache-specific third-party tools are available and are mentioned here. Because Apache can log requests in the Common Log Format, most generic log processing tools can be used with Apache as well.
Earlier in the hour, you learned how to use the HostNameLookups directive to enable or disable hostname resolution at the time the request is made. If HostNameLookups is set to off (the default), the log file will contain only IP addresses. Later, you can use the command-line logresolve utility on Unix or logresolve.exe on Windows to process the log file and convert the IP addresses to hostnames.
logresolve reads log entries from standard input and outputs the result to its standard output. To read to and from a file, you can use redirection, on both Unix and Windows:
logresolve < access.log > resolved.log
Log-resolving tools are efficient because they can cache results and they do not cause any delay when serving requests to clients.
Fastresolve is an alternative, freely available log-resolving utility that can be found at http://www.pix.net/staff/djm/sw/fastresolve/.
In Web sites with high traffic, the log files can quickly grow in size. You need to have a mechanism to rotate logs periodically, archiving and compressing older logs at well-defined intervals.
Log files cannot be removed directly while Apache is running because the server is writing directly to them. The solution is to use an intermediate program to log the requests. The program will, in turn, take care of rotating the logs.
Apache provides the rotatelogs program on Unix and rotatelogs.exe on Windows for this purpose. It accepts three arguments: a filename, a rotate interval in seconds, and an optional offset in minutes against UTC (Coordinated Universal Time).
TransferLog "|bin/rotatelogs /var/logs/apachelog 86400"
will create a new log file and move the current log to the /var/logs directory daily. (At the end of the command, 86400 is the number of seconds in one day.)
If the path to the program includes spaces, you might need to escape them by prefixing them with a \ (backslash)?for example, My\ Documents. This is especially common in the Windows platform.
If the name of the file includes % prefixed options, the name will be treated as input to the strftime function that converts the % options to time values. The manual page for rotatelogs contains a complete listing of options, but here's an example:
TransferLog "|bin/rotatelogs /var/logs/apachelog%m_%d_%y 86400"
This command will add the current month, day, and year to the log filename.
If the name does not include any %-formatted options, the current time in seconds is added to the name of the archived file.
cronolog and httplog are additional log-rotating programs. httplog adds support for additional compression of log files. You can find them at http://www.cronolog.org/ and http://nutbar.chemlab.org/downloads/.
When you have a cluster of Web servers serving similar content, maybe behind a load balancer, you often need to merge the logs from all the servers in a unique log stream before passing it to analysis tools.
Similarly, if a single Apache server instance handles several virtual hosts, sometimes it is useful to split a single log file into different files, one per each virtual host.
Logtools is a collection of log-manipulation tools that can be found at http://www.coker.com.au/logtools/.
Apache includes the split-file Perl script for splitting logs. You can find it in the support subdirectory of the Apache distribution.
After you collect the logs, you can analyze them and gain information about traffic and visitor behavior.
Many commercial and freely available applications are available for log analysis and reporting. Two of the most popular open source applications are Webalizer (http://www.mrunix.net/webalizer/) and awstats (http://awstats.sourceforge.net).
Wusage is a nice, inexpensive commercial alternative and can be found at http://www.boutell.com/wusage/.
If you run Apache on a Unix system, you can use the tail command-line utility to monitor, in real-time, log entries both to your access and error logs. The syntax is
tail -f logname
where logname is the path to the Apache log file. It will print onscreen the last few lines of the log file and will continue to print entries as they are added to the file.
You can find additional programs that enable you to quickly identify problems by scanning your error log files for specific errors, malformed requests, and so on, and reporting on them:
Logscan can be found at http://www.garandnet.net/security.php.
ScanErrLog can be found at http://www.librelogiciel.com/software/.