You want to change how Apache logs requests. For example, you want a database of URLs and access counts, or per-user logs.
Install a handler with PerlLogHandler:
PerlModule Apache::MyLogger PerlLogHandler Apache::MyLogger
Within the handler, methods on the request object obtain information about the completed request. In the following code, $r is the request object and $c is the connection object obtained from $r->connection:
$r->the_request GET /roast/chickens.html HTTP/1.1 $r->uri /roast/chickens.html $r->header_in("User-Agent") Mozilla-XXX $r->header_in("Referer") http://gargle.com/?search=h0t%20chix0rz $r->bytes_sent 1648 $c->get_remote_host 208.201.239.56 $r->status_line 200 OK $r->server_hostname www.myserver.com
Apache calls logging handlers after sending the response to the client. You have full access to the request and response parameters, such as client IP address, headers, status, and even content. Access this information through method calls on the request object.
You'll probably want to escape values before writing them to a text file because spaces, newlines, and quotes could spoil the formatting of the files. Two useful functions are:
# return string with newlines and double quotes escaped sub escape { my $a = shift; $a =~ s/([\n\"])/sprintf("%%%02x", ord($1))/ge; return $a; } # return string with newlines, spaces, and double quotes escaped sub escape_plus { my $a = shift; $a =~ s/([\n \"])/sprintf("%%%02x", ord($1))/ge; return $a; }
Two prebuilt logging modules on CPAN are Apache::Traffic and Apache::DBILogger. Apache::Traffic lets you assign owner strings (either usernames, UIDs, or arbitrary strings) to your web server's directories in httpd.conf. Apache::Traffic builds a DBM database as Apache serves files from these directories. For each owner, the database records the number of hits their directories received each day and the total number of bytes transferred by those hits.
Apache::DBILogger is a more general interface, logging each hit as a new entry in a table. The table has columns for data such as which virtual host delivered the data, the client's IP address, the user agent (browser), the date, the number of bytes transferred, and so on. Using this table and suitable indexes and queries, you can answer almost any question about traffic on your web site.
Because the logging handler runs before Apache has closed the connection to the client, don't use this phase if you have a slow logging operation. Instead, install the handler with PerlCleanupHandler so that it runs after the connection is closed.
Writing Apache Modules with Perl and C; Chapter 16 of mod_perl Developer's Cookbook; documentation for the Apache::Traffic and Apache::DBILogger CPAN modules; the Apache.pm manpage