Standard Apache Access Logging

Using Apache's basic logging features, you can keep track of who visits your Web sites by logging accesses to the servers hosting them. You can log every aspect of the requests and responses, including the IP address of the client, user, and resource accessed. You need to take three steps to create a request log:

  1. Define what you want to log?your log format.

  2. Define where you want to log it?your log files, a database, an external program.

  3. Define whether or not to log?conditional logging rules.

Deciding What to Log

You can log nearly every aspect associated with the request. You can define how your log entries look by creating a log format. A log format is a string that contains text mixed with log formatting directives. Log formatting directives start with a % and are followed by a directive name or identifier, usually a letter indicating the piece of information to be logged. When Apache logs a request, it scans the string and substitutes the value for each directive. For example, if the log format is This is the client address %a, the log entry is something like This is the client address That is, the logging directive %a is replaced by the IP address of the client making the request. Table 17.1 provides a comprehensive list of all formatting directives.

Table 17.1. Log Formatting Directives

Formatting Options


Data from the Client


Remote IP address, from the client.


Hostname or IP address of the client making the request. Whether the hostname is logged depends on two factors: The IP address of the client must be able to resolve to a hostname using a reverse DNS lookup, and Apache must be configured to do that lookup using the HostNameLookups directive, explained later in this hour. If these conditions are not met, the IP address of the client will be logged instead.


Remote user, obtained via the identd protocol. This option is not very useful because this protocol is not supported on the majority of the client machines, and the results can't be trusted anyway because the client provides them.


Remote user from the HTTP basic authentication protocol.

Data from the Server


Local IP address, from the server.


Time it took to serve the request in microseconds.

%{ env_variable}e

Value for an environment variable named env_variable.

Data from the Server

%{ time_format} t

Current time. If {time_format} is present, it will be interpreted as an argument to the Unix strftime function. See the logresolve Apache manual page for details.


Time it took to serve the request, in seconds.


Canonical name of the server that answered the request.


Server name according to the UserCanonicalName directive.


Status of the connection in the server. A value of x means the connection was aborted before the server could send the data. A + means the connection will be kept alive for further requests from the same client. A - means the connection will be closed.

Data from the Request

%{cookie_name} C

Value for a cookie named cookie_name.


Request protocol, such as HTTP or HTTPS.


Request method such as GET, POST, PUT, and so on.

%{header_name} i

Value for a header named header_name in the request from the client. This information can be useful, for example, to log the names and versions of your visitors' browsers.


Text of the original HTTP request.


Query parameters, if any, prefixed by a ?.


Requested URL, without query parameters.


Username for the HTTP authentication (basic or digest).

Data from the Response

%b, %B

Size, in bytes, of the body of the response sent back to the client (excluding headers). The only difference between the options is that if no data was sent, %b will log a - and %B will log 0.


Path of the file served, if any.


Time when the request was served.

%{header_name} o

Value for a header named header_name in the response to the client.


Final status code. Apache can process several times the same request (internal redirects). This is the status code of the final response.

The Common Log Format (CLF) is a standard log format. Most Web sites can log requests using this format, and the format is understood by many log processing and reporting tools. Its format is the following:

"%h %l %u %t \"%r\" %>s %b"

That is, it includes the hostname or IP address of the client, remote user via identd, remote user via HTTP authentication, time when the request was served, text of the request, status code, and size in bytes of the content served.


You can read the Common Log Format documentation of the original W3C server at

The following is a sample CLF entry: - - [21/Sep/2001:11:27:56 -0800] "GET / HTTP/1.1" 200 1456

You are now ready to learn how to define log formats using the LogFormat directive. This directive takes two arguments: The first argument is a logging string, and the second is a nickname that will be associated with that logging string.

For example, the following directive from the default Apache configuration file defines the Common Log Format and assigns it the nickname common:

LogFormat "%h %l %u %t \"%r\" %>s %b" common

You can also use the LogFormat directive with only one argument, either a log format string or a nickname. This will have the effect of setting the default value for the logging format used by the TransferLog directive, explained in "Logging Accesses to Files" later in this hour.

The HostNameLookups Directive

When a client makes a request, Apache knows only the IP address of the client. Apache must perform what is called a reverse DNS lookup to find out the hostname associated with the IP address. This operation can be time-consuming and can introduce a noticeable lag in the request processing. The HostNameLookups directive allows you to control whether to perform the reverse DNS lookup.

HostNameLookups can take one of the following arguments: on, off, or double. The default is off. The double lookup argument means that Apache will find out the hostname from the IP and then will try to find the IP from the hostname. This process is necessary if you are really concerned with security, as described in If you are using hostnames as part of your Allow and Deny rules, a double DNS lookup is performed regardless of the HostNameLookups settings.

If HostNameLookups is enabled (on or double), Apache will log the hostname. This does cause extra load on your server, which you should be aware of when making the decision to turn HostNameLookups on or off. If you choose to keep HostNameLookups off, which would be recommended for medium-to-high traffic sites, Apache will log only the associated IP address. There are plenty of tools to resolve the IP addresses in the logs later. Refer to the "Managing Apache Logs" section later in this hour. Additionally, the result will be passed to CGI scripts via the environment variable REMOTE_HOST.

The IdentityCheck Directive

At the beginning of the hour, we explained how to log the remote username via the identd protocol using the %l log formatting directive. The IdentityCheck directive takes a value of on or off to enable or disable checking for that value and making it available for inclusion in the logs. Because the information is not reliable and takes a long time to check, it is switched off by default and should probably never be enabled. We mentioned %l only because it is part of the Common Log Format.

Environment Variables

The CustomLog directive accepts an environment variable as a third argument. If the environment variable is present, the entry will be logged; otherwise, it will not. If the environment variable is negated by prefixing an ! to it, the entry will be logged if the variable is not present.

The following example shows how to avoid logging images in GIF and JPEG format in your logs:

SetEnvIf Request_URI "(\.gif|\.jpg)$" image
CustomLog logs/access_log common env=!image
Status Code

You can specify whether to log specific elements in a log entry. At the beginning of the hour, you learned that log directives start with a %, followed by a directive identifier. In between, you can insert a list of status codes, separated by commas. If the request status is one of the listed codes, the parameter will be logged; otherwise, a - will be logged.

For example, the directive identifier %400,501{User-agent}i logs the browser name and version for malformed requests (status code 400) and requests with methods not implemented (status code 501). This information can be useful for tracking which clients are causing problems.

You can precede the method list with an ! to log the parameter if the methods are implemented:


Logging Accesses to Files

Logging to files is the default way of logging requests in Apache. You can define the name of the file using the TransferLog and CustomLog directives.

The TransferLog directive takes a file argument and uses the latest log format defined by a LogFormat directive with a single argument (the nickname or the format string). If no log format is present, it defaults to the Common Log Format.

The following example shows how to use the LogFormat and TransferLog directives to define a log format that is based on the CLF but that also includes the browser name:

LogFormat "%h %l %u %t \"%r\" %>s %b \"%{User-agent}i\""
TransferLog logs/access_log

The CustomLog directive enables you to specify the logging format explicitly. It takes at least two arguments: a logging format and a destination file. The logging format can be specified as a nickname or as a logging string directly.

For example, the directives

LogFormat "%h %l %u %t \"%r\" %>s %b \"%{User-agent}i\"" myformat
CustomLog logs/access_log myformat


CustomLog logs/access_log "%h %l %u %t \"%r\" %>s %b \"%{User-agent}i\""

are equivalent.

The CustomLog format can take an optional environment variable as a third argument, as explained in the "Environment Variables" section earlier in the hour.

Logging Accesses to a Program

Both TransferLog and CustomLog directives can accept a program, prefixed by a pipe sign |, as an argument. Apache will write the log entries to the standard input of the program. The program will, in turn, process them by logging the entries to a database, transmitting them to another system, and so on.

If the program dies for some reason, the server makes sure that it is restarted. If the server stops, the program is stopped as well.

The rotatelogs utility, bundled with Apache and explained later in this hour, is an example of a logging program.

As a general rule, unless you have a specific requirement for using a particular program, it is easier and more reliable to log to a file on disk and do the processing, merging, analysis of logs, and so on, at a later time, possibly on a different machine.


Make sure that the program you use for logging requests is secure because it runs as the user Apache was started with. On Unix, this usually means root because the external program will be started before the server changes its user ID to the value of the User directive, typically nobody.

    Part III: Getting Involved with the Code