Setting Up the Apache Web Server


Setting Up the Apache Web Server

You probably already know how it feels to use the Web, but you may not know how to set up a Web server so that you, too, can provide information to the world through Web pages. To become an information provider on the Web, you have to run a Web server on your Red Hat Linux PC on the Internet. You also have to prepare the Web pages for your website-a task that may be more demanding than the Web server setup.

Web servers provide information using HTTP. Web servers are also known as HTTP daemons (because continuously running server processes are called daemons in UNIX) or HTTPD for short. The Web server program is usually named httpd.

Among the available Web servers, the Apache Web server is the most popular. The Apache Web server started out as an improved version of the NCSA HTTPD server but soon grew into a separate development effort. Like NCSA HTTPD, the Apache server is developed and maintained by a team of collaborators. Apache is freely available over the Internet.

The following sections describe the installation and configuration of the Apache Web server.

Learning More about the Apache Web Server

The Apache Web server has too many options and configuration directives to describe in detail in this book. Whole books are devoted to configuring the Apache Web server. You should consult one of these books for more information:

  • Mohammed J. Kabir, Apache Server 2 Bible, John Wiley & Sons, 2002.

  • Mohammed J. Kabir, Apache Server Administrator's Handbook, Hungry Minds, Inc., 1999.

  • Ken A. L. Coar, Apache Server for Dummies, IDG Books Worldwide, 1998.

You can also find late-breaking news and detailed information about the latest version of Apache HTTPD from Apache's website at http://httpd.apache.org/docs-2.0/. In particular, you can browse a complete list of Apache directives at http://httpd.apache .org/docs-2.0/mod/directives.html.

Installing the Apache Web Server

Installing Red Hat Linux from this book's companion CD-ROMs gives you the option to install the Apache Web server. As described in Chapter 2, simply select the Web Server package group when you are prompted for the components to install. This package group includes the Apache Web server. The Web server program is called httpd, so the Apache Web server package is called httpd.

Perform these steps to verify that the Apache Web server software is installed on your system:

  1. Use the rpm -q command to check whether or not the Apache package is installed:

    rpm -q httpd
    httpd-2.0.40-16

    If the output shows an httpd package name, you have installed the Apache software.

  2. Type the following command to check whether or not the httpd process is running (httpd is the name of the Apache Web server program):

    ps ax | grep httpd

    If the Apache Web server is running, the output should show a number of httpd processes. It is common to run several Web server processes-one parent and several child processes-so that several HTTP requests can be handled efficiently by assigning each request to an httpd process. If there is no httpd process, log in as root and start the httpd service with the following command:

    service httpd start
  3. Use the telnet program on your Linux system, and use the HTTP HEAD command to query the Web server, as follows:

    telnet localhost 80
    Trying 127.0.0.1...
    Connected to localhost.
    Escape character is '^]'.
    HEAD / HTTP/1.0  (press Enter twice)
    
    HTTP/1.1 403 Forbidden
    Date: Sat, 15 Feb 2003 22:12:28 GMT
    Server: Apache/2.0.40 (Red Hat Linux)
    Accept-Ranges: bytes
    Content-Length: 2898
    Connection: close
    Content-Type: text/html; charset=ISO-8859-1
    
    Connection closed by foreign host.

    If you get a response such as that in the preceding code, your system already has the Apache Web server installed and set up correctly. All you have to do is understand the configuration so that you can place the HTML documents in the proper directory.

Use a Web server to load the homepage from your system. For instance, if your system's IP address is 192.168.0.100, use the URL http://192/168.0.100/ or try http://localhost and see what happens. You should see a Web page with the title 'Test Page for the Apache Web Server on Red Hat Linux.'

Configuring the Apache Web Server

Red Hat Linux configures the Apache Web server software to use these files and directories:

  • The Web server program-httpd-is installed in the /usr/sbin directory.

  • The Apache Web server configuration file-httpd.conf-is located in the /etc/httpd/conf directory. The configuration file is a text file with directives that specify various aspects of the Web server (a later section describes the Apache directives).

  • The Apache Web server treats files with .conf extension in the /etc/httpd/ conf.d directory as configuration files for Apache modules such as mod_perl, mod_python, mod_ssl, and so on. For example, the /etc/httpd/conf.d directory contains the configuration information that SSL needs.

  • The Apache Web server is set up to serve the HTML documents from the /var/www/html directory. Therefore, you should place your Web pages in this directory.

  • If you have any Common Gateway Interface (CGI) programs-programs the Web server can invoke to access other files and databases-you should place these in the /var/www/cgi-bin/ directory.

  • The /var/log/httpd directory is meant for Web server log files (access logs and error logs).

  • The /etc/init.d/httpd script starts the httpd process as your Red Hat Linux system boots. You can type the command /etc/init.d/httpd start to run the Web server (another way is to type service httpd start).

    Insider Insight 

    If you want the Apache Web server to start automatically when you boot the system, log in as root and type the following command to enable the server:

    chkconfig --level 35 httpd on

    To restart Apache httpd after making changes to configuration files, type:

    service httpd restart

Apache Configuration Directives

The Apache httpd server's operation is controlled by the directives stored in the httpd.conf file located in the /etc/httpd/conf directory as well as separate .conf files located in the /etc/httpd/conf.d directory. The directives in these configuration files specify general attributes of the server, such as the server's name, the port number and the directory in which the server's directories are located. The configuration directives also specify information about the server resources-the documents and other information the Web server provides to users-and access control directives that control access to the entire Web server as well as to specific directories.

The next few sections show you the key information about the Apache httpd configuration directives. Typically, you do not have to change much in the configuration files to use the Apache Web server, except for setting the ServerName in the httpd.conf file. However, it is useful to know the format of the configuration files and the meaning of the various keywords used in them.

As you study the /etc/httpd/conf/httpd.conf file, keep these syntax rules in mind:

  • The configuration file is a text file that you can edit with your favorite text editor and view with the more command.

  • All comment lines begin with a #.

  • Each line can have only one directive.

  • Extra spaces and blank lines are ignored.

  • All entries, except pathnames and URLs, are case insensitive.

The following sections show the Apache directives grouped into three separate categories: general HTTPD directives, resource configuration directives, and access-control directives.

General HTTPD Directives

Some interesting items from the httpd.conf file are

  • ServerName specifies the host name of your website (of the form www.your.domain). The name should be a registered domain name other users can locate through their name servers. Here is an example:

    ServerName  www.myhost.com
  • ServerAdmin is the email address that the Web server provides to clients in case any errors occur. The default value for ServerAdmin is root@localhost. You should set this to a valid email address that anyone on the Internet can use to report errors your website contains.

Many more directives control the way that the Apache Web server works. The following list summarizes some of the directives you can use in the httpd.conf file. You can leave most of these directives in their default settings, but it's important to know about them if you are maintaining a Web server.

  • ServerType type-Specifies how Linux executes the HTTP server. The type can be xinetd (to run the server through the xinetd daemon) or standalone (to run the server as a standalone process). You should run the server as a standalone server for better performance. In addition, the latest Apache documentation states that the xinetd mode is no longer recommended and does not work properly.

  • Port num-Specifies that the HTTP daemon should listen to port num (a number between 0 and 65,535) for requests from clients. The default port for HTTPD is 80. You should leave the port number at its default value; clients will assume that the HTTP port is 80. If your server does not use port 80, the URL for your server must specify the port number.

  • User name [ #id]-Specifies the user name (or ID) the HTTP daemon uses when it is running in standalone mode. You can leave this directive at the default setting (apache). If you specify a user ID, use a hash (#) prefix for the numeric ID.

  • Group name [ #id]-Specifies the group name (or ID) of the HTTP daemon when the server is running in standalone mode. The default group name is apache.

  • ServerRoot pathname-Specifies the directory where the Web server is located. By default, the configuration and log files are expected to reside in subdirectories of this directory. In Red Hat Linux, ServerRoot is set to /etc/httpd.

  • ServerName www.company.com-Sets the server's hostname to www.company.com instead of to its real host name. You cannot simply invent a name; the name must be a valid name from the Domain Name System (DNS) for your system.

  • StartServers num-Sets the number of child processes that start as soon as the Apache Web server runs. The default value is 8.

  • MaxSpareServers num-Sets the desired maximum number of idle child-server processes (a child process is considered idle if it is not handling an HTTP request). The default value is 20.

  • MinSpareServers num-Sets the desired minimum number of idle child server processes (a child process is considered idle if it is not handling an HTTP request). A new spare process is created every second if the number falls below this threshold. The default value is 5.

  • Timeout numsec-Sets the number of seconds that the server waits for a client to send a query after the client establishes connection. The default Timeout is 300 seconds (five minutes).

  • ErrorLog filename-Sets the file where httpd logs the errors it encounters. If the filename does not begin with a slash (/), the name is taken to be relative to ServerRoot. The default ErrorLog is /var/log/httpd/error_log. Typical error-log entries include events such as server restarts and any warning messages, such as the following:

    [Sat Feb 15 17:12:09 2003] [notice] Apache/2.0.40 (Red Hat Linux) configured -- resuming normal operations
    [Sat Feb 15 18:38:51 2003] [error] [client 127.0.0.1] File does not exist: /var/www/html/sample.html
  • TransferLog filename-Sets the file where httpd records all client accesses (including failed accesses). The default TransferLog is /var/log/httpd/ access_log. The following example shows how a typical access is recorded in the TransferLog file:

    127.0.0.1 - - [15/Feb/2003:18:41:36 -0500] "GET / HTTP/1.1" 403 2898 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20030115"
    127.0.0.1 - - [15/Feb/2003:18:41:36 -0500] "GET /icons/powered_by.gif HTTP/1.1" 304 0 "http://localhost/" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gec
    ko/20030115"

    The first entry is for the text of the file; the second entry is for embedded images.

  • LogFormat formatstring formatname-Specifies the format of log-file entries for the TransferLog. This format is also used by the CustomLog directive to produce logs in a specific format.

  • CustomLog filename formatname-Sets the name of the custom log file where httpd records all client accesses (including failed accesses) in a format specified by formatname (which you define using a LogFormat directive).

  • PidFile filename-Sets the file where httpd stores its process ID. The default PidFile is /var/run/httpd.pid. You can use this information to kill or restart the HTTP daemon. The following example shows how to restart httpd:

    kill -HUP `cat /var/run/httpd.pid`
  • MaxClients num-Sets the limit on the number of clients that can simultaneously connect to the server. The default value is 150. The value of MaxClients cannot be more than 256.

  • LoadModule module modules/modfile.so-Loads a module that was built as a Dynamic Shared Object (DSO). You have to specify the module name and the module's object file. Because the order in which modules are loaded is important, you should leave these directives as they appear in the default configuration file. Note that the mod_ssl module provides support for encryption using the Secure Sockets Layer (SSL) protocol.

Resource Configuration Directives

The resource configuration directives specify the location of the Web pages, as well as how to specify the data types of various files. To get started, you can leave the directives at their default settings. These are some of the resource configuration directives for the Apache Web server:

  • DocumentRoot pathname-Specifies the directory where the HTTP server finds the Web pages. In Red Hat Linux, the default DocumentRoot is /var/www/html. If you place your HTML documents in another directory, set DocumentRoot to that directory.

  • UserDir dirname-Specifies the directory below a user's home directory where the HTTP server looks for the Web pages when a user name appears in the URL (in an URL such as http://www.psn.net/~naba/, for example, which includes a user name with a tilde prefix). The default UserDir is public_html, which means that a user's Web pages are in the public_html subdirectory of that user's home directory. If you do not want to allow users to have Web pages, specify disabled as the directory name in the UserDir directive.

  • DirectoryIndex filename1 filename2 ...-Indicates the default file or files to be returned by the server when the client does not specify a document. The default DirectoryIndex is index.html. If httpd does not find this file, it returns an index (basically, a nice-looking listing of the files) of that directory.

  • AccessFileName filename-Specifies the name of the file that may appear in each directory that contains documents and that indicates who has permission to access the contents of that directory. The default AccessFileName is .htaccess. The syntax of this file is the same as that of Apache access-control directives, which the next section discusses.

  • AddType type/subtype extension-Associates a file extension with a MIME (Multipurpose Internet Mail Extensions) data type of the form type/subtype, such as text/plain or image/gif. Thus, to have the server treat files with the .lst extension as plaintext files, specify the following:

    AddType text/plain .lst

    The default MIME types and extensions are listed in the /etc/mime.types file.

  • AddEncoding type extension-Associates an encoding type with a file extension. To have the server mark files ending with .gz or .tgz as encoded with the x-gzip encoding method (the standard name for the GZIP encoding), specify the following:

    AddEncoding x-gzip gz tgz
  • DefaultType type/subtype-Specifies the MIME type that the server should use if it cannot determine the type from the file extension. If you do not specify DefaultType, httpd assumes the MIME type to be text/html. In the default httpd.conf file that you get from the companion CD-ROMs, DefaultType is specified as text/plain.

  • Redirect requested-file actual-URL-Specifies that any requests for requested-file are to be redirected to actual URL.

  • Alias requested-dir actual-dir-Specifies that the server use actual-dir to locate files in the requested-dir directory (in other words, requested-dir is an alias for actual-dir). To have requests for the /icons directory go to /var/ www/icons, specify the following:

    Alias /icons/ /var/www/icons/
  • ScriptAlias requested-dir actual-dir-Specifies the real name of the directory where scripts for the Common Gateway Interface (CGI) are located. For example, the following directive specifies the location of the /cgi-bin/ directory:

    ScriptAlias /cgi-bin/ /var/www/cgi-bin/

    This directive means that when a Web browser requests a script, such as /cgi/bin/test-cgi, the HTTP server runs the script /var/www/cgi-bin/ test-cgi.

  • FancyIndexing on [off]-Enables or disables the display of fancy directory listings, with icons and file sizes.

  • DefaultIcon iconfile-Specifies the location of the default icon that the server should use for files that have no icon information. By default, DefaultIcon is /icons/unknown.gif.

  • ReadmeName filename-Specifies the name of a README file whose contents are added to the end of an automatically generated directory listing. The default ReadmeName is README.

  • HeaderName filename-Specifies the name a header file whose contents are prepended to an automatically generated directory listing. The default HeaderName is HEADER.

  • AddDescription 'file description' filename-Specifies that the file description string be displayed next to the specified filename in the directory listing. You can use a wildcard, such as *.html, as the filename. For example, the following directive describes files ending with .tgz as GZIP compressed tar archives:

    AddDescription "GZIP compressed tar archive" .tgz
  • AddIcon iconfile extension1 extension2 ...-Associates an icon with one or more file extensions. The following directive associates the icon file /icons/text.gif with the file extension .txt:

    AddIcon /icons/text.gif .txt
  • AddIconByType iconfile MIME-types-Associates an icon with a group of file types specified as a wildcard form of MIME types (such as text/* or image/*). To associate an icon file of /icons/text.gif with all text types, specify the following:

    AddIconByType (TXT,/icons/text.gif) text/*

    This directive also tells the server to use TXT in place of the icon for clients that cannot accept images. (Browsers tell the server what types of data they can accept.)

  • AddIconByEncoding iconfile encoding1 encoding2 ...-Specifies an icon to be displayed for one or more encoding types (such as x-compress or x-gzip).

  • IndexIgnore filename1 filename2 ...-Instructs the server to ignore the specified filenames (they typically contain wildcards) when preparing a directory listing. To leave out README, HEADER, and all files with names that begin with a period (.), a trailing tilde (~), or a trailing hash mark (#), specify the following:

    IndexIgnore .??* *~ *# HEADER* README* RCS CVS *,v *,t
  • IndexOptions option1 option2 ...-Indicates the options you want in the directory listing prepared by the server. Options can include one or more of the following:

    • FancyIndexing turns on the fancy directory listing that includes filenames and icons representing the files' types, sizes, and last-modified dates.

    • IconHeight=N specifies that icons are N pixels tall.

    • IconWidth=N specifies that icons are N pixels wide.

    • NameWidth=N makes the filename column N characters wide.

    • IconsAreLinks makes the icons act like links.

    • ScanHTMLTitles shows a description of HTML files.

    • SuppressHTMLPreamble does not add a standard HTML preamble to the header file (specified by the HeaderName directive).

    • SuppressLastModified stops the displaying of the last date of modification.

    • SuppressSize stops the displaying of the file size.

    • SuppressDescription stops the displaying of any file description.

    • SuppressColumnSorting stops the column headings from being links that enable sorting the columns.

  • ErrorDocument errortype filename-Specifies a file the server should send when an error of a specific type occurs. You can also provide a text message for an error. Here are some examples:

    ErrorDocument 403 "Sorry, you cannot access this directory"
    ErrorDocument 404 /cgi-bin/bad_link.pl
    ErrorDocument 401 /new_subscriber.html

    If you do not have this directive, the server sends a built-in error message. The errortype can be one of the following HTTP/1.1 error conditions (see RFC 2616 at http://www.ietf.org/rfc/rfc2616.txt or http://www.cis.ohio-state .edu/htbin/rfc/rfc2616.html for more information):

    • 400-Bad Request

    • 401-Unauthorized

    • 402-Payment Required

    • 403-Forbidden

    • 404-Not Found

    • 405-Method Not Allowed

    • 406-Not Acceptable

    • 407-Proxy Authentication Required

    • 408-Request Timeout

    • 409-Conflict

    • 410-Gone

    • 411-Length Required

    • 412-Precondition Failed

    • 413-Request Entity Too Large

    • 414-Request-URI Too Long

    • 415-Unsupported Media Type

    • 416-Requested Range Not Satisfiable

    • 417-Expectation Failed

    • 500-Internal Server Error

    • 501-Not Implemented

    • 502-Bad Gateway

    • 503-Service Unavailable

    • 504-Gateway Timeout

    • 505-HTTP Version Not Supported

  • TypesConfig filename-Specifies the file that contains the mapping of file extensions to MIME data types. (MIME stands for Multipurpose Internet Mail Extensions, which defines a way to package attachments in a single message file.) The server reports these MIME types to clients. If you do not specify a TypesConfig directive, httpd assumes that the TypesConfig file is /etc/mime.types. The following are a few selected lines from the default /etc/mime.types file:

    application/msword              doc
    application/pdf                 pdf
    application/postscript          ai eps ps
    application/x-tcl               tcl
    audio/mpeg                      mpga mp2 mp3
    audio/x-pn-realaudio            ram rm
    audio/x-wav                     wav
    image/gif                       gif
    image/jpeg                      jpeg jpg jpe
    image/png                       png
    text/html                       html htm
    text/plain                      asc txt
    video/mpeg                      mpeg mpg mpe

    Each line shows the MIME type (such as text/html), followed by the file extensions for that type (html or htm).

Access-Control Directives

Access-control directives enable you to control who can access different directories in the system. These are the global access-configuration directives. In each directory containing documents served by the Apache Web server, you can have another access-configuration file with the name specified by the AccessFileName directive. (That per directory access-configuration file is named .htaccess by default.)

Stripped of most of the comment lines, the access-control directive has this format:

# First, we configure the "default" to be a
# very restrictive set of permissions.
<Directory />
Options None
AllowOverride None
</Directory>

# The following directory name should
# match DocumentRoot in httpd.conf
<Directory /var/www/html>
    Options Indexes Includes FollowSymLinks
    AllowOverride None
    order allow,deny
    allow from all
</Directory>

# The directory name should match the
# location of the cgi-bin directory
<Directory "/var/www/cgi-bin">
    AllowOverride None
    Options ExecCGI
    Order allow,deny
    Allow from all 
</Directory>

Access-control directives use a different syntax from the other Apache directives. The syntax is like that of HTML. Various access-control directives are enclosed within pairs of tags, such as <Directory> ... </Directory>.

The following list describes some of the access-control directives. In particular, notice the AuthUserFile directive; you can have password-based access control for specific directories.

  • Options opt1 opt2 ...-Specifies the access-control options for the directory section in which this directive appears. The options can be one or more of the following:

    • None disables all access-control features.

    • All turns on all features for the directory.

    • FollowSymLinks enables the server to follow symbolic links.

    • SymLinksIfOwnerMatch follows symbolic links, only if the same user of the directory owns the linked directory.

    • ExecCGI enables execution of CGI scripts in the directory.

    • Includes enables server-side include files in this directory (the term server-side include refers to directives, placed in an HTML file, that the Web server processes before returning the results to the Web browser).

    • Indexes enables clients to request indexes (directory listings) for the directory.

    • IncludesNOEXEC disables the #exec command in server-side includes.

  • AllowOverride directive1 directive2 ...-Specifies which access-control directives can be overridden on a per directory basis. The directive list can contain one or more of the following:

    • None stops any directive from being overridden.

    • All enables overriding of any directive on a per directory basis.

    • Options enables the use of the Options directive in the directory-level file.

    • FileInfo enables the use of directives controlling document type, such as AddType and AddEncoding.

    • AuthConfig enables the use of authorization directives, such as AuthName, AuthType, AuthUserFile, and AuthGroupFile.

    • Limit enables the use of Limit directives (allow, deny, and order) in a directory's access-configuration file.

  • AuthName name-Specifies the authorization name for a directory.

  • AuthType type-Specifies the type of authorization to be used. The only supported authorization type is Basic.

  • AuthUserFile filename-Specifies the file in which usernames and passwords are stored for authorization. For example, the following directive sets the authorization file to /etc/httpd/conf/passwd:

    AuthUserFile /etc/httpd/conf/passwd

    You have to create the authorization file with the /usr/sbin/htpasswd support program. To create the authorization file and add the password for a user named jdoe, specify the following:

    /usr/bin/htpasswd -c /etc/httpd/conf/passwd jdoe
    New password: (type the password)
    Re-type new password: (type the same password again)
    Adding password for user jdoe
  • AuthGroupFile filename-Specifies the file to consult for a list of user groups for authentication.

  • order ord-Specifies the order in which two other directives-allow and deny-are evaluated. The order is one of the following:

    • deny,allow causes the Web server to evaluate the deny directive before allow.

    • allow,deny causes the Web server to evaluate the allow directive before deny.

    • mutual-failure enables only hosts in the allow list.

  • deny from host1 host2 ...-Specifies the hosts denied access.

  • allow from host1 host2 ...-Specifies the hosts allowed access. To enable all hosts in a specific domain to access the Web documents in a directory, specify the following:

    order deny,allow
    allow from .nws.noaa.gov
  • require entity en1 en2 ...-This directive specifies which users can access a directory. entity is one of the following:

    • user enables only a list of named users.

    • group enables only a list of named groups.

    • valid-user enables all users listed in the AuthUserFile access to the directory (provided that they enter the correct password).

Supporting Virtual Hosts with the Apache HTTP Server

A useful feature of the Apache HTTP server is its ability to handle virtual Web servers. This ability enables a single server to respond to many different IP addresses and to serve Web pages from different directories, depending on the IP address. That means that you can set up a single Web server to respond to both www.big.org and www.tiny.com and serve a unique home page for each hostname. A server with this capability is known as multihomed Web server, a virtual Web server, or a server with virtual host support.

As you might guess, ISPs use virtual host capability to offer virtual websites to their customers. You must meet the following requirements to support virtual hosts:

  • The Web server must be able to respond to multiple IP addresses (each with a unique domain name) and must enable you to specify document directories, log files, and other configuration items for each IP address.

  • The Linux system must associate multiple IP addresses with a single physical network interface. Red Hat Linux enables you to associate multiple IP addresses with a single physical interface.

  • Each domain name associated with the IP address must be a unique registered domain name with proper DNS entries.

For the latest information on how to set up virtual hosts in an Apache HTTP server, consult the following URL:

http://httpd.apache.org/docs-2.0/vhosts/index.html

The Apache HTTP server can respond to different host names with different home pages. You have two options when supporting virtual hosts:

  • Run multiple copies of the httpd program, one for each IP address-In this case, you create a separate copy of the httpd.conf configuration file for each host and use the BindAddress directive to make the server respond to a specific IP address.

  • Run a single copy of the httpd program with a single httpd.conf file-In the configuration file, set BindAddress to * (so that the server responds to any IP address), and use the VirtualHost directive to configure the server for each virtual host.

You should run multiple HTTP daemons only if you do not expect heavy traffic on your system; the system may not able to respond well because of the overhead associated with running multiple daemons. However, you may need multiple HTTP daemons if each virtual host has a unique configuration need for the following directives:

  • ServerType, which specifies whether the server runs as a standalone process or through xinetd.

  • UserId and GroupId, which are the user and group ID for the HTTP daemon.

  • ServerRoot, which is the root directory of the server.

  • TypesConfig, which is the MIME type configuration file.

For a site with heavy traffic, you should configure the Web server so that a single HTTP daemon can serve multiple virtual hosts. Of course, this recommendation implies that there is only one configuration file. In that configuration file, use the VirtualHost directive to configure each virtual host.

Most ISPs use the VirtualHost capability of Apache HTTP server to provide virtual websites to their customers. Unless you pay for a dedicated Web host, you typically get a virtual site where you have your own domain name, but share the server and the actual host with many other customers.

The syntax of the VirtualHost directive is as follows:

<VirtualHost hostaddr> 
    ... directives that apply to this host
    ... 
</VirtualHost> 

With this syntax, you use <VirtualHost> and </VirtualHost> to enclose a group of directives that will apply only to the particular virtual host identified by the hostaddr parameter. The hostaddr can be an IP address, or the fully qualified domain name of the virtual host.

You can place almost any Apache directives within the <VirtualHost> block. At a minimum, Webmasters include the following directives in the <VirtualHost> block:

  • DocumentRoot, which specifies where this virtual host's documents reside

  • Servername, which identifies the server to the outside world (this should be a registered domain name DNS supports)

  • ServerAdmin, which is the email address of this virtual host's Webmaster

  • Redirect, which specifies any URLs to be redirected to other URLs

  • ErrorLog, which specifies the file where errors related to this virtual host are to be logged

  • CustomLog, which specifies the file where accesses to this virtual host are logged

When the server receives a request for a document in a particular virtual host's DocumentRoot directory, it uses the configuration parameters within that server's <VirtualHost> block to handle that request.

Here is a typical example of a <VirtualHost> directive that sets up the virtual host www.lnbsoft.com:

<VirtualHost www.lnbsoft.com>
    DocumentRoot    /home/naba/httpd/htdocs
    ServerName   www.lnbsoft.com
    ServerAdmin   webmaster@lnbsoft.com
    ScriptAlias   /cgi-bin/   /home/naba/httpd/cgi-bin/
    ErrorLog  /usr/home/naba/httpd/logs/error_log
    CustomLog   /home/naba/httpd/logs/access_log common
</VirtualHost> 

Here, the name common in the CustomLog directive refers to the name of a format defined earlier in the httpd.conf file by the LogFormat directive, as follows:

LogFormat "%h %l %u %t \"%r\" %>s %b" common

This format string for the log produces lines in the log file that look like this:

dial236.dc.psn.net - - [29/Oct/2002:18:09:00 -0500] "GET / HTTP/1.0" 200 1243

The format string contains two letter tokens that start with a percent sign (%). The meaning of these tokens is shown in Table 14-1.

Table 14-1: LogFormat Tokens

Token

Meaning

%b

The number of bytes sent to the client, excluding header information

%h

The hostname of the client machine

%l

The identity of the user, if available

%r

The HTTP request from the client (for example, GET / HTTP/1.0)

%s

The server response code from the Web server

%t

The current local date and time

%u

The user name the user supplies (only when access-control rules require user name/password authentication)

Configuring Apache for Server-Side Includes (SSI)

'Server-side include' (SSI) refers to a feature of the Apache Web server whereby it can include a file or the value of an environment variable in an HTML document. The feature is like the include files in many programming languages such as C and C++. Just as a preprocessor processes the include files in a programming language, the Web browser reads the HTML file and parses the server-side includes before returning the document to the Web browser.

Server-side includes provide a convenient way to include date, file size, and any file into an HTML document. The SSI directives look like special comments in the HTML file. For example, you can show the size of a graphics file by placing the following SSI directive in the HTML file:

File size = <!--#fsize file="nbphoto.gif"-->

The Web server replaces everything to the right of the equal sign with the size of the file nbphoto.gif.

Similarly, to display today's date, you can use the following SSI directive:

Today is <!--#echo var="DATE_LOCAL" --> 

To enable SSI on the Apache Web server, place the following directive in the /etc/httpd/conf/httpd.conf file:

Options +Includes

Apache directives can apply to specific directories. Therefore, it's best if you place this directive in the block of directives that apply to the directory where you want to allow SSI.

Supporting CGI Programs in Apache

Sometimes an HTML document's content may not be known in advance. For example, if a website provides a search capability, the result of a search depends on which keywords the user enters in the search form. To handle these needs, the Web server relies on external programs called gateways.

A gateway program accepts the user input and responds with the requested data formatted as an HTML document. Often, the gateway program acts as a bridge between the Web server and some other repository of information such as a database.

Gateway programs have to interact with the Web server. To allow anyone to write a gateway program, the method of interaction between the Web server and the gateway program had to be specified completely. Common Gateway Interface (CGI) is the standard method used by gateway programs to exchange information with the Web server. The Apache Web server supports CGI programs.

The URL specifying a CGI program looks like any other URL, but the Apache Web server can examine the directory name and d