Early Web servers were designed to handle the contents of a single site. The standard way of hosting several Web sites in the same machine was to install and configure different, and separate, Web server instances. As the Internet grew, so did the need for hosting multiple Web sites and a more efficient solution was developed: virtual hosting. Virtual hosting allows a single instance of Apache to serve different Web sites, identified by their domain names. IP-based virtual hosting means that each of the domains is assigned a different IP address; name-based virtual hosting means that several domains share a single IP address. As is explained later in the hour, name-based virtual hosting requires HTTP/1.1 support.
Web clients use the domain name server system (DNS) to translate hostnames into IP addresses, and vice versa. Several mappings are possible:
One to one? Means that each hostname is assigned a single, unique IP address. This is the foundation for IP-based virtual hosting.
One to many? Means that a single hostname is assigned to several IP addresses. This is useful for having several Apache instances serving the same Web site. If each of the servers is installed in a different machine, it is possible to balance the Web traffic among them, improving scalability.
Many to one? Means that you can assign the same IP address to several hostnames. The client will specify the Web site it is accessing by using the Host: header in the request. This is the foundation for name-based virtual hosting.
When a many-to-one mapping is in place, a DNS server usually can be configured to respond with a different IP address for each DNS query, which helps to distribute the load. This is known as round robin DNS.
The simplest virtual host configuration is when each host is assigned a unique IP address. Each IP address maps the HTTP requests that Apache handles to separate content trees in their own VirtualHost containers, as shown in the following snippet:
Listen 192.168.128.10:80 Listen 192.168.129.10:80 <VirtualHost 192.168.128.10:80> DocumentRoot /usr/local/www-docs/host1 </VirtualHost> <VirtualHost 192.168.129.10:80> DocumentRoot /usr/local/www-docs/host2 </VirtualHost>
If a DocumentRoot is not specified for a given virtual host, the global setting, specified outside any <VirtualHost> section, will be used. In the previous example, each virtual host has its own DocumentRoot. When a request arrives, Apache will use the destination IP address to direct the request to the appropriate host. For example, if a request comes for IP 192.168.128.10, Apache will return the documents from /usr/local/www-docs/host1. If the host operating system cannot resolve an IP address used as the VirtualHost container's name, and there's no ServerName directive, Apache will complain at server startup time that it can't map the IP addresses to hostnames. This complaint is not a fatal error. Apache will still run, but the error indicates that there might be some work to be done with the DNS configuration so that Web browsers can find your server. A fully qualified domain name (FQDN) can be used instead of an IP address as the VirtualHost container name and the Listen directive binding (if the domain name resolves in DNS to an IP address configured on the machine and Apache can bind to it).
As a way to mitigate the consumption of IP addresses for virtual hosts, the HTTP/1.1 protocol version introduced the Host: header, which enables a browser to specify the exact host for which the request is intended. This allows several hostnames to share a single IP address. Most browsers nowadays provide HTTP/1.1 support.
Although Host: usage was standardized in the HTTP/1.1 specification, some older HTTP/1.0 browsers also provided support for this header.
A typical set of request headers from Microsoft Internet Explorer is shown in Listing 22.1. If the URL were entered with a port number, it would be part of the Host header contents as well.
GET / HTTP/1.1 Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */* Accept-Language: en-us Accept-Encoding: gzip, deflate User-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0) Host: host1.example.com Connection: Keep-Alive
Apache uses the Host: header for configurations in which multiple hostnames can be shared by a single IP address?the many-to-one scenario outlined earlier this hour?thus, the description name-based virtual hosts.
The NameVirtualHost directive enables you to specify IP address and port combinations on which the server will receive requests for name-based virtual hosts. This is a required directive for name-based virtual hosts. Listing 22.2 has Apache dispatch all connections to 192.168.128.10 based on the Host header contents.
NameVirtualHost 192.168.128.10 Listen 192.168.128.10:80 <VirtualHost 192.168.128.10> ServerName host1.example.com DocumentRoot /usr/local/www-docs/host1 </VirtualHost> <VirtualHost 192.168.128.10> ServerName host2.example.com DocumentRoot /usr/local/www-docs/host2 </VirtualHost>
For every hostname that resolves in DNS to 192.168.128.10, Apache can support another name-based virtual host. If a request comes for that IP address for a hostname that is not included in the configuration file, say host3.example.com, Apache will simply associate the request to the first container in the configuration file; in this case, host1.example.com. The same behavior is applied to requests that are not accompanied by a Host header; whichever container is first in the configuration file is the one that gets the request.
An end user from the example.com domain might have his machine set up with example.com as his default domain. In that case, he might direct his browser to http://host1/ instead of the fully qualified http://host1.example.com/. The Host header would simply have host1 in it instead of host1.example.com. To make sure that the correct virtual host container gets the request, you can use the ServerAlias directive as shown in Listing 22.3.
NameVirtualHost 192.168.128.10 Listen 192.168.128.10:80 <VirtualHost 192.168.128.10> ServerName host1.example.com ServerAlias host1 DocumentRoot /usr/local/www-docs/host1 </VirtualHost> <VirtualHost 192.168.128.10> ServerName host2.example.com ServerAlias host2 DocumentRoot /usr/local/www-docs/host2 </VirtualHost>
In fact, you can give ServerAlias a space-separated list of other names that might show up in the Host header so that you don't need a separate VirtualHost container with a bunch of common directives just to handle all the name variants.
HTTP 1.1 forces the use of the Host header. If the protocol version is identified as 1.1 in the HTTP request line (that is, GET / HTTP/1.1), the request must be accompanied by a Host header. In the early days of name-based virtual hosts, Host headers were considered a tradeoff: Fewer IP resources were required, but legacy browsers that did not send Host headers were still in use and, therefore, could not access all of the server's virtual hosts. Today, that is not a consideration; there is no statistically significant number of such legacy browsers in use.
The only reason to opt for IP-based and not use name-based virtual hosts is if there are virtual hosts that must use SSL. You can learn more about SSL and this limitation in Hour 23, "Setting Up a Secure Web Server."
In the previous listings, the DocumentRoots follow a simple pattern:
DocumentRoot /usr/local/www-docs/ hostname
where hostname is the hostname portion of the fully qualified domain name used in the virtual host's ServerName. For just a few virtual hosts, this configuration is fine. But what if there are dozens, hundreds, or even thousands of these virtual hosts? The configuration file can become difficult to maintain. Apache provides a good solution for cookie-cutter virtual hosts with mod_vhost_alias. You can configure Apache to map the virtual host requests to separate content trees with pattern-matching rules in the VirtualDocumentRoot directive. This functionality is especially useful for ISPs that want to provide a virtual host for each one of their users. The following example provides a simple mass virtual host configuration:
NameVirtualHost 192.168.128.10 Listen 192.168.128.10:80 VirtualDocumentRoot /usr/local/www-docs/%1
The %1 token used in this example's VirtualDocumentRoot directive will be substituted for the first portion of the FQDN. mod_vhost_alias directives have a language for mapping FQDN components to filesystem locations. Even characters within the FQDN can be accessed.
If we eliminated all the VirtualHost containers and simplified our configuration to the one shown here, the server would serve requests for any subdirectories created in the /usr/local/www-docs directory. If the hostname portion of the FQDN is matched as a subdirectory, that's where Apache will look for content when it translates the request to a filesystem location.
Note that although virtual hosts normally inherit directives from the main server context, some of them, such as Alias directives, do not get propagated. For instance, the virtual hosts will not inherit this filesystem mapping:
Alias /icons /usr/local/apache2/icons
The FollowSymLinks flag for the Options directive is also disabled in this context. However, a variant of the ScriptAlias directive is supported.
The VirtualScriptAlias directive shown in the following snippet treats requests for any resources under /cgi-bin as containing CGI scripts:
NameVirtualHost 192.168.128.10 Listen 192.168.128.10:80 VirtualDocumentRoot /usr/local/vhosts/%1/docs VirtualScriptAlias /usr/local/vhosts/%1/cgi-bin
Note that cgi-bin is a special token for that directive; calling the directory just cgi won't work; it must be cgi-bin.
For IP-based virtual hosting needs, there are variants of these directives: VirtualDocumentRootIP and VirtualScriptAliasIP. However, because the primary motivation of IP-based virtual hosts is for SSL and there's no pattern-matched path support for SSL resources such as certificates and keys, the uses are fairly limited.