9.5 Planning for Disasters

It's a fact of life on a network that things go wrong. Hardware fails, software has bugs, and people occasionally make mistakes. Sometimes this results in minor inconveniences, like having a few users lose connections. Sometimes the results are catastrophic and involve the loss of important data and valuable jobs.

Because the Domain Name System relies so heavily on the network, it is vulnerable to network outages. Thankfully, the design of DNS takes into account the imperfection of networks: it allows for multiple, redundant name servers, retransmission of queries, retrying zone transfers, and so on.

DNS doesn't protect itself from every conceivable calamity, though. DNS doesn't or can't protect against certain types of network failures?some of them quite common. But with a small investment of time and money, you can minimize the threat of these problems.

9.5.1 Outages

Power outages, for example, are relatively common in many parts of the world. In some parts of the U.S., thunderstorms or tornadoes may cause a site to lose power or have only intermittent power for an extended period. Elsewhere, typhoons, volcanoes, or construction work may interrupt electrical service. And you never know when those of you in California might lose power in a rolling blackout from a lack of electrical capacity.

If all your hosts are down, of course, you don't need name service. Quite often, however, sites have problems when power is restored. Following our recommendations, they run their name servers on file servers and big, multiuser machines. And when the power comes up, those machines are naturally the last to boot?because all those disks need to be checked and fixed first! Which means that all the on-site hosts that are quick to boot do so without the benefit of name service.

This can cause all sorts of wonderful problems, depending on what services your hosts access when they boot. For example, your PCs may mount your servers' drives (via net use) when they boot. If they do, they almost certainly specify the servers' domain names or NetBIOS names.

Using hostnames in commands is admirable because it allows administrators to change the servers' IP addresses without changing all the startup files on-site. However, if name service isn't available when your PCs boot, the net use command will fail, which may cause successive commands to fail, too. This will certainly not help your users' productivity.

9.5.2 Recommendations

Our recommendation is to add the names and IP addresses of critical hosts to your PCs' HOSTS files. Any host whose name is referenced during the boot process should appear in this file. You can synchronize the file by copying it from share to share. On Windows Server 2003, the default location for the file is %SystemRoot%\System32\Drivers\Etc, usually C:\Windows\System32\Drivers\Etc. The format of the file is just like the format of the Unix /etc/hosts file: each line consists of an IP address (in dotted-octet notation), which starts in the first column, followed by whitespace and the canonical name of the host. Optionally, one or more aliases may follow the canonical name. For example: wormhole.movie.edu wormhole terminator.movie.edu terminator

Now, if a PC needs to look up wormhole or wormhole.movie.edu when it boots, it will be able to resolve the name.

However, using HOSTS files poses some danger: unless you take care to keep the files up-to-date, their information may become stale. And since the Windows Server 2003 resolver uses HOSTS before querying a name server, a stale entry can cause resolution failures that are hard to diagnose.

The best solution to this problem is to run a name server on a host with uninterruptible power. If you rarely experience extended power loss, battery backup might be enough. If your outages are longer and name service is critical to you, you should consider an uninterruptible power system (UPS) with a generator of some kind.

If you can't afford luxuries like these, you might just try to track down the fastest booting host around and run a name server on it. Hosts with small filesystems should boot quickly since they don't have many disks to check.

Once you've located the right host, you'll need to make sure the host's IP address appears in the resolver configurations of all of your hosts that need full-time name service. You'll probably want to list the backed-up host last since, during normal operation, hosts should use the name server closest to them. Then, after a power failure, your critical applications will still have name service, albeit at a small sacrifice in performance.