While the importance of reliable system logs can't be overestimated, logs only tell part of the story of what is happening on your network. When something out of the ordinary happens, the event is duly logged to the appropriate file, where it waits for a human to notice and take the appropriate action. But logs are valuable only if someone actually reads them. When log files add to the deluge of information that most network administrators already wade through each day, many log files may go unread for days or weeks. This situation is made worse when log files are clogged with irrelevant information. For example, a cry for help from an overburdened mail server can easily be lost if it is surrounded by innocuous entries about failed spam attempts. All too often, logs are used as a resource to figure out "what happened" when systems fail, rather than as a guide to what is happening now.
Another important aspect of log entries is that they only provide a "spot check" of your system at a particular moment. Without a history of what normal performance looks like, it can be difficult to tell the difference between ordinary network traffic, a DoS attack, and a visitation from Slashdot readers. While you can easily build a report on how many times the /var partition filled up, how can you easily know what usage looks like over time? Is the mail spool clogged due to one inconsiderate user, or is it part of an attack by an adversary? Or is it simply a general trend that is the result of trying to serve too many users on too small a disk?
This chapter describes a number of methods for tracking the availability of services and resources over time. Rather than having to watch system logs manually, it is usually far better to have the systems notify you when there is a problem?and only when there is a problem. There are also a number of suggestions about how to recognize trends in your network traffic by monitoring flows and plotting the results on a graph. Sure, you may know what your average outbound Internet traffic looks like, but how much of that traffic is made up of HTTP versus SMTP? You may know roughly how much is being used by each server on your network, but what if you want to break the traffic down by protocol? The hacks in this chapter will show you how.