The principles of network management resemble the principles of network communications; both use a layered approach. The layered approach of the FCAPS model for network management is like the Open System Interconnection (OSI) model used for internetworking, as shown in Figure 11-1.
The correlation between the FCAPS layers and the layers of the OSI model is not direct.
FCAPS is an acronym for the network management model, or framework, and is made up of five layers, as follows:
(F)ault management? Network faults and problems are found and fixed.
(C)onfiguration management? The network is monitored and controlled, often from a central point, such as a network operations center, or NOC. Configuration management includes keeping track of hardware and software on the network and any modifications to this hardware and software.
(A)ccounting management? Network resources are distributed and departments are charged for their end users' network use, such as long-distance or bandwidth usage per user.
(P)erformance management? Network congestion and bottlenecks are minimized.
(S)ecurity management? Only people who need access to specific network resources are allowed to see and use these resources. Security management applies equally to both outside intruders and internal users; not all network hackers are from outside an organization.
Each of these layers is discussed in more detail in the following sections.
Whereas the OSI model works in a service-based mode, meaning that each layer provides services to the layer above and depends on the layer below, the FCAPS model works in a more isolated fashion; each layer can operate independently of the other layers.
Fault management detects, logs, and notifies network managers of any network issues. If possible, fault management can automatically fix network issues, such as rerouting traffic around the fault, much like detouring traffic around an accident on the highway, as illustrated in Figure 11-2.
If a network has redundancy (backup path) built in to its topology, fault management can be configured to occur automatically. The fault is not corrected automatically, but rather the recovery of network connectivity happens automatically. Would you rather troubleshoot a network problem while your users are up on a backup path or while your phone is ringing off the hook? With fault management, your network can automatically detour the network traffic to the good path.
Most network management systems poll the managed devices for error conditions, such as failed links or network congestion, and present this information to the network manager in a manner that is usable, such as an alarm at a network management console or the automatic sending of an e-mail or text page to the network manager, as illustrated in Figure 11-3.
Fault management takes care of events and traps as they occur within your network. A trap is an event that occurs when certain triggers happen. This is similar to your being notified of an overdue bill payment because the billing system "trapped" the event that the payment was not received in time; the trigger is the billing system recognizing the overdue payment, and the trap is the automatic notice sent to you in the mail. Network management traps work in the same fashion. When an event happens within the network that you have set a trap for, such as a failed link, an event notification is sent to the destination you have defined: a network management station, e-mail, or even your pager.
Figure 11-4 illustrates the basic process of fault management.
When a network event occurs, an alarm is sounded. When the network manager (you) detects the alarm, you begin to identify what the problem is in the network. After you've identified the problem, such as a device or link has failed, you begin to solve the problem; this is called troubleshooting. You continue to troubleshoot the problem until you have found a resolution that works and fixes the problem. After you have applied this fix, you log the initial fault and what you did to correct it, so that if it happens again you don't have to re-create your efforts. In other words, if that little red light lights up again, you'll know what to do because you did it before.
The purpose of configuration management is to monitor network, system hardware, and system software configuration information so that the network operation impact of various hardware and software components can be tracked and managed. Changes, additions, and deletions from the network must be coordinated with the network manager or network management personnel, often in a network operations center (NOC).
Before any change is made to the network, it is good practice to have all parties involved in the change discuss what will change, how it will be changed, who will make the change, when the change will occur (often during off-hours when network users are minimally impacted), and, most important, what to do if the change doesn't work, as illustrated in Figure 11-5.
Generally it is not a good idea to effect many changes at one time, because that can be a recipe for disaster. If you need to make several changes to your network, it is best, if possible, to make one change at a time to ensure the network remains up and stable. If you make several changes simultaneously and something goes wrong, you might not know what caused the problem, making fault management your new goal.
The configuration management comprises a number of elements:
Inventory hardware? An inventory of active and spare network hardware.
Inventory software? An inventory of software in use and its associated license keys.
Configuration information? A baseline of hardware firmware updates and software patches that have been applied within your network, and the function of each update and patch. The baseline is often used in the installation of new devices as a template or standard.
Change control? A process whereby network hardware and software changes are managed in a controlled environment without back-out procedures in place, in case an update does not take or goes bad, and the network is down as a result. You can think of change control as the Reset button on your network.
Accounting management is intended to measure network utilization so that individuals or group users on a network can be regulated to prevent one person, or group of people, from using all the network bandwidth and keeping others from using the network to its full capacity. Accounting management also provides the network manager a means to bill network usage back to customers or internal departments, as illustrated in Figure 11-6.
Accounting management provides a mechanism for the Information Technology (IT) department to bill network usage back to internal departmental users so that no one department gets "stuck" with the bill.
Accounting management and performance management share some characteristics. These are the functions they have in common:
Monitoring and measuring of network bandwidth utilization.
Analysis of usage patterns and the trend of those usage patterns. Is usage decreasing, increasing, or holding steady?
Ongoing measurement of network bandwidth. This measurement can result in bandwidth utilization and billing information, helping you ensure there is enough network bandwidth for all your users.
Similar to accounting management, performance management is intended to measure various aspects of network performance. Performance management makes available these network performance aspects so that the network can be maintained at an acceptable threshold, not over- or underutilized, as illustrated in Figure 11-7.
Performance management provides you the tools and methods to collect and analyze network statistics, enabling you to "paint a picture" of your network and how it behaves. Performance management also provides you reporting mechanisms so that network performance can be measured against service level agreements (SLAs) that you might have contracted with a service provider.
An overutilized network can result in contention for network bandwidth, which can be identified by users complaining of a slow network. An underutilized network can result in your paying for network bandwidth you are not using (and might never use).
Figure 11-8 illustrates the performance management process.
First you must gather the interesting performance data. "Interesting" does not mean that the data makes for lively reading, but that it pertains to the network segment you are measuring. After you've gathered this data, you must analyze it and determine the baselines. The average network usage might be a more useful baseline for you than the peak usage data, for example, because the average utilization helps you determine whether your usage is going up or down. After you've established your baseline?in this case, our baseline is average utilization?you need to establish the performance thresholds, the points at which you consider the network to be over- and underutilized. What you use for a baseline depends on your situation and what information you are looking for, such as average utilization, minimum/maximum utilization, peak utilization hours, and so on.
Aside from reactive-based processes, performance management enables you to proactively monitor and manage your network in the form of network simulation or trend analysis. The data collected can be used to create reports justifying network upgrades or to support projects that have been started. Often the technical details are important for political and financial purposes, not just for your own group.
Performance management baseline and trend analyses examine the following network characteristics:
Network-capacity planning? The total amount of network bandwidth.
Availability? The total amount of time your network is up and available to its users.
Response time? The total amount of time it takes for a transaction to complete (for example, a frame being sent from an end user to its destination).
Throughput? The average network bandwidth your network is capable of sustaining. If you have a 100-Mbps Fast Ethernet local-area network (LAN), but your users can use only about 50 kbps of it, there is likely a throughput issue.
Utilization? The average amount of bandwidth and time your network is being used by the network end users.
Security management controls access, in accordance with your organization's security guidelines, to network resources.
Most network management systems address security regarding network hardware?for example, someone logging in to a router or switch.
Security management systems perform the following functions:
The identification of sensitive network resources
The establishment of maps between sensitive network resources and user sets, mapping out which users can access which resources
The monitoring of sensitive network access points and the logging of inappropriate or failed access to these resources
When applied, a good network security management system adds several safeguards to prevent unauthorized network access; however, the only safe computer is a standalone computer (one that is not connected to any network). If we all used standalone computers, it would certainly make doing business in today's world challenging, but would be a boon for carrier pigeon breeders. Because carrier pigeons are not always our best choices for a network transport, we accept certain risks when deploying a network, and security management mitigates these risks.
Different aspects of security management in an Internet Protocol (IP) network are combined with the implementation of the AAA model. AAA is the acronym for authentication, authorization, and accounting. AAA is a system in IP networks that controls what resources users have access to and tracks user activity over a network.
Security management is not just about prevention, but also about detection. Security management includes alerting the network manager when an unauthorized user tries to gain access to network resources, as illustrated in Figure 11-9.
In Figure 11-9, an alarm at the NOC is alerting the network manager that someone is attempting to gain access to network resources, such as a router, a switch, a server, or even a user's workstation. As a network manager, you don't care what this person is going after or why, only that it's happening and you have security management policies already in place that address what to do in this case.
The components of security management are as follows:
Policy? The organization has a security policy on user access to certain network resources. The policy spells out who can access what and what happens when a security compromise occurs.
Authority? An individual is identified who has the authority to grant access to sensitive network resources so that users cannot provide themselves access to certain information.
Access level? Sensitivity level of information is identified as well as user access to these levels. Information can be categorized as confidential, secret, or top secret.
Exceptions? Any exceptions to the security policy or access level must be documented to prevent accidental compromises.
Logging? All activities are logged, whether users logging in to their own machines or someone attempting to log in to a network switch.