As defined previously, the focus of accounting is to track the usage of network resources and traffic characteristics. The following sections identify various accounting scenarios:
Network monitoring
User monitoring and profiling
Application monitoring and profiling
Capacity planning
Traffic profiling and engineering
Peering and transit agreements
Billing
Security analysis
This is certainly not an exhaustive list of the different accounting scenarios and categories. Nevertheless, it covers the needs of the majority of enterprise and service provider customers. Each section describes the problem space, examples of specific results, and some implementation examples.
Let's start by discussing some generic examples that are at the edge between accounting and performance monitoring. The fuzzy area of "network monitoring" fits in here. The term "network monitoring" is widely interpreted: one person might relate it to device utilization only, and someone else might think of end-to-end monitoring. In fact, network monitoring is a vague expression that includes multiple functions. Network monitoring applications enable a system administrator to monitor a network for the purposes of security, billing, and analysis (both live and offline). We propose to use the term "network monitoring" for any application that does not fit into the other categories.
Table 1-2 illustrates device utilization. Assume that we have a network with three service classes deployed. Class 0 delivers real-time traffic, such as voice over IP, and class 1 carries business-critical traffic, such as e-mail and financial transactions. Class 2 covers everything else; this is the "best-effort" traffic class. Table 1-2 illustrates the total amount of traffic collected per class, including the number of packets and number of bytes. This report provides relevant information to a network planner. The technology applied in this example is an SNMP data collection of the CISCO-CLASS-BASED-QOS-MIB (see Chapter 4, "SNMP and MIBs"), which describes all the CoS counters.
Class 0 | Class 1 | Class 2 | ||||
---|---|---|---|---|---|---|
Time (Hour) | Packets | Bytes | Packets | Bytes | Packets | Bytes |
0 | 38 | 2735 | 1300 | 59800 | 3 | 1002 |
1 | 55 | 3676 | 400 | 44700 | 61 | 9791 |
2 | 41 | 36661 | 400 | 16800 | 4 | 240 |
3 | 13 | 1660 | 200 | 8400 | 4 | 424 |
4 | 16 | 14456 | 400 | 44700 | 4 | 420 |
5 | 19 | 2721 | 400 | 44400 | 1 | 48 |
6 | 21 | 24725 | 600 | 35600 | 516 | 20648 |
7 | 19 | 3064 | 700 | 412200 | 15 | 677 |
8 | 5 | 925 | 1200 | 176000 | 1 | 48 |
9 | 4 | 457 | 1300 | 104100 | 1242 | 1489205 |
10 | 5 | 3004 | 1900 | 1091900 | 1 | 48 |
11 | 4 | 451 | 400 | 39800 | 545 | 22641 |
12 | 4 | 456 | 800 | 54200 | 1017 | 1089699 |
13 | 5 | 510 | 500 | 41600 | 36 | 3240 |
14 | 4 | 455 | 400 | 99300 | 15 | 3287 |
15 | 5 | 511 | 800 | 36800 | 685 | 27578 |
16 | 4 | 454 | 100 | 4000 | 3 | 144 |
17 | 4 | 457 | 500 | 309500 | 2 | 322 |
18 | 4 | 455 | 400 | 34100 | 4 | 192 |
19 | 5 | 3095 | 1300 | 104100 | 4 | 424 |
20 | 4 | 398 | 100 | 15200 | 4 | 424 |
21 | 5 | 1126 | 800 | 54200 | 12 | 936 |
22 | 7 | 782 | 1300 | 104100 | 4 | 835 |
23 | 9 | 7701 | 600 | 35600 | 1 | 235 |
Another scenario of network monitoring is the use of accounting usage resource records for performance monitoring. The accounting collection process at the device level gathers usage records of network resources. These records consist of information such as interface utilization, traffic details per application and user (for example, percentage of web traffic), real-time traffic, and network management traffic. They may include details such as the originator and recipient of a communication. Granularity differs according to the requirements. A service provider might collect individual user details for premium customers, whereas an enterprise might be interested in only a summary per department. This section's focus is on usage resource records, not on overall device details, such as CPU utilization and available memory.
A network monitoring solution can provide the following details for performance monitoring:
Device performance monitoring:
- Interface and subinterface utilization
- Per class of service utilization
- Traffic per application
Network performance monitoring:
- Communication patterns in the network
- Path utilization between devices in the network
Service performance monitoring:
- Traffic per server
- Traffic per service
- Traffic per application
Applied technologies for performance monitoring include SNMP MIBs, RMON, Cisco IP SLA, and Cisco NetFlow services.
The trend of running mission-critical applications on the network is evident. Voice over IP (VoIP), virtual private networking (VPN), and videoconferencing are increasingly being run over the network. At the same time, people use (abuse?) the network to download movies, listen to music online, perform excessive surfing, and so on.
This information can be used to
Monitor and profile users.
Track network usage per user.
Document usage trends by user, group, and department.
Identify opportunities to sell additional value-added services to targeted customers.
Build a traffic matrix per subdivision, group, or even user. A traffic matrix illustrates the patterns between the origin and destination of traffic in the network.
Accounting records can help answer the following questions:
Which applications generate the most traffic of which type?
Which users use these applications?
What percentage of traffic do they represent?
How many active users are on the network at any given time?
How long do users stay on the network?
Where do they come from?
Where do they go?
Do the users accept the policies on network usage?
When will upgrades affect the fewest users?
There are also legal requirements related to monitoring users and collecting accounting records. For example, you could draw conclusions about an individual's performance on the job. In some countries, it is illegal to collect specific performance data about employees. One solution could be to collect no details about individuals. Although this is ideal from a legal perspective, it becomes a nightmare during a security attack. Consider a scenario in which a PC of an individual user has been infected by a virus and starts attacking the network, and the user is unaware of this. It would be impossible to identify this PC without collecting accounting records per user, so you need to collect this level of detail. The same applies to the victims of the attack: They will certainly complain about the bad network service, but the operator cannot help them without useful data sets. From a data analysis perspective, we need to store performance baseline information and apply statistical operations such as "deviation from normal" to spot abnormalities.
A compromise could be to gather all details initially and separate the storage mechanisms afterwards. You could keep all details for security analysis for a day (minimum) or a week (maximum) and aggregate the records at the department level for performance or billing purposes. This approach should be okay from a legal perspective if you ensure that there is no public access to the security collection.
Note
Check your country's legal requirements before applying per-user accounting techniques.
Applied technologies for user monitoring and profiling include RMON; Authentication, Authorization, and Accounting (AAA); and Cisco NetFlow services.
With the increase in emerging technologies such as VoIP/IP telephony, video, data warehousing, sales force automation, customer relationship management, call centers, procurement, and human resources management, network management systems are required that allow you to identify traffic per application. Several years ago, this was a relatively easy task, because there were several different transmission protocols: TCP for UNIX communication, IPX for Novell file server sharing, SNA for mainframe sessions, and so on. The consolidation toward IP eliminated several of these protocols but introduced a new challenge for the network operator: how to distinguish between various applications if they all use IP. Collecting different interface counters was not good enough any more. From a monitoring point, it got worse. These days most server applications have a Web graphical user interface (GUI), and most traffic on the network is based on HTTP. In this case, traffic classification for deploying different service classes requires deep packet inspection, which some accounting techniques offer. Because of these changes, we need a new methodology to collect application-specific details, and accounting is the chosen technology. An example is Cisco Network-Based Application Recognition (NBAR), which is described in Chapter 10, "NBAR."
The collected accounting information can help you do the following:
Monitor and profile applications:
- In the entire network
- Over specific expense links
Monitor application usage per group or individual user
Deploy QoS and assign applications to different classes of service
Assemble a traffic matrix based on application usage
A collection of application-specific details is also very useful for network baselining. Running an audit for the first time sometimes leads to surprises, because more applications are active on the network than the administrator expected. Application monitoring is also a prerequisite for QoS deployment in the network. To classify applications in different classes, their specific requirements should be studied in advance, as well as the communication patterns and a traffic matrix per application. Real-time applications such as voice and video require tight SLA parameters, whereas e-mail and backup traffic would accept best-effort support without a serious impact.
The next question to address is how to identify a specific application on the network.
In most environments, applications fall into the following distinct categories:
Applications that can be identified by TCP or UDP port number. These are either "well-known" (0 through 1023) or registered port numbers (1024 through 49151). They are assigned by the Internet Assigned Numbers Authority (IANA).
Applications that use dynamic and/or private application port numbers (49152 through 65535), which are negotiated before connection establishment and sometimes are changed dynamically during the session.
Applications that are identified via the type of service (ToS) bit. Examples such as voice and videoconferencing (IPVC) can be identified via the TOS value.
Subport classification of the following:
- HTTP: URLs, MIME (Multipurpose Internet Mail Extension) types or hostnames
- Citrix applications: traffic based on published application name
Classification based on the combination of packet inspection and multiple application-specific attributes. RTP Payload Classification is based on this algorithm, in which the packet is classified as RTP based on multiple attributes in the RTP header.
In some of these cases, deeper packet inspection is needed. This can be performed by Cisco NBAR, for example.
Figure 1-7 displays traffic details per application, aggregated over time.
An alternative report would identify the various protocols on the network—for example, IPv4 traffic compared to IPv6 traffic or TCP versus UDP traffic. Figure 1-8 shows a protocol distribution.
Cisco IT performed a network audit to track the applications on the Cisco internal network, and it provided some interesting results. The following list of applications and protocols comprises about 80 percent of the total traffic that traverses the WAN:
HTTP
IP telephony
IP video
Server and PC backups
Video on demand (VoD)
Multicast
SNMP
Antivirus updates
Peer-to-peer traffic
Techniques to obtain the classification per application are RMON2, Cisco NetFlow, and Cisco NBAR. All three classify the observed traffic per application type. Chapter 5, "RMON," explains RMON; Chapter 7, "NetFlow," provides NetFlow details; and Chapter 10 covers NBAR.
A more advanced report could combine application-specific details and CoS information. A network planner can use such a report to isolate problems in a QoS-enabled environment (such as to detect when a certain class is almost fully utilized but the bandwidth cannot be increased). In this case, one or multiple applications could be moved to another class. For example, e-mail traffic could be reclassified from class 1 to class 2.
The next report, shown in Table 1-3, is based on Table 1-2, but it extends the level of detail by including some application-specific parts. For class 0, we are interested in the percentage of VoIP and non-VoIP traffic; in class 1 we distinguish between e-mail and SAP traffic (assuming that only these two applications get assigned to class 1). For the best-effort traffic in class 2 we distinguish between web traffic (HTTP), peer-to-peer-traffic, and the rest. This report cannot be compiled by retrieving SNMP data from the CISCO-CLASS-BASED-QOS-MIB, because it collects only counters per traffic class, not counters per application within a class. Hence, we leverage either NetFlow or RMON (Remote Monitoring MIB Extensions for Differentiated Services, RFC 3287) to gather the extra level of per-application details.
Class 0 | Class 1 | Class 2 | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Load | Application (Bytes) | Load | Application (Bytes) | Load | Application (Bytes) | ||||||||
Time (Hour) | Packets | Bytes | Voice | Other | Packets | Bytes | SAP | Packets | Bytes | HTTP | Peer-to-Peer | Other | |
0 | 38 | 2735 | 264 | 2471 | 1300 | 59800 | 38870 | 20930 | 13 | 1002 | 752 | 100 | 150 |
1 | 55 | 3676 | 128 | 3548 | 400 | 44700 | 29055 | 15645 | 61 | 9791 | 8812 | 979 | 0 |
2 | 41 | 56661 | 780 | 55881 | 400 | 16800 | 10920 | 5880 | 4 | 240 | 216 | 24 | 0 |
3 | 13 | 1660 | 328 | 1332 | 200 | 8400 | 5460 | 2940 | 4 | 424 | 382 | 42 | 0 |
4 | 16 | 14456 | 128 | 14328 | 400 | 44700 | 29055 | 15645 | 4 | 420 | 378 | 42 | 0 |
5 | 19 | 2721 | 1164 | 1557 | 400 | 44400 | 28860 | 15540 | 10 | 480 | 48 | 48 | 384 |
6 | 21 | 24725 | 9856 | 14869 | 600 | 35600 | 23140 | 12460 | 516 | 20648 | 18583 | 2065 | 0 |
7 | 19 | 3064 | 2048 | 1016 | 700 | 412200 | 267930 | 144270 | 15 | 677 | 609 | 68 | 0 |
8 | 5 | 925 | 512 | 413 | 1200 | 176000 | 114400 | 61600 | 12 | 960 | 48 | 96 | 816 |
9 | 4 | 457 | 256 | 201 | 1300 | 104100 | 67665 | 36435 | 1242 | 1489205 | 1340285 | 148921 | 0 |
10 | 5 | 3004 | 1684 | 1320 | 1900 | 1091900 | 709735 | 382165 | 3 | 256 | 230 | 26 | 0 |
11 | 4 | 451 | 96 | 355 | 400 | 39800 | 25870 | 13930 | 545 | 22641 | 20377 | 2264 | 0 |
12 | 4 | 456 | 64 | 392 | 800 | 54200 | 35230 | 18970 | 1017 | 1089699 | 980729 | 108970 | 0 |
13 | 5 | 510 | 128 | 382 | 500 | 41600 | 27040 | 14560 | 36 | 3240 | 2916 | 324 | 0 |
14 | 4 | 455 | 416 | 39 | 400 | 99300 | 64545 | 34755 | 15 | 3287 | 2958 | 329 | 0 |
15 | 5 | 511 | 496 | 15 | 800 | 36800 | 23920 | 12880 | 685 | 27578 | 24820 | 2758 | 0 |
16 | 4 | 454 | 128 | 326 | 100 | 4000 | 2600 | 1400 | 3 | 144 | 130 | 14 | 0 |
17 | 4 | 457 | 256 | 201 | 500 | 309500 | 201175 | 108325 | 2 | 322 | 290 | 32 | 0 |
18 | 4 | 455 | 196 | 259 | 400 | 34100 | 22165 | 11935 | 4 | 192 | 173 | 19 | 0 |
19 | 5 | 3095 | 2048 | 1047 | 1300 | 104100 | 67665 | 36435 | 4 | 424 | 382 | 42 | 0 |
20 | 4 | 398 | 286 | 112 | 100 | 15200 | 9880 | 5320 | 4 | 424 | 382 | 42 | 0 |
21 | 5 | 1126 | 956 | 170 | 800 | 54200 | 35230 | 18970 | 12 | 936 | 842 | 94 | 0 |
22 | 7 | 782 | 612 | 170 | 1300 | 104100 | 67665 | 36435 | 4 | 835 | 752 | 84 | 0 |
23 | 9 | 7701 | 2096 | 5605 | 600 | 35600 | 23140 | 12460 | 2 | 235 | 212 | 24 | 0 |
Best practice suggests monitoring the network before implementing new applications. Taking a proactive approach means that you analyze the network in advance to identify how it deals with new applications and whether it can handle the additional traffic appropriately. A good example is the IP telephony (IPT) deployment. You can run jitter probe operations with Cisco IP SLA, identify where the network needs modifications or upgrades, and start the IPT deployment after all tests indicate that the network is running well. After the deployment, accounting records deliver ongoing details about the newly deployed service. These can be used for general monitoring of the service as well as troubleshooting and SLA examination.
Internet traffic increases on a daily basis. Different studies produce different estimates of how long it takes traffic to double. This helps us predict that today's network designs will not be able to carry the traffic five years from now. Broadband adoption is one major driver, as well as the Internet's almost ubiquitous availability. Recently, Cisco internal IT department concluded that bandwidth consumption is doubling every 18 months.
This requires foresight and accurate planning of the network and future extensions. Enterprises and service providers should carefully plan how to extend the network in an economical way.
A service provider might consider the following:
Which point of presence (PoP) generates the most revenue?
Which access points are not profitable and should be consolidated?
Should there be spare capacity for premium users?
In which segment is the traffic decreasing? Did we lose customers to the competition? What might be the reason?
An enterprise IT department might consider the following:
Which departments are growing the fastest? Which links will require an upgrade soon?
For which department is network connectivity business-critical and therefore should have a high-availability design?
These questions cannot be answered without an accurate traffic analysis; it requires a network baseline and continuously collected trend reports. Service providers and professional IT departments should go one step further and offer service monitoring to their customers. This approach can identify potential bottlenecks in advance. It also lets the provider proactively notify customers and offer more bandwidth, different QoS, more high availability, and so on.
Capacity planning can be considered from the link point of view or from the network-wide point of view. Each view requires a completely different set of collection parameters and mechanisms.
For link capacity planning, the interface counters stored in the MIB are polled via SNMP, and the link utilization can be deduced. This simple rule of thumb is sometimes applied to capacity planning. If the average link utilization during business hours is above 50 percent, it is time to upgrade the link! The link utilization is calculated with the MIB variables from the interfaces group MIB (RFC 2863).
Apply the following equation to calculate utilization:
input utilization = [(Δ(ifInOctets)) * 8 * 100] / [(number of seconds in Δ) * ifSpeed]
output utilization = [Δ(ifOutOctets)) * 8 * 100] / [(number of seconds in Δ) * ifSpeed]
Note
On Cisco routers, the ifSpeed value is set by the bandwidth interface command. This bandwidth is a user-configurable value that can be set to any value for routing protocol metric purposes. You should set the bandwidth correctly and check the content of the BW (bandwidth) value in Kbps with the show interface command before doing any interface utilization calculations.
Some alarms, such as a trap or a syslog message, may be sent to the fault management application to detect a threshold violation. When we use accounting information for fault management, we enter the world of performance management, whose applications are described later in this chapter.
Link capacity planning might be enough in most cases when a network administrator knows about a bottleneck in the network. After a link's bandwidth is upgraded, the network administrator should identify the next bottleneck—this is a continuous process! In addition, most networks are designed with economical justifications, which means that very little overprovisioning is done. The term "network over-subscription" describes an abundance of bandwidth in the network, so that under normal circumstances, performance limitations are not caused by a lack of link capacity. Put another way, the only restriction that one application sees when communicating with another application is in the network's inherent physical limitations. In contrast, the term "network overprovisioning" describes a network design with more traffic than bandwidth. This means that, even under normal circumstances, not enough bandwidth is provided for all users to use the network to perform their tasks at the same time, using their maximum allocated bandwidth. The network over-subscription concept is obviously a more cost-effective approach than network overprovisioning, because it assumes that not all users will use their fully dedicated bandwidth at the same time. However, capacity planning is more complex in this case. The computation of what constitutes adequate provisioning, without gross overprovisioning, depends on accurate core capacity planning along with realistic assumptions about what group of users will use what applications and services at key time periods. Another approach is to do networkwide capacity planning by collecting the "core traffic matrix." The core traffic matrix is a table that provides the traffic volumes between the origin and destination in a network. To collect this for all the network's entry points, we need usage information (in number of bytes and/ or number of packets per unit of time) per exit point in the core network. Figure 1-9