Implementation Considerations

The next sections provide an overview of IP SLA features related to various Cisco IOS versions and also address the impact of implementing IP SLA operations. The performance aspect is discussed, as well as measurement accuracy and security considerations.

Supported Devices and IOS Versions

As illustrated in this chapter, IP SLA offers a variety of features that are related to the different operations, as well as several additional functions, such as Multiple Operation Scheduling and VRF support. Some of these features, such as MPLS, ATM, and Frame Relay, support platform dependencies, and others have Cisco IOS version dependencies. Listing all the different features along with the IOS releases and platforms would take up a lot of space, would offer limited value, and would risk listing obsolete information. For the latest IP SLA information, go to http://www.cisco.com/go/ipsla. Specifically, the Cisco Feature Navigator tool lists IP SLA features for each platform and IOS version. Table 11-4 offers an overview of the IP SLA operations and IOS versions.

Table 11-4. IP SLA Operations and IOS Versions
Feature/Release	11.2	12.0(3)T	12.0(5)T 12.0(8)S	12.1(1)T 12.2	12.2(2)T	12.2(11)T (Engine II)	12.3(4)T	12.3(11)T	12.4 IP Base Feature Set^[*]	12.4 Advanced Feature Set
Responder
ICMP Echo
ICMP Path Echo
UDP Echo
TCP Connect
UDP Jitter
DHCP
DNS
HTTP
DLSw+
UDP Jitter (with one-way delay)
FTP Get
MPLS/VPN-Aware
Frame Relay
ICMP Path Jitter
ATM
VoIP UDP Jitter (with MOS and ICPIF score)
VoIP Call Setup (Post-Dial Delay) Monitoring
VoIP Gatekeeper Registration Delay Monitoring Operation

^[*] Starting with IOS Release 12.4, there is a split of IP SLA functions in the different feature sets. Starting with 12.4(2)T, the IOS IP Base image has only the Responder and ICMP Echo operation. An Advanced IOS feature set is required for the sender capabilities. In IOS 12.4(7) and 12.2(25)SEE, the base package contains only the Responder and no sender functionality. The sender function requires any of the Advanced IOS feature sets. This model will be used in 12.5 and subsequent releases.

Performance Impact

As with any other device instrumentation feature implemented at the network element, IP SLA's performance impact is an important aspect. Routers and switches were not primarily designed for network monitoring tasks; hence, the impact of network management features needs to be analyzed. IP SLA stores each operation's results locally in a hierarchical structure, so the processing time increases with the number of configured operations. The IP SLA scheduling function can help reduce the CPU impact, because a large number of operations starting at the same time can lead to CPU utilization spikes. For absolute figures, you need to distinguish between IP SLA Engine I and Engine II implementation, because there is a significant difference from a performance perspective. Engine II is implemented at IOS 12.2(11)T and later; all previous versions have implemented Engine I. The following new features were added in Engine II:

20 to 50 percent reduction in memory consumption per operation.
Backward compatibility. Any operation supported by Engine I can be achieved by Engine II.

Table 11-5 illustrates the CPU impact. Table 11-6 shows the memory consumption on a Cisco 7200VXR NPE225.

Table 11-5. IP SLA CPU Impact (Engines I and II)
Number of Operations per Second	Number of Operations per Minute	Engine I IOS 12.2(8)T5 500 Active Jitter Probes	Engine II IOS 12.3(3) 2000 Active Jitter Probes
4	240	1 percent	4 percent
20	1200	1 percent	3 percent
40	2400	15 percent	7 percent
60	3600	35 percent	11 percent

Table 11-6. IP SLA Memory Consumption (Engines I and II)
Operation	Engine I IOS 12.2(8)T5	Engine II IOS 12.3(3)
UDP Jitter	< 24 KB	< 12 KB
UDP Echo	< 19 KB	< 3.5 KB
ICMP Echo	< 17 KB	< 3.2 KB

Figure 11-9 illustrates the relationship between CPU load (in %) and the total number of jitter operations per second. The three lines represent 500, 1000, and 2000 active operations. The test was performed on a Cisco 7200VXR/NPE-225 with IOS 12.2(8)T5. In the test, the additional UDP Jitter operations are activated sequentially, and each operation sends ten 64-byte packets, each with 20-ms spacing.

Figure 11-9. IP SLA Responder Communication

Accuracy

Because IP SLA is used for SLA validation, accuracy is a significant attribute, because it can have an immediate monetary effect. Most SLAs have a penalty or remedy defined, so it is important for both the service provider and the customer to have accurate measurement functions in place. IP SLA is an embedded feature in the network element, so it "competes" with other processes for system resources, such as CPU, memory, queuing, packet forwarding, ACLs, and more.

A significant factor for IP SLA accuracy is the device CPU utilization; the higher it is, the lower the accuracy becomes, if no IP SLA Responder is used. Using the IP SLA Responder has a major impact as well. Tests revealed interesting results when comparing an ICMP Echo operation (without the IP SLA Responder) and a UDP Echo Probe (with the IP SLA Responder) under two different conditions. In the first scenario, the target device's CPU utilization was low; in the second scenario, CPU utilization was 90 percent on average. In the test setup, two routers were connected back-to-back, and the destination router received traffic from other interfaces, so the link between source and destination device remained unchanged for the two test conditions. The results were as follows:

ICMP Echo operation
- RTT with unloaded target device was 15 ms.
- RTT with target device CPU utilization of 90 percent was 59 ms.
Note that the receiver spends excessive CPU time on processing the ICMP Echo Request and generating the ICMP Echo Reply. Because the ICMP Echo operation does not include time stamps at the Responder, the target device's processing time cannot be subtracted from the total response time.
UDP Echo operation
- RTT with unloaded receiver was 15 ms.
- RTT with target device CPU utilization of 90 percent was 15.3 ms.

Because the IP SLA Responder applies time stamps, the processing delay is subtracted in the calculation of the results.

In a separate setup, the IP SLA Responder was loaded with a CPU-intensive interrupt load, such as packet forwarding on centralized platforms. In this case, IP SLA time-stamping routines compete with the forwarded traffic that is processed at the interrupt level; this has a negative effect on the accuracy. The RTT with an unloaded target device was 100 ms, and the RTT with 90 percent CPU utilization at the target device, loaded by forwarded traffic, was 110 ms.

Tip

For RTT accuracy, always use the UDP Echo or UDP Jitter operation in conjunction with the IP SLA Responder. In this case, processing time on the target router is subtracted, and the results are more accurate, regardless of the sender and receiver CPU utilization.

IP SLA results may be inaccurate with an IP SLA Responder on a router loaded with heavy forwarding traffic, because interrupt level code (such as interface traffic) always gets precedence over normal code (in this case, IP SLA). The IP SLA results have good accuracy if the router's forwarding CPU load is below 30 percent. If the router's forwarding CPU load is above 30 percent, the proposed solution is to use a dedicated, nonforwarding router (also called a shadow router) for the source and/or target device.

The process load has a negligible effect on UDP operations with the IP SLA Responder; it can be neglected if the CPU load is below 60 percent. The results become less accurate when the target device CPU load reaches or exceeds the 60 percent utilization due to extensive forwarding traffic. If the router process load exceeds 60 percent, the proposed solution is to use a shadow router.

On distributed platforms, such as the Cisco 7500, 10000, and 12000, another detail needs to be considered. These platforms have various components such as the route processor (RP) and line cards (LC) where each component has its own clock. IP SLA applies transmit (sending) time stamps at the RP, and the receiving time stamps are applied by the LC. This could create inaccurate results on platforms if the system clocks on the RP and LC are not synchronized.

IP SLA granularity was in the range of submilliseconds until accuracy enhancements added the capability to increase the granularity to the microsecond level. In addition, you can specify the packet priority of an IP SLA operation, set the NTP clock synchronization offset tolerance, and monitor the NTP clock synchronization status of an active IP SLA operation. These features were introduced in IOS 12.3(14)T and are currently supported by UDP jitter operations only; codec types are not supported.

To have more details about the accuracy of IP SLA operations and to provide guidance to operators about which operations are best suited for certain requirements, Cisco funded a research project (IP SLA URP).

This project assessed the level of accuracy feasible with active end-to-end measurements of traffic. Details included traffic QoS, delay, delay variation, and packet loss. Because active probing does not assess the quality of all the customer's traffic, the overall traffic quality was assessed with a certain confidence, close to 100 percent. The challenge was to find a suitable trade-off between packet probing overhead, the desired confidence level, and the results' accuracy.

The outcome of the study was encouraging. It proved that accurate statistical assessment of traffic is possible with active probing if test traffic does not receive special treatment and if enough probe packets are used per measurement interval. The confidence interval is the most relevant factor. For example, starting from a confidence of 99 percent, you can approximate that the increase in needed probe packets is roughly 80 percent for each additional 9 percent of confidence.

Security Considerations

Two main aspects are relevant to IP SLA security considerations:

Exposure or disclosure of measured parameters and measurement components
Attacks against the Responder

The exposure of measured parameters and measurement components means that an operator should ensure that the measurement design and metrics are not published. Observation of IP SLA data might provide an attacker with information about the paths in the network and communication endpoints. If a hacker knows where measurement is applied in the network and what metrics are measured, he or she can launch an attack against the network elements. Alternatively, users could plan to forge traffic with specific patterns or burst characteristics to fake the SLA measurements, which can be considered fraud.

A completely different attack scenario would be sending measurement requests to a Responder with a victim's spoofed source address, with the goal that the Responder then "attacks" the victim. The difference compared to the scenarios just mentioned is that this is an attack against a third party, not against the Responder.

Dealing with IP SLA Responder attacks is easier compared to keeping the infrastructure secret. The IP SLA Responder must keep a port open for control messages (UDP port 1967); therefore, it can be detected with a port scanner tool. However, the router can identify the port scanning and generates an RTR responder: bad format message when the debug ip sla error is enabled.

Unauthorized communication with the Responder can be addressed by using the MD5 option, in which the communication between the IP SLA source and the Responder is authenticated. This means that requests are accepted only from authenticated senders and replayed messages are trashed by the Responder. MD5 authentication is defined in RFC 1321. It produces a 128-bit message digest ("fingerprint") as a digital signature. Note that MD5 only authenticates the sender; it does not encrypt the traffic! Encrypting management traffic between the NMS server and the network element is considered a best practice, and not specific to IP SLA, so it is not covered in detail here.

On top of MD5 authentication and packet encryption, you can configure a generic access control list (ACL) to restrict access to the IP SLA source and destination device and only allow the NMS system to configure and retrieve data. Applying an ACL to the Responder's UDP port (1967) can restrict the access to certain IP addresses. This is also part of general security design and is not addressed here.

IP SLA Deployment

When deploying IP SLA, it is suggested that you follow these five steps:

Step 1.	Identify SLA metrics (as described in the section "Measured Metrics: What to Measure").
Step 2.	Select the appropriate IP SLA types (see the section "Operation Types" for details).
Step 3.	Define the operational parameters (see the section "Operations Parameters" for details).
Step 4.	Select where to place the IP SLA source and destination in the network (see the section "IP SLA Architecture and Best Practices" for details).
Step 5.	Choose an application to configure operations and retrieve and represent the IP SLA results (see the section "NMS Applications").

The following sections describe a proposed architecture for placing IP SLA source and destination devices, setting up operations, composing SLAs, and selecting NMS applications.

IP SLA Architecture and Best Practices

When choosing the best places for the IP SLA operations, you can choose between a full-mesh and partial-mesh design. Full mesh means that you set up an operation between all possible end nodes. This is the most accurate approach and is representative from an end-user perspective. Unfortunately, it does not scale well in large networks, because the number of operations is proportional to the square of the number of nodes. For example, a very small network with five nodes requires ten operations. For 100 nodes, 4950 operations are required. For 5000 nodes, 12,497,500 operations are needed to perform a full measurement.

The alternative to a full mesh is a partial mesh. First you identify the critical network paths and apply full-mesh measurement on only those links. Examples could be the connections from branch offices to headquarters or the core devices in your MPLS network. This approach dramatically reduces the number of operations and still provides accurate performance monitoring results. Figures 11-10 and 11-11 present two design scenarios. The first one is a partial mesh with hierarchical polling, and the second one shows central polling of the core links. Note that these approaches can be combined.

Figure 11-10. Hierarchical IP SLA Design

[View full size image]

Figure 11-11. Central IP SLA Design

[View full size image]

For the hierarchical setup shown in Figure 11-10, full-mesh measurement is applied in the core, and each PoP or branch access area applies the local measurement for the branch office (in the case of an Enterprise network) or the customer site (in the case of an ISP). The SLA metrics for this partial-mesh design are composed, which means that the overall metrics, such as delay or packet loss, are compiled as the sum of multiple submetrics. The total delay between site 1 and site 2 is the sum of the delay values between the source and the destination: t_delay1 + t_delay2 + t_delay3. This approach is very flexible, because the creation of new SLAs and the addition of new sites can be easily integrated into the existing calculation. However, the results are less accurate, because each measurement carries its own error tolerance, which is typically ±1 ms per measurement. Although the composite approach works well for metrics such as delay, packet loss, and packet sequencing, it does not apply to jitter measurements. This has multiple reasons. For example, jitter has positive and negative values, which cannot just be summed up. In the case of composed metrics, jitter is the exception where full measurement is required.

Note

Composite metrics cannot be applied to jitter measurements!

The magnified view of PoP1 in Figure 11-10 illustrates the use of a shadow router, which is a dedicated router for network management purposes.

In this case, the shadow router is used for IP SLA monitoring, and all operations and the IP SLA Responder are defined on this device. Using a shadow router offers a number of benefits, such as increased accuracy, because no forwarding traffic competes with the monitoring tasks at the device. In addition, you can run the latest Cisco IOS versions on the shadow router and update them whenever Cisco offers new functions in IOS. This is not a recommended approach for the core network nodes, where stability is more important than new features. Shadow routers can be connected via different technologies, such as MPLS VRF, tunnels, and 802.1q. All core operations in Figure 11-10 are defined between the shadow routers, which monitor the locally connected sites.

The alternative to a hierarchical design with composite metrics is the central IP SLA design, in which a central dedicated router applies the monitoring to the key sites. Figure 11-11 shows a potential setup, in which two shadow routers at the central Network Operations Center (NOC) monitor the network.

Shadow router S1 monitors all core components, and shadow router S2 monitors connectivity to the edge sites. Even though one shadow router could probably handle all operations, this is not a recommended design, because it introduces a single point of failure. If S1 or S2 fails, at least some IP SLA data is gathered to monitor the network's key components. You should consider two additional points. First, the proposed design monitors only the connectivity between the NOC and the core and remote sites. There might be network outages that are not identified, and the collected IP SLA data may not represent customer performance data. In addition, attention should be given to the scheduling of the operations. Although the IP SLA Multiple Operation Scheduling feature can help distribute the operations equally on one device, only the operator or the NMS application can ensure equal scheduling across the two shadow routers.

An important design aspect is having a single time in the network (as opposed to each network element having its own time). This is useful for accounting and performance management, and it applies to fault and security management as well. Although only the one-way IP SLA measurements mandate using NTP, current best practices suggest implementing the Network Time Protocol (NTP) consistently across the network. NTP is defined in RFC 1305. It is used to synchronize system clocks on network devices with a reference time source. NTP uses the concept of a stratum, which describes the "logical distance" between a client and the master clock, where one stratum is added for each NTP component between the client and the time source. A stratum 1 time server typically has a radio or atomic clock directly attached, a stratum 2 time server receives its time via NTP from a stratum 1 time server, and so on. UTC is the reference time zone for NTP, and each device translates UTC into its local time zone. NTP has a resolution of less than a nanosecond and offers fault-tolerant features. For example, when a network connection is temporarily unavailable, it uses previous measurement polls to estimate the current time and offset. Although stratum 0 NTP sources are very expensive, stratum 1 devices such as GPS appliances are less expensive. The cheapest approach is to synchronize with public NTP sources from the Internet. Which NTP source you select mainly depends on your accuracy requirements. Another source of clock synchronization is the Global Positioning System (GPS), which is more accurate. However, GPS is more expensive than NTP.

NMS Applications

IP SLA gathers performance data and thus offers valuable input for performance management applications. A number of network management applications from partners such as Agilent, CA/Concord, Crannog, HP, IBM/ Micromuse, Infovista, MRTG, and others support Cisco IP SLA. The CiscoWorks Internet Performance Monitor (IPM), Cisco IP Solution Center (ISC), and Cisco InfoCenter (CIC) also leverage IP SLA. Table 11-7 summarizes the various types of IP SLA operations and assigns them to key applications. It also describes what each operation measures and for what purpose the operation can be used.

Table 11-7. IP SLA Operations Summary
IP SLA Operation	What It Measures	Key Applications	Comment
ICMP Echo	Measures round-trip delay for the full path	Troubleshooting, connectivity measurement	—
ICMP Path Echo	Measures round-trip delay and hop-by-hop round-trip delay	Connectivity measurement, identify bottlenecks in the path	—
ICMP Path Jitter	Measures round-trip delay, hop-by-hop jitter, packet loss, and delay measurement	Troubleshooting, hop-by-hop analysis	—
UDP Echo	Measures round-trip delay of UDP traffic	Accurate response-time measurement for UDP traffic	—
UDP Jitter	Measures round-trip delay, one-way delay, one-way jitter, one-way packet loss, and connectivity	VoIP and data network performance	One-way delay requires time synchronization between source and target routers
VoIP UDP Jitter (with MOS and ICPIF score)	Measures round-trip delay, one-way delay, one-way jitter, and one-way packet loss for VoIP traffic. Calculates MOS and ICPIF voice quality scores. Codec simulation: G.711 u-law, G.711 a-law, and G.729A.	VoIP performance monitoring	One-way delay requires time synchronization between source and target routers
TCP Connect	Measures the time taken to connect to a target device with TCP	Server and application performance	—
FTP	Measures round-trip time to transfer a file	FTP server performance and troubleshooting. Monitors path quality to and from the FTP server to the measurement node.	—
Dynamic Host Configuration Protocol (DHCP)	Measures round-trip time to get an IP address from a DHCP server	DHCP server performance and troubleshooting	—
Domain Name System (DNS)	Measures DNS lookup time	DNS server performance and troubleshooting	—
HTTP	Measures round-trip time to retrieve a web page	Web server performance and troubleshooting, path quality monitoring	—
Data Link Switching Plus (DLSw+)	Measures peer tunnel response time	Response time between DLSw+ peers	—
MPLS VPN	Adds VRF measurement to other IP SLA operations: ICMP Echo, ICMP Path Echo, UDP Echo, UDP Jitter	MPLS performance monitoring and troubleshooting	—
Frame Relay	Measures circuit availability, round-trip delay, and frame delivery ratio	WAN service level agreement performance	This operation does not have SNMP support
ATM	Measures physical links or circuits of ATM connections	WAN service level agreement performance	This operation does not have SNMP support
VoIP Call Setup (Post-Dial Delay) Monitoring	Measures the response time for setting up a synthetic VoIP call	VoIP performance monitoring	Requires the VoIP test-call application on the source router
VoIP Gatekeeper Registration Delay Monitoring Operation	Measures the amount of time taken to register a gateway to a gatekeeper	VoIP performance monitoring	—