QoS Operational Model

Thus far, the discussion has laid out the fundamentals of QoS and the types of roles it can play, and has defined some general terms such as CoS, ToS, and DCSP. This section will explore the various steps necessary for a QoS packet to traverse from a Catalyst 6500 switch's ingress to egress port.

Catalyst QoS operational model consists of five steps:

  1. Classification

  2. Input scheduling

  3. Marking and policing

  4. Marking

  5. Output scheduling

Classification

Classification is the initial step that needs to be discussed. The Catalyst switch needs to distinguish one incoming frame from another so that it can appropriately forward the packet through the switch.

Figure 8-1 depicts the path the frame takes before going to the switching engine (PFC/PFC2) for further instructions. By default, Catalyst 6500 switch's port is programmed to be untrusted, which means that any frame received on the port will have its CoS value reset to 0. The default CoS value of 0 can be changed, and any incoming frames on that untrusted port will inherit the new configured CoS setting. Now, that having been said, the port can be configured to be trusted, in which case the incoming frame's CoS value will be maintained. An ingress port can be configured with the following options:

  • Untrusted? Incoming frame will lose its CoS value, and inherit default or configured value on the ingress port.

  • Trust-cos? Incoming frame will maintains its CoS value.

  • Trust-dscp? Incoming packet will maintain its DSCP value.

  • Trust-ipprec? Incoming packet will maintain its IP precedence value.

Figure 8-1. Classification of Incoming Frame

graphics/08fig01.gif


A Layer 3 switching engine is required to configure trust-dscp and trust-ipprec options. Configuring a port's trust status can be done with the set port qos command:






Switch1 (enable) set port qos 1/2 trust trust-cos


The configuration has been modified to allow for port 1/2 to be trusted. The command was performed on a gigabit port. Now, any incoming packet with CoS value set will be forwarded on without change. It is worth noting an importing caveat regarding 10/100 cards (for example, the WS-X6248-xx or WS-X6348-xx) and classification. The 10/100 cards do not support any trust-type configuration. So, for instance in Example 8-2, port 10/3 is configured as a CoS trust port. However, the switch generates a syslog message that trust-cos feature is not supported and that Receive thresholds are enabled. It is also worth noting that even though the trust-type is not supported on a 10/100 card, the command still needs to be performed to enable Receive thresholds. The "Input Scheduling" section of this chapter will discuss Receive thresholds.

Example 8-2. Configuring a 10/100 Port as a CoS Trusted Port

Switch1 (enable) set port qos 10/3 trust trust-cos

Trust type trust-cos not supported on this port.

Receive thresholds are enabled on port 10/3.

Port 10/3 qos set to untrusted.


The repercussion of a 10/100 card not supporting trust-type is that incoming frames with CoS values set will be reset to 0. A workaround can be implemented using an access list as outlined in the following steps for incoming frames to retain their CoS values:

Step 1. Enable the 10/100 card for trust-cos, as follows:






Switch1 (enable) set port qos 10/3 trust trust-cos

Trust type trust-cos not supported on this port.

Receive thresholds are enabled on port 10/3.

Port 10/3 qos set to untrusted.


Step 2. Create an access list:






Switch1 (enable) set qos acl ip list1 trust-cos any


Step 3. Commit the changes to nonvolatile random-access memory (NVRAM):






Switch1 (enable) commit qos acl list1


Step 4. Map the access list to the port:






Switch1 (enable) set qos acl map list1 10/3

ACL list1 is successfully mapped to port 10/3.

The old ACL mapping is replaced by the new one.


Example 8-3 shows an excerpt from the show port qos command. A point of interest in the output is that access list, list1, is applied to port 10/3 and for IP traffic only.

Example 8-3. QoS Parameters for a Single Port

Switch1 (enable) show port qos 10/3

Config:

Port  ACL name                         Type

----- -------------------------------- ----

10/3  list1                            IP


QoS access list can either be implemented for a specific port, port-based, or to the entire VLAN, vlan-based. Access list, list1, was created for port-based only, which means the access list will not affect other hosts on the same VLAN. By default, Cisco switches are configured for port-based. However, if needed, the set port qos command can be used to change the QoS access list configuration to vlan-based.






Switch1 (enable) set port qos 10/3 vlan-based


Input Scheduling

Input scheduling is the next step involved in handling the frame after the frame has arrived at the ingress port, assuming the port has been configured for trust-cos (refer to Figure 8-1). Input scheduling basically assigns incoming frames to queues. If trust-cos is not configured, the incoming frames will bypass the Receive threshold (also known as the drop threshold) queue and are forwarded directly to the switching engine. Each queue has its own drop threshold level, which means that frames are dropped after the threshold value is exceeded.

The number of queues and their associated drop threshold values are dependent on the hardware used. Example 8-4 shows features available for port 10/3 off the WS-X6248-xx module. Note the QoS scheduling field shaded in the example. There are two defined queues: rx-(1q4t),tx-(2q2t). Input scheduling deals with rx-(1q4t). The tx-(2q2t) will be discussed later in the chapter. The 1q4t is defined as 1 queue with 4 drop thresholds. Newer line cards have 1p1q4t, translating to 1 priority queue, 1 normal queue, with 4 drop threshold queues. Each of these thresholds is set to drop incoming packets. The packets are dropped based on their CoS setting and the amount of buffer used.

Example 8-4. Type of QoS Scheduling

Switch1 (enable) show port capabilities 10/3

Model                    WS-X6248-RJ-45

Port                     3/1

Type                     10/100BaseTX

Speed                    auto,10,100

Duplex                   half,full

Trunk encap type         802.1Q,ISL

Trunk mode               on,off,desirable,auto,nonegotiate

Channel                  yes

Broadcast suppression    percentage(0-100)

Flow control             receive-(off,on),send-(off)

Security                 yes

Dot1x                    yes

Membership               static,dynamic

Fast start               yes

QOS scheduling           rx-(1q4t),tx-(2q2t)

CoS rewrite              yes

ToS rewrite              DSCP

UDLD                     yes

Inline power             no

AuxiliaryVlan            1..1000,1025..4094,untagged,dot1p,none

SPAN                     source,destination

COPS port group          3/1-48

Link debounce timer      yes

Dot1q-all-tagged         yes


Since 1q4t has only 1 queue, all incoming frames will be placed in this single queue. However, if the queue starts to become congested, frames will be dropped based on their CoS values. The following lists the defaults for each CoS value:

  • CoS 0 and 1 are mapped to threshold 1 (set at 50 percent)

  • CoS 2 and 3 are mapped to threshold 2 (set at 60 percent)

  • CoS 4 and 5 are mapped to threshold 3 (set at 80 percent)

  • CoS 6 and 7 are mapped to threshold 4 (set at 100 percent)

Any incoming packet with CoS setting of 0 or 1 that is mapped to threshold 1 will be dropped if the port buffer is at 50 percent or higher. The show qos info command in Example 8-5 shows the default mapping for CoS and its associated drop threshold level on a Catalyst switch.

Example 8-5. Default Parameters for 1q4t rx

Switch1 (enable) show qos info config 1q4t rx

QoS setting in NVRAM for 1q4t receive:

QoS is enabled

Queue and Threshold Mapping for 1q4t (rx):

Queue Threshold CoS

----- --------- ---------------

1     1         0 1

1     2         2 3

1     3         4 5

1     4         6 7

Rx drop thresholds:

Queue #  Thresholds - percentage

-------  -------------------------------------

50% 60% 80% 100%

Rx WRED thresholds:

WRED feature is not supported for this port type.

Rx queue size ratio:

Rx queue size-ratio feature is not supported for this port type.


The 1p1q4t has an extra queue called the strict priority queue, which is associated with CoS value of 5. The strict priority queue, queue 2, takes precedence over the standard queue, queue 1. Traffic in the strict priority queue is always serviced first. Typically critical user traffic is marked with CoS 5 at Layer 2 and the equivalent of DSCP value of 40 at Layer 3. The reason user traffic is not marked with higher CoS values such as CoS 6 or 7 is that these values are generally associated with control traffic. The bulletins outline the two queues and their associated drop threshold levels:

  • CoS 0 and 1 are mapped to threshold 1/standard queue (set at 50 percent)

  • CoS 2 and 3 are mapped to threshold 2/standard queue (set at 60 percent)

  • CoS 4 is mapped to threshold 3/standard queue (set at 80 percent)

  • CoS 5 is mapped to priority queue (set at 100 percent)

  • CoS 6 and 7 are mapped to threshold 4/standard queue (set at 100 percent)

Both the queue and threshold settings can be changed, if necessary. For example, using the set qos map command, the CoS 4 has now been mapped to drop threshold level 2:






Switch1 (enable) set qos map 1p1q4t rx 1 2 cos 4

QoS rx priority queue and threshold mapped to cos successfully.


The switch, however, will not allow for the priority queue to be associated with any threshold other than its own threshold. The following configuration attempted to link threshold 4 with CoS value of 6 with priority 2 queue. This example would have worked if the threshold had been set at 1:






Switch1 (enable) set qos map 1p1q4t rx 2 4 cos 6

Incompatible queue/threshold number with port-type specified.


The following changes the drop threshold for threshold 1 from 50 percent to 60 percent:






Switch1 (enable) set qos drop-threshold 1q4t rx queue 1 60 60 80 100

Receive drop thresholds for queue 1 set at 60% 60% 80% 100%


Marking and Policing

The next step after input scheduling is for the frame to be forwarded to the switching engine (PFC/PFC2). The switching engine will mark every frame with an internal DSCP value.

This marking will help the switching engine to appropriately service the frame. The internal DSCP value is not arbitrary. It is derived from the following sources: DSCP or IP precedence value of the packet at Layer 3, CoS value of frame at Layer 2, or from a user-defined access list.

Example 8-6 shows the mapping between CoS and DSCP values. This mapping that occurs is strictly based on the architectural design of the Catalyst 6500 switch.

Example 8-6. CoS to DSCP Map

Switch1 (enable) show qos map runtime cos-dscp-map

CoS - DSCP map:

CoS   DSCP

---   ----

  0   0

  1   8

  2   16

  3   24

  4   32

  5   40

  6   48

  7   56


Figure 8-2 shows the flow of the traffic through the switching engine. It might be bit a confusing to have classification on the switching engine along with marking and policing, but classification has to be done on the switching engine at times (for example, the list1 access list given earlier in the chapter to help assist the port to distinguish incoming traffic).

Figure 8-2. Traffic Flow Through the Switching Engine

graphics/08fig02.gif


After the traffic has been marked, the switching engine checks to see if policing is configured for the traffic and if the traffic is within the bandwidth guidelines. The motivation behind policing is to curb bandwidth use. The policing mechanism places a ceiling on the amount of bandwidth utilized. Traffic is either dropped or its priority marked down if the bandwidth policy is exceeded.

A token bucket conceptual model is used to demonstrate the policing behavior (see Figure 8-3). The objective is to ensure the bucket does not overflow. There are three elements of interest:

  • Incoming rate? Incoming packet rate is what the user is currently sending.

  • Bucket size? The bucket size equates to the burst allowed.

  • Output rate? Output rate is the allowed bandwidth given to the user. The rate interval is set at 0.25 milliseconds for a Catalyst 6500 switch.

Figure 8-3. Token Bucket Model

graphics/08fig03.gif


Initially, the bucket will be empty because there is no traffic flow. If the incoming rate is below the rate limiting parameter configured, the leak rate is able to keep up with incoming rate of packets. As a result, the bucket is not filled. If the rate coming in is higher than the allowed leak rate, an overflow will occur. At this point, policing kicks in.

There are two types of policing defined in Catalyst switches: microflow and aggregate. A microflow policing mechanism looks at each individual flow. These individual flows are defined by their Layer 3 and Layer 4 properties. A Catalyst switch can support up to 63 microflows on Catalyst 6500. Aggregate policing, on the other hand, looks at many individuals flows at a time. Aggregate policing supports up to 1023 policing configurations.

The following steps detail, in brief, how to configure and apply a microflow policer for traffic that has a DSCP value of 40:

Step 1. Define the policing parameters:






Switch1 (enable) set qos policer microflow police1 rate 64 burst 128 drop

QoS policer for microflow police1 created successfully.


Step 2. Assign an access list with its associated policing parameter:






Switch1 (enable) set qos acl ip list1 dscp 40 microflow police1 ip any 10.1.1.1


Step 3. Apply the access list to the NVRAM:






Switch1 (enable) commit qos acl list1


Step 4. Map the access list to the specific port:






Switch1 (enable) set qos acl map list1 10/3


All IP traffic from port 10/3 to 10.1.1.1 will be policed at 64 kbps with a burst up to 128 kbps. On the other hand, the following aggregate configuration applies policing to all IP traffic:

Step 1. Define an aggregate policy. The rate has been configured to be 2000, but because rate and burst are increments of 32, the switch sets the rate to 1984, which is the closest increment of 32 to the 2000 user-defined rate value:






Switch1 (enable) set qos policer aggregate police2 rate 2000 burst 4000 drop

QoS policer for aggregate police2 created successfully.


Step 2. Link the aggregate policy to an access list:






Switch1 (enable) set qos acl ip list1 dscp 40 aggregate police2 ip any any


Step 3. Apply the access list to the NVRAM:






Switch1 (enable) commit qos acl list1


Step 4. Map the access list to the specific port:






Switch1 (enable) set qos acl map list1 10/3


The aggregate policy has been defined that allows for all IP traffic from port 10/3 to not exceed 2 Mbps throughput with 4 Mbps burst.

Marking

The switching engine forwards the traffic to the egress port. The internal DSCP values assigned to the traffic at the switching engine are marked back to their respected CoS or DSCP/IP precedence values at the egress port. For example, the incoming frames at the ingress port had a value of CoS 5, the switching engine assigned an internal DSCP value of 40 for these frames as the switching engine switched these frames. At the egress port, the DSCP value of 40 was marked back to the original value of CoS 5. Same principle applies for incoming packets that have IP precedence or DSCP values set. Figure 8-4 shows the flow of the traffic coming from the switching engine, PFC, or MSFC.

Figure 8-4. Traffic Forwarded to Egress Port

graphics/08fig04.gif


Output Scheduling

After marking, the traffic is forwarded to the appropriate transmit queue on the egress port (see Figure 8-5). Depending on the hardware module used, the number of queues and drop thresholds varies. The reason behind the queues on the egress port is to service higher priority traffic first. It is equally important that during congestion, certain steps should be taken to minimize dropping of critical traffic. This is done via congestion avoidance implemented on Cisco switches.

Figure 8-5. Traffic Handling on the Egress Port

graphics/08fig05.gif


The older line cards such as WS-X6348-xx have two queues with two drop thresholds, 2q2t. In 2q2t, the size of queue 1, which corresponds to low-priority traffic, is 80 percent of total transmit queue size. Queue 2 is allocated the remaining 20 percent for high-priority traffic.

As noted in the output (see Example 8-7), CoS values 4-7 are sent to queue 2. The drop threshold is set at 80 percent for CoS 0 and 1 on queue 1, and CoS 4 and 5 on queue 2. The drop threshold is at 100 percent for CoS values 2 and 3 on queue 1 and 6 and 7 on queue 2.

Example 8-7. QoS Information for 2q2t tx

Switch1 (enable) show qos  information config 2q2t tx

QoS setting in NVRAM for 2q2t transmit:

QoS is enabled

Queue and Threshold Mapping for 2q2t (tx):

Queue Threshold CoS

----- --------- ---------------

1     1         0 1

1     2         2 3

2     1         4 5

2     2         6 7

Tx drop thresholds:

Queue #  Thresholds - percentage

-------  -------------------------------------

1        80% 100%

2        80% 100%

Tx WRED thresholds:

WRED feature is not supported for this port type.

Tx queue size ratio:

Queue #  Sizes - percentage

-------  -------------------------------------

1        80%

2        20%

WRR Configuration of ports with 2q2t:

Queue #  Ratios

-------  -------------------------------------

1        5

2        255


Example 8-7 shows default values for 2q2t; these values can be changed, if necessary, using the set qos map command. For example, frames with CoS 2 value are now associated with drop threshold 1 rather than its default of 2:






set qos map 2q2t tx 1 1 cos 2


Congestion avoidance helps the two queues in Example 8-7 from filling up. Typically, if congestion avoidance mechanism is not used, when the two queues are filled up, any incoming traffic to the queues is dropped. This is known as a tail drop. Congestion avoidance mechanisms such as Random Early Detection (RED) and Weighted Random Early Detection (WRED) help with minimizing the risk of queues being filled up.

Example 8-7 shows that WRED feature is not supported on this port, which means it can only do tail drop. Random RED and WRED accomplish two things:

  • Proactive queue management

  • Queue size control (minimizing queuing delays)

RED simply drops packets randomly regardless of the traffic priority. However, with WRED, high-priority packets are preferred over low-priority packets, and the dropping of the packets is done at random. This randomness prevents global synchronization, which prevents TCP conversations throttling back at the same time. It is a big plus to have hardware that supports WRED.

The newer cards have 1 priority queue, 2 standard queues, and 2 thresholds, 1p2q2t. For 1p2q2t, the standard/low priority queue size is 70 percent of total transmit queue size. The standard/high priority queue and strict priority queue each have 15 percent of the total transmit queue size. Also, note (see Example 8-8) that queue 3 is associated with strict priority queue. The strict priority queue is defined for the CoS 5 traffic.

Example 8-8. QoS Information for 1p2q2t tx

Switch1 (enable) show qos information config 1p2q2t tx

QoS setting in NVRAM for 1p2q2t transmit:

QoS is enabled

Queue and Threshold Mapping for 1p2q2t (tx):

Queue Threshold CoS

----- --------- ---------------

1     1         0 1

1     2         2 3

2     1         4 6

2     2         7

3     -         5

Tx drop thresholds:

Tx drop-thresholds feature is not supported for this port type.

Tx WRED thresholds:

Queue #  Thresholds - percentage

-------  ------------------------------------------

1        40%:70% 70%:100%

2        40%:70% 70%:100%

Tx queue size ratio:

Queue #  Sizes - percentage

-------  -------------------------------------

1        70%

2        15%

3        15%

WRR Configuration of ports with 1p2q2t:

Queue #  Ratios

-------  -------------------------------------

1        5

2        255


After the traffic is in the appropriate queues, the Weighted Round Robin (WRR) is used to service each of the queues. WRR is used for standard/high and standard/low queues. The strict priority is always serviced first before the other two queues. The default behavior is to service standard/high queue 70 percent of time, and standard/low the remaining 30 percent.