MPLS Encapsulation Across Ethernet Links

One of the issues surrounding the use of MPLS encapsulation is the deployment of Ethernet links (Ethernet, Fast Ethernet, or Gigabit Ethernet) within the topology using any supported Ethernet encapsulation: Ethernet II, 802.3 (with or without an 802.2 header), or SNAP. Each media type has a maximum frame size of 1518 octets (not including preamble or Start Frame Delimiter [SFD]) with a payload size ranging from 46 octets to 1500 octets (1492 in the case of SNAP encapsulation).

Previous chapters illustrate that the use of MPLS within the network causes a packet to grow in size, which is due to the addition of labels onto the label stack. Each label header entry is 4 octets in length. This means that if a packet of 1500 octets payload is received, and a label header is pushed onto the stack, then the frame needs to be forwarded with a 1504-octet payload. Because of the restriction to the maximum frame size across the various Ethernet media types, this could cause a problem because the MTU on these links is smaller than the presented packet size.


The Gigabit Ethernet standard currently limits the frame size to 1518 octets although some vendors now support jumbo frames, where the data field can extend to 4470 or 9000 octets. Extending the length of the Ethernet frame data field results in not being able to uniquely determine whether a particular sized packet is an 802.3 or Ethernet Type II encapsulated packet. This is because the type/length field is interpreted as length if it is less than 1535 octets (therefore, it is an 802.3, 802.3 plus 802.2, or SNAP frame) and as type if it is greater than 1535 octets (therefore, the encapsulation is Ethernet II). The consequence of this is that large frames across Ethernet-type media with Ethernet II encapsulation work well but not across any other encapsulation. Further studies are ongoing to determine how to handle encapsulations other than Ethernet II across Gigabit Ethernet although no firm conclusions are available at the time of writing this book.

IP MTU Path Discovery

Most IP hosts today support the use of the Path MTU discovery mechanism, as documented in RFC 1191, "Path MTU Discovery." The mechanism described in the RFC allows an IP host to discover dynamically the maximum allowable MTU size along the path from source to destination.

The basic idea behind Path MTU Discovery is that a source host initially assumes that the Path MTU of a particular connection is the MTU of its first hop, and sends all datagrams on that path with the DF (do not fragment) bit set. No datagram is sent that is bigger than the MTU of the first hop. Hosts that do not use these procedures should not send datagrams larger than 576 octets.


In the case of Transmission Control Protocol (TCP), when a session is established with a remote device, the Maximum Segment Size option is negotiated. The two devices involved in the session establishment exchange their TCP Maximum Segment Size (MSS) values, which normally are determined as the local MTU value minus 40 octets (IP header and TCP header). The smaller of the two MSS values is used for the Path MTU during the discovery process.

When a router receives a packet that is larger than the MTU of the outgoing interface toward the destination contained in the incoming packet, and the DF bit is set on the packet, it sends an ICMP destination unreachable message with a code of 4 (fragmentation needed and DF set) back to the source of the packet. The Path MTU discovery process relies on the receipt of these messages to determine the maximum packet size that can be sent across the path to a particular destination. Figure 5-2 illustrates this process.

Figure 5-2. Path MTU Discovery Mechanism


When you deploy these procedures, packets can be sent successfully across an MPLS backbone without fragmentation. However, each LSR can fragment labeled or non-labeled packets if they are larger than the outgoing MTU, as long as the DF bit is not set. If the DF bit is set, the LSR conforms to the Path MTU discovery mechanism by sending an ICMP destination unreachable message with the "fragmentation needed and DF set" code.

All this would be fine if everyone conformed to the previously described mechanisms. However, the reality is that some hosts do not use Path MTU discovery and send datagrams that are larger than 576 Octets. Furthermore, some firewalls drop ICMP unreachable messages, which effectively breaks the MTU discovery mechanism. Because of these issues, a further mechanism that allows frames with a payload greater than 1500 octets is needed within an MPLS environment to ensure that packets can be sent successfully across the network.


Some of the issues with MTU discovery are discussed in further detail in draft-ietf-tcpimpl-pmtud. You can find this draft at the IETF web site at

Cisco Systems introduced a workaround for these issues that allows an Ethernet port on a router to support MPLS packets that have a payload larger than 1500 octets. This is achieved by increasing the MTU of the Ethernet port to 1526 octets, which constitutes the standard maximum Ethernet frame size of 1518 octets plus 8 octets for two levels of MPLS labels. This amount of labels is adequate at this time, and supports the introduction of MPLS and MPLS-enabled VPNs, but does not support an arbitrary label stack depth. Further study is ongoing and the label stack depth may be increased in the future to allow the introduction of services that require a label stack depth greater than two. This workaround is relevant to packets that are received with their DF (Do not fragment) bit set. In draft-ietf-mpls-label-encaps, this increase in payload size is known as the "True Maximum Frame Payload Size."

For packets that do not have their DF bit set, the previously mentioned draft specifies that every LSR should support a configuration parameter known as the "Maximum Initially Labeled IP Datagram Size." (See section 3.2 of draft-ietf-mpls-label-encaps.) This parameter is used on the ingress to the MPLS domain so that the packet can be fragmented at the edge of the network if it is larger than the configured maximum labeled MTU size. This means that the MTU size needs to be established for all backbone links so that this value can be decided. The advantage of this is that the packet is fragmented prior to entry into the MPLS domain and does not require further fragmentation within the MPLS backbone.

In the Cisco MPLS implementation, this parameter is configured using the tag-switching mtu command on the output interface. This command defaults to the interface MTU size. If packets arrive that are too big, as specified by the tag-switching mtu command, to be sent without fragmentation somewhere within the MPLS network, and they do not have the DF bit set, then they are fragmented prior to transmission out of the outbound interface. The advantage of this is that fragmentation need not occur within the MPLS domain and is restricted to the edge of the network.


The tag-switching mtu command also is required in conjunction with the increase of the maximum Ethernet MTU size ("True Maximum Frame Payload Size"). If you do not set this command, any arriving packets that have a payload size larger than the default maximum frame size for the outgoing interface (in the case of Ethernet II, for example, this size is 1500 octets when MPLS labels are pushed onto the stack) are dropped and an ICMP message is sent back to the source. This occurs even though the interface can support this larger frame size. For this reason, set this command on all Ethernet interfaces that will be configured to carry MPLS-encapsulated packets.


Ethernet interfaces are not the only interfaces where the MTU is smaller than the resultant frame size after MPLS labels have been added. This means that the tag-switching mtu command is not restricted to Ethernet interfaces only and should be configured for any interface where the maximum MTU configured for the interface is likely to be exceeded.

Ethernet Switches and MPLS MTU

As discussed in the previous section, the MTU of IP packets increases by 4 octets for each MPLS label appended to the packet. Whichever MPLS facility is used (base MPLS, VPN, or Traffic Engineering), an MPLS packet can exceed the maximum Ethernet frame size of 1518 octets. The previous section showed that this problem has been somewhat resolved by changes to the LSR to make it capable of transmitting a frame that is larger than 1518 octets.

This workaround is fine if the LSRs are connected through back-to-back Ethernet cabling. However, if you use a Layer 2 switch to provide the Ethernet segment, then this device also must be capable of forwarding frames that are greater than 1518 octets. In most?but not all?cases, this is not actually the reality and the switch drops the frame and reports a GIANT.


Some Cisco Layer 2 switches support giant frames by default and some do not. If they do not, several workarounds exist to enable switches to pass the frames. You can obtain these workarounds from the Cisco Systems Inc., TAC (Technical Assistance Center) on request.

    Part 2: MPLS-based Virtual Private Networks