BGP Confederations Deployment

BGP confederations are another mechanism for controlling the explosion of iBGP meshing. They can be used instead of or in combination with route reflectors. The basic functionality of BGP confederations is to split up the autonomous system into smaller, more manageable, autonomous systems, which are represented as one single autonomous system to BGP peers external to the confederation.

Note

For an in-depth discussion of BGP confederations, refer to the Cisco Press title Internet Routing Architectures, Second Edition, by Bassam Halabi, or RFC 1965, "Autonomous System Confederations for BGP."

By creating smaller autonomous system domains, or sub-ASs, it is possible to restrict the number of iBGP sessions that are required?a full mesh is required only within each sub-AS. A further advantage to this method, which may be useful in very large-scale topologies, is that separate IGPs can be deployed across the service provider backbone, which helps with the scaling of the IGP. This feature is not possible when route reflectors are used, unless the BGP next-hop addresses are leaked across sub-AS boundaries, because you cannot reset the BGP next-hop on a route reflector. (However, you can do this on a confederation boundary router.)

Note

Note that although confederations reduce the size of the iBGP mesh, they make it harder to partition the routing information if the separation of the IGP between sub-AS approach is taken. This is because packets must be forwarded across the confederation boundary, which in an MPLS/VPN environment means that each the edge confederation boundary routers must have access to all VPN routes (unless complex partitioning of the routes between multiple edges is deployed).

To help explain this complex subject, we will consider the topology shown in Figure 12-13. This topology depicts a service provider, Confed.Com, that has three regional POPs, located in San Jose, Paris, and London, connected in a full mesh. All relevant IP address assignments are shown in Table 12-7.

Figure 12-13. Confed.Com Network Topology Example

graphics/12fig13.gif

Table 12-7. IP Address Assignments for Confed.Com Backbone
POP	Site	Subnet
San Jose	San Jose (Loopback0)	194.17.1.1/32
	San Francisco (Loopback0)	194.17.1.2/32
	Santa Clara (Loopback0)	194.17.1.3/32
London	Reading (Loopback0)	197.58.27.3/32
	Heathrow (Loopback0)	197.58.27.2/32
	London (Loopback0)	197.58.27.3/32
Paris	Paris (Loopback0)	195.12.14.1/32
	Chartres (Loopback0)	195.12.14.3/32
	Lyon (Loopback0)	195.12.14.2/32

Figure 12-13 shows that MPLS and MP-iBGP have been deployed within each regional POP and that a full mesh of eBGP is used to connect each regional POP. Each POP is a separate sub-AS, so our requirement of an iBGP full mesh among all BGP speaking routers is relaxed. The exchange of routes and labels between the sub-ASs will differ, depending of which type of deployment option is taken.

Note

Although a full mesh of eBGP is used to connect each regional POP, it should be noted that within a confederation environment, the eBGP session between sub-AS's differs from normal eBGP. The attributes of any routes advertised across the session are not changed, including the BGP next-hop of the route. In addition, normal iBGP rules apply within each sub-AS.

When confederations are used, we have a couple of choices on how to design and deploy the IGP. Which choice is taken affects the use of the MPLS/VPN architecture and how it functions in this type of environment. Example 12-9 provides configuration for PE-routers San Francisco, San Jose, London, and Reading, which will be used in the examples that follow. For the sake of simplicity, the Paris POP will not be considered within the examples.

Example 12-9 BGP Confederation?Example Configuration


hostname Reading

!

ip vrf EuroBank

 rd 1:27

 route-target export 100:27

 route-target import 100:27

!

interface loopback0

 ip address 197.58.27.3 255.255.255.255

!

router bgp 65001

 no bgp default ipv4-unicast

 bgp confederation identifier 100

 bgp confederation-peers 65002



 neighbor 197.58.27.1 remote-as 65001

 neighbor 197.58.27.1 update-source Loopback0

 neighbor 197.58.27.1 activate

 !

 address-family ipv4 vrf EuroBank

 redistribute connected

 no auto-summary

 no synchronization

 exit-address-family

 !

 address-family vpnv4

 neighbor 197.58.27.1 activate

 neighbor 197.58.27.1 send-community extended

 exit-address-family





hostname London

!

ip vrf EuroBank

 rd 1:27

 route-target export 100:27

 route-target import 100:27

!

interface loopback0

 ip address 197.58.27.1 255.255.255.255

!

router bgp 65001

 no synchronization

 no bgp default ipv4-unicast

 bgp confederation identifier 100

 bgp confederation peers 65002

 neighbor 197.58.27.3 remote-as 65001

 neighbor 197.58.27.3 update-source Loopback0

 neighbor 197.58.27.3 activate

 neighbor 10.1.1.14 remote-as 65002

 neighbor 10.1.1.14 activate

 !

 address-family ipv4 vrf EuroBank

 no auto-summary

 no synchronization

 exit-address-family

 !

 address-family vpnv4

 neighbor 197.58.27.3 activate

 neighbor 197.58.27.3 send-community extended

 neighbor 10.1.1.14 activate

 neighbor 10.1.1.14 send-community extended

 exit-address-family        





hostname San Jose

!

ip vrf EuroBank

 rd 1:27

 route-target export 100:27

 route-target import 100:27

!

interface loopback0

 ip address 194.17.1.1 255.255.255.255

!

router bgp 65002

 no synchronization

 no bgp default ipv4-unicast

 bgp confederation identifier 100

 bgp confederation peers 65001

 neighbor 194.17.1.2 remote-as 65002

 neighbor 194.17.1.2 update-source Loopback0

 neighbor 194.17.1.2 activate

 neighbor 10.1.1.13 remote-as 65001

 neighbor 10.1.1.13 activate

 !

 address-family ipv4 vrf EuroBank

 no auto-summary

 no synchronization

 exit-address-family

 !

 address-family vpnv4

 neighbor 194.17.1.2 activate

 neighbor 194.17.1.2 send-community extended

 neighbor 10.1.1.13 activate

 neighbor 10.1.1.13 send-community extended

 exit-address-family         



hostname San Francisco

!

ip vrf EuroBank

 rd 1:27

 route-target export 100:27

 route-target import 100:27

!

interface loopback0

 ip address 194.17.1.2 255.255.255.255

!

router bgp 65002

 no bgp default ipv4-unicast

 bgp confederation identifier 100

 redistribute connected

 neighbor 194.17.1.1 remote-as 65002

 neighbor 194.17.1.1 update-source Loopback0

 neighbor 194.17.1.1 activate

 !

 address-family ipv4 vrf EuroBank

 redistribute connected

 no auto-summary

 no synchronization

 exit-address-family

 !

 address-family vpnv4

 neighbor 194.17.1.1 activate

 neighbor 194.17.1.1 send-community extended

 exit-address-family

BGP Confederations?Single IGP Environment

When the choice is taken to run a single IGP process across the whole BGP confederation, normal iBGP rules apply, so no BGP attributes are changed, including the next-hop for each route that is exchanged between sub-AS boundaries. We have already discussed how an MPLS LSR assigns a label for each internal route that it learns through its IGP. This does not change within a BGP confederation environment, so a label for every BGP next-hop, as assigned by each PE-router within the backbone, should exist. Label swapping can occur to all customer VPN routes.

An example of this type of connectivity can be seen in Figure 12-14. This example also shows the advertisement of a VPN route and the relevant label distribution to obtain connectivity.

Figure 12-14 shows that each sub-AS runs the same IGP process as all other sub-ASs. eBGP is used between sub-ASs, but normal iBGP rules apply across the sub-AS boundaries. This means that the next-hop of any route is not changed and must be reachable by each sub-AS. In the case of MPLS, a label must exist for the BGP next-hop so that packets can be label-switched to the egress LSR for the external destination.

In our example, the Confed.Com San Francisco PE-router receives an update for 195.12.2.0/24 from the EuroBank VPN customer. This update is populated into the EuroBank VRF and is advertised using MP-iBGP to the Confed.Com San Jose PE-router with a next-hop of 194.17.1.2/32 and a VRF label of 11. This route is then advertised across the confederation sub-AS boundary to the Confed.Com London PE-router, with the next-hop and VRF label unchanged. The London router then advertises the route to the Reading PE-router, which installs it into the EuroBank VRF.

Figure 12-14. BGP Confederations?Single IGP Environment

graphics/12fig14.gif

When a packet is sent from one EuroBank site to the other, because a label exists for the BGP next-hop of the route (194.17.1.2/32, in our example), the VRF label is prepended/pushed on to the packet and is label-switched across the Confed.Com backbone to the San Francisco PE-router. The packet will arrive at the San Francisco router with a one-level label stack (the top level will have been popped at the Santa Clara P-router); this label will have a value of 11, as originally set by the San Francisco router.

BGP Confederations?Multiple IGP Environment

When BGP confederations are deployed and each sub-AS uses its own IGP process, the next-hop for all BGP routes is still unchanged across sub-AS boundaries. This means that the BGP next-hop addresses must be reachable from within each sub-AS, or connectivity will be broken. You might think that by redistributing the BGP next-hop addresses between sub-AS IGP processes, connectivity could be restored. This is certainly the case in a non-MPLS environment. However, in the case of MPLS/VPN, we need to consider an example to understand whether this redistribution will allow connectivity between sub-ASs. Figure 12-15 shows a sample topology with a different IGP process in each sub-AS.

Figure 12-15. BGP Confederations?Multiple IGP Environment

graphics/12fig15.gif

Within Figure 12-15, we can see that the San Jose PE-router learns routes from other PE-routers within its own sub-AS, AS65002, through the use of MP-iBGP. These routes are advertised across the sub-AS boundary MP-iBGP session to the London PE-router with the next-hop and label information unchanged.

The first thing to notice is that the London PE-router is incapable of advertising these routes to other PE-routers because the next-hop for the routes is inaccessible: It belongs to the IGP running within the AS65002 sub-AS. This is because of the requirement within BGP that the next-hop of the route be accessible. To rectify this situation, we could try to redistribute the BGP next-hop addresses for each route across the sub-AS boundaries, or we could configure static routes on the London PE-router. This would allow us to label-switch packets to the egress LSR.

Note

A careful reader might notice that BGP was not mentioned as an option for the distribution of next-hop addresses between sub-AS boundaries. This is because labels are not assigned to BGP routes. Therefore, if this protocol were used to advertise the next-hop addresses from AS65002 to AS65001, label switching would not work?no label would be assigned to the next-hop addresses of VPN routes.

The problem with this approach is that multiple static host routes would be required, or redistribution between IGP processes would need to be configured. It is arguable that the reason to deploy confederations is to help scale the IGP and to hide instability in one POP from other POP sites. If this is the case, then redistribution is not a desirable function. On the other hand, the initial scope of confederations was not to hide the IGP information, but rather to reduce the number of iBGP sessions. If this is the requirement, then redistribution may be an option.

The next option that we have is to set the next-hop of all the VPN routes to the advertising sub-AS router. This would cause all routes to be advertised with a BGP next-hop that pointed to the San Jose router. Figure 12-16 illustrates this.

Figure 12-16. Resetting of Next-hop at Subconfederation Boundary

graphics/12fig16.gif

At first glance, this appears to solve the problem. However, if we consider how the Reading router will forward traffic destined for the EuroBank VPN via the San Francisco PE-router, we can see that a problem exists with this mechanism. When the Reading router received a packet, it would append a two-level label stack. The first label would be the VPN label?in our case, label 11?and the second label would be for the egress PE-router. The egress PE-router is actually the BGP next-hop of the route, so the Reading router would apply a label that corresponded to the San Jose router.

The problem with this approach is that packets would be label-switched to the San Jose router, but that router would not be capable of forwarding the packets because it would not understand the second level label (the VPN label). Therefore, a mechanism is required to allow label exchange to occur between sub-AS boundaries, but without the requirement of injecting IGP information between the sub-ASs. This is achieved by allocating a new stack of labels at the sub-AS boundary when next-hop-self is configured. The act of resetting the next-hop causes the PE-router to assign a new label to represent the route and advertise it outside the region across the MP-eBGP connection between confederation peers. An illustration of this technique can be seen in Figure 12-17.

Note

The label can also be reset by the receiving sub-AS PE-router through use of the next-hop-self command. This has the advantage of not having to keep a /32 route for the confederation peer within the receiving sub-AS.

In Figure 12-17, the San Jose PE-router has assigned a label of 12 to a VPN-IPv4 update that it has sent to the London PE-router. The San Jose router is capable of mapping this label to a two-level label stack to reach the San Francisco PE-router, where the VPN-IPv4 route originated. In this type of topology, if the next-hop of a VPN route is changed, one level of label is used across the sub-AS boundary; this label represents the VPN route as seen by the boundary router. The IGP label that got the packet to the boundary router will have been removed.

On the return path, the boundary router replaces this label with a two-level label stack that consists of the original VPN label, as assigned by the originating PE-router, and an IGP label to carry the packet to the originating PE-router. The subsequent forwarding sequence for the example shown in Figure 12-17 can be seen in Figure 12-18.

Figure 12-17. Label Exchange Using MP-eBGP and next-hop-self

graphics/12fig17.gif

Figure 12-18. Label Exchange Using MP-eBGP?Forwarding Example

graphics/12fig18.gif

Note

Only one label is used between the sub-AS peers shown in Figure 12-18 because the IGP label associated with the BGP next-hop address has a value of 2 (implicit-null) and therefore has been popped by the upstream hop, per the penultimate hop popping rules discussed in Chapter 2.