A DoS attack prevents a legitimate user from accessing a resource, such as a web server. It is achieved through overusage of a shared resource, such as the following:
Overloading a line in a network
Exceeding the packet-forwarding capacity of a router
Exceeding the packet-processing capacity of a server
In general, DoS attacks are directed toward exhausting one of the following shared resources:
Bandwidth (for example, on a line)
CPU (on a network element)
Memory (on a network element)
Therefore, a DoS-resistant network must be designed such that none of the shared resources can be overloaded by a malicious user or a group of malicious users. In this section, we firstly discuss general DoS considerations, then various options to design a DoS-resistant network. The main consideration here is how a dedicated PE for VPN traffic helps increase DoS resistance. Various design options are discussed.
When large groups of legitimate users all decide to access a certain resource at the same time, and this resource has not been designed to handle that load, the effect may be similar to a DoS attack. However, the correct expression for this behavior is a flash crowd. Technically, the effect is exactly the same as a DoS attack. Flash crowds could be observed, for example, on September 11, 2001, when after the collapse of the World Trade Center in New York many people tried to access news servers.
There is no 100 percent DoS resistance. Every solution has a maximum capacity. A network withstands a DoS attack if the attacker has less attack capacity than the network offers. A future DoS attack, however, might exceed the network capacity.
Most design principles for DoS resistance apply to any form of network and are not MPLS specific. These are in short:
Correct device positioning? Each device in a network must be able to process the maximum inbound load. On the edge of the network, routers must also be able to handle equired features at full line speed?for example, ACLs, uRPF, NetFlow, and CAR.
Correct bandwidth planning? The lines in the network must be able to handle bursts of traffic. This can be solved in two ways: overprovisioning, where lines are designed such that they cannot be overloaded even under worst-case circumstances; and Quality-of-Service (QoS), where the bandwidth of a line might not be sufficient, but traffic is divided into classes and the important class is always serviced.
Service overprovisioning? Web services must be designed to worst-case assumptions. The content architecture with web caches, load balancers, and so on provides this overprovisioning.
Anti-DoS solutions? Every network should dispose of a DoS solution that allows detecting, analyzing, and mitigating DoS attacks. The Cisco Guard, for example, provides a mitigation device. Operational procedures to handle DoS attacks are required as well.
For MPLS networks, all these points also apply. There are, however, some MPLS-specific considerations: the PE routers require special attention when designing a DoS-resistant MPLS core.
Figure 4-16 describes the potential problem: A PE router typically holds connections to several VPNs. If a DoS attack is directed at one of those VPNs, and if it has a higher packet-per-second (pps) rate than the PE router can handle, other connected VPNs might also suffer performance degradation up to complete loss of connectivity.
DoS attacks are usually perceived as coming from the Internet; however, they can come also from connected VPNs. The difference is that the bandwidth from the Internet is usually much higher than from a VPN site and that a VPN site can be shut down if it attacks the core. Consider also worms that might originate from a VPN and have a similar effect to a DoS attack.
Most current PE routers in the industry use shared resources for different VRFs, which means that this problem affects most MPLS VPN networks. Many marketing slides suggest that a given router is not susceptible to this problem. The reader is encouraged to verify such claims in a lab setting. The Cisco CRS-1 offers a new operating system (IOS-XR) that does provide true memory and CPU protection for different VRFs. On such a router, only the VPN under direct attack would be affected, but not other VPNs on the same PE.
Because the vast majority of routers deployed today as PE routers does not offer any memory and CPU protection, the problem of the shared PE can only be solved by using a different design. Since DoS attacks from the Internet have likely a greater impact than attacks from a connected VPN, we propose to keep those two security levels strictly separate on the network. Figure 4-17 shows the principal idea: we split PE routers into PE routers that only hold VPN connections and PE routers that hold Internet connections. This way, if an attack is coming from the Internet, the VPN PE will not be affected.
There may be customers who only have a VPN connection, which will be connected to the VPN PE; a customer who only has an Internet service will be connected to the Internet PE. Figure 4-17 shows a customer who requires both a VPN and an Internet connection: This set-up maintains the separation between the CE routers and separates routing internally between VPN and Internet traffic.
In such a setup, the Internet connection (upper part of the figure) can be treated as any other Internet service, for example with a leased line to an ISP. A firewall would typically be deployed, plus usual other-edge security features such as intrusion detection. If an attack hits this customer, in the worst case only the Internet part will become unavailable: the VPN part will remain unaffected. This offers very strong DoS resistance. The fact that the Internet part is still exposed is not any different to other network technologies. Customers with leased lines to ISPs may also be affected by DoS attacks; so this is not an MPLS-specific issue.
Routing must be set up on the MPLS core such that traffic from the Internet never passes through the VPN PE. In fact, to prevent a direct attack, the VPN PE must not be accessible from the Internet at all! (Reachability from VPNs should also be limited to a minimum, probably only routing.) If the VPN PEs are not hidden by architecture, infrastructure ACLs are a must to prevent packets from reaching VPN PEs.
The idea behind this separation is similar to the zone segmentation on firewalls: the assumption is that within one security zone attacks are less frequent and that the more secure zone needs protection from the less secure zone. As mentioned above, though, DoS attacks may equally come from VPN sites, so this assumption is not very strict.
If redundancy is required, this setup would have to be duplicated according to normal network design rules. Redundancy is not shown in the examples but can be provided with all options.
Although full separation of Internet and VPN on both the service provider and customer side provides optimal protection against DoS attacks, this is an expensive solution: both the service provider and the customer need to provision two routers, and two access lines are required. There are ways to reduce the costs for such a separation, but the resistance to DoS attacks also goes down.
Referring to Figure 4-17, essentially any duplicated component of this setup can be reduced to a single component to bring down the overall cost of the solution?the PEs, the line between PEs and CEs, and the CEs. Note that all of those options provide the same functionality for the customers; however, the DoS resistance is different for each option.
The first alternative is described in Figure 4-18: the service provider may terminate both services on a single PE, whereas the enterprise separates the two services.
Since a potential DoS attack from the Internet would go over the shared PE, the VPNs connected to this PE might be affected as well. (Note that this might also be a different VPN customer.) Theoretically, this offers higher DoS resistance for the customer's VPN part because the customer's own VPN network is separated. In this case, a DoS attack against the Internet part of the customer would only affect the VPN part if it is strong enough to affect the PE router. Every other part of the setup is still separated.
The service provider will be able to offer this service at lower cost because only a single PE is required. On the customer side, there is still significant cost?the two access lines and the two CE routers.
The customer cannot distinguish by technical means whether the two access lines are terminated on a single PE or two PEs.
Since one of the highest cost factors is the access line, customers have looked for solutions that allow providing both services, Internet and VPN, over a single access line.
There are various ways to share an access line. The common criteria for all of those are
The access line technology must allow for separation between the two services; this could be done by using two Frame Relay circuits, VLANs on an Ethernet connection, or any tunneling technology such as GRE, IPsec, or L2TP. Frame encapsulation on top of Packet over SONET (POS) lines is a popular solution.
The routers on both sides must be able to configure subinterfaces to terminate those circuits.
The routers on both sides must be able to keep traffic separate. This could be done using Layer 2 forwarding (for example, Frame Relay switching).
In the first solution for shared access lines, shown in Figure 4-19, the access line is segmented with Frame Relay logical links. Packets of the two services, Internet and VPN, travel in different Frame Relay circuits.
On the PE side, the two circuits are terminated in the respective VRFs, providing full separation. On the CE side, the VPN circuit is terminated as a normal interface and thus is subject to normal routing. The routing table on this CE, therefore, holds only VPN routes. The Internet circuit is switched with Frame Relay switching, on Layer 2, without touching the routing table. It is terminated on Layer 3 on a second CE router, which acts as the Internet CE. From there, the traffic enters the Internet edge zone with firewall and other security services.
The separation between Internet and VRF is equally good as in the previous examples: the two traffic streams are perfectly separated and cannot mix anywhere. But the lower cost of this model comes at a price. Now all three elements?PE, CE, and access line?are shared, and the VPN part potentially might be affected by a DoS attack on the Internet part.
However, this risk is probably manageable: the overload on the line can be prevented by configuring correct Frame Relay parameters. The bandwidth of the Internet circuit must be configured such that there is always sufficient bandwidth on the VPN circuit. This also helps in securing the CE against overload: both CEs have a maximum forwarding capacity. The circuit bandwidths must be configured such that even if both are fully loaded from the PE side with minimum-size packets, both CEs can still fulfill their tasks. This should be tried in a lab environment.
Thus if the bandwidths are configured correctly, the only bottleneck in this scenario is on the PE router. If the bottleneck is just the access circuit, packets will be dropped on that circuit, but other services will not be affected. This is the desired state. The only problem might be if the attack is so strong that it exceeds the forwarding capacity of the PE, in which case the VPN might also be affected by the DoS attack.
Instead of doing Frame Relay switching on the CE, an alternative would be to provision VRFs on the CE. This concept is illustrated in Figure 4-20. The line is segmented as before in some form (in the example, Frame Relay), but now the Internet circuit is terminated in a separate VRF on the CE. From there, another connection leads to the Internet CE, or alternatively, directly to the firewall because the Internet CE is not really required in this setup.
The concept of using VRFs on a router without any MPLS-based forwarding is called VRF Lite (or Multi-VRF Support). The VRF is exactly the same as a VRF on a PE router, except that here no MPLS switching is done. An interface can only be in the global table or in a single VRF, and the VRFs are strictly separated from each other and from the global table, just as on a PE. With these properties, VRF Lite offers a concept of virtualized routers.
In Figure 4-20, the Internet VRF and the global table are strictly separated. Packets from one side can never get to the other side, except if this is explicitly configured. The separation this solution offers is therefore very good, exactly as in the other examples. As in the previous example, this solution is more exposed to a DoS attack because several components are shared; but here as well correct design can overcome this problem: if the Frame Relay links toward the CE are dimensioned such that the CE will always be able to handle both ingress lines full with minimum packet size, both the access line and the CE cannot be affected by a DoS attack. Correct design is therefore of paramount importance.
An alternative way to separate the two contexts on the CE has been proposed in the past: by configuring policy-based routing (PBR), where the packets entering on the Internet context would be policy routed. This means that all packets coming in on the Internet link would be forwarded to the defined egress interface without a forwarding table lookup. This also provides the required separation, in principle. However, PBR has an important property: if the egress interface is down, PBR is essentially disabled, and packets are again forwarded according to the forwarding table. This, of course, would break the separation.
Never use PBR as a security tool, for example to separate different security levels because under some circumstances PBR falls back to the forwarding table. Although in later IOS releases there is a switch to disable this behavior, separation through VRFs is always more strict.
Among the various options to separate routing on the CE, the VRF Lite mechanism is the best one: both PBR and Frame Relay switching produce a higher CPU load, whereas VRF Lite is a CEF-based feature without performance impact. The separation on VRF Lite is the same as in MPLS: very strict.
The best way to separate routing contexts on any router (specifically CEs) is using the VRF Lite concept.
Figure 4-21 gives an example of how a complete VPN can be implemented on an MPLS VPN core, applying the principles just discussed.
The principal VPN topology for Internet access is hub-and-spoke: all spoke sites send their Internet traffic through the hub site. VPN internal traffic can go directly between the spokes. This may seem like a restriction, but most VPN customers prefer this anyway so that all Internet traffic of the VPN goes through a central point of security control (firewalls, and so on). The hub site connects to the Internet service of the MPLS provider.
In this configuration, packets from the Internet can only be sent to the Internet part of the VPN customer: the VPN part is unreachable from the Internet (which must be enforced by the core design). This way an attacker from the Internet can only attack the Internet PE, CE, and the Internet servers on the VPN?not any of the VPN parts. Therefore, in the worst case, the Internet connection of this VPN might suffer degradation or loss of service, but the VPN part will be unaffected.
There are many ways to combine these principles: there can be two redundant hub sites laid out in the same way; there could be Internet connectivity to selected spoke sites; there could be Internet provided to all VPN sites. The consequences for each scenario have been described previously.
All the discussed options provide perfect separation (apart from the PBR solution). The discussion here is exclusively to DoS resistance.
The remaining key question for an MPLS core design is how much difference it makes to VPN availability whether the core uses shared versus separate PEs for Internet and VPN. This question cannot be answered generally. Both solutions have their applicability and justification.
Because a shared-PE scenario is cheaper to provide, there is probably a large number of customers who would prefer such a service, simply because they do not feel the need for very high DoS resistance. This is a perfectly valid argument.
A VPN customer who has high exposure on the Internet, such as an online book shop or a bookmaker, might take a different view, however. Customers like these are under the permanent threat of DoS attacks, and if they have mission-critical applications on their VPN, they might not want to endanger the VPN with attacks from the Internet.
From the service provider perspective, a third point of view might be taken: if the security operation of the core network is very fast and efficient and if anti-DoS measures are deployed on the core, the service provider might have an extremely short response time to a DoS attack. In this case, one option is to deploy the cheaper variant with shared PEs and monitor the network very closely: if a DoS attack is noticed, the service provider can then divert the attack to a packet scrubber like the Cisco Guard, thereby "cleaning" the traffic and thus mitigating the DoS attack. The question now is, "Can a DoS attack be mitigated fast enough to not break the SLA with the customer?" If this is the case, the cheaper variant can be deployed even for customers with high security requirements.
Finally, high availability requires a number of different technologies in parallel: quality-of-Service, fast rerouting and convergence, device and line redundancy, management, and finally security. Therefore, a complete architecture must take many more considerations into account to achieve a highly available solution.
Following these recommendations, a standard RFC 2547 network can be secured and VPNs be made highly available, even with Internet provided on the same core. The next challenge for the service provider is how to interconnect securely to another service provider network, using one of the Inter-AS methods described in RFC 2547bis.
All the previously discussed problems with DoS attacks potentially affecting a PE router have a common root cause: Most PE platforms in the industry have operating systems with shared memory and CPU. Therefore, there is no 100 percent resource separation between the VPNs as far as DoS resistance is concerned: A single VPN can potentially consume too much shared memory or CPU, which potentially affects other VPNs on that PE. This can be partially secured, as explained in Chapter 5, "Security Recommendations," but the root problem persists.
There is no danger of intrusions: The VPNs remain separate. The risk is exclusively related to availability of resources; for example, if a PE is under a DoS attack, other VPNs on that same platform might also lose their VPN connection.
The Cisco CRS-1 is a notable exception: The architecture of the new operating system IOS-XR allows full memory and CPU separation between virtual routers. Therefore, even if a process in a given VPN is not fully secured, the operating system limits the central resources that this process can consume. This means that there is no additional risk of DoS for a given VPN on a Cisco CRS-1, and it is not necessary to split the PEs.
The resource separation is a feature of the new operating system IOS-XR, and will therefore also be available on other platforms that might run IOS-XR in the future.