Chapter 1. No Straight Thing

Out of the crooked timber of humanity, no straight thing can ever be made.

?Immanuel Kant

In late 1996 there were approximately 14,000,000 computers connected to the Internet. Nearly all of them relied on the Transmission Control Protocol (TCP), one of the fundamental rule sets underlying communication between computers, and the one used for most common services on the Internet. And although it was known to have security weaknesses, the protocol had been doing its work quietly for nearly two decades without a major attack against it.

But on September 1 of that year, the online magazine Phrack published the source code for a network attack tool that exploited the trusting way the protocol handled connection requests (see the sidebar A Fractured Dialogue). Suddenly, the majority of those 14,000,000 computers were now vulnerable to being taken offline?in some cases, crashed?at the whim of any malcontent capable of compiling the attack program.

A Fractured Dialogue

What happens when you call someone on the phone and they hang up before you do?and you decide not to hang up yourself? Until a few years ago (in the U.S., at least), it was possible to tie up the other person's telephone line for a long time this way.

Today we might call this trick a denial of service attack. It's an example of what can happen when one party to a conversation decides not to play by the rules. In the network world, a set of such rules is called a protocol. And the network attack known as a TCP SYN flood is an example of what can happen when an attacker controlling one side of a computer dialogue deliberately violates the protocol.

The Transmission Control Protocol (TCP) is used many billions of times a day on the Internet. When email is exchanged, for example, or when someone visits a web site, the dialogue between the sending and receiving computers is conducted according to these rules. Suppose that computer A wants to initiate a connection with computer B. Computer A offers to "synchronize" with computer B by sending a set of ones and zeros that fit a special pattern. One feature of this pattern is that a particular bit (the SYN flag) is set. Computer B agrees to the exchange by replying in an equally specific bit pattern, setting both the SYN flag and the ACK ("acknowledge") flag. When computer A confirms the connection by replying with its own ACK, the TCP session is open, and the email or other information begins to flow. (Figure 1-1 shows this exchange.)

As early as the mid-1980s, researchers realized that if the initiating computer never completed the connection by sending that final acknowledgment, the second computer would be in a situation similar to that of the hapless telephone user whose caller never hung up. To be sure, in each case the computer programs implementing the dialogue can break the connection after a suitable period of time, freeing up the telephone line or network connection. But suppose that an attacker writes software capable of sending dozens or hundreds of false connections requests per second. Wouldn't the receiving computer be overwhelmed, keeping track of all those half-open connections? That turns out to be the foundation for a TCP SYN flood attack; and in 1996, it was deadly.[1]

[1] The version of the attack code that posed the biggest problem had an additional "feature": it produced "SYN packets" that included false sender addresses, making it much harder for the receiving computers to deal with the attack without shutting out legitimate connection requests.

It was a time when new vulnerabilities were being disclosed daily, and the article at first went unnoticed by most security professionals. It was, however, read carefully in some quarters. Within days, an ISP in New York City named Panix was repeatedly savaged using the technique. Day after day, bombarded by tens of thousands of false connection requests?known as a SYN flood , after the protocol element that was misapplied?Panix was helpless to service its paying customers. The security community took notice and began to mobilize; but before experts could come up with an effective defense, the attacks spread. The Internet Chess Club was clobbered several times in September. Scattered attacks troubled several more sites, mostly media outlets, in October. In November, on election night, the New York Times web site was disabled, and the resulting publicity opened the floodgates. By the time an effective defense had been devised and widely deployed some weeks later, hundreds of sites around the world had been victimized. Tens of thousands more were affected, as experts and laypersons alike struggled to cope with the practical impact of this first widespread denial of service attack.

Figure 1-1. How a normal TCP network session works

Why are we telling this story? True, the attack makes for interesting reading. And true, the attackers deserve our contempt. But there are, sadly, many other Internet attacks that we might describe. Why this one?

It's partly that both of us were heavily involved in the worldwide response of the security community to the vulnerability and resulting attacks. Mark worked at Sun Microsystems then and was integrally involved in Sun's technical response to correcting their networking software. Ken worked the problem from the incident response side?he was chairman of FIRST (the international Forum of Incident Response and Security Teams) at the time.

More importantly, the TCP SYN flood attack exemplifies a multitude of different secure coding issues that we are deeply concerned about. As we wind our way through this book, we'll use this particular example to illustrate many points, ranging from the technical to the psychological and procedural.

We'll return to this story, too, when we discuss design flaws, errors in implementation, and various issues of secure program operation, because it so clearly represents failure during every phase of the software development lifecycle. If the architecture of the TCP protocol had been more defensively oriented in the first place,[2] the attack would never have been possible. If the request-processing code in the affected operating systems had been designed and implemented with multiple layers of defense, the attacks wouldn't have brought down the whole house of cards. If the software had been tested and deployed properly, someone would have noticed the problem before it affected thousands of Internet sites and cost millions of dollars in lost time, data, and opportunity. This "lifecycle" way of looking at security is one we'll come back to again and again throughout this book.

[2] We are not suggesting that the original architects of the protocol erred. Their architecture was so successful that the Internet it made possible outgrew the security provisions of that architecture.

We'll mention several other attacks in this book, too. But our focus won't be on the details of the attacks or the attackers. We are concerned mainly with why these attacks have been successful. We'll explore the vulnerabilities of the code and the reasons for those vulnerabilities. Then we'll turn the tables and make our best case for how to build secure applications from the inside out. We'll ask how we can do better at all stages.

More simply, we'll try to understand why good people write bad code. Smart, motivated people have been turning out new versions of the same vulnerabilities for decades. Can "no straight thing" ever be made?