1.2 On the Internet and Internets

A word on "the Internet," and on "internets" in general, is in order. In print, the difference between the two seems slight: one is always capitalized, one isn't. The distinction between their meanings, however, is significant. The Internet, with a capital "I," refers to the network that began its life as the ARPAnet and continues today as, roughly, the confederation of all TCP/IP networks directly or indirectly connected to commercial U.S. backbones. Seen up close, it's actually quite a few different networks?commercial TCP/IP backbones, corporate and U.S. government TCP/IP networks, and TCP/IP networks in other countries?interconnected by high-speed digital circuits.

A lowercase internet, on the other hand, is simply any network made up of multiple smaller networks using the same internetworking protocols. An internet (little "i") isn't necessarily connected to the Internet (big "I"), nor does it necessarily use TCP/IP as its internetworking protocol. There are isolated corporate internets, for example.

An intranet is really just a TCP/IP-based "little i" internet, used to emphasize the use of technologies developed and introduced on the Internet on a company's internal corporate network. An "extranet," on the other hand, is a TCP/IP-based internet that connects partner companies, or a company to its distributors, suppliers, and customers.

1.2.1 The History of the Domain Name System

Through the 1970s, the ARPAnet was a small, friendly community of a few hundred hosts. A single file, HOSTS.TXT , contained a name-to-address mapping for every host connected to the ARPAnet. The familiar Unix host table, /etc/hosts, was compiled from HOSTS.TXT (mostly by deleting fields Unix didn't use).

HOSTS.TXT was maintained by SRI's Network Information Center (dubbed "the NIC") and distributed from a single host, SRI-NIC.^[1] ARPAnet administrators typically emailed their changes to the NIC and periodically ftped to SRI-NIC and grabbed the current HOSTS.TXT file. Their changes were compiled into a new HOSTS.TXT file once or twice a week. As the ARPAnet grew, however, this scheme became unworkable. The size of HOSTS.TXT grew in proportion to the growth in the number of ARPAnet hosts. Moreover, the traffic generated by the update process increased even faster: every additional host meant not only another line in HOSTS.TXT, but potentially another host updating from SRI-NIC.

^[1] SRI is the former Stanford Research Institute in Menlo Park, California. SRI conducts research into many different areas, including computer networking.

When the ARPAnet moved to TCP/IP, the population of the network exploded. Now there was a host of problems with HOSTS.TXT (no pun intended):

Traffic and load: The toll on SRI-NIC, in terms of the network traffic and processor load involved in distributing the file, was becoming unbearable.
Name collisions: No two hosts in HOSTS.TXT could have the same name. However, while the NIC could assign addresses in a way that guaranteed uniqueness, it had no authority over hostnames. There was nothing to prevent someone from adding a host with a conflicting name and breaking the whole scheme. Adding a host with the same name as a major mail hub, for example, could disrupt mail service to much of the ARPAnet.
Consistency: Maintaining consistency of the file across an expanding network became harder and harder. By the time a new HOSTS.TXT file could reach the farthest shores of the enlarged ARPAnet, a host across the network may have changed addresses or a new host may have sprung up.

The essential problem was that the HOSTS.TXT mechanism didn't scale well. Ironically, the success of the ARPAnet as an experiment led to the failure and obsolescence of HOSTS.TXT.

The ARPAnet's governing bodies chartered an investigation into a successor for HOSTS.TXT. Their goal was to create a system that solved the problems inherent in a unified host table system. The new system should allow local administration of data, yet make that data globally available. The decentralization of administration would eliminate the single-host bottleneck and relieve the traffic problem. And local management would make the task of keeping data up-to-date much easier. The new system should use a hierarchical namespace to name hosts. This would ensure the uniqueness of names.

Paul Mockapetris, then of USC's Information Sciences Institute, was responsible for designing the architecture of the new system. In 1984, he released RFCs^[2] 882 and 883, which describe the Domain Name System. These RFCs were superseded by RFCs 1034 and 1035, the current specifications of the Domain Name System. RFCs 1034 and 1035 have since been augmented by many other RFCs, which describe potential DNS security problems, implementation problems, administrative gotchas, mechanisms for dynamically updating name servers and for securing zone data, and more.

^[2] RFCs are Request for Comments documents, part of the relatively informal procedure for introducing new technology on the Internet. RFCs are usually freely distributed and contain fairly technical descriptions of the technology, often intended for implementers.