TCP/IP and the Internet


TCP/IP and the Internet

TCP/IP has become the protocol of choice on the Internet—the “network of networks” that evolved from ARPAnet, a packet-switching network that itself evolved from research the U.S. government’s Advanced Research Projects Agency (ARPA) initiated in the 1970s. Subsequently, ARPA acquired a Defense prefix and became DARPA. Under the auspices of DARPA, the TCP/IP protocols emerged as a popular collection of protocols for internetworking—a term used to describe communication among networks.

TCP/IP has flourished for several reasons. A significant reason is that the protocol is open, which means the technical descriptions of the protocol appear in public documents, so anyone can implement TCP/IP on specific hardware and software.

Another, more important, reason for TCP/IP’s success is the availability of sample implementation. Instead of describing network architecture and protocols on paper, each component of the TCP/IP protocol suite began life as a specification with a sample implementation.

Taking Stock of RFCs

The details of each TCP/IP protocol (including TCP and IP, as well as specific service protocols such as SMTP and FTP) are described in documents known as Requests for Comments (RFCs). These documents are freely distributed on the Internet. You can get RFCs from http://www.cis.ohio-state.edu/hypertext/information/rfc.html (click the Index link for a complete index of the RFC or search by keyword). Another good URL for RFCs is http://www.faqs.org/rfcs/.

In fact, the notation used to name Internet resources in a uniform manner is itself documented in an RFC. The notation, known as the Uniform Resource Locator (URL), is described in RFC 1738, “Uniform Resource Locators (URL),” written by, among others, T. Berners-Lee, the originator of the World Wide Web (WWW).

You can think of RFCs as the working papers of the Internet research-and-development community. All Internet standards are published as RFCs. However, many RFCs do not specify any standards; they are informational documents only.

The following are some RFCs you may find interesting:

  • RFC 768, “User Datagram Protocol (UDP)”

  • RFC 791, “Internet Protocol (IP)”

  • RFC 792, “Internet Control Message Protocol (ICMP)”

  • RFC 793, “Transmission Control Protocol (TCP)”

  • RFC 854, “TELNET Protocol Specification”

  • RFC 950, “Internet Standard Subnetting Procedure”

  • RFC 959, “File Transfer Protocol (FTP)”

  • RFC 1034, “Domain Names: Concepts and Facilities”

  • RFC 1058, “Routing Information Protocol (RIP)”

  • RFC 1112, “Host Extensions for IP Multicasting”

  • RFC 1155, “Structure and Identification of Management Information for TCP/IP-based Internets”

  • RFC 1157, “Simple Network Management Protocol (SNMP)”

  • RFC 1519, “Classless Inter-Domain Routing (CIDR) Assignment and Aggregation Strategy”

  • RFC 1661, “The Point-to-Point Protocol (PPP)”

  • RFC 1738, “Uniform Resource Locators (URL)”

  • RFC 1796, “Not All RFCs Are Standards”

  • RFC 1855, “Netiquette Guidelines”

  • RFC 1886, “DNS Extensions to Support IP Version 6”

  • RFC 1918, “Address Allocation for Private Internets”

  • RFC 1939, “Post Office Protocol, Version 3 (POP3)”

  • RFC 1945, “HyperText Transfer Protocol—HTTP/1.0”

  • RFC 2026, “The Internet Standards Process—Revision 3”

  • RFC 2028, “The Organizations Involved in the IETF Standards Process”

  • RFC 2045 through 2049, “Multipurpose Internet Mail Extensions (MIME)” (Parts One through Five)

  • RFC 2060, “Internet Message Access Protocol—Version 4rev1 (IMAP4)”

  • RFC 2131, “Dynamic Host Configuration Protocol (DHCP)”

  • RFC 2146, “U.S. Government Internet Domain Names”

  • RFC 2151, “A Primer on Internet and TCP/IP Tools and Utilities”

  • RFC 2305, “A Simple Mode of Facsimile Using Internet Mail”

  • RFC 2328, “Open Shortest Path First Routing (OSPF) Version 2”

  • RFC 2368, “The mailto URL scheme”

  • RFC 2373, “IP Version 6 Addressing Architecture”

  • RFC 2396, “Uniform Resource Identifiers (URI): Generic Syntax”

  • RFC 2460, “Internet Protocol, Version 6 (IPv6) Specification”

  • RFC 2535, “Domain Name System Security Extensions”

  • RFC 2616, “HyperText Transfer Protocol—HTTP/1.1”

  • RFC 2660, “The Secure HyperText Transfer Protocol”

  • RFC 2821, “Simple Mail Transfer Protocol (SMTP)”

  • RFC 2822, “Internet Message Format”

  • RFC 2853, “Generic Security Service API Version 2: Java Bindings”

  • RFC 2854, “The ‘text/html’ Media Type”

  • RFC 2865, “Remote Authentication Dial In User Service (RADIUS)”

  • RFC 2870, “Root Name Server Operational Requirements”

  • RFC 2871, “A Framework for Telephony Routing over IP”

  • RFC 2900, “Internet Official Protocol Standards”

  • RFC 2910, “Internet Printing Protocol/1.1: Encoding and Transport”

  • RFC 2911, “Internet Printing Protocol/1.1: Model and Semantics”

  • RFC 3013, “Recommended Internet Service Provider Security Services and Procedures”

  • RFC 3022, “Traditional IP Network Address Translator (Traditional NAT)”

  • RFC 3076, “Canonical XML Version 1.0”

  • RFC 3130, “Notes from the State-Of-The-Technology: DNSSEC”

  • RFC 3174, “US Secure Hash Algorithm 1 (SHA1)”

  • RFC 3196, “Internet Printing Protocol/1.1: Implementer’s Guide”

  • RFC 3275, “(Extensible Markup Language) XML-Signature Syntax and Processing”

  • RFC 3330, “Special-Use IPv4 Addresses”

  • RFC 3344, “IP Mobility Support for IPv4”

    Insider Insight 

    The RFCs continue to evolve as new technology and techniques emerge. If you work in networking, you should keep an eye on the RFCs to monitor emerging networking protocols. You can check up on the RFCs at http://www.faqs.org/rfcs/.

Understanding IP Addresses

When you have many computers on a network, you need a way to identify each one uniquely. In TCP/IP networking, the address of a computer is known as the IP address. Because TCP/IP deals with internetworking, the address is based on the concepts of a network address and a host address. You might think of the idea of a network address and a host address as having to provide two addresses to identify a computer uniquely:

  • Network address—Indicates the network on which the computer is located

  • Host address—Indicates a specific computer on that network

Class A addresses support 126 networks, each with up to 16 million hosts. Although the network address is 7-bit, two values (0 and 127) have special meaning; therefore, you can have only 1 through 126 as Class A network addresses. There can be approximately 2 billion class A hosts.

Class B addresses are for networks with up to 65,534 hosts. There can be at most 16,384 class B networks. All class B networks, taken together, can have approximately 1 billion hosts.

Class C addresses are meant for small organizations. Each class C address allows up to 254 hosts, and there can be approximately 2 million class C networks. Therefore, there can be at most approximately 500 million class C hosts. If you are in a small company, you probably have a class C address. Nowadays, it is customary to aggregate multiple class C addresses into a single block and use them for efficient routing.

All together, class A, B, and C networks can support at most approximately 3.5 billion hosts.

You can tell the class of an IP address by the first number in the dotted-decimal notation, as follows:

  • Class A addresses: 1.xxx.xxx.xxx through 126.xxx.xxx.xxx

  • Class B addresses: 128.xxx.xxx.xxx through 191.xxx.xxx.xxx

  • Class C addresses: 192.xxx.xxx.xxx through 223.xxx.xxx.xxx

Even within the five address classes, the following IP addresses have special meaning:

  • An address with all zeros in its network portion indicates the local network—the network where the data packet with this IP address originated. Thus, the address 0.0.0.200 means host number 200 on this class C network.

  • The class A address 127.xxx.xxx.xxx is used for loopback—communications within the same host. Conventionally, 127.0.0.1 is used as the loopback address. Processes that need to communicate through TCP with other processes on the same host use the loopback address to avoid having to send packets out on the network.

  • Turning on all the bits in any part of the address indicates a broadcast message. The address 128.18.255.255, for example, means all hosts on the class B network 128.18. The address 255.255.255.255 is known as a limited broadcast; all workstations on the current network segment will receive the packet.

Getting IP Addresses for Your Network

If you are setting up an independent network of your own that will be connected to the Internet, you need unique IP addresses for your network. You would typically get a range of IP addresses for your network from the ISP who connects your network to the Internet. You can get the domain name from one of the Internet domain name registration services. For example, for the .com domain, you can obtain domain names from VeriSign located on the Web at http://www.networksolutions.com/. To learn more about domain name and IP address services, point your Web browser to the InterNIC website at http://www.internic.net/.

ISPs typically get their IP address allocation in large blocks from regional Internet registries such as ARIN (American Registry for Internet Numbers, http://www.arin.net/) in the United States, RIPE (Réseaux IP Européens, http://www.ripe.net/) in Europe, and APNIC (Asia Pacific Network Information Centre, http://www.apnic.net/) for the Asia-Pacific region. For more information about IP address allocation services, visit the Internet Assigned Numbers Authority (IANA) website at http://www.iana.org/ipaddress/ip-addresses.htm.

Figuring Out Network Masks

The network mask is an IP address that has 1s in the bits that correspond to the network address, and 0s in all other bit positions. The class of your network address determines the network mask.

If you have a class C address, for example, the network mask is 255.255.255.0. Thus, class B networks have a network mask of 255.255.0.0, and class A networks have 255.0.0.0 as the network mask. Of course, you do not have to use the historical class A, B, or C network masks. Nowadays, you can use any other network mask that’s appropriate for your network address.

Extracting Network Addresses

The network address is the bitwise AND of the network mask with any IP address in your network. If the IP address of a system on your network is 206.197.168.200, and the network mask is 255.255.255.0, the network address is 206.197.168.0. The network address is written with zero bits in the part of the address that’s supposed to be for the host address.

Using Subnets

If your site has a class B address, you get one network number, and that network can have up to 65,534 hosts. Even if you work for a megacorporation that has thousands of hosts, you may want to divide your network into smaller subnetworks (or subnets). If your organization has offices in several locations, for example, you may want each office to be on a separate network. You can do this by taking some bits from the host-address portion of the IP address and assigning those bits to the network address. This procedure is known as defining a subnet mask.

Caution 

Do not confuse an IP subnet, which is a logical division of a network, with Ethernet segments, which refer to physical divisions of an Ethernet network.

Essentially, when you define a subnet mask, you add more bits to the default network mask for that address class. If you have a class B network, for example, the default network mask would be 255.255.0.0. Then, if you decide to divide your network into 128 subnetworks, each of which has 512 hosts, you would designate 7 bits from the host address space as the subnet address. Thus, the subnet mask becomes 255.255.254.0.

Using Supernets or CIDR

There are so few class A and B network addresses that they are becoming scarce. Class C addresses are more plentiful, but the proliferation of class C addresses has introduced a unique problem. Each class C address needs an entry in the network routing tables—the tables that contain information about how to locate any network on the Internet. Too many class C addresses means too many entries in the routing tables, which causes the router’s performance to deteriorate. One way to get around this problem is ignore the predefined address classes and let the network address be any number of bits. All you need is for the network mask to figure out which part of the 32-bit IP address is the network address. Based on this idea the Classless Inter-Domain Routing (CIDR)—documented in RFC 1519—was developed to enable routing of contiguous blocks of class C addresses with a single entry in the routing table. CIDR is used in the Internet as the primary mechanism to improve scalability of the Internet routing system.

Learning about IPv6

When the 4-byte IP address was created, the number of addresses seemed to be adequate. By now, however, class A and B addresses are running out, and class C addresses are being depleted at a fast rate. The Internet Engineering Task Force (IETF) recognized the potential for running out of IP addresses in 1991, and work began then on the next-generation IP addressing scheme, named IPng, which will eventually replace the old 4-byte addressing scheme (called IPv4, for IP Version 4).

Routing TCP/IP Packets

Routing refers to the task of forwarding information from one network to another. Consider the two class C networks 206.197.168.0 and 164.109.10.0. You need a routing device to send packets from one of these networks to the other.

Because a routing device facilitates data exchange between two networks, it has two physical network connections, one on each network. Each network interface has its own IP address, and the routing device essentially passes packets back and forth between the two network interfaces. Figure 6-4 illustrates how a routing device has a physical presence in two networks and how each network interface has its own IP address.

Click To expand
Figure 6-4: A Routing Device Allows Packet Exchange between Two Networks.

The generic term “routing device can refer to a general-purpose computer with two network interfaces or a dedicated device designed specifically for routing. Such dedicated routing devices are known as routers.

Insider Insight 

The generic term “gateway” also refers to any routing device regardless of whether the device is another PC or a router. For good performance (a high packet-transfer rate), you want a dedicated router, whose sole purpose is to route packets of data in a network.

Later, when you learn how to set up a TCP/IP network in Linux, you’ll have to specify the IP address of your network’s gateway. If your Linux system gets its IP address from a DHCP (Dynamic Host Configuration Protocol) server, then that DHCP server can also provide the gateway address.

A single routing device, of course, does not connect all the networks in the world; packets get around in the Internet from one gateway to another. Any network connected to another network has a designated gateway. You can even have specific gateways for specific networks. As you’ll learn, a routing table keeps track of the gateway associated with an external network and the type of physical interface (such as Ethernet or Point-to-Point Protocol over serial lines) for that network. A default gateway gets packets that are addressed to any unknown network.

Within a single network, you don’t need a router as long as you do not use a subnet mask to break the single IP network into several subnets. In that case, however, you have to set up routers to send packets from one subnet to another.

Understanding the Domain Name System (DNS)

You can access any host computer in a TCP/IP network with an IP address. Remembering the IP addresses of even a few hosts of interest, however, is tedious. This fact was recognized from the beginning of TCP/IP, and the association between a hostname and IP address was created. The concept is similar to that of a phone book, in which you can look up a telephone number by searching for a person’s name.

In the early days of the Internet, the association between names and IP addresses was maintained in a text file named HOSTS.TXT at the Network Information Center (NIC), which was located in the Stanford Research Institute (SRI). This file contained the names and corresponding IP addresses of networks, hosts, and routers on the Internet. All hosts on the Internet used to transfer that file by FTP. (Can you imagine all hosts getting a file from a single source in today’s Internet?) As the number of Internet hosts increased, the single file idea became unmanageable. The hosts file was becoming difficult to maintain, and it was hard for all the hosts to update their hosts file in a timely manner. To alleviate the problem, RFC 881 introduced the concept of and plans for domain names in November 1983. Eventually, in 1987 this led to the Domain Name System (DNS) as we know it today (documented in RFCs 1032, 1033, 1034, and 1035).

DNS provides a hierarchical naming system much like your postal address, which you can read as “your name” at “your street address” in “your city” in “your state” in “your country.” If I know your full postal address, I can locate you by starting with your city in your country. Then, I’d locate the street address to find your home, ring the doorbell, and ask for you by name.

The convention for the email address of a user on a system is to append an at sign (@) to the user name (the name under which the user logs in) and then append the system’s fully qualified domain name. Thus, refer to the user named webmaster at the host gao.gov as webmaster@GAO.GOV (unlike hostnames, user names are case sensitive).

TCP/IP network applications resolve a hostname to an IP address by consulting a name server, which is another host that’s accessible from your network. If you decide to use the Domain Name System (DNS) on your network, you have to set up a name server in your network or indicate a name server (by an IP address).

Cross Ref 

Later sections of this chapter discuss the configuration files /etc/host.conf and /etc/resolv.conf, through which you specify how hostnames are converted to IP addresses. In particular, you specify the IP addresses of a name server in the /etc/resolv.conf file.

If you do not use DNS, you still can have host name-to-IP address mapping through a text file named /etc/hosts. The entries in a typical /etc/hosts file might look like the following example:

# Lines like these are comments
# You must have the localhost line in /etc/hosts file
127.0.0.1       localhost.localdomain localhost
192.168.0.100   lnbp933  lnbp933.local.net
192.168.0.60    lnbp600
192.168.0.200   lnbp200  lnbp200.local.net
192.168.0.40    lnbp400  lnbp400.local.net
192.168.0.25    mac      lnbmac  lnbmac.local.net

As the example shows, the file lists a hostname for each IP address. The IP address and hostnames are different for your system, of course.

Insider Insight 

One problem with relying on the /etc/hosts file for name lookup is that you have to replicate this file on each system on your network. This procedure can become a nuisance even in a network that has only five or six systems.