19.1 The socket Module

The socket module supplies a factory function, also named socket, that you call to generate a socket object s. You perform network operations by calling methods on s. In a client program, you connect to a server by calling s.connect. In a server program, you wait for clients to connect by calling s.bind and s.listen. When a client requests a connection, you accept the request by calling s.accept, which returns another socket object s1 connected to the client. Once you have a connected socket object, you transmit data by calling its method send, and receive data by calling its method recv.

Python supports both current Internet Protocol (IP) standards. IPv4 is more widespread, while IPv6 is newer. In IPv4, a network address is a pair (host,port), where host is a Domain Name System (DNS) hostname such as 'www.python.org' or a dotted-quad IP address string such as '194.109.137.226'. port is an integer indicating a socket's port number. In IPv6, a network address is a tuple (host, port, flowinfo, scopeid). Since IPv6 infrastructure is not yet widely deployed, I do not cover IPv6 further in this book. When host is a DNS hostname, Python implicitly looks up the name, using your platform's DNS infrastructure, and uses the dotted-quad IP address corresponding to that name.

Module socket supplies an exception class error. Functions and methods of the module raise error instances to diagnose socket-specific errors. Module socket also supplies many functions. Several of these functions translate data, such as integers, between your host's native format and network standard format. The higher-level protocol that your program and its counterpart are using on a socket determines what kind of conversions you must perform.

19.1.1 socket Functions

The most frequently used functions of module socket are as follows.

getfqdn

getfqdn(host='')

Returns the fully qualified domain name string for the given host. When host is '', returns the fully qualified domain name string for the local host.

gethostbyaddr

gethostbyaddr(ipaddr)

Returns a tuple with three items (hostname, alias_list, ipaddr_list). hostname is a string, the primary name of the host whose IP dotted-quad address you pass as string ipaddr. alias_list is a list of 0 or more alias names for the host. ipaddr_list is a list of one or more dotted-quad addresses for the host.

gethostbyname_ex

gethostbyname_ex(hostname)

Returns the same results as gethostbyaddr, but takes as an argument a hostname string that can be either an IP dotted-quad address or a DNS name.

htonl

htonl(i32)

Converts the 32-bit integer i32 from this host's format into network format.

htons

htons(i16)

Converts the 16-bit integer i16 from this host's format into network format.

inet_aton

inet_aton(ipaddr_string)

Converts IP dotted-quad address string ipaddr_string to 32-bit network packed format and returns a string of 4 bytes.

inet_ntoa

inet_ntoa(packed_string)

Converts the 4-byte network packed format string packed_string and returns an IP dotted-quad address string.

ntohl

htonl(i32)

Converts the 32-bit integer i32 from network format into this host's format, and returns a normal native integer.

ntohs

htons(i16)

Converts the 16-bit integer i16 from network format into this host's format, and returns a normal native integer.

socket

socket(family,type)

Creates and returns a socket object with the given family and type. family is usually the constant attribute AF_INET of module socket, indicating you want a normal, Internet (i.e., TCP/IP) kind of socket. Depending on your platform, family may also be another constant attribute of module socket. For example, AF_UNIX, on Unix-like platforms only, indicates that you want a Unix-kind socket. This book does not cover sockets that are not of the Internet kind, since it focuses on cross-platform Python. type is one of a few constant attributes of module socket; generally, type is SOCK_STREAM to create a TCP (connection-based) socket, or SOCK_DGRAM to create a UDP (datagram-based) socket.

19.1.2 The socket Class

A socket object s supplies many methods. The most frequently used ones are as follows.

accept

s.accept(  )

Accepts a connection request and returns a pair (s1,(ipaddr,port)), where s1 is a new connected socket and ipaddr and port are the IP address and port number of the counterpart. s must be of type SOCK_STREAM, and you must have previously called s.bind and s.listen. If no client is trying to connect, accept blocks until some client tries to connect.

bind

s.bind((host,port))

Binds socket s to accept connections from host host serving on port number port. host can be the empty string '' to accept connections from any host. It's an error to call s.bind twice on any given socket object s.

close

s.close(  )

Closes the socket, terminating any listening or connection on it. It's an error to call any other method on s after s.close.

connect

s.connect((host,port))

Connects socket s to the server on the given host and port. Blocks until the server accepts or rejects the connection attempt.

getpeername

s.getpeername(  )

Returns a pair (ipaddr,port), giving the IP address and port number of the counterpart. s must be connected, either because you called s.connect or because s was generated by another socket's accept method.

listen

s.listen(maxpending)

Listens for connection attempts to the socket, allowing up to maxpending queued attempts at any time. maxpending must be greater than 0 and less than or equal to a system-dependent value, which on all contemporary systems is at least 5.

makefile

s.makefile(mode='r')

Creates and returns a file object f, as covered in Chapter 10, that reads from and/or writes to the socket. You can close f and s independently; Python closes the underlying socket only when both f and s are closed.

recv

s.recv(bufsize)

Receives up to bufsize bytes from the socket and returns a string with the data received. Returns an empty string when the socket is disconnected. If there is currently no data, blocks until the socket is disconnected or some data arrives.

recvfrom

s.recvfrom(bufsize)

Receives up to bufsize bytes from the socket and returns a tuple (data,(ipaddr,port)), where data is a string with the data received, and ipaddr and port are the IP address and port number of the sender. Useful with datagram-oriented sockets, which can receive data from different senders. If there is currently no data in the socket, blocks until some data arrives.

send

s.send(string)

Sends the bytes of string on the socket. Returns the number n of bytes sent. n may be lower than len(string); your program must check, and resend the unsent substring string[n:] if non-empty. If there is no space in the socket's buffer, blocks until some space appears.

sendall

s.sendall(string)

Sends the bytes of string on the socket, blocking until all the bytes are sent.

sendto

s.sendto(string,(host,port))

Sends the bytes of string on the socket to the destination host and port, and returns the number n of bytes sent. Useful with datagram-oriented sockets, which can send data to various destinations. You must not have previously called method s.bind. n may be lower than len(string); your program must check, and resend the unsent substring string[n:] if non-empty.

19.1.3 Echo Server and Client Using TCP Sockets

Example 19-1 shows a TCP server that listens for connections on port 8881. When connected, the server loops, echoing all data back to the client, and goes back to accept another connection when the client is finished. To terminate the server, hit the interrupt key with the focus on the server's terminal window (console). The interrupt key combination, depending on your platform and settings, may be Ctrl-Break (typical on Windows) or Ctrl-C.

Example 19-1. TCP echo server
import socket
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.bind(('', 8881))
sock.listen(5)

# loop waiting for connections 
# terminate with Ctrl-Break on Win32, Ctrl-C on Unix
try:
    while True:
        newSocket, address = sock.accept(  )
        print "Connected from", address
        while True:
            receivedData = newSocket.recv(8192)
            if not receivedData: break
            newSocket.sendall(receivedData)
        newSocket.close(  )
        print "Disconnected from", address
finally:
    sock.close(  )

The argument passed to the newSocket.recv call, here 8192, is the maximum number of bytes to receive at a time. Receiving up to a few thousand bytes at a time is a good compromise between performance and memory consumption, and it's usual to specify a power of 2 (e.g., 8192==2**13) since memory allocation tends to round up to such powers anyway. It's important to close sock (to ensure we free its well-known port number 8881 as soon as possible), so we use a try/finally statement to ensure sock.close is called. Closing newSocket, which is system-allocated on any suitable free port, is not of the same importance; therefore we do not use a try/finally for it, although it would be fine to do so.

Example 19-2 shows a simple TCP client that connects to port 8881 on the local host, sends lines of data, and prints what it receives back from the server.

Example 19-2. TCP echo client
import socket
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect(('localhost', 8881))
print "Connected to server"
data = """A few lines of data
to test the operation
of both server and client."""
for line in data.splitlines(  ):
    sock.sendall(line)
    print "Sent:", line
    response = sock.recv(8192)
    print "Received:", response
sock.close(  )

Run the server of Example 19-1 on a terminal window, and try a few runs of Example 19-2 while the server is running.

19.1.4 Echo Server and Client Using UDP Sockets

Example 19-3 and Example 19-4 implement an echo server and client with UDP (i.e., using datagram rather than stream sockets).

Example 19-3. UDP echo server
import socket
sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
sock.bind(('', 8881))

# loop waiting for datagrams 
(terminate with Ctrl-Break on Win32, Ctrl-C on Unix)
try:
    while True:
        data, address = sock.recvfrom(8192)
        print "Datagram from", address
        sock.sendto(data, address)
finally:
    sock.close(  )
Example 19-4. UDP echo client
import socket
sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
data = """A few lines of data
to test the operation
of both server and client."""
for line in data.splitlines(  ):
    sock.sendto(line, ('localhost', 8881))
    print "Sent:", line
    response = sock.recv(8192)
    print "Received:", response
sock.close(  )

Run the server of Example 19-3 on a terminal window, and try a few runs of Example 19-4 while the server is running. Example 19-3 and Example 19-4, as well as Example 19-1 and Example 19-2, can run independently at the same time. There is no interference nor interaction, even though all are using port number 8881 on the local host, because TCP and UDP ports are separate. Note that if you run Example 19-4 when the server of Example 19-3 is not running, you don't receive an error message: the client of Example 19-4 hangs forever, waiting for a response that will never arrive. Datagrams are not as robust and reliable as connections.

19.1.5 The timeoutsocket Module

Standard sockets, as supplied by module socket, have no concept of timing out. By default, each socket operation blocks until it either succeeds or fails. There are advanced ways to ask for non-blocking sockets and to ensure that you perform socket operations only when they can't block (relying on module select, covered later in this chapter). However, explicitly arranging for such behavior, particularly in a cross-platform way, can be complicated and difficult.

It's generally simpler to deal with socket objects enriched by a timeout concept. Each operation on such an object fails, with an exception indicating a timeout condition, if the operation still has neither succeeded nor failed after a timeout period has elapsed. Such objects are internally implemented by using non-blocking sockets and selects, but your program is shielded from the complexities and deals only with objects that present a simple and intuitive interface.

In Python 2.3, sockets with timeout behavior will be part of the standard Python library. However, you can use such objects with earlier releases of Python by downloading Timothy O'Malley's timeoutsocket module from http://www.timo-tasi.org/python/timeoutsocket.py. Copy the file to your library directory (e.g., C:\Python22\Lib\). Then, have your program execute a statement:

import timeoutsocket

before the program imports socket or any other module using sockets, such as urllib and others covered in Chapter 18. Afterwards, any creation of a connection-oriented (TCP) socket creates instead an instance t of class timeoutsocket.TimeoutSocket. In addition to socket methods, t supplies two additional methods.

get_timeout

t.get_timeout(  )

Returns the timeout value of t, in seconds.

set_timeout

t.set_timeout(s)

Sets the timeout value of t to s seconds. s is a float or None.

The default timeout value of each new instance t of TimeoutSocket is None, meaning that there is no timeoutt behaves like an ordinary socket instance. To change this, module timeoutsocket supplies two functions.

getDefaultSocketTimeout

getDefaultSocketTimeout(  )

Returns the default timeout value, in seconds, used for newly created instances of class TimeoutSocket. Initially returns None.

setDefaultSocketTimeout

setDefaultSocketTimeout(s)

Sets the default timeout value, used for newly created instances of class TimeoutSocket, to s seconds. s is a float or None.

Socket methods that may block and wait forever when you call them on normal sockets, such as connect, accept, recv, and send, may time out when you call them on an instance t of TimeoutSocket with a timeout value s that is not None. If s seconds elapse after the call, and the wait is still going on, then t stops waiting and raises timeoutsocket.Timeout.



    Part III: Python Library and Extension Modules