18.3 The HTTP and FTP Protocols

Modules urllib and urllib2 are most often the handiest ways to access servers for http, https, and ftp protocols. The Python standard library also supplies specific modules to use for these data access protocols.

18.3.1 The httplib Module

Module httplib supplies a class HTTPConnection to connect to an HTTP server.

HTTPConnection

class HTTPConnection(host,port=80)

Returns an instance h of class HTTPConnection, ready for connection (but not yet connected) to the given host and port.

Instance h supplies several methods, of which the most frequently used are the following.

close

h.close(  )

Closes the connection to the HTTP server.

getresponse

h.getresponse(  )

Returns an instance r of class HTTPResponse, which represents the response received from the HTTP server. Call after method request has returned. Instance r supplies the following attributes and methods:

r.getheader( name,default=None)

Returns the contents of header name, or default if no such header exists.

r.msg

An instance of class Message of module mimetools, covered in Chapter 21. You can use r.msg to access the response's headers and body.

r.read( )

Returns a string that is the body of the server's response.

r.reason

The string that the server gave as the reason for errors or anomalies. If the request was successful, r.reason could, for example, be 'OK'.

r.status

An integer, the status code that the server returned. If the request was successful, r.status should be between 200 and 299 according to the HTTP standards. Values between 400 and 599 are typical error codes, again according to HTTP standards. For example, 404 is the error code that a server sends when the page you request cannot be found.

r.version

10 if the server supports only HTTP 1.0, 11 if the server supports HTTP 1.1.

request

h.request(command,URL,data=None,headers={})

Sends a request to the HTTP server. command is an HTTP command string, such as 'GET' or 'POST'. URL is an HTTP selector (i.e., a URL string without the scheme and location componentsjust the path component, possibly followed by query and/or fragment components). data, if not None, is a string sent as the body of the request, normally meaningful only for such commands as 'POST' and 'PUT'. request computes and sends the Content-Length header to describe the length of data. To send other headers, pass them as part of dictionary argument headers, with the header name as the key and the header contents as the corresponding value.

Module httplib also supplies class HTTPSConnection, used in exactly the same way as class HTTPConnection but supporting connections that use protocol https rather than protocol http.

18.3.2 The ftplib Module

The ftplib module supplies a class FTP to connect to an FTP server.

FTP

class FTP([host[,user,passwd='']])

Returns an instance f of class FTP. When host is given, implicitly calls f.connect(host). When user (and optionally passwd) is also given, implicitly calls f.login(user,passwd) afterward.

Instance f supplies many methods, of which the most frequently used are the following.

connect

f.connect(host,port=21)

Connects to an FTP server on the given host and port. Call once per instance f, as f's first method call. Don't call if host was given on creation.

cwd

f.cwd(pathname)

Sets the current directory on the FTP server to pathname.

delete

f.delete(filename)

Tells the FTP server to delete a file, and returns a string, the server's response.

login

f.login(user='anonymous',passwd='')

Logs in to the FTP server. When user is 'anonymous' and passwd is '', login determines the real user and host and sends user@host as the password, as normal anonymous FTP conventions require. Call once per instance of f, as the first method call on f after connecting.

mkd

f.mkd(pathname)

Makes a new directory, named pathname, on the FTP server.

pwd

f.pwd(  )

Returns the current directory on the FTP server.

quit

f.quit(  )

Closes the connection to the FTP server. Call as the last method call on f.

rename

f.rename(oldname,newname)

Tells the FTP server to rename a file from oldname to newname.

retrbinary

f.retrbinary(command,callback,blocksize=8192,rest=None)

Retrieves data in binary mode. command is a string with an appropriate FTP command, typically 'RETR filename'. callback is a callable that retrbinary calls for each block of data returned, passing the block of data, a string, as the only argument. blocksize is the maximum size of each block of data. When rest is not None, it's the offset in bytes from the start of the file at which you want to start the retrieval, if the FTP server supports the 'REST' command. When rest is not None and the FTP server does not support the 'REST' command, retrbinary raises an exception.

retrlines

f.retrlines(command,callback=None)

Retrieves data in text mode. command is a string with an appropriate FTP command, typically 'RETR filename' or 'LIST'. callback is a callable that retrlines calls for each line of text returned, passing the line of text, a string, as the only argument (without the end-of-line marker). When callback is None, retrlines writes the lines of text to sys.stdout.

rmd

f.rmd(pathname)

Removes directory pathname on the FTP server.

sendcmd

f.sendcmd(command)

Sends string command as a command to the server and returns the server's response string. Suitable only for commands that don't open data connections.

set_pasv

f.set_pasv(pasv)

Sets passive mode on if pasv is true, off if false. Passive mode defaults to on.

size

f.size(filename)

Returns the size in bytes of the named file on the FTP server, or None if unable to determine the file's size.

storbinary

f.storbinary(command,file,blocksize=8192)

Stores data in binary mode. command is a string with an appropriate FTP command, typically 'STOR filename'. file is a file open in binary mode, which storbinary reads, repeatedly calling file.read(blocksize), to obtain the data to transfer to the FTP server.

storlines

f.storlines(command,file)

Stores data in text mode. command is a string with an appropriate FTP command, typically 'STOR filename'. file is a file open in text mode, which storlines reads, repeatedly calling file.readline( ), to obtain the data to transfer to the FTP server.

Here is a typical, simple example of ftplib use in an interactive interpreter session:

>>> import ftplib
>>> f = ftplib.FTP('ftp.python.org')
>>> f.login(  )
'230 Anonymous access granted, restrictions apply.'
>>> f.retrlines('LIST')
drwxrwxr-x   4 webmaster webmaster      512 Oct 12  2001 pub
'226 Transfer complete.'
>>> f.cwd('pub')
'250 CWD command successful.'
>>> f.retrlines('LIST')
drwxrwsr-x   2 barry    webmaster      512 Oct 12  2001 jython
lrwx------   1 root     ftp            25 Aug  3  2001 python -> www.python.org/ftp/python
drwxrwxr-x  43 webmaster webmaster     2560 Sep  3 17:22 www.python.org
'226 Transfer complete.'
>>> f.cwd('python')
'250 CWD command successful.'
>>> f.retrlines('LIST')
drwxrwxr-x   2 webmaster webmaster      512 Aug 23  2001 2.0
  [ many result lines snipped ]
drwxrwxr-x   2 webmaster webmaster      512 Aug  2  2001 wpy
'226 Transfer complete.'
>>> f.retrlines('RETR README')
Python Distribution
===================

Most subdirectories have a README or INDEX files explaining the
contents.
  [ many result lines snipped ]
gzipped version of this file, and 'get misc.tar.gz' will fetch a
gzipped tar archive of the misc subdir.
'226 Transfer complete.'

In this case, the following far simpler code is equivalent:

print urllib.urlopen('ftp://ftp.python.org/pub/python/README').read(  )

However, ftplib affords much more detailed control of FTP operations than urllib does. Thus, in some cases, ftplib may be useful for your programs.



    Part III: Python Library and Extension Modules