Modules urllib and
urllib2 are most often the handiest ways to access
servers for http, https,
and ftp protocols. The Python standard library
also supplies specific modules to use for these data access
protocols.
18.3.1 The httplib Module
Module
httplib supplies a class
HTTPConnection to connect to an HTTP server.
class HTTPConnection(host,port=80)
|
|
Returns an instance h of class
HTTPConnection, ready for connection (but not yet
connected) to the given host and
port.
Instance h supplies several methods, of
which the most frequently used are the following.
Closes the connection to the HTTP server.
Returns an instance r of class
HTTPResponse, which represents the response
received from the HTTP server. Call after method
request has returned. Instance
r supplies the following attributes and
methods:
- r.getheader( name,default=None)
-
Returns the contents of header
name, or
default if no such header exists.
- r.msg
-
An
instance of class Message of module
mimetools, covered in Chapter 21. You can use
r.msg to access the
response's headers and body.
- r.read( )
-
Returns
a string that is the body of the server's response.
- r.reason
-
The
string that the server gave as the reason for errors or anomalies. If
the request was successful,
r.reason could, for
example, be 'OK'.
- r.status
-
An
integer, the status code that the server returned. If the request was
successful, r.status
should be between 200 and 299
according to the HTTP standards. Values between
400 and 599 are typical error
codes, again according to HTTP standards. For example,
404 is the error code that a server sends when the
page you request cannot be found.
- r.version
-
10 if
the server supports only HTTP 1.0, 11 if the
server supports HTTP 1.1.
h.request(command,URL,data=None,headers={})
|
|
Sends
a request to the HTTP server. command is
an HTTP command string, such as 'GET' or
'POST'. URL is an HTTP
selector (i.e., a URL string without the scheme and location
componentsjust the path component, possibly followed by query
and/or fragment components). data, if not
None, is a string sent as the body of the request,
normally meaningful only for such commands as
'POST' and 'PUT'.
request computes and sends the Content-Length
header to describe the length of data. To
send other headers, pass them as part of dictionary argument
headers, with the header name as the key
and the header contents as the corresponding value.
Module httplib also supplies class
HTTPSConnection, used in exactly the same way as
class HTTPConnection but supporting connections
that use protocol https rather than protocol
http.
18.3.2 The ftplib Module
The
ftplib module supplies a class
FTP to connect to an FTP server.
class FTP([host[,user,passwd='']])
|
|
Returns an instance f of class
FTP. When host is
given, implicitly calls
f.connect(host).
When user (and optionally
passwd) is also given, implicitly calls
f.login(user,passwd)
afterward.
Instance f supplies many methods, of which
the most frequently used are the following.
Connects to an FTP server on the given
host and port.
Call once per instance f, as
f's first method call.
Don't call if host was
given on creation.
Sets the current directory on the FTP server to
pathname.
Tells the FTP server to delete a file, and returns a string, the
server's response.
f.login(user='anonymous',passwd='')
|
|
Logs in to the FTP server. When user is
'anonymous' and passwd
is '', login determines the
real user and host and sends
user@host
as the password, as normal anonymous FTP conventions require. Call
once per instance of f, as the first
method call on f after connecting.
Makes a new directory, named pathname, on
the FTP server.
Returns the current directory on the FTP server.
Closes
the connection to the FTP server. Call as the last method call on
f.
f.rename(oldname,newname)
|
|
Tells the FTP server to rename a file from
oldname to
newname.
f.retrbinary(command,callback,blocksize=8192,rest=None)
|
|
Retrieves data in binary mode. command is
a string with an appropriate FTP command, typically
'RETR
filename'.
callback is a callable that
retrbinary calls for each block of data returned,
passing the block of data, a string, as the only argument.
blocksize is the maximum size of each
block of data. When rest is not
None, it's the offset in bytes
from the start of the file at which you want to start the retrieval,
if the FTP server supports the 'REST' command.
When rest is not None
and the FTP server does not support the 'REST'
command, retrbinary raises an exception.
f.retrlines(command,callback=None)
|
|
Retrieves data in text mode.
command is a string with an appropriate
FTP command, typically 'RETR
filename' or
'LIST'. callback is a
callable that retrlines calls for each line of
text returned, passing the line of text, a string, as the only
argument (without the end-of-line marker). When
callback is None,
retrlines writes the lines of text to
sys.stdout.
Removes directory pathname on the FTP
server.
Sends string command as a command to the
server and returns the server's response string.
Suitable only for commands that don't open data
connections.
Sets passive mode on if pasv is true, off
if false. Passive mode defaults to on.
Returns the size in bytes of the named file on the FTP server, or
None if unable to determine the
file's size.
f.storbinary(command,file,blocksize=8192)
|
|
Stores data in binary mode. command is a
string with an appropriate FTP command, typically
'STOR
filename'.
file is a file open in binary mode, which
storbinary reads, repeatedly calling
file.read(blocksize),
to obtain the data to transfer to the FTP server.
f.storlines(command,file)
|
|
Stores data in text mode. command is a
string with an appropriate FTP command, typically
'STOR
filename'.
file is a file open in text mode, which
storlines reads, repeatedly calling
file.readline( ), to
obtain the data to transfer to the FTP server.
Here is a typical, simple example of ftplib use in
an interactive interpreter session:
>>> import ftplib
>>> f = ftplib.FTP('ftp.python.org')
>>> f.login( )
'230 Anonymous access granted, restrictions apply.'
>>> f.retrlines('LIST')
drwxrwxr-x 4 webmaster webmaster 512 Oct 12 2001 pub
'226 Transfer complete.'
>>> f.cwd('pub')
'250 CWD command successful.'
>>> f.retrlines('LIST')
drwxrwsr-x 2 barry webmaster 512 Oct 12 2001 jython
lrwx------ 1 root ftp 25 Aug 3 2001 python -> www.python.org/ftp/python
drwxrwxr-x 43 webmaster webmaster 2560 Sep 3 17:22 www.python.org
'226 Transfer complete.'
>>> f.cwd('python')
'250 CWD command successful.'
>>> f.retrlines('LIST')
drwxrwxr-x 2 webmaster webmaster 512 Aug 23 2001 2.0
[ many result lines snipped ]
drwxrwxr-x 2 webmaster webmaster 512 Aug 2 2001 wpy
'226 Transfer complete.'
>>> f.retrlines('RETR README')
Python Distribution
===================
Most subdirectories have a README or INDEX files explaining the
contents.
[ many result lines snipped ]
gzipped version of this file, and 'get misc.tar.gz' will fetch a
gzipped tar archive of the misc subdir.
'226 Transfer complete.'
In this case, the following far simpler code is equivalent:
print urllib.urlopen('ftp://ftp.python.org/pub/python/README').read( )
However, ftplib affords much more detailed control
of FTP operations than urllib does. Thus, in some
cases, ftplib may be useful for your
programs.