HTTP is a stateless protocol, meaning that it retains no session state between transactions. Cookies, as specified by the HTTP 1.1 standard, let web clients and servers cooperate to build a stateful session from a sequence of HTTP transactions.
Each time a server sends a response to a client's request, the server may initiate or continue a session by sending one or more Set-Cookie headers, whose contents are small data items called cookies. When a client sends another request to the server, the client may continue a session by sending Cookie headers with cookies previously received from that server or other servers in the same domain. Each cookie is a pair of strings, the name and value of the cookie, plus optional attributes. Attribute max-age is the maximum number of seconds the cookie should be kept. The client should discard saved cookies after their maximum age. If max-age is missing, then the client should discard the cookie when the user's interactive session ends.
Cookies have no intrinsic privacy nor authentication. Cookies travel in the clear on the Internet, and therefore are vulnerable to sniffing. A malicious client might return cookies different from cookies previously received. To use cookies for authentication or identification or to hold sensitive information, the server must encrypt and encode cookies sent to clients, and decode, decrypt, and verify cookies received back from clients.
Encryption, encoding, decoding, decryption, and verification may all be slow when applied to large amounts of data. Decryption and verification require the server to keep some amount of server-side state. Sending substantial amounts of data back and forth on the network is also slow. The server should therefore persist most state data locally, in files or databases. In most cases, a server should use cookies only as small, encrypted, verifiable keys confirming the identity of a user or session, using DBM files or a relational database (covered in Chapter 11) for session state. HTTP sets a limit of 2 KB on cookie size, but I suggest you normally use substantially smaller cookies.
The Cookie module supplies several classes, mostly for backward compatibility. CGI scripts normally use the following classes from module Cookie.
Morsel |
A script does not directly instantiate class Morsel. However, instances of cookie classes hold instances of Morsel. An instance m of class Morsel represents a single cookie element: a key string, a value string, and optional attributes. m is a mapping. The only valid keys in m are cookie attribute names: 'comment', 'domain', 'expires', 'max-age', 'path', 'secure', and 'version'. Keys into m are case-insensitive. Values in m are strings, each holding the value of the corresponding cookie attribute.
SimpleCookie |
class SimpleCookie(input=None) |
A SimpleCookie instance c is a mapping. c's keys are strings. c's values are Morsel instances that wrap strings. c[k]=v implicitly expands to:
c[k]=Morsel( ); c[k].set(k,str(v),str(v))
If input is not None, instantiating c implicitly calls c.load(input).
SmartCookie |
class SmartCookie(input=None) |
A SmartCookie instance c is a mapping. c's keys are strings. c's values are Morsel instances that wrap arbitrary values serialized with pickle. c[k]=v has the semantics:
c[k]=Morsel( ); c[k].set(k,str(v),pickle.dumps(v))
Module pickle was covered in Chapter 11. Since you have little control on what code executes during implicit deserialization via pickle.loads, class SmartCookie offers correspondingly little security. Unless your script is exposed only on a trusted intranet, avoid SmartCookieuse SimpleCookie instead. You can use any cryptographic approach to build, and take apart again, the strings wrapped by Morsel instance values in SimpleCookie instances. Modules covered in Chapter 21 make it easy to encode arbitrary byte strings as text strings, quite apart from any cryptographic measures.
SmartCookie is more convenient than SimpleCookie plus cryptography, encoding, and decoding. Convenience and security are often in conflict. The choice is yours. Do not labor under the misapprehension that your system is secure because "after all, nobody knows what I'm doing": security through obscurity isn't. Good cryptography is a necessary (but not sufficient) condition for strong security.
An instance c of SimpleCookie or SmartCookie supplies the following methods.
js_output |
c.js_output(attrs=None) |
Returns a string s, a JavaScript snippet that sets document.cookie to the cookies held in c. You can embed s in an HTML response to simulate cookies without sending an HTTP Set-Cookie header if the client browser supports JavaScript. If attrs is not None, s's JavaScript sets cookie attributes whose names are in attrs.
load |
c.load(data) |
When data is a string, load parses it and adds to c each parsed cookie. When data is a mapping, load adds to c a new Morsel instance for each item in data. Normally, data is string os.environ.get('HTTP_COOKIE',''), to recover the cookies the client sent.
output |
c.output(attrs=None,header='Set-Cookie',sep='\n') |
Returns a string s formatted as HTTP headers. You can print c.output( ) among your response's HTTP headers to send to the client the cookies held in c. Each header's name is string header, and headers are separated by string sep. If attrs is not None, s's headers contain only cookie attributes whose names are in attrs.
An instance m of class Morsel supplies three read-write attributes:
The cookie's value, encoded as a string; m's output methods use m.coded_value
The cookie's name
The cookie's value, an arbitrary Python object
Instance m also supplies the following methods.
js_output |
m.js_output(attrs=None) |
Returns a string s, a JavaScript snippet that sets document.cookie to the cookie held in m. See also the js_output method of cookie instances.
output |
m.output(attrs=None,header='Set-Cookie') |
Returns a string s formatted as an HTTP header that sets the cookie held in m. See also the output method of cookie instances.
OutputString |
m.OutputString(attrs=['path','comment','domain','max-age', 'secure','version','expires']) |
Return a string s that represents the cookie held in m, without decorations. attrs can be any container suitable as the right-hand operand of in, such as a list or a dictionary.
set |
m.set(key,value,coded_value) |
Sets m's attributes. key and coded_value must be strings.
Module Cookie supports cookie handling in both client-side and server-side scripts. Typical usage is server-side, often in a CGI script. The following example shows a simple CGI script using cookies:
import Cookie, time, os, sys, traceback sys.stderr = sys.stdout try: # first, the script emits HTTP headers c = Cookie.SimpleCookie( ) c["lastvisit"]=str(time.time( )) print c.output( ) print "Content-Type: text/html" print # then, the script emits the response's body print "<html><head><title>Hello, visitor!</title></head><body>" # for the rest of the response, the scripts gets and decodes the cookie c = Cookie.SimpleCookie(os.environ.get("HTTP_COOKIE")) when = c.get("lastvisit") if when is None: print "<p>Welcome to this site on your first visit!</p>" print "<p>Please click the 'Refresh' button to proceed</p>" else: try: lastvisit = float(when.value) except: print "<p>Sorry, cannot decode cookie (%s)</p>"%when.value print "</br><pre>" traceback.print_exc( ) else: formwhen = time.asctime(time.localtime(lastvisit)) print "<p>Welcome back to this site!</p>" print "<p>You last visited on %s</p>"%formwhen print "</body></html>" except: print "Content-Type: text/html" print print "</br><pre>" traceback.print_exc( )
Each time a client visits the script, the script sets a cookie encoding the current time. On successive visits, if the client browser supports cookies, the script greets the visitor appropriately. Module time is covered in Chapter 12. Note that this example uses no cryptography or server-side persistence of state, since session state is small and not confidential.