9.2 The string Module

The string module supplies functions that duplicate each method of string objects, as covered in the previous section. Each function takes the string object as its first argument. Module string also has several useful string-valued attributes:

ascii_letters

The string ascii_lowercase+ascii_uppercase

ascii_lowercase

The string 'abcdefghijklmnopqrstuvwxyz'

ascii_uppercase

The string 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'

digits

The string '0123456789'

hexdigits

The string '0123456789abcdefABCDEF'

letters

The string lowercase+uppercase

lowercase

A string containing all characters that are deemed lowercase letters: at least 'abcdefghijklmnopqrstuvwxyz', but more letters (e.g., accented ones) may be present, depending on the active locale

octdigits

The string '01234567'

punctuation

The string '!"#$%&\'( )*+,-./:;<=>?@[\\]^_'{|}~' (i.e., all ASCII characters that are deemed punctuation characters in the "C" locale; does not depend on what locale is active)

printable

The string of those characters that are deemed printable (i.e., digits, letters, punctuation, and whitespace)

uppercase

A string containing all characters that are deemed uppercase letters: at least 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', but more letters (e.g., accented ones) may be present, depending on the active locale

whitespace

A string containing all characters that are deemed whitespace: at least space, tab, linefeed, and carriage return, but more characters (e.g., control characters) may be present, depending on the active locale

You should not rebind these attributes, since other parts of the Python library may rely on them and the effects of rebinding them would be undefined.

9.2.1 Locale Sensitivity

The locale module is covered in Chapter 10. Locale setting affects some attributes of module string (letters, lowercase, uppercase, whitespace). Through these attributes, locale setting also affects functions of module string and methods of plain-string objects that deal with classification of characters as letters, and conversion between upper- and lowercase, such as capitalize, isalnum, and isalpha. The corresponding methods of Unicode strings are not affected by locale setting.

9.2.2 The maketrans Function

The method translate of plain strings, covered earlier in this chapter, takes as its first argument a plain string of length 256 that it uses as a translation table. The easiest way to build translation tables is to use the maketrans function supplied by module string.

maketrans

maketrans(from,onto)

Returns a translation table, which is a plain string of length 256 that provides a mapping from characters in ascending ASCII order to another set of characters. from and onto must be plain strings, with len(from) equal to len(onto). Each character in string from is mapped to the character at the corresponding position in string onto. For each character not listed in from, the translation table maps the character to itself. To get an identity table that maps each character to itself, call maketrans('','').

With the translate string method, you can delete characters as well as translate them. When you use translate just to delete characters, the first argument you pass to translate should be the identity table. Here's an example of using the maketrans function and the string method translate to delete vowels:

import string
identity = string.maketrans('','')
print 'some string'.translate(identity,'aeiou')    # prints: sm strng

Here are examples of turning all other vowels into a's and also deleting s's:

intoas = string.maketrans('eiou','aaaa')
print 'some string'.translate(intoas)              # prints: sama strang
print 'some string'.translate(intoas,'s')          # prints: ama trang


    Part III: Python Library and Extension Modules