A.5 Functional Programming

This section largely recapitulates briefer descriptions elsewhere in this appendix, but a common unfamiliarity with functional programming merits a longer discussion. Additional material on functional programming in Python?mostly of a somewhat exotic nature?can be found in articles at:

<http://gnosis.cx/publish/programming/charming_python_13.html>
<http://gnosis.cx/publish/programming/charming_python_16.html>
<http://gnosis.cx/publish/programming/charming_python_19.html>

It is hard to find any consensus about exactly what functional programming is, among either its proponents or detractors. It is not really entirely clear to what extent FP is a feature of languages, and to what extent a feature of programming styles. Since this is a book about Python, we can leave aside discussions of predominantly functional languages like Lisp, Scheme, Haskell, ML, Ocaml, Clean, Mercury, Erlang, and so on; we can focus on what makes a Python program more or less functional.

Programs that lean towards functional programming, within Python's multiple paradigms, tend to have many of the following features:

Functions are treated as first-class objects that are passed as arguments to other functions and methods, and returned as values from same.
Solutions are expressed more in terms of what is to be computed than in terms of how the computation is performed.
Side effects, especially rebinding names repeatedly, are minimized. Functions are referentially transparent (see Glossary).
Expressions are emphasized over statements; in particular, expressions often describe how a result collection is related to a prior collection?most especially list objects.
The following Python constructs are used prevalently: the built-in functions map(), filter(), reduce(), apply(), zip(), and enumerate(); extended call syntax; the lambda operator; list comprehensions; and switches expressed as Boolean operators.

Many experienced Python programmers consider FP constructs to be as much of a wart as a feature. The main drawback of a functional programming style (in Python, or elsewhere) is that it is easy to write unmaintainable or obfuscated programming code using it. Too many map(), reduce(), and filter() functions nested inside each other lose all the self-evidence of Python's simple statement and indentation style. Adding unnamed lambda functions into the mix makes matters that much worse. The discussion in Chapter 1 of higher-order functions gives some examples.

A.5.1 Emphasizing Expressions Using `lambda`

The lambda operator is used to construct an "anonymous" function. In contrast to the more common def declaration, a function created with lambda can only contain a single expression as a result, not a sequence of statements, nested blocks, and so on. There are inelegant ways to emulate statements within a lambda, but generally you should think of lambda as a less-powerful cousin of def declarations.

Not all Python programmers are happy with the lambda operator. There is certainly a benefit in readability to giving a function a descriptive name. For example, the second style below is clearly more readable than the first:

>>> from math import sqrt
>>> print map(lambda (a,b): sqrt(a**2+b**2),((3,4),(7,11),(35,8)))
[5.0, 13.038404810405298, 35.902646142032481]
>>> sides = ((3,4),(7,11),(35,8))
>>> def hypotenuse(ab):
...     a,b = ab[:]
...     return sqrt(a**2+b**2)
...
>>> print map(hypotenuse, sides)
[5.0, 13.038404810405298, 35.902646142032481]

By declaring a named function hypotenuse(), the intention of the calculation becomes much more clear. Once in a while, though, a function used in map() or in a callback (e.g., in Tkinter, xml.sax, or mx.TextTools) really is such a one-shot thing that a name only adds noise.

However, you may notice in this book that I fairly commonly use the lambda operator to define a name. For example, you might see something like:

>>> hypotenuse = lambda (a,b): sqrt(a**2+b**2)

This usage is mostly for documentation. A side matter is that a few characters are saved in assigning an anonymous function to a name, versus a def binding. But concision is not particularly important. This function definition form documents explicitly that I do not expect any side effects?like changes to globals and data structures?within the hypotenuse() function. While the def form is also side effect free, that fact is not advertised; you have to look through the (brief) code to establish it. Strictly speaking, there are ways?like calling setattr()?to introduce side effects within a lambda, but as a convention, I avoid doing so, as should you.

Moreover, a second documentary goal is served by a lambda assignment like the one above. Whenever this form occurs, it is possible to literally substitue the right-hand expression anywhere the left-hand name occurs (you need to add extra surrounding parentheses usually, however). By using this form, I am emphasizing that the name is simply a short-hand for the defined expression. For example:

>>> hypotenuse = lambda a,b: sqrt(a**2+b**2)
>>> (lambda a,b: sqrt(a**2+b**2))(3,4), hypotenuse(3,4)
(5.0, 5.0)

Bindings with def, in general, lack substitutability.

A.5.2 Special List Functions

Python has two built-in functions that are strictly operations on sequences, but that are frequently useful in conjunction with the "function-plus-list" built-in functions.

zip(seq1 [,seq2 [,...]])

The zip() function, in Python 2.0+, combines multiple sequences into one sequence of tuples. Think of the teeth of a zipper for an image and the source of the name.

The function zip() is almost the same as map(None,...), but zip() truncates when it reaches the end of the shortest sequence. For example:

>>> map(None, (1,2,3,4), [5,5,5])
[(1, 5), (2, 5), (3, 5), (4, None)]
>>> zip((1,2,3,4), [5,5,5])
[(1, 5), (2, 5), (3, 5)]

Especially in combination with apply(), extended call syntax, or simply tuple unpacking, zip() is useful for operating over multiple related sequences at once; for example:

>>> lefts, tops = (3, 7, 35), (4, 11, 8)
>>> map(hypotenuse, zip(lefts, tops))
[5.0, 13.038404810405298, 35.902646142032481]

A little quirk of zip() is that it is almost its own inverse. A little use of extended call syntax is needed for inversion, though. The expression zip(*zip(*seq)) is idempotent (as an exercise, play with variations). Consider:

>>> sides = [(3, 4), (7, 11), (35, 8)]
>>> zip(*zip(*sides))
[(3, 4), (7, 11), (35, 8)]

enumerate(collection)

Python 2.3 adds the enumerate() built-in function for working with a sequence and its index positions at the same time. Basically, enumerate(seq) is equivalent to zip(range(len(seq)),seq), but enumerate() is a lazy iterator that need not construct the entire list to loop over. A typical usage is:

>>> items = ['a','b']
>>> i = 0       # old-style explicit increment
>>> for thing in items:
...     print 'index',i,'contains',thing
...     i += 1
index 0 contains a
index 1 contains b
>>> for i,thing in enumerate(items):
...     print 'index',i,'contains',thing
...
index 0 contains a
index 1 contains b

A.5.3 List-Application Functions as Flow Control

I believe that text processing is one of the areas of Python programming where judicious use of functional programming techniques can greatly aid both clarity and conciseness. A strength of FP style?specifically the Python built-in functions map(), filter(), and reduce()?is that they are not merely about functions, but also about sequences. In text processing contexts, most loops are ways of iterating over chunks of text, frequently over lines. When you wish to do something to a sequence of similar items, FP style allows the code to focus on the action (and its object) instead of on side issues of loop constructs and transient variables.

In part, a map(), filter(), or reduce() call is a kind of flow control. Just as a for loop is an instruction to perform an action a number of times, so are these list-application functions. For example:

for x in range(100):
    sys.stdout.write(str(x))

and:

filter(sys.stdout.write, map(str, range(100)))

are just two different ways of calling the str() function 100 times (and the sys.stdout.write() method with each result). The two differences are that the FP style does not bother rebinding a name for each iteration, and that each call to a list-application function returns a value?a list for map() and filter(), potentially any sort of value for reduce() . Functions/methods like sys.stdout.write that are called wholly for their side effects almost always return None; by using filter() rather than map() around these, you avoid constructing a throwaway list?or rather you construct just an empty list.

A.5.4 Extended Call Syntax and `apply()`

To call a function in a dynamic way, it is sometimes useful to build collections of arguments in data structures prior to the call. Unpacking a sequence containing several positional arguments is awkward, and unpacking a dictionary of keyword arguments simply cannot be done with the Python 1.5.2 standard call syntax. For example, consider the salutation() function:

>>> def salutation(title,first,last,use_title=l,prefix='Dear'):
...     print prefix,
...     if use_title: print title,
...     print '%s %s,' % (first, last)
...
>>> salutation('Dr.','David','Mertz',prefix='To:')
To: Dr. David Mertz,

Suppose you read names and prefix strings from a text file or database and wish to call salutation() with arguments determined at runtime. You might use:

>>> rec = get_next_db_record()
>>> opts = calculate_options(rec)
>>> salutation(rec[0], rec[1], rec[2],
...            use_title=opts.get('use_title',1),
...            prefix=opts.get('prefix','Dear'))

This call can be performed more concisely as:

>>> salutation(*rec, **opts)

Or as:

>>> apply(salutation, rec, opts)

The calls func(*args,**keywds) and apply(func,args,keywds) are equivalent. The argument args must be a sequence of the same length as the argument list for func. The (optional) argument keywds is a dictionary that may or may not contain keys matching keyword arguments (if not, it has no effect).

In most cases, the extended call syntax is more readable, since the call closely resembles the declaration syntax of generic positional and keyword arguments. But in a few cases?particularly in higher-order functions?the older apply() built-in function is still useful. For example, suppose that you have an application that will either perform an action immediately or defer it for later, depending on some condition. You might program this application as:

defer_list = []
if some_runtime_condition():
    doIt = apply
else:
    doIt = lambda *x: defer_list.append(x)
#...do stuff like read records and options...
doIt(operation, args, keywds)
#...do more stuff...
#...carry out deferred actions...
map(lambda (f,args,kw): f(*args,**kw), defer_list)

Since apply() is itself a first-class function rather than a syntactic form, you can pass it around?or in the example, bind it to a name.