5.1 Classic Classes and Instances

A classic class is a Python object with several characteristics:

  • You can call a class object as if it were a function. The call creates another object, known as an instance of the class, that knows what class it belongs to.

  • A class has arbitrarily named attributes that you can bind and reference.

  • The values of class attributes can be data objects or function objects.

  • Class attributes bound to functions are known as methods of the class.

  • A method can have a special Python-defined name with two leading and two trailing underscores. Python invokes such special methods, if they are present, when various kinds of operations take place on class instances.

  • A class can inherit from other classes, meaning it can delegate to other class objects the lookup of attributes that are not found in the class itself.

An instance of a class is a Python object with arbitrarily named attributes that you can bind and reference. An instance object implicitly delegates to its class the lookup of attributes not found in the instance itself. The class, in turn, may delegate the lookup to the classes from which it inherits, if any.

In Python, classes are objects (values), and are handled like other objects. Thus, you can pass a class as an argument in a call to a function. Similarly, a function can return a class as the result of a call. A class, just like any other object, can be bound to a variable (local or global), an item in a container, or an attribute of an object. Classes can also be keys into a dictionary. The fact that classes are objects in Python is often expressed by saying that classes are first-class objects.

5.1.1 The class Statement

The class statement is the most common way to create a class object. class is a single-clause compound statement with the following syntax:

class classname[(base-classes)]: 
    statement(s)

classname is an identifier. It is a variable that gets bound (or rebound) to the class object after the class statement finishes executing.

base-classes is an optional comma-delimited series of expressions whose values must be class objects. These classes are known by different names in different languages; you can think of them as the base classes, superclasses, or parents of the class being created. The class being created is said to inherit from, derive from, extend, or subclass its base classes, depending on what language you are familiar with. This class is also known as a direct subclass or descendant of its base classes.

The subclass relationship between classes is transitive. If C1 subclasses C2, and C2 subclasses C3, C1 subclasses C3. Built-in function issubclass(C1, C2) accepts two arguments that are class objects: it returns True if C1 subclasses C2, otherwise it returns False. Any class is considered a subclass of itself; therefore issubclass(C, C) returns True for any class C. The way in which the base classes of a class affect the functionality of the class is covered later in this chapter.

The syntax of the class statement has a small, tricky difference from that of the def statement covered in Chapter 4. In a def statement, parentheses are mandatory between the function's name and the colon. To define a function without formal parameters, use a statement such as:

def name(  ): 
    statement(s)

In a class statement, the parentheses are mandatory if the class has one or more base classes, but they are forbidden if the class has no base classes. Thus, to define a class without base classes, use a statement such as:

class name: 
    statement(s)

The non-empty sequence of statements that follows the class statement is known as the class body. A class body executes immediately, as part of the class statement's execution. Until the body finishes executing, the new class object does not yet exist and the classname identifier is not yet bound (or rebound). Section 5.4 later in this chapter provides more details about what happens when a class statement executes.

Finally, note that the class statement does not create any instances of a class, but rather defines the set of attributes that are shared by all instances when they are created.

5.1.2 The Class Body

The body of a class is where you normally specify the attributes of the class; these attributes can be data objects or function objects.

5.1.2.1 Attributes of class objects

You typically specify an attribute of a class object by binding a value to an identifier within the class body. For example:

class C1:
    x = 23
print C1.x                               # prints: 23

Class object C1 now has an attribute named x, bound to the value 23, and C1.x refers to that attribute.

You can also bind or unbind class attributes outside the class body. For example:

class C2: pass
C2.x = 23
print C2.x                               # prints: 23

However, your program is more readable if you bind, and thus create, class attributes with statements inside the class body. Any class attributes are implicitly shared by all instances of the class when those instances are created, as we'll discuss shortly.

The class statement implicitly defines some class attributes. Attribute _ _name_ _ is the classname identifier string used in the class statement. Attribute _ _bases_ _ is the tuple of class objects given as the base classes in the class statement (or the empty tuple, if no base classes are given). For example, using the class C1 we just created:

print C1._ _name_ _, C1._ _bases_ _          # prints: C1, (  )

A class also has an attribute _ _dict_ _, which is the dictionary object that the class uses to hold all of its other attributes. For any class object C, any object x, and any identifier S (except _ _name_ _, _ _bases_ _, and _ _dict_ _), C.S=x is equivalent to C._ _dict_ _['S']=x. For example, again referring to the class C1 we just created:

C1.y = 45
C1._ _dict_ _['z'] = 67
print C1.x, C1.y, C1.z                   # prints: 23, 45, 67

There is no difference between class attributes created in the class body, outside of the body by assigning an attribute, or outside of the body by explicitly binding an entry in C._ _dict_ _.

In statements that are directly in a class's body, references to attributes of the class must use a simple name, not a fully qualified name. For example:

class C3:
    x = 23
    y = x + 22                         # must use just x, not C3.x

However, in statements that are in methods defined in a class body, references to attributes of the class must use a fully qualified name, not a simple name. For example:

class C4:
    x = 23
    def amethod(self):
        print C4.x                     # must use C4.x, not just x

Note that attribute references (i.e., an expression like C.S) have richer semantics than attribute binding. These references are covered in detail later in this chapter.

5.1.2.2 Function definitions in a class body

Most class bodies include def statements, as functions (called methods in this context) are important attributes for class objects. A def statement in a class body obeys the rules presented in Section 4.10. In addition, a method defined in a class body always has a mandatory first parameter, conventionally named self, that refers to the instance on which you call the method. The self parameter plays a special role in method calls, as covered later in this chapter.

Here's an example of a class that includes a method definition:

class C5:
    def hello(self):
        print "Hello"

A class can define a variety of special methods (methods with names that have two leading and two trailing underscores) relating to specific operations. We'll discuss special methods in great detail later in this chapter.

5.1.2.3 Class-private variables

When a statement in a class body (or in a method in the body) uses an identifier starting with two underscores (but not ending with underscores), such as _ _ident, the Python compiler implicitly changes the identifier into _classname_ _ident, where classname is the name of the class. This lets a class use private names for attributes, methods, global variables, and other purposes, without the risk of accidentally duplicating names used elsewhere.

By convention, all identifiers starting with a single underscore are also intended as private to the scope that binds them, whether that scope is or isn't a class. The Python compiler does not enforce privacy conventions, however: it's up to Python programmers to respect them.

5.1.2.4 Class documentation strings

If the first statement in the class body is a string literal, the compiler binds that string as the documentation string attribute for the class. This attribute is named _ _doc_ _ and is known as the docstring of the class. See Section 4.10.3 for more information on docstrings.

5.1.3 Instances

When you want to create an instance of a class, call the class object as if it were a function. Each call returns a new instance object of that class:

anInstance = C5(  )

You can call built-in function isinstance(I,C) with a class object as argument C. In this case, isinstance returns True if object I is an instance of class C or any subclass of C. Otherwise, isinstance returns False.

5.1.3.1 _ _init_ _

When a class has or inherits a method named _ _init_ _, calling the class object implicitly executes _ _init_ _ on the new instance to perform any instance-specific initialization that is needed. Arguments passed in the call must correspond to the formal parameters of _ _init_ _. For example, consider the following class:

class C6:
    def _ _init_ _(self,n):
        self.x = n

Here's how to create an instance of the C6 class:

anotherInstance = C6(42)

As shown in the C6 class, the _ _init_ _ method typically contains statements that bind instance attributes. An _ _init_ _ method must either not return a value or return the value None; any other return value raises a TypeError exception.

The main purpose of _ _init_ _ is to bind, and thus create, the attributes of a newly created instance. You may also bind or unbind instance attributes outside _ _init_ _, as you'll see shortly. However, your code will be more readable if you initially bind all attributes of a class instance with statements in the _ _init_ _ method.

When _ _init_ _ is absent, you must call the class without arguments, and the newly generated instance has no instance-specific attributes. See Section 5.3 later in this chapter for more details about _ _init_ _.

5.1.3.2 Attributes of instance objects

Once you have created an instance, you can access its attributes (data and methods) using the dot (.) operator. For example:

anInstance.hello(  )                       # prints: Hello
print anotherInstance.x                    # prints: 42

Attribute references such as these have fairly rich semantics in Python and are covered in detail later in this section.

You can give an instance object an arbitrary attribute by binding a value to an attribute reference. For example:

class C7: pass
z = C7(  )
z.x = 23
print z.x                                   # prints: 23

Instance object z now has an attribute named x, bound to the value 23, and z.x refers to that attribute. Note that the _ _setattr_ _ special method, if present, intercepts every attempt to bind an attribute. _ _setattr_ _ is covered in Section 5.3 later in this chapter.

Creating an instance implicitly defines two instance attributes. For any instance z, z._ _class_ _ is the class object to which z belongs, and z._ _dict_ _ is the dictionary that z uses to hold all of its other attributes. For example, for the instance z we just created:

print z._ _class_ _._ _name_ _, z._ _dict_ _     # prints: C7, {'x':23}

You may rebind (but not unbind) either or both of these attributes, but this is rarely necessary.

For any instance object z, any object x, and any identifier S (except _ _class_ _ and _ _dict_ _), z.S=x is equivalent to z._ _dict_ _['S']=x (unless a _ _setattr_ _ special method intercepts the binding attempt). For example, again referring to the instance z we just created:

z.y = 45
z._ _dict_ _['z'] = 67
print z.x, z.y, z.z                         # prints: 23, 45, 67

There is no difference between instance attributes created in _ _init_ _, by assigning to attributes, or by explicitly binding an entry in z._ _dict_ _.

5.1.3.3 The factory-function idiom

It is common to want to create instances of different classes depending upon some condition or to want to avoid creating a new instance if an existing one is available for reuse. You might consider implementing these needs by having _ _init_ _ return a particular object, but that isn't possible because Python raises an exception when _ _init_ _ returns any value other than None. The best way to implement flexible object creation is by using an ordinary function, rather than by calling the class object directly. A function used in this role is known as a factory function.

Calling a factory function is a more flexible solution, as such a function may return an existing reusable instance or create a new instance by calling whatever class is appropriate. Say you have two almost-interchangeable classes (SpecialCase and NormalCase) and you want to flexibly generate either one of them, depending on an argument. The following appropriateCase factory function allows you to do just that (the role of the self parameters is covered in Section 5.1.5 later in this chapter):

class SpecialCase:
    def amethod(self): print "special"
class NormalCase:
    def amethod(self): print "normal"
def appropriateCase(isnormal=1):
    if isnormal: return NormalCase(  )
    else: return SpecialCase(  )
aninstance = appropriateCase(isnormal=0)
aninstance.amethod(  )                        # prints "special", as desired

5.1.4 Attribute Reference Basics

An attribute reference is an expression of the form x.name, where x is any expression and name is an identifier called the attribute name. Many kinds of Python objects have attributes, but an attribute reference has special rich semantics when x refers to a class or instance. Remember that methods are attributes too, so everything I say about attributes in general also applies to attributes that are callable (i.e., methods).

Say that x is an instance of class C, which inherits from base class B. Both classes and the instance have several attributes (data and methods) as follows:

class B:
    a = 23
    b = 45
    def f(self): print "method f in class B"
    def g(self): print "method g in class B"
class C(B):
    b = 67
    c = 89
    d = 123
    def g(self): print "method g in class C"
    def h(self): print "method h in class C"
x = C(  )
x.d = 77
x.e = 88

Some attribute names are special. For example, C._ _name_ _ is the string 'C', the class name. C._ _bases_ _ is the tuple (B,), the tuple of C's base classes. x._ _class_ _ is the class C, the class to which x belongs. When you refer to an attribute with one of these special names, the attribute reference looks directly into a special dedicated slot in the class or instance object and fetches the value it finds there. Thus, you can never unbind these attributes. Rebinding them is allowed, so you can change the name or base classes of a class or the class of an instance on the fly, but this is an advanced technique and rarely necessary.

Both class C and instance x each have one other special attribute, a dictionary named _ _dict_ _. All other attributes of a class or instance, except for the few special ones, are held as items in the _ _dict_ _ attribute of the class or instance.

Apart from special names, when you use the syntax x.name to refer to an attribute of instance x, the lookup proceeds in two steps:

  1. When 'name' is a key in x._ _dict_ _, x.name fetches and returns the value at x._ _dict_ _['name']

  2. Otherwise, x.name delegates the lookup to x's class (i.e., it works just the same as x._ _class_ _.name)

Similarly, lookup for an attribute reference C.name on a class object C also proceeds in two steps:

  1. When 'name' is a key in C._ _dict_ _, C.name fetches and returns the value at C._ _dict_ _['name']

  2. Otherwise, C.name delegates the lookup to C's base classes, meaning it loops on C._ _bases_ _ and tries the name lookup on each

When these two lookup procedures do not find an attribute, Python raises an AttributeError exception. However, if x's class defines or inherits special method _ _getattr_ _, Python calls x._ _getattr_ _('name') rather than raising the exception.

Consider the following attribute references:

print x.e, x.d, x.c, x.b. x.a                 # prints: 88, 77, 89, 67, 23

x.e and x.d succeed in step 1 of the first lookup process, since 'e' and 'd' are both keys in x._ _dict_ _. Therefore, the lookups go no further, but rather return 88 and 77. The other three references must proceed to step 2 of the first process and look in x._ _class_ _ (i.e., C). x.c and x.b succeed in step 1 of the second lookup process, since 'c' and 'b' are both keys in C._ _dict_ _. Therefore, the lookups go no further, but rather return 89 and 67. x.a gets all the way to step 2 of the second process, looking in C._ _bases_ _[0] (i.e., B). 'a' is a key in B._ _dict_ _, therefore x.a finally succeeds and returns 23.

Note that the attribute lookup steps happen only when you refer to an attribute, not when you bind an attribute. When you bind or unbind an attribute whose name is not special, only the _ _dict_ _ entry for the attribute is affected. In other words, in the case of attribute binding, there is no lookup procedure involved.

5.1.5 Bound and Unbound Methods

Step 1 of the class attribute reference lookup process described in the previous section actually performs an additional task when the value found is a function. In this case, the attribute reference does not return the function object directly, but rather wraps the function into an unbound method object or a bound method object. The key difference between unbound and bound methods is that an unbound method is not associated with a particular instance, while a bound method is.

In the code in the previous section, attributes f, g, and h are functions; therefore an attribute reference to any one of them returns a method object wrapping the respective function. Consider the following:

print x.h, x.g, x.f, C.h, C.g, C.f

This statement outputs three bound methods, represented as strings like:

<bound method C.h of <_ _main_ _.C instance at 0x8156d5c>>

and then three unbound ones, represented as strings like:

<unbound method C.h>

We get bound methods when the attribute reference is on instance x, and unbound methods when the attribute reference is on class C.

Because a bound method is already associated with a specific instance, you call the method as follows:

x.h(  )                      # prints: method h in class C

The key thing to notice here is that you don't pass the method's first argument, self, by the usual argument-passing syntax. Rather, a bound method of instance x implicitly binds the self parameter to object x. Thus, the body of the method can access the instance's attributes as attributes of self, even though we don't pass an explicit argument to the method.

An unbound method, however, is not associated with a specific instance, so you must specify an appropriate instance as the first argument when you invoke an unbound method. For example:

C.h(x)                     # prints: method h in class C

You call unbound methods far less frequently than you call bound methods. The main use for unbound methods is for accessing overridden methods, as discussed in Section 5.1.6 later in this chapter.

5.1.5.1 Unbound method details

As we've just discussed, when an attribute reference on a class refers to a function, a reference to that attribute returns an unbound method that wraps the function. An unbound method has three attributes in addition to those of the function object it wraps: im_class is the class object supplying the method, im_func is the wrapped function, and im_self is always None. These attributes are all read-only, meaning that trying to rebind or unbind any of them raises an exception.

You can call an unbound method just as you would call its im_func function, but the first argument in any call must be an instance of im_class or a descendant. In other words, a call to an unbound method must have at least one argument, which corresponds to the first formal parameter (conventionally named self).

5.1.5.2 Bound method details

As covered earlier in Section 5.1.4, an attribute reference on an instance x, such as x.f, delegates the lookup to x's class when 'f' is not a key in x._ _dict_ _. In this case, when the lookup finds a function object, the attribute reference operation creates and returns a bound method that wraps the function. Note that when the attribute reference finds a function object in x._ _dict_ _ or any other kind of callable object by whatever route, the attribute reference operation does not create a bound method. The bound method is created only when a function object is found as an attribute in the instance's class.

A bound method is similar an unbound method, in that it has three read-only attributes in addition to those of the function object it wraps. Like with an unbound method, im_class is the class object supplying the method, and im_func is the wrapped function. However, in a bound method object, attribute im_self refers to x, the instance from which the method was obtained.

A bound method is used like its im_func function, but calls to a bound method do not explicitly supply an argument corresponding to the first formal parameter (conventionally named self). When you call a bound method, the bound method passes im_self as the first argument to im_func, before other arguments (if any) are passed at the point of call.

Let's follow the conceptual steps in a typical method call with the normal syntax x.name(arg). x is an instance object, name is an identifier naming one of x's methods (a function-valued attribute of x's class), and arg is any expression. Python checks if 'name' is a key in x._ _dict_ _, but it isn't. So Python finds name in x._ _class_ _ (possibly, by inheritance, in one of its _ _bases_ _). Python notices that the value is a function object, and that the lookup is being done on instance x. Therefore, Python creates a bound method object whose im_self attribute refers to x. Then, Python calls the bound method object with arg as the only actual argument. The bound method inserts im_self (i.e., x) as the first actual argument and arg becomes the second one. The overall effect is just like calling:

x._ _class_ _._ _dict_ _['name'](x, arg)

When a bound method's function body executes, it has no special namespace relationship to either its self object or any class. Variables referenced are local or global, just as for any other function, as covered in Section 4.10.6. Variables do not implicitly indicate attributes in self, nor do they indicate attributes in any class object. When the method needs to refer to, bind, or unbind an attribute of its self object, it does so by standard attribute-reference syntax (e.g., self.name). The lack of implicit scoping may take some getting used to (since Python differs in this respect from many other object-oriented languages), but it results in clarity, simplicity, and the removal of potential ambiguities.

Bound method objects are first-class objects, and you can use them wherever you can use a callable object. Since a bound method holds references to the function it wraps and to the self object on which it executes, it's a powerful and flexible alternative to a closure (covered in Section 4.10.6.2). An instance object with special method _ _call_ _ (covered in Section 5.3 later in this chapter) offers another viable alternative. Each of these constructs lets you bundle some behavior (code) and some state (data) into a single callable object. Closures are simplest, but limited in their applicability. Here's the closure from Chapter 4:

def make_adder_as_closure(augend):
    def add(addend, _augend=augend): return addend+_augend
    return add

Bound methods and callable instances are richer and more flexible. Here's how to implement the same functionality with a bound method:

def make_adder_as_bound_method(augend):
    class Adder:
        def _ _init_ _(self, augend): self.augend = augend
        def add(self, addend): return addend+self.augend
    return Adder(augend).add

Here's how to implement it with a callable instance (an instance with _ _call_ _):

def make_adder_as_callable_instance(augend):
    class Adder:
        def _ _init_ _(self, augend): self.augend = augend
        def _ _call_ _(self, addend): return addend+self.augend
    return Adder(augend)

From the viewpoint of the code that calls the functions, all of these functions are interchangeable, since all return callable objects that are polymorphic (i.e., usable in the same ways). In terms of implementation, the closure is simplest; the bound method and callable instance use more flexible and powerful mechanisms, but there is really no need for that extra power in this case.

5.1.6 Inheritance

When you use an attribute reference C.name on a class object C, and 'name' is not a key in C._ _dict_ _, the lookup implicitly proceeds on each class object that is in C._ _bases_ _, in order. C's base classes may in turn have their own base classes. In this case, the lookup recursively proceeds up the inheritance tree, stopping when 'name' is found. The search is depth-first, meaning that it examines the ancestors of each base class of C before considering the next base class of C. Consider the following example:

class Base1:
    def amethod(self): print "Base1"
class Base2(Base1): pass
class Base3:
    def amethod(self): print "Base3"
class Derived(Base2, Base3): pass
aninstance = Derived(  )
aninstance.amethod(  )                    # prints: "Base1"

In this case, the lookup for amethod starts in Derived. When it isn't found there, lookup proceeds to Base2. Since the attribute isn't found in Base2, lookup then proceeds to Base2's ancestor, Base1, where the attribute is found. Therefore, the lookup stops at this point and never considers Base3, where it would also find an attribute with the same name.

5.1.6.1 Overriding attributes

As we've just seen, the search for an attribute proceeds up the inheritance tree and stops as soon as the attribute is found. Descendent classes are examined before their ancestors, meaning that when a subclass defines an attribute with the same name as one in a superclass, the search finds the definition when it looks at the subclass and stops there. This is known as the subclass overriding the definition in the superclass. Consider the following:

class B:
    a = 23
    b = 45
    def f(self): print "method f in class B"
    def g(self): print "method g in class B"
class C(B):
    b = 67
    c = 89
    d = 123
    def g(self): print "method g in class C"
    def h(self): print "method h in class C"

In this code, class C overrides attributes b and g of its superclass B.

5.1.6.2 Delegating to superclass methods

When a subclass C overrides a method f of its superclass B, the body of C.f often wants to delegate some part of its operation to the superclass's implementation of the method. This can be done using an unbound method, as follows:

class Base:
    def greet(self, name): print "Welcome ", name
class Sub(Base):
    def greet(self, name):
        print "Well Met and",
        Base.greet(self, name)
x = Sub(  )
x.greet('Alex')

The delegation to the superclass, in the body of Sub.greet, uses an unbound method obtained by attribute reference Base.greet on the superclass, and therefore passes all attributes normally, including self. Delegating to a superclass implementation is the main use of unbound methods.

One very common use of such delegation occurs with special method _ _init_ _. When an instance is created in Python, the _ _init_ _ methods of base classes are not automatically invoked, as they are in some other object-oriented languages. Thus, it is up to a subclass to perform the proper initialization by using delegation if necessary. For example:

class Base:
    def _ _init_ _(self):
        self.anattribute = 23
class Derived(Base):
    def _ _init_ _(self):
        Base._ _init_ _(self)
        self.anotherattribute = 45

If the _ _init_ _ method of class Derived didn't explicitly call that of class Base, instances of Derived would miss that portion of their initialization, and thus such instances would lack attribute anattribute.

5.1.6.3 "Deleting" class attributes

Inheritance and overriding provide a simple and effective way to add or modify class attributes (methods) non-invasively (i.e., without modifying the class in which the attributes are defined), by adding or overriding the attributes in subclasses. However, inheritance does not directly support similar ways to delete (hide) base classes' attributes non-invasively. If the subclass simply fails to define (override) an attribute, Python finds the base class's definition. If you need to perform such deletion, possibilities include:

  • Overriding the method and raising an exception in the method's body

  • Eschewing inheritance, holding the attributes elsewhere than in the subclass's _ _dict_ _, and defining _ _getattr_ _ for selective delegation

  • Using the new-style object model and overriding _ _getattribute_ _ to similar effect

The last two techniques here are demonstrated in "_ _getattribute_ _" later in this chapter.



    Part III: Python Library and Extension Modules