A module is a Python object with arbitrarily named attributes that you can bind and reference. The Python code for a module named aname normally resides in a file named aname.py, as covered in Section 7.2 later in this chapter.
In Python, modules are objects (values) and are handled like other objects. Thus, you can pass a module as an argument in a call to a function. Similarly, a function can return a module as the result of a call. A module, just like any other object, can be bound to a variable, an item in a container, or an attribute of an object. For example, the sys.modules dictionary, covered later in this chapter, holds module objects as its values.
You can use any Python source file as a module by executing an import statement in some other code. import has the following syntax:
import modname [as varname][,...]
The import keyword is followed by one or more module specifiers, separated by commas. In the simplest and most common case, modname is an identifier, the name of a variable that Python binds to the module object when the import statement finishes. In this case, Python looks for the module of the same name to satisfy the import request. For example:
import MyModule
looks for the module named MyModule and binds the variable named MyModule in the current scope to the module object. modname can also be a sequence of identifiers separated by dots (.) that names a module in a package, as covered in later in this chapter.
When as varname is part of an import statement, Python binds the variable named varname to the module object, but the module name that Python looks for is modname. For example:
import MyModule as Alias
looks for the module named MyModule and binds the variable named Alias in the current scope to the module object. varname is always a simple identifier.
The body of a module is the sequence of statements in the module's source file. There is no special syntax required to indicate that a source file is a module; any valid source file can be used as a module. A module's body executes immediately the first time the module is imported in a given run of a program. During execution of the body, the module object already exists and an entry in sys.modules is already bound to the module object.
An import statement creates a new namespace that contains all the attributes of the module. To access an attribute in this namespace, use the name of the module object as a prefix:
import MyModule a = MyModule.f( )
or:
import MyModule as Alias a = Alias.f( )
Most attributes of a module object are bound by statements in the module body. When a statement in the body binds a variable (a global variable), what gets bound is an attribute of the module object. The normal purpose of a module body is exactly that of creating the module's attributes: def statements create and bind functions, class statements create and bind classes, assignment statements bind attributes of any type.
You can also bind and unbind module attributes outside the body (i.e., in other modules), generally using attribute reference syntax M.name (where M is any expression whose value is the module, and identifier name is the attribute name). For clarity, however, it's usually best to bind module attributes in the module body.
The import statement implicitly defines some module attributes as soon as it creates the module object, before the module's body executes. The _ _dict_ _ attribute is the dictionary object that the module uses as the namespace for its attributes. Unlike all other attributes of the module, _ _dict_ _ is not available to code in the module as a global variable. All other attributes in the module are entries in the module's _ _dict_ _, and they are available to code in the modules as global variables. Attribute _ _name_ _ is the module's name, and attribute _ _file_ _ is the filename from which the module was loaded, if any.
For any module object M, any object x, and any identifier string S (except _ _dict_ _), binding M.S=x is equivalent to binding M._ _dict_ _['S']=x. An attribute reference such as M.S is also substantially equivalent to M._ _dict_ _['S']. The only difference is that when 'S' is not a key in M._ _dict_ _, accessing M._ _dict_ _['S'] directly raises KeyError, while accessing M.S raises AttributeError instead. Module attributes are also available to all code in the module's body as global variables. In other words, within the module body, S used as a global variable is equivalent to M.S (i.e., M._ _dict_ _['S']) for both binding and reference.
Python offers several built-in objects (covered in Chapter 8). All built-in objects are attributes of a preloaded module named _ _builtin_ _. When Python loads a module, the module automatically gets an extra attribute named _ _builtins_ _, which refers to either module _ _builtin_ _ or to _ _builtin_ _'s dictionary. Python may choose either, so don't rely on _ _builtins_ _. If you need to access module _ _builtin_ _ directly, use an import _ _builtin_ _ statement. Note the difference between the name of the attribute and the name of the module: the former has an extra s. When a global variable is not found in the current module, Python looks for the identifier in the current module's _ _builtins_ _ before raising NameError.
The lookup is the only mechanism that Python uses to let your code implicitly access built-ins. The built-ins' names are not reserved, nor are they hardwired in Python itself. Since the access mechanism is simple and documented, your own code can use the mechanism directly (in moderation, or your program's clarity and simplicity will suffer). Thus, you can add your own built-ins or substitute your functions for the normal built-in ones. You can restrict an untrusted module by controlling what built-ins the untrusted module sees (as covered in Chapter 13). The following example shows how you can wrap a built-in function with your own function (_ _import_ _ and reload are both covered later in this chapter):
# reload takes a module object; let's make it accept a string as well import _ _builtin_ _ _reload = _ _builtin_ _.reload # save the original built-in def reload(mod_or_name): if isinstance(mod_or_name, str): # if argument is a string mod_or_name = _ _import_ _(mod_or_name) # get the module instead return _reload(mod_or_name) # invoke the real built-in _ _builtin_ _.reload = reload # override built-in with wrapper
If the first statement in the module body is a string literal, the compiler binds that string as the module's documentation string attribute, named _ _doc_ _. Documentation strings are also called docstrings. See Section 4.10.3 for more information on docstrings.
No variable of a module is really private. However, by convention, starting an identifier with a single underscore (_), such as _secret, indicates that the identifier is meant to be private. In other words, the leading underscore communicates to client-code programmers that they should not access the identifier directly.
Development environments and other tools rely on the leading-underscore naming convention to discern which attributes of a module are public (i.e., part of the module's interface) and which ones are private (i.e., to be used only within the module). It is good programming practice to distinguish between private and public attributes by starting the private ones with _, for clarity and to get maximum benefit from tools.
It is particularly important to respect the convention when you write client code that uses modules written by others. In other words, avoid using any attributes in such modules whose names start with _. Future releases of the modules will no doubt maintain their public interface, but are quite likely to change private implementation details.
Python's from statement lets you import specific attributes from a module into the current namespace. from has two syntax variants:
from modname import attrname [as varname][,...] from modname import *
A from statement specifies a module name, followed by one or more attribute specifiers separated by commas. In the simplest and most common case, attrname is an identifier that names a variable that Python binds to the attribute of the same name in the module named modname. For example:
from MyModule import f
modname can also be a sequence of identifiers separated by dots (.) that names a module within a package, as covered later in this chapter.
When as varname is part of a from statement, Python binds the variable named varname to the attribute, but the module attribute from which the variable gets its value is attrname. For example:
from MyModule import f as foo
attrname and varname are always simple identifiers.
Code that is directly inside a module body (not in the body of a function or class) may use an asterisk (*) in a from statement:
from MyModule import *
The * requests that all attributes of module modname be bound as global variables in the importing module. When the module has an attribute named _ _all_ _, the attribute's value is the list of the attributes that are bound by this type of from statement. Otherwise, this type of from statement binds all attributes of modname except those beginning with underscores. Since from M import * may bind an arbitrary set of global variables, it can have unforeseen and undesired side effects, such as hiding built-ins and rebinding variables you still need. Thus, you should use the * form of from very sparingly and only from modules that are explicitly documented as supporting such usage.
In general, the import statement is a better choice than the from statement. I suggest you think of the from statement, and particularly from M import *, as conveniences meant only for occasional use in interactive Python sessions. If you always access module M with the statement import M, and always access M's attributes with explicit syntax M.A, your code will be slightly less concise, but far clearer and more readable. from is a good idea only for modules whose documentation explicitly specifies from support (such as module Tkinter, covered in Chapter 16). Another good use of from is to import specific modules from a package, as we'll discuss in Section 7.3 later in this chapter.