14.7 Running Other Programs

The os module offers several ways for your program to run other programs. The simplest way to run another program is through function os.system, although this offers no way to control the external program. The os module also provides a number of functions whose names start with exec. These functions offer fine-grained control. A program run by one of the exec functions, however, replaces the current program (i.e., the Python interpreter) in the same process. In practice, therefore, you use the exec functions mostly on platforms that let a process duplicate itself by fork (i.e., Unix-like platforms). Finally, os functions whose names start with spawn and popen offer intermediate simplicity and power: they are cross-platform and not quite as simple as system, but simple and usable enough for most purposes.

The exec and spawn functions run a specified executable file given the executable file's path, arguments to pass to it, and optionally an environment mapping. The system and popen functions execute a command, a string passed to a new instance of the platform's default shell (typically /bin/sh on Unix, command.com or cmd.exe on Windows). A command is a more general concept than an executable file, as it can include shell functionality (pipes, redirection, built-in shell commands) using the normal shell syntax specific to the current platform.

execl, execle, execlp, execv, execve, execvp, execvpe

execl(path,*args)
execle(path,*args)
execlp(path,*args)
execv(path,args)
execve(path,args,env)
execvp(path,args)
execvpe(path,args,env)

These functions run the executable file (program) indicated by string path, replacing the current program (i.e., the Python interpreter) in the current process. The distinctions encoded in the function names (after the prefix exec) control three aspects of how the new program is found and run:

Does path have to be a complete path to the program's executable file, or can the function also accept just a name as the path argument and search for the executable in several directories, like operating system shells do? execlp, execvp, and execvpe can accept a path argument that is just a filename rather than a complete path. In this case, the functions search for an executable file of that name along the directories listed in os.environ['PATH']. The other functions require path to be a complete path to the executable file for the new program.
Are arguments for the new program accepted as a single sequence argument args to the function or as separate arguments to the function? Functions whose names start with execv take a single argument args that is the sequence of the arguments to use for the new program. Functions whose names start with execl take the new program's arguments as separate arguments (execle, in particular, uses its last argument as the environment for the new program).
Is the new program's environment accepted as an explicit mapping argument env to the function, or is os.environ implicitly used? execle, execve, and execvpe take an argument env that is a mapping to be used as the new program's environment (keys and values must be strings), while the other functions use os.environ for this purpose.

Each exec function uses the first item in args as the name under which the new program is told it's running (for example, argv[0] in a C program's main); only args[1:] are passed as arguments proper to the new program.

popen

popen(cmd,mode='r',bufsize=-1)

Runs the string command cmd in a new process P, and returns a file-like object f that wraps a pipe to P's standard input or from P's standard output (depending on mode). mode and bufsize have the same meaning as for Python's built-in open function, covered in Chapter 10. When mode is 'r' (or 'rb', for binary-mode reading), f is read-only and wraps P's standard output. When mode is 'w' (or 'wb', for binary-mode writing), f is write-only and wraps P's standard input.

The key difference of f with respect to other file objects is the behavior of method f.close. f.close waits for P to terminate, and returns None, as close methods of file-like objects normally do, when P's termination is successful. However, if the operating system associates an integer error code with P's termination indicating that P's termination was unsuccessful, f.close also returns c. Not all operating systems support this mechanism: on some platforms, f.close therefore always returns None. On Unix-like platforms, if P terminates with the system call exit(n) (e.g., if P is a Python program and terminates by calling sys.exit(n)), f.close receives from the operating system, and returns to f.close's caller, the code 256*n.

popen2, popen3, popen4

popen2(cmd,mode='t',bufsize=-1)
popen3(cmd,mode='t',bufsize=-1)
popen4(cmd,mode='t',bufsize=-1)

Each of these functions runs the string command cmd in a new process P, and returns a tuple of file-like objects that wrap pipes to P's standard input and from P's standard output and standard error. mode must be 't' to get file-like objects in text mode, or 'b' to get them in binary mode. On Windows, bufsize must be -1. On Unix, bufsize has the same meaning as for Python's built-in open function, covered in Chapter 10.

popen2 returns a pair (fi,fo), where fi wraps P's standard input (so the calling process can write to fi) and fo wraps P's standard output (so the calling process can read from fo). popen3 returns a tuple with three items (fi,fo,fe), where fe wraps P's standard error (so the calling process can read from fe). popen4 returns a pair (fi,foe), where foe wraps both P's standard output and error (so the calling process can read from foe). While popen3 is in a sense the most general of the three functions, it can be difficult to coordinate your reading from fo and fe. popen2 is simpler to use than popen3 when it's okay for cmd's standard error to go to the same destination as your own process's standard error, and popen4 is simpler when it's okay for cmd's standard error and output to be mixed with each other.

File objects fi, fo, fe, and foe are all normal ones, without the special semantics of the close method as covered for function popen. In other words, there is no way in which the caller of popen2, popen3, or popen4 can learn about P's termination code.

Depending on the buffering strategy of command cmd (which is normally out of your control, unless you're the author of cmd), there may be nothing to read on files fo, fe, and/or foe until your process has closed file fi. Therefore, the normal pattern of usage is something like:

import os
def pipethrough(cmd, list_of_lines):
    fi, fo = os.popen2(cmd, 't')
    fi.writelines(list_of_lines)
    fi.close(  )
    result_lines = fo.readlines(  )
    fo.close(  )
    return result_lines

Functions in the popen group are generally not suitable for driving another process interactively (i.e., writing something, then reading cmd's response to that, then writing something else, and so on). The first time your program tries to read the response, if cmd is following a typical buffering strategy, everything blocks. In other words, your process is waiting for cmd's output but cmd has already placed its pending output in a memory buffer, which your process can't get at, and is now waiting for more input. This is a typical case of deadlock.

If you have some control over cmd, you can try to work around this issue by ensuring that cmd runs without buffering. For example, if cmd.py is a Python program, you can run cmd without buffering as follows:

C:/> python -u cmd.py

Other possible approaches include module telnetlib, covered in Chapter 18, if your platform supports telnet; and third-party, Unix-like-only extensions such as expectpy.sf.net and packages such as pexpect.sf.net. There is no general solution applicable to all platforms and all cmds of interest.

spawnv, spawnve

spawnv(mode,path,args)
spawnve(mode,path,args,env)

These functions run the program indicated by path in a new process P, with the arguments passed as sequence args. spawnve uses mapping env as P's environment (both keys and values must be strings), while spawnv uses os.environ for this purpose. On Unix-like platforms only, there are other variations of os.spawn, corresponding to variations of os.exec, but spawnv and spawnve are the only two that exist on Windows.

mode must be one of two attributes supplied by the os module: os.P_WAIT indicates that the calling process waits until the new process terminates, while os.P_NOWAIT indicates that the calling process continues executing simultaneously with the new process. When mode is os.P_WAIT, the function returns the termination code c of P: 0 indicates successful termination, c less than 0 indicates P was killed by a signal, and c greater than 0 indicates normal but unsuccessful termination. When mode is os.P_NOWAIT, the function returns P's process ID (on Windows, P's process handle). There is no cross-platform way to use P's ID or handle; platform-specific ways (not covered further in this book) include function os.waitpid on Unix-like platforms and the win32all extensions (starship.python.net/crew/mhammond) on Windows.

For example, your interactive program can give the user a chance to edit a text file that your program is about to read and use. You must have previously determined the full path to the user's favorite text editor, such as c:\\windows\\notepad.exe on Windows or /bin/vim on a Unix-like platform. Say that this path string is bound to variable editor, and the path of the text file you want to let the user edit is bound to textfile:

import os
os.spawnv(os.P_WAIT, editor, [textfile])

When os.spawnv returns, the user has closed the editor (whether or not he has made any changes to the file), and your program can continue by reading and using the file as needed.

system

system(cmd)

Runs the string command cmd in a new process, and returns 0 if the new process terminates successfully (or if Python is unable to ascertain the success status of the new process's termination, as happens on Windows 95 and 98). If the new process terminates unsuccessfully (and Python is able to ascertain this unsuccessful termination), system returns an integer error code not equal to 0.