Friday, 12 December 2014

Process, Subprocess and Python

A process is an instance of a computer program being executed.
When you run a code, an instance of the program is created. The instance contains the program code and system state like pointer to the next instruction to be executed, memory etc. When you run the same code more than once concurrently, you are actually creating multiple instances of the same program.

A SubProcess is a set of activities that have a logical sequence that meet a clear purpose. A SubProcess is a process in itself, whose functionality is part of a larger process. A new process that a program creates is called a subprocess.

Straight from the python docs .:

The subprocess module allows you to spawn new processes, connect to their input/output/error pipes, and obtain their return codes.

As earlier said that a subprocess carries with it its own system state, so we note in the above definition that every subprocess has its own defined input/ouput/error pipe.

 Subprocess spawns new processes, but aside from stdin/stdout and whatever other APIs the other program may implement you have no means to communicate with them. Its main purpose is to launch processes that are completely separate from your own program. 

In python, the subprocess module lets you run programs that are different from your program. For example, you wrote a python installer to install the package you made. Now, your package has certain dependencies. So, you can either tell the user to install them manually or you could install it by creating a subprocess like:

import subprocess
subprocess.call(["sudo","apt-get","install","<package-name>"]) 

Note that the argument can be a single string or a sequence, but the string implementation is platform independent. So, i think it is better to use the arguments in a sequence.
The call , check_call , check_output are all convenience functions that are based on the Popen interface. call() returns the exit code of the program and you need to check if there was an error or not. However, check_call raises an exception if there was an error. check_output also raises an exception if there is an error and additionally returns output in the form of byte string.

Now, when to use these convenience function and when to use the basic Popen ? Here it is. Let me summarize, the convenience functions are wrappers around the Popen interface with the default functionality that they wait for the subprocess to get finished i.e they block further execution of main process until the subprocess has either finished or timedout. You can look at the implementation of subprocess.py here. 

No comments:

Post a Comment