February 28, 2012

Import modules in Python

I was factoring out some code in Python into modules and it piqued my curiosity on how they worked. Particularly, how importing modules worked. And more particularly, how are circular dependencies in modules handled? For example, if mod_a imports mod_b and mod_b in turn imports mod_a, what happens?

So, I decided to dig deeper. Here are two simple modules:

mod_a.py:
import mod_b                                                               

var_a = 10                                                                 
print "var_a seen from mod_a is: %d" % (var_a)                    
print "var_b seen from mod_a is: %d" % (mod_b.var_b)

mod_b.py:
import mod_a

var_b = 100 
print "var_a seen from mod_b is: %d" % (mod_a.var_a)              
print "var_b seen from mod_b is: %d" % (var_b)
What happens when you run mod_a.py? It fails at the statement printing mod_b.var_b despite importing mod_b at the beginning.

This is why it fails - when mod_a is run, it first imports mod_b. In Python, when a module is imported, the module is first checked to see if it has already been evaluated. If not, the imported module is evaluated line-by-line and loaded into memory. Note that if this module imports other modules, the import process becomes recursive. That is at each import, the evaluation stops and transfers to that of the module being imported.

In the above example, when mod_a starts executing, it encounters the import of mod_b to its namespace. So it transfers to evaluating mod_b. However, mod_b in turn imports mod_a to its namespace and therefore the evaluation moves to mod_a. In this evaluation of mod_a, the import of mod_b is ignored since mod_b is already found to have begun its import to the namespace of mod_a (otherwise it will result in a deadlock!).

During the evaluation of mod_a, var_a is initialized and printed. When it gets to printing the value of var_b that is assigned in mod_b, it fails complaining there is no attribute named var_b in mod_b. See why? The evaluation of mod_b never reached the assignment of var_b there.

Now try moving var_b in mod_b.py to before the import mod_a statement. You will see the print statements from mod_a, followed by print statements from mod_b, finally followed by print statements again from mod_a. The execution flow is as follows: mod_a execution transfers to mod_b evaluation which transfers to mod_a evaluation. At the completion of these evaluations, the control flows back to the original invocation of mod_a.

Now I know exactly what goes on under the hood when I import modules in Python.