Importing Functions From DevTesting Jupyter Notebooks

One of the ways I use Jupyter notebooks is as sketchbooks in which some code cells are used to develop useful functions and other are used as “in-passing” develop’n’test cells that include code fragments on the way to becoming useful as part of a larger function.

Once a function has been developed, it can be a pain getting it into a form where I can use it in other notebooks. One way is to copy the function code into a separate python file that can be imported into another notebook, but if the function code needs updating, this means changing it in the python file and the documenting notebook, which can lead to differences arising between the two versions of the function.

Recipes such as Importing Jupyter Notebooks as Modules provide a means for importing the contents of a notebook as a module, but they do so by executing all code cells.

So how can we get round this, loading – and executing – just the “exportable” cells, such as the ones containing “finished” functions, and ignoring the cruft?

I was thinking it might be handy to define some code cell metadata (‘exportable’:boolean, perhaps), that I could set on a code cell to say whether that cell was exportable as a notebook-module function or just littering a notebook as a bit of development testing.

The notebook-as-module recipe would then test to see whether a notebook cell was not just a code cell, but an exportable code cell, before running it. The metadata could also hook into a custom template that could export the notebook as python with the code cells set to exportable:False commented out.

But this is overly complicating and hides the difference between exportable and extraneous code cells in the metadata field. Because as Johannes Feist pointed out to me in the Jupyter Google group, we can actually use a feature of the import recipe machinery to mask out the content of certain code cells. As Johannes suggested:

what I have been doing for this case is the “standard” python approach, i.e., simply guard the part that shouldn’t run upon import with if __name__=='__main__': statements. When you execute a notebook interactively, __name__ is defined as '__main__', so the code will run, but when you import it with the hooks you mention, __name__is set to the module name, and the code behind the if doesn’t run.

Johannes also comments that “Of course it makes the notebook look a bit more ugly, but it works well, allows to develop modules as notebooks with included tests, and has the advantage of being immediately visible/obvious (as opposed to metadata).”

In my own workflow, I often make use of the ability to display as code cell output whatever value is returned from the last item in a code cell. Guarding code with the if statement prevents the output of the last code item in the guarded block from being displayed. However, passing a variable to the display() function as the last line of the guarded block displays the output as before.

Charts_-_Split_Sector_Delta

So now I have a handy workflow for writing sketch notebooks containing useful functions + cruft from which I can just load in the useful functions into another notebook. Thanks, Johannes :-)

PS see also this hint from Doug Blank about building up a class across several notebook code cells:

Cell 1:

class MyClass():
def method1(self):
print("method1")

Cell 2:

class MyClass(MyClass):
def method2(self):
print("method2")

Cell 3:

instance = MyClass()
instance.method1()
instance.method2()

(WordPress really struggles with: a) markdown; b) code; c) markdown and code.)

See also: https://github.com/ipython/ipynb

PPS this also looks related but I haven’t tried it yet: https://github.com/deathbeds/importnb

Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...