Fragment: From Jupyter Notebooks to Online Interactive Textbooks

One of the things on my to do list is try out a workflow for publishing something like a web-book along the lines of Gitbook from Jupyter notebook source documents.

Ish via Simon Willison (again?!), I note a recipe used to publish the Computational and Inferential Thinking: The Foundations of Data Science online textbook for the Berkeley course Data 8: The Foundations of Data Science. This uses Jupyter notebooks for the chapter source documents, and Jekyll to build the book from markdown files generated from the notebooks; a python script / command line utility generates the markdown files from theoriginal Jupyter notebooks: generate_textbook.py [data-8/textbook repo]. A switch allows notebooks to be executed – or not – before generating the markdown. Another script (generate_summary_from_folders.py) can be used to create a summary.md page which scaffolds the notebook/chapter order if one is not provided. Image paths required by Jekyll are set automatically. Another nice feature is that the generated book can link to Binder run versions of the source notebooks too.

From the textbook repo, it looks as if the recipe could quite easily be separated from the actual book. It’d be nice to see this in the form of a minimal repo that can be downloaded and used to bootstrap a new book published to the same workflow. A template repo from which you can bootstrap your own textbook is available here: choldgraf/textbooks-with-jupyter. There’s also a handy deployment guide.

Another possible approach would be to use a gitbook workflow. It looks like there was an early attempt at this here: Jupyter WebBook [repo].

One of the problems with separating the static HTML site from the runnable notebooks is that the user is forced to move from one environment – the static HTML site – to another – the Binderhub environment – in order to run any interactive elements. A preferable approach is to run interactives within the HTML textbook environment; one way of doing this is to use something like the early demonstrated ThebeLab (see also Juniper, which also lets users “edit and execute code snippets in the browser using Jupyter kernels” and the voila “interactive renderer for Jupyter notebooks”). Whilst this is great for one off pages, ThebeLab is not built into simple publishing workflow.

However, it looks like the new-to-me nbinteract [repo] package does provide a simpler way of publishing interactive webpages from Jupyter notebooks, with HTML pages hosted wherever and interactives powered from a Binder backend. nbinteract can generate multiple HTML pages in one go from notebooks contained in a single directory, although it doesn’t at the moment look like it can produce a structured output such as a gitbook website. (By the by, a simple guide to publishing nbinteract powered books on Github Pages is also provided.)

One of the nice features is that if you have interactive widgets across multiple HTML documents, once a Binderhub instance is started it can provide backend services to other pages viewed in the same session so you don’t have to repeatedly wait for further instances to launch.

A Masters thesis about nbinteract is available here: nbinteract: Generate Interactive Web Pages From Jupyter Notebooks, Samuel Lau.

The docs site is built using Gitbook, so there may be a workflow / recipe for creating nbinteract powered Gitbooks available somewhere? Indeed, the online interactive textbook for another Berkeley course, DS100 – Principles and Techniques of Data Science [repo] – uses both both Gitbook and nbinteract, so the setup guide may be worth a look…

Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...