Opening Up Access to Jupyter Notebooks: Serverless Computational Environments Using JupyterLite

A couple of weeks ago, I started playing with jupyterlite, which removes the need for an external Jupyter server and lets you use JupyterLab or RetroLab (the new name for the JupyterLab classic styled notebook UI) purely in the browser using a reasonably complete Python kernel that runs in the browser.

Yesterday, I had a go at porting over some notebooks we’ve used for several years for some optional activities in first year undergrad equivalent course. You can try them out here:

You can also try out an online HTML textbook version that does require an external server, in the demo case, launched on demand using MyBinder, from here:

The notebooks were originally included in the course as a low-risk proof of concept of how we might make use of notebooks in the course. Although Python is the language taught elsewhere in the module, engagement with it is through the IDLE environment, with no dependencies other than the base Python install: expecting students to install a Jupyter server howsoever, was a no-no. The optional and only limited use of notebooks meant we could also prove a hosted notebook solution using JupyterHub and Kubernetes in a very light touch way: authentication via a Moodle VLE LTI link gave users preauthenticated access to a JupyterHub server from where students could run the provided notebooks. The environment was not persistent though: if students want to save their notebooks to work on them at a future time, they had to export the notebooks then re-upload in their next session. We were essentially running just a temporary notebook server. The notebooks were also designed to take this into account, i.e. that the activities should be relatively standalone and self-contained, and could be completed in a short study session.

To the extent that the rest of the university paid no attention, it would be wrong to class this as innovation. On the one hand, the approach we used was taken off-the-shelf (Zero to JupyerHub With Kubernetes), although some contributed docs did result (using the JupyterHub LTI authenticator with Moodle). The deployment was achieved very much as a side project, using personal contacts and an opportunity to deploy it outside of formal project processes and procedures and before anyone realised what had actually just happened. (I suspect they still don’t). On the other, whilst it worked and did the job required of it, it influenced nothing internally and had zero internal impact other than meeting the needs of several thousand students. And whilst it set a precedent, it wasn’t really one we managed to ever build directly from or invoke to smooth some later campaign.

As well as providing the hosted solution, we also made the environment available via MyBinder, freeloading on that service to provide students with an environment that they could access ex- of university systems. This is important because it meant, and means, that access remains available to students at the end of the course. Unlike traditional print based models of distance education, where students get a physical copy of course materials they can keep for ever, the online first approach that dominates now means that students lose access to the online materials after some cut-off point. So much for being able to dig that old box out of the loft containing your lecture notes and university textbooks. Such is the life of millenials, I guess: a rented, physical artefactless culture. Very much the new Dark Age, as artist James Bridle has suggested elsewhere.

But is there a better way? Until now, to run a Jupyter notebook has placed a requirement on being able to access a Jupyter server. At this point, it’s worth clarifying a key point. Jupyter is not notebooks. At its core, Jupyter is a set of protocols that provide access to arbitrary computational environments, that can run arbitrary code in those environments, and that can return the outputs of that code execution under a REPL (read-eval-print loop) model. The notebooks (or JupyterLab) are just a UI layer. (Actually, they’re a bit more interesting than that, as are ipywidgets, but that’s maybe something for another post.)

So, the server and the computational environment. The Jupyter server is the thing that provides access to the computational environment. And the computational environment has typically needed to run “somewhere else”, as far as the the notebook or JupyterLab UI is concerned. This could be on a remote hosted server somewhere in the cloud or provided by your institution, or in the form of an environment that exists and runs from your own desktop or laptop computer.

What JupyterLite neatly does is bring all these components into the browser. No longer does the notebook client, the user interface, need to connect to a computational environment running elsewhere, outside the browser. Now, everything can run inside the browser. (There is one niggle to this: you need to use a webserver to initially deliver everything into the browser, but that can be any old webserver that might also be serving any other old website.)

Now, I see this as A Good Thing, particularly in open online edcuation where you want learners to be able to do computational stuff or benefit from interactions or activities that require some sort of computational effort on the back end, such as some intelligent tutoring thing that responds to what you’ve just done. But a blocker in open ed has always been: how is that compute provided?

Typically, you either need the learner to install and run something — and this is something that does not scale well in terms of the amount of support you have to provide, because some people will need (a lot of!) support —or you need to host or otherwise provide access to the computational environment. And resource it. And probably also support user authentication. And hence also user registration. And then either keep it running, or prevent folk from accessing it from some unspecified date in the future.

What this also means is that whilst you might reasonably expect folk who want to do computing for computing’s sake to take enough of an interest, and be motivated enough to install a computing evironment of their own to work in, for folk who want to use use computing to get stuff done, or just want to work through some materials without having to already have some skills in installing and running software (on their own computer), it all becomes a bit too much, a bit too involved and let’s just not bother. (A similar argument holds when hosting software: the skills required to deploy and manage end-user facing software on a small network, for example, (think of the have-a-go teacher who looks after the primary school computer network), or the more considerable skills (and resource) required to deploy environments in a large university with lots of formalised IT projects and processes. When just do it becomes a project with planning and meetings and all manner of institutional crap, it can quickly become “just what is the point in even trying to do any this?!”

Which is where running things in the browser makes it easier. No install required. Just publish files via your webserver in the same way you would publish any other web pages. And once the user has opened their page in the browser, that’s you done with it. They can run the stuff offline, on their own computer, on a train, in the garden, in their car whilst waiting for a boat. And they won’t have had to install anything. And nor will you.

Try it here:

Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...

%d bloggers like this: