When I first came across JupyterLite nine months or so ago (jupyterlite — “serverless” Jupyter In the Browser Using Pyodide and WASM), one of my first thoughts was whether I could use it as the programming environment for an open online course / OER that makes use of Jupyter notebooks.
Working with novices, at scale, at a distance, online, and ideally without support raises various support challenges. Trying to create materials that can run anywhere – via an open notebook server (eg Binderhub), via a local install, or even via JupyterLite raises other issues: ideally, you want exactly the same notebook to work in exactly the same way wherever it’s being run. Other issues come from learners losing their work, working from different machines and browsers at different times, working offline (no network access), etc. etc.
At the time, there were several blockers for me when it comes to adopting JupyterLite as just another environment, blockers that are still present today. So as JuptyerLite is brought to wider attention via a recent post on the official Jupyter blog – Jupyter Everywhere – and associated social media sharing, this is just a note-to-self as to why I still haven’t got round to updating the OpenLearn Learn to Code for Data Analysis course to use JupyterLite.
Note that this isn’t intended as a criticism of the JupyterLite devs or dev process. There may well be solutions or workarounds that I haven’t come across. I’m just making observations as an everyman who thinks “ooh, I could use it for this” and then realise I can’t, quite, which means I can’t, at all (blocker, innit!;-). I should also state that my default use case is an extreme one: large populations of naive learners & novice programmers working largely unsupported, online and offline, potentially across several different BYOD or public access machines that may be unpatched & years old, on courses that are expected to remain largely unmaintained for several years following publication.
The mechanics of JupyterLite are beyond me – not just the JupyterLab-ness but also the WASM / pyodide / pyolite-ness. And then there’s things like browser local storage, local forage(?) and potential links to local file system via a browser file system API. So some of the following may be hard, some may be impossible (at the moment, or dependent on upstream things…).
To set the scene, when you open JupyterLab or RetroLab homepage, you see a list of files and notebooks that are part of the JupyterLite distribution. You can use UI controls to upload additional files and see them in the file listing. Notebooks and files can be opened by clicking on them in the normal way. If you edit a file, the changes are saved to browser storage. (I’m not sure if there’s an “official” way to reset a notebook back to the original version as represented by the version served as part of the original distribution, rather than the edited version in browser storage? That probably should be my first “is this a blocker?”)
So what are my (other) blockers, presented here as questions just in case they already have solutions (please feel free to post answers via the comments…):
- how do I reset a modified notebook saved in browser storage to the original version served as part of the jupyerlite distribution [A: deleting a file in the JupyterLab file browser deletes it from browser storage; if the file was part of the original distribution, the file remains in the file browser and is reset to the originally isrtibuted version];
- how do I add additional Python packages to a JupyterLite distribution (ideally, I’d just specify a
requirements.txt
file); [A: install the files into the environment that is used to generate the release; example] - how do I open and read a file programmatically (eg how do I open a data file, or connect to a sqlite database file)? There is an unofficial solution in a discussion thread, but this seems brittle to me and on occasion appears to break. It would be useful if there were an official, min. viable function that also forms part of the release test suite. I wrapped the unofficial solution in a simple utils package but if it is subject to breaks, then it’s not generally usable in published teaching materials unless the jupyterlite version can be guaranteed to be one in which the tricks work; [A: this looks like it will be sorted as far as file read/writes go via
jupyterlite/pull/655
] - how do I write a file that then appears in the file view (eg saving a data file, or writing to a browser storage persisted sqlite database file; or reading an ipynb file from a remote URL, saving it as a file, then generating a URL that will open that notebook from local storage in eg RetroLite via a
path=
URL parameter); [A: this looks like it will be sorted as far as file read/writes go viajupyterlite/pull/655
] - how do I retrieve data from a remote URL in a platform independent way (there are tricks / pyolite functions for reading files from URLs but these require pyodide or js package calls; ideally, I’d just use
requests
and it would figure out how to handle the transport; in the short term, see eg Making the Python requests module work in Pyodide /bartbroere/pyodide-requests
); - how do I avoid async
await
requirements on function calls (some pyolite function calls that can be used to mock non-WASM executed Python functions are asynchronous and require anawait
prefix; this makes it tricky to write code that runs anywhere; is there a way to mask theawait
requirement and wrap asynchronous calls in a non-async
function?) [tracking?:pyodide/pyodide/issues/1503
] There is also a related issue around things liketime.sleep()
[tracked here:pyodide/pyodide/issues/2354
] - how do I synch with my desktop filesystem (eg synch browser storage and local storage, or run jupyterlite against the desktop filesystem rather than browser storage; at the moment, this requires file upload / download; presumably I can access the browser storage db from my desktop commandline?); [A: this is supported by
jupyterlab-contrib/jupyterlab-filesystem-access
] - how do I synch with remote synching drives (eg Dropbox, OneDrive, GoogleDrive etc. etc.); [tracking:
jupyterlite/jupyterlite/issues/315
] - how do I download a file programmatically (eg by creating a blob that can be downloaded from an auto-clicked link);
- how do I open a remote notebook, e.g. in RetroLite (for example: https://jupyter.org/try-jupyter/retro/notebooks/?path=https://raw.githubusercontent.com/jupyterlite/jupyterlite/main/examples/python.ipynb (which does not currently work);
- how do I install Python packages programmatically in a cross-platform way (currently, packages can be installed via notebooks using
micropip
; it would be more convenient to mask this via some %pip magic; see related issue).
To create platform agnostic notebooks, it might be that notebooks need to have a guarded cell that makes decisions about what package or workaround to load if the wasm platform is detected (eg via import platform as p; p.platform()
etc.; test the platform and import packages as required, either via an if
or via a try
).
In passing, there are also various other things that would open up new opportunities; perhaps greatest amongst these are support for single executable cells, and for running code via pyolite kernels in Jupyer Book using thebe (tracking issue). But I also wonder: would it be possible to use pyolite to run as a part of a kernel gateway (eg Building a JSON API Using Jupyter Notebooks in Under 5 Minutes) to support serverless functions?
For installing new packages, you can use
Well, this example doesn’t work, as it depends on the
cryptography
package which is not available as a ‘wheel’. But for your example it might work.