Electron Powered Desktop Apps That Bundle Python — Datasette and JupyterLab

You wait for what feels like years, then couple of in the wild releases appear that put the same pattern into practice within a few weeks of each other!

A couple of years ago, whilst pondering ways of bundling Jupyter Book content in a desktop electron app to try to get round the need for separate webserver to serve the content, I observed:

Looking around, a lot of electron apps seem to require the additional installation of Python environments on host that are then called from the electron app. Finding a robust recipe for bundling Python environments within electron apps would be really useful I think?

Fragment – Jupyter Book Electron App, ouseful.info, May, 2019

And suddenly, in a couple of the projects I track, this now seems to be A Thing.

For example, datasette-app (Mac only at the moment?) bundles a Python environment and a datasette server in an electron app to give you a handy cross-desktop application for playing with datasette.

I need to do a proper review of this app…

Simon Willison’s extensive developer notes, tracked in Github Issues and PRs, tell the tale. For example:

And from a repo that appeared to have gone stale, jupyterlab-app gets an initial release as an electon app bundling an Anaconda environment with some handy packages pre-installed (announcement; cross-platform (Mac, Linux, Windows)).

Naively double-clicking the downloaded JupyterLab-app installer to open raises a not helpful dialog:

Sigining issue with Jupyterlab app on Mac

To make progress, you need to list the app in a Mac Finder window, right-click it and Open, at which point you get a dialog that is a little more helpful:

The download for the JupyterLab-app (Mac) installer was about 300MB, which expands to an installed size of at least double that:

The installation (which requires an admin user password?) takes some time so I’m wondering if load of other things get downloaded and installed as part of the installation process….

Hmmm… on a first run, the app opens up some notebooks that I think I may have been running in another JupyterLab session from a local server – has it actually picked up that JupyterLab workspace context? The announcement post said it shipped with its own conda environment? So where has it picked up a directory path from?

Hmmm… it seems to have found my other kernels too… But I don’t see a new one for the environment, and kernel, it ships with?

Opening the Python 3 (ipykernel) environment appears to give me a kernel that has shipped with the application:

import site
site.getsitepackages()

I wonder if there is a name clash with a pre-existing external kernel, the kernel that ships with the JupyterLab app it uses it’s local one, otherwise it uses the other kernels it can find?

Hmmm… seems like trying to run the other Python kernels gets stuck trying to connect and then freezes the app. But I can connect to the R kernel, which looks like it’s the “external” kernel based on where it thinks the packages are installed (retrieved via the R .libPaths() lookup):

Something else that might be handy would be the ability to connect to a remote server and launch a remote kernel, or launch and connect to a MyBinder kernel…

I also note that if I open several notebooks in the app, then in my browser launch a new JupyterLab session, the open notebooks in the app appear as open notebooks in the browser JupyterLab UI: so workspaces are shared? That really really sucks. Sucks double. Not only are the workspaces messing each other up, but it means not only that the same notebook is open in two environments (causing write conflicts) but those environments are also using different kernels for the same notebook. Wrong. Wrong. Wrong!;-)

But a handy start… and possibly a useful way of shipping simple environments to students, at least, once the signing issues are fixed. (For a related discussion on signing Mac apps, see @simonw’s TIL tale, “Signing and notarizing an Electron app for distribution using GitHub Actions” and this related JupyterLab-app issue.)

I also wonder: as the electron app bundles conda, could it also ship postgres as a callable database inside the app? Postgres is available via conda, after all..

Hmm… thinks… did I also see somewhere discussion about a possible JuypterLite app that would use WASM kernels rather than eg bundling conda or accessing local Jupyyter servers? And I wonder… how much of the JupyterLab-App repo could be lifted and used to directly wrap RetroLab? Or maybe, the JupyterLab app could have a button to switch to RetroLab view, and vice versa?

Hmm… I wonder… the jupyter-server/jupyter_releaser package looks to be providing support for automating builds and release generators for templated Jupyter projects… How tightly are apps bound into electron packages? How easy would it be to have a releaser to wrap an app in an electron package with a specfied conda distribution (an “electron-py-app proxy” into which you could drop your jupyter framework app?)

PS whilst tinkering with the fragment describing a Jupyter Book Electron App, I wrapped a Jupyter book in an electron app to remove the need for a web server to serve the book. That post also briefly also explored the possibility of providing live code execution via a thebe connected local server, as well as mooting the possibilty of executing the code via pyodide. I wonder if the JupyterLab-app, and or the datasette-app, have elecron+py components cleanly separated from the app code components. How easy would it be to take one or other of those components to create an electron app bundled with a py/jupyter server that could support live code execution from the Jupyer Book also inside the electron app?

Getting nbev3devsim (jp_proxy_widget wrapped Javascript App) Running In JupyterLab With JupyterLab-Sidecar

One of the blockers I’ve had to date running the nbev3devsim simulator in JupyterLab was the requirement to have just a single instance of the widget running in a notebook. This is an artefact (I think) of the jp_proxy_widget having a hardwired HTML element reference that needs to be unique.

The JupyterLab sidecar widget offered promise in the form of mirroring a cell output in a separate JupyterLab panel, the only problem being this then created a second widget instance (once in the original cell output, once in a mirroring panel).

But it seems that a (not so?) recent update to the sidecar extension allows you to display a rendered object not in a JuptyerLab cell output, but instead in a separate, detached panel:

nbev3devsim Simulator running in a JupyterLab sidecar widget

This compares to the current classic notebook view where I style the css to dodge the notebook to the left hand side of the page and then pop the simulator into a floating resizable widget:

For a review of nbev3devsim and the materials associated with its use, see this overview.

The recipe for getting the widget displayed in the JupyterLab sidecar panel is as follows:

#%pip install git+https://github.com/innovationOUtside/nbev3devsim.git
#%pip install sidecar
#%pip install jp_proxy_widget
# !jupyter labextension install jp_proxy_widget
# Reload...

from nbev3devsim import ev3devsim_nb as eds

roboSim = eds.Ev3DevWidget()

roboSim.set_element("response", '')

from sidecar import Sidecar

with Sidecar(title='RobotLab', anchor='split-right'):
    display(roboSim)

This seems to work fine on the jp_proxy_widget Binder server into which the sidecar package is installed (and the environment reloaded) but there’s a jp_proxy_widget error (even with a simple test widget) when installing the jp_proxy_widget package into the Binderised sidecar repo. So there’s presumably some package version conflict somewhere? Works fine if you make sure the proxy widget extension is explicitly installed.

In the jp_proxy_widget Binder environment, the plumbing also seems to work: loading in the magic (%load_ext nbev3devsim) and then running via the block magic (`%%sim_magic` or %%sim_magic -s roboSim) and programs seem to run in the simulator okay. Line magics to control the simulator setup also seem to work okay.

There may be a few things, particularly to do with accessibility, that work (ish?!) in the classic notebook UI that break in the JupyterLab UI. A move to the JupyerLab UI also means that icons for launching server proxy apps from the launcher will be required.

But it’s starting to look that the robot simulator is viable in JupyterLab. I just need to find a way to get the notebook customisation extensions such as nb_extension_empinken and nb_extension_tagstyler ported over.

Simple Markdown Table Output From pandas DataFrames

In passing, I note that we can easily generate simple markdown style table output from a pandas data frame using the pandas .to_markdown() dataframe method:

Here’s the associated code fragment:

data = """colA, colB, colC
this, that, 1
or, another, 2"""

import pandas as pd
from io import StringIO

df = pd.read_csv(StringIO(data))
md_table = df.to_markdown(index=False)

print(md_table)

I also note that Stack Overflow recently (ish!) added support for markdown tables (announcement) using Github-flavoured markdown syntax.

Simple Javascript Multiple Choice And Literal Answer Quizzes in Jupyter Notebooks and Jupyter Book

In Helping Students Make Sense of Code Execution and Their Own Broken Code I described a handful of interactive tools to help students have a conversation with a code fragment about what it was doing. In this notebook, I’ll consider another form of interactivity that we can bring to bear in live notebooks as well as static Jupyter Book outputs: simple multiple choice and literal answer quizzes (which I like) and flashcards (which I’ve never really got on with).

Both examples are created by John Shea / @jmshea, whose Intro to Data Science for Engineers Jupyter Book demonstrates some interesting practice.

Keen observers of this blog will note I don’t tend to link to demos of OU materials (only my own that I have drafted in public). That’s because it’s generally authed and only available to paying students. Unlike OU print materials of yore, which could be found in many College libraries, purchased as standalone study packs, or bought second hand from University Book Search. Some materials are on Open Learn, and I keep meaning to give some of them “a treatment” to show other, and to my mind, more engaging, ways in which I think we could present them… When the next strike comes around, maybe…

The jmshea/jupyterquiz package provides support for a range of simple quiz questions that can be used for untracked formative assessment.

Try a Jupyter Book rendering here and a previewed Jupyter notebook rendering here.

The first question type currently supported is a multiple choice question type where a single correct answer is expected (a "multiple_choice" question type).

jupyter-quiz – multiple choice

Hints can also be provided in the event of an incorrect answer being provided:

jupyter-quiz – multiple choice with hint on incorrect answer

The second, related "many_choice" question type requires the user to select M correct correct answers from N choices.

The third quiz type, "numeric" allows a user to check a literal numeric answer and provide a hint if the incorrect answer is provided:

jupyter-quiz – literal answer test with hint on incorrect answer

It strikes me it should be trivial to add an exact string match test, and if Javascript packages are available, simple fuzzy string match tests etc.

The questions and answers can be pulled in from a JSON file hosted locally or retrieved from a URL, or a Python dictionary.

Here’s an example of a "multiple_choice" question type:

{
        "question": "Which of these are used to create formatted text in Jupyter notebooks?",
        "type": "multiple_choice",
        "answers": [
            {
                "answer": "Wiki markup",
                "correct": false,
                "feedback": "False."
            },
            {
                "answer": "SVG",
                "correct": false,
                "feedback": "False."
            },
            {
                "answer": "Markdown",
                "correct": true,
                "feedback": "Correct."
            },
            {
                "answer": "Rich Text",
                "correct": false,
                "feedback": "False."
            }
        ]
    },

And a numeric quiz type:

{
        "question": "The variable mylist is a Python list. Choose which code snippet will append the item 3 to mylist.",
        "type": "multiple_choice",
        "answers": [
            {
                "code": "mylist+=3",
                "correct": false
            },
            {
                "code": "mylist+=[3]",
                "correct": true
            },
            {
                "code": "mylist+={3}",
                "correct": false
            }
        ]
    },

The second package, jmshea/jupytercards, supports the embedding of interactive flash cards in Jupyter notebooks and Jupyter Book.

Clicking on the flashcard turns it to show the other side:

You can also transition from one flashcard to the next:

The flashcards can be loaded from a local or remotely hosted JSON text file listing each of the flash card texts in a simple dictionary:

[
    {
        "front": "outcome (of a random experiment)",
        "back": "An outcome of a random experiment is a result of the experiment that cannot be further decomposed."
    },
    {
        "front": "sample space",
        "back": "The sample space of a random experiment is the set of all possible outcomes."
    },
    {
        "front": "event class",
        "back": "For a sample space $S$ and a probability measure $P$, the event class, denoted by $\\mathcal{F}$, is a collection of all subsets of $S$ to which we will assign probability (i.e., for which $P$ will be defined). The sets in $\\mathcal{F}$ are called events."
    }
]

I’m not sure if you can control the flash card color, or font style, color and size?

What I quite like about these activities is that they slot in neatly into generative workflows: the questions are easily maintained via a source text file or a hidden cell where the JSON data is loaded into a Python dict (I suppose it could even be pulled in from notebook cell metadata) and can then be used in a live (trusted) notebook, a fully rendered notebook (i.e. one rendered by nbviewer, not the Github notebook previweer) or rendered into a Jupyter Book HTML format.

Note to self: add these examples to my Open Jupyter Authoring and Learning Environment (OpenJALE) online HTML book.

Helping Students Make Sense of Code Execution and Their Own Broken Code

The programming related courses I work on are probably best described as introductory programming courses. Students are taught using an approach based on “a line of code at a time” within a Jupyter notebook environment which provides a REPL execution model. Students are encouraged to write a line of code in a cell, run it, and then inspect state changes arising from the code execution as displayed in the code cell output. Markdown cells before an after the code cell are use to explain the motivation for the next bit of code, or prompting students to predict what they think it will do. Markdown cells following a code cell can be use to review or explain what just happened, or prompt students to reflect on what they think happened.

In passing, I note that there are other models for providing text+code style annotations. For example the pycco-docs/pycco package will render side-by-side comments and code:

pycco side by side literate documentation, generated from comment py code

The view is generated from Python files containing inline comments and docstrings:

Source py file from which pycco side-by-side literate code view is generated

Something I haven’t yet tried is a workflow that renders the side-by-side view from a Python file generated from a Jupyter notebook using the jupytext file converter (I’m not sure if Jupytext can generate the python files using comment markup conventions that pycco expects?)

For simple code blocks, tools such as nbtutor provide a simple code stepper and tracer that can be used to explore the behaviour of a few lines of code.

I use nbtutor in some first year undergraduate notebooks and it’s okay, -ish (unfortunately it can break in combination with some other widgets running in the same notebook).

Another approach I am keen to explore in terms of helping students help themselves when it comes to understanding code they have written is the automated generation of simple flowchart visualisations from code fragements (see for example Helping Learners Look at Their Code).

Poking around looking for various Python packages that can help animate or visualise common algorithms (Bjarten/alvito is one; anyone got suggestions for others?) I came across a couple of other code stepping tools produced by Alex Hall / @alexmojaki.

The first one is alexmojaki/birdseye which can provide a step trace for code executed in a magicked notebook cell block:

COde stepper using birdseye

You can also separately step though nested loops:

Step through nested loops using birdseye

Another tool, alexmojaki/snoop, will give a linear trace from an executed code cell:

Linear code trace using snoop

Alex also has a handy package for helping identify out of data Python packages based on the latest version availble on PyPi: alexmojaki/outdated.

When it comes to Python errors, for years we have used the Jupyter skip-traceback extension to minimise the traceback message displayed when an error is raised in a Jupyter notebook. However, there are various packages out there that attempt to provide more helpful error messages, such as SylvainDe/DidYouMean-Python (which is currently broken from the install – I think the package needs its internal paths fettling a bit!) and friendly-traceback. The latter package tidies up the display of error messages:

Simplified error message using friendly-traceback

Note that the pink gutter to indicate failed cell execution comes from the innovationOUtside/nb_cell_execution_status extension.

You can then explore in more detail what the issue is and in some cases, how you might be able to fix it:

friendly-traceback error detail

You can also start to tunnel down for more detail about the error:

friendly-traceback messages and explanations

This extension looks like it could be really handy in an introductory, first year undergraduate intro to programming module, but the aesthetic may be a bit simplistic for higher level courses.

From the repo, friendly-traceback/friendly-traceback, it looks like it shouldn’t be too hard to create your own messages.

friendly-traceback feedback generation

This does make me wonder whether a language pack approach might be useful? That would allow for internationalisation but could also be used to easily support the maintenance of custom message packs for particular teaching and learning use cases?

With a couple of new modules presenting for the first time this year, I would argue we missed an opportunity to explore some of these ideas where we can start to use the technology as an illustrator of what’s going on with code we give to students, and more importantly that students might write for themselves.

There are several reasons why I think this probably hasn’t happened:

  • no time to explore this sort of thing (with two years+ to produce a course, you might want to debate that…);
  • no capacity in a module team to explore and test new approaches (I’d argue that’s our job as much as producing teaching material if the org is a beacon of best practice in the development and delivery of interactive online distance education materials);
  • no capacity in support units to either research their effectiveness or explore such approaches and make recommendations into module teams about how they might be adopted and used, along with an examples gallery and sample worked examples based on current draft materials (I honestly wonder about where all the value add we used to get from support units years ago has gone and why folk don’t think we are the worse for not having units that explore emerging tech for teaching and learning. Folk too busy doing crapalytics and bollockschain, I guess);
  • and I guess: “what value does it add anyway?” (which is to say: “why should we explore new ways of teaching and learning?”) and “you’re just chasing ooh, shiny” (which really doesn’t fit with 2+ year production cycles and material updates every five years where locking into an emerging technology is high risk, becuase rather than regularly updating around it, you are stuck with it for potentially up to a decade (2 years production, five years primary course life, 3 years course life extension)).

Bored, bored, bored, bored, bored…

Preparing Jupyter Notebooks for Release to Students

Over the years, I’ve sketched various tools to support the release of notebooks to students, but as I’m not the person who prepares and distributes the releases, they never get used (“Tony hacking crap again” etc.;-).

Anyway, on the basis that the tools aren’t completely crap, and may be of use to others, perhaps even folk working on other modules internally that make use of notebooks and are using them for the first time this presentation, I’ll post a quick summary of some of them here. (And if they are broken, a little use and testing by not-me could well provide the bug reports and motivation I need to fix them to a level of slightly less possible brokenness.)

The package that bundles the tools can be found here: innovationOUtside/nb_workflow_tools.

First up, tm351nbtest is a tool that helps check whether the notebooks run correctly in the latest environment.

The notebooks we save to the private module team repo all have their cells run, in part so that we can review what the expected outputs are. (When checking in notebooks, the tm351nbrun --file-processor runWithErrors . command can be used to ensure all noebooks in the specified path have their cells run.) The nbval package is a handy package that runs the notebooks in the current environment and checks that the contents of the new output cell match those of the previous, saved output cell. (I keep thinking that jupyter-cache might also be handy here?) Cells that are known to generated an error can be ignored by tagging them with the raises-exception tag and cells you want to ignore the output of can be tagged with the nbval-ignore-output tag. Running the tool generates a report identifying each notebook and each cell where the outputs don’t match.

The next tool, nb_collapse_activities, checks that out activity blocks all have their answers precollapsed. Activities are tagged and coloured using the innovationOUtside/nb_extension_empinken extension; activities with answers use the classic notebook collapsible headings extension to collapse the cells beneath an activity answer heading block (all cells are collapsed to the the cell with a header at the same level or higher as the collapsed answer cell header). The nb_collapse_activities utility tries to identify answer head cells and whenever it finds one, adds heading_collapsed: true metadata.

The third tool also processes the notebooks for release: tm351nbrun --file-processor clearOutput clears the outputs of every code cell and essentially resets each notebook to an unrun state.

A fourth tool, nbzip, can be used to zip required notebook folders for release to students.

A sort of release process could then work soemthing like this. In the environment you want to test in:

# Install package
pip3 install --upgrade git+https://github.com/innovationOUtside/nb_workflow_tools

# When checking in notebooks, ensure cells are run
# Ensure that all cells are run even in presence of errors
tm351nbrun --file-processor runWithErrors .

# Test notebooks
tm351nbtest .

# Quality reports
## Whatever...

# Clear outputs
tm351nbrun --file-processor clearOutput .

# Collapse acvitity answers
nb_collapse_activities .

# Spell check
## However... Or run earlier before output cells cleared

# Zip files
# Whichever...

In passing, the nb_workflow_tools package also includes some other utilities not directly relevant to release, but occasionally handy during production: nb_merge to merge two or more notebooks, and nb_split to split a notebook into to or more notebooks.

I’ve also been exploring various approaches to spell-checking notebooks. These are currently being collected in innovationOUtside/nb_spellchecker and the various issues attached to that repo. When I have something reliable, I’ll add it to innovationOUtside/nb_workflow_tools. Another set of quality tools I had been working on but halted due to a universal “why would we want to know anything comparative about the contents of our notebooks” can be found in innovationOUtside/nb_quality_profile. At some point I’ll revisit this and then try to bundle them up into a simple CLI tool I can also add to nb_workflow_tools.

In passing, and for completeness, I’ve also started sketching some innovationOUtside/ou-jupyter-book-tools. The idea of these is that they can provide an intermediate publishing step, where necessary, that maps from cell tags, for example, to complementary Jupyter Book / MyST / restructured text markdown.

Fragment: Loading Data into pandas DataFrames in JupyerLite

Just a quick note to self about a couple of tricks for loading data files into pandas in JupyterLite. At the moment, a simple pandas.read_csv() is unlikely to work, but there a couple of workarounds have been posted over the last couple of months so I’ve wrapped them into a simple package until such a time as everything works “properly” – innovationOUtside/ouseful_jupyterlite_utils.

Install the package in jupyterlite as:

import micropip

package_url = "https://raw.githubusercontent.com/innovationOUtside/ouseful_jupyterlite_utils/main/ouseful_jupyterlite_utils-0.0.1-py3-none-any.whl"

await micropip.install(package_url)

And then load data into pandas as per:

from ouseful_jupyterlite_utils import pandas_utils as pdu

# Load CSV from URL
# Via @jtpio
URL = "https://support.staffbase.com/hc/en-us/article_attachments/360009197031/username.csv"
df = await pdu.read_csv_url(URL, "\t")

# Load CSV from local browser storage
# Via @bollwyvl
df = await pdu.read_csv_local("iris.csv", "\t")
df

I’ll add more workarouds to the package as I find them (working with SQLite files is next on my to do list…) and then remove items as they (hopefully) become natively supported.

My Personal Blockers on Getting Started With JupyterLab

Although at times the content of this blog may come across as somewhat technical, as anyone who has looked at any of my code would tell you, I am not a developer (actually, you could interpret that phrase in a lot of ways!). This post represents a stream of consciousness about some of the stumbling blocks that I perceive as preventing me from getting started building my own extensions for JupyterLab.

See also this related JupyterLab issue: Getting Started Docs for Non-Developers.

The code I write is generally written to get things done, not to form part of some production application. It is a means to an end. It’s poorly structured, and eclectically commented. There’s no linting. My repo commits are random collections of files with often vacuous commit messages. You would not want me committing code to your code base.

I typically categorise my code outputs into various classes:

  • code fragments, which are simple tricks or hacks for performing a particular effect, often something I’ve picked up from somewhere else. One fragment I need to record in a post somewhere is how to densify points along a geojson linestring, a trick I picked up here ; a recent fragment of my own shows how we might be able to style a JSON fragement that identifies the location of a typo in a text string: there may be better ways, “approved ways”, of doing this, but I didn’t find one when I looked so I made a really simple thing to try to do it myself;
  • code sketches, which often take the form of notebooks that describe a mini-project, one way of doing things. The notebooks in my Conversations With Data: Unistats repo are sketches as much as anything; this are often free-form, as I explore a particular topic;
  • code recipes start off as sketches, but then I try to tease out and explain some sort of task or process and perhaps tidy up the code a bit; producing a recipe involves a bit of iteration, trying to identify each step and the reason for it, and ensure that everything is complete: things like Visualising WRC Rally Stages With rayshader and R and Visualising WRC Rally Timing and Results Data are packed fill of recipes;
  • code hacks are the closest I get to production code, not in the sense of it being properly linted, commented and test but in the sense of something I install and use. My notebook extensions are all hacks.

You’ll note that I don’t write tests: I write a line of code at a time, and look at its output; if it doesn’t look right, or it breaks, I try to fix it. I rerun all the cells in a notebook with a fresh kernel a lot to check that things keep working; if the’re broken, I check to see why the broken cell has broken, then read back up to check each previous step is doing what I wanted and has fed forward what I intended it to feed forward.

When I first started using Jupyter notebooks, the classic notebook, they were still called IPython notebooks. (We started developing a course using notebooks in 2014 that went live, after a 6 month deleay, in 2016B (which is to say, February 2016).) To try to make the experience a bit more like the VLE, I hacked together an extension to augment the notebook with coloured cells to represent activities and to allow tutors to highlight cells they had commented in assessment feedback to students. That extension continues (though updated) as nb_extension_empinken.

Since then we have added various other extensions, such as a riff on empinken that styles cell based on bootstrap-like tags (nb_extension_tagstyler), the ipython_magic_sqlalchemy_schemadisplay extension to display (sort of!) database schemas for a connected database or an extension I liked but no-one else did to pop out cells into a floating widget so you could easily refer back to them (nb_cell_dialog).

Over the years, the repos have added clutter, bits of of automation, more elaborate approaches to packaging, but in the beginning, they were very simple, essentially just a README file, a setup.py file, a directory containing a simple __init__.py file and a static file containing the actual extension code in the form of an index.js file. The packaging structure was cribbed directly from other extensions (typically, the simplest one I could find, a minimum viable extension in other words), the setup for the extension was cribbed directly from other (minimally viable) extensions and much of the code was cribbed from other extensions. The empinken extension is essentially a toolbar button that added metadata to a cell, and a routine that iterated each notebook cell, checked the metadata and updated the css. There were other extensions that in whole or in part demonstrated how to do each of those tasks, which I then pinched and reassmbled to my own purposes.

The code was limited to py packaging (cribbed) and some simple js (largely cribbed).

The development environment was a text editor.

The testing was to install the package, refresh the notebook page and (with the help of browser developer tools) see where it was breaking until it didn’t.

The on-ramp was achievable.

So now we come to JupyterLab, which appears to me as a hostile architecture.

I’ll try to pick out what I mean by that. Please note that the following is not intended as a personal attack on the JupyterLab teams or the docs, it’s a pardoy as much as anything.

From the docs, the getting started is to install a development environment, and I’m already lost. When ever I try to use node it seems to download the internet and the instructions typically say “build the package” without saying what words of incantation I need to type into the command line to actually build the package (why should I just know how to do that?)

The next step is to install a cookiecutter. This doesn’t necessarily help because I have no idea what all the files are for, whether they or necessary, or what changes can be made to each one to perfrom a particular task. I’d rather be interested to a minimally viable set of files one at a time with an explanation of what each one does and containing nothing uncommented that is not essential. (Some “useful but optional” fragments may also be handy so I can uncomment them and try them out to see what they do, but not too many.)

When it comes to trying out some example code, I need to learn a new language, .ts (which is to say, TypeScript). I have no idea what TypeScript is or how to run it.

I also need to import load of things from @ things, whatever they are. If I’m trying to figure out how to do a thing by cribbing code from seeing what files are are loaded to support a working extension in my browser (which I suspect is way harder to do with JupyterLab than it was from classic notebook) I’m not sure if thre’s an obvious way to re-engineer the TypeScript code from the Javascript in the browser that does something like what I want to do.

I’m not totally sure what “locate the extension means? Is that something I have to do or is it something the code exampe is doing? (I am getting less than rational at this point because I already know that I am at the point of blindly clicking run at things I don’t understand.)

Before we can try out the extension, it needs building. In the classic notebook extensions I could simply install a package, but now I need to build one:

This is a step back to the old, pre-REPL days because there is a level of indirection here: I don’t get to try the code I’ve written, I have to convert it to somethig else. When it doesn’t work, where did I go wrong?

  • with the logic?
  • with the javascript?
  • with the typescript?
  • with the build process?
  • can the TypeScript be “right” and the javascript “wrong”? I have no idea…

I think things have improved with JupyterLab now that installing an extension doesn’t require an length delay as the JupyterLab environment rebuilds itself (which was a block in itself in earlier days).

Okay, so skimming the docs doesn’t give me a sense that I’d be able to do anything other than follow the steps, click the buttons and create the example extension.

How about checking a repo to see if I can make sense of a pre-existing extension that does something close to what I what?

The classic notebook collapsible headings extension allows you to click on a heading and collapse all the cells beneath it to the next heading of the same or higher level. It works by setting a piece of metadata on the cell containing the heading you want to collapse. A community contributed extension does the same thing, but uses a different tag, "heading_collapsed": "true" rather than "heading_collapsed": true (related issue). Either that or the JupyterLab extension is broken for some other reason.

Here’s what the repo looks like (again, I’m not intending to mock or attack the repo creator, this is just a typical, parodied, example):

Based on nothing at all except my own personal prejudices, I reckon every file halves the number of folk who think they can make sense of what’s going on… (prove me wrong ;-)

Here’s the Jupyter classic notebook extension repo:

So what do I see as the “hostile architecture” elements?

  • there is not an immediately obvious direct way to write some javascript, install it into JupyterLab and check the extension works;
  • in many example repos, a lot of the files relate to packaging the project; for a novice, it’s not clear what the extension files are, what the exntension files are, and whether all the project files are necessary
  • using TypeScript introduces a level of indirection: the user is now developing for JupyterLab, not for the end user environment they can view source from in the browser. (I think this is something I hadn’t articulated to myself before: in classic notebook extensions, you hack the final code; in JupyterLab, you write code in application land, and magic voodoo converts it to things that run in the browser.)
  • in developing for JupyterLab, you need to know what bit o f Juyterlab to hook into. There’s a lot of hooks, and it’s not clear how a (non-developer) novice can find the ones they need to hook into, let alone how to use them.

And finally:

  • there isn’t a “constructive” tutorial that builds up a minimally viable extension from a blank sheet an explained step at a time.

As I recall from years and years ago, if you ever see or hear a developer say “just”, or you can add a silent “just” to a statement ((just) build the node thing) you know the explanation is far from complete and is not followable.

Faced with these challenges of having to step up and go and do a developer course, learn about project tools, pick up TypeScript, andd try to familiarise myself with a complex application framework, I would probably opt to learn how to develop a VS Code extension on the grounds that the application is more general, runs by default as an desktop application rather than browser accessed service, has increasingly rich support for Jupyter notebooks, and has a wide range of other extensions to use, and crib from, and that can be easily discovered from the VS Code extensions marketplace.

PS I think followable is missing term in the reproducibility lexicon, in both a weak sense and a strong sense. In the weak sense, if you follow the instructions, does it work? In a strong sense, does the reader come away feeling that they could create their own extension.

PPS In passing, I note this from my Twitter timeline yesterday…

Fragment: A Couple of Unofficial and Unofficial Unofficial Jupyter Extensions for VS Code and the Future of Rich Visual Editing of Interactive Generative Texts

To the extent that the Jupyter Extension for VS Code represents the “official” VS Code extension, if not an official Jupyter extension for VS Code (which does not exist, at least not in the jupyter project namespace on Github, nor I suspect on the basis of core Jupyter team contributions, it does a pretty good job, and gets more fully featured with every release.

But to a certain extent, it’s still lagging behind what you can do in even the classic notebook UI. For example, a feature I increasingly make use of is the ability to edit cell tags. These can be used by extensions to modify the presentation of cells in the user interface (I have extensions for that in class notebooks, but not in JupyterLab/RetroLab) or in downstream rendered materials such as Jupyter Book outputs. Where Jupyter Book doesn’t directly exploit cell tags, we can hack string’n’glue tools to transform tagged cells to markup that Jupyter Book / Sphinx can make use of…

Anyway, whilst cell tag editing is not supported in the official (from the VS Code side) unoffical (from the Jupyter side) Jupyter extension, it is supported by an unoffical (from the VS Code side) unoffical (from the Jupyter side) Jupyter Powertools VS Code extension for the VS Code Insiders build of VS Code.

From the blurb, this includes:

  • Shrinking traceback reports to allow you to see just the error (cf. the “Skip traceback” classic notebook extension);
  • Cell magics (& code completion for cell magics);
  • Generate reveal.js slideshows and preview them inside the VS Code environment;
  • Edit cell metadata (tags, slide metadata);
  • Automatic syntax highlighting of cell magics (e.g. using %%html will provide syntax highlighting and language features for HTML in the cell)
  • Toggle cell from Markdown to Cell with toolbar icon (and vice versa)

In passing, I also note a couple of WYSIWYG markdown editor extensions (ryanmcalister/unotes and zaaack/vscode-markdown-editor for example), so it’s not hard to imagine having a VS Code Jupyter environment with WYSIWYG markdown editing in the markdown cells although I’m not sure how easy that is to achieve in practice.

PS in passing I note that there is currently a flurry of interest on the official Jupyter discourse site – Inline variable insertion in markdown – in getting inline code into markdown cells in Jupyter notebooks, cf. the inline `r CODE` fragments that are natively available in Rmd, for example. There are certain practical issues associated with this (how are the markdown cells with code references updated for example) and there are already hacky workarounds (eg my own trick for Displaying Jupyter Notebook Code Cell Content As Formatted Markdown/HTML Code Cell Output using a Python f-string magic) for cases where you aren’t necessarily interesting in reactive updates, such as when generating static Jupyter Book outputs. I keep wondering if things like the reactive kernel implemented as davidbrochart/akernel might also provide another solution?

PPS on my long list, one thing I’m hoping to see at some point is an executable document editor running purely in the browser on top of something like a JupyterLite kernel. There are a couple of different directions I think this could come from: one would be getting something like the Curvenote editor or a Stencila editor working in the browser on top of a JupyterLite WASM powered backend, and the other would be some plugins to make a Jupyter Book editable, cf. a wiki, with edits saved to browser local storage. This would tie you to a particular browser on a particular machine unless browser synching also syncs local browser storage, but for folk who work on a single machine, that would be an acceptable constraint.

PPPS See also The World Moves On – Jupyter Classic Notebook No Longer the MyBinder Default UI.

The World Moves On – Jupyter Classic Notebook No Longer the MyBinder Default UI

See the official announcement here.

I’ve been tinkering with web apps and and using third party APIs for long enough now to have a sense for the lifecycle of various projects.

In the early days, things are innocent and open, and traction is often generated because its easy for folk to try things out: the frameworks are simpler and eaasier to use than previous ones, so folks use them; the APIs don’t require API keys, because no-one’s abusing them; the features are limited, because the service or app is still young, and whilst the docs may or not comprehensive, they’re still small enough to find your way around; and the relatively simplicity of the codebase (because it’s still small) means it’s not too hard to poke around to figure out to do things.

Then the project gets popular, and bigger, and more complex, and it’s harder to play with. It becomes more “enterprisey”, even if it’s still an open source project: the development environment starts to become ever more complex, the code becomes more compartmentalised and elaborate, and unless you’re working with it regularly, it can be hard to get a sensible overview of it.

And the examples often get more elaborate. The sort of examples where you already need to have a good mental model of the whole code framework to make any sense of the examples.

And as things move on, they become more exclusive. I have limited developer skills and limited time. My personal approach works for identifying early stage projects or apps that might have potential (at least as I see it). Low barrier to entry for folk who want to get stuff done with the application or package, and perhaps customise it.

When spotting new tech, new code packages, new ideas, I see if I can get something simple, but novel, working, or that perhaps solves one the dozens of things on my “it would be handy if I had a thing to do X” list, within half an hour (the half an hour is elastic up to one to two hours!)- a half hour hack. And from a standing start.

And if I can’t, then I figure it’s probably too hard for large numbers of other people to get started with too; which will limit its adoption.

And so it is with the Jupyter UIs. The classic notebook was relatively straightforward to build simple extensions for, but JupyterLab continues to be beyond me. Classic notebook is minimally maintained, but the core Jupyter developer effort in UI terms is based on JupyterLab and UIs based around that framework (such as RetroLab). There are other UIs too that better suit my needs:

  • VS Code is increasingly powerful as a notebook editing and development environment (see for example the recent addition of rich visual differencing of different versions of the same notebook); VS Code can also support the creation of generated materials without the need for a code executing Jupyter back end. See for example VS Code as an Integrated, Extensible Authoring Environment for Rich Media Asset Creation.
  • for authoring simple notebooks and interactive texts, the visual editor in RStudio, which looks like it will soon be split into a simpler, standalone editor, the Quarto editor, looks promising;
  • the Curvenote editor is one to watch although I don’t really have a sense yet as to whether this will gain traction as a self-hosted or locally deployable UI irrespective of the viability of the hosted offering… Stencila is still also a thing, but it just never quite seems to be able to break through, and may now just be too complex to generate any momentum amongst have-a-go early adopters looking for a better way.

But the need to find an environment that works for me and that can be widely shared to others is starting to become a blocker (see OpenJALE for my current environment). It’s pressingly timely, I think, for a simple, ideally extensible, editing environment for line at a time coding that can be used to write linear instructional materials, interactive texts and generative materials (documents that embed source code to create assets such as charts, tables and interactive widgets; see for example Subject Matter Authoring Using Jupyter Notebooks).

The reason I need to start looking for a new approach is that it’s starting to look like the classic Jupyter notebook is coming to the end of general availability. The latest sign of this is the announcement that MyBinder launched environments will now, by default, be launched into the JupyterLab UI rather than the classic notebook UI.

For anyone who has repos with environments defind to use the classic notebook and notebook extensions that don’t work in JupyterLab, tweaks will need to be made to MyBinder launch buttons and URL to ensure that the URL path is set to /tree (in a launch URL: ?urlpath=/tree/ ).

Increasingly, to use the classic notebook, you’ll need to know where to find it. Because by default the assumption will be that you want to enter a complex developer environment, not a simple notebook authoring UI.

PS The Jupyter project is a fantastic initiative for making access to compute and arbitrary code execution possible. I think the classic notebook UI had a large part to play in providing an on-ramp to getting started with code for a lot of people: a clean, simple design, with a minimum of clutter and relatively easily extended and customised with simple HTML, Javascript and CSS. It has also provided a way for disciplines to bring computational approaches, particularly a line of code at a time approaches, to a wider audience through narrated and contextualised code scripts that perform particular tasks within the context of a human readable document. But the JuptyerLab UI is not that. It’s a piece of hostile architecture to pretty much everyone who isn’t a developer. You may be able to get to a notebook within the JupterLab UI. But you will have already put the fear into folk that it will be too complicated for them to understand. Making JupyterLab the default UI is off-putting to anyone who opens up the UI for the first time after hearing that Jupyter notebooks provide an “easy way” to get started with computing, because it looks just like any other terrifyingly complex IDE. Not a simple file listing and a simple Word Processor app that can execute code and display the results.