Educational Content Creation in Jupyter Notebooks — Creating the Tools of Production As You Go

For the last few weeks (and still and 2-3 more weeks to go, at the current rate of progress), I’ve been updating some introductory course materials for a module due to present in 20J (which is to say, October, 2020).

Long time readers wil be familiar with the RobotLab application we’ve been using in various versions of the module for the last 20 years and my on and off attempts looking for possible alternatives (for example, Replacing RobotLab…?).

The alternative I opted for is a new browser based simulator based on ev3devsim. Whilst my tinkering with that, in the form of nbev3devsim is related to this post, I’ll reserve discussion of it for another post…

So what is this post about?

To bury the lede further, the approach I’ve taken to updating the course materials has been to take the original activity materials, in their raw OU-XML form, convert them to markdown (using the tools I’ve also used for republishing OpenLearn content as editable markdown / text documents) and then rewrite them using the new simulator rather than the old RobotLab application. All this whilst I’m updating and building out the replacement simulator (which in part means that the materials drafted early in the process are now outdated as the simulator has been developed; but more of that in another post…).

ALong the way, I’ve been trying to explore all manner of things, including building tools to support the production of media assets used in the course.

For example, the simulator uses a set of predefined backgrounds as the basis of various activities, as per the original simulator. The original backgrounds are not available in the right format / at the right resolution, so I needed to create them in some way. Rather than use a drawing package, and a dsequence of hard to remember and hard replicate mouse and menu actions, I scripted the creation of the diagrams:

This should make maintenance easier, and also provides a set of recipes I can build on, image objects I can process, and so on. (You can see the background generator recipes here.)

The original materials also included a range of flowcharts. The image quality of some of them was a bit ropey, so I started looking for alternatives.

I started off using mermaid.js. I was hoping to use a simple magic that would let me put the chart description into a magicked code cell and then render the result, but on a quick first attempt I could get that to work (managing js dependencies and scope is something I can’t get my head round). So instead, at the moment, the mermaid created flow charts I’m using are created on the fly from a call to a mermaid online API.

Using a live, online image generator is not ideal in presentation. For example, a student may be working on the materials whilst offline. It is okay for creating static assets in production and then saving those for embedding in the materials released to students.

One other thing to note about the flow chart is the provision of the long description text, provided as an aid to visually impaired students using screen readers. I’ve been pondering image descriptions for a long time, and there are a few things I am, and want to, explore as I’m updating the TM129 matierals.

The first question is whether we need long description text anyway, or whether the description should be inlined anyway. When a diagram or chart is used in a text, there are at least two ways of seeing / reading it: first, as a sequence of marks on a page: there is a box here with this label, connected by an arrow to a box to the right of it with that label”. And so on. In a chart, such as a scatterplot, something like “a scatterplot with x-axis labelled this and ranging from this to that, a y-axis labelled whatever ranging wherever, a series of blue points densely arranged in the area (coords)” etc etc.

I’ve done crude sketches previously of how we might start to render Grammar of Graphics (ggplot) described graphics and matplotlib chart objects as text (eg First Thoughts on Automatically Generating Accessible Text Descriptions of ggplot Charts in R but I’ve not find anyone else internally keen to play with that idea (at least, not with me, or to my knowing), so I keep putting off doing more on that. But I do still think it could be a useful thing to do more of).

Another approach might be to generate text via a parser over the diagram’s definition in code (I’ve never really played with parsers; lark and plyplus could provide a start). Or, if the grammar is simple enough, provide students with a description early on of how to “read” the description language and then provide the “generator text” as the description text. (Even simple regexes might help, eg mapping -> to “right arrow” or “leads to” etc.) The visual diagram is often a function of the generator text and a layout algorithm (or, following UKgov public service announcements in abusing “+”, diagram = generator_text + layout) so as long as the layout algorithm isn’t deriving and adding additional content, but is simply re-presenting the description as provided, the generator text is the most concise long description.

The second way of looking at / seeing / reading a chart is to try to interpret the marks made in ink in some way. This sort of description is usually provided in the main text, as a way of helping students learn to read the diagram / chart, and what areas of it to focus on. Note that the “meaning” of a chart is subject to stance and rhetoric. On a line chart, we might have a literal description of the ink say “line up and to the right”, we might then “read” that “increasing”, and we might then interpret that “increasing” as evidence of some effect, as a persuasive rhetorical argument in favour of or against something, and so on. Again, that sort of interpretation is the one we’d offer all students equally.

But when it comes to just the “ink on paper” bit, how should we best provide an accessible equivalent to the visual representation? Just as sighted students in their mind’s eye presumably don’t read lines between boxes as “box connected by a line to box” (or do they?), I wonder whether our long description should be read by visually impaired students through their screen reader as “box connected by a line to box”. Why do we map from a thing, a -> b represented visually in terms of the description provided to visually impaired students using a text description of a visual representation? Does it help? The visual representation itself is a re-presentation of a relationship in a graphical way that tries to communicate that relationship to the reader. The visual is the communicative medium. So why use text to describe a visual intermediary representation in a long description? Would another intermediary representation be more useful? I guess I’m saying: why describe a visually rendered matplotlib object to a visually impaired student in a visual way if we want to communicate the idea of what the matplotlib object represents? Why not describe the chart object, which defines the whatever is being re-presented in a visual way, in other terms? (I guess one reason we want to describe the visual representation to visually impaired studets is so that when they hear sighted people talking in visual terms, they know what they’re talking about…)


So, back to creating tools of production. The mermaid.js route as it currently stands is not ideal; and the flow charts it generates are perhaps “non-standard” in their symbol selection and layout. (Note that is something we could perhaps address by forking and fixing the mermaid.js library so that it does render things as we’d like to see them…)

Another possible flowcharter library I came across was flowchart.js. I did manage to wrap this in a jp_proxy_widget as flowchart_js_jp_proxy_widget to provide a means of rendering flowcharts from a simple description within a notebook:

You can find it here: innovationOUtside/flowchart_js_jp_proxy_widget

I also created a simple magic associated with it…

(Note that the jp_proxy_widget route to this magic is perhaps not the best way of doing things, but I’ve been exploring how to use jp_proxy_widget more generally, and this fitted with that; as a generic recipe, it could be handy. What would be useful is a recipe that does not involve jp_proxy_widget; nb-flowchartjs doesnlt seem to work atm, but could provide a clue as to how to do that…)

The hour or two spent putting that together means I now have a reproducible way of scripting the production of simple flowchart diagrams using flowchart.js. The next step is to try to figure out how to parse the flowchart.js diagram descriptions, and for simple ones at least, have a stab at generating a textualised version of them. (Although as mentioned above, is the diagram description text its own best description?)

Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...

%d bloggers like this: