Republishing OpenLearn Materials In Markdown – Next Steps Taken…

Following on from yesterday’s post, I made a little more progress today trying to sort out a workflow.

First up I had a look at my binder-base-boxes to see if I could automate the building of those using repo2docker. It seems I can and there is an example build at binder-examples/continuous-build as referenced from the repo2docker docs: Using repo2docker as part of your Continuous Integration.

I needed to make a slight tweak to the CircleCI config to allow pushing containers built in repo branches to Dockerhub, but it was easy enough to spot where (removing the lines that limited builds to only run in master). There’s also a slight complication in that my choice of Github repo name has a - in it, and said symbol is disallowed in DockerHub repo names; so rather than just lazily use the repo orgname when pushing the image, I had to set another org name (without the -) as an env var in my CircleCI project profile that the script could pull on (support for this is built in to the script). I also added a tweak to the container naming to use the branch name as the container image tag. There’s an example box here: binder-base-boxes:chemistry, though I haven’t tried to use it as part of a CircleCI build yet… (I guess need to check it includes CircleCi required packages…) The associated DockerHub repo is here.

So that’s one dangling jigsaw piece…

I also created a template repo for publishing Github pages sites using nbsphinx under CircleCi. This should have all you need to get going dumping a load of .md files into a repo and then automatically publishing it under CircleCI to Github Pages. (Actually, I probably need to add a few docs to the README…) There’s an example repo here — markdown version of OpenLearn course: The molecular world and site here: The molecular world – OpenLearm Reimagined.

Next on the to do list:

  • automatically generate a simple index.rst file;
  • sort out image dereferencing for nested directories (path to a common image dir);
  • put together a reusable script or CLI tool that can download and generate a set of markdown documents from the OU-XML source of an OpenLearn module given an OpenLearn course URL and generate the md, with derefenced image links from it.

What this would then do is make it easy for anyone to convert an OpenLearn course that has a source OU-XML document to an equivalent markdown source site that can be automatically republished as an HTML site and that they can edit directly in the markdown source on Github.

The other major workflow issue I need to sort out is how best to manage “Binder” environments required to execute documents via Jupytext as part of the nbsphinx publishing step. (The chemistry base box takes quite a long time to build, for example, so if it’s used to build pages as part of an nbsphinx workflow it would be good to be able to pull a cached build in CircleCI (I really need to get my head round CircleCI cacheing) or use a prebuilt Docker image.)

There’s also thinking needs doing about the differences between a publishing step where a notebook is executed and that generates eg some HTML/JS that can be embedded and work standalone as an interactive on Github Pages vs. interactive widgets that need a Jupyter server on the back end to work. I’ve already spotted at least one opportunity for recasting an ipywidgets decorated function that generates views over different 3D molecules to a simple “pure” JS display that works without the need for the py function on the backend. Related to this I need to explore ThebeLab and nbinteract support in nbsphinx. If anyone has demos, please share… :-)

Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...