Tagged: ipynb

Making Music and Embedding Sounds in Jupyter Notebooks

It’s looking as if the new level 1 courses won’t be making use of Jupyter notebooks (unless I can find a way of sneaking them in via the single unit I’be put together!;-) but I still think they’re worth spending time exploring for course material production as well as presentation.

So to this end, as I read through the materials being drafted by others for the course, I’ll be looking for opportunities to do the quickest of quick demos, whenever the opportunity arises, to flag things that might be worth exploring more in future.

So here’s a quick example. One of the nice design features of TM112, the second of the two new first level courses, is that it incorporates some mimi-project activities for students work on across the course. One of the project themes relates to music, so I wondered what doing something musical in a Jupyter notebook might look like.

The first thing I tried was taking the outlines of one of the activities – generating an audio file using python and MIDI – to see how the embedding might work in a notebook context, without the faff of having to generate an audio file from python and then find a means of playing it:


Yep – that seems to work… Poking around music related libraries, it seems we can also generate musical notation…


In fact, we can also generate musical notation from a MIDI file too…


(I assume the mappings are correct…)

So there may be opportunities there for creating simple audio files, along with the corresponding score, within the notebooks. Then any changes required to the audio file, as well as the score, can be effected in tandem.

I also had a quick go at generating audio files “from scratch” and then embedding the playable audio file



That seems to work too…

We can also plot the waveform:


This might be handy for a physics or electronics course?

As well as providing an environment for creating “media-ful” teaching resources, the code could also provide the basis of interactive student explorations. I don’t have a demo of any widget powered examples to hand in a musical context (maybe later!), but for now, if you do want to play with the notebooks that generated the above, you can do so on mybinder – http://mybinder.org/repo/psychemedia/ou-tm11n – in the midiMusic.ipynb and Audio.ipynb notebooks. The original notebooks are here: https://github.com/psychemedia/OU-TM11N

Accessible Jupyter Notebooks?

Pondering the extent to which Jupyter notebooks provide an accessible UI, I had a naive play with the Mac VoiceOver app run over Jupyter notebooks the other day: markdown cells were easy enough to convert to speech, but the code cells and their outputs are nested block elements which seemed to take a bit more navigation (I think I really need to learn how to use VoiceOver properly for a proper test!). Suffice to say, I really should learn how to use screen-reader software, because as it stands I can’t really tell how accessible the notebooks are…

A quick search around for accessibility related extensions turned up the jupyter-a11y: reader extension [code], which looks like it could be a handy crib. This extension will speak aloud a the contents of a code cell or markdown cell as well as navigational features such as whether you are in the cell at the top or the bottom of the page. I’m not sure it speaks aloud the output of code cell though? But the code looks simple enough, so this might be worth a play with…

On the topic of reading aloud code cell outputs, I also started wondering whether it would be possible to generate “accessible” alt or longdesc text for matplotlib generated charts and add those to the element inserted into the code cell output. This text could also be used to feed the reader narrator. (See also First Thoughts on Automatically Generating Accessible Text Descriptions of ggplot Charts in R for some quick examples of generating textual descriptions from matplotlib charts.)

Another way of complementing the jupyter-a11y reader extension might be to use the python pindent [code] tool to annotate the contents of code cells with accessible comments (such as comments that identify the end of if/else blocks, and function definitions). Another advantage of having a pindent extension to annotate the content of notebook python code cells is that it might help improve the readability of code for novices. So for example, we could have a notebook toolbar button that will toggle pindent annotations on a selected code cell.

For code read aloud by the reader extension, I wonder if it would be worth running the content of any (python) code cells through pindent first?

PS FWIW, here’s a related issue on Github.

PPS another tool that helps make python code a bit more accessible, in an active sense, in a Jupyter notebook is this pop-up variable inspector widget.

Steps Towards Some Docker IPython Magic – Draft Magic to Call a Contentmine Container from a Jupyter Notebook Container

I haven’t written any magics for IPython before (and it probably shows!) but I started sketching out some magic for the Contentmine command-line container I described in Using Docker as a Personal Productivity Tool – Running Command Line Apps Bundled in Docker Containers,

What I’d like to explore is a more general way of calling command line functions accessed from arbitrary containers via a piece of generic magic, but I need to learn a few things along the way, such as handling arguments for a start!

The current approach provides crude magic for calling the contentmine functions included in a public contentmine container from a Jupyter notebook running inside a container. The commandline contentmine container is started from within the notebook contained and uses a volume-from the notebook container to pass files between the containers. The path to the directory mounted from the notebook is identified by a bit of jiggery pokery , as is the method for spotting what container the notebook is actually running in (I’m all ears if you know of a better way of doing either of these things?:-)

The magic has the form:

%getpapers /notebooks rhinocerous

to run the getpapers query (with fixed switch settings for now) and the search term rhinocerous; files are shared back from the contentmine container into the .notebooks folder of the Jupyter container.

Other functions include:

%norma /notebooks rhinocerous
%cmine /notebooks rhinocerous

These functions are applied to files in the same folder as was created by the search term (rhinocerous).

The magic needs updating so that it will also work in a Jupyter notebook that is not running within a container – this should simply be just of case of switching in a different directory path. The magics also need tweaking so we can pass parameters in. I’m not sure if more flexibility should also be allowed on specifying the path (we need to make sure that the paths for the mounted directories are the correct ones!)

What I’d like to work towards is some sort of line magic along the lines of:

%docker psychemedia/contentmine -mountdir /CALLING_CONTAINER_PATH -v ${MOUNTDIR}:/PATH COMMAND -ARGS etc

or cell magic:

%%docker psychemedia/contentmine -mountdir /CALLING_CONTAINER_PATH -v ${MOUNTDIR}:/PATH

Note that these go against the docker command line syntax – should they be closer to it?

The code, and a walked through demo, are included in the notebook available via this gist, which should also be embedded below.

Calling an OData Service From Python – UK Parliament Members Data Platform

Whilst having a quick play producing Slack bots and slash commands around the UK Parliament APIs, I noticed (again) that the Members data platform has an OData endpoint.

OData is a data protocol for querying online data services via HTTP requests although it never really seemed to have caught the popular imagination, possibly because Microsoft thought it up, possibly because it seems really fiddly to use…

I had a quick look around for Python client/handler for it, and the closest I came was the pyslet package. I’ve posted a notebook showing my investigations to date here: Handling the UK Parliament Members Data Platform OData Feed, but it seems really clunky and I’m not sure I’ve got it right! (There doesn’t seem to be a lot of tutorial support out there, either?)

Here’s an example of the sort of mess I got myself in:


To make the Parliament OData service more useful needs a higher level Python wrapper, I think, that abstracts a bit further and provides some function calls that make it a tad easier (and natural) to get at the data. Or maybe I need to step back, have a read of the OData blocks, properly get my head around the pyslet OData calls, and try again!

ResBaz Cloud – Containerised Research Apps as a Service

Just over three years or so ago, the OU’s KMi started experimenting with a service to support researchers that made RStudio – and a linked MySQL database – available as on online service (Open Research Data Processes: KMi Crunch – Hosted RStudio Analytics Environment).

I’m not sure if they’ve also started exploring the provision of other browser accessed applications – Jupyter noteboooks, for example – but developing online personal application delivery models is something I’ve felt the OU should be exploring for a long time – for undergraduate and postgraduate teaching, as well as research.

I don’t know whether KMi have been looking at delivering apps via self-service launching of dockerised/containerised applications, or whether there are any HE or Research Council infrastructure projects looking at supporting this sort of thing, but it seems that other enlightened agencies are… For example, a few weeks ago I came across a tweet from ex-JISC disrupter Dave Flanders mentioning the Australian ResBaz cloud service:


Offering a free service to the Australian academic research community (I’m grateful to the team for providing me with reviewer access:-), early stage researchers can request access (or configure access?) to a named research cluster, and then deploy containers to it:


The containerised applications on offer are initially configured by the ResBaz team – I don’t think there’s a way of pointing to your own Dockerfile/setupconfig/image on Dockerhub – but this means there is an element of support that will help you get set up with an application that you know will run!


The containers you create persist – you can turn them off and on again, as well as deleting them and creating new ones – which means you can save project and data files within the container. There’s also an option to export the container, supports portability, I guess.

The platform itself is reminiscent of a minimal take on something like wakari.io, which provides access to a hosted version of IPython notebooks within a claimed workbench environment. To my mind, KMi Crunch as more of a workbench feel to it, because it provides application integration, (RSTudio + MySQL), albeit baked in. At the moment, ResBaz doesn’t seem to offer that. (However, another service that I’ll be blogging about in a day or two, binder, does provide support for 1-click created linked containers (although again, the configuration options are limited). I think binder is builds on elements of tmpnb.org, which itself demonstrates support for a full blown Jupyter install capable of running several kernels, which may be something for the ResBaz folk to think about (for example, offering at least an R kernel within the notebooks, and maybe Python 3 as well as Python 2.7?)


One of the great things about the ResBaz set-up seems to be its support for training events. From my own personal experience, it’s really handy to be able to point workshop participants to online, browser reachable versions of the applications covered in the workshop you’re running.

For OU teaching, I think we really should be looking seriously at using software packages that can be accessed via a browser and run either as a local virtualised service or as a remotely hosted service to try to mitigate against software install issues/hassles. For OU postgrad research students, I think that running applications via containers has a lot to recommend it. And for academic researchers, including the growing number of digital humanities researchers, I think that the range of benefits associated with being able to run research software using what is essentially as software-application-as-a-service model are increasing.

But then, what do I know? I just watched a bunch of folk wasting much of the day trying to work out how to support a raft of remote, informal learners install some remotely hosted and maintained third party s/w onto all manner of personally managed weird and wonderful Windows machines. (The ones on company machines tend not to have the privileges they need to install the software, so we just forget about them. The ones on notebooks wondering why their machines start to fall over when they have to run more than a browser, or the ones who have tablets that can’s install anything other than custom built applications, are also discounted… If the OU is set on becoming a global, online provider, someone needs to start thinkingdoing something about this…)

See also: Seven Ways of Running IPython Notebooks

Authoring Multiple Docs from a Single IPython Notebook

It’s my not-OU today, and whilst I should really be sacrificing it to work on some content for a FutureLearn course, I thought instead I’d tinker with a workflow tool related to the production process we’re using.

The course will be presented as a set of HTML docs on FutureLearn, supported by a set of IPython notebooks that learners will download and execute themselves.

The handover resources will be something like:

– a set of IPython notebooks;
– a Word document for each week containing the content to appear online. (This document will be used as the basis for multiple pages on the course website. The content is entered into the FutureLearn system by someone else as markdown (though I’m not sure what flavour?)
– for each video asset, a Word document containing the script;
– ?separate image files (the images will also be in the Word doc).

Separate webpages provide teaching that leads into a linked to IPython notebook. (Learners will be running IPython via Anaconda on their own desktops – which means tablet/netbook users won’t be able to do the interactive activities as currently delivered; we looked at using Wakari, but didn’t go with it; offering our own hosted solution or tmpnb server was considered out of scope.)

The way I have authored my week is to create a single IPython document that proceeds in a linear fashion, with “FutureLearn webpage” content authored using as markdown, as well as incorporating executed code cells, followed by “IPython notebook” activity content relating to the previous “webpage”. The “IPython notebook” sections are preceded by a markdown cell containing a NOTEBOOK START statement, and closed with markdown cell containing a NOTEBOOK END statement.

I then run a simple script that:

  • generates one IPython notebook per “IPython notebook” section;
  • creates a monolithic notebook containing all, but just, the “FutureLearn webpage” content;
  • generates a markdown version of that monolithic notebook;
  • uses pandoc to convert the monolithic markdown doc to a Microsoft Word/docx file.


Note that it would be easy enough to render each “FutureLearn webpage” doc as markdown directly from the original notebook source, into its own file that could presumably be added directly to FutureLearn, but that was seen as being overly complex compared to the original “copy rendered markdown from notebook into Word and then somehow generate markdown to put into FutureLearn editor” route.

import io, sys
import IPython.nbformat as nb
import IPython.nbformat.v4.nbbase as nb4

#Are we in a notebook segment?

#Quick and dirty count of notebooks

#The monolithic notebook is the content ex of the separate notebook content

#Load the original doc in

#For each cell in the original doc:
for i in mynb['cells']:
    if (i['cell_type']=='markdown'):
        #See if we can stop a standalone notebook code delimiter
        if ('START NOTEBOOK' in i['source']):
            #At the start of a block, create a new notebook
        elif ('END NOTEBOOK' in i['source']):
            #At the end of the block, save the code to a new standalone notebook file
        elif (innb):
    elif (i['cell_type']=='code'):
        #For the code cells, preserve any output text
        for o in i['outputs']:
        #Route the code cell as required...
        if (innb):

#Save the monolithic notebook

#Convert it to markdown
!ipython nbconvert --to markdown monolith.ipynb

##On a Mac, I got pandoc via:
#brew install pandoc

#Generate a Microsoft .docx file from the markdown
!pandoc -o monolith.docx -f markdown -t docx monolith.md

What this means is that I can author a multiple chapter, multiple notebook minicourse within a single IPython notebook, then segment it into a variety of different standalone files using a variety of document types.

Of course, what I really should have been doing was working on the course material… but then again, it was supposed to be my not-OU today…;-)

PS The actual workflow, of course, turned out to be more traditional. Content for the FutureLearn website was copied from the notebooks into Word document, edited there, and then somehow converted to markdown for entry into FutureLearn. (I haven’t seen what the FutureLearn content entry forms look like – anyone got a user guide or screenshots they could share?) Which caused all sorts of fun with the tables and code styling…

IPython Markdown Opportunities in IPython Notebooks and Rstudio

One of the reasons I started working on the Wrangling F1 Data With R book was to see what the Rmd (RMarkdown) workflow was like. Rmd allows you to combine markdown and R code in the same document, as well as executing the code blocks and then displaying the results of that code execution inline in the output document.


As well as rendering to HTML, we can generate markdown (md is actually produced as the interim step to HTML creation), PDF output documents, etc etc.

One thing I’d love to be able to do in the RStudio/RMarkdown environment is include – and execute – Python code. Does a web search to see what Python support there is in R… Ah, it seems it does it already… (how did I miss that?!)


ADDED: Unfortunately, it seems as if Python state is not persisted between separate python chunks – instead, each chunk is run as a one off python inline python command. However, it seems as if there could be a way round this, which is to use a persistent IPython session; and the knitron package looks like just the thing for supporting that.

So that means in RStudio, I could use knitr and Rmd to write a version of Wrangling F1 Data With RPython

Of course, it would be nicer if I could write such a book in an everyday python environment – such as in an IPython notebook – that could also execute R code (just to be fair;-)

I know that we can already use cell magic to run R in a IPython notebook:


…so that’s that part of the equation.

And the notebooks do already allow us to mix markdown cells and code blocks/output. The default notebook presentation style is to show the code cells with the numbered In []: and Out []: block numbering, but it presumably only takes a small style extension or customisation to suppress that? And another small extension to add the ability to hide a code cell and just display the output?

So what is it that (to my mind at least) makes RStudio a nicer writing environment? One reason is the ability to write the Rmarkdown simply as Rmarkdown in a simple text editor enviroment. Another is the ability to inline R code and display its output in-place.

Taking that second point first, the ability to do better inlining in IPython notebooks – it looks like this is just what the python-markdown extension seems to do:


But how about the ability to write some sort of pythonMarkdown and then open in a notebook? Something like ipymd, perhaps…?


What this seems to do is allow you to open an IPython-markdown document as an IPython notebook (in other words, it replaces the ipynb JSON document with an ipymd markdown document…). To support the document creation aspects better, we just need an exporter that removes the code block numbering and trivially allows code cells to be marked as hidden.

Now I wonder… what would it take to be able to open an Rmd document as an IPython notebook? Presumably just the ability to detect the code language, and then import the necessary magics to handle its execution? It’d be nice if it could cope with inline code, e.g. using the python-markdown magic too?

Exciting times could be ahead:-)