Find the port:
lsof -i tcp:$PORT
Kill the process:
kill -9 $PROCESS_ID
Find the port:
lsof -i tcp:$PORT
Kill the process:
kill -9 $PROCESS_ID
Whilst listening in, via Skype, on the School meeting yesterday, I treated it as radio and also started tinkering with an XSLT converter for transforming OU-XML to something I can get into a Jupyter notebook form. (If anyone can point me to official OU XSLT transformers for OU-XML, that’d be really useful…)
I’m 15 years out of XSLT, so I’ve started with an easy converter into HTML that is most of the way there now for common OU-XML elements, as well as one that will convert into a markdown format supported by Jupytext, which would allow me to go OU-XML-md-ipynb. I also wonder if an OU-XML-ipynb (JSON) rout might be a useful exercise.
But then I started wondering… would it make more sense to try to get it into Pandoc? Pandoc recently announced Jupyter notebook/ipynb support as a native converter, so what are the routes in and out of pandoc?
Poking around, it seems that pandoc represents things internally using its own AST (abstract syntax tree). Pandoc filters let you write your own output filters for converting documents represented using the AST in whatever format you want. There are a couple of Python packages that support writing pandoc output filters: pandocfilters, which includes examples, and panflute (docs), which has a separate examples repo; there’s also this handy overview of Technical Writing with Pandoc and Panflute.
So that’s the first question: can I write a filter to generate a valid OU-XML document? OU-XML probably has some structural elements that are not matched by pandoc AST elements, but can these be encoded somehow as extensions to the AST, or represented as text elements in documents produced by pandoc that could be post-processed into OU-XML elements?
Going the other way, it seems that pandoc can ingest a JSON format that serialises the Pandoc AST structure, so if I can convert OU-XML into that then it would make life a lot easier for generating a wide range of output document formats from OU-XML.
So here’s the quandary… do I spend the rest of the morning finishing of my hack XSLT converter, or do I switch track and try to go down the pandoc route? Hmmm… maybe I should finish what I started: it’ll give me a bit more XSLT practice and should result in enough of an approximation of OU-XML content in notebooks that we can start to see whether that sort of conversion even makes sense.
By default, MyBinder looks to repositories on Github for its builds, but it can also build from Githubs gists,
GitLab.com repositories, and, well, any git repository with a networked endpoint, it seems:
What prompted me to this was looking for a way to launch a MyBinder container from Bitbucket. (For the archaeologists, there are various issues and PRs (such as here, and here, as well as this recent forum post — How to use bitbucket repositories on mybinder.org — that trace some of the history…)
So what’s the trick?
For now, you need to get hold of the URL to a particular Bitbucket repo commit. For example, to try running this repo you need to co to the Commits page and grab the URL for the most recent master commit (or whichever one you you want) which will contain the commit hash:
For example, soenthing like
https://bitbucket.org/ueacomputervision/image-labelling-tool/commits/f3ddb33e4839f8a0fe73c168993b405adc13daf0 gives the commit hash
For the repo base URL
https://bitbucket.org/ueacomputervision/image-labelling-tool, the MyBinder launch link then takes on the form:
which is to say:
But it does look like things may get easier in the near future…
Long time readers should be more than well aware by now of MyBinder, the Jupyter project service that will build a Docker image from the contents of a git repository and then launch a container based on that image so you can work with a live, running, albeit temporary, instance if it.
But that’s not all it can do…
Via Chris Holdgraf on the Jupyter discourse community site (Tip: embed custom github content in a Binder link with nbgitpuller), comes a magical trick whereby you can launch a MyBinder instance built from one repository and populate it with files from another.
Why’s this useful? Well, if you’ve had a play with your own repos using MyBinder, you’ll know that each time you make a change to a repository, MyBinder will want to rebuild the Docker image next time you try to launch the repo there.
So if your repo defines a complex build that takes some time to install all of its dependencies, you have to wait for that build even if all you did was correct a typo in the markdown of a notebook file.
So here’s the trick…
nbgitpuller is a Jupyter server extension that supports the “one-way synchronization of a remote git repository to a local git repository”.
There are other approaches to git syncing too. See the next edition of Tracking Jupyter to find out what they are…
Originally developed as a tool to help distribute notebooks to students, it can be called via a Jupyter server URL. For example, if you have
nbgitpuller installed in a local Jupyter server running on the default port 8888, the following URL will pull data from the specified repo into the base directory the notebook server points to using a URL of the form:
One of the neat things about Binderhub / MyBinder is that can pass a
git-pull? argument through as part of a MyBinder launch URL, so if the repo you want to build from installs and enables
nbgitpuller, you can then pull notebooks into the launched container from a second,
For example, yesterday I came across the Python
show_ast package, and incorporated IPython magic, that will render the abstract syntax tree of a Python command:
Such a thing may be useful in an introductory programming course (TBH, I’m never really sure what people try to teach in introductory programming courses, what the useful mental models are, how best to help folk learn them, and how to figure out how to teach them…).
As with most Python based repos, particularly ones that contain Jupyter notebooks (that is,
.ipynb files [….thinks… ooh… I has a plan to try s/thing else too….]) I generally try to “run” them via MyBinder. In this case, the repo didn’t work because there is a dependency on the Linux
graphviz apt package and the Python
At this point, I’d generally fork the repo, create a
binderise branch containing the dependencies, then try that out on MyBinder, sometimes adding an issue and/or making a pull request to the original repository suggesting they Binderise it…
nbgitpuller provides a different opportunity. Suppose I create a base container that contains the Graphviz Linux application and the
graphivz Python package. Something like this: ouseful-testing/binder-graphviz.
Then I can create a MyBinder session from that repo and pull in the
show_ast package from its repo and run the notebook directly:
Fortuitously, things work seemlessly in this case because the example notebook lives in directory where we can
import show_ast without the need to install it (otherwise we’d have needed to run
pip install . at the top level of the repo). In general, where notebooks are kept in a
docs directory, for example, the path to import the package would break. (Hmmm… I need to think about protocols for handling that… It’s better practise to put the notebooks somewhere but that means we need to install the package or change the import path to it, which is one more step for folk to stumble over…)
Thinking about my old show’n’tell repo, the branches of which scruffily define various Binder environments suited to particular topic areas (environments for working on chemistry notebooks, for example, or astronomy notebooks, or classical language or music notebooks) and also contain demo notebooks, I could instead just define a set of base Binder environment containers, slow to build but built infrequently, and then lighter weight notebook repos containing just demo notebooks for a particular topic area. These could then be quickly and easily updated, and run on MyBinder having been
nbgitpulled by a base container, without having to rebuild the base container each time I update a notebook in a notebook repo.
A couple of other things to note here. First,
nbgitpuller has its own helper for creating
nbgitpuller URLs, the nbgitpuller link generator:
It’s not hard to imagine a similar UI, or another tab to that UI, that can build a MyBinder link from a “standard” base container selected from a dropdown menu (or an optional link to a git repo) and then a provided git repo link for the target content repo.
Second, this has got me thinking about how we (don’t) handle notebook distribution very well in the OU.
For our TM351 internal course, we control the student’s computing environment via VM we provide them with, so we could install
nbgitpuller in it, but the notebooks are stored in a private Github repo and we don’t want to give students any keys to it at all. (For some reason, I seem to be the only person who doesn’t have a problem with the notebooks being in a public repo!;-)
For our public notebook utilising courses on FutureLearn or OpenLearn, the notebooks are in a public repo, but we don’t have control of the learners’ computing environments, (which is to say, we can’t preinstall
nbgitpuller and can’t guarantee that learners will have permissions of network access to install it themselves).
It’s almost as if various pieces keep appearing, but the jigsaw never quite seems to fit together…
I finally got round to finding, and fiddling with, an Apache Guacamole container that I could actually make sense of and it seems to work, with audio, when connecting to my demo RobotLab/Wine RDP desktop.
The container I tried is based on the Github repo
The container is started with:
mkdir guac_config docker run -p 8080:8080 -v guac_config:/config oznu/guacamole
Login with user name and password
I then launched a RobotLab container that is running an RDP server:
docker run --name tm129 --hostname tm129demo --shm-size 1g -p 3391:3389 -d ousefulcoursecontainers/tm129rdp
Inside Guacamole, we need to create a new connection profile. From the admin drop down menu, select
Settings, click on the
Connections tab and create a
Given the connection a name and specify the protocol as RDP:
The connection settings require the IP address and port noumber that the connection is to be made on. The port mapping was specified when we started the RobotLab container (
3391) but what’s the network address? If we try to claim “localhost” in the Guacamole container, that refers the container’s localhost, not localhost on host. On a Mac, we can pick up the host IP address from the Network panel in the System Preferences:
Enter the appropriate connection parameters and save them:
From the admin menu, select Home. From the home screen you should be able to select the new connection…
When the connection is opened, I was presented with a warning dialogue:
OK cleared it okay…
Then I could enter the RobotLab RDP connection details (username and password are both
and I was in to the desktop.
The application files can be found within the
File System in the
As mentioned previously, the base container needs some fettling… When you first run the RobotLab or Neural applications, Wine wants to do some updates (which requires a network connection). If I could figure out how to create users in the base image, rather than user creation occurring as part of the entrypoint, following the recipe here.
Although it’s a little bit ropey, the Guacamole desktop does play out audio.
RobotLab has three instructions for playing audio:
send commands play an audio file, and this works, sort of (the spoken works played using the
send command are, erm, very robotic!). The
tone command doesn’t work, but I’ve seen in docs that this was an outstanding issue for some versions of Windows, so maybe it doesn’t work properly under Wine anyway…
Finally, I note that if you leave the remote desktop running, a screensaver kicks in…
Although the audio support isn’t brilliant (maybe there are “settings” in the container config that can improve it?) the support is more or less good enough, as is, for audio feedback / beeps etc. And just about good enough for the RobotLab activities.
What this means is that now I do have a route for running RobotLab, via a browser, with sort of desktop support.
One other thing to note relates to the network addressing. If I start the Guacamole and RobotLab containers together via a
docker-compose.yml file, I’m guessing I should be able to define a Docker Compose network to connect them and use that as the network address/alias name in the Guacamole connection setting?
But I’m creatively drained atm and can’t face trying to get anything else working today…
PS another remote desktop protocol, SPICE, which I found via mentions in OpenStack docs…: [t]he SPICE project aims to provide a complete open source solution for remote access to virtual machines in a seamless way so you can play videos, record audio, share usb devices and share folders without complications [docs and a simple looking howto]. Not sure how deprecated / live this is?
A skim back through this blog will turn up several posts over the years on the topic of “writing diagrams”, using text based scripts along with diagram generating applications to create diagrams from textual descriptions.
There are a several reasons I think such things useful, particularly in online, distance education context in an institution with a factory production model:
Sometimes, though, it can be handy to be able to actually draw a diagram by actually drawing it, rather than generating it from a textual source.
When you create a new diagram, you are prompted for a save location:
Local (device) storage is the default, but it looks like you can also link Google Drive or OneDrive online storage, though I haven’t tried this (yet!):
If you have a previously saved diagram, you can select it from a file browser. If you opt to create a new diagram, you can create a blank diagram with a default set of drawing tools, a particular diagram type or a diagram imported from a template URL:
If you go for the template URL option, you are prompted for the URL (I don’t know if there’s a catalogue / awesome list of template URLs anywhere?):
If you select one of the canned diagram type options, you are provided with a preview of the sorts of diagram you can create within that view:
If you click to select one of the example diagrams then click Create, the diagram editor opens with the example diagram and a set of custom diagram element options in the scratchpad sidebar:
If you don’t select a preview diagram, or you select the Blank diagram, you just get a default tool set.
Usefully for the course I’m looking at, one of the scratchpad collections provides diagram components that can be used to draw Crows Foot entity relation diagrams, as described here: Entity Relationship Diagrams with draw.io (see also: Entity Relationship Diagram (ERD)).
Clicking on an item in the toolbar previews the component and adds it to the canvas; you can also click and drag items from the sidebar and then drop them on the canvas.
(Partial) ERD diagrams can also be generated by importing database table definitions using the SQL import plugin.
One of the nice features of draw.io is that you can also generate certain diagrams from data files. In the Arrange menu, the Insert option provides several options for importing different sorts of data or textual elements from which diagrams can be automatically generated.
Plugins extend the range of import options, as for example in the case of the
sql-plugin. (The SQL plugin seems to add tables based on
CREATE TABLE elements in the SQL; whilst it correctly identifies and highlights primary keys, it doesn’t identify relationships between them, so you have to add the crow’s foot lines yourself…)
See the full list of official plugins.
Data can be imported from CSV files, as described here: Automatically create draw.io diagrams from CSV files Not all columns need to be displayed; some columns may even be used to store metadata or styling information using reserved columns ( image, fill and stroke ). The first column in each row represents a node and may be styled according to details given in the styling columns.
Other columns contain values that can be included in the node or that specify which other nodes that node is connected to. Rules are used to define the styling and labelling of each edge, as well as identifying columns used to identify edge connections between nodes.
Not all columns need to be referenced / used in the diagram that is generated.
I haven’t fully explored all the possible CSV import settings yet; I’m also thinking it’d be nice if there were some Python tooling to help simplify the creation of the CSV import definition file.
(By the by, there is also a handy online CSV viewer webform available.)
As well as CSV import, UML diagrams can be generated using PlantUML, a tool for creating a wide variety of diagram types from imported UML and other diagram specifications: Use PlantUML in draw.io. (That said, when I tried with the online *draw.io editor, the PlantUML import didn’t work. It looks like it uses Graphviz underneath, so it may be something to do with that? I need to try on a local install really, or ideally in a container with JupyterLab using the
Taken together, I wonder if these importers could be used with other Python tools for generating diagrams from code? e.g. could something like this approach to electrical circuit diagram generation with
lcapy be used to generate diagrams that draw.io can render??
Another handy looking too comes in the form of drawio-batch, a “command line converter for draw.io diagrams” based on
puppeteer (“a Node library which provides a high-level API to control Chrome”, operating it by default in headless mode) that wraps the online draw.io conversion code into an offline tool. (I’ve not had a chance to try this yet; from the tests, it looks like you call it with a draw.io XML diagram file and and output file and it gives you an output diagram back in a format corresponding to the filetype you specified by the output file suffix (
svg in the tests)?
Of the plugins,
replay looks interesting: it lets you render an animated version of a diagram, for example as you build up a complex flow diagram a piece at a time. There is also an
anim plugin for what looks like creating more general animations.
All in all, it looks to be really handy, and something I could ship in out VM. The
jupyterlab-drawio extension shows it works in JupyterLab, and I think it should also work with
By the by, the Google Drive / OneDrive integration was interesting (if it works; I haven’t had a chance to try it yet)… In particular, it makes me wonder: could the code that did that be reused to provide a similar storage workflow in JupyterHub?
Whilst my virtualisation ramblings may seem to be taking a scattergun approach, I’m actually trying to explore the space in a way that generalises meaningfully in the context of the open and distance education.
The motivating ideas essentially boil down to these two questions / constraints:
I’m also motivated by “open” on the one hand – can we share the means of production, as well as the result — and factory working: will the approach used to deliver one application scale to other applications in different subject areas, or the same application, over time, as it goes through various versions.
My main focus has been on environments for running our TM351 applications (Jupyter notebooks, various databases, OpenRefine) as well as keeping legacy applications running (RobotLab, Genie, Daisyworld) as well as exploring other virtualised desktops (eg for the VREP simulator) but there is also quite a lot of discussion internally around used virtualised environments to support our cybersecurity courses.
I suspect this is both a mature and an evolving space:
Recently, I also came across Labtainers, a set of virtual machines produced by the US Naval Postgraduate School’s Center for Cybersecurity and Cyber Operations billed as “fully packaged Linux-based computer science lab exercises with an initial emphasis on cybersecurity. Labtainers include more than 40 cyber lab exercises and tools to build your own.”
Individual activities are packaged in individual Docker containers, and a complete distribution is available bundled into a VirtualBox virtual machine (there’s also a Labtainer design guide). There’s also a paper here: Individualizing Cybersecurity Lab Exercises with Labtainers, Michael F. Thompson & Cynthia E. Irvine, IEEE Security & Privacy, Vol 16(2), March/April 2018, pp. 91-95, DOI: 10.1109/MSP.2018.1870862.
I actually spotted Labtainers from a demo by Olivier Berger / @olberger that was in part demonstrating a noVNC bridge container he’s been working on. I first posted about an X11 / XPRA bridge container I’d come across here; that post describes the
JAremko/docker-x11-bridge container which I can run to provide an noVNC desktop through my browser; we can then run application separate application containers and mount the bridge container as a device, exposing the container application on the noVNC desktop. Olivier’s patched noVNC desktop container (fcwu/docker-ubuntu-vnc-desktop offers access to “an Ubuntu LXDE and LXQT desktop environment” so that it can be used in a similar way.
You can see it in action with the labtainers here:
A supporting blog post can be found here: Labtainers in a Web desktop through noVNC X11 proxy, full docker containers; there’s also an associated repo.
From the looks of it, Olivier has been on a similar journey to myself. Another post, this time from last year, describes a Demo of displaying labtainers labs in a Web browser through Guacamole (repo). Guacamole is an Apache project that provides a browser based remote desktop that can act as a noVNC or RDP client (I think…?!).
(For all their attempts to appeal to a wider audience, I think Docker keep missing a trick by not putting the Kitematic crew back together…)