Thinking Around the Edges – Lego EV3 Robots Running Remote Jupyter Kernels

I’ve been pondering the use of Lego EV3 robots on the OU TXR120 residential school again, and after a couple of chats with OU colleagues Jane Bromley (who just won a sci-fi film pitch competition run by New Scientist) and Jon Rosewell (who leads on our level 1 robotics offerings), here’s where I’m at on “visioning” it…?!;-)

The setting for the residential school is 8 groups of 3 students per group working with their own Lego EV3 robot to complete a set of challenges. Although we were hoping to have laptops from which the robots could be programmed wirelessly, it seems as if we might end up with wifi-less desktop machines (which will require tethering the robots to the desktop machine to programme them) with big screens, though I could be wrong.. and we may be able to work around that anyway (a quick trip to Maplin to buy some wifi dongles, for example, or some cables we can plug into a wifi routing switch, for example…) Having laptops would have been a boon for making the room a bit more flexible to work with (but many students have their own laptops…;-) and having a big screen that laptops could be mirror displayed against would possibly improve student experience.

The Lego bricks come with their own firmware, but as described in Pondering New Ways of Programming Lego EV3 Mindstorms Bricks we can also put Linux on them, and make use of a python wrapper library to actually programme the bricks using Python.

In what follows, I’ll refer to the EV3 brick as the client and a laptop or desktop computer that connects to it as the host.

One of the easiest ways to access the Python on the brick is to connect the brick to the network and then ssh in to it – for example (use the correct IP address for the brick):

ssh root@192.168.1.106

Each time you ssh in, you have to provide a password (r00tme is the default on my brick), but passwordless ssh can be set up which makes things quicker (you don’t have to enter the password).

To set up passwordless ssh onto the brick, you need to check that an .ssh folder is available in the home directory (~) on the brick for the ssh key. To do this, ssh in to EV3, cd ~ to ensure you’re at home you home, then ls -a to list the directory contents. If there is no .ssh directory, create one (this post suggests install -d -m 700 ~/.ssh).

If you don’t have an ssh key set up on the host machine you want to get passwordless ssh access from to the brick, follow the above link to find out how to create one. Once you do have ssh keys set up on host, run something along the lines of the following, using the IP address of your connected EV3, to copy the key across to the EV3:

cat ~/.ssh/id_rsa.pub | ssh root@192.168.1.106 'cat > .ssh/authorized_keys'

You will need to provide the password to execute this command (r00tme for the version of Ev3dev I’m running).

You should now be able to ssh in without a password: ssh root@192.168.1.106

Rather than prompting for the password, ssh will use the key provided to log you in.

In my previous post, I described how it was possible to run an old IPython notebook server on the brick that can expose notebooks over the network, although it was a little slow. It’s also possible to ssh in to the brick and run an IPython terminal on the brick:

ssh root@192.168.1.106
ipython

A third way I’d have expected to work is to access a remote IPython kernel on the brick from an IPython console on my laptop: ssh into the EV3 brick, launch an IPython kernel, and pick up the location of the connection file. For example, if the command:

ssh root@192.168.1.106
ipython kernel

to run an IPython kernel on the EV3 responds with something like:

To connect another client to this kernel, use:
    --existing kernel-1129.json

Back on the laptop I’d expect to be able to run:
scp root@192.168.1.106:/root/.ipython/profile_default/security/kernel-1129.json ./
to grab a local copy of the connection file from the brick and copy it over to my host machine (which works) and then use it as an existing connection for a Jupyter console on the laptop host:
jupyter console --existing ./kernel-1129.json --ssh server

But that doesn’t work for some reason? CRITICAL | Could not find existing kernel connection file ./kernel-1129.json

Whatever…

So far, so much noise – the important thing to take away from the above is to get the passwordless ssh set up, because it makes the following possible…

Recall the basic scenario – we want to run code on the brick from a computer. The bricks will run an IPython notebook, but it’s slow. Or we can ssh in to the brick, start an IPython process up, and run things from an IPython command line on the brick.

But what if we could run a Jupyter server on host, making notebooks available via a browser, but use a remote kernel running on the brick?

The remote-ikernel package makes setting up remote kernels that can be accessed from the Jupyter server relatively straightforward to do – install the package on your host machine (eg my laptop) and then run the remote-ikernel command with the settings required:

pip3 install remote_ikernel
remote_ikernel manage --add --kernel_cmd="ipython kernel -f {connection_file}" --name="Ev3dev" --interface=ssh --host=root@192.168.1.106

This creates a kernel.json file and places it in the correct location. (On my Mac this was in the directory /Users/MYUSER/Library/Jupyter/kernels/. )

The kernel.json file created for me (/Users/ajh59/Library/Jupyter/kernels/rik_ssh_root_192_168_1_106_ev3dev/kernel.json) was as follows:

{
  "argv": [
    "/usr/local/opt/python3/bin/python3.5",
    "-m",
    "remote_ikernel",
    "--interface",
    "ssh",
    "--host",
    "root@192.168.1.106",
    "--kernel_cmd",
    "ipython kernel -f {host_connection_file}",
    "{connection_file}"
  ],
  "display_name": "SSH root@192.168.1.106 Ev3dev",
  "remote_ikernel_argv": [
    "/usr/local/bin/remote_ikernel",
    "manage",
    "--add",
    "--kernel_cmd=ipython kernel -f {connection_file}",
    "--name=Ev3dev",
    "--interface=ssh",
    "--host=root@192.168.1.106"
  ]
}

Launching the notebook server using jupyter notebook in the normal way should result in the server picking up the remote kernel from the new kernel file.

To list the kernels available, I launched a Jupyter notebook with the normal (local) Python kernel and ran:

from jupyter_client.kernelspec import KernelSpecManager
list(KernelSpecManager().find_kernel_specs().keys())

My newly created on was there, albeit horribly named (the name was also the directory name the kernel.json file was created in): rik_ssh_root_192_168_1_106_ev3dev.

So here’s where we’re at now:

  • a single desktop or laptop computer with passwordless ssh access to a Lego EV3;
  • the desktop or laptop computer runs a Jupyter server;
  • Jupyter notebooks are available that will start up and run code on a remote IPython kernel on the brick.

Home1

Untitled

Where do we need to be? The most immediate need is to have something that works for residential school. This means we need 8 desktop computers and 8 EV3s.

One thing we’d need to do is find a way of marrying computers and EV3s. Specifically, we need to have known IP addresses associated with specific bricks (can we assign a known IP based on MAC address of the wifi dongle attached to the bricks?) and we need to make sure that passwordless ssh works between the computers and the EV3s. The easiest way of doing the latter would be to use the same ssh keypair in every machine, but this would mean student group A might be able to mistakenly connect to the Student Group B’s machine.

One way round it might be to set up a Jupyterhub server that is responsible for managing connections. The Jupyterhub server is a multiuser hub that will set up a Jupyter server for each logged in user. If we set up the Jupyterhub attached the same wifi network as the EV3 bricks, with 8 users, one per student group, each user/student group could then presumably be assigned to one of the bricks. Students would login to the Jupyterhub, and their Jupyter server would be associated with the remote_ikernel kernel.json file for one of the bricks. Anyone who can connect to the local wifi network can then login to the Jupyterhub as one of the group users, launch a notebook, and run code on one of the bricks. This means that students could write notebooks from their own laptops, connected to the network. Wifi-less desktop machines provided for the residential school could presumably be added into the network via a local ethernet cable?

So we’d have something like this:

robotlab_-_jupyter_pptx2

Does that make sense?

Another Route to Jupyter Notebooks – Azure Machine Learning

In much the same way that the IBM DataScientist Workbench seeks to provide some level of integration between analysis tools such as Jupyter notebooks and data access and storage, Azure Machine Learning studio also provides a suite of tools for accessing and working with data in one location. Microsoft’s offering is new to me, but it crossed my radar with the announcement that they have added native R kernel support, as well as Python 2 and 3, to their Jupyter notebooks: Jupyter Notebooks with R in Azure ML Studio.

Guest workspaces are available for free (I’m not sure if this is once only, or whether you can keep going back?) but there is also a free workspace if you have a (free) Microsoft account.

Microsoft_Azure_Machine_Learning_Studio

Once inside, you are provides with a range of tools – the one I’m interested in to begin with is the notebook (although the piepwork/dataflow experiments environment also looks interesting):

Notebooks_-_Microsoft_Azure_Machine_Learning_Studio2

Select a kernel:

Notebooks_-_Microsoft_Azure_Machine_Learning_Studio

give your new notebook a name, and it will launch into a new browser tab:

test1

You can also arrange notebooks within separate project folders. For example, create a project:

Projects_-_Microsoft_Azure_Machine_Learning_Studio

and then add notebooks to it:

Projects_-_Microsoft_Azure_Machine_Learning_Studio2
When creating a new notebook, you may have noted an option to View More in Gallery. The gallery includes examples of a range of project components, including example notebooks:

gallery_cortanaintelligence_com_browse_orderby_freshness_desc_skip_0_categories__5B_Notebook__5D

Thinking about things like the MyBinder app, which lets you launch a notebook in a container from a Github account, it would be nice to see additional buttons being made available to let folk run notebooks in Azure Machine Learning, or the Data Scientist Workbench.

It’s also worth noting how user tools – such as notebooks – seem to be being provided for free with a limited amount of compute and storage resource behind them as a way of recruiting users into platforms where they might then start to pay for more compute power.

From a course delivery perspective, I’m often unclear as to whether we can tell students to sign up for such services as part of a course or whether that breaks the service terms?  (Some providers, such as Wakari, make it clear that “[f]or classes, projects, and long-term use, we strongly encourage a paid plan or Wakari Enterprise. Special academic pricing is available.”) It seems unfair that we should require students to sign up for accounts on a “free” service in their own name as part of our offering for a couple of reasons at least: first, we have no control over what happens on the service; second, it seems that it’s a commercial transaction that should be formalised in some way, even if only to agree that we can (will?) send our students to that particular service exclusively. Another possibility is that we say students should make their own service available, whether by installing software themselves or finding an online provider for themselves.

On the other hand, trying to get online services provided at course scale in a timely fashion within an HEI seems to be all but impossible, at least until such a time as the indie edtech providers such as Reclaim Hosting start to move even more into end-user app provision either at the individual level, or affordable class level (with an SLA)…

See also: Seven Ways of Running IPython / Jupyter Notebooks.

BlockPy – Introductory Python Programming Blockly Environment

Whilst looking around to see what sorts of graphical editors there are out there for teaching introductory python programming, I ran a search for blockly python. If you haven’t come across Blockly before, it’s a library for building browser based graphical programming interfaces, based on interlocking blocks, with a Scratch style aesthetic: blockly.

I already knew that Blockly could be customised to generate Python code but the BlockPy environment from Virginia Tech’s Software Innovations Lab is even richer:

BlockPy

For a start, the environment is set up for working with small data sets, and can display small tabular datasets as well as plot them. (You may remember we also used data to motivate programming for the FutureLearn Learn To Code (a line at a time) course.) The language is a subset of Python 2.7 (the environment uses the Skulpt client side Python interpreter; I’m not sure if the turtle demo works!).

The environment also supports blocks-to-code as well as code-to-blocks translations, so you can paste a chunk of code into the text view, and then display the blocks equivalent. (I think this is done by parsing the Python into AST and then using that as the bridge to the blocks view?)

Alternatively, it you’re happier with the blocks, you can write a programme graphically and then grab the code version. Or you can flip between the two…

BlockPy2

As well as the blocks/code view, there is a pseudo-code view that maps the code into more explanatory language. This feature is under active development, I think…

To aid debugging – and learning – the environment allows you to step through the code a line at a time, previewing the current state in the panels on the right hand side.

BlockPy3

If you get an error, an error prompt appears. This seems to be quite friendly in some cases, though I suspect not every error or warning is trapped for (I need to explore this a bit more; I can’t help thinking than an “expert” view to show the actual error message might also be useful if the environment is being used as a stepping stone to text-based Python programming.)

BlockPy4

The code is available on Github, and I made a start on putting it into a docker container until my build broke (Kitematic on my machine doesn’t seem to like Java at the moment – a known issue – which seems to be required as part of the build process)…

The environment is also wrapped up in a server side environment, and on the Virginia Tech is wrapped in a login-if-you-want-to environment. I didn’t see any benefit from logging in, though I was hoping to be able to name and save my own programmes. (I wonder if it’s also possible to serialise and encode a programme into a URL so it can be shared?)

You can also embed the environment – prepopulated with code, if required, though I’m not sure how to to that? – inline in a web page, so we could embed it in course materials, for example. Being able to hooks this into an auto-marking tool could also be interesting…

All in all, a really nice environment, and one that I think we could explore for OUr own introductory computing courses.

I also started wondering about how BlockPy might be able to work with a Jupyter server/IPython kernel, or be morphed into an IPyWidget…

In the first case, BlockPy could be used to fire up an IPython process via a Jupyter server, and handle code execution and parsing (for AST-block conversion?) that way rather then using the in-browser Python Skulpt library. Having a BlockPy front end to complement Jupyter notebooks could be quite interesting, I think?

On the widget front, I can imagine running BlockPy within a Jupyter notebook, using it to generate code that could be exported into a code cell, for example, though I’m not really clear what benefit this would provide?

So – anyone know if there is any work anywhere looking at taking the BlockPy front-end and making it a standalone Jupyter client?! :-)

The Rise of Transparent Data Journalism – The BuzzFeed Tennis Match Fixing Data Analysis Notebook

The news today was lead in part by a story broken by the BBC and BuzzFeed News – The Tennis Racket – about match fixing in Grand Slam tennis tournaments. (The BBC contribution seems to have been done under the ever listenable File on Four: Tennis: Game, Set and Fix?)

One interesting feature of this story was that “BuzzFeed News began its investigation after devising an algorithm to analyse gambling on professional tennis matches over the past seven years”, backing up evidence from leaked documents with “an original analysis of the betting activity on 26,000 matches”. (See also: How BuzzFeed News Used Betting Data To Investigate Match-Fixing In Tennis, and an open access academic paper that inspired it: Rodenberg, R. & Feustel, E.D. (2014), Forensic Sports Analytics: Detecting and Predicting Match-Fixing in Tennis, The Journal of Prediction Markets, 8(1).)

Feature detecting algorithms such as this (where the feature is an unusual betting pattern) are likely to play an increasing role in the discovery of stories from data, step 2 in the model described in this recent Tow Center for Digital Journalism Guide to Automated Journalism:]

Guide_to_Automated_Journalism__

See also: OUseful.info: Notes on Robot Churnalism, Part I – Robot Writers

Another interesting aspect of the story behind the story was the way in which BuzzFeed News opened up the analysis they had applied to the data. You can find it described on Github – Methodology and Code: Detecting Match-Fixing Patterns In Tennis – along with the data and a Jupyter notebook that includes the code used to perform the analysis: Data and Analysis: Detecting Match-Fixing Patterns In Tennis.

2016-01-tennis-betting-analysis_tennis-analysis_ipynb_at_master_·_BuzzFeedNews_2016-01-tennis-betting-analysis

You can even run the notebook to replicate the analysis yourself, either by downloading it and running it using your own Jupyter notebook server, or by using the online mybinder service: run the tennis analysis yourself on mybinder.org.

(I’m not sure if the BuzzFeed or BBC folk tried to do any deeper analysis, for example poking into point summary data as captured by the Tennis Match Charting Project? See also this Teniis Visuals project that makes use of the MCP data. Tennis etting data is also collected here: tennis-data.co.uk. If you’re into the idea of analysing tennis stats, this book is one way in: Analyzing Wimbledon: The Power Of Statistics.)

So what are these notebooks anyway? They’re magic, that’s what!:-)

The Jupyter project is an evolution of an earlier IPython (interactive Python) project that included a browser based notebook style interface for allowing users to write and execute code, as well as seeing the result of executing the code, a line at a time, all in the context of a “narrative” text document. The Jupyter project funding proposal describes it thus:

[T]he core problem we are trying to solve is the collaborative creation of reproducible computational narratives that can be used across a wide range of audiences and contexts.

[C]omputation in science is ultimately in service of a result that needs to be woven into the bigger narrative of the questions under study: that result will be part of a paper, will support or contest a theory, will advance our understanding of a domain. And those insights are communicated in papers, books and lectures: narratives of various formats.

The problem the Jupyter project tackles is precisely this intersection: creating tools to support in the best possible ways the computational workflow of scientific inquiry, and providing the environment to create the proper narrative around that central act of computation. We refer to this as Literate Computing, in contrast to Knuth’s concept of Literate Programming, where the emphasis is on narrating algorithms and programs. In a Literate Computing environment, the author weaves human language with live code and the results of the code, and it is the combination of all that produces a computational narrative.

At the heart of the entire Jupyter architecture lies the idea of interactive computing: humans executing small pieces of code in various programming languages, and immediately seeing the results of their computation. Interactive computing is central to data science because scientific problems benefit from an exploratory process where the results of each computation inform the next step and guide the formation of insights about the problem at hand. In this Interactive Computing focus area, we will create new tools and abstractions that improve the reproducibility of interactive computations and widen their usage in different contexts and audiences.

The Jupyter notebooks include two types of interactive cell – editable text cells into which you can write simple markdown and HTML text that will be rendered as text; and code cells into which you can write executable code. Once executed, the results of that execution are displayed as cell output. Note that the output from a cell may be text, a datatable, a chart, or even an interactive map.

One of the nice things about the Jupyter notebook project is that the executable cells are connected via the Jupyter server to a programming kernel that executes the code. An increasing number of kernels are supported (e.g. for R, Javascript and Java as well as Python) so once you hook in to the Jupyter ecosystem you can use the same interface for a wide variety of computing tasks.

There are multiple ways of running Jupyter notebooks, including the mybinder approach described above, – I describe several of them in the post Seven Ways of Running IPython Notebooks.

As well as having an important role to play in reproducible data journalism and reproducible (scientific) research, notebooks are also a powerful, and expressive, medium for teaching and learning. For example, we’re just about to star using Jupyter notebooks, delivered via a virtual machine, for the new OU course Data management and analysis.

We also used them in the FutureLearn course Learn to Code for Data Analysis, showing how code could be used a line at a time to analyse a variety of opendata sets from sources such as the World Bank Indicators database and the UN Comtrade (import /export data) database.

PS for sports data fans, here’s a list of data sources I started to compile a year or so ago: Sports Data and R – Scope for a Thematic (Rather than Task) View? (Living Post).