OUseful.Info, the blog…

Trying to find useful things to do with emerging technologies in open education

Archive for the ‘OU2.0’ Category

Tracking Changes in IPython Notebooks?

Managing the tracking suggested changes to the same set of docs, along with comments and observations, from multiple respondents in is one of the challenges any organisation who business is largely concerned with the production of documents has to face.

Passing shared/social living documents by reference rather than value, so that folk don’t have to share multiple physical copies of the same document, each annotated separately, is one way. Tools like track changes in word processor docs, wiki page histories, or git diffs, is another.

All documents have an underlying representation – web pages have HTML, word documents have whatever XML horrors lay under the hood, IPython notebooks have JSON.

Change tracking solutions like git show differences to the raw representation, as in this example of a couple of changes made to a (raw) IPython notebook:

Track changes in github

Notebooks can also be saved in non-executable form that includes previously generated cell outputs as HTML, but again a git view of the differences would reveal changes at the HTML code level, rather than the rendered HTML level. (Tracked changes also include ‘useful’ ones, such as changes to cell contents, and (at a WYSWYG level at least) irrelevant ‘administrative’ value changes such as changes to hash values recored in the notebook source JSON.

Tracking changes in a WYSIWYG display shows the changes at the rendered, WYSIWYG level, as for example this demo of a track changes CKEditor plugin demonstrates [docs]:

lite - ck editor track changes

However, the change management features are typically implemented through additional additional metadata/markup to the underlying representation:

lite changes src

For the course we’re working on at the moment, we’re making significant use of IPython notebooks, requiring comments/suggested changes from multiple reviewers over the same set of notebooks.

So I was wondering – what would it take to have an nbviewer style view in something like github that could render WYSIWYG track changes style views over a modified notebook in just cell contents and cell outputs?

This SO thread maybe touches on related issues: Using IPython notebooks under version control.

A similar principle would work the same for HTML too, of course. Hmm, thinks… are there any git previewers for HTML that log edits/diffs at the HTML level but then render those diffs at the WYSIWYG level in a traditional track changes style view?

Hmm… I wonder if a plugin for Atom.io might do this? (Anyone know if atom.io can also run as a service? Eg could I put it onto a VM and then axis it through localhost:ATOMIOPORT?)

PS also on the change management thing in IPython Notebooks, and again something that might make sense in a got context, is the management of ‘undo’ features in a cell.

IPython notebooks have a powerful cell-by-cell undo feature that works at least during a current session (if you shut down a notebook and then restart it, I assume the cell history is lost?). [Anyone know a good link describing/summarising the history/undo features of IPython Notebooks?]

I’m keen for students to take ownership of notebooks and try things out within them, but I’m also mindful that sometimes they make make repeated changes to a cell, lose the undo history for whatever reason, and then reset the cell to the “original” contents, for some definition of “original” (such as the version that was issued to the learner by the instructor, or the version the learner viewed at their first use of the notebook.)

A clunky solution is for students to duplicatea each notebook before they start to work on it so they have an original copy. But this is a bit clunky. I just want an option to reveal a “reset” button by each cell and then be able to reset it. Or perhaps in line with the other cell operations, reset either a specific highlight cell, reset all, cells, or reset all cells above or below a selected cell.

Written by Tony Hirst

June 5, 2014 at 9:16 am

Posted in OU2.0, Thinkses

Tagged with ,

Click-Scheduled Forum Posts

Course writing always seems to take me forever for a wide variety of reasons.

The first is the learning. You’re presumably familiar with the old saw about the teacher being one page ahead..? That’s the teacher as expert learner modeling the learning process, the teacher as “teaching on” something they’ve just learned. The teacher-as-learner experiencing just where the stumbling blocks are, or noticing afresh the really big idea… That’s me, learning on the one hand, and on the other trying to justify with footnotes and references the things I’ve learned by experience and that just feel right!

The second thing is the process of course production, the tooling used to support it, and the things we could try out. I’ve spent a lot of time thinking about virtual machines and containers recently, because I think they could be good for the OU and good for the School of Data. It doesn’t surprise me that Coursera courses are using virtual machines, and that Futurelearn isn’t. I think there’s interesting stuff – and not a few business opportunities – in looking at ways of supporting the organisational and end-user creation of VMs, particularly in education where you’re regularly presented with having to find tech solutions – and support – for cohorts of anything between 15 and 1,500 (or with the MOOCs, 15,000). I’ve also spent a lot of time pondering IPython Notebooks – and need to spend more time doing so: literate programming, conversations with data, end user application development, task based computing, and a comparison with the attractiveness of spreadsheets are all in the mix.

The third thing is time spent keeping a learning diary of what’s been going on in the course production process. I haven’t done this this time and I regret it (“no time to blog” because of “deadlines”; course goes to students in 15J, October 2015, (sic, i.e. next year), so the pressure is on to make the deadline (I won’t) for a full first draft handover by tomorrow). So f*****g that for a game of soldiers, I am taking an hour out and writing up a thought…

In particular, this one…

I’m struggling (again) with ways of trying to encourage sharing and discussion amongst the students. A default way of doing this is to have a call out (a “call to action”) from online teaching materials into a forum or forum thread. You know the sort of thing: read this, play with that, share your findings in the forum. Only hopefully a bit more engaging than that.

The problem is, if you are going to link out to a specific thread from course materials, you need to seed the forums. Which means if you have a lot of callouts, the forums can start to get cluttered with stub posts, and overload a nascent forum with irrelevant (at the then time) content.

doodling threads

One way around this is to schedule posts to appear in the forums for around the time you expect students to be reaching out to them. This can make hard linking difficult, unless you can publish a post, get the link, unpublish it and schedule it, and then hope when it does get re-released that the link is the same. (If the URL is minted against a post ID, this should work?) A downside of this approach that if a student clicks on a forum call out link and the post hasn’t yet been republished according to its scheduled date, the link will be broken.

Reflecting on the way wikis work, where you can create a link to a wiki page that doesn’t exist yet, and that page is created when it’s clicked for the first time, I started to wonder about a similar mechanism for links to forum based social activities. So for example, I create a forum post with a scheduled date that publishes the post on a particular date if it hasn’t already been published yet and check the clickpub box. I’m presented with a URL for the post that is guaranteed to be the URL it’s given when it the post does get published.

In my course materials, I paste the link.

If no-one ever clicks the link to that forum post in the course materials, the post is published in the forum on the scheduled date. The post should contain a description of the activity and a reference back to the activity in the course materials, as well as act as a stub for a discussion around the activity or sharing of social objects associated with the activity. In this mode, the post acts as call-to-action from the forum to the course materials, supporting the pacing of the course.

Some students, however, like to get ahead. So if they click on the link before the schedule date they need to see the post somehow.

The first way to achieve this is to use the link in the course materials a bit like a new wiki page link: if a student clicks on a link to a post before the post is scheduled to be published, the click sends a hurry-up, clickpub message that fires the publication of the post. This actually signals two things: one to the course team that someone is that far ahead in working through the course materials, the other to the rest of the cohort that somebody is that far ahead in working through the materials.

(Note that we need to defend against link checkers (human or machine) that might be operating in the VLE accidentally triggering a clickpub event!)

Problems may arise in the case of the student who tries to do the whole 30 week course in the first 10 days after it is opened up. (Unless such students are anti-social and don’t post to such forums, in part because they know it’s unlikely that anyone else will be as far on and keen to discuss the topic. That said, even posting with no hope of reply is often beneficial in the way it forces a little bit of reflective thinking at least.)

To try to mitigate against early publication of a post, we could try a more refined strategy in which a social activity thread is only viewable to students who click on the link to it before the scheduled release date, but is then released openly to the forum at its scheduled time.

We could balance this further with the proviso that if more than x% of the cohort have accessed the thread, it’s scheduled release date is brought forward to that time. In this way, we can start to use the social activity posts as one way of trying to keep the cohort together, for example in cases where the majority of the cohort is working through the course faster than expected.

[Unfiled patents: 3,487; 3,488; 3,489 …;-) – Unless of course these sorts of mechanism already exist? If so, please let me know where via the comments below:-)]

Written by Tony Hirst

June 3, 2014 at 9:57 am

Posted in OU2.0

Tagged with ,

Losing Experimental Edtech Value from IPython Notebooks Because of New Security Policies?

Just like the way VLEs locked down what those who wanted to try to stuff out could do with educational websites, usually on the grounds of “security”, so a chunk of lightweight functionality with possible educational value that I was about to start to exploring inside IPython notebooks has been locked out by the new IPython notebook security policy:

Affected use cases
Some use cases that work in IPython 1.0 will become less convenient in 2.0 as a result of the security changes. We do our best to minimize these annoyance, but security is always at odds with convenience.

Javascript and CSS in Markdown cells
While never officially supported, it had become common practice to put hidden Javascript or CSS styling in Markdown cells, so that they would not be visible on the page. Since Markdown cells are now sanitized (by Google Caja), all Javascript (including click event handlers, etc.) and CSS will be stripped.

Here’s what I’ve been exploring – using a simple button:

ipynb button

to reveal an answer:

ipynb button reveal

It’s a 101 interaction style in “e-learning” (do we still call it that?!) and one that I was hoping to explore more given the interactive richness of the IPython notebook environment.

Here’s how I implemented it – a tiny bit of Javascript hidden in one of the markdown cells:

<script type="text/javascript">
   function showHide(id) {
       var e = document.getElementById(id);
       if(e.style.display == 'block')
          e.style.display = 'none';
          e.style.display = 'block';

and then a quick call from a button onclick event handler to reveal the answer block:

<input type="button" value="Answer" onclick="showHide('ans2')">

<div id="ans2" style="display:none">I can see several ways of generating common identifiers:

<ul><li>using the **gss** code from the area data, I could generate identifiers of the form `http://http://statistics.data.gov.uk/id/statistical-geography/GSS`</li>
<li>from the housing start data, I could split the *Reference Area* on space characters and then extract the GSS code from the first item in the split list</li>
<li>The *districtname* in the area data looks like it make have "issues" with spacing in area names. If we remove spaces and turn everything to lower case in the area data *districtname* and the *Reference Area* in the housing data, we *may* be able create matching keys. But it could be a risky strategy...</li>

This won’t work anymore – and I don’t have the time to learn whether custom CSS can do this, and if so, how.

I don’t really want to have to go back to the approach I tried before I demoed the button triggered reveal example to myself…

ipynb another interaction

That is, putting answers into a python library and then using code to pull the text answer in…

ipynb color styling

Note also the use of colour in the cells – this is something else I wanted to try to explore, the use of styling to prompt interactions; in the case of IPython notebooks, I quite like the idea of students taking ownership of the notebooks and adding content to it, whether by adding commentary text to cells we have written in, adding their own comment cells (perhaps using a different style – so a different cell type?), amending code stubs we have written, adding in their own code, perhaps as code complements to comment prompts we have provided, etc etc.

ipynb starting to think about different interactions...

The quick hack, try and see option that immediately came to mind to support these sorts of interaction seems to have been locked out (or maybe not – rather than spending half an hour on a quick hack I’ll have to spend have an hour reading docs…). This is exactly the sort of thing that cuts down on our ability to mix ideas and solutions picked up from wherever, and just try them out quickly; and whilst I can see the rationale, it’s just another of those things to add to the when the web was more open pile. (I was going to spend half an hour blogging a post to let other members of the course team I’m on know how to add revealed answers to their notebooks, but as I’ve just spent 18 hours trying to build a VM box that supports python3 and the latest IPythion notebook, I’m a bit fed up at the thought of having to stick with the earlier version py’n’notebook VM I built because it’s easier for us to experiment with…)

I have to admit that some of the new notebook features look like they could be interesting from a teaching point of view in certain subject areas – the ability to publish interactive widgets where the controls talk to parameters accessed via the notebook code cells, but that wasn’t on my to do list for the next week…

What I was planning to do was explore what we’d need to do to get elements of the notebook behaving like elements in OU course materials, under the assumption that our online materials have designs that go hand in hand with good pedagogy. (This is a post in part about OU stuff, so necessarily it contains the p-word.)

ou teaching styling

Something else on the to do list was to explore how to tweak the branding of the notebook, for example to add in an OU logo or (for my other day per week), a School of Data logo. (I need to check the code openness status of IPython notebooks… How bad form would it be to remove the IPy logo for example? And where should a corporate log go? In the toolbar, or at the top of the content part of the notebook? If you just contribute content, I guess the latter; if you add notebook functionality, maybe the topbar is okay?)

There are a few examples of styling notebooks out there, but I wonder – will those recipes still work?

Ho hum – this post probably comes across as negative about IPython notebooks, but it shouldn’t because they’re a wonderful environment (for example, Doodling With IPython Notebooks for Education and Time to Drop Calculators in Favour of Notebook Programming?). I’m just a bit fed up that after a couple of days graft I don’t get to have half and hour’s fun messing around with look and feel. Instead, I need to hit the docs to find out what’s possible and what isn’t because the notebooks are no longer an open environment as they were… Bah..:-(

Written by Tony Hirst

April 11, 2014 at 6:10 pm

Posted in Open Education, OU2.0, Tinkering

Tagged with ,

Visualising Pandas DataFrames With IPythonBlocks – Proof of Concept

A few weeks ago I came across IPythonBlocks, a Python library developed to support the teaching of Python programming. The library provides an HTML grid that can be manipulated using simple programming constructs, presenting the outcome of the operations in a visually meaningful way.

As part of a new third level OU course we’re putting together on databases and data wrangling, I’ve been getting to grips with the python pandas library. This library provides a dataframe based framework for data analysis and data-styled programming that bears a significant resemblance to R’s notion of dataframes and vectorised computing. pandas also provides a range of dataframe based operations that resemble SQL style operations – joining tables, for example, and performing grouping style summary operations.

One of the things we’re quite keen to do as a course team is identify visually appealing ways of illustrating a variety of data manipulating operations; so I wondered whether we might be able to use ipythonblocks as a basis for visualising – and debugging – pandas dataframe operations.

I’ve posted a demo IPython notebook here: ipythonblocks/pandas proof of concept [nbviewer preview]. In it, I’ve started to sketch out some simple functions for visualising pandas dataframes using ipythonblocks blocks.

For example, the following minimal function finds the size and shape of a pandas dataframe and uses it to configure a simple block:

def pBlockGrid(df):
    return BlockGrid(x,y)

We can also colour individual blocks – the following example uses colour to reveal the different datatypes of columns within a dataframe:

ipythinblocks pandas type colour

A more elaborate function attempts to visualise the outcome of merging two data frames:

ipythonblocks pandas demo

The green colour identifies key columns, the red and blue cells data elements from the left and right joined dataframes respectively, and the black cells NA/NaN cells.

One thing I started wondering about that I have to admit quite excited me (?!;-) was whether it would be possible to extend the pandas dataframe itself with methods for producing ipythonblocks visual representations of the state of a dataframe, or the effect of dataframe based operations such as .concat() and .merge() on source dataframes.

If you have any comments on this approach, suggestions for additional or alternative ways of visualising dataframe transformations, or thoughts about how to extend pandas dataframes with ipythonblocks style visualisations of those datastructures and/or the operations that can be applied to them, please let me know via the comments:-)

PS some thoughts on a possible pandas interface:

  • DataFrame().blocks() to show the blocks
  • .cat(blocks=True) and .merge(blocks=True) to return (df, blocks)
  • DataFrame().blocks(blockProperties={}) and eg .merge(blocks=True, blockProperties={})
  • blockProperties: showNA=True|False, color_base=(), color_NA=(), color_left=(), color_right=(), color_gradient=[] (eg for a .cat() on many dataframes), colorView=structure|datatypes|missing (the colorView reveals the datatypes of the columns, the structure origins of cells returned from a .merge() or .cat(), or a view of missing data (reveal NA/NaN etc over a base color), colorTypes={} (to set the colors for different datatypes)

Written by Tony Hirst

March 26, 2014 at 11:37 pm

So Is This Guerrillla Research?

A couple of days ago I delivered a workshop with Martin Weller on the topic of “Guerrilla Research”.


The session was run under the #elesig banner, and was the result of an invitation to work through the germ of an idea that was a blog post Martin had published in October 2013, The Art Of Guerrilla Research.

In that post, Martin had posted a short list of what he saw as “guerrilla research” characteristics:

  1. It can be done by one or two researchers and does not require a team
  2. It relies on existing open data, information and tools
  3. It is fairly quick to realise
  4. It is often disseminated via blogs and social media

Looking at these principles now, as in, right now, as a I type (I don’t know what I’m going to write…), I don’t necessarily see any of these as defining, at least, not without clarification. Let’s reflect, and see how my fingers transcribe my inner voice…

In the first case, a source crowd or network may play a role in the activity, so maybe it’s the initiation of the activity that only requires one or two people?

Open data, information and tools helps, but I’d gear this more towards pre-existing data, information and tools, rather than necessarily open: if you work inside an organisation, you may be able to appropriate resources that are not open or available outside the organisation, and may even have limited access within the organisation; you may have to “steal” access to them, even; open resources do mean that other people can engage in the same activity using the same resources, though, which provides transparency and reproducibility; open resources also make inside, outside activities possible.

The activity may be quick to realise, sort of: I can quickly set a scraper going to collect data about X, and the analysis of the data may be quick to realise; but I may need the scraper to run for days, or weeks, or months; more qualifying, I think, is that the activity only requires a relatively short number of relatively quick bursts of activity.

Online means of dissemination are natural, because they’re “free”, immediate, have potentially wide reach; but I think an email to someone who can, or a letter to the local press, or an activity that is it’s own publication, such as a submission to a consultation in which the responses are all published, could also count too.

Maybe I should have looked at those principles a little more closely before the workshop…;-) And maybe I should have made reference to them in my presentation. Martin did, in his.

PS WordPress just “related” this back to me, from June, 2009: Guerrilla Education: Teaching and Learning at the Speed of News

Written by Tony Hirst

March 21, 2014 at 8:44 am

Posted in OU2.0, Thinkses

Tagged with ,

Oppia – A Learning Journey Platform From Google…

I couldn’t get to sleep last night mulling over thoughts that had surfaced after posting Time to Drop Calculators in Favour of Notebook Programming?. This sort of thing: what goes on when you get someone to add three and four?

Part of the problem is associated with converting the written problem into mathematical notation:

3 + 4

For more complex problems it may require invoking some equations, or mathematical tricks or operations (chain rule, dot product, and so on).

3 + 4

Cast the problem into numbers then try to solve it:

3 + 4 =

That equals gets me doing some mental arithmetic. In a calculator, there’s a sequence of button presses, then the equals gives the answer.

In a notebook, I type:

3 + 4

that is, I write the program in mathematicalese, hit the right sort of return, and get the answer:


The mechanics of finding the right hand side by executing the operations on the left hand side are handled for me.

Try this on WolframAlpha: integral of x squared times x minus three from -3 to 4

Or don’t.. do it by hand if you prefer.

I may be able to figure out the maths bit – figure out how to cast my problem into a mathematical statement – but not necessarily have the skill to solve the problem. I can get the method marks but not do the calculation and get the result. I can write the program. But running the programme, diving 3847835 by 343, calculating the square root of 26,863 using log tables or whatever means, that’s the blocker – that could put me off trying to make use of maths, could put me off learning how to cast a problem into a mathematical form, if all that means is that I can do no more than look at the form as if it were a picture, poem, or any other piece of useless abstraction.

So why don’t we help people see that casting the problem into the mathematical form is the creative bit, the bit the machines can’t do. Because the machines can do the mechanical bit:

wolfram alpha

Maybe this is the approach that the folk over at Computer Based Math are thinking about (h/t Simon Knight/@sjgknight for the link), or maybe it isn’t… But I get the feeling I need to look over what they’re up to… I also note Conrad Wolfram is behind it; we kept crossing paths, a few years ago…I was taken by his passion, and ideas, about how we should be helping folk see that maths can be useful, and how you can use it, but there was always the commercial blocker, the need for Mathematica licenses, the TM; as in Computer-Based Math™.

Then tonight, another example of interactivity, wired in to a new “learning journey” platform that again @sjgknight informs me is released out of Google 20% time…: Oppia (Oppia: a tool for interactive learning).

Here’s an example….

oppia project euler 1

The radio button choice determines where we go next on the learning journey:

oppia euler2

Nice – interactive coding environment… 3 + 4 …

What happens if I make a mistake?

oppia euler 3

History of what I did wrong, inline, which is richer than a normal notebook style, where my repeated attempts would overwrite previous ones…

Depending how many common incorrect or systematic errors we can identify, we may be able to add richer diagnostic next step pathways…

..but then, eventually, success:

oppia euler 4

The platform is designed as a social one where users can create their own learning journeys and collaborate on their development with others. Licensing is mandated as “CC-BY-SA 4.0 with a waiver of the attribution requirement”. The code for the platform is also open.

The learning journey model is richer and potentially far more complex in graph structure terms than I remember the attempts developed for the [redacted] SocialLearn platform, but the vision appears similar. SocialLearn was also more heavily geared to narrative textual elements in the exposition; by contrast, the current editing tools in Oppia make you feel as if using too much text is not a Good Thing.

So – how are these put together… The blurb suggests it should be easy, but Google folk are clever folk (and I’m not sure how successful they’ve been getting their previous geek style learning platform attempts into education)… here’s an example learning journey – it’s a state machine:

Example learning design in oppia

Each block can be edited:

oppia state editor

When creating new blocks, the first thing you need is some content:

oppia - content - interaction

Then some interaction. For the interactions, a range of input types you might expect:

oppia interaction inputs

and some you might not. For example, these are the interactive/executable coding style blocks you can use:

oppia progr languages dialogue

There’s also a map input, though I’m not sure what dialogic information you can get from it when you use it?

After the interaction definition, you can define a set of rules that determine where the next step takes you, depending on the input received.

oppia state rules

The rule definitions allow you to trap on the answer provide by the interaction dialogue, optionally provide some feedback, and then identify the next step.

oppia - rules

The rule branches are determined by the interaction type. For radio buttons, rules are triggered on the selected answer. For text inputs, simple string distance measures:

oppia text rules

For numeric inputs, various bounds:

oppia numeric input

For the map, what looks like a point within a particular distance of a target point?

oppia map rule

The choice of programming languages currently available in the code interaction type kinda sets the tone about who might play with this…but then, maybe as I suggested to colleague Ray Corrigan yesterday, “I don’t think we’ve started to consider consequences of 2nd half of the chessboard programming languages in/for edu yet?”

All in all – I don’t know… getting the design right is what’ll make for a successful learning journey, and that’s not something you can necessarily do quickly. The interface is as with many Google interfaces, not the friendliest I’ve seen (function over form, but bad form can sometimes get in the way of the function…).

I was interested to see they’re calling the learning journeys explorations. The Digital Worlds course I ran some time ago used that word too, in the guise of Topic Explorations, but they were a little more open ended, and used a question mechanic used to guide reading within a set of suggested resources.

Anyway, one to watch, perhaps, erm, maybe… No badges as yet, but that would be candy to offer at the end of a course, as well as a way of splashing your own brand via the badges. But before that, a state machine design mountain to climb…

Written by Tony Hirst

February 27, 2014 at 9:14 pm

Posted in OU2.0

Cursory Thoughts on Virtual Machines in Distance Education Courses

One of the advantages of having a relatively long lived blog is that it gives me the ability to look back at the things that were exciting to me several years ago. For example, it was five years ago more or less to to the day when I first saw video ads on the underground; and seven and a half years ago since I remarked on the possible relevance of virtual machines (VMs) to OU teaching: Personal Computing Guidance for Distance Education Students. (At the time, I was more excited by portable applications that could be run from USB sticks, the motivating idea being that OU students might want to access course software or applications from arbitrary machines that they didn’t necessarily have enough permissions on to be able to download and install required applications.)

Since then, a couple of OU courses have dabbled with the virtual machines – the Linux course that’s now part of the course TM129 – Technologies in Practice makes use of a Linux virtual machine running in VirtualBox, and the digital forensics postgrad course (M812) makes use of a couple of VMs – a Windows box that needs analysing, and a Linux VM that contains the analysis tools.

We’re also looking at using a virtual machine for a new level three/third year equivalent course due out in October 2015 (sic…) on data stuff (short title!;-). I haven’t really been paying as much attention as I probably should have to VMs, but a little bit of playing at the end of last week and over the weekend made me realise the error of my ways…

So what are virtual machines (VMs)? You’re possibly familiar with the phrase “(computer) operating system”, and almost definitely will have heard of Windows and iOS. These are the bits of computer software that provide a desktop on top of your computer hardware, and run the services that that allow your applications to talk to the hardware and out in to the wider world. Virtual machines are boxes that allow you to run another operating system, as well as applications on top of it, on your own desktop. So a Windows machine can run a box that contains a fully working Linux computer; or if you’re like me and use a Mac, you’ll have a virtual machine that runs a copy of Windows so you can run Internet Explorer on it in order to access the OU’s expense claims system!

Now when it comes to shipping course software, we’re often faced with the problem of getting software to work on whatever operating system our students are using. In a traditional university, with computer labs, the computers in the public areas will all contain the same software, installed from a common source. (OU IT are trying to enforce a similar policy on staff machines at the moment. Referred to in reverential terms as “desktop optimisation”, the idea is that machines will only run the software that IT says can run on it. Which would rule out the possibility of me running pretty much any of the applications I use on a day to day basis. Although I think Macs are outside the optimisation fold for the moment…?)

Ideally, then, we might want students to all run the same operating system, so that we can test software on that system and write one set of instructions for how to use it. But students bring their own devices. And when it comes to installing the software tools we’d like computing students, for example, to install, there can be all sorts of problems getting the software to install properly.

So another option is to provide students with a machine that we control, that doesn’t upset their own settings, and that won’t kill their computer if something goes horribly wrong. (We can’t, for example, require students to run the OU’s optimised desktop, not least because we’d have to pay license fees for the use of Windows, but also because students would rightly get upset if we prevented them from downloading and installing Angry Birds on their own computer!) Virtual machines provide a way of doing this.

As a case in point, the new data course will probably make use of iPython Notebook, among other things. iPython Notebook is a browser accessed application that allows you to develop and execute Python programme code via an interactive, browser based user interface, which I find quite attractive from a pedagogical point of view. (This post may get read by OU folk in an OU teaching context, so I am obliged to use the p-word.)

Installing the Python libraries the course will draw on, as well as a variety of databases (PostgreSQL and MongoDB are the ones we’re thinking of using…) could be a major headache for our students, particularly if they aren’t well versed in sysadmin and library installation. But if we define a virtual machine that has the required libraries and applications preinstalled and preconfigured, we can literally contain the grief – if students run an application such as VirtualBox (which thy would have to install themselves), we can provide a preconfigured machine (known as a guest) that they can run within their own desktop (part of the host machine), that will make available services that they can access via their normal desktop browser.

So for example, we can build a virtual machine that contains iPython and all the required libraries, that can be defined to automatically run iPython Notebook when it boots, and that can make that notebook available via the host browser. And more than that, we can also configure the Notebook server running on the local guest VM so that it saves notebook files to a directory that is shared between the guest and the host. If a student then switches off, or even deletes, the guest machine, they don’t lose their work…

VMs have been used elsewhere for course delivery too, so we may also be able to learn more about the practicalities of VMs in a course context from those cases. For example, Running a next-gen sequence analysis course using Amazon Web Services describes how virtual machines running on Amazon Cloud services, (rather than in boxes running within a VirtualBox container on the user’s desktop) were used for a data analysis course that made us of very large datasets. (This demonstrates another benefit of virtualisation: we can configure a VM so that it can b run in containerised form on a student’s own computer, or run on a machine hosted on the net somewhere, and then accessed from the student’s own machine.)

Something I found really exciting were the VMs defined by @DataMinerUk and @twtrdaithi for use in data journalism applications – Infinite Interns, a range of virtual machines defined using Vagrant (which is super fun to play with!:-) that contain a rang of tools useful for data projects.

I also wonder about the extent to which the various MOOCs have made use of VMs… And whether there is an argument to be had in favour of “course boxes” in general…?

PS for a hint at something of what’s possible in using a VM to support a course, imagine Mining the Social Web: Data Mining Facebook, Twitter, LinkedIn, Google+, GitHub, and More as course notes, The official online compendium for Mining the Social Web, 2nd Edition (O’Reilly, 2013) as the way in to your run-at-home computer lab, and Mining-the-Social-Web-2nd-Edition – issues on github as instructor/lab technician support. ’nuff said. The things we’re gonna be prepared to pay for have the potential to change…

Written by Tony Hirst

December 2, 2013 at 3:18 pm

Posted in OU2.0

Tagged with ,


Get every new post delivered to your Inbox.

Join 864 other followers