Some Random Upstart Debugging Notes…

…just so I don’t lose them…

dmesg spews out messages the kernel has been issuing…

/var/log/upstart/SERVICE.log has log messages from trying to start a service SERVICE.

/etc/init.d should contain what looks like a generic sort of file, filename SERVICE, with the actual config file that contains the command you want to start the service in a file SERVICE.conf in /etc/init.

To generate the files that will have a go at auto-running the service, run the command update-rc.d SERVICE defaults.

Start a service with service SERVICE start, stop it with service SERVICE stop, and restart (stop if started, then start) with service SERVICE restart. Find out what’s going on with it using service SERVICE status.


Are Robots Threatening Jobs or Are We Taking Them Ourselves Through Self-Service Automation?

Via several tweets today, a story in the Guardian declaring Robots threaten 15m UK jobs, says Bank of England’s chief economist:

The Bank of England has warned that up to 15m jobs in Britain are at risk of being lost to an age of robots where increasingly sophisticated machines do work that was previously the preserve of humans.

The original source appears to be a speech (“Labour’s Share”) given by Andrew G Haldane, Chief Economist of the Bank of England to the Trades Union Congress, London, 12 November 2015 and has bits and pieces in common with recent reports such as this one on The Future of Employment: how susceptible are jobs to computerisation? or this one asking Are Robots Taking Our Jobs, or Making Them?, or this on The new hire: How a new generation of robots is transforming manufacturing, or this collection of soundbites collected by Pew, or this report from a robotics advocacy group on the Positive Impact of
Industrial Robots on Employment
. (Lots of consultancies and industry lobby groups seem to have been on the robot report bandwagon lately…) There’s also been a recent report that seems to have generated some buzz lately from Bank of America/Merrill Lynch on Creative Disruption, which also picks up on several trends in robotics.

But I wonder – is it robots replacing jobs through automating out, or robots replacing jobs by transferring work from the provider of a service or good directly on to the consumer, turning customers into unpaid employees? That is, what proportion of these robots actually self-service technologies (SSTs)? So for example, have you ever:

  • used a self-service checkout in a supermarket rather than waiting in line for a cashier to scan your basketload of goods, let alone bought a bag of crisps or bottle of water from a (self-service) vending machine?
  • used a self-service banking express machine or kiosk to pay in a cheque, let alone used an ATM to take cash out?
  • used a self-service library kiosk to scan out a library book?
  • used a self-service check-in kiosk or self-service luggage drop off in an airport?
  • used a self-service ticket machine to buy a train ticket?
  • collected goods from a (self-service) Amazon locker?
  • commented in a “social learning” course to support a fellow learner?
  • etc etc

Who’s taken the jobwork there? If you scan it yourself, you’re an unpaid employee…

Launch Docker Container Compositions via Tutum and – But What About Container Stashing?

Via a couple of tweets, it seems that 1-click launching of runnable docker container compositions to the cloud is almost possible with Tutum – deploy to Tutum button [h/t @borja_burgos] – with collections of ready–to-go compositions (or in Tutum parlance, stacks) available on [h/t @tutumcloud].

The deploy to Tutum button is very much like the binder setup, with URLs taking the form:

The repository – such as a github repository – will look for tutum.yml, docker-compose.yml and fig.yml files (in that order) and pre-configure a Tutum stack dialogue with the information described in the file.

The stack can then be deployed to a one or more already running nodes.

The site hosts a range of pre-defined configuration files that can be used with the deploy button, so in certain respects it acts much the same way as a the panamax directory (Panamax marketplace?)

One of the other things I learned about Tutum is that they have a container defined that can cope with load balancing: if you launch multiple container instances of the same docker image, you can load balance across them (tutum: load balancing a web service). At least one of the configurations on (Load balancing a Web Service) seems to show how to script this.

One of the downsides of the load balancing, and indeed the deploy to Tutum recipe generally is that there doesn’t seem to be a way to ensure that server nodes on which to run the containers are available: presumably, you have to start these yourself?

What would be nice would be the ability to also specify an autoscaling rule that could be used to fire up at least one node on which to run a deployed stack? Autoscaling rules would also let you power up/power down server nodes depending on load, which could presumably keep the cost of running servers down to a minimum needed to actually service whatever load is being experienced. (I’m thinking of occasional, and relative low usage models, which are perhaps also slightly different from a normal web scaling model. For example, the ability to fire up a configuration of several instances of OpenRefine for a workshop, and have autoscaling cope with deploying additional containers (and if required, additional server nodes) depending on how many people turn up to the workshop or want to participate).)

There seems to be a discussion thread about autoscaling on the Tutum site, but I’m not sure there is actually a corresponding service offering? (Via @tutumcloud, there is a webhook triggered version: Tutum triggers.)

One final thing that niggles at me particularly in respect of personal application hosting is the ability to “stash” a copy of a container somewhere so that it can be reused later, rather than destroying containers after each use. ( appears to have this sorted…) A major reason for doing this would be to preserve user files. I guess one way round it is to use a linked data container and then keep the server node containing that linked data container alive, in between rounds of destroying and starting up new application containers (application containers that link to the data container to store user files). The downside of this is that you need to keep a server alive – and keep paying for it.

What would be really handy would be the ability to “stash” a container in some cheap storage somewhere, and then retrieve that container each time someone wanted to run their application (this could be a linked data container, or it could be the application container itself, with files preserved locally inside it?) (Related: some of my earlier notes on how to share docker data containers.)

I’m not sure whether there are any string’n’glue models that might support this? (If you know of any, particularly if they work in a Tutum context, please let me know via the comments…)

A Peek Inside the TM351 VM

So this is how I currently think of the TM351 VM:


What would be nice would be a drag’n’drop tool to let me draw pictures like that that would then generate the build scripts… (a docker compose script, or set of puppter scripts, for the architectural bits on the left, and a Vagrantfile to set up the port forwarding, for example).

For docker, I wouldn’t have thought that would be too hard – a docker compose file could describe most of that picture, right? Not sure how fiddly it would be for a more traditional VM, though, depending on how it was put together?

Personal Application Hosting, Dreams of a Docker AppStore, and an Incoming Sandstorm?

After a fun chat with Jim Groom this morning – even after all these years, we’ve still never met in person – I thought I should get round to finishing off this post, modified slightly in light of today’s chat…

A couple of months ago, I signed up for some online webhosting from Reclaim Hosting (which I can heartily recommend:-), in part because I wanted to spend a bit of time hacking some #opendata related WordPress plugins (first attempt described here); and my hosting on WordPress-dot-com doesn’t allow much in the way of tech customisation…

Reclaim offers web hosting, which is to say: a place to park several domains of my own, host blogs of my own customisation, manage email associated with my domains, handle analytics and logging, and publish a variety of other web style applications of my own choosing.

The web applications on offer are 1-click installable (ish – there may be a various custom settings associated with any particular application) using cPanel and installatron.


This is great for web hosting BUT the applications on offer are, in the main, applications associated with web stuff. As compared to applications associated with scientific, engineering, or digital humanities coursework, for example; (“scholarly apps”, perhaps?!) So for example, for OUr Data Management and Analysis (TM351) course, students will be running Jupyter notebooks, OpenRefine, MongoDB and PostgreSQL (I had hoped early on that RStudio might make it in there too, but that was over ambitious!;-) It’s not surprising that some of these apps also appear on the ResBaz Cloud.

Jupyter, OpenRefine and RStudio share the common feature of presenting graphical user interfaces via a browser. MongoDB and PostgreSQL, on the other hand, along with services like the Apache Tika Document Text Extraction Service, provide “headless” services via an http port. Which is to say – they work over the web, and, if appropriate, they can be accessed via a browser.

So here’s what I want, what I think I really, really want: an online application hosting provider. Reclaim lets me do the web social and web publishing stuff, but at the moment I can’t 1-click install my web-runnable course software there. Nor can I easily share my own “scholarly app” creations: for example, I could pay $9 a month to host shiny apps I’ve built in RStudio on, but if I just wanted to share a little something with my friends that I’d built on a course for a day or two, that would probably be overkill compared to hosting it briefly on my own site. If I’d built a cool Jupyter notebook and wanted to let you have a play with it, I could share the notebook file with you and you could then download it and run it on your own notebook server, assuming you know how to do that, but it might also be nice if you could 1-click launch an interactive version of it on my site. (Actually, there is a string’n’glue solution to this: I could pop the notebook onto github and then run it via binder.)

So looking around for bits of stick’n’string’n’glue that could perhaps be glued together to let me do this, what I quite like is to have my own online, course-app running StrinGLE (remember StringLE…? A learning environment, made from string’n’glue, where you could actually do stuff as well as be given stuff to read…).

On the one hand, the social webhosting side, I’d have my webhosting apps cPanel; on the other, to meet my course related scientific computing needs, I;d have something like Kitematic:


Note there may be some overlap in the applications. More what I’m thinking about are uses cases where the applications operate on a different timescale. The web hosting apps I start once and they run for ever. I want my blog to be there all the time, and I want my email to work all the time. The personal apps are more like applications that only run when I’m using them: RStudio, or a Jupyter notebook. That is, I start them/launch them when I want to use them, then shut them down when I’ve done, ideally persisting any files I’ve been working on somewhere until the next time I want to use the application. Containers are ideal for this because you can start them when you need them, then throw them away when your study session is done.

So that’s one take – a Kitematic complement to cPanel that lets me fire up applications for short term use, whilst persisting my files in a storage area part of my online hosting, which is perhaps even synched to something like Dropbox.

Here’s another take – imagine what this might mean…:


In this case, imagine that the binder button takes an image on a dockerhub and launches it via my web host. So the binder button takes me to a central clearing house where I have an authenticated account that I’ve configured with details of my web host. Clicking the binder button says to the binder server: “authenticated person X wants to run a container based on image Y”, and the binder server looks up my host details, and fires up a container there, with a URL as a subdomain of my domain.

I could imagine something like Tutum – recently acquired by docker – being able to support something like this: from Tutum, I can currently start up servers (droplets) on a third party host (I use Digital Ocean for this), and then deploy containers from dockerhub on those servers. At the moment it takes a few clicks in Tutum to set up the various settings and start the servers, but it could perhaps all be streamlined in to a few setup screens for the first time I launch a container application, and the parameters saved to a config file that could be used by default on future starts of the same application? So a tutum button, rather than a binder button, on dockerhub perhaps?

As to security, I think that running arbitrary containers fills IT folk with security dread, so it may make more sense to only support containers based on images held in a trusted repository, such as an institutional repository. This does put something of an application gatekeeper role back on the institution, but the institution could be a trusted commercial or community partner. (I wonder: is there support for trusted/signed docker images?)

As to how achievable this is – I wish I had time to explore and play with the Tutum API a little! In the meantime, Jim mentioned the rather intriguing sounding


What this seems to be is an app store where the apps are Linux virtual machines, packaged using vagrant…: Sandstorm application packaging.

From a quick peek, it seems that a Sandstorm application is a Linux image built up from a Sandstorm base image and a set of user defined shell scripts. (UPDATE: for a description of how the approach differs from docker, see Why doesn’t Sandstorm just run Docker apps?) Rather than running a single application within a single container, and then linking containers to make application compositions, it looks as if Sandstorm containers may run several applications that talk to each other within the container? State can also be persisted, so whilst application running containers are destroyed if you close a browser session running against the container, the state is recoverable if you launch another container from the same image. Which means that the Sandstorm folk have got the user-authentication thing sussed? (Sandstorm know I’m me. When I fire up a Jupyter container, they can link it to my stash of notebook files.) Hmm…


My TM351 VM build files are based on puppet – with a few shell scripts – orchestrated by vagrant. I wonder how hard it would be to create a version of the TM351 virtual machine that could be deployed via Sandstorm? Hmm…

PS Hmm.. it seems that a “Deploy to Tutum” button already exists (h/t @borja_burgos), though I’ve not had time to look at this properly yet… Exciting:-)

PPS and via @tutumcloud, – a bit like Panamax compositions, deployable via Tutum… Thinks: so I should be able to do a stack for TM351 linked containers…:-)

Running Executable Jupyter/IPython Notebooks Directly from Github With Binder

It’s taken me way too long to get round to posting this, but it’s a compelling idea that I think more notice should be taken of… binder ([code]).

The idea is quite simple – specify a public github project (username/repo) that contains one or more Jupyter (IPython) notebooks, hit “go”, and the service will automatically create a docker container image that includes a Jupyter notebook server and a copy of the files contained in the repository.


(Note that you can specify any public Github repository – it doesn’t have to be one you have control over at all.)

Once the container image is created, visiting will launch a new container based on that image and display a Jupyter notebook interface at the redirected to URL. Any Jupyter notebooks contained within the original repository can then be opened, edited and executed as an active notebook document.

What this means is we could pop a set of course related notebooks into a repository, and share a link to Whenever the link is visited, a container is fired up from the image and the user is redirected to that container. If I go to the URL again, another container is fired up. Within the container, a Jupyter notebook server is running, which means you can access the notebooks that were hosted in the Github repo as interactive, “live” (that is, executable) notebooks.

Alternatively, a user could clone the original repository, and then create a container image based on their copy of the repository, and then launch live notebooks from their own repository.

I’m still trying to find out what’s exactly going on under the covers of the binder service. In particular, a couple of questions came immediately to mind:

  • how long do containers persist? For example, at the moment we’re running a FutureLearn course (Learn to Code for Data Analysis) that makes use of IPython/Jupyter notebooks (, but it requires learners to install Anaconda (which has caused a few issues). The course lasts 4 weeks, with learners studying a couple of hours a day maybe two days a week. Presumably, the binder containers are destroyed as a matter of course according to some schedule or rule – but what rule? I guess learners could always save and download their notebooks to the desktop and then upload them to a running server, but it would be more convenient if they could bookmark their container and return to it over the life of the course? (So for example, if Futurelearn was operating a binder service, joining the course could provide authenticated access to a container at for the duration of the course, and maybe a week or two after? Following ResBaz Cloud – Containerised Research Apps as a Service, it might also allow for a user to export a copy of their container?)
  • how does the system scale? The FutureLearn course has several thousand students registered to it. To use the binder approach towards providing any student who wants one with a web-accessible, containerised version of the notebook application so they don’t have to insall one of their own, how easily would it scale? eg how easy is it to give a credit card to some back-end hosting company, get some keys, plug them in as binder settings and just expect it to work? (You can probably guess at my level devops/sysadmin ability/knowledge!;-)

Along with those immediate questions, a handful of more roadmap style questions also came to mind:

  • how easy would it be to set up the Jupyter notebook system to use an alternative kernel? e.g. to support a Ruby or R course? (I notice that offers a variety of kernels, for example?)
  • how easy would it be to provide alternative services to the binder model? eg something like RStudio, for example, or OpenRefine? I notice that the binder repository initialisation allows you to declare the presence of a custom Dockerfile within the repo that can be used to fire up the container – so maybe binder is not so far off a general purpose docker-container-from-online-Dockerfile launcher? Which could be really handy?
  • does binder make use of Docker Compose to tie multiple applications together, as for example in the way it allows you to link in a Postgres server? How extensible is this? Could linkages of a similar form to arbitrary applications be configured via a custom Dockerfile?
  • is closer integration with github on the way? For example, if a user logged in to binder with github credentials, could files then saved or synched back from the notebook to that user’s corresponding repository?

Whatever – will be interesting to see what other universities may do with this, if anything…

See also Seven Ways of Running IPython Notebooks and ResBaz Cloud – Containerised Research Apps as a Service.

PS I just noticed an interesting looking post from @KinLane on API business models: I Have A Bunch Of API Resources, Now I Need A Plan, Or Potentially Several Plans. This has got me wondering: what sort of business plan might support a “Studyapp” – applications on demand, as a service – form of hosting?

Several FutureLearn courses, for all their web first rhetoric, require studentslearners to install software onto their own computers. (From what I can tell, FutureLearn aren’t interested in helping “partners” do anything that takes eyeballs away from So I don’t understand why they seem reluctant to explore ways of using tech to provide interactive experiences within the FutureLearn context, like using embedded IPython notebooks, for example. (Trying to innovate around workflow is also a joke.) And IMVHO, the lack of innovation foresight within the OU itself (FutureLearn’s parent…) seems just as bad at the moment… As I’ve commented elsewhere, “[m]y attitude is that folk will increasingly have access to the web, but not necessarily access to a computer onto which they can install software applications. … IMHO, we are now in a position where we can offer students access to “computer lab” machines, variously flavoured, that can run either on a student’s own machine (if it can cope with it) or remotely (and then either on OU mediated services or via a commercial third party on which students independently run the software). But the lack of imagination and support for trying to innovate in our production processes and delivery models means it might make more sense to look to working with third parties to try to find ways of (self-)supporting our students.”. (See also: What Happens When “Computers” Are Replaced by Tablets and Phones?) But I’m not sure anyone else agrees… (So maybe I’m just wrong!;-)

That said, it’s got me properly wondering – what would it take for me to set up a service that provides access to MOOC or university course software, as a service, at least, for uncustomised, open source software, accessible via a browser? And would anybody pay to cover the server costs? How about if web hosting and a domain was bundled up with it, that could also be used to store copies of the software based activities once the course had finished? A “personal, persistent, customised, computer lab machine”, essentially?

Possibly related to this thought, Jim Groom’s reflections on The Indie EdTech Movement, although I’m thinking more of educators doing the institution stuff for themselves as a way of helping the students-do-it-for-themselves. (Which in turn reminds me of this hack around the idea of THEY STOLE OUR REVOLUTION LEARNING ENVIRONMENT. NOW WE’RE STEALING IT BACK !)