First Attempt at Running the TM351 VM as an AMI on Amazon Web Services

One of the things that’s been on my to do list for ages is trying to get a version of the TM351 virtual machine (VM) up and running on Amazon Web Services (AWS) as an Amazon Machine Instance (AMI). This would allow students who are having trouble running the VM on their own computer to access the services running in the cloud.

(Obviously, it would be preferable if we could offer such a service via OU operated servers, but I can’t do politics well enough, and don’t have the mentality to attend enough of the necessary say-the-same-thing-again-again meetings, to make that sort of thing happen.)

So… a first attempt is up on the eu-west-1 region in all its insecure glory: TM351 AMI v1. The security model is by obscurity as much as anything – there’s no model for setting separate passwords for separate students, for example, or checking back agains an OU auth layer. And I suspect everything runs as root…

(One of the things we have noticed in (brief) testing is that the Getting Started instructions don’t work inside the OU, at least if you try to limit access to your (supposed) IP address. Reminds of when we gave up trying to build the OU VM from machines on the OU network because solving proxy and blocked port issues was an irrelevant problem to have to worry about when working from the outside…)

Open Refine doesn’t seem to want to run with the other services in the free tier micro (1GB) machine instance, but at 2GB everything seems okay. (I don’t know if possible race conditions in starting services means that Open Refine could start and then block the Jupyter service’s request for resource.  I need to do an Apollo 13 style startup sequence exploration to see if all services can run in 1GB, I guess!) One thing I’ve added to the to do list is to split things out so into separate AMIs that will work on the 1GB free tier machines. I also want to check that I can provision the AMI from Vagrant, so students could then launch a local VM or an Amazon Instance that way, just by changing the vagrant provider. (Shared folders/volumes might get a bit messed up in that case, though?)

If services can run one at a time in the 1GB machines, it’d be nice to provide a simple dashboard to start and stop the services to make that easier to manage. Something that looks a bit like this, for example, exposed via an authenticated web page:

This needn’t be too complex – I had in mind a simple Python web app that could run under nginx (which currently provides a simple authentication layer for Open Refine to sit behind) and then just runs simple systemctl start, stop and restart commands on the appropriate service.

#fragment...
import os
os.system('systemctl restart jupyter.service')

I’m not sure how the status should be updated (based on whether a service is running or not) or what heartbeat it should update to. There may be better ways, of course, in which case please let me know via the comments:-)

I did have a quick look round for examples, but the dashboards/monitoring tools that do exist, such as pydash, are far more elaborate than what I had in mind. (If you know of a simple example to do the above, or can knock one up for me, please let me know via the comments. And the simpler the better ;-)

If we are to start exploring the use of browser accessed applications running inside user-managed VMs, this sort of simple application could be really handy… (Another approach would be to use a VM running docker, and then have a container manager running, such as portainer.)

Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...

%d bloggers like this: