Fragment – TM351 Services Architected for Online Access

When we put together the original  TM351 VM, we wanted a single, self-contained installable environment capable of running all the services required to complete the practical activities defined for the course. We also had a vision that the services should be capable of being accessed remotely.

With a bit of luck, we’ll have access to an OU OpenStack environment any time soon that will let us start experimenting with a remote / online VM provision, at least for a controlled number of students. But if we knew that a particular cohort of students were only ever going to access the services remotely, would a VM be the best solution?

For example, the services we run are:

  • Jupyter notebooks
  • OpenRefine
  • PostgreSQL
  • MongoDB

Jupyter notebooks could be served via a single Jupyter Hub instance, albeit with persistence enable on individual accounts so students could save their own notebooks.

Access to PostgreSQL could be provided via a single Postgres DB with students logging in under their own accounts and accessing their own schema.

Similarly – presumably? – for MongoDB (individual user accounts accessing individual databases). We might need to think about something different for the sharded Mongo activity, such as a containerised solution (which could also provide an opportunity to bring the network partitioning activity I started to sketch out way back when).

OpenRefine would require some sort of machinery to fire up an OpenRefine container on demand, perhaps with a linked persistent data volume. It would be nice if we could use Binderhub for that, or perhaps DIT4C style infrastructure…

Author: Tony Hirst

I'm a lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...