Pondering What “Digital First” and “University of the Cloud” Mean…

In a briefing to OU staff from senior management earlier this week, VC Peter Horrocks channelled KPMG consultants with talk of the OU becoming “digital first”, and reimagining itself as a “University of the Cloud”, updating the original idea of it being a “University of the Air” [Open University jobs at risk in £100m ‘root and branch’ overhaul].

I have no clear idea what “digital” means, nor what it would mean to be a “university of the cloud” (if things are cloudy, does that mean we can’t do blue skies thinking; or that there is a silver lining somewhere?!;-). But here are a few things I’d like to explore that are based on trends that I think have been emerging for the last few years (and which I can date from historical OUseful.info blog posts, both here and in the original ouseful archive, which dates back to 2005..)

From Applications to Apps

In recent years, we’ve seen a move away from installed software applications that are self contained and run, offline, on a host computer, and towards installed apps that often have tight integration to online services. Apps may run, in part, offline, but they prefer it when there’s a network connection. Online apps exist solely elsewhere and are accessed via a browser.

Where the code lives, and where data files are stored, has implications for the user. If you’re using an online app, you need a reliable network connection. If all you use are online apps, a tablet or a Chromebook are fine. This in turn impacts on providers of services that make make use of software (such as the OU, for example). I’ve been wittering on for years that if all students have is a Chromebook, then we’re excluding them if our courses require them to have a computer onto which you can install a “traditional” software application. This tends to fall on deaf ears – two new level 1 courses, both currently in production and that don’t launch until later this year and next year, and that would typically be expected to have a life of several years, both make use of desktop software installs. I suspect this is not university of the cloud style thinking.

So Browser First…

The view I have had for several years is that all software services we expect students to be able to access should be accessed via a browser. This frees us up to deliver applications onto the desktop that expose themselves to students via the browser, or deliver the services from a remote online host. This could be an OU delivered service (for example, via the OpenSTEM Lab), a third party delivered service (such as Azure Notebooks), or a service managed on a remote host by the student themselves (for example, the TM351 Amazon/AWS AMI we are testing at the moment).

For TM351, we took an early decision to use just such browser accessed tools for the course, in particular Jupyter notebooks and Open Refine (along with some other “headless” database services). For convenience, these were packaged inside a single virtual machine that could be installed on a “traditional” computer (Windows or Mac). Running the virtual machine exposed the services via the browser. A shared directory meant student files were kept on the host computer but could be accessed by the services running inside the VM. Although we did not explicitly provide support for students who only had access to a tablet or Chromebook that could not run the VM, a proof of concept solution using linked Docker containers that could be run on a cloud host was available was an emergency fall back.

In updating the TM351 VM for the next presentation of the course, we are also exploring making the VM available at least as an AWS (Amazon Web Services) machine instance (AMI), which would allow a student to run the VM, at their own cost, on a remote Amazon server and access the course software via their browser.

The applications that live inside the TM351 virtual machine have also been broken out into separate Docker containers. These can be combined and launched in a multitude of ways. For example, OpenRefine running on its own, Jupyter notebooks running on their own, Jupyter notebooks + PostgreSQL running in a linked fashion, Jupyter notebooks + MongoDB running in a linked fashion. The use of Docker means that the services can also be run locally on an offline student computer running Docker, or they can be run on a remote server (that is, in the cloud) and accessed via a browser. This approach means we can continue to provide software to students that runs on their own computer, but we can also provide exactly that same software from a remote host that lives “in the cloud”.

A couple of other examples of using VMs do exist in a couple of computing courses, but from what I can tell there is little interest in trying to push our thinking about how virtualised computing can be used to support either computing courses or other courses with computing needs. The part of the OU that provides “digital” support to the OU has little, if any, experience in providing cloud services, and to date there seems to have been little, if any, capacity in trying to explore this area. Digital. Cloud. Hmmm…

The use of containerised services can also extend outside the computing curriculum to other subject areas. I’ve tried to float the idea of a “digital applications shelf” in the Library that publishes standalone, virtualised (containerised) services, along with scripts for combining them (example, or as in the case of linked TM351 applications) but never really got very far with it. It only takes a little bit of imagination to see how this might work (a Dockerhub image shelf, a repository of Docker compose scripts that can wire containers created from images pulled off the shelf together), but that imagination, again, seems to be lacking. (I could be spectacularly and completely wrong, of course!;-)

What I don’t think we should be doing is making remote desktops available to students that run installed software applications (I don’t think we should give them access to a remote Windows desktop, for example, running the Windows installed software we might traditionally have developed). We should be using software that runs as a service and is delivered directly through the browser. Service based, personal computing in the cloud.

As well as providing software that students can run themselves, there’s also the question of students being provided directly with OU hosted (or at least, OU badged) online applications. We started to have some early discussions internally about a “computational wing” for the OpenSTEM Lab, but that appears to have stalled. My personal roadmap for that would be to start off by making use of a couple of open source programming environments that can be accessed via a browser and that can be run at scale (Jupyter notebooks, RStudio Server, Shiny Server). This would give us operational experience in installing and managing this sort of service – or we could pay someone else to do it… Between them, these three environments support a wide variety of computational activities. Shiny Server, and Jupyter Dashboards, also provide a means for rapidly developing and publishing small interactive applications. Shiny server has already been used in at least one course to deliver some simple interactive applications created by a self-admitted not-very-technical academic.

Exploring this technologies can also support self-service research computing, although I got the feeling the non-teaching related research may be losing support… (That said, I don’t see many/any folk researching “cloud/digital infrastructure and workflows”, or emerging trends in personal computing tech and end user application wrangling, for teaching purposes or otherwise…

Digital Production Methods

I know I’m biased, but I think the OU is way behind the curve in document creation and production methods. Over the last two or three years, reproducible research methods have spawned a range of innovations in supporting the creation and publication of interactive and “generated” content (examples). This speeds up production and cuts down own maintenance. Documents carry the “source code” for the creation of media assets contained within them, and produce assets that reflect the state of other parts of the document around them. This avoids problems of drift between things like code fragments and the outputs the output they produce, as well as syntax errors introduced as part of the editing process.

The depiction of media assets as computational objects also means they can be restyled by applying a different stylistic theme to the asset without changing the actual content (example1, example2).

The ability to both static and interactive outputs from the same media asset object is also very powerful. For example, the mpld3 python package can take a matplotlib chart object that would naturally be rendered using an image format and generate an interactive HTML chart from it – no extra work required.

Helper libraries (with customisable templates) also mean that it can be quite simple to generate complex, templated interactive code quite straightforwardly. I may not know how to write the code to publish an interactive Google map, but I don’t have to when there’s a myriad number of packages out there that will create the code for me, and put a marker on the map in the appropriate place if I give it a location.

Publishing workflows, such as the ones based around the R knitr package or Jupyter notebook nbconvert tool also mean that source documents represented using a simple text format (markdown) can be rendered in a variety of styles and document formats (HTML pages, PDF, .docx, HTML slideshows).

At one point I started to explore Jupyter based workflows, but FutureLearn head in the sand + OU production process fascism put a rapid halt to that. Can it really also be nearly 3 years ago since I used knitr to first publish my Wrangling F1 Data With R book to Leanpub?! (That was motivated originally purely as a way of exploring how to go from RMarkdown to a print publication in an automated way.)

I’m not sure at all what the new OU OpenCreate tool looks like, or supports, or how the workflow flows (I asked for a beta invite… still no reply…) but I wonder if any of the team even looked at things like the knitr workflow and if so, what they thought of it and how the OpenCreate workflow compares? And to what extent asset creation and whole-content maintenance plays an integrated part in it? I also wonder how “digital” it is…

Related, in sense of “digital first” – via Cameron Neylon: As a researcher…I’m a bit bloody fed up with Data Management.

PS The cuts that aren’t cuts because some of the money will be spent elsewhere may also mean I might need a new job soon. FWIW, I tend to work from home, swear a lot on Twitter about whatever I happen to thinking about at the time, and am not a team player. I am convinced my imposter syndrome is actually the Dunning Krueger effect in play. Skillset: quirky.

Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...

%d bloggers like this: