Customisation vs. Personalisation in Course Offerings

According to the Cambridge English Dictionary, customisation and personalisation are defined as follows:

  • customize: verb [ T ] uk usually customise UK — to make or change something according to the buyer’s or user’s needs
  • personalize: verb [ T ] uk usually personalise UK —​ to make something suitable for the needs of a particular person. If you personalize an object, you change it or add to it so that it is obvious that it belongs to or comes from you.

In this post, I’m going to take a more extreme position, contrasting them as:

  • customisation: the changes a vendor or service provider makes;
  • personalisation: the changes a user makes.

Note that a user may play a role in customisation. For example, when buying a car, or computer, a buyer might customise it during purchase using a configurator that lets them select various options: the customisation is done by the vendor, albeit under the control of the buyer. They may then personalise it when they receive it by putting stickers all over it.

One of the things I’ve been pondering in the context of how we deliver software to students is the extent to which we offer them customised and personalisable environments.

In the second half of the post Fragment — Some Rambling Thoughts on Computing Environments in Education I decompose computing environments into three components (PLC): a physical component (servers); a logical component (computing environment: operating system, installed packages, etc); and a cultural component (personal preference text editors, workflows, etc.).

When we provide students with a virtual machine, we provide them with a customised environment at the course (module) level. Each student gets the same logical virtual machine.

The behaviour of the machine in a logical sense will be the same for each student. But students have different computers, with different resource profiles (different processor speeds, or memory, for example).

So their experience of running the logical machine on their personal computer will be a personalised one.

As personalisation (under my sense of the term) it is outside our control.

If we offer students access to the logical machine running on our servers, we customise the physical layer in terms of compute resource, but students will still experience a personalised experience based on the speed and latency of their network connection.

At this point, I suggest that we can control what students receive in terms of the logical component (customisation), and we can suggest minimum resource requirements to try to ensure a minimum acceptable experience, if not parity of experience, in terms of the physical component.

But there is then a tension about the extent to which we tell students how they can personalise their physical component. If we are shipping a VM, should we tell students with powerful computers how to increase the memory size, or number of cores, used by the virtual machine? The change would be a personalisation implemented at the logical layer (changing default settings) that exploits personalisation at the lower physical layer. Or would that be unfair to students with low specced machines who cannot make changes at the logical layer that other students might be able to make at the logical layer?

If it takes the student with the lowest specced machine an hour to run a particularly expensive computation, should every student have to take an hour? Or should we tell students who are in a position to run the computation on their higher specced machine how to change the logical layer to let them run the activity in 5 minutes?

At the cultural layer, I would contend that we should be encouraging students to explore personalisation.

If we are running a course that involves an element of programming using a particular set of programming libraries, we can provide students with a logical environment containing all the required libraries. But should we also control the programming editor environment, particularly if the student is a seasoned developer, perhaps in another language, with a pre-existing workflow and a highly tuned, personalised editing environment?

In our TM351 Data Management and Analysis course, we deliver material to students using a virtual machine and Jupyter notebooks. To complete the course assessment, we require students to develop some code and present a data investigation in a notebook. For seasoned developers, the notebook environment is not necessarily the best environment in which to develop code, so when one student who lived in a Microsoft VS Code editor at work wanted to develop code in that personalised environment using our customised logical environment, that seemed fine to me.

Reflecting on this, it seems to me that at the cultural level we can make recommendations about what tools to use and manage our delivery of the experience in terms of: this is how to do this academic thing we are teaching in this cultural environment (a particular editor, for example) but if you want to personalise the cultural environment, that is fine (and perhaps more than that: it is right and proper…).

To riff on TM351 again, the Jupyter notebook environment we provide is customised (preconfigured) at the logical level with preinstalled extensions and customised at the cultural layer with certain of the extensions pre-enabled (a spell checker is enabled, for example, and a WYSIWYG markdown editor). But students are also free to personalise the notebook environment at the cultural level by enabling their own selection of preinstalled notebook extensions. They can also go further, and personalise the logical component by installing additional extensions that can facilitate personalisation at the cultural level, with the caveat that we only guarantee that things work using the logical component we provided students with.

PS I originally started pondering customisation vs personalisation as a rant against “personalisation” in education. I’d argue that it is actually “customisation” and an example of the institution imposing different customised offerings at the individual student level.

Fragment — Some Rambling Thoughts on Computing Environments in Education

One of the challenges that faces the distance educator, indeed, any educator, in delivering computing related activities is how to provide students with an environment in which they can complete practical computing related teaching and learning activities.

Simply getting the student to a place where the code you want them to work on, and run, is far from trivial.

In a recent post on Creating gentle introductions to coding for journalists… (which for history of ideas folk, and my own narrative development timeline, appeared sometime after most of this post was drafted, but contextualises it nicely), journalism educator Andy (@digidickinson) Dickinson describes how in teaching MA students a little bit of Python he wanted to:

– Avoid where possible, the debates – Should journalists learn to code? Anyone?
– Avoid where possible too much jargon – Is this actually coding or programming or just html
– Avoid the issue of installing development environments – “We’ll do an easy intro but, first lets install R/python/homebrew/jupyter/anaconda…etc.etc.”
– Not put people off – fingers crossed

The post describes how he tried to show a natural equivalence between, and progression from, from Excel formulas to Python code (see my post from yesterday on Diagrams as Graphs, and an Aside on Reading Equations which was in part inspired by that sentiment).

But that’s not what I want to draw on here.

What I do want to draw on is this:

The equation `Tech + Journalists=` is one you don’t need any coding experience to solve. The answer is stress.

Experience has taught me that as soon as you add tech to the mix, you can guarantee that one person will have a screen that looks different or an app that doesn’t work. Things get more complicated when you want people to play and experiment beyond the classroom. Apps that don’t install; or draconian security permissions are only the start. Some of this stuff is quite hardcore for a user who’s never used notepad before let alone fired up the command prompt. All of this can be the hurdle that most people fall at. It can sap your motivation.

Andy declares a preference for Anaconda, but I think that is… I prefer alternatives. Like Docker. This is my latest attempt at explaining why: This is What I Keep Trying to Say….

Docker is also like a friendly way in to the idea of infinite interns.

I first came across this idea — of infinite interns — from @datamineruk (aka Nicola Hughes), developed, I think, in association with Daithí Ó Crualaoich (@twtrdaithi, and by the looks of his Twitter stream, fellow Malta fan:-)

As an idea, I can’t think of anything that has had a deeper or more profound effect on my thinking as regards virtual computing than infinite interns.

Here’s how the concept was originally described, in a blog post that I think is now only viewable via the Internet Archive Wayback Machine — DataMinerUK: What I Do And How:

I specialise in the backend of data journalism: investigations. I work to be the primary source of a story, having found it in data. As such my skills lean less towards design and JavaScript and more towards scraping, databases and statistics.

I work in a virtual world. Literally. The only software I have installed on my machine are VirtualBox and Vagrant. I create a virtual machine inside my machine. I have blueprints for many virtual machines. Each machine has a different function i.e. a different piece of software installed. So to perform a function such as fetching the data or cleaning it or analysing it, I have a brand new environment which can be recreated on any computer.

I call these environments “Infinite Interns“. In order to help journalists see the possibilities of what I do, I tell then to think about what they could accomplish if they had an infinite amount of interns. Because that’s what code is. Here are a couple of slides about my Infinite Interns system:

And here are the slides, used without permission…

Let’s go back to Andy…

There are always going to be snags and, by the time we get to importing libs like pandas [a Python package for working with tabular data], things are going to get complicated – it’s unavoidable. But if the students come away knowing that code isn’t tricky at least in principle, that at a low level the basic structures and ideas are pretty simple and there’s plenty of support out there. Well, that’ll be a win. Fingers crossed.

What you really need is an infinite intern

Which is to say, what you really need is an easy way to tell students how to set up their computing environment.

Which is to say, you really need an easy way for students to tell their computers what sort of environment they’d like to work in.

Want a minimal Jupyter notebook?

docker run --rm -p 8877:8888 -e JUPYTER_TOKEN=letmein jupyter/minimal-notebook

and look to http://localhost:8877 then login with token letmein.

Need a scipy stack in there? Use a different intern…

docker run --rm -p 8877:8888 -e JUPYTER_TOKEN=letmein jupyter/scipy-notebook

And so on…

And if you can’t install Docker on your machine, you can still run (notebook running) containers in the cloud: for example, Running a Minimal OU Customised Personal Jupyter Notebook Server on Digital Ocean.

There’s also tooling to build containers from build specs in Github repos, such as repo2docker. This tool can automatically add in a notebook server for you. That same application is used to build containers that run on the cloud from a Github repo, at a single click: MyBinder (docs).

What this shows, though, is that installing software actually masks a series of issues.

If a student, or a data journalist, is on a low spec computer, or a computer that doesn’t let you install desktop software applications, or a computer that has a different operating system than the one required by the application you want to run, what are you to do?

What is the problem we are actually trying to solve?

I see the computing environment as made up of three components (PLC):

  • a physical component;
  • a logical component;
  • a cultural component.

The Physical Component

The physical component, (physical environment, or physical layer) corresponds to the physical (hardware) resource(s) required to run an activity. This might be a student’s own computer or it might be a remote server. It might include the requirement for a network connection with minimum bandwidth or latency properties. The physical resource maps onto the “compute, storage and network” requirements that must be satisfied in order to complete any given activity.

In some respects, we might be able to abstract completely away from the physical. If I am happy running a “disposable” application where I don’t need to save any files for use later, I can fire up a server, run some code, kill the server.

But if I want to save the files for use an arbitrary amount of time later, I need some persistent physical storage somewhere where I can put those files, and from where I can retrieve them when I need them. Persistence of files is one of the big issues we face when trying to think of how best to support our distance education students. Storage can be problematic.

How individuals connect to resources is another issue. This is the network component. If a student has a low powered computer (poor compute resource) we may need to offer them access to a more powerful remote service. But that requires a network connection. Depending on where files are stored, there are two network considerations we need to make: how does a student access files to edit them, and how do files get to compute so they can be processed.

The Logical Component

The logical component (logical layer; logical environment) might also be referred to as the computational environment. This includes operating system dependencies (for example, the requirement for a particular operating system), application or operating system dependencies (for example, we might require a particular application such as Scratch to be available, or a particular operating system package dependency that is required by a programming language package), programming language dependencies (for example, in a Python environment we might require a particular version of pandas to be installed, or a particular version of Java).

The Cultural Component

The cultural component (cultural layer; cultural environment) incorporates elements of the user environment and workflow. At one extreme, the adoption of a particular programming editor is an example of a cultural component (the choice of editor may actually be irrelevant as far as the teaching except insofar a student needs access to a code editor, not any particular code editor). The workflow element is more complex, covering workflows in both abstract terms (eg using a test driven approach, or using a code differencing and checkin management process) as well as practical terms (for example, using git and Github, or a particular testing framework).

For example, you could imagine a software design project activity in a computing course that instructs students to use a test driven approach and code versioning, but not specify the test framework, version control environment, or even programming language / computational environment.

This cultural element is one that we often ignore when it comes to HE, expecting students to just “pick up” tools and workflows, and one whose deficit makes graduates less than useful when it comes to actually doing some work when they do graduate. It’s also one that is hard to change in an organisation, and one that is hard to change at a personal level.

If you’ve tried getting a new technology into a course created by a course team, and / or into your organisation, you’ll know that one of the biggest blockers is the current culture. Adopting a new technology is really hard because if it really is new, it will lead to, may even require, new workflows — new cultures — for many, indeed any, of the benefits to reveal themselves.

Platform Independent Software Distribution – Physical Layer Agnosticism

Reflecting on the various ways in which we provide computing environments for distance education students on computing courses, one of my motivations is to package computational support for our educational materials in a way that is agnostic to the physical component. Ideally, we should be able to define a single logical environment that can be used across a wide range of physical environments.

Virtualisation has a role to play here: if we package software in virtualised environments, we have great flexibility when it comes to where the virtual machine physically runs. It could be on the student’s own computer, it could be on an OU server, it could be on a “bring your own server” basis.

Virtualisation essentially allows us to abstract away from much of the physical layer considerations because we can always look to provide alternative physical environments on which to run the same logical environment.

However, in looking for alternatives, we need to be mindful that (compute, storage, network) triple provides a set of multi-objective constraints that need to be satisfied and that may lead to certain trade-offs between them being required.

This is particularly true when we think of extrema, such as large data files (large amount of storage and/or large amount of bandwidth/network connectivity) and/or processes that require large amounts of computation (these may be associated with large amounts of data, or they may not; an example of the latter might be running a set of complex equations over multiple iterations, for example).

My preference is also that we should be distributing software environments and services that also allow students to explore, and even bring to bear, their own cultural components (for example, their favourite editor). I’ll have more to say about that in a future post…

Related: Fragment – Programming Privilege. See also This is What I Keep Trying to Say… where a few very short lines of configuration code let me combine / assemble pre-existing packages in new and powerful ways, without really having to understand anything about how the pieces themselves actually work.

This is What I Keep Trying to Say…

Small pieces loosely joined…

Last week, I learned that students on a level 3 course were being asked to install Docker so that they could run a particular application (Genie, a climate simulation tool) distributed via a Docker container image.

RESULT :-)

Two things follow from this:

  1. with Docker installed, giving students access to additional software applications also packaged as Docker containers becomes trivial: just tell them to run a different Docker container;
  2. we can start to join small pieces together in more integrated environments.

Here’s an example of joining pieces together:

There are a couple of tricks involved here.

Trick the first is to use the Genie container image distributed to students as the first part of a Docker multistage build. The useful part of the container distributed to students essentially boils down to three parts:

  1. a built application in a specified directory;
  2. a node.js run time to run the application;
  3. a start command to start the application server.

In a multistage build, I can:

  • pull in the original application container;
  • reset the base layer of the container to a base layer from which *I* want to build (for example, a branded notebook server image);
  • copy over the application files from the original application server container into my container;
  • install a node.js runtime required to run the copied application;
  • install nbserverproxy`jupyter-server-proxy`;
  • create a simple server proxy config file to run the application.

Here’s what the Dockerfile for such a multistage build looks like:

#Demo - multistage build

#From a genie application container
#copy the application into an OU customised notebook container
#and use nbserverproxy to run the genie application

FROM $GENIE_APP
# This loads in the original GEnie application image
# that can run on it's own to serve the Genie app

#But we can also copy the application from that image
# into another container...

FROM ousefulcoursecontainers/oubrandednotebook
#Alternatively, use a notebook container seeded with notebooks

#Install node
USER root
RUN apt update \
&& apt-get install -y curl \
&& curl -sL https://deb.nodesource.com/setup_8.x | bash - \
&& apt-get install -y nodejs

USER $NB_USER

WORKDIR $HOME

#Grab the application files from the originally distributed container
COPY --from=0 /home/genie/node_app ./genie/

#Server proxify the application
RUN pip install --no-cache jupyter-server-proxy
RUN mkdir -p $HOME/.jupyter/
ADD jupyter_notebook_config.py $HOME/.jupyter/

Trick the second is in using juputer-server-proxy (e.g. OpenRefine Running in MyBinder, Several Ways…). This allows you to add a start command to the Jupyter notebook New menu and launch a URL proxied application from it.

nbGenie_png

For completeness, the proxy server config is quite straightforward:

# Traitlet configuration file for jupyter-notebook.
#jupyter_notebook_config.py

c.ServerProxy.servers = {
  'Genie': {
    'command': ['node', 'genie/genie_app.js'],
    'port': 3000,
    'timeout': 120,
    'launcher_entry': {
    'title': 'Genie'
    },
  },
}

Rather than just ship the application container, we can ship the application container in a more general “student workbench” context such as the above. Rather than tell the students to run the original application container, we can get them to launch a more general course environment. This is no harder to do — the blockers and hard work required to install in the Docker environment to run the original application container have already been negotiated. The playing field is now wide open to getting arbitrary applications onto student desktops once Docker is installed.

In the above example, I took the liberty of reworking one of the optional course activities as a Juptyer notebook. The original Word file was a simple derivation of the logistic equation (I seem to have oopsed the filename… doh!), but it wasn’t hard to make a simple interactive around that:

If students have the means to access the interactive environment to hand, we might as well use it if it helps support their learning, right?

Poking around the student forums (keeping an eye out for emerging support issues), I noticed one student referring to an issue with another piece of course software. That particular application was a Java application, and required students to install Java on their computer to run the application.

Hmm… so… students have Docker, we can run Java in a Docker, so why should students have to clutter their computer with a Java install? (Note that the release of the docker application has actually appeared for the first time late in the course presentation, so it wasn’t available at the start of the course. I’m not criticising any of the module production team here, just pointing out a little of what’s possible to try to smooth things for students in the next presentation of the course.)

Is there a workaround?

One of the other small pieces I’ve been exploring is how to expose desktops to students. As posted previously, we can do this via a browser using XPRA or we can use RDP.

So suppose we also get students to download and install the cross-platform Microsoft RDP client.

I can download the Java application files from the VLE, and build my own containerised runner for it using a simple Dockerfile like this:

#Grab an XRDP base container
FROM danielguerra/ubuntu-xrdp

#Install Java runtime
RUN apt-get update && apt-get install -y default-jre && apt-get clean

#Make a directory for the app and copy the application files over
RUN mkdir -p /S397/daisyworld
COPY daisy_1/ /S397/daisyworld/

#Optionally create a directory that we can mount onto from the desktop
#so we can share files in from the desktop if we want to.
RUN mkdir -p /S397/share

In case you’re wondering, when folk say: “everyone should learn to code”, I’d say being able to come up with something like that Dockerfile counts as being able to code.

We can now build and run that container, push it to Dockerhub, and again let students run it with a single docker command (possibly hidden in a shortcut, or maybe launched more simply via Kitematic or docker-compose).

docker run --rm -d --name daisyworld --hostname OU-S397 --shm-size 1g -p 3399:3389 --volume ~/S397/daisyshare:/S397/share $DAISYWORLDCONTAINER

I can now create a remote desktop onto that connection:

login with a default username, and launch the application via the remote desktop:

With a bit of fettling, I wonder if I could customise the desktop a little and perhaps autolaunch the application? Or even, rather then expose the whole desktop, autorun the application and automatically run it full window? (I think way back when I explored a small amount of Linux desktop customisation in a VM here?)

Yes, this is at the overhead of running Java in a container, but it also means we don’t require students to install Java and the application itself.

Next on my to do list is a simple notebook container that bundles XPRA so that we can run desktops over http via jupyter-server-proxy. (CoCalc can do this already… Does anyone have a working Jupyter demo that implements something similar?) With that in place, we could ship a single container that would allow students to run notebooks, the Genie web UI application, and the DaisyWorld Java application via a browser viewable desktop from a single container and via a single UI.

That’s the sort of thing I keep trying to talk about…

That’s why we should be doing this…

PS for Docker on the student desktop, students could equally be accessing the browser based services from docker containers running in the cloud, either on institutionally hosted servers or self-service servers. Running your own docker container instances in the cloud is not difficult: Running a Minimal OU Customised Personal Jupyter Notebook Server on Digital Ocean.

Diagrams as Graphs, and an Aside on Reading Equations

I woke up this morning thing about the mechanics (?!) of creating trolley diagrams and the like using TikX / LaTeX:

These diagrams can be made up from single blocks, or multiple blocks…

In the experiments I tried creating these documents, the LaTex is a bit fiddly and still too hard to “just write”.

I quickly discounted trying to come up with a graphical editor to assemble blocks and generate the LaTex (GUI / canvas coding is still one of the things I don’t know really know how to do at all) and then pondered an object model. Perhaps writing something like:

wall.add('trolley', by=['spring','damper']).add('right_arrow'), orient='LR'

that would build up a list of assets (wall, spring, damper, trolley) to include in the diagram, construct a Tikz script based on the assets and how they join together (the hard part…) then maybe generate a __repr_html__ output to render the output in a notebook.

But I think the wiring the blocks together sensibly using an arbitrary number of connections could be challenging.

This all got me thinking about the grammar of the diagram, in part in the sense of Leland Wilkinson’s The Grammar of Graphics, but also in the sense of grammars that can be used to write electrical circuit diagrams and then reason about them (that is, generate mathematical systems that can be calculated as well as graphical ones that can be displayed). Things like the lcapy package for creating a Circuit, for example.

In turn, this got me thinking that the trolley diagrams are graphs. For example, probably abusing Graphviz dot notation, we could write something like:

T = trolley[label=m]
W = wall[vertical]
S = spring[horizontal]
D = damper[horizontal]
A = arrow[right] #Or should that be: F = force[right]
W - S - T
W - D - T
T - A

In terms of layout, there’ d still be the issue of locating the registration points of the connectors, but assembling the equations should be straightforward. For example, cribbing some notes on the Free vibration of a damped, single degree of freedom, linear spring mass system from an Introduction to Dynamics and Vibrations course from the School of Engineering at Brown University:

we see how the mathematical model can be built up from the graphical model: each connection, each edge in the diagram graph, represents a separate component in the mathematical model.

This is good for teaching and learning too, because learning to read (and write) the diagram also helps us learn to read (and write) the mathematical model.

I think learners often underappreciate that in many cases what look like complex mathematical models are actually constructed from smaller parts (grouped terms and +/- signs are often a giveaway that the equation comprises multiple components and that you can often read the equation as a statement of what physical components each corresponds to). If you can see an equation as an assembly of component parts, it often makes it easier to read.

For example:

k(s-L_0)+\lambda\frac{ds}{dt} = ma

may look scary but it follows from the diagram:

spring + damper = force

in which we model the spring as:

spring \sim k(s-L_0)

which is to say, a constant (k) times the amount the spring has stretched ((s-L_0)), which is to say the difference (-) between the distance between the wall and the trolley (s) and the original length of the spring (L_0).

And the damper as:

damper \sim \lambda\frac{ds}{dt}

You can also read into the component parts of that too, of course: \frac{ds}{dt} presumably reads as something like the rate at which the distance between the wall and the trolley changes or the speed with which the trolley moves towards and away from the wall. And \lambda (greek symbol, pronounced as lambda) is another constant material property of the damper.

People know this stuff…  They get the idea of constants, but they don’t realise it. Rubber balls are squeezy in the way that bricks aren’t. Squeeziness is a constant, and the ball and the brick have different values, peculiar to them, that are both associated with that same notion, that same constant, same property, of squeeziness. Similarly, folk know speed is something to do with the relationship between distance and time, and may even know that speed is distance over time, but they don’t know how to see it and read it (don’t know how to spot it or recognise it) as such when presented with an equation…

We don’t teach people how to read equations…

…nor do we teach them how to read diagrams and charts…

[Scurries away quickly in case I read the spring-damper equation incorrectly…!;-)]

Anyway, like I said, I need to think about representing stuff as graphs a bit more… A lot more…

PS I hadn’t appreciated before now that WordPress lets you write simple LaTex, albeit at the non-accessible expense of generating a PDF of the resulting expression… h/t @econproph for pointing that out…

Warning — May Contain Traces of AI

A recent flurry of announcements by Google demonstrate how tensorflow co-processors and statistical models, rather than rule based ones, may soon be coming to a device near you.

Getting on for three years ago, Google announced they had developed a Tensorflow Processing Unit, a co-processor designed to speed up the training of deep-learning models. A year later — so two years ago — they announced cloud availability of TPUs, along with an “in-depth look at Google’s first Tensor Processing Unit (TPU)”.

You can check out TPU availability on Google Cloud services here. Since September 2018 (?), access to limited free TPU support has been available via Google Colab. A minimal ‘get started’ notebook can be found here.

The next step, recently announced among a flurry of announcements at the Tensorflow Developer Summit, 2019 (review), is to provide TPUs you can run at home: Coral Edge TPU Devices. These come in a couple of flavours:

  • Coral Dev-Board, a wireless development board with 1GB of RAM and 8GB of Flash memory, micro-SD slot, gigabit Ethernet, audio jack, and HDMI connectors, dual microphone, and onboard CPU, GPU  and “ML [machine learning] accelerator” Google Edge TPU coprocessor; (seems like a Raspberry Pi on steroids?)
  • USB accelerator, a TPU on a stick that you can plug into your Linux laptop or Raspberry Pi to give it a bit of extra oomph…

Some other things to watch out for…

Pete Warden reports how tensorflow models may soo be coming to a micro-controller near you: Launching TensorFlow Lite for Microcontrollers (repo); ported versions for several microcontrollers are already availaible. It seems he gave a demo of a microcontroller responding to a particular voice activation command:

So why is this useful? First, this is running entirely locally on the embedded chip, with no need to have any internet connectivity, so it’s good to have as part of a voice interface system. The model itself takes up less than 20KB of Flash storage space, the footprint of the TensorFlow Lite code is only another 25KB of Flash, and it only needs 30KB of RAM to operate.

Lest you think this is just in the realm of demoware, Google are also releasing an all-neural on-device speech recognizer:

… a model trained using RNN [recurrent neural network] transducer (RNN-T) technology that is compact enough to reside on a phone. This means no more network latency or spottiness — the new recognizer is always available, even when you are offline. The model works at the character level, so that as you speak, it outputs words character-by-character, just as if someone was typing out what you say in real-time, and exactly as you’d expect from a keyboard dictation system.

Just reflect on that naming for a moment: Recurrent Neural Network Transducer. I normally thing of transducers as physical sensors (eg things that continuously convert sound, or light, or pressure, or temperature to an electrical signal). Here, we have the notion of a software transducer that turns a signal into a set of meaningful symbols in a real-time conversion stream:

RNN-Ts are a form of sequence-to-sequence models that do not employ attention mechanisms. Unlike most sequence-to-sequence models, which typically need to process the entire input sequence (the waveform in our case) to produce an output (the sentence), the RNN-T continuously processes input samples and streams output symbols, a property that is welcome for speech dictation. In our implementation, the output symbols are the characters of the alphabet. The RNN-T recognizer outputs characters one-by-one, as you speak, with white spaces where appropriate. It does this with a feedback loop that feeds symbols predicted by the model back into it to predict the next symbols…

We can haz all ur devices R listen 4 uz…

By the by, I note that the Tensorflow Hub (about) provides a range of (partial) models to build from / retrain in your own solution. Amazon Sagemaker also offers pretrained ML models in their AWS Sagemaker Marketplace. At the moment, I don’t think any of these come with health warnings along the lines of may contain bias or bias inside… Which they should…

However, tools for helping probe the various levels of feature detection embedded within a network are starting to appear. For example, Google announced a technique they’re calling activation atlases:

Activation atlases provide a new way to peer into convolutional vision networks, giving a global, hierarchical, and human-interpretable overview of concepts within the hidden layers of a network. We think of activation atlases as revealing a machine-learned alphabet for images — an array of simple, atomic concepts that are combined and recombined to form much more complex visual ideas.

An example is given of activation atlases for a convolutional image classification network:

In general, classification networks are shown an image and then asked to give that image a label from one of 1,000 predetermined classes — such as “carbonara“, “snorkel” or “frying pan“. … One neuron at one layer might respond positively to a dog’s ear, another at an earlier layer might respond to a high-contrast vertical line.

An activation atlas is built by collecting the internal activations from each of these layers of our neural network from one million images. These activations, represented by a complex set of high-dimensional vectors, is projected into useful 2D layouts …

[A]ll the activations are too many to consume at a glance [so] we draw a grid over the 2D layout we created. For each cell in our grid, we average all the activations that lie within the boundaries of that cell, and use feature visualization to create an iconic representation.

In certain respects, this reminds me a little bit of Andy Wuensche’s basins of attractions in discrete dynamical networks from way back when…

In that case, the idea was to try to represent how all possible states of a network were connected to see where any given initial state might lead a network to and then find a way to meaningfully visualise that. In this case, it seems that the idea is to to try to identify what features and given node might be sensitive to (i.e. plot all the grandmother cells (lite background)).

Running a Minimal OU Customised Personal Jupyter Notebook Server on Digital Ocean

In words and pictures, how to create a simple throwaway server, on a cheap, commercial web host, that automatically runs a personal Jupyter notebook server, with a light covering of OU branding., via a prebuilt Docker container image… (More significant customisations are, of course, possible…)

Step 1 (you only need to do this once)

Get a Digital Ocean account.

A downside of this is that you probably have to give you credit card details to get an account. The upside is that this link should get you $100 free credit that you won’t ever get round to spending anyway.

At this point, I realise many people won’t get past this step… If you’re in HE, your institution should really be able to give you access to online servers that can spin up as required. If they don’t, hassle the Library, because they should be providing you access to this sort of environment if your department can’t or won’t.

[I speak in terms of ideals, of course, which is why I am using a personal Digital Ocean account…]

There are alternative routes, exemplified in posts throughout this blog, but naysayers will just pick grief with those too… At the end of the day, things like Digital Ocean offer commodity compute, and folk in computing education at least should be aware of them, what sorts of thing they offer, and how to work with them in general…

Step 2

Select your own Docker server environment type. Digital Ocean calls servers droplets and the Docker server can be found in the marketplace:

Pedants I work with would probably claim that’s four steps and that it’s already too hard.

Step 3

Select the server size. Some applications require grunt, but for now let’s be a cheapskate with something that approximates an OU min spec machine. (This sort of discipline is also good for academics with fast/whizzy machines so they can see how the rest of the world has to get by…)

Step 4

Select where in the world you want your server to start up.

Step 5

We’re going to get our server to automatically download and run a very lightly customised Jupyter notebook server:

Copy and paste both the following lines into the user data area.

The first one tells the start-up-er-er what sort of script it is. If you have trouble remembering the order of symbols at the start, it reads hash (#) bang (!). The /bin/bash says it’s a bash script.

#!/bin/bash
​​​​​​​​​​​​docker run -p 80:8888 -e JUPYTER_TOKEN=MyP45SwerD ousefulcoursecontainers/oubrandednotebook

The second line is a Docker run command. What it does is download a minimally branded, basic Jupyter notebook server Docker image (ousefulcoursecontainers/oubrandednotebook) and launch a container from it, exposing it to the default http port 80.

If you have Docker installed on your own computer, you should be able to run something like: ​​​​​​​​​​​​​docker run -p HOSTPORT:8888 -e JUPYTER_TOKEN=MyP45SwerD ousefulcoursecontainers/oubrandednotebook ​ , where  HOSTPORT is the port number on host you want to visit the notebook server on in your browser at http://localhost:HOSTPORT or http://127.0.0.1:HOSTPORT.

As part of the command, we specify a default plain text password like token that will add a small level of security. In this case, I’m setting the token to MyP45SWerD.

Step 6

Optionally set a server name and then create your server:

Step 7

Wait for the server to boot…

When the server is launched, you should see a public IP address for it that you can copy if you hover your mouse cursor over it…

It’s probably not quite ready to use yet, though… It will need a minute or two to download the Docker container we told it to start up with…

Step 8

In your browser, paste in the server IP address. If you don’t get a response, it’s still sorting its bits out. Refresh the browser page every so often if the page load appears to stop.

After a minute or two, you should see your notebook login page:

Log in to the server using the password you set in the user data area (for example, MyP45SwerD). You can either use the token to set a browser cookie on the notebook to let you in, or use it set a new password on the notebook.

Step 9

Play with your notebook…

Step 10

When you’re done playing, you can delete the kill the server / droplet so you don’t keep paying for it (it’s metered by the hour or part thereof).

You’ll lose everything inside the server, so the confirmation prompt is in your own best interest…


And that’s it…

If we want to create custom notebook environments, seeded with notebooks and with a more complex environment installed (specific Python packages, for examples, or other servers we want running, we can create another container on top of our minimal container and launch that. (You can see the repo that adds the basic OU branding to the official Jupyter base notebook image here.)

We can also version containers (for example, for specific modules, modules presentations, tutorials, etc).

And as I demonstrated in the previous blogpost, we can also use a similar technique to provide a view of a desktop application via a browser or remote desktop connection.

Running Microsoft VS Code Remotely – In a Browser Using XPRA and Via a Remote Desktop Application Using RDP

If you’re a new student with a bright new Chromebook or other netbook style computer, what do you do other than panic, or cry, when you’re expected to download and install a desktop application — even a cross-platform one — when you start your course?

A month or so ago I posted some notes on Viewing Dockerised Desktops via an X11 Bridge, novnc and RDP, Sort of… where I found a recipe for running a graphical application running in one container via a browser using a “bridge” application in another container.

(The bridge application container exposed a graphical desktop using http via a browser. The bridge connected to the application container via an X11 connection and then exposed the graphical UI using X11.)

As a building block, this is useful in a couple of ways:

  • the application can be remotely hosted and used to expose a graphical application via a browser (so no installation required);
  • the same recipe can be used to set up a “hub” on a local machine to let you run arbitrary graphical applications on your own computer and access them via a browser.

It can be a bit fiddly though and requires plumbing the application container to the bridge, although Docker Compose can handle that for you.

But it’s still one more thing to go wrong.

And it doesn’t help the student with a Chromebook and no local installation route (let’s set aside the ability to run containers on a Chromebook – the same is probably not true of other netbooks, plus it requires a good spec Chromebook).

One of the problems I’ve found with the browser based approach is that you don’t necessarily get sound… So for our student hoping to rely on accessing an application via a remote server, that could be an issue too…

So here are a couple more recipes for creating simple containerised applications that can be used to provide access to remote desktops, whether they’re remote over the internet, or running “remotely” in a container on your own machine.

To demonstrate, I’ll use the Microsoft VSCode application. This electron app will run cross platform, but that doesn’t directly help if you need to access it remotely…

Method the First: An XPRA Container

Via lanrat/docker-xpra-html5, a base container that includes an XRDP server.

Via the CoCalc Docker container recipe, I cribbed this Dockerfile installation command for VSCode that needs to be added in to the Dockerfile:

#From cocalc-docker
# Microsoft's VS Code
RUN apt-get update && apt-get install -y curl && apt-get clean && \
     curl https://packages.microsoft.com/keys/microsoft.asc | gpg --dearmor > microsoft.gpg \
  && install -o root -g root -m 644 microsoft.gpg /etc/apt/trusted.gpg.d/ \
  && sh -c 'echo "deb [arch=amd64] https://packages.microsoft.com/repos/vscode stable main" > /etc/apt/sources.list.d/vscode.list' \
  && DEBIAN_FRONTEND=noninteractive apt-get install -y apt-transport-https \
  && apt-get update \
  && DEBIAN_FRONTEND=noninteractive apt-get install -y code

We can use a simple Dockerfile bootstrapped with FROM lanrat/docker-xpra-html5) and then add in the VSCode layer, or we can add it to a cloned copy of the original repository Dockerfile and build our own container from scratch.

I also made a couple of other tweaks to the Dockerfile so that it could use my own command file. But first, we need to create another file (vscode.sh) for our own start command that will launch VSCode, rather than the infinityTerminal, on start:

#! /usr/bin/env bash

#Start the VSCode app
code

#Keep the container running...
tail -f /dev/null

Then copy the file into the container, and set the permissions, via these Dockerfile instructions:

ADD vscode.sh /usr/local/bin/vscode
RUN chmod +x /usr/local/bin/vscode

I also made a change to the CMD, replacing --start-child=infinityTerm with the --start-child=vscode value and using the new CMD line in my Dockerfile:

CMD xpra start --bind-tcp=0.0.0.0:10000 --html=on --start-child=vscode --exit-with-children --daemon=no --xvfb="/usr/bin/Xvfb +extension Composite -screen 0 1920x1080x24+32 -nolisten tcp -noreset" --pulseaudio=no --notifications=no --bell=no

If you’re creating your own Dockerfile bootstrapped from the original Docker image, rather than built from the raw Dockerfile, you’ll need to add the revised CMD command in too. (The CMD statement also looks to have a wide range of other settings that could be useful and that may provide a better experience.)

Building a local copy of the container with docker build -t psychemedia/xpra-vscode . (yes, the . is part of that: it tells the builder to look in the local (.) directory for the Dockerfile) and running it with docker run --rm -d -p 8822:10000 psychemedia/xpra-vscode, the fan on my laptop goes in to overdrive but I can now work in VSCode via my browser:

So that seems to work…

(I also pushed the image to Dockerhub, so you should be able to run the docker run command directly yourself… Alternatively, you can run the command in a new Digital Ocean Docker droplet (see steps 1-5 of the recipe here).)

Method the Second: An RDP Container

In this second example, let’s add the VSCode editor to an image published at danielguerra69/ubuntu-xrdp. Bootstrapping from that container, all we need to do is pull in a VSCode layer:

FROM danielguerra/ubuntu-xrdp

RUN apt-get update && apt-get install -y curl && apt-get clean

RUN curl https://packages.microsoft.com/keys/microsoft.asc | gpg --dearmor > microsoft.gpg \
  && install -o root -g root -m 644 microsoft.gpg /etc/apt/trusted.gpg.d/ \
  && sh -c 'echo "deb [arch=amd64] https://packages.microsoft.com/repos/vscode stable main" > /etc/apt/sources.list.d/vscode.list' \
  && DEBIAN_FRONTEND=noninteractive apt-get install -y apt-transport-https \
  && apt-get update \
  && DEBIAN_FRONTEND=noninteractive apt-get install -y code

Building this with docker build -t psychemedia/rdp-vscode . and then running it with docker run --rm -d --name uxrdp --hostname terminalserver --shm-size 1g -p 3399:3389 psychemedia/rdp-vscode I can now connect to it on localhost:3399 using a Microsoft Remote Desktop app (available for Windows, Android, iOS and macOS) and login with default credentials (user: ubuntu, password: ubuntu) and access it via a remote desktop window.

(One issue is that resizing the RDP app window doesn’t resize the desktop to fit; there’s perhaps a setting I’m missing somewhere that might help with that…?)

(As before, I also pushed the image to Dockerhub, so you should be able to run the docker run command directly yourself…)

Looking at the Dockerfile, it also looks as if something is exposed on port 9001. If we add -p 9001:9001 (remember, the pattern is -p HOST_PORT:PORT_INSIDE_CONTAINER into the docker start command, and then go to the mapped port, we see a supervisor control panel:

Supervisor_Status

Something like that could be a handy thing to add in to the TM351 VM… Here’s the corresponding supervisor.d config file… (Looks like we can use it as a basis for further monitoring, eg of memory usage. See here.)

I can also confirm that audio (as well as video) works…

Sort of…

Launching a 2GB Digital Ocean Docker droplet with the following user data / startup command:

When I connected to the public IP address for the droplet with port 3389, the RDP worked fine but when I tried to play a Youtube video in Firefox, the buffering garbled the audio really badly. (Maybe a larger droplet instance would have been better…)

Summary

These two simple recipes provide a way for sharing simple GUI apps either via a browser (without audio) or via RDP, with sound available if required. Microsoft provide cross-platform clients for Windows, Mac, Android and IoS.

If the Android client in the Google Play store works for Chromebooks, then a student with a Chromebook should be able to fully access a remotely hosted application over RDP, including support for audio and video.

There is a huge gulf, of course, between being able to get things working, sort of, and getting them working at scale in an educational environment.

The experience may be a bit ropey at times, but if we have one or two students with real issues getting software working, I don’t see why we can’t try to pull together a homebrew solution to provide them with a remotely hosted software environment, albeit perhaps a temporary one.

And each time we try, we’ll maybe figure out a way to smooth it out and improve the experience a bit more.

Or we could teach folk how to launch their own remote server instances  which could be kept running with some sort of file persistence over the life of a course, whether by leaving the server switched on or mounting user files to a storage volume somewhere . (We’d probably have to improve the security layer a bit, at least by showing students how to create their own user credentials. But that’s a problem we’d only have to figure out a recipe for once.)

What keeps winding me up is how folk are so resistant to using computers to do computery things…

PS here’s something else I learned today, ish via @mickaelistria on the Twitterz. The original tweet pointed to a repo that provides access to an Eclipse IDE via a browser: https://github.com/ws-skeleton/eclipse-broadway/ From which I then learned that the GDK Broadway backend provides support for displaying GTK+ applications in a web browser, using HTML5 and web sockets. So that’s another route?

Also via Twitter, @olberger pointed me to https://janitor.technology/ https://github.com/JanitorTechnology/janitor, which looks like it provides an online development environment built around https://github.com/theia-ide/theia .

I’m thoroughly out of my depth with this development environment stuff…

All I’m trying to do is identify the small pieces that can be used to make applications available to students.

For two reasons…

Firstly, so I can experiment with the pieces and get a feel for how they work / what their limitations are / what sorts of richer environments might come to be built on their foundations.

Secondly, so when folk pitch whizzy start-up solutions to us at “only” $X per student seat, I have some mental filters in place so that I can filter out the shiny glare and get a feeling for what’s actually being offered, and maybe what technology it’s built on (or at least, seed me questions so I can try to figure out what it’s built on). And from that, what the limitations and benefits might be. We generally don’t get to hear, make time for, or understand detailed technical sales pitches, but at the end of the day, the tech is a large part of what you’re buying, wrapped up in user experience chrome.

PS here’s an online service that looks like it’s over a ‘native’ web hosted version of VS Code:  StackBlitz; and here’s another, coder.com , with a repo: codercom/code-server.

PPS By the by, VS Code looks like it’s built using this browser based text editor from Microsoft: Monaco.