Generative Assessment Creation

It’s coming round to that time of year where we have to create the assessment material for courses with an October start date. In many cases, we reuse question forms from previous presentations but change the specific details. If a question is suitably defined, then large parts of this process could be automated.

In the OU, automated question / answer option randomisation is used to provide iCMAs (interactive computer marked assessments) via the student VLE using OpenMark. As well as purely text based questions, questions can include tables or images as part of the question.

One way of supporting such question types is to manually create a set of answer options, perhaps with linked media assets, and then allow randomisation of them.

Another way is to define the question in a generative way so that the correct and incorrect answers are automatically generated.(This seems to be one of those use cases for why ‘everyone should learn to code’;-)

Pinching screenshots from an (old?) OpenMark tutorial, we can see how a dynamically generated question might be defined. For example, create a set of variables:

and then generate a templated question, and student feedback generator, around them:

Packages also exist for creating generative questions/answers more generally. For example, the R exams package allows you to define question/answer templates in Rmd and then generate questions and solutions in a variety of output document formats.


You can also write templates that include the creation of graphical assets such as charts:

 

Via my feeds over the weekend, I noticed that this package now also supports the creation of more general diagrams created from a TikZ diagram template. For example, logic diagrams:

Or automata diagrams:

(You can see more exam templates here: www.r-exams.org/templates.)

As I’m still on a “we can do everything in Jupyter” kick, one of the things I’ve explored is various IPython/notebook magics that support diagram creation. At the moment, these are just generic magics that allow you to write TikZ diagrams, for example, that make use of various TikZ packages:

One the to do list is to create some example magics that template different question types.

I’m not sure if OpenCreate is following a similar model? (I seem to have lost access permissions again…)

FWIW, I’ve also started looking at my show’n’tell notebooks again, trying to get them working in Azure notebooks. (OU staff should be able to log in to noteooks.azure.com using OUCU@open.ac.uk credentials.) For the moment, I’m depositing them at https://notebooks.azure.com/OUsefulInfo/libraries/gettingstarted, although some tidying may happen at some point. There are also several more basic demo notebooks I need to put together (e.g. on creating charts and using interactive widgets, digital humanities demos, R demos and (if they work!) polyglot R and python notebook demos, etc.). To use the notebooks interactively, log in and clone the library into your own user space.

Seeding Shared Folders With Files Distributed via a VM

For the first few presentations of our Data Management and Analysis course, the course VM has been distributed to students via a USB mailing. This year, I’m trying to move to a model whereby the primary distribution is via a download from VagrantCloud (students manage the VM using Vagrant), though we’re also hoping to be able to offer access to an OU OpenStack hosted VM to any student’s who really need it.

For students on Microsoft Windows computers, an installer installs Virtualbox and vagrant from installers distributed via the USB memory stick. This in part derives from the policy of fixing versions of as much as we can so that it can be tested in advance. The installer also creates a working directory for the course that will be shared by the VM, and copies required files, again from the memory stick, into the shared folder. On Macs and Linux, students have to do this setup themselves.

One of the things I have consciouslystarted trying to do is move the responsibility for satisficing of some of the installation requirements into the Vagrantfile. (I’m also starting to think they should be pushed even deeper into the VM itself.)

For example, as some of the VM services expect particular directories to exist in the shared directory, we have a couple of defensive measures in place:

  • the Vagrantfile creates any required, yet missing, subdirectories in the shared directory;
            #Make sure that any required directories are created
            config.vm.provision :shell, :inline => <<-SH
                mkdir -p /vagrant/notebooks
                mkdir -p /vagrant/openrefine_projects
                mkdir -p /vagrant/logs
                mkdir -p /vagrant/data
                mkdir -p /vagrant/utilities
                mkdir -p /vagrant/backups
                mkdir -p /vagrant/backups/postgres-backup/
                mkdir -p /vagrant/backups/mongo-backup/	
            SH
    

  • start up scripts for services that require particular directories check they exist before they are started and create them if they are missing. For example, in the service file, go defensive with something like ExecStartPre=mkdir -p /vagrant/notebooks.

The teaching material associated with the (contents of) the VM is distributed using a set of notebooks downloaded from the VLE. Part of the reason for this is that it delays the point at which the course notebooks must be frozen: the USB is mastered late July/early August for a mailing in September and course start in October.

As well as the course notebooks are a couple of informal installation test notebooks. This can be frozen along with the VM and distributed inside it, but the question then arises. So this year I am trying out a simple pattern that bakes test files into the VM and then uses the Vagranfile to copy the files into the shared directory on its first run with a particular shared folder:

config.vm.provision :shell, :inline => <<-SH
    if [ ! -f /vagrant/.firstrun_nbcopy.done ]; then
        # Trust notebooks in immediate child directories of notebook directory
        files=(`find /opt/notebooks/* -maxdepth 2 -name "*.ipynb"`)
        if [ ${#files[@]} -gt 0 ]; then
            jupyter trust /opt/notebooks/*.ipynb;
            jupyter trust /opt/notebooks/*/*.ipynb;
        fi
        #Copy notebooks into shared directory
        cp -r /opt/notebooks/. /vagrant/notebooks
        touch /vagrant/.firstrun_nbcopy.done
    fi
   SH

This pattern allows files shipped inside the VM to be copied into the shared folder once it is mounted into the VM from host. The files will then persist inside the shared directory, along with a hidden flag file to say the files have been copied. I’m not sure about the benefits of auto-running something inside the VM to manage this copying? Or whether to check that a more recent copy of the files to be copied doesn’t already exist in the shared folder before copying on the first run in the folder?

[Not] Running Jupyter Notebook Containers Via Jupyterhub Container Under Kubernetes Under Docker on My Desktop

…(sic).

One of the things I’ve been exploring lately (actually, that my colleague Rod Norfor in IT has been exploring lately) has been a way of offering a self-service, disposable Jupyter notebook service, pre-packaged with customised, pre-prepared teaching notebooks for students to work through. The hope is that this will go live to some TM112 students later this month.

The solution we’ve come up with is running Jupyterhub under Kubernetes on Microsoft Azure Cloud. The Jupyterhub server serves a prebuilt, public Docker container pulled from Docker Hub. (As a back-up, the same notebooks can be run via Microsoft Azure Notebooks cloned from a library on Microsoft Azure Notebooks or from the Github repo used to build the notebook container served by Jupyterhub.) On the to do list is to explore whether we should use a private registry or pull from an OU hosted registry, either private or public.

One thing we have noticed is that tagging the source notebook container image as latest breaks things with Kubernetes if we try to serve an updated image. The solution is to tag each updated version of the container differently (?and restart the Jupyterhub server?). The image is rebuilt using the same tag as a Github release tag, and the build triggered from a Github release. The Github release is handled manually.

As things currently stand, authentication is required to get into the OU VLE in the normal way, and then a link to the Jupterhub server containing a secret is used to take students to the Jupyterhub server. From there, users self-start the Jupyter notebook container.

In the default setup, dummy auth is in place, letting users through an explicit check with any user/password combination:

 

Thanks to Rod, the Azure route appears to be working so far, but  I thought I’d also try to set up a local development environment on my desktop.

The Docker Community Edition Edge release](https://www.docker.com/kubernetes) allows you to run Kubernetes locally on your desktop under Docker. Having got Kubernetes installed and up running straightforwardly enough, it was time to give Jupyterhub a go in its default state. The following recipes is via a get started crib from David Currie and the Zero to JupyterHub with Kubernetes docs:

## Install helm
## - download and install appropriate distribution (I grabbed the Mac distribution from [kubernetes/helm](https://github.com/kubernetes/helm), unzipped it, and put the app in my `Applications` folder, which is in my path)

##Check we can see it
helm

## Initialise helm
helm init

## Create a working directory, just in case!
mkdir code/dockerk8sjupyterhub
cd code/dockerk8sjupyterhub

## Set up things to work with my local docker powered kubernetes service
kubectl config use-context docker-for-desktop
kubectl get nodes

##Tiller is already installed - I'm not sure if this is necessary?
helm init --service-account tiller

## Security voodoo
kubectl --namespace=kube-system patch deployment tiller-deploy --type=json --patch='[{"op": "add", "path": "/spec/template/spec/containers/0/command", "value": ["/tiller", "--listen=localhost:44134"]}]'

## Create a config file
touch config.yaml
## Create a secret hash value
openssl rand -hex 32
## And add it to the YAML file:
nano config.yaml
|------
proxy:
  secretToken: "MY_TOKEN"
|------

## Add helm chart
helm repo add jupyterhub https://jupyterhub.github.io/helm-chart/
helm repo update
helm install jupyterhub/jupyterhub --version=v0.6 --name=thhelm1juphub --namespace=ouseful -f config.yaml
##locally:
##helm install path/to/my/files
##helm install my_chart.tgz

## When it's running, find the IP address
kubectl --namespace=ouseful get svc proxy-public
|------
NAME           CLUSTER-IP       EXTERNAL-IP   PORT(S)                      AGE
proxy-public   10.104.156.186   localhost     80:32292/TCP,443:32125/TCP   1h
|------
## In this case, we can find Jupyterhub at: 127.0.0.1:80

##Bring everything down with:
helm delete ouseful --purge
kubectl delete namespace ouseful

One of the things I’d like to do is explore things like customise the notebook container start page:

…and the holding page as the server starts up:

and the error pages (at the moment, this is as far as I can get:-(

(I seem to remember there were issues setting up the helm chart to work with Azure the first time, so I’m hoping there’s a similar simple fix to get my local environment working. If you can point me to possible solutions via the comments, that would be appreciated:-)

I can also easily raise a 404 error page:

It looks as if the template pages are in the jupyterhub repo here: share/jupyterhub/templates so while things are still broken, now might be a good time for me to try some simple (branded) customisations of them.

At the moment, there are two things I need to clarify for myself (help via the comments appreciated!):

  • where is the path to the Jupyterhub container specified? Ideally, I want to build a local image containing customised start/error pages.
  •  where is the path to the Jupyter notebook image served by Jupyterhub specified so that I can serve my own test containers?

Clarification here: only one container is required – the one that will be served – but :

The Docker image you are using must have the jupyterhub package installed in order to work. Moreover, the version of jupyterhub must match the version installed by the helm chart that you’re using. For example, v0.5 of the helm chart usesjupyterhub==0.8.

To use a new image, tweak the config.yaml file, for example:

singleuser:
  image:
    name: jupyter/scipy-notebook
    tag: c7fb6660d096

DIT4C Inspired RobotLab Container

A couple of days ago I posted a quick demo of how to run an old OU Windows app under wine in a Docker container that exposed the app running on a graphical desktop via a browser.

One of the the problems with that demo is that it doesn’t provide a way of uploading or downloading files, so unless the user runs the container with a volume mounted against something that does support file transfer, users couldn’t persist files outside the container.

I don’t know whether the audio works either (though I didn’t try…) but as that recipe was based on a container I’d used to run the Audacity audio editor previously, I’m guessing it should…?

Anyway… regarding the file transfer issue, I recalled putting together a container last year for my PhD student who was looking for a portable way of demoing a desktop Ruby app. Digging that out, it was easily repurposed to run RobotLab. The base container is the DIT4C X11 container. The DIT4C project lost its funding last year, I think, but the repos and Docker images still exist and they provide a really great set of examples of prebuilt containerised apps. Exactly the sort of apps I’d want on my HE Library Digital App shelf. A bit dated now, perhaps, but as I when I get a chance I’ll start trying to refresh them.

The base Dockerfile is relatively straightforward:

#Base container
FROM dit4c/dit4c-container-x11:debian

#Required in order to add backports repo
RUN apt-get update && apt-get install -y software-properties-common 

# Install wine
RUN dpkg --add-architecture i386
RUN echo "deb http://httpredir.debian.org/debian jessie-backports main" | sudo tee /etc/apt/sources.list.d/docker.list
RUN apt-get update && apt-get install -y -t jessie-backports wine

# /var contains HTML site for container homepage
COPY var /var

# RobotLab windows files
COPY Apps/ /opt/Apps

# profile.d environment vars
COPY etc /etc

# Desktop shortcuts and application icons
COPY usr /usr
RUN chmod +x /usr/share/applications/robotlab.desktop
RUN chmod +x /usr/share/applications/neural.desktop
RUN chmod +x /usr/share/applications/remote.desktop

#Add items to top toolrail on desktop
RUN LNUM=$(sed -n '/launcher_item_app/=' /etc/tint2/panel.tint2rc | head -1) && \
  sed -i "${LNUM}ilauncher_item_app = /usr/share/applications/robotlab.desktop" /etc/tint2/panel.tint2rc && \
sed -i "${LNUM}ilauncher_item_app = /usr/share/applications/neural.desktop" /etc/tint2/panel.tint2rc && \
sed -i "${LNUM}ilauncher_item_app = /usr/share/applications/remote.desktop" /etc/tint2/panel.tint2rc

The local build / run / push to dockerhub process is trivial:

cp Dockerfile_dit4c_x11 Dockerfile
docker build -t psychemedia/robtolab2 .
docker run -d -p 8082:8080 psychemedia/robtolab2
docker push psychemedia/robtolab2

As to what it looks like, here’s the home page:

Files can be uploaded to / downloaded from the container using the File Management link:

The Linux Desktop Session link takes you to a Linux desktop. Several tools are available in the toolbar at the top of the desktop, including a terminal and the RobotLab application.

Clicking on the RobotLab icon runs it under wine – on first run we seem to get an update message. Maybe I need to run wine as part of the start-up procedure to handle this as part of the build process?

As well as the RobotLab application, we can run other tools in the RobotLab suite, such as the Neural neural network package:

Note that I haven’t tested the student activities yet – this is still early days in just trying to work out what the tech requirements are and how the workflow / student user experience might play out…

Repo is here (it contains redundant RobotLab stuff such as drivers for the original Lego Mindstorms infra-red towers, but I haven’t yet worked out what can be dropped and what RobotLab requires…): ouseful-course-containers/ou-tm129-robotlab.

A container is also available on Dockerhub as ousefulcoursecontainers/ou-tm129-robotlab:dit4c but I haven’t had chance to check it yet… (it’s still in the build queue…).

The container can be imported into a desktop Docker environment using Kitematic:

robotlab_kitematic0

Once added, the container can also be run using Kitematic, with the user clicking on a link to take then directly to an appropriate location in their web browser:

robotlab_kitematic

The desktop / Kitematic route also enables file sharing with a local directory mounted into the container (though by the looks of it, I may need to tidy that up a little?).

The pieces are slowly starting to come together…?

Scripted Forms Magic

One of the things I think we could be looking at more in open ed in general, and the OU in particular, are authoring environments that allow educators to create their own interactives.

A significant blocker is the skills required to create and deploy such things:

  • activity design in general;
  • front end / UI coding (HTML / js)
  • back end coding (application logic, js, or py, for example)
  • runtime management (js can run in a browser, R or py likely to require a backend kernel to execute the code).

One of the things that appeals to me about the Jupyter and RStudio environments / ecosystems is the availability of packages that support end-user development by wrapping high level UI components as templated widgets that can be automatically generated from code. (See for example my own attempts at creating an HTML hexjson widget for rendering a hex map given an R dataframe.)

One of the things I started to come round to when creating various binderhub demos to show off how Jupyter notebooks can be used to generate and embed rich media assets for different topic areas (see slides 25-33 of the presentation from Dev.ac.uk embedded below, for example) was that a good collection of IPython magics could provide the basis for end-user developed online materials featuring embedded interactives.

For example, the py / folium magic I started working on makes it relatively straightforward to create an embedded map within a Jupyter notebook, that can then be saved and rendered as a standalone HTML page featuring an embedded interactive map.

It struck me that the magic should also work in the context of Scriptedforms, eg as per Creating Simple Interactive Forms Using Python + Markdown Using ScriptedForms + Jupyter.

And it does…

The markdown (text) file, scriptedforms_magic.md, on the left can be fired up as the HTML + interactive form on the right from a single command line command:

scriptedforms scriptedforms_magic.md

You can also do “live development” by editing and saving the markdown and then reloading the browser page.

By developing a sensible set of IPython magics, authors with some limited technical skills should be able to write enough “code”, as per the “code” in the markdown document above, to:

  • identify form elements;
  • bind those form elements as parameters in an IPython magic call.

After all, we keep saying everyone should be able to code… and that’s a level of coding where you can be immediately productive…

Everybody Should Code? Well, Educators Working in Online Learning in a “Digital First” Organisation Probably Should?

Pondering Sheila’s thoughts on Why I Don’t Code?, and mulling over some materials we’re prepping as revisions to the database course…

I see the update as an opportunity to try to develop the ed tech a bit, make it a bit more interactive, make it a bit “responsive” to students so they can try stuff out and get meaningful responses back that helps them check what they’ve been doing has worked. (I’ve called such approaches “self-marking” before, eg in the context of robotics, where it’s obvious if your robot has met the behavioural goal specified in some activity (eg go forwards and backwards the same distance five times, or traverse a square).)

Some the best activities in this regard, to my mind at least, can be defined as challenges, with a “low floor, high ceiling” as I always remember Mike Reddy saying in the context of educational robotics outreach. Eg how perfect a square, or how quickly can your robot traverse it. And can it reverse it?

Most kids/students can get the robot to do something, but how well can they get it to do something in particular, and how efficiently? There are other metrics too – how elegant does the code look, or how spaghetti like?!

So prepping materials, I keep getting sidetracked by things we can do to try to support the teaching. One bit I’m doing is getting students to link database tables together, so it seems sensible to try to find, or come up with, a tool that lets students make changes to the database and then look at the result in some meaningful way to check their actions have led to the desired outcome.

There are bits and bobs of things I can reuse, but they require:

1) gluing stuff together to make the student use / workflow appropriate. That takes a bit of code (eg to make some IPython magic that provides a oneliner capable of graphically depicting some updated configuration or other );
2) tweaking things that don’t work quite how want them, either in terms of presentation / styling, or functionality – which means being able to read other people’s code and change the bits we want to change;
3) keeping track of work – and thinking – in progress, logging issues, sketching out possible fixes, trying to explore the design space, and use space, including looking for alternative – possibly better – ways of achieving the same thing; which means trying to get other people’s code to work, which in turn means coming up with and trying out small examples, often based on reading documentation, looking for examples on Stack Overflow, or picking through tests to find out how to call what function with what sorts of parameter values.

Code stuff; very practical programmingy code-y stuff. That’s partly how I see code being used – on a daily basis – by educators working in a institution like the OU that claims to pride itself on digital innovation: educators coming up with small bits of code to implement micro digital innovations that directly support/benefit the teaching/learning materials and enhance them, technologically… Code as part of everyday practice for online digital educators. (Remember: “everyone should code”…)

It’s also stuff that tests my own knowledge of how things are supposed to work in principle – as well as practice. This plays out through trying to make sure that the tech remains relevant to the educational goal, and ideally helps reinforce it, as well as providing a means for helping the students act even more effectively as independent learners working at a distance…

And to my mind, it’s also part of our remit, if we’re supposed to be trying to make the most use of the whole digital thing… And if we can’t do this in a computing department, then where can – or should – we do it?

PS I should probably have said something about this being relevant to, or an example of, technology enhanced learning, but I think that phrase has been lost to folk who want to collect every bit of data they can from students? Ostensibly, this is for “learning analytics”, which is to say looking at ways of “optimising” students, rather than optimising our learning materials (by doing the basic stuff like looking at web stats as I’ve said repeatedly). I suspect there’s also an element of plan B – if we have a shed load of data, we can sell it to some one as part of every other fire sale.

Potential Issues With Institutionally Mediated Reproducible Research Environments

One of the advantages, for me, of the Jupyter Binderhub enviornment is that it provides with a large amount of freedom to create my own computational environment in the context of a potentially managed institutional service.

At the moment, I’m lobbying for an OU hosted version of Binderhub, probably hosted via Azure Kubernetes, for internal use in the first instance. (It would be nice if we could also be part of an open and federated MyBinder provisioning service, but I’m not in control of any budgets.) But in the meantime, I’m using the open MyBinder service (and very appreciative of it, too).

To test the binder builds locally, I use repo2docker, which is also used as part of the Binderhub build process.

What this all means is that I should be able to write – and test – notebooks locally, and know that I’ll be able to run them “institutionally” (eg on Binderhub).

However, one thing I noticed today was that notebooks in a binder container that was running okay, and that still builds and runs okay locally, have broken when run through Binderhub.

I think the error is a permissions error in creating temporary directories or writing temporary image files in either the xelatex commandline command used to generate a PDF from the LaTeX script, or the ImageMagick convert command used produce an image from the PDF which are both used as part of some IPython magic that renders LaTeX tikz diagram generating scripts. It certainly affects a couple of my magics. (It might be an issue with the way the magics are defined too. But whatever the case, it works for me locally but not “institutionally”.)

Broken notebook: https://mybinder.org/v2/gh/psychemedia/showntell/maths?filepath=Mechanics.ipynb
magic code: https://github.com/psychemedia/showntell/tree/maths/magics/tikz_magic
Error is something to do with the ImageMagick convert command not converting the .pdf to an image. At least one of issues seems to be that ghostscript is lost somewhere?

So here’s the issue. Whilst the notebooks were running fine in a container generated from an image that was itself created presumably before a Binderhub update, rebuilding the image (potentially without making any changes to the source Github repository) can lead to notebooks that were running fine to break.

Which is to say, there may be a dependency in the way a repository defines an environment on some of the packages installed by the repo2docker build process. (I don’t know if we can fully isolate out these dependencies by using a Dockerfile to define the environment rather than apt.txt and requirements.txt?)

This raises a couple of questions for me about dependencies:

  • what sort of dependency issues might there be in components or settings introduced by the jupyter2repo process, and how might we mitigate against these?
  • are there other aspects of the Binderhub process that can produce breaking changes that impact on notebooks running in a repository that specifies a computational environment run via Binderhub?

Institutionally, it also means that environments run via an institutionally supported Binderhub environment could break downstream environments (that is, ones run via Binderhub) through updates to the Binderhub environment.

This is a really good time for this to happen to me, I think, because it gives me more things to think about when considering the case for providing a Binderhub service institutionally.

On the other hand, it means I can’t update any of the other repos that use the tikz or asymptote magic until I find the fix because otherwise they will break too…

Should users of the institutional service, for example, be invited to define test areas in their Binder repositories (for example, using nbval) that the institution can use as test cases when making updates to the institutional service? If errors are detected through the running of these tests by the institutional service provider against their users’ tests, then the institutional service provider could explore whether the issue can be addressed by their update strategy, or alert the Binderhub user there may be breaking changes and how to explore what they are or mitigate against them. (That is, perhaps it falls to the institutional provider to centrally explore the likely common repercussions of a particular update and identify fixes to address them?)

For example, there might be dependencies on particular package version numbers. In this case, the user might then either want to update their own code, or add in a build requirement that regresses the package to the desired version. (Institutional providers might have something to say about that if the upgrade was for valid security reasons, though running things in isolation in containers should reduce that risk?) Lists of affected packages could also be circulated to other users using the same packages, along with mitigation strategies for coping with updates to the institutionally provided service.

There are also updating issues associated with a workflow strategy I am exploring around Binderhub which relates to using “base containers” to seed Binderhub builds (Note On My Emerging Workflow for Working With Binderhub). For example, if a build uses a “latest” tagged base image, any updates to that base image may break things built on top of it. In this case, mitigating against update risk to the base container is achieved by building from a specifically tagged version of the container. However, if an update to the Binderhub environment can break notebooks running on top of a particularly labelled base container, the fix for the notebooks may reside in making a fix to the environment in the base container (for example, which specifically acts to enforce a package version). This suggests that the base container might need doubly tagging – one tag paying heed to the downstream end users (“buildForExptXYZ”) – and the other that captures the upstream Binderhub environment (“BinderhubBuildABC”).

I’m also wondering know about where responsibility arises for maintaining the integrity of the user computing environment (that is, the local computational environment within which code in notebooks should continue to operate once the user has defined their environment). Which is to say, if there are changes to the wider environment that somehow break that local user environment, who should help fix it? If the changes are likely to impact widely, it makes sense to try to fix it once and then share the change, rather than expecting every user suffering from the break to have to find the fix independently?

Also, I’m wondering about classes of error that might arise. For example, ones that can be fixed purely by changing the environmental definition (baking package versions into config files, for example, which is probably best practice anyway) and ones that require changes to code in notebooks?

PS Hmm.. noting… are whitelists and blacklists also specifiable in Binderhub config? eg https://github.com/jupyterhub/mybinder.org-deploy/pull/239/files

binderhub:
  extraConfig:
    bans:
        c.GitHubRepoProvider.banned_specs = [
          '^GITHUBUSER/REPO.*'
        ]