Potential Issues With Institutionally Mediated Reproducible Research Environments

One of the advantages, for me, of the Jupyter Binderhub enviornment is that it provides with a large amount of freedom to create my own computational environment in the context of a potentially managed institutional service.

At the moment, I’m lobbying for an OU hosted version of Binderhub, probably hosted via Azure Kubernetes, for internal use in the first instance. (It would be nice if we could also be part of an open and federated MyBinder provisioning service, but I’m not in control of any budgets.) But in the meantime, I’m using the open MyBinder service (and very appreciative of it, too).

To test the binder builds locally, I use repo2docker, which is also used as part of the Binderhub build process.

What this all means is that I should be able to write – and test – notebooks locally, and know that I’ll be able to run them “institutionally” (eg on Binderhub).

However, one thing I noticed today was that notebooks in a binder container that was running okay, and that still builds and runs okay locally, have broken when run through Binderhub.

I think the error is a permissions error in creating temporary directories or writing temporary image files in either the xelatex commandline command used to generate a PDF from the LaTeX script, or the ImageMagick convert command used produce an image from the PDF which are both used as part of some IPython magic that renders LaTeX tikz diagram generating scripts. It certainly affects a couple of my magics. (It might be an issue with the way the magics are defined too. But whatever the case, it works for me locally but not “institutionally”.)

Broken notebook: https://mybinder.org/v2/gh/psychemedia/showntell/maths?filepath=Mechanics.ipynb
magic code: https://github.com/psychemedia/showntell/tree/maths/magics/tikz_magic
Error is something to do with the ImageMagick convert command not converting the .pdf to an image. At least one of issues seems to be that ghostscript is lost somewhere?

So here’s the issue. Whilst the notebooks were running fine in a container generated from an image that was itself created presumably before a Binderhub update, rebuilding the image (potentially without making any changes to the source Github repository) can lead to notebooks that were running fine to break.

Which is to say, there may be a dependency in the way a repository defines an environment on some of the packages installed by the repo2docker build process. (I don’t know if we can fully isolate out these dependencies by using a Dockerfile to define the environment rather than apt.txt and requirements.txt?)

This raises a couple of questions for me about dependencies:

  • what sort of dependency issues might there be in components or settings introduced by the jupyter2repo process, and how might we mitigate against these?
  • are there other aspects of the Binderhub process that can produce breaking changes that impact on notebooks running in a repository that specifies a computational environment run via Binderhub?

Institutionally, it also means that environments run via an institutionally supported Binderhub environment could break downstream environments (that is, ones run via Binderhub) through updates to the Binderhub environment.

This is a really good time for this to happen to me, I think, because it gives me more things to think about when considering the case for providing a Binderhub service institutionally.

On the other hand, it means I can’t update any of the other repos that use the tikz or asymptote magic until I find the fix because otherwise they will break too…

Should users of the institutional service, for example, be invited to define test areas in their Binder repositories (for example, using nbval) that the institution can use as test cases when making updates to the institutional service? If errors are detected through the running of these tests by the institutional service provider against their users’ tests, then the institutional service provider could explore whether the issue can be addressed by their update strategy, or alert the Binderhub user there may be breaking changes and how to explore what they are or mitigate against them. (That is, perhaps it falls to the institutional provider to centrally explore the likely common repercussions of a particular update and identify fixes to address them?)

For example, there might be dependencies on particular package version numbers. In this case, the user might then either want to update their own code, or add in a build requirement that regresses the package to the desired version. (Institutional providers might have something to say about that if the upgrade was for valid security reasons, though running things in isolation in containers should reduce that risk?) Lists of affected packages could also be circulated to other users using the same packages, along with mitigation strategies for coping with updates to the institutionally provided service.

There are also updating issues associated with a workflow strategy I am exploring around Binderhub which relates to using “base containers” to seed Binderhub builds (Note On My Emerging Workflow for Working With Binderhub). For example, if a build uses a “latest” tagged base image, any updates to that base image may break things built on top of it. In this case, mitigating against update risk to the base container is achieved by building from a specifically tagged version of the container. However, if an update to the Binderhub environment can break notebooks running on top of a particularly labelled base container, the fix for the notebooks may reside in making a fix to the environment in the base container (for example, which specifically acts to enforce a package version). This suggests that the base container might need doubly tagging – one tag paying heed to the downstream end users (“buildForExptXYZ”) – and the other that captures the upstream Binderhub environment (“BinderhubBuildABC”).

I’m also wondering know about where responsibility arises for maintaining the integrity of the user computing environment (that is, the local computational environment within which code in notebooks should continue to operate once the user has defined their environment). Which is to say, if there are changes to the wider environment that somehow break that local user environment, who should help fix it? If the changes are likely to impact widely, it makes sense to try to fix it once and then share the change, rather than expecting every user suffering from the break to have to find the fix independently?

Also, I’m wondering about classes of error that might arise. For example, ones that can be fixed purely by changing the environmental definition (baking package versions into config files, for example, which is probably best practice anyway) and ones that require changes to code in notebooks?

PS Hmm.. noting… are whitelists and blacklists also specifiable in Binderhub config? eg https://github.com/jupyterhub/mybinder.org-deploy/pull/239/files

binderhub:
  extraConfig:
    bans:
        c.GitHubRepoProvider.banned_specs = [
          '^GITHUBUSER/REPO.*'
        ]

Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...

%d bloggers like this: