Running R Projects in MyBinder – Dockerfile Creation With Holepunch

For those who don’t know it, MyBinder is a reproducible research automation tool that will take the contents of a Github repository, build a Docker container based on requirements files found inside the repo, and then present the user with a temporary, running container that can serve a Jupyter notebook, JupyterLab or RStudio environment to the user. All at the click of a button.

Although the primary, default, UI is the original Jupyter notebook interface, it is also possible to open a MyBinder environment into JupyterLab or, if the R packaging is install, RStudio.

For example, using the demo repository, which contains a simple base R environment, with RStudio installed, we can use my Binder to launch RStudio running over the contents of that repository:

When we launch the binderised repo, we get — RStudio in the browser:

Part of the Binder magic is to install a set of required packages into the container, along with “content” documents (Jupyter notebooks, for example, or Rmd files), based on requirements identified in the repo. The build process is managed using a tool called repo2docker, and the way requirements / config files need to be defined can be found here.

To make building requirements files easier for R projects, the rather wonderful holepunch package will automatically parse the contents of an R project looking for package dependencies, and will then create a DESCRIPTION metadata file itemising the found R package dependencies. (holepunch can also be used to create install.R files.) Alongside it, a Dockerfile is created that references the DESCRIPTION file and allows Binderhub to build the container based on the project’s requirements.

For an example of how holepunch can be used in support of academic publishing, see this repo — rgayler/scorecal_CSCC_2019 — which contains the source documents for a recent presentation by Ross Gayler to the Credit Scoring & Credit Control XVI Conference. This repo contains the Rmd document required to generate the presentation PDF (via knitr) and Binder build files created by holepunch.

Clicking the repo’s MyBinder  button takes you, after a moment or two, to a running instance of RStudio, within which you can open, and edit, the presentation .Rmd file and knitr it to produce a presentation PDF.

In this particular case, the repository is also associated with a Zenodo DOI.

As well as launching Binderised repositories from the Github (or other repository) URL, MyBinder can also launch a container from a Zenodo DOI reference.

The screenshot actually uses the incorrect DOI…

For example,

Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...

3 thoughts on “Running R Projects in MyBinder – Dockerfile Creation With Holepunch”

  1. Thanks for using my repo as an example. It’s a bit of a surprise to be reading a blog post and then find a reference to your own stuff. FWIW the one thing that doesn’t work out of the box, is that the presentation used a non-standard font and holepunch/repo2docker/MyBinder doesn’t automatically pick that up. So the dockerised presentation uses a default font which doesn’t quite fit on the page for some slides. When I work out how to fix that I’ll post instructions.

    1. The joys of an open web – hopefully you don’t mind?! Re: the fonts — that’s interesting… it’s only once you actually start trying to make things work you realise how many hidden dependencies there are…

      1. hopefully you don’t mind?
        Not at all, and if I did I would be being inconsistent, because I made it CC-BY.

        Re: the fonts
        I suspect I have to manually edit the dockerfile to install the fonts in the docker image.

        you realise how many hidden dependencies there are
        Yes. I am still grappling with the alternative approaches. I think using docker images is probably the best medium term approach on the grounds that the user only has to install docker and I expect docker to be a viable platform at least to 10-15 years from now. I am going to assume conservatively that (or at least their free cloud instances) are only going to be available short term. Over the long term I’m not convinced that any technical solution will be reliable. I doubt that docker would be readily available in 50 years. For longer periods I think plain text is the only viable mechanism. Of course, there’s probably little need for computational reproducibility over long times.

Comments are closed.

%d bloggers like this: