Book Extras – Data Files, Code Files and a Dockerised Application

Idling through the LeanPub documentation last night, I noticed that they support the ability to sell digital extras, such as bundled code files or datafiles. Along with the base book sold at one price, additional extras can be bundled into packages alongside the original book and sold at another (higher) price. As with the book sales, two price points are supported: the minimum price and a recommended price.

It was easy enough to create a bundle of sample code and data files to support the Wrangling F1 Data With R book and add them as an extras package bundled with the book for an extra dollar or so.

leanpub_extras

This approach makes it slightly easier to distribute file bundles to support a book, but it still requires a reader to do some work in configuring their own development environment.

In an attempt to make this slightly easier, I also started exploring ways of packaging and distributing a preconfigured virtual machine that contains the tools – as well as code and data examples – that are required in order to try out the data wrangling approaches described in the book. (I’m starting to see more and more technical books supported by virtual machines, and can imagine this approach becoming a standard way of augmenting practical texts.) In particular, I needed a virtual machine that could run RStudio and that would be preloaded with libraries that would support working with SQLite data files and generating ggplot2 charts.

The route I opted for was to try out a dockerised application. The rocker/hadleyverse Docker image bundles a wide variety of useful R packages into a container along with RStudio and a base R installation. Building on top of this image, I created a small Dockerfile that additionally loaded in the example code and data files from the book extras package – psychemedia/wranglingf1data.

# Wrangling F1 Data Dockerfile
#
# https://github.com/psychemedia/wranglingf1data-docker
#

# Pull RStudio base image with handy packages...
FROM rocker/hadleyverse

#Create a directory to create a new project from
RUN mkdir -p /home/rstudio/wranglingf1data
RUN chmod a+rw /home/rstudio/wranglingf1data

#Populate the project-directory-to-be with ergast API supporting code and data
ADD ergastR-core.R /home/rstudio/wranglingf1data/ergastR-core.R
ADD ergastdb13.sqlite /home/rstudio/wranglingf1data/ergastdb13.sqlite

#Populate the project-directory-to-be with an additional data source
ADD scraperwiki.sqlite /home/rstudio/wranglingf1data/scraperwiki.sqlite

Running this Dockerfile (for example, using boot2docker) downloads and builds a containerised application preconfigured to support the book and available via a web browser. Instructions for downloading, and running the container can be found here: psychemedia/wranglingf1data-docker repository.

I also added instructions for using the Dockerised application to the book extras package as part of its README file.