Connecting RStudio and MySQL Docker Containers – an example using the ergast db

building on Dockerising Open Data Databases – First Fumblings and my Book Extras – Data Files, Code Files and a Dockerised Application, I just figured out how to get the ergast db into a MySQL docker container and then query it from RStudio:

  • Download and unzip the f1db.sql.gz file to f1db.sql
  • install these docker-mysql-scripts
  • run boot2docker
  • from the boot2docker shell, start up a MySQL server (ergastdb) with password f1: dmysql-server ergastdb f1 By default, this exposes port 3306
  • create an new empty database (f1db): dmysql-create-database ergastdb f1db
  • add the ergast data to it: dmysql-import-database ergastdb /path/to/ergastdb/f1db.sql --database f1db
  • fire up a copy of RStudio, in this case using my psychemedia/wranglingf1data container, linking it to the MySQL database which has the alias db: docker run --name f1djd -p 8788:8787 --link ergastdb:db -d psychemedia/wranglingf1data
  • run boot2docker ip to find where RStudio is running (IPADDRESS) and in your browser go to: http://IPADDRESS:8788, logging in with username rstudio and password rstudio
  • in RStudio, import the RMySQL library: library(RMySQL)
  • in RStudio, connect to the database: con=dbConnect(MySQL(),user='root',password='f1',host='db',port=3306,dbname='f1db')
  • in RStudio, run a test query: dbQuery(con,'SHOW TABLES');

rstudio-mysq;

I guess what I need to do now is pull the various bits into another script to make it a one-liner, perhaps with a few switches? For example, to create the database if it doesn’t exist, to download the ergast database file automatically, to populate the database for the first time, or update it with a more recent copy of the database, to fire up both containers and make sure they are appropriately linked etc. This would dramatically simplify things for use in the context of the Wrangling F1 Data With R book, for example. (If you beat me to it, please post details in the comments below.)

PS Hmm…. seems I get a UTF-8 encoding issue:

RStudio-encoding

Not sure if this is with the database, or the RMySQL connector? Anyone got any ideas of a fix?

Ah ha – sort of via SO:

Running dbGetQuery(con,'SET NAMES utf8;') before querying seems to do the trick…