OUseful.Info, the blog…

Trying to find useful things to do with emerging technologies in open education

Generating d3js Motion Charts from rCharts

Remember Gapminder, the animated motion chart popularised by Hans Rosling in his TED Talks and Joy of Stats TV programme? Well it’s back on TV this week in Don’t Panic – The Truth About Population, a compelling piece of OU/BBC co-produced stats theatre featuring Hans Rosling, and a Pepper’s Ghost illusion brought into the digital age courtesy of the Musion projection system:

Whilst considering what materials we could use to support the programme, we started looking for ways to make use of the Gapminder visualisation tool that makes several appearances in the show. Unfortunately, neither Gapminder (requires Java?), nor the Google motion chart equivalent of it (requires Flash?), appear to work with a certain popular brand of tablet that is widely used as a second screen device…

Looking around the web, I noticed that that Mike Bostock had produced a version of the motion chart using d3.js: The Wealth & Health of Nations. Hmmm…

Playing with that rendering on a tablet, I had a few problems when trying to highlight individual countries – the interaction interfered with an invisible date slider control – but a quick shout out to my OU colleague Pete Mitton resulted in a tweaked version of the UI with the date control moved to the side. I also added a tweak to allow specified countries to be highlighted. You can find an example here (source).

Looking at how the data was pulled into the chart, it seems to be quite a convoluted form of JSON. After banging my head against a wall for a bit, a question on Stack Overflow about how to wrangle the data from something that looked like this:

Country Region  Year    V1  V2
AAAA    XXXX    2001    12  13
BBBB    YYYY    2001    14  15
AAAA    XXXX    2002    36  56
AAAA    XXXX    1999    45  67

to something that looked like this:

[
  {"Country": "AAAA",
   "Region":"XXXX",
    "V1": [ [1999,45], [2001,12] , [2002,36] ],
    "V2":[ [1999,67], [2001,13] , [2002,56] ]
  },
  {"Country": "BBBB",
   "Region":"YYYY",
   "V1":[ [2001,14] ],
   "V2":[ [2001,15] ]
  }
]

resulted in a handy function from Ramnath Vaidyanathan that fitted the bill.

One of the reasons that I wanted to use R for the data transformation step, rather than something like Python, was that I was keen to try to get a version of the motion charts working with the rCharts library. Such is the way of the world, Ramnath is the maintainer of rCharts, and with his encouragement I had a go at getting the motion chart to work with that library, heavily cribbing from @timelyportfolio’s rCharts Extra – d3 Horizon Conversion tutorial on getting things to work with rCharts along the way.

For what it’s worth, my version of the code is posted here: rCharts_motionchart.

I put together a couple of demo’s that seem to work, including the one shown below that pulls data from the World Bank indicators API and then chucks it onto a motion chart…

UPDATE: I’ve made things a bit easier compared to the original recipe included in this post… we can now generate fertility/GDP/population motion chart for a range of specified countries using data pulled directly from the World Bank development indicators API with just the following two lines of R code:

test.data=getdata.WDI(c('GB','US','ES','BD'))
rChartMotionChart(test.data,'Country','Year','Fertility','GDP','Region','Population')

It’s not so hard to extend the code to pull in other datasets, either…

Anyway, here’s the rest of the original post… Remember, it’s easier now;-) [Code: https://github.com/psychemedia/rCharts_motionchart See example/demo1.R]

To start with, here are a couple of helper functions:

require('WDI')

#A handy helper function for getting country data - this doesn't appear in the WDI package?
#---- https://code.google.com/p/google-motion-charts-with-r/source/browse/trunk/demo/WorldBank.R?r=286
getWorldBankCountries <- function(){
  require(RJSONIO)
  wbCountries <- fromJSON("http://api.worldbank.org/country?per_page=300&format=json")
  wbCountries <- data.frame(t(sapply(wbCountries[[2]], unlist)))
  wbCountries$longitude <- as.numeric(wbCountries$longitude)
  wbCountries$latitude <- as.numeric(wbCountries$latitude)
  levels(wbCountries$region.value) <- gsub("\\(all income levels\\)", "", levels(wbCountries$region.value))
  return(wbCountries)
}


#----http://stackoverflow.com/a/19729235/454773
pluck_ = function (element){
  function(x) x[[element]]
}

#' Zip two vectors
zip_ <- function(..., names = F){
  x = list(...)
  y = lapply(seq_along(x[[1]]), function(i) lapply(x, pluck_(i)))
  if (names) names(y) = seq_along(y)
  return(y)
}

#' Sort a vector based on elements at a given position
sort_ <- function(v, i = 1){
  v[sort(sapply(v, '[[', i), index.return = T)$ix]
}

library(plyr)

This next bit still needs some refactoring, and a bit of work to get it into a general form:

#I chose to have a go at putting all the motion chart parameters into a list

params=list(
  start=1950,
  end=2010,
  x='Fertility',
  y='GDP',
  radius='Population',
  color='Region',
  key='Country',
  yscale='log',
  xscale='linear',
  rmin=0,
  xmin=0

  )

##This bit needs refactoring - grab some data; the year range is pulled from the motion chart config;
##It would probably make sense to pull countries and indicators etc into the params list too?
##That way, we can start to make this block a more general function?

tmp=getWorldBankCountries()[,c('iso2Code','region.value')]
names(tmp)=c('iso2Code','Region')

data <- WDI(indicator=c('SP.DYN.TFRT.IN','SP.POP.TOTL','NY.GDP.PCAP.CD'),start = params$start, end = params$end,country=c("BD",'GB'))
names(data)=c('iso2Code','Country','Year','Fertility','Population','GDP')

data=merge(data,tmp,by='iso2Code')

#Another bit of Ramnath's magic - http://stackoverflow.com/a/19729235/454773
dat2 <- dlply(data, .(Country, Region), function(d){
  list(
    Country = d$Country[1],
    Region = d$Region[1],
    Fertility = sort_(zip_(d$Year, d$Fertility)),
    GDP = sort_(zip_(d$Year, d$GDP)),
    Population=sort_(zip_(d$Year, d$Population))
  )
})

#cat(rjson::toJSON(setNames(dat2, NULL)))

To minimise the amount of motion chart configuration, can we start to set limits based on the data values?

#This really needs refactoring/simplifying/tidying/generalising
#I'm not sure how good the range finding heuristics I'm using are, either?!
paramsTidy=function(params){
  if (!('ymin' %in% names(params))) params$ymin= signif(min(0.9*data[[params$y]]),3)
  if (!('ymax' %in% names(params))) params$ymax= signif(max(1.1*data[[params$y]]),3)
  if (!('xmin' %in% names(params))) params$xmin= signif(min(0.9*data[[params$x]]),3)
  if (!('xmax' %in% names(params))) params$xmax= signif(max(1.1*data[[params$x]]),3)
  if (!('rmin' %in% names(params))) params$rmin= signif(min(0.9*data[[params$radius]]),3)
  if (!('rmax' %in% names(params))) params$rmax= signif(max(1.1*data[[params$radius]]),3)
  params
}

params=paramsTidy(params)

This is the function that generates the rChart:

require(rCharts)

#We can probably tidy the way that the parameters are mapped...
#I wasn't sure whether to try to maintain the separation between params and rChart$params?
rChart.generator=function(params, h=400,w=800){
  rChart <- rCharts$new()
  rChart$setLib('../motionchart')
  rChart$setTemplate(script = "../motionchart/layouts/motionchart_Demo.html")

  rChart$set(
  
   countryHighlights='',
   yearMin= params$start,
   yearMax=params$end,
  
   x=params$x,
   y=params$y,
   radius=params$radius,
   color=params$color,
   key=params$key,
  
   ymin=params$ymin,
   ymax=params$ymax,
   xmin=params$xmin,
   xmax=params$xmax,
   rmin=params$rmin,
   rmax=params$rmax,
  
   xlabel=params$x,
   ylabel=params$y,
  
   yscale=params$yscale,
   xscale=params$xscale,
   
   width=w,
   height=h
 )

 rChart$set( data= rjson::toJSON(setNames(dat2, NULL)) )

 rChart
}

rChart.generator(params,w=1000,h=600)

Aside from tidying – and documenting/commenting – the code, the next thing on my to do list is to see whether I can bundle this up in a Shiny app. I made a start sketching a possible UI, but I’ve run out of time to do much more for a day or two… (I was also thinking of country checkboxes for either pulling in just that country data, or highlighting those countries.)

items=c("Fertility","GDP","Population")
names(items)=items

shinyUI(pageWithSidebar(
  headerPanel("Motion Chart demo"),
  
  sidebarPanel(
    selectInput(inputId = 'x',
                label = "X",
                choices = items,
                selected = 'Fertility'),
    selectInput(inputId = 'y',
                label = "Y",
                choices = items,
                selected = 'GDP'),
    selectInput(inputId = 'r',
                label = "Radius",
                choices = items,
                selected = 'Population')
  ),
  mainPanel(
    #The next line throws an error (a library is expected? But I don't want to use one?)
    showOutput("motionChart",'')
  )
))

As ever, we’ve quite possibly run out of time on getting much up on the OpenLearn website by Thursday to support the programme as it airs, which is partly why I’m putting this code out now… If you manage to do anything with it that would allow folk to dynamically explore a range of development indicators over the next day or two (especially GDP, fertility, mortality, average income, income distributions (this would require different visualisations?)), we may be able to give it a plug from OpenLearn, and maybe via any tweetalong campaign that’s running as the programme airs…

If you do come up with anything, please let me know via the comments, or twitter (@psychemedia)…

Written by Tony Hirst

November 4, 2013 at 6:34 pm

Posted in OBU, Rstats

Tagged with

3 Responses

Subscribe to comments with RSS.

  1. You also may be interested in the animint package: an interactive animation can be defined as a list of ggplots with showSelected and clickSelects aesthetics.

    http://sugiyama-www.cs.titech.ac.jp/~toby/animint/index.html

    I used it to create this GapMinder-like plot:

    http://sugiyama-www.cs.titech.ac.jp/~toby/animint/WorldBank/viz.html

    Toby Dylan Hocking

    November 7, 2013 at 5:22 am

    • @toby That’s nice; I really like the linked charts, and the use of the text label to identify a line chart highlighted element in the scatterplot….:-)

      Tony Hirst

      November 7, 2013 at 11:46 am

  2. […] using rCharts published by Tony Hirst at github: https://github.com/psychemedia (He also wrote a tutorial on his blog about creating a motion chart using […]


Comments are closed.

Follow

Get every new post delivered to your Inbox.

Join 812 other followers

%d bloggers like this: