OUseful.Info, the blog…

Trying to find useful things to do with emerging technologies in open education

Posts Tagged ‘devsci

Creating Simple Interactive Visualisations in R-Studio: Subsetting Data

Watching a fascinating Google Tech Talk by Hadley Wickham on The Future of Interactive Graphics in R – A Joint Visualization and UseR Meetup, I was reminded of the manipulate command provided in R-Studio that lets you create slider and dropdown widgets that in turn let you dynamically interact with R based visualisations, for example by setting data ranges or subsetting data.

Here are a couple of quick examples, one using the native plot command, the other using ggplot. In each case, I’m generating an interactive visualisation that lets me display as a line chart two user selected data series from a larger data set.

manipulate UI builder in RStudio

[Data file used in this example]

Here’s a crude first attempt using plot:

hun_2011comprehensiveLapTimes <- read.csv("~/code/f1/generatedFiles/hun_2011comprehensiveLapTimes.csv")
View(hun_2011comprehensiveLapTimes)

library("manipulate")
h=un_2011comprehensiveLapTimes

manipulate(
plot(lapTime~lap,data=subset(h,car==cn1),type='l',col=car) +
lines(lapTime~lap,data=subset(h,car==cn2 ),col=car),
cn1=slider(1,25),cn2=slider(1,25)
)

This has the form manipulate(command1+command2, uiVar=slider(min,max)), so we see for example two R commands to plot the two separate lines, each of them filtered on a value set by the correpsonding slider variable.

Note that we plot the first line using plot, and the second line using lines.

The second approach uses ggplot within the manipulate context:

manipulate(
ggplot(subset(h,h$car==Car_1|car==Car_2)) +
geom_line(aes(y=lapTime,x=lap,group=car,col=car)) +
scale_colour_gradient(breaks=c(Car_1,Car_2),labels=c(Car_1,Car_2)),
Car_1=slider(1,25),Car_2=slider(1,25)
)

In this case, rather than explicitly adding additional line layers, we use the group setting to force the display of lines by group value. The initial ggplot command sets the context, and filters the complete set of timing data down to the timing data associated with at most two cars.

We can add a title to the plot using:

manipulate(
ggplot(subset(h,h$car==Car_1|car==Car_2)) +
geom_line(aes(y=lapTime,x=lap,group=car,col=car)) +
scale_colour_gradient(breaks=c(Car_1,Car_2),labels=c(Car_1,Car_2)) +
opts(title=paste("F1 2011 Hungary: Laptimes for car",Car_1,'and car',Car_2)),
Car_1=slider(1,25),Car_2=slider(1,25)
)

My reading of the manipulate function is that if you make a change to one of the interactive components, the variable values are captured and then passed to the R command sequences, which then executes as normal. (I may be wrong in this assumption of course!) Which is to say: if you write a series of chained R commands, and can abstract out one or more variable values to the start of the sequence, then you can create corresponding interactive UI controls to set those variable values by placing the command series with the manipulate() context.

Written by Tony Hirst

August 5, 2011 at 1:05 pm

Posted in Anything you want

Tagged with , , , , ,

Follow

Get every new post delivered to your Inbox.

Join 784 other followers