OUseful.Info, the blog…

Trying to find useful things to do with emerging technologies in open education

Archive for April 2011

Thoughts on a Couple of Possible Lap Charting Apps

with 3 comments

I seem to have posted a lot of F1 related items recently (there seem to have been a lot of Bank Holidays and weekends lately – and F1 diversions feed into those; normal service will be resumed shortly….) and here’s another one, in part inspired by Joe Saward’s post on Lap Charts (and as discussed in the most recent Sidepodcast Aside With Joe), but also harking back to something I though about at the BTCC/Brands Hatch race last year, and in mind because I hope to get to Thruxton tomorrow…

The problem? Capturing lap chart information like this:

Joe Saward, lap chart, Shanghai
Image used, without permission, from: Joe Saward, A great race in Shanghai

I’ve been experimenting with various ways of displaying this data, such as “augmenting” traditional lap charts with additional colour and size dimensions:

(In the above case, node size is proportional to time to car in front (or denotes a pit stop); colour is related to time to car behind (red is hot – car behind is close), or choice of tyres in a pit stop. Laps count across the screen, colours are ascending race position. A bright red dot with a large dot above it shows two cars close together. A trace for Car 18 on the grid (Webber) is shown throughout the race.)

Now I now there is is probably a way of grabbing this data, for F1 at least, from something like the BBC live timing feed, or from live timing feeds for other races from TSL/Timing Solutions Limited or MST Systems, but I’m thinking more generally for cases where live timing isn’t available…

So here are a couple of possible ideas for apps to support the collection of lap chart data: a tablet (e.g. iPad) app, and a mobile (e.g. iPhone or Android phone) app.

First up, the tablet app:

The idea here is that you can click on the car number as the car goes past and build up a live view of the lap chart. The last car clicked is highlighted and can be annotated. It may also be worth having a setting so that after a car has been selected it is greyed out for 5 seconds less than the fastest expected lap time. (Except maybe for pit option, where possibly capture car going into and out of the pit?)

Here’s a sketch for a mobile app:

As before, you click on the car number in left hand column as the car goes past. To simplify matters, the car numbers in the left hand column are in an ordered list (by track position? Initial state is Grid position). After a few seconds, the car clicked on disappears from the top of the list an is added to the bottom of the list, the idea being that the top of the list shows cars you expect to come past next. As with the tablet app, the last car clicked is highlighted and can be annotated using tags from the right hand column,

On a final note, if the positions are being added in real time, the app can also collect rough timing information. That means we can then also start to produce crude gap charts that show the time/distance between cars. Something like this, maybe?

In this case, the chart shows gaps between cars, per lap (increasing lap number up the screen, car positions left to right). Gaps indicate a pitted or lapped car. From the lap chart data, and crude timing, we could automatically generate this sort of view.

PS I probably won’t get round to making either of these apps, (at least, not in the immediate future…) but if anyone would like to take them on, I’d be happy to test them and chip in ideas:-)

Written by Tony Hirst

April 30, 2011 at 11:09 am

Visualising Sports Championship Data Using Treemaps – F1 Driver & Team Standings

leave a comment »

I *love* treemaps. If you’re not familiar with them, they provide a very powerful way of visualising categorically organised hierarchical data that bottoms out with a quantitative, numerical dimension in a single view.

For example, consider the total population of students on the degrees offered across UK HE by HESA subject code. As well as the subject level, we might also categorise the data according to the number of students in each year of study (first year, second year, third year).

If we were to tabulate this data, we might have columns: institution, HESA subject code, no. of first year students, no. of second year students, no. of third year students. We could also restructure the table so that the data was presented in the form: institution, HESA subject code, year of study, number of students. And then we could visualise it in a treemap… (which I may do one day… but not now; if you beat me to it, please post a link in the comments;-)

Instead, what I will show is how to visualise data from a sports championship, in particular the start of the Formula One 2011 season. This championship has the same entrants in each race, each a member of one of a fixed number of teams. Points are awarded for each race (that is, each round of the championship) and totalled across rounds to give the current standing. As well as the driver championship (based on points won by individual drivers) is the team championship (where the points contribution form drivers within a team is totalled).

Here’s what the results from the third round (China) looks like:

Driver Team Points
Lewis Hamilton McLaren-Mercedes 25
Sebastian Vettel RBR-Renault 18
Mark Webber RBR-Renault 15
Jenson Button McLaren-Mercedes 12
Nico Rosberg Mercedes 10
Felipe Massa Ferrari 8
Fernando Alonso Ferrari 6
Michael Schumacher Mercedes 4
Vitaly Petrov Renault 2
Kamui Kobayashi Sauber-Ferrari 1
Paul di Resta Force India-Mercedes 0
Nick Heidfeld Renault 0
Rubens Barrichello Williams-Cosworth 0
Sebastien Buemi STR-Ferrari 0
Adrian Sutil Force India-Mercedes 0
Heikki Kovalainen Lotus-Renault 0
Sergio Perez Sauber-Ferrari 0
Pastor Maldonado Williams-Cosworth 0
Jarno Trulli Lotus-Renault 0
Jerome d’Ambrosio Virgin-Cosworth 0
Timo Glock Virgin-Cosworth 0
Vitantonio Liuzzi HRT-Cosworth 0
Narain Karthikeyan HRT-Cosworth 0
Jaime Alguersuari STR-Ferrari 0

F1 2011 Results – China, © 2011 Formula One World Championship Ltd

We can represent data from across all the races using a table of the form:

Driver Team Points Race
Lewis Hamilton McLaren-Mercedes 25 China
Sebastian Vettel RBR-Renault 18 China
Felipe Massa Ferrari 10 Malaysia
Fernando Alonso Ferrari 8 Malaysia
Kamui Kobayashi Sauber-Ferrari 6 Malaysia
Michael Schumacher Mercedes 0 Australia
Pastor Maldonado Williams-Cosworth 0 Australia
Michael Schumacher Mercedes 0 Australia
Pastor Maldonado Williams-Cosworth 0 Australia
Narain Karthikeyan HRT-Cosworth 0 Australia
Vitantonio Liuzzi HRT-Cosworth 0 Australia

Sample of F1 2011 Results 2011, © 2011 Formula One World Championship Ltd

I’ve put a copy of the data to date at Many Eyes, IBM’s online interactive data visualisation site: F1 2011 Championship Points

Here’s what it looks like when we view it in a treemap visualisation:

The size of the boxes is proportional to the (summed) values within the hierarchical categories. In the above case, the large blocks are the total points awarded to each driver across teams and races. (The team field might be useful if a driver were to change team during the season.)

I’m not certain, but I think the Many Eyes treemap algorithm populates the map using a sorted list of summed numerical values taken through the hierarchical path from left to right, top to bottom. Which means top left is the category with the largest summed points. If this is the case, in the above example we can directly see that Webber is in fourth place overall in the championship. We can also look within each blocked area for more detail: for example, we can see Hamilton didn’t score as many points in Malaysia as he did in the other two races.

One of the nice features about the Many Eyes treemap is that it allows you to reorder the levels of the hierarchy that is being displayed. So for example, with a simple reordering of the labels we can get a view over the team championship too:

The Many Eyes treemap can be embedded in a web page (it’s a Java applet), although I’m not sure what, if any, licensing restrictions apply (I do know that the Guardian datastore blog embeds Many Eyes widgets on that site, though). Other treemap widgets are available (for example, Protovis and JIT both offer javascript enabled treemap displays).

What might be interesting would be to feed Protovis or the JIT with data dynamically form a Google Spreadsheet, for example, so that a single page could be used to display the treemap with the data being maintained in a spreadsheet.

Hmm, I wonder – does Google spreadsheets have a treemap gadget? Ooh – it does: treemap-gviz. It looks as if a bit of wrangling may be required around the data, but if the display works out then just popping the points data into a Google spreadsheet and creating the gadget should give an embeddable treemap display with no code required:-) (It will probably be necessary to format the data hierarchy by hand, though, requiring differently layed out data tables to act as source for individual and team based reports.)

So – how long before we see some “live” treemap displays for F1 results on the F1 blogs then? Or championship tables from other sports? Or is the treemap too confusing as a display for the uninitiated? (I personally don’t think so.. but then, I love macroscopic views over datasets:-)

PS see also More Olympics Medal Table Visualisations which includes a demonstration of a treemap visualisation over Olympic medal standings.

Written by Tony Hirst

April 28, 2011 at 11:38 am

Posted in Data, Visualisation

Tagged with , , ,

Getting Access to University Course Code Data (or not… (yet…))

with 8 comments

A couple of weeks or so ago, having picked up the TSO OpenUp competition prize for suggesting that it would be a Good Thing for UCAS/university course code data to be made available, I had a meeting with the TSO folk to chat over “what next?” The meeting was an upbeat one with a plan to get started as soon as possible with a scrape of the the UCAS website… so what’s happened since…?

First up – a reading of the UCAS website Terms and Conditions suggests that scraping is a no-no…

6. Intellectual property rights
e. Copying, distributing or any use of the material contained on the website for any commercial purpose is prohibited.
f. You may not create a database by systematically downloading substantial parts of the website

(In the finest traditions of the web, you aren’t allowed to deep link into the site without permission either: 6.c inks to the website are not permitted, other than links to the homepage for your personal use, except with our prior written permission. Links to the website from within a frameset definition are not permitted except with our prior written permission.)

So, err, I guess my link to the terms and conditions breaks those terms and conditions? Oops…;-) Should I be sending them something like this do you think?

Dear enquiries@ucas.ac.uk,
As per your terms and conditions, (paragraph 6 c) please may I publish a link to your terms and conditions web page [ http://www.ucas.com/terms_and_conditions ] in a blog post I am writing that, in part, refers to your terms and conditions?
Luv'n'hugs,
tony

As a fallback, I put a couple of trial balloon FOI requests in to a couple of universities asking for the course names and UCAS course codes for courses offered in 2010/11, along with the search keywords associated with each course (doh! I did it again, deep linking into the UCAS site…)

PS Please may I also link to the page describing course search keywords [ http://www.ucas.com/he_staff/courses/coursesearchkeywords ] ?

The first request went to the University of Southampton, in part because I knew that they already publish chunks of the data (as data) as part of their #opensoton Open Data initiative. (This probably means I was abusing the FOI system, but a point maybe needed to be made…?!;-) The second request was put in to the University of Bristol.

The requests were of the form:

I would be grateful if you could send me in spreadsheet, machine readable electronic form or plain text a copy of the course codes, course titles and search keywords for each course as submitted to UCAS for the 2010-2011 (October 2010) student entry.

If possible, would you also provide HESA subject category codes associated with each course.

So how did I get on?

Bristol’s response was as follows:

On discussion with our Admissions and Student Information teams, it appears that the University does not actually hold this data – it is held on a UCAS database. UCAS are not currently subject to the Freedom of Information Act (they will be in due course) but it may be worth talking to them directly to see if they are willing to assist.

And Southampton’s FOI response?

Course codes and titles may be found here: http://www.soton.ac.uk/corporateservices/foi/request-66210-6124d691.pdf Keywords were not held by the University – you should inquire with UCAS (http://www.ucas.com). HESA subject category codes may be found here: http://www.hesa.ac.uk/index.php/content/view/1806/296/

So what did I learn?

  1. I don’t seem to have made it clear enough to Southampton that I wanted the the 2-tuple (course code, HESA code) for each course. So how should I have asked for that data (the response pointed me to the list of all HESA codes. What I wanted was, for each course code, the course code/HESA code pair).
  2. Generalising from an example of one;-), there seems to be a disconnect between FOI and open data branches of organisations. In my ideal world, the FOI person (an advocate for the person making the request) would also be on good terms with the Open Data team in the organisation, if not a data wrangler themselves. For data requests, the FOI person would make sure the data is released as open data as part of the process of fulfilling the request and then refer the person making the request to the open data site (see also: Open Data Processes – Taps, Query Paths/Audit Trails and Round Tripping). Southampton have part of this process already – the course data is in a PDF on the their site and I was referred to it. (Note that the PDF is not just any PDF – have a look at it! – rather than the spreadsheet, machine readable electronic form or plain text I requested, even though @cgutteridge had posted a link to the SPARQL opendata query for the course code/UCAS code information I’d requested as a reply to my FOI request on the WhatDoTheyKnow site.)
  3. Universities don’t necessarily have any record of the search keywords they associate with the courses they post on UCAS. The UCAS website suggests that (doh!) “[r]ecent analysis of unique IP address use of the UCAS Course Search indicates that the subject search is by far the most popular of the 3 search options currently available”, such that “[w]hen an applicant uses our Course Search facility to search for available courses, they can choose a keyword by which to search, known as the ‘subject search’.” Which is to say, universities have no local record of the terms they use to describe courses that are the the primary way of discovering their courses on UCAS? Blimey… (I wonder how much universities spend on Google AdWords for advertising particular courses on their own course prospectus websites and how they go about selecting those terms?)
  4. Asking for a machine readable “data as data” response has no teeth at the current time. I don’t know if the Protection of Freedoms bill clause that “extends Freedom of Information rights by requiring datasets to be available in a re-usable format” will change this? It seems like it might?

    Where—
    (a) an applicant makes a request for information to a public authority in respect of information that is, or forms part of, a dataset held by the public authority, and
    (b) on making the request for information, the applicant expresses a preference for communication by means of the provision to the applicant of a copy of the information in electronic form, the public authority must, so far as reasonably practicable, provide the information to the applicant in an electronic form which is capable of re-use.

  5. So what next? UCAS is a charity that appears to be operated by, for, and on behalf of UK Higher Education (e.g. UCAS Directors’ Report and Accounts 2009). Whilst not FOIable yet, it looked set to become FOIable from October 2011 (Ministry of Justice: Greater transparency in Freedom of Information), though I haven’t been able to find the SI and commencement date that enact this…?). IF it does become FOIable, we may be able to get the data out that way (although memories of the battle between open data advocates and the Ordnance Survey come to mind…) Hopefully, though, we’ll be able to get the data open by more amicable means before then…:-)

    PS a couple of other things that I’ve been dipping into relating to this project. Firstly, the UCAS Business Plan 2009-2012 (doh!):

    PPS Please may I also link to your Corporate Business Plan 2009-2012 [ http://www.ucas.com/documents/corporate/corpbusplan09-12.pdf ]

    Secondly, the Cabinet Office’s “Better Choices: Better Deals” strategy document [PDF], which as well as its “MyData” right to personal data initiative, also encourages business to put their information (and data…) to work. Whether or not you agree that more information may help to make for better choices from potential students, or that comparison sites have a role to play in this, the UK government appears to believe it and looks set to support the development of businesses operating in this area. For example:

    Effective consumer choices are also important in the public sector – such as decisions about what and where to study.
    However, unlike in private markets, public services are generally:
    ● Free at the point of delivery, so prices do not give us clues about quality or popularity.
    ● Not motivated by profits, so there is little incentive to highlight differences and encourage switching.
    ● Supplied under a universal service obligation, such that they serve a particularly broad range of users, from the very informed to the highly vulnerable.
    In the same way that comparison and feedback sites have developed for private markets, some choice-tools have already emerged for public services. For example, parents and prospective students can use league tables to compare school and university performance, while patients can access websites comparing waiting times for treatments across different healthcare providers, and feedback from fellow consumers about the performance of a local GP practice. Their role is likely to become more important in future as public service markets are opened up and there is scope for further choice-tools to be developed [Better Choices: Better Deals, p. 32]

    If you’re looking to put a bid or business plan together based on using public data as a basis for comparison services, the Better Choices document has more than a few quotable sections;-)

    [Related: Course Detective metasearch/custom search across UK University prospectus websites]

Written by Tony Hirst

April 26, 2011 at 12:58 pm

Posted in Data, Stirring, Thinkses

Tagged with , , , ,

A First Attempt at Looking at F1 Timing Data in Google Motion Charts (aka “Gapminder”)

leave a comment »

Having managed to get F1 timing data data through my cobbled together F1 timing data Scraperwiki, it becomes much easier to try out different visualisation approaches that can be used to review the stories that sometimes get hidden in the heat of the race (that data journalism trick of using visualisation as an analytic tool for story discovery, for example).

Whilst I was on holiday, reading a chapter in Beautiful Visualization on Gapminder/Trendalyser/Google Motion Charts (it seems the animations may be effective when narrated, as when Hans Rosling performs with them, but for the uninitiated, they can simply be confusing…), it struck me that I should be able to view some of the timing data in the motion chart…

So here’s a first attempt (going against the previously identified “works best with narration” bit of best practice;-) – F1 timing data (China 2011) in Google Motion Charts, the video:


Visualising the China 2011 F1 Grand Prix in Google Motion Charts

If you want to play with the chart itself, you can find it here: F1 timing data (China 2011) Google Motion Chart.

The (useful) dimensions are:

  • lap – the lap number;
  • pos – the car/racing number of each driver;
  • trackPos – the position in the race (the racing position);
  • currTrackPos – the position on the track (so if a lapped car is between the leader and second place car, their respective currtrackpos are 1, 2, 3);
  • pitHistory – the number of pit stops to date

The timeToLead, timeToFront and timeToBack measures give the time (in seconds) between each car and the leader, the time to the car in the racing position ahead, and the time to the car in racing position behind (these last two datasets are incomplete at the moment… I still need to calculate this missing datapoints…). The elapsedTime is the elapsed racetime for each car at the end of each measured lap.

The time starts at 1900 because of a quirk in Google Motion Charts – they only work properly for times measured in years, months and days (or years and quarters) for 1900 onwards. (You can use years less than 1900 but at 1899 bad things might happen!) This means that I can simply use the elapsed time as the timebase. So until such a time as the chart supports date:time or :time as well as date: stamps, my fix is simply to use an integer timecount (the elapsed time in seconds) + 1900.

Written by Tony Hirst

April 26, 2011 at 7:49 am

BBC Click Radio – Openness Special on “Privacy”: Jeff Jarvis vs. Andrew Keen

with one comment

This week saw the latest episode in the OU/BBC World Service Click (radio) co-produced season on openness, with a focus this week on privacy… You can hear an extended version of the discussion between entrepeneurial journalism and openness advocate, Jeff Jarvis, and professional contrarian, Andrew Keen: Privacy in a connected world

Unfortunately, the episode aired just too early to pick up up on this week’s “Who needs privacy?!” news, and in particular the new iPhone’s “secret” location logging behaviour: iPhone keeps record of everywhere you go; (find out how to see where your iPhone thinks you’ve been here: Got an iPhone or 3G iPad? Apple is recording your moves); but the discussion is a great one, so I encourage you to listen to it…(I’ll be asking questions later!;-)

The programme also saw the launch of its new hashtag: #bbcClickRadio

Whilst the Digital PlanetClick twitter audience is still dwarfed by the Digital Planet Listeners’ Facebook group, I’m keen to see if we can try to grow it… one way might be to show who’s recently been tweeting about the programme, and encourage people to start following each other and chatting about the issues raised in the programme a little bit more – something Gareth Mitchell (@garethm) can now pick up on at least on the first airing, as Click now goes out live…. So to that end, I’m going to try to work up a special version of my Twtter friendviz application that shows connections between folk who’ve recently tweeted a particular term, and in this case, the #bbcClickRadio hashtag. To see the map, visit http://bit.ly/bbcclickradiocommunity.

As a tease, here’s a rather more polished version of a map I grabbed recently…

Snapshot of #bbcClickRadioCommunity - http://bit.ly/bbcclickradiocommunity

(Unfortunately, the live one is unlikely to ever look like this!)

PS I wonder if the investigation into the iPhone tracking was inspired by the recent story about German politician Malte Spitz who managed to obtain a copy of the data his phone provider had stored about his location… Zeit Online: Tell-all telephone (If you want to play with the data, it’s available from there…)

Written by Tony Hirst

April 21, 2011 at 4:53 pm

BBC Click Radio – SXSW Interview With Andrew Keen

leave a comment »

Tomorrow (today??? Err, Tuesday…) sees (hears?! Err… airs) the next in the OU/BBC Click radio (ex-Digital Planet) co-produced season on “openness”.

Click (radio) (err, as was Digital Planet) now airs live and direct, comin’ atcha on Tuesday’s at, err, it’s not easy to find out from the programme page, is it??? Err, 19.32 (UK time???) on Tuesday (the science slot on World Service). (See the full upcoming schedule.)

Anyway, given all that confusion, why not take a break, sit back, and have a listen to this exclusive interview between Click’s Gareth Mitchell and Andrew Keen on “The Squeezed Midlist“.

And for more exclusive and extended interviews, check out the Information and Communication Technologies area on OpenLearn…

PS and for listeners of the BBC World Service Radio programme formally know as Digital Planet Click who are on Twitter, the hashtag is now #bbcClickRadio

Written by Tony Hirst

April 19, 2011 at 8:34 am

Googling the Future – from the Present and the Past

with 8 comments

An XKCD cartoon today described Googling the future using search terms such as “in <year>” and “by <year>”:

So I tried it:

Hmm – results from the future?

So I had a play in Google News… could this be a good way of searching forecasts?

By searching the past, we can search for old forecasts of the future…

I leave it as an exercise for the reader to search results from 2006, 2001, and 1991 for the 5, 10 and 20 years forecasts respectively for this year… let me know in the comments if anything interesting turns up;-)

See also: Google Impact…? The “Google Suggest” Factor

Written by Tony Hirst

April 18, 2011 at 9:33 am

Posted in Anything you want, Infoskills, Search

Tagged with ,

Visualising China 2011 F1 – Timing Charts

with 2 comments

Just a quick post (that I could actually have published 20 mins or so ago), showing a couple of graphics generated from my scrape of the 2011 China Formula One Grand Prix timing data (via FIA press releases).

First up, the race to the podium:

Chna f1 2011 - the race to the podium
Data © 2011 Formula One World Championship Ltd, 6 Princes Gate, London, SW7 1QJ, England

The full lap chart, with pit stops:

China F1 pit 2011 stops
Data © 2011 Formula One World Championship Ltd, 6 Princes Gate, London, SW7 1QJ, England

Both the above graphics were using data scraped from press releases published on the FIA media centre website. You can find the data in the GDF format I used to generate the images using Gephi here (howto).

PS @bencc has also been on the case, visualising telemetry data from Vodafone McLaren Mercedes. For example, Hamilton’s tour and Button’s tour.

PPS which reminds me – here’s an example of how to use Gephi to visualise telemetry data captured from the McLaren websire: Visualising Vodafone Mclaren F1 Telemetry Data in Gephi

Written by Tony Hirst

April 17, 2011 at 10:24 am

Posted in Tinkering, Visualisation

Tagged with , ,

Visualising F1 Timing Sheet Data

with one comment

Putting together a couple of tricks from recent posts (Visualising Vodafone Mclaren F1 Telemetry Data in Gephi and PDF Data Liberation: Formula One Press Release Timing Sheets), I thought I’d have a little play with the timing sheet data in Gephi…

The representations I have used to date are graph based, with each node corresponding a particular lap performance by a particular driver, and edges connecting consecutive laps.

**If you want to play along, you’ll need to download Gephi and this data file: F1 timing, Malaysia 2011 (NB it’s not throughly checked… glitches may have got through in the scraping process:-(**

The nodes carry the following data, as specified using the GDF format:

  • name VARCHAR: the ID of each node, given as driverNumber_lapNumber (e.g. 12_43)
  • label VARCHAR: the name of the driver (e.g. S. VETTEL
  • driverID INT: the driver number (e.g. 7)
  • driverNum VARCHAR: an ID for the driver of the lap (e.g. driver_12
  • team VARCHAR: the team name (e.g. Vodafone McLaren Mercedes)
  • lap INT: the lap number (e.g. 41)
  • pos INT: the position at the end of the lap (e.g. 5)
  • pitHistory INT: the number of pitstops to date (e.g. 2)
  • pitStopThisLap DOUBLE: the duration of any pitstop this lap, else 0 (e.g. 12.321)
  • laptime DOUBLE: the laptime, in seconds (e.g. 72.125)
  • lapdelta DOUBLE: the difference between the current laptime and the previous laptime (e.g. 1.327)
  • elapsedTime DOUBLE: the summed laptime to date (e.g. 1839.021)
  • elapsedTimeHun DOUBLE: the elapsed time divided by a hundred (e.g. )

Using the geolayout with an equirectangular (presumably this means Cartesian?) layout, we can generate a range of charts simply by selecting suitable co-ordinate dimensions. For example, if we select the laptime as the y (“latitude”) co-ordinate and x (“longitude”) as the lap, filtering out the nodes with a null laptime value, we can generate a graph of the form:

We can then tweak this a little – e.g. colour the nodes by driver (using a Partition based coluring), and edges according to node, resize the nodes to show the number of pit stops to date, and then filter to compare just a couple of drivers :

This sort of lap time comparison is all very well, but it doesn’t necessarily tell us relative track positions. If we size the nodes non-linearly according to position, with a larger size for the “smaller” numerical position (so first is less than second, and hence first is sized larger than second), we can see whether the relative positions change (in this case, they don’t…)

Another sort of chart we might generate will be familiar to many race fans, with a tweak – simply plot position against lap, colour according to driver, and then size the nodes according to lap time:

Again, filtering is trivial:

If we plot the elapsed time against lap, we get a view of separations (deltas between cars are available in the media centre reports, but I haven’t used this data yet…):

In this example, lap time flows up the graph, elapsed time increases left to right. Nodes are coloured by driver, and sized according to postion. If a driver has a hight lap count and lower total elapsed time than a driver on the previous lap, then it’s lapped that car… Within a lap, we also see the separation of the various cars. (This difference should be the same as the deltas that are available via FIA press releases.)

If we zoom into a lap, we can better see the separation between cars. (Using the data I have, I’m hoping I haven’t introduced any systematic errors arising from essentially dead reckoning the deltas between cars…)

Also note that where lines between two laps cross, we have a change of position between laps.

[ADDED] Here’s another view, plotting elapsed time against itself to see where folk are on the track-as-laptime:

Okay, that’s enough from me for now.. Here’s something far more beautiful from @bencc/Ben Charlton that was built on top of the McLaren data…

First up, a 3D rendering of the lap data:

And then a rather nice lap-by-lap visualisation:

So come on F1 teams – give us some higher resolution data to play with and let’s see what we can really do… ;-)

PS I see that Joe Saward is a keen user of Lap charts…. That reminds me of an idea for an app I meant to do for race days that makes grabbing position data as cars complete a lap as simple as clicking…;-) Hmmm….

PPS for another take of visualising the timing data/timing stats, see Keith Collantine/F1Fanatic’s Malaysia summary post.

Written by Tony Hirst

April 16, 2011 at 7:30 pm

Visualising Vodafone Mclaren F1 Telemetry Data in Gephi

with 5 comments

Last year, I popped up an occasional series of posts visualising captures of the telemetry data that was being streamed by the Vodoafone McLaren F1 team (F1 Data Junkie).

I’m not sure what I’m going to do with the data this year, but being a lazy sort, it struck me that I should be able to visualise the data using Gephi (using in particular the geo layout that lets you specify which node attributes should be used as x and y co-ordinates when placing the nodes.

Taking a race worth of data, and visualising each node as follows (size as throttle value, colour as brake) we get something like this:

(Note that the resolution of the data is 1Hz, which explains the gaps…)

It’s possible to filter the data to show only a lap’s worth:

We could also filter out the data to only show points where the throttle value is above a certain value, or the lateral acceleration (“G-force”) and so on… or a combination of things (points where throttle and brake are applied, for example). I’ll maybe post examples of these using data from this year’s races…. err..?;-)

For now though, here’s a little video tour of Gephi in action on the data:

What I’d like to be able to do is animate this so I could look at each lap in turn, or maybe even animate an onion skin of the “current” point and a couple of previous ones) but that’s a bit beyond me… (for now….?!;-) If you know how, maybe we should talk?!:-)

[Thanks to McLaren F1 for streaming this data. Data was captured from the McLaren F1 website in 2010. I believe the speed, throttle and brake data were sponsored by Vodafone.]

PS If McLaren would like to give me some slightly higher resolution data, maybe from an old car on a test circuit, I’ll see what I can do with it… Similarly, any other motor racing teams in any other formula who have data they’d like to share, I’m happy to have a play… I’m hoping to go to a few of the BTCC races this year, so I’d particularly like to hear from anyone from any of those teams, or teams in the supporting races:-) If a Ginetta Junior team is up for it, we might even be able to get an education/outreach thing going into school maths, science, design and engineering clubs…;-)

Written by Tony Hirst

April 14, 2011 at 12:41 pm

Follow

Get every new post delivered to your Inbox.

Join 126 other followers