Posts Tagged ‘F1’
I often get quizzical looks when I drop F1 related visualisations into random presentations (“Tony slacking around again”), whereas if I said “Raspberry Pi” then it would somehow be rather more legitimate… However, one of the ways I see it is that I’m trying to engage in an informal way with a large audience in a target demographic, a significant proportion of which are prequalified as ‘interested in STEM’. I’m also trying to engage, albeit slackly, in some sort of weak knowledge transfer (hey, motor racing folk: you increasingly haz data, and maybe there are ways of visualising it to try and gain value from it that you haven’t really thought about yet…)
In case you didn’t already know, motorsport is worth shedloads* to quite a lot* of UK companies in both domestic and export sales and employs probably more than seven* people. (*Official trade association stats.)
Anyway, what prompted this post? This did:
learndirect as sponsor of the Marussia F1 Team?!
I have to admit, for some reason I associate learndirect with DirectGov, the government one stop-shop (will gov.uk be rebranded as DirectGov when it comes out of beta, I wonder? Or will DirectGov go the way of the open2.net and be quietly run down and then out?!)… but the truth of the matter is that learndirect is a
VCprivate equity operated outfit, “the UK’s leading online learning provider”, apparently, “[acquired in] October 2011 [by LDC] … in a transaction valued in the order of £40 million.” LDC Portfolio: learndirect.
Ah, here’s where my memory tricked me (like it does with supermarket and bank “promises”…): “LDC bought learndirect by acquiring its parent Ufi Limited from the Ufi Charitable Trust (UCT). UCT, a registered charity, was set up in 1998 to use new technology to transform the delivery of learning and skills.” Ufi, of course, was the University for Industry, an ill-fated government venture that I seem to remember the OU partnered to a certain extent…
So why would LDC be splashing the learndirect brand all over the MarussiaF1 racing car (aside from the fact the learndirect owners LDC also have a stake in the Marussia F1 team (one aim of which is to “meet our latent sponsorship potential”, which presumably means getting sponsorship mileage for other LDC companies?), as well as having at least one person on both the learndirect and Marussia Virgin(?, or should that be F1?) Racing boards…
And there was me thinking there were absolutely no opportunities for wrangling F1 freebies, seeing as I am stuck in the education sector… Hmmm… time to dig out some of my old science, technology, engineering and maths outreach pitches, maybe…?! (If anyone at the Marussia F1 Racing team fancy chatting about exploring the use of data visualisation either for outreach, or maybe in research, please feel free to get in touch…:-) The (nearby, Milton Keynes based) OU also has various lab facilities and experience in instrumentation (including space flown instruments – so good on the heat, mass, volume and vibration front, I’m guessing…?), materials and CFD (though I suspect too much CFD may be something of a sore point!?), and I’ll happily put you in touch with folk who can tell you more if you’re interested…;-) There’s also some experience in Twitter audience interest profiling, heh heh;-)
PS MarussiaF1 also happen to have appointed a female test driver, Maria de Villota, which may or may not also be a good thing as far as WISE-like initiatives go (I know the drivers aren’t engineers, but it’s a aspiration-related funnel thing; see also James Allen on Why aren’t there more women engineers in F1, where he writes: “F1 in Schools has a very high ratio of female competitors, around 35%, and all-girl teams are quite common. And yet when they get to around 15 years of age, the numbers fall away and few girls pursue engineering degrees.”)
PPS During National Motorsport Week last year, I won a trip round the Marussia(-Virgin, as it then was) F1 factory in Dinnington, near Sheffield (it’s since moved to Banbury; the factory, that is, not Dinnington…;-). Here’s the obligatory blog post: Marussia Virgin Racing F1 Factory Visit. Btw, National Motorsport Week runs again this year too: National Motorsport Week 2012).
PPPS this reminds me of a noticing by @barnstormed (?) a couple of weeks ago that the OU had an ad on ?rotating digital hoardings during Six Nations rugby? (Confirmed by @stuartbrown: “the ou was advertising on boards during scot vs england in the 6 nations rugby”. Photo of that anyone?) Anyone got other examples of education related orgs sponsoring sports to a significant extent?
Yesterday, I had the good fortune to visit the F1 Marussia Virgin Racing factory at Dinnington, near Sheffield, as a result of “winning” a luck dip competition run via GoMotorSport (part of a series of National Motorsport week promotions being run by the F1 teams based in the UK).
[Thanks to @markhendy for the pic...]
Thanks to Finance Director Mark Hendy and engineer Shakey for the insight into the team’s operations:-)
Over the next few days and weeks, I’ll try to pick up on a few of the things I learned from the tour on the F1DataJunkie blog, tying them in to the corresponding technical regulations and other bits and pieces, but for now, here are some of the noticings I came away with…
- the engines aren’t that big, weighing 90kg or so and looking small than the engine in my own car…
- wheels are slotted onto the axles using a 3 pin mount on the front and a six(?) pin mount on the rear. (The engines are held on using a 6(?) point fixing.)
- the drivers aren’t that heavy either, weight wise (not that we met either of the drivers: neither Timo Glock nor Jerome D’Ambrosio are frequent visitors to the Dinnington factory, where the team’s cars are prepared fro before, and overhauled after, each race…): 70 kg or so. With cars prepared to meet racing weight regulations to a tolerance of 0.5kg or so, a large mixed grill and a couple of pints can make a big difference… (Hmm, I guess it would be easy enough to calculate the “big dinner weight effect” penalty on laptime?!)
I’m not sure if this was a “right-handed vs left-handed spanner” remark, but a comment was also made that the adhesive sponsor sticker can have a noticeable effect on the car’s aerodynamics as the corners become unstuck and start to flap. (Which made me wonder, of that is the case, is the shape of stickers taken into account? Is a leading edge on a label with a point/right angled corner rather than a smooth curve likely to come unstuck more easily, for example?!) Cars also need repainting every few races (stripping back to the carbon, and repainting afresh) because of pitting and chipping and other minor damage than can affect smooth airflow.
- side impact tubes are an integral part of the safety related design of the car:
- to track the usage of tyres during a race weekend, an FIA official scans a barcode on each tyre as it is used on the car:
The data junkie in me in part wonders whether this data could be made available in a timely fashion via the Pirelli website (or a Pirelli gadget on each team’s website) – or would that me giving away too much race intelligence to the other teams? That way, we could get an insight into the tyre usage over the course weekend…
- IT plays an increasingly important part of the the pit garage setup; local area networks (cabled and wifi?) are set up by each team for the weekend, the data engineers sitting behind the screen and viewing area in the garage (rather than having a fixed set up in one of the 5(?) trucks that attends each race.).
- the cars are rigged up with 60 or sensors; there is only redundancy on throttle and clutch sensors. Data analysis is in part provided through engineers provided by parts suppliers (McLaren Electronics, who supply the car’s ECU (and telemetry box(?)) provide a dedicated person(?) to support the team; data analysis is, in part, carried out using the Atlas (9?) Advanced Telemetry Linked Acquisition System from McLaren Electronic Systems. Data collected during a stint is transmitted under encryption back to the the pits, as well as being logged on the car itself. A full data dump is available to the team and the FIA scrutineers via an umbilical/wired connection when the car is pitted.
UST Global, one of the teams partners, also provide 3(?) data analysts to support the team during a race (presumably using UST Global’s “Race Management System”?).
- for design and testing, weekly reporting is required that conforms to a trade-off between the number of hours per week that each team can spend on wind tunnel testing (60 hours per week) and and CFD (“can’t find downforce”;-) simulation (40 teraflops per week). My first impression there was that efficient code could effectively mean more simulation testing?! (CFD via CSC? CSC expands relationship with Marussia Virgin Racing, doubling computing power for the team’s 2011 formula 1 season, or are things set to change with the replacement of Nick Wirth by Pat Symonds…?)
- the resource restriction agreement also limits the number of people who can work on the chassis. For a race weekend, teams are limited to 50 (47?) people. We were given a quick run down of at least (8?) engineer roles assigned to each car, but I forget them…
So – that’s a quick summary of some of the things I can remember off the top of my head…
…but here are a couple of other things to note that may be of interest…
Marussia Virgin are making the most of their Virgin partnership over the Silverstone race weekend with a camping party/Virgin Experience at Stowe School (Silverstone Weekend) and a hook-up with Joe Saward’s “An Audience With Joe“… (If you don’t listen to @sidepodcast’s An Aside With Joe podcast series, you should…;-)
The team has also got en education thing going with race ticket sweeteners for folk signing up to the course: Motorsport Management Online Course.
I can’t help thinking there may be a market for a “hardcore fans” course on F1 that could run over a race season and run as an informal, open online course… I still don’t really know how a car works, for example ;-)
Anyway – that’s by the by: thanks again to the GoMotorsport and the Marussia Virgin Racing team (esp. Mark Hendy and Shakey) for a great day out :-)
PS I think the @marussiavirgin team are trying to build up their social media presence too… to see who they’re listening to, here’s how their friends connect:
I *love* treemaps. If you’re not familiar with them, they provide a very powerful way of visualising categorically organised hierarchical data that bottoms out with a quantitative, numerical dimension in a single view.
For example, consider the total population of students on the degrees offered across UK HE by HESA subject code. As well as the subject level, we might also categorise the data according to the number of students in each year of study (first year, second year, third year).
If we were to tabulate this data, we might have columns: institution, HESA subject code, no. of first year students, no. of second year students, no. of third year students. We could also restructure the table so that the data was presented in the form: institution, HESA subject code, year of study, number of students. And then we could visualise it in a treemap… (which I may do one day… but not now; if you beat me to it, please post a link in the comments;-)
Instead, what I will show is how to visualise data from a sports championship, in particular the start of the Formula One 2011 season. This championship has the same entrants in each race, each a member of one of a fixed number of teams. Points are awarded for each race (that is, each round of the championship) and totalled across rounds to give the current standing. As well as the driver championship (based on points won by individual drivers) is the team championship (where the points contribution form drivers within a team is totalled).
Here’s what the results from the third round (China) looks like:
|Paul di Resta||Force India-Mercedes||0|
|Adrian Sutil||Force India-Mercedes||0|
F1 2011 Results – China, © 2011 Formula One World Championship Ltd
We can represent data from across all the races using a table of the form:
Sample of F1 2011 Results 2011, © 2011 Formula One World Championship Ltd
Here’s what it looks like when we view it in a treemap visualisation:
The size of the boxes is proportional to the (summed) values within the hierarchical categories. In the above case, the large blocks are the total points awarded to each driver across teams and races. (The team field might be useful if a driver were to change team during the season.)
I’m not certain, but I think the Many Eyes treemap algorithm populates the map using a sorted list of summed numerical values taken through the hierarchical path from left to right, top to bottom. Which means top left is the category with the largest summed points. If this is the case, in the above example we can directly see that Webber is in fourth place overall in the championship. We can also look within each blocked area for more detail: for example, we can see Hamilton didn’t score as many points in Malaysia as he did in the other two races.
One of the nice features about the Many Eyes treemap is that it allows you to reorder the levels of the hierarchy that is being displayed. So for example, with a simple reordering of the labels we can get a view over the team championship too:
What might be interesting would be to feed Protovis or the JIT with data dynamically form a Google Spreadsheet, for example, so that a single page could be used to display the treemap with the data being maintained in a spreadsheet.
Hmm, I wonder – does Google spreadsheets have a treemap gadget? Ooh – it does: treemap-gviz. It looks as if a bit of wrangling may be required around the data, but if the display works out then just popping the points data into a Google spreadsheet and creating the gadget should give an embeddable treemap display with no code required:-) (It will probably be necessary to format the data hierarchy by hand, though, requiring differently layed out data tables to act as source for individual and team based reports.)
So – how long before we see some “live” treemap displays for F1 results on the F1 blogs then? Or championship tables from other sports? Or is the treemap too confusing as a display for the uninitiated? (I personally don’t think so.. but then, I love macroscopic views over datasets:-)
PS see also More Olympics Medal Table Visualisations which includes a demonstration of a treemap visualisation over Olympic medal standings.
Just a quick post (that I could actually have published 20 mins or so ago), showing a couple of graphics generated from my scrape of the 2011 China Formula One Grand Prix timing data (via FIA press releases).
First up, the race to the podium:
The full lap chart, with pit stops:
Both the above graphics were using data scraped from press releases published on the FIA media centre website. You can find the data in the GDF format I used to generate the images using Gephi here (howto).
PPS which reminds me – here’s an example of how to use Gephi to visualise telemetry data captured from the McLaren websire: Visualising Vodafone Mclaren F1 Telemetry Data in Gephi
Putting together a couple of tricks from recent posts (Visualising Vodafone Mclaren F1 Telemetry Data in Gephi and PDF Data Liberation: Formula One Press Release Timing Sheets), I thought I’d have a little play with the timing sheet data in Gephi…
The representations I have used to date are graph based, with each node corresponding a particular lap performance by a particular driver, and edges connecting consecutive laps.
The nodes carry the following data, as specified using the GDF format:
- name VARCHAR: the ID of each node, given as driverNumber_lapNumber (e.g. 12_43)
- label VARCHAR: the name of the driver (e.g. S. VETTEL
- driverID INT: the driver number (e.g. 7)
- driverNum VARCHAR: an ID for the driver of the lap (e.g. driver_12
- team VARCHAR: the team name (e.g. Vodafone McLaren Mercedes)
- lap INT: the lap number (e.g. 41)
- pos INT: the position at the end of the lap (e.g. 5)
- pitHistory INT: the number of pitstops to date (e.g. 2)
- pitStopThisLap DOUBLE: the duration of any pitstop this lap, else 0 (e.g. 12.321)
- laptime DOUBLE: the laptime, in seconds (e.g. 72.125)
- lapdelta DOUBLE: the difference between the current laptime and the previous laptime (e.g. 1.327)
- elapsedTime DOUBLE: the summed laptime to date (e.g. 1839.021)
- elapsedTimeHun DOUBLE: the elapsed time divided by a hundred (e.g. )
Using the geolayout with an equirectangular (presumably this means Cartesian?) layout, we can generate a range of charts simply by selecting suitable co-ordinate dimensions. For example, if we select the laptime as the y (“latitude”) co-ordinate and x (“longitude”) as the lap, filtering out the nodes with a null laptime value, we can generate a graph of the form:
We can then tweak this a little – e.g. colour the nodes by driver (using a Partition based coluring), and edges according to node, resize the nodes to show the number of pit stops to date, and then filter to compare just a couple of drivers :
This sort of lap time comparison is all very well, but it doesn’t necessarily tell us relative track positions. If we size the nodes non-linearly according to position, with a larger size for the “smaller” numerical position (so first is less than second, and hence first is sized larger than second), we can see whether the relative positions change (in this case, they don’t…)
Again, filtering is trivial:
If we plot the elapsed time against lap, we get a view of separations (deltas between cars are available in the media centre reports, but I haven’t used this data yet…):
In this example, lap time flows up the graph, elapsed time increases left to right. Nodes are coloured by driver, and sized according to postion. If a driver has a hight lap count and lower total elapsed time than a driver on the previous lap, then it’s lapped that car… Within a lap, we also see the separation of the various cars. (This difference should be the same as the deltas that are available via FIA press releases.)
If we zoom into a lap, we can better see the separation between cars. (Using the data I have, I’m hoping I haven’t introduced any systematic errors arising from essentially dead reckoning the deltas between cars…)
Also note that where lines between two laps cross, we have a change of position between laps.
[ADDED] Here’s another view, plotting elapsed time against itself to see where folk are on the track-as-laptime:
Okay, that’s enough from me for now.. Here’s something far more beautiful from @bencc/Ben Charlton that was built on top of the McLaren data…
First up, a 3D rendering of the lap data:
And then a rather nice lap-by-lap visualisation:
So come on F1 teams – give us some higher resolution data to play with and let’s see what we can really do… ;-)
PS I see that Joe Saward is a keen user of Lap charts…. That reminds me of an idea for an app I meant to do for race days that makes grabbing position data as cars complete a lap as simple as clicking…;-) Hmmm….
PPS for another take of visualising the timing data/timing stats, see Keith Collantine/F1Fanatic’s Malaysia summary post.
If you want F1 summary timing data from practice sessions, qualifying and the race itself, you might imagine that the the FIA Media Centre is the place to go:
Some of the documents provide all the results on a single page in a relatively straightforward fashion:
Others are split into tables over multiple pages:
Following the race, the official classification was available as a scrapable PDF in preliminary for, but the final result – with handwritten signature – looked to be a PDF of a photocopy, and as such defies scraping without an OCR pass first… which I didn’t try…
I did consider setting up separate scrapers for each timing document, and saving the data into a corresponding Scraperwiki database, but a quick look at the license conditions made me a little wary…
No part of these results/data may be reproduced, stored in a retrieval system or transmitted in any form or by any means electronic, mechanical, photocopying, recording, broadcasting or otherwise without prior permission of the copyright holder except for reproduction in local/national/international daily press and regular printed publications on sale to the public within 90 days of the event to which the results/data relate and provided that the copyright symbol appears together with the address shown below …
Instead, I took the scrapers just so far such that I (that is, me ;-) could see how I would be able to get hold of the data without too much additional effort, but I didn’t complete the job… there’s partly an ulterior motive for this too… if anyone really wants the data, then you’ll probably have to do a bit of delving into the mechanics of Scraperwiki;-)
(The other reason for not my spending more time on this at the moment is that I was looking for a couple of simple exercises to get started with grabbing data from PDFs, and the FIA docs seemed quite an easy way in… Writing the scrapers is also bit like doing Sudoku, or Killer, which is one of my weekend pastimes…;-)
The scraper I set up is here: F1 Timing Scraperwiki
To use the scrapers, you need to open up the Scraperwiki editor, and do a little bit of configuration:
(Note the the press releases may disappear a few days after the race – I’m not sure how persistent the URLs are?)
When you’ve configured the scraper, run it…
The results of the scrape should now be displayed…
Scraperwiki does allow scraped data to be deposited into a database, and then accessed via an API, or other scrapers, or uploaded to Google Spreadsheets. However, my code stops at the point of getting the data into a Python list. (If you want a copy of the code, I posted it as a gist: F1 timings – press release scraper; you can also access it via Scraperwiki, of course).
Note that so far I’ve only tried the docs from a single race, so the scrapers may break on the releases published for future (or previous) races… Such is life when working with scrapers… I’ll try to work on robustness as the races go by. (I also need to work on the session/qualifying times and race analysis scrapers… they currently report unstructured data and also display an occasional glitch that I need to handle via a post-scrape cleanser.
If you want to use the scraper code as a starting point for building a data grabber that publishes the timing information as data somewhere, that’s what it’s there for (please let me know in the comments;-)
PS by the by, Mercedes GP publish an XML file of the latest F1 Championship Standings. They also appear to be publishing racetrack information in XML form using URLs of the form http://assets.mercedes-gp.com/—9—swf/assets/xml/race_23_en.xml. Presumably the next race will be 24?
If you know of any other “data” sources or machine readable, structured/semantic data relating to F1, please let me know via a comment below:-)
Why oh why doesn’t F1 get into the spirit of releasing live time data in a API form during the race?
Here’s something I’d like to build, based on track position graphics:
The ability to play along as a pit lane strategist looking for opportunities about when to pit….
For example, I’d select my driver, then using a model of how long it takes to pit, how far behind the traffic is, and how the time difference maps onto distance round the track, we could pop up a graphic showing the window the pitting car would look to return in to…
Post hoc timing data is available, I suppose, so I guess I could model what this might look like anyway…?
The count down is on to my first post unpicking some of the telemetry data grabbed from the Mclaren F1 site during the Bahrain Grand Prix, and then maybe this weekend’s race, but first, here’s another tease…
One of the problems I’ve found from a data-based (groan…) storytelling perspective is relating what the data’s telling us to what we know the car is doing is from where it is on the track. As I/we refine our data anlaysis skills we’ll be able just to look at the data and work out what the likely features of the track are at the point the data was collected; but as novice data engineers, we need all the cribs we can get. Which is why I had a little play with my Processing code and built an interactive data explorer that looks something like this:
The idea is that I can easily select a data trace, or a location on the track, and get a snapshot of the data collected at that point in the context of the other data points. That is, this data navigator allows me to expose the data collected in a single sample, in the the context of the position of the car on the track, and given the state of the other data values at the same point in time, as well as immediately before and immediately after.
I’ll post a version of this data explorer somewhere when I post the first data analysis post proper, but for now, you’ll just have to make do with the video…;-)
PS As to where the data came from, that story is described here: F1 Data Junkie – Looking at What’s There
It’s F1 race weekend again, so I’m back pondering what to do next on my F1 Race Day Strategist spreadsheets. Coming across an article on (BBC F1’s fuel-adjusted Monaco GP grid), I guess one thing I could do is look to try and model the fuel adjusted grid for each race. That post also identifies the speed penalty per kg (“each kilo of fuel slows it down by about 0.025 seconds”) so I need to factor that in too, somehow, into a laptime predictor spreadsheet, maybe?
Note that I didn’t really see many patterns in lap time changes when I tried to plot them previously (A Few More Tweaks to the Pit Stop Strategist Spreadsheet) so maybe the time gained by losing weight is offset by decreasing tyre performance?
One thing the spreadsheet has (badly) assumed to data was a fuel density of 1 kg/l. Checking the F1 2009 technical specification, the actual density can range between 0.72 and 0.775 kg/l (regulation 19.3), so relating fuel timings (l/s), lap distances/fuel efficiencies (km/l), and car starting weight (kg) means that the density measures need taking into account.
Unfortunately, I factored density into some of the formulae but not others, so the spreadsheets could take some picking apart trying to take density into account to keep the different calculations consistent. Hmm, maybe I should start a new spreadsheet from scratch to work out fuel adjusted grid positions, and then use the basic elements from that spreadsheet as the base elements for the other spreadsheets?
Something else that I need to start considering, particularly given that there won’t be any race day refuelling next year, is tyre performance (note to self: track temperature is important here). A quick scout around didn’t turn up any useful charts (I was using words like “model”, “tyre”, “performance”, “timing” and “envelope”) but what I think I want is a simple, first approximation model of tyres that models time “penalties” and “bonuses” about an arbitrary point, over number of laps, and as a function of track temperature.
For the spreadsheet, I’m thinking something like an “attack decay” or attack-decay-sustain-release (ADSR) envelope (something I came across originally in the context of sound synthesis many years ago…)
On the x-axis, I’m guessing I want laps, on the y-axis, a modifier to lap time (in seconds) relative to some nominal ideal lap time. The model should describe the number of laps it takes for the tyres to come on (a decreasing modifier to the point at which the tyres are working optimally), followed by an increasing penalty modifier as they go off.
Ho hum, quali over, so I’ve run out of time to actually do anything about any of this now… maybe tomorrow…?
For the first Formula One Grand Prix of the year, I put together a spreadsheet that would let you play the role of a pit stop fuel strategist (F1 Pit Stop Strategist – Fuel Stop Spreadsheet).
I missed the last couple of races, but I did get to see today’s, so whilst I was watching I also made a few tweaks to the spreadsheet.
First thing was to tweak the first pit stop estimator, by adding an offset that factors in a 3 lap fuel penalty to account for the procession lap, the formation lap, and some slack!
Secondly, I added a new sheet that allows you to play along with the race so that you can try to work out when all the other cars are likely to be pitting throughout the race.
This is very much the first pass of this spreadsheet – I’m not sure how the BBC calculate or guess at the amount of fuel added on the few occasions they do pop up an info bar, although they do show quite a few of the pit stop timings. So over the next few races (or maybe by watching replays – and with knowledge of when all the stops were actually taken) I’ll try to work on the formula that takes the pit stop time – or an estimate of how long the fuel hose was attached – and calculates the fuel loaded (and hence number of extra laps that car can complete).
The other thing I added to the strategist spreadsheet was a display of the best sector times from each driver in Q3, charted relative to the best sector times of a nominated driver:
(Obviously, a similar chart could also be used to display the best sector times for each driver during the race.)
You can find the race day strategist spreadsheet here: Race Day Strategist Spreadsheet.
As far as post-race stats go, I was intrigued as to whether lap times show any benefit to decreasing car weight as fuel is used up each lap – so here are the time differences between consecutive laps for Button:
(For the pit stops, I limited the time to 3s.)
I’m not sure whether an improvement in lap time should be shown above the line, or below the line?