Sketching – WRC Stage Progress Chart

Picking up on some sketches I started doing around WRC (World Rally Championship) results data last year, and after rewriting pretty much everything following an update to WRC’s website that means results data can now be accessed via JSON data feeds, here are some notes on a new chart type, which I’m calling a Stage Progress Chart for now.

In contrast to one of my favourite chart forms, the macroscope, which is intended to visualise the totality of a dataset in a single graphic, this one is developed as a deltascope, where the intention is to visualise a large number of differences in same chart.

In particular, this chart is intended to show:

  • the *time difference* between a given car and a set of other cars at the *start* of a stage;
  • the *time difference* between that same given car and the same set of other cars at the *end* of a stage;
  • any *change in overall ranking* between the start and end of the stage;
  • the *time gained* or the *time lost* by the given car relative to each of the other cars over the stage.

Take 1

The data is presented on the WRC website as a series of data tables. For example, for SS4 of WRC Rally Sweden 2018 we get the following table:

The table on the left shows the stage result, the table on the right the overall position at the end of the stage. DIFF PREV is the time difference to the car ahead, and DIFF 1st the difference to the first ranked driver in the table, howsoever ranked: the rank for the left hand table is the stage position, the rank on the right hand table is overall rally position at the end of the stage.

One thing the table does not show is the start order.

The results table for SS4 is shown below. SS4 is the stage that forms the focus for this walkthrough.

Data is obtained by scraping the WRC results JSON data feeds and popping it into a SQLite3 database. A query returns the data for a particular stage:

We can pivot the data to get the total accumulated times (in ms) at the end of each stage for the current (SS4) and previous (SS3) stage for each driver onto the same row:

Obtaining the times for a target driver allows us to rebase the times relative to that driver as a Python dict:

Rebasing is simply a matter of relativising times with respect to the target driver times:

The rebased accumulated rally times for each stage are relative to the accumulated rally time for the target driver at the end of the corresponding stage. The rebased delta gives the time that the target driver either made up, or lost against, each other car on the stage. The deltaxx value is “a thing” used to support the chart plotting. It’s part of the art… ;-)

UPDATE: it strikes me that thing() is actually returning the value closest to zero if the signs are the same:

if samesign(overall,delta): return min([overall,delta], key=abs)
return delta

So, that’s data wrangly bits… here’s what we can produce from it:

The chart is rebased according to a particular driver, in this case Ott Tänak. It’s overloaded and you need to learn to read it. It’ll be interesting to see how natural it comes to read with some practise – if it doesn’t, then it’s not that useful; if it does, it may be handy for power users.

First up, what do you see?

The bars are ordered and have positive (left) and negative (right) values, with names down the side. The names are driver labels, in this case for WRC RC1 competitors. The ordering is the in overall rally class rank order at the end of the stage identified in the title (SS4). So at the end of stage SS4, NEU was in the lead of the rally class, with MIK in second.

The chart times are rebased relative to TÄN – at the moment this is not explicitly identified, though it can be read from the chart – the bars for TÄN are all set to 0.

Rebasing means that the times are normalised (rebased) relative to a particular driver (the “target driver”).

The times that are used for the rebasing are:

  • the difference between the accumulated rally stage time of the target driver and each other driver at the end of the *previous* stage;
  • the difference between the accumulated rally stage time of the target driver and each other driver at the end of the specified stage (in this case, SS4).

From these differences, calculated relative to the target driver, we can calculate the time gained (or lost) on the stage by the target driver relative to each of the other drivers.

The total length of each continued coloured bar indicates the time delta (that is, the time gained/lost by the target driver relative to each other car) on the stage between the target driver and each other car. Red means time was lost, green means time was gained.

So TÄN lost about 15s (red) to NEU on the stage, and gained about 50s (green) on AL. He also lost about 3s (pink) to MEE and about 15s to BRE (pink plus red).

The furthest extent of the light grey bar, or the solid (red / green) bar shows the overall difference to that driver from the target at the end of the stage. So NEU is 17s or so ahead and EVA is about 80s behind. Of the 80s or so EVA is behind, TÄN made about 50s of those (green) on the current stage. Of the 17s or so TÄN is behind NEU, NEU made up about 15s on the current stage, and went into the stage ahead (grey to the right) of TÄN by about 2s.

If a pastel (pink or light green) colour totally fills a bar to left or right, that is a “false time”. It doesn’t indicate the overall time at the end of the stage (the grey bar on the other side does that), bit it does allow you to read off how much time was gained / lost.

The dashed bar indicates situations where the driver started the stage relative to the target driver. So SUN started the stage about 15s behind and made up about 14s (the pink bar to the right). The grey bar to the left shows SUN finished the stage about 1s behind TÄN. PAD also started about 15s behind and made up about 15s, leaving just a fraction of a second behind (below) TÄN at the end of the stage. If you look very closely, you might just see a tiny sliver of grey to the left for PAD.

Where coloured bars straddle the zero line, that shows the the target driver has gone from being ahead (or behind) a driver at the start of the stage to being behind (ahead) of them at the end of the stage. So MIK, LAP, OST, BRE and LAT all started the stage behind TÄN but finished overall ahead at the end. (If you look at the WRC overall result table on the right for SS3 (the previous stage) you’ll see TÄN was second overall. At the end of SS4, he was in seventh, as the stage progress chart shows.)

Here’s the chart for the same stage rebased relative to PAD.

Here we see that PAD make up time on TÄN and LAT, remaining a couple of seconds behind LAT (the light green that fills the bar to the left shows this is a “false” time) and not quite getting ahead of TÄN overall (TÄN is above PAD). SUN held stay relative to PAD, but PAD took good chunks of time away from MEE all the way down (green). PAD also lost (red) a second or two to each of NEU, LAP and OST, and lost most to MIK.

Take 2

The interpretation of the pink /light green bars is a bit confusing, so let’s simplify a little to show times that should be added using the solid red/green colour, and the pastel colours purely as false times.

We can also simplify the chart symbols further by removing the dashed line and just using a solid line for bars.

Now the solid colour clearly shows the times should be added together to give the overall delta on the stage between the rebase target car and the other cars.

One thing missing from the chart is the start order, which could be added as part of the y-axis label. The driver the chart is rebased relative to should also be identified clearly, even if just in the chart title.

As to how to understanding how the chart works, one way is to think of it as using a warped hydraulic model.

For example:

  • imagine that a grey bar to the right shows how far behind a particular car the target driver is at the start of the stage; being ahead, the bar is above that of the target car:
    • the car extends its lead over the target driver during the stage, so the bar plunger is pulled to the right and it fills with red (time lost) fluid at the base. How much red fluid is the amount of additional time the target driver lost on that stage to the target bar.
    • the car reduces its overall lead over the target driver during the stage, which is to say the target driver gains time. The plunger is pushed to the left and fills with notional light green time, to the left, representing the time gained by the target driver; the actual time the target driver is behind the car is still shown by the solid grey bar to the right; the white gap at the end of the bar (similar in size to the notional light green extruded to the left) shows how much of the lead the car had at the start of the stage is has been lost during the stage. Reading off the width of the white area is har to do, which is why we map it to the notional light green time to the left;
    • the car loses all its lead and falls behind the target car, overall, by the end of the stage. In this case, the grey plunger to the right is pushed all the way down, past zero, which fills the whole bar with solid green; the bar is further pushed to the left by the amount of time the car is now behind the target car, overall, at the end of the stage. The total width of the solid green bar is the total amount of time the target car gained on the stage. The vertical positioning of the bar also falls below that of the target car to show it is now behind. That the bar is filled green to the right also indicates that the car was ahead of the target car at the end of the previous stage;
  • imagine that a grey bar to the left shows how far ahead of a particular car the target driver is at the start of the stage; behind behind, the bar is below that of the target car:
    • the car loses further ground on the stage, so the bar is extended to the left and fills with green “time gained by target car” fluid at the base;
    • the car makes some time back on the target car on the stage, but not enough to get ahead overall; the bar is pushed to the right, leaving a white space to the left indicative of the time the target car lost. The amount by which the grey still extends to the left is the time the target driver is still ahead of the car at the end of the stage. A notional pink time to the right (the same width as the white space created on the left) is shown to the right;
    • the car makes all the time it was behind at the start of the stage back, and then some; the bar moves above the target car on the vertical dimension. The plunger is pushed so far from he left it passes zero and fills the bar with solid red (time lost by target car). The amount the bar extends to the right is the time the car is ahead of the target car, overall, at the end of the stage. The total width of the completely solid red bar is the total time the target driver lost relative to that car on the stage. The fact that the bar is above the target driver and has a solid red bar that extends on the left as well as the right further shows that the car was behind the target at the start of the stage.

I’m still feeling my way with this one… as mentioned at the start, it’ll be interesting to see how natural it feels – or not – after some time spent trying to use it.

Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...