OUseful.Info, the blog…

Trying to find useful things to do with emerging technologies in open education

Visualising MPs’ Expenses Using Scatter Plots, Charts and Maps

A couple of days ago, the Guardian’s @datastore announced that a spreadsheet of UK MPs’ (Members of Parliament) expenses had been posted to the Guardian OpenPlatform datastore on Google Spreadsheets.

Just because, I though it would be nice to visualise the spreadsheet using some Many Eyes Wikified charts, so I had a look at the data, and sighed a little: in many of the spreadsheet cells was a pound sign, and Many Eyes doesn’t like those – it just wants numbers… So I went in to Yahoo pipes to create a pipe to tidy up the CSV output of the spreadsheet so I could pipe it into Many Eyes Wikified… and drew a blank: I couldn’t get the pipe to work (no CSV – just HTML (it turns out I was using the wrong URL pattern from the spreadsheet – doh!)). So I exported the CSV, reg-exped it in a text editor, adn uploaded it to create a new spreadsheet. (Which reminds me: note to self – create a tidy-upper pipe fed from the datastore and refactor the wikified data page to feed from the pipe…)

[Many Eyes Wikified is no longer available as a service - to replicate the following visulisations, you need to upload the data to Many Eyes (the none wikified version...). I think this is the spreadsheet I was pulling in to the Wikified service...]

So anyway, here are some interactive ways of visualising MPs’ expenses data using Many Eyes wikified

Firstly, a bar char – select which expenses category you’d like to chart and then view the ranked distribution by sorting by values. If you mouse over any of the bars, you’ll see which MP made that claim:

Second up, a block histogram view. This chart is good for looking at the natural distribution of different claim categories. The search box makes it easy to search for your MP by name:

Again, mousing over any of the blocks identifies the name of the MP making that claim.

Thirdly, a scatter plot. This display lets you compare an MP’s claims across two categories, and potentially bring in a thrid category using the dot size:

As with the other visulisations, mouse over any point to see which MP it belongs to.

By the by, along the way I did a couple of other Yahoo pipes – one to extract expenses by MP name, (which simply pulls in CSV from the spreadsheet, then filters on an MP’s surname), the other MPs’ expenses by postcode. The latter pipe actually embeds the foemer, and works by looking up the name of the MP by postcode, using the TheyWorkForYou API; this name is then passed in to an embedded ‘expenses by name’ pipe.

Anyway, back to the viz biz: Charles Arthur generously picked up on my tweets announcing the visualisations with a blog post on the Guardian data blog (Visualising MP expenses) in a post that included the tease:

But what we need now is a dataset which shows constituency distances
from Westminster, so that we can compare that against travel. And perhaps someone else can work out the travelling MPs’ carbon footprints based on whether they went by air or rail or car

No fair… Okay – so where to get the location data for each MP. Well, the TheyWorkForYou API came to my rescue again. One call (to getConstituencies) pulled out details of each constituency, which included the lat/long geo-coordinated of the ‘centre’ of each constituency (along with the co-ordinates of the bounding box round each constituency… maybe I’ll use those another time ;-) A second call (to getMPs) pulled out all the MPs, and their constituencies. Loading both sets of data into different sheets on Dabble DB, meant I could then link them together by constituency name (for more on linking data in Dabble DB, see Mash/Combining Data from Three Separate Sources Using Dabble DB and Using Dabble DB in an Online Mashup Context).

Adding the MP data into Dabble DB after a further bit of cleaning – removing things like Mr, Miss, and Sir from the firstnames etc – and linking by MP name meant that I could now generate a single data view that exposed MPs by name, constituency, and expense claims, along with the geolocation of the midpoint of their constituency.

After grabbing the CSV feed out of this Dabble DB view into a pipe, and tidying up the data a little once again (eg removing commas in the formatted numbers), it was an easy matter to pull the JSON output from the pipe into a map, and plot different coloured markers depending what ‘band’ the MPs’ total expenses fell into. Here’s a snapshot of that first map:

All well and good – what’s nice about this view is that it’s quite easy to see which MPs appear to be claiming disproportionately more than other MPs with constituencies in a similar area. (There may be good reason for this, like, err… whatever. This tool is just a starting point for sensemaking round the data, right?!;-). If you click on one of the markers you can pop up a little info window, too (rather sparse in this first demo):

In that first map, the only expenses data I was exposing, and mapping, was the total travel expenses claimed. So over a coffee this afternoon, I created a richer view, and tweaked the map code to let me inspect a couple of other data sets. You can find the map here: MPs’ travel expenses map.

So for example, we can look at mileage claims:

Or the total expenses claimed for living away from the primary home:

One thing these quick to put together maps show is how powerful map based displays can be used to get a feel for local differences where there is a difference. (There may well be a good reason for this, of course; including errors in the data set being used…)

It’s also interesting to use the map based displays in conjunction with other chart based visualisations, such as the MPs’ expenses visualisations on Many Eyes Wikified, to explore the data in a far more natural way than trying to make sense of a spreadsheet containing the MPs’ expenses data.

Enjoy :-)

PS the code is all as is; if it’s broken and the visualisations are consequently wrong/misleading, then I apologise in advance… ;-)

PPS See also: My Guardian OpenPlatform API’n’Data Hacks’n’Mashups Roundup, which describes 6 different recipes for playing with Guardian openplatform resources. And if you’re into F1, see Visualising Lap Time Data – Australian Grand Prix, 2009 ! ;-)

PPPS see also MPs’ Expenses by Constituency, Sort Of…, where I plot a couple of really colourful proportional symbol maps based on total travel expenses…

Written by Tony Hirst

April 2, 2009 at 11:44 pm

30 Responses

Subscribe to comments with RSS.

  1. This is fantastic!

    MK

    April 3, 2009 at 1:08 pm

  2. [...] has published data on each MPs’ claims. Now, it’s been combined with data from They Work For You to create a map showing MPs’ expenses claims, revealing interesting anomalies. An investigation into MPs’ [...]

    Expenses Mashup

    April 3, 2009 at 2:59 pm

  3. Very nice! Are you charging them yet?

    JTownend

    April 3, 2009 at 5:21 pm

  4. [...] describes his work here which he developed after he discovered that the expenses data was being released via Data Store. [...]

  5. “Are you charging them yet”
    If I was, what would be the going rate, what would i be charging for, and what would likely conditions arising (on both sides) be???? ;-)

    Tony Hirst

    April 3, 2009 at 6:34 pm

  6. [...] MPs’ travel expenses data was transformed into an easy-to-read Google Map by Tony Hirst, a lecturer at the Open University and Guardian reader. He picked up the figures from the [...]

  7. [...] Visualising MPs’ Expenses Using Scatter Plots, Charts and Maps via Nick Booth (tags: mp politics finance government uk visualization) [...]

  8. I was half-way doing something similar, but you’ve beaten me to it.

    One trick I was going to use, which might be helpful: the TheyWorkForYou API call to ‘getConstituencies’ also lets you ask for all constituencies within X kilometres of a given long/lat. Even better, it returns the distance from that long/lat point in the results.

    So you can ask for all constituencies within 999km of 51.5000 / -0.1246… ie the House of Commons chamber, more or less… and bingo. The distance in km, to 12 significant figures!, for every (mainland) constituency in the country. ;)

    Simon Dickson

    April 6, 2009 at 3:16 pm

  9. @simon It would be interesting to just pop that distance vs expenses data onto a scatter plot (and do the same for eg mileage and/or rail data)?

    In terms of helping see why the expenses are so, though, the raw distance measure doesn’t take into account that two people the same distance away from Westminster may actually have a very different public transport option thought…

    …but it would maybe help normalise the data in terms of eg. somewhere up north being a similar ditance away from London as somewhere south west… and it may also identify different travel costs if eg similar distances but from different parts of the country have different expenses associated with them?

    I guess the map way of displaying this would be to plot concentric circles over the map, centred on Westminster?

    Tony Hirst

    April 6, 2009 at 5:45 pm

  10. [...] describes his work here which he developed after he discovered that the expenses data was being released via Data Store. [...]

  11. [...] describes his work here which he developed after he discovered that the expenses data was being released via Data Store. [...]

  12. I think that’s exactly right, Tony. A simple distance-to-cost calculation won’t answer anything directly – but it would ask a few interesting questions anyway.

    Simon Dickson

    April 7, 2009 at 9:16 am

  13. Awe inspiring. Really, I am inspired. How can we make this easier for anyone else to do?

    Paul Bradshaw

    April 14, 2009 at 8:42 am

  14. [...] 0 Comments A few weeks ago, I posted several maps visualising MPs’ expenses (Visualising MPs’ Expenses Using Scatter Plots, Charts and Maps). A couple of days later, I created another map that I didn’t post at the time, partly [...]

  15. It would be nice to relate expenses claimed to the usefulness/effectiveness of the MP! For example, expenses against speeches made, bills proposed or votes cast. This way the taxpayer can judge value for money spent. Not sure how to get hold of the data to do this.

    Lesley

    May 8, 2009 at 12:29 pm

  16. [...] spreadsheet format it suddenly becomes meaningful and very shareable.) People then started finding interesting trends with very little effort. And then we got a very public flame [...]

  17. [...] map in the Guardian has been produced with the help of Tony Hirst of the Open University, who tells us more in the OUseful.Info [...]

    Wordblog

    May 14, 2009 at 9:34 am

  18. [...] – Tony Hirst has used the Guardian’s and other data to map MPs’ expense claims across the country – neatly showing the power of open datasets and the internet. Possibly related posts: [...]

  19. [...] blog ‘Ouseful’ has some very interesing visual representations of moral vacuum at Westminster’s [...]

  20. [...] you’re reading this then I reckon that’s a safe bet, then have a look at Mark Reckons, Tony Hirst, Matt Riggott and the developers at Shoothill, to pick four, have been [...]

  21. [...] Arthur over at the Guardian has a closer look.  Tony Hirst’s blog gives an excellent account on the technical efforts needed to do this. Looking at what he has done [...]

  22. [...] Arthur over at the Guardian has a closer look.  Tony Hirst’s blog gives an excellent account on the technical efforts needed to do this. Looking at what he has done [...]

  23. [...] has already produced some great work from what I once described as the “Technician” variant of distributed [...]

  24. [...] Hirst has published several maps and charts of the data allowing you to look for patterns or specific information, such as this interactive bar chart and [...]

  25. Hi,

    Good work, we have done some work on the MPs’ Expenses data and are looking to aggregate it with other data sets.

    I thought you might be interested in OpenPSI ( http://www.openpsi.org ), a collaboration between the University of Southampton and the UK government, lead by the National Archive, to trial a new form of community provisioned information service.

    We are exposing UK government data using the Semantic Web standards, RDF. We have SPARQL end point so data mashups can be created by issuing query requests.

    As well as providing some data sources, we are trying to spark interaction between government information providers, academic researchers and information intermediaries, specifically to bridge the gap between those researchers who may not have all the technical skills or data knowledge to answer important research questions.

    John.

    John Darlington

    September 7, 2009 at 2:51 pm

  26. [...] Visualising MPs’ Expenses Using Scatter Plots, Charts and Maps, reported here; [...]

  27. [...] from across the UK found a number of unusual claims. Others took the information about expenses and visualized it in interesting ways, ways that allowed citizens to better understand how their money was being [...]

  28. [...] broke the story about the MPs expenses claims and developers, such as Tony Hirst, subsequently provided analyses of the data once the data had been [...]

  29. [...] Analyse des dépenses des députés au Parlement britannique : la première carte montre que certains élus ont des dépenses liées à leurs déplacements largement supérieures à celles de leurs collègues demeurant dans des circonscriptions à des distances similaires de Londres. [...]


Comments are closed.

Follow

Get every new post delivered to your Inbox.

Join 729 other followers

%d bloggers like this: