OUseful.Info, the blog…

Trying to find useful things to do with emerging technologies in open education

Using Many Eyes Wikified to Visualise Guardian Data Store Data on Google Docs

Last week, I posted a quick demo of how to visualise data stored in a Google spreadsheet in Many Eyes Wikified (HEFCE Grant Funding, in Pictures).

The data I used was the latest batch of HEFCE teaching funding data, but Joss soon tweeted to say he’d got Research funding data up on Google spreadsheets, and could I do something with that? You can see the results here: Visualising UK HEI Research Funding data on many Eyes Wikified (Joss has also had a go: RAE: UK research funding results visualised).

Anyway, today the Guardian announced a new content API (more on that later – authorised developer keys are still like gold dust), as well as the Guardian data store (strapline: “facts you can use”) and the associated Data Store Blog.

Interestingly, the data is being stored on Google docs, in part because Google spreadsheets offer an API and a wide variety of export formats.

As regular OUseful.info readers will know, one of the export formats from Google spreadsheets is CSV – Comma Separated Variable data – which just so happens to be liked by services such as Dabble DB and Many Eyes. I’ll try to come up with a demo of how to mash-up several different data sets in Dabble DB over the next few days, but as I’ve a spare half-hour now, I thought I’d post a qiuck demo of how to visualise some of the Guardian data store spreadsheet data in Many Eyes Wikified.

So to start, let’s look at the the RAE2008 results data – University research department rankings (you can find the actual data here: http://spreadsheets.google.com/pub?key=phNtm3LmDZEM-RqeOVUPDJQ.

If you speak URL, you’ll know that you can get the CSV version of the data by adding &output=csv to the URL, like this: http://spreadsheets.google.com/pub?key=phNtm3LmDZEM-RqeOVUPDJQ&output=csv

Inspection of the CSV output suggests there’s some crap at the top we don’t want – i.e. not actual column headings – as well as the the end of the file:

(Note this “crap” is actually important metadata – it describes the data and its provenance – but it’s not the actual data we want to visualise).

Grabbing the actualt data, without the metadata, can be achieve by grabbing a particular range of cells using the &range= URL argument. Inspection of the table suggests that meaningful data can be found in the columnar range of A to H; guesswork and a bit of binary search identifies the actual range of cell data as A2:H2365 – so we can export JUST the data, as CSV, using the URL http://spreadsheets.google.com/pub?key=phNtm3LmDZEM-RqeOVUPDJQ&output=csv&range=A2:H2365.

If you create a new page on Many Eyes Wikified, this data can be imported into a wiki page there as follows:

We can now use this data page as the basis of a set of Many Eyes visualisations. Noting that the “relative URL address” of the data page is ousefulTestboard/GuardianUKRAERankings2008 (the full URL of the wikified data page is http://manyeyes.alphaworks.ibm.com/wikified/ousefulTestboard/GuardianUKRAERankings2008), create a new page and put a visualisation placeholder or two in it:

Saving that page – and clicking through on the visualisation placeholder links – means you can now create your visualisation (Many Eyes seems to try to guess what visualisation you want if you use an appropriate visulisation name?):

Select the settings you want for you visualisation, and hit save:

A visualisation page will be created automatically, and a smaller, embedded version of the visualisation will appear in the wiki page:

If you visit the visualisation page – for example this Treemap visualisation, you should find it is fully interactive – which means you can explore the data for yourself, as I’ll show in a later post…

See more examples here: RAE 2008 Treemap; RAE 2008 Bubble Diagram; RAE 2008 bar chart.

Written by Tony Hirst

March 10, 2009 at 5:56 pm

Posted in Data, Visualisation

Tagged with ,

6 Responses

Subscribe to comments with RSS.

  1. Hi there, hoping to play with all of this really soon, but in the meantime just wanted to thank you for all the really useful posts about Many Eyes, Dabble, etc. Really great to see how easy it is to do these days, although the screenshots really do paint a thousand words…

    Cheers,
    Graham

    Graham

    March 11, 2009 at 9:15 am

  2. [...] (Almost) code-free guide to visualising Guardian Data Store information using Many Eyes [...]

  3. [...] Published March 11, 2009 CandS_HowTo , Data , Tinkering , Visualisation Tags: CandS_HowTo In Using Many Eyes Wikified to Visualise Guardian Data Store Data on Google Docs I showed how to pull data from Google spreadsheets (uploaded there by the Guardian as part of their [...]

  4. [...] Using Many Eyes Wikified to Visualise Guardian Data Store Data on Google Docs: using a Guardian data in a particular Google docs spreadsheet, generate a URL that emits CSV formatted data from a specified region of cells in the spreadsheet and consume the live CSV data feed in Many Eyes wikified to visualise the data. Nuggets: how to generate the URL for the CSV output from a Google spreadsheet over a range of specified cells, and then consume that data in Many Eyes Wikified. Use the Many Eyes wikified data as the basis for several Many Eyes wikified visualisations. [...]

  5. [...] we can get just the banks and assets as a CSV file by adding &output=csv&range=B2:C51 (via OUseful.Info). import urllib2, csv url = [...]

  6. [...] I also managed something similar in wikified with automatic updates of share prices. Tony’s excellent guidance explains how to export as csv etc I don’t know if it will be the same for regular Many Eyes, [...]


Comments are closed.

Follow

Get every new post delivered to your Inbox.

Join 787 other followers

%d bloggers like this: