This is a stub post as much as anything to help me keep tabs on a technique I’ve not had any time to properly play with, let alone document: consuming Linked Data in a Google spreadsheet.
First up – why might this be useful? In short, I think that many people who might want to make use of data.gov.uk data are probably comfortable with spreadsheets but not with code, so giving them access to SPARQL and RDF is not necessarily useful.
So whilst the recipe shown here is a hacky one, at least it opens up the playground a little to explore what the issues might be – and what things might be facilitated by – providing fluid, live data routes from RDF triple stores into spreadsheets.
So here’s what I’m aiming for – some data from the education datastore in a spreadsheet:
And how did it get there? Via a CSV import using the =importData formula.
And where did the CSV come from? The sparqlproxy webservice:
As well as giving CSV output, the serivce can also gnerate HTML and a variety of JSON formats, including the MIT Simile and Googl Viz API formats (which means it’s easy to just plug other data into a wide variety of visualisation formats.
To get the data into a Google spreadsheet, simply copy the CSV URI into an =importData(“CSV_URI_HERE”) formula in a spreadsheet cell.
The sparqlproxy service can also pull in and transform queries that have been posted on the web:
So for example, in the above case the query at http://data-gov.tw.rpi.edu/sparql/stat_ten_triples_default.sparql looks like:
What this means is that someone else can write complex queries and mortals can then access the data and display it however they want. (What I’d really like to see is a social site that supports the sharing of endpoint/query pairs for particular queries (I could probably even hack something to do this using delicious?) ;-)
Once the data is in the spreadsheet, it can be played with in the normal way of course. So for example, I can query the spreadsheet using my old prototype Guardian datastore explorer:
In the above example, the route is then:
1) a sparql query onto http://services.data.gov.uk/education/sparql is run through
2) the http://data-gov.tw.rpi.edu/ws/sparqlproxy.php sparqlproxy service to produce a CSV output that is
3) pulled into a Google spreadsheet using an =importData() formula, and in turn
4) queried using my Google Datastore explorer using the Google visualisation API query language and then
5) rendered in a Google Visualisation table widget.