Mulling Over =datagovukLookup() in Google Spreadsheets

I’ve been fighting with Linked Data again… Today, I thought I’d have a quick look at how easy it would be to write a couple of Google Apps Script functions (e.g. as used in Creating a Winter Olympics 2010 Medal Map In Google Spreadsheets) to do something like the =googlelookup formula in Google Spreadsheets (as described in =GoogleLookup: Creating a Google Fact Engine Directory, for example).

What that function does is let you run a query on a particular property of a given search term, =googlelookup(“Brazil”, “capital”), for example.

So what I was thinking was defining a Google Apps Script function along the lines of:
=datagovuk_education_schoolLookup(id,arg)
which would take a school ID (such as http://education.data.gov.uk/id/school/135503) and then return some property of it, such as school name, (i.e. establishmentName, schoolCapacity and so on).

The function would then construct an appropriate SPARQL query, run it using the Google Apps Script UrlFetchApp() function, parse out the result, and return it to the spreadsheet cell that called the formula.

Unfortunately, it seems as if UrlFetchApp() doesn’t like the lonnnnnngggg SPARQL query URIs I was wrting to call the SPARQLProxy service I usually turn to (not least because it usually makes a pretty good attempt at giving a CSV output if you ask it to!) – requests fail with the long query but work okay if I use a minified URI for the same query.

I think that UrlFetchApp() works with POST, but I haven’t tried that yet (as with many Google services, their docs’n’tutorial materials assume you work for Google and can just make sense of technical documentation; I prefer cut’n’pasting Hello World examples and then slowly complicating them;-)

And so, because there’s nothing obvious for me to cut and paste, I’m going to call it a day and go and make some pancakes…:-)

PS as to why look at writing spreadsheet formulae to mask Linked Data datastore calls? Because most people who work with data work with it in spreadsheets. Approximately none are RDF speaking, SPARQL querying A-list geek developers. In fact, using spreadsheet formulas is quite possibly pushing it a bit far in terms of complexity because the approach I’m taking will need lots of documentation (“what was the thing for school name, again?”)

PPS I’m still finding it hard to find example SPARQL queries too… I also note that no Linked Data folk appear to have picked up on Brian Kelly’s challenge? Brian, do I win if I can work out a solution using stuff from the Guardian Datastore and Google Spreadsheets, or are you only accepting Proper Linked Data solutions?!;-)

7 comments

  1. Wilbert

    The following SPARQL query worked for me:

    PREFIX dbpedia:
    PREFIX dbpedia-owl-ed:
    PREFIX dbpedia-owl:
    SELECT ?name ?capacity WHERE { ?subject dbpedia-owl-ed:city dbpedia:Sheffield .
    ?subject rdf:type dbpedia-owl:School .
    ?subject rdfs:label ?name .
    OPTIONAL { ?subject dbpedia-owl:students ?capacity }}

    Executed on http://dbpedia.org/sparql

    The data may not be entirely complete (don’t think dbpedia has all the capacity data for example), but it’s good enough. If you look at multiple examples of schools (e.g. http://dbpedia.org/class/yago/SchoolsInSheffield), then you can tweak the query for better results fairly easily.

    The W3C SPARQL spec has remarkably clear and stealable examples: http://www.w3.org/TR/rdf-sparql-query/

  2. Pingback: Using Data From Linked Data Datastores the Easy Way… « OUseful.Info, the blog…
  3. Pingback: Response To My Linked Data Challenge « UK Web Focus
  4. Pingback: Grabbing “Facts” from the Guardian Datastore with a Google Spreadsheets Formula « OUseful.Info, the blog…
  5. Pingback: Open data equals open scrutiny but doesn’t always equal all of the answer JISC CETIS MASHe
  6. Pingback: Grabbing “Facts” from the Guardian Datastore with a Google Spreadsheets Formula | OUseful.Info, the blog...