OUseful.Info, the blog…

Trying to find useful things to do with emerging technologies in open education

Last Week’s Football Reports from the Guardian Content Store API (with a little dash of SPARQL)

A big :-) from me today – at last I think I’ve started to get my head round this mashup malarkey properly… forget the re-presentation stuff, the real power comes from using one information source to enrich another… but as map demos are the sine qua non of mashup demos, I’ll show you what I mean with a map demo…

So to start, here’s a simple query on the Guardian content store API for football match reports:

http://api.guardianapis.com/content/search?
filter=/football&filter=/global/matchreports&after=20090314&api_key=MYSECRETACTIVATEDKEY

It’s easy enough to construct the query URI using a relative date in the Yahoo pipe, so the query will always return the most recent match reports (in this case, matc h reports since “last saturday”):

It’s easy enough to use these results to generate an RSS feed of the most recent match reports:

Pulling the images in as Media RSS (eg media:group) elements means that things like the Google Ajax slide show control and the Pipes previewer can automatically generate a slideshow for you…

You can also get the straight feed of course:

A little bit of tinkering with the creation of the description element means we can bring the original byline and match score in to the description too:

Inspecting the API query results by eye, you might notice that a lot of the bylines have the form “John Doe at the Oojamaflip Stadium”:

Hmmm…

It’s easy enough to exploit this structural pattern to grab the stadium name using a regular expression or two:

I thien did a little experiment running the name of the stadia, and the name of the stadia plius football ground, UK through the Yahoo Location Extractor block to try to plot the sotries on map locations corresponding to the football ground locations, but the results weren’t that good…

…so I tweeted:

And got a couple of responses…

The XQuery/DBpedia with SPARQL – Stadium locations link looked pretty interesting, so I tweaked the example query on that page to return a list of English football stadia and their locations:

PREFIX p: <http://dbpedia.org/property/&gt;
PREFIX skos: <http://www.w3.org/2004/02/skos/core#&gt;
PREFIX geo: <http://www.w3.org/2003/01/geo/wgs84_pos#&gt;
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#&gt;
SELECT * WHERE
{?ground skos:subject <http://dbpedia.org/resource/Category:Football_venues_in_England&gt;.
?ground geo:long ?long.
?ground geo:lat ?lat.
?ground rdfs:label ?groundname.
FILTER (lang(?groundname) ='en').
}

and created a pipe to call dBpedia with that query (dbpedia example – English football stadium location lookup pipe):

Because I don’t know how to write SPARQL, I wasn’t sure how to tweak the query to just return the record for a given stadium name (feel tfree to comment telling me how ;-) – so instead I used a pipe filter block to filter the results instead. (This combination of search and filter can be a very powerful one when you don’t know how to phrase a particular qusry, or when a query language doesn’t support a search limit you want…

It was now a simple matter to add this pipe in to geocode the locations of the appropriate stadium for each match report:

So let’s recap – we call the Guardian content API for match reports since “last saturday” and construct a nice RSS feed from it, with description text that includes the byline and match score, as well as the match report. Then we pull out the name of stadium each match was played at (relying on the convention that seems to work much of the time that the byline records the stadium) and pass it through another pipe that asks DBpedia for a list of UK football stadium locations, and then filters out the one we want.

Tweak the location data to a form Yahoo pipes likes (which means it will create a nice geoRSS or KML feed for us) and what do we get? Map based match reports:

As I’ve show in this blog many times before, it’s easy enough to grab a KML feed from the More options pipe output and view the results elsewhere:

(Click on a marker on the google map and it will pop up the match report.)

So what do we learn from this? Hmmm – that I need to learn to speak SPARQL, maybe?!

PS @kitwallace has come up trumps with a tweak to the SPARQL query that will do the query by stadium name in one:
FILTER (lang(?groundname) =’en’ && regex(?groundname,’Old Trafford’)). Ta, muchly :-)

Written by Tony Hirst

March 18, 2009 at 9:53 am

Posted in CandS_HowTo, Pipework, Tinkering

Tagged with ,

8 Responses

Subscribe to comments with RSS.

  1. Very nicely done! Regarding your tweet about some canned SPARQLs and tags from Guardian API, maybe using the set of classes in DBpedia would provide some useful filtering to get canned SPARQLs started. DBpedia’s RDF server (Virtuoso) also has a special ‘bif:contains’ predicate that works similar to the regex filter:

    ?article bif:contains “your string”

    So if you have a tag, say, “football” maybe start with something like this:

    PREFIX dbpprop:
    PREFIX dbpont:

    select distinct ?name

    where {
    ?res a dbpont:Person ;
    dbpprop:name ?name ;
    dbpprop:abstract ?abs .
    ?abs bif:contains “football” .

    FILTER (lang(?abs) = ‘en’)
    }

    LIMIT 100

    ?res a dbpont:Person says to look for people — there are lots of other classes in DBpedia that you could use to filter, then
    ?res bif:contains “your tag” digs through, similar to the regex.

    Then, for different searches, you might just be adding in a new predicate instead of dbpprop:name, as in what you have above.

    I’ve found, though, that I need to remember that it’s still based on a wiki — what people have entered for the same property can differ wildly (e.g., sometimes the object is a literal, and sometimes is a link to another wikipedia page).

    Hope that helps!
    Patrick

    Patrick Murray-John

    March 18, 2009 at 2:30 pm

  2. [...] Search « Last Week’s Football Reports from the Guardian Content Store API (with a little dash of S… [...]

  3. [...] The Guardian åpnet jo opp sitt API tidligere (var det i forrige uke tro?) og nå begynner ting som benytter APIet å komme ut. Her er i allefall en særdeles god beskrivelse på hvordan et kart med fotballnyheter er satt sammen.  [...]

  4. [...] Last Week’s Football Reports from the Guardian Content Store API (with a little dash of SPARQL): use a Yahoo pipe to pull football match reports “since last Saturday” from the content API, extract the name of the stadium each match was played in, and use as as the search term in a SPARQL query over a DBpedia page listing the locations of each English football stadium. Retrieve the lat/lon goe-co-ordinates for each stadium from the SPARQL query, and associate them with the match reports. Plot the resulting feed of geo-annotated match reports on a Google map. Nuggets: running a SPARQL query over DBPedia from a Yahoo pipe. [...]

  5. Arghh – deleted this trackback by mistake:

    “I’m a Hatters fan, not a Gooner. Do I really need to know where The Emirates is? No. I want to know where my County match report is. And it ain’t going to be on a Guardian API…”[http://outwithabang.rickwaghorn.co.uk/?p=269 ]

    Tony Hirst

    March 26, 2009 at 6:01 pm

  6. Well, these are interesting thoughts. I think they are true. However, everything is
    relative and ambiguous to my mind.

    AlexSorent

    April 8, 2009 at 1:33 pm

  7. [...] So for example, in a recent workshop I demonstrated the Last Week’s Football Reports from the Guardian Content Store API (with a little dash of SPARQL). [...]

  8. [...] done the odd demo of how to use SPARQL in a Yahoo Pipe before (Last Week’s Football Reports from the Guardian Content Store API (with a little dash of SPARQL), which is not about the football, right?) but a tweet last week tipped me off to a potentially more [...]


Comments are closed.

Follow

Get every new post delivered to your Inbox.

Join 796 other followers

%d bloggers like this: