Hackable SPARQL Queries: Parameter Spotting Tutorial

Whenever I come across a new website or search tool, one of the first things I do is have a look at the URIs of resource pages and search results to see: a) whether I can make sense of them (that is, are they in any sense human readable), and b) whether they are “hackable”, to the extent that I can change certain parts of the URI in particular way and have a pretty good idea what the resulting page will look like.

If the URI is hackable, then it often means that it can be parameterised, in the sense that I can construct valid URIs from some sort of template within which part of the URI path, or one of the URI arguments, is replaced using a variable that can be assigned a particular value as required.

So for example, a search for the term ouseful in Google delivers the results page with URI that looks like:
http://www.google.com/search?client=safari&rls=en&q=ouseful&ie=UTF-8&oe=UTF-8

Comparing the search term that I entered (ouseful) with the URI, it’s easy to see how the search term is used in order to create the results page URI:
http://www.google.com/search?client=safari&rls=en&q=SEARCH_TERM_HERE&ie=UTF-8&oe=UTF-8

This technique applies equally to looking at SPARQL search queries, so here’s a worked through example that makes use of a query on the Talis n2 blog (I tend to use SparqlProxy for running SPARQL queries):
#List the uri, latitude and longitude for road traffic monitoring points on the M5
PREFIX road: <http://transport.data.gov.uk/0/ontology/roads#&gt;
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#&gt;
PREFIX geo: <http://geo.data.gov.uk/0/ontology/geo#&gt;
PREFIX wgs84: <http://www.w3.org/2003/01/geo/wgs84_pos#&gt;
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#&gt;
SELECT ?point ?lat ?long WHERE {
?x a road:Road.
?x road:number "M5"^^xsd:NCName.
?x geo:point ?point.
?point wgs84:lat ?lat.
?point wgs84:long ?long.
}

Looking carefully at the descriptive comment:

#List the uri, latitude and longitude for road traffic monitoring points on the M5

and the query:

...
?x road:number "M5"^^xsd:NCName.
...

we see how it is possible to parameterise the query such that we can replace the “M5” string with a variable and use it to pass in the details of (presumably) any UK road number.

In Yahoo Pipes, here’s what the parameterisation looks like – we construct the query string and pass in a value for the desired road number from a user text input (split the query string after ?x road:number “):

The rest of the pipe is built around the SPARYQL pattern that I have described before (e.g. Getting Started with data.gov.uk, Triplr SPARYQL and Yahoo Pipes):

By renaming the latitude and longitude value elements as y:location.lat and y:location.lon, the pipe infrastructure can do itself and provide us with a map based preview of the pipe output, as well as a KML output that can be viewed in Google maps (simply paste thee KML URI into the Google maps search box and use it as the search term) or Google Earth, for example:

Inspection of he the pipe’s KML output URL:
http://pipes.yahoo.com/pipes/pipe.run?
_id=78f6547cc12ac3ebcb84144ec3e37205
&_render=kml&roadnum=M5

shows that is is also hackable. Can you see how to change it so that it will return the traffic monitoring points on the A1, bearing in mind it currently refers to the M5?

So there we have it – given an example SPARQL query for road traffic monitoring locations on thee M5, we can parameterise the query by observation and construct a pipe that gives a map based preview, as well as a KML version of the output, all in less time than it takes to document how it was done… :-)

Here’s another example. This time the original query comes from @tommyh (geeky related stuff here;-); the query pulls a list of motorway service station locations from dbpedia:

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#&gt;
PREFIX dbpprop: <http://dbpedia.org/property/&gt;
PREFIX yago-class: <http://dbpedia.org/class/yago/&gt;
PREFIX geo: <http://www.w3.org/2003/01/geo/wgs84_pos#&gt;
SELECT ?services ?label ?road ?lat ?long
WHERE {
?services dbpprop:wikiPageUsesTemplate <http://dbpedia.org/resource/Template:infobox_motorway_services&gt; .
?services rdfs:label ?label
OPTIONAL {
?services dbpprop:road ?road .
?services dbpprop:lat ?lat .
?services dbpprop:long ?long .
} .
FILTER (isIRI(?road)) .
}
ORDER BY ASC(?label)

The results look like:

So how can we weak the original query to search for motorway services on the M1? By inspection of the query, we see the search is looking for services on any ?road (and more than that, on any isIRI(?road), whatever that means?!;-) Looking at the results, we see that the roads are identified in the form:
<http://dbpedia.org/resource/M40_motorway&gt;

So we can tweak the query with an additional condition that requires a particular road. For example:

WHERE {
?services dbpprop:wikiPageUsesTemplate <http://dbpedia.org/resource/Template:infobox_motorway_services&gt; .
?services rdfs:label ?label .
?services dbpprop:road <http://dbpedia.org/resource/M1_motorway&gt;
OPTIONAL {
?services dbpprop:lat ?lat .
?services dbpprop:long ?long .
}
}

(I think we can drop the original FILTER too?)

To parameterise this query, we just ned to feed in the desired road number here:

<http://dbpedia.org/resource/ROADNUMBER_motorway>

Alternatively, we can hack in a regular expression to filter the results by road number – e.g. using the M1 again:

WHERE {
?services dbpprop:wikiPageUsesTemplate <http://dbpedia.org/resource/Template:infobox_motorway_services&gt; .
?services rdfs:label ?label .
?services dbpprop:road ?road
OPTIONAL {
?services dbpprop:lat ?lat .
?services dbpprop:long ?long .
} .
FILTER (isIRI(?road) && regex(?road,"M1_")) .
}

This time, the parametrisation would occur here:
<em FILTER (isIRI(?road) && regex(?road,"ROADNUMBER_”))

Note that if we just did the regular expression on “M1” rather than “M1_” we’d get back results for the M11 etc as well…

In the spirit of exploration, let’s se if we can guess at/pattern match towards a little bit more. (Note guessing may or may not work – but if it doesn’t, you won’t break anything!)

The line:
?services rdfs:label ?label
would seem to suggest that human readable labels corresponding to URI identifiers may be recorded using the rdfs:label relation. So let’s see:

Create a ?roadname variable in the query and see if ?road rdfs:label ?roadname manages to pull out a useful label:
SELECT ?services ?label ?roadname ?road ?lat ?long
WHERE {
?services dbpprop:wikiPageUsesTemplate <http://dbpedia.org/resource/Template:infobox_motorway_services&gt; .
?services rdfs:label ?label .
?services dbpprop:road ?road .
?road rdfs:label ?roadname
OPTIONAL

Ooh… that seems to work (in this case, at least… maybe it’s a dbpedia convention, maybe it’s a general convention, who knows?!:-)

But it’s a little messy, with different language variants also listed. However, another trick in my toolbox is memory. I remember seeing a filter option in a query once before:
&& lang(?someLabel)=’en’

Let’s try it – change the filter terms to:
FILTER (isIRI(?road) && regex(?road,”M1_”) && lang(?roadname)=’en’) .
and see what happens:

So now I have a query that I can use to find motorway service station locations on a particular UK motorway, and get the name of the motorway back as part of the results. And all with only a modicum of knowledge/understanding of SPARQL… Instead, I relied on pattern matching, a memory of a fragment of a previous query and a bit of trial and error…

PS If you want to try out hacking around with a few other SPARQL quries, I’ve started collecting some likely candidates: Bookmarking and Sharing Open Data Queries

Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...

One thought on “Hackable SPARQL Queries: Parameter Spotting Tutorial”

Comments are closed.