First Signs (For Me) of Linked Data Being Properly Linked…?!

As anyone who’s followed this blog for some time will know, my relationship with Linked Data has been an off and on again one over the years. At the current time, it’s largely off – all my OpenRefine installs seem to have given up the ghost as far as reconciliation and linking services go, and I have no idea where the problem lies (whether with the plugins, the installs, with Java, with the endpoints, with the reconciliations or linkages I’m trying to establish).

My dabblings with pulling data in from Wikipedia/DBpedia to Gephi (eg as described in Visualising Related Entries in Wikipedia Using Gephi and the various associated follow-on posts) continue to be hit and miss due to the vagaries of DBpedia and the huge gaps in infobox structured data across Wikipedia itself.

With OpenRefine not doing its thing for me, I haven’t been able to use that app as the glue to bind together queries made across different Linked Data services, albeit in piecemeal fashion. Because from the occasional sideline view I have of the Linked Data world, I haven’t seen any obvious way of actually linking data sets other than by pulling identifiers in to a new OpenRefine column (or wherever) from one service, then using those identifiers to pull in data from another endpoint into another column, and so on…

So all is generally not well.

However, a recent post by the Ordnance Survey’s John Goodwin (aka @gothwin) caught my eye the other day: Federating SPARQL Queries Across Government Linked Data. It seems that federated queries can now be made across several endpoints.

John gives an example using data from the Ordnance Survey SPARQL endpoint and an endpoint published by the Environment Agency:

The Environment Agency has published a number of its open data offerings as linked data … A relatively straight forward SPARQL query will get you a list of bathing waters, their name and the district they are in.

[S]uppose we just want a list of bathing water areas in South East England – how would we do that? This is where SPARQL federation comes in. The information about which European Regions districts are in is held in the Ordnance Survey linked data. If you hop over the the Ordnance Survey SPARQL endpoint explorer you can run [a] query to find all districts in South East England along with their names …

Using the SERVICE keyword we can bring these two queries together to find all bathing waters in South East England, and the districts they are in:

And here’s the query John shows, as run against the Ordnance Survey SPARQL endpoint

SELECT ?x ?name ?districtname WHERE {
  ?x a <http://environment.data.gov.uk/def/bathing-water/BathingWater> .
  ?x <http://www.w3.org/2000/01/rdf-schema#label> ?name .
  ?x <http://statistics.data.gov.uk/def/administrative-geography/district> ?district .
  SERVICE <http://data.ordnancesurvey.co.uk/datasets/boundary-line/apis/sparql>
    ?district <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/within> <http://data.ordnancesurvey.co.uk/id/7000000000041421> .
    ?district <http://www.w3.org/2000/01/rdf-schema#label> ?districtname .
  }
} ORDER BY ?districtname

In a follow on post, John goes even further “by linking up data from Ordnance Survey, the Office of National Statistics, the Department of Communities and Local Government and Hampshire County Council”.

So that’s four endpoints – the original one against which the query is first fired, and three others…

SELECT ?districtname ?imdrank ?changeorder ?opdate ?councilwebsite ?siteaddress WHERE {
  ?district <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/within <http://data.ordnancesurvey.co.uk/id/7000000000017765> .
  ?district a <http://data.ordnancesurvey.co.uk/ontology/admingeo/District> .
  ?district <http://www.w3.org/2000/01/rdf-schema#label> ?districtname .
  SERVICE <http://opendatacommunities.org/sparql> {
    ?s <http://purl.org/linked-data/sdmx/2009/dimension#refArea> ?district .
    ?s <http://opendatacommunities.org/def/IMD#IMD-rank> ?imdrank . 
    ?authority <http://opendatacommunities.org/def/local-government/governs> ?district .
    ?authority <http://xmlns.com/foaf/0.1/page> ?councilwebsite .
  }
  ?district <http://www.w3.org/2002/07/owl#sameAs> ?onsdist .
  SERVICE <http://statistics.data.gov.uk/sparql> {
    ?onsdist <http://statistics.data.gov.uk/def/boundary-change/originatingChangeOrder> ?changeorder .
    ?onsdist <http://statistics.data.gov.uk/def/boundary-change/operativedate> ?opdate .
  }
  SERVICE <http://linkeddata.hants.gov.uk/sparql> {
    ?landsupsite <http://data.ordnancesurvey.co.uk/ontology/admingeo/district> ?district .
    ?landsupsite a <http://linkeddata.hants.gov.uk/def/land-supply/LandSupplySite> .
    ?landsupsite <http://www.ordnancesurvey.co.uk/ontology/BuildingsAndPlaces/v1.1/BuildingsAndPlaces.owl#hasAddress> ?siteaddress .
  }
}

Now we’re getting somewhere….

Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...

3 thoughts on “First Signs (For Me) of Linked Data Being Properly Linked…?!”

Comments are closed.

%d bloggers like this: