A v. quick post this one, because I have other stuff that really needs to be done, but it’s something I want to record as another couple of observations around the practical difficulties of engaging with Linked Data…
Firstly, identifiers for things most of us would probably call councils. The Guardian Datablog has just published data/details of the local council cuts. The associated Datastore Spreadsheet has a column containing council identifiers, as well as the council names:
Adding formal identifiers such as these is something I keep hassling Simon Rogers and the @datastore team about, so it’s great to see the use of a presumably standardised identifier there:-) Only – I can’t see how to join it up to any of the other myriad identifiers that seem to exist for council areas?
So for example, looking up Trafford on the National Statistics Linked Data endpoint identifies it as local-authority-district/00BU and Local education authority 358 – I can’t find R342 anywhere? Nor does R342 appear as an identifier on the OpenlyLocal page for Trafford Council, which is another default place I go to look at for bridging/linking information (but then, maybe a local authority is not a council?)
(A use case for the data might be taking the codes and using them to colour areas on an Ordnance Survey OpenSpace map (ans. 1.17)… This requires a bridge into the namespaces the OS mapping tools recognise.)
I can google “Trafford R342” and find a couple of other references to this association, but I can’t find a way of linking to entities I know about in the Linked Data world?
But then, maybe the R*** areas don’t match any of the administrative areas that are recorded in any of the other data soruces I found…?
So I have an identifier, but I don’t know what it actually refers to/links to, and I donlt know how to make use of it?
And then there’s a second related problem – a mismatch between popular understanding of a term/concept, and it’s formal use in a defined ontology, which can cause all sorts of problems when naively trying to make use of formally defined data…
Take for example, the case of counties. Following a brief Twitter exchange this morning with the ever helpful @gothwin, it turns out that if you live in somewere like Southampton (or another unitary authority or metropolitan district), you don’t live in a county… (for example – compare the Ordnance Survey pages for postcode areas SO16 4GU and EX1 1HD). The notion of counties is apparently just a folk convention now, although the Association of British Counties is trying to “promote awareness of the continuing importance of the 86 historic (or traditional) Counties of Great Britain… contend[ing] that Britain needs a fixed popular geography, one divorced from the ever changing names and areas of local government but, instead, one rooted in history, public understanding and commonly held notions of cultural identity.” Which is why they “seek to fully re-establish the use of the Counties as the standard popular geographical reference frame of Britain and to further encourage their use as a basis for social, sporting and cultural activities”. (@gothwin did hint that OS might be “look[ing] at publishing a ‘people’s geography’ with traditional counties”.
As it is, for a naive developer, (or random tinkerer, such as myself), struggling to get to grips with the mechanics of Linked Data, it seems that to make any use at all of government Linked Data, you also need a pretty good grasp of the data models before you randomly try hacking together queries or linking stuff together, as the nightmare exposure I had to COINS Linked Data suggests… ;-)
In other words, there are at least two major barriers to entry to using government Linked Data: on the one hand, there’s getting comfortable enough with things like SPARQL to be able to navigate Linked Data datasets and put together sensible queries (the technical problem); on the other hand, there’s understanding the data model and the things it models well enough to articulate even natural language questions that might be asked of a dataset (a domain expertise problem). (And as we try to link across datasets, the domain expertise problem just compounds?) Then all that remains is mapping the natural language query onto the formal query, given the definitions of ontologies being used…
(I know, I know – it’s always rash to query data you don’t understand… but I think a point I’m trying to make is that getting your head round Linked Data is made doubly difficult when things don’t work not because of the way you’ve written the query, but because you don’t understand the way the data has been modeled… (which ends up meaning it is a problem with the way you wrote the query, just not the way you thought…!))