A few days ago, I had reason to start pondering URI schemes for open data released by educational institutions. The OU, like a couple of other HEIs, is looking at structuring – and opening up – various sorts of data, and there are also mutterings around what a data.ac.uk styled site might have to offer.
Being a lazy sort, it seems to me that in figuring out how we might collate data from across the ac.uk environment, we could look to the gov.uk environment. So for example, data.gov.uk as a central index over data from both central and local government, which each have their own concerns, and within a type, are likely to share some common features: all local councils will have some of the same sort of data to share, government departments might share some requirements for consistent, centralised reporting (such as website costs and usage) as well their own peculiar data releases, and so on. In the ac.uk context, we have the HEIs (and FE colleges) in one set, research councils and other project related funding bodies in another.
If we look to local council data, we can also spot intermediate layers appearing that apply a canonical structure to a range of variously published data from the local councils. For example, Openly Local is making a play to act as the canonical source for a whole range of local council data across all the UK’s councils; the Police API “allows you to retrieve information about neighbourhood areas in all 43 English & Welsh police forces”, RateMyPlace is a “one stop shop for information on Food Safety Inspections in Staffordshire”, aggregating information from several councils and representing it via a single API, and so on. (For an example of how different councils can publish ostensibly the same data in a wie variety of formats, see Library Location Data on data.gov.uk).
Looking at the list of local councils with open data sites as collected on the OpenlyLocal open data scoreboard (and as extracted from theOpenlyLocal API via a Yahoo Pipe), are any conventions appearing to emerge in the location of local council open data homepages?
– http://www.aberdeencity.gov.uk/open_data/open_data_home.asp (Aberdeen City Council)
– http://www.bournemouth.gov.uk/Data/ (Bournemouth Borough Council)
– http://www.bristol.gov.uk/opendata (Bristol City Council)
– http://www.darlington.gov.uk/Generic/Info/opendata.htm (Darlington Borough Council)
– http://www.eaststaffsbc.gov.uk/opendata/Pages/default.aspx (East Staffordshire Borough Council)
– http://eastsussex.gov.uk/about/standards/opendata.htm (East Sussex County Council)
– http://www.eden.gov.uk/about-this-site/open-data/ (Eden District Council)
– http://data.london.gov.uk/ (Greater London Authority)
– http://picandmix.org.uk/ (Kent County Council)
– http://www2.lichfielddc.gov.uk/data/ (Lichfield District Council)
– http://data.lincoln.gov.uk/ (Lincoln City Council)
– http://www.brent.gov.uk/xml (London Borough of Brent)
– http://www.hillingdon.gov.uk/data (London Borough of Hillingdon)
– http://www.sutton.gov.uk/index.aspx?articleid=10077 (London Borough of Sutton)
– http://www.rbwm.gov.uk/web/transparency.htm (Royal Borough of Windsor and Maidenhead)
– http://www.salford.gov.uk/opendata.htm (Salford City Council)
– http://www.stratford.gov.uk/opendata (Stratford-on-Avon)
– http://www.sunderland.gov.uk/localpublicdata (Sunderland City Council)
– http://www.trafford.gov.uk/opendata/ (Trafford Council)
– http://opendata.walsall.org.uk/ (Walsall Metropolitan Borough Council)
– http://opendata.warwickshire.gov.uk/ (Warwickshire County Council)
– http://www.westberks.gov.uk/index.aspx?articleid=20365 (West Berkshire Council)
With only a small number of councils fully engaged, as yet, with open data, no dominant top level naming scheme has yet appeared, although there are a couple of early runners:
- /opendata [3] (e.g. http://www.stratford.gov.uk/opendata)
- /data [2] (e.g. http://www.hillingdon.gov.uk/data)
- data. [2] (e.g. http://data.london.gov.uk/)
- opendata. [2] (e.g. http://opendata.walsall.org.uk/)
As yet, there is no agreement on the following naming approaches:
- /opendata/Pages [1] (e.g. http://www.eaststaffsbc.gov.uk/opendata/Pages/default.aspx)
- /Data (e.g. http://www.bournemouth.gov.uk/Data/)
- /localpublicdata (e.g. http://www.sunderland.gov.uk/localpublicdata)
- /xml (e.g. http://www.brent.gov.uk/xml)
- open_data/ (e.g. http://www.aberdeencity.gov.uk/open_data/)
Several other councils appear to be offering a specific page to handle (at the moment) open data issues (e.g. http://www.salford.gov.uk/opendata.htm or http://www.westberks.gov.uk/index.aspx?articleid=20365), or even separate domains for their data site (e.g. http://picandmix.org.uk/)
Does any of this matter? At the top level, I’m not sure it does, except in setting expectations and providing a sound footing for a scaleable URI scheme. The Cabinet Office Guidance on designing URI sets, which outlines many considerations that need to be taken into account when defining URI schemes particularly for use as identifiers in RDF inspired Linked Data, suggests that domains should “[e]xpect to be maintained in perpetuity” and that “the choice of domain should provide the confidence to the consumer, …, the domain itself … convey[ing] an assurance of quality and longevity.”
In the foreseeable future, I suspect that (pragmatically) it is likely that the majority of data that will be released in the short term will be published as Excel spreadsheets or inforamlly formatted CSV/TSV data, with some sites publishing raw XML. (As Library Location Data on data.gov.uk describes, even when councils ostensibly release the same sort of data, there is no guarantee that they will do it in similar ways: of the 5 councils publishing the locations of local libraries, 5 different data formats were used… ) It is unlikely that councils will be early adopters of Linked Data across the board. (If they were, it might be seen as excluding users in the short term, because while many people are familiar with working with spreadsheets (a widely adopted “end user” technology for people who work with data in their day job), familiar routes in to and out of Linked Data stores are not there yet…) That said, if local councils do end up wanting to publish data with well formed URIs into the Linked Data space, it would be handy if their current URI scheme was designed with that in mind, and in such a way that the minting of future Linked Data URIs isn’t likely to conflict or clash.
I think that sites designed from the ground up with linked data practice in mind should integrate the HTML as just one more part of the URI scheme, as the BBC do for some parts of their site.
However, this requires skill and planning of a degree that it is not reasonable to expect every council to be able to hire in. They need to be able middle of the price scale staff and get good value for the public money spent.
In cases where the HTML and data sites diverge in both content and management, I think it makes sense to keep the URI schemes separate. Also when retrofitting linked data to highly established sites.
I like the convention data.somename.gov.uk as it gives a very distinct clue for non data-minded people that they are not in kansas any more. Only a very small portion of the population will use direct access to open government data, but most will benefit from the improvements it brings in transparency, inter-office communication and mashups provided by the hackorati.
There is already a proposal in place that
“open data” be a recognised “service” (LGSL Local
Gov Service List), which means it has
a “PID” (identifying number).
I understand that number is 1465.
Spotted this earlier this week:
http://www.communities.gov.uk/localdirectgov/localconnects/1687776/
So armed with that number and the number of the Council,
lets say it is 123 there is a magic lookup service
coming out of local.direct.gov.uk driven by data
that councils themselves must submit to the esd-toolkit.
Matching the two should lead to the page in question
on that council’s website.
But that url is so secret I cannot even find it
myself again now – ah-ha! here’s the skinny:
http://www.communities.gov.uk/localdirectgov/aboutus/