I spent much of yesterday trying to come up with some sort of storyline for a presentations I’m giving in at a Manchester Digital Open Data: Concept & Practice event tomorrow evening on Open Civic Data, quite thankful that I’d bought myself a bit of slack with the original title: “Open Data Surprise”…
Anywhere, a draft of the slides as here (Manchester opendata presentation):
though as ever, they don’t necessarily give the full picture without me talking over them…
The gist of the presentation is basically as follows: there is an increasing number of local council websites out there making data available. By data, I mean stuff that developers, or tinkerers such as myself, can wrangle with, or council officers actually use as part of their job. The data that’s provided can be thought of along a spectrum ranging from fixed archival data (i.e. reports, things that aren’t going to change) to timely data, such as dates of forthcoming council elections, or details of current or planned roadworks. Somewhere in-between is data that changes on a regular cycle, such as the list of councillors, for example. The most timely of all data is live data, such as bus locations, or their estimated arrival time at a particular bus stop.
Lots of councils are starting to offer “data” via maps. What they are actually doing is providing information in a natural way, which is not a Bad Thing, although it’s not a Data Thing – if I wanted to create my own map view of the data, it’s generally not easy for me to do so. Providing data files that contain latitude and longitude is one way that councils can make the data available to users, but there is then a barrier to entry in terms of who can realistically make use of that data. Publishing geo-content in the KML format is one way we can can improve this, because tools such as Google Earth provide a ready way of rendering KML feeds that is accessible to many more users.
As a pragmatist, I believe that most people who use data do so in the context of a spreadsheet. This suggests that we need to make data available in that format, notwithstanding the problems that might arise from the difficulties of keeping an audit trail of the origins of that data in that format once files become merged. As a realist, I appreciate that most people don’t know that it’s possible to visualise their data, or what sorts of insight might be afforded by visualising the data. Nor do they know how it is becoming increasingly easy to create visualisations on top of data presented in an appropriate way. Just as using KML formats allows almost anyone to crate their own google Map by pasting a KML file URL into a Google maps search box, so the use of simple data formats such as CSV allow users to pull data into visualisation environments such as IBM’s Many Eyes. (For more on this line of thinking, see Programming, Not Coding: Infoskills for Journalists (and Librarians..?!;-)). As a data junkie, I think that the data should be “linkable” across data sets, and also queryable. As a contrarian, I think that Linked Data is maybe not the way forward at this time, at least in terms of consumer end-user evangelism/advocacy… (see also: So What Is It About Linked Data that Makes it Linked Data™?
The data that is starting to be published by many councils typically corresponds to the various functions of the council – finance, education, transport, planning, cultural and leisure services, for example. Through publishing this data in an open way, third party vertical providers can both aggregate data as information across councils, as well as adding contextual value. Some councils are entering into partnership with other councils to develop vertical services to which the council can publish it’s data, before pulling it back into the council’s own website via a data feed. And as to whose data it is anyway, it might be ours, but it’s also theirs: data as the business of government. Which makes me think: the most effective council data stores will be the ones that are used by councils are data consumers in their own right, rather than just as data publishers*.
(* There is a corollary here with open educational resources, I think? Institutions that make effective use of OERs are institutions who use at least their own OERs, as well as publishing them…?)
Recent communications from Downing Street suggest the new coalition government is serious in its aim to open up public data (though as @lesteph points out, this move towards radical transparency is not without its own attendant risks), so data releases are going to happen. The question is, are we going to help that data flow so that it can get to where it needs to go?