First Sightings of the Data Strategy Board

Via a BIS press release earlier this week – Better access to public sector information moves a step closer – it seems that the Data Strategy Board is on its way, along with a Public Data Group and an Open Data User Group (these are separate from the yet to be constituted Open Standards Board (if you’re quick, the deadline for membership of the board is tomorrow: Open Standards Board – Volunteer Members and Board Advisers, – Ref:1238758) and its feeder Open Data Standards, and Open Technical Standards panels).

So what does the press release promise?

A new independently chaired Data Strategy Board (DSB) will advise Ministers on what data should be released [will this draw on data requests made to data.gov.uk, I wonder? – TH] and has the potential to unlock growth opportunities for businesses across the UK. At least one in three members of the DSB will be from outside government, including representatives of data re-users.

The DSB will work with the Public Data Group (PDG) – which consists of Trading Funds the Met Office, Ordnance Survey, Land Registry and Companies House – to provide a more consistent approach to improving access to public sector information. These organisations have already made some data available, which has provided opportunities for developers and entrepreneurs to create imaginative ways to develop or start up their own businesses based on high quality data.

Looking at the Terms of reference for the Data Strategy Board & the Public Data Group, we can broadly see how they’re organised:

Three departmental agendas then…?! A good sign, or, erm..?! (I haven’t read the Terms of reference properly yet – that’s maybe for another post…)

How these fit in with the Public Sector Transparency Board and the Local Public Data Panel, I’m not quite sure, though it might be quite interesting to try and map out the strong and weak ties between them once their memberships are announced? It’d also be interesting to know whether there’d be any mechanism for linking in with open data standards recommendations and development (via the Standards Hub process to ensure that as an when data gets released, there is at least an eye towards releasing it in a usable form!

The Government is making £7m available from April 2013 for the DSB to purchase additional data for free release from the Trading Funds and potentially other public sector organisations, funded by efficiency savings. An Open Data User Group, which will be made up of representatives from the Open Data community, will be directly involved in decisions on the release of Open Data, advising the DSB on what data to purchase from the Trading Funds and other public organisations and release free of charge.

So the DSB is a pseudo-cartel of sort-of government data providers (the Trading Funds) who are being given £7 million or so to open up data that the public purse (I think?) paid them to collect. The cash is there to offset the charges they would otherwise have made selling the data. (Erm… so, in order for those agencies to give their data away for free, we have to pay them to do it? Right… got it…) Presumably, the DSB members won’t be on the ODG who will be advising the DSB on what data to purchase from the Trading Funds and other public organisations and release free of charge (my emphasis). Note the explicit recognition here that free actually costs. In this case, public bodies are having data central gov paid them to collect bought off them by central gov so (central gov, or the bodies themselves) can then release it “for free”? Good. That’s clear then…

Francis Maude also clarifies this point: “The new structure for Open Data will ensure a more inclusive discussion, including private sector data users, on future data releases, how they should be paid for and which should be available free of charge.”

In addition: The DSB will provide evidence on how data from the Trading Funds – including what is released free of charge – will generate economic growth and social benefit. It will act as an intelligent customer advising Government on commissioning and purchasing key data and services from the PDG, and ensuring the best deal for the taxpayer. So maybe this means the Public Sector Transparency Board will now focus more on “public good” and transparency” arguments, leaving the DSB to demonstrate the financial returns of open data?

The Open Data User Group (ODUG) [will] support the work of the new Data Strategy Board (DSB). [The position of Chair of the group is currently being advertised, if you fancy it…: Chair of Open Data User Group, – Ref:1240914 -TH]. The ODUG will advise the DSB on public sector data that should be prioritised for release as open data, to the benefit of the UK.

As part of the process, an open suggestion site has been set up using the Delib Dialogue app to ask “the community” How should the Open Data User Group engage with users and re-users of Open Data?: [i]n advance of appointing a Chair and Members of the group, the Cabinet Office wants to bring together suggestions for how the ODUG should go about this engagement with wider users and re-users. We are looking for ideas about things like how the ODUG should gather evidence for the release of open data, how it should develop it’s advice to the DSB, how it should run its meetings and how it should keep the wider community up to date on developments (as well as other ideas you have).

A Twitter account has also been pre-emptively set up to manage some of the social media engagement activites of the group: @oduguk

The account currently has just over a couple of hundred followers, so I grabbed the list of all the folk they follow, then graphed folk followed by 30 or more current followers of @oduguk.

Here’s the graph, laid out in Gephi using a fore directed layout, with nodes colured according to modularity group and sized by eigenvector centrality:

Here’s the same graph with nodes size by betweenness centrality:

By the by, responses to the Data Policy for a Public Data Corporation consultation have also been published, including with the Government response, which I haven’t had chance to read yet… If I get a chance, I’ll try to post some thoughts/observations on that alongside a commentary on the terms of reference doc linked to above somewhere…

Government Communications – Department Press Releases and Autodiscoverable Syndication Feeds

A flurry of articles earlier this week (mine will be along shortly) about the Data Strategy Board all broadly rehashed the original press release from BIS. Via the Cabinet Office Transparency minisite, I found a link to the press release via the COI News Distribution Service…

…whereupon I noticed that the COI – Central Office of Information – is to close at the end of this month (31 March 2012), taking with it the News Distribution Service for Government and the Public Sector (soon to be ex- of http://nds.coi.gov.uk/).

In its place is the following advice: “For government press releases please follow this link to find the department that you require http://www.direct.gov.uk/en/Dl1/Directories/A-ZOfCentralGovernment/index.htm This leads to a set of alphabetised pages with links to the various government departments… i.e. it points to a starting point for likely fruitless browsing and searching if you’re after aggregated press releases from gov departments.

(I’m not sure where News Sauce: UK Government Edition gets its data from, but if it’s by scrapes of departmental press releases rather than just scraping and syndicating the old COI content, then it’s probably the site I’ll be using to keep tabs on government press releases.)

FWIW, centralisation and aggregation are not the same in terms of architectures of control. Aggregation (then filter on the way out, if needs be) can be a really really useful way of keeping tabs on otherwise distributed systems… I had a quick look to see whether anyone was scraping and aggregating UKGov departmental press releases on Scraperwiki, but only came up with @pezholio’s LGA Press Releases scraper…

An easier way would be to hook up my feed reader to an OPML bundle that collected together RSS/Atom feeds of news releases from the various government websites. I’m not sure if such a bundle is available anywhere (if you know of one, please add a link in the comments below), but if: 1) gov departments do publish RSS/Atom feed containing their press releases; 2) they make these feeds autodiscoverable via their homepages, and: 3) ensure that said feeds are reliably identifiable as press release/media release feeds, it wouldn’t be too hard to build a simple OPML feed generator.

So for example, trawling through old posts, I note that the post 404 “Page Not Found” Error pages and Autodiscoverable Feeds for UK Government Departments used a Yahoo Pipes pipe to try to automatically audit feed autodiscovery on UK gov departmental homepages, though it may well have rotted by now. If I was to fix it, I’d probably reimplement it in Scraperwiki, as I did with my UK HEI feed autodiscovery thang (UK university autodiscoverable RSS Feeds (Scraperwiki scraper), and Scraperwiki View; about: Autodiscoverable Feeds and UK HEIs (Again…)). If you beat me to that, please post a link to your scraper below;-)

I have to admit I haven’t checked the state of feed autodiscovery on UK gov, local gov, or university websites recently. Sigh… another thing to add to the list of ‘maybe useful’ diversions…;-)

See also: Public Data Principles: RSS Autodiscovery on Government Department Websites?

PS This tool may or may not be handy if feed autodiscovery is new to you? Feed Autodiscovery in Javascript

PPS hmm, from Tracking Down Local Government Consultation Web Pages, I recall there are LGD service ID codes that lists identifiers for local government services that can be used to tag webpages/URLs on local government sites. Are there service identifiers for central government communication services (eg provision of press releases?) that could be used to find central gov department press releases (or local gov press releases for that matter?) Of course, if departments all had autodiscoverable press release feeds on their homepages, it’d be a more weblike way;-)

The Other Sort of Higher Education Standards…

…that is, not standards in the sense of university league tables, but standards in the sense of British Standards, open standards, and interoperability standards…

Standards help make things work together. Whether they’re mandated from the top down, or grow up from grass roots conventions, norms, and widely adopted usage examples, standards provide a means by which independent parties can design their own systems in the reasonable expectation that they will work with products or applications developed by others. In some cases, the standard may also demonstrate best practice, or provide an efficient way of tackling some problem, (that is, it may help you be lazy…). Widely adopted standards are often supported by tooling that make it easier to adopt or a deploy a particular standard, such as code libraries that can import and export particular file types, and so on.

In the data world, publishing data in a standard way supports the creation of aggregation and analysis tools that can be used to analyse data published according to the standard from multiple separate sources. Sometimes, third parties step in to normalise data that is published severally by different parties, even if it purports to be the same thing. For example, Chris Taggart’s OpenlyLocal aggregates local council data published in a wide variety of formats by UK local councils, normalises it and makes it available through a single API, around which third parties can build services that will then apply to *all* UK councils whose data is aggregated on OpenlyLocal.

Many public bodies have a duty to make data submissions to both local and central government as part of their formal reporting requirements. For local government, this burden is captured by the DCLG single data list; in HE, a recent review for the HESA project on Redesigning the higher education data and information landscape included a review of reporting requirements on HEIS – survey available via that project’s reports archive (see also getTheData: Data Burden on UK Higher Education). The review itself is currently ongoing and invites representation from “Data gatherers (professional statutory and regulatory bodies),
[d]ata providers (HE providers and their representative organisations), [and d]ata users (this includes students and potential students as well as all of the above bodies)”.

The HE Information Landscape project has identified the following initial set of principles:

The following initial project principles that were considered by the Steering Group on 17 February 2011.

  • Pursue the aim of ‘collect once use many times’ wherever practicable, working to align different bodies to avoid duplication of effort.
  • Enable the collection of essential, comprehensive, consistent and timely information that meets the requirements of the new regulatory framework in England and is, so far as possible, fit for purpose across the UK HE landscape.
  • Achieve efficiency and data sharing through the development and adoption of information and technical standards.
  • Ensure any data can be trusted and attributed to its provenance by defining robust quality assurance processes.
  • Foster open access to information wherever possible while respecting IPR, DPA and other regulatory, statutory and information management requirements for the processing of data.
  • Seek to manage demands for increased data collection and mitigate these through well-targeted use of technology, looking where possible to streamline data collection.

So how does this sit in the wider context of public sector ICT standards in general? Although the UK Gov Open Standards Board has not yet been constituted (applications for volunteer positions are open till March 22nd: Open Standards Board – Volunteer Members and Board Advisers, – Ref:1238758 ), the (Shadow) Open Data Standards Panel has met three times (disclaimer: I’m on it). The panel’s current mode of operation is to run a series of challenges via the Standards Hub around a set of topic areas. The intention behind the challenges is to seek feedback and recommendations on current standards or conventions used within the area, identifying candidates that can be passed to the Board as potential recommended open standards, or identifying the need for open standards development on a particular topic. At the current time, two challenges have been issued by the panel: Standards for Open Public Services, and one on Managing and Using Terms and Codes. (I had hoped that participants in the recent CETIS workshop on vocabularies might put in a response to that challenge;-) Challenges from other panels are also available.

What I’m wondering is whether anyone in the edu community thinks a challenge in the education area would be a Good Thing (i.e. are there any standardisation efforts that would benefit the sector, and would anyone be minded to submit a response to challenge;-), and if so, what form might it take? I’m also wondering about the extent to which the Open Data Standards Panel might be a stakeholder in the HESA Information Landscape project (and maybe even vice versa: i.e. to what extent might HESA be a stakeholder in a suitably framed Standards Hub challenge? Of course, it’s probably completely impolitic publicly blogging this, and I may not make it through to full Panel membership when the (Shadow) moniker is dropped, but hey, that’s the price of transparency and openness for you!)

In recent times we have seen the evolution of the XCRI (course data exchange format) (XCRI historical timeline), over the next year or two there’s likely to be all sorts of work around the Key Information Sets (KIS Data Standards), as well as wider work on course data (eg the University of Lincoln’s JISC funded ON Course open course data project (disclaimer: I’m doing a day’s consultancy for them next week)), so there is standards work to be done. The HESA Information Landscape review seems pretty far reaching, so there are presumably sector specific, as well as generic, standards issues to be raised there. If university open data/open linked data initiatives then conventions, norms and maybe even standards will emerge from bottom up activity mediated there. Recent changes to research council funding suggest that universities will increasingly need to manage their facilities in a semi-commercial way, maybe requiring standard ways of defining facilities; estates management is another area where universities have major needs, some that are emphasised in the academic domain (timetabling rooms, for example), others that are more generic (such as energy monitoring). Choosing basic list definition formats (as in the Managing and Using Terms and Codes (meta)-challenge is a generic requirement, as are matters such as vocabulary definitions surrounding ethnicity, gender and so on (I guess the sector uses things like HESA standard ethnicity codes?), which are matters that will be covered by generic challenges but could also be referenced from an education focussed challenge. And so on… So the question before the house is: would an education sector challenge be of interest, and if so, what sort of thing should it cover? Or are there generic challenges that might apply across government/public bodies, but that would also get a buy in from, and offer benefit to, the educational sector?

And finally, something I haven’t managed to get my head fully round yet is how the Open Data Standards Panel might relate to things like the Information Standards Board (ISB) for Education, Skills and Children’s Services and the NHS Information Standards Board for Health and Social Care (see also: Connecting for Health/NHS Data Standards and the NHS Information Governance Toolkit – is there a mapping from the governance toolkit onto related datasets?) on the one hand, and the Transparency Board on the other? For example, in the latter case, the Transparency Board minutes for the 15th Nov 2011 meeting described a “[discussion with] Guy Goodwin, Director of Population, Health and Regional Analysis, Office for National Statistics, and Jason Bradbury, Deputy Director, National Statistician’s Office, UK Statistics Authority provided background to the current work of Official Statistics” as follows:

  • It would be helpful to consider the ‘privacy and jigsaw effect’ point in greater detail at a later Board meeting to ensure that appropriate anonymisation techniques were understood and used, and that data was presented in a way that was still useful.
  • Recognition of the challenge in moving up accessibility scale due to IT platform barriers; and recognition of the skills gaps in creating open data which exists across government as a whole.
  • The GSS community was the model for skills in extracting the utility of data across government and the public sector, and as such should be the lead for the promotion of these skills in a roll out to the wider public sector.
  • A desire for GSS to help by establishing standard 4* definitions for GSS standard data categories, e.g. geographies.

The need for producers around government to move from creating “table builder capability” (inflexible and often not open format) to consistent URLs that can be linked to over time (enabling higher levels of accessibility and re-use). GG supported this view and noted the GSS was heading in that direction.
ONS had published Consumer Prices and Retail Prices microdata for the first time in response to an FOI request, made by an individual; with transparency in mind made this available via its website for all to re-use. The information was published to a good level of detail and would now be published routinely.
Action: Office for National Statistics to forward a detailed case study of this Consumer Prices and Retail Prices microdata publication to the Transparency Board

So what I’m wondering here are a couple of things. Firstly, by posting open minutes in a syndication supporting way, it’s easy enough for different panels to share what they’re doing with other panels in a weak tie sort of way. Secondly, the Transparency minute above suggests that there is some hope for the idea that FOI requests can lead to the everyday, matter-of-course release of data as open public data (see also things like data taps in this context). Which is where formal and informal standards may have a role to play, in encouraging best practice models for the release as data, either as conventions, or as recognised open standards.

PS in passing, a bit of a rant: Can someone let me have copies of the #opendata usage example casestudies mentioned in Francis Maude’s “Open Data Innovation Community” speech that aren’t open on http://communities.maven-cast.com/pg/groups/3731? Thanks…;-)

See also: TT381 Presentation – Open Data and Open Standards. There’s also a consultation open at the moment on open standards that seeks feedback on the extent to which folk like or don’t like the government’s take on the defining criteria for open standards before it formally adopts them (i.e. consultation in pre-emptive mitigation of PR flak;-): UKGov Open Standards Consultation.

On the to read list: Demos: The Data Dividend, Francis Maude’s speech to the World Bank on data’s role in transparency, Tim Davies on NT Open Data Days: Exploring data flow in a VCO, Francis Irving and Rufus Pollock on From CMS to DMS: C is for Content, D is for Data.