Archive for the ‘opengov’ Category
One of the many things I’d like to spend my time doing is tinkering with data journalism doodles relating to local news stories. For example, via our local hyperlocal blog, I saw this post announced today: Isle of Wight has highest percentage of secondary school absentee rates in country. The post included a link to a Department for Education page (Pupil absence in schools in England, including pupil characteristics) containing links to the statistical release and the associated data sets:
Here’s what we get in the zipped datafile:
The school level dataset had the following column headings:
Year, country_code, country, GOR, GOR_code, LA, new_LA_code, LA_Name, URN, Estab, LAEstab, School_name, School_type, Academy_Flag, Academy_open_date, enrol_sum, SessionsPossible_sum, OverallAbsence_sum, AuthorisedAbsence_sum, UnauthorisedAbsence_sum, overall_absence_percent, auth_absence_percent, unauth_absence_percent, PA_15_sum, possible_sessions_pa_15_sum, overall_abs_pa_15_sum, auth_abs_pa_15_sum, unauth_abs_pa_15_sum, overall_absence_percent_PA_15, auth_absence_percent_PA_15, unauth_absence_percent_PA_15, sess_auth_illness, sess_auth_appointments, sess_auth_religious, sess_auth_study, sess_auth_traveller, sess_auth_holiday, sess_auth_ext_holiday, sess_auth_excluded, sess_auth_other, sess_auth_totalreasons, sess_auth_unclass, sess_unauth_holiday, sess_unauth_late, sess_unauth_other, sess_unauth_noyet, sess_unauth_totalreasons, sess_unauth_unclass, sess_overall_totalreasons
We can guess at what some of these refer to, but what, for example, do the “PA 15″ columns refer to? In this case, what we really should do is look up the actual definitions, which are described in the metadata description document; a document that just happens to be a Microsoft Word 2007 formatted document…
…a document that doesn’t play nicely either with the copy of Word I have on my Mac:
…or the converter that the Google docs uploader uses:
In cases such as this, particularly where there are mathematical equations that often have very specific layout requirements, it can be “safer” to use a document format such as PDF that more reliably captures the appearance of the original page. (If we were really keen on reproducibility, we might also suggest that the equations were made available in an executable form, such as programme code or even as a spreadsheet (I’m not sure “Microsoft equations” are executable?).)
I gave myself a couple of hours to have a quick look through some of the data, but as it is I’ve spent an hour or so looking for ways of reading the metadata description document along with writing up my frustration around not being able to do so… Which is time spent not making sense of the data, or, indeed, its metadata…
PS in passing, I note the publication of the parliamentary Public Accounts Committee 37th report, Whole of Government Accounts 2010-11 again picks up on the way in which government data releases often fall short in terms of their usability (for example, this week MPs call for greater use of Whole of Government Accounts; see also last August Government must do better on transparency, say MPs).
PPS Here’s the solution I used in the end – Skydrive, Microsoft’s online storage/doc viewing play:
As it turns out, the equations could easily have been written using simple text strings…
PPPS as to the “15″ columns, the metadata files describes them along the following lines:
PA_15_sum Number of enrolments classed as persistent absentees (threshold of 15 per cent)
possible_session_pa_15_sum Sessions possible for persistent absentees (threshold of 15 per cent)
Which means what exactly?!
Some time ago, in the post Using Aggregated Local Council Spending Data for Reverse Spending (Payments to) Lookups, I described a way of looking at local council spending data based on how much different councils spent with each other.
This technique generalises within and across sectors, so for example we could look at how hospitals spend money with each other, or how police authorities spend money with each other. In this way, we can get a picture of how public bodies buy -and sell – services off each other. The mappings don’t have to relate to spend, either – we could equally well use this sort of model to see how hospitals transfer patients to one another, or how mental health or social care services offer out-of-area cover to each other, or how councils and housing trusts manage transfers between each other.
The insight that lets us produce this sort of view is that we have entities of a particular sort (hospitals, for example, or local councils), entering into transactions with other entities of the same sort. If these sorts of entity all operate under the same transparency rules, a requirement to publish outgoing (spend) transactions, for example, then we can recreate incoming (receipt) transactions from each entity of the same sort. For example, if local councils are required to publish details of spend over £x, then we can also learn how much councils received from other local councils by means of transactions over £x.
As the UK Government at least seems hell bent on getting markets established in the delivery of public services, markets that can include private companies, then we are faced with a possible asymmetry in transparency information.
The public should be able to hold local councils to account about the services they provide. To do this, people need information about what decisions local councils are taking, and how local councils are spending public money.
And from the NHS:
As part of the government’s commitment to greater transparency, there is a requirement to publish online each NHS organisation’s expenditure over £25,000. In accordance with the requirement NHS Direct publish this on the basis of payments made in each calendar month.
For example, if hospital A buys significant services off hospital B, and must report that spend under transparency legislation, we can build up a picture not only relating to A’s spend, but also B’s sale of services, because A’s data relating to spend with B is openly available; which means B’s receipts from A are also available. (In this example, if items can be itemised as less than £25k per item, then this form of reporting under transparency guidelines is not required.)
If hospital A now buys service of company C, then we can look up spend from hospital A to get a picture of how much public money is flowing out to the private sector and into company C. That is, we can get an idea of company C’s receipts from openly published hospital spending data. (Of course, games could be played with itemisation – 10 treatments at £3k a treatment would result in a ‘must declare’ spend of £30k on the course of treatment, but an undeclarable £3k per treatment if billing is organised that way.)
But what if company C buys services off hospital B (maybe even subcontracting services it was contracted to deliver by hospital A)? If the spend data of company C is not subject to transparency requirements, and the receipt data from the hospital is not publicly available, we lose sight of how money is being spent within and across the public service.
Whilst private companies may balk at being required to publish details of their own spending data, we might still be able to recreate a picture of their spend with public services by requiring public bodies to also publish receipts data, along with the current requirement to publish spend data?
Following the official opening of the Open Data Institute (ODI) last week, a flurry of data related announcements this week:
- A big one for stats fans with the release of 2011 Census data by the ONS: 2011 Census, Key Statistics for Local Authorities in England and Wales. A few charts appear to have made it into the mix (along with the data to generate them), which I guess sets the baseline for whoever lands the currently advertised Head of Rich Content at the ONS job…
The data files associated with press releases are published as Excel spreadsheets. I guess this reflects, in part, the need to come up with a container that can cope with all the metadata. It’s a bit of a pain, though. One thing I keep meaning to explore further are ways of bundling data in R packages, along with scripts for analysing and visualising the data so bundled (eg US Census Spatial and Demographic Data in R: The UScensus2000 Suite of Packages or US consumer expenditure survey (ce) in R). I probably should also look again at Google’s Dataset Publication Language (DSPL) as well as other packaging formats. I need to check out the latest major release from the W3C Provenance Working Group too…
- Over at BIS, £8 million of investment in open public data is announced, the major chunk of which goes to the Data Strategy Board (#datastrategy) Breakthrough Fund to help public bodies get over short term technical barriers to releasing open public data. I keep wittering on about mapping out data flows that already exist and then finding ways to tap into them directly, so won’t repeat that here;-) A smaller pot, administered by the ODI, will be available to SMEs via the Open Data Immersion Programme. Also announced, the Ordnance Survey will be widening the availability of its range of mapping data.
- Not sure if I missed this when it was presumably announced? The Data Strategy Board’s chair Stephan Shakespeare (CEO of YouGov Plc) is leading an independent review of public sector information (here are the (draft) terms of reference). I’m not sure how this review fits into the reports to the tangle of reporting lines associated with the Data Strategy Board and the Public Data Group (the latter seems to have been very quiet?). I also wonder where the ODI fits into that whole structure?
- The funding around public open data coincided with a written Ministerial statement form the Cabinet Office that provided an Update on Departmental Open Data Commitments and adherence to Public Data Principles (>original link on a gov.uk domain, h/t @owenboswarva). The update is spectacularly lacking in linking to any of the raw data that is summarised in the actual statement, so so much for any actual transparency there… The same minister, Francis Maude, has also been fulfilling his social media obligations with a piece in the Huffington Post on A Practical Vision for Open Government. (In other news, at the micro/pragmatic level of open public data, I’m still finding that week on week releases of NHS sitrep data show minor differences in formatting and occasional errors…)
Things have been moving on the Communications Data front too. Communications Data got a look in as part of the 2011/2012 Security and Intelligence Committee Annual Report with a review of what’s currently possible and “why change may be necessary”. Apparently:
118. The changes in the telecommunications industry, and the methods being used by people to communicate, have resulted in the erosion of the ability of the police and Agencies to access the information they require to conduct their investigations. Historically, prior to the introduction of mobile telephones, the police and Agencies could access (via CSPs, when appropriately authorised) the communications data they required, which was carried exclusively across the fixed-line telephone network. With the move to mobile and now internet-based telephony, this access has declined: the Home Office has estimated that, at present, the police and Agencies can access only 75% of the communications data that they would wish, and it is predicted that this will significantly decline over the next few years if no action is taken. Clearly, this is of concern to the police and intelligence and security Agencies as it could significantly impact their ability to investigate the most serious of criminal offences.
N. The transition to internet-based communication, and the emergence of social networking and instant messaging, have transformed the way people communicate. The current legislative framework – which already allows the police and intelligence and security Agencies to access this material under tightly defined circumstances – does not cover these new forms of communication. [original emphasis]
Elsewhere in Parliament, the Joint Select Committee Report on the Draft Communications Data Bill was published and took a critical tone (Home Secretary should not be given carte blanche to order retention of any type of data under draft communications data bill, says joint committee. “There needs to be some substantial re-writing of the Bill before it is brought before Parliament” adds Lord Blencathra, Chair of the Joint Committee.) Friend and colleague Ray Corrigan links to some of the press reviews of the report here: Joint Committee declare CDB unworkable.
In other news, Prime Minister David Cameron’s announcement of DNA tests to revolutionise fight against cancer and help 100,000 patients was reported via a technology angle – Everybody’s DNA could be on genetic map in ‘very near future’ [Daily Telegraph] – as well as by means of more reactionary headlines: Plans for NHS database of patients’ DNA angers privacy campaigners [Guardian], Privacy fears over DNA database for up to 100,000 patients [Daily Telegraph].
If DNA is your thing, don’t forget that the Home Office already operates a National DNA Database for law enforcement purposes.
And if national databases are your thing, there always the National Pupil Database which was in the news recently with the launch of a consultation on proposed amendments to individual pupil information prescribed persons regulations which seeks to “maximise the value of this rich dataset” by widening access to this data. (Again, Ray provides some context and commentary: Mr Gove touting access to National Pupil Database.)
PS A late inclusion: DECC announcement around smart meter rollout with some potential links to #midata strategy (eg “suppliers will not be able to use energy consumption data for marketing purposes unless they have explicit consent”). A whole raft of consultations were held around smart metering and Govenerment responses are also published today, including Government Response on Data Access and Privacy Framework, the Smart Metering Privacy Impact Assessment and a report on public attitudes research around smart metering. I also spotted an earlier consultation that had passed me by around the Data and Communications Company (DCC) License Conditions; here the response, which opens with: “The communications and data transfer and management required to support smart metering is to be organised by a new central communications body – the Data and Communications Company (“the DCC”). The DCC will be a new licensed entity regulated by the Gas and Electricity Markets Authority (otherwise referred to as “the Authority”, or “Ofgem”). A single organisation will be granted a licence under each of the Electricity and Gas Acts (there will be two licences in a single document, referred to as the “DCC Licence”) to provide these services within the domestic sector throughout Great Britain”. Another one to put on the reading pile…
Putting a big brother watch hat on, the notion of “meter surveillance” brings to mind BBC article about an upcoming (will hopefully thence be persistently available on iPlayer?) radio programme on “Electric Network Frequency (ENF) analysis”, The hum that helps to fight crime. According to Wikipedia, ENF is a forensic science technique for validating audio recordings by comparing frequency changes in background mains hum in the recording with long-term high-precision historical records of mains frequency changes from a database. In turn, this reminds me of appliance signature detection (identifying what appliance is switched on or off from its electrical load curve signature), for example Leveraging smart meter data to recognize home appliances. In context of audio surveillance, how about supplementing surveillance video cameras with microphones? Public Buses Across Country [US] Quietly Adding Microphones to Record Passenger Conversations.
transparency boards will be established in each of the key delivery departments (health, education, justice, work and pensions, transport).
I’ve just done a quick trawl and found:
- Health and Social Care Transparency Panel
- DfT Transport Sector Transparency Board
- DWP Welfare Sector Transparency Board
- Crime and Justice Transparency Board Minutes on the data.gov.uk site (via Chris Hanretty)
but not corresponding boards for DfE (Education) or
MoJ (Justice)? If you know where to find any more info about these boards (or links to sources explaining why they don’t exist) please let me know via the comments…
It does, however, look as if there may be a Research Sector Transparency Board on the way…(?)
There’s also a smattering of other transparency boards/panels:
- Local Public Data Panel (via DCLG Public Data and Transparency Board)
- Public Sector Transparency Board (which is essentially the Cabinet Office’s Transparency Board)
- Data Strategy Board (which is down the /transparency path on the BIS website) and the Cabinet Office sponsored(?) Open Data User Group
(Again, please let me know via the comments if I’m missing any…)
All departments are also required to publish open data strategies – you can find links to them here: Cabinet Office list of Departmental Open Data Strategies.
I do wonder what all this alleged transparency means or makes possible though…?
It feels like there are just too many opendata reports being published at the moment to know which ones to read? They do potentially provide lots of possible content for structured reading exercises in an (open) data course though….?
Here’s a list of some the reports I’ve noticed recently, and that I haven’t really had time to read and digest properly:-(
- Open Data White Paper: Unleashing the Potential (Cabinet Office, June 2012)
- Implementing transparency (National Audit Office (NAO), April 2012)
- Report on Using Open Data: policy modeling, citizen empowerment, data journalism (W3C, June 2012)
- The Data Dividend (Demos, March 2012)
- The Big Data Opportunity: Making government faster, smarter and more personal (Policy Exchange/lobbiests, July 2012)
- Open data and charities: a state of the art review (Nominet Trust, July 2012)
- Open data dialogue final report (RCUK, June 2012)
- Open Data in Cultural Heritage Institutions (EPSI Platform, May 2012)
- Open Aid Data (EPSI Platform, May 2012)
Whilst not specifically about open data, these are also related to whole data and openness thang:
- Defining and defending consumer interests in the digital age (Ctrl-Shift/Consumer Focus, June 2012)
- #Intelligence (Demos, May 2012)
- Data Jujitsu: The art of turning data into product (O’Reilly, July 2012)
- Science as an open enterprise (Royal Society, June 2012)
UK Gov Departments also published their open data strategies – they’re linked to from here: UK Gov Departmental Open Data Strategies.
PS I’m not sure if an English translation of this report (in Dutch) on Internal Business Models for Open Government Data is available anywhere?
In FOI Signals on Useful Open Data?, I pondered whether we could make use of information about FOI to help identify what sorts of data folk might actually be interested in by virtue of making Freedom of Information (FOI) requests for that that data.
I couldn’t help but start to try working various elements of that idea through, so here’s a simple baby step to begin with – a scraper on Scraperwiki (Scaperwiki scraper: WhatDOTheyKnow requests) that searches for FOI requests made through WhatDoTheyKnow that got one or more Excel/xls spreasheets back as an attachment.
Clicking through on an FOI request link takes you to the response that contains the data file, which can be downloaded directly or previewed on Zoho:
It strikes me that if I crawled the response pages, I could build my own index of data files, catalogued according to FOI request titles, in effect generating a “fake” data.gov.uk or data.ac.uk opendata catalogue as powered by FOI requests…? (What would be really handy in the local council requests would be if the responses were tagged with with appropriate LGSL code or IPSV terms (indexing on the way out) as a form of useful public metadata that can help put the FOI released data to work…?)
Insofar as the requests may or may not be useful as signaling particular topic areas as good candidates as “standard” open data releases, I still need to do some text analysis on the request titles. In the meantime, you can enter a keyword/key phrase in the Request text box in order to filter the table results to only show requests whose title contains the keyword/phrase. (The Council drop down list allows you to filter the table so that it only shows requests for a particular university/council.)
PS via a post on HelpMeInvestigate, I came across this list of FOI responses to requests made to the NHS Prescription Pricing Division. From a quick skim, some of the responses have “data” file attachments, though in the form of PDFs rather than spreadsheets/CSV. However, it would be possible to scrape the pages to at least identify ones that do have attachments (which is a clue they may contain data sets?)
So now I’m wondering – what other bodies produce full lists of FOI requests they have received, along with the responses to them?
PPS See also this gov.uk search query on FOI Release publications.
(Punchy title, eh?!) If you’re a researcher interested in local government initiatives or service provision across the UK on a particular theme, such as air quality, or you’re looking to start pulling together an aggregator of local council consultation exercises, where would you start?
Really – where would you start? (Please post a comment saying how you’d make a start on this before reading the rest of this post… then we can compare notes;-)
My first thought would be to use a web search engine and search for the topic term using a site:gov.uk search limit, maybe along with intitle:council, or at least council. This would generate a list of pages on (hopefully) local gov websites relating to the topic or service I was interested in. That approach is a bit hit or miss though, so next up I’d probably go to DirectGov, or the new gov.uk site, to see if they had a single page on the corresponding resource area that linked to appropriate pages on the various local council websites. (The gov.uk site takes a different approach to the old DirectGov site, I think, trying to find a single page for a particular council given your location rather than providing a link for each council to a corresponding service page?) If I was still stuck, OpenlyLocal, the site set up several years ago by Chris Taggart/@countculture to provide a single point of reference for looking up common adminsitrivia details relating to local councils, would be the next thing that came to mind. For a data related query, I would probably have a trawl around data.gov.uk, the centralised (but far form complete) UK index of open public datasets.
How much more convenient it would be if there was a “vertical” search or resource site relating to just the topic or service you were interested in, that aggregated relevant content from across the UK’s local council websites in a single place.
(Erm… or maybe it wouldn’t?!)
Anyway, here are a few notes for how we might go about constructing just such a thing out of two key ingredients. The first ingredient is the rather wonderful Local directgov services list:
This dataset is held on the Local Directgov platform which provides the deep links into Local council websites for a number of services in Directgov. The Local Authority Service details holds the local council URLS for over 240 services where the customer can directly transfer to the appropriate service page on any council in England.
The date on the dataset post is 16/09/2011, although I’m not sure if the data file itself is more current (which is one of the issues with data.gov.uk, you could argue…). Presumably, gov.uk runs off a current version of the index? (Share…. ;-) Each item in the local directgov services list carries with it a service identifier code that describes the local government service or provision associated with the corresponding web page. That it, each URL has associated with it a piece of metadata identifying a service or provision type.
Which leads to the second ingredient: the esd standards Local Government Service List. This list maps service codes onto a short key phrase description of the corresponding service. So for example, Council – consultation and community engagement is has service identifier 366, and Pollution control – air quality is 413. (See the standards page for the actual code/vocabulary list in a variety of formats…)
As a starter for ten, I’ve pulled the Directgov local gov URL listing and local gov service list into scraperwiki (Local Gov Web Pages). Using the corresponding scraper API, we can easily run a query looking up service codes relating to pollution, for example:
select * from `serviceDesc` where ToName like '%pollution%'
From this, we can pick up what service code we need to use to look up pages related to that service (413 in the case of air pollution):
select * from `localgovpages` where LGSL=413
We can also get a link to an HTML table (or JSON representation, etc) of the data via a hackable URI:
(Hackable in the sense we can easily change the service code to generate the table for the service with that code.)
So that’s the starter for 10. The next step that comes to my mind is to generate a dynamic Google custom search engine configuration file that defines a search engine that will search over just those URLs (or maybe those URLs plus the pages they link to). This would then provide the ability to generate custom search engines on the fly that searched over particular service pages from across localgov in a single, dynamically generated vertical.
A second thought is to grab those page, index them myself, crawl them/scrape them to find the pages they link to, and index those pages also (using something like tf-idf within each local council site to identify and remove common template elements from the index). (Hmmm… that could be an interesting complement to scraperwiki… SolrWiki, a site for compiling lists of links, indexing them, crawling them to depth N, and then configuring search ranking algorithms over the top of them… Hmmm… It’s a slightly different approach to generating custom search engines as a subset of a monolithic index, which is how the Google CSE and (previously) the Yahoo BOSS engines worked… Not scaleable, of course, but probably okay for small index engines and low thousands of search engines?)
Via a BIS press release earlier this week – Better access to public sector information moves a step closer – it seems that the Data Strategy Board is on its way, along with a Public Data Group and an Open Data User Group (these are separate from the yet to be constituted Open Standards Board (if you’re quick, the deadline for membership of the board is tomorrow: Open Standards Board – Volunteer Members and Board Advisers, – Ref:1238758) and its feeder Open Data Standards, and Open Technical Standards panels).
So what does the press release promise?
A new independently chaired Data Strategy Board (DSB) will advise Ministers on what data should be released [will this draw on data requests made to data.gov.uk, I wonder? - TH] and has the potential to unlock growth opportunities for businesses across the UK. At least one in three members of the DSB will be from outside government, including representatives of data re-users.
The DSB will work with the Public Data Group (PDG) – which consists of Trading Funds the Met Office, Ordnance Survey, Land Registry and Companies House – to provide a more consistent approach to improving access to public sector information. These organisations have already made some data available, which has provided opportunities for developers and entrepreneurs to create imaginative ways to develop or start up their own businesses based on high quality data.
Looking at the Terms of reference for the Data Strategy Board & the Public Data Group, we can broadly see how they’re organised:
Three departmental agendas then…?! A good sign, or, erm..?! (I haven’t read the Terms of reference properly yet – that’s maybe for another post…)
How these fit in with the Public Sector Transparency Board and the Local Public Data Panel, I’m not quite sure, though it might be quite interesting to try and map out the strong and weak ties between them once their memberships are announced? It’d also be interesting to know whether there’d be any mechanism for linking in with open data standards recommendations and development (via the Standards Hub process to ensure that as an when data gets released, there is at least an eye towards releasing it in a usable form!
The Government is making £7m available from April 2013 for the DSB to purchase additional data for free release from the Trading Funds and potentially other public sector organisations, funded by efficiency savings. An Open Data User Group, which will be made up of representatives from the Open Data community, will be directly involved in decisions on the release of Open Data, advising the DSB on what data to purchase from the Trading Funds and other public organisations and release free of charge.
So the DSB is a pseudo-cartel of sort-of government data providers (the Trading Funds) who are being given £7 million or so to open up data that the public purse (I think?) paid them to collect. The cash is there to offset the charges they would otherwise have made selling the data. (Erm… so, in order for those agencies to give their data away for free, we have to pay them to do it? Right… got it…) Presumably, the DSB members won’t be on the ODG who will be advising the DSB on what data to purchase from the Trading Funds and other public organisations and release free of charge (my emphasis). Note the explicit recognition here that free actually costs. In this case, public bodies are having data central gov paid them to collect bought off them by central gov so (central gov, or the bodies themselves) can then release it “for free”? Good. That’s clear then…
Francis Maude also clarifies this point: “The new structure for Open Data will ensure a more inclusive discussion, including private sector data users, on future data releases, how they should be paid for and which should be available free of charge.”
In addition: The DSB will provide evidence on how data from the Trading Funds – including what is released free of charge – will generate economic growth and social benefit. It will act as an intelligent customer advising Government on commissioning and purchasing key data and services from the PDG, and ensuring the best deal for the taxpayer. So maybe this means the Public Sector Transparency Board will now focus more on “public good” and transparency” arguments, leaving the DSB to demonstrate the financial returns of open data?
The Open Data User Group (ODUG) [will] support the work of the new Data Strategy Board (DSB). [The position of Chair of the group is currently being advertised, if you fancy it...: Chair of Open Data User Group, - Ref:1240914 -TH]. The ODUG will advise the DSB on public sector data that should be prioritised for release as open data, to the benefit of the UK.
As part of the process, an open suggestion site has been set up using the Delib Dialogue app to ask “the community” How should the Open Data User Group engage with users and re-users of Open Data?: [i]n advance of appointing a Chair and Members of the group, the Cabinet Office wants to bring together suggestions for how the ODUG should go about this engagement with wider users and re-users. We are looking for ideas about things like how the ODUG should gather evidence for the release of open data, how it should develop it’s advice to the DSB, how it should run its meetings and how it should keep the wider community up to date on developments (as well as other ideas you have).
A Twitter account has also been pre-emptively set up to manage some of the social media engagement activites of the group: @oduguk
The account currently has just over a couple of hundred followers, so I grabbed the list of all the folk they follow, then graphed folk followed by 30 or more current followers of @oduguk.
Here’s the graph, laid out in Gephi using a fore directed layout, with nodes colured according to modularity group and sized by eigenvector centrality:
Here’s the same graph with nodes size by betweenness centrality:
By the by, responses to the Data Policy for a Public Data Corporation consultation have also been published, including with the Government response, which I haven’t had chance to read yet… If I get a chance, I’ll try to post some thoughts/observations on that alongside a commentary on the terms of reference doc linked to above somewhere…
A flurry of articles earlier this week (mine will be along shortly) about the Data Strategy Board all broadly rehashed the original press release from BIS. Via the Cabinet Office Transparency minisite, I found a link to the press release via the COI News Distribution Service…
…whereupon I noticed that the COI – Central Office of Information – is to close at the end of this month (31 March 2012), taking with it the News Distribution Service for Government and the Public Sector (soon to be ex- of http://nds.coi.gov.uk/).
In its place is the following advice: “For government press releases please follow this link to find the department that you require http://www.direct.gov.uk/en/Dl1/Directories/A-ZOfCentralGovernment/index.htm This leads to a set of alphabetised pages with links to the various government departments… i.e. it points to a starting point for likely fruitless browsing and searching if you’re after aggregated press releases from gov departments.
(I’m not sure where News Sauce: UK Government Edition gets its data from, but if it’s by scrapes of departmental press releases rather than just scraping and syndicating the old COI content, then it’s probably the site I’ll be using to keep tabs on government press releases.)
FWIW, centralisation and aggregation are not the same in terms of architectures of control. Aggregation (then filter on the way out, if needs be) can be a really really useful way of keeping tabs on otherwise distributed systems… I had a quick look to see whether anyone was scraping and aggregating UKGov departmental press releases on Scraperwiki, but only came up with @pezholio’s LGA Press Releases scraper…
An easier way would be to hook up my feed reader to an OPML bundle that collected together RSS/Atom feeds of news releases from the various government websites. I’m not sure if such a bundle is available anywhere (if you know of one, please add a link in the comments below), but if: 1) gov departments do publish RSS/Atom feed containing their press releases; 2) they make these feeds autodiscoverable via their homepages, and: 3) ensure that said feeds are reliably identifiable as press release/media release feeds, it wouldn’t be too hard to build a simple OPML feed generator.
So for example, trawling through old posts, I note that the post 404 “Page Not Found” Error pages and Autodiscoverable Feeds for UK Government Departments used a Yahoo Pipes pipe to try to automatically audit feed autodiscovery on UK gov departmental homepages, though it may well have rotted by now. If I was to fix it, I’d probably reimplement it in Scraperwiki, as I did with my UK HEI feed autodiscovery thang (UK university autodiscoverable RSS Feeds (Scraperwiki scraper), and Scraperwiki View; about: Autodiscoverable Feeds and UK HEIs (Again…)). If you beat me to that, please post a link to your scraper below;-)
I have to admit I haven’t checked the state of feed autodiscovery on UK gov, local gov, or university websites recently. Sigh… another thing to add to the list of ‘maybe useful’ diversions…;-)
PPS hmm, from Tracking Down Local Government Consultation Web Pages, I recall there are LGD service ID codes that lists identifiers for local government services that can be used to tag webpages/URLs on local government sites. Are there service identifiers for central government communication services (eg provision of press releases?) that could be used to find central gov department press releases (or local gov press releases for that matter?) Of course, if departments all had autodiscoverable press release feeds on their homepages, it’d be a more weblike way;-)