So What, #midata? And #yourData, #ourData…

The Twittertubes were all abuzz yesterday with news about the UKGov’s announcement on #midata (even though the press release everyone was referring to came out earlier?). It’s still not clear to me what announcement was actually made yesterday, or where? [Ah… seems the actual statement relates to the Government’s response to the midata consultation, along with an impact assessment.] I also struggled to find any write-ups of the hacks’n’ideas produced at the ODI’s (@ukodi) #midata hackathon over the weekend?

(For a round-up from over the summer of reports on personal data, see Personal Data Exploitation – Recent Reports, which also quotes a sceptical view about public uptake from a Government commissioned report.)

The personal data that UKGov is encouraging companies to make available in the first instance is credit card/banking transaction data, phone billing, and energy usage data. The first two sectors typically offer itemised breakdowns anyway – maybe the #midata initiative will “request” that the information is made available in a machine readable form if it isn’t already published as such? – with the energy usage data requiring a smart meter, presumably (and which many folk who are interested will have acquired – and hacked years ago – already?!) So what’s new? Does this add to our right to data, eg as supported by subject access requests under the Data Protection Act? What I’m sceptical about is the extent to which this initiative is just a roundabout way of allowing companies to share data amongst themselves (eg Data Bartering Is Everywhere) with checkbox customer permission, of course… (Market context: Computing.co.uk – How Tesco and co are testing the limits of customer data exploitation.)

By the by, on the topic of sharing individual level data, it seems that the Department for Education are currently consulting around the wider release of pupil data – Consultation on proposed amendments to individual pupil information prescribed persons regulations: A consultation on proposals to amend regulations to enable the Department for Education to share extracts of data held in the National Pupil Database for a wider range of purposes than currently possible. The aim is to maximise the value of this rich dataset.

The National Pupil Database is a longitudinal database, which holds information on children in schools in England. The majority of datasets go back 10 years, with the earliest data going back to 1996. There are a range of data sources in the National Pupil Database providing information about children’s education at different stages (pre-school, primary, secondary and further education).
It includes detailed information about pupils, their test and exam results, prior attainment and progression at different key stages for all state schools in England. Attainment data is also held for pupils and students in nonmaintained special schools, sixth form and Further Education (FE) colleges and (where available) independent schools. The National Pupil Database includes information about the characteristics of pupils in the state sector and non-maintained special schools such as gender, ethnicity, first language, eligibility for free school meals, information about special educational needs (SEN), as well as detailed information about pupil absence and exclusions.
The data held in the National Pupil Database is collected from a range of sources including schools, local authorities and awarding organisations. This data is processed by the Department’s Data and Statistics Division and matched and stored in the National Pupil Database. The Department makes it clear to children and their parents what information is held about pupils and how it is processed, through a statement on its website. Schools also inform parents and pupils of how the data is used through privacy notices.

There’s a lot to be said for opening up this data to researchers, but I’m sure the privacy wonks will also have plenty of points to make… For example, from Privacy International, UK School Census proposals – How you can help. (Related – just in to my mailbox, EU report on “The right to be forgotten”. And more: ICO code on anonymisation, managing privacy risks and maintaining transparency.)

Sort of related, it’s maybe also worth remembering that the Department of Health, via the NHS, is also widening collection of, and opening up researcher access to, anonymised cradle-to-grave health records via Clinical Practice Research Datalink. (Launch press release; context: eg NHS patient records to revolutionise medical research in Britain.)

Taking these together, along with the idea that media channels deliver audiences to advertisers, I wonder: what is being transacted (collected, bought, staged, and sold) when government releases life event related “transactional” datasets (school records, health records) to researchers? How do the costs and benefits flow (eg in terms of improving the lot of the citizen, playing fair with taxation, etc…?)

PS I haven’t been keeping up with Linked Data in Gov initiatives lately, so this (via @ldodds, I think?) looks like it might be a handy round-up: UKGovLD (UK Government Linked Data Working Group) – opening the doors event.

PPS Via @mhawksey, something that should be read alongside the #midata announcement – Tesco vacancy – Product Manager, ‘My Data’ (commentary): “The successful candidate will define the strategy to develop and support the deployment of Group-wide capability to deliver market-leading products and games which give our Clubcard customers simple, useful, fun access to their own data to help them plan and achieve their goals.”

Key responsibilities:

– You will build and develop the personalised access to customer’s data capability plan
– Accountable for working with functional and country stakeholders across the business to develop a strategy for personalised access to customer’s data and prioritising which products, tools and capabilities to build
– Work with Tesco IT and dunnhumby and other functional stakeholders to deliver these new capabilities to plan and to budget
– Manage the delivery of Clubcard Play (games) to engage customers and create new media opportunities for brands and marketing opportunities for Tesco
– Represent the functional teams and their interests to ensure there is a constant delivery of customer and business benefits from the personalised access to customer’s data workstream
– Manage a team of managers (who work with functional stakeholders and IT) to define and deliver new products, tools and capabilities
– Work with key functional stakeholders such as marketing to manage the organisation change and impact that the personalised access to customer’s data workstream will have
– Work with Corporate and Legal Affairs to manage any legal obligations around giving customers digital access to their own data
– Drive learning through rapid testing and piloting and be involved in running trials in market where needed
– Drive requirements back into the Data and Personalisation Engine streams within the Programme
– Manage the reporting and tracking of benefits to ensure that we are measuring the impact of our activities
– Contribute as part of the Personalisation customer data leadership team
– Look to the medium-term future and think about potential innovations in the area of personalised access to customer’s data to bring into the overall programme roadmap
– Stay close to the customer through market scanning, networking and by building relationships with key internal and external thought leaders

If you spot any ads from other companies that look as if they are #midata related, please post a link to them, the job title and if possible a clip/quick summary, in the comments;-)

PPPS On my “possibly related?” to read list: Network Accountability for the Domestic Intelligence Apparatus. From the abstract, “The network is anchored by “fusion centers,” novel sites of intergovernmental collaboration that generate and share intelligence and information. Several fusion centers have generated controversy for engaging in extraordinary measures that place citizens on watch lists, invade citizens’ privacy, and chill free expression. … A new concept of accountability – network accountability – is needed to address the shortcomings of fusion centers. Network accountability has technical, legal, and institutional dimensions. Technical standards can render data exchange between agencies in the network better subject to review. Legal redress mechanisms can speed the correction of inaccurate or inappropriate information.” With public datasets, we can of course create our own “fusion centres”.

PPPPS …and on the “to play with” list, analyze the consumer expenditure survey (ce) with r (“the consumer expenditure survey (ce) is the primo data source to understand how americans spend money. participating households keep a running diary about every little purchase over the year. those diaries are then summed up into precise expenditure categories.” And the data is available:-).

PPPPPS December 2012: FTC to Study Data Broker Industry’s Collection and Use of Consumer Data “The Federal Trade Commission issued orders requiring nine data brokerage companies to provide the agency with information about how they collect and use data about consumers. The agency will use the information to study privacy practices in the data broker industry.

“Data brokers are companies that collect personal information about consumers from a variety of public and non-public sources and resell the information to other companies. In many ways, these data flows benefit consumers and the economy; for example, having this information about consumers enables companies to prevent fraud. Data brokers also provide data to enable their customers to better market their products and services.”

Could be interesting… It also links to a March 2012 report on Protecting Consumer Privacy in an Era of Rapid Change: Recommendations for Businesses and Policymakers.

“For Data Protection Purposes”, Can You Give Me Some Personal Data…?

Ring ring… “Hi, it’s the AA; we’re doing a survey about your recent call out; for data protection purposes, may I have your address please?” Er, no… “Oh, well, I’m sorry, I can’t continue this call without confirming who you are…”

[I should really have asked what was meant by “data protection purposes”…;-)]

I think I’m going to start collecting stories like this, unsolicited calls from companies I’m a customer of, who call me on my phone (using a number they have on their records for my account), and then try to get me to provide them with additional personal information so that they can check I’m me… (but how do I know they’re them…?)

Ring, ring: “Hi, I’m an evil phisher who got your number from a phone book (phone books, remember them? Directories that came unsolicited through your letterbox, publishing the name, address and telephone number of most people in your area in public **Privacy breach alert**, panic, don’t panic ;-) pretending to be from a large company who you’re likely to be a customer of based on the demographics of your postcode area. Could you confirm your first name please… and can you confirm that your address (the one I found in the phone book…), is blah, blah, blah. And your date of birth…?”

Ring, ring: “Hi, I’m an evil burglar checking out properties that I might come and visit, but I’m pretending to offer cheap house insurance. Do your windows have XYZ locks? And do you have a burglar alarm? Is the property ever vacant for more than three or four hours at a time? Thank you…”

There has to be a better way… some form of reciprocated two step verification, maybe, whereby both the caller and the callee can confirm each other’s identity?