The Loss of Obscurity – A Round-Up of Recent Reports Relating to Privacy and Personal Consumer Data

A jumbled collection of recent clips and snippets, that feel to me as if they’re pieces of the same jigsaw…

  • An article in The Atlantic on Obscurity: A Better Way to Think About Your Data Than ‘Privacy’:

    …”privacy” is an over-extended concept. It grabs our attention easily, but is hard to pin down. Sometimes, people talk about privacy when they are worried about confidentiality. Other times they evoke privacy to discuss issues associated with corporate access to personal information. Fortunately, obscurity has a narrower purview.

    Obscurity is the idea that when information is hard to obtain or understand, it is, to some degree, safe. Safety, here, doesn’t mean inaccessible. Competent and determined data hunters armed with the right tools can always find a way to get it. Less committed folks, however, experience great effort as a deterrent.

    This can be a useful distinction to make, I think, when considering the uses to which “personal data” is, or can be, put. Obscure things are hard to find. Just because a dataset is “anonymised” doesn’t mean that a determined data hunter (DDH) won’t be able to deanonymise elements of it.

    Related to obscurity is obfuscation – coding things in such a way that you accept the information contained in the dataset is open, but you do your damnedest to deliberately make it difficult for people to extract certain meaningful elements from it. (For example, How can I obfuscate JavaScript?.) Looking at the way many open public datasets are published, you might think an obfuscation step had been built in to the publication process;-)

    For a linked take in defense of privacy (from which we can maybe identify useful attributes associated with the notion of privacy), see Privacy is not the enemy – rebooted… Paul Bernal.

  • Overt camera surveillance (cameras in carparks, shops and town centres, for example, or ANPR (Automated Number Plate Recognition) cameras in petrol station forecourts and again, in car parks) is presumably deployed to dissuade people from performing particular acts by making it known to them that if they engage in those acts they will be held accountable for them. If we pick this apart a little, CCTV surveillance can operate in two modes: 1) identifying particular actions and then (maybe) taking steps to prevent their furtherance; 2) identifying people captured in the video. Whilst the aim of (2) may be to identify people involved in (1), (2) may also be used to identify and track people in general, irrespective of the actions they are performing. A currently open Home Office Surveillance camera code of practice consultation gives some background to what is deemed to be acceptable use of, and controls on, the use of overt camera surveillance, although it does not seem to explore any possible “evil consequences” of such technology. I’m not sure whether it covers the use of drone-based surveillance either?!

    A wider review of surveillance systems can be found in an EU Seventh Framework Programme report – IRISS (Increasing Resilience in Surveillance Societies) Deliverable D1.1: Surveillance, fighting crime and violence.

  • Another key ingredient in the management of privacy and obscurity is the notion of identity and identities. UKGov has been considering “identity” in two different ways recently:
    • The BIS Foresight project on Future Identities/The Future of Identity reviews different notions of identity (where identity is “the sum of those characteristics which determine who a person is”) and the different identities we may express:

      This Foresight Report provides an evidence base for decision makers in government to understand better how people’s identities in the UK might change over the next 10 years. The impact of new technologies and increasing uptake of online social media, the effects of globalisation, environmental disruption, the consequences of the economic downturn, and growing social and cultural diversity are all important drivers of change for identities. However, there is a gap in understanding how identities might change in the future, and how policy makers might respond to such change.

    • When working with services online, we’re all familiar with the notion of have different login identities with different services. When working with government services, there may be a requirement to ensure that a given user login identity actually relates to a particular person. The DWP Identity Assurance Scheme seems to be working with commercial providers (Post Office, Cassidian, Digidentity, Experian, Ingeus, Mydex, Verizon, PayPal) to establish an “identity registration service [that] will enable benefit claimants to choose who will validate their identity by automatically checking their authenticity with the provider before processing online benefit claims”. Whatever that is supposed to mean. Does it mean when I create a DWP login I can use my PayPal credentials to prove to DWP who I am? Or does it mean I’ll be able to log in to DWP services using my PayPal credentials? I couldn’t find anything related in a quick skim of the DWP Digital Strategy on this? Are there any good references out there? UPDATE – ah, this ComputerWeekly report suggests the identity providers will do verification and manage logins – not sure if those logins will be unique to accessing DWP/gov.uk services, though, or whether they would also access eg my PayPal account?)

      See also the Open Identity Exchange, a scheme for building trusted relationships between online identity providers on a global scale…

  • A recent report from the Administrative Data TaskforceImproving Access for Research and Policy – provides a series of recommendations for establishing a research network for analysing and linking administrative datasets. Among other things, the report suggests the following model for “de-identifying” linked datasets:

    ADT - de-identified record linkage

    Here’s a sample of some of the other sorts of things the ADT recommended:

    • R1.1 The ADRCs will be responsible for commissioning and undertaking linkage of data from different government departments and making the linked data available for analysis, thereby creating new resources for a growing research agenda. Analyses of within sector data (e.g. linking medical records between primary and secondary care) and linking of data between departments for operational purposes may continue to be conducted by the relevant government departments and agencies.
    • R1.3 Personal identifiers (names, addresses, precise date of birth, national insurance numbers, etc.) attached to administrative data records will not be available to, or held in, the ADRCs; hence, both ADRC staff and researchers accessing data through ADRCs will not have sight of such personal identifying information. Linkage will be achieved through the use of third parties who have the expertise to provide secure data linkage services for matching personal records from existing data systems.
    • R1.6 Access to data held in the ADRCs by accredited researchers will be possible using three approaches. For all of these, no individual-level records will be released from the ADRCs. First, researchers can visit the ADRC secure data access facility, where their analyses of the relevant data sub-set will be overseen by the ADRC support team. Second, researchers can submit statistical syntax to the ADRC support team who will run the analysis on the dataset on behalf of the researcher (results would be thoroughly checked before return). Third, remote secure data access facilities may be established which allow virtual access to datasets held in the ADRCs. With the latter approach, no data would be transferred to these remote safe settings, which would use state-of-the-art technologies and apply rigorous international standards, equivalent to those used in the ADRCs themselves, to provide a secure environment for researchers to undertake their analyses.
    • R1.11 … However, the Taskforce recognises that there could well be potential benefits that derive from private sector data and related research interests. The Governing Board will, at an early stage, investigate guidelines for access and linkage by private sector interests, …
  • I haven’t had a chance to read this yet, but the World Economic Forum (WEF) have just published a report on Rethinking Personal Data.

    In the UK, the #midata route to encouraging folk to hand over access to their personal transaction data associated with company to other data processing and aggregation services continues apace with a set of clauses added to the Enterprise & Regulatory Reform Bill – Midata.

    In the US, related notion of Smart Disclosure is being pursued – “an innovative new tool designed to help consumers make better informed decisions and benefit from new products and services powered by data. It refers to expanding access to data in machine-readable formats so that innovators can create interactive services and tools that allow consumers to make important choices in sectors such as health care, education, finance, energy, transportation, and telecommunications.” Because of course “Giving consumers access to their own data—with comprehensive privacy and security safeguards—can empower consumers to make better choices.” Which is to say – if you give access to your data to a third party, they can use that, in combination with other data, to recommend services to you.

So – that’s a quick round-up of recent reports I’m aware of. Have I missed any?

See also:
Whither Transparency? This Week in Open Data
OpenData Reports Round Up (Links…)
So What, #midata? And #yourData, #ourData…

2 comments

  1. Pingback: The Curse of Our Time – Tracking, Tracking Everywhere | OUseful.Info, the blog...