Tagged: midata

When Your Identity Can Be Used Against You…

A couple of news stories came to my attention today, one via a news stand, one via a BBC news report that – it turns out – was recapitulating an earlier story.

Both of them demonstrate how you the user of the online service are only a “valued customer” insofar as you help generate revenue.

In the first case, it seems that Amazon – doyens of good employment and tax practice, ahem – are going to start suing folk who publish fake reviews on the site.


Ah, bless… Amazon fighting on behalf of the consumer…

A more timely report – including a copy of the complaint – is posted on Geekwire.

Apparently, “[d]efendants are misleading Amazon’s customers and tarnishing Amazon’s brand for their own profit and the profit of a handful of dishonest sellers and manufacturers. Amazon is bringing this action to protect its customers from this misconduct, by stopping defendants and uprooting the ecosystem in which they participate.”.

It seems that each reviewer of a product has agreed to and is bound by the Conditions of Use of the Amazon site, so I guess that before posting a review, we should all be reading the Ts & Cs… Of course, Amazon ensures that folk using any of its websites have read – and understood – all the terms and conditions of the relevant site. Ahem, again… Cough, splutter, choke….

In case you’re interested, here are the US Terms and Conditions, under acceptance of which “you agree that the Federal Arbitration Act, applicable federal law, and the laws of the state of Washington” apply, presumably because even though Amazon is a Delaware corporation, (not that Delaware not being the most transparent of jurisdictions is likely to have anything to do with that?!) its principal place of business in Seattle, Washington. Sort of. In the UK, for example, which is to say the EU, it’s based in Luxembourg, presumably to help its tax position…? It must be nice being a big enough company to choose what jurisdiction to put what part of your business in, so you can, erm, maximise the benefits…

Although the current case is playing out against Amazon.com users, in case you’re interested, here are the UK Ts & Cs. Read and digest… Remember: you almost undoubtedly signed up to them… Ahem…

(Another by the by on partially related matters – it seems like if you work for Facebook in the UK, business has been good and you can expect a pretty good bonus, but if you’re a member of HM’s Inland Revenue, there’s not been that much business as far as tax is concerned (Facebook paid £4,327 corporation tax despite £35m staff bonuses). In passing, I wonder if Facebook pay the cleaning and ancillary staff that service Facebook’s UK premises a living wage, or whether we should be sleeping happy that soon enough these employees won’t be receiving as much “UK taxpayer’s money” in the form of (soon to be cut) working tax credits, that form of corporate welfare payment used to support companies at state expense, in lieu of them paying a reasonable wage (even leaving tax affairs aside)…) I’m not sure who’s taking the p**s more – international corporations or UK Gov…?

Here’s the root of the second story that caught my eye, in which it seems that successful online gamblers are deemed persona non grata if they’re no longer “revenue positive”, or whatever the phrase is.


So bear in mind that when companies collect your personal data, the benefit that they ultimately want to derive is to the company, not to you, the individual user. Sucker…

“Student for Life” – A Lifelong Learning Relationship With Your University… Or Linked In?

I’ve posted several times over the years wondering why universities don’t try to reimagine themselves as see undergrad degrees as a the starting point for a lifelong relationship as an educational service provider with their first-degree alumni.

A paragraph in an recent Educause Review article Data, Technology, and the Great Unbundling of Higher Education (via @Downes [commentary]) caught my eye this morning:

In an era of unbundling, when colleges and universities need to move from selling degrees to selling EaaS subscriptions, the winners will be those that can turn their students into “students for life” — providing the right educational programs and experiences at the right time.

On a quick read, there’s a lot in the article I don’t like, even though it sounds eminently reasonable, positing a competency based “full-stack model to higher education” in which providers will (1) develop and deliver specific high-quality educational experiences that produce graduates with capabilities that specific employers desperately want; (2) work with students to solve financing problems; and (3) connect students with employers during and following the educational experience and make sure students get a job.

The disruption that HE faces, then, is not one about course delivery, but rather one about life-as-career management?

What if … the software that will disrupt higher education isn’t courseware at all? What if the software is, instead, an online marketplace? Uber (market cap $40 billion) owns no vehicles. Airbnb (market cap $10 billion) owns no hotel rooms. What they do have are marketplaces with consumer-friendly interfaces. By positioning their interfaces between millions of consumers and sophisticated supply systems, Uber and Airbnb have significantly changed consumer behavior and disrupted these supply systems.

Is there a similar marketplace in the higher education arena? There is, and it has 40 million college students and recent graduates on its platform. It is called LinkedIn.

Competency marketplaces will profile the competencies (or capabilities) of students and job seekers, allow them to identify the requirements of employers, evaluate the gap, and follow the educational path that gets them to their destination quickly and cost-effectively. Although this may sound like science fiction, the gap between the demands of labor markets and the outputs of our educational system is both a complex sociopolitical challenge and a data problem that software, like LinkedIn, is in the process of solving. …

(I’m not sure if I don’t like the article because I disagree with it, or because it imagines a future that is one that I’d rather not see play out: the idea that learners don’t develop a longstanding relationship with a particular university, and consequently miss out on the social and cultural memories and relationships that develop therein, but instead taking occasional offerings from a wide a variety of providers and instead having their long term relationship with someone like LinkedIn, feels like something will be lost to me. Martin Weller captures some of it, I think, in his reflection yesterday on Product and process in higher ed in terms of how the “who knows?!” answer to the “what job are you going to do with that?” question about a particular degree becomes a nonsense answer, because the point of the degree has become just that: getting a particular job. Rather than taking a degree to widen your options, the degree becomes one of narrowing them down?! Maybe the first degree should be about setting yourself up to becoming a specialist over the course of occasional and extended education over a lifetime? UPDATE: related, this quote from an article on the “death of Twitter”: When a technology is used to shrink people’s possibilities, more than to expand them, it cannot create value for them. And so people will simply tune it out, ignore it, walk away from it if they can. In the sense that universities are a technology… hmmm…)

Furthermore, I get twitchy about this being another example of a situation where it’s tradable personal data that’s the valuable bargaining chip:

To avoid marginalization, colleges and universities need to insist that individuals own their competencies. Ensuring that ownership lies with the individual could make the competency profile portable and could facilitate movement across marketplaces, as well as to higher education institutions.

(As for how competencies are recognised, and fraud avoided in terms of folk claiming a competency that hasn’t be formally qualified, I’ve pondered this before, eg in the context of Time to build trust with an open achievements API?, or this idea for a Qualification Verification Service. It seems to me that universities don’t see it as their business proving that folk have the qualifications or certificates they’ve been awarded – which presumably means that if it does become yet another part of the EaaS marketplace, it’ll be purely corporate commercial interests that manage it.)

We’ll see….

A Nudge Here, A Nudge There, But With Meaning..

A handful of posts caught my attention yesterday around the whole data thang…

First up, a quote on the New Aesthetic blog: “the state-of-the-art method for shaping ideas is not to coerce overtly but to seduce covertly, from a foundation of knowledge”, referencing an article on Medium: Is the Internet good or bad? Yes. The quote includes mention of an Adweek article (this one? Marketers Should Take Note of When Women Feel Least Attractive; see also a response and the original press release) that “noted that women feel less attractive on Mondays, and that this might be the best time to advertise make-up to them.”

I took this as a cautionary tale about the way in which “big data” qua theoryless statistical models based on the uncontrolled, if large, samples that make up “found” datasets, to pick up on a phrase used by Tim Harford in Big data: are we making a big mistake? [h/t @schmerg et al]) can be used to malevolent affect. (Thanks to @devonwalshe for highlighting that it’s not the data we should blame (“the data itself has no agency, so a little pointless to blame … Just sensitive to tech fear. Shifts blame from people to things.”) but the motivations and actions of the people who make use of the data.)

Which is to say – there’s ethics involved. As an extreme example, consider the possible “weaponisation” of data, for example in the context of PSYOP – “psychological operations” (are they still called that?) As the New Aesthetic quote, and the full Medium article itself, explain, the way in which data models allow messages to be shaped, targeted and tailored provides companies and politicians with a form of soft power that encourage us “to click, willingly, on a choice that has been engineered for us”. (This unpicks further – not only are we modelled so that the prompts are issued to us at an opportune time, but the choices we are provided with may also have been identified algorithmically.)

So that’s one thing…

Around about the same time, I also spotted a news announcement that Dunnhumby – an early bellwether of how to make the most of #midata consumer data – has bought “advertising technology” firm Sociomantic (press release): “dunnhumby will combine its extensive insights on the shopping preferences of 400 million consumers with Sociomantic’s intelligent digital-advertising technology and real-time data from more than 700 million online consumers to dramatically improve how advertising is planned, personalized and evaluated. For the first time, marketing content can be dynamically created specifically for an individual in real-time based on their interests and shopping preferences, and delivered across online media and mobile devices.” Good, oh…

A post on the Dunnhumby blog (It’s Time to Revolutionise Digital Advertising) provides further insight about what we might expect next:

We have decided to buy the company because the combination of Sociomantic’s technological capability and dunnhumby’s insight from 430m shoppers worldwide will create a new opportunity to make the online experience a lot better, because for the first time we will be able to make online content personalised for people, based on what they actually like, want and need. It is what we have been doing with loyalty programs and personalised offers for years – done with scale and speed in the digital world.

So what will we actually do to make that online experience better for customers? First, because we know our customers, what they see will be relevant and based on who they are, what they are interested in and what they shop for. It’s the same insight that powers Clubcard vouchers in the UK which are tailored to what customers shop for both online and in-store. Second, because we understand what customers actually buy online or in-store, we can tell advertisers how advertising needs to change and how they can help customers with information they value. Of course there is a clear benefit to advertisers, because they can spend their budgets only where they are talking to the right audience in the right way with the right content at the right time, measuring what works, what doesn’t and taking out a lot of guesswork. The real benefit though must be to customers whose online experience will get richer, simpler and more enjoyable. The free internet content we enjoy today is paid for by advertising, we just want to make it advertising offers and content you will enjoy too.

This needs probing further – are Dunnhumby proposing merging data about actual shopping habits in physical and online store with user cookies so that ads can be served based on actual consumption? (See for example Centralising User Tracking on the Web. How far has this got, I wonder? Seems like it may be here on mobile devices? Google’s New ‘Advertising ID’ Is Now Live And Tracking Android Phones — This Is What It Looks Like. Here’s the Android developer docs on Advertising ID. See also GigaOm on As advertisers phase out cookies, what’s the alternative?, eg in context of “known identifiers” (like email addresses and usernames) and “stable identifiers” (persistent device or browser level identifiers).)

That’s the second thing…

For some reason, it’s all starting to make me think of supersaturated solutions

PS FWIW, the OU/BBC co-produced Bang Goes the Theory (BBC1) had a “Big Data” episode recently – depending on when you read this, you may still be able to watch it here: Bang Goes the Theory – Series 8 – Episode 3: Big Data

More Digital Traces…

Via @wilm, I notice that it’s time again for someone (this time at the Wall Street Journal) to have written about the scariness that is your Google personal web history (the sort of thing you probably have to opt out of if you sign up for a new Google account, if other recent opt-in by defaults are to go by…)

It may not sound like much, but if you do have a Google account, and your web history collection is not disabled, you may find your emotional response to seeing months of years of your web/search history archived in one place surprising… Your Google web history.

Not mentioned in the WSJ article was some of the games that the Chrome browser gets up. @tim_hunt tipped me off to a nice (if technically detailed, in places) review by Ilya Grigorik of some the design features of the Chrome browser, and some of the tools built in to it: High Performance Networking in Chrome. I’ve got various pre-fetching tools switched off in my version of Chrome (tools that allow Chrome to pre-emptively look up web addresses and even download pages pre-emptively*) so those tools didn’t work for me… but looking at chrome://predictors/ was interesting to see what keystrokes I type are good predictors of web pages I visit…

chrome predictors

* By the by, I started to wonder whether webstats get messed up to any significant effect by Chrome pre-emptively prefetching pages that folk never actually look at…?

In further relation to the tracking of traffic we generate from our browsing habits, as we access more and more web/internet services through satellite TV boxes, smart TVs, and catchup TV boxes such as Roku or NowTV, have you ever wondered about how that activity is tracked? LG Smart TVs logging USB filenames and viewing info to LG servers describes not only how LG TVs appear to log the things you do view, but also the personal media you might view, and in principle can phone that information home (because the home for your data is a database run by whatever service you happen to be using – your data is midata is their data).

there is an option in the system settings called “Collection of watching info:” which is set ON by default. This setting requires the user to scroll down to see it and, unlike most other settings, contains no “balloon help” to describe what it does.

At this point, I decided to do some traffic analysis to see what was being sent. It turns out that viewing information appears to be being sent regardless of whether this option is set to On or Off.

you can clearly see that a unique device ID is transmitted, along with the Channel name … and a unique device ID.

This information appears to be sent back unencrypted and in the clear to LG every time you change channel, even if you have gone to the trouble of changing the setting above to switch collection of viewing information off.

It was at this point, I made an even more disturbing find within the packet data dumps. I noticed filenames were being posted to LG’s servers and that these filenames were ones stored on my external USB hard drive.

Hmmm… maybe it’s time I switched out my BT homehub for a proper hardware firewalled router with a good set of logging tools…?

PS FWIW, I can’t really get my head round how evil on the one hand, or damp squib on the other, the whole midata thing is turning out to be in the short term, and what sorts of involvement – and data – the partners have with the project. I did notice that a midata innovation lab report has just become available, though to you and me it’ll cost 1500 squidlly diddlies so I haven’t read it: The midata Innovation Opportunity. Note to self: has anyone got any good stories to say about TSB supporting innovation in micro-businesses…?

PPS And finally, something else from the Ilya Grigorik article:

The HTTP Archive project tracks how the web is built, and it can help us answer this question. Instead of crawling the web for the content, it periodically crawls the most popular sites to record and aggregate analytics on the number of used resources, content types, headers, and other metadata for each individual destination. The stats, as of January 2013, may surprise you. An average page, amongst the top 300,000 destinations on the web is:

– 1280 KB in size
– composed of 88 resources
– connects to 15+ distinct hosts

Let that sink in. Over 1 MB in size on average, composed of 88 resources such as images, JavaScript, and CSS, and delivered from 15 different own and third-party hosts. Further, each of these numbers has been steadily increasing over the past few years, and there are no signs of stopping. We are increasingly building larger and more ambitious web applications.

Is it any wonder that pages take so long to load on a mobile phone off the 3G netwrok, and that you can soon eat up your monthly bandwidth allowance!

Tin Foil Hats or Baseball Caps? Why Your Face is a Cookie and Your Data is midata

Over the weekend, chatting with friends, I heard myself going off on what I imagine sounded like a paranoid fantasy fuelled privacy rant. But it stems from my own confusion about what it means for so much data to be out there about us, and whether the paranoid fantasy bit actually relates to:

– the extent to which folk would want to collect and process that data, and use it “against” me, as an individual;
– the extent to which data from disparate sources can be reconciled;
– the idea that all manner and variety of data about me is being collected anyway;
– the fact that all manner and variety of data about me could in principle be being collected.

So here are some more bits and pieces…

We all know that Tesco pioneered the use of loyalty cards for personalised customer marketing and store optimisation (eg The Tesco Data Business (Notes on “Scoring Points”)) and maybe that they track you round a store (or do they track your face?!), and now it seems that as well as supplementing their petrol stations with ANPR (Automatic Number Plate Recognition) systems (I assume their garages are equipped with them? Some of their car parks are…) they’ll be using face scanning Amscreen Point of Sale advertising screens to profile folk based on gender and age. (It’s possibly just easier to recognise someone by their face or phone and then lookup their gender and age; and economic circumstances; and etc etc?!)

Adrian Short has some further comments here… When does face scanning tip over into the full-time surveillance society?

Face recgonition as commodity
See the ad? Face recgonition as commodity service?

I don’t really know how concerning this is – folk I meet regularly recognise me, so what does it matter if machines universally and ubiquitously recognise me? Should I be concerned that my face is essentially third party cookie, at least for unique ID purposes, that can be identified by anyone whose servers hook into a particular video or image feed?

And presumably things like my payment cards, and car number plate, and postcode, and etc etc can effectively be treated as third party cookies too in a similar respect of unique or group identification? (What should we call such things? I, me, my cookies…? icookies?! Or to tie into the notion of #midata, micookies?)

And should I be fearful that such companies buy and sell data about me via ad exchanges and cookie matching services?

Surely companies using #midata can help me make better decisions, nudging me in to taking courses of action that are good for me?

Food hygiene rating

So should we care? Should we care what data’s out there in the wild about me? Should we care that a shedload of #midata may actually be publicly available data, not least through cookie tracking, and micookie traces?

Should we care that services like Wonga.com may be making use of that data to make decisions about me, as described in Leaky data: How Wonga makes lending decisions (read it, it’s an interesting read…).

And should we care that the decisions made on the basis of such publicly available but who knows what data are probably so algorithmically complex that there is no transparency or rationale in how or why such decisions are actually made the way they are? (See for example Transparent Predictions, Tal Zarsky, University of Illinois Law Review, Vol. 2013, No. 4, 2013.)

Not paranoid, just confused, and not really able to think any of this through…

POS an example of where Facebook’s at wrt automated face recognition around the end of 2013: DeepFace: Closing the Gap to Human-Level Performance in Face Verification

The Loss of Obscurity – A Round-Up of Recent Reports Relating to Privacy and Personal Consumer Data

A jumbled collection of recent clips and snippets, that feel to me as if they’re pieces of the same jigsaw…

  • An article in The Atlantic on Obscurity: A Better Way to Think About Your Data Than ‘Privacy’:

    …”privacy” is an over-extended concept. It grabs our attention easily, but is hard to pin down. Sometimes, people talk about privacy when they are worried about confidentiality. Other times they evoke privacy to discuss issues associated with corporate access to personal information. Fortunately, obscurity has a narrower purview.

    Obscurity is the idea that when information is hard to obtain or understand, it is, to some degree, safe. Safety, here, doesn’t mean inaccessible. Competent and determined data hunters armed with the right tools can always find a way to get it. Less committed folks, however, experience great effort as a deterrent.

    This can be a useful distinction to make, I think, when considering the uses to which “personal data” is, or can be, put. Obscure things are hard to find. Just because a dataset is “anonymised” doesn’t mean that a determined data hunter (DDH) won’t be able to deanonymise elements of it.

    Related to obscurity is obfuscation – coding things in such a way that you accept the information contained in the dataset is open, but you do your damnedest to deliberately make it difficult for people to extract certain meaningful elements from it. (For example, How can I obfuscate JavaScript?.) Looking at the way many open public datasets are published, you might think an obfuscation step had been built in to the publication process;-)

    For a linked take in defense of privacy (from which we can maybe identify useful attributes associated with the notion of privacy), see Privacy is not the enemy – rebooted… Paul Bernal.

  • Overt camera surveillance (cameras in carparks, shops and town centres, for example, or ANPR (Automated Number Plate Recognition) cameras in petrol station forecourts and again, in car parks) is presumably deployed to dissuade people from performing particular acts by making it known to them that if they engage in those acts they will be held accountable for them. If we pick this apart a little, CCTV surveillance can operate in two modes: 1) identifying particular actions and then (maybe) taking steps to prevent their furtherance; 2) identifying people captured in the video. Whilst the aim of (2) may be to identify people involved in (1), (2) may also be used to identify and track people in general, irrespective of the actions they are performing. A currently open Home Office Surveillance camera code of practice consultation gives some background to what is deemed to be acceptable use of, and controls on, the use of overt camera surveillance, although it does not seem to explore any possible “evil consequences” of such technology. I’m not sure whether it covers the use of drone-based surveillance either?!

    A wider review of surveillance systems can be found in an EU Seventh Framework Programme report – IRISS (Increasing Resilience in Surveillance Societies) Deliverable D1.1: Surveillance, fighting crime and violence.

  • Another key ingredient in the management of privacy and obscurity is the notion of identity and identities. UKGov has been considering “identity” in two different ways recently:
    • The BIS Foresight project on Future Identities/The Future of Identity reviews different notions of identity (where identity is “the sum of those characteristics which determine who a person is”) and the different identities we may express:

      This Foresight Report provides an evidence base for decision makers in government to understand better how people’s identities in the UK might change over the next 10 years. The impact of new technologies and increasing uptake of online social media, the effects of globalisation, environmental disruption, the consequences of the economic downturn, and growing social and cultural diversity are all important drivers of change for identities. However, there is a gap in understanding how identities might change in the future, and how policy makers might respond to such change.

    • When working with services online, we’re all familiar with the notion of have different login identities with different services. When working with government services, there may be a requirement to ensure that a given user login identity actually relates to a particular person. The DWP Identity Assurance Scheme seems to be working with commercial providers (Post Office, Cassidian, Digidentity, Experian, Ingeus, Mydex, Verizon, PayPal) to establish an “identity registration service [that] will enable benefit claimants to choose who will validate their identity by automatically checking their authenticity with the provider before processing online benefit claims”. Whatever that is supposed to mean. Does it mean when I create a DWP login I can use my PayPal credentials to prove to DWP who I am? Or does it mean I’ll be able to log in to DWP services using my PayPal credentials? I couldn’t find anything related in a quick skim of the DWP Digital Strategy on this? Are there any good references out there? UPDATE – ah, this ComputerWeekly report suggests the identity providers will do verification and manage logins – not sure if those logins will be unique to accessing DWP/gov.uk services, though, or whether they would also access eg my PayPal account?)

      See also the Open Identity Exchange, a scheme for building trusted relationships between online identity providers on a global scale…

  • A recent report from the Administrative Data TaskforceImproving Access for Research and Policy – provides a series of recommendations for establishing a research network for analysing and linking administrative datasets. Among other things, the report suggests the following model for “de-identifying” linked datasets:

    ADT - de-identified record linkage

    Here’s a sample of some of the other sorts of things the ADT recommended:

    • R1.1 The ADRCs will be responsible for commissioning and undertaking linkage of data from different government departments and making the linked data available for analysis, thereby creating new resources for a growing research agenda. Analyses of within sector data (e.g. linking medical records between primary and secondary care) and linking of data between departments for operational purposes may continue to be conducted by the relevant government departments and agencies.
    • R1.3 Personal identifiers (names, addresses, precise date of birth, national insurance numbers, etc.) attached to administrative data records will not be available to, or held in, the ADRCs; hence, both ADRC staff and researchers accessing data through ADRCs will not have sight of such personal identifying information. Linkage will be achieved through the use of third parties who have the expertise to provide secure data linkage services for matching personal records from existing data systems.
    • R1.6 Access to data held in the ADRCs by accredited researchers will be possible using three approaches. For all of these, no individual-level records will be released from the ADRCs. First, researchers can visit the ADRC secure data access facility, where their analyses of the relevant data sub-set will be overseen by the ADRC support team. Second, researchers can submit statistical syntax to the ADRC support team who will run the analysis on the dataset on behalf of the researcher (results would be thoroughly checked before return). Third, remote secure data access facilities may be established which allow virtual access to datasets held in the ADRCs. With the latter approach, no data would be transferred to these remote safe settings, which would use state-of-the-art technologies and apply rigorous international standards, equivalent to those used in the ADRCs themselves, to provide a secure environment for researchers to undertake their analyses.
    • R1.11 … However, the Taskforce recognises that there could well be potential benefits that derive from private sector data and related research interests. The Governing Board will, at an early stage, investigate guidelines for access and linkage by private sector interests, …
  • I haven’t had a chance to read this yet, but the World Economic Forum (WEF) have just published a report on Rethinking Personal Data.

    In the UK, the #midata route to encouraging folk to hand over access to their personal transaction data associated with company to other data processing and aggregation services continues apace with a set of clauses added to the Enterprise & Regulatory Reform Bill – Midata.

    In the US, related notion of Smart Disclosure is being pursued – “an innovative new tool designed to help consumers make better informed decisions and benefit from new products and services powered by data. It refers to expanding access to data in machine-readable formats so that innovators can create interactive services and tools that allow consumers to make important choices in sectors such as health care, education, finance, energy, transportation, and telecommunications.” Because of course “Giving consumers access to their own data—with comprehensive privacy and security safeguards—can empower consumers to make better choices.” Which is to say – if you give access to your data to a third party, they can use that, in combination with other data, to recommend services to you.

So – that’s a quick round-up of recent reports I’m aware of. Have I missed any?

See also:
Whither Transparency? This Week in Open Data
OpenData Reports Round Up (Links…)
So What, #midata? And #yourData, #ourData…

#Midata Is Intended to Benefit Whom, Exactly?

A CTRL-Shift blog post entitled MIDATA Legislation Begins mentions, but doesn’t link to, “an amendment to the Enterprise and Regulator Reform Bill in the House of Lords”, presumably referring to paragraphs 58C*, 58D* and 58E* proposed by Viscount Younger of Leckie in the Seventh Marshalled List of Amendments:


Insert the following new Clause—

“Supply of customer data

(1) The Secretary of State may by regulations require a regulated person to provide customer data—

(a) to a customer, at the customer’s request;

(b) to a person who is authorised by a customer to receive the data, at the customer’s request or, if the regulations so provide, at the authorised person’s request.

(2) “Regulated person” means—

(a) a person who, in the course of a business, supplies gas or electricity to any premises;

(b) a person who, in the course of a business, provides a mobile phone service;

(c) a person who, in the course of a business, provides financial services consisting of the provision of current account or credit card facilities;

(d) any other person who, in the course of a business, supplies or provides goods or services of a description specified in the regulations.

(3) “Customer data” means information which—

(a) is held in electronic form by or on behalf of the regulated person, and

(b) relates to transactions between the regulated person and the customer.

(4) Regulations under subsection (1) may make provision as to the form in which customer data is to be provided and when it is to be provided (and any such provision may differ depending on the form in which a request for the data is made).

(5) Regulations under subsection (1)—

(a) may authorise the making of charges by a regulated person for complying with requests for customer data, and

(b) if they do so, must provide that the amount of any such charge—

(i) is to be determined by the regulated person, but

(ii) may not exceed the cost to that person of complying with the request.

(6) Regulations under subsection (1)(b) may provide that the requirement applies only if the authorised person satisfies any conditions specified in the regulations.

(7) In deciding whether to specify a description of goods or services for the purposes of subsection (2)(d), the Secretary of State must (among other things) have regard to the following—

(a) the typical duration of the period during which transactions between suppliers or providers of the goods or services and their customers take place;

(b) the typical volume and frequency of the transactions;

(c) the typical significance for customers of the costs incurred by them through the transactions;

(d) the effect that specifying the goods or services might have on the ability of customers to make an informed choice about which supplier or provider of the goods or services, or which particular goods or services, to use;

(e) the effect that specifying the goods or services might have on competition between suppliers or providers of the goods or services.

(8) The power to make regulations under this section may be exercised—

(a) so as to make provision generally, only in relation to particular descriptions of regulated persons, customers or customer data or only in relation to England, Wales, Scotland or Northern Ireland;

(b) so as to make different provision for different descriptions of regulated persons, customers or customer data;

(c) so as to make different provision in relation to England, Wales, Scotland and Northern Ireland;

(d) so as to provide for exceptions or exemptions from any requirement imposed by the regulations, including doing so by reference to the costs to the regulated person of complying with the requirement (whether generally or in particular cases).

(9) For the purposes of this section, a person (“C”) is a customer of another person (“R”) if—

(a) C has at any time, including a time before the commencement of this section, purchased (whether for the use of C or another person) goods or services supplied or provided by R or received such goods or services free of charge, and

(b) the purchase or receipt occurred—

(i) otherwise than in the course of a business, or

(ii) in the course of a business of a description specified in the regulations.

(10) In this section, “mobile phone service” means an electronic communications service which is provided wholly or mainly so as to be available to members of the public for the purpose of communicating with others, or accessing data, by mobile phone.”


Insert the following new Clause—

“Supply of customer data: enforcement

(1) Regulations may make provision for the enforcement of regulations under section (Supply of customer data) (“customer data regulations”) by the Information Commissioner or any other person specified in the regulations (and, in this section, “enforcer” means a person on whom functions of enforcement are conferred by the regulations).

(2) The provision that may be made under subsection (1) includes provision—

(a) for applications for orders requiring compliance with the customer data regulations to be made by an enforcer to a court or tribunal;

(b) for notices requiring compliance with the customer data regulations to be issued by an enforcer and for the enforcement of such notices (including provision for their enforcement as if they were orders of a court or tribunal).

(3) The provision that may be made under subsection (1) also includes provision—

(a) as to the powers of an enforcer for the purposes of investigating whether there has been, or is likely to be, a breach of the customer data regulations or of orders or notices of a kind mentioned in subsection (2)(a) or (b) (which may include powers to require the provision of information and powers of entry, search, inspection and seizure);

(b) for the enforcement of requirements imposed by an enforcer in the exercise of such powers (which may include provision comparable to any provision that is, or could be, included in the regulations for the purposes of enforcing the customer data regulations).

(4) Regulations under subsection (1) may—

(a) require an enforcer (if not the Information Commissioner) to inform the Information Commissioner if the enforcer intends to exercise functions under the regulations in a particular case;

(b) provide for functions under the regulations to be exercisable by more than one enforcer (whether concurrently or jointly);

(c) where such functions are exercisable concurrently by more than one enforcer—

(i) designate one of the enforcers as the lead enforcer;

(ii) require the other enforcers to consult the lead enforcer before exercising the functions in a particular case;

(iii) authorise the lead enforcer to give directions as to which of the enforcers is to exercise the functions in a particular case.

(5) Regulations may make provision for applications for orders requiring compliance with the customer data regulations to be made to a court or tribunal by a customer who has made a request under those regulations or in respect of whom such a request has been made.

(6) Subsection (8)(a) to (c) of section (Supply of customer data) applies for the purposes of this section as it applies for the purposes of that section.

(7) The Secretary of State may make payments out of money provided by Parliament to an enforcer.

(8) In this section, “customer” and “regulated person” have the same meaning as in section (Supply of customer data).”


Insert the following new Clause—

“Supply of customer data: supplemental

(1) The power to make regulations under section (Supply of customer data) or (Supply of customer data: enforcement) includes—

(a) power to make incidental, supplementary, consequential, transitional or saving provision;

(b) power to provide for a person to exercise a discretion in a matter.

(2) Regulations under either of those sections must be made by statutory instrument.

(3) A statutory instrument containing regulations which consist of or include provision made by virtue of section (Supply of customer data)(2)(d) may not be made unless a draft of the instrument has been laid before, and approved by a resolution of, each House of Parliament.

(4) A statutory instrument containing any other regulations under section (Supply of customer data) or section (Supply of customer data: enforcement) is subject to annulment in pursuance of a resolution of either House of Parliament.”

Note that 58C/1/b states that data could be released “to a person who is authorised by a customer to receive the data, at the customer’s request or, if the regulations so provide, at the authorised person’s request.” So if I say to my electricity company that they can share the data with you (“a person who is authorised by a customer to receive the data”), the company can share the data with you if I ask them to or if you ask them. Which is presumably a bit like how direct debits work (I sign something and give it to you and you then go to my bank and request access to my bank account). So the proposed legislation seems to allow for (or at least, not exclude?) the creation of data aggregators who might start to aggregate data from a variety of “regulated persons” at my authorisation.

Note that I assume other regulations, such as the Data Protection Act, preclude those data aggregators from acting as data brokers, “companies that collect personal information about consumers from a variety of public and non-public sources and resell the information to other companies” (FTC [the US Federal Trade Commission] to Study Data Broker Industry’s Collection and Use of Consumer Data).

It’s also worth mentioning that the amendment doesn’t actually seem to set about enacting any actual midata legislation: “The Secretary of State may by regulations require…” which is presumably setting up the opportunity for the Secretary of State to bring it about through a Statutory Instrument or similar?

(In passing, the tabled amendments to the Bill also includes amendments relating to proposed amendments to the Copyright, Designs and Patents Act 1988 (part 6 of the Bill, relating to licensing of orphan works, collection licensing, duration of copyright et al.) as well as the creation of a Director General of Intellectual Property Rights (28C).)

The day before, CTRL-Shift had also published a post on Building Relationships for a New Data Age:

The challenge (and opportunity) is to start building an information sharing relationship with customers where both sides use data sharing to save time, cut costs and be more efficient – and to add new value.

In a world that’s rapidly going digital, an information sharing relationship makes it normal for individuals to provide the organisations they deal with new, additional and updated data, and for organisations to also routinely provide customers with additional data or data-based services. Information sharing relationships and services are becoming a key influence on which organisations customers choose to do business with, and how valuable this business becomes.

The question is, how do we get from A to B? From today’s ‘one way’ norm where organisations collect data about customers and send messages to them, to a more equal and valuable information sharing partnership? There are three key pillars to an information sharing relationship:

– establish a trustworthy ‘default setting’ for the use of personal data
– give users/customers control
– earn VPI (volunteered personal information) via new information services.

Volunteered personal information, a phrase straight out of the Facebook playbook…

The post then discusses the importance of getting default settings right, in part to avoid a public backlash and a “loss of trust” when folk realise the terms and conditions allow the companies involved to do whatever it is they say the company can, before describing how companies can Earn VPI via information services:

Getting default settings right and giving users control only create the context needed for a healthy information sharing relationship. They don’t actually get the information flowing. To do that, organisations need to:

– elicit valuable additional information from customers
– release and provide customers with additional information and/or information based services that help them make better decisions and make it easier for them to get stuff done and achieve their goals – i.e. services that add new value.

In theory, eliciting VPI and offering added value information services are two separate things. But in reality they are likely to advance hand in hand: with individuals offering additional information (in an environment they can trust because of default settings and user control) as a way to get additional value from information-driven services.

Hmmm… elicit valuable additional information from customers; and then release and provide customers with … services that add new value (I can play the selective cut and past game too…;-) #midata is presumably being sold to consumers on the basis of the latter, particularly those services that “help them make better decisions and make it easier for them to get stuff done and achieve their goals”.

And then we read:

In theory, eliciting VPI and offering added value information services are two separate things. But in reality they are likely to advance hand in hand: with individuals offering additional information (in an environment they can trust because of default settings and user control) as a way to get additional value from information-driven services.

In theory, eliciting VPI and offering added value information services are two separate things. In the land where the flowers grow and the flopsy bunnies frolic, blissfully unaware that they are what Farmer McGregor actually sells to the butcher, presumably at a greater price than he can sell the lettuces the flopsy bunnies eat to the local greengrocer. Or something like that.

But in reality sound the drums of doom…in reality they are likely to advance hand in hand. Erm…of course… No-one wants shed loads of transactional data for personal use…with individuals offering additional information as a way to get additional value from information-driven services.

Yep… #midata is a way of getting you to give shed loads of low quality transactional data to third parties (who may or may not aggregate it worth other data you grant them access to) and then give them a shed load more data before it actually becomes useful. Because that’s how data works…but it’s not how the dream is sold…

Hmmm… I wonder, does the draft legislation say anything about the extent to which an authorised person is allowed to aggregate and mine data from regulated person(s) that relates to data collected from different customers either of the same, or different regulated persons? Because there lies another source of those “in reality” sources of potential value add…though we really should also try to imagine what sources they might be. (Is receiving targeted ads “value add” for me over random junk mail?)

On the other side of the fence, sort of, we see a Private Member’s Bill (Ten Minute Rule Bill?) from John Denham, Labour MP for Southampton, Itchen (not, apparently, the constituency in which the University of Southampton resides…) on Supermarket price transparency which seeks to require supermarkets “to release pricing data product by product and store by store [update: Supermarket Pricing Information Bill 2012-13]. This price information would not only enable the comparison of basic product prices, but also enable consumers to understand the differences in pricing between stores within the same retail chain, or variations in pricing of goods in different areas and regions.” In addition, it is claimed that the Private Member’s Bill “would also enable efficient scrutiny of special offers, multi-buys, ‘bogofs’ and other price promotions that have been the subject of recent criticism and regulatory action.”.

PS See also So What, #midata? And #yourData, #ourData…