Tin Foil Hats or Baseball Caps? Why Your Face is a Cookie and Your Data is midata

Over the weekend, chatting with friends, I heard myself going off on what I imagine sounded like a paranoid fantasy fuelled privacy rant. But it stems from my own confusion about what it means for so much data to be out there about us, and whether the paranoid fantasy bit actually relates to:

– the extent to which folk would want to collect and process that data, and use it “against” me, as an individual;
– the extent to which data from disparate sources can be reconciled;
– the idea that all manner and variety of data about me is being collected anyway;
– the fact that all manner and variety of data about me could in principle be being collected.

So here are some more bits and pieces…

We all know that Tesco pioneered the use of loyalty cards for personalised customer marketing and store optimisation (eg The Tesco Data Business (Notes on “Scoring Points”)) and maybe that they track you round a store (or do they track your face?!), and now it seems that as well as supplementing their petrol stations with ANPR (Automatic Number Plate Recognition) systems (I assume their garages are equipped with them? Some of their car parks are…) they’ll be using face scanning Amscreen Point of Sale advertising screens to profile folk based on gender and age. (It’s possibly just easier to recognise someone by their face or phone and then lookup their gender and age; and economic circumstances; and etc etc?!)

Adrian Short has some further comments here… When does face scanning tip over into the full-time surveillance society?

Face recgonition as commodity
See the ad? Face recgonition as commodity service?

I don’t really know how concerning this is – folk I meet regularly recognise me, so what does it matter if machines universally and ubiquitously recognise me? Should I be concerned that my face is essentially third party cookie, at least for unique ID purposes, that can be identified by anyone whose servers hook into a particular video or image feed?

And presumably things like my payment cards, and car number plate, and postcode, and etc etc can effectively be treated as third party cookies too in a similar respect of unique or group identification? (What should we call such things? I, me, my cookies…? icookies?! Or to tie into the notion of #midata, micookies?)

And should I be fearful that such companies buy and sell data about me via ad exchanges and cookie matching services?

Surely companies using #midata can help me make better decisions, nudging me in to taking courses of action that are good for me?

Food hygiene rating

So should we care? Should we care what data’s out there in the wild about me? Should we care that a shedload of #midata may actually be publicly available data, not least through cookie tracking, and micookie traces?

Should we care that services like Wonga.com may be making use of that data to make decisions about me, as described in Leaky data: How Wonga makes lending decisions (read it, it’s an interesting read…).

And should we care that the decisions made on the basis of such publicly available but who knows what data are probably so algorithmically complex that there is no transparency or rationale in how or why such decisions are actually made the way they are? (See for example Transparent Predictions, Tal Zarsky, University of Illinois Law Review, Vol. 2013, No. 4, 2013.)

Not paranoid, just confused, and not really able to think any of this through…

POS an example of where Facebook’s at wrt automated face recognition around the end of 2013: DeepFace: Closing the Gap to Human-Level Performance in Face Verification

Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...

7 thoughts on “Tin Foil Hats or Baseball Caps? Why Your Face is a Cookie and Your Data is midata”

  1. I think the crux is control- all this midata gathering is OK, as long as there is a balance of power, which means a way for an individual to control what they relinquish in exchange for greater convenience and/or a better deal. I don’t have a clubcard, which means I won’t get freebies, and that’s how I like it.

    What is not OK are situations were the individual has no control; face recognition, ANPR and other micookies as well as Wonga scoring are all instances of that. I can’t really choose not to have a face, drive a car without license plates or, if I’m sufficiently desperate, not get a pay day loan.

    1. I’m with you Wilbert on the Control.

      A decade+ back, when thinking about the things that were kept/know about you then (minuscule by today’s standard of course), it felt right that access to data about me should be notified to me when it was accessed – so I know who, or more likely what, was wanting to find out which piece of information about me.

      My imagined ideal was that I would then have control over that request, deciding what info went to who/what and at what time.

      Clearly an information-firewall-type set up with rules – eg My current bank can access all financial records of mine at any time, but I’d like to see a summary – would be needed, or you’d end up being constantly distracted by dealing with the requests.

      Having the ability to choose who could see what would even encourage the inquirer to specify Why they wanted to find out x about me, as it would increase the likelihood of me agreeing.

      This is of course, my ideal world and near impossible to retro-fit into Britain. Perhaps Iceland?


      1. Wilbert/Simon

        I think there are a couple of issues in there that intrigue me:

        1) the notion that I don’t have control when someone uses a physical attribute as a key to identify me or associate me with a particular data record; (hence the notion that a physical property of me or a thing closely identified as me acts as a third party cookie for ID purposes; and the consequent notion of cookie matching in the sense that different orgs can match data about me based on some physical attribute of me)

        2) that some decisions may get made on the basis – or requirement – of public disclosures; eg I can’t get a loan or a job because I don’t have an active Facebook account. This has other correlates of course in a far more general sense – digital first as excluding folk who aren’t online , for example.

  2. I think being confused is a good place to be about this. I sit on the other side of the debate most of the time, and am too readily forced into a knee-jerk “you’re all tinfoil hatters” when I see privacy overclaims. The reality for most of us is that the “unified view of the customer” is a mythical construct; the tools are clumsy and the data dirty and confused. We’re able to add a little incremental value to our campaigns by judicious application of statistical methods, but the reality is that we’re still mostly flying blind.

    That said, the issue is real; and the actors include not only “bad government” and “soulless corporations”, but also cybercriminals. That alone makes it a debate worth having…

    BTW — this site drops a few cookies itself, among them Quantcast (who’ll happily sell us lookalike audiences based on aggregating these data.)

  3. I have recently been delighted to find there is a move towards algorithmic transparency. This applies to algo-based decision making and could be tesco profiling you or the government making decisions about who to give benefits or health treatments to. You ought to be able to check the algorithms to ensure they aren’t buggy…

    Of course, even if the algorithms are ok, you can only hope they are working on correct data about you…

  4. @matt I agree about the practical problems in matching/reconciling data, hence trying to bring ‘in principle’ into the argument, as opposed to ‘in actual practice’. Even there, there may be some things that are to all intents and purposes impossible even if in principle being possible (NP complete style problems in large spaces, for example).

    @laura re: algorithmic transparency; agreed, but I can give you a neural net and the weight values for a net with categorical value inputs as well as reals, but I’d defy you to give me a logical/rational explanation of human understandable argumentation rules that generate the output?

Comments are closed.

%d bloggers like this: