Confused About the Consequences

In the previous couple of posts, I’ve rambled about web apps that will find a book from its cover and a song just by playing it and your online contacts across a myriad of services from your username on a single service.

But today I saw something that brought home to me the consequences of aggregating millions of tiny individual actions, in this case photo uploads to the flickr social photo site.

Form my reading of the post, the purple overlays in the images above – not the blue bounding boxes – are generated automatically by clustering geotagged and placename tagged images and extrapolating a well contoured shape around them.

That is, from the photos tagged “London” [that is, photos that are tagged with London in Yahoo’s WOE service], the algorithm creates the purple “London city” overlay in the above diagram.

For each an every photo upload, there is maybe a tiny personal consequence. For millions of photo uploads, there are consequences like this… (From millions of personal votes cast, there’s the possible consequence of change…) [Update: apparently, flickr received its 3 billionth upload at the start of November…]

And it struck me that even the relatively unsophisticated form of signals intelligence that is traffic analysis was capable of changing the face of war. So what are the consequences of traffic analysis at this scale?

What are the possible consequences? What are we walking into?

(Of course, following a brief moment of “I want to stop contributing to this; I’m gonna kill my computer and go and grow onions somewhere”, I then started wondering: “hmm, maybe if we also mine the info about what camera took each photo, and looked up the price of that camera, we might be able to generate socio-economic overlays over different neighbourhoods, and then… arrghh… stop, no, evil, evil…;-)

So to add to the mix, here’s a couple more things that the web made easy this week. Firstly, the Google Visualisation API was extended so that it could consume data in a simple format from your own data sources. That is, if you allow your own database to output data in a simple tabular structure, the Google visualisation API makes it trivial to generate charts and graphs from that data. Secondly, Google added RSS feed support to their Google alerts service. This makes it easy to subscribe to an RSS feed that will alert you to new results on Google for a particular search. What really surprised me was how, after setting up a couple of alerts, they appeared without me doing anything (or maybe that should be – without me changing something to say “no”?) in my Google Reader account.

Small components is one thing.

Small components loosely coupled is another – and one where many of us see value.

Small components automatically wired together is yet another thing – and one that is increasingly going to happen. A consequence I hadn’t anticipated of setting up a Google RSS alert was that the feed appeared automatically in my feed reader.

Yesterday, an unanticipated consequence of me adding my blog URL to my Google Profile page was that several other URLs I control were automatically suggested to me as things I might want to add to my profile.

Whenever I go into Facebook, the platform suggests a list of people I might know to me, whom I might want to “friend”.

Now this recommendation may be because we share a large number of friends, or it might be that I’ve appeared in the same photograph as some of these people… How would Facebook know? Maybe Mircosoft, their search provider, told them: Why “People” Tags? describes how the beta version of Microsoft Live Photo gallery automatically identifies faces in photos and then prompts you to tag them with people’s names… Google already does this, of course, in Picasa, with its “name tags“.

And finally…a chance clickthru from someone on the Copac developments blog, which lists in the blogroll, alerted me through my blog stats to this post on Spooky Personalisation (should we be afraid?) which discusses the extent to which “adaptive personalisation” may appear “spooky” to the user.

(A serendipitous link discovery for me? Surely… Spooky? Maybe!;-)

And that maybe is going to be an ever more apparent unanticipated consequence of the way in which it’s getting so much easier to glue apps together? Spookiness…

PS see also Does Google Know Too Much? (h/t Ray@B2FXXX)

Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...

4 thoughts on “Confused About the Consequences”

  1. Interesting, and giving form to the otherwise paranoid suspicions. But Tony, do you think people had similar concerns when hypertext become a reality. The idea that any document (and so anyone) could hyperlink to your document (and so you) without you being able to control the association. I realise that the scale you are describing is far greater, but it seems to me that the concern is similar. We don’t even give hypertext a second thought these days.. do we?

  2. I think the search tools are becoming so powerful now that things are becoming accretable and findable and the search and reasoning tools that work over the various graphs we inhabit are becoming increasingly powerful.

    Hypertext – and the original creation of the graph – was one thing…. This is another…

    (By graph, I mean all sorts of graphs – people, urls, photos as nodes, links, relationships, friendships etc etc as edges.)

    I’ve been an advocate of goog’n’the web for ages. And having experienced several “*I* didn’t do that” moments over the last week, I’ve started trying to think through some of the consequences, and ‘next step’ steps that we can start to make – the only problem being that each time I think of a “it would be creepy if these two apps shared/aggregated data”, a Google query turns up someone who’s just blogged an implementation!

    The posts aren’t intended to be paranoid ravings – they’re just observations along the lines of “did you know we can now do this?”;-)

    But I’ve just started to think about what’s possible now and what the consequences are, and that’s an appropriate thing to do, I think? I could set up a blog with “does this worry you?” search queries etc etc, but I can *already* see that that’s a bit like someone setting up a white hat hacker blog, publishing hacks, techniques and exploits that can also be used by the black hats… But I’m not going to, because I think it would be irresponsible. (Or maybe there’s nothing wrong with posting a recipe for a google query that takes 10s to write and give me a set of images of 16 year old females in your local area?)

    One oft claimed thing that I and others have made about the web is that the bad stuff about us will get lost in the noise. But if the Goog accretes lots of stuff about us, I can dump that into a custom search engine, and then write increasingly powerful queries over it.

    Out of interest, have you played with the Yahoo query language yet? I can see that developing in all sorts of ways – and while I don’t feel paranoid about it, I think as Yahoo expand the interface I could write some queries that would freak a lot of people out…

  3. I don’t think this invalidates the point, but from my reading of the article on the Flickr Developer Blog suggests that the way the outlines are formed is by taking a lat and lon, and using WOE to assign that to a ‘place name’ of some kind. This is slightly different to your description of tagging somewhere with ‘London’. It also seems to make the outline a bit self fulfilling – given enough pictures, the outline will match what the creators of the WOE database defined as the outline. But of course, the point is that there are enough pictures in Flickr to generate a reasonable outline – so this is a numbers game which I think is your point?

    Another thing to remember is that the way that they have generated the outlines suggests that the key point is how many photos that are close (but inside) the boundary of the place (as defined by WOE). Further down the post it suggests that you can ‘improve’ the outline of your neighbourhood by walking round the perimeter and take geo-tagged photos. This suggests that the number of photos actually required is smaller than you might initially think – as long as they are taken in the right places.

  4. “I don’t think this invalidates the point, but from my reading of the article on the Flickr Developer Blog suggests that the way the outlines are formed is by taking a lat and lon, and using WOE to assign that to a ‘place name’ of some kind.”

    Yes – you’re right, the map is drawn based on WOE – Where on Earth – info ( ). I maybe hallucinated or misremembered the bit where I thought tags were used as a possible data source for the WOE service?

    Will clarify the post now – thanks for pointing it out:-)

Comments are closed.