Ad-Tech – A Great Way in To OSINT

Open Source Intelligence – OSINT – is intelligence that can be collected from public sources. That is to say, OSINT is the sort of intelligence that you should be able to collect using a browser and a public or academic library that also provides access to public subscription content. (For an intro to OSINT, see for example Sailing the Sea of OSINT in the Information Age; for example context, Threat Intelligence: Collecting, Analysing, Evaluating). OSINT can be used as much by corporates as by the security services. It’s also up for grabs by journalists, civil society activists and stalkers…

Looking at the syllabus for a OSINT beginners course, such as IMSL’s Basic Open Source (OSINT) Research & Analysis Tradecraft turns up the sorts of thing you might also expect to see as part of one of Phil Bradley or Karen Blakeman’s ILI search workshops:

  • Appreciation of the OS environment
    • Opportunities, Challenges and Threats
  • Legal and Ethical Guidance
  • Search Tradecraft
    • Optimising Search
    • Advanced Search Techniques
  • Profile Management and Risk Reduction
    • Technical Anonymity/Low Attribution
    • Security Tradecraft
  • Social Media exploitation
    • Orientation around the most commonly used platforms Twitter, Facebook, LinkedIn etc.
    • Identifying influence
    • Event monitoring
    • Situational Awareness
    • Emerging social media platforms
  • Source Evaluation
    • Verifying User Generated Content on Social Media

And as security consultant Bruce Schneier beautifully observed in 2014, [s]urveillance is the business model of the Internet.

What may be surprising, or what may help explain in part their dominance, is that a large part of the surveillance capability the webcos have developed is something they’re happy to share to with the rest of us. Things like social media exploitation, for example, allow you to easily identify social relationships, and pick up personal information along the way (“Happy Birthday, sis..”). You can also identify whereabouts (“Photo of me by the Eiffel Tower earlier to day”), captioned or not – Facebook and Google will both happily tag your photos for you to make them, and the information, or intelligence, they contain more discoverable.

Part of the reason that the web companies have managed to grow so large is that they operate very successful two-sided markets. As the FT Lexicon defines it, these are markets that provide “a meeting place for two sets of agents who interact through an intermediary or platform”. In the case of the web cos, “social users” who gain social benefit from interacting with each other through the platform, and the advertisers who pay the platform to advertise to the social users (Some Notes on Churnalism and a Question About Two Sided Markets).

A naive sort of social media intelligence would focus, I think, on what can be learned simply through the publicly available activity on the social user side of the platform, albeit activity that may be enriched through automatic tagging by the platform itself.

But there is the other side of the platform to consider too. And the tools on that side of the platform, the tools developed for the business users, are out and out designed to provide the business users – the advertisers – with intelligence about the social users.

Which is all to say that if surveillance is your thing, then ADINT – Adtech Intelligence – could be a good OSINT way in, as a recent paper from the Paul G. Allen School of Computer Science & Engineering, University of Washington describes: ADINT: Using Targeted Advertising for Personal Surveillance (read the full paper; Wired also picked up the story: It Takes Just $1,000 to Track Someone’s Location With Mobile Ads). Here’s the paper abstract:

Targeted advertising is at the heart of the largest technology companies today, and is becoming increasingly precise. Simultaneously, users generate more and more personal data that is shared with advertisers as more and more of daily life becomes intertwined with networked technology. There are many studies about how users are tracked and what kinds of data are gathered. The sheer scale and precision of individual data that is collected can be concerning. However, in the broader public debate about these practices this concern is often tempered by the understanding that all this potentially sensitive data is only accessed by large corporations; these corporations are profit-motivated and could be held to account for misusing the personal data they have collected. In this work we examine the capability of a different actor — an individual with a modest budget — to access the data collected by the advertising ecosystem. Specifically, we find that an individual can use the targeted advertising system to conduct physical and digital surveillance on targets that use smartphone apps with ads.

The attack is predicated in part around knowing the MAID – the Mobile Advertising ID (MAID) – of a user you want to track, and several strategies are described for obtaining that.

I haven’t looked at adservers for a long time (or Google Analytics for that matter), so I thought I’d have a quick look at what the UIs support. So for example, Google AdWords seems to offer quite a simple range of tools, that presumably let me target based on various things, like demographics:

or location:

or time:

It also looks like I can target ads based on apps a user users:

or websites they visit:

though it’s not clear to me if I need to be the owner of those apps or webpages?

If I know someone’s email address, it also looks like I can use that to vector an ad towards them? Which means Google cookies presumably associate with an email address?

This email vectoring is actually part of Google’s “Customer Match” offering, which “lets you show ads to your customers based on data about those customers that you share with Google”.

So how about Facebook? As you might expect, there’s a range of audience targeting categories that draw heavily on the information users provide to the system:

(You’ve probably heard the slogan “if you aren’t paying for the product, you are the product” and thought nothing of it. Are you starting to feel bought and sold, yet?)

Remember that fit of anger, or joy, when you changed your relationship, maybe also flagging a life event (= valuable to advertisers)?

Or maybe when you bought that thing (is there a Facebook Pay app yet, to make this easier for Facebook to track?):

And of course, there’s location:

If you fancy exploring some more, the ADINT paper has a handy table summarising what’s offered by various other adtech providers:

On the other hand, if you want to buy readymade audiences from a data aggregator, try the Oracle Data Marketplace. It looks as if they’ll happily resell you audiences derived from Experian data, for example:

So I’m wondering, what other sorts of intelligence operation could be mounted against a targeted individual using adtech more generally? And what sorts of target identification can be achieved through a creative application of adtech, and maybe some simple phishing to entice a particular user onto a web page you control and which you can use to grab some preliminary tracking information from targeted users you entice there?

Presumably, once you can get your hooks into a user, maybe by enticing them to a web page that you have set up to show your ad so that the adserver can spear the user, you can also use ad retargeting or remarketing (that follows users around the web, in the sense of continuing to show them ads from a particular campaign) to keep a tail on them?

[This post was inspired by an item on Mike Caulfield’s must read Traces weekly email newsletter. Subscribe to his blog – Hapgood – for a regular dose of digital infoskills updating. You might also enjoy his online book Web Literacy for Student Fact-Checkers.]

Community Detection? (And Is Your Phone a Cookie?)

A few months ago, I noticed that the Google geolocation service would return a lat/long location marker when provided with the MAC address of a wifi router (Using Google to Look Up Where You Live via the Physical Location of Your Wifi Router [code]) and in various other posts I’ve commented on how communities of bluetooth users can track each other’s devices (eg Participatory Surveillance – Who’s Been Tracking You Today?).

Which got me wondering… are there any apps out there that let me detect the MAC address of Bluetooth devices in my vicinity, and is there anyone aggregating the data, perhaps as a quid pro quo for making such an app available?

Seems like the answer is yes, and yes…

For example, John Abraham’s Bluetooth 4.0 Scanner [Android] app will let you [scan] for Bluetooth devices… The information is recorded includes: device name, location, RSSI signal strength, MAC address, MAC address vendor lookup.

In a spirit of sharing, the Bluetooth 4.0 Scanner app “supports the earthping.com project – crowdsourced Bluetooth database. Users are also reporting usage to find their lost Bluetooth devices”.

So when you run the app to check the presence of Bluetooth devices in your own vicinity, you also gift location of those devices – along with their MAC addresses – to a global database – earthping. Good stuff…not.

We’re all familiar (at least in the UK) with surveillance cameras everywhere, and as object recognition and reconciliation tools improves it seems as if tracking targets across multiple camera views will become a thing, as demonstrated by the FX Pal Dynamic Object Tracking System (DOTS) for “office surveillance”.

It’s also increasingly the case that street furniture is appearing that captures the address of our electronic devices as we pass them. For example, in New York, Link NYC “is a first-of-its-kind communications network that will replace over 7,500 pay phones across the five boroughs with new structures called Links. Each Link will provide superfast, free public Wi-Fi, phone calls, device charging and a tablet for Internet browsing, access to city services, maps and directions”. The points will also allow passers-by to ‘view public service announcements and more relevant advertising on two 55” HD displays’ – which is to say they track everything that passes, tries to profile anyone who goes online via the service, and then delivers targeted advertising to exactly the sort of people passing each link.

LinkNYC is completely free because it’s funded through advertising. Its groundbreaking digital OOH advertising network not only provides brands with a rich, context-aware platform to reach New Yorkers and visitors, but will generate more than a half billion dollars in revenue for New York City.

[Update: 11/16 – it seems that offering pavement wifi hubs had consequences: “It took less than a year for New Yorkers to lose sidewalk internet privileges. … Soon came the reports of people gathered for hours around these digital campfires, streaming music or watching movies and porn. …LinkNYC disabled web browsing …” Public In/Formation]

So I wondered just what sorts of digital info we leak as we do walk down the street. Via Tracking people via WiFi (even when not connected), I learn that devices operate in one of two modes – a listening beacon mode, where they essentially listening for access points, but at high battery cost. Or a lower energy ping mode, where they announce themselves (along with MAC address) to anyone who’s listening.

If you want to track passers-by, many of whom will be pinging their credentials to anyone whose listening, you can set up things like wifi routers in monitor mode to listen out for – and log – such pings. Edward Keeble describes how to do it in the post Passive WiFi Tracking

If you’d rather not hack together such a device yourself, you can always buy something off the shelf to log the MAC addresses of passers-by, eg from somebody such as Libelium’s Meshlium Scanner [datasheet – PDF]. So for example:

  • Meshlium Scanner AP – It allows to detect (sic) Smartphones (iPhone, Android) and in general any device which works with WiFi or Bluetooth interfaces. This model can receive and store data from Waspmotes with GPRS, 3G or WiFi, sending via HTTP protocol. The collected data can be send (sic) to the Internet by using the Ethernet.
  • Meshlium Scanner 3G/GPRS-AP – It allows to detect (sic) Smartphones (iPhone, Android) and in general any device which works with WiFi or Bluetooth interfaces. This model can receive and store data from Waspmotes with GPRS, 3G or WiFi, sending via HTTP protocol. The collected data can be send (sic) to the Internet by using the Ethernet, and 3G/GPRS connectivity
  • Meshlium Scanner XBee/LoRa -AP – It allows to detect (sic) Smartphones (iPhone, Android) and in general any device which works with WiFi or Bluetooth interfaces. It can also capture the sensor data which comes from the Wireless Sensor Network (WSN) made with Waspmote sensor devices. The collected data can be send (sic) to the Internet by using the Ethernet and WiFi connectivity.

So have any councils started installing that sort of device I wonder? And if so, on what grounds?

On the ad-tracking/marketing front, I’m also wondering whether there are extensions to cookie matching services that can match MAC addresses to cookies?

PS you know that unique tat you’ve got?! FBI Develops tattoo tracking technology!

PPS capturing data from wifi and bluetooth devices is easy enough, but how about listening out for mobile phone as phones? Seems that’s possible too, though perhaps not off-the-shelf for your everyday consumer…? What you need, apparently, is an IMSI catcher such as the Harris Corp Stingray. Examples of use here and here.

See also: Tin Foil Hats or Baseball Caps? Why Your Face is a Cookie and Your Data is midata and We Are Watching, and You Will be Counted.

PS Interesting piece from the Bristol Cable Oct 2016: Revealed: Bristol’s police and mass mobile phone surveillance. Picked up by the Guardian: Controversial snooping technology ‘used by at least seven police forces’.

Participatory Surveillance – Who’s Been Tracking You Today?

With the internet of things still trying to find its way, I wonder why more folk aren’t talking about participatory surveillance?

For years, websites have been gifting information to third parties that you have visited them (Personal Declarations on Your Behalf – Why Visiting One Website Might Tell Another You Were There), but as more people are instrumenting themselves, the opportunities for mesh network based surveillance are ever more apparent.

Take something like thetrackr, for example. The device itself is a small bluetooth powered device the size of a coin that you attach to your key fob or keep in your wallet:

The TrackR is a Bluetooth device that connects to an app running on your phone. The phone app can monitor the distance between the phone and device by analyzing the power level of the received signal. This link can be used to ring the TrackR device or have the TrackR device ring the phone.

The other essentially part is an app you run permanently on your phone that listens out for the trackr device. Not just yours, but anyone’s. And when it detects one it posts its location to a central server:

[thetrackr] Crowd GPS is an alternative to traditional GPS and revolutionizes the possibilities of what can be tracked. Unlike traditional GPS, Crowd GPS uses the power of the existing cell phones all around us to help locate lost items. The technology works by having the TrackR device broadcast a unique ID over Bluetooth Low Energy when lost. Other users’ phones can detect this wireless signal in the background (without the user being aware). When the signal is detected, the phone records the current GPS location, sends a message to the TrackR server, and the TrackR server will then update the item’s last known location in its database. It’s a way that TrackR is enabling you to automatically keep track of the location of all your items effortlessly.

And if you don’t trust the trackr folk, other alternatives are available. Such as tile:

The Tile app allows you to anonymously enlist the help of our entire community in your search. It works both ways — if you’re running the app in the background and come within range of someone’s lost item, we’ll let the owner know where it is.

This sort of participatory surveillance can be used to track stolen items too, such as cars. The TRACKER mesh network (which I’ve posted about before: Geographical Rights Management, Mesh based Surveillance, Trickle-Down and Over-Reach) uses tracking devices and receivers fitted to vehicles to locate other similarly fitted vehicles as they pass by them:

TRACKER Locate or TRACKER Plant fitted vehicles listen out for the reply codes being sent out by stolen SVR fitted vehicles. When the TRACKER Locate or TRACKER Plant unit passes a stolen vehicle, it picks up its reply code and sends the position to the TRACKER Control Room.

That’s not the only way fitted vehicles can be used to track each other. A more general way is to fit your car with a dashboard camera, then use ANPR (automatic number plate recognition) to identify and track other vehicles on the road. And yes, there is an app for logging anti-social or dangerous driving acts the camera sees, as described in a recent IEEE Spectrum article on The AI dashcam app that wants to rate every driver in the world. It’s called the Nexar app, and as their website proudly describes:

Nexar enables you to use your mobile telephone to record the actions of other drivers, including the license plates, types and models of the cars being recorded, as well as signs and other surrounding road objects. When you open our App and begin driving, video footage will be recorded. …

If you experience a notable traffic incident recorded through your use of the App (such as someone cutting you off or causing an accident), you can alert Nexar that we should review the video capturing the event. We may also utilize auto-detection, including through the use of “machine vision” and “sensor fusion” to identify traffic law violations (such as a car in the middle of an intersection despite a red stop light). Such auto-detected events will appear in your history. Finally, time-lapse images will automatically be uploaded.

Upon learning of a traffic incident (from you directly or through auto-detection of events), we will analyze the video to identify any well-established traffic law violations, such as vehicle accidents. Our analysis will also take into account road conditions, topography and other local factors. If such a violation occurred, it will be used to assign a rating to the license plate number of the responsible driver. You and others using our App who have subsequent contact with that vehicle will be alerted of the rating (but not the nature of the underlying incidents that contributed to the other driver’s rating).

And of course, this is a social thing we can all participate in:

Nexar connects you to a network of dashcams, through which you will start getting real-time warnings to dangers on the road

It’s not creepy though, because they don’t try to relate to number plates to actual people:

Please note that although Nexar will receive, through video from App users, license plate numbers of the observed vehicles, we will not know the recorded drivers’ names or attempt to link license plate numbers to individuals by accessing state motor vehicle records or other means. Nor will we utilize facial recognition software or other technology to identify drivers whose conduct has been recorded.

So that’s all right then…

But be warned:

Auto-detection also includes monitoring of your own driving behavior.

so you’ll be holding yourself to account too…

Folk used to be able to go to large public places and spaces to be anonymous. Now it seems that the more populated the place, the more likely you are to be located, timestamped and identified.

We Are Watching, and You Will be Counted

Two or three weeks ago, whilst in Cardiff, I noticed one of these things for the first time:

20682320123_203e79367d_o 20682320123_203e79367d_o_jpg

It’s counts the number of cyclists who pass by it and is a great example of the sort of thing that could perhaps be added to a “data walk”, along with several other examples of data revealing street furniture as described by Leigh Dodds in Data and information in the city.

It looks like could be made by a company called Falco – this Falco Cycle Counter CB650 (“[a]lready installed for Cardiff County Council as well as in Copenhagen and Nijmegen”)? (Falco also make another, cheaper one, the CB400.)

From the blurb:

The purpose of the Falco Cycle Counter is to show the number of cyclists on a bicycle path. It shows the number of cyclists per day and year. At the top of the Counter there is a clock indicating time and date. On the reverse it is possible to show city map or other information, alternatively for a two-way bicycle path it is possible to have display on both side of the unit. Already installed for Cardiff County Council as well as in Copenhagen and Nijmegen, three very strong cycling areas, the cycle counter is already proving to be an effective tool in managing cycle traffic.

As with many of these sorts of exhibit, it can phone home:

When configured as a Cycle Counter, the GTC can provide a number of functions depending on the configuration of the Counter. It is equipped with a modem for a SIM card use which provides a platform for mobile data to be exported to a central data collection system.

This makes possible a range of “on-… services”, for example: [g]enerates individual ‘buy-in’ from local people via a website and web feed plus optional Twitter RSS enabling them to follow the progress of their own counter personally.

I was reminded of this appliance (and should really have blogged it sooner) by a post today from Pete Warden – Semantic Sensors – in which he remarked on spotting an article about “people counters” in San Francisco that count passing foot traffic.

In that case, the counters seem to be provided by a company called Springboard who offer a range of counting services using a camera based counting system: a small counting device … mounted on either a building or lighting/CCTV column, a virtual zone is defined and pedestrians and cars who travel through the zone are recorded.

Visitor numbers are recorded using the very latest counting software based on “target specific tracking”. Data is audited each day by Springboard and uploaded daily to an internet server where it is permanently stored.

Target specific tracking software monitors flows by employing a wide range of characteristics to determine a target to identify and track.

Here’s an example of how it works:

As Pete Warden remarked, [t]raditionally we’ve always thought about cameras as devices to capture pictures for humans to watch. People counters only use images as an intermediate stage in their data pipeline, their real output is just the coordinates of nearby pedestrians.

He goes on:

Right now this is a very niche application, because the systems cost $2,100 each. What happens when something similar costs $2, or even 20 cents? And how about combining that price point with rapidly-improving computer vision, allowing far more information to be derived from images?

Those trends are why I think we’re going to see a lot of “Semantic Sensors” emerging. These will be tiny, cheap, all-in-one modules that capture raw noisy data from the real world, have built-in AI for analysis, and only output a few high-level signals.

For all of these applications, the images involved are just an implementation detail, they can be immediately discarded. From a systems view, they’re just black boxes that output data about the local environment.

Using cameras to count footfall appears to be nothing new – for example, the Leeds Data Mill openly publish Leeds City Centre footfall data collected by the council from “8 cameras located at various locations around the city centre [which monitor]numbers of people walking past. These cameras calculate numbers on an hourly basis”. I’ve also briefly mentioned several examples regarding the deployment of related technologies before, for example The Curse of Our Time – Tracking, Tracking Everywhere.

From my own local experience, it seems cameras are also being used (apparently) to gather evidence about possible “bad behaviour” by motorists. Out walking the dog recently, I noticed a camera I hadn’t spotted before:

21923091712_92b682edf0_k

It’s situated at the start of a hedged both sides footpath that runs along the road, although the mounting suggests that it doesn’t have a field of view down the path. Asking in the local shop, it seems as if the camera was mounted to investigate complaints of traffic accelerating off the mini-roundabout and cutting-up pedestrians about to use the zebra-crossing:

21747057030_b7356581f6_o

(I haven’t found any public consultations about mounting this camera, and should really ask a question, or even make an FOI request, to clarify by what process the decision was made to install this camera, when it was installed, for how long, for what purpose, and whether it could be used for other ancillary purposes.)

On a slightly different note, I also note from earlier this year that Amazon acquired internet of things platform operator 2lemetry, “an IoT version of Enterprise Application Integration (EAI) middleware solutions, providing device connectivity at scale, cross-communication, data brokering and storage”, apparently. In part, this made me think of an enterprise version of Pachube, as was (now Xively?).

So is Amazon going to pitch against Google (in the form of Nest), or maybe Apple, perhaps building “home things” services around a home hub server? After all, they have listening connected voice for your home already in the form of the voice controlled Amazon Echo (a bit like a standalone Siri, Cortana or Google Now). (Note to self: check out the Amazon Amazon Alexa Voice Services Developer Kit some time…)

As Pete Warden concluded, it seems obvious to me that machine vision is becoming a commodity“. What might we expect as and when listening and voice services also become a commodity?

Related: a recent article from the Guardian posing the question What happens when you ask to see CCTV footage?, as is your right by making a subject access request under the Data Protection Act, picks up on a recently posted paper by OU academic Keith Spiller: Experiences of accessing CCTV data: the urban topologies of subject access requests.

Geographical Rights Management, Mesh based Surveillance, Trickle-Down and Over-Reach

Every so often there’s a flurry of hype around the “internet of things”, but in many respects it’s already here – and has been for several decades. I remember as a kind being intrigued by some technical documents describing some telemetry system or other that remote water treatment plants used to transmit status information back to base. And I vaguely remember from a Maplin magazine around the time an article or two about what equipment you needed to listen in on, and decode, the radio chatter of all manner of telemetry systems.

Perhaps the difference now is a matter of scale – it’s easier to connect to the network, comms are bidirectional (you can receive as well as transmit information), and with code you can effect change on receipt of a message. The tight linkage between software and hardware – bits controlling atoms – also means that we can start to treat more and more things as “plant” whose behaviour we can remotely monitor, and govern.

A good example of how physical, consumer devices can already be controlled – or at least, disabled – by a remote operator is described in a New York Times article that crossed my wires last week, Miss a Payment? Good Luck Moving That Car, which describes how “many subprime borrowers [… in the US] must have their car outfitted with a so-called starter interrupt device, which allows lenders to remotely disable the ignition. Using the GPS technology on the devices, the lenders can also track the cars’ location and movements.” As the loan payment due date looms, it seems that some devices also emit helpful beeps to remind you…. And if your car loan agreement stipulates you’ll only drive within a particular area, I imagine that you could find it’s been geofenced. (A geofence is geographical boundary line that can be used to detect whether a GPS tracked device has passed into, or exited from, a particular region. When used to disable a device that leaves – or enters – a particular area, as for example drones flying into downtown Washington, we might consider it a form “location based management” (or “geographical rights management (GRM)”?!) that can disable activity in a particular location where someone who claims to control use of that device in that space actually exerts their control. (Think: DRM for location…))

One of the major providers “starter interrupt devices” is a company called PassTime (product list). Their products include:

  • PassTime Plus, the core of their “automated collection technology”.
  • Trax: “PassTime TRAX is the entry level GPS tracking product”. Includes: Pin point GPS location service, Up to Six (6) simultaneous Geo-fences.
  • PassTime GPS: “provides asset protection at an economical price while utilizing the same hardware and software platform of PassTime’s Elite Pro line of products. GPS tracking and remote vehicle disable features offer customers tools for a swift recovery if needed.” Includes: Pin point GPS location service, Remote vehicle disable option, Tow-Detect Notification, Device Tamper Notification, Up to Six (6) simultaneous Geo-fences, 24-Hour Tracking, Automatic Location Heartbeat
  • Elite-Pro: “the ultimate combination of GPS functionality and Automated Collection Technology”. Includes the PassTime GPS features but also mentions “Wireless Command Delivery”.

PassTime seem to like the idea of geofences so much they have patents in related technologies: PassTime Awarded Patent for Geo-Fence and Tamper Notice (US Patent: 8018329). You can find other related patents by looking up other patents held by the inventors (for example…).

You’ll be glad to know that PassTime have UK partners… in the form of The Car Finance Company, who are apparently “the world’s largest user and first company in the UK to start fitting Payment Reminder Technology to your new car”. Largest user?! According to a recent [March 12, 2015] press release announcing an extension to their agreement that “will bring 70,000 payment assurance and telematics devices to the United Kingdom”.

Here’s how The Car Finance Company spin it: The Passtime system helps remind you when your repayments are due so you can ensure you stay on track with your loan and help repair and rebuild your credit. The device is only there to help you keep your repayments up to date, it doesn’t affect your car nor does it monitor the way you drive. From the recent press release, “PassTime has been supplying Payment Assurance and GPS devices to The Car Finance Company since 2009″ (my emphasis). I’m not sure if that means the PassTime GPS (with the starter interrupt) or the Trax device? If I was a journalist, rather than a blogger, I’d probably phone them to try to clarify that…

In passing, whilst searching for providers of automotive GPS trackers in the UK (and there are lots of them – search on something like GPS fleet management, for example…) I came across this rather intrusive piece of technology, The TRACKER Mesh Network, which “uses vehicles fitted with TRACKER Locate and TRACKER Plant to pick up reply codes from stolen vehicles with an activated TRACKER unit making them even easier to locate and recover”. Which is to say, this company has an ad hoc, mobile, distributed network of sensors spread across the UK road network that listen out for each other and opportunistically track each other. It’s all good, though:

“The TRACKER Mesh Network will enable the police to extend the network of ‘eyes and ears’ to identify and locate stolen vehicles more effectively using advanced technology and allow us to stay one step ahead of criminals who are becoming more and more adept at stealing cars. This is a real opportunity for the motoring public to help us clamp down on car thieves and raises public confidence in our ability to recover their possessions and bring the offenders to justice.”

(By the by, previous notes on ANPR – Automatic Number Plate Recognition. Also, note the EU eCall accident alerting system that automatically calls for help if you have a car accident [about, UK DfT eCall cost/benefit analysis].)

This conflation of commercial and police surveillance is… to be expected. But the data’s being collected, and it won’t go away. Snowden revelations revealed the scope of security service data collection activities, and chunks of that data won’t be going away either. The scale of the data collection is such that it’s highly unlikely that we’re all being actively tracked or that this data will ever meaningfully contribute to the detection of conspiracies, but it can and will be used post hoc to create paranoid data driven fantasies about who could have have met whom, when, discussed what, and so on.

I guess where we can practically start to get concerned is in considering the ‘trickle down’ way in which access to this data will increasingly be opened up, and/or sold, to increasing numbers of agencies and organisations, both public and private. As Ed Snowden apparently commented in a session as SXSW (Snowden at SXSW: Be very concerned about the trickle down of NSA surveillance to local police), “[t]hey’ve got everything. The question becomes, Now they’re empowered. They can leak [this stuff]. It does happen at the local level. These capabilities are created. High tech. Super secret. But they inevitably bleed over to law enforcement. When they’re brand new they’re only used in the extremes. But as that transition happens, more and more people get access, they use it in newer and more and more expansive and more abusive ways.”

(Trickle down – or over-reach – applies to legislation too. For example, from a story widely reported in April, 2008: Half of councils use anti-terror laws to spy on ‘bin crimes’, although the legality of such practices was challenged: Councils warned over unlawful spying using anti-terror legislation and guidance brought in in November 2012 that required local authorities to obtain judicial approval prior to using covert techniques. (I realise I’m in danger here of conflating things not specifically related to over-reach on laws “intended” to be limited to anti-terrorism associated activities (whatever they are) with over-reach…) Other reviews: Lords Constitution Committee – Second Report – Surveillance: Citizens and the State (Jan 2009), Big Brother Watch on How RIPA has been used by local authorities and public bodies and Cataloguing the ways in which local authorities have abused their covert surveillance powers. I’m guessing a good all round starting point would be the reports of the Independent Reviewer of Terrorism Legislation.)

When it comes to processing large amounts of data, finding meaningful, rather than spurious, connections connections between things can be hard… (Correlation is not causation, right?, as Spurious Correlations wittily points out…;-)

What is more manageable is dumping people onto lists and counting things… Or querying specifics. A major problem with the extended and extensive data collection activities going on at the moment is that access to the data to allow particular queries to be made will be extended. The problem is not that all your data is being collected now, the issue is that post hoc searches over it it could be made by increasing numbers of people in the future. Like bad tempered council officers having a bad day, or loan company algorithms with dodgy parameters.

PS Schneier on connecting the dots.. Why Mass Surveillance Can’t, Won’t, And Never Has Stopped A Terrorist.

PPS Here’s another example of a vehicle taking control of communications: Car calls 911 after alleged hit-and-run, driver arrested.

Participatory Surveillance

This is an evocative phrase, I think – “participatory surveillance” – though the definition of it is lacking from the source in which I came across it (Online Social Networking as Participatory Surveillance, Anders Albrechtslund, First Monday, Volume 13, Number 3 – 3 March 2008).

A more recent and perhaps related article – Cohen, Julie E., The Surveillance-Innovation Complex: The Irony of the Participatory Turn (June 19, 2014). In Darin Barney, Gabriella Coleman, Christine Ross, Jonathan Sterne & Tamar Tembeck, eds., The Participatory Condition (University of Minnesota Press, 2015, Forthcoming) – notes how “[c]ontemporary networked surveillance practices implicate multiple forms of participation, many of which are highly organized and strategic”, and include the “crowd-sourcing of commercial surveillance”. It’s a paper I need to read and digest properly…

One example from the last week or two of a technology that supports particapatory surveillance comes from Buzzfeed’s misleading story relating how Hundreds Of Devices [Are] Hidden Inside New York City Phone Booths that “can push you ads — and help track your every move”; (the story resulted in the beacons being removed). My understanding of beacons is that they are a Bluetooth push technology that emit a unique location code, or a marketing message, within a limited range. A listening device can detect the beacon message and do something with it. The user thus needs to participate in any surveillance activity that makes use of the beacon by listening out for a beacon, capturing any message it hears, and then doing something with that message (such as phoning home with the beacon message).

The technology described in the Buzzfeed story is developed by Gimbal, who offer an API, so it should be possible to get a feel from that what is actually possible. From a quick skim of the documentation, I don’t get the impression that the beacon device itself listens out for and tracks/logs devices that come into range of it? (See also Postscapes – Bluetooth Beacon Handbook.)

Of course, participating in beacon mediated transactions could be done unwittingly or surreptitiously. Again, my understanding is that Android devices require you to install an app and grant permissions to it that let it listen out for, and act on, beacon messages, whereas iOS devices have iBeacon listening built in the iOS Location Services*, and you then grant apps permission to use messages that have been detected? This suggests that Apple can hear any beacon you pass within range of?

* Apparently, [i]f [Apple] Location Services is on, your device will periodically send the geo-tagged locations of nearby Wi-Fi hotspots and cell towers in an anonymous and encrypted form to Apple to augment Apple’s crowd-sourced database of Wi-Fi hotspot and cell tower locations. In addition, if you’re traveling (for example, in a car) and Location Services is on, a GPS-enabled iOS device will also periodically send GPS locations and travel speed information in an anonymous and encrypted form to Apple to be used for building up Apple’s crowd-sourced road traffic database. The crowd-sourced location data gathered by Apple doesn’t personally identify you. Apple don’t pay you for that information of course, though they might argue you get a return in kind in the form of better location awareness for your device.

There is also the possibility with any of those apps that you install one for a specific purpose, grant it permissions to use beacons, then the company that developed gets taken over by someone you wouldn’t consciously give the same privileges to… (Whenever you hear about Facebook or Google or Experian or whoever buying a company, it’s always worth considering what data, and what granted permissions, they have just bought ownership of…)

See also: “participatory sensing”Four Billion Little Brothers? Privacy, mobile phones, and ubiquitous data collection, Katie Shilton, University of California, Los Angeles, ACM Queue, 7(7), August 2009 – which “tries to avoid surveillance or coercive sensing by emphasizing individuals’ participation in the sensing process”.

More Digital Traces…

Via @wilm, I notice that it’s time again for someone (this time at the Wall Street Journal) to have written about the scariness that is your Google personal web history (the sort of thing you probably have to opt out of if you sign up for a new Google account, if other recent opt-in by defaults are to go by…)

It may not sound like much, but if you do have a Google account, and your web history collection is not disabled, you may find your emotional response to seeing months of years of your web/search history archived in one place surprising… Your Google web history.

Not mentioned in the WSJ article was some of the games that the Chrome browser gets up. @tim_hunt tipped me off to a nice (if technically detailed, in places) review by Ilya Grigorik of some the design features of the Chrome browser, and some of the tools built in to it: High Performance Networking in Chrome. I’ve got various pre-fetching tools switched off in my version of Chrome (tools that allow Chrome to pre-emptively look up web addresses and even download pages pre-emptively*) so those tools didn’t work for me… but looking at chrome://predictors/ was interesting to see what keystrokes I type are good predictors of web pages I visit…

chrome predictors

* By the by, I started to wonder whether webstats get messed up to any significant effect by Chrome pre-emptively prefetching pages that folk never actually look at…?

In further relation to the tracking of traffic we generate from our browsing habits, as we access more and more web/internet services through satellite TV boxes, smart TVs, and catchup TV boxes such as Roku or NowTV, have you ever wondered about how that activity is tracked? LG Smart TVs logging USB filenames and viewing info to LG servers describes not only how LG TVs appear to log the things you do view, but also the personal media you might view, and in principle can phone that information home (because the home for your data is a database run by whatever service you happen to be using – your data is midata is their data).

there is an option in the system settings called “Collection of watching info:” which is set ON by default. This setting requires the user to scroll down to see it and, unlike most other settings, contains no “balloon help” to describe what it does.

At this point, I decided to do some traffic analysis to see what was being sent. It turns out that viewing information appears to be being sent regardless of whether this option is set to On or Off.

you can clearly see that a unique device ID is transmitted, along with the Channel name … and a unique device ID.

This information appears to be sent back unencrypted and in the clear to LG every time you change channel, even if you have gone to the trouble of changing the setting above to switch collection of viewing information off.

It was at this point, I made an even more disturbing find within the packet data dumps. I noticed filenames were being posted to LG’s servers and that these filenames were ones stored on my external USB hard drive.

Hmmm… maybe it’s time I switched out my BT homehub for a proper hardware firewalled router with a good set of logging tools…?

PS FWIW, I can’t really get my head round how evil on the one hand, or damp squib on the other, the whole midata thing is turning out to be in the short term, and what sorts of involvement – and data – the partners have with the project. I did notice that a midata innovation lab report has just become available, though to you and me it’ll cost 1500 squidlly diddlies so I haven’t read it: The midata Innovation Opportunity. Note to self: has anyone got any good stories to say about TSB supporting innovation in micro-businesses…?

PPS And finally, something else from the Ilya Grigorik article:

The HTTP Archive project tracks how the web is built, and it can help us answer this question. Instead of crawling the web for the content, it periodically crawls the most popular sites to record and aggregate analytics on the number of used resources, content types, headers, and other metadata for each individual destination. The stats, as of January 2013, may surprise you. An average page, amongst the top 300,000 destinations on the web is:

– 1280 KB in size
– composed of 88 resources
– connects to 15+ distinct hosts

Let that sink in. Over 1 MB in size on average, composed of 88 resources such as images, JavaScript, and CSS, and delivered from 15 different own and third-party hosts. Further, each of these numbers has been steadily increasing over the past few years, and there are no signs of stopping. We are increasingly building larger and more ambitious web applications.

Is it any wonder that pages take so long to load on a mobile phone off the 3G netwrok, and that you can soon eat up your monthly bandwidth allowance!