Open Source Intelligence – OSINT – is intelligence that can be collected from public sources. That is to say, OSINT is the sort of intelligence that you should be able to collect using a browser and a public or academic library that also provides access to public subscription content. (For an intro to OSINT, see for example Sailing the Sea of OSINT in the Information Age; for example context, Threat Intelligence: Collecting, Analysing, Evaluating). OSINT can be used as much by corporates as by the security services. It’s also up for grabs by journalists, civil society activists and stalkers…
Looking at the syllabus for a OSINT beginners course, such as IMSL’s Basic Open Source (OSINT) Research & Analysis Tradecraft turns up the sorts of thing you might also expect to see as part of one of Phil Bradley or Karen Blakeman’s ILI search workshops:
- Appreciation of the OS environment
- Opportunities, Challenges and Threats
- Legal and Ethical Guidance
- Search Tradecraft
- Optimising Search
- Advanced Search Techniques
- Profile Management and Risk Reduction
- Technical Anonymity/Low Attribution
- Security Tradecraft
- Social Media exploitation
- Orientation around the most commonly used platforms Twitter, Facebook, LinkedIn etc.
- Identifying influence
- Event monitoring
- Situational Awareness
- Emerging social media platforms
- Source Evaluation
- Verifying User Generated Content on Social Media
And as security consultant Bruce Schneier beautifully observed in 2014, [s]urveillance is the business model of the Internet.
What may be surprising, or what may help explain in part their dominance, is that a large part of the surveillance capability the webcos have developed is something they’re happy to share to with the rest of us. Things like social media exploitation, for example, allow you to easily identify social relationships, and pick up personal information along the way (“Happy Birthday, sis..”). You can also identify whereabouts (“Photo of me by the Eiffel Tower earlier to day”), captioned or not – Facebook and Google will both happily tag your photos for you to make them, and the information, or intelligence, they contain more discoverable.
Part of the reason that the web companies have managed to grow so large is that they operate very successful two-sided markets. As the FT Lexicon defines it, these are markets that provide “a meeting place for two sets of agents who interact through an intermediary or platform”. In the case of the web cos, “social users” who gain social benefit from interacting with each other through the platform, and the advertisers who pay the platform to advertise to the social users (Some Notes on Churnalism and a Question About Two Sided Markets).
A naive sort of social media intelligence would focus, I think, on what can be learned simply through the publicly available activity on the social user side of the platform, albeit activity that may be enriched through automatic tagging by the platform itself.
But there is the other side of the platform to consider too. And the tools on that side of the platform, the tools developed for the business users, are out and out designed to provide the business users – the advertisers – with intelligence about the social users.
Which is all to say that if surveillance is your thing, then ADINT – Adtech Intelligence – could be a good OSINT way in, as a recent paper from the Paul G. Allen School of Computer Science & Engineering, University of Washington describes: ADINT: Using Targeted Advertising for Personal Surveillance (read the full paper; Wired also picked up the story: It Takes Just $1,000 to Track Someone’s Location With Mobile Ads). Here’s the paper abstract:
Targeted advertising is at the heart of the largest technology companies today, and is becoming increasingly precise. Simultaneously, users generate more and more personal data that is shared with advertisers as more and more of daily life becomes intertwined with networked technology. There are many studies about how users are tracked and what kinds of data are gathered. The sheer scale and precision of individual data that is collected can be concerning. However, in the broader public debate about these practices this concern is often tempered by the understanding that all this potentially sensitive data is only accessed by large corporations; these corporations are profit-motivated and could be held to account for misusing the personal data they have collected. In this work we examine the capability of a different actor — an individual with a modest budget — to access the data collected by the advertising ecosystem. Specifically, we find that an individual can use the targeted advertising system to conduct physical and digital surveillance on targets that use smartphone apps with ads.
The attack is predicated in part around knowing the MAID – the Mobile Advertising ID (MAID) – of a user you want to track, and several strategies are described for obtaining that.
I haven’t looked at adservers for a long time (or Google Analytics for that matter), so I thought I’d have a quick look at what the UIs support. So for example, Google AdWords seems to offer quite a simple range of tools, that presumably let me target based on various things, like demographics:

or location:

or time:

It also looks like I can target ads based on apps a user users:

or websites they visit:

though it’s not clear to me if I need to be the owner of those apps or webpages?
If I know someone’s email address, it also looks like I can use that to vector an ad towards them? Which means Google cookies presumably associate with an email address?

This email vectoring is actually part of Google’s “Customer Match” offering, which “lets you show ads to your customers based on data about those customers that you share with Google”.
So how about Facebook? As you might expect, there’s a range of audience targeting categories that draw heavily on the information users provide to the system:

(You’ve probably heard the slogan “if you aren’t paying for the product, you are the product” and thought nothing of it. Are you starting to feel bought and sold, yet?)
Remember that fit of anger, or joy, when you changed your relationship, maybe also flagging a life event (= valuable to advertisers)?

Or maybe when you bought that thing (is there a Facebook Pay app yet, to make this easier for Facebook to track?):

And of course, there’s location:

If you fancy exploring some more, the ADINT paper has a handy table summarising what’s offered by various other adtech providers:

On the other hand, if you want to buy readymade audiences from a data aggregator, try the Oracle Data Marketplace. It looks as if they’ll happily resell you audiences derived from Experian data, for example:

So I’m wondering, what other sorts of intelligence operation could be mounted against a targeted individual using adtech more generally? And what sorts of target identification can be achieved through a creative application of adtech, and maybe some simple phishing to entice a particular user onto a web page you control and which you can use to grab some preliminary tracking information from targeted users you entice there?
Presumably, once you can get your hooks into a user, maybe by enticing them to a web page that you have set up to show your ad so that the adserver can spear the user, you can also use ad retargeting or remarketing (that follows users around the web, in the sense of continuing to show them ads from a particular campaign) to keep a tail on them?
[This post was inspired by an item on Mike Caulfield’s must read Traces weekly email newsletter. Subscribe to his blog – Hapgood – for a regular dose of digital infoskills updating. You might also enjoy his online book Web Literacy for Student Fact-Checkers.]