Personal Declarations on Your Behalf – Why Visiting One Website Might Tell Another You Were There
A couple of weeks ago, I was chatting to someone about their use of Google Analytics on OU related websites. I asked a question about whether or not privacy concerns had been taken into account, and was provided with a response along the lines of “we don’t know who the data relates to, so it’s not really a problem”. Which is not quite what I was asking…
Whenever you visit a website that uses Adsense or Google Analytics, Google knows that a particular person – as identified by a cookie that Google stores in your browser – has visited that website. If you’re logged in to a Google account (in another tab or window in the same browser, for example; or from a previous visit to GMail or Google docs), Google also potentially knows that it is INSERT YOUR NAME HERE who is represented by that cookie, and hence, Google knows which of the many, many Google Analytics and Adsense serving websites you have visited. (This may also be why Google just acquired the Labpixies widget platform – it wants to extend its reach…)
With the release of Facebook’s “Like” button that has started appearing on many websites, Facebook is now developing the capacity to track who of its users are visiting “Likeable” websites. If you’ve logged in to Facebook, Facebook will have placed a cookie in your browser to identify you. Whenever you visit a website that has installed any Facebook for Websites utilities (see also Facebook widgets), the Facebook code inserted into the page tells Facebook that you have visited that site.
This information may then be passed back to the website you have visited in an anonymised, aggregated form as domain analytics, aggregated analytics that reveal demographic data about the makeup of the site visitors that Facebook knows about. (See Facebook Overhauls Page and App Insights, Adds Domain Analytics Features and an API for a good overview of insight and analytics services provided by Facebook that cover apps and pages, as well as domains.)
Youtube also offers a similar service to video publishers, in the form of reports about the demographics of people who have viewed a particular video, presumbaly based on samples of viewers who have entered this personal information as part of their Youtube profile.
Note that by claiming your website on Facebook, you can also get hold of reports relating to activity around your website on Facebook itself:
So what information can we learn from this?
– demographics/personal profile data may be passed on (in aggregate form) as part of analytics reports;
– user tracking across multiple websites being achieved by the big web companies, but in an indirect way. If the OU includes Google Analytics or a Facebook Like button on an OU page, Google or Facebook respectively know that you have visited that page. The OU doesn’t necessarily know, but the third party site whose tools are annotating the OU’s pages do know.
(Your ISP will know too, of course; as does your browser; so if you were paranoid, you might think that the browser supplied on your iPhone or Android phone is phoning home information about the websites you have visited. But that would just be paranoid, right…?! After all, EULA details probably rule that sort of thing out (anyone checked?), unless they rule them in… in an opt-in-able way, of course. (So for example, if you install the Google toolbar, you can let Google maintain a history of all the websites you visit; as it (opt-in-ally) does with all the websites you click through to from a Google search results page;-)
There’s a great discussion about whether Facebook needed to implement its widgets in an intrusive, always user tracking way, in this episode of the Technometria podcast – What’s Facebook Thinking? (the long and the short of it: it didn’t need to…).
As to why Facebook and Google want this demographic and interest/attention data? They make megabucks from selling targeted advertising…
And as to what we can do about it? I’d love to see a map of t’internet that shows the proportion of websites that Google and Facebook can track my behaviour on because those site owners have invited them in…
PS h/t to @stuartbrown for pointing out the work that Mathieu d’Aquin from the OU’s KMi is doing on tracking the amount of personal data we reveal to websites.
PPS as a user, taking defensive measures using things like browser privacy settings may also not be very much help. See for example Why private browsing isn’t…
PPPPS Here’s another way of leaking personal data: if you click on a link on a page, the site you are visiting is notified about the URL of the referring page (so if you click a link on the OUseful.info blog the site you click through to knows you came from this blog). If personal information – such as your name – is encoded into the URL of a page you click from (for example, because you are clicking through an ad on your profile page whose URL includes your name), this information may be passed to the clicked thru to site. Ref: WSJ: Facebook, MySpace Confront Privacy Loophole and Ars Technica Report: Facebook caught sharing secret data with advertisers. See also: On the Leakage of Personally Identiﬁable Information Via
Online Social Networks[pdf]
P-whatever-S If you want to op-out of being tracked by Google Analytics, here’s a RWW report of an “official” tool for doing just that.. (This can all get a bit meta- though.. e.g. best way of finding out public content websites don’t want Google to index is just to look at the robots.txt file for the website;-) I also wonder: should privacy policies for sites that include things like Google Analytics also link to tools that would allow the user to the site to opt-out of being part of that tracking?
[UPDATE: here’s another example – NHS.uk allowing Google, Facebook, and others to track you]