Patently Imagined Futures, (or, what’s Facebook been getting up to recently?)

One of the blogs on my “must read” list is Bill Slawski’s SEO by the Sea, which regularly comments on a wide variety of search related patents, both recent and in the past, obtained by Google and what they might mean…

The US patent system is completely dysfunctional, of course, acting as way of preventing innovative competition in a way that I think probably wasn’t intended by its framers, but it does provide an insight into some of the crazy bar talk ideas that Silicon Valley types thought they might just go and try out on millions of people, or perhaps already are trying out.

As an example, here are a couple of recent patents from Facebook that recently crossed my radar.

First up, USPTO 20150124107 – ASSOCIATING CAMERAS WITH USERS AND OBJECTS IN A SOCIAL NETWORKING SYSTEM:

Images uploaded by users of a social networking system are analyzed to determine signatures of cameras used to capture the images. A camera signature comprises features extracted from images that characterize the camera used for capturing the image, for example, faulty pixel positions in the camera and metadata available in files storing the images. Associations between users and cameras are inferred based on actions relating users with the cameras, for example, users uploading images, users being tagged in images captured with a camera, and the like. Associations between users of the social networking system related via cameras are inferred. These associations are used beneficially for the social networking system, for example, for recommending potential connections to a user, recommending events and groups to users, identifying multiple user accounts created by the same user, detecting fraudulent accounts, and determining affinity between users.

Which is to say: traces of the flaws in a particular camera that are passed through to each photograph are unique enough to uniquely identify that camera. (I note that academic research picked up on by Bruce Schneier demonstrated this getting on for a decade ago: Digital Cameras Have Unique Fingerprints.) So when a photo is uploaded to Facebook, Facebook can associate it with a particular camera. And by association with who’s uploading the photos, a particular camera, as identified by the camera signature baked into a photograph, can be associated with a particular person. Another form of participatory surveillance, methinks.

Note that this is different to the various camera settings that get baked into photograph metadata (you know, that “administrative” data stuff that folk would have you believe doesn’t really reveal anything about the content of a communication…). I’m not sure to what extent that data helps narrow down the identity of a particular camera, particularly when associated with other bits of info in a data mosaic, but it doesn’t take that many bits of data to uniquely identify a device. Like your web-browser’s settings, for example, that are revealed to webservers of sites you visit through browser metadata, and uniquely identify your browser. (See eg this paper from the EFF – How Unique Is Your Web Browser? [PDF] – and the associated test site: test your browser’s uniqueness.) And if your camera’s also a phone, there’ll be a wealth of other bits of metadata that let you associate camera with phone, and so on.

Facebook’s face recognition algorithms can also work out who’s in an image, so more relationships and associations there. If kids aren’t being taught about graph theory in school from a very young age, they should be… (So for example, here’s a nice story about what you can do with edges: SELECTION AND RANKING OF COMMENTS FOR PRESENTATION TO SOCIAL NETWORKING SYSTEM USERS. Here’s a completely impenetrable one: SYSTEMS, METHODS, AND APPARATUSES FOR IMPLEMENTING AN INTERFACE TO VIEW AND EXPLORE SOCIALLY RELEVANT CONCEPTS OF AN ENTITY GRAPH.)

Here’s another one – hinting at Facebook’s role as a future publisher:

SOCIAL NETWORKING SYSTEM DATA EXCHANGE

An online publisher provides content items such as advertisements to users. To enable publishers to provide content items to users who meet targeting criteria of the content items, an exchange server aggregates data about the users. The exchange server receives user data from two or more sources, including a social networking system and one or more other service providers. To protect the user’s privacy, the social networking system and the service providers may provide the user data to the exchange server without identifying the user. The exchange server tracks each unique user of the social networking system and the service providers using a common identifier, enabling the exchange server to aggregate the users’ data. The exchange server then applies the aggregated user data to select content items for the users, either directly or via a publisher.

I don’t really see what’s clever about this – using an ad serving engine to serve content – even though Business Insider try to talk it up (Facebook just filed a fascinating patent that could seriously hurt Google’s ad revenue). I pondered something related to this way back when, but never really followed it through: Contextual Content Server, Courtesy of Google? (2008), Contextual Content Delivery on Higher Ed Websites Using Ad Servers (2010), or Using AdServers Across Networked Organisations (2014). Note also this remark on the the University of Bedfordshire using Google Banner Ads as On-Campus Signage (2011).

(By the by, I also note that Google has a complementary service where it makes content recommendations relating to content on your own site via AdSense widgets: Google Matched Content.)

PS not totally unrelated, perhaps, a recent essay by Bruce Schneier on the need to regulate the emerging automatic face recognition industry: Automatic Face Recognition and Surveillance.

Even Though RSS Never Went Away, Could It Be Coming Back as a Facebook Sinker?

Long time readers will know I was – am – a huge fan of RSS and Atom, simple feed based protocols for syndicating content and attachment links, even going so far as to write a manifesto of a sort at one point (We Ignore RSS at OUr Peril).

This blog, and the earlier archived version of it, are full of reports and recipes around various RSS experiments and doodles, although in more recent years I haven’t really been using RSS as a creative medium that much, if at all.

But today I noticed this on the official Facebook developer blog: Publishing Instant Articles Directly From Your Content Management System [Instant Article docs]. Or more specifically, this:

When publishers get started with Instant Articles, they provide an RSS feed of their articles to Facebook, a format that most Content Management Systems already support. Once this RSS feed is set up, Instant Articles automatically loads new stories as soon as they are published to the publisher’s website and apps. Updates and corrections are also automatically captured via the RSS feed so that breaking news remains up to date.

So… Facebook will use RSS to synch content into Facebook from publishers’ CMS’.

Depending on the agreement Facebook has with the publishers, it may require that those feeds are private, rather than public, feeds that sink the the content directly into Facebook.

But I wonder, will it also start sinking content from other independent publishers into the Facebook platform via those open feeds, providing even less reason for Facebook users to go elsewhere as it drops bits of content from the open web into closed, personal Facebook News Feeds? Hmmm…

There seems to be another sort of a grab for attention going on too:

Each Instant Article is associated with the URL where the web version is hosted on the publisher’s website. This means that Instant Articles are open and compatible with all of the ways that people share links around the web today:

  • When a friend or page you follow shares a link in your News Feed, we check to see if there is an Instant Article associated with that URL. If so, you will see it as an Instant Article. If not, it will open on the web browser.
  • When you share an Instant Article on Facebook or using email, SMS, or Twitter, you are sharing the link to the publisher website so anyone can open the article no matter what platform they use.

Associating each Instant Article with a URL makes it easy for publishers to adopt Instant Articles without changing their publishing workflows and means that people can read and share articles without thinking about the platform or technology behind the scenes.

Something like this maybe?

fbInstantRSS

Which is to say, this?

fbInstantRSS2

Or maybe not. Maybe there is some enlightened self interest in this, and perhaps Facebook will see a reason to start letting its content out via open syndication formats, like RSS.

Or maybe RSS will end up sinking the Facebook platform, by allowing Facebook users to go off the platform but still accept content from it?

Whatever the case, as Facebook becomes a set of social platform companies rather than a single platform company, I wonder: will it have an open standard, feed based syndication bus to help content flow within and around those companies? Even if that content is locked inside the confines of a Facebook-parent-company-as-web attention wall?

PS So the ‘related content’ feature on my WordPress blog associates this post with an earlier one: Is Facebook Stifling the Free Flow of Information?, which it seems was lamenting an earlier decision by Facebook to disable the import of content into Facebook using RSS…?! What goes around, comes around, it seems?!

Dangers of a Walled Garden…

Reading a recent Economist article (The value of friendship) about the announcement last week that Facebook is to float as a public company, and being amazed as ever about how these valuations, err, work, I recalled a couple of observations from a @currybet post about the Guardian Facebook app (“The Guardian’s Facebook app” – Martin Belam at news:rewired). The first related to using Facebook apps to (only partially successfully) capture attention of folk on Facebook and get them to refocus it on the Guardian website:

We knew that 77% of visits to the Guardian from facebook.com only lasted for one page. A good hypothesis for this was that leaving the confines of Facebook to visit another site was an interruption to a Facebook session, rather than a decision to go off and browse another site. We began to wonder what it would be like if you could visit the Guardian whilst still within Facebook, signed in, chatting and sharing with your friends. Within that environment could we show users a selection of other content that would appeal to them, and tempt them to stay with our content a little bit longer, even if they weren’t on our domain.

The second thing that came to mind related to the economic/business models around the app Facebook app itself:

The Guardian Facebook app is a canvas app. That means the bulk of the page is served by us within an iFrame on the Facebook domain. All the revenue from advertising served in that area of the page is ours, and for launch we engaged a sponsor to take the full inventory across the app. Facebook earn the revenue from advertising placed around the edges of the page.

I’m not sure if Facebook runs CPM (cost per thousand) display based ads, where advertisers pay per impression, or follow the Google AdWords model, where advertisers pay per click (PPC), but it got me wondering… A large number of folk on Facebook (and Twitter) share links to third party websites external to Facebook. As Martin Belam points out, the user return rate back to Facebook for folk visiting third party sites from Facebook seems very high – folk seem to follow a link from Facebook, consume that item, return to Facebook. Facebook makes an increasing chunk of its revenue from ads it sells on Facebook.com (though with the amount of furniture and Facebook open graph code it’s getting folk to include on their own websites, it presumably wouldn’t be so hard for them to roll out their own ad network to place ads on third party sites?) so keeping eyeballs on Facebook is presumably in their commercial interest.

In Twitter land, where the VC folk are presumably starting to wonder when the money tap will start to flow, I notice “sponsored tweets” are starting to appear in search results:

ANother twitter search irrelevance

Relevance still appears to be quite low, possibly because they haven’t yet got enough ads to cover a wide range of keywords or prompts:

Dodgy twitter promoted tweet

(Personally, if the relevance score was low, I wouldn’t place the ad, or I’d serve an ad tuned to the user, rather than the content, per se…)

Again, with Twitter, a lot of sharing results in users being taken to external sites, from which they quickly return to the Twitter context. Keeping folk in the Twitter context for images and videos through pop-up viewers or embedded content in the client is also a strategy pursued in may Twitter clients.

So here’s the thought, though it’s probably a commercially suicidal one: at the moment, Facebook and Twitter and Google+ all automatically “linkify” URLs (though Google+ also takes the strategy of previewing the first few lines of a single linked to page within a Google+ post). That is, given a URL in a post, they turn it into a link. But what if they turned that linkifier off for a domain, unless a fee was paid to turn it back on. Or what if the linkifier was turned off if the number of clickthrus on links to a particular domain, or page within a domain, exceeded a particular threshold, and could only be turned on again at a metered, CPM rate. (Memories here of different models for getting folk to pay for bandwidth, because what we have here is access to bandwidth out of the immediate Facebook, Twitter or Google+ context).

As a revenue model, the losses associated with irritating users would probably outweigh any revenue benefits, but as a thought experiment, it maybe suggests that we need to start paying more attention to how these large attention-consuming services are increasingly trying to cocoon us in their context (anyone remember AOL, or to a lesser extent Yahoo, or Microsoft?), rather than playing nicely with the rest of the web.

PS Hmmm…”app”. One default interpretation of this is “app on phone”, but “Facebook app” means an app that runs on the Facebook platform… So for any give app, that it is an “app” implies that that particular variant means “software application that runs on a proprietary platform”, which might actually be a combination of hardware and software platforms (e.g. Facebook API and Android phone)???

Jumbled Thoughts About Facebook’s Open Graph and User-Network Relays/Syndication

In Is Facebook Stifling the Free Flow of Information? I noted how Facebook no longer allows you to use an RSS feed to automatically syndicate content via your Facebook Notes page, instead recommending that you post the content directly into Facebook, or specifically post an update that links to your content.

There are workarounds, of course. Here’s one I’ve just tried – If this, then that (IFTT):

iftt - RSS2Facebook

In a license controlled piece (more about that in another post… -ed.) regarding “Frictionless sharing” – exploring the changes to Facebook, Martin Belam hints that the Facebook “Open Graph” API supports actions that allow website publishers to add an action to their pages that will automatically post an update to logged in Facebook user’s stream announcing that they have visited that page. (I’m trying to find a simple explanation of this, with code snippets, but can’t seem to track one down. If you know of one, please let me know… The closest I can find is a walkthrough about getting started with the Facebook Open Graph API. See also non-technical reviews such as PCWorld’s Facebook’s Frictionless Sharing: A Privacy Guide.)

This brought to mind a couple of things:

1) the notion of webhooks; it seems to me that the user’s Facebook identity essentially provides a webhook/callback URL that allows the publisher of a Facebook app/owner of a web page that embeds a Facebook app to use page events to automatically trigger Facebook actions on that user’s Facebook account.

2) We get a new model of syndication, whereby readers of a page actually announce the fact that they have visited a page, and with it syndicate a link to that page. At least, until the (Facebook) algorithm kicks in that determines which of particular Facebook user’s friends see which of their updates…

PS watching the Facebook Open Graph tutorial video, I wondered whether anyone in the HE sector has looked at defining “Open Graph” elements for use in an educational context, and maybe built proof of concept apps that build up personal timelines based on course/VLE related actions (“completed this exercise”, “found this resource useful”, etc)?

Or maybe someone involved with OERs that lets folk share information about OER sites/resources they’ve viewed, used, downloaded etc?

I’m not suggesting it’s a good (or bad) idea, just wondering…

Social Interest Positioning – Visualising Facebook Friends’ Likes With Data Grabbed Using Google Refine

What do my Facebook friends have in common in terms of the things they have Liked, or in terms of their music or movie preferences? (And does this say anything about me?!) Here’s a recipe for visualising that data…

After discovering via Martin Hawksey that the recent (December, 2011) 2.5 release of Google Refine allows you to import JSON and XML feeds to bootstrap a new project, I wondered whether it would be able to pull in data from the Facebook API if I was logged in to Facebook (Google Refine does run in the browser after all…)

Looking through the Facebook API documentation whilst logged in to Facebook, it’s easy enough to find exemplar links to things like your friends list (https://graph.facebook.com/me/friends?access_token=A_LONG_JUMBLE_OF_LETTERS) or the list of likes someone has made (https://graph.facebook.com/me/likes?access_token=A_LONG_JUMBLE_OF_LETTERS); replacing me with the Facebook ID of one of your friends should pull down a list of their friends, or likes, etc.

(Note that validity of the access token is time limited, so you can’t grab a copy of the access token and hope to use the same one day after day.)

Grabbing the link to your friends on Facebook is simply a case of opening a new project, choosing to get the data from a Web Address, and then pasting in the friends list URL:

Google Refine - import Facebook friends list

Click on next, and Google Refine will download the data, which you can then parse as a JSON file, and from which you can identify individual record types:

Google Refine - import Facebook friends

If you click the highlighted selection, you should see the data that will be used to create your project:

Google Refine - click to view the data

You can now click on Create Project to start working on the data – the first thing I do is tidy up the column names:

Google Refine - rename columns

We can now work some magic – such as pulling in the Likes our friends have made. To do this, we need to create the URL for each friend’s Likes using their Facebook ID, and then pull the data down. We can use Google Refine to harvest this data for us by creating a new column containing the data pulled in from a URL built around the value of each cell in another column:

Google Refine - new column from URL

The Likes URL has the form https://graph.facebook.com/me/likes?access_token=A_LONG_JUMBLE_OF_LETTERS which we’ll tinker with as follows:

Google Refine - crafting URLs for new column creation

The throttle control tells Refine how often to make each call. I set this to 500ms (that is, half a second), so it takes a few minutes to pull in my couple of hundred or so friends (I don’t use Facebook a lot;-). I’m not sure what limit the Facebook API is happy with (if you hit it too fast (i.e. set the throttle time too low), you may find the Facebook API stops returning data to you for a cooling down period…)?

Having imported the data, you should find a new column:

Google Refine - new data imported

At this point, it is possible to generate a new column from each of the records/Likes in the imported data… in theory (or maybe not..). I found this caused Refine to hang though, so instead I exprted the data using the default Templating… export format, which produces some sort of JSON output…

I then used this Python script to generate a two column data file where each row contained a (new) unique identifier for each friend and the name of one of their likes:

import simplejson,csv

writer=csv.writer(open('fbliketest.csv','wb+'),quoting=csv.QUOTE_ALL)

fn='my-fb-friends-likes.txt'

data = simplejson.load(open(fn,'r'))
id=0
for d in data['rows']:
	id=id+1
	#'interests' is the column name containing the Likes data
	interests=simplejson.loads(d['interests'])
	for i in interests['data']:
		print str(id),i['name'],i['category']
		writer.writerow([str(id),i['name'].encode('ascii','ignore')])

[I think this R script, in answer to a related @mhawksey Stack Overflow question, also does the trick: R: Building a list from matching values in a data.frame]

I could then import this data into Gephi and use it to generate a network diagram of what they commonly liked:

Sketching common likes amongst my facebook friends

Rather than returning Likes, I could equally have pulled back lists of the movies, music or books they like, their own friends lists (permissions settings allowing), etc etc, and then generated friends’ interest maps on that basis.

[See also: Getting Started With The Gephi Network Visualisation App – My Facebook Network, Part I and how to visualise Google+ networks]

PS dropping out of Google Refine and into a Python script is a bit clunky, I have to admit. What would be nice would be to be able to do something like a “create new rows with new column from column” pattern that would let you set up an iterator through the contents of each of the cells in the column you want to generate the new column from, and for each pass of the iterator: 1) duplicate the original data row to create a new row; 2) add a new column; 3) populate the cell with the contents of the current iteration state. Or something like that…

PPS Related to the PS request, there is a sort of related feature in the 2.5 release of Google Refine that lets you merge data from across rows with a common key into a newly shaped data set: Key/value Columnize. Seeing this, it got me wondering what a fusion of Google Refine and RStudio might be like (or even just R support within Google Refine?)

PPPS this could be interesting – looks like you can test to see if a friendship exists given two Facebook user IDs.

PPPPS This paper in PNAS – Private traits and attributes are predictable from digital records of human behavior – by Kosinski et. al suggests it’s possible to profile people based on their Likes. It would be interesting to compare how robust that profiling is, compared to profiles based on the common Likes of a person’s followers, or the common likes of folk in the same Facebook groups as an individual?

Is Facebook Stifling the Free Flow of Information?

Struggling to get to sleep last night, I caught this whilst listening to episode 124 of This Week in Google from a few weeks ago (45 mins or so in to the original; I’ve excerpted the relevant bit below):

The first thing that grabbed my attention was that Importing a blog or RSS feed to your personal Facebook account is no longer available. Facebook’s recommendation is to “Use Facebook Notes to customize your blog posts in a rich format that’s compatible for readers on Facebook, [or] [l]ink directly to your blog posts from your status”.

Pretty much the only interaction I have with Facebook is (or rather, was) to automatically syndicate my OUseful.info blog posts via an RSS through my Facebook Notes application. This didn’t generate many views, clickthrus or trackbacks, but it did generate some, and now, it seems, I’m no longer posting blog post links to my Facebook friends. So much for frictionless sharing, huh? I’ve been frictionless sharing content *I* wanted to share through Facebook in frictionless way for years, and now it seems I don’t. And more that, I can’t, easily (at least, not in the same way).

Long time readers will know I’ve been a fan of RSS for years (hands up who remembers the We Ignore RSS at OUr Peril rant?!;-) for a few very simple reasons: firstly, it generally works; secondly, it’s widely adopted; thirdly, it’s a type of wiring that no-one really controls, except through various standardisation processes. So it’s pernicious moves like this one from Facebook that make me think that Facebook may have made a strategic error here, because it represents a separating of the ways from those of us who were happy to use to Facebook as a terminal in our our personal publishing networks via things like RSS but aren’t willing to spend time “doing Facebook”.

Although I’m a fan of RSS/Atom feeds, I fully appreciate the at the orange radar signal icon is meaningless to most people, and that most people don’t know what to do with it. But I also know that folk are happily subscribing to all sorts of feed based streams in a painless way via services like Facebook and Twitter. Indeed, the TWIG piece above raised the issue of dropped support for RSS imports in the context of a new Facebook button for websites that allows folk visiting the site to one-click subscribe to that site’s Facebook page from the website (err, I think?!).

So what I’m pondering is this: why doesn’t Facebook set itself up as an RSS reader, offering a Feedburner like service to feed publishers and making it one click easy for folk to subscribe to those feed proxies in the Facebook context? Which is to say: I’d be reluctant to post a “Subscribe to my Facebook page” button on the blog (mainly because I don’t post any content to Facebook), but I might be willing to put a ‘subscribe to this site in Facebook’ site? (So how might that work? First, I guess I’d have to set up a page for this site in Facebook; then I’d feed it from this site’s feed; then I’d put the ‘subscribe to this site on Facebook’ link on this site. At which point, of course, I’d have lost control of the terminal subscription point for the feed to Facebook, at least for those subscribers. (This differs slightly from my current setup where the WordPress feed goes to through feedburner, then gets published via a URL I control. So the subscription point is under my control and I can control the wiring upstream of that.) Of course, Facebook may offer this route already, and I’m just not aware of it (not least because I don’t tend to keep up with Facebook’s machinations much at all…)

For a related take on other freedom eroding steps currently being taken by consumer tech companies towards their users, see Dave Winer’s The Un-Internet.

Facebook App Permissions Request – What Does This Mean?

I rarely link social apps to other social apps, but sometimes I click through on the first through stages of the linking process to see what happens. Here’s an example I just tried using Klout, which wants me to link in to my account on Facebook. The screenshot is taken from Facebook… but what does it mean?

So what does this mean...?

Does that horizontal arrow aligned with the first element mean permission is only being requested for my personal information? Or is that thin vertical line an “AND” that says persmission is being requested to access my personal information AND post to my wall AND etc etc…

I have no idea….?

Getting Started With The Gephi Network Visualisation App – My Facebook Network, Part V

A comment from one of the Gephi developers to Getting Started With The Gephi Network Visualisation App – My Facebook Network, Part IV, in which I described how to use the Modularity statistic to partition a network in terms of several different similar subnetwork groupings, suggested that a far better way of visualising the groups was to use the Partion parameter… and how right they were…

Running the Modularity statistic over my Facebook netwrok, as captured using Netvizz, and then refreshing the view in the Partition panel allows us to colour the netwrok using different partitions – such as the Modularity classes that the Modularity statistic generates and assigns nodes to:

Partition functions in gephi

Here’s what happens when we applying the colouring:

Partition colouring by modularity class

Selecting the Group view collects all the nodes in a partition together as a group:

Partition groups in gephi

These grouped nodes can be individually ungrouped by right-clicking on a group node and ungrouping it, or they can be expanded which maintains the group identity whilst still letting us look at the local structure:

Group node management in gephi

Here’s what the expanded view of one of the classes looks like, with text labels turned on:

Expanded group node in gephi

We see that the members of the group are visible, allowing us to explore the make-up of the subnetwork. As you might expect, we can then colour or resize nodes within the expanded group in the normal way:

Node resizing within an expanded group in gephi

To create a workspace containing just the members of a particular partition, ungroup all the nodes via the Partition module and filter on the required partition using a Modularity Class filter:

Create a workspace with just members of a given partition in gephi

The Partition module is incredibly powerful, as you can hopefully see; but it isn’t limited to dealing with just partitions created using Gephi statistics – it can also deal with partitions defined over the graph as loaded into Gephi (see the GUESS format for more details on how to structure the input file).

So for example, the most recent version of Netvizz will return additional data alongside just the identities of your friends, such as their gender (if revealed to you by their profile privacy settings) and the number of their wall posts. Loading this richer network specification into Gephi, and refreshing the Partion module settings reveals the following:

Gephi partiion over a preloaded partition

Which in turn means we can colour the graph as follows:

Gephi - partition colouring based oon pre-specified partititons

The wall count parameter is made available through the Ranking panel:

User specified Ranking parameters in Gephi

So as we can see, if you have partition data available for network members, Gephi can provide a great way of visualising it :-)

Getting Started With Gephi Network Visualisation App – My Facebook Network, Part III: Ego Filters and Simple Network Stats

In a couple of previous posts on exploring my Facebook network with Gephi, I’ve shown how to plot visualise the network, and how to start constructing various filtered views over it (Getting Started With The Gephi Network Visualisation App – My Facebook Network, Part I and Getting Started With Gephi Network Visualisation App – My Facebook Network, Part II: Basic Filters). In this post, I’ll explore a new feature, ego filters, as well as looking at some simple social network analysis tools that can help us better understand the structure of a social network.

To start with, I’m going to load my Facebook network data (grabbed via the Netvizz app, as before) into Gephi as an undirected graph. As mentioned above, the ego network filter is a new addition to Gephi, which will show that part of a graph that is connected to a particular person. So for example, I can apply the ego filter (from the Topology folder in the list of filters) to “George Siemens” to see which of my Facebook friends George knows.

Gephi - ego filter - my Facebook friends who are friends with George Siemens

If I save this as a workspace, I can then tunnel into it a little more, for example by applying a new ego filter to the subgraph of my friends who George Siemens knows. In this case, lets add Grainne to the mix – and see who of my friends know both George Siemens and Grainne:

Ego filter applied within an ego filtered workspace

Note that I could have achieved a similar effect with the full graph by using the intersection filter (as introduced in the previous post in this series):

Seeing my facebook connections that two of my Facebook friends know

The depth of the ego filter also allows you to see who of of my friends the named individual knows either directly, or through one of my other friends. Using an ego filtered network to depth two (frined of a friend) around George Siemens, I can run some network statistics over just that group of people. So for example, if I run the Degree statistics over the network, and then set the node size according to node degree within that network this is what I get:

Running stats on the network

(I also turned node labels on and set their size proportional to node size.)

Running Network Diameter stats generates the following sorts of report:

Gephi Network diameter stats

That is:

– betweenness centrality;
– closeness centrality;
– eccentricity.

These all sound pretty technical, so what do they refer to?

Betweenness centrality is a measure based on the number of shortest paths between any two nodes that pass through a particular node. Nodes around the edge of the network would typically have a low betweenness centrality. A high betweenness centrality might suggest that the individual is connecting various different parts of the network together.

Closeness centrality is a measure that indicates how close a node is to all the other nodes in a network, whether or not the node lays on a shortest path between other nodes. A high closeness centrality means that there is a large average distance to other nodes in the network. (So a small closeness centrality means there is a short average distance to all other nodes in the network. Geddit? (I think sometimes the reciprocal of this measure is given as closeness centrality:-).

The eccentricity measure captures the distance between a node and the node that is furthest from it; so a high eccentricity means that the furthest away node in the network is a long way away, and a low eccentricity means that the furthest away node is actually quite close.

So let’s have a look at the structure of my Facebook network, as filtered according to George’s ego filter, depth 2:

Plotting size proportional to betweenness centrality, we see Martin Weller, Grainne and Stephen Downes are influential in keeping different parts of my network connected:

Betweenness centrality

As far as outliers go, we can look at the closeness centrality and eccentricity (to protect the innocent, I shall suppress the names!)

Eccentricity (size) and closeness centrality (colour) in gephi

Here, the colour field defines the closeness centrality and the size of the node the eccentricity. It’s quite easy to identify the people in this network who are not well connected and who are unlikely to be able to reach each other easily through those of my friends they know.

From nods with similar sizes and different colours, we also see how it’s quite possible for two nodes to have a similar eccentricity (similar distances to the furthest away nodes) and very different closeness centrality (that is, the node may have a small or large average distance to every other node in the graph). For example, if a node is connected to a very well connected node, it will lower the closeness centrality.

So for example, if we look at the ego network with the above netwrok based around the very well connected Martin Weller, what do we see?

Further filter

The colder, blue shaded circles (high closeness centrality) have disappeared. Being a Martin Weller friend (in my Facebook network at least) has the effect of lowering your closeness centrality, i.e. bringing you closer to all the other people in the network.

Okay, that’s definitely more than enough for now. Why not have a play looking at your Facebook network, and seeing if you can identify who the best connected folk are?

PS when plotting charts, I think Gephi uses data from the last statistics run it did, even if that was in another workspace, so it’s always worth running the statistics over the current graph if you intend to chart something based on those stats…

Why I Joined the Facebook Privacy Changes Backlash…

Whenever Facebook rolls out a major change, there’s a backlash… Here’s why I posted recently about how to opt out of Facebook’s new services…

Firstly, I’m quite happy to admit that it might be that you will be benefit from opting in to the Facebook personaliation and behavioural targeting services. If you take the line that better targeted ads are content, and behavioural advertising is one way to achieve that, all well and good. Just bear in mind that your purchasing decisions will be even more directedly influenced ;-)

What does concern me is that part of the attraction of Facebook for many people are its privacy controls. But when they’re too confusing to understand, and potentially misleading, it’s a Bad Thing… (I suppose you could argue that Facebook is innovating in terms of privacy, openness, and data sharing on behalf of its users, but is that a Good Thing?)

If folk think they have set their privacy setting one way, but they operate in another through the myriad interactions of the different settings, users may find that the images and updates they think they are posting into a closed garden, are in fact being made public in other ways, whether by the actions of their friends, applications they have installed, pages they have connected to, or websites they visit.

The Facebook privacy settings also seem to me to suggest various asymmetries. For example, if think I am only sharing videos with friends, then if those friends can also share on content because of the way I have set/not changed the default on another setting, I may be publishing content in a way that was not intended. It seems to me that Facebook is set up to devolve trust to the edge of my network – I publish to the edge of the my network, for example, but the people or pages on the edge of my network can then push the content out further.

So for example, in the case of connecting to pages, Facebook says: “Keep in mind that Facebook Pages you connect to are public. You can control which friends are able to see connections listed on your profile, but you may still show up on Pages you’re connected to. If you don’t want to show up on those Pages, simply disconnect from them by clicking the “Unlike” link in the bottom left column of the Page.”

The privacy settings around how friends can share on content I have shared with them is also confusing – do their privacy settings override mine on content I have published to them?

I’m starting to think (and maybe I’m wrong on this) that the best way of thinking about Facebook is to assume that everything you publish to your Facebook network can be republished by the members of your network under the terms of their privacy conditions. So if I publish a photo that you can see, then I have to assume that you can also publish it under your privacy settings. And so on…

This contrasts with a view of each object having a privacy setting, and that by publishing an object, the publisher controls that setting. So for example, I could publish an object and say it could only be seen by friends of me, and that setting would stick with the object. If you treid to republish it, it could only be repulshed to your friends who are also my friends. My privacy settings would set the scope, or maximum reach, of your republication of it.

Regular readers will know I’ve started looking at ways of visualising Facebook networks using Gephi. What I’m starting to think is that Facebook should offer a visualisation of the furthest reach of a person’s data, videos, images, updates, etc, given their current privacy settings (or preview changes to that reach if they want to test out new privacy settings.

PS re the visualisation thing – something like this, generated from your current settings, would do the job nicely:

Facebook privacy defaults

More at The Evolution of Privacy on Facebook, including a view of just how open things are now…