Getting Started With Gephi Network Visualisation App – My Facebook Network, Part III: Ego Filters and Simple Network Stats
In a couple of previous posts on exploring my Facebook network with Gephi, I’ve shown how to plot visualise the network, and how to start constructing various filtered views over it (Getting Started With The Gephi Network Visualisation App – My Facebook Network, Part I and Getting Started With Gephi Network Visualisation App – My Facebook Network, Part II: Basic Filters). In this post, I’ll explore a new feature, ego filters, as well as looking at some simple social network analysis tools that can help us better understand the structure of a social network.
To start with, I’m going to load my Facebook network data (grabbed via the Netvizz app, as before) into Gephi as an undirected graph. As mentioned above, the ego network filter is a new addition to Gephi, which will show that part of a graph that is connected to a particular person. So for example, I can apply the ego filter (from the Topology folder in the list of filters) to “George Siemens” to see which of my Facebook friends George knows.
If I save this as a workspace, I can then tunnel into it a little more, for example by applying a new ego filter to the subgraph of my friends who George Siemens knows. In this case, lets add Grainne to the mix – and see who of my friends know both George Siemens and Grainne:
Note that I could have achieved a similar effect with the full graph by using the intersection filter (as introduced in the previous post in this series):
The depth of the ego filter also allows you to see who of of my friends the named individual knows either directly, or through one of my other friends. Using an ego filtered network to depth two (frined of a friend) around George Siemens, I can run some network statistics over just that group of people. So for example, if I run the Degree statistics over the network, and then set the node size according to node degree within that network this is what I get:
(I also turned node labels on and set their size proportional to node size.)
Running Network Diameter stats generates the following sorts of report:
– betweenness centrality;
– closeness centrality;
These all sound pretty technical, so what do they refer to?
Betweenness centrality is a measure based on the number of shortest paths between any two nodes that pass through a particular node. Nodes around the edge of the network would typically have a low betweenness centrality. A high betweenness centrality might suggest that the individual is connecting various different parts of the network together.
Closeness centrality is a measure that indicates how close a node is to all the other nodes in a network, whether or not the node lays on a shortest path between other nodes. A high closeness centrality means that there is a large average distance to other nodes in the network. (So a small closeness centrality means there is a short average distance to all other nodes in the network. Geddit? (I think sometimes the reciprocal of this measure is given as closeness centrality:-).
The eccentricity measure captures the distance between a node and the node that is furthest from it; so a high eccentricity means that the furthest away node in the network is a long way away, and a low eccentricity means that the furthest away node is actually quite close.
So let’s have a look at the structure of my Facebook network, as filtered according to George’s ego filter, depth 2:
Plotting size proportional to betweenness centrality, we see Martin Weller, Grainne and Stephen Downes are influential in keeping different parts of my network connected:
As far as outliers go, we can look at the closeness centrality and eccentricity (to protect the innocent, I shall suppress the names!)
Here, the colour field defines the closeness centrality and the size of the node the eccentricity. It’s quite easy to identify the people in this network who are not well connected and who are unlikely to be able to reach each other easily through those of my friends they know.
From nods with similar sizes and different colours, we also see how it’s quite possible for two nodes to have a similar eccentricity (similar distances to the furthest away nodes) and very different closeness centrality (that is, the node may have a small or large average distance to every other node in the graph). For example, if a node is connected to a very well connected node, it will lower the closeness centrality.
So for example, if we look at the ego network with the above netwrok based around the very well connected Martin Weller, what do we see?
The colder, blue shaded circles (high closeness centrality) have disappeared. Being a Martin Weller friend (in my Facebook network at least) has the effect of lowering your closeness centrality, i.e. bringing you closer to all the other people in the network.
Okay, that’s definitely more than enough for now. Why not have a play looking at your Facebook network, and seeing if you can identify who the best connected folk are?
PS when plotting charts, I think Gephi uses data from the last statistics run it did, even if that was in another workspace, so it’s always worth running the statistics over the current graph if you intend to chart something based on those stats…