A couple of days ago, I grabbed the Twitter friends lists of all my Twitter friends (that is, lists of all the people that the people I follow on Twitter follow…) and plotted the connections between them filtered through the people I follow (Small World? A Snapshot of How My Twitter “Friends” Follow Each Other…). That is, for all of the people I follow on Twitter, I plotted the extent to which they follow each other… got that?
Running the resulting network through Gephi’s modularity statistic (some sort of clustering algorithm; I really need to find out which), several distinct clusters of people turned up: OU folk, data journalism folk, ed techies, JISC/Museums/library folk, and open gov data folk.
(Gephi allows you to export the graph file for the current project, including annotations, if appropriate, (such as modularity class) that are added by running Gepi’s statistics. Extracting the list of nodes (i.e. Twitter users), and filtering them by modularity class means we can create separate lists of individuals based on which cluster they appear in; which in turn means that we could generate a Twitter list from those individuals.)
From my “curated” list of Twitter friends, we can identify a set of “OU twitterers” through a cluster analysis of the mass action of their own friending behaviour, and I could use this to automatically generate a Twitter list of (potential) OU Twitterers that other people can follow.
Here’s the total set of my followers, coloured by modularity class and sized by in-degree (that is, the number of my friend who follow that person).
If we filter on modularity class, we can just look at the folk in what I have labelled “OU Twitterers”. There are one or two folk in there who donlt quite fit this label (e.g. University of Leicester folk, and a handful of otherwise “disconnected” folk…), but it’s not bad.
Note that if I grab the complete friends and followers lists of these individuals, and look for users who are commonly followed, who also tend to follow back, and who donlt have huge numbers of followers (ie they aren’t celebrities who automatically follow back…) I may discover other OU Twitterers that I don’t follow…
If we run the modularity stat over this group of people, the “OU Twitterers” (most easily done by generating a new workspace from the filtered group), we see three more partitions fall out. Broadly, this first one corresponds to OU Library folk (ish…):
Twitterers from my faculty (several whom rarely, if ever, tweet):
And the rest (the vast majority, in fact):
(Note that a coule of folk are completely disconnected, and have nothing to do with the OU…)
Running the modulraity class over this larger group turns up nothing of interest.
So… so what? So this. Firstly, I can mine the friends lists of the friends of arbitrary people on Twitter and pull out clusters from that may tell me something about the interests of those people. (For example, we might grab their twitter biography statements and run them through a word cloud as a first approximation; or grab their recent tweets and do some text mining on that to see if there is any common interest. Hashtag analysis might also be revealing…) Secondly, we could use the members of cluster to act as a first approximation for a list of connected members of a community interested in a particular topic area; for these community members we could then pull down lists of all their friends and followers and look to see if we can grow the list through other commonly connected to individuals.
PS after tweeting the original post, a couple of people asked if I could grab the data from their friends lists. For example, @neilkod’s turned up clusters relating to “Utah tweeps, my cycling ones, and of course data/#rstats.” So the approach appears to work in general…:-)