Another conference, another hashtag, another opportunity to explore a shortlived emergent network… My default app for this is the (stalled under construction) OUseful hashtag community viewer, which as well as listing my friends and followers using the hashtag, also shows the folk I don’t necessarily share friend/follower relations with, along with a “report” about my reach into the community (and it’s reach into my Twitter traffic streams):
(Click through and hack the URL to see a report personalised around your account…)
As Martin W expressed an interest in doing some sort of “research” around the currently running OU (Online) Conference, I thought I’d have a little explore yesterday to see whether I could get pull a script or two together to look at the network connections across the OUConf hashtag community.
Using the Tweepy Python wrapper for the Twitter AP, authenticated to a Twitter account that I think I got whitelisted a year or so ago (so effectively no limit on the number of API calls per hour), I wrote a script that, for each of the users identified in a hashtaggers list (gleaned via an old hashtaggers Yahoo pipe (?!;-) I had laying around somewhere… as soon as I get started with a Twapperkeeper API key, I’ll use that as my source (although I guess I could use other Martin’s nifty Google spreadsheet twitter archiver in the meantime ;…) And what did that script do?
– pulled the list of the hashtaggers’ followers (as Twitter numeric IDs)
– pulled the list of the hashtaggers’ friends (as Twitter numeric IDs)
I then created three output file variants each based on the following sorts of network connections between individuals:
– hashtagger -> friend, so we can look at people friended by the hashtaggers;
– follower -> hashtagger, so we can look at people following the hashtaggers;
The three variants were:
– all connections;
– inner connections (that is, only show the connection if the hashtagger and the friend/follower is also a hashtagger); [so, err, maybe the inner friends and inner followers are the same…? I can’t think straight!]
– outer connections (that is, only show the connection if the friend/follower is not a hashtagger).
I also created a file containing hashtaggers’ IDs and screennames by making API requests for the user details of each of the hashtaggers.
The snapshot I got of the hashtag community was grabbed yesterday afternoon and is only a partial one (i.e. it only contains about 50-60 of the hashtaggers, compared to the 100 or so we’re currently at…)
So here are a couple of views, using Gephi. Firstly, the internal friends connections:
The size of the nodes relate to in-degree, which means that a large node corresponds to an individual who has been friended by lots of the other hashtaggers.
Looking at the betweenness centrality metric (a measure of the extent to which an individual is on the shortest path between two other individuals in the network), we can start to explore the structure of the community a little more.
Here’s a look over the whole set of friends of people in the hashtag network (that is, the people who the hashtaggers may be influenced by):
By plotting in-degree as a the node size, we can see who is most friended by members of the hashtag network, including other members of the hashtag community. If we filter the view to show only nodes with an in-degree greater than 10, we see who is respected (by virtue of being friended by) members of the hashtag community:
If we look at the “friends outer” network, we get a view over just the people outside the hashtag community who are followed by people inside it:
As far as reach goes, that is, the number of people who may be seeing hashatg traffic via people they follow, we need to look the followers of the hashtaggers. Once again, if we look at the “outer” graph, we see the people who are seeing hashtag tweets from their friends, but who aren’t using the hashtag:
We can also trivially see which members of the hashtag community have the largest number followers:
If we implement an ego filter with depth 1, we can then look to see which followers are connected to a particular individual. (By changing the filter settings, eg going from one person (mweller, say) to another (gconole, say), we can see the similarities and differences between their followers.
Okay, that’s enough for now… no real “research”, but a few of really quick examples of how you can use Gephi to start to explore the structure of a hashtag network, and the size of the community around it.
Issues: the friends/followers lists are numeric IDs. Calling the API once per ID to get the screen names would be expensive, but there are a couple of possible heuristics. For example, you can pull back tweets from the 100 most recent friends or followers of an individual, including screen names and IDs) with a single API call, so we can use that to grab names for identifiers across the network. If we pick a hashtagger with largest spanning coverage over the network, we could pull this information for all their friends/followers, 100 at a time.
Alternatively, we could use something like Gephi to report on the most connected individuals whose names we don’t know, and use that to prioritise which user details we get first to further annotate the visualised graph.