Friends of the Community: Who’s Effectively Following a Hashtag

Picking up on @briankelly’s Thoughts on ILI 2010, where he reports on a few gross level stats about #ili2010 hashtag activity grabbed from Summarizr, here are a few things I observed from looking at some of the hashtag community network stats…

To start with, I looked at the “inner hashtag community” where I grab the list of hashtaggers and their friends who have also used the hashtag and make links between them to give this sort of graph, as used before in many posts:

ILI2010 hashtaggers

(Directed graph from person to friend (i.e. to person they follow); node size proportional to in-degree, heat to out-degree.)

After running a few network statistics generated using Gephi, and exporting the data from the Gephi Data Table view, I uploaded the statistics data to IBM’s ManyEyes site here. This allows us to view the distribution of the hashtaggers based on various statistical and network measures using a range of other visualisation techniques, such as histograms (view interactive histogram chart for ILI2010 hashtaggers, interactive scatterplot)

So for example, here’s the distribution of hashtaggers by total number of followers (that is, including followers outside the hashtag community) as a histogram:

ILI2010 hashtaggers - total numbers of followers

If we look at the betweenness measure, which was calculated over the friends connections between the hashtaggers, we can see who’s best suited to getting a message broadcast across the community through direct and friend-of-a-friend links:

ILIhashtaggers - inner frineds betweenness

If we look at the in-degree (number of people in the hashtag community who have friended (i.e. are following) an individual, divided by the total number of friends of that individual, we can identify people who are being followed by more people in the community than they have as friends:

ILI2010 hashtaggers - in-degree divided by total friends

If we look at the in-degree divided by a users total number of followers, we can see the extent to which a person’s twitter feed is dominated by updates from folk who have used the ILI2010 hashtag:

ILI2010 hashtaggers - ectent to which stream is dominated by hashtaggers

In the above case, we see one person who appears to only follow members of the ILI2010 hashtag community. (I’m guessing that if folk come to twitter through a conference, this might be a signature of that?) Before you get too excited though, a little more digging suggests that that person only follows 1 person;-)

The interactive scatterplot allows us to view 3 dimensions of data – in the following case, ‘m looking for well connected (good betweenness centrality), well respected (high in-degree) folk in the hashtag community who also have a large reach in terms of their total number of followers:

ILI2010 hashtaggers - scatterplot

In terms of audience development, we can also create a network based on the complete follower lists of the ILI2010 hashtaggers. Creating such a graph generates a network with 71627 nodes, of which 236 were hashtaggers – meaning that in principle 71,391 people outside the hashtag community might have seen an ILI2010 hashtagged tweet…

Using a directed graph from hashtaggers to their followers, If we filter the graph to only show individuals with an in-degree above 60, say, we can see those people who are following at least 60 people who have used the hashtag:

ILI2010 hashtagger followers

In the way I have constructed this graph, the nodes showing Twitter usernames are in the hashtag community, the numerical IDs are individuals who didn’t use the ILI2010 hashtag but who do follow at least 60 people who did, and therefore presumably saw quite a lot of tweets about the event.

Looking up the twitter IDs of the “friends of the hashtag community”, we see the following people did not use the hashtag over the sample period, but do follow lots of people who did: @ijclark, @aekins, @metalibrarian, @schammond, @Jo_Bo_Anderson, @research_inform, @tomroper, @facetpublishing, @DavidGurteen

Of course, to know the extent to which hashtagger activity dominates the twitterstream of this “friends of the ahshtag community”, we’d need to normalise this against their total number of friends; because for exampe If I follow 20k people, of which 60 were hashtaggers, I’d probably miss most of the hashtagged tweets; whereas, if I follow 100 people, of which 60 are hashtaggers, the density of tweets received from hashtaggers could be expected to be quite high.

Okay – enough for now… although if you can think of anything else that might be interesting to know about the wider community around the hashtaggers, please post it in a comment below:-)

Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...

2 thoughts on “Friends of the Community: Who’s Effectively Following a Hashtag”

  1. Another brilliant post. Many thanks for the perspective on these analytics. It will probably take a while for the depth of significance about what can be learned to sink in.

    I did wonder – and apologies if you have covered this before – but can you use this type of approach to understand more about the way a community grows around a particular interest. I guess I am almost thinking somewhat astronomically here (which is a stretch since I am not an astronomer) but if you consider the first and last graphs as akin to an astronomical map would you be able to identify the nature of the node (person) by the representation in the map and would it tell you about their stage of maturity relevant to this particular community?

    Not sure if that even makes sense but several of your comments stimulated that thought.

    1. @allyn one of the things on my to do list is looking at the growth of a network overtime. I do think there probably are network metrics that might partially signal different stages of maturity of a community, e.g. in sense of gross level statistics that describe the connectedness and size of the network (early thoughts here: ].

      I am, of course, completely making all this stuff up, and learning in public as I go along, so as and when I get my head round how standard, research quoted network statistics are used, I’ll post it – and try to interpret it in terms of what it might practically mean;-)

Comments are closed.