Personal Twitter Networks in Hashtag Communities

Another conference I’m not at, this time ALT-C, so time for another blatant attempt to raise my profile at the event even if I’m not there with another Twitter related hack…;-) This time, a little tool to help you explore the extent of your Twitter network within a community of people using a particular hashtag.

Here’s a tease of the sort of report it gives:

My place in a Twitter hashtag community http:ouseful.open.ac.uk/twitterMyhashtagNet.php?q=psychemedia&h=altc2009

Some numbers (I’ll let you know what in a minute…) A list of people in the hashtag network who are followed by a particular individual (their “friends”). A list of people in the hashtag network who follow a particular individual, but are not followed (friended) back (their “serfs”). A list of people in the hashtag network who are followed (friended) a particular individual, but do not follow them (their “slebs”). A list of people in the hashtag network who neither follow nor are followed (friended) by a particular individual (“the void”).

Before I go on, I should probably also define what I mean by a hashtag community, not last because there are some, err, pragmatic constraints on defining this;-)

For the purposes of this post, a hashtag community is a collection of people who have used a particular hashtag more than a certain minimum specified number of times in a set of Twitter posts that use the hashtag. In my default ad hoc set up, I tend to look for people who have used the hashtag more than 3 times in the most recent 500 or so tweets. For the proof of concept demo, I also limit the size of the hashtag network to 100, otherwise the pipework that underpins it starts to fall over…

UPDATE:
Here’s a bit more explanation about why the app doesn’t always show people in the community you ‘know’ to be there…

You may notice that not everyone you know has used the hashtag appears in the friends and followers lists. This is because the size of the hashtag community is limited in three ways:

  • hashtag use sample size: for this proof of concept, the hashtag community analysis is based on a Twitter search that grabs the 500 most recent uses of the declard hashtag. If this were a production tool, it would pull the complete archive of hashtag use from one of the twitter archiving services. if you want that feature, build it yourself…;-)
  • minimum number of tweets: an optional paramenter in the URI identifies the minimum number of hashtagged tweets that a user must have sent in the sample to be considered a member of the community. By setting this numbr large, it allows you to just see the heaviest hashtagger in the community, or filter out people who maybe just use the hashtag once in a retweet. (I think there’s a bug in the code – if you set this mintweets paramter to 2, the user must have hashtagged at least 3 times. i.e. one more. 10 is 11.
  • Max community size: an ‘issue’ in the Twitter search API means I need to call the Twitter API once for every person in the community. This overhead can break the pipework, so the community size can be limited arbitrarily.

The inspiration for the report is a typical ego thing – to what extent is my personal Twitter network dominated by the membership of a particular hashtag community. (Note I’ve explored related ideas in a variety of other ad hoc ways: Who’s Tweeting Our Hashtag?, Where Next With The Hashtagging Twitterers List?, Preliminary Thoughts on Visualising the OpenEd09 Twitter Network, A Quick Peek at the IWMW2009 Twitter Network, More Thinkses Around Twitter Hashtag Networks: #JISCRI and Handling Yahoo Pipes Serialised PHP Output).

Anyway, in the current example, the numbers I’ve started to look at are defined as follows. All numbers are either integers, or real numbers in the range 0..1.

So what do the numbers mean?

  • Number of hashtaggers: the number of people in the hashtag network, Ngalaxy;
  • Hashtaggers as followers (‘hashtag followers’): the number of people in the hashtag community who are following the named individual, Gfollowers
  • Hashtaggers as friends (‘hashtag friends’): the number of people in th hashtag community that the named individual has friended, Griends
  • Hashtagger followers not friended (‘serfs’): the number of people in the hashtag community that follow the named individual but that are not followed back (i.e. who are not friends of the named individual), Gserfs
  • Hashtagger friends not following (‘slebs’): the number of people in the hashtag community that are followed by the named individual (i.e. friends) but that do not follow them back (i.e. who are not also followers of the named individual), Gslebs
  • Hashtaggers not friends or followers (‘the hashtag void’): the number of people in the hashtag community who neither follow, nor are friended by, the named individual Gvoid
  • Reach into hashtag community: the proportion of the the hashtag community that follow the named individual; a measure of the extent to which an individual can reach the hashtag community without actually using the hashtag; Greach=Gfollowers/Ngalaxy.
  • Reception of hashtag community the proportion of the the hashtag community that are followed by (i.e. are friends of) the named individual; a measure of the extent to which an individual sees messages from the hashtag community without directly tracking the hashtag; Greception=Gfriends/Ngalaxy
  • Hashtag void (normalised): the size of the void normalised relative to the size of the hashtag community; the proportion of the hashtag community that are unlikely to be directly encountered outside of the hashtag community; Normvoid=Gvoid/Ngalaxy
  • Total personal followers the total number of followers of the named individual, Nfollowers
  • Total personal friends: the total number of friends of the named individual Nfriends
  • Hashtag community dominance of personal reach: the extent to which the hashtag community dominates the set of people who follow the named individual, Domreach=Gfollowers/Nfollowers. If all the named individual’s followers are in the hashtag community, Domreach=1. If none of them are, Domreach=0.
  • Hashtag community dominance of personal reception: the extent to which the set of the named individual’s friends is dominated by members of the hashtag community, Domreception=Gfriends/Nfriends. If all the named individual’s friends are in the hashtag community, Domreception=1. If none of them are, Domreception=0.

If you want to try the tool out, the interface is provided by the URI:
http://ouseful.open.ac.uk/twitterMyhashtagNet.php?q=ostephens&h=altc2009&mintweets=2&maxusers=99

I have no idea whether any of these measures are used in more formal analyses (I’ve yet to start my formal reading of the proper social network analysis stuff…) but it’s a way in for me to start thinking about what measures that might be in some sense meaningful and both easy to explain and calculate;-)

Handling Yahoo Pipes Serialised PHP Output

One of the output formats supported by Yahoo Pipes is a PHP style array. In this post, which describes a way of seeing how well connected a particular Twitter user is to other Twitterers who have recently used a particular hashtag, I’ll show you how it can b used.

The following snippet, (cribbed from Coding Forums) shows how to handle this array:

//Declare the required pipe, specifying the php output
$req = "http://pipes.yahoo.com/ouseful/hashtagtwitterers?_render=php&min=3&n=100&q=%23jiscri";

// Make the request
$phpserialized = file_get_contents($req);

// Parse the serialized response
$phparray = unserialize($phpserialized);

//Here's the raw contents of the array
print_r($phparray);

//Here's how to parse it
foreach ($phparray['value']['items'] AS $key => $val)
	printf("<div><p><a href=\"%s\">%s</a></p><p>%s</p>\n", $val['link'], $val['title'], $val['description']);

The pipe used in the above snippet (http://pipes.yahoo.com/ouseful/hashtagtwitterers) displays a list of people who have recently used a particular hashtag on Twitter a minimum specified number of times.

It’s easy enough to parse out the Twitter ID of each individual, and then for a particular named individual see which of those hashtagging Twitterers they are either following, or are following them. (Why’s this interesting? Well, for any given hashtag community, it can show you how well connected you are with that community).

So let’s see how to do it. First, parse out the Twitter ID:

foreach ($phparray['value']['items'] AS $key => $val) {
	$id=preg_replace("/@([^\s]*)\s.*/", "$1", $val['title']);
	$idList[] = $id; 
}

We have the Twitter screennames, but now we want the actual Twitter user IDs. There are several PHP libraries for accessing the Twitter API. The following relies on an old, rejigged version of the library available from http://github.com/jdp/twitterlibphp/tree/master/twitter.lib.php (the code may need tweaking to work with the current version…), and is really kludged together… (Note to self – tidy this up on day!)

The algorithm is basically as follows, and generates a GraphViz .dot file that will plot the connections a particular user has with the members of a particular hashtagging community:

  • get the list of hashtagger Twitter usernames (as above);
  • for each username, call the Twitter API to get the corresponding Twitter ID, and print out a label that maps each ID to a username;
  • for the user we want to investigate, pull down the list of people who follow them from the Twitter API; for each follower, if the follower is in the hashtaggers set, print out that relationship;
  • for the user we want to investigate, pull down the list of people who they follow (i.e. their ‘friends’) from the Twitter API; for each friend, if the friend is in the hashtaggers set, print out that relationship;
$Twitter = new Twitter($myTwitterID, $myTwitterPwd);

//Get the Twitter ID for each user identified by the hashtagger pipe
foreach ($idList as $user) {
	$user_det=$Twitter->showUser($user, 'xml');
 	$p = xml_parser_create();
	xml_parse_into_struct($p,$user_det,$results,$index);
	xml_parser_free($p);
	$id=$results[$index['ID'][0]][value];
	$userID[$user]=$id;
	//print out labels in the Graphviz .dot format
	echo $id."[label=\"".$user."\"];\r";
}

//$userfocus is the Twitter screenname of the person we want to examine
$currUser=$userID[$userfocus];
 
//So who in the hashtagger list is following them?
$follower_det=$Twitter->getFollowers($userfocus, 'xml');
$p = xml_parser_create();
xml_parse_into_struct($p,$follower_det,$results,$index);
xml_parser_free($p);
foreach ($index['ID'] as $item){
	$follower=$results[$item][value];
	//print out edges in the Graphviz .dot format
	if (in_array($follower,$userID)) echo $follower."->".$currUser.";\r";
}

//And who in the hashtagger list are they following?
$friends_det=$Twitter->getFriends($userfocus, 'xml');
$p = xml_parser_create();
xml_parse_into_struct($p,$friends_det,$results,$index);
xml_parser_free($p);
foreach ($index['ID'] as $item){
	$followed=$results[$item][value];
	//print out edges in the Graphviz .dot format
	if (in_array($followed,$userID)) echo $currUser."->".$followed.";\r";
}

For completeness, here are the Twitter object methods and their associated Twitter API calls that were used in the above code:

function showUser($id,$format){
	$api_call=sprintf("http://twitter.com/users/show/%s.%s",$id,$format);
  	return $this->APICall($api_call, false);
}

function getFollowers($id,$format){
  	$api_call=sprintf("http://twitter.com/followers/ids/%s.%s",$id,$format);
 	return $this->APICall($api_call, false);
}
  
function getFriends($id,$format){
  	$api_call=sprintf("http://twitter.com/friends/ids/%s.%s",$id,$format);
 	return $this->APICall($api_call, false);
}

Running the code uses N+2 Twitter API calls, where N is the number of different users identified by the hashtagger pipe.

The output of the script is almost a minimal Graphviz .dot file. All that’s missing is the wrapper, e.g. something like: digraph twitterNet { … }. Here’s what a valid file looks like:

(The labels can appear either before or after the edges – it makes no difference as far as GraphViz is concernd.)

Plotting the graph will show you who the individual of interest is connected to, and how, in the particular hashtag community.

So for example, in the recent #ukoer community, here’s how LornaMCampbell is connected. First a ‘circular’ view:

ukoerInternalNetLMC2

The arrow direction goes FROM one person TO a person they are following. In the circular diagram, it can be quite hard to see whether a connection is reciprocated or one way.

The Graphviz network diagram uses a separate edge for each connection and makes it easier to spot reciprocated links:

ukoerInternalNetLMC

So, there we have it. Another way of looking at Twitter hashtag networks to go along with Preliminary Thoughts on Visualising the OpenEd09 Twitter Network, A Quick Peek at the IWMW2009 Twitter Network and More Thinkses Around Twitter Hashtag Networks: #JISCRI

More Thinkses Around Twitter Hashtag Networks: #JISCRI

A brief next step on from Preliminary Thoughts on Visualising the OpenEd09 Twitter Network and A Quick Peek at the IWMW2009 Twitter Network with a couple of graphs that look at the hashtag network around the JISCRI event that’s going on this week.

The sample was a taken from a search of recent #jiscri hashtagged tweets captured last night using the Hashtag Twitterers pipe.

The first chart was to look at people who the hashtag twitterers were following in large numbers who weren’t using the hashtag (I think…my experimental protocol was a bit ropey last night… oops).

The graphs were plotted using Graphviz – firstly a radial plot:

jiscrinetExtGurus

And then a circular one:

jiscrinetExtGurus2

The circular one is quite fun, I think? :-) At a glance, it shows who the “external gurus” are, as well as the differences in their influence.

The second thing I looked at was the network graph of the JISCRI hashtaggers, showing who friended whom:

jiscriTwitterNet

Here’s the circular view:

jiscriTwitterNetCircular

For a large event, I think this sort of graph could be quite fun to generate at both the start of the event and at the end of the event, to show how connections can be formed during an event.

For conferences that publish lists of attendees, popping up a poster of the delegates’ twitter network might provide an interesting discussion thing for people to chat around.

PS See also Meet @HelloApp, Making Conferences More Fun.

Where Next With The Hashtagging Twitterers List?

This post is a holding position, so it’s probably gonna be even more cryptic than usual…

In Who’s Tweeting Our Hashtag?, I described a recipe for generating a list of people who had been tweeting, twittering or whatever, using a particular hashtag.

So what’s next on my to do list with this info?

Well, first of all I thought it’d be interesting to try to plot a graph of connections between the followers of everyone on the list, to see how large the hashtag audience might be.

Using a list of about 60 or so twitterers, captured yesterday, I called the Twitter API http://twitter.com/followers/ids/USERNAME.xml function for each one to pull down an XML list of all each of their followers by ID number, and topped it up with the user info (http://twitter.com/users/show/USERNAME.xml) for each person on the original list; this info meant I could in turn spot the ID for each of the hashtagging twitterers amongst the followers lists.

It’s easy enough to map transform these lists into the dot format that can be plotted by GraphViz, but the 10,000 edges or so that the list generated from the followers lists was too much for my version of GraphViz to cope with.

So instead, I thought I’d just try to plot a subgraph, such as the graph of people who were following a minimum specified number of people in the original hashtag twittering list. So for example, the graph of people who were following at least five of the the people who’d used the particular hashtag.

I hacked a piece of code to do this, but it’s far from ideal and I’m not totally convinced it works properly… Ideally what I want is simple (efficient) utility that will accept a .dot file and prune it, removing nodes that are less than a specified degree. (If you know of such a tool, please post a link to it in the comments:-)

Here’s the first graph I managed to plot:

If my code is working, an edge points to a person if at that person is following at least, err, lots of the other people [that is: lots of other people who used the hashtag]. So under the assumption that the code is working, this graph shows one person at the centre of the graph who is following lots of people who have tweeted the hashtag. Any guesses who that person might be? People who have edges directed towards them in this sort of plot are people who are heavily following the people using a particular hashtag. If you’re a conference organiser, I’m guessing that you’d probably want to appear in this sort of graph?

(If the code isn’t working, I’m not sure what the hell it is doing, or what the graph shows?!;-)

One other thing I thought I’d look at was the people who are following lots of people on the hashtagging list who haven’t themselves used the hashtag. These are the people to whom the event is being heavily amplified.

So for example, here we have a chart that is constructed as follows. The hashtag twitterers list is constructed from a sample of the most recent 500 opened09 hashtagged tweets around about the time stamp of this post and contains people who are in that list at least 3 times.

The edges on the chart are directed towards people who are not on the hashtag list but who are following more than 13 of the people who are on the list.

Hmmmm… anyway, that’s more than enough confusion for now… I’m going to try not to tinker with this any more for a bit, becuase a holiday beckons and this could turn into a mindf**k project… However, when I do return to it, I think I’m going to have a go at attacking it with a graph/network toolkit, such as NetworkX, and see if I can do a proper bit of network analysis on the resulting graphs.