Journalist Filters on Twitter – The Reuters View

It seems that Reuters has a new product out – Reuters Social Pulse. As well as highlighting “the stories being talked about by the newsmakers we follow”, there is an area highlighting “the Reuters & Klout 50 where we rank America’s most social CEOs.” Of note here is that this list is ordered by Klout score. Reuters don’t own Klout (yet?!) do they?!

The offering also includes a view of the world through the tweets of Reuters own staff. Apparently, “Reuters has over 3,000 journalists around the world, many of whom are doing amazing work on Twitter. That is too many to keep up with on a Twitter list, so we created a directory Reuters Twitter Directory] that shows you our best tweeters by topic. It let’s you find our reporters, bloggers and editors by category and location so you can drill down to business journalists in India, if you so choose, or tech writers in the UK.”

If you view the source of Reuters Twitter directory page, you can find a Javascript object that lists all(?) the folk in the Reuters Twitter directory and the tags they are associated with… Hmm, I thought… Hmmm…

If we grab that object, and pop it into Python, it’s easy enough to create a bipartite network that links journalists to the categories they are associated with:

import simplejson
import networkx as nx
#http://mlg.ucd.ie/files/summer/tutorial.pdf
from networkx.algorithms import bipartite

g = nx.Graph()

#need to bring in reutersJournalistList
users=simplejson.loads(reutersJournalistList)

#I had some 'issues' with the parsing for some reason? Required this hack in the end...
for user in users:
	for x in user:
		if x=='users':
			u=user[x][0]['twitter_screen_name']
			print 'user:',user[x][0]['twitter_screen_name']
			for topic in user[x][0]['topics']:
				print '- topic:',topic
				#Add edges from journalist name to each tag they are associated with
				g.add_edge(u,topic)
#print bipartite.is_bipartite(g)
#print bipartite.sets(g)

#Save a graph file we can visualise in Gephi corresponding to bipartite graph
nx.write_graphml(g, "usertags.graphml")

#We can find the sets of names/tags associated with the disjoint sets in the graph
users,tags=bipartite.sets(g)

#Collapse the bipartite graph to a graph of journalists connected via a common tag
ugraph= bipartite.projected_graph(g, users)
nx.write_graphml(ugraph, "users.graphml")

#Collapse the bipartite graph to a set of tags connected via a common journalist
tgraph= bipartite.projected_graph(g, tags)
nx.write_graphml(tgraph, "tags.graphml")

#Dump a list of the journalists Twitter IDs
f=open("users.txt","w+")
for uo in users: f.write(uo+'\n')
f.close()

Having generated graph files, we can then look to see how the tags cluster as a result of how they were applied to journalists associated with several tags:

Alternatively, we can look to see which journalists are connected by virtue of being associated with similar tags (hmm, I wonder if edge weight carries information about how many tags each connected pair may be associated through? [UPDATE: there is a projection that will calculate this – bipartite.projection.weighted_projected_graph]). In this case, I size the nodes by betweenness centrality to try to highlight journalists that bridge topic areas:

Association through shared tags (as applied by Reuters) is one thing, but there is also structure arising from friendship networks…So to what extent do the Reuters Twitter List journalists follow each other (again, sizing by betweenness centrality):

Finally, here’s a quick look at folk followed by 15 or more of the folk in the Reuters Twitter journalists list: this is the common source area on Twitter for the journalists on the list. This time, I size nodes by eigenvector centrality.

So why bother with this? Because journalists provide a filter onto the way the world is reported to us through the media, and as a result the perspective we have of the world as portrayed through the media. If we see journalists as providing independent fairwitness services, then having some sort of idea about the extent to which they are sourcing their information severally, or from a common pool, can be handy. In the above diagram, for example, I try to highlight common sources (folk followed by at least 15 of the journalists on the Twitter list). But I could equally have got a feeling for the range of sources by producing a much larger and sparser graph, such as all the folk followed by journalists on the list, or folk followed by only 1 person on the list (40,000 people or so in all – see below), or by 2 to 5 people on the list…

Friends lists are one sort of filter every Twitter user has onto the content been shared on Twitter, and something that’s easy to map. There are other views of course – the list of people mentioning a user is readily available to every Twitter user, and it’s easy enough to set up views around particular hashtags or search terms. Grabbing the journalists associated with one or more particular tags, and then mapping their friends (or, indeed, followers) is also possible, as is grabbing the follower lists for one or more journalists and then looking to see who the friends of the followers are, thus positioning the the journalist in the social media environment as perceived by their followers.

I’m not sure that value Reuters sees in the stream of tweets from the folk on its Twitter journalists lists, or the Twitter networks they have built up, but the friend lenses at least we can try to map out. And via the bipartite user/tag graph, it also becomes trivial for us to find journalists with interests in Facebook and advertising, for example…

PS for associated techniques related to the emergent social positioning of hashtags and shared links on Twitter, see Socially Positioning #Sherlock and Dr John Watson’s Blog… and Social Media Interest Maps of Newsnight and BBCQT Twitterers. For a view over @skynews Twitter friends, and how they connect, see Visualising How @skynews’ Twitter Friends Connect.

Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering... View all posts by Tony Hirst