In a couple of recent posts, I’ve shown how it’s possible to extract and visualise the internal link structure (the “autopingback graph”) of a WordPress blog. It’s also easy enough to extract linkage information from a WordPress export file that shows who’s been linking to which posts…
Building on the code I posted in The Structure of OUseful.Info, we can extract the external links data as follows (the else part):
if commentInfo['type']=='pingback' and commentInfo['url'].find('http://blog.ouseful.info')!=-1: cID=commentInfo['url'].rstrip('/') cID=cID.rpartition('/') rID=post["link"].rstrip('/') rID=rID.rpartition('/') f.write('"'+cID[2]+'"->"'+rID[2]+'"\n') f2.write('"'+cID[2]+'","'+rID[2]+'"\n') #post['comments'].append(comments) else: xID=commentInfo['url'].lstrip('http://') xID=xID.partition('/') rID=post["link"].rstrip('/') rID=rID.rpartition('/') f3.write('"'+xID[0]+'","'+rID[2]+'"\n')
To simplify matters, I only record the domain that generated the incoming link, rather than the URL of the particular blog post making the link, for example. Edges go from the linking domain to individual blog posts on OUseful..info.
We can then visualise the graph in Gephi. So here, for example, I have sized the nodes according to the number of inlinks (i.e. in degree of each node); that is, in proportion to the number of times posts on OUseful info have been linked to from external sites:
Alternatively, we can size the nodes according to out degree to see which domains link most frequently to OUSeful info posts:
If we apply an ego filter to a particular domain, we can see which posts it has linked to:
We can then increase the depth of the filter to see which other domains have linked to the posts that a particular domain has linked to:
When I first started using Gephi, I don’t think I saw it as a tool for exploring the information environment around a blog, but I do now:-)
PS Hmmm…. I wonder if I should have a go at writing a Gephi plugin to consume WordPress export files and visualise the pingback structures contained within them…?
Sure, a WordPress plugin could be amazing!
I’ll try it on gephi.org ;)
Just been having a look at what’s involved in writing Gephi plugins. Looks like a step too far for me at the moment in terms of overhead required in just setting up the dev environment… Ho hum…