Years ago, I used the Javascript Infovis Toolkit to put together a handful of data visualisations around the idea of the “social life of a URL” by looking up bookmarked URLs on delicious and then seeing who had bookmarked them and using what tags (delicious URL History – Hyperbolic Tree Visualisation, More Hyperbolic Tree Visualisations – delicious URL History: Users by Tag). Whilst playing with some Twitter hashtag network visualisations today, I wondered whether I could do something similar based around delicious bookmark tags, so here’s a first pass attempt…
As a matter of course, delicious publishes RSS and JSON feeds from tag pages, optionally containing up to 100 bookmarked entries. Each item in the response is a bookmarked URL, along with details of the single individual person who saved that particular bookmark and the tags they used.
That is, for a particular tag on delicious we can trivially get hold of the 100 most recent bookmarks saved with that tag and data on:
– who bookmarked it;
– what tags they used.
Here’s a little script in Python to grab the user and tag data for each lak11 bookmark and generate a Gephi gdf file to represent the bipartite graph that associates users with the tags they have used:
import simplejson
import urllib
def getDeliciousTagURL(tag,typ='json', num=100):
#need to add a pager to get data when more than 1 page
return "http://feeds.delicious.com/v2/json/tag/"+tag+"?count=100"
def getDeliciousTaggedURLDetailsFull(tag):
durl=getDeliciousTagURL(tag)
data = simplejson.load(urllib.urlopen(durl))
userTags={}
uniqTags=[]
for i in data:
url= i['u']
user=i['a']
tags=i['t']
title=i['d']
if user in userTags:
for t in tags:
if t not in uniqTags:
uniqTags.append(t)
if t not in userTags[user]:
userTags[user].append(t)
else:
userTags[user]=[]
for t in tags:
userTags[user].append(t)
if t not in uniqTags:
uniqTags.append(t)
f=open('bookmarks-delicious_'+tag+'.gdf')
f.write('nodedef> name VARCHAR,label VARCHAR, type VARCHAR\n')
for user in userTags:
f.write(user+','+user+',user\n')
for t in uniqTags:
f.write(t+','+t+',tag\n')
f.write('edgedef> user VARCHAR,tag VARCHAR\n')
for user in userTags:
for t in userTags[user]:
f.write(user+','+t+'\n')
f.close()
tag='lak11'
getDeliciousTaggedURLDetailsFull(tag)
[Note to self: this script needs updating to grab additional results pages?]
Here’s an example of the output, in this case using the tag for Jim Groom’s Digital Storytelling course: ds106. The nodes are coloured according to whether they represent a user or a tag, and sized according to degree, and the layout is based on a force atlas layout with a few tweaks to allow us to see labels clearly.

Note that the actual URLs that are bookmarked are not represented in any way in this visualisation. The netwroks shows the connections between users and the tags they have used irrespective of whether the tags were applies to the same or different URLs. Even if two users share common tags, they may not share any common bookmarks…
Here’s another example, this time using the lak11 tag:

Looking at these networks, a couple of things struck me:
– the commonly used tags might express a category or conceptual tag that describes the original tag used to source the data;
– folk sharing similar tags may share similar interests.
Here’s a view over part of the LAK11 network with the LAK11 tag filtered out, and the Gephi ego filter applied with depth two to a particular user, in this case delicious user rosemary20:

The filtered view shows us:
– the tags a particular user (in this case, rosemary20) has used;
– the people who have used the same tags as rosemary20; note that this does not necessarily mean that they bookmarked any of the same URLs, nor that they are using the tags to mean the same thing*…
(* delicious does allow users to provide a description of a tag, though I’m not sure if this information is generally available via a public API?)
By sizing the nodes according to degree in this subnetwork, we can readily identify the tags commonly used alongside the tag used to source the data, and also the users who have used the largest number of identical tags.
PS it struck me that a single page web app should be quite easy to put together to achieve something similar to the above visualisations. The JSON feed from delicious is easy enough to pull in to any page, and the Protovis code library has a force directed layout package that works on a simple graph representation not totally dissimilar to the Gephi/GDF format.
If I get an hour I’ll try to have a play to put a demo together. If you beat me to it, please post a link to your demo (or even fully blown app!) in the comments:-)