Additional Thoughts on Generating a Persistent Context from an Event Tag

In Deriving a Persistent EdTech Context from the ALTC2010 Twitter Backchannel (aka ‘From community folksonomy to epistemology in a few clicks: Possibly the most useful post (ever?)’, via @georgeroberts;-), I showed how we could mine the tweets surrounding an archived hashtag in order to generate a topic based context that would persist after the event had been long gone.

So what else might we do? Here are a couple of quick thoughts…

Firstly, some folk were tweeting links using the hashtag, so we can scrape these from the twapperkeeper archive and maybe use them to feed a facet of the search engine (e.g. relating to sites/links tweeted during the event). In this case, that part of the search engine would correspond to a fragmentary memory of links deemed important at the time of the original event.

Here are a couple of fragments that could form the basis of the generating script. Firstly, a link stripper to extract links from a tweet. Something like this should work:

print re.findall(r'(?:http://|www.)[^"\s]+',string)

Secondly, we need to post full links rather than shortened links to the search engine. I noticed @AJCann was using a URL in his twitter profile, and also a lot of tweeted links are shortened using; so for those at least we can expand the links via the API:

import simplejson,urllib,re


urls.append('') api call can take up to 15 &shorturl=URL pairs
for i in urls:
  print 'url: '+url
  for j in r['data']['expand']:
    print 'long '+j['long_url']

(Anyone know of a service that can expand links from the most popular shortening services via a single API, rather than, say, having to call each shortened URL to see where it actually points to?)

In passing, we can probably look to other services, such as delicious, to see who has been bookmarking URLs with the particular tag, and maybe even use these links in a custom search engine (though that may go against Delicious’ terms and conditions.) Similarly for Slideshare.

Just considering delicious again, we could also look to that service to see who bookmarked the ALTC2010 homepage – they may be folk we want to add into our context- but I don’t think delicious profiles include a personal homepage URL? What we can do, though, is look to see what tags folk were using to tag the ALTC2010 homepage (and maybe other links tweeted during the conference) to identify folksonomic keywords to associate with the context? We can pull the tag data for a link in from the delicious API, I think? Here’s an example from years ago (More Hyperbolic Tree Visualisations – delicious URL History: Users by Tag) about the sort of thing we can extract – tags and users around a URL as bookmarked on a delicious:

Tags used to describe the altc2010 homepage on delicious

The outer leaves show the users who used the particular tag:

Altc2010 homepage tags by user on delicious

By the by, we can also look at users who bookmarked a link, and the tags they used, via another script (delicious URL History – Hyperbolic Tree Visualisation):

ALTC2010 homepage on delicious - tags by user

Going back to the hashtaggers’ Twitter IDs, we can use the Google social graph API to find “aliases” of twitterati on other services (see for example Time to Get Scared, People? to see what I can find out from just the twitter name “ajcann”). One useful feature of this API is the ability to discover the URLs of several blogs etc that may be associated with an individual, rather than just the single link we get from a person;s Twitter profile.

Finally, George Roberts pointed out that there were “splitters” from the the main hashtag who were using alternative forms. If those tag was being archived too, it would be easy enough to merge the two archives client side before creating the context. In order to discover those tags, it might be possible to use the delicious tags as a crib, and do a data limited search on the tags on Twapperkeeper just to see if any likely alternative tags turned up. To check the guessed at tags were relevant, we might do a quick social analysis of the friends/followers of folk using those tags to see if they have a significant overlap with folk using the “authorised” hashtag; if they do, we might reasonably assume the tag being used is an alternative one.

Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...

4 thoughts on “Additional Thoughts on Generating a Persistent Context from an Event Tag”

Comments are closed.

%d bloggers like this: