OUseful.Info, the blog…

Trying to find useful things to do with emerging technologies in open education

Additional Thoughts on Generating a Persistent Context from an Event Tag

In Deriving a Persistent EdTech Context from the ALTC2010 Twitter Backchannel (aka ‘From community folksonomy to epistemology in a few clicks: Possibly the most useful post (ever?)’, via @georgeroberts;-), I showed how we could mine the tweets surrounding an archived hashtag in order to generate a topic based context that would persist after the event had been long gone.

So what else might we do? Here are a couple of quick thoughts…

Firstly, some folk were tweeting links using the hashtag, so we can scrape these from the twapperkeeper archive and maybe use them to feed a facet of the search engine (e.g. relating to sites/links tweeted during the event). In this case, that part of the search engine would correspond to a fragmentary memory of links deemed important at the time of the original event.

Here are a couple of fragments that could form the basis of the generating script. Firstly, a link stripper to extract links from a tweet. Something like this should work:

string="@sdsd http://sds.sd/sd?&dsd http://sds.sd/?r+dsd"
print re.findall(r'(?:http://|www.)[^"\s]+',string)

Secondly, we need to post full links rather than shortened links to the search engine. I noticed @AJCann was using a bit.ly URL in his twitter profile, and also a lot of tweeted links are shortened using bit.ly; so for those at least we can expand the links via the bit.ly API:

import simplejson,urllib,re

bu='psychemedia'
bkey=''

urls=[]
urls.append('http://bit.ly/AJCann')

#bit.ly api call can take up to 15 &shorturl=URL pairs
for i in urls:
  url='http://api.bit.ly/v3/expand?shortUrl='+urllib.quote(i)+'&login='+bu+'&apiKey='+bkey+'&format=json'
  print 'url: '+url
  r=simplejson.load(urllib.urlopen(url))
  for j in r['data']['expand']:
    print 'long '+j['long_url']

(Anyone know of a service that can expand links from the most popular shortening services via a single API, rather than, say, having to call each shortened URL to see where it actually points to?)

In passing, we can probably look to other services, such as delicious, to see who has been bookmarking URLs with the particular tag, and maybe even use these links in a custom search engine (though that may go against Delicious’ terms and conditions.) Similarly for Slideshare.

Just considering delicious again, we could also look to that service to see who bookmarked the ALTC2010 homepage – they may be folk we want to add into our context- but I don’t think delicious profiles include a personal homepage URL? What we can do, though, is look to see what tags folk were using to tag the ALTC2010 homepage (and maybe other links tweeted during the conference) to identify folksonomic keywords to associate with the context? We can pull the tag data for a link in from the delicious API, I think? Here’s an example from years ago (More Hyperbolic Tree Visualisations – delicious URL History: Users by Tag) about the sort of thing we can extract – tags and users around a URL as bookmarked on a delicious:

Tags used to describe the altc2010 homepage on delicious

The outer leaves show the users who used the particular tag:

Altc2010 homepage tags by user on delicious

By the by, we can also look at users who bookmarked a link, and the tags they used, via another script (delicious URL History – Hyperbolic Tree Visualisation):

ALTC2010 homepage on delicious - tags by user

Going back to the hashtaggers’ Twitter IDs, we can use the Google social graph API to find “aliases” of twitterati on other services (see for example Time to Get Scared, People? to see what I can find out from just the twitter name “ajcann”). One useful feature of this API is the ability to discover the URLs of several blogs etc that may be associated with an individual, rather than just the single link we get from a person;s Twitter profile.

Finally, George Roberts pointed out that there were “splitters” from the the main hashtag who were using alternative forms. If those tag was being archived too, it would be easy enough to merge the two archives client side before creating the context. In order to discover those tags, it might be possible to use the delicious tags as a crib, and do a data limited search on the tags on Twapperkeeper just to see if any likely alternative tags turned up. To check the guessed at tags were relevant, we might do a quick social analysis of the friends/followers of folk using those tags to see if they have a significant overlap with folk using the “authorised” hashtag; if they do, we might reasonably assume the tag being used is an alternative one.

Written by Tony Hirst

September 9, 2010 at 8:12 am

Posted in Thinkses

Tagged with

4 Responses

Subscribe to comments with RSS.

  1. Brizzly.com displays the full url when you read tweets through it, might be useful in the absence of anything better?

    Niamh

    September 9, 2010 at 8:28 am

  2. [...] be useful, one day… BlogAboutSearch « Discovering Context: Event Focusing Additional Thoughts on Generating a Persistent Context from an Event Tag [...]

  3. This just keeps looking better and better. Thanks Tony!

    John Rigdon

    September 9, 2010 at 10:14 pm

  4. [...] secondary products? Deriving a Persistent EdTech Context from the ALTC2010 Twitter Backchannel and Additional Thoughts on Generating a Persistent Context from an Event Tag – tag network mapping, custom search engines, blogrolls, twitter lists etc etc; mining the [...]


Comments are closed.

Follow

Get every new post delivered to your Inbox.

Join 787 other followers

%d bloggers like this: