OUseful.Info, the blog…

Trying to find useful things to do with emerging technologies in open education

Posts Tagged ‘altc2010

Additional Thoughts on Generating a Persistent Context from an Event Tag

with 3 comments

In Deriving a Persistent EdTech Context from the ALTC2010 Twitter Backchannel (aka ‘From community folksonomy to epistemology in a few clicks: Possibly the most useful post (ever?)’, via @georgeroberts;-), I showed how we could mine the tweets surrounding an archived hashtag in order to generate a topic based context that would persist after the event had been long gone.

So what else might we do? Here are a couple of quick thoughts…

Firstly, some folk were tweeting links using the hashtag, so we can scrape these from the twapperkeeper archive and maybe use them to feed a facet of the search engine (e.g. relating to sites/links tweeted during the event). In this case, that part of the search engine would correspond to a fragmentary memory of links deemed important at the time of the original event.

Here are a couple of fragments that could form the basis of the generating script. Firstly, a link stripper to extract links from a tweet. Something like this should work:

string="@sdsd http://sds.sd/sd?&dsd http://sds.sd/?r+dsd"
print re.findall(r'(?:http://|www.)[^"\s]+',string)

Secondly, we need to post full links rather than shortened links to the search engine. I noticed @AJCann was using a bit.ly URL in his twitter profile, and also a lot of tweeted links are shortened using bit.ly; so for those at least we can expand the links via the bit.ly API:

import simplejson,urllib,re

bu='psychemedia'
bkey=''

urls=[]
urls.append('http://bit.ly/AJCann')

#bit.ly api call can take up to 15 &shorturl=URL pairs
for i in urls:
  url='http://api.bit.ly/v3/expand?shortUrl='+urllib.quote(i)+'&login='+bu+'&apiKey='+bkey+'&format=json'
  print 'url: '+url
  r=simplejson.load(urllib.urlopen(url))
  for j in r['data']['expand']:
    print 'long '+j['long_url']

(Anyone know of a service that can expand links from the most popular shortening services via a single API, rather than, say, having to call each shortened URL to see where it actually points to?)

In passing, we can probably look to other services, such as delicious, to see who has been bookmarking URLs with the particular tag, and maybe even use these links in a custom search engine (though that may go against Delicious’ terms and conditions.) Similarly for Slideshare.

Just considering delicious again, we could also look to that service to see who bookmarked the ALTC2010 homepage – they may be folk we want to add into our context- but I don’t think delicious profiles include a personal homepage URL? What we can do, though, is look to see what tags folk were using to tag the ALTC2010 homepage (and maybe other links tweeted during the conference) to identify folksonomic keywords to associate with the context? We can pull the tag data for a link in from the delicious API, I think? Here’s an example from years ago (More Hyperbolic Tree Visualisations – delicious URL History: Users by Tag) about the sort of thing we can extract – tags and users around a URL as bookmarked on a delicious:

Tags used to describe the altc2010 homepage on delicious

The outer leaves show the users who used the particular tag:

Altc2010 homepage tags by user on delicious

By the by, we can also look at users who bookmarked a link, and the tags they used, via another script (delicious URL History – Hyperbolic Tree Visualisation):

ALTC2010 homepage on delicious - tags by user

Going back to the hashtaggers’ Twitter IDs, we can use the Google social graph API to find “aliases” of twitterati on other services (see for example Time to Get Scared, People? to see what I can find out from just the twitter name “ajcann”). One useful feature of this API is the ability to discover the URLs of several blogs etc that may be associated with an individual, rather than just the single link we get from a person;s Twitter profile.

Finally, George Roberts pointed out that there were “splitters” from the the main hashtag who were using alternative forms. If those tag was being archived too, it would be easy enough to merge the two archives client side before creating the context. In order to discover those tags, it might be possible to use the delicious tags as a crib, and do a data limited search on the tags on Twapperkeeper just to see if any likely alternative tags turned up. To check the guessed at tags were relevant, we might do a quick social analysis of the friends/followers of folk using those tags to see if they have a significant overlap with folk using the “authorised” hashtag; if they do, we might reasonably assume the tag being used is an alternative one.

Written by Tony Hirst

September 9, 2010 at 8:12 am

Posted in Thinkses

Tagged with

Deriving a Persistent EdTech Context from the ALTC2010 Twitter Backchannel

with 14 comments

So you’ve been to an event where everyone was tweeting, and now what? That stuff’s all in the past, right? Wrong…

Earlier today, I published a short post describing how it was possible to do all sorts of wonderful things around a twitter hashtag community (well I think they’re wonderful – or some of them, at least…). In this post, I’ll give a couple of illustrations using the #altc2010 hashtag from this year’s ALTC conference.

First up, what does the inner structure of the hashtag community look like? That is, of the Twitter folk using the twitter hashtag (in fact, folk who’ve used the hashtag more than three times over the last couple of days), who follows whom? In the following graph, nodes are individual twitterers, edges go from a person to a person they follow, node size and label size is proportional to the number of hashtaggers following the named person (that is, the in degree of the node) and colour is proportional to the number of hashtaggers an individual is following (out degree; red is “hot”/high).

ALTC-2010 hashtag community

This graph was produced using Gephi, which can also run stats over the graph. So for example, if we size the nodes according to betweenness, we can see which twitterers in the community are likely to be most effective at getting a message out across that community.

ALC2010 BEtweenness centrality

Note that the ALT user is way and above the node with the highest betweenness score – the sizes of the other nodes are amplified just so we can see them…

If we grab the total number of followers and friends of each user (that is, including folk who have not used the hashtag and are not part of the hashtag community) and use that to set the size (number of followers) and colour (number of friends) of each user, we can see which twitterers are most likely to amplify the event outside of the community.

ALTC2010 total frinds/followers

Okay, so what else can we do?

One thing is create a twitter list containing the folk who’ve been using the ALTC2010 hashtag; you can find it here: ALTC2010 List

ALTC2010 hashtaggers list

We can also feed the address of this list into a Yahoo pipe (described here) that will search through recent tweets visible through the list for hashtags. In this way we can use the folk who were twittering around ALTC2010 to act as an early warning beacon for other hashtags or hashtagged events in the educational technology area.

ALTC2010 hashtag community - what else is hot?

Something else we can do via the twitter list is grab everyone’s personal homepage URL, as declared on their twitter profile, and use these URLs to seed an ALTC2010 custom search engine.

ALTC2010 hashtaggers search engine

That is, a search engine over a good proportion of the personal pages of HE related UK educational technologists, as they declared themselves over Twitter circa September 2010.

[UPDATE: and here's an example of why the community defined custom search engine might be interesting... via @eingang: Ouch! David White And The Dragon Slaying]

So, there we have it. The scripts are in place, so generating the screenshots, and writing this post, took waaaaaaaaaay longer than mining the twapperkeeper archive, setting up the lists, generating the graph files (though I still had to load them into gephi, lay them out and render them “by hand” i.e. by clicking a couple of buttons…) and seeding the custom search engine (which also had to be initially set up by hand).

But why bother? Well, my developing idea is that we can mine events to define (automatically) a context around a particular subject area or domain (for example, a set of people interested in an expert in the area), and then draw on this context for search and discovery at a later date (e.g. through monitoring their twitter feeds via an auto-generated list to see what they – as a group of independent individuals – are talking about severally together, or by searching over just their personal webpages).

PS odds on some f*****r has patented this approach; if they have, this was all my own work, and it was bleedin’ obvious, so s***w you, m**********r… sue me.

PPS idly mulling over what else I could do with the custom search engine, I seem to remember that it’s possible to tweak the ranking factors of results returned from particular sites in the CSE definition file… which means we could take things like the number of twitter followers, or the betweenness centrality of everyone within the hashtag community, and use this as a ranking factor? That is, we might use the twitter “reputation” of an individual, either in general terms (overall number of followers, say), or within a community (e.g. betweenness centrality) to boost or reduce the tanking of results returned from their pages within the custom search engine. And if anyone else out there thinks they have a patent on that idea, they can f**k right off too, cos I haven’t got the idea from you, either…

PPPS for a few immediate thoughts about where next with all of this, see Additional Thoughts on Generating a Persistent Context from an Event Tag

Written by Tony Hirst

September 8, 2010 at 8:19 pm

Posted in Tinkering, Visualisation

Tagged with ,

Follow

Get every new post delivered to your Inbox.

Join 134 other followers