For ever, it seems as if we have had a problem of “information overload”. Way back when, in their dusty cells, scholarly monks would spend their days writing digests of of books that had gone before, because there were so many books (?!) very few people would be able to read them all, and know what insights they contained. And then, it seems, Shirky came along, amplified by Jarvis, Weinberger and other net culture thinkers and commentators, popularising the notion of “filter failure”.
We’ve also been hearing a lot lately about event amplification…
Almost three years ago now (three years, do you remember what the web was like, and what hadn’t been invented yet, three years ago?!), I presented what felt like an old idea to me, even at the time, at ILI2007 on the topic of “Search Hubs and Custom Search”. The idea was that there are lots of places where we context already exists that can used to mine links that might serve as custom search engines. That there were contexts that by their very nature brought together people and content relating to a particular topic, or domain. At that time, I demoed a Google Custom Search engine that searched over third party content linked to from an OU OpenLearn course, as well as something I’d been dabbling with for over a year even at that time: searchfeedr, a search engine that searched over domains listed in the links of an RSS feed pulled in from wherever (I think it still works? http://searchfeedr.com/), and which itself had origins in a hack I’d called deliSearch, that would search over sites tagged in a particular way on delicious.
Yesterday, whilst reading a post on GigaOm (The Web of Intent is Coming (Sooner Than You Think)), yet another post on how “[t]here’s an emerging opportunity for content publishers (and the publishing technologies they rely upon) to dramatically improve how they filter the stream for the consumers they serve”, it struck me that I’ve never really moved on from thinking about where we might find “discovered search engines” (similar to the sense of “found objects”; I always did like Duchamp…;-).
So for example, over the weekend, I made some glue, pulling together a few scripts I had around what I’ve been calling “hashtag communities”, those groups of people who use a particular hashtag on Twitter, often around an event, and putting together a few new scripts (never more than a few lines of code each).
The scripts were, variously:
– a script for grabbing a hashtag archive via the Twapperkeeper API, and pulling out all the people who had sent more than a certain number of tweets using the particular hashtag;
– a script for taking a list of Twitter user IDs, and grabbing the lists of their friends of followers from the Twitter API;
– a script for identifying the friends and followers of an individual who had used a particular hashtag, for use in the creation of a hashtag community graphs, showing the links between folk using a particular hashtag;
– a script for creating a Twitter list containing the folk who had used a particular hashtag from a list of Twitter user names (e.g. as grabbed from Twapperkeeper);
– a script for grabbing the details of members on a Twitter list, and grabbing the number of their friends and followers to add further depth to a hashtag community graph;
– a variant of the Twitter list member detail grabbing script, that pulled out the homepage URLs used on Twitter profiles and generated a Google custom search definition file (so you can easily search over the websites of folk listed in a Twitter list).
So now, given a Twitter hashtag, assuming it’s been archived on Twapperkeeper, I can easily generate network graphs showing the interconnections of folk on Twitter using a particular hashtag, create Twitter lists based on hashtag users, create a custom search engine around the folk who used a particular hashtag.
And what I haven’t done, but could quite easily do, are things like:
– generate a custom search engine around the links tweeted in the context of a hashtag (cf. DeliSearch, where searches were created around links tagged in a particular way in delicious);
– monitor the Twitter list of hashtag users and pull out hashtags they use in the future.
If we look at how a particular tag is used more widely, e.g. on Slideshare, or delicious, maybe as part of an event amplification strategy, then it’s easy enough to see how we might start to concentrate all these resources: the new Lanyrd service looks like it might start to do something of this, and to a certain extent, Cloudworks also does so. It’s easy enough to hack something similar together too – simply generate a set of URLs around a tag for the RSS feeds from services like delicious, Slideshare, flickr, and so on, generate an OPML file, and for half a dozen lines of code or so you’ve built yourself a source file for something like Netvibes that can provide you with a readymade event monitoring dashboard.
So the dashboard, then.might be seen as some sort of concentrator of activity around the event. But whereas the dashboard may provide a snapshot of an event, and may be useful in an archival context. the tools I’m interested in are ones where we can mine events to provide a context that can be used in the future: so for example, using this week’s ALTC2010 as an example, I can trivially generate a search engine around UK educational technologists using the recipe described above (If they participate in the use of the appropriate hashtag on Twitter), trivially create a list of ed tech Twitterers that I can monitor for future related events (by extracting hashtags currently in use by that community) and so on.
Whether this is another form of amplification, I’m not sure? I see it more as a way of using an event to focus on, or define, a context that may continue to be useful even when the original event has long been forgotten…
[PS I was to going to pepper this post with links, but it’ll take too long to add them all; if you pick out likely phrases and search for them on the OUseful search engine, they should (?) turn related blog posts up.]
PPS for a “worked example” of some of the above, see Deriving a Persistent EdTech Context from the ALTC2010 Twitter Backchannel