OUseful.Info, the blog…

Trying to find useful things to do with emerging technologies in open education

Archive for September 2011

A Couple More Webrhythm Identifying Tools

with one comment

I love the idea of trendspotting using tools that expose something of the rhythm of life on the web, so was intrigued to see another app in the area of Twitter trends via an email from @R3beccaF: timeu.se

timeu.se - web trends

In contrast to the 1-day, 7-day, 28 day history view offered by services like Trendistic (or in the web search area, Google Trends or Google Insights for Search, for example), timeu.se offers views binned by hour of day or day of week, and allows you to display these as line charts or as a heatmap:

timeu.se

There is also a scatterplot display that allows you to compare two terms, but I’m not totally sure what the axes represent?

Another related tool I don’t think I’ve blogged about before is Tweetolife, which again logs activity around a particular term on Twitter by time of day:

Tweetolife - webrhythms

Interestingly, it also attempts to expose gender differences between male and female Twitter users:

tweetolife

PS loosely related, I notice that Google is about to start offering realtime Google Analytics

PPS something I haven’t really explored, but that possibly complements trend analysis, is sentiment analysis. Martin has started tinkering around this already: Using the Viralheat Sentiment API and a Google Spreadsheet of conference tweets to find out how that keynote went down

Written by Tony Hirst

September 30, 2011 at 9:53 am

A Quick Intro to Google Custom Search Engine Definition Files

leave a comment »

In Search Engine Powered Courses…, I took an initial, baby step to demonstrate one way in which a promoted link might be used be within a course specific custom search engine. In the next post in this series, I will describe how to influence the positioning of results within a Google custom search engine by boosting their ranking, as well as how results may be ‘faceted’ into different results sets through the use of labels.

In this post, I thought it would be worth taking a step back and reviewing the three configuration files we have access to when defining a Google custom search engine: the configuration file, the promotions file, and the annotations file. If you create a minimal Google custom search engine using the CSE management tools, and then go to the Advanced page, you will see options that allow you to upload the configuration and annotations file. The promotions file can be imported via the Promotions page.

So what do each of these file do?

  • The configuration file defines the top level configuration of the search engine. The easiest way of obtaining a template for a CSE is to create a minimal search engine using the CSE management tools, and then export the configuration file from the Advanced page. The configuration file defines, among other things: whether the search engine will search over the whole web, prioritising (or ‘BOOSTing’) sites and pages indexed explicitly by the CSE, or whether it will just return resuts from the explicilty indexed pages (a FILTER style search engine); a definition of the labels, or facets, that allow different search refinements to be applied as different search strategy contexts within the CSE; some styling information; and information relating to Subscribed Links (more of them in another post, if they’re still supported by then..)..
  • The promotions file allows you do define promoted links within a CSE; in Search Engine Powered Courses…, I give an example of how these might be used in a course search engine.
  • The annotations file identifies the sites and pages that are specific members of the CSE index, as well as how they should be handled (eg the extent to which they should be positively or negatively boosted in the search engine results listing, whether they should appear in the top few results, and what labels or facets should apply to them).

It’s also possible to customise the styling/presentation of the search engine, but that’s a shiny, shiny feature, so probably not something I’ll be looking at…

PS I just noticed you can now manage Google Analytics settings for custom search engines (which allows you to log search queries) from within the CSE control panel… I’m still not sure how easy it is to track which results get clicked through, though?

Written by Tony Hirst

September 28, 2011 at 10:33 am

Posted in Search

Tagged with

Rediscovering Playlists…

leave a comment »

Yesterday, I had a quick peek at the beta version of SocialLearn (currently open to OU staff, at least…). A key feature of the site are “learning paths”, ordered sets of annotated resources with associated progress status indicators:

I haven’t yet had a proper play with the site yet, so a will hold off a review of the site just for now, but my first glimpse reaction to that feature was: “isn’t that what H2O Playlists did?”

(I thought I must have posted some sort of review of H20 Playlists, “shared list[s] of readings and other content about a topic of intellectual interest”, but it seems I only made passing mention of them. However, I do remember creating several H2O playlists as a way of curating links associated with several presentations I gave when I was trying to advocate the use of social bookmarking in education. (For a review of H20 Playlists posted elsewhere around the same time, see More on H20 Playlist as a Social Bookmarking Tool for Business.)

H2O Playlists

Hmmm.. it seems I misremembered: you couldn’t check off progress on the playlist items, though you could save items off one list onto your own playlist, and you could also discover “related lists” that shared some of the same items.

Also yesterday, I came across the BBC Food Recipe Binder site:

BBC Food Recipe Binder

This site lets you save, and annotate, recipes on the BBC Food site to a personal “Recipe Binder” page from an on-page call to action button.

BBC Food - Recipe Binder

(Okay, so the Recipe Binder isn’t a playlist, but it is an example of embedded bookmarking/personal curation of web resources…)

And then, today, I fire up my feeds to see all sorts of chatter about the delicious website redesign, a key feature of which appears to be… stacks (aka playlists:

The mechanics for putting together the playlists still seem a bit clunky (do I really need to add three links to create a new playlist?) but I guess it’s still early days… Anyway, here’s my first playlist stack: Crafty Stats…

Suddenly, it seems like 2005 again…

Written by Tony Hirst

September 27, 2011 at 9:29 am

Feed Autodiscovery in Javascript

leave a comment »

For what it’s worth, I’ve posted a demo showing a couple of feed autodiscovery/autodetection tricks that let you autodiscover feeds in remote pages via a couple of online services: the Google feed api, and YQL (Feed Autodiscovery With YQL).

Try it out: Feed autodiscovery in Javascript (code)

Single page web app: feed autodetection

I’ve also added in a routine that uses the Google feed api to look up historical entries on an RSS feed. As soon as Google is alerted to a feed (presumably by anyone or any means), it starts cacheing entries. The historical entries API lets you grab up to 250 of the most recent entries from a feed, irrespective of how many items the feed itself currently contains…

Why it matters: Public Data Principles: RSS Autodiscovery on Government Department Websites?, Autodiscoverable Feeds and UK HEIs (Again…)

PS Just by the by, I added a Scraperwiki view to my UK HEI autodiscovered feeds Scraperwiki. I added a little bit of logic to try to pull out feeds on a thematic basis too…

UK HE autodisocverable feeds

On the to do list is to create some OPML output views so you can easily subscribe to, or display, batches of the feeds in one go.

I guess I should also add a table to the scraper to start logging the number of feeds that are autodiscoverably out there over time?

Written by Tony Hirst

September 22, 2011 at 12:22 pm

Twitter and the Telly…and the Sale of Twapperkeeper

with 4 comments

Two and a half years ago or so, I cobbled together a Yahoo Pipe that created a caption/subtitle feed file from a hashtag based search on Twitter. The idea was that if folk were using a tag to tweet around a video event, such as a conference video feed or TV broadcast, and a recording of the same video feed was then uploaded to Youtube, you could watch original “live” Twitter stream as captions on the original recording (Wikipedia: Twitter subtitling). Since then, I’ve pushed the thinking along a bit and Martin Hawksey has pushed the code along, demonstrating among other things how to create twitter based iPlayer captions (Martin: maybe we need to tell ‘the BBC’s “King of subtitles”, @amcp [h/t @nevali]?! ;-) and “as-if live” tweet capture to fold back twitter updates from a viewing of the replay into the original social stream.

Now it seems as if there’s a new app on the block, that whilst not directed directly at the TV replay market, cerainly lists it as a use case: Rewinder [via Follow tweets about recorded TV shows as if you were watching live, with Rewinder, h/t Laura James]. The idea appears to be that you can (I wonder if this service is built on top of the (metered) Datasift?): watch your timeline or a hashtag as it happened in realtime!

When Martin and I were bouncing round ideas for Twitter subtitles, we were both Twapperkeeper as the backend archiver, but following Twitter’s API licensing changes earlier this year, Martin has been taking the Google hacker approach and looking at all sorts of ways of using Google spreadsheets for archiving tweets. As well as running a hosted service, Twapperkeeper’s developer was (I believe) also funded by JISC to develop an open source version, reflecting JISC’s continuing interest in exploiting – and archiving – social media backchannels and community support, particularly around events and (increasingly) JISC projects. Andy Powell’s Summarizr is one service that demonstrates what we can start to glean from tag archives. (ThinkUp App is another open source app that allows you archive your social media activity.)

Although I only spotted it today, it appears that Hootsuite acquired Twapperkeeper a couple of days ago (congrats @jobrieniii). Although JISC managed to help get the open source YourTwapperKeeper code released, I wonder whether plans are in place surrounding any continued access to data in the Twapperkeeper archive following the sale, especially given the legal wranglings between Twitter and Twapperkeeper regarding API access to Twapperkeeper data earlier this year? (For the record, I think it’s great that Twapperkeeper got acquired, and wonder whether this might lead JISC to consider whether or not there may be opportunities in occasionally acting as an angel or venture funder? I don’t think this is entirely without precedent: wasn’t 3i originally UK government backed?)

Quite by chance, another TV’n'Twitter post passed through my feeds today: Trendrr launches social media tools for TV networks. I guess the operators of social media dashboards that are based around Twitter are all getting twitchy (hmmm… Twitter, tweet, twitch… So what’s a twitch then? Reminds me of the tweverything days, when we twere all going to become tweachers…), and so I guess we may see a flurry of sector specific dashboards appearing (I wonder if @briankelly is currently seeking out VC funding for an education industry social media monitoring spinoff, notwithstanding my recent mutterings about What’s the Point of Social Media Metrics??!;-)

Just in passing, if you’re interested in the sorts of Twitter based analysis that are possible, see: Engaging News Hungry Audiences Tweet by Tweet: An audience analysis of prominent mainstream media news accounts on Twitter, or this tutorial – Twitter Research Methods – from the folk at Mapping Online Publics (who also put together this First (Twitter) Map of Australia).

And finally, as a much as a note to self as anything else, here’s an interesting social network analysis walk through based on a Twitter data set: Combing Through the Infovis Twitter Network Hairball. I’ve just started playing with the dexy.it automated documentation framework, and this recipe looks like something it might be quite fun to try to automate…

PS related – see the comments on Joss Winn’s response to this post: Universities as Venture Capitalists. And quite by chance, via Paul Stainthorp, I notice a Research And Enterprise [Office] Merger at Lincoln. Which makes me wonder again about whether it’s Time for TechCrunch, Academic?, and whether there are any other UK HEIs out their with blogging enterprise and innovation offices? Hmm… If I wanted to find out the bit of the university that tried to make it money from spinouts, where would I look in the Linking You model (URI conventions for HE)?

Written by Tony Hirst

September 19, 2011 at 5:50 pm

Posted in Anything you want

Search Engine Powered Courses…

with 2 comments

How can we use customised search engines to support uncourses, or the course models used to support MOOC style offerings?

To set the scene, here’s what Stephen Downes wrote recently on the topic of How to partcipate in a MOOC:

You will notice quickly that there is far too much information being posted in the course for any one person to consume. We tried to start slowly with just a few resources, but it quickly turns into a deluge.

You will be provided with summaries and links to dozens, maybe hundreds, maybe even thousands of web posts, articles from journals and magazines, videos and lectures, audio recordings, live online sessions, discussion groups, and more. Very quickly, you may feel overwhelmed.

Don’t let it intimidate you. Think of it as being like a grocery store or marketplace. Nobody is expected to sample and try everything. Rather, the purpose is to provide a wide selection to allow you to pick and choose what’s of interest to you.

This is an important part of the connectivist model being used in this course. The idea is that there is no one central curriculum that every person follows. The learning takes place through the interaction with resources and course participants, not through memorizing content. By selecting your own materials, you create your own unique perspective on the subject matter.

It is the interaction between these unique perspectives that makes a connectivist course interesting. Each person brings something new to the conversation. So you learn by interacting rather than by mertely consuming.

When I put together the the OU course T151, the original vision revolved around a couple of principles:

1) the course would be built in part around materials produced in public as part of the Digital Worlds uncourse;

2) each week’s offering would follow a similar model: one or two topic explorations, plus an activity and forum discussion time.

In addition, the topic explorations would have a standard format: scene setting, and maybe a teaser question with answer reveal or call to action in the forums; a set of topic exploration questions to frame the topic exploration; a set of resources related to the topic at hand, organised by type (academic readings (via a libezproxy link for subscription content so no downstream logins are required to access the content), Digital Worlds resources, weblinks (industry or well informed blogs, news sites etc), audio and video resources); and a reflective essay by the instructor exploring some of the themes raised in the questions and referring to some of the resources. The aim of the reflective essay was to model the sort of exploration or investigation the student might engage in.

(I’d probably just have a mixed bag of resources listed now, along with a faceting option to focus in on readings, videos, etc.)

The idea behind designing the course in this way was that it would be componentised as much as possible, to allow flexibility in swapping resources or even topics in and out, as well as (though we never managed this), allowing the freedom to study the topics in an arbitrary order. Note: I realised today that to make the materials more easily maintainable, a set of ‘Recent links’ might be identified that weren’t referred to in the ‘My Reflections’ response. That is, they could be completely free standing, and would have no side effects if replaced.

As far as the provision of linked resources went, the original model was that the links should be fed into the course materials from an instructor maintained bookmark collection (for an early take on this, see Managing Bookmarks, with a proof of concept demo at CourseLinks Demo (Hmmm, everything except the dynamic link injection appears to have rotted:-().

The design of the questions/resources page was intended to have the scoping questions at the top of the page, and then the suggested resources presented in a style reminiscent of a search engine results listing, the idea being that we would present the students with too many resources for them to comfortably read in the allocated time, so that they would have to explore the resources from their own perspective (eg given their current level of understanding/knowledge, their personal interests, and so on). In one of my more radical moments, I suggested that the resources would actually be pulled in from a curated/custom search engine ‘live’, according to search terms specially selected around the current topic and framing questions, but I was overruled on that. However, the course does have a Google custom search engine associated with it which searches over materials that are linked to from the course.

So that’s the context…

Where I’m at now is pondering how we can use an enhanced custom search engine as a delivery platform for a resource based uncourse. So here’s my first thought: using a Google Custom Search Engine populated with curated resources in a particular area, can we use Google CSE Promotions to help scaffold a topic exploration?

Here’s my first promotions file:

<Promotions>
   <Promotion id="t151_1a" 
        queries="topic 1a, Topic 1A, topic exploration 1a, topic exploration 1A, topic 1A, what is a game, game definition" 
        title="T151 Topic Exploration 1A - So what is a game?" 
        url="http://digitalworlds.wordpress.com/2008/03/05/so-what-is-a-game/"
        description="The aim of this topic is to think about what makes a game a game. Spend a minute or two to come up with your own definition. If you're stuck, read through the Digital Worlds post 'So what is a game?'"
        image_url="http://kmi.open.ac.uk/images/ou-logo.gif" />
</Promotions>

It’s running on the Digital Worlds Search Engine, so if you want to try it out, try entering the search phrase what is a game or game definition.

T151 CSE promotion - game definition

(This example suggests to me that it would also make sense to use result boosting to boost the key readings/suggested resources I proposed in the topic materials so that they appear nearer the top of the results (that’ll be the focus of a future post;-))

The promotion displays at the top of the results listing if the specified queries match the search terms the user enters. My initial feeling is that to bootstrap the process, we need to handle:

- queries that allow a user to call on a starting point for a topic exploration by specifically identifying that topic;
- “naive queries”: one reason for using the resource-search model is to try to help students develop effective information skills relating to search. Promotions (and result boosting) allow us to pick up on anticipated naive queries (or popular queries identified from search logs), and suggest a starting point for a sensible way in to the topic. Alternatively, they could be used to offer suggestions for improved or refined searches, or search strategy hints. (I’m reminded of Dave Pattern’s work with guided searches/keyword refinements in the University of Huddersfield Library catalogue in this context).

Here’s another example using the same promotion, but on a different search term:

T151 CSE - topic 1a

Of course, we could also start to turn the search engine into something like an adventure game engine. So for example, if we type: start or about, we might get something like:

T151 CSE - start

(The link I associated with start should really point to the course introduction page in the VLE…)

We can also use the search context to provide pastoral or study skills support:

T151 CSE - pastoral

These sort of promotions/enhancements might be produced centrally and rolled out across course search engines, leaving the course and discipline related customisations to the course team and associated subject librarians.

Just a final note: ignoring resource limitations on Google CSEs for a moment, we might imagine the following scenarios for their role out:

1) course wide: bespoke CSEs are commissioned for each course, although they may be supplemented by generic enhancements (eg relating to study skills);

2) qualification based: the CSE is defined at the qualification level, and students call on particular course enhancements by prefacing the search with the course code; it might be that students also see a personalised view of the qualification CSE that is tuned to their current year of study.

3) university wide: the CSE is defined at the university level, and students students call on particular course or qualification level enhancements by prefacing the search with the course or qualification code.

Written by Tony Hirst

September 15, 2011 at 2:03 pm

From “Special Result” to “Promotion” in Google CSEs

leave a comment »

In passing, I noticed I had a broken link to a Google CSE documentation page:

http://code.google.com/apis/customsearch/docs/special_results.html

Searching a little, I found the page had moved to

http://code.google.com/apis/customsearch/docs/promotions.html

A cached version of the originally linked page is still available, so I did a side-by-side comparison:

From 'special result' to 'promotion'

Hmmm…

Written by Tony Hirst

September 15, 2011 at 11:25 am

Posted in Search

Tagged with

Course Librarians and Search Assist…

leave a comment »

For all their success in attracting universities to adopt Google Apps (Tradition meets technology: top universities using Apps for Education), it’s not obvious to me how – or even if – Google is actually doing much around search signal detection and innovation in an educational context?

I’ve floated this a couple of times before (eg Could Librarians Be Influential Friends? And Who Owns Your Search Persona? and Integrating Course Related Search and Bookmarking?), but with yet another announcement from Google about how they’re incorporating social signals into search rankings (Hide sites from anywhere in the world: “We’ve … started incorporating data about sites people have blocked into our general search ranking algorithms to help users find more high quality sites.”), I’m going to raise it again…

To what extent are course and subject librarians setting up course/subject personas that engage in recommending and sharing high quality links in an appropriate social content, and encouraging students to follow those accounts in order to benefit from personalisation of search results based on social signals?

Furthermore, to what extent might the development of search personas represent the creation of a “scholarly agent” that can be used to offer “search assist” to followers of that agent/persona?

I don’t find it that hard to imagine myself taking a course, following the course recommender on a social network (an account that might send out course related reminders as well as relevant links), with an icon depicting my university and the associated course, that on occasion appeared to “recommend” links to me when I was searching for topics relating to my course. (In the normal scheme of things, it wouldn’t actively be recommending links to me, of course. For that, I’d need to subscribe to something like Subscribed Links, as mentioned in Integrating Course Related Search and Bookmarking?.)

Written by Tony Hirst

September 14, 2011 at 9:12 pm

Posted in OU2.0, Search, SEO

What’s the Point of Social Media Metrics?

with 8 comments

In a week when Twitter finally announces it will be releasing an analytics package (GigaOm: Twitter offers analytics to try and prove its value) that will allow website owners to track activity around links that are shared to their site on Twitter (or more sprecifically, via t.co links?), I notice that @briankelly is banging on again about social media metrics (Bath is the University of the Year! But What if Online Metrics Were Included?).

To try and clarify some of my own thinking on this, here are a handful of things I think media metrics (online and offline) are typically used for:

  • selling ads: publishers delivers advertisers audiences with a particular demographic. Publishers create publications that pull together audiences in particular demographic groups so that they can sell access to those groups to advertisers. So things like ABC figures… (although maybe they are falling out of favour? UBM’s ABC exit shows how publishers are moving from measuring users to building relationships /via, err, @paulbradshaw, I think…);
  • measuring returns on investment: if you put a call to action out to an audience (for example, by advertising something, or putting it into a catalogue that you expect people to buy from), it helps if you know whether that call to action got a response, or generated some sort of return on the cost of making the call to action. Google reinvented everything when they worked out how to find a way of pricing ads and charging for them when someone actually clicked through… (and Google analytics then helps folk track whether these click-thrus actually result in things like online sales).
  • ranking: if you’re measuring things like ‘reputation’ (whatever that is…?) it makes sense to ask why. One reason might be to as a signal to help organise the ranking or ordering of search results in a large set of results.
  • recommendation: looking for clusters in data so that when someone picks one item in a cluster, you can recommend the rest;
  • discovering new segments: another application of clustering, trying to find new audience groupings as part of a product development exercise, perhaps?

My own interest in things like hashtag community graphs has more to do with finding collections and understanding the structure and makeup of a system, but for no other reason than collection building and personal curiosity about how it may be structured…;-)

So – what else have I missed?

PS this is interesting, and bits are maybe related – Battelle on The Future of Twitter Ads. In particular, his thoughts on how Twitter might execute ad targeting:

- Interest targeting. Twitter will expose a dashboard that allows advertisers to target users based on a set of interests. … There are plenty of clear signals: What a user posts, of course. But also what he or she retweets, replies to, clicks on in someone else’s tweet, or who they follow (and who that followed person follows, and, and….).
- Geotargeting.
- Audience targeting. I’d expect that at some point, Twitter will expose various audience “buckets” to the marketer for targeting based on unique signals that Twitter alone has views into. These might include “active retweeters,” “influencers,” or “tastemakers”
- Demographic targeting. This one I’m less certain of…
- Device/location targeting.

The question is, when it comes to social media metrics in Higher Education, what are we trying to do with them?

Related: Forget the online traffic tricks and start measuring value [h/t @paulbradshaw]

Written by Tony Hirst

September 14, 2011 at 11:25 am

Data Journalists Engaging in Co-Innovation…

with 3 comments

You may or may not have noticed that the Boundary Commission released their take on proposed parliamentary constituency boundaries today.

They could have released the data – as data – in the form of shape files that can be rendered at the click of a button in things like Google Maps… but they didn’t… [The one thing the Boundary Commission quango forgot to produce: a map] (There are issues with publishing the actual shapefiles, of course. For one thing, the boundaries may yet change – and if the original shapefiles are left hanging around, people may start to draw on these now incorrect sources of data once the boundaries are fixed. But that’s a minor issue…)

Instead, you have to download a series of hefty PDFs, one per region, to get a flavour of the boundary changes. Drawing a direct comparison with the current boundaries is not possible.

The make-up of the actual constituencies appears to based on their member wards, data which is provided in a series of spreadsheets, one per region, each containing several sheets describing the ward makeup of each new constituency for the counties in the corresponding region.

It didn’t take long for the data junkies to get on the case though. From my perspective, the first map I saw was on the Guardian Datastore, reusing work by University of Sheffield academic Alasdair Rae, apparently created using Google Fusion Tables (though I haven’t see a recipe published anywhere? Or a link to the KML file that I saw Guardian Datablog editor Simon Rogers/@smfrogers tweet about?)

[I knew I should have grabbed a screen shot of the original map...:-(]

It appears that Conrad Quilty-Harper (@coneee) over at the Telegraph then got on the case, and came up with a comparative map drawing on Rae’s work as published on the Datablog, showing the current boundaries compared to the proposed changes, and which ties the maps together so the zoom level and focus are matched across the maps (MPs’ constituencies: boundary changes mapped):

Telegraph side by side map comparison

Interestingly, I was alerted to this map by Simon tweeting that he liked the Telegraph map so much, they’d reused the idea (and maybe even the code?) on the Guardian site. Here’s a snapshot of the conversation between these two data journalists over the course of the day (reverse chronological order):

Datajournalists in co-operative bootstrapping mode

Here’s the handshake…

Collaborative co-evolution

I absolutely love this… and what’s more, it happened over the course of four or five hours, with a couple of technology/knowledge transfers along the way, as well as evolution in the way both news agencies communicated the information compared to the way the Boundary Commission released it. (If I was evil, I’d try to FOI the Boundary Commission to see how much time, effort and expense went into their communication effort around the proposed changes, and would then try to guesstimate how much the Guardian and Telegraph teams put into it as a comparison…)

At the time of writing (15.30), the BBC have no data driven take on this story…

And out of interest, I also wondered whether Sheffield U had a take…

Sheffiled u media site

Maybe not…

PS By the by, the DataDrivenJournalism.net website relaunched today. I’m honoured to be on the editorial board, along with @paulbradshaw @nicolaskb @mirkolorenz @smfrogers and @stiles, and looking forward to seeing how we can start to drive interest, engagement and skills development in, as well as analysis and (re)use of, and commentary on, public open data through the data journalism route…

PPS if you’re into data journalism, you may also be interested in GetTheData.org, a question and answer site in the model of Stack Overflow, with an emphasis on Q&A around how to find, access, and make use of open and public datasets.

Written by Tony Hirst

September 13, 2011 at 2:46 pm

Follow

Get every new post delivered to your Inbox.

Join 126 other followers