OUseful.Info, the blog…

Trying to find useful things to do with emerging technologies in open education

Archive for June 2009

Hyperlocal Twitter Trends

with 6 comments

Just by the by, I idly tweeted last week along the lines of “does anyone know of a twitter trends service that identifies trending topics within a particular region or locale?”.

I didn’t receive any links to such a service at the time (and didn’t build the service myself…) but it strikes me that this could be a really useful hyperlocal news service?

An alternative might be to find ‘trending locations’ or ‘trending places’ rather than trending hashtags or topics, so if there is a sudden flurry of tweets in a particular area, it could get flagged (does Twitter this anyway with its trending topics?).

The locus of the trending regions need not be limited to ‘a circle within fifty miles of a point’ either; they could (with lots of computing power;-) be points along a line, such as a road, for example.

As an asymmetric follower type (I have many more followers than people I follow on Twitter), it might also be handy to be able to see trending topics across the people who follow me, just in case the sampled population that I do follow aren’t a fair sample of the people who follow me…

Just a thought… now back to the jet lag :-(

[See also: Mapping Realtime Events on Twitter and this Simple Embeddable Twitter Map Mashup]

UPDATE: with the release of a Twitter geo api, it didn’t take long: Trendsmap

Written by Tony Hirst

June 30, 2009 at 12:52 pm

Posted in Thinkses

Tagged with ,

Deep Link into BBC iPlayer Content

with 3 comments

One of the really handy things about Youtube is the ability to share bookmarks that “deep link” to a particular point within a video (e.g here’s Google having a dig at Microsoft; the URL? http://www.youtube.com/watch?v=S5aJAaGZIvk#t=29m10s, which should start the video playing 29 minutes 10 seconds in. That is, just add something like #t=29m10s to the end of the Youtube video page URL to start the video playing that far in).

A similar service is offered on podcast material published through the wonderful IT Conversations, that lets you deep link in to a particular part of an audio file, which is great for sharing audio quotes and, err, messing around with: IT Conversations samples trigger pad;-)

Anyway, anyway, yesterday I saw this:

which means you can now deep link in to iPlayer content :-)

Deep link into iPlayer content

As with the Youtube deep linking, if you know the URL pattern, you can can create your own deep links on the fly (just add, ?t=21m45s, for example, on to the end of the URL to start the programme playing 21 minutes 45 seconds in.)

Something else I thought was interesting – the shared link is actually a BBC short link. So for an example, this is the sort of link you are given to share:

http://bbc.co.uk/i/l9n18/?t=13m55s

which then resolves to something like this:

http://www.bbc.co.uk/iplayer/episode/b00l9n18/Psychoville_Episode_1/?t=13m55s

I’ve raised the issue before now (in conversation with HEI internet services people, rather than through blog posts, I think?) about whether HEIs should run their own short code services (maybe as a Library service), but it’s always been shot down as being an extra hassle that we don’t need to worry about. (I always saw it as an opportunity for providing a couple of value add services: 1) providing a persistent web identifier that could act like a DOI; 2) providing a level of indirection (as in the case of a DOI) that might help as part of an archiving or “archival redirection” project – e.g. in the case of content moving and URIs changing (because they do change).)

Anyway – it seems as if the BBC think running their own short URI service is a good idea.. It’d also be useful to know if the short URI will permanently map to the same full URI, or whether it will support a more arbitrary form of resolution, e.g. maybe hooking in to services like URIPlay?

PS sort of, but not really, related, see also: Open University Podcasts on Your TV – Boxee App

PPS note the deep link time code doesn’t work with radio content in iPlayer console.

PPPS for a hacky mashup way of making use of timecodes, see Searching the backchannel with Twitter subtitles

Written by Tony Hirst

June 19, 2009 at 9:50 am

Posted in BBC

Tagged with

Open University Podcasts on Your TV – Boxee App

with 6 comments

Over the weekend, a submission went in from The Open University (in particular, from Liam GreenHughes (dev) and some of the OU Comms team Dave Winter in Online Services (design)), to the Boxee application competition (UK’s Open University on boxee).

For those of you who haven’t com across Boxee, it’s an easy to use video on demand aggregator that turns your computer into a video appliance and lets you watch video content from a wide range of providers (including BBC iPlayer) on your TV. Liam’s been evangelising it for some time, as well as exploring how to get OU Podcasts into it via RSS’n'OPML feeds (An OU Podcast RSS feed for Boxee).

(For those of you who prefer to just stick with the Beeb, then the BBC iPlayer big screen version provides an interface optimised for use on your telly.)

As well as channeling online video services, and allowing users to wire in their own video and audio content via a feed feed, Boxee also provides a plugin architecture for adding additional services to your Boxee setup. The recent Boxee competition promoted this facility by encouraging developers to create new applications for it.

So what does the OU Podcasts Boxee app over and above a simple subscription to an OU podcasts feed?

A pleasing, branded experience, that’s what.

So for example, on installing the OU podcasts app (available from the Boxee App Box), an icon for it is added to your Internet Services applications.

Launching the application takes you to an OU podcasts browser that is organised along similar lines to the OU’s Youtube presence, that is, in terms of OU Learn, OU Research and OU Life content. The Featured content area also provides a mechanism for pushing editorially selected content to higher prominence. (Should this be the left-most, default option, I wonder, rather than the OU Learn channel?)

In the Research area, a single level of navigation exists, listing the various episodes available:

OU Boxee app

Th more comprehensive Learn area organises content into topic basic based themes/episode collections (listed in the right hand panel) with the episodes associated with a particular selected theme or collection displayed in the left hand panel. Selecting an episode in the left hand panel then reveals its description in the right hand panel (as in the screenshot above).

So for example, when we go to the OU Learn area, the Arts and Humanities episodes are listed in the left hand area (by default), and available collections in the right.

We can scroll down the collections and select one, Engineering for example:

Episodes in this collection are listed in the left hand panel, and further subcollections in the right hand panel (it all seems a little confusing to describe, but it actually seems to work okay… maybe?!;-)

Highlighting an actual episode then displays a description of it.

Selecting a program to play pops up a confirmation “play this” overlay, along with a link to further information for the episode:

Both audio and video content can be channeled to the service – selecting a video programme provides a full screen view of the episode, whilst audio is played within a player

The “Read More” option provides a description of the episode, as well as social rating and recommendation options:

Finally, a search tool allows for content to be discovered using user selected search terms,

If you search with an OU course code, and there is video on the OU podcasts site from the course, the search may turn that course related video up…

This wouldn’t be a OUseful post if I didn’t add my own 2p’s worth, of course, so what else would I have liked to have seen in this app. One thing that comes to mind is a seven day catch-up of OU co-pro content that has been broadcast on the BBC (or more generally, the ability to watch all OU co-pro content that is currntly available on the BBC iPlayer). I developed a proof-of-concept demonstrator of how such a service might work on the web, or for the iPhone/iPod Touch (iPhone 7 Day OU Programme CatchUp, via BBC iPlayer), so under the assumption that the Boxee API can provide the hooks you need to be able to play iPlayer content, I’d guess adding this sort of functionality shouldn’t take Liam much more than half-an-hour?!;-)

I also wonder if the application can be used to preserve local state in the form of personalisation information? For example, could a user create their own saved searches – and by default their own topic themed channels? Items in such a feed could also be nominally tagged with that search term back on a central server, if, for example, if a user watched an episode that had been retrieved using a particular search term all the way through?

To vote for the OU Boxee app, please go to: vote for your favorite apps, RSVP for the boxee event in SF.

PS the OU Podcasts app is not the only education related submission to the competition. There’s also OpenCourseWare on boxee, which porvides a single point of entry to several video collections from some of the major US OCW projects.

PPS it also turns out that KMi have a developer who’s currently working on a range of mobile apps for the iPhone/iPod Touch, Android phones and so on. If any OU readers have ideas for compelling OU related mobile apps, you just may get lucky in getting it built, so post the idea as a comment to this post, or contact, err, erm, @stuartbrown, maybe?

PPPS Now I’m not sure how much time was spent on the app, but as the competition was only launched on May 5th, with a closing date of June 14th, it can’t have been that long, putting things like even the JISC Rapid Innovation (JISCRI) process to shame…?!;-)

Written by Tony Hirst

June 18, 2009 at 11:49 am

Posted in OBU, Open Content, Open Education, OU2.0

Tagged with ,

Pandering to the News Cycle, or Enriching It? (aka a roundabout palaver way of embedding OU podcasts in a WordPress blog)

with 6 comments

Stephen Downes picked up on a recent post of mine (Guerrilla Education: Teaching and Learning at the Speed of News [OLDaily] with the response:

“[S]hould we as academics be engaging with the news cycle in order to deliver informal, opportunistic ‘teaching’ at the point of need?” My answer: no. Not when ‘need’ is defined as ‘powerful’ or ‘influential’. Because then it’s not teaching, it’s just lobbying, or worse, pandering.

Okay – so here’s slightly more worked out example of one of the approaches I have in mind. In the original post, I mentioned “[a] sleeper podcast from John Naughton [that] picked up significant amounts of traffic … from the 40th anniversary of the internet.”

Here’s what John wrote (The Internet at 40)

From ‘Hot News’ on the Apple site this morning:
The Internet turns 40, June 9, 2009
You’re so used to paying bills, getting your news and weather, and doing more and more of your purchasing online, you probably think the Internet has been around forever. But it hasn’t. As you’ll learn from this program on Open University, the Internet turns 40 this year. How did it get started? Where is it taking us next? Find out by listening to these Internet pioneers on iTunes U…
It seems that the recording of my interview is #4 in the top 100 downloads

(I would embed the podcast here – John links to the version of it on the OU podcast site – but the site doesn’t currently support embed codes. As this is a hosted WordPress blog, if it supportd custom OU flashplayer embed codes, that wouldn’t be much good either: WordPress is quite restricted in the embed codes it supports [that is: WordPress blogs hosted on WordPress.com are limited in what you can embed - self-hosted WordPress installations can be configured to let you embed what you like...]. (In a meeting last week, my question as to whether we should offer Youtube embed codes (which are accepted in WordPress) as well as OU podcast player codes was not met with much support… Which means if an OU player embed code had been available, I couldn’t have *easily* used it anyway…(The workaround would be to grab the OU embed code into Vodpod, which is accepted by WordPress…. which gives me an idea – I couldn’t get Vodpod to work with the OU podcast site, but it does work with the nascent UK HE Steeple Podcast Portal:-))


So what I am suggesting, in part, is not that necessarily that we pander to the news cycle (what would that mean, anyway, pander to it?), but that we do make content available that allows news readers to learn more about a topic.

[Hmmm... it seems like this video has disappeared from the Steeple site... ho hum, must be a Steeple thing... will try to see if i can get Vodpod to embed directly from OU podcasts site if i get a chance, assuming the KMi folks don't block it, of course....]

Another example might be come from the rise in interest in news media making raw data available. Surely there is an opportunity here for educational institutions to provide educational material that explains how news readers can engage with this data (and conversely, how educators might make use of such data)? (This is actually something I’ve been thinking about quite a lot lately…)

Argghhhh – time to go: day 2 of the Isle of Wight Festival beckons… I would have written more but got distracted by the embed sidetrack… ;-)

Written by Tony Hirst

June 13, 2009 at 11:49 am

Posted in Anything you want

Time for a New Interaction Metaphor? Click and Wire

with 3 comments

I’m guessing that most computer users are all now reasonably familiar with simple mouse control actions (click to select, double click to open, click and drag to resize, drag and drop to move) if not some of the more esoteric ones (right click to open up a context sensitive menu, drag and drop a document icon onto an application icon to open it, or a link from a browser (page or address bar) onto a browser application icon or another browser window to open it elsewhere; you can of course (?!) also drag links onto the favourites toolbar to favourite/bookmark them.

So I’m starting to wonder whether the time is right yet to start talking about another operation: click and wire (or maybe that should be drag and wire?) to describe a subscription action.

There are two steps that I think are required to helping people understand this metaphor. The first is to provide a visual cue showing how things can be clicked and wired together: Yahoo Pipes offers a great example of this idea:

The second thing we need to demonstrate is how website content can be subscribed to in many browsers using a drag and (invisibly) wire approach. To all intents and purposes, this feels like drag and drop. But if you drag the right sort of thing, like a link to an RSS feed, then you are actually clicking and dragging something that is publishing content rather than representing a fixed document.

And if you drop the feed link you’re dragging on to the right sort of thing, like a browser’s feed sidebar, then the drop isn’t the ‘open this application with this document’ action that you get from dropping a document icon onto an application icon, it’s actually a ‘subscribe’ action (that is, you have wired the original publishing source/link to something that can display the most recently published items from that source).

Now I can already hear one or two people who read this blog saying “yeah, right, but no-one knows how to spot a feed link, or knows that feedreading sidebars exist”, which may or may not be true. But the RSS feed icon is now pervasive; all we’re missing are the obvious drop targets; and the notion of “click and wire” (where “wire” is to “drop” as “subscribe” is to “open”) or “drag and subscribe”, or some combination thereof…

Written by Tony Hirst

June 12, 2009 at 1:59 pm

Posted in Thinkses

Guardian DataStore Visualisation Competition

leave a comment »

A post over on the Guardian DataStore site last week announced a competition based around visualising data from the Guardian DataStore: Build stuff with our data and win a Flip Mino HD camcorder.

There are two categories for submissions:

1) The best user experience for understanding meaning in data, and
2) The best tool for web developers to build other things with data

I’ve posted quite a few recipes so far that describe different ways of engaging with the data, many of which you can find from posts categorised here with visualisation, so it’d be great to see people trying to run with them.

Many of the recipes I’ve come up with start by getting data out of the spreadsheets as CSV, so that it can then be passed to other services, such as Many Eyes Wikified or Yahoo Pipes; or JSON, so that it can be pulled in to a web page. (My very work in progress Guardian Datastore explorer can be used to generate URIs that run queries on Google spreadsheets, which may be of some use if you want to get started on running SQL like queries on spreadsheet data.)

As far as I know, know one has explored using tools like the JavaScript InfoVis Toolkit yet, which could be interesting from the point of view nice UIs, as well as the developer perspective (e.g. a nice set of hooks in to the DataStore from visualisation toolkits or code frameworks). And then of course there are plenty of other more traditional chart toolkits out there…

If you’re after something a little more exotic, how about something like Thematic Mapping, HeatMap API or CloudMade on the geoviz front, though the problem of geocoding DataStore data would have to be solved first (Yahoo Placemaker might be handy there?); the Timetric or the MIT Simile Timeplot or Timeline tools for displaying information against a time axis (none of which have, to date, and as far as I know, been combined with a Fourier Analysis tools to help identify periodicities in the charted data); or how about finding a use for TimeMap, which combines MIT Simile Timeline widgets with Google maps..?

For truly open ended visualisations, using something like Processing may be the way to go: there’s already a Processing wrapper for the OpenPlatform API, but I’m not sure if anyone has provided an easy way (as yet) to pull DataStore content into it. Integration with Processing.js, a Javascript implementation of Processing that makes things like Obsessing possible, is also something that could open up a lot of opportunities for making use of the data?

On the other hand, if it’s analysis you’re after, it might be interesting to see what could be done if the DataStore spreadsheets could be integrated with various stats analysis packages (is there a variant of R as a st of Javascript libraries, I wonder?!)

PS Just for the record, I’m not eligible to enter the competition just at the moment…

Written by Tony Hirst

June 12, 2009 at 12:34 pm

Posted in Uncategorized

An Essential Part of My Workflow

with 2 comments

A couple of days ago, on of those reminders about how reliant we are on various pieces of technology was forced upon me: Jing died on me….

For those of you who don’t know it, Jing is a screencapture/screencasting tool that is integrated with flickr (free version, for screenshots) and Youtube (pro version, for screencasts). It’s producd by Techsmith, who also publish the more comprehensive SnagIt and Camtasia tools, so the technical underpinnings of the app are excellent.

Anyway, I’ve been using the free version of Jing for what seems like forever, using it to grab screenshots at will and send them direct to flickr, then typically pasting the embed code that is magically popped into my clipboard directly into my WordPress editor. But I’ve decided that I really need to do more screencasts, and whilst Jing automates video uploads to screencast.com, I really wanted the ability to post screencasts direct to Youtube. So on Sunday I upgraded, and after a couple of battles getting the upgrade to take, uploaded a couple of test screencasts to Youtube, easy as anything.

And then, on Tuesday, late on Tuesday, at a time when Tuesday had bcome Wednesday and I really wanted to call an end to the day, save for finishijng off a post with a couple of screenshots, Jing died. Every time I restarted it, it claimed I was no longer a Pro user, and died.

So I reinstalled, and tried again. Same thing. Reboot my Mac, and try again. Still no joy, Crate a new, free account, and whenever I started Jing, it crashed.

Superstition kicked in and I blamed the upgrade, trying (maybe successfully, maybe not) to send a help request to Techsmith. (Finding the help was a nightmare, I think I had to create a new account on a help system somewhere along the way, and on posting a help email, I couldn’t tell whether it had been submitted or not.) The typical online help rigmarole, essentially. Even if you don’t start off angry, you’re likely to end up furious. (Plus I was really flagging by now and maybe not thinking as clearly as I might!)

A search on Twitter turned up a @techsmith account, and the contact details of someone at Camtasia, who I emailed. But it was passed days’ end, even in the US, so I went looking for an alternative. (I could of course have just used the Mac screengrab tools to do what I needed, and then uploaded the images to flickr using flock, but I was looking to punish to Camtasia by finding an alternative to Jing that worked just as well!)

In the end, I settled on Skitch, and it sort of worked okay, but it was nothing like as painless as Jing. For every screenshot I took, I just wanted Jing back…

…anyway, I picked up a friendly email from Techsmith yesterday saying there had been problems, and a tweeted prompt from Techsmith last night asking if Jing was now working for me again (it was/is). The problem, it seems, was at the Techsmith end, an issue that caused Jing on Mac Tiger to crash (I’m intrigued as to how a problem on the webservice end and kill an app running on the desktop? This is a harbinger of things to come more generally with web apps, maybe?)

So what do I take from this experience? Firstly, Jing is part of what I do, and it does just what it needs to for me. Secondly, without twitter I’d have had a really crap customer experience trying to understand what was going on (had something gone bad with my Pro upgrade? Was it a Jing problem or my problem? and so on..).

As it’s turned out, rather than writing a ranty post saying I’ve now changed my screencapture tool because of blah, blah, blah, if anyone asks what tool I use for screencaptures, I’d still say Jing. And from the ease of use in uploading screencaptured videos to Youtube, I’d also recommend the upgrade to Jing Pro if quick’n'easy raw screencasts are your thing.

Written by Tony Hirst

June 12, 2009 at 8:55 am

Guerrilla Education: Teaching and Learning at the Speed of News

with 2 comments

Wikinomics author Don Tapscott has been at it again, (giving @liamgh yet another Mexican Wave opportunity), complementing a recent essay in which he argues “that the universities are entering a period of crisis” with a linkbait post asking Will universities stay relevant?

[UPDATE: this post is way to much of a ramble; the point is in the last para, republished here because I know you're just skimming this post and will probably miss it: How about engaging in a bit of guerrilla teaching and looking for opportunities to help people understand something better, or learn how to do something they are currently struggling with. If we help people learn at the point of need, maybe they’ll be inspired enough to engage in more formal learning opportunities? And even if they don’t, maybe we’ll have helped make the world a slightly better informed place? ]

I posted a comment there – for what it’s worth – and by linking back to the post from here as well, I’ll maybe raise my profile on that thread via a trackback (and perhaps even get a tiny bit of traffic from that site flowing this way too;-)

Shameless traffic mongering? You got it!

Anyway, anyway, as I’m here, here’s a quick thought about guerrilla education, and engaging with the news cycle.

Somewhen over the last couple of weeks, I stopped in my tracks whilst reading the opening section of The culture of copying on the BBC News dot life blog:

Oh no: another boring report about piracy by a strange body with an obscure title.

That was my first reaction on getting hold of Copycats? Digital Consumers in the Online Age – a report for the Strategic Advisory Board on Intellectual Property.

But when I read on, the report was full of fascinating insights into the way that we’ve all begun to think about the rights and wrongs of online piracy – or rather, “unauthorised downloading”, which is how this report for the government carefully describes it.

The authors, from University College London, point to evidence that what they amusingly call the “UK’s unauthorised downloading community” now stands at nearly seven million people, and they question the assumption that these are just teenagers and students – it seems older people are downloading too.

What shocked me? Well, here’s a report, maybe interesting, maybe not, in an area that borders on things I’m interested in, that maybe has something to contribute to a course I have a loose affiliation with (Beyond Google: working with information online), written by academics (but does that matter, except maybe I can trust it without further verification…?!) and just released (i.e. the data shouldn’t be more than a couple of years old!;-)

Now the situation as it currently stands is that the media find these reports and then report on them, interpreting the report contents for a larger audience as they do so. Sometimes they even go to academics for further comment (for example, we had a request round for comment on a Sunday Times piece that was being put together yesterday (I wasn’t in a position to field it at the time) and my colleague Ray Corrigan provided a comment for a piece in today’s Technology Guardian (online at Sweden’s Pirate party sails to success in European elections). A sleeper podcast from John Naughton has also picked up significant amounts of traffic lately from the 40th anniversary of the internet (The Internet at 40).

I don’t think I’m giving anything away by saying that there’s a group in the OU who are currently looking at the way we engage with ‘the wider web’ through the current hotch-potch of web properties such as open2.net, OpenLearn and Platform. For those of you who aren’t familiar with these sites:

- open2.net is the site that is used to support OU BBC programmes (the Reith Lectures are a current major feature);
- OpenLearn is th home of the OU’s open educational content initiative; and
- Platform is the OU’s social soft edges, a community site (open to all, not just OU students, current and past).

(There’s also the OU Research website, which has still (IMHO) yet to find its feet. And the Faculty websites – the Science Faculty website is probably the most engaging at the moment. There are departmental websites too (e.g. my department’s website: Communication and Systems, which is in yet another holding pattern as we wait for yet another relaunch!;-); and there’s the OU Podcast site too, of course (I won’t mention the various Youtube channel pages, iTunesU, Steeple, etc etc;-).)

Of all those sites, news related items feature on a surprising number of them, and yet the disconnect between our formal teaching, and exploiting news related, ad hoc teaching opportunities is significant (although it has to be said that the folks working on /Platfrom do seem to have been keeping an eye and the sorts of thing that are likely to pull in traffic at any given time:-)

Anyway, one of the things I’ve been mulling over for a some time has been the extent to which journalists, academics and students are all engaged in trying to make sense of the world. Timescale is one of the differences, I think? Another is that academics tend to strive for a model of how things work in general (e.g. how populations behave in general), whereas journalists often seem to take a generalised issue and humanise it by illuminating a general case with a particular case (the story of a particular person with a particular condition or in a particular exemplar situation, for example).

So here’s my starter for 10 (which is a shame, because this post is already long enough…): should we as academics be engaging with the news cycle in order to deliver informal, opportunistic “teaching” at the point of need (i.e. at those points where people might be confused about a topic, realising they don’t understand it as well as they might, or where they may be minded to learn more.)

One of the now well worn ways of thinking about this (in the OU and BBC at least) that comes in and out of fashion (current status: IN) is the idea of a learning journey, that takes a generally interested viewer down a path of discovery from an informal encounter with a topic, through some “further information” about the topic, and possibly a free open education course, until they eventually sign up for a formally delivered course.

Traditionally, the OU has had several starting points for learning journeys: BBC driven traffic to open2.net, local recruitment onto Openings courses from various regional Widening Participation initiatives, as well as planned (or opportunistic) “marketing” initiatives such as the Outsmart the Recession site. More recently, Platform looks (to me at least) like it’s also trying to deliver contextual/content lead marketing in a community based environment.

(Just by the by, I really think we should be running our own ad-platform across OU sites that serves up personalised course ads and related educational content, cf. Arise Ye Databases of Intention.)

But where else might we provide an entry point to a learning journey. How about out there? How about keeping tabs on what’s going on in the wider world, and us engaging with it, creating opportunities (e..g. by commenting on third party posts) for people to follow a path back to OU sites; and failing that, maybe they’ll learn something useful from us anyway (we are paid for using public funds, after all). How about engaging in a bit of guerrilla teaching and looking for opportunities to help people understand something better, or learn how to do something they are currently struggling with. If we help people learn at the point of need, maybe they’ll be inspired enough to engage in more formal learning opportunities? And even if they don’t, maybe we’ll have helped make the world a slightly better informed place?

Or not…

PS if there’s an argument in there somewhere that I’m fumbling towards, it maybe contains some or all of the following pieces:
- good advertising (relevant, timely, appropriate) is content;
- people need help to understand the news (readers as well as journalists);
- educational material is content;
- educational material sampled from a course may act as a tease, advert or lead-in for that course;
- if someone learns something from content that’s a Good Thing;
- people don’t just learn when they’re studying a formal course;
- there’s news everyday (i.e. lots of opportunities to wrap new content with other content);
- news is often syndicated;
- news can provide context for learning;
- any given learning topic may provide a context for republishing news stories;
- etc…

Enough…

Written by Tony Hirst

June 11, 2009 at 9:34 pm

Posted in Thinkses

Initial Thoughts on “Mashup Patterns”

with 2 comments

Several weeks ago now, I noticed a sideways mention to a new book on Mashup Patterns by Michael Ogrinz on some blog or other, and through the magic of Twitter managed to get a hold of a review copy, with no obligation to review it etc etc (so count that as some sort of disclaimer and s**w you, Mike Arrington, s**w you;-)

Mashup patterns are actually something I’ve been thinking about for quite some time now, most notably when it comes round to preparing (or not) for a mashup workshop I’m supposed to be running. (So for example, in my diary at the moment are a workshop at IWMW with Mike Ellis and a session at Mash Oop North.)

So what are “mashup patterns”? Following Ogrinz’s lead, I’m not really into the idea of coming up with an academically robust definition of what mashup patterns are, or aren’t. Where Ogrinz talks of “reusable solutions”, I am equally likely to talk about recipes (or instances of a particular recipe type). This reflects, in part, the way I cook! So for example, of the dozens of pasta sauces I have, on occasion, tried to make, they’re all pretty much one of two types: the red one (tomato based) or the white one (white sauce based).

Ogrinz identifies various catgories of pattern, and organises the book accordingly:
- harvest patterns (“Mine one or more resources for unique data”) identify how to get data out of the silos and into the mashup space in an appropriate format (for example data scraping from WIkipedia);
- enhance patterns (“extend the capabilities of existing resources”) show how to improve what you’ve already got (for example, feed annotation streams and progressive enhancement);
- assemble patterns (“remix existing data and interfaces to serve new purposes”), such as content aggregation and filtering;
- manage patterns (“leverage the investmnt in existing assets more effectively”) such as widget enablers and dashboards (like the OUseful dashboard);
- testing patterns (“verify the performance and reliability of applications”), such as load testing and regression testing; and
- anti-patterns: that is, how not to do it…

Each of the patterns is described (in brief) on the Mashup Patterns website.

Each pattern is then given a memorable name, tagged with the “core activities” associated with the pattern and described in terms of a problem statement and an appropriate solution (the solution describes the pattern, the problem statement the context within which the pattern might be applied), along with a diagrammatic representation of the pattern. One or more examples of how (at a high level) the pattern might be applied in a real world context (or contrived real world context!) are then described. Value is also added to each pattern in the form of ‘links’ to other related patterns, and a fragility rating that describes how brittle the pattern is likely to be.

Whilst I’m sympathetic to Ogrinz’s decision to avoid writing code or UML diagrams in favour of specifying mashup patterns in more generic terms, I do often prefer to illustrate the mashups I produce as working examples. (Ogrinz uses an attractive abstracted visual language to describe the elements involved in each pattern), So as regular readers of OUseful.info will know, Yahoo Pipes are one of my preferred ways of doing this…

As well as the idea of mashup patterns, I’m also very partial to the idea of lenses, ways of thinking or asking questions about a particular problem from a particular perspective. The Art of Game Design: A Book of Lenses provides a fine example of using lenses to think about the design of computer games (in fact, many of the lenses are appropriate for thinking about many areas of design). If you’ve heard of de Bono’s Thinking Hats, lenses provide a similar device for taking “different perspectives” on a particular issue, or asking particvular sorts of questions about it.

So to finish off, I wonder: if I was to start a “practical mashups” uncourse blog, would: a) anyone follow it; b) be prepared to buy Mashup Patterns (and maybe also The Art of Game Design: A Book of Lenses), as uncourse ‘set books’?

Written by Tony Hirst

June 11, 2009 at 11:38 am

Posted in Thinkses

The Guardian OpenPlatform DataStore – Just a Toy, or a Trusted Resource?

with 8 comments

When the Guardian launched their OpenPlatform DataStore, a collection of public data, curated by Guardian folk, hosted on Google Spreadsheets, it raised the question as to whether this initiative would influence the attitude of the Office of National Statistics, and in particular the way they publish their results (e.g. Guardian Data Store: threat to ONS or its saviour?).

In the three sexy skills of data geeks, Michael Driscoll reinterprets Google’s Chief Economist’s prediction that “the sexy job in the next ten years will be statisticians… The ability to take data—to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it” with his belief that “the folks to whom Hal Varian i.e. [Google's Chief Economist] is referring are not statisticians in the narrow sense, but rather people who possess skills in three key, yet independent areas: statistics [studying], data munging [suffering] and data visualization [storytelling]”

I’ve already suggested that what I’d quite like to see is plug’n’play public data that’s easy for people to play with in a variety of ways, and publishing it via Google Spreadsheets certainly lowers quite a few barriers to entry from a technical perspective which can make life easier for statisticians and the visualisers, and reduce the need for the data mungers, the poor folks who go through “the painful process of cleaning, parsing, and proofing one’s data before it’s suitable for analysis. Real world data is messy” as well as providing access to data where it is difficult to access: “related to munging but certainly far less painful is the ability to retrieve, slice, and dice well-structured data from persistent data stores”.

But if you don’t take care of the data you’re publishing, the even though there are friendly APIs to the data it doesn’t necessarily follow that the data will be useful.

As Steph Gray says in Cui bono? The problem with opening up data:

Here’s my thought: open data needs a new breed of data gardeners – not necessarily civil servants, but people who know data, what it means and how to use it, and have a role like the editors of Wikipedia or the mods of a busy forum in keeping it clean and useful for the rest of us. … Support them with some data groundsmen with heavy-lifting tools and technical skills to organise, format, publish and protect large datasets.

So with all that in mind, is the Guardian DataStore adding value to the data in the data store in an accessibility sense by reducing the need for data mungers to have to process the data so that it can be used in a plug’n'play way by the statisticians and the data visualisers, whether they’re professionals, amateurs or good old Jo Public?

As a way in to this question, let’s look at the various HE datasets. The Guardian has published several of these:

- Get the full university tables – as a spreadsheet
- University research department rankings
- Drop out rates for every university

Before we look at the data, though, let’s look at the URIs to see if the architecture of the site makes it easy to discover potentially related datasets. (Finding data is another of the skill related to the black arts of the data mungers, I think?!;-)

The URI for the metapage that hosts a link to the RAE/research data blog post is:
http://www.guardian.co.uk/news/datablog+education/research,
and links to the teaching related posts is:
http://www.guardian.co.uk/news/datablog+education/higher-education.
Going back up the common path to http://www.guardian.co.uk/news/datablog+education/ we get…. a 404 :-(

Hmmm… So how come the datablog+education page doesn’t link down to the HE collection pages, as wll as the schools data blog pages (e.g. these are both valid:
- http://www.guardian.co.uk/news/datablog+education/school-tables and
- http://www.guardian.co.uk/news/datablog+education/primary-school-league-tables
and might naturally be expected to be linked to from:
http://www.guardian.co.uk/news/datablog+education/).

Looking back to the HE teaching related datasets, we see they are both listed on the http://www.guardian.co.uk/news/datablog+education/higher-education page. So might we then expect them to be ‘compatible’ datasets in some sense?

That is, do the HE data sets share common values, for instance in the way the HEIs are named?

If we generate a couple of queries on to the university satisfaction tables and the dropout tables (maybe trying to look for correlations between drop out rate and student satisfaction) by pulling the results from different queries on those tables in to a data grid within a Google spreadsheet (cf. the approach taken in Using Google Spreadsheets and Viz API Queries to Roll Your Own Data Rich Version of Google Squared on Steroids (Almost…)), what do we gt?

Here’s a search for “Leeds”, for example:

One table contains items:

- Leeds Trinity & All Saints
- Leeds
- Leeds Met

and the other contains:

- Leeds College of Music
- The University of Leeds
- Leeds Metropolitan University
- Leeds Trinity and All Saints

So already, even with quite a young datastore, we have an issue with data quality. In Data Driven: Profiting from Your Most Important Business Asset, Thomas Redman identifies “seven common data quality issues) which include the related problems of too much data (i.e. multiple copies of the same data in different places – that is, redundancy) and data inconsistency across sources (not a problem the datastore suffers from – yet?) and poor data definition (p41 -preview available on Google books?).

This latter issue, poor data definition, is evident in the naming of the HEI institutions above: I can’t simply import the overall tables and dropout tables into DabbleDB and let it magically create a combined table based on common (i.e. canonical) HEI names (using the approach described in Mash/Combining Data from Three Separate Sources Using Dabble DB), for example) because the HEIs don’t have common names.

So what does Redmond have to say about this (p.55)?

- Find and fix errors
- Prevent them at their source [in this case, the error is inconsistency and could have been prevented by using a common HEI naming scheme, OR providing another unique identifier that could act as a key across multiple data tables; but name is easier – because name is what people are likely to search by…).

(See also Redmond’s “Hierarchy of Data and Information Needs”, (p. 58), which identifies the need for consistency across sources.)

Note that we have a problem though – the datastore curators can’t change the names in the current spreadsheets, because people may already be using them and keying on the current name format. We shouldn’t create another spreadsheet containing the same data because that causes duplication/redundancy? So what would be the best approach? Answers on the back of a postcard to, err, the Guardian datastore, I guess?!;-)

So is it the Guardian’s job to be curating this data, or tending it as one of Steph’s data gardeners/groundsmen might? If they want it to be a serious resource, then I would say so. But if it’s just a toy? Well, who cares…?

PS Just in passing, what other value might the DataStore add to spreadsheets to make them more amenable to “mashups”? For data like the university data, providing geo-data might be interesting (even at the crude level of just providing a single geographical co-ordinate for the central location of the institution). If I could easily get geo-data for the HEIs, and combine it with the satisfaction tables or dropout rates, it would be trivial to generate map based views of the data.

PPS one other gripe I have with the Guardian datablog, where many of the datastore data sets are announced, is that the links are misleading:

Now call me naive, but I’d expect those DATA links to point to spreadsheets, as indeed the first two do, but the third points to another blog post and so I’ve lost trust in being able to use those DATA links (e.g. in a screenscraper) as a direct pointer to a spreadsheet.

Written by Tony Hirst

June 8, 2009 at 9:35 am

Follow

Get every new post delivered to your Inbox.

Join 126 other followers