Asset Stripping OpenLearn – Images

A long time ago, I tinkered with various ways of disaggregating OpenLearn course units into various components – images, audio files, videos, etc. (OpenLearn_XML Asset stripper (long since rotted)). Over the last few weeks, I’ve returned to the idea, using Scraperwiki to trash through the OpenLearn XML (and RSS) in order to build collections out of various different parts of the OpenLearn materials. So for example, a searchable OpenLearn meta-glossary, that generates one big glossary out of all the separate glossary entries in different OpenLearn units, and an OpenLearn learning outcomes explorer, that allows you to search through learning outcomes as described in different OpenLearn courses.

I’ve also been pulling out figure captions and descriptions, so last night I added a view that allows you to preview images used across OpenLearn: OpenLearn image viewer.

There’s a bit of niggle in using the viewer at the moment (as Jenny Gray puts it, “it’ll be a session cookie called MoodleSession in the openlearn.open.ac.uk domain (if you can grab it?)”) which, if you don’t have a current OpenLearn session cookie, requires you to click on one of the broken images in the righthand-most column and then go back to the gallery viewer (at which point, the images should load okay…unless you have some cookie blocking or anti-tracking features in place, which may well break things further:-( )

(If anyone can demonstrate a workaround for me for how to set the cookie before displaying the images, that’s be appreciated…)

To limit the viewed images, you can filter results according to terms appearing in the captions or descriptions or by course unit number.

One thing to note is that although the OpenLearn units are CC licensed, some of the images used in the units (particularly third party images) may not be so liberally licensed. At the moment, there is a disconnect in the OU XML between images and any additional rights information (typically a set of unstructured acknowledgements at the end of the unit XML), which makes a fully automated “open images from OpenLearn” gallery/previewer tricky to knock together. (When I get a chance, I’ll put together a few thoughts about what would be required to support such a service. It probably won’t be much, just an appropriate metadata filed or two…)

PS here’s an example of why the ‘need a cookie to get the image’ thing is really rather crap… I embedded an image from OpenLearn, via a link/url, in a post (making sure to link back to the original page). Good for me – I get a relevant image, I don’t have to upload it anywhere – good for OpenLearn, they get a link back, good for OpenLearn, they get a loggable server hit when anyone views the image (although bad for them in that it’s their server and bandwidth that has to deliver the image).

However, as it stands, it’s bad for OpenLearn because all the users see is a broken link, rather than the image, unless you have a current OpenLearn cookie session already set. The fix for me is more work: download the image, upload it to my own server, and then embed my copy of the image. OpenLearn no longer gets any of the “paradata” surrounding the views on that image, and indeed may never even know that I’m reusing it…

Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...

3 thoughts on “Asset Stripping OpenLearn – Images”

  1. “At the moment, there is a disconnect in the OU XML between images and any additional rights information” Hm yes its true they are in the acknowledgements section, but there will be an tag inside the tag for each asset with extra copyright restrictions that you should be able to scrape more easily.

    I’ve been testing the cookie problem with moodle 2, which we plan to switch to in the summer. I believe the problem has gone away. OpenLearn does want people to embed our assets like this, and we can take the bandwidth! Note to self: get Tony to check image embed when we have a test server.

Comments are closed.

%d bloggers like this: