First Dabblings With @daveyp’s MOSAIC Library Competition Data API

A couple of days ago, Dave Pattern published a simple API to the JISC MOSAIC project HE library loans data that has been opened up as part of a developer competition running over the summer (Simple API for JISC MOSAIC Project Developer Competition data [competition announcement]).

The API offers a variety of simple queries on the records data (we like simple queries:-) that allow the user to retrieve records according to ISBN of the book that was borrowed, books that were taken out by people on a particuler course, (using a UCAS course code identifier) or a combination of the two.

The results file from a search is returned as an XML file containing zero or more results records; each record has the form:

<useRecord row="19">
 <from>
  <institution>University of Huddersfield</institution>
  <academicYear>2008</academicYear>
 </from>
 <resource>
  <media>book</media>
  <globalID type="ISBN">1856752321</globalID>
  <author>Mojay, Gabriel.</author>
  <title>Aromatherapy for healing the spirit : a guide to restoring emotional and mental balance through essential oils /</title>
  <localID>582083</localID>
  <catalogueURL>http://library.hud.ac.uk/catlink/bib/582083</catalogueURL
  <publisher>Gaia</publisher><published>2005</published>
 </resource>
 <context>
  <courseCode type="ucas">B390</courseCode>
  <courseName>FdSc Holistic Therapies</courseName>
  <progression>UG1</progression>
 </context>
</useRecord>

[How to style code fragments in a hosted Wordpres.com blog, via WordPress Support: “Code” shortcode]

(For a more complte description of the record format, see Mosaic data collection – A guide [PDF])

As a warm up exercise to familiarise myself with the data, I did a proof of concept hack (in part inspired by the Library Traveller script) that would simply linkify course codes appearing on a UCAS courses results page such that each course code would link to a results page listing the books that had been borrowed by students associated with courses with a similar course code in previous years.

Looking at the results page, w can see the course code appears next to tha name of each course:

A simple bookmarklet can b used to linkify the qualification code so that it points to the results of a query on the MOSAIC data with the appropriate course code:

javascript:(function(){
 var regu=new RegExp("\([A-Z0-9]{4}\)");
 var s=document.getElementsByTagName('span');
 for (i=0;i<s.length;i++){
  if (s&#91;i&#93;.getAttribute('class')=='bodyTextSmallGrey')
   if (regu.test(s&#91;i&#93;.innerHTML)){
    var id=regu.exec(s&#91;i&#93;.innerHTML);
    s&#91;i&#93;.innerHTML=s&#91;i&#93;.innerHTML.replace( regu,
     "<a href=\"http://library.hud.ac.uk/mosaic/api.pl?ucas="
     +id&#91;0&#93;+"\">"+id[0]+"</a>");}}})()

(It’s trivial to convert this to to Greasemonkey script: simple add the required Greasemonkey header and save the file with an appropriate filename – e.g. ucasCodeLinkify.user.js)

Clicking on the bookmarklet linkifies the qualification code to point to a search on http://library.hud.ac.uk/mosaic/api.pl?ucas= wigth the appropriate code.

To make the results a little friendlier, I created a simple Yahoo pipe that generates an RSS feed containing a list of books (along with their book covers) that had been borrowed more than a specified minimum number of times by people associated with that course code in previous years.

To start with, create the URI with a particular UCAS qualification code to call the web service and pull in the XML results feed:

Map elements onto legitimate RSS feed elements:

and strip out duplicate books. Note that the pipe also counts how many duplicate (?!) items there are:

Now filter the items based on the duplication (replication?) count – we want to only see books that have been borrowed at least a minimum number of times (this rules out ‘occasional’ loans on unrelated courses by single individuals – we only want the popular or heavily borrowed books associated with a particular UCAS qualification code in the first instance):

Finally, create a link to each book on Google books, and grab the book cover. Note that I assume the global identifier is an ISBN10… (if it’s an ISBN13, see Looking Up Alternative Copies of a Book on Amazon, via ThingISBN for a defensive measure using the LibraryThing Check ISBN API. That post also points towards a way by which we might find other courses that are associated with different editions of a particular book… ;-)


http://pipes.yahoo.com/pipes/pipe.info?_id=HqJfbDR93hGE5DAewmH_9A

You can find the pipe here: MOSAIC data: books borrowed on particular courses.

If we now change the linkifying URL to the RSS output of the pipe, we can provide a link for each course on the UCAS course search results page that points to an RSS feed of reading material presumably popularly associated with the course in previous years (note, however, that note all codes have books associated with them).

To do this, simply change the following URI stub in the bookmarklet:
http://library.hud.ac.uk/mosaic/api.pl?ucas=
to
http://pipes.yahoo.com/pipes/pipe.run?_id=HqJfbDR93hGE5DAewmH_9A&_render=rss&min=3&cc=

The “popular related borrowings” reading list generated now allows a potential student to explore some of the books associated with a particular course at decision time:-)

One possible follow on step would be to look up other courses related to each original course by virtue of several people having borrowed the same book (or other editions of it) on other courses. Can you see how you might achieve that, or at least what sorts of steps you need to implement?

PS If anyone would like to work this recipe up properly as (part of) a competition entry, feel free, though I’d appreciate a cut of any prize money if you win;-)

Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...

3 thoughts on “First Dabblings With @daveyp’s MOSAIC Library Competition Data API”

  1. That is so cool! :-)

    Fingers crossed, we’re hoping that there’ll be data from a few more libraries available (both as separate downloads and as part of the API) by mid August :-)

Comments are closed.

%d bloggers like this: