My (Rather Scruffy) MOSAIC Library Data Competition Entry

Several weeks ago, I put in an entry to the JISC MOSAIC Library Data Competition based on a couple of earlier hacks I’d put together that were built around @daveyp’s MOSAIC API wrapper the competition dataset. Here’s the howto…

First thing to note is that the application built on stuff I’d posted about previously:

– an elaboration of UCAS course choice pages with a link that identified books related to a particular course based on course code: First Dabblings With @daveyp’s MOSAIC Library Competition Data API;
– a bookmarklet that would “look up to see whether there are any courses associated with a particular book (or its other ISBN variants) if its ISBN appears in the URI”: People Who Referred To This Book Were Taking This Course.

So what was the submission? A page, keyed in a RESTful way by a UCAS course code, that:

  • takes a course code,
  • looks for the set books that have been borrowed more than a certain number of time by students associated with that course code (“find popular books associated with this course”),
  • grabs a review of each book from the Amazon mobile site,
  • annotates each book with a list of courses whose students have borrowed the book (“find other courses whose students have used this book”).

This page can be called using the following URI pattern:
http://ouseful.open.ac.uk/mosaic.php?cc=COURSECODE

(Note that the service can be quite slow at times – the pipe is doing a fair bit of work and I’m not sure how quick the MOSAIC API is either. But that’s not the point, right? The point is identifying the logical glue required to join up the MOSAIC service API with a range of other index keys and web services, so that a production service could then potentially be implemented against a proven (-ish!;-) and demonstrably working (sometimes!) functional spec.)

The pipework itself can be found here: MOSAIC Data: Books on a course, with related courses

The first part takes a UCAS course code as key/index value, and queries the MOSAIC API; only books that have been taken out more than a specified minimum number of times in the context of a particular course are passed though:

A link to the book cover is then included in the description item of each book, and a call made to find the courses related to that book. A crude bit of screenscraping on the Amazon mobile page for each book brings in a book review. (I had originally intended to pull in more reviews from the Amazon API, but over the summer Amazon introduced a key based handshake to access to the API – so s***w ’em.)

The output of the pipe is pulled into this page as a JSON feed, and the data from it is used to populate the page.

The first column in the table simply displays the title of each book associated with the specified course code.

The book cover is pulled in from… Worldcat, I think…, keyed by ISBN10 (maybe there are licensing issues involved..?) The preview link pops up a preview of the book in a shadowbox, if available, or failing that a link to the generic book info page on Google Books. (Close the shadowbox using the X on the bottom right of the Shadowbox view.)

It should be possible to handle the previewer using code that is displayed directly within the shadowbox, but I ran out of time trying to get it to work, so resorted to using a helper page instead that could be embedded via an iframe in the shadowbox (the helper might be quite useful in it’s own right as a piece of rapid protoyping componentware? http://ouseful.open.ac.uk/gbookIframeEmbed.php?isbn=159059858X). The lightbox code is from Shadowbox.js.

The Amazon info (the ‘first’ listed Amazon review for the book) is pulled in from the Amazon mobile site via a Yahoo pipe. (The Amazon URIs I use look something like http://www.amazon.co.uk/gp/aw/d.html/?a=ISBN10&er=1) Since Amazon started requiring API requests to be signed, it’s made quick hacks difficult; the next quickest thing is to scrape the mobile site, which is what I’ve done here. A regular expression in the page rewrites the Amazon mobile URIs to the normal web URIs.

The course info columns shows info for courses that are also associated with the book; because not enough people in HE appear to care about URIs and “pivot data”, we often can’t just take a course code and create a URI that links to a corresponding course description. In the short term, I pivot back to this page, so you can see what other books are associated with the specified course. And as a hack, I munge together a Google search query that at least tries to track the course down based on course code and HEI name (e.g. http://www.google.com/search?q=BSc%28H%29+Computer+Games+Programming+%22University+of+Huddersfield%22. I don’t query UCAS page directly because the UCAS search uses session variables and a handshake as part of a shortlived URI to a set of search results.) Many of the results are to timed out UCAS searches indexed by Google though. It also amuses me that for some HEIs, searching their public site with the course code for a course they have ‘advertised’ on the UCAS site turns up no results. Zero. Zilch. None.

I also posted a complementary bookmarklet that can be used to annotate course search results page on the UCAS website with a link to the appropriate related books’n’courses page.

You may have noticed that the competition entry was posted in a minimal, unstyled form (I had hoped to make use of a Google visualisation API table widget to display the results, but proper work intruded ;-) This is in part to make the point that it is not – never was meant to be – a production service. It’s a working rapid prototype intended to demonstrate how Library data might be used outside the Library domain to act as a marketing support tool for Higher Education courses on the one hand, and an informal learning/related resources recommendation tool on the other.

These sorts of prototypes can be constructed in 1 to 2 hours and provide something tangible for Library folk to talk around (as opposed to documents produced at length describing how the needs of fictional characters and user scenarios generated by break out groups in “let’s reinvent our web presence” departmental away days…

They’re also the sorts of thing that we should be creating and discussing as throwaways on a regular basis, not hiding away for months end because they’re in some competition or other…

(As to why I haven’t posted this before – a huge half written blog post backlog; the how to has been available on the demo page, which was tweeted about widely as soon as I’d got it on to the server…)

PS see also @ostephen’s Read To Learn competition entry, which takes a set of ISBNs from an uploaded file, looks them up against the MOSAIC API and returns the UCAS course codes of courses that are associated through book loans with the uploaded ISBNs (and maybe xISBNs too?). Each course code is then looked up against the current UCAS course catalogue, and the search results (i.e. the list of corresponding courses at institutions across the UK) is retrieved and displayed. In short, Read To Learn takes a set of ISBNs and finds related courses from a course code lookup on the UCAS site. My app just took a single course code and tried to find related books (along with reviews of the books) and courses.

Meanwhile, Over on the Arcadia Blog(s)… Redux

A month or so ago, I posted a round-up of items I’d published on the various Arcadia Project blogs ( Meanwhile, Over on the Arcadia Blog(s)…). Here’s a follow up to that one, providing a quick review of the various Arcadia posts I’ve produced since then, posts that might in other circumstances have normally appeared on this blog.

PS For completeness in this summary of posts I’ve recently blogged elsewhere, there’s a smattering of stuff on the WriteToReply/Actually blog:

Phew… next week, back to normal – ish – though I intend to carry on posting library related stuff on the Arcadia blogs.

Google Books Library Shelves

It’s been some time since I last had a look at the “My Library” service in Google Books, but with the announcement of Google eBooks store (currently US only, except for out-of-copyright free downloads) I popped over to my Google Books account to see whether anything else had changed…

One of the little known (I think?) features of Google Books is the “My Library” personalisation which allows you to create a collection of books and search over them. Searching your library finds all the books in your library collection that contain the search phrase; if a preview of the book is available returns deep links into the book to the point(s) at which the search terms appear:

Search within a book on google books

I’ve previously commented on the My Library aspect of Google Books in the context of its possible use by libraries for providing a full-text search option over books in their collection (e.g. Complementing the OPAC With a Full Text Search Book Catalogue where I describe the use of the service by Wiltshire Heritage Library (example) and the Penn State University Press booksearch (example)).

(At the moment I don’t think you can get statistics back on the searches carried out on a My Library profile, though Google books can do stats for publishers e.g. Google Books for Publishers).

Anyway – one of the problems I originally had with My Library was that you could only maintain a single collection. But it seems that it’s now possible to create separate collections by tagging books in your Library onto “shelves”:

Google Books - My Library

(Shelves appeared at the start of 2010, it seems: Updated Books Home Page and My Library.)

So what immediately comes to mind is that if you’re running several courses, you could add the books used in the course to a My Library shelf, and then publish a link to a search context for that shelf to give a full text searchable version of the books on the list (assuming they’ve been scanned by the Goog, of course). Where previews are available, deep links into books will be available as part of the search results.

I haven’t really populated any shelves yet, but here’s the idea:

Google books - My library search

I haven’t explored the Book Search Data API yet, bit it does seem to offer the ability to search over a particular user’s public library, as well as retrieve lists of books from the library. API options also exist for adding books to a library, though the API seems to only support adding labels, rather than updating shelves (or maybe legacy handlers map labelled books onto shelves?). With a bit of digging, it might be possible to find a route to automate the creation of a library shelf from a list of books. (Hmmm, maybe I should try this with the OU Set books list?!;-)

Google Books shelves thus seem to provide a way of creating different lists of books within a single user library, although I’m not sure if there is a limit on the number of books contained within a shelf, or in the library as a whole. Another nice feature is that it’s possible to select a shelf based filter to just display books from a similar shelf (click on the label in the left-hand sidebar to filter by shelf); this search facet also seems to be passed through to a bookmarkable URL for the filtered search via the as_coll argument (I think?). (Which is to say: you can share a link for a search within a particular shelf in a particular user’s library.)

I’m not sure if Google Books is available through Google Apps for Education, but it could be a useful component of a full text book search context around books on a reading list?

PS As Google Scholar appears to be improving its coverage, it strikes me that the Goog still doesn’t offer a Google service for building searchable reference lists, although it does let you customise the addition of links that will bookmark a reference to a service for you:

Google scholar citation linker

Here’s how the links are displayed:

Google scholar results

Given you can build weblink search contexts using Google custom search engines, full text book search contexts using the Books My Library service, search over content from bundled feeds in Google Reader and even run things like video search by user on Youtube*, the Goog must surely be looking to offer a collection building and searching over service for Google Scholar? So I wonder… could Google end up taking over a service like CiteULike or Mendeley to complement and bootstrap personalisation of their Google Scholar offering? Or would they just build their own (cut down) version of these services?

* Hmm… I wonder if there’s a Youtube API switch that lets you search playlists? It’s definitely possible to get a playlist feed out…

PPS the Goog is also lacking a way of exposing all these personal search contexts to a logged in user through the same interface. If it were down to me, I’d start to expose them in the left hand sidebar of Google websearch, so I’m guessing this will be a labs/experimental service in the new year, if it isn’t already so…

Google search tools

…maybe…?;-)

Positioning @theUL

After a couple of pleasurable days in Cambridge on an Arcadia Project debrief, and an aside comment about the Univesity Librariy’s social media media presence, I thought I’d generate a quick plot of: a) the 1.5 degree egonet showing how followers of @theUL on Twitter follow each other:

@theUL followers 1.5

If my metadata is correct, here’s a social positioning graph showing Twitter accounts followed by 50 or more Twitter users who also follow @theUL (actually, I think the approach was: for each of the followers of @theUL, sample 250 of their friends and then display folk followed by at least 50 of the @theUL’s followers):

@theUL - social positioning

Just by the by, I wonder if this is how the University Library folk imagine themselves to be positioned in social media space?!