Mashup Mayhem BCS (Glasgow Branch) Young Professionals Talk

On Monday I gave a presentation for the BCS Glasgow branch at the invite of Daniel Livingstone, who I met in the mashup mart session at the CETIS bash last year.

I’d prepared some slides – even rehearsed a couple of the mashups I was going to do – and then fell apart somewhat when the IE6 browser I was using on the lectern PC failed to play nicely with either Pageflakes or Yahoo Pipes. (I had intended to use my own laptop, but the end of the projector cable was locked away…)

“Why not use Firefox Portable?” came a cry from the floor (and I did, in the end, thanks to Daniel…). And indeed, why not? When I was in the swing of doing regular social bookmarking sessions, often in IT training suites, I always used the local machines, and I always used Portable Firefox.

But whilst I’ve started “playing safe” by uploading at least a basic version of the slides I intend to use to Slideshare before I leave home on the way to a presentation, I’ve stopped using Portable Firefox on a USB key even if I am taking the presentation off one… (There is always a risk that “proxy settings” are required when you use your own browser, of course, but a quick check beforehand usually sorts that…)

So note to self – get back in the habit of taking everything on a USB key, as well as doing the Slideshare backup, and ideally prepping links in a feed somewhere (I half did that on Monday) so they can be referred to via a live bookmark or feedshow.

Anyway, some of the feedback from the session suggested handouts would have been handy, so here are handouts of a sort – a set of repurposed slides in which I’ve taken some of the bits that hopefully worked on Monday, along with a little bit of extra visual explanation added in. The slides probably still don’t work as a standalone resource, but that’s what the talking’s for, right?!;-)

There are also some relevant URLs collected together under the glasgowbcs tag on my delicious account: http://delicious.com/psychemedia/glasgowbcs.

Looking Up Alternative Copies of a Book on Amazon, via ThingISBN

As Amazon improves access to the long tail of books through Amazon’s marketplace sellers and maybe even their ownership of Abebooks, it’s increasingly easy to find multiple editions of the same book. So when I followed a link to a book that Mike Ellis recommended last week (to The Victorian Internet in fact) and found that none of the editions of the book were in stock, as new, on Amazon, I had the tangential thought that it’d be quite handy to have a service that would take an ISBN and then look up the prices for all the various editions of that book on Amazon.

Given an ISBN for a book, there are at least a couple of ways of finding the ISBNs for other editions of the book – the Worldcat xISBN service, and ThingISBN from LibraryThing (now part owned by Amazon through Amazon’s ownership of Abebooks; for who else Amazon owns, see Amazon “Edge Services” – Digital Manufacturing).

So here’s a couple of Yahoo pipes for looking up the alternative editions of a book on the Amazon website, after discovering those editions from ThingISBN.

First of all a pipe that takes an ISBN and looks up alternative editions using ThingISBN:

What this pipe does is construct a URL that calls for the list of alternative ISBNs for a given ISBN. That is, it constructs a URL of the form http://www.librarything.com/api/thingISBN/ISBNHERE, which returns an XML file containing the alternative ISBNs (example), grabs the XML file back using the Fetch Data block, renames the internal representation of the grabbed XML so that the pipe will generate a valid RSS feed, and output the result.

So now we have an RSS feed that contains a list of alternative ISBNs, via ThingISBN, for a given ISBN.

Now to find out how much these books cost on Amazon. For that, we shall find it convenient to construct a pipe that will look up the details of a book on Amazon using the Amazon Associates web service, given an ISBN. (For a brief intro to Amazon Associates web services, see Calling Amazon Associates/Ecommerce Web Services from a Google Spreadsheet.)

Here’s a pipe to do that:

(If you use the AWSzone scratchpad to construct a URL that calls the Amazon web service with a look up for book by ISBN, you can just paste it into the “Base” entry form in the Pipe’s URL Builder block and hit return, and it will explode the arguments into the appropriate slots for you.)

So now we have a pipe that will look up the details of a book on Amazon given its ISBN.

We can now put the ThingISBN pipe and the Amazon ISBN lookup pipe together, to create a compound pipe that will lookup details for all the alternative versions of a particular book, given that particular book’s ISBN:

Okay – so now we have a pipe that takes an ISBN, looks up the alternative ISBNs using ThingISBN, then grabs details for each of those alternatives from Amazon…

Now what? Well, if you use this pipe in your own mashup, you may find that if you construct a URL that calls a pipe with a given ISBN, if you don’t handle the ISBN properly in your own code, you can pass a badly formed ISBN to the pipe. The most common example of this is dropping a leading 0 on the ISBN – so e.g. you pass 441172717 rather than 0441172717.

Now it just so happens that LibraryThing offers another webservice that can correct this sort of error – ISBN check API – and it’s easy enough to create a pipe to call it:

Good – so now we can defensively programme the front end of our pipe to handle badly formed ISBNs by sticking this pipe at the front of the compound pipe that calls ThingISBN and then loops through Amazon calls.

But there’s something we can do at the other end of the pipe too, and that is make use of a ‘slideshow’ feature that Yahoo pipes offers as an interface to the pipe. If the elements of a feed contain image items that are packaged in an appropriate way, the Yahoo pipes interface will automatically create a slidesho of those images.

What this means is that if we package URLs that point to the book cover image of each alternative version of a book, we can get a slideshow of the bookcovers of all the alternative editions of that book.

Here’s just such a pipe:

And here’s the example output:

If you click on the “Get as Badge” option, you can then embed this slideshow on your own website or start page:

For example, here I’ve added the slideshow to my iGoogle page:

Now to my mind, that’s quite a fun (and practical) way of introducing quite a few ideas about webservice orchestration that can be unpacked at a later date. But of course, it’s not very academic, so it’s unlikely to appear in a course near you anytime soon… ;-) But I’d argue that it does stand up as a demo that could be given to show people how much fun this stuff can be to play with, before we inflict SOAP and WS-* on them…

Amazon Reviews from Different Editions of the Same Book

A couple of days ago I posted a Yahoo pipe that showed how to Look Up Alternative Copies of a Book on Amazon, via ThingISBN. The main inspiration for that hack was that it could be useful to get “as new” prices for different editions of the same book if you’re not so bothered about which edition you get, but you are bothered by the price. (Or maybe you wanted an edition of a book with a different cover…)

It struck me last night that it might also be useful to aggregate the reviews from different editions of the same book, so here’s a hack that will do exactly that: produce a feed listing the reviews for the different editions of a particular book, and label each review with the book it came from via its cover:

The pipe starts exactly as before – get an ISBN, check that the ISBN is valid, then look up the ISBNs of the alternative editions of the book. The next step is to grab the Amazon comments for each book, before annotating each item (that is, each comment) with a link to the book cover that the review applies to; we also grab the ISBN (the ASIN) for each book and make a placeholder using it for the item link and image link:

Then we just create the appropriate URLs back to the Amazon site for that particular book edition:

The patterns are as follows:
– book description page: http://www.amazon.co.uk/exec/obidos/ASIN/ISBN
– book cover image: http://images.amazon.com/images/P/ISBN.01.TZZZZZZZ

Here’s how the nested pipe that grabs the comments works (Amazon book reviews lookup by ISBN pipe): first construct the URL to call the webservice that gets details for a book with a particular ISBN – the large report format includes the reviews:

Grab the results XML and point to the reviews (which are at Items.Item.CustomerReviews.Review):

Construct a valid RSS feed containing one comment per item:

And there you have it – a pipe that looks up the different editions of a particular book using ThingISBN, and then aggregates the Amazon reviews for all those editions.

Recent OU Programmes on the BBC, via iPlayer

As @liamgh will tell you, Coast is getting a quite a few airings at the moment on various BBC channels. And how does @liamgh know this? Because he’s following the open2 openuniversity twitter feed, which sends out alerts when an OU programme is about to be aired on a broadcast BBC channel.

(As well as the feed from the open2 twitter account, you can also find out what’s on from the OU/BBC schedule feed (http://open2.net/feeds/rss_schedule.xml), via the Open2.net schedule page; iCal feeds appear not to be available…)

So to make it easier for him to catch up on any episodes he missed, here’s a quick hack that mines the open2 twitter feed to create a “7 day catch up” site for broadcast OU TV programmes (the page also links through to several video playlists from the OU’s Youtube site).

The page actually displays links to programmes that are currently viewable on BBC iPlayer (either via a desktop web browser, or via a mobile browser – which means you can view this stuff on your iPhone ;-), and a short description of the programme, as pulled from the programme episode‘s web page on the BBC website. You’ll note that the original twitter feed just mentions the programme title; the TinyURLd link goes back to the series web page on the Open2 website.

Thinking about it, I could probably have done the hackery required to get iPlayer URLs from with in the page; but I didn’t… Given the clue that page is put together using a JQuery script I stole from this post on Parsing Yahoo Pipes JSON Feeds with jQuery, you can maybe guess where the glue logic for this site lives?;-)

There are three pipes involved in the hackery – the JSON that is pulled into the page comes from this OU Recent programmes (via BBC iPlayer) pipe.

THe first part grabs the feed, identifies the programme title, and then searches for that programme on the BBC iPlayer site.

The nested BBC Search Results scrape pipe searches the BBC programmes site and filters results that point to an actual iPlayer page (so we can we can watch the result on iPlayer).

Back in the main pipe, we take the list of recently tweeted OU programmes that are available on iPlayer, grab the programme ID (which is used as a key in all manner of BBC URLs :-), and then call another nested pipe that gets the programme description from the actual programme web page.

This second nested pipe just gets the programme description, creates a title and builds the iPlayer URL:

(The logic is all a bit hacked – and could be tidied up – but I was playing through my fingertips and didn’t feel like ‘rearchitecting’ the system once I knew what I wanted it to do… which it is what it does do…;-)

As an afterthought, the items in the main pipe are annotated with a link to the mobile iPlayer version of each programme:

So there you have it: a “7 day catch up” site for broadcast OU TV programmes, with replay via iPlayer or mobile iPlayer.

[18/11/08 – the site that the app runs on is down at the moment, as network security update is carried out; sorry about that – maybe I should use a cloud server?]

Approxi-mapping Mash-ups, with a Google MyMaps Tidy Up to Follow

What do you do when you scrape a data set, geocode it so you can plot it on a map, and find that the geocoding isn’t quite as good as you’d hoped?

I’d promised myself that I wasn’t going to keep on posting “yet another way of scraping data into Google spreadsheets then geocoding it with a Yahoo pipe” posts along the lines of Data Scraping Wikipedia with Google Spreadsheets, but a post on Google Maps mania – Water Quality Google Map – sent me off on a train of thought that has sort of paid dividends…

So first up, the post got me thinking about whether there are maps of Blue Flag beaches in the UK, and where I could find them. A link on the UK page of blueflag.org lists them: UK Blue Flag beaches, (but there is a key in the URL, so I’m not sure how persistent that URL is).

Pull it into a Google spreadsheet using:
=ImportHtml(“http://www.blueflag.org/tools/beachsearch?q=beach&k={E1BB12E8-A3F7-4EE6-87B3-EC7CD55D3690}&f=locationcategory”,
“table”,”1″)

Publish the CSV:

Geocode the beaches using a Yahoo pipe – rather than using the Pipe location API, I’m making a call to the Yahoo GeoPlanet/Where API – I’ll post about that another day…

Grab the KML from the pipe:

Now looking at the map, it looks like some of the markers may be mislocated – like the ones that appear in the middle of the country, hundreds of miles from the coast. So what it might be handy to do is use the scraped data as a buggy, downloaded data set that needs cleaning. (This means that we are not going to treat the data as “live” data any more.)

And here’s where the next step comes in… Google MyMaps lets you seed a map by importing a KML file:

The import can be from a desktop file, or a URL:

Import the KML from the Yahoo pipe, and we now have the data set in the Google MyMap.

So the data set in the map is now decoupled from the pipe, the spreadsheet and the original Blue Flag website. It exists as a geo data set within Google MyMaps. Which means that I can edit the markers, and relocate the ones that are in the wrong place:

And before the post-hegenomic tirade comes in (;-), here’s an attempt at capturing the source of the data on the Google MyMap.

So, to sum up – Google MyMaps can be used to import an approximately geo-coded data set and used to tidy it up and republish it.

PS dont forget you can also use Google Maps (i.e. MyMaps) for geoblogging

Merging Several Calendar iCal Feeds With Yahoo Pipes

Following up on Displaying Events from Multiple Google Calendars in a Single Embedded Calendar View, and picking up on a quip Jim Groom made in the post that started this thread (“Patrick suggested Yahoo Pipes!, you ever experiment with this? “), I did have a quick play with pipes, and this is what I found..,

The “Fetch Feed” block is happy to accept iCal feeds, as this iCal Merge pipe demonstrates:

(I grabbed the iCal feeds from pages linked to from the Stanford events page. A websearch for “ical lectures events” should pull up other sources;-)

If you import an iCal feed into a Yahoo pipe, you get an iCal output format option:

You can then render this feed in an online calendar such as 30 boxes: pipes merged iCal feeds in 30 boxes (here’s the 30 boxes config page for that calendar).

(NB it’s worth noting that 30 boxes will let you generate a calendar view that will merge up to 3 iCal feeds anyway.)

Using the Pipe’s output iCal URL to try to add the merged calendar feed to Google Calendar didn’t seem to work, but when I converted the URL to a TinyURL (http://tinyurl.com/67bg2d) and used that as the import URL, it worked fine.

Do this:

then this:

and get this:

(I couldn’t get the Yahoo pipe iCal feed to work in iCal on my Mac, nor could I resyndicate the feed from the Google Calendar. I think the problem is with the way the Pipes output URL is constructed… which could be worked around by relaying/republishing the Pipe iCal feed through something with a nice URL, maybe?)

That okay for you, Reverend? :-)

PS having to add the feeds by hand to the pipe is a pain. So how about if we list a set of iCal feeds in an RSS feed (which could be a shared bookmark feed, built around a common tag), then pull that bookmark feed (such as the feed from a delicious page (e.g. http://delicious.com/psychemedia/ical+feedtest)) into a pipe and use it to identify what iCal feeds to pull into the pipe?

Got that? The Loop block grabs the URL for each iCal feed listed in the input RSS feed, and pulls in the corresponding iCal events. It seems to work okay, too:-) That is, the feed powered iCal merge pipe will aggregate events from all the iCal feed listed in the RSS feed that is pulled into the pipe.

So now the workflow, which could possibly be tidied a little, is this:
– bookmark iCal feed URLs to a common somewhere (this can be as weak as shared tags, which are then used as the basis for aggregation of feed URLs);
– take the feed from that common somewhere and pop it into the feed powered iCal merge pipe.
– get the TinyURL of the iCal output from the pipe, and subscribe to it in Google Calendar, (for a personal calendar view).

Hmm… we still can’t publish the Google Calendar though, because we don’t “own” the calendar dates (the iCal feed does)? But I guess we can still use 30boxes as the display surface, and provide a button to add the calendar to Google Calendar?

OKAY – it seems that when you import the feed, it makes sense to tick the box that says “allow other people to find this calendar”:

… because then you can generate some embed code for the calendar, provide a link for anyone else to see the calendar (like this one), and use the tidied up iCal feed that Google calendar now provides to view the calendar in something like iCal:

PPS To make things a little easier, I tweaked the feed powered pipe so now you can just provide it with an RSS feed that points to one or more iCal feeds:

I also added a block to sort the dates in ascending date order. It’s simple enough to add the feed to iGoogle etc, or as a badge in your blog, using the Yahoo Pipes display helper tools:

Hmm, it would be nice if Pipes also offered a “calendar” output view when it knew there was iCal data around, just like it generates a map for when it sniffs geo-data, and a slideshow view when it detects appropriately addressed media objects? Any chance of that, I wonder?

Getting Lots of Results Out of a Google Custom Search Engine (CSE) via RSS

In Getting an RSS Feed Out of a Google Custom Search Engine (CSE), I described a Yahoo! pipe that can be used as a way of getting a RSS feed out of a Google Custom Search Engine using the Google Ajax Search API.

One of the limitations of the API appears to be that it only returns 8 search results at a time. although these can be paged.

So for example, if you run a normal Google search that returns lots of results, those results are presented over several results pages. If you hover over the links for the different pages, and look at the status bar at the bottom of your browser where the link is displayed, you’ll see that the URL for each page of results is largely the same; the difference comes in the &start= argument in the URI that says which number search result should be at the top of the page; something like this:

The same argument – start can be used to page the CSE results from the AJAX API; which means we can add this in to the URI that calls the Google AJAX Search API within a pipe:

This gives us a quick fix for getting more than 8 results out of a CSE: use the get 8 CSE results starting at a given result pipe to get the first 8 results (counted as results 0..7), then another copy of the pipe to get results 9-16 (counted as 8..15 – i.e. starting at result 8), a second copy of the pipe to get results 17-25, and so on, and then aggregate all the results…

Here’s an example – lots of Google CSE results as RSS pipe:

Notice that each CSE calling pipe is called with the same CSE ID and the same search query, but different start numbers.

This resipe hopefully also gives you the clue that you could use the Union pipe block to merge results from different CSEs (just make sure you use the right CSE ID and the right start values!).

What Happens If Yahoo! Pipes Dies?

News appeared recently that Yahoo’s video editing site Jumpcut has stopped accepting new uploads, and users are being encouraged to move over to flickr. (On the odd occasion I’ve played with online video suites, I’ve tended to use Jumpcut, so I’m not overjoyed about this. Just FYI, Jaycut or Photobucket (which uses Adobe Premiere Express) are my fallback positions…)

This news got me thinking – again – about what my fallback position would be if Yahoo! Pipes disappeared. (Regular readers – and anyone who’s seen me give a mashup related presentation lately – will know I’m a bit of a pipes junkie;-)

So here’s what I’ve been saying I’m going to do for a long time – and maybe by posting it I’ll provoke myself into doing something about it next year…

  1. Set up a wiki… Yahoo Pipes Code Bindings, or similar;
  2. for each block in Yahoo! Pipes, post the following:
    • an image of the block;
    • a code equivalent for that block; (e.g. a fragment of Python, PHP, Javascript or Google Mashup Editor code that is functionally equivalent to the block);
  3. that’s it… or maybe show a minimal example pipe using the block, and an equivalent, working, PHP, Python, Javascript or Google Mashup Editor programme;

What this would mean is that a screenshot of a Yahoo pipe could act as a specification for a a feed processing programme, and the bindings from blocks to code would allow a translation from the visual pipe description to some actual (working) code.

That would be okay for starters, and would at least mean I’d be able to ‘rescue’ large amounts of the functionality of pipes I’ve blogged about without having to rethink all the algorithms, or work out too much (if any) of the code. Cut and paste job from the code equivalents on the wiki… (err…?!)

As well as rescuing the functionality of the pipe, this approach also has the advantage of making Yahoo pipes acceptable as a rapid prototyping code for a list a quick rush of code that can be run on a server elsewhere.

How could the process be improved? Well, taking a cue from the AWSZone Sctarchpads (which, err, appear to be down at the moment?), it’d be nice to be able to just generate the code from the actual pipe.

How might we be able to do that? I’m not sure, but I’d like to think the following would be possible:

  1. using a browser extension, or Greasemonkey script, capture a Javascript object representation of a pipe from the Edit view of that pipe;
  2. parse the Javascript representation of the pipe and translate each Pipe block to the appropriate code binding;

So the vision here is you could edit a pipe, click a button, and generate the code equivalent of the pipe. (Of course, it’d be really nice if Pipes offered an “export pipe as code” option natively;-)

(After all, Zoho Creator Deploys to Google App Engine: “When you open an application in Zoho Creator in edit mode, you’ll see a new option ‘Deploy in App Engine’ under ‘More Actions’ menu (on the top). This option will let you generate and download the Python code (App Engine supports deployment of Python only apps) of your Zoho Creator application which you can then deploy to Google App Engine. … Zoho Creator essentially acts as an IDE for Google App Engine.” So why shouldn’t Pipes pipelines also deploy elsewhere too? Why shouldn’t “Yahoo pipes essentially act as an IDE for feed-powered pipelines in Python, PHP, Javascript and the Google Mashup Editor”?)

PS If anyone wants to create a wiki and start this process off, please be my guest (I’ll be largely offline over the Christmas period, so won’t be able to run with this idea until the New Year, if then…)

Tinkering With Time

A few weeks ago now, I was looking for a context within which I could have a play with the deprecated BBC Web API. Now this isn’t the most useful of APIs, as far as I’m concerned, because rather than speaking in the language of iPLayer programme identifiers it users a different set of programme IDs (and I haven’t yet found a way of mapping the one onto the other). But as I’d found the API, I wanted to try to do something with it, and this is what I came up with: a service you can tweet to that will tell you what’s on a specified BBC TV or radio channel now (or sometime in the recent past).

Now I didn’t actually get round to building the tweetbot, and the time handling is a little ropey, but if I write it up it make spark some more ideas. So here goes…

The first part of the pipe parses a message of the form “#remindme BBCChannel time statement”. The BBCChannel needs to be in the correct format (e.g. BBCOne, BBCRFour) and only certain time constructs work (now, two hours ago, 3 hours later all seem to work).

The natural language-ish time expression time gets converted to an actual time by the Date Builder block, and is then written into a string format that the BBC Web API requires:

Then we construct the URI that references the BBC Web API, grab the data back from that URI and do a tiny bit of tidying up:

If you run the pipe, you get something like this:

Time expressions such as “last Friday” seem to calculate the correct date and use the current time of day. So you could use this service to remind yourself what was on at the current time last week, for example.

A second pipe grabs the programme data from the programme ID, by constructing the web service call:

then grabbing the programme data and constructing a description based on it:

It’s then easy enough to call this description getting pipe at the end of the original pipe, remembering to call the pipe with the appropriate programme ID:

So now we get the description too:

To see what’s on (or what was on) between two times, we need to to construct a URI to call the BBC Web API with the appropriate time arguments, suitably encoded:

and then call the web service with that URI.

It’s easy enough to embed this pipe in a variant of the original pipe that generates the two appropriately encoded time strings from two natural language time strings:

If we add the programme details fetching pipe to the end of the pipe, we can grab the details for each programme and get this sort of output from the whole pipeline:

Telling Yahoo Pipes How YOU Want URI Arguments Ordered

I had an email today from OUseful.info reader Stephen Harlow asking a Yahoo Pipes related question (I get this evry 2-3 weeks, at the moment, and try to answer them as best I can): “My problem is that the URL Builder in Pipes seems to muddle the order of query parameters in the URL… Is there anyway of fixing the order within Pipes’ URL builder?”

Hmm… so here’s the problem:

(Is this a reverse alphabetical ordering by the Pipe I wonder? One to explore…)

Now most of the time, the order of the arguments in a URI doesn’t matter, although for some systems, (as seems to be the case for the crappy Library OPAC Stephen was building a pipe for) it does matter.

So here was my suggested fix: use the String Builder to build the URI:

If necessary, you can get the argument values from a user input in the normal way:

I’m not sure if this sorted Stephen’s problem, but it’s another trick to remember… :-)