Supplementing Yahoo Pipes with Javascript Functions

Something is happening to me… No time to play myself, so once again I find myself reposting other people’s hacks… but hey, I guess that means loads more to play with when I do have some time to tinker again…

Anyway, here’s a quick round-up of a couple of tricks from @hapdaniel showing how to augment a Yahoo pipe with Javascript functionality.

The first one involves AppJet. This has been languishing on my “to play with” list for so long now, it’s nice to see a one line demo showing how to hook it into pipes (even if it’s potentially a bit ropey in that respect, as I’ll come on to…).

Anyway, for those of you who’ve never heard of it, AppJet is essentially a hosted Javascript app offering: write your Javascript app on AppJet, and run it from there as an app or call it via a web service. (For the geeks, it does persistent storage, Comet and cron too, as well as support for hosting apps yourself.)

So here’s a simple example of how to use AppJet in conjunction with Yahoo Pipes. The issue at hand was how to number each item in a feed within Pipes (there is no support for e.g. an item.yt:index element that carries the index of an item number within a pipe, though it is something the Pipes folks are looking at). Here’s a simple AppJet single function app to number each item in a feed passed to the app as JSON (cloned to my account here).

The utility function can now be called (using my cloned version) at http://pipes-item-numbering.appjet.net/serialize_feed_items by passing it a JSON encoded version of a feed, and getting a JSON response:

(One thing AppJet appears to do is allow you to import other people’s published code? So for example, I think lib-json is community sourced (and published at http://lib-json.appjet.net/) rather than being a native AppJet library?)

As required, AppJet annotates the feed with item index numbers:

There is a downside, though, and that is that time-outs seem to occur for all but the shortest feeds…

Having seen this solution, I wondered whether it would be possible to do a similar thing via Yahoo’s YQL Execute, which allows you to add “arbitrary developer code [server-side JavaScript with E4X (native XML) support] that the YQL data engine runs during the processing of a YQL statement.”

Not having any free time at the mo, I left that one pending, but @hapdaniel was thinking along similar lines:

And here is that example (the pipe contents are pulled in via RSS):

The actual YQL query has the following pattern:

use “http://pdaniel.co.uk/yql/feedrank.xml”; select * from feedrank where url=’http://RSS_URL’

which means we can easily construct a pipe that will take in a feed and then run it though YQL Execute to number each item:

The feed returned from the YQL Execute query is duly annotated with item index numbers:

So what’s the voodoo magic execute code that achieves this? http://pdaniel.co.uk/yql/feedrank.xml is simply(?!;-):

(If anyone wants to pick through the line y.query(“select * from xml where url=@url”,{url:url}).results; and describe just what it does and how it does it, along with a couple of other examples maybe, feel free to post an explantory comment or two…:-)

If you have created your own pipe and want to index the items in it, the easiest way is probably to:
1) create your pipe;
2) create a new pipe that contains the YQL Execute indexer and uses the URL of the RSS output of your pipe.

PS Just by the by, YQL Execute makes it easy to write CSS selectors which can be really handy when scraping HTML… ;-)

PPS during an exchange on Twitter with @hapdaniel, we were joined by Pipes developer @pjdonelly, so I asked if it would be possible to get the YUI datatable library as a preview UI component on a Pipe’s public page where CSV output was available; and it seems like they’ll look into it

Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...

4 thoughts on “Supplementing Yahoo Pipes with Javascript Functions”

  1. Much of creating the data table was a matter of copying code such as part of the big green blob at http://nagiworld.net/2009/05/yql-execute-screencast-tutorial
    What I found to be the hard part was working out what the for each loop variables should be. Eventually I was forced to do the sensible thing and add a y.log(“search is ” + search) to the original code. That put a dump of the output from the select query into the diagnostics section of the YQL console output. Comparing that dump and the original variables used with a similar dump for my own table quickly enabled me to work out the correct variables.

    Thanks for the mentions. I’m now going out to buy a bigger hat :)

  2. Hi Tony … you were always the biggest fan of yahoo pipes I knew. Check this out … a long, but nicely presented take on “web hooks” as a natural extension of pipes in a web world.

    Great ideas, and a couple of cool sites I hadn’t seen before that help you connect services from different apps.

Comments are closed.