Search results for: pipes

Reflections on the Closure of Yahoo Pipes

Last night I popped up a quick post relaying the announcement of impending closure of Yahoo Pipes, recalling my first post on Yahoo Pipes, and rediscovering a manifesto I put together around the rallying cry We Ignore RSS at OUr Peril.

When Yahoo Pipes first came out, the web was full of the spirit of Web2.0 mashup goodness. At the time, the big web companies were opening all all manner of “open” web APIs – Amazon, Google, and perhaps more than any other, Yahoo – with Google and Yahoo particularly seeming to invest in developer evangelism events.

One of the reasons I became sos evangelical about Yahoo Pipes, particularly in working with library communities, was that it enabled non-coders to engage in programming the web. And more than that. It allowed non-coders to use web based programming tools to build out additional functionality for the web.

looking back, it seems to me now that the whole mashup thing arose from the idea of the web as a creative medium, and one which the core developers (the coders) were keen to make accessible to a wider community. Folk wanted to share, and folk wanted other folk to build on their services in interoperation with other services. It was an optimistic time for the tinkerers among us.

The web companies produced APIs that did useful things, used simple, standard representations (RSS, and then Atom, as simple protocols for communicating lists of content items, for example, then, later, JSON as a friendlier, more lightweight alternative to scary XML, which also reduced the need for casual web tinkerers to try to make sense of XMLHttpRequests), and seemed happy enough to support interoperability.

When Yahoo Pipes came online (and for a brief time, Microsoft’s Popfly mashup tool), the graphical drag-and-drop, wire it together, flow based programming model allowed non-coders to start trying developing, publishing, sharing and building on top of each others real web applications. You could inspect the internals of other peoples pipes, and clone those pipes so you could extend or modify them yourself, and put pipes inside pipes, fostering reuse and the notion of building stuff on top of and out of stuff you’ve learned how to do do before.

And it all seemed so hopeful…

And then the web companies started locking things down a bit more. First my Amazon Pipes started to break, and then my Twitter Pipes, as authentication was introduced to access the feeds published by those companies. It started to seem as if those companies didn’t want their content flows rewired, reflowed and repurposed. And so Yahoo Pipes started to become less useful to me. And a little bit of the spirit of a web as a place where the web companies allowed whosoever, coders and non-coders alike, to build a better web using their stuff started to die.

And perhaps with it, the openness and engagement of the core web developers – the coders – started to close off a little too. True, there are repeated initiatives about learning to code, but whilst I’ve fallen into that camp myself over the last few years, and especially over the last two years, having discovered IPython notebooks and the notion of coding, one line at a time, I think we are complicit in closing off opportunities that help people build out the web using bits of the web.

Perhaps the web is too complicated now. Perhaps the vested interests are too vested. Perhaps the barrage of content of and peck, peck, click, click, Like, addiction feeding, pigeon rat, behaviourist conditioning, screen based crack-Like business model has blinded us to the idea that we can use the web to build our own useful tools.

(I also posted yesterday about a planning application map I helped my local hyperlocal – OnTheWight – publish yesterday. If The Isle of Wight Council published current applications as an RSS feed, it would have been trivial to use the Yahoo Pipes to construct the map. It would have been a five minute hack. As it is, the process we used required building a scraper (in code) and hacking a some code to generate the map.)

There still are tools out there that help you build stuff on the web for the web. CartoDB makes map creation relatively straightforward, and things like Mozilla Popcorn allow you to build your own apps around content containers (I think? It’s been a long time since I looked at it).

Taking time out to reflect on this, it seems as if the web cos have become too inward looking. Rather than engaging wider communities to engage in building out the web, the companies get to a size where their systems become ever more complex, yet have to maintain their own coherence, and a cell wall goes up to contain that activity, and authentication starts to be used to limit access further.

At the time as the data flows become more controlled, the only way to access them comes through code. Non-coders are disenfranchised and the lightweight, open protocols that non-coding programming tools can work most effectively with become harder to justify.

When Pipes first appeared, it seemed as if the geeks were interested in building tools that increased opportunities to engage in programming the web, using the web.

And now we have Facebook. Tap, tap, peck, peck, click, click, Like. Ooh shiny… Tap, tap, peck, peck…

Yahoo Pipes Retires…

And so it seem that Yahoo Pipes, a tool I first noted here (February 08, 2007), something I created lots of recipes for (see also on the original, archived OUseful site), ran many a workshop around (and even started exploring a simple recipe book around) is to be retired (end of life annoucement)…


It’s not completely unexpected – I stopped using Pipes much at all several years ago, as sites that started making content available via RSS and Atom feeds then started locking it down behind simple authentication, and then OAuth…

I guess I also started to realise that the world I once imagine, as for example in my feed manifesto, We Ignore RSS at OUr Peril, wasn’t going to play out like that…

However, if you still believe in pipe dreams, all is not lost… Several years ago, Greg Gaughan took up the challenge of producing a Python library that could take a Yahoo Pipe JSON definition file and execute the pipe. Looking at the pipe2py project on github just now, it seems the project is still being maintained, so if you’re wondering what to do with your pipes, that may be worth a look…

By the by, the last time I thought Pipes might not be long for this world, I posted a couple of posts that explored how it might be possible to bulk export a set of pipe definitions as well as compiling and running your exported Yahoo Pipes.

Hmmm… thinks… it shouldn’t be too hard to get pipe2py running in a docker container, should it…?

PS I don’t think pipe2py has a graphical front end, but javascript toolkits like jsPlumb look like they may do much of the job. (It would be nice if the Yahoo Pipes team could release the Pipes UI code, of course…;-)

PPS if you you need a simple one step feed re-router, there’s always IFTT. If realtime feed/stream processing apps are more your thing, here are a couple of alternatives that I keep meaning to explore, but never seem to get round to… Node-RED, a node.js thing (from IBM?) for doing internet-of-things based inspired stream (I did intend to play with it once, but I couldn’t even figure out how to stream the data I had in…); and Streamtools (about), from The New York Times R&D Lab, that I think does something similar?

When a Hack Goes Wrong… Google Spreadsheets and Yahoo Pipes

One of my most successful posts in terms of lifetime traffic numbers has been a recipe for scraping data from a Wikipedia page, pulling it into a Google spreadsheet, publishing it as CSV, pulling it into a Yahoo Pipe, geo-coding it, publishing it as a KML file, displaying the KML in Google maps and embedding the map in another page (which could in principle be the original WIkipedia page): Data Scraping Wikipedia with Google Spreadsheets.

A variant of this recipe in other words:


Running the hack now on a new source web page, we get the following data pulled into the spreadsheet:

pipe broken

And the following appearing in the pipe (I am trying to replace the first line with my own headers):

imported data

The CSV file appears to be misbehaving… Downloading the CSV data and looking at it in TextWrangler, a text editor, we start to see what’s wrong:

text editor

The text editor creates line numbers for things it sees as separate, well formed rows in the CSV data. We see that the header, which should be a single row, is actually spread over four rows. In addition, the London data is split over two rows. The line for Greater Manchester behaves correctly…: if you look at the line numbers, you can see line 7 overflows in the editor (the … in the line number count shows the CSV line (a separate dataflow) has overflowed the width of the editor and been wrapped round in the editor view).

If I tell the editor to stop “soft wrapping” each line of data in the CSV file, the editor displays each line of the CSV file on a single line in the editor:

text editor nowrap

So… where does this get us in fixing the pipe? Not too far. We can skip the first 5 lines of the file that we import into the pipe, and that gets around all the messed up line breaks at the top of the file, but we lose the row containing the data for London. In the short term, this is probably the pragmatic thing to do.

Next up, we might look to the Wikipedia file and see how the elements that appear to be breaking the the CSV file might be fixed to unbreak them. Finally, we could go to the Google Spreadsheets forums and complain about the pile of crap broken CSV generation that the Googlers appear to have implemented…

PS MartinH suggests workaround in the comments, wrapping a QUERY round the import and renaming the columns…

Just in Case – Saving Your Yahoo Pipes…

Yahoo is laying off again, so just in case, if you’re a user of Yahoo Pipes, it may be worth exporting the “source code” of your pipes or any pipes that you make frequent use of in case the Yahoo Pipes service gets cut.

Why? Well, a little known fact about Yahoo pipes is that you can get hold of a JSON representation from a pipe that describes how the pipe is constructed…

…and some time ago, Greg Gaughan started working on a script that allows you to “compile” these descriptions of your Yahoo Pipes into Python programming code that can be run as a standalone programme on your own server: Pipe2Py. (Greg also did a demo that allowed Pipes to be “migrated” to a version of Pipe2Py running on Google App Engine.)

From a quick skim over the Pipes service, it seems you can get hold of a list of published pipes for a user easily enough, which means we can get a quick dump of the “source code” of all the published pipes for a given user (and then maybe compile them to Python using pipe2py so we can essentially keep the functionality running…). Here’s a first pass at a bulk exporter: Yahoo Pipes Exporter (published pipes by user).

To get a full list of pipes by user, I think you need to be logged in as that user?

See also: Yahoo Pipes Code Generator (Python): Pipe2Py

PS do I need to start worrying about flickr too?!

Library Catalogue SRU Queries via YQL and Yahoo Pipes

I got a question from @liwazi last week wondering why a SRU request to the Cambridge Library catalogue wasn’t being handled properly in Yahoo pipes… I think it’s because the Yahoo Pipes XML parser is sometimes like that!

Anyway, here was my fix – to use YQL as a proxy, based around a SRU request URL of the form:

Here’s the the form of YQL query (try it in YQL developer console):

select * from xml where url=''

You can find a copy of the pipe here: SRU demo pipe

Note that as well as accessing the data via the pipe, you can also pull the results of a search into a web page directly from YQL as a JSON feed:

If you’re really keen, you might also define a YQL data table that would allow you to make a request of the form “select * from camsru where q=’learning perl'”, and then set up a short alias for the query so you could run it using a construction of the form based on a YQL query of the form select * from camsru where q=@q

PS tomorrow is Mashed Library day at Lincoln – Pancakes and Mash. Be sure to follow #mashlib and chip in if you can ;-)

On flickr, delicious and Yahoo Pipes…

According to Slideshare, it was four years ago that I ran a series of social bookmarking workshops in the OU:

At the time, I was a fan of delicious (still am), because it did what it did and it did it well enough. As part of the workshop, I tried to encourage folk to use delicious, but I also ran an “OU unofficial” version of Scuttle for folk to use if they preferred using a locally hosted social bookmarking app. (A few did, at first, but the folk who got value from social bookmarking tended to then move on to delicious, so I shut the hosted version of Scuttle down.)

With the future of delicious uncertain, I wonder whether Scuttle has continued in development, and whether it’s worth setting up again?

As to the continuity of flickr – I guess I need to have a think about what to do if flickr goes down. As a paid up premium user, I have thousands of images on flickr, many of them screenshots which are served to this blog. If flickr were to die, I’d need to get the images moved elsewhere, and the links updated in this blog. I’m not sure how to do this? Anyone got any good ideas given this is a WordPress hosted blog (and the fact I don’t want to have to pay for image storage on WordPress – unless Automattic buy flickr???)

And then there’s Yahoo Pipes. As far as I know, it hasn’t been mentioned in any of the recent reports around Yahoo’s portfolio reorganisation, but who knows how safe it is? I’ve posted before wondering about what happens if yahoo pipes dies?, and thanks to Greg Gaughan there’s now an exporter and partial runner for pipes using Pipe2Py and the Google Apps Pipes Engine. All that’s needed now is for someone to come up with a UI that generates the Pipes JSON export format… There are a few possible candidates out there, but nothing that hits the sweet spot yet, so if you fancy having a go, let me know (I probably won’t be able to help with the code, but I can try out the UI and help test any outputs within Pipe2Py…)

Backup and Run Yahoo Pipes Pipework on Google App Engine

Wouldn’t it be handy if you could use Yahoo Pipes as code free rapid prototyping development environment, then export the code and run it on your own server, or elsewhere in the cloud? Well now you can, using Greg Gaughan’s pipe2py and the Google App Engine “Pipes Engine” app.

As many readers of will know, I’m an advocate of using the visual editor/drag and drop feed-oriented programming application that is Yahoo Pipes. Some time ago, I asked the question What Happens if Yahoo Pipes Dies?, partly in response to concerns raised at many of the Pipes workshops I’ve delivered about the sustainability, as well as the reliability, of the Yahoo Pipes platform.

A major issue was that the “programmes” developed in the Pipes environment could only run in that environment. As I learned from pipes guru @hapdaniel, however, it is possible to export a JSON representation of a pipe and so at least grab some sort of copy of a pipes programme. This led to me doodling some ideas around the idea of a Yahoo Pipes Documentation Project, which would let you essentially export a functional specification of a pipe (I think the code appears to have rotted or otherwise broken on this?:-(

This in turn led naturally to Starting to Think About a Yahoo Pipes Code Generator, whereby we could take a description of a pipe and generate a code equivalent version from it.

Greg Gaughan took up the challenge with Pipe2Py (described here) to produce a pipes compiler capable of generating and running Python equivalents of a Yahoo pipe (not all Pipes blocks are implemented yet, but it works well for simple pipes).

And now Greg has gone a step further, by hosting pipe2py on Google App engine so you can make a working Python backup of a pipe in that environment, and run it: Running Yahoo! Pipes on Google App Engine.

As with pipe2py, it won’t work for every Yahoo pipe (yet!), but you should be okay with simpler pipes. (Support for more blocks is being added all the time, and implementations of currently supported blocks also get an upgrade if, as and when any issues are found with them. If you have a problem, or suggestion for a missing block, add a comment on Greg’s blog;-)

(Looking back over my old related posts, deploying to Google Apps also seems to be supported by Zoho Creator.)

Quite by chance, @paulgeraghty tweeted a link to an old post by @progrium on the topic of “POSS == Public Open Source Services: … or User Powered Self-sustaining Cloud-based Services of Open Source Software”:

How many useful bits of cool plumbing are made and abandoned on the web because people realize there’s no true business case for it? And by business case, I mean make sense to be able to turn a profit or at least enough to pay the people involved. Even as a lifestyle business, it still has to pay for at least one person … which is a lot! But forget abandoned … how much cool tech isn’t even attempted because there is an assumption that in order for it to survive and be worth the effort, there has to be a business? Somebody has to pay for hosting! Alternatively, what if people built cool stuff because it’s just cool? Or useful (but not useful enough to get people to pay — see Twitter)?

Well this is common in open source. A community driven by passion and wanting to build cool/useful stuff. A lot of great things have come from open source. But open source is just that … source. It’s not run. You have to run it. How do you get the equivalent of open source for services? This is a question I’ve been trying to figure out for years. But it’s all coming together now …

Enter POSS

POSS is an extension of open source. You start with some software that provides a service (we’ll just say web service … so it can be a web app or a web API, whatever — it runs “in the cloud”). The code is open source. Anybody can fix bugs or extend it. But there is also a single canonical instance of this source, running as a service in the cloud. Hence the final S … but it’s a public service. Made for public benefit. That’s it. Not profit. Just “to be useful.” Like most open source.

Hmmm….. ;-)

Yahoo Pipes Code Generator (Python): Pipe2Py

Wouldn’t it be nice if you coud use Yahoo Pipes as a visual editor for generating your own feed powered applications running on your own server? Now you can…

One of the concerns occasionally raised around Yahoo Pipes (other than the stability and responsiveness issues) relates to the dependence that results on the Yahoo pipes platform from creating a pipe. Where a pipe is used to construct an information feed that may get published on an “official” web page, users need to feel that content will always be being fed through the pipe, not just when when Pipes feels like it. (Actually, I think the Pipes backend is reasonably stable, it’s just the front end editor/GUI that has its moments…)

Earlier this year, I started to have a ponder around the idea of a Yahoo Pipes Documentation Project (the code appears to have rotted unfortunately; I think I need to put a proper JSON parser in place:-(, which would at least display a textual description of a pipe based on the JSON representation of it that you can access via the Pipes environment. Around the same time, I floated an idea for a code generator, that would take the JSON description of a pipe and generate Python or PHP code capable of achieving a similar function to the Pipe from the JSON description of it.

Greg Gaughan picked up the challenge and came up with a Python code generator for doing just that, written in Python. (I didn’t blog it at the time because I wanted to help Greg extend the code to cover more modules, but I never delivered on my part of the bargain:-(

Anyway – the code is at and it works as follows. Install the universal feed parser (sudo easy_install feedparser) and simplejson (sudo easy_install simplejson), then download Greg’s code and declare the path to it, maybe something like:
export PYTHONPATH=$PYTHONPATH:/path/to/pipe2py.

Given the ID for a pipe on Yahoo pipes, generate a Python compiled version of it:
python -p PIPEID

This generates a file containing a function pipe_PIPEID() which returns a JSON object equivalent of the output of the corresponding Yahoo pipe, the major difference being that it’s the locally compiled pipe code that’s running, not the Yahoo pipe…

So for example, for the following simple pipe, which just grabs the blog feed and passes it straight through:

SImple pipe for compilation

we generate a Python version of the pipe as follows:
python -p 404411a8d22104920f3fc1f428f33642

This generates the following code:

from pipe2py import Context
from pipe2py.modules import *

def pipe_404411a8d22104920f3fc1f428f33642(context, _INPUT, conf=None, **kwargs):
    if conf is None:
        conf = {}

    forever = pipeforever.pipe_forever(context, None, conf=None)

    sw_502 = pipefetch.pipe_fetch(context, forever, conf={u'URL': {u'type': u'url', u'value': u''}})
    _OUTPUT = pipeoutput.pipe_output(context, sw_502, conf={})
    return _OUTPUT

We can then run this code as part of our own program. For example, grab the feed items and print out the feed titles:

context = Context()
p = pipe_404411a8d22104920f3fc1f428f33642(context, None)
for i in p:
  print i['title']

running a compiled pipe on the desktop

Not all the Yahoo Pipes blocks are implemented (if you want to volunteer code, I’m sure Greg would be happy to accept it!;-), but for simple pipes, it works a dream…

So for example, here’s a couple of feed mergers and then a sort on the title…

ANother pipe compilation demo

And a corresponding compilation, along with a small amount of code to display the titles of each post, and the author:

from pipe2py import Context
from pipe2py.modules import *

def pipe_2e4ef263902607f3eec61ed440002a3f(context, _INPUT, conf=None, **kwargs):
    if conf is None:
        conf = {}

    forever = pipeforever.pipe_forever(context, None, conf=None)

    sw_550 = pipefetch.pipe_fetch(context, forever, conf={u'URL': [{u'type': u'url', u'value': u''}, {u'type': u'url', u'value': u''}]})
    sw_572 = pipefetch.pipe_fetch(context, forever, conf={u'URL': {u'type': u'url', u'value': u''}})
    sw_580 = pipeunion.pipe_union(context, sw_550, conf={}, _OTHER = sw_572)
    sw_565 = pipesort.pipe_sort(context, sw_580, conf={u'KEY': [{u'field': {u'type': u'text', u'value': u'title'}, u'dir': {u'type': u'text', u'value': u'ASC'}}]})
    _OUTPUT = pipeoutput.pipe_output(context, sw_565, conf={})
    return _OUTPUT

context = Context()
p = pipe_2e4ef263902607f3eec61ed440002a3f(context, None)
for i in p:
        print i['title'], ' by ', i['author']

And the result?
MCMT013:pipes ajh59$ python
Build an app to search Delicious using your voice with the Android App Inventor by Liam Green-Hughes
Digging Deeper into the Structure of My Twitter Friends Network: Librarian Spotting by Tony Hirst
Everyday I write the book by mweller

So there we have it.. Thanks to Greg, the first pass at a Yahoo Pipes to Python compiler…

PS Note to self… I noticed that the ‘truncate’ module isn’t supported, so as it’s a relatively trivial function, maybe I should see if I can write a compiler block to implement it…

PPS Greg has also started exploring how to export a pipe so that it can be run on Google App Engine: Running Yahoo! Pipes on Google App Engine

Previewing the Contents of a JSON Feed in Yahoo Pipes

This post builds on the previous one (Grabbing the Output of a Yahoo Pipe into a Web Page) by describing a strategy that can help you explore the structure of a JSON feed that you may be pulling in to a web page so that you can identify how to address the separate elements contained within it.

This strategy is not so much for developers as for folk who don’t really get coding, and don’t want to install developer tools into their browser.

As the “Grabbing the Output of a Yahoo Pipe into a Web Page” post described, it’s easy enough to use JQuery to get a JSON feed into a web page, but what happens then? How do you work out how to “address” the various parts of the Javascript object so that you can get the information or data you want out of it?

Here’s part of a typical JSON feed out of a Yahoo pipe:

{“count”:17,”value”:{“title”:”Proxy”,”description”:”Pipes Output”,”link”:”http:\/\/\/pipes\/″,”pubDate”:”Mon, 19 Jul 2010 05:15:55 -0700″,”generator”:”http:\/\/\/pipes\/”,”callback”:””,”items”:[{“link”:”http:\/\/\/~r\/ouseful\/~3\/9WBAQqRtH58\/”,”y:id”:{“value”:”http:\/\/\/?p=3800″,”permalink”:”false”},”feedburner:origLink”:”http:\/\/\/2010\/07\/19\/grabbing-the-output-of-a-yahoo-pipe-into-a-web-page\/”,”slash:comments”:”0″,”wfw:commentRss”:”http:\/\/\/2010\/07\/19\/grabbing-the-output-of-a-yahoo-pipe-into-a-web-page\/feed\/”,”description”:”One of the things I tend to take for granted about using Yahoo Pipes is how to actaully grab the output of a Yahoo Pipe into a webpage. Here’s a simple recipe using the JQuery Javascript framework to do just that. The example demonstrates how to add a bit of code to a web page […]“,”comments”:”http:\/\/\/2010\/07\/19\/grabbing-the-output-of-a-yahoo-pipe-into-a-web-page\/#comments”,”dc:creator”:”Tony Hirst”,”y:title”:”Grabbing the Output of a Yahoo Pipe into a Web Page”,”content:encoded”:”

One of the things I tend to take for granted about using Yahoo Pipes is how to actaully grab the output of a Yahoo Pipe into a


However, we can can use the Yahoo pipes environment to help us understand the structure and make up of this feed. Create a new pipe, and just add a “Fetch Data” block to it. Paste the URL of the JSON feed into the block, and now you can preview the feed – the image below show a preview of the JSON output from a simple RSS proxy pipe, that just takes in the URL of an RSS feed and then emits it as a JSON feed:

Yahoo pipes JSON browser

(Note that if you find yourself using the Yahoo Pipes V2 engine, you may have to wire the output of the Fetch Data block to the output block before the preview works. You shouldn’t need to save the pipe though…)

When you load the feed in to a webpage, if you assign the whole object to the variable data, then you will find the output of the pipe in the object data.value.

In the example shown above, the title of the feed as a whole will be in data.value.title. The separate feed items will be in the collection of data.value.items; data.value.items[0] gives the first item, data.value.items[1] the second, and so on up to data.value.items[data.value.items.length-1]. The title of the third feed item will be data.value.items[2].title and the description of the 10th feed item will be data.value.items[9].description.

This style of referencing the different components of the javascript object loaded into the page is known as the javascript object dot notation.

Here’s a preview of a council feed from OpenlyLocal:

Preview an openly local council feed

In this case, we start to address the data at data.council, find the population at data.council.population, index the wards using data.council.wards[i] and so on.

Feed Aggregation, Truncation and Post Labeling With Google Spreadsheets and Yahoo Pipes

Got another query via Twitter today for a Yahoo Pipe that is oft requested – something that will aggregate a number of feeds and prefix the title of each with a slug identifying the appropriate source blog.

So here’s one possible way of doing that.

Firstly, I’m going to create a helper pipe that will truncate the feed from a specified pipe to include a particular number of items from the feed and then annotate the title with a slug of text that identifies the blog: (Advisory: Truncate and Prefix).

The next step is to build a “control panel”, a place where we list the feeds we want to aggregate, the number of items we want to truncate, and the slug text. I’m going to use a Google spreadsheet.

We can now create a second pipe (Advisory: Spreadsheet fed feed aggregator that will pull in the list of feeds as a CSV file from the spreadsheet, for each feed grab the feed contents, then truncate them and badge them as required using the helper pipe:

To keep things tidy, we can sort the posts so they appear in the traditional reverse chronological order.

PS Hmmm… it might be more useful to be able to limit the feed items by another criteria, such as all posts in the last two weeks? If so, this sort of helper pipe would do the trick (Advisory: Recent Posts and Prefix):