Search results for: feed

From the Archive – Browser OS and the Facebook Feed Mixing Desk

Skimming through my blog history looking for examples from my past of search thinkses, I came across several things I’d forgotten about; so as part of what may become and occasional series of posts that trip back in time, here are a couple of things I came across, one that imagined the future, another that is maybe revealing from the past.

First up, from my original blog, Micro-Info, a post about Browser OS – A Single Application Operating System.

I’m not sure how this would go down with my colleagues who still believe that everyone has a desktop or laptop computer, rather than a phone, tablet, Chromebook style computers (what’s the Microsoft equivalent?) as their primary computing device…

What do I really need from my operating system, if I’m doing everything through a browser?

Part of the point behind BOS  [Browser Operating System] is that we expect to be online most of the time, ideally with a persistent connection. Once I have the BOS customised for my hardware set-up, I don’t really need a thousand and one drivers available, just in case I add a periheral [sic], if I know I’m going to be able to install the appropriate driver from the web.

What I see for BOS, therefore, is a simple, if hefty, installation profiling client that looks at my system, works out what’s there, gets the drivers I need, and bundles them for me with a single application – my heavyweight browser – in a customised BOS installer.

And that’s what I install.

Just one application – the browser. Only the drivers I need. And only the supporting functions I need to get the browser to work on my particular system.

Okay – so I thought that I’d still be plugging things into the computer and that it would require drivers for them… But the browser-centricity…? Hmmm… (for those of you who don’t particular following operating systems, see things like Chrome_OS.)

So, the second thing that caught my eye: the Facebook feed mixing desk, this time captured by the archive of the original blog:

Move the sliders and tune the content of your Facebook feed. What’s revealing about this is that the user was given some control over the ranking factors for posts that appear in the feed, along with an indication of what those ranking factors were.

So for folk who today don’t understand that the content they see is tuned (forgive the pun!) by Facebook algorithms, this provides a visual metaphor for what’s going on and who has the control. Because you can bet that: a) there are many more ranking factors now; and b) it’s up to Facebook how the faders are set. And it also hints at the oft unconsidered point: c) whose ear are the faders tuning the mix to?

(By the by, see the ad? 1 minute response time?!)

Personal Health Calendar Feeds and a Social Care Annunciator?

Over the last few weeks and months I’ve started pondering all sorts of health and care related stuff that may help when trying to support family members a couple of hundred miles away. One of the things we picked up on was a “friendly” digital clock display (often sold as a “dementia clock” or “memory loss calendar”), with a clear screen, and easy to read date and time.

The clock supports a variety of daily reminders (“Take your pills…”) and can also be programmed to display images or videos at set dates and times (“Doctor’s today”).

One of the things this reminded me of was the parliamentary annunciators, that detail the current activity in the House of Commons and House of Lords, and that can be found all over the parliamentary estate.

Which got me thinking:

  • what if I could send a short text message or reminder to the screen via SMS?
  • what if I could subscribe to a calendar feed from the device that could be interpreted to generate a range of alerts leading up to an event (*”Doctor’s tomorrow morning”*, *”Hospital this afternoon at 2pm”*).

(Lots of other ideas came to mind too, but the primary goal is to keep the device as simple as possible and the display as clear as possible, which includes being able to read it from a distance.)

The calendar feed idea also sparked a far more interesting idea – one of the issues of trying to support family members with ongoing health appointments is knowing when they are taking place, whether you need to go along to provide advocacy or support, whether hospital stays are being planned, and so on. Recent experience suggests that different bits of the NHS ac independently of each other:

  • the GP doesn’t know when hospital surgery has been booked, let alone when pre-op assessments requiring a hospital visit are scheduled;
  • district nurses don’t know when hospital visits are planned;
  • different parts of the hospital don’t know when other parts of the hospital have visits planned,

and so on…

In short, it seems that the hospital doesn’t seem to have a calendar associated with each patient.

As with “student first” initiatives in HE, I suspect “patient first” initiatives are more to do with tracking internal performance metrics and KPIs rather than initiatives formulated from the patient perspective, but a personal “health and social care calendar” could benefit a whole range of parties:

  • the patient, wanting to keep track of appointments;
  • health and social care agencies wanting to book appointments and follow up on appointments with other parts of the service;
  • family members looking to support the patient.

So I imagine a situation where a patient books a GP appointment, and the receptionist adds it to the patient’s personal calendar.

A hospital appointment is generated by a consultant and, along with the letter informing the patient of the date, the event is added to the patient’s calendar (possibly with an option to somehow acknowledge it, confirm it, cancel it?).

A patient asks the GP to add a family member to the calendar/calendar feed so they can also access it.

A range of privacy controls allow different parts of the health and social care system to request/make use of read access to a patient’s health and social care calendar.

The calendar keeps a memory of historical appointments as well as future ones. Options may be provided to say whether an appointment was attended, cancelled or rescheduled. Such information may be useful to a GP (“I see you had your appointment with the consultant last week…”) or consultant (“I see you have an appointment with your GP next week? It may be worth mentioning to them…”)

Hmmm…thinks… is this sort of thing has this sort of thing being explored (or has it been in the past?), or maybe considered at an NHS Hack Day? Or is it the sort of thing I could put together as an NHS tech funding pitch?

PS Some of the features of the Amazon Show could also work well in the context of a health/care annunciator, but… the Amazon Show is too feature rich and could easily lead to feature creep and complexity in use; I’d have “privacy concerns” using the Amazon backend and always on Alexa/Echo mic.

More Observations on the ONS JSON Feeds – Returning Bulletin Text as Data

Whilst starting to sketch out some python functions for grabbing the JSON data feeds from the new ONS website, I also started wondering how I might be able to make use of them in a simple slackbot that could provide a crude conversational interface to some of the ONS stats.

(To this end, it would also be handy to see some ONS search logs to see what sort of things folk search – and how they phrase their searches…)

One of the ways of using the data is as the basis for some simple data2text scripts, that can report the outcomes of some simple canned analyses of the data (comparing the latest figures with those from the previous month, or a year ago, for example). But the ONS also produce commentary on various statistics for via their statistical bulletins – and it seems that these, too, are available in JSON form simply by adding /data to the end of the IRL path as before:


One thing to note is that whist the HTML view of bulletins can include a name element to focus the page on a particular element:

the name attribute switch doesn’t work to filter the JSON output to that element (though it would be easy enough to script a JSON handler to return that focus) so there’s no point adding it to the JSON feed URL:

One other thing to note about the JSON feed is that it contains cross-linked elements for items such as charts and tables. If you look closely at the above screenshot, you’ll see it contains a reference to an ons-table.

sections: [
title: "Summary of latest labour market statistics",
markdown: "Table A shows the latest estimates, for October to December 2015, for employment, unemployment and economic inactivity. It shows how these estimates compare with the previous quarter (July to September 2015) and the previous year (October to December 2014). Comparing October to December 2015 with July to September 2015 provides the most robust short-term comparison. Making comparisons with earlier data at Section (ii) has more information. <ons-table path="cea716cc" /> Figure A shows a more detailed breakdown of the labour market for October to December 2015. <ons-image path="718d6bbc" />"

This resource is then described in detail elsewhere in the data feed linked by the same ID value:


tables: [
title: "Table A: Summary of UK labour market statistics for October to December 2015, seasonally adjusted",
filename: "cea716cc",
uri: "/employmentandlabourmarket/peopleinwork/employmentandemployeetypes/bulletins/uklabourmarket/february2016/cea716cc"

Images are identified via the ons-image tag, charts via the ons-chart tag, and so on.

So now I’m thinking – maybe this is the place to start thinking about a simple conversational UI? Something that can handle simple references into different parts of a bulletin, and return the ONS text as the response?

The Rise of Transparent Data Journalism – The BuzzFeed Tennis Match Fixing Data Analysis Notebook

The news today was lead in part by a story broken by the BBC and BuzzFeed News – The Tennis Racket – about match fixing in Grand Slam tennis tournaments. (The BBC contribution seems to have been done under the ever listenable File on Four: Tennis: Game, Set and Fix?)

One interesting feature of this story was that “BuzzFeed News began its investigation after devising an algorithm to analyse gambling on professional tennis matches over the past seven years”, backing up evidence from leaked documents with “an original analysis of the betting activity on 26,000 matches”. (See also: How BuzzFeed News Used Betting Data To Investigate Match-Fixing In Tennis, and an open access academic paper that inspired it: Rodenberg, R. & Feustel, E.D. (2014), Forensic Sports Analytics: Detecting and Predicting Match-Fixing in Tennis, The Journal of Prediction Markets, 8(1).)

Feature detecting algorithms such as this (where the feature is an unusual betting pattern) are likely to play an increasing role in the discovery of stories from data, step 2 in the model described in this recent Tow Center for Digital Journalism Guide to Automated Journalism:]


See also: Notes on Robot Churnalism, Part I – Robot Writers

Another interesting aspect of the story behind the story was the way in which BuzzFeed News opened up the analysis they had applied to the data. You can find it described on Github – Methodology and Code: Detecting Match-Fixing Patterns In Tennis – along with the data and a Jupyter notebook that includes the code used to perform the analysis: Data and Analysis: Detecting Match-Fixing Patterns In Tennis.


You can even run the notebook to replicate the analysis yourself, either by downloading it and running it using your own Jupyter notebook server, or by using the online mybinder service: run the tennis analysis yourself on

(I’m not sure if the BuzzFeed or BBC folk tried to do any deeper analysis, for example poking into point summary data as captured by the Tennis Match Charting Project? See also this Teniis Visuals project that makes use of the MCP data. Tennis etting data is also collected here: If you’re into the idea of analysing tennis stats, this book is one way in: Analyzing Wimbledon: The Power Of Statistics.)

So what are these notebooks anyway? They’re magic, that’s what!:-)

The Jupyter project is an evolution of an earlier IPython (interactive Python) project that included a browser based notebook style interface for allowing users to write and execute code, as well as seeing the result of executing the code, a line at a time, all in the context of a “narrative” text document. The Jupyter project funding proposal describes it thus:

[T]he core problem we are trying to solve is the collaborative creation of reproducible computational narratives that can be used across a wide range of audiences and contexts.

[C]omputation in science is ultimately in service of a result that needs to be woven into the bigger narrative of the questions under study: that result will be part of a paper, will support or contest a theory, will advance our understanding of a domain. And those insights are communicated in papers, books and lectures: narratives of various formats.

The problem the Jupyter project tackles is precisely this intersection: creating tools to support in the best possible ways the computational workflow of scientific inquiry, and providing the environment to create the proper narrative around that central act of computation. We refer to this as Literate Computing, in contrast to Knuth’s concept of Literate Programming, where the emphasis is on narrating algorithms and programs. In a Literate Computing environment, the author weaves human language with live code and the results of the code, and it is the combination of all that produces a computational narrative.

At the heart of the entire Jupyter architecture lies the idea of interactive computing: humans executing small pieces of code in various programming languages, and immediately seeing the results of their computation. Interactive computing is central to data science because scientific problems benefit from an exploratory process where the results of each computation inform the next step and guide the formation of insights about the problem at hand. In this Interactive Computing focus area, we will create new tools and abstractions that improve the reproducibility of interactive computations and widen their usage in different contexts and audiences.

The Jupyter notebooks include two types of interactive cell – editable text cells into which you can write simple markdown and HTML text that will be rendered as text; and code cells into which you can write executable code. Once executed, the results of that execution are displayed as cell output. Note that the output from a cell may be text, a datatable, a chart, or even an interactive map.

One of the nice things about the Jupyter notebook project is that the executable cells are connected via the Jupyter server to a programming kernel that executes the code. An increasing number of kernels are supported (e.g. for R, Javascript and Java as well as Python) so once you hook in to the Jupyter ecosystem you can use the same interface for a wide variety of computing tasks.

There are multiple ways of running Jupyter notebooks, including the mybinder approach described above, – I describe several of them in the post Seven Ways of Running IPython Notebooks.

As well as having an important role to play in reproducible data journalism and reproducible (scientific) research, notebooks are also a powerful, and expressive, medium for teaching and learning. For example, we’re just about to star using Jupyter notebooks, delivered via a virtual machine, for the new OU course Data management and analysis.

We also used them in the FutureLearn course Learn to Code for Data Analysis, showing how code could be used a line at a time to analyse a variety of opendata sets from sources such as the World Bank Indicators database and the UN Comtrade (import /export data) database.

PS for sports data fans, here’s a list of data sources I started to compile a year or so ago: Sports Data and R – Scope for a Thematic (Rather than Task) View? (Living Post).

Olympics Data Feeds – Scribbled Notes

This is so much a blog post as a dumping ground for bits and pieces relating to Olympics data coverage…

BBC Internet blog: Olympic Data Services and the Interactive Video Player – has a brief overview of how the BBC gets its data from LOCOG; and Building the Olympic Data Services describes something of the technical architecture.

ODF Data Dictionaries eg ODF Equestrian Data Dictionary [via @alisonw] – describes how lots of data that isn’t available to mortals is published ;-)

Computer Weekly report from Sept 2011: Olympic software engineers enter final leg of marathon IT development project

Examples of some of the Olympics related products you can buy from the Press Association: Press Association: Olympics Graphics (they also do a line of widgets…;-)

I haven’t found a public source of press releases detailing results that has been published as such (seems like you need to register to get them?) but there are some around if you go digging (for example, gymnastics results, or more generally, try a recent websearch for something like this: "report created" filetype:pdf olympics results).

A search for medallists on Freebase (via @mhawksey), and an example of how to query for just the gold medal winners.

[PDFs detailing biographical details of entrants to track and field events at lease: games XXX olympiad biographical filetype:pdf]

A really elegant single web page app from @gabrieldance: Was an Olympic Record Set Today? Great use of the data…:-)

This also makes sense – story on how Telegraph builds Olympics graphics tool for its reporters to make it easy to generate graphical views over event results.

PS though it’s not data related at all, you may find this amusing: OU app for working out which Olympic sport you should try out… Olympisize Me (not sure how you know it was an OU app from the landing page though, other than by reading the URL…?)

PPS I tweeted this, but figure it’s also worth a mention here: isn’t it a shame that LOCOG haven’t got into the #opendata thing with the sports results…

Feeding on OU/BBC Co-Produced Content (Upcoming and Currently Available on iPlayer)

What feeds are available listing upcoming broadcasts of OU/BBC co-produced material or programmes currently available on iPlayer?

One of the things I’ve been pondering with respect to my OU/BBC programmes currently on iPlayer demo and OU/BBC co-pros upcoming on iPlayer (code) is how to start linking effectively across from programmes to Open University educational resources.

Chatting with KMi’s Mathieu d’Aquin a few days ago, he mentioned KMi were looking at ways of automating the creation of relevant semantic linkage that could be used to provide linkage between BBC programmes and OU content and maybe feed into the the BBC’s dynamic semantic publishing workflow.

In the context of OU and BBC programmes, one high level hook is the course code. Although I don’t think these feeds are widely promoted as a live service yet, I did see a preview(?) of an OU/BBC co-pro series feed that includes linkage options such as related course code (one only? Or does the schema allow for more than one linked course?) and OU nominated academic (one only? Or does the schema allow for more than one linked academic? More than one), as well as some subject terms and the sponsoring Faculty:

    <title><![CDATA[OU on the BBC: Symphony]]></title>
    <description><![CDATA[Explore the secrets of the symphony, the highest form of expression of Western classical music]]></description>
    <image title="The Berrill Building"></image>
    <ou_faculty_reference>Music Department</ou_faculty_reference>
            <showdate>21:00:00 24/11/2011</showdate>
            <location><![CDATA[BBC Four]]></location>
            <showdate>19:30:00 16/03/2012</showdate>
            <location><![CDATA[BBC Four]]></location>
            <showdate>03:00:00 17/03/2012</showdate>
            <location><![CDATA[BBC Four]]></location>
            <showdate>19:30:00 23/03/2012</showdate>
            <location><![CDATA[BBC Four]]></location>
            <showdate>03:00:00 24/03/2012</showdate>
            <location><![CDATA[BBC Four]]></location>
 <category domain="">What's On</category>
 <category domain="">BBC Four</category>
 <category domain="">music</category>
 <category domain="">symphony</category>
 <pubDate>Tue, 18 Oct 2011 10:38:03 +0000</pubDate>
 <guid isPermaLink="false">147728 at</guid>

I’m not sure what the guid is? Nor do there seem to be slots for links to related OpenLearn resources other than the top link element? However, the course code does provide a way into course related educational resources via, the nominated academic link may provide a route to associated research interests (for example, via ORO, the OU open research repository), the BBC programme code provides a route in to the BBC programme metadata, and the category tags may provide other linkage somewhere depending on what vocabulary gets used for specifying categories!

I guess I need to build myself a little demo to se what we can do with a fed of this sort..?!;-)

I’m not sure if plans are similarly afoot to publish BBC programme metadata actual the actual programme instance (“episode”) level? It’s good to see that the OpenLearn What’s On feed has been tidied up little to include title elements, although it’s still tricky to work out what the feed is actually of?

For example, here’s the feed I saw a few days ago:

    <title><![CDATA[OU on the BBC: Divine Women  - 9:00pm 25/04/2012 - BBC Two and BBC HD]]></title>
    <description><![CDATA[Historian Bettany Hughes reveals the hidden history of women in religion, from dominatrix goddesses to feisty political operators and warrior empresses&nbsp;]]></description>
    <location><![CDATA[BBC Two and BBC HD]]></location>
	<image title="The Berrill Building"></image>
    <showdate>21:00:00 25/04/2012</showdate>
     <pubDate>Tue, 24 Apr 2012 11:19:10 +0000</pubDate>
 <guid isPermaLink="false">151446 at</guid>

It contains an upcoming show date for programmes that will be broadcast over the next week or so, and a link to a related page on OpenLearn for the episode, although no direct information about the BBC programme code for each item to be broadcast.

In the meantime, why not see what OU/BBC co-pros are currently available on iPlayer?

Or for a bitesize videos, how about this extensive range of clips from OU/BBC co-pros?

Enjoy! :-)

Government Communications – Department Press Releases and Autodiscoverable Syndication Feeds

A flurry of articles earlier this week (mine will be along shortly) about the Data Strategy Board all broadly rehashed the original press release from BIS. Via the Cabinet Office Transparency minisite, I found a link to the press release via the COI News Distribution Service…

…whereupon I noticed that the COI – Central Office of Information – is to close at the end of this month (31 March 2012), taking with it the News Distribution Service for Government and the Public Sector (soon to be ex- of

In its place is the following advice: “For government press releases please follow this link to find the department that you require This leads to a set of alphabetised pages with links to the various government departments… i.e. it points to a starting point for likely fruitless browsing and searching if you’re after aggregated press releases from gov departments.

(I’m not sure where News Sauce: UK Government Edition gets its data from, but if it’s by scrapes of departmental press releases rather than just scraping and syndicating the old COI content, then it’s probably the site I’ll be using to keep tabs on government press releases.)

FWIW, centralisation and aggregation are not the same in terms of architectures of control. Aggregation (then filter on the way out, if needs be) can be a really really useful way of keeping tabs on otherwise distributed systems… I had a quick look to see whether anyone was scraping and aggregating UKGov departmental press releases on Scraperwiki, but only came up with @pezholio’s LGA Press Releases scraper…

An easier way would be to hook up my feed reader to an OPML bundle that collected together RSS/Atom feeds of news releases from the various government websites. I’m not sure if such a bundle is available anywhere (if you know of one, please add a link in the comments below), but if: 1) gov departments do publish RSS/Atom feed containing their press releases; 2) they make these feeds autodiscoverable via their homepages, and: 3) ensure that said feeds are reliably identifiable as press release/media release feeds, it wouldn’t be too hard to build a simple OPML feed generator.

So for example, trawling through old posts, I note that the post 404 “Page Not Found” Error pages and Autodiscoverable Feeds for UK Government Departments used a Yahoo Pipes pipe to try to automatically audit feed autodiscovery on UK gov departmental homepages, though it may well have rotted by now. If I was to fix it, I’d probably reimplement it in Scraperwiki, as I did with my UK HEI feed autodiscovery thang (UK university autodiscoverable RSS Feeds (Scraperwiki scraper), and Scraperwiki View; about: Autodiscoverable Feeds and UK HEIs (Again…)). If you beat me to that, please post a link to your scraper below;-)

I have to admit I haven’t checked the state of feed autodiscovery on UK gov, local gov, or university websites recently. Sigh… another thing to add to the list of ‘maybe useful’ diversions…;-)

See also: Public Data Principles: RSS Autodiscovery on Government Department Websites?

PS This tool may or may not be handy if feed autodiscovery is new to you? Feed Autodiscovery in Javascript

PPS hmm, from Tracking Down Local Government Consultation Web Pages, I recall there are LGD service ID codes that lists identifiers for local government services that can be used to tag webpages/URLs on local government sites. Are there service identifiers for central government communication services (eg provision of press releases?) that could be used to find central gov department press releases (or local gov press releases for that matter?) Of course, if departments all had autodiscoverable press release feeds on their homepages, it’d be a more weblike way;-)

Tune Your Feeds…

I’m so glad we’re at year’s end: I’m completely bored of the web, my feeds contain little of interest, I’m drastically in need of a personal reboot, and I’m starting to find myself stuck in a “seen-it-all-before” rut…

Take the “new” Google Circle’s volume slider, for example… Ooh.. shiny… ooh, new feature…

Yawn… Slider widgets have been around for ages, of course (e.g. Slider Widgets Around the Web) and didn’t Facebook allow you to do the volume control thing on your Facebook news feeds way back when, when Facebook’s feeds were themselves news (Facebook News Mixing Desk)?

Facebook Mixing desk

Does Facebook still offer this service I wonder?

On the other hand, there is the new Google Zeitgeist Scrapbook… I’m still trying to decide whether this is interesting or not… The prmeise is a series of half completed straplines that you can fill in with subheadings that interest you, and reveal a short info paragraph as a result.

Google scrapbook

Google scrapbook

The finished thing is part scrapbook, part sticker book.

Google scrapbook

The reason why I’m not sure whether this is interesting or not is because I can’t decide whether it may actually hint at a mechanic for customising your own newspaper out of content from your favoured news provider. For example, what would it look like if we tried to build something similar around content from the Guardian Platform API? Might different tag combinations be dragged into the story panels to hook up a feed from that tag or section of the “paper”? And once we’ve acted as editor of our own newspaper, might advanced users then make use of mixing desk sliders to tune the volume of content in each section?

This builds on the idea that newspapers provide you with content and story types you wouldn’t necessarily see, whilst still allowing to some degree of control over how weighted the “paper” is to different news sections (something we always had some element of control over before, though at a different level of granularity, for example, by choosing to buy newspapers only on certain days because they came with a supplement you were interested in, though you were also happy to read the rest of the paper since you have it…)

(It also reminds me that I never could decide about Google’s Living Stories either…)

PS in other news, MIT hints at an innovation in the open educational field, in particular with respect to certification… It seems you may soon be able to claim some sort of academic credit, for a fee, if you’ve been tracked through an MITx open course (MIT’s new online courses target students worldwide). Here’s the original news release: MIT launches online learning initiative and FAQ.

So I wonder: a “proven” online strategy is to grab as big an audience as you can as quickly as you can, then worry about how to make the money back. Could MIT’s large online course offereings from earlier this year be seen in retrospect as MIT testing the water’s to see whether or not they could grow an audience around online courses quickly?

I just wonder what would have happened if we’d managed to convert a Relevant Knowldge course to an open course accreditation container for a start date earlier this year, and used it to offer credit around the MIT courses ourselves?!;-) As to what other innovations might there be around open online education? I suspect the OU still has high hopes for SocialLearn… but I’m still of the mind that there’s far more interesting stuff to be done in the area of open course production

Feed Autodiscovery in Javascript

For what it’s worth, I’ve posted a demo showing a couple of feed autodiscovery/autodetection tricks that let you autodiscover feeds in remote pages via a couple of online services: the Google feed api, and YQL (Feed Autodiscovery With YQL).

Try it out: Feed autodiscovery in Javascript (code)

Single page web app: feed autodetection

I’ve also added in a routine that uses the Google feed api to look up historical entries on an RSS feed. As soon as Google is alerted to a feed (presumably by anyone or any means), it starts cacheing entries. The historical entries API lets you grab up to 250 of the most recent entries from a feed, irrespective of how many items the feed itself currently contains…

Why it matters: Public Data Principles: RSS Autodiscovery on Government Department Websites?, Autodiscoverable Feeds and UK HEIs (Again…)

PS Just by the by, I added a Scraperwiki view to my UK HEI autodiscovered feeds Scraperwiki. I added a little bit of logic to try to pull out feeds on a thematic basis too…

UK HE autodisocverable feeds

On the to do list is to create some OPML output views so you can easily subscribe to, or display, batches of the feeds in one go.

I guess I should also add a table to the scraper to start logging the number of feeds that are autodiscoverably out there over time?

Extracting Data From Misbehaving RSS/Atom Feeds

A quick recipe prompted by a query from Joss Winn about getting data out of an apparently broken Atom feed:

The feed previews in Google Reader okay – – and is also viewable in my browser, but neither Google Spreadsheets (via the =importFeed() formula) nor YQL (?!) appear to like it.

[Note: h/t to Joss for pointing this out to me: is a recipe for accessing Google Reader’s archive of a feed, and pulling out e.g. n=150 items (r=n is maybe an ordering argument?) Which is to say: here’s a way of accessing an archive of RSS feed items…:-)]

However, Yahoo Pipes does, so a simple proxy pipe normalises the feed and gives us one that is properly formatted:

Sample Yahoo feed proxy -

The normalised feed can now be accessed via:

We can also get access to a CSV output:

The CSV can be imported in to a Google spreadsheet using the =importData() formula:

[Gotcha: if you have ?&_render in the URL (i.e. ?&), Spreadsheets can’t import the data…]

Once in the spreadsheet it’s easy enough to just pull out e.g. the description text from each feed item because it all appears in a single column.

Google spreadsheets can also query the feed and just pull in the description element. For example:

=ImportFeed(“;,”items summary”)

(Note that it seemed to time out on me when I tried to pull in the full set of 150 elements in Joss’ original feed, but it worked fine with 15.)

We can also use YQL developer console to just pull out the description elements:

select description from rss where url=’

YQL querying an rss feed

YQL then provides XML or JSON output as required.