Blog Details from an RSS/Atom Feed

Picking up on Feed Autodetection With YQL, where I described a YQL custom query for autodetecting RSS and Atom feed URLs in a web page given the web page URL, here’s a complementary YQL custom query function which polls a feed URL through the YQL feed normaliser and returns the title and URL of the alternate HTML page for the feed:

select title,link from feednormalizer where url=@url and output='atom_1.0'

You can call the query using this alias:

(Leave the &format=json off if you want amn XML response.)

Here’s an example of that query with a URL instantiated, via a specific query in the YQL developer console:

Feed details via YQL

You’ll notice several alternatives are also given; the HTML page URL is given in the result where rel=”alternate”, which is somewhat reminiscent of the case of feed autodetection in an HTML page, where rel=”alternate” identifies a <link> element that includes the URL for a feed alternative…

It’s now easy enough to do a two-pass procedure where we autodetect a feed URL from an HTML blog homepage using the autodetection query described previously, and then lookup the details of the feed using the query described above.

And why exactly might we want to do this? Because in many HTML docs that do specify an alternate RSS/Atom feed, the title element provided is often something like the uninformative “RSS2.0”, rather than the title of original blog…

One comment

  1. Pingback: Feed-detection From Blog URL Lists, with OPML Output « OUseful.Info, the blog…