Tagged: Wordpress

Creating a Self-Updating WordPress Post Using WordPress Transients and Data from a Third Party API

In the post Pondering a Jupyter Notebooks to WordPress Publishing Pattern: MultiMarker Map Widget, I described a simple pattern I started exploring last year that used a custom WordPress shortcode plugin to render data added to one or more custom fields associated with a WordPress post; the post text (including shortcode) and custom fields data were themselves posted into WordPress using some Python code executed from a Jupyter notebook. The idea behind that pattern was to provide a way of automating the creation of custom posts largely from a supplied data set, rendered using a generic shortcode plugin.

Another pattern I explored last year used the WordPress Transients API to cache data pulled from a 3rd party API in the WordPress database, and allow that data to be used by a custom plugin to render the post.

Here’s some example code for a plugin that renders a map containing recent planning applications on the Isle of Wight: the data is grabbed via an API from a morph.io webscraper, which scrapes the data from the Isle of Wight council website.

The two key bits of the script are where I check to see if cached data exisits ( get_transient( 'iwcurrplanningitems' ); and if it doesn’t, grab a recent copy from the API and cache it for 8 hours (set_transient('iwcurrplanningitems', $markers, 60*60*8);).

Plugin Name: IWPlanningLeafletMap
Description: Shortcode to render an interactive map displaying clustered markers. Markers are pulled in via JSON from an external URL. Intended primarily to supported automated post creation. Inspired by folium python library and Google Maps v3 Shortcode multiple Markers WordPress plugin
Version: 1.0
Author: Tony Hirst

//Loaded in from multimarker shortcode
add_action( 'wp_enqueue_scripts', 'custom_scripts' );
add_action( 'wp_enqueue_scripts', 'custom_styles' );

// Add stuff to header
add_action('wp_head', 'IWPlanningLeafletMap_header');
add_action('wp_head', 'fix_css');

function fix_css() { 
	echo '<style type="text/css">#map {
      }</style>' . "\n";

function IWPlanningLeafletMap_header() {

function IWPlanningLeafletMap_call($attr) {
// Generate the map template

	// Default attributes - can be overwritten from shortcode
	$attr = shortcode_atts(array(	
									'lat'   => '50.675', 
									'lon'    => '-1.32',
									'id' => 'iwmap_1',
									'zoom' => '11',
									'width' => '800',
									'height' => '500',
									), $attr);

	$html = '<div class="folium-map" id="'.$attr['id'].'" style="width: '. $attr['width'] .'px; height: '. $attr['height'] .'px"></div>

   <script type="text/javascript">
      var base_tile = L.tileLayer("https://{s}.tile.openstreetmap.org/{z}/{x}/{y}.png", {
          maxZoom: 18,
          minZoom: 1,
          attribution: "Map data (c) OpenStreetMap contributors - http://openstreetmap.org"

      var baseLayer = {
        "Base Layer": base_tile

      list of layers to be added
      var layer_list = {

      Bounding box.
      var southWest = L.latLng(-90, -180),
          northEast = L.latLng(90, 180),
          bounds = L.latLngBounds(southWest, northEast);

      Creates the map and adds the selected layers
      var map = L.map("'.$attr['id'].'", {
                                       center:['.$attr['lat'].', '.$attr['lon'].'],
                                       zoom: '.$attr['zoom'].',
                                       maxBounds: bounds,
                                       layers: [base_tile]

      L.control.layers(baseLayer, layer_list).addTo(map);

      //cluster group
      var clusteredmarkers = L.markerClusterGroup();
      //section for adding clustered markers
	$markers =  get_transient( 'iwcurrplanningitems' );
	if ( false === $markers ) {
		$json = file_get_contents($url);
		$markers=json_decode($json, true);
		set_transient('iwcurrplanningitems', $markers, 60*60*8);
	for ($i = 0;$i < count($markers);$i ++){
		$arrkeys=['Agent or Applicant','Location','Proposal'];
		foreach($arrkeys as $arrkey){
			$markers[$i][$arrkey] = str_replace("\n", "<br/>", $markers[$i][$arrkey]);
			$markers[$i][$arrkey] = str_replace("\r", "<br/>", $markers[$i][$arrkey]);
		$html .='
			var marker_'.$i.'_icon = L.AwesomeMarkers.icon({ icon: "info-sign",markerColor: "blue",prefix: "glyphicon",extraClasses: "fa-rotate-0"});
      		var marker_'.$i.' = L.marker(['.$markers[$i]['lat'].','.$markers[$i]['lon'].'], {"icon":marker_'.$i.'_icon});
      marker_'.$i.'.bindPopup("<strong>Consultation start:</strong> '.$markers[$i]['Consultation Start Date'].'<br/><strong>Consultation end:</strong> '.$markers[$i]['Consultation End Date'].'<br/><strong>Location:</strong> '.$markers[$i]['Location'].'<br/><em> '.$markers[$i]['Parish'].' parish, '.$markers[$i]['Ward'].' ward.</em><br/><strong>Proposal:</strong> '.$markers[$i]['Proposal'].'<br/><strong>Agent or Applicant:</strong> '.$markers[$i]['Agent or Applicant'].'<br/><strong>Case Officer:</strong> '.$markers[$i]['Case Officer'].'<br/><em><a href=\'https://www.iwight.com/planning/'.$markers[$i]['stub'].'\'>View application</a></em>");
      marker_'.$i.'._popup.options.maxWidth = 300;
     		//add the clustered markers to the group anyway

	$html .= '</script>';
	return $html;

add_shortcode('IWPlanningLeafletMap', 'IWPlanningLeafletMap_call');

One thing I started to wonder over the Christmas break was whether this approach could provide a way of sharing “data2text” content. For example, having a plugin that creates a canned summary of jobseeker’s allowance figures from data cached from the ONS website? A downside of this is that I’d have to write the data2text script using PHP, which means I couldn’t directly build on related code I’ve written previously…

I also wonder if we could use custom fields to permanently store data for a particular post. For example, we might check whether or not a custom field exists for the post, and if it doesn’t we could create and populate it using data pulled from an API, (possibly keyed by plugin/shortcode parameters, or the post publication date), using a WordPress add_post_meta() function call?

Some Idle Thoughts on Managing Temporal Posts in WordPress

Now that I’ve got a couple of my own WordPress blogs running off the back of my Reclaim Hosting account, I’ve started to look again at possible ways of tinkering with WordPress.

The first thing I had a look at was posting a draft WordPress post from a script.

Using a WordPress role editor plugin (e.g. a long the lines of this User Role Editor) it’s easy enough to create a new role with edit and upload permissions only [WordPress roles and capabilities], and create a new ‘autoposter’ user with that role. Code like the following then makes it easy enough to upload an image to WordPress, grab the URL, insert it into a post, and then submit the post – where it will, by default, appear as a draft post:

#Ish Via: http://python-wordpress-xmlrpc.readthedocs.org/en/latest/examples/media.html
from wordpress_xmlrpc import Client, WordPressPost
from wordpress_xmlrpc.compat import xmlrpc_client
from wordpress_xmlrpc.methods import media, posts
from wordpress_xmlrpc.methods.posts import NewPost

wp = Client('http://blog.example.org/xmlrpc.php', ACCOUNTNAME, ACCOUNT_PASSWORD)

def wp_simplePost(client,title='ping',content='pong, <em>pong<em>'):
    post = WordPressPost()
    post.title = title
    post.content = content
    response = client.call(NewPost(post))
    return response

def wp_uploadImageFile(client,filename):

    mimes={'png':'image/png', 'jpg':'image/jpeg'}
    # prepare metadata
    data = {
            'name': filename,
            'type': mimetype,  # mimetype

    # read the binary file and let the XMLRPC library encode it into base64
    with open(filename, 'rb') as img:
            data['bits'] = xmlrpc_client.Binary(img.read())

    response = client.call(media.UploadFile(data))
    return response

def quickTest():
    txt = "Hello World"
    txt=txt+'<img src="{}"/><br/>'.format(wp_uploadImageFile(wp,'hello2world.png')['url'])
    return txt


Dabbling with this then got me thinking about the different sorts of things that WordPress allows you to publish in general. It seems to me that there are essentially three main types of thing you can publish:

  1. posts: the timestamped elements that appear in a reverse chronological order in a WordPress blog. Posts can also be tagged and categorised and viewed via a tag or category page. Posts can be ‘persisted’ at the top of the posts page by setting them as a “sticky” post.
  2. pages: static content pages typically used to contain persistent, unchanging content. For example, an “About” page. Pages can also be organised hierarchically, with child subpages defined relative to a specified ‘parent’ page.
  3. sidebar elements and widgets: these can contain static or dynamic content.

(By the by, a range of third party plugins appear to support the conversion of posts to pages, for example Post Type Switcher [untested] or the bulk converter Convert Post Types [untested].)

Within a page or a post, we can also include a shortcode element that can be used to include a small piece of templated text or generated from the execution of some custom code (which it seems could be python: running a python script from a WordPress shortcode). Shortcodes run each time a page is loaded, although you can use the WordPress Transients database API to implement a simple cache for them to improve performance (eg as described here and here).

Within a post, page or widget, we can also embed dynamic content. For example, we could embed a map that displays dynamically created markers that are essentially out of the control of the page or post publisher. Note that by default WordPress strips iframes from content (and it also seems reluctant to allow the upload of html files to the media gallery, at least by default). The preferred way to include custom embedded content seems to be to define a shortcode to embed the required content, although there are plugins around that allow you to embed iframes. (I didn’t spot one that let you inline the content of the iframe using srcdoc though?)

When we put together the Isle of Wight planning applications : Mapped page, one of the issues related to how updates to the map should be posted over time.


That is, should the map be uploaded to a fixed page and show only the most recent data, should it be posted as a timestamped post, to provide archival copies of the page, or should it be posted to a page and support a timeslider/history function?

Thinking about this again, the distinction seems to rely on what sort of (re)discovery we want to encourage or support. For example, if the page is a destination page, then we should probably use a page with a fixed URL for the most recent map. Older maps could be accessed via archive links, or perhaps subpages, if a time-filter wasn’t available on a single map view. Alternatively, we might want to alert readers to the map, in which case it might make more sense to use a timestamped post. (We could of course use a post to announce an update to the page, perhaps including a screenshot of the latest map in the post.)

It also strikes me that we need to consider publication schedules by a news outlet compared to the publication schedules associated with a particular dataset.

For example, Land Registry House Prices Paid data is published on a monthly basis a few weeks after each month the data has been collected for. In this case, it probably makes sense to publish on a monthly basis.

But what about care home or food outlet inspection data? The CQC publish data as it becomes available, although searches support the retrieval of data for a particular area published over the last week or last month relative the time the search is made. The Food Standards Agency produce updates to data download files on a daily basis, but the file for any particular area is only updated when it contains new data. (So on any given day, you don’t know which, if any, area files will be updated.)

In this case, it may well be that a news outlet may want to do a couple of things:

  • publish summaries of reports over the last week or last month, on a weekly or monthly schedule – “The CQC published reports for N care homes in the region over the last month, of which X were positive and Y were negative”, etc.
  • engage in a more immediate or responsive publication of stories around particular reports as they are published by the responsible agency. In this case, the journalist needs to find a way of discovering stories in a timely fashion, either through signing up to alerts or inspecting the agency site on a regular basis.

Again, it might be that we can use posts and pages in complementary way: pages that act as fixed destination sites with a fixed URL, and perhaps links off to archived historical sub-pages, as well as related news stories, that contain the latest summary; and posts that announce timely reports as well as ‘page updated’ announcements when the slower-changing page is updated.

More abstractly, it probably makes sense to consider the relative frequencies with which data is originally published (also considering whether the data is published according to a fixed schedule, or in a more responsive way as and when data becomes available), the frequency with which journalists check the data site, and the frequency with which journalists actually publish data related stories.

WordPress Quickstart With Docker

I need a WordPress install to do some automated publishing tests, so had a little look around to see how easy it’d be using docker and Kitematic. Remarkably easy, it turns out, once the gotchas are sorted. So here’s the route in four steps:

1) Create a file called docker-compose.yml in a working directory of your choice, containing the following:

  image: mysql
  image: wordpress
    - somemysql:mysql
    - 8082:80

The port mapping sets the WordPress port 80 to be visible on host at port 8082.

2) Using Kitematic, launch the Kitematic command-line interface (CLI), cd to your working directory and enter:

docker-compose up -d

(The -d flag runs the containers in detached mode – whatever that means?!;-)

3) Find the IP address that Kitematic is running the VM on – on the command line, run:

docker-machine env dev

You’ll see something like export DOCKER_HOST="tcp://" – the address you want is the “dotted quad” in the middle; here, it’s

4) In your browser, go to eg (or whatever values your setup us using) – you should see the WordPress setup screen:



Here’s another way (via this docker tutorial: wordpress):

i) On the command line, get a copy of the MySQL image:

docker pull mysql:latest

ii) Start a MySQL container running:

docker run --name some-mysql -e MYSQL_ROOT_PASSWORD=example -d mysql

iii) Get a WordPress image:

docker pull wordpress:latest

iv) And then get a WordPress container running, linked to the database container:

docker run --name wordpress-instance --link some-mysql:mysql -p 8083:80 -d wordpress

v) As before, lookup the IP address of the docker VM, and then go to port 8083 on that address.

WordPress Stats in R

A trackback from Martin Hawksey’s recent post on Analysing WordPress post velocity and momentum stats with Google Sheets (Spreadsheet), which demonstrates how to pull WordPress stats into a Google Spreadsheet and generates charts and reports therein, reminded me of the WordPress stats API.

So here’s a quick function for pulling WordPress reports into R.

#Wordpress Stats
#Wordpress Stats API docs (from http://stats.wordpress.com/csv.php)

#You can get a copy of your API key (required) from Akismet:
#Login with you WordPress account: http://akismet.com/account/
#Resend API key: https://akismet.com/resend/

#Required parameters: api_key, blog_id or blog_uri.
#Optional parameters: table, post_id, end, days, limit, summarize.

#api_key     String    A secret unique to your WordPress.com user account.
#blog_id     Integer   The number that identifies your blog. Find it in other stats URLs.
#blog_uri    String    The full URL to the root directory of your blog. Including the full path.
#table       String    One of views, postviews, referrers, referrers_grouped, searchterms, clicks, videoplays.
#post_id     Integer   For use with postviews table.
#end         String    The last day of the desired time frame. Format is 'Y-m-d' (e.g. 2007-05-01) and default is UTC date.
#days        Integer   The length of the desired time frame. Default is 30. "-1" means unlimited.
#period      String    For use with views table and the 'days' parameter. The desired time period grouping. 'week' or 'month'
#Use 'days' as the number of results to return (e.g. '&period=week&days=12' to return 12 weeks)
#limit       Integer   The maximum number of records to return. Default is 100. "-1" means unlimited. If days is -1, limit is capped at 500.
#summarize   Flag      If present, summarizes all matching records.
#format      String    The format the data is returned in, 'csv', 'xml' or 'json'. Default is 'csv'.
#NOTE: some of the report calls I tried didn't seem to work properly?
#Need to build up a list of tested calls to the API that actually do what you think they should?

wordpress.getstats.demo=function(apikey, blogurl, table='postviews', end=Sys.Date(), days='12', period='week', limit='', summarise=''){
  #default parameters gets back last 12 weeks of postviews aggregated by week
    '&',summarise, #set this to 'summarise=T' if required
  #Martin's post notes that JSON appears to work better than CSV
  #May be worth doing a JSON parsing version?

#Use the URL of a WordPress blog associated with the same account as the API key


getDomain=function(url) str_match(url, "^http[s]?://([^/]*)/.*?")[, 2]

#We can pull out the domains clicks were sent to or referrals came from


#Scruffy bar chart - is there a way of doing this sorted chart using geom_bar? How would we reorder x?
ggplot(c)+geom_bar(aes(x=reorder(Var1,Freq),y=Freq),stat='identity')+theme( axis.text.x=element_text(angle=-90))

ggplot(c)+geom_bar(aes(x=reorder(Var1,Freq),y=Freq),stat='identity')+theme( axis.text.x=element_text(angle=-90))

(Code as a gist.)

I guess there’s scope for coming up with a set of child functions that pull back specific report types? Also, if we pull in the blog XML archive and extract external links from each page, we could maybe start to analyse we pages are sending traffic where? (Of course, you can use Google Analytics to do this more efficiently, for hosted WordPress blogs don’t support Google Analytics (for no very good reason that I can tell…?)

PS for more WordPress tinkerings, see eg How OUseful.Info Posts Link to Each Other…,which links to a Python script for extracting data from WordPress blog export files that show how blogs posts in a particular WordPress blog link to each other.

A Quick Comparison of Several Recent Online Consultations

Several online consultation and review documents that engaged my interest were published recently, so I thought it might be useful to quickly compare how they’re presented and what they have to offer.

Public Data Corporation
Firstly, the Plans for the Public Data Corporation consultation. The consultation is presented as a WordPress blog (with some untidy default widgets left in the right hand sidebar) with a brief summary and list of ten (10) consultation questions listed on the front page, and then a separate page to solicit responses for each particular question:

The comments are captured using Disqus and a pre-moderation policy:

It is hard to see at a glance the extent to which people have engaged with the questions across the consultation. The premoderation policy means that there is a delay (and uncertainty) in publishing comments – so for example, the comments I posted on a Saturday morning (#bigsociety time?!;-) presumably won’t be released (if at all) until Monday morning at the earliest… meaning no on-site discussion in the comment thread over the weekend.

(See also SImon Dickson’s take on this consultation: Another Cabinet Office WP consultation.)

Where WordPress is used as a platform, single page RSS feeds and comment feeds per page are available, although it is up to the publisher to decide whether full or summary feeds are published for each page. The following Netvibes dashboard demonstrates an aggregation of single page and page level comment feeds for the PDC consultation:

This suggests that it may be possible to increase the surface area of a consultation using dashboard services, as well as developing dashboards to support the management and reactive moderation of a consultation.

Commons Committee Inquiry on Peer Review
The House of Commons Science and Technology Committee have just called for a new Inquiry into Peer Review.

. Eight (8) separate issues are identified and up to 3,000 word submissions in Word format with numbered paragraphs are requested by email, with a paper copy submitted as well.

In terms of online engagement, I guess this sets the minimum possible baseline?!

“Protection of Freedom Bill” Public Reading Stage
The Cabinet Office recently released a public reading stage for the Protection of Freedom Bill using a themed WordPress site. This site offers front page navigation with the number of public comments received through the platform to date identified for each page.

Comments are supported at a page level, with partial feeds supported at the page level (using ?feed=rss2&withoutcomments=1) along with full comment feeds.

WordPress comment threads enabled.

Top level navigation across the document is preserved at the page level by means of the left-hand navigation sidebar.

Despite the legalistic nature of the Bill, paragraph level commenting is not directly supported.

(See also Simon Dickson’s response to this consultation: Can Cabinet Office’s WordPress-based commentable bills make a difference?.)

Department of Health Online Consultations
The Department of Health Online Consultations Hub provides a single home for current and recently closed consultations from the DoH. Consultations are split over several pages with clearly marked out text entry forms on at the bottom of pages where feedback is requested.(That is, page level structured commenting is supported.) By providing email credentials, users can obtain a link that allows them to return to their submission to the consultation at a later date.

Resource Discovery Taskforce Request for Comments on Metadata Guidelines on JISCPress
The JISC Resource Discovery Taskforce (RDTF) request for comments on UK Metadata Guidelines was published as a multipage document on JISCPress, a WordPress installation running the digress.it theme.

Front page sidebar navigation allows access to all areas of the document and summarises the number of comments per page. Mousing over a page link on the front page loads a preview of the page in the central pane. Following a link leads to a page with floating comment box that supports threaded commenting at the paragraph level:

Each paragraph is also given a unique URI allowing it to be uniquely referenced in posts on third party sites.

Along with comments by section, comments are viewable by commenter:

[Disclaimer: I was part of the project team that proposed JISCPress and the use of the digress.it WordPress plugin and am also a member of the RDTF technical advisory group associated with this RFC.]

Wordpress appears to be gaining traction as a consultation publishing platform, with either vanilla themes (e.g. Public Data Corporation proposal) or custom commentable document themes (JISC RDTF guidelines). WordPress native comments as well as third party commenting support using Disqus are demonstrated (it would be interesting to hear the rationale behind the choice of Disqus and an evaluation of how well it was deemed to have worked). Reactive and pre-moderation strategies are in evidence.

PS One more, that I should have included the first time round, on @lesteph’s ReadAndComment platform – LG Group Transparency Programme.

Whole document navigation is available from the front page as well as from the right hand sidebar on document pages (though it’s not clear if there would be a count of comments per page?) Comments are at page level via a WordPress comment entry form at the bottom of the page:

Steph hinted I won’t like the feeds… dare I look?!;-)

Viewing WordPress Posts in Chronological Order

A short and sweet blog post this one… if you want to share a list of posts by tag or category, or the results from a search on a WordPress blog in the order in which they were posted, just add ?orderby=ID&order=ASC to the end of the URL.

Like this:


What this means is that you can share tagged posts in a chronological view, rather than the default reverse chronological review. Which means your reader can read them in the right order without having to go through any grief…

[UPDATE: as Simon Dickson points out in a comment below, the above actually returns the order in which posts were created. For the order in which they were published, use ?orderby=date&order=ASC]

PS it works for feeds too…

PPS I just added this hack to my blog sidebar too – as a “View these posts in chronological order” link:


PPPS a whole host of other ordering parameters appear to be available too:

* orderby=author
* orderby=date
* orderby=title
* orderby=modified
* orderby=menu_order Note: Only works with Pages.
* orderby=parent
* orderby=ID
* orderby=rand
* orderby=meta_value Note: A meta_key=keyname must also be present in the query.
* orderby=none – no order (available with Version 2.8)
* orderby=comment_count – (available with Version 2.9)

On wordpess.com blogs at least, the pagination parameters other than order don’t appear to work though? (nopaging=true (i.e. display all corresponding posts), posts_per_page=, paged=)

Single Page RSS Feeds – So What? So this…

Having posted about Single Item RSS Feeds on WordPress blogs: RSS For the Content of This Page, it struck me that whilst this facility might be of interest to a very, very select few, most people would probably have the response: so what?

To answer that question, it might help if I let you into a little secret: I’m not really that into content, open educational or otherwise. What I am interested in is how content can flow around the web, and how it can be re-presented in different ways and different places around the web by different people, all pulling on the same source.

So if we consider single page RSS feeds, what this means is that I can re-present the content of any of my WordPress blogged posts anywhere that accepts RSS. So for example, I could view just that post as a Wordle generated word cloud, or subscribe to the RSS version of single blog post on a Netvibes page (maybe along with other related posts):

and view the post in that location:

(At the moment not many other platforms appear to offer single page RSS feeds. I was hopeful that the Guardian might, because they have quite a well developed feed platform, but I couldn’t find a way to grab a single page feed trivially from a page URI:-(

To see why that might be useful, you need to know another of my little secrets. I don’t really think of RSS feeds being used to transport new content, such as the latest posts from the many blogs I still subscribe to. For sure, they can be used for that purpose, and a great many RSS readers are set up to accommodate that sort of use (only showing you feed items you haven’t already read, for example), but that is a special case. The more general case is simply that feeds are used to transport content that has quite a simple structure around the web. And this content might be fixed, static, immutable. That is, the content of the feed might never change once the feed has been created, as in the case of OpenLearn course unit full content RSS feeds.

AS AN ASIDE… I generally think of RSS feeds as providing a way of transporting simple content “items” around where each item has a quite simple structure:

If you think of a blog post or news article as an item, the title is hopefully obvious (the title of the post/article), the description is the content “body” of the item (e.g. the text content of the news article) and the link is the URL of where that post or article can be found on the web. The other elements are optional: what I refer to as annotations correspond to things like latitude and longitude co-ordinates that can be used add geographical information to the item so that it can b plotted on a map for example; and what I term a payload would be something like an audio file that gets delivered when you subscribe to an RSS podcast feed from somewhere like iTunes or IT Conversations.

Once you start viewing RSS feeds as a general transport mechanism, then you start to see the world in a slightly different way… So for example: the a href=”https://ouseful.wordpress.com/2009/07/08/single-item-rss-feeds-on-wordpress-blogs-rss-for-the-content-of-this-page/”>Single Item RSS Feeds post reveals how to create single item RSS feeds from the URL of a blog post hosted on WordPress. Now if I bookmark a series of WordPress hosted blog posts to somewhere like the delicious.com social bookmarking site, and tag them all in the same way, I can get an RSS feed out that contains a list of posts that can be obtained in XML form (that is, as single item RSS feeds).


So maybe if I find a series of posts from WordPress blogs all over the world on a particular topic, I can create my own custom RSS feed of those posts that I can use as the basis of a reading list, for example, or to feed a Netvibes page on a particular topic, or even to feed an RSS2PDF service*?

* these needn’t be really horrible and divisive… For example, the Feedjournal service will take in an RSS feed and produce a rather nice looking newspaper version of your feed… ;-)

Now it just so happens, I’ve prepared one of these earlier. In particular, I’ve posted a small collection of blog posts on the topic of WordPress from a variety of (WordPress) blogs at http://delicious.com/psychemedia/singlefeeddemo:

You’ll notice that I can get an RSS feed of this list out too: from http://delicious.com/rss/psychemedia/singlefeeddemo in fact.

Now the links I’ve bookmarked are links to the original HTML page version of each blog post; but all it takes is the simple matter of rewriting those URLs by adding ?feed=rss2&withoutcomments=1 on to the end of them to get the RSS version of each post.

Hmm… Yahoo Pipes, where are you? Let’s just pull in the RSS feed of those WordPress hosted blog post bookmarks, and rewrite the URLs to their single item RSS feed equivalent:

Now we can loop through each of those items, and replace it with the actual content of those single item RSS feeds:

The output of the pipe is then a real RSS feed that contains items that correspond to the content of WordPress blog posts that I have bookmarked on delicious.

Now just think about this for a moment: most RSS feeds are transitory – the content that appears in the feed on a blog post is a reverse chronological list of the 10 or 20 most recent items on the blog (or in a particular category on a particular blog). The feed we are pulling in to this pipe may be fixed (e.g. if we create a list of bookmarks tagged in a particular way, and then don’t tag any more bookmarks in that way) and used to create a very specific a list of blog posts from all over the web. By rewriting the URLs to get the RSS version of each bookmarked post, we can create our own full RSS feed of those list items. (Actually, that isn’t quite true – if the blog is configured to only emit partial RSS feeds, we’ll only get a partial version of a post, typically the first sentence or two.)

(Pipes’ homepages only show preview versions of a feed description, even if the full description is available.)

Just to recap, here’s the whole pipe:

We take in a list of bookmarked URLs that correspond to bookmarked WordPress blog posts, and generate the single item RSS feed URL for each post. We then use these URLs to pull in the content for each post, and this create out own, full content custom RSS feed. The pipe itself emits RSS, so w can take the RSS feed from the pipe and feed it into any service that consumes RSS, such as Feedjournal:

Alternatively, I could subscribe to the pipe’s output feed in somewhere like Netvibes (or even a VLE) and then view the contents of my customised feed in that location. Or I could import that feed into a new WordPress blog. And so on…

Now of course I appreciate that many people will still say: so what? But it’s a start… a small step towards a world in which I can declare an arbitrary list of links to content spread all over the web and then pull it into a single location where I can consume it, or process it further, such as converting it into a PDF (which is a preferred way of consuming large chunks of content for many people) or even delivering it in drip feed fashion over an extended period of time as a serialised RSS feed, for example.

An exercise for the interested reader: clone the pipe and modify it so that it will accept as user input an RSS URL so that the pipe can be used to consume any social bookmarking service RSS feed.

Note: as the pipe stands, the order of items in the feed will correspond to the order in which they were bookmarked. It is possible to tag each bookmark with its desired position in the RSS feed, but that is a rather more advanced topic. (See a soon to be(?!)* deprecated solution to that problem here: Ordered Lists of Links from delicious Using Yahoo Pipes.

* If @hapdaniel hasn’t already published a more elegant solution to this problem using YQL Execute somewhere, I’ll try to do so when I get a chance…

PS ho hum, maybe we don’t need RSS after all: Instapaper, Del.icio.us, Yahoo! Pipes and being Slack (via @mediaczar)