Tinkering With Data Consuming WordPress Plugins

Now that I’ve got my own webspace on Reclaim Hosting and set up a handful of my own self-hosted WordPress blogs, I started having a tinker with some custom plugins. I’ve started off with something simple, a couple of shortcode plugins that I can use to embed a simple leaflet map into a post or a page.

datawire_–_test_blog___Just_seeing_what_the_bots_can_do…

So far, I’ve tried a couple of different approaches…

The first one has been pared down to just a default accepting shortcode:

[IWPlanningLeafletMap]

The shortcode calls a bit of code that makes a cached, (at most, three times a day) request to a morph.io scraper that returns a list of currently open for consultation planning applications to the Isle of Wight Council. (That said, I can pass in a few shortcode parameters relating to the size of the map (which is a fixed size, unfortunately – I haven’t worked out to make it flexible…) and the default location and zoom.) The scraper itself runs once daily.

scraperplugin

My feeling is that this sort of code would work best in the context of a WordPress page, acting as a destination that allows folk to just check on currently open applications.

The second plugin embeds a map that displays markers for recent house sales (as recorded by the Land Registry prices paid dataset). This dataset is published as a monthly set of data a month or two after the fact and is downloaded to my local desktop. A python script then reads in the data, creates a new WordPress post containing the shortcode with the data baked in, and uploads the post to WordPress (where it appears in the draft queue).

 alt=

In this shortcode, the marker data is currently encoded using the PHP serialise format (via the python phpserialize dumps method) and embedded in the post as the value of shortcode attribute.

[MultiMarkerLeafletMap zoom=11 lat=50.675 lon=-1.32 width=800 height=500 markers='a:112:{i:0;a:5:{s:3:"lat";d:50.699382843;s:4:"date";s:10:"2015-07-10";s:3:"lon";d:-1.29297620442;s:8:"location";s:22:"SAVOY COURT, TOWN...]

In this case, with the marker data baked into the shortcode, there’s a good argument for rendering the map within a timestamped post as a fixed map (at least, ‘fixed’ in the sense that the data is unchanging).

The PHP un/serialize route isn’t ideal because I think it raises security issues? I originally tried to pass the data as serialised JSON, but the data is in the form of a list and it seems the ] breaks things. I guess what I should really do is see if I can pass the data in as serialised JSON between the shortcode tags rather than pass it in as an attribute?

Another approach might be to define just a simple map embed shortcode, and then support additional shortcodes that add markers to the map?

My PHP coding is also a bit scrappy (and I keep forgetting the ;’s…:-( I think one thing I do need to do is pop the simple plugin code functions into a class to keep them safe, and also patch in a hack or two (as seems to be required?) so that the leaflet map libraries are only loaded into post and page headers for posts/pages that actually contain one of the map shortcodes.

I’m also thinking now I need to find a way to support boundary lines and shape colouring?

boundarymaplines

Hmm…

Some Idle Thoughts on Managing Temporal Posts in WordPress

Now that I’ve got a couple of my own WordPress blogs running off the back of my Reclaim Hosting account, I’ve started to look again at possible ways of tinkering with WordPress.

The first thing I had a look at was posting a draft WordPress post from a script.

Using a WordPress role editor plugin (e.g. a long the lines of this User Role Editor) it’s easy enough to create a new role with edit and upload permissions only [WordPress roles and capabilities], and create a new ‘autoposter’ user with that role. Code like the following then makes it easy enough to upload an image to WordPress, grab the URL, insert it into a post, and then submit the post – where it will, by default, appear as a draft post:

#Ish Via: http://python-wordpress-xmlrpc.readthedocs.org/en/latest/examples/media.html
from wordpress_xmlrpc import Client, WordPressPost
from wordpress_xmlrpc.compat import xmlrpc_client
from wordpress_xmlrpc.methods import media, posts
from wordpress_xmlrpc.methods.posts import NewPost

wp = Client('http://blog.example.org/xmlrpc.php', ACCOUNTNAME, ACCOUNT_PASSWORD)

def wp_simplePost(client,title='ping',content='pong, <em>pong<em>'):
    post = WordPressPost()
    post.title = title
    post.content = content
    response = client.call(NewPost(post))
    return response

def wp_uploadImageFile(client,filename):

    #mimemap
    mimes={'png':'image/png', 'jpg':'image/jpeg'}
    mimetype=mimes[filename.split('.')[-1]]
    
    # prepare metadata
    data = {
            'name': filename,
            'type': mimetype,  # mimetype
    }

    # read the binary file and let the XMLRPC library encode it into base64
    with open(filename, 'rb') as img:
            data['bits'] = xmlrpc_client.Binary(img.read())

    response = client.call(media.UploadFile(data))
    return response

def quickTest():
    txt = "Hello World"
    txt=txt+'<img src="{}"/><br/>'.format(wp_uploadImageFile(wp,'hello2world.png')['url'])
    return txt

quickTest()

Dabbling with this then got me thinking about the different sorts of things that WordPress allows you to publish in general. It seems to me that there are essentially three main types of thing you can publish:

  1. posts: the timestamped elements that appear in a reverse chronological order in a WordPress blog. Posts can also be tagged and categorised and viewed via a tag or category page. Posts can be ‘persisted’ at the top of the posts page by setting them as a “sticky” post.
  2. pages: static content pages typically used to contain persistent, unchanging content. For example, an “About” page. Pages can also be organised hierarchically, with child subpages defined relative to a specified ‘parent’ page.
  3. sidebar elements and widgets: these can contain static or dynamic content.

(By the by, a range of third party plugins appear to support the conversion of posts to pages, for example Post Type Switcher [untested] or the bulk converter Convert Post Types [untested].)

Within a page or a post, we can also include a shortcode element that can be used to include a small piece of templated text or generated from the execution of some custom code (which it seems could be python: running a python script from a WordPress shortcode). Shortcodes run each time a page is loaded, although you can use the WordPress Transients database API to implement a simple cache for them to improve performance (eg as described here and here).

Within a post, page or widget, we can also embed dynamic content. For example, we could embed a map that displays dynamically created markers that are essentially out of the control of the page or post publisher. Note that by default WordPress strips iframes from content (and it also seems reluctant to allow the upload of html files to the media gallery, at least by default). The preferred way to include custom embedded content seems to be to define a shortcode to embed the required content, although there are plugins around that allow you to embed iframes. (I didn’t spot one that let you inline the content of the iframe using srcdoc though?)

When we put together the Isle of Wight planning applications : Mapped page, one of the issues related to how updates to the map should be posted over time.

Isle_of_Wight_planning_applications___Mapped

That is, should the map be uploaded to a fixed page and show only the most recent data, should it be posted as a timestamped post, to provide archival copies of the page, or should it be posted to a page and support a timeslider/history function?

Thinking about this again, the distinction seems to rely on what sort of (re)discovery we want to encourage or support. For example, if the page is a destination page, then we should probably use a page with a fixed URL for the most recent map. Older maps could be accessed via archive links, or perhaps subpages, if a time-filter wasn’t available on a single map view. Alternatively, we might want to alert readers to the map, in which case it might make more sense to use a timestamped post. (We could of course use a post to announce an update to the page, perhaps including a screenshot of the latest map in the post.)

It also strikes me that we need to consider publication schedules by a news outlet compared to the publication schedules associated with a particular dataset.

For example, Land Registry House Prices Paid data is published on a monthly basis a few weeks after each month the data has been collected for. In this case, it probably makes sense to publish on a monthly basis.

But what about care home or food outlet inspection data? The CQC publish data as it becomes available, although searches support the retrieval of data for a particular area published over the last week or last month relative the time the search is made. The Food Standards Agency produce updates to data download files on a daily basis, but the file for any particular area is only updated when it contains new data. (So on any given day, you don’t know which, if any, area files will be updated.)

In this case, it may well be that a news outlet may want to do a couple of things:

  • publish summaries of reports over the last week or last month, on a weekly or monthly schedule – “The CQC published reports for N care homes in the region over the last month, of which X were positive and Y were negative”, etc.
  • engage in a more immediate or responsive publication of stories around particular reports as they are published by the responsible agency. In this case, the journalist needs to find a way of discovering stories in a timely fashion, either through signing up to alerts or inspecting the agency site on a regular basis.

Again, it might be that we can use posts and pages in complementary way: pages that act as fixed destination sites with a fixed URL, and perhaps links off to archived historical sub-pages, as well as related news stories, that contain the latest summary; and posts that announce timely reports as well as ‘page updated’ announcements when the slower-changing page is updated.

More abstractly, it probably makes sense to consider the relative frequencies with which data is originally published (also considering whether the data is published according to a fixed schedule, or in a more responsive way as and when data becomes available), the frequency with which journalists check the data site, and the frequency with which journalists actually publish data related stories.

Single Page RSS Feeds – So What? So this…

Having posted about Single Item RSS Feeds on WordPress blogs: RSS For the Content of This Page, it struck me that whilst this facility might be of interest to a very, very select few, most people would probably have the response: so what?

To answer that question, it might help if I let you into a little secret: I’m not really that into content, open educational or otherwise. What I am interested in is how content can flow around the web, and how it can be re-presented in different ways and different places around the web by different people, all pulling on the same source.

So if we consider single page RSS feeds, what this means is that I can re-present the content of any of my WordPress blogged posts anywhere that accepts RSS. So for example, I could view just that post as a Wordle generated word cloud, or subscribe to the RSS version of single blog post on a Netvibes page (maybe along with other related posts):

and view the post in that location:

(At the moment not many other platforms appear to offer single page RSS feeds. I was hopeful that the Guardian might, because they have quite a well developed feed platform, but I couldn’t find a way to grab a single page feed trivially from a page URI:-(

To see why that might be useful, you need to know another of my little secrets. I don’t really think of RSS feeds being used to transport new content, such as the latest posts from the many blogs I still subscribe to. For sure, they can be used for that purpose, and a great many RSS readers are set up to accommodate that sort of use (only showing you feed items you haven’t already read, for example), but that is a special case. The more general case is simply that feeds are used to transport content that has quite a simple structure around the web. And this content might be fixed, static, immutable. That is, the content of the feed might never change once the feed has been created, as in the case of OpenLearn course unit full content RSS feeds.

AS AN ASIDE… I generally think of RSS feeds as providing a way of transporting simple content “items” around where each item has a quite simple structure:

If you think of a blog post or news article as an item, the title is hopefully obvious (the title of the post/article), the description is the content “body” of the item (e.g. the text content of the news article) and the link is the URL of where that post or article can be found on the web. The other elements are optional: what I refer to as annotations correspond to things like latitude and longitude co-ordinates that can be used add geographical information to the item so that it can b plotted on a map for example; and what I term a payload would be something like an audio file that gets delivered when you subscribe to an RSS podcast feed from somewhere like iTunes or IT Conversations.

Once you start viewing RSS feeds as a general transport mechanism, then you start to see the world in a slightly different way… So for example: the a href=”https://ouseful.wordpress.com/2009/07/08/single-item-rss-feeds-on-wordpress-blogs-rss-for-the-content-of-this-page/”>Single Item RSS Feeds post reveals how to create single item RSS feeds from the URL of a blog post hosted on WordPress. Now if I bookmark a series of WordPress hosted blog posts to somewhere like the delicious.com social bookmarking site, and tag them all in the same way, I can get an RSS feed out that contains a list of posts that can be obtained in XML form (that is, as single item RSS feeds).

Hmmm….

So maybe if I find a series of posts from WordPress blogs all over the world on a particular topic, I can create my own custom RSS feed of those posts that I can use as the basis of a reading list, for example, or to feed a Netvibes page on a particular topic, or even to feed an RSS2PDF service*?

* these needn’t be really horrible and divisive… For example, the Feedjournal service will take in an RSS feed and produce a rather nice looking newspaper version of your feed… ;-)

Now it just so happens, I’ve prepared one of these earlier. In particular, I’ve posted a small collection of blog posts on the topic of WordPress from a variety of (WordPress) blogs at http://delicious.com/psychemedia/singlefeeddemo:

You’ll notice that I can get an RSS feed of this list out too: from http://delicious.com/rss/psychemedia/singlefeeddemo in fact.

Now the links I’ve bookmarked are links to the original HTML page version of each blog post; but all it takes is the simple matter of rewriting those URLs by adding ?feed=rss2&withoutcomments=1 on to the end of them to get the RSS version of each post.

Hmm… Yahoo Pipes, where are you? Let’s just pull in the RSS feed of those WordPress hosted blog post bookmarks, and rewrite the URLs to their single item RSS feed equivalent:

Now we can loop through each of those items, and replace it with the actual content of those single item RSS feeds:

The output of the pipe is then a real RSS feed that contains items that correspond to the content of WordPress blog posts that I have bookmarked on delicious.

Now just think about this for a moment: most RSS feeds are transitory – the content that appears in the feed on a blog post is a reverse chronological list of the 10 or 20 most recent items on the blog (or in a particular category on a particular blog). The feed we are pulling in to this pipe may be fixed (e.g. if we create a list of bookmarks tagged in a particular way, and then don’t tag any more bookmarks in that way) and used to create a very specific a list of blog posts from all over the web. By rewriting the URLs to get the RSS version of each bookmarked post, we can create our own full RSS feed of those list items. (Actually, that isn’t quite true – if the blog is configured to only emit partial RSS feeds, we’ll only get a partial version of a post, typically the first sentence or two.)

(Pipes’ homepages only show preview versions of a feed description, even if the full description is available.)

Just to recap, here’s the whole pipe:

We take in a list of bookmarked URLs that correspond to bookmarked WordPress blog posts, and generate the single item RSS feed URL for each post. We then use these URLs to pull in the content for each post, and this create out own, full content custom RSS feed. The pipe itself emits RSS, so w can take the RSS feed from the pipe and feed it into any service that consumes RSS, such as Feedjournal:

Alternatively, I could subscribe to the pipe’s output feed in somewhere like Netvibes (or even a VLE) and then view the contents of my customised feed in that location. Or I could import that feed into a new WordPress blog. And so on…

Now of course I appreciate that many people will still say: so what? But it’s a start… a small step towards a world in which I can declare an arbitrary list of links to content spread all over the web and then pull it into a single location where I can consume it, or process it further, such as converting it into a PDF (which is a preferred way of consuming large chunks of content for many people) or even delivering it in drip feed fashion over an extended period of time as a serialised RSS feed, for example.

An exercise for the interested reader: clone the pipe and modify it so that it will accept as user input an RSS URL so that the pipe can be used to consume any social bookmarking service RSS feed.

Note: as the pipe stands, the order of items in the feed will correspond to the order in which they were bookmarked. It is possible to tag each bookmark with its desired position in the RSS feed, but that is a rather more advanced topic. (See a soon to be(?!)* deprecated solution to that problem here: Ordered Lists of Links from delicious Using Yahoo Pipes.

* If @hapdaniel hasn’t already published a more elegant solution to this problem using YQL Execute somewhere, I’ll try to do so when I get a chance…

PS ho hum, maybe we don’t need RSS after all: Instapaper, Del.icio.us, Yahoo! Pipes and being Slack (via @mediaczar)

Single Item RSS Feeds on WordPress blogs: RSS For the Content of This Page

At Mash Oop North yesterday, Brian Kelly askd me how I got the “RSS for the content of this page” link onto my (hosted) WordPress blog:

Clicking the link on an arbitrary blog post page turns up an RSS feed containing just a single item: the content of that blog post.

The trick is quite simple, and relies on a couple of things.

The first thing you need to know is that you can get a single item RSS feed containing an RSS version of a single WordPress blog page by adding ?feed=rss2&withoutcomments=1 to the end of the page URL.

So for example, the RSS version of the post that lives here:
http://ukwebfocus.wordpress.com/2009/07/06/enthusiastic-amateurs-and-overcoming-institutional-inertia/
on Brian’s blog can be found here:
http://ukwebfocus.wordpress.com/2009/07/06/enthusiastic-amateurs-and-overcoming-institutional-inertia/?feed=rss2&withoutcomments=1

The second thing you need to be aware of is how wb browsers handle links that appear in a web page, and in particular how they handle relative links. Relative links are most easily thought of as links in a web page that do not specify the domain of the link. So for example, on this blog, the domain is ouseful.wordpress.com. Links to posts on OUseful.info look something like the following:

https://ouseful.wordpress.com/2009/07/07/mash-oop-north-pipes-mashup-by-way-of-an-apology/

An absolute way of writing this as a link in a web page would be to write the link in an HTML anchor tag as follows:

<a href=”https://ouseful.wordpress.com/2009/07/07/mash-oop-north-pipes-mashup-by-way-of-an-apology/”&gt;

That is, we specify the domain (https://ouseful.wordpress.com) and the path to the resource as well as the resource page itself.

A relative link would be written as follows:

<a href=”2009/07/07/mash-oop-north-pipes-mashup-by-way-of-an-apology/”>

with the browser filling in the gaps using the domain that the page itself is served from (https://ouseful.wordpress.com).

(For a basic grounding in how browsers handle relative links, see Absolute vs. Relative Paths/Links. If you want the hardcore standards stuff, you should read the original RFC: RFC 1808: Relative Uniform Resource Locators.)

One further thing to know about relative links is that in you use something of the form ?foo=bar in the link (e.g. <a href=”?foo=bar”>), the browser will add the argument to the end of the current page’s URL. So if the page mypage.html being served from http://example.com contains the relative link <a href=”?foo=bar”> that link will actually point to http://example.com/mypage.html?foo=bar.

Putting these two things together (how to create a URI for the single item RSS feed version of a post, and how to construct relative URIs), we are now in a position to add an ‘RSS version of this page’ link to a WordPress blog sidebar.

So, to get the single item RSS feed link, go to the Widgets settings area of your WordPress blog and add a text widget as follows:

Okay, Brian?:-)

WP_LE

And so it came to pass that the campus was divided.

The LMS had given way to the VLE and some little control was given over to the instructors that they might upload some of their own content to the VLE, yet woe betide any who tried to add their own embed codes or script tags, for verily it is evil and the devil’s own work…

And in the dark recesses of the campus, the student masses were mocked with paltry trifles thrown to them in the form of a simple blogging engine, that they might chat amongst each other and feel as if their voice was being heard…

But over time, the blogging engine did grow in stature until such a day that it was revealed in its fullest glory, and verily did the VLE cower beneath the great majesty of that which came to be known as the WP_LE…

…or something like that…

Three posts, from three players, who just cobbled together something that could well work at institutional scale…

  1. New digs for UMW Blogs, or the anatomy of a redesign: an “anatomy of the redesign of UMW Blogs” (WordPress MU), describing sitewide aggregation, tagclounds and all sorts of groovy stuff on the homepage, along with courses, support and contact pages;
  2. Reuse, resources, re-whatever…: showing how Mediawiki can now be used in all sort of ways to feed wiki content into WordPress… (just think about it: this is the bliki concept working for real on two best-of-breed, open source plaforms…);
  3. Batch adding users to a WordPress site: “import users into a site. All you need to provide is a username and email address for each student and it will create the account, generate a password, assign the specified user Role, and send an email to the student so they can login”…

So what do we have here? WordPress MU and Mediawiki working together to provide a sitewide, integrated publish platform. The multi-user import “doesn’t create blogs for each student” but I think that’s something that could be fixed easily enough, if required…

Thus far, we’ve been pretty quiet here at the OU on the WordPress and Mediawiki front, although both platfroms are used internally… but just before the summer, as one of the final OpenLearn projects, we got the folks over at Isotoma to put together a couple of WordPress and WordPress MU widgets.

Hopefully we’ll be making them available soon, along with some demo sites, but for now, here’s a tease of what we’ve pulled together.

Now you may or may not remember the the Reverend’s edupunkery that resulted in Proud Spammer of Open University Courses, a demo of how to import an OpenLearn unit content RSS feed into a WordPress blog…?

Well we’ve run with that idea – and generalised it a little – so that you can take any of the OpenLearn topic/subject area feeds (that list a set of units in a particular topic) and set up each of the courses itemised in the list with its own WordPress MU blog. Automatically. At the click of a button. What this means is that if you want to create collection of course unit blogs using OpenLearn units, you can do it in one go…

Now there are a few issues with some of the links that are pulled into the blogs from the OpenLearn feeds, and there’s some dodgy bits of script that need thinking about, but at the very least we now have a bulk spamming of OpenLearn courses tool… And if we can get a fix going with the imported, internal unit blog links, and maybe some automated blog tagging and categorising done at import time, then there is plenty of scope for emergent uncourse link mapping across and between OpenLearn WP MU course units…

Using separate WordPress MU blogs to publish unchanging “static” courses is one thing of course – the blog environment makes it easy to comment and publicly annotate each separate unit page. But compare these fixed, unchanging blog courses with how you might consume a blogged (un)course the first time it was presented… Assuming that pages were posted as they were written over the life of the course, you get each new section as new post in your feed reader every day or two…

So step in an old favourite of mine – daily feeds. (Anyone remember the OpenLearn_daily experiment that would deliver an OpenLearn unit via a feed over several days, relative to the day you first subscribed to it?) Our second offerin is a daily feeds widget for WordPress. Subscribe to a daily feed, and you’ll get one item a day from a static course unit blog in your feed reader, starting with the first item in the course unit on the first day.

Taking the two widgets together, we can effectively create a version of OpenLearn in which each OpenLearn unit will be delivered via its own WP MU blog, and each unit capable of being consumed via a daily feed…

A couple of people have been trying out the widgets already, and if anyone else would like a “private release” copy of the code to play with before we post it openly, please get in touch….