OUseful.Info, the blog…

Trying to find useful things to do with emerging technologies in open education

Archive for July 2010

What Else I Have Missed Recently? More Subscription Management Tools in Delicious

with 4 comments

Long time readers of this blog might remember that I used to play in public with the delicious social bookmarking service all the time, advocating the use of social bookmarks through presentations and workshops, as well as demoing various visualisations; but whilst I now use delicious regularly as a place to dump bookmarks for future reference(?!), as well as the occasional bit of syndication through my augmented blog feed via a feedthru tag, I haven’t really been keeping up with the innovations they occasionally, and quietly, roll out.

If you haven’t looked at delicious lately, here are few things you can do with it that you might not have realised:

- browse links in delicious: for a particular set of links, browse through the web pages one at a time. If you do presentations that demonstrate, or show off, a lot of web pages, this can be a really handy tool (e.g. Browse Links in Delicious – Another OUseful Prototype Unprediction Comes True:-));

- advanced search operators: as well as being able to search through your bookmarks, your network’s bookmarks, or everyones’:

Search facets in delicious

you can also filter within a particular tag:
Search bookmarks filtered by tag in delicious

Tag based filtered searching is also possible using the tag: search limit, as is searching by site: or filetype:

Search limits in delicious

Something that was definitely new to me as I was having a play yesterday were tag based subscriptions, subscription bundles, and network bundles.

Tag based subscriptions allow you to subscribe to a feed of links that are being bookmarked with a particular tag. For example, you might use this approach to subscribe to a list of links being tagged with a course code:

Delicious tag subscriptions

Subscription bundles allow you to collect a stream of links from several tag subscriptions together. So for example, if you are subscribing to several conference tag bookmark feeds, you could collate them all in one big “conference” subscription bundle. (Essentially, subscription bundles allow you to subscribe to a set of OR’d tags). It’s also possible to subscribe to a tag as used by a particular user:

Add a subscription a particular user and tag in delicious

Then we can create subscription bundles around separate subscriptions:

manage subscription bundles in delcious

We can also search within the metadata (title, description) of bookmarks collected across my subscriptions.

Network bundles allow you to group different members of your delicious network together so that you can view their combined new bookmarks in a single feed (i.e. providing the ability to look over recent bookmarks from an OR’d set of users in a single place). Creating a bundle is easy:

Creating a netwrok bundle in delicious

Then we can view a feed of the new bookmarks saved by those users:

Network bundles in delicious

As to how you can get hold of this information for your own purposes, there are several relevant feeds provided:

Bookmarks from a user’s subscriptions:
http://feeds.delicious.com/v2/{format}/subscriptions/{username}

Bookmarks from members of a user’s network:
http://feeds.delicious.com/v2/{format}/network/{username}

Bookmarks from members of a user’s network by tag:
http://feeds.delicious.com/v2/{format}/network/{username}/{tag[+tag+...+tag]}

We can also get a feed from a network bundle. For example:
http://feeds.delicious.com/v2/rss/network/psychemedia/bundle:edtech

or from a subscription bundle, as this URL demonstrates:
http://feeds.delicious.com/v2/rss/subscriptions/psychemedia/bundle:Cogdog%20Horizon%20Scan

Finally, we can get feeds out based on a user’s “social network” in delicious. For example:

- a list of a user’s network members:
http://feeds.delicious.com/v2/{format}/networkmembers/{username}

- a list of a user’s network fans:
http://feeds.delicious.com/v2/{format}/networkfans/{username}

What’s of interest to me about these tools is the way they provide different ways of helping you organise sets of links based on who has bookmarked them and how they have been tagged. Even if you don’t use delicious, I think it’s important to at least be aware that this model exists for managing and routing resources discovered by others in a social setting.

PS [via @deburca]: it seems that you can now attach a license to your feeds…:

Licesning feeds in delicious

I’m not sure I understand how this works? I can see there may be a an argument for claiming some sort of database right as a result of the way a collection is put together, but copyright? Hmmm… maybe the copyright really applies to the description and tag content used to describe the bookmarks?

Written by Tony Hirst

July 15, 2010 at 12:56 pm

Posted in Library, Search

Tagged with ,

Google Charts Now Plot Functions

with 3 comments

I didn’t get round to posting this at the time it was announced, but as I’ve got a few posts on a similar theme already (e.g. RESTful Image Generation – When Text Just Won’t Do) I think it’s worth a quick post for continuity, if nothing else: Google Charts support for TeX images and formula plotting (i.e. provide an equation and it will give you an image back of the formula plotted out); there’s also an interactive Google Charts Playground that I hadn’t seen before…

You can also add labels to images…

So for example, take this URL:

http://chart.apis.google.com/chart
?cht=lc
&chd=t:-1|15,45
&chs=250×150&chco=FF0000,000000
&chfd=0,x,0,11,0.1,sin(x)*50%2B50
&chxt=x,y
&chm=c,00A5C6,0,110,10|a,00A5C6,0,60,10

and it delivers this:

GOogle chart demo

The following is also plotted:

google chart formula plot

from this URL:

http://chart.apis.google.com/chart
?cht=lxy&chs=250×250&chd=t:0|0|0
&chxs=0,ff0000,12,0,lt|1,0000ff,10,1,lt
&chfd=0,x,0,360,1.9,sin(4*x)*40%2b50|1,y,0,360,1.9,cos(6*y)*40%2b50
&chf=c,lg,90,FFFF00,0,FF9933,1&chco=006699

On the TeX front, this URL:
http://chart.apis.google.com/chart
?cht=tx
&chl=x%20=%20%5Cfrac%7B-b%20%5Cpm%20%5Csqrt%20%7Bb%5E2-4ac%7D%7D%7B2a%7D

delivers:
google TeX chart demo

Getting the escaping can be a bit of a pain, but the interactive playground makes things slightly easier:

Google chart playground

(You still need to escape things like “+” signs, though…)

Written by Tony Hirst

July 13, 2010 at 1:34 pm

Posted in Anything you want

Using Twitter Lists to Define Custom Search Engines

with 7 comments

A long time ago, I used to play with search engines all the time, particularly in the context of bounded search, (that is, search over a particular set of web pages of web domains, e.g. Search Hubs and Custom Search at ILI2007). Although I’m not at IWMW this year, I can’t not have an IWMW related tinker, so here’s a quick play around IWMW related twittering folk…

To start with, let’s have a look at the IWMW Twitter account:

IWMW lists

We see there are several twitter lists associated with the account, including one for participants…

Looking around the IWMW10 website, I also spy a community area, with a Google Custom search engine that searches over institutional web management blogs that @briankelly, I presume, knows about:

Institutional Web Managemet blogs search engine

It seems a bit of a pain to manage though… “Please contact Brian Kelly if you would like your blog to be included in this list of blogs which are indexed”

Ever one to take the lazy approach, I wondered whether we could create a useful search engine around the URLs disclosed on the public Twitter profile page of folk listed on the various IWMW Twitter lists. The answer is “not necessarily”, because the URLs folk have posted on their Twitter profiles seem to point all over the place, but it’s easy enough to demonstrate the raw principle.

So here’s the recipe:

- find a Twitter list with interesting folk on it;
- use the Twitter API to grab the list of members on a list;
- the results include profile information of everyone on the list – including the URL they specified as a home page in their profile;
- grab the URLs and generate an annotations file that can be used to import the URLs into a Google Custom Search Engine;
- note that the annotations file should include a label identifier that specifies which CSE should draw on the annotations:

Google CSE config

Once the file is uploaded, you should have a custom search engine built around the URLs folk followed in the twitter list have revealed in their twitter profiles (here’s my IWMW Participants CSE (list date: 12:00 12/7/10)

Note that to create sensibly searchable URLs, I used the heuristics:

- if page URL is example.com or example.com/, search on example.com/*
- by default, if page is example.com/page.foo, just search on that page.

I used Python (badly!;-) and the tweepy library to generate my test CSE annotations feed:

import tweepy

#these are the keys you would normally use with oAuth
consumer_key=''
consumer_secret=''

#these are the special keys for single user apps from http://dev.twitter.com/apps
#as described in http://dev.twitter.com/pages/oauth_single_token
#select your app, then My Access Token from the sidebar
key=''
secret=''

auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(key, secret)
api = tweepy.API(auth)

#this identifier is the identifier of the Google CSE you want to populate
cseLabelFromGoogle=''

listowner='iwmw'
tag='iwmw10participant'

auth = tweepy.BasicAuthHandler(accountName, password)
api = tweepy.API(auth)

f=open(tag+'listhomepages.xml','w')

cse=cseLabelFromGoogle

f.write("<GoogleCustomizations>\n\t<Annotations>\n")

#use the Cursor object so we can iterate through the whole list
for un in tweepy.Cursor(api.list_members,owner=listowner,slug=tag).items():
    if  type(un) is tweepy.models.User:
      l=un.url
      if l:
        l=l.replace("http://","")
        if not l.endswith('/'):
          l=l+"/*"
        else:
          if l[-1]=="/":
            l=l+"*"
        f.write("\t\t<Annotation about=\""+l+"\" score=\"1\">\n")
        f.write("\t\t\t<Label name=\""+cse+"\"/>\n")
        f.write("\t\t</Annotation>\n")

f.write("\t</Annotations>\n</GoogleCustomizations>")

f.close()

(Here’s the code as a gist, with tweaks so it runs with oAUth.)

Running this code generates a file (listhomepages.xm) that contains Google custom search annotations for a particular Google CSE, based around the URLs declared in the public twitter profiles of people listed in a particular list. This file can then be uploaded to the Google CSE environment and used to help configure a bounded search engine.

So what does this mean? It means that if you have a identified a set of people sharing a particular set of interests using a Twitter list, it’s easy enough to generate a custom search engine around the webpages or domains they have declared in their Twitter profile.

Written by Tony Hirst

July 12, 2010 at 12:18 pm

Posted in Tinkering

Tagged with , ,

OPML Support for JISCPress and WriteToReply

with 2 comments

What’s the easiest way to read a document published on JISCPress or WriteToReply? One answer is just to read the document on the parent site, but another way is to pull the content into another space using the RSS/Atom syndication feeds that WordPress makes available, and that the digress.it plugin opens up even further.

In what follows, I’ll use URLs for example docs published on both JISCPress:
e.g. http://linkeddata.jiscpress.org
WriteToReply
e.g. http://writetoreply.org/publicsectortransparencyboard

The simplest subscription option is just to subscribe to the document as an RSS feed:

- http://linkeddata.jiscpress.org/feed/
- http://writetoreply.org/publicsectortransparencyboard/feed/

This will pull the whole document into your feedreader, with each section of the document (i.e. each “page” of the doc as published on JISCPress/WriteToReply) as it’s own “blog post”.

Note that this form of subscription displays the posts in reverse order – to view the sections that make up the document in the “proper” order, we use URLs of the form:

- http://linkeddata.jiscpress.org/feed/?order=ASC
- http://writetoreply.org/publicsectortransparencyboard/feed/?order=ASC

To view the comments from the document as a whole, we need a URL that looks like:

- http://linkeddata.jiscpress.org/comments/feed/
- http://writetoreply.org/publicsectortransparencyboard/comments/feed/

It is also possible to get a separate RSS feed out of the platform for each page, as well as a separate comment feed for each page. For example, single item RSS feeds, where each page has an RSS feed with one item in it – the content of that page:

- http://linkeddata.jiscpress.org/executive-summary/?feed=rss2&withoutcomments=1
- http://writetoreply.org/publicsectortransparencyboard/ptb-terms-of-reference/?feed=rss2&withoutcomments=1

And for comment feeds on a page basis:
- http://linkeddata.jiscpress.org/the-semantic-web/feed
- http://writetoreply.org/publicsectortransparencyboard/ptb-terms-of-reference/feed

If you were viewing any of these sorts of feed in a feed reader such as Google Reader, you would be able to favourite and share each separate page or each separate comment, for example.

For an example of the sort of thing this makes possible, see An Example Netvibes Dashboard for the Digital Britain Interim Report on WriteToReply.

We can also get feeds out on a page basis where each paragraph has a separate feed item to itself:

- http://linkeddata.jiscpress.org/feed/paragraphlevel/executive-summary
- http://writetoreply.org/publicsectortransparencyboard/feed/paragraphlevel/ptb-terms-of-reference/

If you were viewing these sorts of feed in a feed reader such as Google Reader, you would be able to favourite and share each separate paragraph.

With so many separate feed URLs available, it can be a problem entering them separately into a dashboard such as Netvibes or a feed reader such as Google Reader, so I’ve created a couple of OPML generators:

- http://ouseful.open.ac.uk/xmltools/jiscpressOPML.php
- http://ouseful.open.ac.uk/xmltools/wtrOPML.php

They both take similar sorts of parameters, which are a little opaque at the moment as I try to work out sensible OPML element configurations.

The first parameter we need for the generator specifies the document:

- http://ouseful.open.ac.uk/xmltools/jiscpressOPML.php?url=linkeddata
- http://ouseful.open.ac.uk/xmltools/wtrOPML.php?url=publicsectortransparencyboard

Then we have the parameters b, c, s, and p… If you set these parameters in the URL (e.g. b=1, c=1&s=1), they act as follows:

$c = (isset($_GET['c'])) ? true : false;
// import feeds corresponding to comment feeds at the page level (i.e. each page will have its own comment feed or tab in the reader/dashboard)

$s = (isset($_GET['s'])) ? true : false;
// import feeds corresponding to single feed item per page feeds (i.e. each page will have its own feed or tab in the reader/dashboard; a single feed item will represent the whole of the page contents)

$p = (isset($_GET['p'])) ? true : false;
// import paragraph level feeds at the page level (i.e. each page will have its own feed or tab in the reader/dashboard and each paragraph will be a separate feed item)

if ((!($c))&&(!($p))) $s=true;
//default behaviour – if no comments and no para level feeds, use single item page level content feeds

$b = (isset($_GET['b'])) ? true : false;
// bundled – one folder – all the feeds will be imported into a single folder/page
// the default should be true, but it isn’t, so you’d be advised to normally set this parameter…

Using these various parameters, you can create a range of OPML files that can be used for the bulk import of feeds from a document published on WriteToReply or JISCPress. (Typically, you will need to donwload a copy of the OPML file to your desktop and then upload it to your dashboard/feed reader application. Download the document using File->Save Page As (and then choose the simplest format possible… e.g. Web Page, XML only).)

So for example:

- http://ouseful.open.ac.uk/xmltools/jiscpressOPML.php?url=linkeddata&b=1&c=1&p=1&s=1
- http://ouseful.open.ac.uk/xmltools/wtrOPML.php?url=ukgovoss&c=1

These OPML feeds can be useful for:

- importing feeds into Netvibes in one go, and creating dashboards with either one tab per document, or separate tabs for each document;

- importing feeds into Google Reader, so that you can read, share and favourite parts of documents (even down to the paragraph level if you import paragraph level feeds).

[Note: I'm thinking that the generation of paragraph level feeds needs tweaking in digress.it so that the title shows the first 50 or so characters of the paragraph, rather than the page title?]

Written by Tony Hirst

July 9, 2010 at 2:40 pm

Posted in WriteToReply

Tagged with

Open Course Production

with 15 comments

Following a chat with Mark Surman of the Mozilla Foundation a week or two ago, I’ve been pondering a possible “flip” between:

a) the production of course materials as part of a (closed) internal process, primarily for use within a (closed) course in a particular institution, and then released under an open license (such as a Creative commons license); and

b) the production of course materials in the open that are then:

i) pulled into the institution for use within a (closed) course; or

ii) used (or not) to support self-directed learning towards an assessment only award.

In the OU, the course production model can take a team of several academics, supported by a course manager, media project manager, editor, picture researcher, rights chasers, developers, artists, et al. several years to produce a course that will then last for between five and ten years of presentation. In addition, handover of course materials may take place up to a year before the first presentation of the course. Course units are typically drafted by individual authors, and then passed for comment and critical reading to the rest of the course team. Typically, materials will pass through at least two drafts before final handover.

(After a little digging, and the help of @ostephens, I managed to track down some reports on how course production was managed in the early years of the OU: Course Production: Some Basic Problems, Course Production: Activities and Activity Networks, Course Production: Planning and Scheduling, Course Production: The Problem of Assessment, though I haven’t had chance to read them yet…)

For the OU short course T151 Digital Worlds, the majority of the course team authored content was published as it was being written on a public WordPress blog (Digital Worlds Uncourse Blog); in the current version of the course, students are referred to that public content from within the VLE. (Note that the copyright and licensing of content on the public blog is left deliberately vague!)

Although the Digital Worlds content was written by a single author (me;-), the model was intended to support at the very least a team blog approach, or a distributed blog network authoring approach. Rather than authors writing large chunks of text and then passing them for comment to other course team members, the blogged approach encourages authors to: a) read along with what others are producing; b) create short chunks of material (500-800 words, typical blog post length) on a particular topic (probably linked to other posts on the topic) that are convenient to study in a single study session or interstitial learning break (cf. @lorcand on Interstitial reading); c) link out to related resources; d) act as a focus for trackbacks (passive related resource discovery) and comments that might influence the direction taken in future blog posts.

The use of WordPress as the blogging platform was deliberate, in part because of the wide support WordPress offers for RSS/Atom feed generation. By linking between posts, as well as tagging and categorising posts appropriately, a structure emerges that offers many different possible pathways through the content. RSS feeds with everything means that it’s then relatively straightforward to republish different pathways apparently as linear runs of content elsewhere, if required (e.g. as in an edufeedr environment, perhaps?)

Authoring content in a public forum – ideally under an open content license – means that content becomes available for re-use even as it is being drafted. By opening up comments, feedback can be solicited that allows content to be improved by updating blog posts, if necessary, as well as identifying topics or clarifications that can be addressed in separate backlinking blog posts. By opening up the production process, we make it far more likely that others will contribute to that process, helping shape and influence that content, than expecting others to take openly licensed content as a large chunk and then produced openly licensed derived works as a result (i.e. forks?!)

In short: maybe we shouldn’t just be releasing content created in a closed process as Open Educational Resources (OERs); rather, we should be producing them in public using an open source production model?

As Cameron Neylon suggests in a critique of academic research publishing (It’s not information overload, nor is it filter failure: It’s a discovery deficit):

t is very easy to say there is too much academic literature – and I do. But the solution which seems to be becoming popular is to argue for an expansion of the traditional peer review process. To prevent stuff getting onto the web in the first place. This is misguided for two important reasons. Firstly it takes the highly inefficient and expensive process of manual curation and attempts to apply it to every piece of research output created. This doesn’t work today and won’t scale as the diversity and sheer number of research outputs increases tomorrow. Secondly it doesn’t take advantage of the nature of the web. They way to do this efficiently is to publish everything at the lowest cost possible, and then enhance the discoverability of work that you think is important. We don’t need publication filters, we need enhanced discovery engines. Publishing is cheap, curation is expensive whether it is applied to filtering or to markup and search enhancement.

Filtering before publication worked and was probably the most efficient place to apply the curation effort when the major bottleneck was publication. Value was extracted from the curation process of peer review by using it reduce the costs of layout, editing, and printing through simple printing less. But it created new costs, and invisible opportunity costs where a key piece of information was not made available. Today the major bottleneck is discovery. …

The problem we have in scholarly publishing is an insistence on applying this print paradigm publication filtering to the web alongside an unhealthy obsession with a publication form, the paper, which is almost designed to make discovery difficult. If I want to understand the whole argument of a paper I need to read it. But if I just want one figure, one number, the details of the methodology then I don’t need to read it, but I still need to be able to find it, and to do so efficiently, and at the right time.

Currently scholarly publishers vie for the position of biggest barrier to communication. The stronger the filter the higher the notional quality. But being a pure filter play doesn’t add value because the costs of publication are now low. The value lies in presenting, enhancing, curating the material that is published.

And so on… (read the whole thing).

Maybe we need to think about educational materials in a similar way? By creating the materials in the open, we start to identify what the good stuff is, as well as being able to benefit from direct and relevant feedback from people who are interested in the topic because they discovered it by looking for it, or at least something like it. (For educators, if they think they are helping shape content, for example through commenting on it, they may be more likely to link back to it and direct their students to it because they have a stake in it, albeit weakly and possibly indirectly.)

In response to a call I put out out on Twitter last night for links to work relating to the use of open source production models in course development, @mweller suggested that Andreas Meiszner‘s PhD work may be relevant here? “My PhD research is aimed at investigating the impact of the organizational structure and operational organization on ICT enriched education by conducting a comparative study between FLOSS (Free / Libre Open Source Software) communities and Higher Education Institutions (HEIs). This work will conduct a comparative study between FLOSS communities and HEIs. The primary unit of analysis is (i.) the organizational structure of FLOSS communities and HEIs, (ii.) the operational organization of FLOSS communities and HEIs and (iii.) the learning process, outcome and environment in FLOSS communities and HEIs.”

(These are also relevant, I think? OSS-Watch briefings on Community source vs open source and The community source development model.)

By placing content out in the open, we also provide a stepping stone towards producing “assessment only” courses. By decoupling the teaching/learning content from the assessment, we can offer assessment only products (such as derivatives of the OU’s APEL containers, maybe?) that assess students based on their informal study of our open materials. (I’m not sure if any courses are yet assessing students who have studied materials placed on OpenLearn?) Once mechanisms are in place for writing robust assessments under the assumption that students will have been drawing at least in part on the study of open OU materials, we can maybe start to be more flexible in assessing students who have made used of other OERs (or indeed, any resources that they have been able to use to further their understanding on a topic).

Just by the by, it’s also worth noting that decoupling of assessment from teaching at the degree level is in the air at moment (e.g. New universities could teach but not test for degrees, says Vince Cable) …

Related: an old and confused post about what happens when content on the inside is opened up to the outside so that folk from the inside can work on it on the outside using all their skills from the inside but not having to adhere to any of its constraints… Innovating from the Inside, Outside

Written by Tony Hirst

July 9, 2010 at 12:14 pm

Dazed and Confused…

with one comment

So via several twittering sources, today I learn that:

- ‘the government now says Facebook will be its “primary channel” for communicating with the public about spending cuts’ [BBC News: Ministers turn to Facebook users for cuts suggestions]

As @adrianshort pointed out, “You’ll need a Facebook account to vote in a few years time. Whatever happened to the public web?”, which reminded me of something I skimmed on Technology Review earlier this week (The Government Has an Online Identity Plan for You):

the U.S. government is hoping to step in and improve the state of online identity management. In a draft recently posted online, the Department of Homeland Security outlined a possible National Strategy for Trusted Identities in Cyberspace–a document that suggests how the government could facilitate a system for managing identities. The system could be used not only by government sites such as the Internal Revenue Service, but by other websites, including commercial ones.

The draft document does not suggest creating a national ID card or government-mandated Internet identity system. Instead it proposes a way to combine existing online identity technologies to create a simpler, more privacy-conscious identity system, without the government taking control of the whole thing.

… The draft suggests starting with accounts that users might already have, like those from Google or Facebook. …

If the UK Gov would rather go for physical ID card by proxy, there’s always the Tesco Clubcard of course (“Tesco now has 16 million active Clubcard holders in the UK, compared to 11.7 million people who have a Barclaycard” [Tesco Clubcard signs up one million customers since relaunch]). Or your mobile phone; even the under 8s have mobile phones…

And the second thing from my Twitter feeds?

- “Money-saving plans to separate teaching from examining in higher education are to be outlined by the business secretary, Vince Cable. The proposals would allow new institutions to teach students for degrees that would be then awarded by prestige universities. … All universities would be offered the opportunity to teach to an externally set, globally recognised exam. One by-product of this would be the emergence of a new breed of private universities.” [Guardian: New universities could teach but not test for degrees, says Vince Cable]

Hmm… maybe I should repitch my idea for a qualification verification webservice so employers three or four years down the line can stand a chance of checking whether job applicants’ degrees are valid or not. (QVS could also play really nicely with Facebook, as it happens…;-) [The QVS doc was a blue sky pitch to the SocialLearn project a couple of years ago, that spawned a small internal project reviewing how a less ambitious service might be used to simplify internal OU processes. The doc linked to above is an edited version of the original draft doc, that was itself revised for presentation to the SocialLearn steering group. The views contained within it barely reflect my own views, let alone those of my employer.]

The move to decouple teaching from assessment using a national HE exam (good for Pearson via EdExcel, methinks?!) is something that might help us make the case internally for some assessment only versions of courses… More about that in a future post, but now I’m going to have another quick peek at the wires to see what else has happened in the last couple of hours!

Written by Tony Hirst

July 9, 2010 at 9:45 am

Posted in Anything you want

Amplified Meetings and Participatory Deliberation…

with one comment

So according to the Guardian (David Cameron tells civil servants he wants to ‘turn government on its head’, via @neillyneil), it seems that:

[David Cameron] told a civil service conference in London that he wants to replace what he described as “the old system of bureaucratic accountability” with a democratic accountability “to the people, not the government machine”.

As part of that, every government department will be required to publish structural reform plans setting out how they will put “people in charge, not politicians”.

So I’m wondering… maybe there are certain boards and committees that might benefit from opening their processes to public view, not just for transparency but also so that folk who are interested (and maybe qualified) can contribute too…? After all, select committees formally solicit views from witnesses called to present evidence to the committee; so why not also require other committees to do the same, although in a more casual way? I started doodling some ideas on this topic in a blog post yesterday (Using WriteToReply to Publish Committee Papers. Is an Active Role for WTR in Meetings Also Possible?), essentially around the idea that by opening up committee papers before a meeting, comments could be solicited from the interested, and optionally drawn on in the meeting itself.

(I’ve never really understood why the business of meetings is required to take place in a particular location at a particular time…?)

By amplifying the business of a committee, both before and after its meetings, the members of the committee also get to draw on the combined wisdom of whoever happens to be following the business of the committee, or who is interested in it, if they wish; and by opening up the closed walls of the committee, we allow the potential for participatory deliberation of the matters at hand.

After all, why should it only be conferences that get amplified online?

Written by Tony Hirst

July 8, 2010 at 8:36 pm

Using WriteToReply to Publish Committee Papers. Is an Active Role for WTR in Meetings Also Possible?

with 4 comments

Last night I spent an hour or two putting the various papers released so far by the Public Sector Transparency Board on WriteToReply, an unofficial act but one that seems to have met with approval:

WTR public Secotr Transparency Board

The majority of documents we’ve published on WriteToReply previously tend to be large, standalone documents, albeit with multiple sections that we tend to map on to separate blog posts. Recently, I’ve also started exploring how it feels to post “single page” consultations within a more general blog setting (e.g. Single Page Commentable Consultation Docs).

The PSTB site is a different matter, however, because there are likely to be several separate documents for each meeting of the board (if nothing else, at least and agenda and the minutes), as well as multiple sittings of the board.

Public Sector Transparency Board, WTR

So how does this change things?

The first thing to realise is that the structure – one post per document – encourages linking between documents. So for example, if the agenda identifies that a particular document was a subject of discussion, a link can be included to that page. If a particular section, or point raised in a document is minuted, then the way that WriteToReply generates unique URLs for each paragraph means that a link can be included that references that particular paragraph.

Capturing the forward path – from document paragraph in to the minutes, for example – relies on WordPress capturing a link from the minutes to a paragraph link via a trackback. It strikes me that in a WTR environment, if notes or minutes on the discussion of a particular paragraph or section were captured as comments, then a comment feed would automatically capture an ordered and referenced set of notes/minutes, that could be fed directly into a derived document?

The ability to syndicate and embed (i.e. transclude) paragraph level content from one WriteToReply document in another web document (as demonstrated in Engaging With the Issues Raised By The Google Book Settlement and described in Taking the Conversation Elsewhere – Embedded Quotes) is also a feature that begins to look attractive once we start thinking about a document ecosystem. So for example, if we minute a reference to a particular document in a tabled document, we might allow the viewer to read that paragaph via a sub-text annotation within the context of the minutes.

A more extreme mechanic might to be allow the reader to view elements of documents considered by the Board using a TiddlyWiki like mechanic; (if you haven’t tried TiddlyWiki, follow the link to it now, and then try clicking on some of the links on the TiddlyWIki page. For some users, the TiddlyWiki user experience is compelling; others hate it…)

In this case, clicking on a link in a minute would dynamically pull in the corresponding paragraph or section from the linked to document. Once read, the user could then “close” the referred to paragraph. (You really do need to have played with TiddyWiki recently to appreciate what I might mean by that!)

To return to the PSTB example, it will be interesting to see how the evolving nature of the site plays out and whether WriteToReply could play an active role in both the preparation for and run up to a meeting, as well as recording discussions and decisions made by the Board.

For example, if we manage to get prior notice of tabled items and the meeting agenda, then we’ll be able to solicit comments on very specific items that can possibly be referred to within the meetings of the Board itself. Even Board members might use WTR to take notes on – and solicit further comments on – particular sections of the document, as well as being able to refer to them via user feeds during the meeting itself.

As we get more documents, we should be able to increase the linkage between documents (though it will remain to be seen how that might turn out to be useful, if at all). In the sense that the published documents (and comments) provide a corpus for a search engine over issues considered by the Board, I think we’d probably need to offer paragraph level search, as well as comment level search (e.g. see this demonstration hack: Paragraph Level Search Results on WordPress Using Digress.it and Yahoo Pipes; I don’t think it searches comments though? In fact, does WordPress allow users to search through comments?).

I believe that there is certainly scope for using this sort of approach to “amplify” both preparation for and dissemination of PSTB discussions at the very least. If anyone else out there would like to explore the possibility of using a WriteToReply environment to support meeting based activities (“amplified meetings”), (either on a public site or a closed site), please get in touch:-)

Written by Tony Hirst

July 7, 2010 at 3:53 pm

Posted in Search, WriteToReply

Mulling Over an Idea for Hashtag Community Maturity Profiles

with 3 comments

A couple of weeks ago, I put started cobbling together some clunky scripts to collate network data files from lists of people twittering with a particular hashtag (First Glimpses of the OUConf10 Hashtag Community). I’ve got a Twapperkeeper key now, so the next step is to pull archived hashtagged tweets from there to generate my hashtaggers list, and then use that data as the basis for pulling in friends and followers links for particular individuals from the Twitter API.

One thing I’d like to start pulling together is a set of tools for providing network and backchannel analysis around hashtag communities. Andy Powell has already published a site that summarises hashtag activity in the form of Summarizr using a Twapperkeeper archive:

Summarizr

So what else might we look for?

Mulling over my own Personal Twitter Networks in Hashtag Communities, the metrics I report include:

- Number of hashtaggers [Ngalaxy]
- Hashtaggers as followers (‘hashtag followers’) [Gfollowers]
- Hashtaggers as friends (‘hashtag friends’) [Gfriends]
- Hashtagger followers not friended (‘serfs’) [Gserfs]
- Hashtagger friends not following (‘slebs’) [Gslebs]
- Hashtaggers not friends or followers (‘the hashtag void’) [Gvoid]
- Reach into hashtag community [Greach=Gfollowers/Ngalaxy]
- Reception of hashtag community the proportion of the the hashtag community that are followed by (i.e. are friends of) the named individual; [Greception=Gfriends/Ngalaxy]
- Hashtag void (normalised) [Normvoid=Gvoid/Ngalaxy]
- Total personal followers the total number of followers of the named individual [Nfollowers]
- Total personal friends: the total number of friends of the named individual [Nfriends]
- Hashtag community dominance of personal reach: the extent to which the hashtag community dominates the set of people who follow the named individual, [Domreach=Gfollowers/Nfollowers]
- Hashtag community dominance of personal reception: the extent to which the set of the named individual’s friends is dominated by members of the hashtag community, [Domreception=Gfriends/Nfriends]

Anyway, it strikes me that calculating those measures as means (and standard deviations) across all the members of the network, along with more traditional social network analysis network centrality or clustering measures, might help identify different signatures relating to the maturity of different hashtag communities (for example, the extent to which they are just forming, or the extent to which they have largely saturated in terms of members knowing each other).

These metrics might also change over the course of an event being discussed via a particular hashtag.

Written by Tony Hirst

July 7, 2010 at 2:27 pm

When Open Public Data Isn’t…?

with 5 comments

This year was always going to be an exciting year for open data. The launch of data.gov.uk towards the end of last year, along with commitments from both sides of the political divide before the election that are continuing to be acted upon now means data is starting to be opened up -scruffily at first, but that’s okay – and commercial enterprises are maybe starting to get interested too…

…which was always the plan…

…but how is it starting to play out?

The story so far…

A couple of weeks ago, the first meeting of the Public Data Transparency Board was convened, which discussed – and opened up for further discussion, a set of draft public data principles. (Papers relating to the meeting can be found here.)

In a letter to the responsible Minister prior to the meeting (commentable extracts here), Professor Nigel Shadbolt suggested that:

4. … The economic analysis, and the views we regularly hear from the business community themselves, are unequivocal: data must be released for free re-use so that the private sector can add new value and develop innovative new business services from government information. …

8. Transparency principles need to be extended to those who operate public services on a franchised, regulated or subsidised basis. If the state is controlling a service to the public or is franchising or regulating its delivery the data about that activity should be treated as public data and made available. …

11. We need to support the development of licences and supporting policies to ensure that data released by all public bodies can be freely re-used and is interoperable with the internationally recognised Creative Commons model. …

12. A key Government objective is to realise significant economic benefits by enabling businesses and non-profit organisations to build innovative applications and websites using public data. …

The business imperative is further reinforced by the second of three reasons given by the Open Government Data tracking project in Why Open Government Data?:

Releasing social and commercial value. In a digital age, data is a key resource for social and commercial activities. Everything from finding your local post office to building a search engine requires access to data much of which is created or held by government. By opening up data, government can help drive the creation of innovative business and services that deliver social and commercial value.

So how has business been getting involved? As several local councils start to pick up a request contained in a letter from the Prime Minister published at the end of May that they open up their financial data, Chris Taggart/@countculture, developer of OpenlyLocal posted a piece on The open spending data that isn’t… this is not good in which he described how apparently privileged access to financial data from several councils was being used to drive Spikes Cavell’s SpotlightOnSpend website (for a related open equivalent, see Adrian Short’s Armchair Auditor). Downstream use of the data was hampered by a “personal use only” license, and a CAPTCHA that requires a human in the loop in order to access the data. The Public Sector Transparency Board promptly responded to Chris’ post (Work on Local Spending Data), quoting the principle that:

“Public data will be released under the same open licence which enables free reuse, including commercial reuse – all data should be under the same easy to understand licence. Data released under the Freedom of Information Act or the new Right to Data should be automatically released under that licence.”

and further commenting: “We have already reminded those involved of this principle and the existing availability of the ‘data.gov.uk’ licence which meets its criteria, and we understand that urgent measures are already taking place to rectify the problems identified by Chris.”

Spikes Cavell chief executive Luke Spikes responded via an interview with Information Age, (SpotlightOnSpend reacts to open criticism):

[SpotlightOnSpend] is first and foremost a spend analysis software and consultancy supplier, and that it publishes data through SpotlightOnSpend as a free, optional and supplementary service for its local government customers. The hope is that this might help the company to win business, he explains, but it is not a money-spinner in itself.

“The contribution we’re making to transparency is less about what the purists would like to see, it’s simply putting the information out there in a form that is useful for the audience for which it is intended [i.e. citizens and small businesses]” he said. “But there are a few things we haven’t done right, and we’ll fix that.”

Following the criticism, Cavell says that SpotlightOnSpend will make the data available for download in its raw form. “That’s what we thought was the most sensible solution to overcoming this obstacle,” he told me.

Adrian Short, developer of the open Armchair Auditor, then picked up the baton in a comment to the Information Age article:

There is room for Spikes Cavell to develop their applications and I doubt that anyone has any objection to them offering their services to councils commercially just like thousands of other businesses. But they do not have a monopoly of ideas, talent and resources to build great applications with public spending data. Nor does anyone else.

The concerns that @CountCulture raised were not that Spikes Cavell were trading with councils or trying to attract their business but that they are doing so in a way that precludes anyone else developing applications with this data. By legally and technically locking the data into the Spotlight on Spend platform, everyone else is excluded.

It’s understandable that most councils have no understanding of the culture, legalities or technicalities of open data. This is new territory for nearly all of them. Those councils that have put their data straight onto Spotlight on Spend, bypassing the part where it is made genuinely open — cannot be criticised for not complying with what to them must be a very unusual requirement. But that’s why @CountCulture and I and others want to be very clear that the end result of this process is having effective scrutiny of council finances through multiple websites and applications, not just Spotlight on Spend or any other single website or application. The way we get there is with open data.

And Chris Taggart’s response? (Update on the local spending data scandal… the empire strikes back):

Lest we forget, Spikes Cavell is not an agent for change here, not part of those pushing to open public data, but in fact has a business model which seems to be predicated on data being closed, and the maintenance of the existing procurement model which has served us so badly.

(For recommendations on how councils might publish financial data in an open way, see: Publishing itemised local authority expenditure – advice for comment (reiterated here: Open Government Data: Finances. The Office for National Statistics occasionally releases summary statistics (e.g. as republished in Openlocal: Local spending data in OpenlyLocal, and some thoughts on standards) but at only a coarse resolution. As to how much it might cost to do that, some are claiming Cost of publishing ‘£500 and above’ council expenditure prohibitive.)

From my own perspective, I would also add that should consultants like Spikes Cavell create derived data works from open public data, there should be some transparency in declaring how the derived work was created (see for example: So Where Do the Numbers in Government Reports Come From? and Data is not Binary).

Another example of how once open data is becoming “closed” behind a paywall comes from Paul Geraghty (“Closed Data Now” SOCITM does a “Times”):

If my memory serves me well the e-Gov Register (eGR) hosted by Brent has been showing every IT supplier sortable by product type, supplier, local government type and even on maps for about 6 or 7 years (some links below if you hurry up).

I am aware that there are problems with this data, in my own past employer I know that some of the data is out of date.

But it is there, it is useful and informative and it is OPEN to all, even SMEs like me researching on niche markets in local government.

The latest move by SOCITM (and presumably with the knowledge of the LGA and the IDeA) means all that data is going to go behind the SOCITM paywall.

And the response from Socitm, via a comment from Vicky Sargent:

First of all, I’d like to clear up some points of fact. No local authority or other public sector service provider that provides data to the Applications Register will have to pay for their subscription and for them, access to the data will be free, regardless of whether they subscribe to Socitm Insight (as 95% of local authorities do). Anyone who is employed in an organisation that is an Applications Register subscriber – f-o-c or paid, will be able to access the data.
Then there is who pays. Clearly an information service like this that adds value, has to cover the costs of development and delivery. Unlike government departments, LGA, IDeA and local councils, Socitm is not directly funded by the taxpayer, and needs to fund the services it delivers from money raised from fees, subscriptions, events and other services.
The business model we use for the Applications Register is that public bodies that contribute should not pay to use the service, but those that do not contribute pay in cash. Private sector bodies can only pay in cash.

Your article also suggests that Socitm’s support for the move towards open data is hypocritical, set against our business model for the Applications Register. I think this misunderstands the thinking behind ‘open data’, which is to get raw data out of government systems for transparency purposes, also so that it can be re-used. Socitm has been a long-term strong supporter of this.
The open data agenda explicitly acknowledges that ‘re-use’ includes adding value and selling on. If councils were to routinely publish the sort of data we will collect for the Applications Register, there would still be work to be done aggregating and manipulating and re-publishing the information to make it useful, and that is what we do, recovering our costs in the way described.

Adrian Short (can you see how it’s the same few players engaging in these debates?!;-) develops the “keep it free” argument in a further comment:

Your argument presupposes your conclusion, which is that Socitm is the best organisation to be managing/publishing the applications register. Because, as you correctly say, you don’t receive any direct funding from the taxpayer, you have to find other ways of paying for that work. Inevitably this means charging non-contributing users.

What you’re missing is that millions of pounds of public money is spent every year supporting businesses, helping to create new markets and generally oiling the parts of the economy that don’t easily oil themselves. That’s what BIS and the economic development departments of local authorities do. The public interest and private benefit aren’t easily distinguishable unless you contrive that private benefit for a small group to the exclusion of others. But as Paul rightly points out, the potential market for this information is enormous — essentially every business and individual that works for, supplies or wants to work for the public sector, from the individual IT worker to the massive global consultancies, manufacturers and software firms.

Currently it’s a small number of incumbent suppliers that benefit from this relatively inefficient market. Other businesses lose. Public sector buyers lose. The taxpayer loses.

Keeping this information free for everyone to use and enabling it to be used in future when combined with the enormous amount of data that will be released soon will be likely to produce economic benefits to the public through market efficiencies that outstrip its cost by several orders of magnitude. If Socitm can’t publish this data in the most useful, non-discriminatory way then it’s not the best organisation for the job. I can see no reason in principle or practice why it shouldn’t be fully funded by the taxpayer and free at the point of use for everyone. To do otherwise would be an extremely false economy.

(Note that “free vs. open” debates have also been played out in the open source software arena. Maybe it’s worth revisiting them…?)

The previously quouted comment from Vicky Sargent also contains what might be described as an example case study:

This brings me to Better Connected, the annual survey of council websites carried out by Socitm. You say:
Just about every council in the UK has little option but to pay SOCITM hundreds of pounds annually to join their club to find out the exact details of how their website is being ranked.The data for Better connected only exists because Socitm has devised a methodology for evaluating websites, pays for a team of reviewers collect the data each year, and then analyses and publishes the results. No one has to subscribe, they choose to do so because the information is valuable to them.
Information about how we do the evaluation and ranking is freely available on our website, in our press releases and in our free-to-join Website Usage and Improvement community. The 2010 headline results for all councils are published on socitm.net as open data under a creative commons licence and are linked from data.gov.uk.
If the Better connected report has become a ‘must read’, that is because the investment Socitm has made in the product has led to it being a more cost-effective investment for councils than alternative sources of advice on improving their website. Many users have told us Better connected (cover price £415 for non-subscribers or free as part of the Socitm Insight subscription that starts at £670 pa for a small district council) is worth many days’ consultancy, even when that consultancy is purchased from lower cost SME providers.

As these examples show, the license under which data is originally released can have significant consequences on its downstream use and commercialisation. The open source software community has know this for years, of course, which is why organisations like GNU have two different licenses – GPL, which keeps software open by tainting other software that includes GPL libraries, and LGPL, which allows libraries to be used in closed/proprietary code. There is a good argument that by combining data from different open sources in a particular way valuable results may be created, but it should also be recognised that work may be expended doing this and a financial return may need to be generated (so maybe companies shouldn’t have to open up their aggregated datasets?) Just how we balance commercial exploitation with ongoing openness and access to raw public data is yet to be seen.

(The academic research area – which also has it’s own open data movement (e.g. Panton Principles) – also suggests a different sort of tension arising from the “potential value” of a data set or aggregated data set. For example, research groups analysing data in one particular way may be loathe to release to others because they want to analyse it in another, value creating way at a later date.)

Getting the licensing right is particularly important if councils become obliged to use third party services to publish their data. For example, the grand vision of the Public Sector Transparency Board identified in this paragraph in Shadbolt’s letter to Maude states:

13. We must promote and support the development and application of open, linked data standards for public data, including the development of appropriate skills in the public services. …

But as a recent report, again from Chris Taggart, on Publishing Local Open Data – Important Lessons from the Open Election Data project suggests, there are certain challenges associated with web related development in local authorities, and in particular a significant lack of experience and expertise in dealing with Linked Data (which is not surprising – it is a relatively new, and so far arcane) technology. Here are the first four lessons, for example:

- There is a lack of ‘corporate’ awareness/understanding of open data issues, and this will inhibit take up of open, linked data publishing unless it is addressed
- There is a lack of even basic web skills at some councils
- Many councils lack web publishing resources, never mind the resources to implement open, linked data publishing
- The understanding of even the basics of linked data and the steps to publishing public data in this way is very, very limited

What this suggests to me is that it is likely that in the short term at least, the capability for publishing Linked Data will reside in specialist third party companies, possibly one of only a few companies. As Paul Geraghty discovers from the eGovernment Register in If #localgovweb supplier says “RDF WTF?” Sack em #opendata #spending:

[I]t seems to me that of 450 or so local government organisations, 357 are listed as having a “Financials” supplier **.

There are only 18 suppliers listed, and of those there are 6 Big Ones.

Between them the 6 Big Ones supply “Financials” to 326 Councils.

Don’t you think that the first one of those 6 Big Ones who natively supports LOD [Linked Open Data] as an export option (or agrees to within, say, 8 months) really ought to be favoured when bidding for new business?

Lets go further, lets say that it should be mandated that all new contracts with “Financials” suppliers include an LOD clause.

Perhaps Mr Pickles could dispatch someone to have a chat with one or two of these suppliers, or that he should have someone check that future contracts for Financial products being sold to Local Government all contain the necessary wording to make this happen?

So instead of trying to train and cajole 450 councils to FTP assorted CSV files into localdata.gov.uk (FFS) all the way through to grokking RDF, namespaces, LOD et al – why does the government not get on and make a strategy to bully and coerce 6 suppliers instead – and potentially get 326 councils teed up to produce useful LOD a bit sharpish?

Another technology option is for councils to publish their own linked data to a commercially hosted datastore. At the moment, the two companies I know of that offer “datastore” services for publishing Linked Data, at scale, are Talis, and the Stationery Office (in partnership with Garlik). It is, of course, open knowledge that one Professor Nigel Shadbolt is a director of Garlik Limited.

Written by Tony Hirst

July 7, 2010 at 10:31 am

Follow

Get every new post delivered to your Inbox.

Join 126 other followers