OUseful.Info, the blog…

Trying to find useful things to do with emerging technologies in open education

Archive for February 2011

BBC “In Our Time” Reading List using Linked Data

with 9 comments

If you’re a regular listener of BBC Radio 4, you will almost certainly have come across In Our Time, a weekly, single topic discussion programme (with a longstanding archive of listen again material) hosted by Melvyn Bragg on matters scientific, philosophical, historical and cultural. In certain respects, In Our Time may be thought of as discussion based audio encyclopedia. The format sees a panel of three experts (made up of academics, commentators and critics knowledgeable on the topic for that week) teaching the host about the topic. A diligent student, he will of course have done some background reading, and posted links to the references consulted on the programme’s web page.

I’ve already had a quick play with the In Our Time data, looking to see how easy it is to relate programmes to expert academics from various UK universities (Visualising OU Academic Participation with the BBC’s “In Our Time”), but I also wondered whether it would be possible to do anything with the book references, such as using them to identify courses that may be related to a particular programme; (this is reminiscent of a couple of MOSAIC competition entries that looked at ways of recommending books based on courses, and courses based on books using @daveyp’s data from Huddersfield University library that associated course codes with the books borrowed by students taking those courses).

Being a lazy sort, I posted an idea to the OKF Ideas Incubator suggesting that it might be worth considering extracting references from In Our Time programme pages and then reconciling them with Linked Data representations of the corresponding book data.

And then, as if by magic, a solution appeared, from Orangeaurochs: “In Our Time” booklist which describes a method for parsing out the book data and then getting a Linked Data resource reference back from Bibliographica.

The original recipe suggested screenscraping the raw book references from the page HTML, but I posted a comment (at the time of writing, still in the moderation queue) which suggests:

Hi
Great to see you taking this challenge on. Re your step 2 – obtaining the reading list – a possibly more structured way of doing this is to get the appropriate section out of the xml or json representation of the programme page (eg http://www.bbc.co.uk/programmes/b00xhz8d.xml or http://www.bbc.co.uk/programmes/b00xhz8d.json).

I wonder if the BBC will start to structure the data even more – for example by adding explicitly marked up biblio data to book references?

Anyway, you can see an example of the results at pages with URLs of the form http://www.aurochs.org/inourtime_booklist/inourtime_booklist_v1.php?http://www.bbc.co.uk/programmes/b00xhz8d – just add the appropriate IOT programme page URL to extract the data from it.

There are a few hit and misses, but it’s a great start, and something that can be used as a starting point for thinking about how to annotate programme related booklists with structured bibliographic data and exploring what that might mean in a world of linked educational resources that can also reference linked BBC content… :-)

PS Hmmm, I wonder what other programmes are associated with books? A Good Read and Desert Island Discs certainly…

Written by Tony Hirst

February 24, 2011 at 4:06 pm

Posted in BBC, Data, Library, OBU, OU2.0

Tagged with ,

First Inklings of a Small Contract Market Around Data Services? And a concern…

with one comment

A few days ago I was tipped off to a “bounty” request on Scraperwiki, offering 50 quid for a scrape of the DVLA test centres. The request had been posted on the Scraperwiki, and a bounty offered (on which Scraperwiki seems to add a commission).

Scraperwiki also appears to be offering a “private scraper” service as a business model. Maybe visualisation design around a wiki will be next to be offered on the market?!

Another hint that folk may be willing to pay to get data into a useable form appeared on GetTheData in a request for information about currency data from a professional, non-coder journalist that suggested a payment may be in the offing for anyone who could help.

Given that a lot of data that is apparently out there, is readily scrapeable, but is actually subject to non-commercial, personal use only end user licences, I do wonder if there will be a black market in unlicensed data that gets laundered through a series of steps that don’t respect attribution, let alone other, more stringent license conditions.

On the other hand, I wonder whether or not GetTheData should have a facility for associating a bounty with a particular query?

And the concern? It’s to do with the ethics of scraping or aggregating large amounts of personal – albeit public – data from folk on social networks. For example, it’s easy enough to find out who’s being wished a happy birthday on Twitter, and I have more than a few tools for grabbing friends and follower lists around hashtags, search terms, Twitter lists and usernames, and so on. Once we start mining data, it may be possible to discover things about folk from the public context they inhabit that maybe reveals something about them they didn’t realise could be deduced from the context? So what should our response be if we get a request on GetTheData asking someone how to mine public social data around a named individual… It may not be phone tapping, but something about that sort of request, should it ever occur, wouldn’t feel quite right to me…?

Written by Tony Hirst

February 23, 2011 at 7:21 pm

Posted in Data

Tagged with

Opening Up Digital Planet…

leave a comment »

The second in the OU’s co-produced season of programmes with the BBC World Service Digital Planet radio programme is now available on the Digital Planet podcast feed, this week covering the topic of “Ownership and Openness” and featuring OU Senior Lecturer (and intellectual property geek;-) Ray Corrigan.

In the spirit of openness, wherever possible we’re trying to open up access to full length versions of the interviews used in the programme on the OpenLearn website. So for example, if you want to hear fuller length interviews recorded from Brazil’s Campus Party, as covered in the opening episode of the openness series, you can find them here: Campus Party Brasil 2011 – The Digital Planet Interviews.

Interviewees include Jon “Maddog” Hall on Free as in Freedom, not as in price and Sir Tim Berners Lee on net neutrality, opening up data, why open data is important and on WIkileaks.

The OpenLearn site also hosts a recording of Al Gore’s Campus Party 2011 Keynote which I don’t think received an airing on either Digital Planet, or Digital Planet’s sister World Service TV programme, Click?

And as if that’s not enough, the audio clips have been made available as MP3 files, which means you can download them to your own device, or embed them in your own web pages… Like this:

Sir Tim Berners Lee on why open data is important:

If you can think of any other ways we can open up the programmes, please let us know:-)

To keep up-to-date with the OU Digital Planet extras, keep your eye on the OpenLearn Digital Planet profile page, or even better, subscribe to the OpenLearn/Digital Planet RSS feed:-)

Written by Tony Hirst

February 23, 2011 at 12:17 pm

Coming Soon, A Command Line to the Web in Google Chrome?

leave a comment »

Having been a user of the Chrome browser for several months now, I am completely sold on the idea of having a single text box to double up as both a search box and an address box. On the occasions I use the Safari browser, I’m more likely than not to end up with a “page not found” displayed as the result of typing a search query in to the location bar…

As with most browsers, Chrome is extensible through the addition of plugins (I’ve just started using these again, starting with the Awesome Screenshot plugin). With the opening up of an API for the Omnibox (labelled as “Onebox” in the screenshot), developers will now get a chance to “add their own keyword commands to the omnibox”.

(Reminds me of a time when I had umpteen different smart keyword powered searches in my Firefox browser to activate searches on different search endpoints.)

What suddenly occurred to me was that when Chromium devices start shipping (computers where all you get is the browser), the omnibox could also become the command line to the web for folk who like the command line. Google already has a range of command line commands (Google CL: the Google command line tool), so I wonder – could these be supported in the omnibox, and if so, when will they be?!;-)

(In the meantime, the omnibox could also provide a command line from the web for Chrome users who want a web command line in their browser, rather than firing up a terminal?!)

PS on the Chromium front, it seems that the development of Native Client support is continuing apace. Which if I understand it correctly, means you can write and run C applications within a sandboxed container inside Chromium. And which I guess also means it could provide support for applications running on their own virtual machine?

Written by Tony Hirst

February 23, 2011 at 10:37 am

Posted in Anything you want

Tagged with

A Quick Comparison of Several Recent Online Consultations

with 2 comments

Several online consultation and review documents that engaged my interest were published recently, so I thought it might be useful to quickly compare how they’re presented and what they have to offer.

Public Data Corporation
Firstly, the Plans for the Public Data Corporation consultation. The consultation is presented as a WordPress blog (with some untidy default widgets left in the right hand sidebar) with a brief summary and list of ten (10) consultation questions listed on the front page, and then a separate page to solicit responses for each particular question:

The comments are captured using Disqus and a pre-moderation policy:

It is hard to see at a glance the extent to which people have engaged with the questions across the consultation. The premoderation policy means that there is a delay (and uncertainty) in publishing comments – so for example, the comments I posted on a Saturday morning (#bigsociety time?!;-) presumably won’t be released (if at all) until Monday morning at the earliest… meaning no on-site discussion in the comment thread over the weekend.

(See also SImon Dickson’s take on this consultation: Another Cabinet Office WP consultation.)

Where WordPress is used as a platform, single page RSS feeds and comment feeds per page are available, although it is up to the publisher to decide whether full or summary feeds are published for each page. The following Netvibes dashboard demonstrates an aggregation of single page and page level comment feeds for the PDC consultation:

This suggests that it may be possible to increase the surface area of a consultation using dashboard services, as well as developing dashboards to support the management and reactive moderation of a consultation.

Commons Committee Inquiry on Peer Review
The House of Commons Science and Technology Committee have just called for a new Inquiry into Peer Review.

. Eight (8) separate issues are identified and up to 3,000 word submissions in Word format with numbered paragraphs are requested by email, with a paper copy submitted as well.

In terms of online engagement, I guess this sets the minimum possible baseline?!

“Protection of Freedom Bill” Public Reading Stage
The Cabinet Office recently released a public reading stage for the Protection of Freedom Bill using a themed WordPress site. This site offers front page navigation with the number of public comments received through the platform to date identified for each page.

Comments are supported at a page level, with partial feeds supported at the page level (using ?feed=rss2&withoutcomments=1) along with full comment feeds.

WordPress comment threads enabled.

Top level navigation across the document is preserved at the page level by means of the left-hand navigation sidebar.

Despite the legalistic nature of the Bill, paragraph level commenting is not directly supported.

(See also Simon Dickson’s response to this consultation: Can Cabinet Office’s WordPress-based commentable bills make a difference?.)

Department of Health Online Consultations
The Department of Health Online Consultations Hub provides a single home for current and recently closed consultations from the DoH. Consultations are split over several pages with clearly marked out text entry forms on at the bottom of pages where feedback is requested.(That is, page level structured commenting is supported.) By providing email credentials, users can obtain a link that allows them to return to their submission to the consultation at a later date.

Resource Discovery Taskforce Request for Comments on Metadata Guidelines on JISCPress
The JISC Resource Discovery Taskforce (RDTF) request for comments on UK Metadata Guidelines was published as a multipage document on JISCPress, a WordPress installation running the digress.it theme.

Front page sidebar navigation allows access to all areas of the document and summarises the number of comments per page. Mousing over a page link on the front page loads a preview of the page in the central pane. Following a link leads to a page with floating comment box that supports threaded commenting at the paragraph level:

Each paragraph is also given a unique URI allowing it to be uniquely referenced in posts on third party sites.

Along with comments by section, comments are viewable by commenter:

[Disclaimer: I was part of the project team that proposed JISCPress and the use of the digress.it WordPress plugin and am also a member of the RDTF technical advisory group associated with this RFC.]

Summary
Wordpress appears to be gaining traction as a consultation publishing platform, with either vanilla themes (e.g. Public Data Corporation proposal) or custom commentable document themes (JISC RDTF guidelines). WordPress native comments as well as third party commenting support using Disqus are demonstrated (it would be interesting to hear the rationale behind the choice of Disqus and an evaluation of how well it was deemed to have worked). Reactive and pre-moderation strategies are in evidence.

PS One more, that I should have included the first time round, on @lesteph’s ReadAndComment platform – LG Group Transparency Programme.

Whole document navigation is available from the front page as well as from the right hand sidebar on document pages (though it’s not clear if there would be a count of comments per page?) Comments are at page level via a WordPress comment entry form at the bottom of the page:

Steph hinted I won’t like the feeds… dare I look?!;-)

Written by Tony Hirst

February 19, 2011 at 10:01 pm

Google Apps as a Mashup Environment – Slides from #guug11

leave a comment »

FWIW, here are the slides from my presentation on “Mashing Up Google Apps” at the excellent Google Apps UK User Group (#guug11), as hosted by Martin Hamilton at Loughbourough University yesterday.

The “mashup environment” diagram was generated using a desktop version of Graphviz, but it can also be generated using the Google Chart Tools Graphviz chart, as in the example below:

google apps mashup environment

Here’s the “source code” for that image:

digraph googApps {

GoogleSpreadsheet [shape=Msquare]
GoogleCalendar [shape=Msquare]
GoogleMail [shape=Msquare]
GoogleDocs [shape=Msquare]
CSV [shape=diamond]
JSON [shape=diamond]
HTML [shape=diamond]
XML [shape=diamond]
GoogleAppsScript [shape=diamond]
"[GoogleVizDataAPI]" [shape=diamond]
"<GoogleForm>" [shape=doubleoctagon]
"<GoogleGadgets>" [shape=doubleoctagon]
"<GoogleVizDataCharts>" [shape=doubleoctagon]
"<GoogleMaps>" [shape=doubleoctagon]

CSV->URL
HTML->URL
XML->URL
event->GoogleAppsScript
GoogleAppsScript->"<GoogleMaps>"
GoogleAppsScript->GoogleMail
GoogleAppsScript->GoogleCalendar
GoogleAppsScript->GoogleSpreadsheet
GoogleSpreadsheet->GoogleAppsScript
GoogleAppsScript->GoogleDocs
GoogleSpreadsheet->JSON
email->GoogleMail
GoogleMail->email
GoogleDocs->GoogleAppsScript
GoogleCalendar->GoogleAppsScript
"<GoogleForm>"->event
event->GoogleSpreadsheet
time->event
"<GoogleForm>"->GoogleSpreadsheet
URL->GoogleSpreadsheet
GoogleSpreadsheet->"[GoogleVizDataAPI]"
"[GoogleVizDataAPI]"->"<GoogleVizDataCharts>"
GoogleSpreadsheet->"<GoogleGadgets>"
}

And finally, here’s a snapshot of the hashtag community around the event as of mid-morning yesterday:

#guug11 twitter echo chamber

Node colour is related to the total number of followers, and node size is betweenness centrality.

Written by Tony Hirst

February 16, 2011 at 4:48 pm

Posted in Presentation

Tagged with ,

UK HE Libraries Using Google Analytics

with 3 comments

How many UK Higher Education Library websites are running Google Analytics, and how many of them are actually using them to report anything other than sitewide pageviews and visitor numbers?

A couple of years ago, I ran a series of posts on Library Analytics where I started to explore some of the ways in which Google Analytics (as it was then) could be used to help us start to understand how a library website was being used by its different sorts of visitors.

Two years on, and I’ve started looking again at Googalytics in the Library, and will hopefully get round to publishing a few posts at least about what I’ve learned about using as it currently stands for making sense of Library website usage, and for what we may be able to report back to course teams about library website activity of users referred from course pages on the OU VLE.

One thing I thought I’d like to try to do is come up with custom reports, segments and goal recipes that might be transferable, or useful to other HE Library websites, as well as identify “best practice” approaches that are used by other HE libraries running Google Analytics… But which libraries are running Google Analytics?

Using a list of HE Library websites grabbed from a November, 2009 dump of a scrape of the Sconul website (by @ostephens, I think?), I ran a quick python script to sniff library websites for evidence of Google Analytics tracking codes (results).

Total number of websites checked 181
Number with Google Analytics code detected 110 Percentage: 0.60773480663
Number without Google Analytics code detected 67 Percentage: 0.370165745856
Number of pages failed to load 4 Percentage: 0.0220994475138

So, it seems like a fair few folk are running Google Analytics… but I wonder: what are they reporting, what segments and custom reports do they find most useful, what goals have they defined (and do they carry a meaningful “financial” conversion value? If so, defined how?), are they in any sense “actionable” (that is, have they been used to prompt interventions to increase traffic, influence on-site behaviour, feed in to website design changes, feed in to subscription or book acquisition policies, improve links with course academics, update reading lists, contribute to VLE content or structure, schedule and staff online help services, influence opening hours etc. etc.)

If you are working in an HE library, running Google Analytics, and can provide even fragmentary answers to any of the above questions, please reply in a comment below, or feel free to email me (in confidence, if required) at: a.j.hirst@open.ac.uk

PS I’m even going to start looking to the literature, too… So for example, this is next on my reading list: Turner, S. J. (2010). Website Statistics 2.0: Using Google Analytics to Measure Library Website Effectiveness. Technical Services Quarterly, 27(3), 261-278. doi:10.1080/07317131003765910

PPS I thought I’d follow the single citation to that paper too, but it seems I can’t unless I pay for it…

This is interesting, methinks. Not only is the content of the paper kept behind a paywall, but so is its incoming link context…

Written by Tony Hirst

February 11, 2011 at 12:31 pm

Posted in Library, OU2.0

Tagged with ,

“Child of the Library”

leave a comment »

A reminder that folk music is a living tradition, and often a voice of protest…

“Child of the Library” : taken from – http://soundcloud.com/pdcawley/child-of-the-library/
Author: pdcawley

http://creativecommons.org/licenses/by-nc-sa/3.0/

Full words and more description are at Save Our Libraries

PS see also:

Written by Tony Hirst

February 11, 2011 at 11:00 am

Posted in Anything you want, Library

Tagged with

Shaping the Future of Open Data?

leave a comment »

If open public data is your thing, or if open public data is something you you think might turn out out to be important in some way, here are a couple of ways to have your say, and maybe even shape the future of open public data in the UK at least…

A Vision for the The Public Data Corporation
Given the lack of information about what the UK’s mooted Public Data Corporation might turn out to be, the Open Rights Group have taken the fake consultation approach and published a wiki to solicit A Vision for the The Public Data Corporation (PDC). Suggested headings include:
- What is the main purpose and priority?
- Which existing trading funds should be covered by the PDC?
- Who would run it?
- How would it function?
- What business models can be proposed?
so if you think you can contribute to the vision in any of those areas, or add any you think are missing, the wiki’s waiting for you..;-)

(For some of the issues, see @hadleybeeman’s Uses for open data and chunks of @timdavies’ Open Data, Democracy and Public Sector Reform.)

The Cabinet Office seem to have been quite open to influence previously from “community” contributions to policy development (they were supportive of what Joss and I tried to achieve with WriteToReply for example), so with work ongoing for the next month or two within the Cabinet Office if Item 2.6(ii) of the Cabinet Office’s january update to its structural reform plan [pdf]Work with the Shareholder Executive to drive the release of core reference data for free re-use from the Public Data Corporation (end Apr 2011) – is anything to go by, now might be time to try to exert some sort of influence…

Consultation on Code of Practice for Local Authority Data Transparency
The Department for Communities and Local Government (DCLG) have just published a Consultation on Code of Practice for Local Authority Data Transparency

List of Local Government Data Burdens
DCLG have also published a request for comments on local government reporting to central government: Data burdensWe have now published a draft list of the data that we think central government departments will need to request in the future. … At present, the list goes into detail on the Department’s requirements and we will be updating the list with details from other Departments. As further details become available we want to include information about requests likely to be made by other public bodies outside central government. We’re keen to involve you in the process of developing the list; we hope you can help us to fill in the gaps. … We welcome all comments, whether they are on specific requests, on general topics or on the list as a whole. We have tried to structure the list so it is easy to pick out the area that is relevant to you. We also give a brief explanation of why central government is asking for specific pieces of information.

HEFCE Review of JISC
Whilst neither an official nor a fake consultation document, HEFCE recently turned their spotlight on JISC and produced a review of JISC. If you want to comment on the review, a copy can be found in commentable form on JISCPress: HEFCE Review of JISC.

(Readers interested in that review may also be interested in the uncommentable, PDF-based Online Learning Task Force (OLTF) report to HEFCE: Collaborate to compete – Seizing the opportunity of online learning for UK higher education .

PS a couple of other things that may be of interest from the Cabinet Office’s january update to its structural reform plan [pdf]:

- pursuant to the action “IT skunk works to assess and develop faster and cheaper ways of using ICT in government”, “IT Skunk Works team was put in place in early January.”

Work is also ongoing relating to:

- 2.3 (iii) Amend Freedom of Information guidance to extend “right to data” to public services (end Mar 2011)
- 2.3 (iv) Introduce legislative amendments to Freedom of Information Act to strengthen “right to data” (end Dec 2011)

- a couple of deadlines relating to the development of open standards and ICT procurement were missed:

– “Establish draft government open standards (including those relating to security) and crowd-source for feedback”; the missed deadline comment states: “Cabinet Office plan to ‘crowd-source’ (published on the internet to allow public inspection and specialist feedback) draft government open standards in February” so it looks as though there may be a chance to contribute here directly?

– “Announce new open standards and procurement rules for ICT, including right for skunk works to be involved prior to launch of procurement”; the missed deadline comment states: “Strong progress has been made on this commitment. A Procurement Policy Note on the procurement rules for ICT has been issued; and by establishing the Major Projects Authority we now have a mechanism whereby skunk works can become involved before the launch of procurement. However, we still need to confirm the new open standards for IT.”

– one for the legislation junkies: plans to “Present proposals to the House of Commons to introduce a new ‘public reading stage’ for Bills to give the public an opportunity to comment on proposed legislation online for use in a dedicated ‘public reading day’ within a Bill’s committee stage” were not complete, although “Progress continues to be made towards an announcement being made to Parliament in due course”

Written by Tony Hirst

February 8, 2011 at 6:12 pm

Posted in Data, Policy

Tagged with

Predictive Ads…? Or Email Address Targeted Advertising…?!

with one comment

As I get was getting increasingly annoyed by large flashing display ads in my feedreader this morning, the thought suddenly occurred to me: could Google serve me ads on third party sites based on my unread Gmail emails?

That is, as I check my feeds before my email in a morning, could I be seeing ads that foreshadow the content of the email I’ve been ignoring for way too long? Or could I receive ads that flag the content of my Priority Inbox messages?

Rules regarding sensitivity and privacy would have to be carefully thought through,m of course. Here’s how they currently stand regarding contextual ads delivered in Gmail (More on Gmail and privacy: Targeted ads in Gmail):

By offering Gmail users relevant ads and information related to the content of their messages, we aim to offer users a better webmail experience. For example, if you and your friends are planning a vacation, you may want to see news items or travel ads about the destination you’re considering.

To ensure a quality user experience for all Gmail users, we avoid showing ads reflecting sensitive or inappropriate content by only showing ads that have been classified as “Family-Safe.” We also avoid targeting ads to messages about catastrophic events or tragedies. [Google's emphasis]

[See also: Ads in Gmail and your personal data Share Comment]

Not quite as future predictive as gDay™ with MATE™ that lets you “search tomorrow’s web today” and “[discover] content on the internet before it is created”, but almost…!

It’s also a step on the road to Eric Schmidt’s dream of providing you with results even before you search for them. (For a more recent interview, see Google’s Eric Schmidt predicts the future of computing – and he plans to be involved.)

Here’s another, more practical(?!) thought – suppose Google served me headers of Priority Inbox email messages that were also marked as urgent through Adwords ads, in a full-on attempt to try to attract my attention to “really important” messages?! “Flashmail” messages delivered through the Adwords network… (I can imagine at least one course manager who I suspect would try to contact me via ads when I don’t pick up my email! ;-)

Searching the internet of things may still be a little way off though….

PS thinking email address targeted ads (mailads?) through a bit more, here are a couple of ways of doing it that immediately come to mind. Suppose I want to target an ad at whoever@example.com:

1) Adwords could place that ad in my GMail sidebar; (I think they’d be unlikely to place ads within emails, even if clearly marked, because this approach has been hugely unpopular in the past (it also p****s me off in feeds ); that said, Google has apparently started experimenting with (image based) display ads in gmail;

2) Adwords could place the ad on a third party site if the Goog spots me via a cookie and sees I’m currently logged in to Google, for example, with the whoever@example.com email address.

As Facebook gets into the universal messaging game, email address based ad targeting would also work there?

PPS interesting – the best ads act as content, so maybe ads could be used to deliver linked content? Twitter promoted tweets – the AdWords for live news?. Which reminds me, I need to work up my bid for using something like AdWords to deliver targeted educational content.

Written by Tony Hirst

February 8, 2011 at 11:08 am

Follow

Get every new post delivered to your Inbox.

Join 126 other followers