OUseful.Info, the blog…

Trying to find useful things to do with emerging technologies in open education

Archive for April 2009

Scripting Charts WIth GraphViz – Hierarchies; and a Question of Attitude

with 3 comments

A couple of weeks ago, my other was finishing off corrections to her PhD thesis. The layout of one of the diagrams – a simple hierarchy written originally using the Draw tools in an old versioof MS-Word – had gone wrong, so in the final hours before the final printing session, I offered to recreate it.

Not being a draughtsman, of course I decided to script the diagram, using GraphVIz:

The labels are added to the nodes using the GraphViz label command, such as:

n7[label="Trait SE"];

The edges are defined in the normal way:

n4->n8;
n4->n9;

But there was a problem – in the above figure, two nodes are placed by the GraphvViz layout in the wrong place – the requirement was that the high and low nodes were ordered according to their parents, and as, indeed, they had been ordered in the GraphViz dot file.

A bit of digging turned up a fix, though:

graph [ ordering="out" ];

is a switch that forces GraphViz to place the nodes in a left-to-right fashion in the order in which they are declared.

During the digging, I also found the following type of construct

{rank=same;ordering=out;n8;n9;n10;n11;n12;n13;n14;n15

which will force a set of nodes to be positioned along the same horizontal row. Whilst I didn’t need it for the simple graph I was plotting, I can see this being a useful thing to know.

There are a few more things, though, that i want to point out about this whole exercise.

Firstly, I now tend to assume that I probably should be able to script a diagram, rather than have to draw it. (See also, for example, Writing Diagrams, RESTful Image Generation – When Text Just Won’t Do and Progressive Enhancement – Some Examples.)

Secondly, when the layout “went wrong”, I assumed there’d be a fix – and set about searching for it – and indeed found it, (along with another possibly useful trick along the way).

This second point is an attitudinal thing; knowing an amount of programming, I know that most of the things I want to do most of the time are probably possible because they the exactly the sorts of problems are likely to crop up again and again, and as such solutions are likely to have been coded in, or workarounds found. I assume my problem is nothing special, and I look for the answer; and often find it.

This whole attitude thing is getting to be a big bugbear of mine. Take a lot of the mashups that I post here on OUseful.info. They are generally intended not to be one off solutions. This blog is my notebook, so I use it to record “how to” stuff. And a lot of the posts are contrived to demonstrate minimally worked examples of how to do various things.

So for example, in a recent workshop I demonstrated the Last Week’s Football Reports from the Guardian Content Store API (with a little dash of SPARQL).

Now to me, this is a mashup that shows how to :

- construct a relative date limited query on the Guardian content API;
- create a media RSS feed from the result;
- identify a convention in the Guardian copy that essentially let me finesse metadata from a free text field;
- create a SPARQL query over dbpedia and use the result to annotate each result from the Guardian content API;
- create a geoRSS feed from the result that could be plotted directly on a map.

Now I appreciate that no-one in the (techie) workshop had brought a laptop, and so couldn’t really see inside the pipe (the room layout was poor, the projection screen small, my presentation completely unprepared etc etc), but even so, the discounting of the mashup as “but no-one would want to do anything with football match reviews” was…. typical.

So here’s an issue I’ve some to notice more and more. A lot of people see things literally. I look at the football match review pipe and I see it as giving me a worked example of how to create a SPARQL query in a Yahoo pipe, for example (as well as a whole load of other things, even down to how to construct a complex string, and a host of other tiny little building blocks, as well as how to string them together).

Take GraphViz as another example. I see a GraphViz file as a way of rapidly scripting and laying out diagrams using a representation that can accommodate change. It is possible to view source and correct a typo in a node label, whereas it might not be so easy to see how to do that in a jpg or gif.

“Yes but”, now comes the response, “yes, but: an average person won’t be able to use GraphViz to draw a [complicated] diagram”. Which is where my attitude problem comes in again:

1) most people don’t draw complicated diagrams anyway, ever. A hierarchical diagram with maybe 3 layers and 7 or 8 nodes would be as much as they’d ever draw; and if it was more complicated, most people wouldn’t be able to do it in Microsoft Word anyway… I.e. they wouldn’t be able to draw a presentable diagram anyway…

2) even if writing a simple script is too hard, there are already drag and drop drop interfaces that allow the construction of GraphViz drawings that can then be tidied up by the layout engine.

So where am I at? I’m going to have a a big rethink about presenting workshops (good job I got rejected from presenting at the OU’s internal conference, then…) to try to help people to see past the literal and to the deeper truth of mashup recipes, and try to find ways of helping others shift their attitude to see technology as an enabler.

And I also need a response to the retort that “it won’t work for complicated examples” along the lines of: a) you may be right; but b) most people don’t want to do the complicated things anyway…

Written by Tony Hirst

April 28, 2009 at 9:14 pm

Posted in Anything you want

Tagged with ,

Using YQL With Yahoo Pipes

with one comment

A couple of days ago, @mikechelen asked:

“can yql plug in to pipes for improved development, compared with other cloud platforms that accept standard languages?”

where YQL is the Yahoo Query Language, a SQL like query language that can run queries on data pulled in from all over the web…

There are a couple of ways at least of doing this: a) calling YQL from Yahoo Pipes; b) calling Yahoo Pipes from within a YQL query.

First up, calling YQL from Yahoo Pipes, using the pipes YQL block and a trick I learned from @hapdaniel that lets me run a query on a couple of Google spreadsheets, where the results from one of the spreadsheets are subselected based on results of a query to a second Google spreadsheet:

select * from csv(2,500) where url =’http://spreadsheets.google.com/pub?key=phNtm3LmDZEM6HUHUnVkPaA&gid=47&output=csv’ and col4 > ’70′ and col1 in (select col1 from csv(2,500) where url = ‘http://spreadsheets.google.com/pub?key=phNtm3LmDZEM6HUHUnVkPaA&gid=40&output=csv’ and col4 > ’70′)

The second approach is to run a YQL query, e.g. with the YQL console, that calls on the JSON output of a Yahoo pipe (in this case, I just happen to be displaying the results from the pipe shown above. That is, a pipe that itself embeds a different YQL query).

Calling Yahoo pipes from YQL  - http://developer.yahoo.com/yql/console/

http://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20json%20where%20url%3D%22http%3A%2F%2Fpipes.yahoo.com%2Fpipes%2Fpipe.run%3F_id%3D0IpRD0sP3hGFBtfI3XBDOQ%26_render%3Djson%26minArchitecture%3D70%26minPlanning%3D70%22&format=xml

So, there you have it – how to embed YQL in a Yahoo pipe, and how to call a Yahoo pipe from YQL…

PS See also HTML Tables and the Data Web and the Google Visualization API
Query Language
, which apparently lets you “perform various data manipulations with the query to the data source. The Query Language does not depend on the implementation of any specific data source. These data manipulations are performed by the data source server, reducing the need to perform data manipulations and formatting by developers on the client.”

Written by Tony Hirst

April 27, 2009 at 3:18 pm

Posted in Pipework, Search, Tinkering

Tagged with

A Few More Tweaks to the Pit Stop Strategist Spreadsheet

with 3 comments

For the first Formula One Grand Prix of the year, I put together a spreadsheet that would let you play the role of a pit stop fuel strategist (F1 Pit Stop Strategist – Fuel Stop Spreadsheet).

I missed the last couple of races, but I did get to see today’s, so whilst I was watching I also made a few tweaks to the spreadsheet.

First thing was to tweak the first pit stop estimator, by adding an offset that factors in a 3 lap fuel penalty to account for the procession lap, the formation lap, and some slack!

Secondly, I added a new sheet that allows you to play along with the race so that you can try to work out when all the other cars are likely to be pitting throughout the race.

This is very much the first pass of this spreadsheet – I’m not sure how the BBC calculate or guess at the amount of fuel added on the few occasions they do pop up an info bar, although they do show quite a few of the pit stop timings. So over the next few races (or maybe by watching replays – and with knowledge of when all the stops were actually taken) I’ll try to work on the formula that takes the pit stop time – or an estimate of how long the fuel hose was attached – and calculates the fuel loaded (and hence number of extra laps that car can complete).

The other thing I added to the strategist spreadsheet was a display of the best sector times from each driver in Q3, charted relative to the best sector times of a nominated driver:

(Obviously, a similar chart could also be used to display the best sector times for each driver during the race.)

You can find the race day strategist spreadsheet here: Race Day Strategist Spreadsheet.

As far as post-race stats go, I was intrigued as to whether lap times show any benefit to decreasing car weight as fuel is used up each lap – so here are the time differences between consecutive laps for Button:

(For the pit stops, I limited the time to 3s.)

I’m not sure whether an improvement in lap time should be shown above the line, or below the line?

Written by Tony Hirst

April 26, 2009 at 7:39 pm

Posted in Tinkering

Tagged with ,

How To Create Wordcloud from a Hashtag Feed in a Few Easy Steps

with 2 comments

So I was struggling for a quick hit blog post to publish today (busy:-(, but then I got a tweet from @paulbradshaw asking “Any ideas how you could make mashup showing the frequency of certain words in hashtagged tweets – e.g. tagcloud.”

Hmm – like this maybe?

create word cloud from hashtag feed

:-)

[NOTE - you need to encode the hashtag as %23 in the feed URI.]

I call this technique a screencaptutorial… (err….?!)

The screen capture was made using Jing, and the white background comes from an empty text editor document exploded to fill the screen.

For more info on manipulating Twitter search feeds, see Twitter Powered Subtitles for Conference Audio/Videos on Youtube.

PS I’m not sure whether the wordle app generates a static word cloud from a feed, or a more dynamic one? (That is, does it just grab the feed contents at the time the word cloud is created and use those to generate a one-hit word cloud, or does it keep sampling the the feed? If you want a live word cloud, then a better way is to import the feed into a Google spreadsheet, publish the spreadsheet, take a CSV output from it and drop it into Many Eyes wikified. Or create a web page of your own and generate the word cloud from the feed (maybe pulling it into the page as JSON via Yahoo pipe, so you can get around having to use a proxy to pull the feed into the page) using a word cloud javascript library such as Dynacloud, Cloudinizr or Cloudy.

Written by Tony Hirst

April 23, 2009 at 11:12 am

Ordered Lists of Links from delicious Using Yahoo Pipes

with 7 comments

One of the things that I often use the delicious social bookmarking service for is to push lists of links into web pages, web dashboards, or the feedshow link presenter. However, sometimes it’s important to be able to push the links in a particular order (particularly for the link presenter) rather than the order in which the links were bookmarked (i.e. order by timestamp based on when the bookmark was saved).

So a couple of days ago it occurred to me that I should be able to do this with a simple Yahoo Pipe by using a tags to order the sequence of links and sorting on those. So for anyone who remembers programming in BASIC, and number the lines 10, 20, 30 (or 100, 200, 300) to give yourself “room” to insert additional lines, the following convention may be familiar…

STEP 1: tag your links according to the convention: ORDERLABEL:nnn. So for example, to provide raw testing material for my pipe I tagged three links with the following variants: orderA:1000, orderB:120, orderC:103; orderA:3000, orderB:110, orderC:102; and orderA:2000, orderB:130, orderC:101. It also makes sense to tag each item with just ORDERLABEL, so you can pull out just those items from delicious.

STEP 2: build the pipe. My idea here was to grab the list of tags for each link as a string, use a regular expression to just parse out the sequence number from the string, having been provided by the order label (e.g. orderA, orderB or orderC in my test case), and then sort the feed on those numbers…

Unfortunately, delicious doesn’t emit all the tags in a single element (at least, not as far as Yahoo! Pipes are concerned):

And even more unfortunately (for me at least), I don’t know an effective way of combining these sub-elements into a single element? (The Sub-element pipe operator will convert every item in each category subelement list to an element in it’s own right, but that’s not a lot of use as I don’t know how to copy the tilte, link and description elements into each category subelement…)

So what to do?

Well, it turns out you can use this sort of construction in a regular expression block:
${category.0.content}
which says “use the content of the 0′th category subelement”.

Which means in turn that if I refer to each of n tags explicitly (as in: ${category.n.content}), I can construct a single string containing all n categories (i.e. all the tags in a single string).

We copy the title element as an element of convenience to create an order element within the feed. The string block constructs a single replacement string for the regular expression that will replace the original contents of the order element with the content element from the first 16 category subelements. Following the regular expression replacement, the order element now contains up to the first 16 tags associated with the element in a single string.

The next step is to filter the feed so that we only pass elements that contain tags that are based on the ORDERLABEL root (in this case, I am sorting on things like orderA:1000, orderA:2000, etc):

(Remember that we could use another tag (I usedorderedfeedtest in this example) to pull in all the orderA:nnn tagged bookmarks.)

The appropriately order number tagged elements are then processed so that the order element is rewritten with just the “line number” for each feed item (so e.g. orderA:2000 would become 2000, and the items in the feed sorted using this element.

By specifying the appropriate ordering label, we can force the order in which feed items are displayed:

And then:

You can find the pipe here: delicious feed ordered by “tag line numbers”.

Written by Tony Hirst

April 22, 2009 at 12:30 am

The OU on iPlayer (Err? Sort of not…)

with 2 comments

Last week, Martin Belam blogged a must-read-it-if-you-haven’t-already post on How the Ian Tomlinson G20 video spread The Guardian brand across the media, describing how the Guardian watermarked video splashed the Guardian brand across numerous other news websites and publications through their embedding of the video footage, or images captured from the video:

Having The Guardian’s logo burned into the video footage meant that many other online news publications had to display images which advertised the fact that they had not been the first paper to get access to this content. They approached reproducing the images and crediting The Guardian in a variety of ways both in print and online.

Now I know that OU co-produced content for the BBC is a different beast altogether, but let’s just see for sake of interest how the OU brand gets splashed across the web via co-produced content that is made available on the BBC iPlayer.

Take Coast, for example:

Start the programme playing, and we get the broadcast channel ident:

BBC iplayer coast ident

(I’m not sure what’s used if a programme is repeated on another channel a week later? Which ident is used then, e.g. 29 days after the broadcast on the original channel but maybe only 22 days after broadcast on the secondary channel?)

Then we get the intro at the top of the programme…

Notice the double dose of BBC Branding?

BBC identified on iPlayer

The watermark in the top left corner is present throughout:

The OU does get a mention, of a sort in the closing credits, but the further details URI is a BBC one:

And finally, at the closing captions:

…no URL though…

Maybe there’s a mention in the programme info?

Hmmm…

Okay, so how about OU/BBC co-pro content that makes it onto the official BBC Youtube channel? Something like a bit of James May’s 20th century?

Well, I guess there’s a textual credit, even though it’s the BBC’s logo watermarked into the actual video. And the video does have embedding disabled, so other people can’t run with the content… (I’m not sure if we’re allowed to put content like that on the open2.net site, though, which has historically been run as a co-branded OU/BBC site and under BBC editorial guidelines (although I believe that may be set to change…).

So how would it be different on an OU iPlayer (cf. CBeebies iPlayer, and Why I Think an OU iPlayer Presence Would be a “Good Thing”)?

Well if we had a version of iPlayer cf. the CBBC iPlayer, a programme could possibly open with an OU ident and carry an OU watermark?

And failing that, on the main iPlayer site, a semi-transparent, overlaid OU watermark logo somewhere might be appropriate?

Written by Tony Hirst

April 18, 2009 at 4:49 pm

Posted in BBC, OBU

Data DOIs

leave a comment »

Okay, here’s another Friday twitter brainstorm capture post, this time arising from my responses to @jimdowning who made a query about in response to a tweet I made about an interesting looking “DOIs for data” proposal…

Here’s what I pondered:

- why might it be useful? Err, “eg allows to resolve either to Guardian data blog data on google docs or national stats copy of a data set?” That is, several of the data sets that have been republished by the Guardian on google docs duplicate (I think) data from National Statistics. A data DOI service could resolve to either of these data sets depending on a user’s preferences…

Hmmm… ;-)

But I can also imagine derived data DOIs that extend eg journal paper DOIs in a standardised way, and then point to data that relates to the original journal article. So for example, an article DOI such as doi:nnn-nnn.n might be used to generate a data DOI that extends the original DOI, such as doi:nnn-nnn.n-data; or we might imagine a parallel data DOI resolution service that reuses the same DOI: data-doi:nnn-nnn.n.

Where multiple data sets are associated with an article, it might be pragmatic to add a suffix to the doi, such as data-doi:nnn-nnn.n-M to represent the M’th dataset associated with the article? For only one dataset, it could be identified as data-doi:nnn-nnn.n-0 (always count from 0, right?;-), with data-doi:nnn-nnn.n (i.e. no suffix) returning the number of data sets associated with the article?

PS hmmm this reminds me of something the name of which I forget (cf. image extraction from PDFs), where assets associated with an article are unbundled from the article (images, tabulated data and so on); how are these things referenced? Are references derivable from the parent URI?

PPS Maybe related? (I really need to get round to reading this…) How Data RSS Might Workl.

Written by Tony Hirst

April 17, 2009 at 2:23 pm

Posted in Thinkses

Tagged with , ,

Lazy Acquisition of Article Citations

with 5 comments

So this post is just to try and capture comments…

The problem: I’m writing some stuff for a course to be delivered via the VLE; we have a hugely elaborate structured authoring system that doesn’t make much use of structure or semantics but I’d like to be able to something like:

1) enter a doi for an article as a structured element;
2) generate a link to that article through libezproxy (easy – http://libezproxy.open.ac.uk/login?url=http://dx.doi.org/nnn.nn-n.etc);
3) pull in a the reference to the article in whatever format we are supposed to use.

I thought one easy way to do this might be if there was something like a http://dx.doi.org/citation/ service that would return citation info for a DOI, or if journal publishers supported an argument on journal paper web pages along the lines of ?&citationstyle=ALA that would return as plain text the citation for the article in the declared style.

@scottbw suggested using RSS to deliver all the info required to generate a citation – eg i could see this working by adding /rss/ in to the path or just ?output-style=rss.

One huge advantage of getting preformatted text (or text in a json object) would be that it would be trivial to derive a URI from the URI of a journal article page that pointed to the citation, and then embed that URI/citation straightforwardly in a web page.

Looking at reference pages in ORO/eprints, e.g. http://oro.open.ac.uk/6701/:

it’d be nice to be able to say something like http://oro.open.ac.uk/6701/?citationstyle=true and get this out, as text, as a result:

Adams, Anne, Blandford, Ann and Lunt, Peter (2005). Social empowerment and exclusion: A case study on digital libraries. ACM Transactions on Computer-Human Interaction, 12(2), pp. 174–200.

Maybe?

PS see also Scholarly HTML: Simple, rational, modern citations using links as well as the comments below…

Written by Tony Hirst

April 17, 2009 at 1:06 pm

Posted in Thinkses

Tagged with

Finding Rights Cleared Video Resources for Use in Course Materials

with 4 comments

Way back when, the Library piloted a video search engine – DiVA – that would search over some of the video material that had been produced specifically for OU courses (Course Content Image Search) and possibly over some of the content the OU had co-produced with the BBC.

Recently, of course, the OU has got into co-producing flagship programmes for BBC1 and BBC2, as well as the lesser channels, but as far as I know, there is no easy way for us to search over this material (the best way used to be the now deprecated BBC Catalogue search).

As the BBC programme catalogue adds entries, this will become increasingly valuable for resource discovery, and it will also be interesting to see how Box of Broadcasts plays out, too.

For using video in courses, there are three main issues: 1) discovery of the clip; 2) rights clearance; 3) actually getting the video embedded in the VLE.

In an ideal world, I’d quite like to be able to go to an institutional version of Youtube, enter the search terms and get a video clip. This is already possible in the Youtube universe, of course…. For example, I want to use a clip from a James May programme that the OU co-produced, so the easiest way I could think of saying “this is the clip I want” was to search for james may motion capture on Youtube and grab the top result:

Overall time to go from thinking “I’d like this clip” to getting an embed code for it (albeit a copyright infringing one)? Less than a minute.

I have just started the process of trying to get an official version of the clip (start time: 14.00 Weds April 15th, 2009…) so it’ll be interesting to see whether I can get this clip in the VLE in time for when it’s actually needed at the start of June. Indeed, I’m not even sure I sent the email to the right person, so maybe I’ve only false started on actually finding out how to get this clip?!

When it comes to referring students to complete programmes, I’m not sure what the best approach is?

My ad hoc approach would be to try to find out whether a programme was likely to be broadcast on the BBC somewhere during the presentation of the course, and if it was, telling students to find it on iPlayer.

I’d possibly also look for links to what I needed from a BBC Programmes catalogue listing, the BBC World Service documentaries archive, the BBC Four interviews archive (deprecated), the BBC Learning Zone class clips website, or the BBC Archive and so on. (If they were no good, I’d end up on Redux….).

…and that’s just the BBC of course: the other UK terrestrial channels (or at least, ITV and Channel 4) now happily stream catch-up services on the web, as well as making some of their content (at least in Channel 4′s case) to services such as Joost (e.g. Channel 4 shows on Joost).

I’m not sure whether it’d also be useful to start compiling lists of links to BBC programme pages for OU co-pro programmes, because there’s nothing that obviously fulfills that role on Open2.net. (The closest I have at the moment is the OU/BBC iPlayer catch-up mashups here and here).

Written by Tony Hirst

April 15, 2009 at 2:00 pm

Posted in BBC, OBU

Mashing Up Government the RSS Way: Raw Materials

with 3 comments

Three or four weeks ago, @adrianshort tipped me off about a campaign he was trying to put together to encourage local councils to start publishing autodiscoverable web pages from their homepages. Various overcommitments of my own meant I couldn’t contribute anything to this initiative, but it’s great to see it up and running now at Mash the State:

So how does my local council do?

Boo – no autodiscoverable feeds on their homepage…. (I wonder whether it might it be an idea to have a link to the council page that is being checked for autodiscoverable links, so that people can see which page it actually is and scout around it for non-autodiscoverable feeds?)

Although the campaign is targeted at encouraging councils to publish RSS news feeds, there’s a range of other feeds that they could usefully publish too, potentially without too much effort.

For example, councils can make use of the Planning Alerts service that scrape planning info from local council websites (presumably it would make get the data via feeds if the data were made available that way? [Update: the link is there, I just hadn't noticed it - the name of the council in the body text is a link to the assumed council home page.]):

These feeds include geo-data too, which means you can plot the feed on a map:

(I started exploring an even richer planning map for the IW Council, who provide (albeit in a hard to find way) audio recordings of council planing meetings. You can find the proof of concept here: Barriers to Open Availability of Information? IW Planning Committee Audio Recordings.)

Roadworks feeds might be another useful service? Elgin (the electronic local government information network) is one source of this information, although their results listings aren’t available as RSS, and in constructing the URLs for the search results, you need to know the Local Authority Area number :-( (Is there a straightforward list of these available anywhere?)

As well as the opening up of the Mash the State Campaign, I also spotted this week that the UK Parliament website was now providing RSS feeds detailing the progress of every bill currently going through Parliament:

Haing the RSS feed means it’s trivial to create a timeline viewof a Bill’s progress using a service such as Dipity. So for example, here’s a timeline depicting the progress of the Coroners and Justice Bill:

Coroners and Justice bill timeline http://www.dipity.com/psychemedia/Coroners-and-Justice

(I’m not sure if there’s an official way of tracking amendments to already enacted Acts? If not, here’s a workaround I put together some time ago – Tracking UK Parliamentary Act Amendments – although I don’t know whether it’s still working?)

PS this looks like an interesting related collection of links: Mashups in government; and this post – Sign up, sign up for Open Source – describes some innovative looking local council projects (I like the idea of a planning application tracker, cf. the government Bill tracker, maybe?)

PPS Although the percentage of councils that currently have autodiscoverable feeds on their homepage is quite low, it’s still a better uptake than for HEIs: Back from Behind Enemy Lines, Without Being Autodiscovered(?!) and Autodiscoverable RSS Feeds From HEI Library Websites. See also 404 “Page Not Found” Error pages and Autodiscoverable Feeds for UK Government Departments.

Written by Tony Hirst

April 11, 2009 at 10:20 am

Follow

Get every new post delivered to your Inbox.

Join 126 other followers