Library Analytics (Part 1)

Having had a wonderful time at ILI2007 last year (summary of my talk, according to Brian Kelly – “For most of the people, most of the time, Google’s good enough – get over it…”, though I like to think I was actually talking about the idea of search hubs), I’ve joined forces with Hassan Sheikh from the OU Library on a paper this year’s ILI2008 on the topic of using Google analytics to track user behaviour on the Library website…

First up, it’s probably worth pointing out the unique organisation of the OU, because this impacts on the way the Library website is used.

The OU is a distance learning organisation with tens of thousands active, offsite students; a campus, which is home to teaching academics (course writers), researchers, “academic related” services (software developers, etc.), and administrators; several regional offices; and part-time Associate Lecturers (group tutors), who typically work from home, although they may also work full- or part-time for other educational institutions.

The Library is a “trad” Library, in that it is home to books and a physical journal collection (as well as an OU course materials archive and several other collections) that are typically used by on-campus academics and researchers. The Library has also been quite go-ahead in obtaining online access to journal, ebook, image and reference collections – online access means that these services can be delivered to our student body (whereas the physical collections are used in the main by OU academic and research staff…. I assume…!;-)).

Anyway, to ease myself back into thinking about “Library Analytics”, (I haven’t looked at the Library stats for several months now), here are some warm-up exercises/starting point observations I made, for whatever they’re worth… (i.e. statements of the bleedin’ obvious;-)

Firstly, can we segment users into onsite and offsite users? (I’m pretty sure Hassan was running separate reports for these different gorups, but if he is, I don’t have access to them…)

Even from just the headline report, it appears that a ‘just about significant’ amount of traffic is coming from the intranet.

Just to get my eye in, is this traffic coming from the OU campus at Walton Hall? If we look at the intranet as the traffic source, and segment according to the Network Location of the user (that is, the IP network they’re on), we can see the traffic predominantly local:

By the by, if I’m reading the following report correctly, we can also see that most of the intranet traffic is incoming from the intranet homepage…

And as you might expect, this traffic comes on weekdays…

So here’s a working assumption then (and one that we could probe later for real insight in any principled cases where it doesn’t hold true!): most referrals from the OU intranet occur Monday to Friday, from onsite users, via the intranet homepage.

Secondly, how well is the Library front page working? Whilst not as quick to read as a heat map, the Google Analytics site overlay can provide a quick way way of summarising the most popular links on a page (notwithstanding it’s faults, such as appearing not to disambiguate certain links…)

A quick glimpse suggests the search links need dumping, and more real estate should be given over to the “Journals” and “Databases” links that are currently in the left hand sidebar, and which get 20% and 19% of the click-thrus respectively. Despite the large areas of the screen given over to the image-based navigation, they aren’t pulling much traffic. (That said, if we segment the users it might well be the case that the images in the middle of the page disproportionately attract clicks from certain sorts of user? I don’t think it’s possible to segment this out in the general report, however? For that, I guess we need to define some separate reports that are pre-segmented according to referrer?)

Just chasing the traffic a little more, I wonder if there are a few, popular databases or whether traffic is distributed over all of them equally? The Library databases page is pretty horrible – a long alphabetical list of databases – so can the analytics suggests ways of helping people find the pages they want?

So how are things distributed?

Well – it seems like some databases are more popular than others… but just how true is that observation…?

Let’s do a bit more drilling to see what people are clicking through to from the databases pages… I have to admit that here I start to get a bit confused, because the analytics are giving me two places where databases are being reached from, whereas I can only find one of the paths on the website…

Here’s the one I can find – traffic from:
http://library.open.ac.uk/find/databases/index.cfm:

And here’s what I can’t find on the website – traffic from:
http://library.open.ac.uk/databases/database/:

They both identify the same databases as most popular though, though which databases those are I’ll leave for another day…because as you’ll see in a minute, this might be false popularity…

Why? Well let’s just see where the traffic for one of the most popular databases is coming from over the sample period I’ve been playing with:

Any idea why the traffic isn’t coming from the OU, but is coming form other HEIs???

Well, I happen to know that Bath, Brighton and Durham are used for OU residentlal schools, so I suspect that residential school students, after a reminder about the OU online Library services, are having a play, and maybe even participating in some information literacy activities that the OU Library trainers (as well as some of the courses) run at residential school…

Data – don’t ya just love it…? ;-) It sets so many traps for you to fall into!

Searching for Curriculum Development Course Insights

For almost as long as I can remember (?! e.g. Search Powered Predictions), I’ve had the gut feeling that one of the most useful indicators about the courses our students want to study is their search behaviour, both in terms of searches that drive (potential) students to the OU courses and qualifications website from organic search listings, as well as their search behaviour whilst on the OU site, and whilst floundering around within the courses and quals minisite.

A quick skim through our current strategic priorities doc (OU Futures 2008 (internal only), though you can get a flavour from the public site: Open University Strategic Priorities 2007) suggests that there is increased interest in making use of data, for example as demonstrated by the intention to develop a more systematic approach for new curriculum developments, such that the student market, demography and employment sectors are the primary considerations.

So, to give myself something to think about over the next few days/weeks, here’s a marker post about what a “course search insights” tool might offer, inspired in part by the Google Youtube Insights interface.

So, using Youtube Insight as a starting point, let’s see how far we can get…

First off, the atom is not a Youtube video, it’s a course, or to be more exact, a course page on the courses and quals website… Like this page for T320 Ebusiness technologies: foundations and practice for example. The ideas are these: what might an “Insight” report look like for a course page such as this, how might it be used to improve the discoverability of the page (and improve appropriate registration conversion rates), and how might search behaviour inform curriculum development?

Firstly, it might be handy to segment the audience reports into four:

  • people hitting the page from an organic search listing;
  • people hitting the page from an internal (OU search engine) search listing;
  • people hitting the page from an ‘organic’ link on a third party site (e.g. a link to the course page from someone’s blog);
  • people hitting the page from an external campaign/adword etc on a search engine;
  • people hitting the page from any other campaign (banner ads etc);
  • the rest…

For the purposes of this post, I’ll just focus on the first two, search related, referrers… (and maybe the third – ‘organic’ external links). What would be good to know, and how might it be useful?

First off, a summary report of the most popular search terms would be handy:

– The terms used in referrers coming from external organic search results give us some insight into the way that the search engines see the page – and may provide clues relating to how to optimise the page so as to ensure we’re getting the traffic we expect from the search engines.

– The terms used within the open.ac.uk search domain presumably come from (potential) students who have gone through at least one micro-conversion, in that they have reached, and stayed in, the OU domain. Given that we can (sometimes) identify whether users are current students (e.g. they may be logged in to the OU domain as a student) or new to the OU, there’s a possibility of segmenting here between the search terms used to find a page by current students, and new prospects.

(Just by the by, I emailed a load of OU course team chairs a month or two ago about what search terms they would expect potential students to use on Google (or on the OU search engine) to find their course page on the courses and quals site. I received exactly zero responses…)

The organic/third party incoming link traffic can also provide useful insight as to how courses are regarded from the insight – an analysis of link text, and maybe keyword analysis of the page containing the link – can provide us with clues about how other people are describing our courses (something which also feeds into the way that the search engines will rank our course pages; inlink/backlink analysis can further extend this approach.). I’m guessing there’s not a lot of backlinking out there yet (except maybe from professional societies?), but if and when we get an affiliate scheme going, this may be one to watch…?

So that’s one batch of stuff we can look at – search terms. What else?

As a distance learning organisation, the OU has a national reach (and strategically, international aspirations), so a course insight tool might also provide useful intelligence about the geographical location of users looking at a particular course. Above average numbers of people reading about a course from a particular geo-locale might provide evidence about the effectiveness of a local campaign, or even identify a local need for a particular course (such as the opening or closure of large employer).

The Youtube Insight reports shows how as the Google monster gets bigger, it knows more and more about us (I’m thinking of the Youtube Insight age demographic/gender report here). So providing insight about the gender split and age range of people viewing a course may be useful (we can find this information out for registered users – incoming users are rather harder to pin down…), and may provide further insight when these figures are compared to the demographics of people actually taking the course, particularly if the demographic of people who view a course on the course catalogue page differs markedly from the demographics of people who take the course…

(Notwithstanding the desire to be an “open” institution, I do sometimes wonder whether we should actually try to pitch different courses at particular demographics, but I’m probably not allowed to say things like that…;-)

As well as looking at search results that (appear) to provide satisfactory hits, it’s also worth looking at the internal searches that don’t get highly relevant results. These searches might indicate weak optimisation of pages – appropriate search terms donlt find appropriate course pages – or they might identify topics or courses that users are looking for that don’t exist in the current OU offerings. Once again, it’s probably worth segmenting these unfulfilled/unsatisfactory courses according to new prospects and current students (and maybe even going further, e.g. by trying to identify the intentions of current students by correlating their course history with their search behaviour, we may gain insight into emerging preferences relating to free choice courses within particular degree programmes).

To sum up… Search data is free, and may provide a degree of ‘at arms length’ insight about potential students before we know anything about them ‘officially’ by virtue of them registering with us, as well as insight relating to emerging interests that might help drive curriculum innovation. By looking at data analysis and insight tools that are already out there, we can start to dream about what course insight tools might look like, that can be used to mine the wealth of free search data that we can collect on a daily basis, and turn it into useful information that can help improve course discovery and conversion, and feed into curriculum development.

Olympic Medal Table Map

Every four years, I get blown away by the dedication of people who have spent the previous four years focussed on their Olympic Challenge (I find it hard to focus for more than an hour or two on any one thing!)

Anyway, I was intrigued to see this post on Google Maps Mania yesterday – Olympic Heat Maps – that displayed the Olympics medal table in the form of a heat map, along with several variants (medal tallies normalised against population, or GDP, for example).

The maps were neat, but static – they’d been derived by cutting and pasting a snapshot of a medals table into a Google spreadsheet, and then creating a Heat Map widget using the data…

Hmmm… ;-)

So I had a look round for a ‘live’ data source for the medals table, didn’t find anything obvious, so looked for a widget that might be pulling on a hidden data source somewhere… Whereupon I found a reference to a WordPress Olympic Medal Tally widget

A quick peek at the code shows the widget pulling on a data feed from the 08:08:08 Olympics blog, so I ‘borrowed’ the feed and some of the widget code to produce a simple HTML table containing the ISO country codes that the Google Heat Map widget requires, linked to it from a Google Spreadsheet (Google Spreadsheets Lets You Import Online Data) and created a live Olympic medal table map (top 10).

If you want to use the heat map as an iGoogle widget, here it is: Olympic Medal Table Map Widget.

Google Insights for Search (on Youtube too…)

It seems that Google opened up a supercharged variant of Google Trends over the last week or two: Google Insights for Search.

One useful feature the new service offers over the original trends service is the ability to compare the relative volumes for the same search term over several different time periods:

It’s also possible to get a breakdown by geography, or, as with Google Trends, compare volumes for different search terms.

Along with search volume trends, you also get insight into the geographical distribution of where searches are originating from (though this sort of view is always subject to interpretation!), and maybe more interestingly, related search terms and “rising searches” – that is, search phrases that have increased in volume over the specified period.

The URLs appear to be hackable/bookmarkable, too, which means that I can also bookmark them in Trendspotting (which I really need to tinker with on the templates front, at least to display inline graphs on the most recent entry, and maybe offer a preview link, too…).

I have to admit I probably wouldnlt have posted about this were it not for the fact that some of the insight views have also appeared on Youtube, at least for personally uploaded videos:

And here are the views…

Viewing by geography:

Relative popularity by geographical region:

How people came to view the movie… (i.e. “discovery”):

And finally, viewer demographics:

It’ll be interesting to see where Google go with their data products; as well as Google Insights for Search (and Google Trends), there’s also Google Analytics, Feedburner (which hasn’t yet been integrated into Blogspot – which is lacking on any stats/data tools, I think?) and Google Webmaster tools.

(There are also tools relating to Adsense/Adwords as well, of course, including this one I just found – a keyword recommender for a given URL: Google Adwords: Keyword Tool.)

And then, of course, there are all the Google visualisation widgets that are starting to appear for Google Spreadsheets, as well as around the Google visualisation API

Embedding Youtube Videos on the BBC Website

Although I managed to get third party Youtube movies embedded in an online OU course earlier this year, mentioning the use of embedded Youtube resources in our course materials still causes moments of tension in course team meetings (“what about the rights?”, “can we trust the video will stay at that URL?” and so on), so I keep an eye out for the appearance of embedded Youtube movies on other sites that I can use as examples of how other publishers are happy to make use of embedded resources from other sites…

…like this one for example – embedded Youtube music videos on the bbc.co.uk domain:

:-)

The End of Linear TV Schedules?

Just back from a proper holiday (i.e. off the interweb), though still on a sort of holiday, so here’s a quick rag tag of a post, to follow on from the previous BBC dominated post, with a couple more BBC related things that caught my eye when in catch-up mode earlier today…

First up, it seems that someone’s picking at the linear schedule scab and looking for new ways to promote cross-channel content: When’s the sailing on? Introducing genre schedules….

The schedules can be accessed in a variety of formats (HTML, XML, JSON, iCal, etc) and also “by genre, by channel” (e.g. allowing you to tunnel into drama on Radio 4, for example) using an oh so lovely, hackable URL format – check out Tom Scott’s linked to post above for more info…

I’d say that was a “notable” step, but then, I’m not much of a media pundit…

I fired off the obligatory email to OBU/open2, of course, asking whether we’d be able to get something like this URL working:

http://www.bbc.co.uk/programmes/genres/learning/openuniversity/schedules/

so it’ll be interesting to see whether our agreement with the beeb goes so far as to allow OU co-pros to be defined as a genre in their own right!

What else was there…? Oh yes, this was interesting: Martin Belam spotted that the Beeb are experimenting with ‘in-line text links’ (BBC News in-line text links trial out in the wild).

I’ve been arguing for ages that we should be using the rather sleeker lightbox progressive enhancement for certain sorts of links, such as links to ‘optional’ Youtube videos, within our course materials… I guess I really should try to make a formal case, identifying the conditions under which we might want to open a link in an ‘overlayed’ frame, rather than the same window or a new tab, but that’s always for another day…! (That said, I have been using the approach “informally – e.g. follow the “Cheswick/Burch Map of the Internet” link on this page….

And finally, I’ve been dipping in and out of a report from Ofcom on The Communications Market 2008 (August) all day, and learning all sorts of interesting stuff, as well as finding little bits of evidence for stuff I’ve heard spoken of elsewhere…

Like this for example – a stat showing how TV fails to completely hold anyone’s attention any more!

(I’ll pull out some more graphs in a later post…)

And finally, finally, for anyone who still thinks that 360 plays are not the way forward, you should probably check out the Britain from Above website first…
Vodpod videos no longer available.

I’d love to have seen a general interest short course pulled together around this programme, but I don’t think it was a co-pro…. (err, “so what?”, maybe????)

Embedding BBC iPlayer Music Videos – Foals

Having a quick look at the new BBC Music and BBC Artist pages that have been getting a lot of mentions this week, I noticed (again?!) that it’s possible to officially embed at least some iPlayer videos now:

So for example, here’s clip of Foals from the BBC Introducing stage at the Bestival last year (I was there, they rocked…. totally…)

Vodpod videos no longer available.

I have to admit, though, that I suspect that if anyone at Ofcom has a visionary moment about the potential of a BBC backed iPlayer, in the context of all the other BBC web content that’s available (including increasing amounts of semantic/linked data, they’re going to come down on the Beeb like a tonne of bricks – though it may be too late by then…

Anyway, if you’re not keeping up with iPlayer plays, here’s a good round up: BBC iPlayer 2.0: Links Roundup.

And if you’re into music, here’s another take on how the beeb sees music on the web: Sound Index. At the moment, the Sound Index artist pages don’t appear to match the BBC music artist pages (e.g. Foals (BBC Sound Index) and Foals (BBC Music, Artist pages, beta), nor does Sound Index use the MusicBrainz artist identifier that the Artist pages do in the URL, but maybe these services will merge in the near future?

If I was in the music biz, particularly the “360” music biz where merchandise and sales around the music (and artist profile/fan pages) is arguably more lucrative than music sales themselves, I’d be getting twitchy… (no ads or clicks-to-buy on the BBC…)

PS I’m going to be offline for a week or two, taking a bit of holiday, and catching up on some reading that’ll probably include BBC Trust – PwC Study into the economic impact of the BBC on the UK, Scoring Points and Making Money!