Olympic Medal Table Map

Every four years, I get blown away by the dedication of people who have spent the previous four years focussed on their Olympic Challenge (I find it hard to focus for more than an hour or two on any one thing!)

Anyway, I was intrigued to see this post on Google Maps Mania yesterday – Olympic Heat Maps – that displayed the Olympics medal table in the form of a heat map, along with several variants (medal tallies normalised against population, or GDP, for example).

The maps were neat, but static – they’d been derived by cutting and pasting a snapshot of a medals table into a Google spreadsheet, and then creating a Heat Map widget using the data…

Hmmm… ;-)

So I had a look round for a ‘live’ data source for the medals table, didn’t find anything obvious, so looked for a widget that might be pulling on a hidden data source somewhere… Whereupon I found a reference to a WordPress Olympic Medal Tally widget

A quick peek at the code shows the widget pulling on a data feed from the 08:08:08 Olympics blog, so I ‘borrowed’ the feed and some of the widget code to produce a simple HTML table containing the ISO country codes that the Google Heat Map widget requires, linked to it from a Google Spreadsheet (Google Spreadsheets Lets You Import Online Data) and created a live Olympic medal table map (top 10).

If you want to use the heat map as an iGoogle widget, here it is: Olympic Medal Table Map Widget.

More Olympics Medal Table Visualisations

So the Olympics is over, and now’s the time to start exploring various views over the data tables in a leisurely way:-)

A quick scout around shows that the New York Times (of course) have an interactive view of the medals table, also showing a historical dimension:

Channel 4’s interactive table explores medal table ‘normalisation’ according to population, GDP and so on…

GDP and population data have also been taking into account in a couple of visualisations created on Many Eyes – like this one:

Not wanting to not be part of the fun, I spent a bit of time this evening scraping data from the Overall medal standing table and popping it into Many Eyes myself.

(Note that there’s lots of mashable stuff – and some nice URLs – on the http://en.beijing2008.cn/ website… why, oh, why didn’t I think to have a play with it over the last couple of weeks?:-(

Anyway, I’ve uploaded the results, by discipline, for the Olympics 2008 Medal Table (Top 10, by Tally) and had a quick play to see what sort views might be useful in visualising the wealth of information the data contains.

First up, here are the disciplines that the top 10 countries (by medal tally) were excelling at:

Treemaps are one of my favourite visualisation tools. The Many Eyes treemap, whilst not allowing much control over colour palettes, does make it easy to reorder the order of the hierarchy used for the treemap.

Here’s a view by discipline, then country, that allow you to see the relative number of medals awarded by discipline, and the countries that ‘medalled’ within them:

Rearranging the view, we can see how well each country fared in terms of total medal haul, as well as the number of medals in each medal class.

The search tool makes it easy to see medals awarded in a particular discipline by country and medal class – so for example, here’s where the swimming medals went:

A network diagram view lets us see (sort of) another view of the disciplines that each country took medals in.

The matrix chart is more familiar, and shows relative medal hauls for gold, silver and bronze, by country.

By changing the colour display to show the disciplines medals were awarded in, we can see which of the countries won swimming medals, for example.

Enough for now… the data‘s on the Many Eye’s site if you want to create your own visualisations with it… You should be able to reduce the data (e.g. by creating copies of the data set with particular columns omitted) to produce simpler visualisations (e.g. simpler treemaps).

You can also take a copy of the data to use in your own data sets, (e.g. normalising it by GDP, population, etc, etc.)

If you do create any derived visualisations, please post a link back as a comment to this post :-)

Journal Impact Factor Visualisation

Whilst looking around for inspiration for things that could go into a mashup to jazz up the OU repository, I came across the rather wonderful eigenfactor.org which provides an alternative (the “eigenfactor”) to the Thompson Scientific Impact Factor measure of academic journal “weight”.

The site provides a range of graphical tools for exploring the relative impact of journals in a particular discipline, as well as a traditional search box for tracking down a particular journal.

Here’s how we can start to explore the journals in a particular area using an interactive graphical map:

The top journals in the field are listed on the right hand side, and the related fields are displayed within the central panel view.

A motion chart (you know: Hans Rosling; Gapminder…) shows how well particular journals have fared over time:

As well as providing eigenfactor (cf. impact) ratings for several hundred journals, the site also provides a “cost effectiveness” measure that attempts to reconcile a journal’s eigenfactor with it’s cost, giving buyers an idea of how much “bang per buck” their patrons are likely to get from a particular journal (e.g. in terms of how well a particular journal provides access to frequent, highly cited papers in a particular area, given its cost).

Reports are also available for each listed journal:

Finally, if you want to know how eigenfactors are calculated (it’s fun :-), the algorithm is described here: eigenfactor calculation.

Rehashing Old Tools to Look at CCK08

I haven’t posted for a few days (nothing to write about, sigh….) so here’s a cheap’n’lazy post reusing a couple of old visual demos (edupunk chatter, More Hyperbolic Tree Visualisations – delicious URL History: Users by Tag) to look at what’s happening around the use of the CCK08 tag that’s being used to annotate – in a distributed way – the Connectivism and Connective Knowledge online course

For example, here’s a view of people who have been using the cck08 tag on delicious:

People twittering mentions of cck08:

And here’s how people have been tagging the Connectivism and Connective Knowledge course homepage on delicious (along with te people who’ve been using those tags).

The next step is to move from hierarchical info displays (such as the above) to mining networks – grous of people who are talking about the same URLs on delicious and twitter, and maybe even blogging about CCK08 too…

Thoughts on Visualising the OU Twitter Network…

“Thoughts”, because I don’t have time to do this right now, (although it shouldn’t take that long to pull together? Maybe half a day, at most?) and also to give a glimpse into to the sort of thinking I’d do walking the dog, in between having an initial idea about something to hack together, and actually doing it…

So here’s the premise: what sort of network exists within the OU on Twitter?

Stuff I’d need – a list of all the usernames of people active in the OU on Twitter; Liam is aggregating some on PlanetOU, I think?, and I seem to remember I’ve linked to an IET aggregation before.

Stuff to do (“drafting the algorithm”):

– for each username, pull down the list of the people they follow (and the people who follow them?);
– clean each list so it only contains the names of OU folks (we’re gonna start with a first order knowledge flow network, only looking at links within the OU).
– for each person, p_i, with followers F_ij, create pairs username(p_i)->username(F_ij); or maybe build a matrix: M(i,j)=1 if p_j follows p_i??
– imagine two sorts of visualisation: one, an undirected network graph (using Graphviz) that only shows links where following is reciprocated (A follows B AND B follows A); secondly, a directed graph visualisation, where the link simply represents “follows”.

Why bother? Because we want to look at how people are connected, and see if there are any natural clusters (this might be most evident in the reciprocal link case?) cf. the author clusters evident in looking at ORO co-authorship stuff. Does the network diagram give an inkling as to how knowledge might flow round the OU? Are there distinct clusters/small worlds connected to other distinct clusters by one or two individuals (I’m guessing people like Martin who follows everyone who follows him?). Are there “supernodes” in the network that can be used to get a message out to different groups?

Re: the matrix view: I need to read up on matrices… maybe there’s something we can do to identify clusters in there?

Now if only I had a few hours spare…

Data Scraping Wikipedia with Google Spreadsheets

Prompted in part by a presentation I have to give tomorrow as an OU eLearning community session (I hope some folks turn up – the 90 minute session on Mashing Up the PLE – RSS edition is the only reason I’m going in…), and in part by Scott Leslie’s compelling programme for a similar duration Mashing Up your own PLE session (scene scetting here: Hunting the Wily “PLE”), I started having a tinker with using Google spreadsheets as for data table screenscraping.

So here’s a quick summary of (part of) what I found I could do.

The Google spreadsheet function =importHTML(“”,”table”,N) will scrape a table from an HTML web page into a Google spreadsheet. The URL of the target web page, and the target table element both need to be in double quotes. The number N identifies the N’th table in the page (counting starts at 0) as the target table for data scraping.

So for example, have a look at the following Wikipedia page – List of largest United Kingdom settlements by population (found using a search on Wikipedia for uk city population – NOTE: URLs (web addresses) and actual data tables may have changed since this post was written, BUT you should be able to find something similar…):

Grab the URL, fire up a new Google spreadsheet, and satrt to enter the formula “=importHTML” into one of the cells:

Autocompletion works a treat, so finish off the expression:

=ImportHtml(“http://en.wikipedia.org/wiki/List_of_largest_United_Kingdom_settlements_by_population”,”table”,1)

And as if by magic, a data table appears:

All well and good – if you want to create a chart or two, why not try the Google charting tools?

Google chart

Where things get really interesting, though, is when you start letting the data flow around…

So for example, if you publish the spreadsheet you can liberate the document in a variety of formats:

As well publishing the spreadsheet as an HTML page that anyone can see (and that is pulling data from the WIkipedia page, remember), you can also get access to an RSS feed of the data – and a host of other data formats:

See the “More publishing options” link? Lurvely :-)

Let’s have a bit of CSV goodness:

Why CSV? Here’s why:

Lurvely… :-)

(NOTE – Google spreadsheets’ CSV generator can be a bit crap at times and may require some fudging (and possibly a loss of data) in the pipe – here’s an example: When a Hack Goes Wrong… Google Spreadsheets and Yahoo Pipes.)

Unfortunately, the *’s in the element names mess things up a bit, so let’s rename them (don’t forget to dump the original row of the feed (alternatively, tweak the CSV URL so it starts with row 2); we might as well create a proper RSS feed too, by making sure we at least have a title and description element in there:

Make the description a little more palatable using a regular expression to rewrite the description element, and work some magic with the location extractor block (see how it finds the lat/long co-ordinates, and adds them to each item?;-):

DEPRECATED…. The following image is the OLD WAY of doing this and is not to be recommended…

…DEPRECATED

Geocoding in Yahoo Pipes is done more reliably through the following trick – replace the Location Builder block with a Loop block into which you should insert a Location Builder Block

yahoo pipe loop

The location builder will look to a specified element for the content we wish to geocode:

yahoo pipe location builder

The Location Builder block should be configured to output the geocoded result to the y:location element. NOTE: the geocode often assumes US town/city names. If you have a list of town names that you know come from a given country, you may wish to annotate them with a country identify before you try to geocode them. A regular expression block can do this:

regex uk

This block says – in the title element, grab a copy of everything – .* – into a variable – (.*) – and then replace the contents of the title element with it’s original value – $1 – as well as “, UK” – $1, UK

Note that this regular expression block would need to be wired in BEFORE the geocoding Loop block. That is, we want the geocoder to act on a title element containing “Cambridge, UK” for example, rather than just “Cambridge”.

Lurvely…

And to top it all off:

And for the encore? Grab the KML feed out of the pipe:

…and shove it in a Google map:

So to recap, we have scraped some data from a wikipedia page into a Google spreadsheet using the =importHTML formula, published a handful of rows from the table as CSV, consumed the CSV in a Yahoo pipe and created a geocoded KML feed from it, and then displayed it in a YahooGoogle map.

Kewel :-)

PS If you “own” the web page that a table appears on, there is actually quote a lot you can do to either visualise it, or make it ‘interactive’, with very little effort – see Progressive Enhancement – Some Examples and HTML Tables and the Data Web for more details…

PPS for a version of this post in German, see: http://plerzelwupp.pl.funpic.de/wikitabellen_in_googlemaps/. (Please post a linkback if you’ve translated this post into any other languages :-)

PPPS this is neat – geocoding in Google spreadsheets itself: Geocoding by Google Spreadsheets.

PPPS Once you have scraped the data into a Google spreadsheet, it’s possible to treat it as a database using the QUERY spreadsheet function. For more on the QUERY function, see Using Google Spreadsheets Like a Database – The QUERY Formula and Creating a Winter Olympics 2010 Medal Map In Google Spreadsheets.

Visualising Financial Data In a Google Spreadsheet Motion Chart

Following on from Data Scraping Wikipedia With Google Spreadsheets, here’s a quick post showing how you can use another handy Google spreadsheet formula:

=GoogleFinance(“symbol”, “attribute”, “start_date”, “end_date”, “interval”)

This function will pull in live – and historical – price data for a stock.

Although I noticed this formula yesterday as I was exploring the “importHTML” formula described in the Wikipedia crawling post, I didn’t have time to have a play with it; but after a quick crib of HOWTO – track stocks in Google Spreadsheets, it struck me that here was something I could have fun with in a motion chart (you know, one of those Hans Rosling Gapminder charts….;-)

NB For the “official” documentation, try here: Google Docs Help – Functions: GoogleFinance)

# Stock quotes and other data may be delayed up to 20 minutes. Information is provided “as is” and solely for informational purposes, not for trading purposes or advice. For more information, please read our Stock Quotes Disclaimer.
# You can enter 250 GoogleFinance functions in a single spreadsheet; two functions in the same cell count as two.

So – let’s have some fun…

Fire up a new Google spreadsheet from http://docs.google.com, give the spreadsheet a name, and save it, and then create a new sheet within the spreadsheet (click on the “Add new sheet” button at the bottom of the page). Select the new sheet (called “Sheet2” probably), and in cell A1 add the following:

=GoogleFinance(“AAPL”, “all”, “1/1/2008”, “10/10/2008”, “WEEKLY”)

In case you didn’t know, AAPL is the stock ticker for Apple. (You can find stock ticker symbols for other companies on many finance sites.)

The formula will pull in the historical price data for Apple at weekly intervals from the start of 2008 to October 10th. (“all” in the formula means that all historical data will be pulled in on each sample date: opening price, closing price, high and low price, and volume.)

(If this was the live – rather than historical – data, it would be updated regularly, automatically…)

It’s easy to plot this data using a Google chart or a Google gadget:

Google spreadsheet create chart/gadget menu

I’m not sure a bar chart or scatter chart are quite right for historical stock pricing… so how about a line chart:

Et voila:

If you want to embed this image, you can:

If I was using live pricing data, I think the image would update with the data…?

Now create a few more sheets in your spreadsheet, and into cell A1 of this new sheet (sheet3) paste the following:

=GoogleFinance(“IBM”, “all”, “1/1/2008”, “10/10/2008”, “WEEKLY”)

This will pull in the historical price data for IBM.

Create two or three more new sheets, and in cell A1 of each pull in some more stock data (e.g. MSFT for Microsoft, YHOO for Yahoo, and GOOG…)

Now click on Sheet1, which should be empty. Fill in the following title cells by hand across cells A1 to G1:

Now for some magic…

Look at the URL of your spreadsheet in the browser address bar – mine’s “http://spreadsheets.google.com/ccc?key=p1rHUqg4g423seyxs3O31LA&hl=en_GB#”

That key value – the value between “key=” and “&hl=en_GB#” is important – to all intents and purposes it’s the name of the spreadsheet. Generally, the key will be the characters between “key=” and an “&” or the end of the URL; the “&” means “and here’s another variable” – it’s not part of the key.

In cell B2, enter the following:

=ImportRange(“YOURKEY”, “Sheet2!A2:F42”)

YOURKEY is, err, your spreadsheet key… So here’s mine:

=ImportRange(“p1rHUqg4g423seyxs3O31LA”, “Sheet2!A2:F42”)

What ImportRange does is pull in a range of cell values from another spreadsheet. In this case, I’m pulling in the AAPL historical price data from Sheet2 (but using a different spreadsheet key, I could pull in data from a different spreadsheet altogether, if I’ve made that spreadsheet public).

In cell A2, enter the ticker symbol AAPL; highlight cell A2, click on the square in the bottom right hand corner and drag it down the column – when you release the mouse, the AAPL stock ticker should be added to all the cells above. Label each row from the imported data, and then in the next row, B column, import the data from Sheet 3:

These rows will need labeling “IBM”.

Import some more data if you like and then… let’s create a motion chart (info about motion charts.

Highlight all the cells in sheet1 (all the imported data from the other sheets) and then from the Insert menu select Gadget; from the Gadget panel that pops up, we want a motion chart:

Configure the chart, and have a play [DEMO]:

Enjoy (hit the play button… :-)

PS And remember, you could always export the data from the spreadsheet – though there are probably better API powered ways of getting hold of that data…

PPS and before the post-hegemonic backlash begins (the .org link is broken btw? or is that the point?;-) this post isn’t intended to show how to use the Google formula or the Motion Chart well or even appropriately, it’s just to show how to use it to get something done in a hacky mashery way, with no heed to best practice… the post should be viewed as a quick corridor conversation that demonstrates the tech in a casual way, at the end of a long day…

PS for a version of this post in French, see here: Créer un graphique de mouvement à partir de Google Docs.

Approxi-mapping Mash-ups, with a Google MyMaps Tidy Up to Follow

What do you do when you scrape a data set, geocode it so you can plot it on a map, and find that the geocoding isn’t quite as good as you’d hoped?

I’d promised myself that I wasn’t going to keep on posting “yet another way of scraping data into Google spreadsheets then geocoding it with a Yahoo pipe” posts along the lines of Data Scraping Wikipedia with Google Spreadsheets, but a post on Google Maps mania – Water Quality Google Map – sent me off on a train of thought that has sort of paid dividends…

So first up, the post got me thinking about whether there are maps of Blue Flag beaches in the UK, and where I could find them. A link on the UK page of blueflag.org lists them: UK Blue Flag beaches, (but there is a key in the URL, so I’m not sure how persistent that URL is).

Pull it into a Google spreadsheet using:
=ImportHtml(“http://www.blueflag.org/tools/beachsearch?q=beach&k={E1BB12E8-A3F7-4EE6-87B3-EC7CD55D3690}&f=locationcategory”,
“table”,”1″)

Publish the CSV:

Geocode the beaches using a Yahoo pipe – rather than using the Pipe location API, I’m making a call to the Yahoo GeoPlanet/Where API – I’ll post about that another day…

Grab the KML from the pipe:

Now looking at the map, it looks like some of the markers may be mislocated – like the ones that appear in the middle of the country, hundreds of miles from the coast. So what it might be handy to do is use the scraped data as a buggy, downloaded data set that needs cleaning. (This means that we are not going to treat the data as “live” data any more.)

And here’s where the next step comes in… Google MyMaps lets you seed a map by importing a KML file:

The import can be from a desktop file, or a URL:

Import the KML from the Yahoo pipe, and we now have the data set in the Google MyMap.

So the data set in the map is now decoupled from the pipe, the spreadsheet and the original Blue Flag website. It exists as a geo data set within Google MyMaps. Which means that I can edit the markers, and relocate the ones that are in the wrong place:

And before the post-hegenomic tirade comes in (;-), here’s an attempt at capturing the source of the data on the Google MyMap.

So, to sum up – Google MyMaps can be used to import an approximately geo-coded data set and used to tidy it up and republish it.

PS dont forget you can also use Google Maps (i.e. MyMaps) for geoblogging

Writing Diagrams

One of the reasons I don’t tend to use many diagrams in the OUseful.info blog is that I’ve always been mindful that the diagrams I do draw rarely turn out how I wanted them to (the process of converting a mind’s eye vision to a well executed drawing always fails somewhere along the line, I imagine in part because I’ve never really put the time into practising drawing, even with image editors and drawing packages etc etc.)

Which is one reason why I’m always on the lookout for tools that let me write the diagram (e.g. Scripting Diagrams).

So for example, I’m very fond of Graphviz, which I can use to create network diagrams/graphs from a simple textual description of the graph (or a description of the graph that has been generated algorithmically…).

Out of preference, I tend to use Mac version of Graphviz, although the appearance of a canvas/browser version of graphviz is really appealing… (I did put in a soft request for a Drupal module that would generate a Graphviz plot from a URL that pointed to a dot file, but I’m not sure it went anywhere, and the canvas version looks far more interesting anyway…)

Hmmm – it seems there’s an iPhone/iPod touch Graphviz app too – Instaviz:

Another handy text2image service is the rather wonderful Web sequence diagrams, a service that lets you write out a UML sequence diagram:

There’s an API, too, that lets you write a sequence diagram within a <pre> tag in an HTML page, and a javascript routine will then progressively enhance it and provide you with the diagrammatic version, a bit like MathTran, or the Google Chart API etc etc (RESTful Image Generation – When Text Just Won’t Do).

If graphs or sequence diagrams aren’t your thing, here’s a handy hierarchical mindmap generator: Text2Mindmap:

And finally, if you do have to resort to actually drawing diagrams yourself, there are a few tools out there that look promising: for example, the LucidChart flow chart tool crossed my feedreader the other day. More personally, since Gliffy tried to start charging me at some point during last year, I’ve been using the Project Draw Autodesk online editor on quite a regular basis.

PS Online scripting tool for UML diagrams: YUML

PPS This is neat – a quite general diagramming language for use in eg markdown scripts: pikchr.

Revisiting the Library Flip – Why Librarians Need to Know About SEO

What does information literacy mean in the age of web search engines? I’ve been arguing for some time (e.g. in The Library Flip) that one of the core skills going forward for those information professionals who “help people find stuff” is going to be SEO – search engine optimisation. Why? Because increasingly people are attuned to searching for “stuff” using a web search engine (you know who I’m talking about…;-); and if your “stuff” doesn’t appear near the top of the organic results listing (or in the paid for links) for a particular query, it might as well not exist…

Whereas once academics and students would have traipsed into the library to ask the one of the High Priestesses to perform some magical incantation on a Dialog database through a privileged access terminal, for many people research now starts with a G. Which means that if you want your academics and students to find the content that you’d recommend, then you have to help get that content to the top of the search engine listings.

With the rate of content production growing to seventy three tera-peta-megabits a second, or whatever it is, does it make sense to expect library staffers to know what the good content is, any more (in the sense of “here, read this – it’s just what you need”)? Does it make even make sense to expect people to know where to find it (in the sense of “try this database, it should contain what you need”)? Or is the business now more one of showing people how to go about finding good stuff, wherever it is (in the sense of “here’s a search strategy for finding what you need”) and helping the search engines see that stuff as good stuff?

Just think about this for a moment. If your service is only usable by members of your institution and only usable within the locked down confines of your local intranet, how useful is it?

When your students leave your institution, how many reusable skills are they taking away? How many people doing informal learning or working within SMEs have access to highly priced, subscription content? How useful is the content in those archives anyway? How useful are “academic information skills” to non-academics and non-students? (I’m just asking the question…;-)

And some more: do academic courses set people up for life outside? Irrespective of whether they do or not, does the library serve students on those courses well within the context of their course? Does the library provide students with skills they will be able to use when they leave the campus and go back to the real world and live with Google. (“Back to”? Hah – I wonder how much traffic on HEI networks is launched by people clicking on links from pages that sit on the google.com domain?) Should libraries help students pass their courses, or give them skills that are useful after graduation? Are those skills the same skills? Or are they different skills (and if so, are they compatible with the course related skills?)?

Here’s where SEO comes in – help people find the good stuff by improving the likelihood that it will be surfaced on the front page of a relevant web search query. For example, “how to cite an article“. (If you click through, it will take you to a Google results page for that query. Are you happy with the results? If not, you need to do one of two things – either start to promote third party resources you do like from your website (essentially, this means you’re doing off-site SEO for those resources) OR start to do onsite and offsite SEO on resources you want people to find on your own site.

(If you don’t know what I’m talking about, you’re well on the way to admitting that you don’t understand how web search engines work. Which is a good first step… because it means you’ve realised you need to learn about it…)

As to how to go about it, I’d suggest one way is to get a better understanding of how people actually use library or course websites. (Another is Realising the Value of Library Data and finding ways of mining behavioural data to build recommendation engines that people might find useful.)

So to start off – find out what search terms are the most popular in terms of driving traffic to your Library website (ideally relating to some sort of resource on your site, such as a citation guide, or a tutorial on information skills); run that query on Google and see where you page comes in the results listing. If it’s not at the top, try to improve its ranking. That’s all…

For example, take a look at the following traffic (as collected by Google Analytics) coming in to the OU Library site over a short period some time ago.

A quick scan suggests that we maybe have some interesting content on “law cases” and “references”. For the “references” link, there’s a good proportion of new visitors to the OU site, and it looks from the bounce rate that half of those visited more than one page on the OU site. (We really should do a little more digging at this point to see what those people actually did on site, but this is just for argument’s sake, okay?!;-)

Now do a quick Google on “references” and what do we see?

On the first page, most of the links are relating to job references, although there is one citation reference near the bottom:

Leeds University library makes it in at 11 (at the time of searching, on google.co.uk):

So here would be a challenge – try to improve the ranking of an OU page on this results listing (or try to boost the Leeds University ranking). As to which OU page we could improve, first look at what Google thinks the OU library knows about references:

Now check that Google favours the page we favour for a search on “references” and if it does, try to boost it’s ranking on the organic SERP. If Google isn’t favouring the page we want as its top hit on the OU site for a search on “references”, do some SEO to correct that (maybe we want “Manage Your References” to come out as the top hit?)

Okay, enough for now – in the next post on this topic I’ll look at the related issue of Search Engine Consequences, which is something that we’re all going to have to become increasingly aware of…

PS Ah, what the heck – here’s how to find out what the people who arrived at the Library website from a Google search on “references” were doing onsite. Create an advanced segment:

Google analytics advanced segment

(PS I first saw these and learned how to use them at a trivial level maybe 5 minutes ago;-)

Now look to see where the traffic came in (i.e. the landing pages for that segment):

Okay? The power of segmentation – isn’t it lovely:-)

We can also go back to the “All Visitors” segment, and see what other keywords people were using who ended up on the “How to cite a reference” page, because we’d possibly want to optimise for those, too.

Enough – time for the weekend to start :-)

PS if you’re not sure what techniques to use to actually “do SEO”, check on Academic Search Premier (or whatever it’s called), because Google and Google Blogsearch won’t return the right sort of information, will they?;-)