Course Librarians and Search Assist…

For all their success in attracting universities to adopt Google Apps (Tradition meets technology: top universities using Apps for Education), it’s not obvious to me how – or even if – Google is actually doing much around search signal detection and innovation in an educational context?

I’ve floated this a couple of times before (eg Could Librarians Be Influential Friends? And Who Owns Your Search Persona? and Integrating Course Related Search and Bookmarking?), but with yet another announcement from Google about how they’re incorporating social signals into search rankings (Hide sites from anywhere in the world: “We’ve … started incorporating data about sites people have blocked into our general search ranking algorithms to help users find more high quality sites.”), I’m going to raise it again…

To what extent are course and subject librarians setting up course/subject personas that engage in recommending and sharing high quality links in an appropriate social content, and encouraging students to follow those accounts in order to benefit from personalisation of search results based on social signals?

Furthermore, to what extent might the development of search personas represent the creation of a “scholarly agent” that can be used to offer “search assist” to followers of that agent/persona?

I don’t find it that hard to imagine myself taking a course, following the course recommender on a social network (an account that might send out course related reminders as well as relevant links), with an icon depicting my university and the associated course, that on occasion appeared to “recommend” links to me when I was searching for topics relating to my course. (In the normal scheme of things, it wouldn’t actively be recommending links to me, of course. For that, I’d need to subscribe to something like Subscribed Links, as mentioned in Integrating Course Related Search and Bookmarking?.)

University Search Engine Sitelinks and (Rich) Snippets

A long time ago, I started running with the idea that an organisation’s homepage on the web was the dominant search engine’s search results page for the most common search term associated with that organisation. So for example, the OU’s effective home page is not, but the Google search results page for open university. (You can test the extent to which this claim is supported by checking out your website logs, and comparing how many direct visitors you get to the “official” homepage for your website compared to the amount of traffic referred to your site from google for common search terms used to find your site. You do know what those search terms are, right?!;-)

At one time, the results page might include several links to different institutional web pages, each listed as a separate individual search result item. Then five years or so ago, Google started introducing sitelinks into the search results page, a display technique that would include a list of several links to specific pages within the same domain within the context of a single result item, headed by the top level domain.

OU snippets

Or maybe how about the library?

OU library snippets and sitelinks

[For an overview, see Anatomy Of A Google Snippet and maybe also Meta Description Mutiny! Take Control of Your Text Snippets; if you want to take some responsibility for what appears, see the Google Webmaster Tools posts on sitelinks, Changing a site title and description and Removing snippets and Instant Preview]

Video: Matt Cutts introduces the original snippets

A recent (August 2011) update sees the Goog placing even more focus on the display of sitelinks (The evolution of sitelinks: expanded and improved).

Here are few things about this update that I think are worth noting, particularly in light of recommendations emerging from the JISC “Linking You” project, which offers best practice guidance on the design of top-level URI schemes for university websites:

Sitelinks will now be full-size links with a URL and one line of snippet text—similar to regular results—making it even easier to find the section of the site you want. We’re also increasing the maximum number of sitelinks per query from eight to 12. …

In addition, we’re making a significant improvement to our algorithms by combining sitelink ranking with regular result ranking to yield a higher-quality list of links. This reduces link duplication and creates a better organized search results page. Now, all results from the top-ranked site will be nested within the first result as sitelinks, and all results from other sites will appear below them. The number of sitelinks will also vary based on your query—for example, [museum of art nyc] shows more sitelinks than [the met] because we’re more certain you want results from

So what do we learn from this?

  • Sitelinks will now be full-size links with a URL and one line of snippet text—similar: so check your results listing and see if the “one line of snippet text” for each displayed result makes sense. (For some ideas about how to influence snippet text, see e.g. the Google Webmaster Tools links above.)
  • We’re … increasing the maximum number of sitelinks per query from eight to 12: what links would you like to appear in the sitelinks list, compared to what links actually do appear? Would consensus in how UK HEIs architect top-level URLs (as, for example, recommended by Linking You, provide uniformity of display of results in the Google SERPs space? Would consensus open Google up to discussion relating to the most effective way of displaying search results for UK HEI sitelink results?
  • we’re making a significant improvement to our algorithms by combining sitelink ranking with regular result ranking to yield a higher-quality list of links: which is to say – SEO, and URI path design, may play a role in determining what links get displayed as sitelinks.
  • This reduces link duplication and creates a better organized search results page. That is, better orgainised, as defined by Google. If you want to influence the way links to your site are displayed as sitelinks, you need to figure how – you don’t control how Google provides this top level navigation to your site as sitelinks, but you may be able to influence the display through good website design….
  • The number of sitelinks will also vary based on your query

If you have pages that mainly contain lists of items, these may also be handled differently in the context of snippets: New snippets for list pages

See also:
Rich snippets microdata – if Google handled edu microdata, what would it describe…?
– Google Webmaster tools: Rich Snippets testing tool – “enter a web page URL to see how it may appear in search results”

PS I wonder how sitelink displays interact with rich snippets…?

PPS There’s a great write up of the Linking You project that I’ve just come across here: Lend Me Your Ears Dear University Web Managers!. Go and read it…. now… Hmmm.. thinks… what would a similar exercise for local council websites look like?

Getting Library Catalogue Searches Out There…

As a long time fan of custom search engine offerings, I keep wondering why Google doesn’t seem to have much active interest in this area? Google Custom Search updates are few and far between, and typically go unreported by the tech blogs. Perhaps more surprisingly, Custom Search Engines don’t appear to have much, if any, recognition in the Google Apps for Education suite, although I think they are available with a Google Apps for education ID?

One of the things I’ve been mulling over for years is the role that automatically created course related search engines might have to play as part of a course’s VLE offering. The search engine would offer search results either over a set of web domains linked to from the actual course materials, or simply boost results from those domains in the context of a “normal” set of search results. I’ve recently started thinking that we could also make use “promoted” results to highlight specific required or recommended readings when a particular topic is searched for (for example, Integrating Course Related Search and Bookmarking?).

During an informal “technical” meeting around three JISC funded reseource discovery projects at Cambridge yesterday (Comet, Jerome, SALDA; disclaimer: I didn’t work on any of them, but I was in the area over the weekend…), there were a few brief mentions of how various university libraries were opening up their catalogues to the search engine crawlers. So for example, if you do a site: limited search on the following paths:


you can get (partial?) search results, with a greater or lesser degree of success, from the Sussex, Lincoln, Huddersfield and Cambridge catalogues respectively.

In a Google custom search engine context, we can tunnel in a little deeper in an attempt to returns results limited to actual records:


I’ve added these to a new Catalogues tab on my UK HE library website CSE (about), so we can start to search over these catalogues using Google.

I’m not sure how useful or interesting this is at the moment, except to the library systems developers maybe, who can compare how informatively their library catalogue content is indexed and displayed in Google search results compared to other libraries… (so for example, I noticed that Google appears to be indexing the “related items” that Huddersfield publishes on a record page, meaning that if a search term appears in a related work, you might get a record that at first glance appears to have little to do with your search term, in effect providing a “reverse related work” search (that is, search on related works and return items that have the search term as the related work)).

Searching UK HE library catalogues via a Google CSE

But it’s a start… and with the addition of customised rankings, might provide a jumping off point for experimenting with novel ways of searching across UK HE catalogues using Google indexed content. (For example, a version of the CSE on the domain might boost the Cambridge results; within an institution, works related to a particular course through mention on a reading list might get a boost if a student on that course runs a search… and so on…

PS A couple of other things that may be worth pondering… could Google Apps for Education account holders be signed up to to Subscribed Links offering customised search results in the main Google domain relating to a particular course. (That is, define subscribed link profiles for a each course, and automatically add those subscriptions to an Apps for Edu user’s account based on the courses they’re taking?) Or I wonder if it would be possible to associate subscribed links to public access browsers in some way?

And how about finding some way of working with Google to open up “professional” search profiles, where for example students are provided with “read only” versions of the personalised search results of an expert in a particular area who has tuned, through personalisation, a search profile that is highly specialised in a particular subject area, e.g. as mentioned in Google Personal Custom Search Engines? (see also Could Librarians Be Influential Friends? And Who Owns Your Search Persona?).

If anyone out there is working on ways of using Google customised and personalised search as a way of delivering “improved” search results in an educational context, I’d love to hear more about what you’re getting up to…

Autocuration Signals in My Personalised Google Search Results

I spotted this for the first time last night:

Auto-curation signals in my search results

I had actually read the post in the Google Reader context (so Google knew that), but I wonder: if I hadn’t read the post, would it still have shown up like that?

As far as personalised ranking signals go:

– does the fact that I subscribe to the feed in Google Reader affect the rank of items from that feed in my personalised search results?
– if I have read the post in Google reader, does that also affect the rank of that specific post in my personalised search results?

If I have shared a link – through Google+, or Twitter, for example – are the ranking of those links positively affected in my personalised search results. That is, might social search actually be most useful when the Goog picks up on things I have shared myself, and then “reminds” me of them via a ranking boost in my personalised search results when I’m searching on a related topic?

Maybe tweeting and sharing into the void is actually yet another way of invisibly building search refinements into your personalised search context?

Integrating Course Related Search and Bookmarking?

Not surprisingly, I’m way behind on the two eSTEeM projects I put proposals in for – my creative juices don’t seem to have been flowing in those areas for a bit:-( – but as a marking avoidance strategy I thought I’d jot down some thoughts that have been coming to mind about how the custom search project at least might develop (eSTEeM Project: Custom Course Search Engines).

The original idea was to provide a custom search engine that indexes pages and domains that are referenced within a course in order to provide a custom search engine for that course. The OU course T151 is structured as a series of topic explorations using the structure:

– topic overview
– framing questions
– suggested resources
– my reflections on the topic, guided by the questions, drawing on the suggested resources and a critique of them

One original idea for the course was that rather than give an explicit list of suggested resources, we provide a set of links pulled in live from a predefined search query. The list would look as if it was suggested by the course team but it would actually be created dynamically. As instructors, we wouldn’t be specifying particular readings, instead we would be trusting the search algorithm to return relevant resources. (You might argue this is a neglectful approach… a more realistic model might be to have specifically recommended items as well as a dynamically created list of “Possibly related resources”.)

At this point it’s maybe worth stepping back a moment to consider what goes into producing a set of search results. Essentially, there are three key elements:

– the index, the set of content that the search engine has “searched” and from which it can return a set of results;
– the search query; this is run against the index to identify a set of candidate search results;
– a presentation algorithm that determines how to order the search results as presented to the user.

If the search engine and the presentation algorithm are fixed, then for a given set of search terms, and a given index, we can specify a search term and get a known set of results back. So in this case, we could use a fixed custom search engine, with know search terms, and return a known list of suggested readings. The search engine would provide some sort of “ground truth” – same answer for the same query, always.

If we trust the sources and the presentation algorithm, and we trust that we have written an effective search query, then if the index is not fixed, or if a personalised ranking algorithm (that we trust) is used as part of the search engine, we would potentially be returning search results that the instructor has not seen before. For example, the resources may be more recent than the last time the instructor searched for resources to recommend, or they better fit the personalisation criteria for the user under the ranking algorithm used as part of the presentation algorithm.

In this case, the instructor is not saying: “I want you to read this particular resource”. They are saying something more along the lines of: “these are potentially the sorts of resource I might suggest you look at in order to study this topic”. (Lots of caveats in there… If you believe in content led instruction, with students referring to to specifically referenced resources, I imagine that you would totally rile against this approach!)

At times, we might want to explicitly recommend one or two particular resources, but also open up some other recommendations to “the algorithm”. It struck me that it might be possible to do this within the context of a Google Custom Search approach using “special results” (e.g. Google CSEs: Creating Special Results/Promotions).

For example, Google CSEs support:

promotions: “A promotion is simply an association between a pre-defined set of query terms and a link to a webpage. When a user types a search that exactly matches one of your query terms, the promotion appears at the top of the page.” So by using a specific search term, we can force the return of a specific result as the top result. In the context of a topic exploration, we could thus prepopulate the search form of an embedded search engine with a known search phrase, and use a promotion to force a “recommend reading” link to the top of the results listing.

Promotion links are stored in a separate config file and have the form:

  <Promotion id="1"
    queries="wanderer, the wanderer" 
    title="Groo the Wanderer" 
    description="Comedy. American series illustrated by Sergio Aragonés."
    image_url="" />

subscribed links: subscribed links allow you to return results in a specific format (such as text, or text and a link, or other structured results) based on a perfect match with a specific search term. In a sense, subscribed links represent a generalised version of promotions. Subscribed links are also available to users outside the context of a CSE. If a user subscribes to a particular subscribed link file, then if there is an exact match against of one the search phrases in the subscribed link file and a search phrase used by a subscribing user on Google web search (i.e. on or, the subscribed link will be returned in the results listing.

In the simplest case, subscribed links can be defined at the individual link level:

Google subscribed link definition

If your search term is an exact match for the term in the subscribed link definition, it will appear in the main search results page:

Google subscribed links

It’s also possible to define subscribed link definition files, either as simple tab separated docs or RSS/Atom feeds, or using a more formal XML document structure. One advantage of creating subscribed links files for use within in custom search engine is that users (i.e. students) can subscribe to them as a way of augmenting or enhancing their own Google search results. This has the joint effect of increasing the surface area of the course, so that course related recommendations can be pushed to the student for relevant queries made through the Google search engine, as well as providing a legacy offering: students can potentially take away a subscription when then finish the course to continue to receive “academically credible” results on relevant search topics. (By issuing subscription links on a per course presentation basis (or even on a personalised, unique feed per student basis), feeds to course alumni might be customised, or example by removing links to subscription content (or suggesting how such content might be obtained through a subscription to the university library), or occasionally adding in advertising related links (so if a student searches using a “course” keyword, make recommendations around that via a subscribed links feed; in the limit, this could even take on the form of a personalised, subscription based advertising channel).

Another way in which “recommended” links can be boosted in a custom search result listing is through boosting search results via their ranking factors (Changing the Ranking of Your Search Results).

In the case of both subscribed links and boosted search results, it’s possible to create a configuration file dynamically. Where students are bookmarking search results relating to a course, it would therefore be possible to feed these into a course related custom search engine definition file, or a subscribed link file. If subscribed link files are maintained at a personal level, it would also be possible to integrate a student’s bookmarked links in to their subscribed links feed, at least for use on Google websearch (probably not in the custom search engine context?). This would support rediscovery of content bookmarked by the student through subscribed link recommendations.

Just by the by, a PR mailing in my inbox today threw up another example of how search and bookmarking might be brought more closely together: SearchTeam (screenshots [pdf]).

The model here is based around defining search contexts that one or more users can contribute to, and then saving out results from a search into a topic based bookmark area. The video suggests that particular results can also be blocked (and maybe boosted? The greyed plus on the left hand side?) – presumably this is a persistent feature, so if you, or another member of your “search team” runs the search, the blocked result doesn’t appear? (Is a list of blocked results and their corresponding search terms available anywhere I wonder?) In common with the clipping blog model used by sites such as posterous, it’s possible to post links and short blog posts into a topic area. Commenting is also supported.

To say that search was Google’s initial big idea, it’s surprising that it seems to play no significant role in Google’s offerings for education through Google Apps. Thinking back, search related topics were what got me into blogging and quick hacks; maybe it’s time to return to that area…

Google Playing the SEO Link Building Game to Drive Uptake Of Google Profiles?

As you’re probably aware by now, yesterday Google announced its Google+ social network. A key part of every social network is a user’s personal profile page, the “social object” that other people can actually connect to.

Google has offered personal profile pages for some time, (here’s my rather basic ), but they’ve never really been a part of anything, and they’re not really linkable to – which means there’s little reason for PageRank based search algorithms such as Google’s to return Google Profile pages in the top results for you if anyone ever searches for you.

(PageRank is the algorithm that gave Google its early edge in the search engine wars; links from one page to another count as “votes” regarding the quality of the page that is linked to. Crudely put, if people link to you, those links contribute to your PageRank and you’re more likely to make it to the top of a search results page.)

Until now, that is (or at least, until a couple of weeks ago… I missed this announcement at the time it was made…): Authorship markup and web search, a technique for “supporting markup that enables websites to publicly link within their site from content to author pages”.

The method is described as follows:

To identify the author of an article, Google checks for a connection between the content page (such as an article), an author page, and a Google Profile.

A content page can be any piece of content with an author: a news article, blog post, short story …
An author page is a page about a specific author, on the same domain as the content page.
A Google Profile is Google’s version of an author page. It’s how you present yourself to the web and to Google.

In confirming authorship, Google looks for:

Links from the content page to the author page (if the path of links continues to a Google Profile, we can also show Profile information in search results)
A path of links back from your Google Profile to your content.
These reciprocal links are important: without them, anyone could attribute content to you, or you could take credit for any content on the web.
The rel=”author” link indicates the author of an article [so for example: <a rel=”author” href=””>Google Profile: Tony Hirst</a>]

Source: Authorship

Here’s why you might be tempted to do this…:

Many of you create great content on the web, and we work hard to make that content discoverable on Google. Today, we will start highlighting the people creating this content in search results.

Google author identified links

As you can see …, certain results will display an author’s picture and name — derived from and linked to their Google Profile — next to their content on the Google Search results page.

Source: Highlighting content creators in search results; [my emphasis]

So… if you want to assert authorship and be recognised as the author in the Google search results, you need to start linking all your content back to your Google Profile Page…

…and so start feeding PageRank juice to your Google profile page…

…so that when folk search for you on the web, they’re more likely to see that page…

This is a harsh reading, of course: authorship can also be asserted by linking within a domain to a page that you have asserted to Google that represents you: The rel=”author” link indicates the author of an article, and can point to .. an author page on the same domain as the content page: Written by <a rel="author" href="../authors/mattcutts">Matt Cutts</a>. The author page should link to your Google Profile using rel=”me”.

(I wonder why <link rel=”author” href=”../authors/mattcutts”/> isn’t supported? Or maybe it is?)

Algorithmically, the assertion of authorship might also help in Google’s fight against spamblogs, which republish content blindly from original sources. That is, by asserting authorship of a page, if someone reposts your content, google will be able to identify you as the original author and return a link back to your page in the search results listing, rather than the republished page.

I imagine there might also be personal reputation benefits – for example, if people +1 a page you have claimed authorship of, it might give you a “Reputation Rank” boost for the subject area associated with that page?

Google Correlate: What Search Terms Does Your Time Series Data Correlate With?

Just a few days over three years ago, I blogged about a site I’d put together to try to crowdsource observations about correlated searchtrends: TrendSpotting.

One thing that particularly interested me then, as it still does now, was the way that certain search trends they reveal rhythmic behaviour over the course of weeks, months or years.

At the start of this year, I revisited the topic with a post on Identifying Periodic Google Trends, Part 1: Autocorrelation (followd by Improving Autocorrelation Calculations on Google Trends Data).

Anyway today it seems that Google has cracked the scaling issues with discovering correlations between search trends (using North American search trend data), as well as opening up a service that will identify what search trends correlate most closely with your own uploaded time series data: Correlate (announcement: Mining patterns in search data with Google Correlate)

For the quick overview, check out the Google Correlate Comic.

So what’s on offer? First, enter a search term and see what it’s correlated with:

As well as the line chart, correlations can also be plotted as a scatterplot:

You can also run “spatial correlations”, though at the moment this appears to be limited to US states. (I *think* this works by looking for search terms that are popular in the requested areas and not popular in the other listed areas. To generalise this, I guess you need three things: the total list of areas that work for the spatial correlation query; the areas you want the search volume for the “to be discovered correlated phrase” to be high; the areas you want to the search volume for the “to be discovered correlated phrase” to be low?)

At this point it’s maybe worth remembering that correlation does not imply causation…

A couple of other interesting things to note: firstly, you can offset the data (so shift it a few weeks forwards or backwards in time, as you might do if you were looking for lead/lag behaviour); secondly, you can export/download the data.

You can also upload your own data to see what terms correlate with it:

(I wonder if they’ll start offering time series analysis features on uploaded, as well as other trend data, too? For example, frequency analysis or trend analysis? This is presumably going on in the background (though I haven’t read the white paper [PDF] yet…)

As if that’s not enough, you can also draw a curve/trendline and then see what correlates with it (so this a weak alternative to uploading your own data, right? Just draw something that looks like it… (h/t to Mike Ellis for first point this out to me).

I’m not convinced that search trends map literally onto the well known “hype cycle” curve, but I thought I’d try out a hype cycle reminiscent curve where the hype was a couple of years ago, and we’re now maybe seeing start to reach mainstream maturity, with maybe the first inklings of a plateau…

Hmmm… the pr0n industry is often identified as a predictor of certain sorts of technology adoption… maybe the 5ex searchers are too?! (Note that correlated hand-drawn charts are linkable).

So – that’s Google Correlate; nifty, eh?

PS Here’s another reason why I blog… my blog history helps me work out how far i the future I live;-) So currently between about three years in the future.. how about you?!;-)

PPS I can imagine Google’s ThinkInsights (insight marketing) loving the thought that folk are going to check out their time series data against Google Trends so the Goog can weave that into it’s offerings… A few additional thoughts leading on from that: 1) when will correlations start to appear in Google AdWords support tools to help you pick adwords based on your typical web traffic patterns or even sales patterns? 2) how far are we off seeing a Google Insights box to complement the Google Search Appliances, that will let you run correlations – as well as Google Prediction type services – onsite without feeling as if you have to upload your data to Google’s servers, and instead, becoming part of Google’s out-kit-in-your-racks offering; 3) when is Google going to start buying up companies like Prism and will it then maybe go after the likes of Experian and Dunnhumby to become a company that organises information about the world of people, as well as just the world’s information…?!)

PPPS Seems like as well as “traditional” link sharing offerings, you can share the link via your Google Reader account…


Random Thoughts on Search Demographics

In his post Using Yahoo! Clues to target your headlines by demographic, Paul Bradshaw picked up on my post from earlier today about Yahoo Clues, which describes some of the demographic information behind who’s using what search terms.

In particular, Paul wondered: “But what if your publication is specifically aimed at women – or men? Or under-25s? Or over-40s? Or the wealthy?”, the implication being that we can tune our words to hit particular searcher demographics (a more refined approach to SEO than the norm).

This is the norm in ad placement, of course, where ad words are chosen according to demographic. I don’t spend as much time looking at ad tools as I should (one backburner project I really should try to get going properly is a consideration of how we can use contextual ad servers to place content), so I’m not really up to speed with what’s out there but a quick trawl turns up a couple of tools like those appearing on Yahoo Clues ob the Microsoft AdLab site.

So for example, we have a demographics prediction tool, that predicts demographics based on keywords, in the example below comparing library with newspaper:

Microsoft AdLab: demographics

There are also Search funnels, cf. the Yahoo Clues “Search Flow” tool:

Microsoft adlab - search funnel

As well as search for “in” terms, you can also look for “out” terms, (i.e. terms that follow the term of interest in a search session. (You can hack the URL to choose between these).

The AdLab Audience Intelligence tool also has a go at prediciting demographics, either of users of particular search terms, or visitors to a particular URL:

AdLab 0 AUdience intelligence

(I have no idea if the above predication bears any resemblance to reality…?;-)

I think that Google AdWords supports demographic placement, so it probably has a similar tool available. Facebook has unrivalled access to demographic data, so it’s ad optimisation tools may also be worth looking at (I’m sure it has some? If you know where they live, please post a link below:-) And of course, Yahoo has its own range of ad management tools and services (look through the Yahoo Advertising Solutions to get a flavour of what;s possible, from geographic and demographic targeting, to behavioural marketing).

Of course, if you want real data, you’ll need to pay for commercial analytics services; have a trawl through some of Experian’s public sector data products to see what’s available… (Note to self: starting blogging about some of these tools…heh heh…;-)

And where does this thinking lead? Maybe authors who want their content to be discoverable via search need to start using some of the tools the marketers use to optimise their content and place it appropriately? (So for example, when writing course catalogue pages, don’t use the words that you’d expect someone who has completed the course in the course description…

PS since looking through the Experian product catalogue, I now rank them right up there with Dunnhumby in terms of what they know about us…

Searching By Looking Elsewhere

A couple of weeks or so ago, I got an email requesting a link to something I’d spoken about at a department meeting some time ago (the Gartner hype cycle, actually). Now normally I’d check my delicious bookmarks for a good link, or maybe even run a Google web search, but instead I ran a search for ‘gartner hypecycle 2008’ on Google Images

…which is when it struck me that searching Google Images may on occasion lead to better quality, or more relevant, results than doing a normal web search, particularly if you use a level of indirection. In particular, it can often lead to a web document or post that provides some sort of analysis around a topic. (Remember, Google image search links to the web pages that contain the images that are displayed in the image search results, not just the images.)

So for example, a web search for games console sales chart [web search] turns up a different set of results to an image search for games console sales chart [image search]. And here’s where my gut feeling comes in about using the fact that documents contain images as a filter – if people have gone to the trouble of including a relevant image in something they have published, their post may be more considered on a particular topic than one that doesn’t. That is, the inclusion of a relevant image can be used as a valuable ranking term when searching for results. Essentially, you are running an advanced, search limited query around an image document type.

Note that it’s often sensible, when sharing image queries, to make the search a ‘safe’ (i.e. adult content filtered) one: in Google, just add &safe=active to the end of the URL.

(The image search approach also lets me quickly scan the results for one that appears to contain the sort of chart data I want. Supporting visual filtering is one reason why some search engines have experimented with including an image from each linked to page in the search engine results listing.)

Limiting searches by document type can also be achieved in a normal web search too, of course. For example, if you are looking for a report on knife crime in UK cities, then it might be reasonable to suspect that the most relevant documents were published as PDFs – so limit on that:

If you’d rather use the normal Google search box as a command line, the search query is: uk+knife+crime+report+filetype:pdf

If you’re looking for actual data, it might make sense to search on spreadsheet documents? uk knife crime statistics filetype:xls

As well as variously using the keyword ‘chart’ or ‘statistics’, the word ‘data’ or ‘table’ can also help tune results, particularly when running an image search. Remember, the point may not necessarily to find a chart, or set of data directly. Instead, it may be using the fact that a document contains a chart or a table to limit the results you get back (assuming that documents or posts containing charts, tables, etc., are likely to be more considered on a particular topic simply because the author has gone to the trouble of including a a chart or a table etc.)

Increasingly, I find I’m also using Youtube to search for particular items of BBC content. Note that my motivation here is not necessarily to use the video clip I have found directly, mainly because a lot of BBC related footage on Youtube has not been put there by the BBC – i.e. it is more likely to be copyright infringing content uploaded by an individual.

Instead, I am making use of:

1) the segmenting of video clips that individuals have done (chopping a 3 minute clip out of an hour long documentary, for example);
2) the user provided metadata around the clip – the title they have given it, the description text, the tags used to annotate it;
3) the automatically generated ‘related video’ service provided by Youtube,

to help me deep search into BBC content so that I can quickly find a clip that can then be obtained in a rights approved manner, without having to wade through hours and hours of video searching for a clip I want to use.

That is, it is possible to use Youtube as a great big index of BBC ‘deep clips’, in the sense that they are clipped from deep within a longer programme, to locate a particular clip that can then be obtained in a rights cleared fashion: searching Youtube to find something that I will then go elsewhere for.

So the take home message from this post? The best place to search for a particular resource may not be the obvious one.

Applying SEO to the Course Catalogue

Just before Christmas I gave a talk at the department awayday that I’d intended to do in the style of a participatory lecture but, as is the way of these things, it turned into a total palaver and lost most of the lunch-addled audience within the first 20s;-)

Anyway, anyway, one of the parts of the talk was to get everyone to guess what course was being described based on a tag cloud analysis of the course description on the corresponding page of the course catalogue (got that?)

Here’s the relevant part of the presentation:

(The course codes were actually click-revealed during the presentation.)

Note that the last slide actually shows a tag cloud of the search terms that brought visitors into the OU website and delivered visitors to the specified course page, rather than a tag cloud of the actual course description.

See if you can spot which is which – remember, one of the following is generated from the actual course description, the other from incoming search terms to that page:


T209 description tag cloud

I’m not going to explore what any of this “means” in this post (my blogging time is being increasingly sidelined, unfortunately:-( suffice to say that whilst I was giving the original presentation I heard my self strongly arguing something along the lines of the following:

It’s pointless writing the course description on the course catalogue web pages using the terminology you want students to come out of the course with (that is, using the language you expect the course to teach them). What the course description has to do is attract people who want to learn those terms; so YOU have to use the words that they are likely be using on Google to find the course in the first place.

It strikes me that a similar sense of before/after language might also apply to the way we phrase learning objectives at the start of a learning activity in everyday, why we’re bothering learning thisat all, type language, and then clarify the learning outcomes in jargon heavy, terminology laden, worthy sounding terms at the end of the activity?;-)

See also: Measuring the Success of the Online Course Catalog, which looks at the design of a course catalogue from an SEO/actionable analytics point of view.