Search Engine Powered Courses…

How can we use customised search engines to support uncourses, or the course models used to support MOOC style offerings?

To set the scene, here’s what Stephen Downes wrote recently on the topic of How to partcipate in a MOOC:

You will notice quickly that there is far too much information being posted in the course for any one person to consume. We tried to start slowly with just a few resources, but it quickly turns into a deluge.

You will be provided with summaries and links to dozens, maybe hundreds, maybe even thousands of web posts, articles from journals and magazines, videos and lectures, audio recordings, live online sessions, discussion groups, and more. Very quickly, you may feel overwhelmed.

Don’t let it intimidate you. Think of it as being like a grocery store or marketplace. Nobody is expected to sample and try everything. Rather, the purpose is to provide a wide selection to allow you to pick and choose what’s of interest to you.

This is an important part of the connectivist model being used in this course. The idea is that there is no one central curriculum that every person follows. The learning takes place through the interaction with resources and course participants, not through memorizing content. By selecting your own materials, you create your own unique perspective on the subject matter.

It is the interaction between these unique perspectives that makes a connectivist course interesting. Each person brings something new to the conversation. So you learn by interacting rather than by mertely consuming.

When I put together the the OU course T151, the original vision revolved around a couple of principles:

1) the course would be built in part around materials produced in public as part of the Digital Worlds uncourse;

2) each week’s offering would follow a similar model: one or two topic explorations, plus an activity and forum discussion time.

In addition, the topic explorations would have a standard format: scene setting, and maybe a teaser question with answer reveal or call to action in the forums; a set of topic exploration questions to frame the topic exploration; a set of resources related to the topic at hand, organised by type (academic readings (via a libezproxy link for subscription content so no downstream logins are required to access the content), Digital Worlds resources, weblinks (industry or well informed blogs, news sites etc), audio and video resources); and a reflective essay by the instructor exploring some of the themes raised in the questions and referring to some of the resources. The aim of the reflective essay was to model the sort of exploration or investigation the student might engage in.

(I’d probably just have a mixed bag of resources listed now, along with a faceting option to focus in on readings, videos, etc.)

The idea behind designing the course in this way was that it would be componentised as much as possible, to allow flexibility in swapping resources or even topics in and out, as well as (though we never managed this), allowing the freedom to study the topics in an arbitrary order. Note: I realised today that to make the materials more easily maintainable, a set of ‘Recent links’ might be identified that weren’t referred to in the ‘My Reflections’ response. That is, they could be completely free standing, and would have no side effects if replaced.

As far as the provision of linked resources went, the original model was that the links should be fed into the course materials from an instructor maintained bookmark collection (for an early take on this, see Managing Bookmarks, with a proof of concept demo at CourseLinks Demo (Hmmm, everything except the dynamic link injection appears to have rotted:-().

The design of the questions/resources page was intended to have the scoping questions at the top of the page, and then the suggested resources presented in a style reminiscent of a search engine results listing, the idea being that we would present the students with too many resources for them to comfortably read in the allocated time, so that they would have to explore the resources from their own perspective (eg given their current level of understanding/knowledge, their personal interests, and so on). In one of my more radical moments, I suggested that the resources would actually be pulled in from a curated/custom search engine ‘live’, according to search terms specially selected around the current topic and framing questions, but I was overruled on that. However, the course does have a Google custom search engine associated with it which searches over materials that are linked to from the course.

So that’s the context…

Where I’m at now is pondering how we can use an enhanced custom search engine as a delivery platform for a resource based uncourse. So here’s my first thought: using a Google Custom Search Engine populated with curated resources in a particular area, can we use Google CSE Promotions to help scaffold a topic exploration?

Here’s my first promotions file:

<Promotions>
   <Promotion id="t151_1a" 
        queries="topic 1a, Topic 1A, topic exploration 1a, topic exploration 1A, topic 1A, what is a game, game definition" 
        title="T151 Topic Exploration 1A - So what is a game?" 
        url="http://digitalworlds.wordpress.com/2008/03/05/so-what-is-a-game/"
        description="The aim of this topic is to think about what makes a game a game. Spend a minute or two to come up with your own definition. If you're stuck, read through the Digital Worlds post 'So what is a game?'"
        image_url="http://kmi.open.ac.uk/images/ou-logo.gif" />
</Promotions>

It’s running on the Digital Worlds Search Engine, so if you want to try it out, try entering the search phrase what is a game or game definition.

T151 CSE promotion - game definition

(This example suggests to me that it would also make sense to use result boosting to boost the key readings/suggested resources I proposed in the topic materials so that they appear nearer the top of the results (that’ll be the focus of a future post;-))

The promotion displays at the top of the results listing if the specified queries match the search terms the user enters. My initial feeling is that to bootstrap the process, we need to handle:

– queries that allow a user to call on a starting point for a topic exploration by specifically identifying that topic;
– “naive queries”: one reason for using the resource-search model is to try to help students develop effective information skills relating to search. Promotions (and result boosting) allow us to pick up on anticipated naive queries (or popular queries identified from search logs), and suggest a starting point for a sensible way in to the topic. Alternatively, they could be used to offer suggestions for improved or refined searches, or search strategy hints. (I’m reminded of Dave Pattern’s work with guided searches/keyword refinements in the University of Huddersfield Library catalogue in this context).

Here’s another example using the same promotion, but on a different search term:

T151 CSE - topic 1a

Of course, we could also start to turn the search engine into something like an adventure game engine. So for example, if we type: start or about, we might get something like:

T151 CSE - start

(The link I associated with start should really point to the course introduction page in the VLE…)

We can also use the search context to provide pastoral or study skills support:

T151 CSE - pastoral

These sort of promotions/enhancements might be produced centrally and rolled out across course search engines, leaving the course and discipline related customisations to the course team and associated subject librarians.

Just a final note: ignoring resource limitations on Google CSEs for a moment, we might imagine the following scenarios for their role out:

1) course wide: bespoke CSEs are commissioned for each course, although they may be supplemented by generic enhancements (eg relating to study skills);

2) qualification based: the CSE is defined at the qualification level, and students call on particular course enhancements by prefacing the search with the course code; it might be that students also see a personalised view of the qualification CSE that is tuned to their current year of study.

3) university wide: the CSE is defined at the university level, and students students call on particular course or qualification level enhancements by prefacing the search with the course or qualification code.

From “Special Result” to “Promotion” in Google CSEs

In passing, I noticed I had a broken link to a Google CSE documentation page:

http://code.google.com/apis/customsearch/docs/special_results.html

Searching a little, I found the page had moved to

http://code.google.com/apis/customsearch/docs/promotions.html

A cached version of the originally linked page is still available, so I did a side-by-side comparison:

From 'special result' to 'promotion'

Hmmm…

Course Librarians and Search Assist…

For all their success in attracting universities to adopt Google Apps (Tradition meets technology: top universities using Apps for Education), it’s not obvious to me how – or even if – Google is actually doing much around search signal detection and innovation in an educational context?

I’ve floated this a couple of times before (eg Could Librarians Be Influential Friends? And Who Owns Your Search Persona? and Integrating Course Related Search and Bookmarking?), but with yet another announcement from Google about how they’re incorporating social signals into search rankings (Hide sites from anywhere in the world: “We’ve … started incorporating data about sites people have blocked into our general search ranking algorithms to help users find more high quality sites.”), I’m going to raise it again…

To what extent are course and subject librarians setting up course/subject personas that engage in recommending and sharing high quality links in an appropriate social content, and encouraging students to follow those accounts in order to benefit from personalisation of search results based on social signals?

Furthermore, to what extent might the development of search personas represent the creation of a “scholarly agent” that can be used to offer “search assist” to followers of that agent/persona?

I don’t find it that hard to imagine myself taking a course, following the course recommender on a social network (an account that might send out course related reminders as well as relevant links), with an icon depicting my university and the associated course, that on occasion appeared to “recommend” links to me when I was searching for topics relating to my course. (In the normal scheme of things, it wouldn’t actively be recommending links to me, of course. For that, I’d need to subscribe to something like Subscribed Links, as mentioned in Integrating Course Related Search and Bookmarking?.)

University Search Engine Sitelinks and (Rich) Snippets

A long time ago, I started running with the idea that an organisation’s homepage on the web was the dominant search engine’s search results page for the most common search term associated with that organisation. So for example, the OU’s effective home page is not http://www.open.ac.uk, but the Google search results page for open university. (You can test the extent to which this claim is supported by checking out your website logs, and comparing how many direct visitors you get to the “official” homepage for your website compared to the amount of traffic referred to your site from google for common search terms used to find your site. You do know what those search terms are, right?!;-)

At one time, the results page might include several links to different institutional web pages, each listed as a separate individual search result item. Then five years or so ago, Google started introducing sitelinks into the search results page, a display technique that would include a list of several links to specific pages within the same domain within the context of a single result item, headed by the top level domain.

OU snippets

Or maybe how about the library?

OU library snippets and sitelinks

[For an overview, see Anatomy Of A Google Snippet and maybe also Meta Description Mutiny! Take Control of Your Text Snippets; if you want to take some responsibility for what appears, see the Google Webmaster Tools posts on sitelinks, Changing a site title and description and Removing snippets and Instant Preview]


Video: Matt Cutts introduces the original snippets

A recent (August 2011) update sees the Goog placing even more focus on the display of sitelinks (The evolution of sitelinks: expanded and improved).

Here are few things about this update that I think are worth noting, particularly in light of recommendations emerging from the JISC “Linking You” project, which offers best practice guidance on the design of top-level URI schemes for university websites:

Sitelinks will now be full-size links with a URL and one line of snippet text—similar to regular results—making it even easier to find the section of the site you want. We’re also increasing the maximum number of sitelinks per query from eight to 12. …

In addition, we’re making a significant improvement to our algorithms by combining sitelink ranking with regular result ranking to yield a higher-quality list of links. This reduces link duplication and creates a better organized search results page. Now, all results from the top-ranked site will be nested within the first result as sitelinks, and all results from other sites will appear below them. The number of sitelinks will also vary based on your query—for example, [museum of art nyc] shows more sitelinks than [the met] because we’re more certain you want results from http://www.metmuseum.org.

So what do we learn from this?

  • Sitelinks will now be full-size links with a URL and one line of snippet text—similar: so check your results listing and see if the “one line of snippet text” for each displayed result makes sense. (For some ideas about how to influence snippet text, see e.g. the Google Webmaster Tools links above.)
  • We’re … increasing the maximum number of sitelinks per query from eight to 12: what links would you like to appear in the sitelinks list, compared to what links actually do appear? Would consensus in how UK HEIs architect top-level URLs (as, for example, recommended by Linking You, provide uniformity of display of results in the Google SERPs space? Would consensus open Google up to discussion relating to the most effective way of displaying search results for UK HEI sitelink results?
  • we’re making a significant improvement to our algorithms by combining sitelink ranking with regular result ranking to yield a higher-quality list of links: which is to say – SEO, and URI path design, may play a role in determining what links get displayed as sitelinks.
  • This reduces link duplication and creates a better organized search results page. That is, better orgainised, as defined by Google. If you want to influence the way links to your site are displayed as sitelinks, you need to figure how – you don’t control how Google provides this top level navigation to your site as sitelinks, but you may be able to influence the display through good website design….
  • The number of sitelinks will also vary based on your query

If you have pages that mainly contain lists of items, these may also be handled differently in the context of snippets: New snippets for list pages

See also:
Rich snippets microdata – if Google handled edu microdata, what would it describe…?
– Google Webmaster tools: Rich Snippets testing tool – “enter a web page URL to see how it may appear in search results”

PS I wonder how sitelink displays interact with rich snippets…?

PPS There’s a great write up of the Linking You project that I’ve just come across here: Lend Me Your Ears Dear University Web Managers!. Go and read it…. now… Hmmm.. thinks… what would a similar exercise for local council websites look like?

Getting Library Catalogue Searches Out There…

As a long time fan of custom search engine offerings, I keep wondering why Google doesn’t seem to have much active interest in this area? Google Custom Search updates are few and far between, and typically go unreported by the tech blogs. Perhaps more surprisingly, Custom Search Engines don’t appear to have much, if any, recognition in the Google Apps for Education suite, although I think they are available with a Google Apps for education ID?

One of the things I’ve been mulling over for years is the role that automatically created course related search engines might have to play as part of a course’s VLE offering. The search engine would offer search results either over a set of web domains linked to from the actual course materials, or simply boost results from those domains in the context of a “normal” set of search results. I’ve recently started thinking that we could also make use “promoted” results to highlight specific required or recommended readings when a particular topic is searched for (for example, Integrating Course Related Search and Bookmarking?).

During an informal “technical” meeting around three JISC funded reseource discovery projects at Cambridge yesterday (Comet, Jerome, SALDA; disclaimer: I didn’t work on any of them, but I was in the area over the weekend…), there were a few brief mentions of how various university libraries were opening up their catalogues to the search engine crawlers. So for example, if you do a site: limited search on the following paths:

– sabre.sussex.ac.uk/vufindsmu/Record/
– jerome.library.lincoln.ac.uk/catalogue/
– webcat.hud.ac.uk/catlink/bib/
– search.lib.cam.ac.uk/

you can get (partial?) search results, with a greater or lesser degree of success, from the Sussex, Lincoln, Huddersfield and Cambridge catalogues respectively.

In a Google custom search engine context, we can tunnel in a little deeper in an attempt to returns results limited to actual records:

– sabre.sussex.ac.uk/vufindsmu/Record/*/Description
– jerome.library.lincoln.ac.uk/catalogue/*
– webcat.hud.ac.uk/catlink/bib/*
– search.lib.cam.ac.uk/?itemid=*

I’ve added these to a new Catalogues tab on my UK HE library website CSE (about), so we can start to search over these catalogues using Google.

I’m not sure how useful or interesting this is at the moment, except to the library systems developers maybe, who can compare how informatively their library catalogue content is indexed and displayed in Google search results compared to other libraries… (so for example, I noticed that Google appears to be indexing the “related items” that Huddersfield publishes on a record page, meaning that if a search term appears in a related work, you might get a record that at first glance appears to have little to do with your search term, in effect providing a “reverse related work” search (that is, search on related works and return items that have the search term as the related work)).

Searching UK HE library catalogues via a Google CSE

But it’s a start… and with the addition of customised rankings, might provide a jumping off point for experimenting with novel ways of searching across UK HE catalogues using Google indexed content. (For example, a version of the CSE on the cam.ac.uk domain might boost the Cambridge results; within an institution, works related to a particular course through mention on a reading list might get a boost if a student on that course runs a search… and so on…

PS A couple of other things that may be worth pondering… could Google Apps for Education account holders be signed up to to Subscribed Links offering customised search results in the main Google domain relating to a particular course. (That is, define subscribed link profiles for a each course, and automatically add those subscriptions to an Apps for Edu user’s account based on the courses they’re taking?) Or I wonder if it would be possible to associate subscribed links to public access browsers in some way?

And how about finding some way of working with Google to open up “professional” search profiles, where for example students are provided with “read only” versions of the personalised search results of an expert in a particular area who has tuned, through personalisation, a search profile that is highly specialised in a particular subject area, e.g. as mentioned in Google Personal Custom Search Engines? (see also Could Librarians Be Influential Friends? And Who Owns Your Search Persona?).

If anyone out there is working on ways of using Google customised and personalised search as a way of delivering “improved” search results in an educational context, I’d love to hear more about what you’re getting up to…

Tweaking Ranking Factors in the Course Detective Custom Search Engine

This is a note-to-self as much as anything, relating to the Course Detective custom search engine that searches over UK HE course prospectus web pages about the extent to which we might be able to use data such as the student satisfaction survey results (as made available via Unistats) to boost search results around particular subjects in line with student satisfaction ratings, or employment prospects, for particular universities?

It’s possible to tweak rankings in Google CSEs in a variety of ways. On the one hand, we can BOOST (improve the ranking), FILTER (limit results to members of a given set) or ELIMINATE (exclude) sites appearing in the search results listing. In the simplest case, we assign a BOOST, FILTER or ELIMINATE weight to a label, and then apply labels to annotations so that they benefit from the corresponding customisation. We can further refine the effect of the modification by applying a score to each annotation. The product of score and weight values determines the overall ranking modification that is applied to each result for a label applied to an annotation.

So here’s what I’m thinking:

– define labels for things like achievement or satisfaction that apply a boost to a result;
– allow uses to apply a label to a search;
– for each university annotation (that is, the listing that identifies the path to the pages for a particular university’s online prospectus), add a label with a score modifier determined by the achievement or satisfaction value, for example, for that institution;
– for refinement labels that tweak search rankings within a particular subject area, define labels corresponding to those subject areas and apply score modifiers to each institution based on, for example, the satisfaction level with that subject area. (Note: I’m not sure if the same path can have several different annotations provided to it with different scores?

For example, an annotation file typically contains a fragment that looks like:

<Annotations>
  <Annotation about="webcast.berkeley.edu/*" score="1">
    <Label name="university_boost_highest"/>
    <Label name="lectures"/>
  </Annotation>

  <Annotation about="www.youtube.com/ucberkeley/*" score="1">
    <Label name="university_boost_highest"/>
    <Label name="videos_boost_mid"/>
    <Label name="lectures"/>
  </Annotation>
</Annotations>

I don’t know if this would work:

<Annotations>
  <Annotation about="example.com/prospectus/*" score="1">
    <Label name="chemistry"/>
  </Annotation>
<Annotations>
  <Annotation about="example.com/prospectus/*" score="0.5">
    <Label name="physics"/>
  </Annotation>
</Annotations>

That said, if the URLs are nicely structured, we might be able to do something like:

<Annotations>
  <Annotation about="example.com/prospectus/chemistry/*" score="1">
    <Label name="chemistry"/>
  </Annotation>
<Annotations>
  <Annotation about="example.com/prospectus/physics/*" score="0.5">
    <Label name="physics"/>
  </Annotation>
</Annotations>

albeit at the cost of having to do a lot more work in terms of identifying appropriate URI paths.

I also need to start thinking a bit more about how to apply refinements and ranking adjustments in course based CSEs.

Autocuration Signals in My Personalised Google Search Results

I spotted this for the first time last night:

Auto-curation signals in my search results

I had actually read the post in the Google Reader context (so Google knew that), but I wonder: if I hadn’t read the post, would it still have shown up like that?

As far as personalised ranking signals go:

– does the fact that I subscribe to the feed in Google Reader affect the rank of items from that feed in my personalised search results?
– if I have read the post in Google reader, does that also affect the rank of that specific post in my personalised search results?

If I have shared a link – through Google+, or Twitter, for example – are the ranking of those links positively affected in my personalised search results. That is, might social search actually be most useful when the Goog picks up on things I have shared myself, and then “reminds” me of them via a ranking boost in my personalised search results when I’m searching on a related topic?

Maybe tweeting and sharing into the void is actually yet another way of invisibly building search refinements into your personalised search context?