Structured Data for Course Web Pages and Customised Custom Search Engine Sorting

As visitors to any online shopping site will know, it’s often possible to sort search query results by price, or number of ‘review stars’, or filter items to show only books by a specified author, or publisher, for example. Via Phil Bradley, I see it’s now possible to introduce custom filtering and sorting elements into Google Custom Search Engine results.

(If you’re not familiar with Google’s Custom Search Engines (CSE), they’re search engines that only search over, or prioritise results from, a limited set of web pages/web domains. Google CSEs power my Course Detective and UK University Libraries search engines. (Hmm… I suspect Course Detective has rotted a bit by now…:-(

What this means is that if web pages are appropriately marked up, they can be sorted, filterd or ranked accordingly when returned as a search result in a Google CSE. So for example, if course pages were marked up with academic level, start date, NSS satisfaction score, or price, they could be sorted along those lines.

So how do pages need to be marked up in order to benefit from this feature? There are several ways:

  • Simply add meta-tags to a web page. For example, <meta name=”course.identifier” content=”B203″ />
  • using Rich Snippets supporting markup (i.e. microdata/microformats/RDFa)
  • As PageMap data added to a sitemap, or webpage. PageMap data also allows for the definition of actions, such as “Download”, that can be emphasised as such within a custom search result. (Facebook is similarly going down the path of trying to encourage developers to use verb driven, action related semantics (Facebook Actions))

I wonder about the extent to which JISC’s current course data programme of activities could be used to encourage institutions to explore the publication of some of their course data in this way? For example, might it be possible to transform XCRI feeds such as the Open University XCRI feed, into PageMap annotated sitemaps?

Something like a tweaked Course Detective CSE could then act as a quick demonstrator of what benefits can be immediately realised? So for example, from the Google CSE documentation on Filtering and sorting search results (I have to admit I haven’t played with any of this yet…), it seems that as well as filtering results by attribute, it’s also possible to use them to filter and rank (or at least, bias) results:

Not to self: have a rummage around the XCRI data definitions/vocabularies resources… I also wonder if there is a mapping of XCRI elements onto simple attribute names that could be used to populate eg meta tag or PageMap name attributes?