OU Goes Social with “Platform”

Earlier this week, the OU quietly opened up its new social site – Platform – with a mailing going out today to inform students and alumni about it’s availability…

…and at first sight, it’s looking really good:

As a distance learning institution, our students potentially miss out on the sense of community that you get as a student in a traditional university, although we work hard at engaging students in online forums at a course level and the students assocation (OUSA) try to support general interest groups again with online forums. At a regional and local level, course tutorials offer students a chance to meet face to face, (although there is an increasing number of wholly online courses) and our students also take it on themselves to create their own local groups, Facebook groups, and so on.

So I’m guessing that one of the functions of the Platform site is to help develop the wider community feeling that membership of a university provides, alongside the course cohort communities.

But more than that – the site is open to anyone, whether or not they are a current student or part of the OU alumni. And there’s no hard sell…

So what’s on Platform?

The front page is a general news page, that also currently includes a couple of “interactive” features, specifically a poll and a Youtube video from one of the OU View channels on Youtube (The Open University, OU Life or OU Learn). (I assume that the polls, and maybe the video, will change on a regular basis?)

There’s also what looks like a “learning fact of the day” panel that provides a link to an actual “course sales” page in a reasonably un-intrusive way.

Just in passing, it’s worth comparing this panel with the OU “Learning Fact of the Day” widget, which actually links through to an OpenLearn course from which the fact was pulled, rather than driving the viewer to a page on the course selling catalogue.

Something that is not obviously on the site is a schedule of OU/BBC programmes, or even an OU/BBC iPlayer channel? Maybe that’s because the placement of this site in comparison to the open2.net site is not fully clear yet? Certainly I could see Platform cannibalising open2’s traffic if Platform started publicising OU/BBC programmes? But Open2 is looking rather tired… (That said, things are happening on that site. For example, the site is starting to include extra video features around our broadcast TV programmes, as the Barristers wraparound site shows (if you can manage to navigate round it to actually find the content, that is ;-) and commenting around the programme pages is slowly starting to take off (see for example the comments around the James May’s Big Ideas: Man-Machine programme).

But back to the Platform site…

The News tab links to a set of news stories I guess created by OU staff (at the moment?). And I’m guessing there’ll be a mix of text stories as well as audio packages. (Though I do take issue with calling linked to audio a “podcast”, I do have to admit;-)

Two more things to note about that audio link: firstly, it’s a link rather than an embedded player plus a link – clicking the link opened a player in a new window on my browser. That’s a shame… it would have been much neater if there was an embedded player there. Secondly, here’s where it’s pointing to: http://podcast.open.ac.uk/feeds/platform/20081124T124715_is_reality_tv_ruining_music.mp3. The OU podcast site (which is: a) still not out of testing/really launched yet, and b) not the OU iTunesU site. (I’m not sure how much the content from those sites will overlap). And from a little tweet I heard a week or two ago, the podcast site actually uses Amazon S3 for storage and delivery…

A few other things to notice about the News pages – ratings, tagging and comments are all available… (I’m not sure what the moderation policy is, w.g. whether or not Platform staffers are actively moderating (= not scalable/sustainable in the long run, if the site takes off?) or using a lazy approach (report this post). Same with the tags – e.g. if people use inappropriate or offensive tags, can these be moderated, deleted?

The Blogs area links to a set of blogs on different topics. At the moment this looks like they’ve commissioned people to write posts for the Platform blogs (Open2 uses a similar sort of approach for their topic blogs), so it’ll be interesting to see how that plays out. Certainly I don’t fully engage with writing posts to the Open2 Science and Technology blog, for a variety of reasons (I don’t like the blog engine they use; posts need to go through an editorial policy that strips out movies and maps in case of rights issues, but lets typos through that I can’t go in and change once the post is published, the traffic is lousy compared to the views I can get posting here on OUseful.info etc etc).

Each blog appears to have it’s own RSS feed, which is good (I haven’t checked which feed type they went for… it would be nice to think it was Atom).

The call to action around the feed – “Get Updates” – is well chosen, I think, and it’s nice that feed autodiscovery is enabled. I have to admit that the feed URL looks a bit odd, though… http://www.open.ac.uk/platform/blogs/alumni/%2A/%2A/feed. Hmm… (%2A renders as * if you hover over the URL in the browser status window)

The Campus area looks to be an attempt to bring something of the OU campus alive, with voices and tales from people who work there. (I’m guessing this part of the are will feed from the OUlife Youtube channel and maybe the research channel, when it launches?).

If anywhere, this is the page on the Platform site that looks most like the place that is linking out to other OU web properties on the “main” OU website. In which guess, I guess it’s really an info point? And many respects, the thing that is closest to a traditional university homepage (although, err, Where is the Open University Homepage??).

The Join In area is where forums can be found (also linked to as “Forums” from the front page, I think?

The Timeout area is where the games are… ;-) The OU actually has quite a long history of releasing games (e.g. here’s a round-up I did a couple of years ago: OU Online Games and Interactives), but the explosion in casual game formats and libraries means that they must be far easier (=quicker and cheaper) to make now, as well as being more acceptable, maybe?

Finally, it’s worth mentioning that the commenting and “joining in” features require you to login. There are two huge things happening here. Firstly, to login to the site, you don’t need to be a member of the OU (that is, you don’t need to be staff, student, or alumni). Secondly, you can – if you want – login in OpenID:

The OU has actually been running an experimental OU OpenID server for sometime, which allows anyone with OU credentials to use those credentials as an OpenID, but as far as I know, this is one of the first production service running on the open.ac.uk domain that lets users in with an OpenID, although take note here – the OpenID doesn’t let you in to any OU authenticated areas: it’s just for Platform. (I’m not sure if Cloudworks or Cohere do OpenID yet?)

Although there’s little customisation you can do as a virtue of registering – the benefits arise from being able to comment, and join in the forums – the site design certainly has the look and feel of a site that might, one day, let you drag and drop panels around, and rearrange the page furniture webtop fashion. (Or maybe we need to clarify the widget strategy first?!)

As yet, there’s no link to the platform site from the Open University homepage, so it’ll be interesting to see how the relationship between the OU homepage and the platform homepage evolves over the coming weeks and months (and also how the relationship between Platform and open2 are managed?).

Seeing how the relationship between Platform and the new generation of departmental websites will evolve over time will also be an interesting one. For example, my own Communication and Systems Department homepage is experimenting with “voices from the department” with a range of blog and audio content, and the team responsible are also looking for ways to make the site a destination site around communication related technologies (hence the “Gadgets” area):

Hmm – maybe I should offer to do a “speedmash” or “half hour hack” area for them?;-)

And finally, for a review of some “older” OU 2.0 services, Brian Kelly did a write up some time ago: The Open University’s Portfolio Of Web 2.0 Services. You can find links to most of them here: /use – From us, to you, and back again.

PS in case you’re wondering, I think I’m correct in saying that the OU Platform site is built on Drupal…

PPS Brilliant job folks – it’ll be interesting to see how people engage with it…

My CETIS 2008 Presentations

I’ve just spent a most enjoyable couple of days at the CETIS 2008 event in Birmingham, where I particpated in a couple of sessions on the future of the VLE, and HE APIs.

Just for the record, here’s the presentation I gave in the VLE session (“Web 2.ools and the VLE“):

(Mark Stiles was kind enough to say how he liked the slides, and in particular the way in which the pictures weren’t about anything at all…;-)

[Transcript of liveblog/tweeting: I was on around 3pm]

And here are the presentations I didn’t give in the APIs session – “APIs Wot I Play Wiv“:

And “What I’d Like From JISC APIs“:

Instead, I ran through the Data Scraping Wikipedia with Google Spreadsheets mashup; somewhere along the way, the idea of a “speed mashup” was introduced… this is maybe something I’ll try out at the Mashed Library event tomorrow….. err, later today… One thing that did come out of the session for me is that maybe there really is an opportunity for some sort of roadshow/masterclass around the very idea of mashups, with some quick and effective mashup demos along the way (which are, apparently, quite “intimidating” compared to what you can and canlt do with educational system APIs…;-)

There’s a few more notes – and some blatant self-promotion – on the CETIS08 APIs session wiki. Note to self: play with the PROD project discovery API.

Open Content Anecdotes

Reading Open Content is So, Like, Yesterday just now, the following bits jumped out at me:

Sometimes– maybe even most of the time– what I find myself needing is something as simple as a reading list, a single activity idea, a unit for enrichment. At those times, that often-disparaged content is pure gold. There’s a place for that lighter, shorter, smaller content… one place among many.

I absolutely agree that content is just one piece of the open education mosaic that is worth a lot less on its own than in concert with practices, context, artifacts of process, and actually– well, you know– teaching. Opening content up isn’t the sexiest activity. And there ain’t nothin’ Edupunk about it. But I would argue that in one way if it’s not the most important, it’s still to be ranked first among equals. Not just for reasons outlined above, but because for the most part educators have to create and re-create anew the learning context in their own environment. Artifacts from the processes of others– the context made visible– are powerful and useful additions that can invigorate one’s own practice, but I still have to create that context for myself, regardless of whether it is shared by others or not. Content, however, can be directly integrated and used as part of that necessary process. When all is said and done, neither content nor “context” stand on their own particularly well.

For a long time now, I’ve been confused about what ‘remixing’ and ‘reusing’ open educational content means in practical terms that will see widespread, hockey stick growth in the use of such material.

So here’s where I’m at… (err, maybe…?!)

Open educational content at the course level: I struggle to see the widespread reuse of courses, as such; that is, one insitution delivering another; if someone from another institution wants to reuse our course materials (pedagogy built in!), we license it to them; for a fee. And maybe we also run the assessment, or validate it. It might be that some institutions direct their students to a pre-existing, open ed course produced by another instituion where the former instituion doesnlt offer the course; maybe several institutions will hook up together around specialist open courses so they can offer them to small numbers of their own students in a larger, distributed cohort, and as such gain some mutual benefit from bringing the cohort up to a size where it works as a community, or where it becomes financially viable to provide an instructor to lead students through the material.

For indidividuals working through a course on their own, it’s worth bearing in mind that most OERs released by “trad” HEIs are not designed as distance education materials, created with the explicit intention that they are studied by an individual at a remote location. The distance educational materials we create at the OU often follow a “tutorial-in-print” model, with built in pacing and “pedagogical scaffolding” in the form of exercises and self-assessment questions. Expecting widespread consumption of complete courses by individuals is, I think, unlikely. As with a distributed HEI cohort model, it may be that gorups of individuals will come together around a complete course, and maybe even collectively recruit a “tutor”, but again, I think this could only ever be a niche play.

The next level of granularity down is what would probably have been termed a “learning object” not very long ago, and is probably called something like an ‘element’ or ‘item’ in a ‘learning design’, but which I shall call instead a teaching or learning anecdote (i.e. a TLA ;-); be it an exercise, a story, an explanation or an activity, it’s a narrative something that you can steal, reuse and repurpose in your own teaching or learning practice. And the open licensing means that you know you can reuse it in a fair way. You provide the context, and possibly some customisation, but the original narrative came from someone else.

And at the bottom is the media asset – an image, video, quote, or interactive that you can use in your own works, again in a fair way, without having to worry about rights clearance. It’s just stuff that you can use. (Hmmm I wonder: if you think about a course as a graph, a TLA is a fragment of that graph (a set of nodes connected by edges), and a node, (and maybe even an edge?) is an asset?)

The finer the granularity, the more likely it is that something can be reused. To reuse a whole course maybe requires that I invest hours of time on that single resource. To reuse a “teaching anecdote”, exercise or activity takes minutes. To drop in a video or an image into my teaching means I can use it for a few a seconds to illustrate a point, and then move on.

As educators, we like to put our own spin on the things we teach; as learners viewed from a constructivist or constructionist stance, we bring our own personal context to what we are learning about. The commitment required to teach, or follow, a whole course is a significant one. The risk associated with investing a large amount of attention in that resource is not trivial. But reusing an image, or quoting someone else’s trick or tip, that’s low risk… If it doesn’t work out, so waht?

For widespread reuse of the smaller open ed fragments, then we need to be able to find them quickly and easily. A major benefit of reuse is that a reused component allows you to costruct your story quicker, because you can find readymade pieces to drop into it. But if the pieces are hard to find, then it bcomes easier to create them yourself. The bargain is soemthing like this:

if (quality of resource x fit with my story/time spent looking for that resource) > (quality of resource x fit with my story/time spent creating that resource), then I’m probably better of creating it myself…

(The “fit with my story” is the extent to which the resource moves my teaching or learning on in the direction I want it to go…)

And this is possible where the ‘we need more‘ OERs comes in; we need to populate something – probably a search engine – with enough content so that when I make my poorly formed query, something reasonable comes back; and even if the results don’t turn up the goods with my first query, the ones that are returned should give me the clues – and the hope – that I will be able to find what I need with a refinement or two of my search query.

I’m not sure if there is a “flickr for diagrams” yet (other than flickr itself, of course), maybe something along the lines of O’Reilly’s image search, but I could see that being a useful tool. Similarly, a deep search tool into the slides on slideshare (or at least the ability to easily pull out single slides from appropriately licensed presentations).

Now it might be that any individual asset is only reused once or twice; and that any individual TLA is only used once or twice; and that any given course is only used once or twice; but there will be more assets than TLAs (becasue resources can be disaggreated from TLAs), and more TLAs than courses (becuase TLAs can be disaggregated from courses), so the “volume reuse” of assets summed over all assets might well generate a hockey stick growth curve?

In terms of attention – who knows? If a course consumes 100x as much attention as a TLA, and a TLA consumes 10x as much attenion as an asset. maybe it will be the course level open content that gets the hiockey stcik in terms of “attention consumption”?

PS being able to unlock things at the “asset” level is one of the reasons why I don’t much like it when materials are released just as PDFs. For example, if a PDF is released as CC non-derivative, can I take a screenshot of a diagram contained within it and just reuse that? Or the working through of a particular mathematical proof?

PS see also “Misconceptions About Reuse”.

On Writing “Learning Content” in the Cloud

A couple of weeks ago, I posted about an experiment looking at the “mass authoring” of a book on Processing (2.0 1.0, and a Huge Difference in Style).

Darrel Ince, who’s running the experiment, offered to post a public challenge for me to produce 40, 000 words as an adjunct to the book using my own approach… I declined, partly because I’m not sure what I really had in mind would work to produce 40,000 words of “book adjunct”, partly because I don’t know what my approach would be (and I don’t have the time to invest in finding out at the moment, more’s the pity:-(….

Anyway, here’s some of my further thinking on the whole “mass authoring experiment”…

Firstly, three major things came to my mind as ‘issues’ with the process originally suggested for the ‘mass authoring experiment’ – two related to the technology choice, the third to the production model.

To use an application such as Google docs, or a even a wiki, to write a book in a sense respects the structure of the book. Separate documents represent separate chapters, or sections, and multiple authors can have access the document. If “version control” is required – that is, if separate discrete drafts are required – then separate documents can be spawned for each draft. Alternatively, if a the process is one of continual refinement, each chapter can evolve in a single document, potentially authored, edited, critically read and commented on by several people.

There are quite a few books out there that have been written by one or two people round a blog, but there the intent was to create posts that acted as tasters or trial balloons for content and get feedback from the community relating to it. John Battelle’s book on search (Dear Blog: Today I Worked on My Book), and the Groundswell book (7 ways the Web makes writing a book better & faster) are prime examples of this. “The Googlization of Everything” is another, and is in progress at the moment (Hi. Welcome to my book.).

The Google Hacks book I contributed a single hack to (;-) used separate Google docs docs for each hack, as described in Writing a Book in Google Docs. (In part, the use of Google docs as the authoring environment was a ‘medium is the message’ hack!) There the motivation was to author a ‘trad’ book in a new environment – and it seemed to work okay.

In each case, it’s worth remembering that the motivation of the authors was to write a book book, as with the mass authoring experiment, so in that sense it will provide another data point to consider in the “new ways of authoring books” landscape.

The second technology choice issue was the medium chosen for doing the code development. In a book book, intended for print, you necessarily have to refer the reader to a computer in order for them to run the code – offline or online doesn’t really come into it. But if you are writing for online delivery, then there is the option of embedding interactive code development activities withing the test, using something like Obsessing, for example. Potentially, Obsessing, and even the processing.js library, might be pretty unstable, which would provide for an unsatisfactory learning experience for a novice working through the materials (“is my code broken or is the environment broken?”), but with use and a community around it, either the original developer might be motivated to support the libraries, or someone else might be minded to provide maintenance and ongoing development and support an engaged and contributory audience. After all, having a community finding bugs and testing fixes for you is one of the reasons people put time into their open code.

The other major issue I had was with respect to the structuring and organising of the “book”. If you want to play to network strengths in recruiting authors, critical readers, editors and testers, I’m not sure that providing a comprehensively broken down book structure is necessarily the best model? At its worst, this is just farming out word creation to “word monkeys” who need to write up each structural element until they hit the necessary word count (that maybe a little harsh, but you maybe get the gist of what I’m trying to say?). The creativity that comes from identifying what needs to go into a particular section, and how it relates to other sections, is, in the worst case, denied to the author.

In contrast, if you provide a book stub wiki page as a negotiation environment and then let “the community” create further stub pages identifying possible book topics, then the ‘outline’ of the book – or the topics that people feel are important – would have had more play – and more sense of ownership would belong with the community.

A more ‘natural’ way of using the community, to my mind, would be to explore the issue of a ‘distributed uncourse’ in a little more detail, and see how a structure could emerge from a community of bloggers cross referencing each other through posts, comments and trackbacks – Jim Groom’s UMW edu-publishing platform or D’Arcy Norman’s UCalgary Blogs platform are examples of what a hacked-off-the-shelf solution might look like to support this “within” an institution?

The important thing is that the communities arise from discovering a shared purpose. Rather than being given a set of explicit tasks to do, the community identifies what needs doing and then does it. Scott Leslie recently considered another dimension to this problem, in considering how “getting a community off the shelf” is a non-starter: Planning to Share versus Just Sharing.

It strikes me that the “mass authoring” experiment is trying to source and allocate resource to perform a set of pre-defined tasks, rather than allowing a community to grow organically through personal engagement and identify meaningful tasks that need to be completed within that community – that is, allowing the tasks to be identified on an ‘as required’ basis, or as itches that occur that come to need scratching?

The output of an emergent community effort would potentially be non-linear and would maybe require new ways of being read, or new ways of having the structure exposed to the reader? I tried to explore some of these issues as they came to mind when I was writing the Digital Worlds uncourse blog:

(though it probably doesn’t make a lot of sense without me talking to it!)

As part of the challenge, I was advised that I would need about 16 authors. I’m really intrigued about how this number was arrived at. On the basis of porducity (circa 2,500 words per person, assuming a 40, 000 words deliverable?). When I as doing the uncourse posts, my gut feeling was that an engaging 500-800 word blog post might get say a handful of 50-200 word comments back, and possibly even a link back from another blog post. But what does that mean in terms of word count and deliverables?

Another issue that I had with taking the ‘recruit from cold’ approach were I to take up the challenge is that there is potentially already a community around Resig’s processing library, the obsessing interactive editor for it, and Processing itself.

For example, there are plenty of resources already out in the wild to support Processing (eg at the Processing.org website) that might just need some scaffolding or navigation wrapped around them on order to make a “processing course” (copyright and license restrictions allowing, of course…)? So why not use them? (cf. Am I missing the point on open educational resources? and Content Is Infrastructure.) Of course, if the aim was to manufacture a “trad book” according to a prespecified design, this approach may not be appropriate, compared to seeing the structure of the “unbook” arise as engagement in an emergent and ongoing conversation – the next chapter is the next post I read or write on the topic.

From my own experience of Digital Worlds, I wrote a post or two a day for maybe 12 weeks, and then the flow was broken. That required maybe 2-4 hours a day commitment, learning about the topics, tinkering with ideas, seeing what other conversations were going on. It was time consuming, and the community I was engaging with (in terms of people commenting and emailing me) was quite small. Playing a full role in a larger community is more time consuming still, and is maybe one downside to managing an effective community process?

The idea behind the experiment – of looking for new ways to author content – is a good one, but for me the bigger question is to find new ways of reading and navigating content that already exists, or that might emerge through conversation. If we assume the content is out there, how can we aggregate it into sensible forms, or scaffold it so that it is structured in an appropriate way for students studying a particular “course”, If the content is produced through conversation, then does it make sense to talk about creating a content artefact that can be picked up an reused? Or is the learning achieved through the conversation, and should instructor interventions in the form of resource discovery and conducting behaviour, maybe, replace the “old” idea of course authoring?

In terms of delivering content that is authored in a distributed fashion on a platform such as the UMW WPMU platform, I am still hopeful that a “daily feed” widget that producing 1 or more items per day form a “static blog” according to a daily schedule, starting at the day the reader subscribes to the blog, will be one way of providing pacing to linearised feed powered content. (I need to post the WP widget we had built to do this, along with a few more thoughts about a linear feed powered publishing system built to service it).

For example, if you define a static feed – maybe one that replays a blog conversation – then maybe this serves as an artefact that can be reused by other people down the line, and maybe you can post in your own blog posts in “relative time”. I have lots of half formed ideas about a platform that could support this, e.gg on WPMU, but it requires a reengineering (I think), or at least a reimagining, of the whole trackback and commenting engine (you essentially have to implement a notion of sequence rather than time…).

(To see some related example of “daily feeds’, see this ‘daily feeds’ bookmark list.)

So to sum up what has turned out to be far too long a post? Maybe we need to take some cues from this:

and learn to give up some of the control we strive for, and allow our “students” to participate a little more creatively?

See also: Learning Outcomes – again . It strikes me that predefining the contents of the book is like an overkill example of predefining learning outcomes written to suit the needs of a “course author”, rather than the students…?

Approxi-mapping Mash-ups, with a Google MyMaps Tidy Up to Follow

What do you do when you scrape a data set, geocode it so you can plot it on a map, and find that the geocoding isn’t quite as good as you’d hoped?

I’d promised myself that I wasn’t going to keep on posting “yet another way of scraping data into Google spreadsheets then geocoding it with a Yahoo pipe” posts along the lines of Data Scraping Wikipedia with Google Spreadsheets, but a post on Google Maps mania – Water Quality Google Map – sent me off on a train of thought that has sort of paid dividends…

So first up, the post got me thinking about whether there are maps of Blue Flag beaches in the UK, and where I could find them. A link on the UK page of blueflag.org lists them: UK Blue Flag beaches, (but there is a key in the URL, so I’m not sure how persistent that URL is).

Pull it into a Google spreadsheet using:

Publish the CSV:

Geocode the beaches using a Yahoo pipe – rather than using the Pipe location API, I’m making a call to the Yahoo GeoPlanet/Where API – I’ll post about that another day…

Grab the KML from the pipe:

Now looking at the map, it looks like some of the markers may be mislocated – like the ones that appear in the middle of the country, hundreds of miles from the coast. So what it might be handy to do is use the scraped data as a buggy, downloaded data set that needs cleaning. (This means that we are not going to treat the data as “live” data any more.)

And here’s where the next step comes in… Google MyMaps lets you seed a map by importing a KML file:

The import can be from a desktop file, or a URL:

Import the KML from the Yahoo pipe, and we now have the data set in the Google MyMap.

So the data set in the map is now decoupled from the pipe, the spreadsheet and the original Blue Flag website. It exists as a geo data set within Google MyMaps. Which means that I can edit the markers, and relocate the ones that are in the wrong place:

And before the post-hegenomic tirade comes in (;-), here’s an attempt at capturing the source of the data on the Google MyMap.

So, to sum up – Google MyMaps can be used to import an approximately geo-coded data set and used to tidy it up and republish it.

PS dont forget you can also use Google Maps (i.e. MyMaps) for geoblogging

Where is the Open University Homepage?

Several weeks ago, I was listening to one of the programmes delivered to me every week via my subscription to the IT Cnversations podcast feed, when I came across this Technometria episode on Search Engine Marketing (if you have a daily commute, it’s well worth listening to on one of your trips this week…).

One of the comments that resonated quite strongly with me, in part because I’ve heard several people in the OU comms team asking several times over the last few months “what’s the point of the OU homepage?”, was that to all intents and purposes, Google is now the de facto homepage from many institutions.

That is, this is the OU homepage for many people:

rather than this:

(As far as I know, very little of our online marketing sends traffic to the homepage – most campaigns send traffic to a URL deeper in the site more relevant to the particular campaign).

Just in passing, a post on Google Blogoscoped today – What Do People Seaarch For? – picked up on an item from Search Engine Land describing a new tool from Google: Search based Keyword Tool.

What this tool does is to “suggest keywords based on actual Google search queries” that are “matched to specific pages of your website”:

Hmmm…. (and yes, that Savings Interest Rates pages is on an OU domain…)

PS this search based keyword tool is also in the ball park of Google Trends, Google Insights for Search, and Google Trends for websites, which I’ve be playing with a lot recently (e.g. Playing with Google Search Data Trends and Recession, What Recession?), as well as the Google Adwords keywords tool:

which looks a lot more reasonable than the Search based Keyword tool?!

PPS Again in passing, and something I intend to pick up on a little more in a later post, Yahoo have just opened up a Key Terms service as part of the BOSS platform that will let you see the keywords that Yahoo has used to index a particular web page (Key Terms provide “an ordered terminological representation of what a document is about. The ordering of terms is based on each term’s frequency and its positional and contextual heuristics.”).

Services like Reuters’ OpenCalais already allow you to do ‘semantic tagging’ of free text, and Yahoo’s Term Extraction service also extracts keywords from text. I’m not sure how the BOSS exposed keywords compare with the keywords identified by the Term Extraction service as applied to a particular web page?

If I get a chance to run some tests, I’ll let you know, unless anyone can provide more info in the meantime?

Will Lack of Relevancy be the Downfall of Google?

Every so often, posts come around about new search engines that are going to make a bid to become a Google search killer, but I wonder if the changing nature of the web itself will lead people to a search engine that appears to do search better in those bits of the web that they’re spending most time, and so lead them away from Google?

It’s hard thinking back the 10 years or so to a time before Google, so I’m not sure what prompted me to switch allegiance from Metacrawler to Google? Maybe it was that Google results were dominating the Metacrawler results page? (For those of you who have know idea what I’m talking about, Metacrawler (which lives on to this day, as… MetaCrawler;-) was essentially a federated search engine, that pooled results from several, early web search engines. Before Metacrawler, I used Webcrawler, which was one of the first search engines to do full text search, I think?

In those early days, Google won out on producing “better” results in part because of its PageRank algorithm, in part becuase of its speedy response. PageRank essentially determines the authority of a page by the number of pages that link to it, and the authority of those pages. There’s lots of other voodoo magic in the ranking and relevancy algorithm now, of course, but that was at the heart of what made Google different in the early days.

So Google came good in large part because it used the structure of the web to help people better navigate the web.

But what of the structure of the web now? Many of the recently launched search engines have made great play of being “social” or “people powered” search engines, that leverage personal recommendations to improve search results. The big search engines are experimenting with tools that let searchers “vote up” more relevant results, and so on (e.g. Google’s experiment, or Microsoft’s URank experiment).

But it might be that the nature of recommending a page to someone else is now less to do with publishing a link to another site on a web page or in a blog post, than sharing a link with someone in a more conversational way (though as to how you found that link in the first place – there lies a problem;-)

So although Google won’t be able to snoop on link sharing in “walled garden” social networks like Facebook, I wonder if they are tracking link sharing in services like Twitter? (Google owns rival microblogging site Jaiku, but since buying it, all has been quiet. Maybe they’re waiting for the masses to become conscious of the thing called Twitter, then they’ll go prime time with Jaiku?)

Just by the by, there’s also the “problem” that many shared links are now being obfuscated by URL shortening services, which means that TinyURLs, bit.ly URLs and is.gd URLs all need resolving back to the pages they point to in order to rank those pages. (Hmm…. so when will the Goog be pushing it’s own URL shortening service, I wonder?)

This link resolution is easy enough to achieve, though. For example, the Tiwtturly service tracks the most poplular links being shared on Twitter over a 24 hour period (I think they also used to let you see who was tweeting about a particular URL, because I built a pipe around it – A Pipe for Twitturly – although that functionality appears to have disappeared?)

PS Maybe that Jaiku launch moment will actually be on mobile devices – on the iPhone (which now has Google Voice search), and on Android devices? Maybe Jaiku’s relaunch (and remember, Jaiku was heavy on the mobile stuff) will be a defining moment that hails: “the era of the PC [i]s over,… the future belong[s] to cloud applications accessed via phones”, via Daddy, Where’s Your Phone? which also includes a lovely story to illustrate this: a child overhears her dad answering “I don’t know” to a question

“Daddy, where’s your phone?”

“What do you mean, where’s my phone?” She explained that she’d overheard the question. Why wasn’t he just looking up the answer on his phone?

Cf. the apocryphal story of the child looking behind the TV set for a mouse, and the idea that “a screen without a mouse is broken”… ;-)