Playing With Google Search Data Trends

Early last week, Google announced a Google Flu trends service, that leverages the huge number of searches on Google to provide a near real-time indicator of ‘flu outbreaks in the US. Official reports from medical centres and doctors can lag actual outbreaks by up to a couple of weeks, but by correlating search trend data with real medical data, the Google folks were able to show that their data led the the official reports.

John Naughton picked up on this service in his Networker Observer column this week, and responded to an email follow-up comment I sent him idly wondering what search terms might be indicators of recession in this post on Google as a predictor. “Jobseeker’s allowance” appears to be on the rise, unfortunately (as does “redundancy”).

For some time, I’ve been convinced that spotting clusters of related search terms, or meaningful correlations between clusters of search terms, is going to be big the next step towards, err, something(?!), and Google Flu trends is one of the first public appearances of this outside the search, search marketing and ad sales area.

Which is why, on the playful side, I tried to pitch something like Trendspotting to the Games With a Purpose (GWAP) folks (so far unreplied to!), the idea being that players would have to try to identify search terms who’s trends were correlated in some “folk reasonable” way. Search terms like “flowers” and “valentine”, for example, which appear to be correlated according to the Google Trends service:

Just out of interest, can you guess what causes the second peak? Here’s one way of finding out – take a look at those search terms on the Google Insights for Search service (like Google Trends on steroids!):

Then narrow down the date over which we’re looking at the trend:

By inspection, it looks like the peak hits around May, so narrow the trend display to that period:

If you now scroll down the Google Insights for Search page, you can see what terms were “breaking out” (i.e. being searched for in volumes way out of the the norm) over that period:

So it looks like a Mother’s Day holiday? If you want to check, the Mother’s Day breakout (and ranking in the top searches list) is even more evident if you narrow down the date range even further.

Just by the by, what else can we find out? That the “Mother’s Day” holiday at the start of May is not internationally recognised, maybe?

There are several other places that are starting to collect trend data – not just search trend data – from arbitrary sources, such as Microsoft Research’s DataDepot (which I briefly described in Chasing Data – Are You Datablogging Yet?) and Trendrr.

The Microsoft service allegedly allows you to tweet data in, and the Trendrr service has a RESTful API for getting data in.

Although I’ve not seen it working yet (?!), the DataDepot looks like it tries to find correlations between data sets:

Next stop convolution of data, maybe?

So whither the future? In an explanatory blog post on the flu trends service – How we help track flu trends – the Googlers let slip that “[t]his is just the first launch in what we hope will be several public service applications of Google Trends in the future.”

It’ll be interesting to see what exactly those are going to be?

PS I’m so glad I did electronics as an undergrad degree. Discrete maths and graph theory drove web 2.0 social networking theory algorithms, and signal processing – not RDF – will drive web 3.0…

Google MyMaps Now With RSS (= Easy Geoblogging)

A tweet from Jim Groom earlier this week alerted me to a post he had just written entitled Google My Maps with RSS.

Now some time earlier this year, I’d hacked together a Yahoo Pipe that would generate an RSS feed from the KML feed output of a map, and so provide an ad hoc geoblogging environment from Google MyMaps (MyMaps GeoBlogger – Blogging From Google Maps).

But the availability of the RSS feed direct from the MyMap just makes this a whole lot easier…

And yes, the RSS output is geocoded (that is, the feed is GeoRSS):

PS Google also just changes the Terms of Service on Google Maps. As with all rights issues, I’m not totally sure I understand what the actual consequences are… For a discussion, see Ed Parsons’ Who reads the Terms of Service anyway...

PPS for easy maps data mashups, check out GeoCommons.com. The CogDog gives an eduview here: Geocommons Makes it Easy for Anyone to Mashup Data & Maps. You might also find this technique for geocoding data from a Google spreadsheet useful…

Innovation in Online Higher Education

In an article in the Guardian a couple of days ago – UK universities should take online lead, it was reported that “UK universities should push to become world leaders in online higher education”, with universities secretary, John Denham, “likely to call” for the development of a “global Open University in the UK”. (Can you imagine how well that call went down here?;-)

Anyway, the article gave me a heads-up about the imminent publication of a set of reports to feed into a Debate on the Future of Higher Education being run out of the Department for Innovation, Universities and Skills.

The reports cover

The “World leader in elearning” report, (properly titled “On-line Innovation in Higher Education“), by Professor Sir Ron Cooke is the only one I’ve had a chance to skim through so far, so here are some of the highlights from it for me…

HE and the research funding bodies should continue to support and promote a
world class ICT infrastructure and do more to encourage the innovative
exploitation of this infrastructure through … a new approach to virtual education based on a corpus of open learning content

Agreed – but just making more content available under an open license won’t necessarily mean that anyone will use this stuff… free content works when there’s an ecosystem around it capable of consuming that content, which means confusion about rights, personal attitudes towards reuse of third party material, and a way of delivering and consuming that material all need to be worked on.

The OERs “[need] to be supported by national centres of excellence to provide quality control, essential updating, skills training, and research and development in educational technology, e-pedagogy and educational psychology”.

“National Centres of Excellence”? Hmmm… I’d rather that networked communities had a chance of taking this role on. Another centre of excellence is another place to not read the reports from… Distributed (or Disaggregated) Centres of Excellence I could maybe live with… The distributed/disaggregated model is where the quality – and resilience – comes in. The noise the distributed centre would have to cope with because it is distributed, and because its “nodes” are subject to different local constraints, means that the good will out. Another centralised enclave (black hole, money sink, dev/null) is just another silo…

“[R]evitalised investment into e-infrastructures” – JISC wants more money…

[D]evelopment of institutional information strategies: HEIs should be encouraged and supported to develop integrated information strategies against their individual missions, which should include a more visionary and innovative use of ICT in management and administration

I think there’s a lot of valuable data locked up in HEIs, and not just research data; data about achievement, intent and sucessful learning pathways, for example. Google has just announced a service where it can track flu trends, which is “just the first launch in what we hope will be several public service applications of Google Trends in the future”. Google extracts value from search data and delivers services built on mining that data. So in a related vein, I’ve been thinking for a bit now about how HEIs should be helping alumni extract ongoing value from their relationship with their university, rather than just giving them 3 years of content, then tapping them every so often with a request to “donate us a fiver, guv?” or “remember us? We made you who you are… So don’t forget us in your will”. (I once had a chat with some university fundraisers who try to pull in bequests… vultures, all of ’em ;-)

“It is however essential that central expenditure on ICT infrastructure (both at the national level through JISC and within institutions in the form of ICT services and libraries) are maintained.” – JISC needs more cash. etc etc. I won’t mention any more of these – needless to say, similar statements appear every page or two… ;-)

“The education and research sectors are not short of strategies but a visionary thrust across the UK is lacking” – that’s because people like to do their own thing, in their own place, in their own way. And retain “ownership” of their ideas. And they aren’t lazy enough…;-) I’d like to see people trying to mash-up and lash-up the projects that are already out there…

the library as an institutional strategic player is often overlooked because the changes and new capabilities in library services over the past 15 years are not sufficiently recognised

Academic Teaching Library 2.0 = Teaching University 2.0 – discuss… The librarians need to get over their hang-ups about information (the networked, free text search environment is different – get over it, move on, and make the most of it…;-) and the academics need to get their heads round the fact that the content that was hard to access even 20 years ago is now googleable; academics are no longer the only gateways to esoteric academic content – get over it, move on, and make the most of it…;-)

Growth in UK HE can come from professional development, adult learning etc. but might be critically dependent on providing attractive educational offerings to this international market.

A different model would be to encourage some HEIs to make virtual education offerings aimed at the largely untapped market of national and overseas students who cannot find (or do not feel comfortable finding) places in traditional universities. This approach can exploit open educational resources but it would be naïve to expect all HEIs to contribute open education resources if only a few
exploit the potential offered. All HEIs should be enabled to provide virtual education but a few exemplar universities should be encouraged (the OU is an obvious candidate).

Because growth in business is good, right? (err….) and HE is a business, right? (err….) And is that a recommendation that the OU become a global online education provider?

A step change is required. To exploit ICT it follows that UK HEIs must be flexible, innovative and imaginative.

Flexible… innovative… imaginative…

ICT has greatly increased and simplified access by students to learning materials on the Internet. Where, as is nearly universal in HE, this is coupled with a Virtual Learning Environment to manage the learning process and to provide access to quality materials there has been significant advances in distance and flexible learning.

But there is reason to believe this ready access to content is not matched by training in the traditional skills of finding and using information and in “learning how to learn” in a technology, information and network-rich world. This is reducing the level of scholarship (e.g. the increase in plagiarism, and lack of critical judgement in assessing the quality of online material). The Google and Facebook generation are at ease with the Internet and the world wide web, but they do not use it well: they search shallowly and are easily content with their “finds”. It is also the case that many staff are not well skilled in using the Internet, are pushed beyond their comfort zones and do not fully exploit the potential of Virtual Learning Environments; and they are often not able to impart new skills to students.

The use of Web 2.0 technologies is greatly improving the student learning experience and many HEIs are enhancing their teaching practices as a result. A large majority of young people use online tools and environments to support social interaction and their own learning represents an important context for thinking about new models of delivery.

It’s all very well talking about networked learners, but how does the traditional teacher and mode of delivery and assessment fit into that world? I’m starting to think the educator role might well be fulfilled by the educator as “go to person” for a topic, but what we’re trying to achieve with assessment still confuses the hell out of me…

Open learning content has already proved popular…

A greater focus is needed on understanding how such content can be effectively used. Necessary academic skills and the associated online tutoring and support skills need to be fostered in exploiting open learning content to add value to the higher education experience. It is taken for granted in the research process that one builds on the work of others; the same culture can usefully be encouraged in creating learning materials.

Maybe if the materials were co-created, they would be more use? We’re already starting to see people reusing slides from presentations that people they know and converse with (either actively, by chatting, or passively, by ‘just’ following) have posted to Slideshare. It’d be interesting to know just how the rate of content reuse on Slideshare compares with the rate of reuse in the many learning object repositories? Or how image reuse from flickr compares with reuse from learning object repositories? Or how video reuse from Youtube compares with reuse from learning object repositories? Or how resource reuse from tweeting a link or sharing a bookmark compares with reuse from learning object repositories?

…”further research”… yawn… (and b******s;-) More playing with, certainly ;-) Question: do you need a “research question” if you or your students have an itch you can scratch…? We need a more playful attitude, not more research… What was that catchphrase again? “Flexible… innovative… imaginative…”

A comprehensive national resource of freely available open learning content should be established to provide an “infrastructure” for broadly based virtual education provision across the community. This needs to be curated and organised, based on common standards, to ensure coherence, comprehensive coverage and high quality.

Yay – another repository… lots of standards… maybe a bit of SOAP? Sigh…

There is also growing pressure for student data transfer between institutions across the whole educational system, requiring compliance with data specifications and the need for interoperable business systems.

HEIs should consider how to exploit strategically the world class ICT infrastructure they enjoy, particularly by taking an holistic approach to information management and considering how to use ICT more effectively in the management of their institution and in outreach and employer engagement activities.

There’s huge amount of work that needs doing there, and there may even be some interesting business opportunities. But I’m not allowed to talk about that…

ICT is also an important component in an institution’s outreach and business and community engagement activities. This is not appreciated by many HEIs. Small and medium enterprise (SME) managers need good ICT resources to help them deliver their learning needs. Online resources and e-learning are massively beneficial to work based learning. Too little is being done to exploit ICT in HE in this area although progress is being made.

I’ve started trying to argue – based on some of the traffic coming into my email inbox – that OUseful.info actually serves a useful purpose in IT skills development in the “IT consultancy” sector. OUseful.info is often a bit of a hard read at times, but I’m not necessarily trying to show SMEs how to solve their problems – this blog is my notebook, right? – though at times I do try to reach the people who go into SMEs, and hopefully give them a few ideas that they can make (re)use of in particular business contexts.

Okay – that was a bit longer and a bit more rambling than I’d anticipated… if you ewant to read the report, it’s at On-line Innovation in Higher Education. There’s also a discussion blog available at The future of Higher Education: On-Line Higher Education Learning.

Just by the by, here are a couple more reports I haven’t linked to before on related matters:

It’s just a shame there’s no time to read any of this stuff ;-) Far easier to participate in the debate in a conversational way, either by commenting on, or tracking back to, The future of Higher Education: On-Line Higher Education Learning.

PS here’s another report, just in… Macarthur Study: “Living and Learning with New Media: Summary of Findings from the Digital Youth Project”

iPhone 7 Day OU Programme CatchUp, via BBC iPlayer

Somewhen last week, I posted about a Recent OU Programmes on the BBC, via iPlayer hack that uses an Open2 twitter feed to identify recently broadcast OU programmes on the BBC, to create a feed of links to watchable versions of those programmes via BBC iPlayer.

So yesterday I had a little play and put an iPhone/iPod Touch web front end onto the pipe.

Here’s the front page (captured using an old version of iPhoney) – I’ve given myself the option of adding more than just the seven day catchup service…

The 7 day Catchup Link takes you through to a listing of the programmes that should, according to the BBC search results (but sometimes don’t always?) link to a watchable version of the programme on iPlayer.

Clicking on the programme link takes you to the programme description – and a link to the programme on mobile iPlayer itself:

Clicking through the programme link take you to the appropriate iPlayer page – where you can (hopefully) watch the programme… :-)

As is the way of these things, I gave myself half an hour to do the app, expecting it to take maybe 90 mins or so. The interface uses the iUI library, which I used previously to build iTwitterous/serendiptwitterous, (various bits of which broke ages ago when Twitter switched off the friends RSS feeds, and which I haven’t tried to work around:-( so all I expected to do was hack around that…

…which was okay, but then the final link out to the iPlayer site didn’t work… Hmmm… now the URLs to the iPlayer mobile programme pages look like http://www.bbc.co.uk/mobile/iplayer/index.html#episode/b00fj0y4′, and the way that the iUI pages work is to display various parts of a single HTML page using anchor/name tags of the form http://ouseful.open.ac.uk/i/ioutv.php#_proglist. So my guess was that the interface library was doing something different to normal whenever it saw a # (which I later refined to the assumption that it was intercepting the onclick event whenever that sort of link was clicked on).

My first thought at a fix was to just add another bit of pipework that would create a TinyURL to the mobile link (and so hide the # from iUI). I found an is.gd pipe and cloned it, but it didn’t work… it looked like is.gd had actually followed the link, got an error page back (“we don’t support that mobile device”) and shortened the iPlayer error page URL. V early hours of the morning now, so I wasn’t tempted to build a TinyURL shortener peipe and went to bed…

Next morning, and in the OU pipes wasn’t working for me very well over the guest network… so I thought I’d set up an Apache RewriteRule that would take a BBC programme ID and generate the mobile iPlayer URL. Nope – the # got encoded and the link didn’t work (I used something like RewriteRule ^ipm/.* http://www.bbc.co.uk/mobile/iplayer/index.html#episode/$1, but couldn’t get # rewritten as #??? Any ideas???)

Next thought – a PHP header redirect – didn’t work… a PHP page that returns some Javascript to reset the page location? Nope… (I later realised I was using the wrong mobile iPlayer URL pattern – I’d transposed mobile and iplayer, but I don’t think that was the only problem ;-)

A short walk to a meeting on ********************* (super secret censored project – I even used an arbitrary number of *’s there; and can’t even tell you who was at the meeting) gave me the crib – use javascript to reset the location in the link (<a href=”javascript:window.location.href=’http://www.bbc.co.uk/mobile/iplayer/index.html#episode/b00fj0y4″&gt;).

Still no…. hmmm, maybe I need to add that to the onclick too? Success!:-)

So there we have it, multiple failure and blind hackery, little or no understanding of what’s not working or why, but always the option to try to find another way of doing it; not pretty, not clever, but not beholden to a particular way of doing it. Come across a problem, and route around it… just do it the internet way;-)

OU Programme 7 day catchup, iPlayer’n’iPhone app. Seen anything interesting lately?;-)

PS see also OpenLearn ebooks, for free, (and readable on iPhone) courtesy of OpenLearn RSS and Feedbooks…

[18/11/08 – the site that the app runs on is down at the moment, as network security update is carried out; sorry about that – maybe I should use a cloud server?]

Can SocialLearn Be Built As Such? Plus an OU Jobs RoundUp

A tweet from Scott Leslie on Saturday alerted me to the fact he had a major post brewing…

And here it is: Planning to Share versus Just Sharing.

Do yourself a favour and go and read it now… Then come back and finish reading this post… or not… but read that one…

Here’s the link again: Planning to Share versus Just Sharing.

‘Nuff said? Here’s one thing it made me think of: Planning to Build versus Just Building.

Speaking of which, I wonder if we have any more SocialLearn planning meetings this week? ;-)

On another tack, it looks like the OU’s recruiting to some interesting posts again:

  • Director of Research and Enterprise, Research School, Strategy Unit: “The Open University plans to increase the range and volume of research of international quality and to expand its knowledge transfer activity at national and regional levels. We need an experienced, proactive and forward looking Director of Research and Enterprise who can help us achieve these ambitions.” I’d personally argue blogs like OUseful.info are in the KT business – if you get the post, feel free to buy me a coffee and vehemently disagree;-)
  • Online Marketing Manager, Marketing and Sales: “In this role, you will contribute to the new media strategy, setting strategies to achieve the online objectives to achieve student targets. You will manage the implementation and evaluation of PPC, affiliate programmes and third party partnerships and manage the development of existing and future marketing websites.” = you will spend lots of money with Google. Just beware Simpson’s Paradox
  • Development Advisor – Collaborative Tools, Learning & Teaching Solutions (LTS): “Collaborative tools are a key part of the online learning experience of Open University students. You will play a key role in both promoting the effective use of collaborative tools in new OU courses and the introduction of new collaborative tools across existing courses.” IMHO, don’t even think about mentioning Second Life, unless it’s to advocate the use of flamethrowers ;-)
  • Programmer/ Web developer, The Library and Learning Resource Centre: “Would you like to contribute in a key role in the development of the Open University’s Library systems, services and products to support all its business processes for both customers and Library staff? You will be providing technical input to projects and service developments, in particular maintaining and developing new services for the Library website.” Far be it from me to say that any Library website redesign should be informed by at least a passing familiarity with what the Library website analytics have say about how the site is used… And if you persuade them to dump Voyager, I’ll buy you a pint of whatever you want…
  • Broadcast Project Manager, Open Broadcasting Unit (OBU): “we need an additional Broadcast Project Manager to work with OU colleagues, the BBC and others to develop and manage detailed project plans for TV, radio and broadband commissions and associated support elements (e.g. print items). You’ll have your own group of projects and opportunities to contribute to process developments.” Tell ’em you watch OU programmes via the “OU Catchup Channel” on MythTV – the panel won’t have a clue what you’re talking about, so you could maybe follow up by suggesting a quick project that would produce a Wii front end for the the OU CatchUp Channel;-) (Hint condition: steal the BBC iPlayer Wii interface and ask Guy to make ice from it ;-)
  • e-Learning Developer, Learning and Teaching Solutions: “We are looking for an experienced e-learning developer with a web/software background. Working as part of a project team and in close collaboration with academics and other media specialists, you will play a key role in developing effective OU distance learning materials for delivery online or via disc.”
  • Research Fellow – SocialLearn, Knowledge Media Institute (KMi): “your responsibility will be to use your understanding of learning and sensemaking online to improve the SocialLearn platform.” I have no idea what this post is about? Maybe trying to think about ways we can mine the platform for data. I can offer you the 5k user records we have on Course Profiles to get started with, and suggestions about how to scale that app in terms of numbers and the data it can collect, but to date no else seems to think this is in anyway relevant to the data/insight that SocialLearn will collect, so maybe that’s just a red herring…;-)
  • Web Developer – cohere.open.ac.uk, Knowledge Media Institute (KMi): a Cohere hacking post. IMHO, Cohere isn’t yet what it may turn to be useful as…. (My attempts at grokking a simpler, more literal version of it, are Linktracks? Trackmarks? Linkmarks? and Doublemarks!)
  • Publicity and Evaluation Officer, Personalised Integrated Learning Support (PILS), Centre for Excellence in Teaching and Learning: “in this new role we are looking for an experienced secretary to support one of our PILS managers and our Publicity and Evaluation Manager. You will be required to use your IT, written communication and numeric skills to support the production of publicity and evaluation materials, and to update our websites.” Personally, I’d look to appoint an evangelist to the Open CETL, but I suppose we still have to service the old-fashioned markets (that aren’t so amenable to social network leverage) somehow?;-)

As ever, I have nothing to do with any of the above…

Orange Broadband ISP Hijacks Error Pages

An article in the FT last week (referenced here: British ISP Orange Shuns Phorm) described how ISP Orange have decided not to go with the behavioural advertising service Phorm, which profiles users internet activity in order to serve them with relevant ads.

But one thing I have noticed them doing over the last week is hijacking certain “domain not found” pages:

Orange broadband intercepting (some) page not found pages...

…which means that Orange must be looking at my HTML page headers for certain error codes?

Now I wonder if anyone from Orange Customer services or the Orange Press Office would like to comment on whether this is reasonable or not, and/or how they are doing it?

Just by the by, I found the Orange Customer Services web page interesting – not just the links to all the premium rate phine lines, more the font size;-) (click through for the full size image):

//www.orange.co.uk/contact/internet/default.htm?&article=contactussplitterwanadoo - check out the font size

I’ve also noticed what appear to be a few geo-targeted ads coming at me through my browser, so wonder if Orange is revealing my approximate location data to online ad targeting services (I’ll try to remember to grab a screenshot next time I see one). The reason I suspect it’s Orange is because I ran a test using a cookie blocking browser…

PS note to self: try to find out how ad services like NebuAd, Tacoda and of course Phorm make use of user data, and see just how far their reach goes…

PPS Hmmm… so just like there is a “junk mail opt out“, “unaddressed mail opt out and “junk phone call opt out” in the UK, it seems like there is a (cookie based….?!) initiative for opting out of online ad targeting from the Network Advertising Initiative. Does anyone know anything about this? Is it legitimate, or a gateway to yet more unwanted ads? I’d maybe trust it more if it was linked to from mydm, which I trust becasue it was linked to from the Royal Mail…

Recent OU Programmes on the BBC, via iPlayer

As @liamgh will tell you, Coast is getting a quite a few airings at the moment on various BBC channels. And how does @liamgh know this? Because he’s following the open2 openuniversity twitter feed, which sends out alerts when an OU programme is about to be aired on a broadcast BBC channel.

(As well as the feed from the open2 twitter account, you can also find out what’s on from the OU/BBC schedule feed (http://open2.net/feeds/rss_schedule.xml), via the Open2.net schedule page; iCal feeds appear not to be available…)

So to make it easier for him to catch up on any episodes he missed, here’s a quick hack that mines the open2 twitter feed to create a “7 day catch up” site for broadcast OU TV programmes (the page also links through to several video playlists from the OU’s Youtube site).

The page actually displays links to programmes that are currently viewable on BBC iPlayer (either via a desktop web browser, or via a mobile browser – which means you can view this stuff on your iPhone ;-), and a short description of the programme, as pulled from the programme episode‘s web page on the BBC website. You’ll note that the original twitter feed just mentions the programme title; the TinyURLd link goes back to the series web page on the Open2 website.

Thinking about it, I could probably have done the hackery required to get iPlayer URLs from with in the page; but I didn’t… Given the clue that page is put together using a JQuery script I stole from this post on Parsing Yahoo Pipes JSON Feeds with jQuery, you can maybe guess where the glue logic for this site lives?;-)

There are three pipes involved in the hackery – the JSON that is pulled into the page comes from this OU Recent programmes (via BBC iPlayer) pipe.

THe first part grabs the feed, identifies the programme title, and then searches for that programme on the BBC iPlayer site.

The nested BBC Search Results scrape pipe searches the BBC programmes site and filters results that point to an actual iPlayer page (so we can we can watch the result on iPlayer).

Back in the main pipe, we take the list of recently tweeted OU programmes that are available on iPlayer, grab the programme ID (which is used as a key in all manner of BBC URLs :-), and then call another nested pipe that gets the programme description from the actual programme web page.

This second nested pipe just gets the programme description, creates a title and builds the iPlayer URL:

(The logic is all a bit hacked – and could be tidied up – but I was playing through my fingertips and didn’t feel like ‘rearchitecting’ the system once I knew what I wanted it to do… which it is what it does do…;-)

As an afterthought, the items in the main pipe are annotated with a link to the mobile iPlayer version of each programme:

So there you have it: a “7 day catch up” site for broadcast OU TV programmes, with replay via iPlayer or mobile iPlayer.

[18/11/08 – the site that the app runs on is down at the moment, as network security update is carried out; sorry about that – maybe I should use a cloud server?]