FutureLearn Data Doodles Notebook and a Reflection on unLearning Analytics

With LAK16 (this year’s Learning Analytics and Knowledge Conference upon us, not that I’m there, I thought I’d post an (updated) notebook showing some of my latest data sketches’n’doodles around the data made available to FutureLearn partners arising from courses they have presented on FutureLearn. You can find the notebook in the notebooks folder of this Github repository: psychemedia/futurelearnStatsSketches.

Recalling the takedown notice I received for posting “unauthorised” screenshots around some of the data from a FutureLearn course I’d worked on, the notebook doesn’t actually demonstrate the analysis of any data at all. Instead, it provides a set of recipes that can be applied to your own FutureLearn course data to try to help you make sense of it.

In contrast to many learning analytics approaches, where the focus is on building models of learners so that you can adaptively force them to do particular things to make your metrics look better

My thinking hasn’t really moved on that much since my original take on course analytics in 2007, or in a presentation I gave in 2008 (Course Analytics in Context presentation) and it can (still) be summarised pretty much as follows:

Insofar as we are producers of online course producers for delivery “at scale” (that is, to large numbers of learners), our first duty is to ensure that the course materials appear to be working. That is, we should regard the online materials in the same way as the publisher of any content focussed website, as pages that can be optimised in terms of their own performance against any expectations we place on them.

So, if a page has links on it, we should keep track of whether folk click on the link in the volumes we expect. If we expect a person to spend a certain amount of time on a page, we should be concerned if, en masse, they appear to be spending a much shorter longer period of time on the page. In short, we should be catering to the mass behaviour of the visitors, to try to ensure that the page appears to be delivering (albeit at a surface level) the sort of experience we expect for it for the majority of visitors. (Unless the page has been designed to target a very particular audience, in which case we need to segment our visitor stats to ensure that for that particular audience, the page meets out expectations of it in terms of crude user engagement metrics.) This is not about dynamically trying to manage the flow of users through the course materials, it’s about making sure the static site content is behaving as we expect it to.

This is possibly naive, and could be seen as showing a certain level of disinterest in users’ individual learning behaviours, but I think it reflects how we tend to write static materials. In the case of the OU, this tends to be with a very strong, single narrative line, almost as if the materials were presented as a set of short books. I suspect that writing material that is intended to be dynamically served up in response to an algorithm perceived model of the user needs to be authored differently using chunks that can be connected in ways that allow for multiple narrative pathways through them.

In certain respects, this is a a complementary approach to learning design where educators are encouraged in advance of writing a course to identify various sorts of structural activity that I suspect LD advocates would then like to then see being used as the template for an automated course production process; templated steps conforming to the structural design elements could then be dropped into the academic workflow for authors to fill out. (At the same time, my experience of authoring workflows for online material delivery is that they generally suck, despite my best efforts…. See also: here.)


The notebook is presented as a Jupyter notebook with code written using Python3. It requires pandas and seaborn but no other special libraries and should work on a variety of notebook hosting workbenches (see for example Seven Ways of Running IPython / Jupyter Notebooks). I’ve also tested it against the psychemedia/ou-tm351-pystack container image on Github, which is a clone of the Jupyter set-up we’re using in the current presentation of the OU course TM351 Data management and analysis. My original FutureLearn data analysis notebook only used techniques developed in the FutureLearn course Learn to Code for Data Analysis, but the current one goes a little bit further than that…

The notebook includes recipes that analyse all four FutureLearn data file types, both individually and in combination with each other. It also demonstrates a few interactive widgets. Aside from a few settings (identifying the location and name of the data files), and providing some key information such as course enrolment opening date and start date, and any distinguished steps (specific social activity or exercise steps, for example, that you may want to highlight), the analyses should all run themselves. (At least, they ran okay on the dataset I had access to. If you run them and get errors, please send me a copy of any error messages and/or fixes you come up with.) All the code is provided though, so if you want to edit or otherwise play with any of it, you are free to do so. The code is provided without warranty and may not actually do what I claim for it (though if you find any such examples, please let me know).

The notebook and code are licensed as attribution required works. I thought about an additional clause expressly forbidding commercial use or financial gain from the content by FutureLearn Ltd, but on reflection I thought that might get me into more grief than it was worth!;-) (It could also come over as perhaps a bit arrogant and I’m note sure the notebooks have anything that novel or interesting in them; they’re more of a travel diary that record my initial meanderings around the data, as well as a few sketches and doodles as I tried to work out how to wrangle the data.)

As I was about to post the notebooks, I happened to come a across report of a recent investigation on the financial flows in academic publishing (It’s time to stand up to greedy academic publishers) which raises questions about “the issue of how research is communicated in society … that cut to the heart of what academics do, and what academia is about”; this resonated with a couple of recent quotes from Downes that made me smile, an off-the-cuff remark from Martin Weller last week mooting whether there was – or wasn’t a book around the idea of guerrilla research as compared to digital scholarship, and the observation that I wasn’t the only person who took holiday and covered my own expenses to attend the OER conference in Edinburgh last week (though I am most grateful to the organisers for letting me in and giving me the opportunity to bounce a few geeky ideas around with Jim Groom, Brian Lamb, Grant Potter, Martin Hawksey and David Kernohan that I still need to think through. I need to start pondering the data driven stand-up and panel show games too…!:-)

Arising from that confused melee of ideas around what I guess is the economics of gonzo academia, I decided to a post a version of the notebook on Leanpub as the first part of a possible work in progress: Course Analytics – Wrangling FutureLearn Data With Python and R. (I’ve been pondering a version of the notebook recast as an R shiny app, a Jupyter dashboard, and an RMarkdown report, and I think that title will accommodate such ramblings under the same cover.) So if you find value in the notebook and feel as if you should pay for it, you now have an opportunity to do so. (Any monies generated will be used to cover costs of activities related to the topic of the work, along with the progression and dissemination of ideas related to it. Receipts and expenditure arising therefrom will be itemised in full in the repository.) And if you don’t think it’s worth anything, the book is flexibly priced with a starting price of free.

PS the notebook is a Jupyter notebook, as used in the OU/FutureLearn course Learn to Code for Data Analysis. My original FutureLearn data analysis notebook used only techniques developed in that course, although the current version uses a few more, including interactive widgets that let you analyse the data interactively within the notebook. If you need any further reasons as to why you should take the course, here’s a marketing pitch…

Want to Get Started With Open Data? Looking for an Introductory Programming Course?

Want to learn to code but never got round to it? The next presentation of OUr FutureLearn course Learn to Code for Data Analysis will teach you how to write you own programme code, a line a time, to analyse real open data datasets. The next presentation starts on 6 June, 2016, and runs for 4 weeks, and takes about 5 hrs per week.

I’ve often thought that there are several obstacles to getting started with programming. Firstly, there’s the rationale or context: why bother/what could I possibly use programming for? Secondly, there are the practical difficulties: to write and execute programmes, you need to get an programming environment set up. Thirdly, there’s the so what: “okay, so I can programme now, but how do I use this in the real world?”

Many introductory programming courses reuse educational methods and motivational techniques or contexts developed to teach children (and often very young children) the basics of computer programming to set the scene: programming a “turtle” that can drive around the screen, for example, or garishly coloured visual programming environments that let you plug logical blocks together as if they were computational Lego. Great fun, and one way of demonstrating some of the programming principles common to all programming languages, but they don’t necessarily set you up for seeing how such techniques might be directly relevant to an IT problem or issue you face in your daily life. And it can be hard to see how you might use such environments or techniques at work to help you get perform real tasks… (Because programmes can actually be good at that – automating the repetitive and working through large amounts of stuff on your behalf.) At the other extreme are professional programming environments, like geekily bloated versions of Microsoft Word or Excel, with confusing preference setups and menus and settings all over the place. And designed by hardcore programmers for hardcore programmers.

So the approach we’ve taken in the OU FutureLearn course Learn to Code for Data Analysis is slightly different to that.

The course uses a notebook style programming environment that blends text, programme code, and the outputs of running that code (such as charts and tables) in a single, editable web page accessed via your web browser.


To motivate your learning, we use real world, openly licensed data sets from organisations such as the World Bank and the United Nations – data you can download and access for yourself – that you can analyse and chart using your own programme code. A line at a time. Because each line does it’s own thing, each line is useful, and you can see what each line does to your dataset directly.

So that’s the rationale: learn to code so you can work with data (and that includes datasets much larger than you can load into Excel…)

The practicalities of setting up the notebook environment still have to be negotiated, of course. But we try to help you there too. If you want to download and install the programming environment on your computer, you can do, in the form of the freely available Anaconda Scientific Computing Python Distribution. Or you can access an online versions of the notebook based programming environment via SageMathCloud and do all your programming online, through your browser.

So that’s the practical issues hopefully sorted.

But what about the “so what”? Well, the language you’ll be learning is Python, a widely used language programming language that makes it ridiculously easy to do powerful things.

Pyython cartoon - via https://xkcd.com/353/

But not that easy, perhaps..?!

The environment you’ll be using – Jupyter notebooks – is also a “real world” technology, inspired as an open source platform for scientific computing but increasingly being used by journalists (data journalism, anyone?) and educators. It’s also attracted the attention of business, with companies such as IBM supporting the development of a range of interactive dashboard tools and backend service hooks that allow programmes written using the notebooks to be deployed as standalone online interactive dashboards.

The course won’t take you quite that far, but it will get you started, and safe in the knowledge that whatever you learn, as well as the environment you’re learning in, can be used directly to support your own data analysis activities at work, or at home as a civically minded open data armchair analyst.

So what are you waiting for? Sign up now and I’ll see you in the comments:-)

Fragmentary Observations from the Outside About How FutureLearn’s Developing

I’m outside the loop on all matters FutureLearn related, so I’m interested to see what I can pick up from fragments that do make it onto the web.

So for example, from a presentation by Hugh Davis to the M25 Libraries conference April 2013 about Southampton’s involvement with FutureLearn, Collaboration, MOOCs and Futurelearn, we can learn a little bit about the FutureLearn pitch to partners:

FutureLEarn Overview

More interesting, I think, is this description of what some of the FutureLearn MOOCs might look like:

MOOC Structure

“miniMOOCs” containing 2 to 3 learning units, each 2-6 hours of study time, broken into 2-3 self-contained learning blocks (which suggests 1-2 hours per block).

So I wonder, based on the learning block sequence diagram, and the following learning design elements slide:

learning design

Will the platform be encouraging a learning design approach, with typed sequences of blocks that offer templated guides as to how to structure that sort of design element? Or is that way off the mark. (Given the platform is currently being built, (using Go Free Range for at least some of the development, I believe), it’s tricky to see how this is being played out, given courses and platform both need to ready at the same time, and it’s hard to write courses using platform primitives if the platform isn’t ready yet?)

Looking elsewhere (or at least, via @patlockley), we may be able to get a few more clues about the line partners are taking towards FutureLearn course development:

futurelearn job ad - LEeds

Hmm, I wonder – would it be worth subscribing to jobs feeds from the partner universities over the next few months to see whether any other FutureLearn related posts are being opened up? And does this also provide an opportunity for the currently rather sparse FutureLearn website to start promoting those jobs ads? And come to that, how come the jobs that have been appointed at FutureLearn weren’t advertised on the FutureLearn website…?

Because jobs have been appointed, as LinkedIn suggests… Here’s who’s declaring an association with the company at the moment:

futurelearn on linkedIN

We can also do a slightly broader search:

futurelearn search

There’s also a recently closed job ad with a role that doesn’t yet appear on anyone’s byline:

global digital marketing sstrategist

So what roles have been filled according to this source?

  • CEO
  • Head of Content
  • Head of UK Education & HE Partnerships
  • CTO
  • Senior Project Manager / Scrum Master (Contract)
  • Agile Digital Project Manager
  • Product manager
  • Marketing and Communications Assistant
  • Interim HR Consultant
  • Learning Technologist
  • Commercial and Operations Director for Launch
  • Global Digital Marketing Strategist

Here’s another one, Academic Lead [src].

By the by, I also notice that the OU VC, Martin Bean, has just been appointed as a director of FutureLearn Ltd.

Exciting times, eh…?!;-)

Related: OU Launches FutureLearn Ltd

PS v loosely related (?!) – (Draft) Coursera data export policy

PPS I also noticed this the other day – OpenupEd (press release) an EADTU co-ordinated portal that looks like a clearing house for OER powered MOOCs from universities across the EU (particularly open universities, including, I think, The OU…;-)

MOOC Platforms and the A/B Testing of Course Materials

[The following is my *personal* opinion only. I know as much about FutureLearn as Google does. Much of the substance of this post was circulated internally within the OU prior to posting here.]

In common with other MOOC platforms, one of the possible ways of positioning FutureLearn is as a marketing platform for universities. Another might see it as a tool for delivering informal versions of courses to learners who are not currently registered with a particular institution. [A third might position it in some way around the notion of “learning analytics”, eg as described in a post today by Simon Buckingham Shum: The emerging MOOC data/analytics ecosystem] If I understand it correctly, “quality of the learning experience” will be at the heart of the FutureLearn offering. But what of innovation? In the same way that there is often a “public benefit feelgood” effect for participants in medical trials, could FutureLearn provide a way of engaging, at least to a limited extent, in “learning trials”.

This need not be onerous, but could simply relate to trialling different exercises or wording or media use (video vs image vs interactive) in particular parts of a course. In the same way that Google may be running dozens of different experiments on its homepage in different combinations at any one time, could FutureLearn provide universities with a platform for trying out differing learning experiments whilst running their MOOCs?

The platform need not be too complex – at first. Google Analytics provides a mechanism for running A/B tests and “experiments” across users who have not disabled Google Analytics cookies, and as such may be appropriate for initial trialling of learning content A/B tests. The aim? Deciding on metrics is likely to prove a challenge, but we could start with simple things to try out – does the ordering or wording of resource lists affect click-through or download rates for linked resources, for example? (And what should we do about those links that never get clicked and those resources that are never downloaded?) Does offering a worked through exercise before an interactive quiz improve success rates on the quiz, and so on.

The OU has traditionally been cautious when running learning experiments, delivering fee-waived pilots rather than testing innovations as part of A/B testing on live courses with large populations. In part this may be through a desire to be ‘equitable’ and not jeopardise the learning experience for any particular student by providing them with a lesser quality offering than we could*. (At the same time, the OU celebrates the diversity and range of skills and abilities of OU students, which makes treating them all in exactly the same way seem rather incongruous?)

* Medical trials face similar challenges. But it must be remembered that we wouldn’t trial a resource we thought stood a good chance of being /less/ effective than one we were already running… For a brief overview of the broken worlds of medical trials and medical academic publishing, as well as how they could operate, see Ben Goldacre’s Bad Pharma for an intro.

FutureLearn could start to change that, and open up a pathway for experimentally testing innovations in online learning as well as at a more micro-level, tuning images and text in order to optimise content for its anticipated use. By providing course publishers with a means of trialling slightly different versions of their course materials, FutureLearn could provide an effective environment for trialling e-learning innovations. Branding FutureLearn not only as a platform for quality learning, but also as a platform for “doing” innovation in learning, gives it a unique point of difference. Organisations trialling on the platform do not face the threat of challenges made about them delivering different learning experiences to students on formally offered courses, but participants in courses are made aware that they may be presented with slightly different variants of the course materials to each other. (Or they aren’t told… if an experiment is based on success in reading a diagram where the labels are presented in different fonts or slightly different positions, or with or without arrows, and so on, does that really matter if the students aren’t told?)

Consultancy opportunities are also likely to arise in the design and analysis of trials and new interventions. The OU is also provided with both an opportunity to act according to it’s beacon status as far communicating innovative adult online learning/pedagogy goes, as well as gaining access to large trial populations.

Note that what I’m not proposing is not some sort of magical, shiny learning analytics dashboard, it’d be a procedural, could have been doing it for years, application of web analytics that makes use of online learning cohorts that are at least a magnitude or two larger than is typical in a traditional university course setting. Numbers that are maybe big enough to spot patterns of behaviour in (either positive, or avoidant).

There are ethical challenges and educational challenges in following such a course of action, of course. But in the same way that doctors might randomly prescribe between two equally good (as far as they know) treatments, or who systematically use one particular treatment over another that is equally good, I know that folk who create learning materials also pick particular pedagogical treatments “just because”. So why shouldn’t we start trialling on a platform that is branded as such?

Once again, note that I am not part of the FutureLearn project team and my knowledge of it is largely limited to what I have found on Google.

See also: Treating MOOC Platforms as Websites to be Optimised, Pure and Simple…. For some very old “course analytics” ideas about using Google Analytics, see Online Course Analytics, which resulted in OUseful blogarchive: “course analytics”. Note that these experiments never got as far as content optimisation, A/B testing, search log analysis etc. The approach I started to follow with the Library Analytics series had a little more success, but still never really got past the starting post and into a useful analyse/adapt cycle. Google Analytics has moved on since then of course… If I were to start over, I;d probably focus on creating custom dashboards to illustrate very particular use cases, as well as REDACTED.

Treating MOOC Platforms as Websites to be Optimised, Pure and Simple…

A month or so on from its PR launch, and with a steady trickle of press mentions since then (though no new updates on the website?), I’m guessing that the folk over at FutureLearn must be putting the hours in trying to work out what the platform offering will actually consist of, or what the sustainabilitybusiness model will actually be. (I have no inside information on the FutureLearn project…)


One of the things I have sort of picked up from online glimpses of things said and commented upon is that the USP is going to relate to the quality of teaching/pedagogy (erm, I think?!). I’m not sure if “proven” learning designs will be baked into the platform, constraining the way courses are delivered (in which case, there’s likely to be something of a bootstrap problem in getting the first courses out if they have to wait for the platform?) or whether the quality will flow “naturally” from the fact the the courses will be provided by British universities (?!), but if innovation is also to flow, it’ll be interesting to see how it’ll be supported…?

…and whether it will be done through “open” means? (I can haz API? But what would it do?!?) If it is built up from open code, I wonder to what extent it might draw on code and ideas used in other learning platforms (for example, Moodle, to which the OU is already a core contributor, I think?) or Class2Go) as well as drawing on learning from whatsoever folk managed to learn from the OU’s other open learning builds – OpenLearn/Labspace (content and community), iSpot (community and reputation), Cloudworks (community and resource sharing) or the very many expensive attempts at SocialLearn (wtf?!) that never saw the light of day? I can’t imagine a FutureLearn offering being based on the Google Coursebuilder, but it wouldn’t surprise if it ended up with something being bought in… Time to start watching the tender site, maybe, though surely that would knock any start date back too far?

One thing that would be nice to see would be project using something akin to the open, agile development process used by the @GDSteam, which is opening up the backend to View Source as well as the front-end…

I also wonder about the extent to which it might be possible to reuse ideas from commercial website design and development in the way the site is architected. This will be anathema to many, but I wonder just how far the idea could be pushed? Start with the idea of analytics, and define funnels for how folk might be expected to move through course units. Associate activities with some sort of intentional action, such as popping items into a shopping basket, or maybe the equivalent of 1-click purchases. Making it through the to end of a course can be seen as completing such sort of purchase (chuck in some open badge framework badges as a reward for good measure;-). Ad-delivery mechanisms can be rethought of as personalised content delivery (eg contextual content delivery, banner ads as signage or email-pre-emptive ads). Use search data to help refine content pages, and A/B testing to try out multiple variants of course materials and exercises (weak example). (I have never understood why the OU doesn’t engage in A/B tested delivery of course materials as a matter of course? OU courses are delivered at large enough scale, and containing more than enough content, to trial different ways of delivering content and assessment without jeopardising overall outcomes for any individual student.)

All of the above – search analysis, web analytics, contextual content/ad-serving and A/B testing – can be managed through ad servers and Google Analytics (and to a lesser extent Piwik, though they are open to additional contributions), which could provide a minimum-viable product tooling basis for a testing and analytics framework that’s ready to go now? Such an approach is far too scruffy and ad hoc, of course, for a “proper” platform project…

PS by the by, I notice that JISC Advance’s Generic eMarketplace (or GeM) for Work Based Learning (“gemforwbl”, or looking at the logo, “gee em for weeble”? (will it wobble? will it fall down?) is now open and ready for business… and as for the logo, what on earth is it supposed to represent?


Answers in the comments, etc etc, please…

PPS As ever, the opinions expressed herein are not necessarily even reflective of my own, let alone those of my employer…;-)

OU Launches FutureLearn Ltd

So it seems the Open University press office must have had an embargoed press release lined up for midnight, with a flurry of stories – and a reveal of the official press release on the OU site, partner quotes and briefing doc – about FutureLearn Ltd (Twitter: @future_learn)

Futurelearn Ltd log

Futurelearn original press release

Apparently, Futurelearn (not FutureLearn? The UEA press release uses CamelCase…) “will bring together a range of free, open, online courses from leading UK universities, in the same place and under the same brand.”

A bit like edX, then…?

future of online education

…only that’s for US unis… Or Coursera, which is open to all-comers, I think? Whereas Futurelearn looks as if it’ll be championing the cause of UK universities – apparently, Birmingham [UK universities embrace the free, open, online future of higher education], Bristol [UK universities embrace the free, open, online future of higher education powered by The Open University], Cardiff [Online future of higher education], East Anglia [UK universities embrace the online future of higher education], Exeter [UK universities embrace the free, open, online future of higher education powered by The Open University], King’s College London [Futurelearn – new online higher education initiative], Lancaster [Lancaster signs up for Futurelearn], Leeds [Leeds joins partners in offering free online access to education], Southampton [University of Southampton embraces the open, online future of higher education], St Andrews [news feed] and Warwick [Warwick joins other leading UK universities to create multiple MOOC giving free access to some of those Universities’ most innovative courses] have all signed up to join Futurelearn… (It’ll be interesting to see if HEIs that are trying out Coursera, such as Edinburgh, will joing Futurelearn, or whether exclusive agreements are in place? I also wonder about whether membership of any of the particular university groups will influence which “open” online course marketing outfit particular universities join?) [Other press releases: QAA: Open University launches UK-based Moocs platform]

[For what it’s worth, the OU and UEA were the only press offices to break the story just after midnight. St Andrews is the last to release a press release. Birmingham and Kings were also tardy… I wonder whether some of the partners were waiting to see whether anyone picked up on the story before putting out their own press releases?]

Here’s some of the press coverage so far – I guess I should grab these reports and give each a churnalism score…?

Simon Nelson, whom I remember gave a presentation at the OU a few years ago when he was BBC multiplatform commissioner, has been appointed as CEO, so that could prove interesting… (FWIW, Simon Nelson Linked In page, directorships: Sineo Ltd, and I think Ludifi Ltd?) What might this mean for the OpenLearn brand, I wonder? Or for the Open University Apps, iBooks and Stores?

Structurally, “Futurelearn will be independent but majority-owned by the OU”, although as far as “partners” announced so far go, this “do[es] not constitute a partnership in the legal sense and the Parties shall not have authority to bind each other in any way. The term is used to indicate their support and intent to work together on this project.”

One possible response is that this is a playing out of an Emperor’s New Clothes marketing battle, but as with the evolution of any novel communication technology (seeing “MOOC’s” as such as thing), some of them do manage to lock-in… (And as George Siemens comments in Finally, alternatives to prominent MOOCs, “Even if MOOCs disappear from the landscape in the next few years, the change drivers that gave birth to them will continue to exert pressure and render slow plodding systems obsolete (or, perhaps more accurately, less relevant). If MOOCs are eventually revealed to be a fad, the universities that experiment with them today will have acquired experience and insight into the role of technology in teaching and learning that their conservative peers won’t have. It’s not only about being right, it’s about experimenting and playing in the front line of knowledge”.)

Futurelearn Ltd

Leagas Delaney, it seems, is some sort of brand communications agency. So much style on their website, I couldn’t actually work out the substance of what it is they actually do at this late hour (all I did was check my feeds quickly, just after midnight, as I was on my way to bed, and catch sight of the OU news release…).

PS No-one mention the warUKeU… (via Seb Schmoller (Futurelearn – an OU-led response to Coursera, Udacity, and MITx), I am reminded of Paul Bacsich’s Lessons to be learned from the failure of UKeU.)

PPS Now I’m wondering whether @dkernohan knew something I didn’t when he launched the MOOCAS/”MOOC Advisory Service” search engine a couple of days ago…?!;-)

[UPDATE: this was post was an early response that collated press stories released at end of embargo time. For a more considered review, check out Doug Clow’s Futurelearn may or may not succeed but is well worth a try. Via @dkernohan, William Hammonds on the Universities UK blog: Are we witnessing higher education’s “digital moment”?]

[The views expressed within this post are barley even my personal ones, let alone anybody else’s…]