A few weeks ago I spotted a review paper of “data wrangling” activities at the OU (Making sense of learner and learning Big Data: reviewing five years of Data Wrangling at the Open University UK). I saw it being linked to/promoted again today.
Apparently, “Data Wranglers [DWs] are a group of academics who analyse data about student learning and prepare reports with actionable recommendations based upon that data”. Also apparently, “[i]n practice” they also do “Big Data insights”. Or something. I’m not sure we have any “Big Data” do we? (Big data, meh.)
Furthermore, it seems that “Learning analytics are now increasingly taken into consideration at the OU when designing, writing and revising modules, and in the evaluation of specific teaching approaches and technologies”.
Looks around, confused…
…because something that I’ve been failing to understand for years and years and years and years is why no-one seems interested in taking the view that we are, in a lot of courses, delivering online content just like any other web publisher would, and as such we could be looking at ways of making our content “work better”, for some definition of “better”. Or even “work”.
In the learning analytics world, this possibly means building predictive models based on previous cohorts that show how students who dwelled this long on those content pages did well, while others who didn’t reveal that hidden answer of or visit that page, or who didn’t appear to visit any course pages, failed.
At this point, it’s probably worth mentioning that the OU, as a distance learning organisation, used to deliver course materials to students as print material, but increasingly we deliver material (that looks just like the print material) as HTML via a Moodle VLE. Each section of “as if” print material appears as a separate HTML page. (We also make PDFs available that students can download… It’d be interesting to know how many then print those PDF downloads out…)
It’s also worth mentioning that a lot of the teaching related activity pursued by the OU’s central academics relates to the production of course materials and assessment materials, which is to say, writing stuff, rather than delivery to students: when the course runs, it’s the moderators of online forums (which may include the occasional central academic) and the students’ personal tutors (Associate Lecturers, in OU parlance), who are the people who actually engage with students directly.
So to a large extent, once the stuff it’s written, that’s job done. Despite a laborious editing and publishing process to get the material onto the website, errors do slip through, and when spotted (often by pathfinder/vanguard students studying course material weeks ahead of the course schedule), corrected in another lengthy process (authors don’t have edit/write permissions on the course materials, and in some cases errors may be left uncorrected in situ with students expected to pick up the errata announcements via errata notices. Just like the print days…)
So what I keep on not understanding is why we don’t have someone paying attention to the course material as web content with a view to helping us better understand the obvious (because it’s nothing f****g difficult I want to learn from the pages), as I demoed nine years ago. For example:
- what’s the course dynamic in terms of content use (when are most students studying particular parts of the course)? – have we got the pacing about right?
- what’s the weekly rhythm of the course (what time of day are most students accessing the content pages?) – this could help forum moderators schedule their time;
- how much time are students spending, on average, in a particular study session, and does this vary (e.g. 1-2 hours on a weekday evening, 3-4 hours for daytime or weekday study, 45 mins over lunch periods), and so on; i.e. what user stories might we create *from the data*?
- how much time are students spending on particular pages? Are some pages just too long, or maybe have an idea or activity that is taking a lot of time to complete – or less time that we expect? Handy to know as a content designer (which is what course authors are). For the learning analytics surveillance freaks, can they spot students who spend more or less time than average on a particular page as a “likely fail” feature that they can celebrate?
- are those links to external resources clicked on? Ever?
- are the “optional activities” linked to on separate pages visited? Ever? Again, the learning analytics folk may be able to wet themselves finding correlation features on those pages, but I don’t really care about that. I just want to know, in the first instance, are the pages visited. Ever. (If they are, and it’s only a fraction of students who visit those pages/follow those links, then maybe it becomes useful to track the learning analytics stuff to see if we can figure what sort of student is making use of those resources. But rather than caring about a particular student, I’m more interested in getting a better user story dialled in that I can use to help as one more focal point to motivate content production in future courses.)
- are students using particular devices, or the same users using different sorts of devices at different times of time? With our insistence on still delivering software that needs to be installed on a traditional desktop computer, it would be useful to know if this can affect what a student might be able to study when based on device availability. And if it comes to trying to pitch particular computer requirements, it would be handy to know what the baseline is (which course webstats can provide an indicator of), and the extent to which this may vary across faculties or course levels.
Sometimes it can be comforting to see that your expectations about how the content would be used appear to be being met. Sometimes it can be revealing to find out that they’re not.
This is all basic stuff, and someone can probably have a fun time building some dashboards to report it. (Maybe there are some already, but no-one’s directed me to them despite my asking everyone I can think of.)
To reiterate on the why: I just want to be able to tell myself more informed stories about how the content appears to be being used en masse, and maybe also identifying different audience segments in the data (eg weekend studiers, weekday nighters, full-timers). Looking across courses (faculties, levels) it may be that we get different sorts of pattern / segmentation, which could be interesting from a user / user story informed content design perspective. It may well also prompt “learning analytics” discussions. (Writing this, I’ve come to realise I associate learning analytics with tracking back into individual data from “success” criteria such as assessment scores. For the content analysis, in the first instance, I’m just interested in how its generally being consumed. No individual data necessary. Once I’ve got broad usage pattern segments down, then maybe looking at performance level segments would be useful. But then, I’d rather just track the whole cohort score distribution to try and improve that.)
From looking at VLE pages, it looks as if there are Google Analytics and optimizely tracking scripts linked in the pages, although asking around I can’t find anyone who does anything with that data from the VLE pages. (Maybe the “DW”s do?) So I’m guessing the data is there?
PS One of the things I think optimizely may be used for is A/B testing by the Marketing folk on other bits of the website. Something I’ve pitched before is A/B testing on course materials (e.g. differently phrased or worked versions of the same activity).
This has generally been treated with disdain, but if it works for medical trials I don’t see why we can’t try it in education too. There is an argument here that we would need to track effect on attainment (the learning analytics thing), but I’m wary of the idea that changing a single page in several hundred could wildly affect attainment, unless it related to a particular key concept that the whole course hinged on. More realistically, if we see a page on average is taking students an hour to work through when we estimated it at 20 minutes, I’d be tempted to do A/B tests on it within a cohort. (Managing that if students chat about the topic in the common forums could represent a challenge!) The idea would be to see if we could improve the content performance more in line with expectations. As it is, the current approach would be to wait until the next presentation and give that whole cohort the new version. Which would of course be previously untested at scale. And may end up with students taking even longer to work through it.