Using Selenium to Support Teaching and the Production and Maintenance of Teaching Materials?

At the OU, we tell ourselves lots of myths, but don’t necessarily act them out. I believe more than a few of them, not least the one that we’re a content factory. I also believe we used to be really innovative in our production methods, but I think that’s largely fallen by the wayside in recent years.

The following is an example of a few hours play, though each step has probably taken me longer to write up in this post than the documented proof of concept code for each step took to produce.

It’s based on a couple of observations about Selenium that I hadn’t fully grokked until I played with it over the weekend, as described in the previous post (a recipe for automating bulk uploads of Jupyter notebooks to nbgallery), and then a riff or two off the back of them.

First up, I think we can use this to support teaching in a several ways.

One of the strategies we use in the OU for documenting how to use software applications is to use narrated screencasts, which is to say, screen-recordings of how to use an application with a narrated audio track explaining what’s going, and/or overlaid captions.

I wrote my nbgallery script as a way of automating bulk uploads, but its not hard to see how it can also be used to help in the automation of a screencast:

In that case, I did a test run to see where the browser was opened, then used Giphy to record a video of that part of the screen as I replayed the script.

The last time I recorded one of these was a couple of years ago and as I recall was a bit of a faff as I read from a script to dub the audio (I’m not a natural when it comes to the studio; I’m still not that comfortable, but still find it easier, recording an ad libbed take, although this is may become a bit fiddly when trying at the same time to control an application with a reasonable cadence).

What might have been easier would have been to script the sequence of button presses and mouse actions (though mousing actions would be lost?)

That said, it is possible to script in some highlighting too…

For example:

import time

def highlight(element, sleep=1.0):
    """Highlights (blinks) a Selenium Webdriver element"""
    driver = element._parent
    def apply_style(s):
        driver.execute_script("arguments[0].setAttribute('style', arguments[1]);",
                              element, s)
    original_style = element.get_attribute('style')
    apply_style("background: yellow; border: 2px solid red;")

gives something like this:

A couple of different workflows are possible here.

Firstly, we could bake timings in and record a completely automated screen-capture using time,wait() commands to hold each step as long as we need (or long enough so an editor can easily pause the video at a particular point for as many frames as are required).

Alternatively, we could use the notebook to allow us to step through the automation of particular actions.

What’s more, the notebook could include a script. Here’s an example in a step-through style:

One of the big issues with creating assets such as these is knowing the storyboard — what you expect to see at each step. This is particular true if a software application or webpage is updated, and an automation script breaks.

At a technical level, knowing what the original paged looked like as HTML can help, but the best crib is often a view of the original rendered display.

Which makes me think: it’s trivial to grab a screenshot of each step and insert those back into the notebook?

Here’s a code fragment for that:

import tempfile
from IPython.display import Image

#Create a temporary file for now
imgfile = tempfile.mktemp(suffix='.png')

#Get a browser element - this would be any old step

#Grab a screenshot fo the browser

#Display the screenshot in the notebook

Not only can this help us document script at a step level, but it also sets up an opportunity to create a text document (rather than a video screencast) that describes what steps to do when.

Can we also record a video of the automation? Selenium appears not to offer that out of the can, but maybe ffmpeg can help (ffmpeg docs)? Alternatively this Selenium docker image looks to support video capture, though I don’t see offhand to drive it from Python?

I also wonder: do the folk who do testing use this sort of automation, and if so, why don’t they share the knowledge and scripts back with us as a way of helping automate production as well as test? After all, that’s where factories are useful: mechanisation / automation helps with the scaling.

Once we start thinking about creating sorts of media asset, it’s natural to ask: could we also create a soundtrack?

I don’t see why not…

For example, pyttx3 is a cross-platform text-to-speech application, albeit with not necessarily the best voice:

#!pip3 install pyobjc pyttsx3

import pyttsx3
engine = pyttsx3.init()

def sayItToMe(txt):
    ''' Simple text to speech. '''

We can explicitly create text strings, but I don’t see why we should also find a way of grabbing relevant text from markdown cells?

TXT = '''
The first thing we need to do is log in.

TXT = '''
Select the person icon at the top right of the screen.

element = driver.find_element_by_id("gearDropdown")

Okay, so that’s one way in which we may be able to make use of Selenium, as a way of creating reproducible scripts for creating documentation in a variety of media of how to use a particular web application or website.

How about the second?

I think that one of the claims made for using Scratch in our introductory computing module is that you can get it to control animated things, which can help novices see the actions of particular steps in an animated way originally designed to appeal to primary school children. (And yes, I am prepared to argue an androgogy vs. pedagogy thing, as well as a curriculum thing, about why I think we should have used BlockPy.)

If you want shiny, and animated, and perhaps a little a bit frightening, perhaps (surprisingly) useful, and contextualised by all sorts of other basic computing stuff, like how browsers work, and what HTML and the DOM are (so you can probably make synoptic claims too…), then automatically launching a browser from a script and getting it to click things and take pictures might be seen as a really cool, or fun, thing to do — did you know you can…? etc. — and along the way provide a foil for learning a bit about scripting too.


PS longtime readers will note the themes of this post fit in with a couple of oft-repeated ideas contained elsewhere in this blog. For example, the notion I’m trying to work up of “reproducible educational materials” (which also doubles as the automation of rich media assets, and which is something I think is useful from production, testing and maintenance perspectives in a content factory (though no-one else seems to agree:-(,l and the use of notebooks for everything (which again, most people I know think is just me going off on one again…:-(.

Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...

%d bloggers like this: