With a bit of luck, we’ll be updating software for our databases course starting in Sept/Oct to use JupyterLab. It’s a bit late for settling this, but I find the adrenaline of going live, and the interaction with pathfinder students in particular, to be really invigorating (a bit like the old days of residential schools) so here’s hoping…
[I started this post last night, closed the laptop, opened it this morning and did 75 mins of work. The sleep broke WordPress’ autobackup, so when I accidentally backswiped the page, I seem to have lost over an hour’s work. Which is to say, this post was originally better written, but now it’s rushed. F**k you, WordPress.]
If we are to do the update, one thing I need to do is update the software guide. The software guide looks a bit like this:
Which is to say:
- process text describing what to do;
- a (possibly annotated) screenshot showing what to do, or the outcome of doing it;
- more process text, etc.
Maintaining this sort of documentation can be a faff:
- if the styling of the target website/application changes, but the structure and process steps remain the same, the screenshots drift from actuality, even if they are still functionally correct;
- if the application structure or process steps change, then the documentation properly breaks. This can be hard to check for unless you rerun everying and manually test things before each presentation at least.
So is there a better way?
There is, and it’s another example of why it’s useful for folk to learn to code, which is to say, have some sort of understanding about how the web works and how to construct simple computational scripts.
Regular readers will know that I’ve tinkered on and off with the Selenium browser automation toolkit in the past, using it for scraping as well as automating repetitive manual tasks (such as downloading from one exams system scores of student scripts one at a time and grabbing a 2FA code for each of them for use in submiting marks into a second system). But
playwright, Microsoft’s browser automation tool (freely available), seems to be what all the cool kids are using now, so I thought I’d try that.
playwright app itself is a node app, which also makes me twitchy because becuase node is a pain to install. But the
pytest-python package, which is installable from PyPi and which wraps
playwright, bundles it’s own version of node, which makes things much simple. (See Simon Wilison’s Bundling binary tools in Python wheels for a discussion of this; for any edtechies out there, this is a really useful pattern, because if students have Python installed, you can use it as a route to deploy other things…)
Just as a brief aside,
playwright is also wrapped by @simonw’s
shot-scraper command line tool which makes it dead easy to grab screenshots. For example, we can grab a screenshot of the OU home page as simply as typing
Note that because the session runs in a new, incognito, browser, we get the cookie notice.
We can also grab a screenshot of just a particular, located CSS element:
shot-scraper https://www.open.ac.uk/ -s '#ou-org-footer'. See the
shot-scraper docs for many more examples.
In many cases, screenshots that appear in OU course materials that mismatch with reality are identified and reported by students. Tools like
pytest can both be used as a part of a documentation testing suite where we create gold master images and then, as required, test “current” screenshots to see if they match distributed originals.
But back to the creation or reflowing of documentation.
As well as command line control using
shot-scraper, we can also drive
playwright from Python code, executed synchronously, as in a
pytest test, or asynchronously, as in the case of Python running inside a Jupyter notebook. This is what I spent yesterday exploring, in particular, whether we could create reproducible documentation in the sense of something that has the form text, screenshot, text, … and looks like this:
but is actually created by something that has the form text, code, (code output), text, … and looks like this:
And as you’ve probably guessed, we can.
For some reason, my local version of
nbconvert seems to now default to using no-input display settings and I can’t see why (there are no
nbconvert settings files I can see. Anyone got any ideas how/why this might be happening? The only way I can work around it atm is to explicitly enable the display of the inputs:
jupyter nbconvert --to pdf --TemplateExporter.exclude_output_prompt=True --TemplateExporter.exclude_input=False --TemplateExporter.exclude_input_prompt=False notebook.ipynb.
It’s worth noting a couple of things here:
- if we reflow the document to generate new output screenshots, they will be a faithful representation of what the screen looks like at the time the document is reflowed. So if the visual styling (and nothing else) has been updated, we can capture the latest version;
- the code should ideally follow the text description of the process steps, so if the code stops working for some reason, that might suggest the process has changed and so the text might well be broken too.
Generating the automation code requires knowledge of a couple of things:
- how to write the code itself: the documentation helps here (eg in respect of how to wait for things, how to grab screenshots, how to move to things so they are in focus when you take a screenshot), but a recipe / FAQ / crib sheet would also be really handy;
- how to find locators.
In terms of finding locators, one way is to do it manually, by usier browser developer tools to inspect elements and grab locators for required elements. But another, simpler way, is to record a set of
playwright steps using the
playwright codegen URL command line tool: simple walk through (click through) the process you want to document in the automatically launched interactive browser, and record the corresponding
With the script recorded, you can then use that as the basis for you screenshot generating reproducible documentation.
For any interested OU internal readers, there is an example software guide generated using
playwright in MS Teams > Jupyter notebook working group team. I’m happy to share any scripts etc I come up with, and am interested to see other examples of using browser automation to test and generate documentation etc.
Referring back to the original software guide, we note that some screenshots have annotations. Another nice feature of
It’s trivial to amend that function to add a border round an element:
Then we can use a simple script to grab a screenshot with a highlighted area:
from time import sleep from playwright.async_api import async_playwright playwright = await async_playwright().start() # If we run headless--False, plawright will launch a # visible browser we can track progress in browser = await playwright.chromium.launch(headless = False) # Create a reference to a browser page page = await browser.new_page() # The page we want to visit PAGE = "https://www.open.ac.uk" # And a locator for the "Accept all cookies" button cookie_id = "#cassie_accept_all_pre_banner" # Load the page await page.goto(PAGE) # Accept the cookies await page.locator(cookie_id).click() # The selectors we want to screenshot selectors =  selectors_all = [".int-grid3"] # Grab the screenshot await screenshot_bounded_selectors(page, selectors, selectors_all, border='5px solid red', padding=2, margin=0, full_page=True)
Or we can just sreenshot and highlight the element of interest:
Simon has an open
shot-scraper issue on adding things like arrows to the screenshot, so this is obviously something that might repay a bit more exploration.
I note that the Jupyter accessibility docs have a section on the DOM / locator structure of the JupyterLab UI that include highlighted screenshots annotated, I think, using drawio. It might be interesting to try to replicate / automatically generate those, using
Finally, it’s worth noting that there is another
playwright based tool,
galata, that provides a set of high level tools for controlling JupyterLab and scripting JupyterLab actions (this will be the subject of another post). However,
galata is currently a developer only tool, in that it expected to run against a wide open Juypter environment (no authentication), and only a wide open JupyterLab environment. It does this by over-riding
.goto() method to expect a particular Jupyterlab locator, which means that if you want to test a deployment that sits behind the Jupyter authenticator (which is the recommended and default way of running a Jupyer server), or you want to go through some browser steps involving arbitrary web pages, you can’t. (I have opened an issue regarding at least getting through Jupyter authentication here and a related Jupyter discourse discussion here. What would be useful in the general case would be a trick to use a generic
playwright script automate steps up to a JupyterLab page and then hand over browser state to a
galata script. But I don’t know how to “just” do that. Which is why I say this is a developer tool, and as such is hostile to non-developer users, who, for example, might only be able to run scripts against a hosted server accessed through mutliple sorts of institutional and Jupyter based authentcation. The current set up also means it’s not possible to use
galata out of the can for testing a deployment. Devs only, n’est-ce pas?