Automated Testing of Educational Jupyter Notebook Distributions Using Github Actions

For years and years and years and years (since 2016) we’ve been updating both the notebooks, and the environment we ship to students on our data management and analysis course, on an annual basis.

The notebooks are substantially the same (we update maybe 20% of the material each presentation) and the environment updates are typically year on year updates to Python packages. There are some breaking changes, but these are generally flagged by deprecation warning messages that start to appear as part of pre-breaking change package updates a year before they become breaking.

The notebooks are saved to a private Github repo with cells pre-run. The distribution process then involves clearing the output cells and zipping the cleaned notebooks and any required data files into a distributed zip file. On my to do list is automating the release steps. More on that when I get round to it…

Checking that the notebooks all run correctly in the updated environment is done as a manual process. It doesn’t have to be, becuase nbval, which runs notebooks and tests cell outputs against reference outputs in a previously run version of the notebook, has been around for years. But for all the “the OU is a content factory with industrial production techniques” bluster, module teams are cottage industries and centralised production doesn’t come anywhere near our Jupyer notebook workflows. (FWIW, I don’t do the checking… If I did, I’d have got round to automating it properly years ago!;-)

In a recent fragmentary post Structural Testing of Jupyter Notebook Cell Outputs With nbval, I described some of the tweaks I’ve been making to nbval to reduce the false positive cell output test errors and add elements of structural testing so we donlt ignore those cell outputs completely.

Today, I wasted hours and hours not understanding why a simple Github Action automation script to run the tests wasn’t working (answer: I think I was running Linux commands in the wrong shell…. Plus, our file names have spaces in them (some may even have punctuation, but I haven’t hit associated errors with that sort of filename yet… A simple tweak to the delimiter I user to separate filenames (e.g. moving away from a comma separator to | might be a quick fix for that…)).

Anyway, I think I have an automation script that works to check all the notebooks in a repo on demand, and notebooks that are changed

On the to do list is coping with things like markdown files that load as notebooks using jupytext, as well as tweaking nbval to allow me to exclude certain notebooks.

So, here’s the “check changed notebooks” script, set up in this case to pull on the container we use for the databases course. Note that for some reason the database services don’t seem to autorun, so I need to manually start them. The action is triggered by a push and a job is then run that tests to see whether there are any .ipynb files in the commits and sets a flag to that effect. If there are notebook files in the push, a second job is run that grabs a list of changed notebook filenames. These files are then separately tested using nbval. (It would be easier if we could just pass the list of files we want to test. But the crappy filenames with spaces and punctuation in them that repeatedly cause ALL SORTS OF PROBLEMS IN ALL SORTS OF WAYS, not least for students, would probably cause issues here too…)

name: nbval-partial-test
on:
  push

jobs:
 changes:
    runs-on: ubuntu-latest
    # Set job outputs to values from filter step
    outputs:
      notebooks: ${{ steps.filter.outputs.notebooks }}
    steps:
    # (For pull requests it's not necessary to checkout the code)
    - uses: actions/checkout@v2
    - uses: dorny/paths-filter@v2
      id: filter
      with:
        filters: |
          notebooks:
            - '**.ipynb'


  nbval-partial-demo:
    needs: changes
    if: ${{ needs.changes.outputs.notebooks == 'true' }}
    runs-on: ubuntu-latest
    container:
      image: ouvocl/vce-tm351-monolith
    steps:
    - uses: actions/checkout@master
      with:
        fetch-depth: 0 # or 2?
#        ref: nbval-test-tags
    - id: changed-files
      uses: tj-actions/changed-files@v11.2
      with:
        separator: ','
        files: |
          .ipynb$
    - name: Install nbval (TH edition)
      run: |
        python3 -m pip install --upgrade https://github.com//ouseful-PR/nbval/archive/table-test.zip
        #python3 -m pip install --upgrade git+https://github.com/innovationOUtside/nb_workflow_tools.git
    - name: Restart postgres
      run: |
        sudo service postgresql restart
    - name: Start mongo
      run: |
        sudo mongod --fork --logpath /dev/stdout --dbpath ${MONGO_DB_PATH}
    - name: test changed files
      run: |
        # The read may be redundant...
        # I re-ran this script maybe 20 times tryin to get it to work...
        # Having discovered the shell: switch, we may
        # be able to simplify this back to just the IFS setting
        # and a for loop, without the need to set the array
        IFS="," read -a added_modified_files <<< "${{ steps.changed-files.outputs.all_modified_files }}"
        for added_modified_file in "${added_modified_files[@]}"; do
          py.test  --nbval "$added_modified_file" || continue
        done
      # The IFS commands require we're in a bash shell
      # By default, I think the container may drop users into sh
      shell: bash
      continue-on-error: true

At the moment, the output report exists in the Action report window:

The action will also pass even if there are errors detected: removing the continue-on-error: true line will ensure that if there is an error the action will fail.

I should probably also add an automated test to spell check all modified notebooks and at least publish a spelling report.

The other script will check all the notebooks in the repo based on a manual trigger:

name: nbval-test
on:
  workflow_dispatch:
    inputs:
      logLevel:
        description: 'Log level'     
        required: true
        default: 'warning'
      tags:
        description: 'Testing nbval' 

jobs:
  nbval-demo:
    runs-on: ubuntu-latest
    container:
      image: ouvocl/vce-tm351-monolith
    steps:
    - uses: actions/checkout@master
    - name: Install nbval (TH edition)
      run: |
        python3 -m pip install --upgrade https://github.com//ouseful-PR/nbval/archive/table-test.zip
    - name: Restart postgres
      run: |
        sudo service postgresql restart
    - name: Start mongo
      run: |
        sudo mongod --fork --logpath /dev/stdout --dbpath ${MONGO_DB_PATH}
    - name: Test notebooks in notebooks/ path
      run: |
        py.test --nbval py.test  --nbval ./notebooks/*
      continue-on-error: true

So… with these scripts, we should be able to:

  • test updated notebooks to check they are correctly committed into the repo;
  • manually test all notebooks in a repo, e.g. when we update the environment.

Still to do is some means of checking notebooks that we want to release. This probably needs doing as part of a release process that allows us to:

  • specify which notebook subdirectories are to form the release;
  • get those subdirectories and the reference pre-run notebooks they contain;
  • test those notebooks; note that we could also run additional tests at this point, such as a spell checker;
  • clear the output cells of those notebooks; we could also run other bits of automation here, such as checking that activity answers are collapsed, etc.
  • zip those cleared cell notebooks into a distribution zip file.

But for today, I am sick of Github f****g Actions.

Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...

%d bloggers like this: