Following on from Thinking About Things That Might Be Autogradeable or Useful for Automated Marking Support, via Chris Holdgraf I get something else that might be worth considering both for profiling notebooks as well as assessing code.
The response came following an idle tweet I’d posted wondering “If folk can read 600wpm (so 10wps), what’s a reasonable estimate for reading/understanding code blocks eg in jupyter notebook?”; if you’re trying to make sense of a code chunk in a notebook, I’m minded to assume that the number of lines may have an effect, as well as the line length.
Context for this: I’ve started mulling over a simple tool to profile / audit our course notebooks to try to get a baseline for how long it might reasonably take for a student to work through them. We could instrument the notebooks (eg using the nbgoogleanalytics or jupyter-analytics extensions to inject Google Analytics tracking codes into notebooks) and collect data on how long it actually takes, but we don’t. And whilst our course compute environment is on my watch, we won’t (at least, not using a commercial analytics company, even if their service is “free”, even though it would be really interesting…). If we were to explore logging, it might be interesting to add something an open source analytics engine like Matomo (Piwik, as was) to the VM and let students log their own activity… Or maybe explore jupyter/telemetry collection with a local log analyser that students could look at…
So, Chris’ suggestion pointed me towards
wily, “an application for tracking, reporting on timing and complexity in Python code”. Out of the can
wily can be used to analyse and report on the code complexity of a git repo over a period of time. It also looks like it can cope with notebooks: Wily will detect and scan all Python code in .ipynb files automatically”. It also seems like there’s the ability to “disable reporting on individual cells*, so maybe I can get reports on a per notebook or per cell basis?
My requirement is much simpler than the evolution of the code complexity over time, however: I just want to run the code complexity tools over a single set of files, at one point in time, and generate reports on that. (Thinks: letting students plot the complexity of their code over time might be interesting, eg in a mini-project setting?) However, from the briefest of skims of the
wily docs, I can’t fathom out how to do that (there is support for analysing across the current filesystem rather rather than a git repo, but that doesn’t seem to do anything for me… Is it looking to build a cache and search for diffs? I DON’T WANT A DIFF! ;-)
There is an associated blog post that builds up the rationale for
wily here — Refactoring Python Applications for Simplicity — so maybe by reading through that and perhaps poking through the
wily repo I will be able to find an easy way of using
wily, somehow, to profile my notebooks…
But the coffee break break I gave myself to look at this and give it a spin has run out, so it’s consigned back to the back of the queue I’ve started for this side-project…
PS From a skim of the associated blog post,
wily‘s not the tool I need:
radon is, “a Python tool which computes various code metrics, including raw metrics (SLOC (single lines of code), comment lines, blank lines, etc.), Cyclomatic Complexity (i.e. McCabe’s Complexity), Halstead metrics (all of them), the Maintainability Index (a Visual Studio metric)”. So I’ll be bumping that to the head of the queue…
PPS here’s another tool that could be handy for static analysis: [
returntocorp/semgrep](https://github.com/returntocorp/semgrep); originally developed for finding secutiry vulnerabilities, at least one set of folks have found a way to repurpose it for educational testing purposes: [Autograding code structure using CodeGrade and Semgrep](https://www.codegrade.com/blog/autograding-code-structure-using-codegrade-and-semgrep)