I pitched the best ideas I had, garbled by lone working, incoherent hysterical naked.

On Tuesday, I listened to myself start to pitch ideas around “reproducible reports” incoherently until the situation was saved by convener @fantasticlife who reset the proceedings by starting at the beginning and setting out the problem the ideas were supposed to address.

Yesterday, I listened to myself ranting incoherent demands for getting Jupyter notebooks installed NOW on OU servers.

In each case I went in naked, unprepared in making sure the scene was set for what the problem I perceived was, and what benefits might arise if we solved it in a way I would then advocate.

In each case, I got hysterical, fumbling words and ideas and because to see how it makes sense (as it does in my head) you need to understand the whole process or system the problem is situated in and how the solution addresses it.

Lone working doesn’t help much in this respect – hours spent typing meaningless words through a keyboard trying to get jumbled ideas out. No time spent in human conversation, rehearsing, trying out storylines, addressing questions and quizzical looks.

Lack of respect doesn’t help much either –  “everyone else it doing it WRONG”!;-)

Whatever…

(See what I did there?!;-)

Anyway – I need to start working on better advocacy skills that try to crystallise out the problems I think the tools and approaches I want to advocate might help to address.

First step, try to match message to audience. The following, for example, could be a starting point to trying to advocate the use of Jupyter notebooks as an environment to support the teaching of programming to folk who are interested in teaching programming and the selection of environments for teaching programming…

Second step: pithy identification of a problem. For example, in the Atlantic article The Coming Software Apocalypse, James Somers quotes John Resig, who in observing student programmers realised that “the students who did well—in fact the only ones who survived at all—were those who could step through that text one instruction at a time in their head, thinking the way a computer would, trying to keep track of every intermediate calculation”.

Third step: ways of addressing the problem. One of the reasons I like Jupyter notebooks so much is that a natural way of using them is to use them to develop (and implicitly test) code by writing it a line at a time.

Writing a line of code at a time, and displaying the output after each step, means you don’t have to keep the complete state of the programme in your head.

In writing many programs, the aim is often to get from one state to another state that allows you to do something more easily. For example, it may be possible to generate a complex chart from a data set directly if the data is correctly organised.

The Jupyter notebook allows you to lay out the multiple steps required to get from your original state to the desired state and check your progress as you go along. (Implicitly, this provides a form of testing each step.)

This is good for exposition in teaching, but also in learning, as the student:

  • constructs the program one line of code at a time;
  • checks the output resulting from that line;
  • compares the new state with the previous state to check the correct sort of operation or transformation has been applied.

Having worked out the programme, which is a series of steps, and checking its implementation in code by visualising the intermediate state at each step, the student may then start to package the code contained in several cells in a single cell that contains many steps – and check that works correctly, essentially treating the cell containing multiple lines of code as a single line of more complex code. The next step may be to package those multiple lines of code that are bundled into one cell into a single function that allows those cells to be executed directly as a single line of code.

All the while, the original line-at-time code cells from further up the notebook act as a reference, and stepwise documentation supporting visual testing, of how each line of code works and the state changes it produces (that is, the input state it expects and the output state it produces).

Sigh… I still don’t see why folk don’t grok that?:-(

PS From the same Atlantic article: Bret Victor’s frustration that “when someone wanted to do something interesting with a computer, they had to write code”. This is one area where I know I see the world very differently from computing colleagues: they want to teach programming; I want to help people use computers to get stuff done. But where I differ from someone like Bret Victor is that I see value in people having access to the single lines of code that do things and the intermediate states that arise because it facilitates a “scripting” approach to programming.

Note that I don’t mean scripting vs programming in the sense of compiled versus interpreted programmes; I mean it more in the sense of scripts of linear sequences of instructions that can be used to perform a particular task.

In the notebook context, I see each line of code, or at leach each code cell, as the line of code, plus its output (or at least, an output that depicts any resulting change in state, such as the new value of a parameter if the line of code updates the value of that parameter). There is also an expectation of what the input to that cell block might look like, in the form of outputs from previous code blocks that initiate the state expected as input to the code cell. The result is that an executed notebook can be read as a  narrative trace that shows:

  • an arrangement of various lines of code,
  • the effect of applying each line of code in terms of how transforms the output or outputs from previous cells into the output of the current cell.

In the Scratch inspired visual programming environments beloved of many of my colleagues, differently shaped and coloured blocks limit how programming blocks can be joined together to ensure syntactic correctness. The following screenshot from another Block.ly inspired programming environment, Open RobertaLab, further groups different sorts of functional blocks in a colour coded command palette:

Whilst these environments do often provide a code view, it’s often not possible to keep track of the intermediate state or the state transformations applied by each block:

(I’m not sure if OpenBuild(?), which I’m guessing is the OU’s fork of MIT Scratch, that’s about to be used in our new level 1 course, offers a code view?)

That is, whilst the block style interfaces help maintain syntactic correctness but doesn’t allow you to monitor changes in state from applying a particular block, using the notebook style does let you inspect the effect of applying a particular operation. In many ways, the notebook view is like an exploded step tracer that lets you keep track of the state of a linear programme as it works its way through a series of sequential steps.

That’s another feature of what I’m calling ‘scripting’ – information processing recipes that get a thing done by transforming stuff into other stuff.

(I can hear my colleagues now – “Ah yes, all notebooks are good for is one-shot linear programmes”. Whatever. If a line of code calls a function repeatedly, you don’t necessarily need to see the output of the function at each step (though you could display that if you wanted to); what is important is that you can see how the function works in a one shot mode (and maybe test it with various parameters) and also see what the overall effect is by applying it however times in the particular cell that calls it repeatedly.)

Part of the reason I’m in favour of having lines-of-code-that-do-things to hand, as in a Jupyter notebook, is that building graphical user interfaces is hard. Even if you can build a nice graphical programming environment to support the development of a particular sort of application or support the user in performing a particular sort of task (tidying up a messy data set and generating a chart from it), the GUI elements are themselves going to trigger, and perhaps insert particular parameter values into, lines of code.

For example, by going to a menu and selecting an item, you are triggering the execution of a particular block of code a line at a time. By ticking a checkbox, or checking a radio button, particularly in a responsive interface, you are setting the value of a parameter that is passed to a line of code for it to do something with.

What the notebook does is let you arrange cell blocks of code that can perform similar actions in terms of manipulating state to the actions triggered by invoking those interactive user elements. Rather than “select that menu option, check that box” in a graphical interface, you “use the block of code that would be triggered by that menu” and “use a block of code to set the value that would be updated by the checkbox and apply it”.

4 comments

  1. Douglas Blank

    Thanks for the insightful self-reflection, and laying out the struggle. I, too, am wrestling with some of the same issues. Some computer scientists would agree with your methods. I guess my view is that people will learn about CS because they are able to write these scripts. Some CS types hate the notebook for one reason: it looks like a linear progression of ideas, but as you develop the notebook story, there are “leaks” in the dependencies. For example, if I define a function “foo”, but later name it “bar”, “foo” is still around in the hidden state. I think that can be fixed with clever dependency tracking (eg, rename foo to bar, and foo is removed from env). This will have to be done on a language-by-language case, I suspect.

    • Tony Hirst

      Hi Doug – thanks for the comment.

      I think another issue folk have is with “false state” reports, eg where I display the value of a variable from one cell then update it in another, but from a reading of the notebook you might still imagine the variable had the original (still displayed) value.

      Another issue is the way that a user can execute cells in a non-linear fashion (although this is flagged to the reader by out-of-sequence cell run numbers).

      For me, notebooks are a medium that allow me to engage in an interactive conversation with an evolving programme.

      I think there are corollaries to visualisation too, in particular exploration vs “presentation” graphics. If I am exploring a data set interactively, the fact that I am part of the process means that if axes are zoomed in to part of the range or I don’t label things correctly, it doesn’t matter in the way it might as if I was producing a properly titled, labelled presentation graphic for a formal report.

      Another thing I think folk who haven’t used notebooks miss on is the fact that resetting the kernel and running a notebook from zero state can become part of the process quite easily (which is something that helps implicit testing of the code).

      • Doug Blank

        Agreed! I think a “dependency graph” would help with most of these issues: if you change something (function name, variable value) that has an effect downstream, all of that downstream output should disappear/become un-executed. That would help make it clear where there are dependencies.

        • Tony Hirst

          @Doug which raises the question – is there an extension for that?! (Actually, may not be that simple in UI terms – e.g. I may want a reference for the sort of thing a previous output looked like. Grey it out, or do a diff between last displayed value and current value, or updated value after running a particular cell if several cells change the value?)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s