I keep hearing myself in meetings talking about the “need” to get people coding, but that’s not really what I mean, and it immediately puts people off because I’m not sure they know what programming/coding is or what it’s useful for.
So here’s an example of the sort of thing I regularly do, pretty much naturally – automating simple tasks, a line or two at a time.
The problem was generating some data files containing weather data for several airports. I’d already got a pattern for the URL for the data file, now I just needed to find some airport codes (for airports in the capital cities of the BRICS countries) and grab the data into a separate file for each [code]:
In other words – figuring out what steps I need to do to solve a problem, then writing a line of code to do each step – often separately – looking at the output to check it’s what I expect, then using it as the input to the next step. (As you get more confident, you can start to bundle several lines together.)
The print statements are a bit overkill – I added them as commentary…
On its own, each line of code is quite simple. There are lots of high level packages out there to make powerful things happen with a single command. And there are lots of high level data representations that make it easier to work with particular things.
pandas dataframes, for example, allow you to work natually the contents of a CSV data file or an Excel spreadsheet. And if you need to work with maps, there are packages to help with those too. (So for example, as an afterthought I added a quick example to the notebook showing how to add markers for the airports to a map… (I’m not sure if the map will render in the embed or the gist?) That code represents a recipe that can be copied and pasted and used with other datasets more or less directly.
So when folk talk about programming and coding, I’m not sure what they mean by it. The way we teach it in computing departments sucks, because it doesn’t represent the sort of use case above: using a line of code at a time, each one a possible timesaver, to do something useful. Each line of code is a self-made tool to do a particular task.
Enterprise software development has different constraints to the above, of course, and more formalised methods for developing and deploying code. But the number of people who could make use of code – doing the sorts of things demonstrated as per the example above – is far larger than than the number of developers we’ll ever need. (If more folk could build their own single line tools, or work through tasks a line of a code at a time, we may not need so many developers?)
So when it comes to talk of developing “digital skills” at scale, I think of the above example as being at the level we should be aspiring to. Scripting, rather then developer coding/programming (h/t @RossMackenzie for being the first to comment back with that mention). Because it’s in the reach of many people, and it allows them to start putting together their own single line code apps from the start, as well as developing more complex recipes, a line of code at a time.
And one of the reasons folk can become productive is because there are lots of helpful packages and examples of cribbable code out there. (Often, just one or two lines of code will fix the problem you can’t solve for yourself.)
Real programmers don’t write a million lines of code at a time – they often write a functional block – which may be just a line or a placeholder function – one block at a time. And whilst these single lines of code or simple blocks may combine to create a recipe that requires lots of steps, these are often organised in higher level functional blocks – which are themselves single steps at a higher level of abstraction. (How does the joke go? Recipe for world domination: step 1 – invade Poland etc.)
The problem solving process then becomes one of both top-down and bottom up: what do I want to do, what are the high-level steps that would help me achieve that, within each of those: can I code it as a single line, or do I need to break the problem into smaller steps?
Knowing some of the libraries that exist out there can help in this problem solving / decomposing the problem process. For example, to get Excel data into a data structure, I don’t need to know how to open a file, read in a million lines of XML, parse the XML, figure out how to represent that as a data structure, etc. I use the
pandas.read_excel() function and pass it a filename.
If we want to start developing digital skills at scale, we need to get the initiatives out of the computing departments and into the technology departments, and science departments, and engineering departments, and humanities departments, and social science departments…