A couple of weeks ago I posted a demo of how to automate the production of a templated report (catchment for GP practices by LSOA on the Isle of Wight) using
knitr (Reporting in a Repeatable, Parameterised, Transparent Way).
Today, I noticed another report, with data, from the House of Commons Library on Superfast Broadband Coverage in the UK. This reports at the ward level rather than the LSOA level the GP report was based on, so I wondered how easy it would be to reuse the GP/LSOA code for a broadband/ward map…
After fighting with the Excel data file (metadata rows before the header and at the end of the table, cruft rows between the header and data table proper) and the R library I was using to read the file (it turned the data into a tibble, with spacey column names I couldn’t get to work with
ggplot, rather than a dataframe – I ended saving to CSV then loading back in again…), not many changes were required to the code at all… What I really should have done was abstracted the code in to an R file (and maybe some importable Rmd chunks) and tried to get the script down to as few lines of bespoke code to handle the new dataset as possible – maybe next time…
I also had a quick play at generating a shiny app from the code (again, cut and pasting rather than abstracting into a separate file and importing… I guess at least now I have three files to look at when trying to abstract the code and to test against…!)
So this has got me thinking – what are the commonly produced “types” of report or report section, and what bits of common/reusuble code would make it easy to generate new automation scripts, at least at a first pass, for a new dataset?