With “data driven decision making” and “evidence based policy making” both in fashion at the moment, it’s quite likely that we’ll see more stories along the line of Critique of Reinhart & Rogoff Garners International Attention.
The story so far: an academic paper used to bolster austerity policy arguments was shown to include errors, frame the data in a particular way, and make a hardline point around an arbitrary threshold value, as for example described in The 90% question [The Economist] or Reinhart-Rogoff Revisited: Why We Need Open Data in Economics [Open Economics blog]. THe story of the graduate student project that led to the mistakes being discovered is described in the BBC News Magazine: Reinhart, Rogoff… and Herndon: The student who caught out the profs.
Here’s the BBC’s “More Or Less” take on it (personal archive copy) :
Some observations on related matters:
- if you ever work with data, you’ll know that you have to be selective about what data you include in analysis; sometimes data isn’t available, sometimes it comes from different sources that use slightly different interpretations or definitions, sometimes you throw data away because it clutters a storyline you are trying to explore. For me, data analysis is a conversation with a data. The conversation is caveated and explores a particular issue, “all other things being equal”, which they often aren’t; but we build that into our caveats as we tell stories to ourselves. Caveats and qualifiers and particular rationales which don’t get communicated to a wider audience, maybe, or the consequences of which aren’t fully appreciated by a wider audience. Although they are by the analyst/researcher. (“You took it out of context”, “but it was obviously a joke”, etc etc). You can look at a sculpture from many different directions. Same with data.
- things get picked up by official reports and become “truth”, sometimes losing their flaky or biased provenance as they get “washed” though being cited in more and more “weighty” reports or publications (eg Sleight of Hand and Data Laundering in Evidence Based Policy Making). Others become zombie statistics, refusing to die and repeatedly being cited as evidence, even if they have been debunked.
- I used to share an office with psychology researchers, whose day was made if they had an experimental result “significant at the 5% level”, crushed if the significance level was outside it. (What are good references containing arguments against null hypothesis significance testing?). Tell yourself a story, then make the judgement. Hard threshold values can make for dodgy decisions (one reason I like the notions of fuzzy logic and hysteresis). If your results have accuracy bounds, rerun the numbers a few times with different errors that take into account those tolerances. Can you make an insignificant result significant, at least once, and then quote that? #FFS There’s no truth there…Or maybe there are lots of sort-of truths? (See also: Ben Goldacre’s Bad Pharma, especially the chapters on “Missing Data” and “Bad Trials”.)
- spreadsheets can be really hard to debug. I don’t think they have a “View Source” option that just shows all the formulae and outline’s there ranges, as well as maybe highlight fixed cells that have been overwritten into otherwise calculated cell ranges, do they? See also: EuSRIG – the European Spreadsheet Risks Interest Group, which I guess has come to many folks’ attention for the first time over the last week, and whose summer conference enrolment numbers may well take a leap, not least from press access requests?!
- there are shed loads of papers out there in the “peer reviewed” literature, though that’s not to say the peer reviewers actually tried to replicate the results.
- although this has slipped onto the backburner a bit (where does the time go?!:-(, I tried to frame my own personal learning journey into the world of statistics around replicating various academic papers about motor sport (and in particular, Formula One) results and timing data. I’ve only managed to attack one so far (here and here), but already it through up some interesting observations about the data that was used to generate correlations, such as whether you rank cars that weren’t classified. The used of significance tests also seemed a bit – pointless – to me. Anyway – the point is this: there are a load of papers out there on a whole range of topics that might provide a good basis for “textbook examples” to help folk learn how to use particular analysis tools or techniques. As well as describing the method, (supposedly!), many papers include “answers” that you can use to check your working (or theirs!). In much the same way that the School of Data is trying to develop the idea of “Data Expedition” style uncourses, where folk come together to find, analyse and tell stories from datasets in particular topic areas, how about a notMOOC pedagogical style based around working through and replicating the findings of particular published academic papers, which might also involve learning precursor stuff that you need to know in order to make sense of the paper or try out its analyses?
Hmmm…