Sharing the Data Load

A few weeks ago, I popped together a post listing a few Data Journalism Units on Github. These repos (that is, repositories), are being used to share code (for particular interactives, for example), data, and analysis scripts. They’re starting to hint at ways in which support for public reproducible local data journalism might start to emerge from developing (standardised) data repositories and reproducible workflows built around them.

Here are a handful of other signals that I think support this trend that I’ve come across in the last few weeks (if they haven’t appeared in your own feeds, a great shortcut to many of them is via @digidickinson’s weekly Media Mill Gazette):




And here’s another one, from today – the Associated Press putting together a pilot with data publishing platform “to help newsrooms find local stories within large datasets” (Localizing data, quantifying stories, and showing your work at The Associated Press ). I’m not sure what the pilot will involve, but the rationale sounds interesting:

Transparency is important. It’s a standard we hold the government to, and it’s a standard we should hold the press to. The more journalists can show their work, whether it’s a copy of a crucial document or the data underlying an analysis, the more reason their audience has to accept their findings (or take issue with them in an informed way). When we share our data and methodology with our members, those journalists give us close scrutiny, which is good for everyone. And when we can release the data more broadly and invite our readers to check our work, we create a more secure grounding for the relationship with the reader.

:-) S’what we need… Show your working…