Data Analysis Packages…?

Chasing the thought of Frictionless Data Analysis – Trying to Clarify My Thoughts, I wonder: how about if, in addition to the datapackage.json specification, there was a data analysis package or data analysis toolkit package specification? Perhaps the latter might be something that unpacks rather like the fig.yml file described in Using Docker to Build Linked Container Course VMs, and the former a combination of a datapackage and a data analysis toolkit package, that downloads a datapackage and opens it into a toolkit configuration specified by data analysis toolkit package. We’d perhaps also want to be able to define a set of data analysis scripts (data analysis script package???) relevant to working with a particular datapackage in the specified tools (for example, some baseline IPython notebooks or R/Rmd scripts?)

One comment

  1. John David Smith

    John Tukey was here before, as usual. :-) See his Styles of Data Analysis (p 11 of “Modern Data Analysis” Academic Press: 1982.) Has a very provocative diagram showing “input” plugging into “automatic data expanders” plugging into other exotic Tukey coinages.