Confused Fragments About Open Data Economics…

Some fragments…

the public paid for it so public has a right to it: the public presumably paid for it through their taxes. Companies that use open public data that don’t fully and fairly participate in the tax regime of the country that produced the data then they didn’t pay their fair share for access to it.

data quality will improve: with open license conditions that allow users to take open (public) data and do what they want with it without the requirement to make derived data available in a bulk form under an open data license, how does the closed bit of the feedback loop work? I’ve looked at a lot of open public data releases on council and government websites and seen some companies making use of that data in presumably a cleaned form (if it hasn’t been cleaned, then they’re working with a lot of noise…) But if they have cleaned and normalised the data, have they provided this back ion an open form to the public body that gifted them access to it? Is there an open data quality improvement cycle working there? Erm… no… I suspect if anything, the open data users would try to sell the improved quality data back to the publisher. This may be their sole business model, or it may be a spin-off as a result of using the (cleaned and normalised) data fro some other commercial purpose.

4 comments

  1. Owen Boswarva (@owenboswarva)

    Re “the public paid for it so the public has a right to it”, the key point is that production of the data has already been funded to deliver a public task. If the marginal cost of reproduction for reuse is nil (or near enough) then the “free rider” problem doesn’t occur. There is a collective interest in maximising reuse of the data, including by overseas corporations, because that increases familiarity and interoperability.

    Re improvements to data quality, IMO that’s never been a credible economic argument for open data in general. Any feedback loop would have to be created as an additional outreach effort by the publisher; it is not a natural feature of open data itself.

    • Tony Hirst

      @owen: “the key point is that production of the data has already been funded to deliver a public task” – there is on ongoing cost associated with maintaining a data set and it’s publication (an archive problem) and there are also costs associated with continuing the production of the data. Or is the idea that public bodies start paying private corps to collect and publish what was once the publicly collected data, because there is neither a financial return (direct through sales or indirect through taxation) nor a quality return?

      • Owen Boswarva (@owenboswarva)

        Of course there’s an ongoing cost associated with maintaining a dataset. But if the dataset is necessary to deliver a public task, that cost arises irrespective of whether the dataset is also released for reuse as open data. Similarly the costs of continuing to produce that dataset.

        In general, costs to release and archive data for reuse (i.e. beyond the public task) are nugatory. That’s what I mean by the marginal cost of reproduction being nil. There are some types of data — usually very large datasets or high-volume feeds — where a business case is required to justify the additional support costs; but those are not the norm.

        IMO it is not any part of the economic case for open data that public authorities should continue to produce or fund datasets that they don’t first require to deliver a public task. The “public paid for it” argument assumes that open data release is secondary to the core purposes for which the data is produced.

        I can imagine circumstances in which it might be in the national interest for government to fund the production and release of open datasets that are not required specifically for the purposes of government — as it already does with physical infrastructure. However that’s rather different from the existing open data model.

        • Tony Hirst

          @Owen Agreed, I apologise for being short in my previous response (long day!). What the notes in the post referred to were observations relating to (at least once) commonly given arguments around open data that I spotted in a couple of readings today. As I spot them, I keep meaning to start logging my reactions to them, which is what I did today. The intention then is to pull them together into something more considered.

          I have know argument with the idea that eg local gov should *additionally* publish as open data data that they have been collecting as part of their other duties. But retaking the stance from the post, the benefits of the data release should go to the members of the society that the council serves. And if companies act in ways that go against the norms of the society, why should they benefit?

          If we start looking at data quality improvement loops, added value/financial benefit loops etc where there is opendata involved, I do start to wonder what the picture looks like and where the value is actually accruing…?