Using CSV Docs As a Database

Earlier today, my twinterwebs were full of the story about the COI announcing a reduction in the number of government websites:

Multiple sources of news... not...

(The told the story differently, at least…)

A little while later(?), a press released appeared on the COI website: Clamp down on Government websites to save millions, although by that time, via @lesteph, I’d found a copy of the interim report and a CSV dataset Reporting on progress: Central Government websites 2009-10.

So what can we do with that CSV file? How about turning it into a database, simply by uploading it to Google docs? Or importing it live (essentially, synching a spreadsheet with it) from the source URL on the COI website?

<a href="COI website report - costs column headings” title=”Photo Sharing”>Import csv into google docs

I had a quick play pruning the code on my Guardian Datastore explorer, and put together this clunky query tool that lets you explore the COI website data as if it was a database.

COI Website review CSV data explorer

The explorer allows you to bookmark views over the data, to a limited extent (the ascending/descending views aren’t implemented:-(, so for example, we can see:

websites with a non-zero “Very Poor Editorial” score
Age profile of visitors (where available)
Costs

(Feel free to share bookmarks to other views over the data in the comments to this post.)

Note that the displayed results table is an activie one so you can click on column headings to order the results by column values.

SOrting a table by colun

Note that there seem to be issues with columns not being recognised as containing numerical data (maybe something to do in part with empty cells in a column?), which means the chart views don’t work, but this page is not trying to do anything clever – it’s just a minimal query interface over the visualisation API from a spreadsheet. (To build a proper explorer for this dataset, we’d check the column data types were correct, and so on.)

Looking at the app, I think it’d probably useful to display a “human readable” version of the query too, that translates column identifiers to column headings for example, but that’s for another day…

GOTCHAS: use single quote (‘) rather than double quote (“) in the WHERE statements.

Related: Using Google Spreadsheets as a Database with the Google Visualisation API Query Language
Using Google Spreadsheets Like a Database – The QUERY Formula

Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...

%d bloggers like this: