Using Aggregated Local Council Spending Data for Reverse Spending (Payments to) Lookups
Knowing how much a council spent on this or that activity may contribute to transparency, but it also provides us with an opportunity for tracking the extent to which a council may provide services to other councils in exchange for payment (a “reverse spend”, if you will…)
In Inter-Council Payments and the Google Fusion Tables Network Graph, I demonstrated a recipe for graphing the extent to which county councils made payments to each other using data scraped from OpenlyLocal. But how about if we stick with a tabular view, and just work directly with the data contained in the Scraperwiki database?
One of the nice things about Scraperwiki is that it provides us with API access to the datatables in each scraper. Here’s the API for my “local spend” scraper:
We can use the API explorer to generate URLs to HTML tables containing the results of a query, or JSON or CSV feeds of the results. We can also preview the result of a query:
So what sorts of query can we run? As the database behind Scraperwiki scrapers is SQLite, the way in is through SQL like queries. SQL (Structured Query Language) is a query language for making very powerful searches over database tables (it also provides the basis for queries that treat Google spreadsheets or Google Fusion Tables as a database).
Here’s a sample of some queries we can run over the local spending data on my Scraperwiki scraper. The table I’m calling (think of a table like a particular worksheet in a spreadsheet) is publicMesh, where I have aggregated data from OpenlyLocal that relates to total spend made to various public bodies from other public bodies. supplier is the entity that received a sum from another body; payer and supplyingTo both relate to the entity that paid another body. (I guess “supplier” is not necessarily always the right term, e.g. when a payer is making a grant or award payment?)
The queries are structured as follows: select oneThing, anotherThing from 'table' says “grab the columns ‘oneThing’ and ‘anotherThing’ from the database table ‘table’”; where item relation condition lets us limit the results to those where the value of the ‘item’ column meets some condition (such as, total > 1000, which only selects results where the corresponding value in the total column is greater than 1000; or thing like ‘%term%’ which searches for rows where the contents of the ‘thing’ column contains ‘term’ (the % is like a wildcard character)).
- select supplier,supplyingTo,total from `publicMesh` where normsupplier like '%county hampshire%' and normsupplyingTo like '%arts%' order by total desc – look (crudely) for Arts Council payments to Hampshire County Council (result)
- select supplier,supplyingTo,total from `publicMesh` where normsupplier like '%county hampshire%' and normsupplyingTo like '%county%' order by total desc – see which other County Councils Hampshire is providing service to (result)
- select supplier,supplyingTo,total from `publicMesh` where normsupplier like '%county hampshire%' and normsupplyingTo like '%borough%' order by total desc – alternatively, what councils with ‘Borough” in their name has Hampshire County Council received funding from? (result)
- select supplyingTo,sum(total) as amount,count(payer) from `publicMesh` where normsupplier like '%county hampshire%' group by payer order by amount desc – if you click through on the “arts” link above, you’ll see that the Arts Council makes various payments to different entities associated with Hampshire County Council (Hampshire County Council; Arts Service, Hampshire County Council; Hampshire County Council – Schools Landscape Programme; Hampshire County Council Music Service). It’s possible aggregate the separate totals for each of this under a single “Hampshire County Council” banner (though note that it may not always make sense to do this sort of grouping operation. The sum operator adds together (sums) all the totals from items that are grouped together by the group by payer statement, that bundles together items with the same payer. count(payer) counts just how many lines from the same payer are grouped together (result)
Hmm…methinks there may be an opportunity here for an tutorial on writing SQL queries…? Maybe this would be a good context for a pathway on SocialLearn…?