University Funding – A Wider View

A post on the Guardian Datablog yesterday (Higher education funding: which institutions will be affected?) alerted me to the release of HEFCE’s “provisional allocations of recurrent funding for teaching and research, and the setting of student number control limits for institutions, for academic year 2012-13″ (funding data).

Here are the OU figures for teaching:

Funding for old-regime students (mainstream) Funding for old-regime students (co-funding) High cost funding for new-regime students Widening participation Teaching enhancement and student success Other targeted allocations Other recurrent teaching grants Total teaching funding
59,046,659 0 2,637,827 23,273,796 17,277,704 22,619,320 3,991,473 128,846,779

HEFCE preliminary teaching funding allocations to the Open University, 2012-13

Of the research funding for 2012-13, mainstream funding was 8,030,807, the RDP supervision fund came in at 1,282,371, along with 604,103 “other”, making up the full 9,917,281 research allocation.

Adding Higher Education Innovation Funding of 950,000, the OU’s total allocation was 139,714,060.

So what other funding comes into the universities from public funds?

Open Spending publishes data relating to spend by government departments to named organisations, so we can search that for data spent by government departments with the universities (for example, here is a search on for “open university”:

Given the amounts spent by public bodies on consultancy (try searching OpenCorporates for mentions of PriceWaterhouseCoopers, or any of EDS, Capita, Accenture, Deloitte, McKinsey, BT’s consulting arm, IBM, Booz Allen, PA, KPMG (h/t @loveitloveit)), university based consultancy may come in reasonably cheaply?

The universities also receive funding for research via the UK research councils (EPSRC, ESRC, AHRC, MRC, BBSRC, NERC, STFC) along with innovation funding from JISC. Unpicking the research council funding awards to universities can be a bit of a chore, but scrapers are appearing on Scraperwiki that make for easier access to individual grant awards data:

  • AHRC funding scraper; [grab data using queries of the form select * from `swdata` where organisation like "%open university%" on scraper arts-humanities-research-council-grants]
  • EPSRC funding scraper; [grab data using queries of the form select * from `grants` where department_id in (select distinct id as department_id from `departments` where organisation_id in (select id from `organisations` where name like "%open university%")) on scraper epsrc_grants_1]
  • ESRC funding scraper; [grab data using queries of the form select * from `grantdata` where institution like "%open university%" on scraper esrc_research_grants]
  • BBSRC funding [broken?] scraper;
  • NERC funding [broken?] scraper;
  • STFC funding scraper; [grab data using queries of the form select * from `swdata` where institution like "%open university%" on scraper stfc-institution-data]

In order to get a unified view over the detailed funding of the institutions from these different sources, the data needs to be reconciled. There are several ID schemes for identifying universities (eg UCAS or HESA codes; see for example GetTheData: Universities by Mission Group) but even official data releases tend not make use of these, preferring instead to rely solely on insitution names, as for example in the case of the recent HEFCE provisional funding data release [DOh! This is not the case – identifiers are there, apparently (I have to admit, I didn’t check and was being a little hasty… See the contribution/correction from David Kernohan in the comments to this post…]:

For some time, I’ve been trying to put my finger on why data releases like this are so hard to work with, and I think I’ve twigged it… even when released in a spreadsheet form, the data often still isn’t immediately “database-ready” data. Getting data from a spreadsheet into a database often requires an element of hands-on crafting – coping with rows that contain irregular comment data, as well as handling columns or rows with multicolumn and multirow labels. So here are a couple of things that would make life easier in the short term, though they maybe don’t represent best practice in the longer term…:

1) release data as simple CSV files (odd as it may seem), because these can be easily loaded into applications that can actually work on the data as data. (I haven’t started to think too much yet about pragmatic ways of dealing with spreadsheets where cell values are generated by formulae, because they provide an audit trail from one data set to derived views generated from that data.)

2) have a column containing regular identifiers using a known identification scheme, for example, HESA or UCAS codes for HEIs. If the data set is a bit messy, and you can only partially fill the ID column, then only partially fill it; it’ll make life easier joining those rows at least to other related datasets…

As far as UK HE goes, the JISC monitoring unit/JISCMU has a an api over various administrative data elements relating to UK HEIs (eg GetTheData: Postcode data for HE and FE institutes, but I don’t think it offers a Google Refine reconciliation service, (ideally with some sort of optional string similarity service)…? Yet?! ;-) maybe that’d make for a good rapid innovation project???

PS I’m reminded of a couple of related things: Test Your RESTful API With YQL, a corollary to the idea that you can check your data at least works by trying to use it (eg generate a simple chart from it) mapped to the world of APIs: if you can’t easily generate a YQL table/wrapper for it, it’s maybe not that easy to use? 2) the scraperwiki/okf post from @frabcus and @rufuspollock on the need for data management systems not content management systems.

PPS Looking at the actual Guardian figures reveals all sorts of market levers appearing… Via @dkernohan, FT: A quiet Big Bang in universities


  1. Maxine

    Fascinating post, thank you. In the science research arena, universities can also get substantial amounts of funding from charities, notably the Wellcome but also significant amounts from, eg, Cancer Research UK, British Heart Foundation, etc. Much of these grants are for specific research programmes rather than to the university directly, of course, but a proportion of them goes to “overheads” which can include bricks, mortar, test-tubes (!) etc.

    • Tony Hirst

      @maxine do charities have to report the extent of their charitable donations? Are they FOIable? I guess HEIs could be FOI’d about sizeable grants from chairties if they aren’t otherwise itemised in public accounts? (Hmmm… how much detail/data is available in those public accounts anyway? Thinks: just as local authorities have to publish detail itemisation of spend over £500, could we imagine an expectation, at least, that universities publish itemised breakdowns of (grant/award) income over £500???

    • Maxine

      Sorry, I am afraid I don’t know the technical details of reporting: I am sure one could find a list of awards given by charity, or a list of awards a particular university is given, but I do not know if this is centralised or reported in a way that is quantitatively analysable. Probably not, but it would be nice if it were. I know a bit about the scientific community (not the humanities) and I know that funders of all kinds are very interested in knowing more about the outcomes of their investments than they get at present, and I know that technical means are being devised that could help to enable this, eg via

  2. David Kernohan (@dkernohan)

    Hey Tony – for the HEFCE funding tables the HESAcode is often (but not always) in a hidden first column on the excel sheet. It is on table 2 and table 4 of the recently released HEFCE grant tables. Not table 1 sadly, but you can always do a look up.

    Table 2 column 1 is VERY interesting, as it contains what is affectionately known as HEFCE institutional code, which is used internally to provide an ordered quasi-alphabetic list of institutions based on the most significant name (University College London filed under L for London rather than U for University). This changes from year to year, which makes building a time series immensely annoying. I often wondered if they did it deliberately.

    Hope this helps

    • Tony Hirst

      @david AH, okay, thanks for the tip; I wasn’t looking as closely as I probably should have been… rushing to get the post out;-) For the year to year list, doesn’t anyone (JISCMU perhaps?) maintain a watching brief and sameas facility to help with year on year reconciliation?

    • David Kernohan (@dkernohan)

      Tony – to clarify – HESAcodes only change if there is an institutional merger, and even then only if there’s not an obvious single code out of the two (or more) to standardise on [eg Manchester + UMIST became Manchester, so they kept the Manchester code]. For name changes the underlying code remains the same. It’s the HEFCE institutional ordering code that changes as they decide what order they want institutions in this year.

      I agree that there should be a better way of publicly documenting the HESA changes – I guess there are internal tools in HEFCE and HESA. UCAS, of course, use completely different codes for reasons best known to themselves.

  3. Pingback: Britain universities in crisis « Rightways's Blog
  4. Pingback: Sketching Spending Flows to Serco Using OpenlyLocal Aggregated Spending Data « OUseful.Info, the blog…
  5. Pingback: Universities Look to Bond Markets for Additional Funding? « OUseful.Info, the blog…