Notes on the JISC Grant Funding Call 8/11: “Course Data: Making the most of Course Information” Capital Programme – Call for Letters of Commitment
This post builds on quick commentaries around other reports in the area of Higher Education course data: Immediate Thoughts on the “Provision of information about higher education” and Getting Access to University Course Code Data (or not… (yet…))). It doesn’t necessarily represent my own opinions, let alone those of my employer.
1. The Joint Information Systems Committee (JISC) and the Higher Education Funding Council for England (HEFCE) invite English Universities and FE colleges (teaching over 400 HE FTEs) to become involved in a new programme of work which will help prepare the sector for increasing demands on course data.
3. Funding is available for projects starting from Monday 12 September 2011 for an initial period of approximately three months. Projects selected to go forward into Stage 2 will continue for an additional 12 to 15 months. All projects must be complete by 29 March 2013.
So how does this fit with the timeline for HEFCE Key Information Set (KIS) development if the called for work is relevant to that? (Note: HEFCE makes available much of the monies disbursed by JISC, and HEFCE is managing the KIS work directly.)
|As soon as possible and not later than the end of September 2011||Technical guidance published by HEFCE|
|January to March 2012||Submission system open for KISs to be published in September 2012: Institutions submit their data to
|June to early July 2012||2012 NSS and DLHE data available to HEFCE|
|July to August 2012 HEFCE merges data submitted by institutions with 2012 NSS and DLHE data. Institutions quality check and sign off their final
|September 2012||KISs available for institutions to upload. All KISs to be accessible via institutional web-sites by the end of the month|
[HEFCE: Provision of information about higher education]
So given the timings, the JISC second phase work looks as if it is supporting processes relating to, and publication of, different sorts of data to the KIS data, although phase 1 work may be relevant to KIS releases?
10. There are 3 main drivers for making it easier for people to find and compare courses:
– prospective fee paying students want to know more about the academic experience a course will provide and be able to compare this with other courses;
– better informed students are more likely to choose a course that they will complete, and be more motivated to achieve better results;
– increased scrutiny by quality assurance agencies and the Government’s requirement for transparency of publicly funded bodies.
11. JISC have made it easier for prospective students to decide which course to study by creating an internationally recognised data standard for course information, known as XCRI-CAP which is conformant with the new European standard for Advertising Learning Opportunities. This will make transferring and advertising information about courses between institutions and organisations more efficient and effective. Placing this data at a consistent COOL URI makes it easier to find.
So there are two end-user groups in mind for the course related information: prospective students, and the scrutineers. XCRI-CAP relates to the publication of information describing at a high level the subject content of a course, rather than the sorts of “metadata” around courses that the KIS will provide. If we were building a course comparison website, the XCRI-CAP data might provide course descriptions relating to a course, whereas the KIS data would provide student satisfaction ratings, teaching hours, assessment strategies, graduate employment rates and salaries. Pricing related information might be common to both sets?
Example of what the KIS display might look like.
Within the university website, developers will be required to identify which course a page relates to, and then call in the appropriate KIS widget from HEFCE or its agent, presumably by passing parameters relating to: institution identifier; course identifier.
In order to display both XCRI-CAP style data and KIS data on the same third party site web page, the third party will need to be able to identify the course identifier and the university identifier. It will also need a way of identifying which course codes are offered by each institution. In order to satisfy requests from potential applicants searching for a particular topic anywhere in the country*, the third party would ideally have access to an index (or at least a comprehensive list either of courses for each institution, or of institutions by course) that allows it to identify and return the set of (institution, course) pairs for which the course satisifes the search term. (Alternatively, for every request, the third party could query every university separately for related courses, aggregate these responses, and then annotate each result with a link to the corresponding KIS information, or its widget.) If the aggregator was to offer a service whereby potential applicants could rank each result according to one or more KIS data elements, it would need to index associate the KIS data relating to each of the courses identified by the (institution, course) pairs with the corresponding pair, and then use this aggregated data set to present the result to the end user. Again, this could be achieved my making separate requests to the KIS information server, once for each (institution, course) pair; or it could draw on its own index of this data if the information was openly licensed.
* when thinking about course selection, I often have four scenarios in mind: a) I know what course I want to do and where I want to do it; b) I know where I want to go but donlt know what course to do; c) I know what course I want to do, but know where to do it; d) I don’t know what course to do or where to do it…
The KIS data only partially overlaps with the XCRI-CAP data, so I wonder: to what extent will it be possible to JOIN the two data sets (that is, how will we be able to link XCRI-CAP and KIS data? Via HEI+coursecode keys, presumably?)
12. The proposed programme will support the sector to prepare for the increasing demand for course information, and increase the availability of high-quality, accurate information about part-time, online and distance learning opportunities offered by UK HEIs by:
– funding institutions to make the process and technical innovations necessary to release a structured, machine-readable feed of their course-related information, and;
– creating a proof-of-concept aggregator and discovery service to bring together this course information and enable prospective students to search it.
So – what I think the JISC are suggesting is that they are looking to fund work on the “wider information set” of information around courses? That JISC are also looking to create a “proof-of-concept aggregator and discovery service to bring together this course information and enable prospective students to search it” sounds interesting. I wonder how this would sit in the context of:
- UCAS (which currently concentrates course listings as a basis for a single point of application for entry (how will entry work for the private universities? Cf. also the OU, which has only just started to make use of the UCAS entry route, and which also supports a significant direct entry route onto modules?)
- third party services such as ???Hotcourses
- custom search engines such as CourseDetective, which search over online course prospectuses (and which cost approx. 2 volunteered FTE days to put together at a hackday…;-)
It’s also worth bearing in mind that my TSO OpenUp competition entry also suggested the opening up of course code scaffolding data so that third parties could start to create aggregated and enriched datasets around courses, as well as building services on top of that data that would potentially be revenue generating and commercially sustainable…
Just on the topic of “wider information sets”, here’s what the HEFCE KIS consultation report had to say on the matter:
The wider information set
32. Higher education providers already publish a wide range of information about their institution and the courses they deliver. The information published has been considered by QAA in the context of institutional audit (for publicly funded higher education institutions and those privately funded providers that subscribe to QAA) or of Integrated Quality and Enhancement Review (for further education colleges (FECs) offering HE courses) and is subject to a ‘comment’ in that context. The consultation proposed that institutions should make this information more public-facing, noting that published information would, in due course, be subject to a judgement in QAA review processes.
33. It was proposed that this wider information set has two purposes: to provide information about higher education to a wide variety of audiences including:
prospective and current students; students’ parents and advisers; employers; the media; and the institution itself to form part of the evidence used in QAA audit and review.
34. The required information set was presented in the consultation document as a minimum requirement, with institutions continuing to publish as much other information as they wished. Institutions were asked to consider whether any of the information could be presented in more accessible ways.
Information about aspects of course/awards (not available in the KIS):
Information to be provided Level of information Availability prospectuses, programme guides, module descriptors or similar programme specifications;
results of internal student surveys
links with employers – where employers have input into a course or programme (this could be quite a high-level statement)
partnership agreements, links with awarding bodies/delivery partners.
Course/programme level All apart from results of internal surveys to be publicly available
Results of internal surveys should be available internally
If there is such pent-up demand for aggregated course discovery services, then they should also be able to run as commercial services? One thing that I would argue currently limits innovation in this area is access to a comprehensive qualifcation catologue across the UK. UCAS do have this data, and they do sell it. But I want to play with it and see if I can build a service round it, rather than deciding to quit my job, raise finance, buy the data from UCAS and then see if I can make a go of building a commercial service around the data. UCAS would still benefit from traffic driven to the UCAS site for couse registrations. (But then, if aggregators were also aggregating information about courses in the private sector that supported direct entry and did not require central applications and clearing, aggregators might also start recommending courses outside the scope of UCAS…? Hmmm… Becuase the private universities would probably provide a commercial incentive to drive traffic to them in the form of affiliate fees based on registrations resulting from referrals… Hmmm… This is all starting to put me in mind of things like FOTA, Formula One and the FIA…!)
Another route to a comprehensive course catalogue is through indexing catalogue feeds (akin to website sitemap feeds that detail all the pages on a website to make it easy for search engines to index them) published directly by the universities, such as XCRI-CAP feeds…
13. The availability of useable course data feeds, and the demonstration of the proof-of-concept aggregator, is intended to provide a catalyst to the feeds being used within existing aggregators, catalogues or information, advice and guidance services, or to form the basis of new services.
I’m not sure an incentive is required.. just open access to the data, free in the first instance. (And if companies do start to make money from it, then license fees can kick in. I don’t think people would have a problem with that…)
15. Between September 2011 and March 2013, JISC intends to fund projects that help institutions review and adapt their internal processes to permit easier access to their course data to meet the needs of various stakeholders. As a minimum, and to provide a clear focus for this overarching activity, the programme will concentrate on the implementation of an XCRI-CAP standard system-generated feed. The programme will be staged to ensure maximum benefit is achieved.
If this data is already exposed via online course prospectuses, a developer with data scraper in hand could probably get a large chunk of this data anyway over the next three to six months. (The CourseDetective CSE definition file already provides a basis for anyone wanting to spider university course catalogues… Hmmm… maybe that’s a good reason for me to get to grips with Lucene…? Ideally, course prospectuses would also produce a sitemap (or XCRI) feed providing URLs for all the course pages currently published via the online prospectus to make it easy for third parties to index, or harvest, this data. The provision of semantic markup in a page, whether through RDFa, microformats, microdata or metadata would also simplify the sctaping (i.e. machibe parseability) of the course pages. At the very least, using template based, sensibly structured presentation markup that enforces markup conventions that suggest de facto semantics makes pages reliably scrapeable and provides one way of supporting the harvesting of data (if license conditions allow…)) Because, of course, a major why potentially commercial services don’t just scrape the data to build course comparison sites relates to the licensing/copyright restrictions that may exist, deliberately or by default, over the university prospectus data that is published online… (Not everyone’s a pirate;-)
16. In Stage 1, institutions will review the maturity of their management of course data using the XCRI Self Assessment Framework. This could cover the full course data life cycle, but must include a particular focus on prospectus and course advertising information. Based on the outcomes of this review, institutions will produce an implementation plan for how they will improve processes to, as a minimum, create a system-generated course advertising feed in a XCRI CAP 1.2 format with a COOL URI.
So I wonder, would JISC indemnify a third party looking to scrape, aggregate, and republish this data in a standard form via an open API and a permissive license, against actions taken against them by UCAS and the universities for breach of copyright?! I also wonder whether JISC will be providing guidance about what license conditions they expect XCRI-CAP data to be published under? Or is that out of scope?
19. The anticipated outcomes from this programme of work are:
– There will be increased usage of appropriate technology to streamline course data processes leading to:
— More standardised, and therefore comparable, course information in a consistent location making discovery easier.
— Improved quality and therefore more efficient and effective course data.
— Increased ease in finding and comparing courses, especially types of courses that are currently hard to find, such as ones delivered by distance learning.
– Institutions are able to make appropriate and informed decisions about their processes for managing course-related data, leading to a reduced administrative data burden, cost-effective working, and better business intelligence.
Ah… this is actually different to getting the data out there, then, in a way that third parties can use it? It’s more about tweaking systems and processes inside the institution to support the provisioning of data in ways that make it more accessible to third party aggregators? The course aggregator is then a red herring – it’s just there to provide a reference/candidate client/consumer against which the released data can be targeted.
25. There will be a support and synthesis project that will be working with projects from the start of the programme to help them shape their implementation plans in Stage 1 and other outputs in Stage 2 that are of most use to the sector. Projects are expected to engage with the support and synthesis project and to be proactive in sharing outputs throughout the project. This information will be synthesised and shared with the sector; where that information is sensitive, it will be shared in an aggregated, anonymised form.
A “support and synthesis project” within JISC presumably, (i.e. run by the usual suspects)? Rather than sponsoring and indemnifying the open data community on the one hand, or encouraging potential startups on the other, to start building user facing (potential student) services, along with the necessary business model that will make them sustainable, and maybe even profitable?
26. Funding is provided to enable institutions to carry out project work, but also to release key staff to prepare for, take part in and follow up on these programme-level activities. Projects should allow at least 5 person-days in Stage 1 and 10 person-days in Stage 2.
Such is the price of funding HE based developer activities. 5 days project work: £10k. 10 days project work: £40k-80k. So now you know…