One of the challenges that needs to be addressed when developing data driven applications is how to get data from a data base to the point of use. Another relates to time-relevance. For example, in one document you might want to look at figures or data relating to a specific, fixed date or period (March 2012, for example, or the academic years 2012-13). At other times, you might want current data: values of some indicator over the last 6 weeks for example. Where live data feeds are available, you might want them to be displayed on a dashboard. In other circumstances, you might want a more traditional printed report or Powerpoint presentation, but one that contains absolutely up to the minute information. Ideally, the report or presentation would be “self-updating” so that each time you printed it off, it contained the latest data values.
One common user environment for data related activities is a spreadsheet. When developing an API that produces data that might be usefully manipulated within a spreadsheet context, it can be useful to provide a set of connectors that allow data to be pulled directly from the API and inserted into the spreadsheet.
Here’s a quick example of how we might start to pull data into a Google Spreadsheets context from the University of Lincoln course data API. The example calls broadly reproduce those described in Getting ONCourse With Course Data.
Although the new script management tools in the Google Apps environment confuse the process of defining spreadsheet specific scripts, at its heart defining custom spreadsheet functions for Google Spreadsheets is a trivial exercise: create a custom function (for example, myFunction(val)), pass in a value (such as a cell value) and return a value or array. You can can call the function using a formula, =myFunction(A3), for example. If a single value is returned, this will be returned into the calling cell. If a one-dimensional list is returned, it will add alues to the current row. If a two dimensional array is returned, it will populate rows and columns.
Here are a few helper functions for calling the ONCourse data into a Google Spreadsheet:
function signed(url){ //Use this to add the API key, for example: //return url+'?key='+KEYVAL return url } function gurl(url){ //Fetch the JSON data and parse it as such raw=UrlFetchApp.fetch(signed(url)) json=Utilities.jsonParse(raw.getContentText()) return json } function jgrabber(u){ root='https://n2.online.lincoln.ac.uk' return gurl(root+u) } function grabModulesId(id){ path= '/modules/id/'+id json= jgrabber(path ) return json['result'] } function grabProgrammesId(id){ path= '/programmes/id/'+id json= jgrabber(path ) return json['result'] } function assessmentByModuleID(moduleID){ d=grabModulesId(moduleID) return d['assessments'] }
Let’s now see if we can pull data into the spreadsheet, and try to get a feel for whether this sort of approach looks as if it may be useful…
First up, how about getting a list of programmes associated with a module?
function printProgsFromModuleID(moduleID){ d=grabModulesId(moduleID) var arr = []; c=[] c.push(d['title']) c.push(d['module_code']['code']) c.push(d['level']['description']) c.push(d['credit_rating'] + ' points') arr.push(c) arr.push([]) c=['Programme', 'Course', 'Course Code'] arr.push(c) for (var i=0;i< d['module_links'].length;i++){ r=d['module_links'][i]['programme'] c=[ r['programme_title'], r['course_title'], r['course_code']['code'] ] arr.push(c) } return arr }
Here’s the result:
How about learning outcomes per module (via the assessments associated with each module)?
function learningOutcomesByModuleID(moduleID){ d=assessmentByModuleID(moduleID) arr=[] for (i=0; i<d.length; i++){ r=d[i] assessment=gurl(r['nucleus_url'])['result'] c=[ assessment['module']['title'], assessment['module']['module_code']['code'], assessment['assessment_method'] ] arr.push(c) learningOutcomes=assessment['learning_outcomes'] for (j=0; j< learningOutcomes.length; j++) arr.push([ '','', learningOutcomes[j]['description'] ]) } return arr }
Which gives us:
And how about programme outcomes for a particular programme, broken out into outcome types:
function programmeOutcomesByProgrammeID(id){ d=grabProgrammesId(id) arr=[] var types= new Object(); for (i=0;i<d['programme_outcomes'].length; i++){ r=d['programme_outcomes'][i] if (types[ r['category']['title'] ] == undefined ){ types[r['category']['title']]=[] } types[r['category']['title']].push(r['description']) } for ( j in types ) arr.push([j,types[j] ]) return arr }
The result? A handy table:
Okay, so that all seems easy enough. I think that Google docs are also scriptable, so it should be possible to populate a templated “word” document using data pulled from the API (for example, using a snippet like this one of Martin Hawksey’s: Google Apps Script to fill in a Document template with Spreadsheet data).
One thing I realised about the API from playing with this recipe was that it is defined very much as a little-l, little-d “linked data” API that works well for browsing the data. The well defined URIset and use of internal unique identifiers make it easy to traverse the data space once you are in it. However, it’s not immediately obvious to me how I could either search my way into the data, or access it via natural identifiers such as programme codes or module codes.
[Seems I missed a trick on the API front… As described in APMS -> Nucleus -> APIs. How? Why?, it seems that you could get details of all assessments that are the final assessment for the module, that contain group work at: https://n2.online.lincoln.ac.uk/assessments?access_token=ACCESSTOKEN&group_work=1&final_assessment=1]
For example, to print the programmes associated with a module, I might use the above formula =printProgsFromModuleID(97), but the identifier I need to pass in (97) is an internal, machine generated ID. This is all well and good if you are working within the machine-ID space, but this is not the human user space. For that, it would be slightly more natural to make use of a module code, and call something like printProgsFromModuleCode('FRS2002M'). There are issues with this approach of course, with module codes being carried over from presentation to presentation, and breaking a simple bijective (one-one onto) relationship between internal module IDs and module codes. As a result, we might need to further qualify these calls with a presentation year (or by default assume the presentation from the current academic year), or whatever other arguments are required to recapture the bijectivity.
PS in passing, here’s another view over modules by programme, broken down into core and option modules.
There are a few problems with this view – for example, the levels need ordering properly; where there are no core or no optional modules the display is not as good as it could bel the label sizing is a bit small; there is no information relating to pre-requisites for optional modules – but it’s a start, and it’s reasonably clean to look at.
It should also be easy enough to tweak the data generator script to allow us to use the same display script to show assessment types for each module in a programme, as demonstrated using enclosure charts in VIsually Revealing Gaps in ONCourse Data , and maybe even learning outcomes too.
If I get a chance, I’ll look through the sorts of thing requested in the ONCourse Focus Group – 14th March 2012 and try to pull out some views that match some of the requested ones. What would also be interesting would be to have a list of use cases from people who work with the data, too…
Tony
If you are interested, here is a link to the latest OU XCRI-CAP 1.2 data set: http://www.alanpaull.co.uk/OpenUniversityXCRI-CAP1-2.xml. Technically this is a test feed, but it’s reasonably accurate I believe. It includes both module and ‘qualification-based’ courses. I am thinking that we should provide (in XCRI-CAP) a vocabulary for course type that labels ‘course objects’, such as ‘module’, ‘programme’, ‘stage’, ‘year’, ‘pathway’, et al. Do you think that would help with analysis? Otherwise, how can you tell whether you’ve got a module or a programme?
Alan
alan@alanpaull.co.uk
@Alan Do you know if this feed (or the live feed): a) has any consumers; b) what they do do with it?
I need to have a proper look at this feed I think (when does it go live?). There seems to be some useful structure in there, but also some messiness, eg in the learning outcomes, which are not enumerated and which do not appear to be itemised in the free text in a conventional way, which makes parsing individual LOs out, eg for displaying as a list, trickier?
Tony
No consumers as yet that I know of. However, we’re currently doing some demonstrators in the Course Data Programme, and we’re using the data (loading it into a database) for subject search services and to link with KIS data.
It won’t have properly ‘live’ status (I think) till the communications dept in the OU are able to develop their own version – and at present there are more pressing concerns. This is why I’ve produced this version. It contains live data, which we update 6x per year.
In relation to learning outcomes (and some of the other descriptive text fields), these are presented for marketing purposes as a single field, bearing in mind that the information is published for marketing, not for academic, purposes. Parsing these descriptive text fields will be tricky, because they’re designed to be ‘final form’ not internally structured.
With XCRI-CAP (Course Advertising Profile) we have a tension between publishing information for direct use in marketing websites, versus data exchange for added value. Certain decisions in the design of XCRI-CAP help with the former, but don’t with the latter, unfortunately.
Alan
If I’m producing a marketing website, that mean pushing text based on search and other customisation factors. Is the learning objective stuff just intended as a hook for free text searches then? being able to print the LOs out in a pretty way is currently non-trivial?
Separate LOs are not available via the XCRI-CAP source. I’m sure they’re available somewhere in OU systems, but not publicly, and not used on the OU website. Maybe via OU linked data? I’ve not investigated that.
Hmm. Maybe it would be possible to add in OU linked data to the XCRI-CAP feed itself (or vice versa). I should perhaps follow that up.
Alan
The LO field can be parsed to pull out the separate LOs, but there doesn’t seem to be a conventional way of writing the LOs across courses (which means several parsers are required to detect and pull out LOs according to the different ways they are expressed).
There are also conventional ways of carving up LOs (Knowledge and understanding, Practical Skills, etc) though I’m not sure if this carving up is standardised, follows a universal convention, or is just defined within the context of an institution?
I think that there is an emerging consensus (??) about carving up LOs, particularly in the light of QAA guidance and suchlike. However, whether this will ever be enough to standardise a vocab or internal structure is moot. I suspect we may be in a 60/40 and heading for 80/20.
It’s still institution-based as far as I can tell, with a strong push from some to link LOs more clearly to assessment and content within the information structures, as well as just ‘conceptually’.
Alan