data.open.ac.uk Linked Data Now Exposing Module Information

As HE becomes more and more corporatised, I suspect we’re going to see online supermarkets appearing that help you identify – and register on – degree courses in exchange for an affiliate/referral fee from the university concerned. For those sites to appear, they’ll need access to course catalogues, of course. UCAS currently holds the most comprehensive one that I know of, but it’s a pain to scrape and all but useless as a datasource. But if the universities publish course catalogue information themselves in a clean way (and ideally, a standardised way), it shouldn’t be too hard to construct aggregation sites ourselves…

So it was encouraging to see earlier this week an announcement that the OU’s data.open.ac.uk site has started publishing module data from the course catalogue – that is, data about the modules (as we now call them – they used to be called courses) that you can study with the OU.

The data includes various bits of administrative information about each module, the territories it can be studied in, and (most importantly?!) pricing information;-)

data.open.ac.uk - module data

You may remember that the data.open.ac.uk site itself launched a few weeks ago with the release of Linked Data sets including data about deposits in the open repository, as well as OU podcasts on iTunes (data.open.ac.uk Arrives, With Linked Data Goodness. Where podcasts are associated with a course, the magic of Linked Data means that we can easily get to the podcasts via the course/module identifier:

data.open.ac.uk

It’s also possible to find modules that bear an isSimilarTo relation to the current module, where isSimilarTo means (I think?) “was also studied by students taking this module”.

As an example of how to get at the data, here’s a Python script using the Python YQL library that lets me run a SPARQL query over the data.open.ac.uk course module data (the code includes a couple of example queries):

import yql

def run_sparql_query(query, endpoint):
    y = yql.Public()
    query='select * from sparql.search where query="'+query+'" and service="'+endpoint+'"'
    env = "http://datatables.org/alltables.env"
    return y.execute(query, env=env)

endpoint='http://data.open.ac.uk/query'

# This query finds the identifiers of postgraduate technology courses that are similar to each other
q1='''
select distinct ?x ?z from <http://data.open.ac.uk/context/course> where {
?x a <http://purl.org/vocab/aiiso/schema#Module>.
?x <http://data.open.ac.uk/saou/ontology#courseLevel> <http://data.open.ac.uk/saou/ontology#postgraduate>.
?x <http://purl.org/dc/terms/subject> <http://data.open.ac.uk/topic/technology>.
?x <http://purl.org/goodrelations/v1#isSimilarTo> ?z
} limit 10
'''

# This query finds the names and course codes of 
# postgraduate technology courses that are similar to each other
q2='''
select distinct ?code1 ?name1 ?code2 ?name2 from <http://data.open.ac.uk/context/course> where {
?x a <http://purl.org/vocab/aiiso/schema#Module>.
?x <http://data.open.ac.uk/saou/ontology#courseLevel> <http://data.open.ac.uk/saou/ontology#postgraduate>.
?x <http://purl.org/dc/terms/subject> <http://data.open.ac.uk/topic/technology>.
?x <http://courseware.rkbexplorer.com/ontologies/courseware#has-title> ?name1.
?x <http://purl.org/goodrelations/v1#isSimilarTo> ?z.
?z <http://courseware.rkbexplorer.com/ontologies/courseware#has-title> ?name2.
?x <http://purl.org/vocab/aiiso/schema#code> ?code1.
?z <http://purl.org/vocab/aiiso/schema#code> ?code2.
}
'''

# This query finds the names and course codes of 
# postgraduate courses that are similar to each other
q3='''
select distinct ?code1 ?name1 ?code2 ?name2 from <http://data.open.ac.uk/context/course> where {
?x a <http://purl.org/vocab/aiiso/schema#Module>.
?x <http://data.open.ac.uk/saou/ontology#courseLevel> <http://data.open.ac.uk/saou/ontology#postgraduate>.
?x <http://courseware.rkbexplorer.com/ontologies/courseware#has-title> ?name1.
?x <http://purl.org/goodrelations/v1#isSimilarTo> ?z.
?z <http://courseware.rkbexplorer.com/ontologies/courseware#has-title> ?name2.
?x <http://purl.org/vocab/aiiso/schema#code> ?code1.
?z <http://purl.org/vocab/aiiso/schema#code> ?code2.
}
'''

result=run_sparql_query(q3, endpoint)

for row in result.rows:
	for r in row['result']:
		print r

I’m not sure what purposes we can put any of this data to yet, but for starters I wondered just how connected the various postgraduate courses are based on the isSimilarTo relation. Using q3 from the code above, I generated a Gephi GDF/network file using the following snippet:

# Generate a Gephi GDF file showing connections between 
# modules that are similar to each other
fname='out2.gdf'
f=open(fname,'w')

f.write('nodedef> name VARCHAR, label VARCHAR, title VARCHAR\n')
ccodes=[]
for row in result.rows:
	for r in row['result']:
		if r['code1']['value'] not in ccodes:
			ccodes.append(r['code1']['value'])
			f.write(r['code1']['value']+','+r['code1']['value']+',"'+r['name1']['value']+'"\n')
		if r['code2']['value'] not in ccodes:
			ccodes.append(r['code2']['value'])
			f.write(r['code2']['value']+','+r['code2']['value']+',"'+r['name2']['value']+'"\n')
		
f.write('edgedef> c1 VARCHAR, c2 VARCHAR\n')
for row in result.rows:
	for r in row['result']:
		#print r
		f.write(r['code1']['value']+','+r['code2']['value']+'\n')

f.close()

to produce the following graph. (Size is out degree, colour is in degree. Edges go from ?x to ?z. Layout: Fruchterman Reingold, followed by Expansion.)

OU postgrad courses in gephi

The layout style is a force directed algorithm, which in this case has had the effect of picking out various clusters of highly connected courses (so for example, the E courses are clustered together, as are the M courses, B courses, T courses and so on.)

If we run the ego filter over this network on a particular module code, we can see which modules were studying alongside it:

ego filter on course codes

Note that in the above diagram, the nodes are sized/coloured according to in-degree/out-degree in the original, complete graph, If we re-calculate those measures on just this partition, we get the following:

Recoloured course network

If we return to the whole network, and run the Modularity class statistic, we can identify several different course clusters:

Modules - modularity class

Here’s one of them expanded:

A module cluster

Here are some more:

COurse clusters

I’m not sure what use any of this is, but if nothing else, it shows there’s structure in that data (which is exactly what we’d expect, right?;-)

PS as to how I wrote my first query on this data, I copied the ‘postgraduate modules in computing’ example query from data.open.ac.uk:

http://data.open.ac.uk/query?query=select%20distinct%20%3Fx%20from%20%3Chttp://data.open.ac.uk/context/course%3Ewhere%20{%3Fx%20a%20%3Chttp://purl.org/vocab/aiiso/schema%23Module%3E.%0A%3Fx%20%3Chttp://data.open.ac.uk/saou/ontology%23courseLevel%3E%20%3Chttp://data.open.ac.uk/saou/ontology%23postgraduate%3E.%0A%3Fx%20%3Chttp://purl.org/dc/terms/subject%3E%20%3Chttp://data.open.ac.uk/topic/computing%3E%0A}%0A&limit=200

and pasted it into a tool that “unescapes” encoded URLs, which encodes the SPARQL query:

Unescaping text

I was then able to pull out the example query:
select distinct ?x from <http://data.open.ac.uk/context/course&gt;
where {?x a <http://purl.org/vocab/aiiso/schema#Module&gt;.
?x <http://data.open.ac.uk/saou/ontology#courseLevel&gt; <http://data.open.ac.uk/saou/ontology#postgraduate&gt;.
?x <http://purl.org/dc/terms/subject&gt; <http://data.open.ac.uk/topic/computing&gt;
}

Just by the by, there’s a host of other handy text tools at Text Mechanic.

4 comments

  1. Allyn J Radford

    One possible use, and somewhat relevant to OU is the ability to provide students with course planning tools that may also align with career management scenarios. While some different visual representation might be useful an interesting application might be organisations like Open Universities Australia who register students into single subjects and who then might want to see where there interests take them into awards after collecting a few subjects, or SeekLearning (AU again) who take the career-path focus. Maybe someone like Ufi/LearnDirect would also gain benefit. Just a few thoughts…

  2. Pingback: OU Related Courses Network Visualisation Using Protovis and Open University Open Data « OUseful.Info, the blog…
  3. Pingback: Open University Undergraduate Module Map « OUseful.Info, the blog…