Lorcan Dempsey was revisiting an old favourite last week, in a discussion about inside-out and outside-in library activities (Discovery vs discoverability …), where outside-in relates to managing collections of, and access to, external resources, versus the inside-out strategy whereby the library accepts that discovery happens elsewhere, and sees its role as making library mediated resources (and resources published by the host institution) available in the places where the local patrons are likely to be engaging in resource discovery (i.e. on the public web…)
A similar notion can be applied to innovation, as fumblingly described in this old post Innovating from the Inside, Outside. The idea there was that if institutions made their resources and data public and openly licensed, then internal developers would be able to make use of them for unofficial and skunkwork internal projects. (Anyone who works for a large institution will know how painful it can be getting hold of resources that are “owned” by other parts of the institution). A lot of the tinkering I’ve done around OU services has only been possible because I’ve been able to hold of the necessary resources via public (and unauthenticated) URLs. A great example of this relates to my OpenLearn tinkerings (e.g. as described in both the above linked “Innovation” post and more recently in Derived Products from OpenLearn/OU XML Documents).
But with the recent migration of OpenLearn to the open.edu domain, it seems as if the ability to just add ?content=1 to the end of a unit URL and as a result get access to the “source” XML document (essentially, a partially structured “database” of the course unit) has been disabled:
Of course, this could just be an oversight, a switch that failed to be flicked when the migration happened; although from the unit homepage, there is no obvious invitation to download an XML version of the unit.
[UPDATE: see comments – seems as if this should be currently classed as “broken” rather than “removed”.]
In a sense, then, access to a useful format of the course materials for the purpose of deriving secondary products has been removed. (I also note that the original, machine readable ‘single full list’ of available OpenLearn units has disappeared, making the practical act of harvesting harder even if the content is available…) Which means I can no longer easily generate meta-glossaries over all the OpenLearn units, nor image galleries or learning objective directories, all of which are described in the Derived Products from OpenLearn post. (If I started putting scrapes on the OU network, which I’ve considered many times, I suspect the IT police would come calling…) Which is a shame, especially at a time when the potential usefulness of text mining appears to be being recognised (eg BIS press release on ‘Consumers given more copyright freedom’, December 20, 2012: “Data analytics for non-commercial research – to allow non-commercial researchers to use computers to study published research results and other data without copyright law interfering;”, interpreted by Peter Murray Rust as the UK government says it’s legal to mine content for the purposes of non-commercial research. By the by, I also notice that the press release also mentions “Research and private study – to allow sound recordings, films and broadcasts to be copied for non-commercial research and private study purposes without permission from the copyright holder.” Which could be handy…).
This effective closing down of once open services is (deliberate or not), of course, common to anyone who plays with web APIs, which are often open and free in early beta development phase, but then get locked down as companies are faced with the need to commercialise them. Faced with the need to commercialise them.
Returning to Lorcan’s post for a moment, in which he notes “growing interest in connecting the library’s collections to external discovery environments so that the value of the library investment is actually released for those for whom it was made” on the one hand; and “a parallel interest in making institutional resources (research and learning materials, digitized special materials, faculty expertise, etc) more actively discoverable.” More actively discoverable.
If part of the mission is also to promote reuse of content, as well as affording the possibility of third parties opening up additional discovery channels (for example, through structured indices and recommendation engines), not to say creating derived and value-add products, then making content available in “source” form, where structural metadata can be mined for added value discovery (for example, faceted search over learning objectives, or images or glossary items, blah, blah, blah..) is good for everyone.
Unless you’re precious about the product of course, and don’t really want it to be open (whatever “open” means…).
As as pragmatist, and a personal learner/researcher, I often tend not to pay too much attention to things like copyright. In effect, I assert the right to read and “reuse” content for my own personal research and learning purposes. So the licensing part of openness doesn’t really bother me in that respect too much anyway. It might become a problem if I built something that I made public that started getting use and starting “stealing” from, or misrepresenting the original publisher, and then I’d have to do worry about the legal side of things… But not for personal research.
Note that as I play with things like Scraperwiki more and more, I find myself more and more attracted to the idea of pulling content in to a database so that I can add enhanced discovery services over the content for my own purposes, particularly if I can pull structural elements out o the scraped content to enable more particular search queries. When building scrapers, I tend to limit myself to scraping sites that do not present authentication barriers, and whose content is generally searchable via public web search engines (i.e. it has already been indexed and is publicly discoverable).
Which brings me to consider a possibly disturbing feature of MOOC platforms such as Coursera. The course may be open (if you enrol, but the content of, and access to, the materials ins’t discoverable. That is, it’s not open as to search. It’s not open as to discovery. (Udacity on the other hand does seem to let you search course content; e.g. search with limits site:udacity.com -site:forums.udacity.com)
I’m not sure what the business model behind FutureLearn will be, but when (if?!) the platform actually appears, I wonder whether course content will be searchable/outside-discoverable on it? (I also wonder to what extent the initial offerings will relate to course resources that JISC OER funding helped to get openly licensed? And what sort of license will apply to the content on the site (for folk who do pay heed to the legalistic stuff;-)
So whilst Martin Weller victoriously proclaims Openness has won – now what?, saying “we’ll never go back to closed systems in academia”, I just hope that we don’t start seeing more and more lock dawn, that we don’t start seeing less and less discovery of useful content published ac.uk sites, that competition between increasingly corporatised universities doesn’t mean that all we get access to is HE marketing material in the form of course blurbs, and undiscoverable content that can only be accessed in exchange for credentials and personal tracking data.
In the same way that academics have always worked round the journal subscription racket that the libraries were complicit in developing with with academic publishers (if you get a chance, go to UKSG, where publisher reps with hospitality accounts do the schmooze with the academic library folk;-), sharing copies of papers if anyone ever asked, I hope that they do the same with their teaching materials, making them discoverable and sharing the knowledge.