Stirring – OUseful.Info, the blog…

Student Workload Planning – Section Level Word Count Reports in MS Word Docs

One of the things the OU seems to have gone in for big time lately is “learning design”, with all sorts of planning tools and who knows what to try and help us estimate student workloads.

One piece of internal research I saw suggested that we “adopt a University-wide standard for study speed of 35 words per minute for difficult texts, 70 words per minute for normal texts and 120 words per minute for easy texts”. This is complemented by a recommended level 1 (first year equivalent) 60:40 split between module-directed (course text) work and student-directed (activities, exercises, self-assessment questions, forum activity etc) work. Another constraint is the available study time per week – for a 30 CAT point course (300 hours study), this is nominally set at 10 hours study per week. I seem to recall that retention charts show that retention rates go down as mean study time goes up anywhere close to this…

One of the things that seems to have been adopted is the assumption that the first year equivalent study material should all be rated at the 35 words per minute level. For 60% module led work, at 10 hours a week, this gives approximately 35 * 60 * 6 ~ 1200 words of reading per week. With novels coming in around 500 words a page, that’s 20 pages of reading or so.

This is okay for dense text but we tend to write quite around with strong narrative, using relatively straightforward prose, explaining things a step at a time, with plenty of examples. Dense sentences are rewritten and the word count goes up (but not the reading rate… Not sure I understand that?)

As part of the production process, materials go through multiple drafts and several stages of critical reading by third parties. Part of the critical reading process is to estimate (or check) workload. To assist this, materials are chunked and should be provided with word counts and estimated study times. The authoring process uses Microsoft Word.

As far as I can tell, there is an increasing drive to segment all the materials and chunk them all to be just so, one more step down the line rigidly templated materials. For a level 1 study week, the template seems to be five sections per week with four subsections each, each subsection about 500 words or so. (That is, 10 to 20 blog posts per study week…;-)

I’m not sure what, if any, productivity tools there are to automate the workload guesstimates, but over coffee this morning I though I’d have a go at writing a Visual Basic macro to do do some of the counting for me. I’m not really familiar with VB, in fact, I’m not sure I’ve ever written a macro before, but it seemed to fall together okay if the document was structured appropriately.

To whit, the structure I adopted was: a section to separate each section and subsection (which meant I could count words in each section); a heading as the first line after a section break (so the word count could be associated with the (sub)section heading). This evening, I also started doodling a convention for activities, where an activity would include a line on its own of the form – Estimated study time: NN minutes – which could then be used as a basis for an activity count and an activity study time count.

Running the macro generates a pop up report and also inserts the report at the cursor insertion point. The report for a section looks something like this:

tm112block2part6_d2_2nd_attempt_docm

A final summary report also gives the total number of words.

It should be easy enough to also insert wordcounts into the document at the start of each section, though I’m not sure (yet) how I could put a placeholder in at the start of each section that the macro could update with the current wordcount each time I run it? (Also how the full report could just be updated, rather than appended to the document, which could get really cluttered…) I guess I could also create a separate Word doc, or maybe populate an Excel spreadsheet, with the report data.

Another natural step would be to qualify each subsection with a conventional line declaring the estimated reading complexity level, detecting this, and using it with a WPM rate to estimate the study time of the reading material. Things are complicated somewhat by my version of Word (on a Mac) not supporting regular expressions, but then, in the spirit of trying to build tools at the same level of complexity as the level at which we’re teaching, regex are probably out of scope (too hard, I suspect…)

To my mind, exploring such productivity tools is the sort of thing we should naturally do; at least, it’s the sort of thing that felt natural in a technology department. Computing seems different; computing doesn’t seem to be about understanding the technical world around us and getting our hands dirty with it. It’s about… actually, I’m not sure what it’s about. The above sketch really was a displacement activity – I have no misconceptions at all that the above will generate any interest at all, not even as a simple daily learning exercise (I still try to learn, build or create something new every day to keep the boredom away…) In fact, the “musical differences” between my view of the world and pretty much everyone else’s is getting to the stage where I’m not sure it’s tenable any more. The holiday break can’t come quickly enough… Roll on HoG at the weekend…

Sub WordCount()

    Dim NumSec As Integer
    Dim S As Integer
    Dim Summary As String

    Dim SubsectionCnt As Integer
    Dim SubsectionWordCnt As Integer
    Dim SectionText As String

    Dim ActivityTime As Integer
    Dim OverallActivityTime As Integer
    Dim SectionActivities As Integer

    Dim ParaText As String

    Dim ActivityTimeStr As String

    ActivityTime = 0
    OverallActivityTime = 0
    SectionActivities = 0

    SubsectionCnt = 0
    SubsectionWordCnt = 0

    NumSec = ActiveDocument.Sections.Count
    Summary = "Word Count" & vbCrLf

    For S = 1 To NumSec
        SectionText = ActiveDocument.Sections(S).Range.Paragraphs(1).Range.Text

        For P = 1 To ActiveDocument.Sections(S).Range.Paragraphs.Count
            ParaText = ActiveDocument.Sections(S).Range.Paragraphs(P).Range.Text
            If InStr(ParaText, "Estimated study time:") Then
                ActivityTimeStr = ParaText
                ActivityTimeStr = Replace(ActivityTimeStr, "Estimated study time: ", "")
                ActivityTimeStr = Replace(ActivityTimeStr, " minutes", "")
                ActivityTime = ActivityTime + CInt(ActivityTimeStr)
                SectionActivities = SectionActivities + 1
            End If
        Next

        If InStr(SectionText, "Section") = 1 Then
            OverallActivityTime = OverallActivityTime + OverallActivityTime
            Summary = Summary & vbCrLf & "SECTION SUMMARY" & vbCrLf _
            & "Subsections: " & SubsectionCnt & vbCrLf _
            & "Section Wordcount: " & SubsectionWordCnt & vbCrLf _
            & "Section Activity Time: " & ActivityTime & vbCrLf _
            & "Section Activity Count: " & SectionActivities & vbCrLf & vbCrLf
            SubsectionCnt = 0
            SubsectionWordCnt = 0
            ActivityTime = 0
            SectionActivities = 0
        End If

        Summary = Summary & "[Document Section " & S & "] " _
        & SectionText _
        & "Word count: " _
        & ActiveDocument.Sections(S).Range.Words.Count _
        & vbCrLf

        SubsectionCnt = SubsectionCnt + 1
        SubsectionWordCnt = SubsectionWordCnt + ActiveDocument.Sections(S).Range.Words.Count
    Next

    Summary = Summary & vbCrLf & vbCrLf & "Overall document wordcount: " & _
    ActiveDocument.Range.Words.Count

    Summary = Summary & vbCrLf & "Activity Time: " & ActivityTime & " minutes"
    MsgBox Summary

    Selection.Paragraphs(1).Range.InsertAfter vbCr & Summary & vbCrLf
End Sub

PS I’ve no idea what idiomatic VB is supposed to look like; all the examples I saw seemed universally horrible… If you can give me any pointers to cleaning the above code up, feel free to add them in the comments…

PPS Thinks… I guess each section could also return a readability score? Does VB have a readability score function? VB code anywhere implementing readability scores?

Jupyter Notebooks as Part of a Publishing System – “Executable” Inline Maths and Music Notations

One of the books I’m reading at the moment is Michael Hiltzik’s Dealers of Lightning: Xerox PARC and the Dawn of the Computer Age (my copy is second hand, ex-library stock…), birthplace to ethernet and the laser printer, as well as many of the computer user interactions we take for granted today. One thing I hadn’t fully appreciated was Xerox’s interests in publishing systems, which is in part what put it in mind for this post. The chapter I just finished reading tells of their invention of a modeless, WYSIWYG word processor, something that would be less hostile than the mode based editors of the time (I like the joke about accidentally entering command mode and typing edit – e: select entire document, d: delete selection, i:insert, t: the letter inserted. Oops – you just replaced your document with the letter t).

It must have been a tremendously exciting time there, having to invent the tools you wanted to use because they didn’t exist yet (some may say that’s still the case, but in a different way now, I think: we have many more building blocks at our disposal). But it’s still an exciting time, because while a lot of stuff has been invented, whether or not there is more to come, there are still ways of figuring out how to make it work easier, still ways of figuring out how to work the technology into our workflows in more sensible way, still many, many ways of trying to figure out how to use different bits of tech in combination with each other in order to get what feels like much more than we might reasonably expect from considering them as a set of separate parts, piled together.

One of the places this exploration could – should – take place is in education. Whilst at HE we often talk down tools in place of concepts, introducing new tools to students provides one way of exporting ideas embodied as tools into wider society. Tools like Jupyter notebooks, for example.

The more I use Jupyter notebooks, the more I see their potential as a powerful general purpose tool not just for reproducible research, but also as general purpose computational workbench and as a powerful authoring medium.

Enlightened publishers such as O’Reilly seem to have got on board with using interactive notebooks in a publishing context (for example, Embracing Jupyter Notebooks at O’Reilly) and colleges such as Bryn Mawr in the US keep coming up with all manner of interesting ways of using notebooks in a course context – if you know of other great (or even not so great) use case examples in publishing or education, please let me know via the comments to this post – but I still get the feeling that many other people don’t get it.

“Initially the reaction to the concept [of the Gypsy, GUI powered wordprocessor that was to become part of the Ginn publishing system] was ‘You’re going to have to drag me kicking and screaming,'” Mott recalled. “But everyone who sat in front of that system and used it, to a person, was a convert within an hour.”
Michael Hiltzik, Dealers of Lightning: Xerox PARC and the Dawn of the Computer Age, p210

For example, in writing computing related documents, the ability to show a line of code and the output of that code, automatically generated by executing the code, and then automatically inserted into the document, means that when writing code examples, “helpful corrections” by an over-zealous editor go out of the window. The human hand should go nowhere near the output text.

week_3_exercise_notebook

Similarly when creating charts from data, or plotting equations: the charts should be created from the data or the equation by running a script over a source dataset, or plotting an equation directly.

week_3_exercise_notebook2

Again, the editor, or artist, should have no hand in “tweaking” the output to make it look better.

If the chart needs restyling, the artist needs to learn how to use a theme (like this?!) or theme generator rather then messing around with a graphics package (wrong sort of graphic). To add annotations, again, use code because it makes the graphic more maintainable.

We can also use various off-the-shelf libraries to generate HTML/Javascript fragments for creating inline interactives that can be embedded within the notebook, or saved and then reused elsewhere.

There are also several toolkits around for creating other sorts of diagram from code, as I’ve written about previously, such as the tools provided on blockdiag.com:

Aside from making diagrams more easily maintainable, rendering them inline within a Jupyter notebook that also contains the programmatic “source code” for the diagram, written diagrams also provide a way in to the automatic generation of figure londesc text.

Electrical circuit schematics can also be written and embedded in a Jupyter notebook, as this Schemdraw example shows:

So far, I haven’t found an example of a schematic plotting library that also allows you to simulate the behaviour of the circuit from the same definition though (eg I can’t simulate(d, …) in the above example, though I could presumably parameterise a circuit definition for a simulation package and use the same parameter values to label a corresponding Schemdraw circuit).

There are some notations that are “executable”, though. For example, the sympy (symbolic Python) package lets you write texts using python variables that can be rendered either as a symbol using mathematical notation, or by their value.

sympydemo1

(There’s a rendering bug in the generated Mathjax in the notebook I was using – I think this has been corrected in more recent versions.)

We can also use interactive widgets to help us identify and set parameter values to generate the sort of example we want:

sympydemo2

Sympy also provides support for a wide range of calculations. For example, we can “write” a formula, render it using mathematical notation, and then evaluate it. A Jupyter notebook plugin (not shown) allows python statements to be included and executed inline, which means that expressions and calculations can be included – and evaluated – inline. Changing the parameters in an example is then easy to achieve, with the added benefit that the guaranteed correct result of automatically evaluating the modified expression can also be inlined.

sympdemo3

(For interactive examples, see the notebooks in the sympy folder here; the notebooks are also runnable by launching a mybinder container – click on the launch:binder button to fire one up.)

It looks like there are also tools out there for converting from LateX math expressions to sympy equivalents.

As well as writing mathematical expressions than can be both expressed using mathematical notation, and evaluated as a mathematical expression, we can also write music, expressing a score in notational form or creating an admittedly beepy audio file corresponding to it.

(For an interactive example, run the midiMusic.ipynb notebook by clicking through on the launch:binder button from here.)

We can also generate audio files from formulae (I haven’t tried this in a sympy context yet, though) and then visualise them as data.

audio6

Packages such as librosa also seem to provide all sorts of tools for analysing an visualising audio files.

When we put together the Learn to Code MOOC for FutureLearn, which uses Jupyter notebooks as an interactive exercise environment for learners, we started writing the materials in (web pages for the FutureLearn teaching text, notebooks for the interactive exercises) in Jupyter notebooks. The notebooks can export as markdown, the FutureLearn publishing systems is based around content entered as a markdown, so we should have been able to publish direct from the notebooks to FutureLearn, right? Wrong. The workflow doesn’t support it: editor takes content in Microsoft Word, passes it back to authors for correction, then someone does something to turn it into markdown for FutureLearn. Or at least, that’s the OU’s publishing route (which has plenty of other quirks too…).

Or perhaps will be was the OU’s publishing route, because there’s a project on internally (the workshops around which I haven’t been able to make, unfortunately) to look at new authoring environments for producing OU content, though I’m not sure if this is intended to feed into the backend of the current route – Microsoft Word, Oxygen XML editor, OU-XML, HTML/PDF etc output – or envisages a different pathway to final output. I started to explore using Google docs as an OU XML exporter, but that raised little interest – it’ll be interesting to see what sort of authoring environment(s) the current project delivers.

(By the by, I remember being really excited about the OU-XML a publishing system route when it was being developed, not least because I could imagine its potential for feeding other use cases, some of which I started to explore a few years later; I was less enthused by its actual execution and the lack of imagination around putting it to work though… I also thought we might be able to use FutureLearn as a route to exploring how we might not just experiment with workflows and publishing systems, but also the tech – and business models around the same – for supporting stateful and stateless interactive, online student activities. Like hosting a mybinder style service, for example, or embedded interactions like the O’Reily Thebe demo, or even delivering a course as a set of linked Jupyter notebooks. You can probably guess how successful that’s been…)

So could Jupyter notebooks have a role to play in producing semi-automated content (automated, for example in the production of graphical objects and the embedding of automatically evaluated expressions)? Markdown support is already supported and it shouldn’t take someone too long (should it?!) to put together an nbformat exporter that could generate OU-XML (if that is still the route we’re going?)? It’d be interesting to hear how O’Reilly are getting on…

Whatever, again…

Fighting With docker – and Pondering Innovation in an Institutional Context

I spent my not-OU day today battling with trying to bundle up a dockerised VM, going round in circles trying simplify things a bit, and getting confused by docker-compose not working quite so well following an upgrade.

I think there’s still some weirdness going on (eg in docker-ui showing messed container names?) but I’m now way too confused to care or try to unpick it…

I also spent a chunk of time considering the 32 bit problem, but got nowhere with it…. Docker is predominantly a 64 bit thing, but the course has decided in it’s wisdom that we have to support 32 bit machines, which means I need to find a way of getting a 32 bit version of docker into a base box (apt-get install docker.io I think?), finding way of getting the vagrant docker provisioner to use it (would an alias help?), and checking that vagrant-docker-compose works in a 32 bit VM, then tracking down 32 docker images for PostgreSQL, MongoDB, dockerUI and OpenRefine (or finding build files for them so I can build my own 32 bit images).

We then need to be able to test the VM in a variety of regimes: 32 bit O/S on a 32 bit machine, 32 bit O/S on a 64 bit machine, 64 bit O/S on a 64 bit machine, with a variety of hardware virtualisation settings we might expect on students’ machines. I’m on a highly specced Macbook Pro, though, so my testing is skewed…

And I’m not sure I have it in me to try to put together 32 bit installs…:-( Perhaps that’s what LTS are for…?!;-)

(I keep wondering if we could get access to stats about the sorts of machines students are using to log in to the OU VLE from the user-agent strings of their browsers that can be captured in webstats? And take that two ways: 1) look to see how it’s evolving over time; 2) look to see what the profile of machines is for students in computing programmes, particular those coming up to level 3 option study? That’s the sort of pratical, useful data that could help inform course technology choices but that doesn’t have learning analytics buzzword kudos or budget attached to it though, so I suspect it’s not often championed…)

When LTS was an educational software house, I think there was also more opportunity, support and willingness to try to explore what the technology might be able to do for us and OUr students? Despite the continual round of job ads to support corporate IT, I fear that exploring the educational uses of software has not had much developer support in recent years…

As an example of the sort of thing I think we could explore – if only we could find a forum to do so – is the following docker image that contains an OU customised IPython notebook: psychemedia/ouflpy_scipystacknserver

The context is a forthcoming FutureLearn course on introductory programming. We’re currently planning on getting students to use Anaconda to run the IPython Notebooks that provide the programming environment for the course, but I idly wondered what a Kitematic route might be like. (The image is essentially the scipystack and notebook server with a few notebook extensions and OU customisations installed.)

There are some sample (testing) notebooks here that illustrate some of the features.

Here’s the installation recipe:

– download and unzip the notebooks (double click the downloaded file) and keep a note of where you unzipped the notebook directory to.

– download and install Kitematic. Ths makes use of docker and Virtualbox – but I think it should install them both for you if you don’t already have them installed.

– start Kitematic, search for psychemedia/ouflpy_scipystacknserver and create an application container.

It should download and start up automatically.

When it’s running, click on the Notebooks panel and Enable Volumes. This allows the container to see a folder on your computer (“volumes” are a bit like folders that can be aliased or mapped on to other folders across devices).

Click the cog (settings) symbol in the Notebooks panel to get to the Volumes settings. Select the directory that you created when you unzipped the downloaded notebooks bundle.

Click on the Ports tab. If you click on the link that’s displayed, it should open an IPython notebook server homepage in your browser.

Here’s what you should see…

Click on a notebook link to open the notebook.

The two demo notebooks are just simple demonstrations of some custom extensions and styling features I’ve been experimenting with. You should be able to create you own notebooks, open other people’s notebooks, etc.

You can also run the container in the cloud. Tweak the following recipe to try it out on Digital Ocean: Getting Started With Personal App Containers in the Cloud or Running RStudio on Digital Ocean, AWS etc Using Tutum and Docker Containers. (That latter example you could equally well run in Kitematic – just search for and install rocker/rstudio.)

The potential of using containers still excites me, even after 6 months or so of messing around the fringes of what’s possible. In the case of writing a new level computing course with a major practical element, limiting ourselves to a 32 bit build seems a backward step to me? I fully appreciate the need to to make our courses as widely accessible as possible, and in an affordable a way as possible (ahem…) but here’s why I think supporting 32 bit machines in for a new level 3 computing course is a backward step.

In the first case, I think we’re making life harder for OUrselves. (Trying to offer backwards compatibility is prone to this.) Docker is built for 64 bit and most of the (reusable) images are 64 bit. If we had the resource to contribute to a 32 bit docker ecosystem, that might be good for making this sort of technology accessible more widely internationally, as well as domestically, but I don’t think there’s the resource to do that? Secondly, we arguably worsen the experience for students with newer, more powerful machines (though perhaps this could be seen as levelling the playing field a bit?) I always liked the idea of making use of progressive enhancement as a way of trying to offer the best possible experience using the technology they have, though we’d always have to ensure we weren’t then favouring some students over others. (That said, the OU celebrates diversity across a whole range of dimensions in every course cohort…)

Admittedly, students on a computing programme may well have bought a computer to see them through their studies – if the new course is the last one they do, that might mean the machine they bought for their degree is now 6 years old. But on the other hand, students buying a new computer recently may well have opted for an affordable netbook, or even a tablet computer, neither of which can support the installation of “traditional” software applications.

The solution I’d like to explore is a hybrid offering, where we deliver software that makes use of browser based UIs and software services that communicate using standard web protocols (http, essentially). Students who can install software on their computers can run the services locally and access them through their browser. Students who can’t install the software (because they have an older spec machine, or a newer netbook/tablet spec machine, or who do their studies on a public access machine in a library, or using an IT crippled machine in their workplace (cough, optimised desktop, cOUgh..) can access the same applications running in the cloud, or perhaps even from one or more dedicated hardware app runners (docker’s been running on a Raspberry Pi for some time I think?). Whichever you opt for, exactly the same software would be running inside the container and exposing it in the same way though a browser… (Of course, this does mean you need a network connection. But if you bought a netbook, that’s the point, isn’t it?!)

There’s a cost associated with running things in the cloud, of course – someone has to pay for the hosting, storage and bandwidth. But in a context like FutureLearn, that’s an opportunity to get folk paying and then top slice them with a (profit generating…) overhead, management or configuration fee. And in the context of the OU – didn’t we just get a shed load of capital investment cash to spend on remote experimentation labs and yet another cluster?

There are also practical consequences – running apps on you own machine makes it easier to keep copies of files locally. When running in the cloud, the files have to live somewhere (unless we start exploring fast routes to filesharing – Dropbox can be a bit slow at synching large files, I think…)

Anyway – docker… 32 bit… ffs…

If you give the container a go, please let me know how you get on… I did half imagine we might be able to try this for a FutureLearn course, though I fear the timescales are way too short in OU-land to realistically explore this possibility.

Using Open Public Data to Hold Companies to Account

I’ve been in a ranty mood all day today, so to finish it off, here are some thoughts about how we can start to use #opendata to hold companies to account. The trigger was finding a dataset released by the Care Quality COmmission (CQC) listing the locations of premises registered with the CQC, and the operating companies of those locations (early observations on that data here).

The information is useful because it provides a way of generating aggregated lists of companies that are part of the same corporate group (for example, locations operated by Virgin Care companies, or companies operated by Care UK). When we have these aggregation lists, it means we can start to run the numbers across all the companies in a corporate group, and get some data back about how the companies that are part of a group are operating in general. The aggregated lists thus provide a basis for looking at the gross behaviour of a particular company. We can then start to run league tables against these companies (folk love league tables, right? At least, they do when it comes to public sector bashing). So we can start to see how the corporate groupings compare against each other, and perhaps also against public providers. Of course, there is a chance that the private groups will be shown to be performing better than public sector bodies, but that could be a useful basis for a productive conversation about why…

So what sorts of aggregate lists can we start to construct? The CQC data allows us to get lists of locations associated with various sorts of care delivery (care home, GP services, dentistry, more specialist services) and identify locations that are part of the same corporate group. For example, I notice that filtering the CQC data to care homes, the following are significant operators (the number relates to the number of locations they operate):

Voyage 1 Limited                          273
HC-One Limited                            169
Barchester Healthcare Homes Limited       168

When it comes to “brands”, we have the following multiple operators:

BRAND Four Seasons Group                   346
BRAND Voyage                               279
BRAND BUPA Group                           246
BRAND Priory Group                         183
BRAND HC-One Limited                       169
BRAND Barchester Healthcare                168
BRAND Care UK                              130
BRAND Caretech Community Services          118

For these operators, we could start to scrape their most recent CQC reports and build up a picture of how well the group as a whole is operating. In the same way that “armchair auditors” (whatever they are?!) are supposed to be able to hold local councils to account, perhaps they can do the same for companies, and give the directors a helping hand… (I would love to see open data activists buying a share and going along to a company shareholder meeting to give some opendata powered grief ;-)

Other public quality data sites provide us with hints at ways of generating additional aggregations. For example, from the Food Standards Agency, we can search on ‘McDonalds’ as a restaurant to bootstrap a search into premises operated by that company (although we’d probably also need to add in searches across takeaways, and perhaps also look for things like ‘McDonalds Ltd” to catch more of them?).

Note – the CQC data provides a possible steer here for how other data sets might be usefully extended in terms of the data they make available. For example, having a field for “operating company” or “brand” would make for more effective searches across branded or operated food establishments. Having company number (for limited companies and LLPs etc) provided would also be useful for disambiguation purposes.

Hmm, I wonder – would it make sense to start to identify the information that makes registers useful, and that we should start to keep tabs on? We could then perhaps start lobbying for companies to provide that data, and check that such data is being and continues to be collected? It may not be a register of beneficial ownership, but it would provide handy cribs for trying to establish what companies are part of a corporate grouping…

(By the by, picking up on Owen Boswarva’s post The UK National Information Infrastructure: It’s time for the private sector to release some open data too, these registers provide a proxy for the companies releasing certain sorts of data. For example, we can search for ‘Tesco’ as a supermarket on the FSA site. Of course, if companies were also obliged to publish information about their outlets as open data – something you could argue that as a public company they should be required to do, trading their limited liability for open information about where they might exert that right – we could start to run cross-checks (which is the sort of thing real auditors do, right?) and publish complete records of publicly account performance in terms of regulated quality inspections.)

The CQC and Food Standards Agency both operate quality inspection registers, so what other registers might we go to to build up a picture of how companies – particularly large corporate groupings – behave?

The Environment Agency publish several registers, including one detailing enforcement actions, which might be interesting to track, though I’m not sure how the data is licensed? The HSE (Health & Safety Executive) publish various notices by industry sector and subsector, but again, I’m not too clear on the licensing? The Chief Fire Officers Association (CFOA) publish a couple of enforcement registers which look as if they cover some of the same categories as the CQC data – though how easy it would be to reconcile the two registers, I don’t know (and again, I don’t know how the license is actually registered). One thing to bear in mind is that where registers contain personally identifiable information, any aggregations we build that incorporates such data (if we are licensed to build such things) means (I think) that we become data controllers for the purposes of the Data Protection Act (we are not the maintainers and publishers of the public register so we don’t benefit from the exemptions associated with that role).

Looking at the above, I’m starting to think it could be a really interesting exercise to pick some of the care home provider groups and have a go at aggregating any applicable quality scores and enforcement notices from the CQC, FSA, HSE and CFOA (and even the EA if any of their notices apply! Hmm… does any HSCIC data cover care homes at all too?) Coupled with this, a trawl of directors data to see how the separate companies in a group connect by virtue of directors (and what other companies may be indicated by common directors in a group?).

Other areas perhaps worth exploring – farms incorporated into agricultural groups? (Where would be find that data? One register that could be used to partially hold those locations to account may be the public register of pesticide enforcement notices as well as other EA notices?)

As well as registers and are there any other sources of information about companies we can add in to the mix? There’s lots: for limited companies we can pull down company registration details and lists of directors (and perhaps struck off directors) and some accounting information. Data about charities should be available from the Charities Commission. The HSCIC produces care quality indicators for a range of health providers, as well as prescribing data for individual GP practices. Data is also available about some of the medical trials that particular practices are involved in.

At a local council level, local councils maintain and publish a wide variety of registers, including registers of gaming machine licenses, licensed premises and so on. Where the premises are an outlet of a parent corporate group, we may be able to pick up the name of the parent group as the licensee. (Via @OwenBoswarva, it seems the Gambling Commission has a central list of operating license holders and licensed premises.)

Having identified influential corporate players, we might then look to see whether those same bodies are represented on lobbiest groups, such as the EU register of commission expert groups, or as benefactors of UK Parliamentary All Party groups, or as parties to meetings with Ministers etc.

We can also look across all those companies to see how much money the corporate groups are sinking from the public sector, by inspecting who payments are made to in the masses of transparency spending data that councils, government departments, and services such as the NHS publish. (For an example of this, see Spend Small Local Authority Spending Index; unfortunately, the bulk data you need to run this sort of analysis yourself is not openly available – you need to aggregate and clean it yourself.)

Once we start to get data that lists companies that are part of a group, we can start to aggregate open public data about all the companies in the group and look for patterns of behaviour within the groups, as well as across them. Lapses in one part of the group might suggest a weakness in high level management (useful for the financial analysts?), or act as a red flag for inspection and quality regimes.

Hmmm… methinks it’s time to start putting some of this open data to work; but put it to work by focussing on companies, rather than public bodies…

I think I also need to do a little bit of digging around how public registers are licensed? Should they all be licensed OGL by default? And what guidance, if any, is there around how we can make use of such data and not breach the Data Protection Act?

PS via @RDBinns, What do they know about me? Open data on how organisations use personal data, describing some of the things we can find from the data protection notifications published by the ICO [ICO data controller register].

Using Open Data to Hold Companies to Account?

Some rambling but possibly associated thoughts… I suggest you put Alice’s Restaurant on…

For some time now, I’ve had an uncomfortable feeling about the asymmetries that exist in the open data world as well as total confusion about the notion of transparency.

Part of the nub of the problem (for me) lies with the asymmetric disclosure requirements of public and private services. Public bodies have disclosure requirements (eg Local Government Transparency Code), private companies don’t. Public bodies disclose metrics and spend data, data that can be used in public contract tendering processes by private bodies against public ones tendering for the same service. The private body uses this information – and prices in a discount associated with not having to carry the cost of public reporting – into the bid. The next time the contract is tendered, the public body won’t have access to the (previously publicly disclosed) information that the private body originally had when making its bid. Possibly. I don’t know how tendering works. But from the outside, that’s how it appears to me. (Maybe there needs to be more transparency about the process?)

Open data is possibly a Big Thing. Who knows? Maybe it isn’t. Certainly the big consulting firms are calling it as something worth squillionty billionty of pounds. I’m not sure how they cost it. Maybe I need to dig through the references and footnotes in their reports (Cap Gemini’s Open Data Economy: Unlocking Economic Value by Opening Government and Public Data, Deloitte’s Open growth: Stimulating demand for open data in the UK or McKinsey’s Open data: Unlocking innovation and performance with liquid information). I don’t know how much those companies have received in fees for producing those reports, or how much they have received in consultancy fees associated with public open data initiatives – somehow, that spend data doesn’t seem to have been curated in a convenient way, or as a #opendatadata bundle? – but I have to assume they’re not doing it to fleece the public bodies and tee up benefits for their other private corporate clients.

Reminds me – I need to read Owen Boswarva’s Who supports the privatisation of Land Registry? and ODUG benefits case for open data release of an authoritative GP dataset again… And remind myself of who sits on the Open Data User Group (ODUG), and other UK gov departmental transparency boards…

And read the FTC’s report Data Brokers: A Call For Transparency and Accountability…

Just by the by, one thing I’ve noticed about a lot of opendata releases is that, along with many other sorts of data, they are most useful when aggregated over time or space, and/or combined with other data sets. Looking at the month on month reports of local spending data from my local council is all very well, but it gets more interesting when viewed over several months or years. Looking at the month on month reports of local spending data from my local council is all very well, but it gets more interesting when looking at spend across councils, as for example in the case of looking at spend to particular companies.

Aggregating public data is one of the business models that helps create some of the GDP figure that contributes to the claimed, anticipated squillionty billionty pounds of financial benefit that will arise from open data – companies like opencorporates aggregating company data, or Spend Network aggregating UK public spending data who hope to start making money selling products off the back of public open data they have curated. Yes – I know a lot of work goes in to cleaning and normalising that data, and that exploiting the data collection as a whole is what their business models are about – and why they don’t offer downloads of their complete datasets, though maybe licenses require they do make links to, or downloads of, the original (“partial”) datasets available?

But you know where I think the real value of those companies lies? In being bought out. By Experian, or Acxiom (if there’s even a hint of personally identifiable data through reverse engineering in the mix), or whoever… A weak, cheap, cop out business model. Just like this: Farmers up in arms over potential misuse of data. (In case you missed it, Climate Corporation was one of the OpenData500 that aggregated shed loads of open data – according to Andrew Stott’s Open Data for Economic Growth report for the World Bank, Climate Corp “uses 60 years of detailed crop yield data, weather observations from one million locations in the United States and 14 terabytes of soil quality data – all free from the US Government – to provide applications that help farmers improve their profits by making better informed operating and financing decisions”. It was also recently acquired by Monsanto – Monsanto – for just under a billion US $. That’s part of the squillionty billionties I guess. Good ol’ open data. Monsanto.)

Sort of related to this – that is, companies buying others to asset strip them for their data – you know all that data of yours locked up in Facebook and Google? Remember MySpace? Remember NNTP? According to the Sophos blog, Just Because You Don’t Give Your Personal Data to Google Doesn’t Mean They Can’t Acquire It. Or that someone else might buy it.

And as another aside – Google – remember Google? They don’t really “read” your email, at least, people don’t, they just let algorithms process it so the algorithms can privately just use that data to send you ads, but no-one will ever know what the content of the email was to trigger you getting that ad (‘cos the cookie tracking, cookie matching services can’t unpick ad bids, ad displays, click thrus, surely, can they?!), well – maybe there are side effects: Google tips off cops after spotting child abuse images in email (for some reason, after initially being able to read that article, my browser can’t load it atm. Server fatigue?). Of course, if Google reads your ads for blind business purposes and ad serving is part of that blind process you accept it. But how does the law enforcement ‘because we can even though you didn’t warrant us to?’ angle work? Does the Post Office look inside the envelope? Is surveillance actually part of Google’s business model?

If you want to up the paranoia stakes, this (from Ray Corrigan, in particular: “Without going through the process of matching each government assurance with contradictory evidence, something I suspect would be of little interest, I would like to draw your attention to one important misunderstanding. It seems increasingly to be the belief amongst MPs that blanket data collection and retention is acceptable in law and that the only concern should be the subsequent access to that data. Assertions to this effect are simply wrong.”) + that. Because one day, one day, they may just find your name on an envelope of some sort under a tonne of garbage. Or an algorithm might… Kid.

But that’s not what this post is about – what this post is about is… Way back when, so very long ago, not so very long ago, there was a license called GPL. GPL. And GPL was a tainting license. findlaw describes the consequences of reusing GPL licensed code as follows: Kid, ‘if a user of GPL code decides to distribute or publish a work that “in whole or in part contains or is derived from the [open source] or any part thereof,” it must make the source code available and license the work as a whole at no charge to third parties under the terms of the GPL (thereby allowing further modification and redistribution).

‘In other words, this can be a trap for the unwary: a company can unwittingly lose valuable rights to its proprietary code.’

Now, friends, GPL scared people so much that another license called LGPL was created, and LGPL allowed you to use LGPL licensed code without fear of tainting your own code with the requirement to open up your own code as GPL would require of it. ‘Cos licenses can be used against you.

And when it comes to open data licenses, they seem to be like LGPL. You can take open public data and aggregate it, and combine it, and mix it and mash it and do what you like with it and that’s fine… And then someone can come along buy that good work you’ve done and do what they want with it. Even Monsanto. Even Experian. And that’s good and right, right? Wrong. The ODUG. Remember the ODUG? The ODUG is the Open Data User Group that lobbies government for what datasets to open up next. And who’s on the ODUG? Who’s there, sitting there, on the ODUG bench, right there, right next to you?

Kid… you wanna be the all-open, all-liberal open data advocate? You wanna see open data used for innovation and exploitation and transparency and all the Good Things (big G, big T) that open data might be used for? Or you wanna sit down on the ODUG bench? With Deloitte, and Experian, and so on…

And if you think that using a tainting open data license so anyone who uses that data has to share it likewise, aggregated, congregated, conjugated, disaggregated, mixed, matched, joined, summarised or just otherwise and anyways processed, is a Good Thing…? Then kid… they’ll all move away from you on the bench there…

Because when they come to buy you, they won’t your data to be tainted in any way that means they’ll have to give up the commercial advantage they’ll have from buying up your work on that open data…

But this post? That’s not what this post is about. This post about holding companies to account. Open data used to hold companies to account. There’s a story to be told that’s not been told about Dr Foster, and open NHS data and fear-mongering and the privatisation of the NHS and that’s one thing…

But another thing is how government might use data to help us protect ourselves. Because government can’t protect us. Government can’t make companies pay taxes and behave responsibly and not rip off consumers. Government needs our help to do that. But can government help us do that too? Protect and Survive.

There’s a thing that DECC – the Department of Energy and Climate Change – do, and that’s publish statistics about domestic energy price statistics and industrial energy price statistics and road fuel and other petroleum product price statistics, and they’re all meaningless. Because they bear little resemblance to spot prices paid when consumers pay their domestic energy bills and road fuel and other petroleum product bills.

To find out what those prices are you have to buy the data from someone like Experian, from something like Experian’s Catalist fuel price data – daily site retail fuel prices – data product. You may be able to caluclate the DECC statistics from that data (or you may not) but you certainly can’t go the other way, from the DECC statistics to anything like the Experian data.

But can you go into your local library and ask to look at a copy of the Experian data? A copy of the data that may or may not be used to generate the DECC road fuel and other petroleum product price statistics (how do they generate those statistics anyway? What raw data do they use to generate those statistics?)

Can you imagine ant-eye-ant-eye-consumer data sets being published by your local council or your county council or your national government that can be used to help you hold companies to account and help you tell them that you know they’re ripping you off and your council off and your government off and that together, you’re not going to stand for it?

Can you imagine your local council publishing the forecourt fuel prices for one petrol stations, just one petrol station, in your local council area every day? And how about if they do it for two petrol stations, two petrol stations, each day? And if they do it for three forecourts, three, can you imagine if they do it for three petrol stations…? And can you, can you imagine prices for 50 petrol stations a day being published by your local council, your council helping you inform yourself about how you’re being manipulated, can you imagine…? (It may not be so hard – food hygiene ratings are published for food retail environments across the England, Northern Ireland and Wales…

So let’s hear it for open data, and how open data can be used to hold corporates to account, and how public bodies can use open data to help you make better decisions (which is a good neoliberal position to take and one which the other folk on the bench tell you that that’s what you want and that and markets work, though they also fall short of telling you that the models say that markets work with full information but you don’t have the information, and even if you did, you wouldn’t understand it, because you don’t really know how to make a good decision, but at the end of the day you don’t want a decision, you just want a good service fairly delivered, but they don’t tell that it’s all right to just want that…)

And let’s hear it for public bodies making data available whether it’s open or not, making it available by paying for it if they have to and making it available via library services so that we can start using it to start holding companies to account and start helping our public services, and ourselves, protect ourselves from the attacks being mounted on us by companies, and their national government supporters, who take on debt, and who allow them to take on debt, to make dividend payouts but not capital investment and subsidise the temporary driving down of prices (which is NOT a capital investment) through debt subsidised loss leading designed to crush competition in a last man standing contest that will allow monopolistic last man standing price hikes at the end of it…

And just remember, if there’s anything you want, you know where you can get it… At Alice’s… or the library… only they’re shutting them down, aren’t they…? So that leaves what..? Google?

Regularly Scheduled FOI Requests as a None Too Subtle Regular OpenData Release Request? And Some Notes on Extending FOI

A piece of contextualisation in an interview piece with Green MP Caroline Lucas in Saturday’s Guardian (I didn’t do this because I thought it was fun) jumped out at me as I read it: More than 50 energy company employees have been seconded to the government free of charge, and dozens of them to the department of energy and climate change.

Hmm…. /site:gov.uk (secondment OR seconded) DECC/

How about the gov.uk site?

(I don’t know what’s going in the fight between GDS and the data.gov.uk folk ito getting central gov #opendata info discoverable on the web, but the http://www.gov.uk domain seems to be winning out, not least because for departments who’re in that empire, that’s where any data that eventually linked to from data.gov.uk will actually be published?)

So – it seems folk have been FOIing this sort of information, but it doesn’t look as if this sort of information is being published according to a regular schedule under an #opendata transparency agenda.

Anyone would thing that the UK government wasn’t in favour of a little bit of light being shone on lobbying activity…

(What did happen to the lobbying bill? Oh, I remember, it got through in a form that doesn’t allow for much useful light shedding at all (enactment), and now Labour want to repeal it.)

I guess I could put a request in to the ODUG (Open Data User Group) for this data to be released as open data, but my hunch is it’s not really the sort of thing they’re interested in (I get the feeling they’re not about open data for transparency, but (perhaps unfairly…?!) see them more as a lobbying group (ODUG membership) for companies who can afford to buy data but who would rather the tax payer pays for its collection and the government then gifts it to them).

More direct would be to find a way of automating FOI requests using something like WhatDoTheyKnow that would fire off an FOI request to each central government department once a month asking for that previous months’ list of secondments into and out of that department in the preceding month (or in the month one or two months preceding that month if they need a monthly salary payment cycle or two for that data to become available).

Of course, it does seem a bit unfair that each government department should have to cover the costs of these requests, but as it stands I can’t make an FOI request of companies that choose to engage in this sort of presumably public service.

Making private companies offering public services under contract subject to FOI does seem to be on the agenda again though, after being knocked back around this time last year?:

An extension to the scope of the FOI Act was proposed a few weeks ago in the Public Bill Committee debate of the morning of Tuesday 18 March 2014 on the Criminal Justice & Courts Bill, columns 186-193:

Dan Jarvis: I beg to move amendment 37, in clause 6, page 6, line 29, at end insert—

‘(1A) The Code of Practice must include a requirement that a person carrying out electronic monitoring who is not a public authority as defined by section 3 of the Freedom of Information Act 2000 shall provide information in respect of the carrying out of electronic monitoring in the same manner as if they were such a public authority.’.

The Chair: With this it will be convenient to discuss amendment 38, in schedule 4, page 73, line 25, at end insert—

‘(1A) Where the Secretary of State enters into a contract with another person under paragraph 1(1), and that person is not a public authority for the purposes of section 3 of the Freedom of Information Act 2000, that person shall be designated by the Secretary of State as a public authority for the purposes of that section in relation to that contract.’.

I remind the Committee that this group is about freedom of information provisions as they apply to aspects of the Bill. Members will have the opportunity to debate the detail of secure colleges later.

Dan Jarvis: I serve notice that, unless sufficient assurances are received, we intend to put the amendments to a vote. [ Interruption. ] Dramatic! I sensed for a moment that there was a higher authority raising a concern about these amendments, but I shall plough on regardless, confident in the knowledge that they are true and right.

Anyone who knows the story of Jajo the rabbit will understand what I am about to say. For those members of the Committee who do not know, Jajo was the pet rabbit successfully registered as a court translator and then booked in for shifts following the Ministry of

Column number: 187
Justice’s outsourcing of language service contracts. Jajo’s short-lived translation career says less about his talent and much more about the importance of ensuring that public contracts delivered by private providers are properly managed.
As was touched on, Ministers now have to manage another fall-out. Two private providers of electronic monitoring overcharged the taxpayer by millions of pounds for tagging offenders who had died or moved abroad, or who were already back in prison. That underlines the case for the amendments.

Both amendments would seek to bring non-public providers of public services contracted out under the Bill within the scope of the Freedom of Information Act. Amendment 37 relates to clause 6 and the code of practice that would be issued by the Secretary of State on the processing of data related to electronic monitoring. It would require anyone carrying out monitoring related to the clauses to comply with FOI requests in the same way as public bodies do. Amendment 38 relates to schedule 4 and the arrangements for contracting out secure colleges, which are detailed in part 2. It would require anyone contracted to provide a secure college to comply with freedom of information in the same way. Both our proposals are worthy of consideration by the Committee.

We all know that the landscape of how public services are delivered is changing. The Government spend £187 billion on goods and services with third parties each year, about half of which is estimated to be on contracting out services. About half of all spending on public services now ends up in private providers’ hands and more and more private providers are bidding to take on the responsibility and financial rewards that come with large-scale public contracts. As outsourcing is stepped up, more and more information about public services and public money is being pulled out of the public domain. That presents a particular challenge that we must tackle.

As the Information Commissioner told the Justice Committee last year,

“if more and more services are delivered by alternative providers who are not public authorities, how do we get accountability?”

The rewards that third parties stand to gain need to go hand in hand with the duties of transparency and information sharing. The public should be able to ask about how, and how well, the service they are paying for is being run.

The Freedom of Information Act does provide for supply-chain companies to be considered to be holding information on behalf of a public authority. In practice, however, contracted providers in the justice sector are not subject to anywhere near the same transparency requirements as publicly-run services. Private prisons, for example, are not subject to FOI in the same way as public prisons and the experience of G4S, Serco and others will have influenced many other companies not to be as forthcoming as they might have been. That is why we need to build freedom of information into the contracts that the Government make with third parties.

The Committee will be aware that such an approach was recommended by the Public Accounts Committee in its excellent report published last week. It made the

Column number: 188
point that many Departments are not providing information on how those contracts work on the grounds of commercial confidentiality. The public will not accept that excuse for much longer.
Let me conclude my remarks by offering the Committee a final quote. Someone once said:

“Information is power. It lets people hold the powerful to account”

and it should be used by them to hold their

“public services to account”.

I agree with the Prime Minister. Two years ago, he spoke about

“the power of transparency”

and

“why we need more of it.”

He also spoke of leading

“the most transparent Government ever.”

Labour has pledged that the next Labour Government will deal with the issue by bringing companies providing public contracts into the scope of FOI legislation.

Freedom of information can be uncomfortable. It can shed light on difficult issues and be problematic for Government Ministers, but that is the point. The Committee has the opportunity today to improve the Bill and to get a head start.

Dr Huppert: I will not detain the Committee. I share the concern about the lack of FOI for private organisations providing public services. My colleagues and I have expressed concerns about that for many years, and the previous Government were not very good at accepting that. It is good news that the Labour party may undo that error.

Mr Slaughter: Can the hon. Gentleman say what steps he and the coalition have taken to extend FOI in the past four years?

Dr Huppert: Not as many as I would like, but we have seen progress in some areas; we did not see any at all when the hon. Gentleman was a Minister. I hope we will see the correct drive. I share the concern that we need transparency when public services are delivered by private companies. They must not be shielded. I look forward to hearing what the Minister has to say, because he has commented on such issues before.

It is important that the matter should be dealt with on a global scale. I think the shadow Minister would agree that the case is broader. I hope to hear from the Minister that there will be more work to look at how the issue can be addressed more generally, rather than just in a specific case. That would probably require amendment of the Freedom of Information Act. That is probably the best way to resolve the issue, rather than tacking it on to this area, but I absolutely share the concerns. I hope we can see more transparency, both from the Government—we are seeing that—and from the private sector as it performs public functions.

Yasmin Qureshi: The Justice Committee, of which I am a member, looked into the Freedom of Information Act and how it has been operating since it was passed many years ago. We spoke to different groups of people,

Column number: 189
representatives of councils and local authorities, the Information Commissioner and pressure groups. Generally, the view was that the Freedom of Information Act has been a force for good. The thing that people mentioned time and again was the fact that it applies only to public authorities and has a narrow remit in private companies. A lot of concern was expressed about that.
As my hon. Friend the Member for Barnsley Central said, just under £200 billion is being spent by the Government for private companies to carry out public work. The number of outsourcings could increase, especially in the criminal justice system. In the probation service there will be contracting out and privatisation, as well as changes in the criminal justice system in relation to legal aid and suchlike. We have concerns about the criminal justice system and the number of companies that will be carrying out work that the state normally does. It is an important issue.

Will the Minister give us an undertaking for whenever Government money is given to carry out work on behalf of the Government? Local authorities and Government Departments have to provide information, and it should be the same for private companies. At the moment, as the shadow Minister mentioned, the agencies providing some of the public work give some information, but it is not enough.

It is often hard to get information from private companies. It is important for the country that we know where public money is being spent and how private companies respond to such things. We can have party political banter, but freedom of information was introduced many years ago and has been working well. Freedom of information needs to be extended in light of the new circumstances. I ask for a clear commitment from the Government that they will encapsulate that in the Bill. They now have that opportunity; the Labour party has said that, if it was in government, it would certainly do so. The lacunae and the gaps would be addressed by the amendment, which would make it clear exactly how the regime applies. [Interruption.]

10.30 am
The Chair: I apologise for the background noise. We are looking into the cause.

Jeremy Wright: Thank you, Mr Crausby. I hope Jajo the rabbit is not responsible.

As the hon. Member for Barnsley Central said, amendment 37 seeks to introduce a requirement as to the contents of the code of practice that the Secretary of State will issue under proposed new section 62B of the Criminal Justice and Court Services Act 2000, which is to be introduced through clause 6. The Secretary of State would have to include provisions in the code of practice requiring providers of outsourced electronic monitoring services to make information available in the same manner as if they were subject to the provisions of the Freedom of Information Act. The aim of the amendment seems essentially to extend the Act to providers of electronic monitoring not already subject to its provisions.

Amendment 38 has the same basic intention in that it seeks to extend the Freedom of Information Act to providers of secure colleges that have entered a contract with the Secretary of State to do so under schedule 4. The approach differs, however, because amendment 38

Column number: 190
would extend the Act directly, whereas amendment 37 seeks to extend its obligations through code of practice guidance.
In other words, both amendments would require private providers not currently subject to the Freedom of Information Act to make information available both in response to FOI requests and proactively through publication schemes. Section 5 of the Act already provides a power to extend the Act’s provisions to contractors providing public services. For reasons I will try to outline, the Government do not currently propose to adopt that approach and are adopting an alternative method to ensure transparency. I am aware, however, of the long-standing and serious concerns raised on the position under the Act of private providers of public services. It might help the hon. Member for Hammersmith to know that the Government are committed to, and have taken steps to extend, the Act. More than 100 additional organisations have been included since 2010, and we are considering other ways in which its scope may be widened.

The issue of outsourced public services was considered in some detail during post-legislative scrutiny of the Freedom of Information Act by the Select Committee on Justice in 2012. I do not know whether the hon. Member for Bolton South East was a member of the Committee of that time, but the Committee rightly issued a reminder that

“the right to access information is crucial to ensuring accountability and transparency for the spending of taxpayers’ money”.

The Committee recommended the use of contractual provisions, rather than the formal extension of the Act, to ensure that transparency and accountability are maintained. In particular, the Committee said:

“We believe that contracts provide a more practical basis for applying…outsourced services than partial designation of commercial companies under section 5 of the Act”.

The Committee also feels that

“the use of contractual terms to protect the right to access information is currently working relatively well.”

The Government’s approach is consistent with that recommended by the Justice Committee.

In addition to information being made available proactively, the Government are taking steps to issue a revised code of practice under section 45 of the Freedom of Information Act to promote transparency on outsourced public services in response to FOI requests. The code of practice will be issued later this year and will promote and encourage the use and enforcement of contractual obligations to ensure that private bodies not subject to the Act provide appropriate assistance where information about outsourced public services is requested from bodies that are subject to the Act.

The Government recognise that only a small amount of information held by private providers is currently often formally subject to the Act. Our code of practice will encourage public authorities to go further, to interpret their freedom of information obligation broadly and to release more information on a voluntary basis, where it would be in the public interest to do so. In the event of non-compliance, it will also be possible for the Information Commissioner to issue and publish a practice recommendation setting out steps that, in his view, the public authority should take to promote conformity with the guidance.

Column number: 191
Mr Slaughter: I seem to remember taking part in the Westminster Hall debate arising out of the Justice Committee’s deliberation and I do not think that it was very happy with the approach that the Government are taking, particularly where they are seeking to restrict freedom of information further. Does the hon. Gentleman accept on the basis of what he has just said that this will not be a level playing field and that the same requirements that apply to public bodies will not apply to private organisations undertaking an effectively identical role? Does he accept that, whatever the merits of his scheme, it does not to far enough and does not address the comments of my hon. Friend the Member for Barnsley Central?

Jeremy Wright: The hon. Gentleman will recognise that the organisations we are talking about extending the provisions of the Act to cover vary hugely in size and level of resources. The concern is to draw the appropriate balance between giving correct access to information and not imposing intolerable burdens on organisations, particularly smaller ones. That is the balance that has to be struck. We are looking at ways in which we can continue to make public authorities responsible for supplying information but ensure that it comes from the place where it originated, which may be those other organisations.

Mr Slaughter: That is a different argument and one that is often tried. It was tried in relation to universities and to the smaller district councils much beloved of the hon. Member for Bromley and Chislehurst. There are already limitations within the Act. There are safeguards for organisations in terms of the amount of time and cost. Why are they not sufficient?

Jeremy Wright: As I said, there is a balance to be struck. We attempt to strike that balance correctly with our proposals. If I can explain what we want to do a little more fully, perhaps the hon. Gentleman will be reassured—although frankly I doubt it. There is an opportunity for us to look at the issue in a sensible way with the code of practice. Applying our forthcoming code of practice guidance across the public sector will ensure that transparency and response to freedom of information requests will be maintained in a consistent way. This is preferable—I agree with my hon. Friend the Member for Cambridge—to the more piecemeal approach promoted by amendments 37 and 38.

The success of our own code of practice will be monitored by the Ministry of Justice and the Information Commissioner. We were clear in our response to post-legislative scrutiny of the Freedom of Information Act that, should this approach yield insufficient dividends, we will consider what other steps are necessary. In summary, we are committed to ensuring transparency in relation to all outsourced public services, including electronic monitoring and secure colleges. We are taking steps to ensure that through the code of practice to be issued later this year. On that basis, I invite the hon. Gentleman to withdraw his amendment.

Yasmin Qureshi: The Minister referred to the Select Committee on Justice and its recommendations. As you know, without going into the detail of that discussion, Select Committee recommendations sometimes tend to

Column number: 192
be compromises. At the time, three issues were in the mind of the Select Committee. First, it did not realise that a legislative opportunity would come so soon in which to put the measure in a more codified way with a clearer legal obligation. Secondly, there was quite a lot of discussion about private companies.
The Select Committee accepted that the Freedom of Information Act should not apply to purely private companies carrying out purely private work; it was not really arguing against that. However, here we have an opportunity to codify once and for all in legislation the provision that the FOIA should apply whenever public money is paid to a private company to carry out work. That would be a fairly straightforward provision. I do not see why we need to go down the complicated route of using a code of practice, putting in a specific provision in a new contract each time something happens. Why can we not just have a general provision that applies to every situation?

Jeremy Wright: I was a member of the Justice Committee before the hon. Lady was, so I understand her point that recommendations of the Select Committee are a matter of discussion and compromise. However, they are made on a cross-party basis, and paid all the more attention to for that reason. I quoted directly from the Select Committee’s conclusions in what I said earlier.

On the hon. Lady’s other point, this may be an earlier legislative opportunity than the Select Committee anticipated, but of course, it is only an opportunity in relation to specific policies. Again, I rather agree with the point made earlier by my hon. Friend the Member for Cambridge: there is an argument for addressing the issue, not on a piecemeal basis, but more comprehensively.

The hon. Lady’s final point is that the approach that we have set out—using a code of practice—is inadequate and that a statutory approach should be introduced by amending primary legislation. An initial approach of using a code of practice is a sensible one. She will recognise that amendment 37, tabled by the hon. Member for Barnsley Central, deals with a requirement in a code of practice, not primary legislation. Amendment 38 is different, but in relation to electronic monitoring, on which a number of concerns have been expressed, the hon. Gentleman’s chosen vehicle is a code of practice. The code of practice approach appears to be welcomed by both sides of the Committee.

Dan Jarvis: I have listened carefully to the Minister’s response. Clearly, we will want to look carefully at the detail of what he has said about a code of practice.

I agree with my hon. Friend the Member for Bolton South-East that the Committee has an opportunity this morning to make progress on redefining the freedom of information. I have heard the Minister’s response to that point, but the reality is that the move would be popular with the public.

There is no doubt that the landscape in which public services are delivered is changing. The Opposition have pledged to reform freedom of information if we are in government from 2015. I am mindful of the Prime Minister’s comments, which I quoted earlier. He said:

Column number: 193
“Information is power. It lets people hold the powerful to account”,

and it should be used by them to hold their public services to account.

Mike Kane: Does my hon. Friend agree that, as the contracting out of public services expands, the public’s right to information shrinks?

Dan Jarvis: I agree absolutely. There is a degree of inevitability that we will see change in the area. The debate is about how we do it, and it is important that we have that debate. We have tabled the amendments partly so that we can take the opportunity to debate such issues.

Mr Slaughter: There is another point here, which is that the Ministry of Justice is particularly vulnerable on the issue. We have had the privatisation of the probation service and the scandals regarding tagging. We will come to later in the Bill to proposals about the externalisation of the collection of fines and other matters. First, that is going on wholesale in the Department, and secondly, it is defective in many aspects. It is particularly relevant that the Minister should accept that the proposals in the Bill are not sufficient.

Dan Jarvis: My hon. Friend is right. In the context of the delivery of public services within the Ministry of Justice remit, this is a particularly relevant, timely and important issue. It has been incredibly useful to have the opportunity to debate it owing to the tabling of the amendments.

10.45 am
I mentioned that I was mindful of the Prime Minister’s comments, and I am mindful of the fact that the Justice Secretary has also indicated a desire to reform freedom of information. Given that there is a general acknowledgment that the status quo is not acceptable and despite what the Minister has said in response to our amendment, I will press it to a vote.

The amendment was defeated.

An hour or so later, the government took this line:

Daily Hansard, Commons, Tuesday 18 March 2014 – c.639

Freedom of Information Act
23. Lindsay Roy (Glenrothes) (Lab): What plans he has to bring forward legislative proposals to expand the scope of the Freedom of Information Act 2000.

The Minister of State, Ministry of Justice (Simon Hughes): There has been good progress in extending the implementation of the Freedom of Information Act because the coalition Government pledged to extend its scope to provide greater transparency. We extended it in 2010 to academies, in 2011 to the Association of Chief Police Officers, the Financial Ombudsman Service and the Universities and Colleges Admissions Service, and last year to 100 companies wholly owned by more than one public authority. The next item on the agenda is to do with Network Rail, and we are awaiting a view from the Department for Transport as to whether it thinks it would be appropriate for that to be implemented this year.

Lindsay Roy: What benefits have accrued to the Government and citizens from the implementation of the Act, and when does the Minister plan to extend its scope further?

Simon Hughes: We intend to extend it further as soon as is practical. One specific issue that I hope will be of interest to the hon. Gentleman—as it is to colleagues of his, including those who have come to see me about it—is that we intend to publish a revised code of practice to make sure that private companies that carry out public functions have freedom of information requirements in their contracts and go further than that. We hope that that will be in place by the end of this year.

Mr Mark Harper (Forest of Dean) (Con): There is one area where the Minister should perhaps look at narrowing the scope of the Act, because my understanding is that requests can be made by anybody anywhere on the face of the earth; they do not have to be British citizens. It is not the role of the British Government to be a taxpayer-funded research service for anyone on the globe. May I suggest that he narrow the scope to those for whom the Government work—citizens of our country?

Simon Hughes: I well understand my hon. Friend’s point. There will be two consultations this year: first, on precisely such issues about the scope of the current legislation to make sure that it is not abused while we retain freedom of information as a principle of Government; and secondly, on extending it to other areas where we have not gone so far.

Dr Huppert:I read out the quote from someone who has made the position clear when it comes to private companies carrying out public functions. Indeed, the code of practice has exactly the wording used in amendment 11, which the hon. Gentleman supported when we debated it on Tuesday. I do not want to take up too much of the Chairman’s kindness to discuss an issue that was rejected at that point, but it is happening as we wanted.

The matter was also touched upon a couple of days later in a Public Bill Committee on the Criminal Justice and Courts Bill (Official Report, Thursday 20 March 2014, 257-259) where accountability around public contracts delivered by private provides was being discussed:

Mr Slaughter: Absolutely not. I hope that the hon. Gentleman has read the article about Jago the rabbit that my hon. Friend the Member for Barnsley Central (Dan Jarvis) and I wrote for The Independent yesterday [It’s time we extended Freedom of Information to public services run by private companies – just ask Jago the Rabbit], which dealt with what should be done, which is to bring these companies within the ambit of FOI, and what the Minister of State did—with his usual skill, shall we say?—at Justice questions on Tuesday. He implied that that was what was going to happen, whereas in fact he was doing nothing more than putting round the line that the Cabinet Office has already indicated.

If I am wrong about that, I will give way in a moment and the hon. Gentleman can come back to me, but my understanding is that the Government—both parts of it, as long as they are just about coalescing—are of the view that the contracts that are drawn up should include this notional transparency. That is to say that they will encourage public authorities to encourage private companies to put clauses into contracts that will expose as much as possible, within the realms of commercial confidentiality. So the contracts will be open, with publication on websites and so forth of as much information about the contract as the two parties think fit. What we will not have is a duty on those private companies—in so far as they are carrying out public functions—to comply with the terms of the Freedom of Information Act, as would be the case in the public sector.

I accept that they are two sides of the same coin. On the one hand, of course it is a good idea that the information is made available voluntarily, but if it is not—either because the company does not choose to do so or because the contract is not drafted sufficiently well to ensure that it must—the citizen must have the right, through FOI, to require that information to be made available. As far as I am concerned, that is not what was said on Tuesday. I know that there is consultation going on, but if it is the intention of the Government—at least the Liberal Democrat part of the Government—to follow the line taken by my right hon. Friend the Member for Tooting (Sadiq Khan), the shadow Lord Chancellor, which he has repeated often in recent months, and require all those private companies performing public functions to come within the requirements of the Freedom of the Information Act, I would be pleased if the hon. Gentleman said so now.

Mr Slaughter:I take from that comment that even the hon. Gentleman does not understand what the Minister of State, Ministry of Justice, the right hon. Member for Bermondsey and Old Southwark, says, so opaque is it. If nobody, including the Minister, is going to answer my question, the answer will no doubt come out in the wash on a later occasion. However, it seems to me that that is not what is being promised. If it were, the Minister would be jumping up and claiming credit for it, but he is not. Therefore, I assume that that is not the case.

The significance of that is that those four companies about which I have just raised doubts—G4S, Serco, Capita, and we can safely add Atos—all told the Public Accounts Committee that they were prepared to accept the measures that the Committee proposed. It therefore appears that the main barrier to greater transparency lies within Government.

That is where we are. Even the companies that want to put themselves and the interests of their shareholders first are more keen on transparency and on answering the legitimate questions that are being asked by everyone— from ourselves to the chief inspector of prisons—than this Government are.

I say that because if we are to take this further leap down that path, it is only right that the Government do not just challenge, as the Minister has said, acknowledged frauds, but look at the entire performance behaviour, as well as the number of available companies that could step into the breach and deal with these matters.

What we must conclude from the conjunction of clauses 17 and 18 is that, first, the Government are prepared to take this leap in the dark, in terms of the reconfiguration of the youth estate and, secondly, that they are prepared to leave that entirely in the hands of the people who failed so many times in so many contracts, not least in running parts of the adult prison service.

For more on some of the specifics, see the House of Commons Public Accounts Committee report on “Contracting out public services to the private sector”, which for example recommended “that the Cabinet Office should explore how the FOI regime could be extended to cover contracts with private providers, including the scope for an FOI provision to be included in standard contract terms; that neither the Cabinet Office nor departments should routinely use commercial confidentiality as a reason for withholding information about contracts with private providers; [and that] The Cabinet Office should set out a plan for departments to publish routinely standard information on their contracts with private providers”.

There’s also a couple of related private members bills floating around at the moment – Grahame Morris’ empty Freedom of Information (Private Healthcare Companies) Bill 2013-14 “to amend the Freedom of Information Act 2000 to apply to private healthcare companies”, and Caroline Lucas’ Public Services (Ownership and User Involvement) Bill 2013-14 “to put in place mechanisms to increase the accountability, transparency and public control of public services, including those operated by private companies”. The latter >a href=”http://www.publications.parliament.uk/pa/bills/cbill/2013-2014/0160/cbill_2013-20140160_en_2.htm#l1g5″>proposes:

5 Transparency
(1) Where a relevant authority starts the process of procurement for a public services contract, it must make available to the public details of all bids received prior to the conclusion of the procurement exercise.
(2) Where a relevant authority enters into a public services contract, details of that contract shall be made available to the public within 28 days of the procurement decision.

6 Freedom of information
(1) The Secretary of State must designate as a public authority, pursuant to section 5(1)(b) of the Freedom of Information Act 2000, companies or other bodies which enter into a public services contract.
(2) “Public services contract” has the meaning contained within section 8 of this Act.
(3) The Secretary of State shall maintain a list of companies designated under section 6(1) of this Act.
(4) Requests under the Freedom of Information Act 2000 in respect of such companies or bodies can only be made in respect of information relevant to the provision of a public services contract.
(5) The Secretary of State must designate as a public authority, pursuant to section 5(1)(b) of the Freedom of Information Act 2000, any utility company subject to regulation by regulatory authorities as defined in section 8.

Finally, on the accountability and transparency thing, there’s a consultation on at the moment regrading “smaller authorities with an annual turnover not exceeding £25,000, including parish councils, [who] will be exempt from routine external audit” but instead will be subject to a transparency code (Draft transparency code for parish councils – consultation).

Related: Spending & Receipts Transparency as a Consequence of Accepting Public Money? If you accept public money for contracts that would otherwise be provided by a public service you should be subject to the same levels of FOI and transparency reporting. Why should public services have to factor this in to their bids for running a service when private companies don’t?

Other reading to catch up on: Commons Public Administration Select Committee [PASC] Report on Statistics and Open Data (evidence).

MOOC Busting: Personal Googalytics…

Reading Game Analytics: Maximizing the Value of Player Data earlier this morning (which I suggest might be a handy read if you’re embarking on a learning analytics project…) I was struck by the mention of “player dossiers”. A Game Studies article from 2011 by Ben Medler- Player Dossiers: Analyzing Gameplay Data as a Reward describes them as follows:

Recording player gameplay data has become a prevalent feature in many games and platform systems. Players are now able to track their achievements, analyze their past gameplay behaviour and share their data with their gaming friends. A common system that gives players these abilities is known as a player dossier, a data-driven reporting tool comprised of a player’s gameplay data. Player dossiers presents a player’s past gameplay by using statistical and visualization methods while offering ways for players to connect to one another using online social networking features.

Which is to say – you can grab your own performance and achievement data and then play with it, maybe in part to help you game the game.

The Game Analytics book also mentioned the availability of third party services built on top of game APIs that let third parties build analytics tools for users that are not otherwise supported by the game publishers.

Hmmm…

What I started to wonder was – are there any services out there that allow you aggregate dossier material from different games to provide a more rounded picture of your performance as a gamer, or maybe services that homologate dossiers from different games to give overall rankings?

In the learning analytics space, this might correspond to getting your data back from a MOOC provider, for example, and giving it to a third party to analyse. As a user of MOOC platform, I doubt that you’ll be allowed to see much of the raw data that’s being collected about you; I’m also wary that institutions that sign up to MOOC platforms will also get screwed by the platform providers when it comes to asking for copies of the data. (I suggest folk signing their institutions up to MOOC platforms talk to their library colleagues, and ask how easy it is for them to get data, (metadata, transaction data, usage data etc etc) out of the library system vendors, and what sort of contracts got them into the mess they may admit to being in.)

(By the by, again the Game Analytics book made a useful distinction – that of viewing folk as customers, (i.e. people you can eventually get money from), or as players of the game (or maybe in MOOC land, learners). Whilst you may think of yourself as a player (learner), what they really want to do is develop you as a customer. In this respect, I think one of the great benefits of the arrival of MOOCs is that it allows us to see just how we can “monetise” education and let’s us talk freely and, erm, openly, in cold hard terms about the revenue potential of these things, and how they can be used as part of a money making/sales venture, without having to pretend to talk about educational benefits, which we’d probably feel obliged to do if we were talking about universities. Just like game publishers create product (games) to make money, MOOCspace is about businesses making money from education. (If it isn’t, why is venture capital interested?))

Anyway, all that’s all by the by, not just the by the by bit: this was just supposed to be a quick post, rather than a rant, about how we might do a little bit to open up part of the learning analytics data collection process to the community. (The technique generalises to other sectors…) The idea is built on appropriating a technology that many website publishers use to collect data, the third party service that is Google Analytics (eg from 2012, 88% of Universities UK members use Google Analytics on their public websites). I’m not sure how many universities use Google Analytics to track VLE activity though? Or how many MOOC operators use Google Analytics to track activity on course related pages? But if there are some, I think we can grab that data and pop it into a communal data pool; or grab that data into our own Google Account.

So how might we do that?

Almost seven years ago now – SEVEN YEARS! – in a post entitled They Stole OUr Learning Environment – Now We’re Stealing It Back, I described a recipe for customising a VLE (virtual learning environment – the thing that MOOC operators are reimagining and will presumably start (re)selling back to educational institutions as “Cloud based solutions”) – by injecting a panel that allowed you to add your own widgets from third part providers. The technique relied on a browser extension that allowed you to write your own custom javascript programmes that would be injected into the page just before it finished loading. In short, it used an extension that essentially allowed you to create you own additional extensions within it. It was an easy way of writing browser extensions.

That’s all a rather roundabout way of saying we can quite easily write extensions that change the behaviour of a web page. (Hmm… can we do this for mobile devices?) So what I propose – though I don’t have time to try it and test it right now (the rant used up the spare time I had!) – is an extension that simply replaces the Google Analytics tracking code with another tracking code:

– either a “common” one, that pools data from multiple individuals into the same Google Analytics account;
– or a “personal” one, that lets you collect all the data that the course provider was using Google Analytics to collect about you.

(Ideally the rewrite would take place before the tracking script is loaded? Or we’d have to reload the script with the new code if the rewrite happens too late? I’m not sure how the injection/replacement of the original tracking code with the new one actual takes place when the extension loads?)

Another “advantage” of this approach is that you hijack the Google Analytics data so it doesn’t get sent to the account of the person whose site you’re visiting. (Google Analytics docs suggest that using multiple tracking codes is “not supported”, though this doesn’t mean it can’t be done if you wanted to just overload the data collection (i.e. let the publisher collect the data to their account, and you just grab a copy of it too…).

(An alternative, cruder, approach might be to create an extension that purges Google Analytics code within a page, and then inject your own Google Analytics scripts/code. This would have the downside of not incorporating the instrumentation that the original page publisher added to the page. Hmm.. seems I looked at this way back when too… Collecting Third Party Website Statistics (like Yahoo’s) with Google Analytics.)

All good fun, eh? And for folk operating cMOOCs, maybe this represents a way of tracking user activity across multiple sites (though to mollify ethical considerations, tracking/analytics code should probably only be injected onto whitelisted course related domains, or users presented with a “track my activity on this site” button…?)

Fragmentary Observations from the Outside About How FutureLearn’s Developing

I’m outside the loop on all matters FutureLearn related, so I’m interested to see what I can pick up from fragments that do make it onto the web.

So for example, from a presentation by Hugh Davis to the M25 Libraries conference April 2013 about Southampton’s involvement with FutureLearn, Collaboration, MOOCs and Futurelearn, we can learn a little bit about the FutureLearn pitch to partners:

More interesting, I think, is this description of what some of the FutureLearn MOOCs might look like:

“miniMOOCs” containing 2 to 3 learning units, each 2-6 hours of study time, broken into 2-3 self-contained learning blocks (which suggests 1-2 hours per block).

So I wonder, based on the learning block sequence diagram, and the following learning design elements slide:

Will the platform be encouraging a learning design approach, with typed sequences of blocks that offer templated guides as to how to structure that sort of design element? Or is that way off the mark. (Given the platform is currently being built, (using Go Free Range for at least some of the development, I believe), it’s tricky to see how this is being played out, given courses and platform both need to ready at the same time, and it’s hard to write courses using platform primitives if the platform isn’t ready yet?)

Looking elsewhere (or at least, via @patlockley), we may be able to get a few more clues about the line partners are taking towards FutureLearn course development:

Hmm, I wonder – would it be worth subscribing to jobs feeds from the partner universities over the next few months to see whether any other FutureLearn related posts are being opened up? And does this also provide an opportunity for the currently rather sparse FutureLearn website to start promoting those jobs ads? And come to that, how come the jobs that have been appointed at FutureLearn weren’t advertised on the FutureLearn website…?

Because jobs have been appointed, as LinkedIn suggests… Here’s who’s declaring an association with the company at the moment:

We can also do a slightly broader search:

There’s also a recently closed job ad with a role that doesn’t yet appear on anyone’s byline:

So what roles have been filled according to this source?

CEO
Head of Content
Head of UK Education & HE Partnerships
CTO
Senior Project Manager / Scrum Master (Contract)
Agile Digital Project Manager
Product manager
Marketing and Communications Assistant
Interim HR Consultant
Learning Technologist
Commercial and Operations Director for Launch
Global Digital Marketing Strategist

Here’s another one, Academic Lead [src].

By the by, I also notice that the OU VC, Martin Bean, has just been appointed as a director of FutureLearn Ltd.

Exciting times, eh…?!;-)

Related: OU Launches FutureLearn Ltd

PS v loosely related (?!) – (Draft) Coursera data export policy

PPS I also noticed this the other day – OpenupEd (press release) an EADTU co-ordinated portal that looks like a clearing house for OER powered MOOCs from universities across the EU (particularly open universities, including, I think, The OU…;-)

Moving Machines…

I’ve just taken on a new desktop computer – the first desktop machine I’ll have used as daily machine for seven or eight years. As with every new toy, there is the danger of immediately filling it with the same crud that I’ve got on my current laptop, but I’m going to try to limit myself to installing things that I actually use…

My initial download list (the computer is a Mac):

A lot of files I work with are on Google docs, so I don’t actually need to install them at all – I just need a browser to access them
an alternative browser: Macs come with Safari preinstalled but I tend to use Chrome; I don’t sign in to Chrome, although I do use it on several machines. Being able to synch bookmarks would be handy, but I’m not sure I want to inflict the scores of open tabs I have onto every browser I open…
Dropbox desktop: I need to rethink my Dropbox strategy, and indeed the way I organise files, but Dropbox on the desktop is really handy…having downloaded and configured the client, it started synching my Dropbox files by itself (of course…;-). I’ll probably add the Google Drive dektop client at some point too, but in that case I definitely need a better file management strategy…
Gephi: for playing with network visualisations, and one of the main reasons for getting the new machine. As Gephi is a Jave app, I also needed to download a Java runtime in order to be able to run it
Rstudio: I considered not bothering with this, pondering whether I could move wholesale to the hosted RStudio at crunch.kmi.open.ac.uk, but then went with the desktop version for several reasons: a) I tinker with RStudio all the time, and don’t necessarily want to share everything on Crunch (not because users can see each others’ files even if they aren’t public, rather: there’s the risk Crunch may disappear/become unavailable/I might be cast out of the OU etc etc); b) the desktop version plays nicely with git/github…
Git and Git for Mac: I originally downloaded Git for Mac, a rather handy UI client, thinking it would pull down a version of Git for the commandline that RStudio could play with. It didn’t seeem to, so I pulled a git installer down too;
Having got Git in place, I cloned one project I’m currently working on from Github using RStudio, and another using Git for Mac; the RStudio project had quite a few package dependencies (ggplot2, twitteR, igraph, googleVis, knitr) so I installed them by hand. I really need to refactor my R code so that it installs any required packages if they haven’t already been installed.
One of the things I pulled from Github is a Python project; it has a few dependencies (simplejson (which I need to update away from?), tweepy, networkx, YQL), so I grabbed them too (using easy_install).
For my Python scribbles, I needed a text editor. I use TextWrangler on my laptop, and saw no reason to move away from it, so I grabbed that too. (I really need to become a more powerful user of TextWrangler – I don’t really know how to make proper use of it at all…)
Another reason for the big screen/bigger machine was to start working with SVG files – so I grabbed a copy of Inkscape and had a quick play with it. It’s been a long time since I used a mouse, and the Mac magic mouse seems to have a mind of its own (I far prefer two-finger click to RSI inducing right-click but haven’t worked out how/if magic mouse supports that?) but I’ve slowly started to find my way round it. Trying to import .eps files, I also found I needed to download and install Ghostscript (which required a little digging around until I found someone who’d built a Mac package/installer…)
I am reluctant to install a Twitter client – I think I shall keep the laptop open and running social tools so as not to distract myself by social conversation tools on the other machine…
I guess I’ll need to install a VPN client when I need to login to the OU VPN network…
I had a brief go at wiring up Mac mail and iCal to the OU’s Outlook client using a Faculty cribsheet, but after a couple of attempts I couldn’t get it to take so guess I’ll just stick with the Outlook Web App.

PS One of the reasons for grabbing this current snapshot of my daily tools is because the OU IT powers that be are currently looking at installing OU standard desktops that are intended to largely limit the installation of software to software from an approved list (and presumably offer downloads from an approved repository). I can see this has advantages for management, (and might also have simplified my migration?) but it is also highly restrictive. One of the problems with instituting too much process is that folk find workarounds (like acquiring admin passwords, rather than being given their own admin/root accounts from the outset) or resetting machines to factory defaults to get around pre-installed admin bottlenecks. I appreciate this may go against the Computing Code of Conduct, but I rarely connect my machines directly to the OU network, instead favouring eduroam when on campus (better port access!) and using VPN if I ever need access to OU network services. Software is the stuff that allows computers to take on the form of an infinite number of tools – the IT stance seems to take the view that it’s a limited purpose tool and they’re the ones who set the limits. Which makes me wonder: maybe this is just another front on the “Coming Civil War over General-purpose Computing”…?

Enter the Market – Course Data

I’m not at Dev8Ed this week, though I probably should be, but here’s what I’d have probably tinkered with had I gone – a recipe for creating a class of XCRI (course marketing data) powered websites to support course choice on a variety of themes and that could be used to ruthlessly and shamelessly exploit any and every opportunity for segmenting audiences and fragmenting different parts of the market for highly targeted marketing campaigns. So for example:

let’s start with something easy and obvious: russelgroupunis.com (sic;-), maybe? Search for courses from Russell Group (research intensive) universities on a conservatively branded site, lots of links to research inspired resources, pre-emptively posted reading lists (with Amazon affiliate codes attached); then bring in a little competition, and set this site up as a Waitrose to the Sainsburys of 1994andallthat.com, a course choice site based around the 1994 Group Universities (hmmm: seems like some of the 1994 Group members are deserting and heading off to join the Russell Group?); worthamillionplus.com takes the Tesco ads for the Million+ group, maybe, and unireliance.com (University Alliance) the Morrisons(?) traffic. (I have no idea if these uni group-supermarket mappings work? What would similarly tongue-in-cheek broadsheet/tabloid mappings be I wonder?!). If creative arts are more your thing, there could be artswayforward.com for the UKIAD folk, perhaps?
there are other ways of segmenting the market, of course. University groupings organise universities from the inside, looking out, but how about groupings based on consumers looking in? At fiveAgrades.com, you know where the barrier is set, as you do with 9kQuality.com, whereas cheapestunifees.com could be good for bottom of the market SEO. wetakeanyone.com could help at clearing time (courses could be identified by looking at grade mappings in course data feeds), as could the slightly more upmarket universityclearingcourses.com. And so on
National Student Survey data could also play a part in automatically partitioning universities into different verticals, maybe in support of FTSE-30 like regimes where only courses from universities in the top 30 according to some ranking scheme or other are included. NSS data could also power rankings of course. (Hmm… did I start to explore this for Course Detective? I don’t remember…Hmmm…)

The intention would be to find a way of aggregating course data from different universities onto a common platform, and then to explore ways of generating a range of sites, with different branding, and targeted at different markets, using different views over the same aggregated data set but similar mechanics to drive the sites.

PS For a little inspiration about building course comparison websites based around XCRI data, NSS data and KIS data, it may be worth looking at how the NHS does it (another UK institution that’s hurtling towards privatisation…): for example, check out NHS Choices hospitals near you service, or alternatively compare GPs.

PPS If anyone did start to build out a rash of different course comparison sites on a commercial basis, you can bet that as well as seeking affiliate fees for things like lead generation (prospectuses downloaded/mailed, open day visits booked (in exchange for some sort of ‘discount’ to the potential student if they actually turn up to the open day), registrations/course applications made etc) advertising would play a major role in generating site revenue. If a single operator was running a suite of course choice sites, it would make sense for them to look at how cross-site exploitation of user data could be used to track users across sites and tune offerings for them. I suspect we’d also see the use of paid placement on some sites (putting results to the top of a search results listing based on payment rather than a more quality driven ranking algorithm), recreating some of the confusion of the early days of web searchengines.

I suspect there’d also be the opportunity for points-make-prizes competitions, and other giveaways…

Or like this maybe?

Ahem…

[Disclaimer: the opinions posted herein are, of course, barely even my own, let alone those of my employer.]