OUseful.Info, the blog…

Trying to find useful things to do with emerging technologies in open education

Archive for the ‘Stirring’ Category

Fragmentary Observations from the Outside About How FutureLearn’s Developing

leave a comment »

I’m outside the loop on all matters FutureLearn related, so I’m interested to see what I can pick up from fragments that do make it onto the web.

So for example, from a presentation by Hugh Davis to the M25 Libraries conference April 2013 about Southampton’s involvement with FutureLearn, Collaboration, MOOCs and Futurelearn, we can learn a little bit about the FutureLearn pitch to partners:

FutureLEarn Overview

More interesting, I think, is this description of what some of the FutureLearn MOOCs might look like:

MOOC Structure

“miniMOOCs” containing 2 to 3 learning units, each 2-6 hours of study time, broken into 2-3 self-contained learning blocks (which suggests 1-2 hours per block).

So I wonder, based on the learning block sequence diagram, and the following learning design elements slide:

learning design

Will the platform be encouraging a learning design approach, with typed sequences of blocks that offer templated guides as to how to structure that sort of design element? Or is that way off the mark. (Given the platform is currently being built, (using Go Free Range for at least some of the development, I believe), it’s tricky to see how this is being played out, given courses and platform both need to ready at the same time, and it’s hard to write courses using platform primitives if the platform isn’t ready yet?)

Looking elsewhere (or at least, via @patlockley), we may be able to get a few more clues about the line partners are taking towards FutureLearn course development:

futurelearn job ad - LEeds

Hmm, I wonder – would it be worth subscribing to jobs feeds from the partner universities over the next few months to see whether any other FutureLearn related posts are being opened up? And does this also provide an opportunity for the currently rather sparse FutureLearn website to start promoting those jobs ads? And come to that, how come the jobs that have been appointed at FutureLearn weren’t advertised on the FutureLearn website…?

Because jobs have been appointed, as LinkedIn suggests… Here’s who’s declaring an association with the company at the moment:

futurelearn on linkedIN

We can also do a slightly broader search:

futurelearn search

There’s also a recently closed job ad with a role that doesn’t yet appear on anyone’s byline:

global digital marketing sstrategist

So what roles have been filled according to this source?

  • CEO
  • Head of Content
  • Head of UK Education & HE Partnerships
  • CTO
  • Senior Project Manager / Scrum Master (Contract)
  • Agile Digital Project Manager
  • Product manager
  • Marketing and Communications Assistant
  • Interim HR Consultant
  • Learning Technologist
  • Commercial and Operations Director for Launch
  • Global Digital Marketing Strategist

Here’s another one, Academic Lead [src].

By the by, I also notice that the OU VC, Martin Bean, has just been appointed as a director of FutureLearn Ltd.

Exciting times, eh…?!;-)

Related: OU Launches FutureLearn Ltd

PS v loosely related (?!) – (Draft) Coursera data export policy

PPS I also noticed this the other day – OpenupEd (press release) an EADTU co-ordinated portal that looks like a clearing house for OER powered MOOCs from universities across the EU (particularly open universities, including, I think, The OU…;-)

Written by Tony Hirst

April 25, 2013 at 9:46 am

Posted in Stirring

Tagged with

Moving Machines…

with 5 comments

I’ve just taken on a new desktop computer – the first desktop machine I’ll have used as daily machine for seven or eight years. As with every new toy, there is the danger of immediately filling it with the same crud that I’ve got on my current laptop, but I’m going to try to limit myself to installing things that I actually use…

My initial download list (the computer is a Mac):

  • A lot of files I work with are on Google docs, so I don’t actually need to install them at all – I just need a browser to access them
  • an alternative browser: Macs come with Safari preinstalled but I tend to use Chrome; I don’t sign in to Chrome, although I do use it on several machines. Being able to synch bookmarks would be handy, but I’m not sure I want to inflict the scores of open tabs I have onto every browser I open…
  • Dropbox desktop: I need to rethink my Dropbox strategy, and indeed the way I organise files, but Dropbox on the desktop is really handy…having downloaded and configured the client, it started synching my Dropbox files by itself (of course…;-). I’ll probably add the Google Drive dektop client at some point too, but in that case I definitely need a better file management strategy…
  • Gephi: for playing with network visualisations, and one of the main reasons for getting the new machine. As Gephi is a Jave app, I also needed to download a Java runtime in order to be able to run it
  • Rstudio: I considered not bothering with this, pondering whether I could move wholesale to the hosted RStudio at crunch.kmi.open.ac.uk, but then went with the desktop version for several reasons: a) I tinker with RStudio all the time, and don’t necessarily want to share everything on Crunch (not because users can see each others’ files even if they aren’t public, rather: there’s the risk Crunch may disappear/become unavailable/I might be cast out of the OU etc etc); b) the desktop version plays nicely with git/github…
  • Git and Git for Mac: I originally downloaded Git for Mac, a rather handy UI client, thinking it would pull down a version of Git for the commandline that RStudio could play with. It didn’t seeem to, so I pulled a git installer down too;
  • Having got Git in place, I cloned one project I’m currently working on from Github using RStudio, and another using Git for Mac; the RStudio project had quite a few package dependencies (ggplot2, twitteR, igraph, googleVis, knitr) so I installed them by hand. I really need to refactor my R code so that it installs any required packages if they haven’t already been installed.
  • One of the things I pulled from Github is a Python project; it has a few dependencies (simplejson (which I need to update away from?), tweepy, networkx, YQL), so I grabbed them too (using easy_install).
  • For my Python scribbles, I needed a text editor. I use TextWrangler on my laptop, and saw no reason to move away from it, so I grabbed that too. (I really need to become a more powerful user of TextWrangler – I don’t really know how to make proper use of it at all…)
  • Another reason for the big screen/bigger machine was to start working with SVG files – so I grabbed a copy of Inkscape and had a quick play with it. It’s been a long time since I used a mouse, and the Mac magic mouse seems to have a mind of its own (I far prefer two-finger click to RSI inducing right-click but haven’t worked out how/if magic mouse supports that?) but I’ve slowly started to find my way round it. Trying to import .eps files, I also found I needed to download and install Ghostscript (which required a little digging around until I found someone who’d built a Mac package/installer…)
  • I am reluctant to install a Twitter client – I think I shall keep the laptop open and running social tools so as not to distract myself by social conversation tools on the other machine…
  • I guess I’ll need to install a VPN client when I need to login to the OU VPN network…
  • I had a brief go at wiring up Mac mail and iCal to the OU’s Outlook client using a Faculty cribsheet, but after a couple of attempts I couldn’t get it to take so guess I’ll just stick with the Outlook Web App.

PS One of the reasons for grabbing this current snapshot of my daily tools is because the OU IT powers that be are currently looking at installing OU standard desktops that are intended to largely limit the installation of software to software from an approved list (and presumably offer downloads from an approved repository). I can see this has advantages for management, (and might also have simplified my migration?) but it is also highly restrictive. One of the problems with instituting too much process is that folk find workarounds (like acquiring admin passwords, rather than being given their own admin/root accounts from the outset) or resetting machines to factory defaults to get around pre-installed admin bottlenecks. I appreciate this may go against the Computing Code of Conduct, but I rarely connect my machines directly to the OU network, instead favouring eduroam when on campus (better port access!) and using VPN if I ever need access to OU network services. Software is the stuff that allows computers to take on the form of an infinite number of tools – the IT stance seems to take the view that it’s a limited purpose tool and they’re the ones who set the limits. Which makes me wonder: maybe this is just another front on the “Coming Civil War over General-purpose Computing”…?

Written by Tony Hirst

September 13, 2012 at 8:41 pm

Posted in Admin..., Stirring

Enter the Market – Course Data

with one comment

I’m not at Dev8Ed this week, though I probably should be, but here’s what I’d have probably tinkered with had I gone – a recipe for creating a class of XCRI (course marketing data) powered websites to support course choice on a variety of themes and that could be used to ruthlessly and shamelessly exploit any and every opportunity for segmenting audiences and fragmenting different parts of the market for highly targeted marketing campaigns. So for example:

  • let’s start with something easy and obvious: russelgroupunis.com (sic;-), maybe? Search for courses from Russell Group (research intensive) universities on a conservatively branded site, lots of links to research inspired resources, pre-emptively posted reading lists (with Amazon affiliate codes attached); then bring in a little competition, and set this site up as a Waitrose to the Sainsburys of 1994andallthat.com, a course choice site based around the 1994 Group Universities (hmmm: seems like some of the 1994 Group members are deserting and heading off to join the Russell Group?); worthamillionplus.com takes the Tesco ads for the Million+ group, maybe, and unireliance.com (University Alliance) the Morrisons(?) traffic. (I have no idea if these uni group-supermarket mappings work? What would similarly tongue-in-cheek broadsheet/tabloid mappings be I wonder?!). If creative arts are more your thing, there could be artswayforward.com for the UKIAD folk, perhaps?
  • there are other ways of segmenting the market, of course. University groupings organise universities from the inside, looking out, but how about groupings based on consumers looking in? At fiveAgrades.com, you know where the barrier is set, as you do with 9kQuality.com, whereas cheapestunifees.com could be good for bottom of the market SEO. wetakeanyone.com could help at clearing time (courses could be identified by looking at grade mappings in course data feeds), as could the slightly more upmarket universityclearingcourses.com. And so on
  • National Student Survey data could also play a part in automatically partitioning universities into different verticals, maybe in support of FTSE-30 like regimes where only courses from universities in the top 30 according to some ranking scheme or other are included. NSS data could also power rankings of course. (Hmm… did I start to explore this for Course Detective? I don’t remember…Hmmm…)

The intention would be to find a way of aggregating course data from different universities onto a common platform, and then to explore ways of generating a range of sites, with different branding, and targeted at different markets, using different views over the same aggregated data set but similar mechanics to drive the sites.

PS For a little inspiration about building course comparison websites based around XCRI data, NSS data and KIS data, it may be worth looking at how the NHS does it (another UK institution that’s hurtling towards privatisation…): for example, check out NHS Choices hospitals near you service, or alternatively compare GPs.

PPS If anyone did start to build out a rash of different course comparison sites on a commercial basis, you can bet that as well as seeking affiliate fees for things like lead generation (prospectuses downloaded/mailed, open day visits booked (in exchange for some sort of ‘discount’ to the potential student if they actually turn up to the open day), registrations/course applications made etc) advertising would play a major role in generating site revenue. If a single operator was running a suite of course choice sites, it would make sense for them to look at how cross-site exploitation of user data could be used to track users across sites and tune offerings for them. I suspect we’d also see the use of paid placement on some sites (putting results to the top of a search results listing based on payment rather than a more quality driven ranking algorithm), recreating some of the confusion of the early days of web searchengines.

I suspect there’d also be the opportunity for points-make-prizes competitions, and other giveaways…

Or like this maybe?

Ahem…

[Disclaimer: the opinions posted herein are, of course, barely even my own, let alone those of my employer.]

Written by Tony Hirst

May 29, 2012 at 12:38 pm

Posted in Stirring

Cognitive Waste and the Project Funding Bind

with one comment

As I tweeted earlier today: “A problem with project funding is that you’re expected to know what you’re going to do in advance – rather than discover what can be done..”

This was prompted by reading a JISC ITT (Invitation to Tender) around coursedata: Making the most of course information – xcri-cap feed use demonstrators. Here’s an excerpt from the final call:

JISC is seeking to fund around 6-10 small, rapid innovation projects to create innovative, engaging examples that demonstrate the use of the #coursedata xcri-cap feeds (either directly, or via the JISC Aggregator API). These innovative examples will be shared openly through the JISC web site and events to promote the good practice that has been adopted.
13. The demonstrators could use additional data sources such as geolocation data to provide a mash-up, or may focus on using a single institutional feed to meet a specific need.
14. The demonstrators should include a clear and compelling use case and usage scenario.
15. The range of demonstrators commissioned will cover a number of different approaches and is likely to include examples of:
• an online prospectus, such as a specialist courses directory;
• a mobile app, such as a course finder for a specific geographical area;
• a VLE block or module, such as a moodle block that identifies additional learning opportunities offered by the host institution;
• an information dashboard, such as a course statistics dashboard for managers providing an analysis of the courses your institution offers mashed up with search trends from the institutional website;
• a lightweight service or interface, such as an online study group that finds peers based on course description;
• a widget for a common platform, such as a Google Gadget that identifies online courses, and pushes updates to the users iGoogle page.
16. All demonstrators should be working code and must be available under an open source licence or reusable with full documentation. Project deliverables can build on proprietary components but wherever possible the final deliverables should be open source. If possible, a community-based approach to working with open source code should be taken rather than just making the final deliverables available under an open source licence.
17. The demonstrators should be rapidly developed and be ready to use within 4 months. It is expected most projects would not require more than 30 – 40 chargeable person days.

In addition:

23. Funding will not be allocated to allow a simple continuation of an existing project or activity. The end deliverable must address a specific need that is accepted by the community for which it is intended and produce deliverables within the duration of the project funding.
24. There should be no expectation that future funding will be available to these projects. The grants allocated under this call are allocated on a finite basis. Ideally, the end deliverables should be sustainable in their own right as a result of providing a useful solution into a community of practice.

The call appears to be open to all comers (for example, sole traders) and represents a way of spending money on bootstrapping innovation around course data feeds using HEFCE funding, in a similar way to how the Technology Strategy money disburses money (more understandably?) to commercial enterprises, SMEs, and so on. (Although JISC isn’t a legal entity – yet – maybe we’ll start to see JISC trying to find ways in which it can start to act as a vehicle that generates returns from which it can benefit financially, eg as a venture funder, or as a generator of demonstrable financial growth?)

As with many JISC calls, the intention is that something “sustainable” will result:

22. Without formal service level agreements, dependency on third party systems can limit the shelf life of deliverables. For these types of projects, long term sustainability although always desirable, is not an expected outcome. However making the project deliverables available for at least one year after the end of the project is essential so opportunities are realised and lessons can be learned.

24. There should be no expectation that future funding will be available to these projects. The grants allocated under this call are allocated on a finite basis. Ideally, the end deliverables should be sustainable in their own right as a result of providing a useful solution into a community of practice.

All well and good. Having spent a shedload (technical term ;-) on getting institutions to open up their course data, the funders now need some uptake. (That there aren’t more apps around course data to date is partly my fault. The TSO Open Up Competition prize I won secured a certain amount of TSO resource to build something around course scaffolding code scaffolding data as held by UCAS (my proposal was more to do with seeing this data opened up as enabling data, rather than actually pitching a specific application…). As it turned out, UCAS (a charity operated by the HEIs, I think) were (still are?) too precious over the data to release it as open data for unspecified uses, so the prize went nowhere… Instead, HEFCE spent millions through JISC to get universities to open up course data (albeit probably more comprehensive than the UCAS data) instead…and now there’s an unspecified amount for startups and businesses to build services around the XCRI data. (Note to self: are UCAS using XCRI as an import format or not? If not, is HEFCE/JISC also paying the HEIs to maintain/develop systems that publish XCRI data as well as systems that publish data in an alternative way to UCAS?)

I think TSO actually did some work aggregating datasets around a, erm, model of the UCAS course data; so if they want a return on that work, they could probably pitch an idea for something they’ve already prepped and try to gt HEFCE to pay for it, 9 months on from when I was talking to them at their expense…

Which brings me in part back to my tweet earlier today (“A problem with project funding is that you’re expected to know what you’re going to do in advance – rather than discover what can be done..”), as well as the mantra I was taught way back when I was a research student, that the route to successful research bids was to bid to do work you had already done (in part because then you could propose to deliver what you knew you could already deliver, or could clearly see how to deliver…)

This is fine if you know what you’re pitching to do (essentially, doing something you know how to do), as opposed to setting out to discover what sorts of things might be possible if you set about playing with them. Funders don’t like the play of course, because it smacks of frivolity and undirectedness, even though it may be a deeply focussed and highly goal directed activity, albeit one where the goal emerges during the course of the activity rather than being specified in advance.

As it is, funders tend to fund projects. They tell bidders what they want, bidders tell funders back how they’ll do it (either something they’ve already done = guaranteed deliverable, paid for post hoc), or something they *think* they intend to do (couched in project management and risk assessment speak to mask the fact they don’t really know what’ll happen when they try to execute the plan, but that doesn’t really matter, because at the end of the day they have a plan and a set of deliverables against which they can measure (lack of) progress.) In the play world, you generally do or deliver something because that’s the point – you are deeply engaged in and highly focussed on whatever it is that you’re doing (you are typically intrinsically motivated and maybe also extrinsically motivated by whatever constraints or goals you have adopted as defining the play context/play world. During play, you work hard to play well. And then there’s the project world. In the project world, you deliver or you don’t. So what.

Projects also have overheads associated with them. From preparing, issuing, marking, awarding, tracking and reporting on proposals and funded projects on the fundrs’ side, to preparing, submitting, and managing the project on the other (aside from actually doing the project work – or at last, writing up what has previously been done in an appropriate way;-).

And then there’s the waste.

Clay Shirky popularised the notion of cognitive surplus to characterise creative (and often collaborative creative) acts done in folks’ free time. Things like Wikipedia. I’d characterise this use of cognitive surplus capacity as a form of play – in part because it’s intrinsically motivated, but also because it is typically based around creative acts.

But what about cognitive waste, such as arises from time spent putting together project proposals that are unfunded and then thrown away (why aren’t these bids, along with the successful ones, made open as a matter of course, particularly when the application is for public money from an applicant funded by public money?). (Or the cognitive waste associated with maintaining a regular blog… erm… oops…)

I’ve seen bids containing literature reviews that rival anything in the (for fee, paywall protected, subscription required, author/institution copyright waivered) academic press, as well as proposals that could be taken up, maybe in partnership, by SMEs for useful purpose, rather than academic partners for conference papers), to time spent pursuing project processes, milestones and deliverables for the sole reason that they are in the plan that was defined before the space the project was pitched in to is properly through engaging with it, rather than because they continue to make sense (if indeed they ever did). (And yes, I know that the unenlightened project manager who sees more merit in trying to stick to the project plan and original deliverables, rather than pivoting if a far more productive, valuable or useful opportunity reveals itself, is a mythical beast…;-).

Maybe the waste is important. Evolution is by definition wasteful process, and maybe the route to quality is through a similar sort of process. Maybe the time, thought and effort that goes into unsuccessful bids really is cognitive waste, bad ideas that don’t deserve to be shared (and more than that, shouldn’t be shared because they are dangerously wrong). But then, I’m not sure how that fits in with project funding schemes that are over-subscribed and even highly rated proposals (that would ordinarily receive funding) are rejected, whereas in an undersubscribed call (maybe because it is mis-positioned or even irrelevant), weak bids (that ordinarily wouldn’t be considered) get funding.

Or maybe cognitive waste arises from a broken system and broken processes, and really is something valuable that is being wasted in the sense of squandered?

Right – rant over, (no) (late)lunchtime over… back to the “work” thing, I guess…

PS via @raycorrigan: “Newton, Galileo, Maxwell, Faraday, Einstein, Bohr, to name but a few; evidence of paradigm shifting power of ‘cognitive waste’” – which is another sense of “waste” I hadn’t considered, which is waste (as in loss, or loss to an organisation) of good ideas through rejecting or not supporting the development of a particular proposal or idea..?

Written by Tony Hirst

May 25, 2012 at 2:16 pm

Posted in Stirring

Tagged with

Mapping the Tesco Corporate Organisational Sprawl – An Initial Sketch

with 8 comments

A quick sketch, prompted by Tesco Graph Hunting on OpenCorporates of how some of Tesco’s various corporate holdings are related based on director appointments and terminations:

The recipe is as follows:

- grab a list of companies that may be associated with “Tesco” by querying the OpenCorporates reconciliation API for tesco
- grab the filings for each of those companies
- trawl through the filings looking for director appointments or terminations
- store a row for each directorial appointment or termination including the company name and the director.

You can find the scraper here: Tesco Sprawl Grapher

import scraperwiki, simplejson,urllib

import networkx as nx

#Keep the API key [private - via http://blog.scraperwiki.com/2011/10/19/tweeting-the-drilling/
import os, cgi
try:
    qsenv = dict(cgi.parse_qsl(os.getenv("QUERY_STRING")))
    ockey=qsenv["OCKEY"]
except:
    ockey=''

rurl='http://opencorporates.com/reconcile/gb?query=tesco'
#note - the opencorporates api also offers a search:  companies/search
entities=simplejson.load(urllib.urlopen(rurl))

def getOCcompanyData(ocid):
    ocurl='http://api.opencorporates.com'+ocid+'/data'+'?api_token='+ockey
    ocdata=simplejson.load(urllib.urlopen(ocurl))
    return ocdata

#need to find a way of playing nice with the api, and not keep retrawling

def getOCfilingData(ocid):
    ocurl='http://api.opencorporates.com'+ocid+'/filings'+'?per_page=100&api_token='+ockey
    tmpdata=simplejson.load(urllib.urlopen(ocurl))
    ocdata=tmpdata['filings']
    print 'filings',ocid
    #print 'filings',ocid,ocdata
    #print 'filings 2',tmpdata
    while tmpdata['page']<tmpdata['total_pages']:
        page=str(tmpdata['page']+1)
        print '...another page',page,str(tmpdata["total_pages"]),str(tmpdata['page'])
        ocurl='http://api.opencorporates.com'+ocid+'/filings'+'?page='+page+'&per_page=100&api_token='+ockey
        tmpdata=simplejson.load(urllib.urlopen(ocurl))
        ocdata=ocdata+tmpdata['filings']
    return ocdata

def recordDirectorChange(ocname,ocid,ffiling,director):
    ddata={}
    ddata['ocname']=ocname
    ddata['ocid']=ocid
    ddata['fdesc']=ffiling["description"]
    ddata['fdirector']=director
    ddata['fdate']=ffiling["date"]
    ddata['fid']=ffiling["id"]
    ddata['ftyp']=ffiling["filing_type"]
    ddata['fcode']=ffiling["filing_code"]
    print 'ddata',ddata
    scraperwiki.sqlite.save(unique_keys=['fid'], table_name='directors', data=ddata)

def logDirectors(ocname,ocid,filings):
    print 'director filings',filings
    for filing in filings:
        if filing["filing"]["filing_type"]=="Appointment of director" or filing["filing"]["filing_code"]=="AP01":
            desc=filing["filing"]["description"]
            director=desc.replace('DIRECTOR APPOINTED ','')
            recordDirectorChange(ocname,ocid,filing['filing'],director)
        elif filing["filing"]["filing_type"]=="Termination of appointment of director" or filing["filing"]["filing_code"]=="TM01":
            desc=filing["filing"]["description"]
            director=desc.replace('APPOINTMENT TERMINATED, DIRECTOR ','')
            director=director.replace('APPOINTMENT TERMINATED, ','')
            recordDirectorChange(ocname,ocid,filing['filing'],director)

for entity in entities['result']:
    ocid=entity['id']
    ocname=entity['name']
    filings=getOCfilingData(ocid)
    logDirectors(ocname,ocid,filings)

The next step is to graph the result. I used a Scraperwiki view (Tesco sprawl demo graph) to generate a bipartite network connecting directors (either appointed or terminated) with companies and then published the result as a GEXF file that can be loaded directly into Gephi.

import scraperwiki
import urllib
import networkx as nx

import networkx.readwrite.gexf as gf

from xml.etree.cElementTree import tostring

scraperwiki.sqlite.attach( 'tesco_sprawl_grapher')
q = '* FROM "directors"'
data = scraperwiki.sqlite.select(q)

DG=nx.DiGraph()

directors=[]
companies=[]
for row in data:
    if row['fdirector'] not in directors:
        directors.append(row['fdirector'])
        DG.add_node(directors.index(row['fdirector']),label=row['fdirector'],name=row['fdirector'])
    if row['ocname'] not in companies:
        companies.append(row['ocname'])
        DG.add_node(row['ocid'],label=row['ocname'],name=row['ocname'])   
    DG.add_edge(directors.index(row['fdirector']),row['ocid'])

scraperwiki.utils.httpresponseheader("Content-Type", "text/xml")


writer=gf.GEXFWriter(encoding='utf-8',prettyprint=True,version='1.1draft')
writer.add_graph(DG)

print tostring(writer.xml)

Saving the output of the view as a gexf file means it can be loaded directly in to Gephi. (It would be handy if Gephi could load files in from a URL, methinks?) A version of the graph, laid out using a force directed layout, with nodes coloured according to modularity grouping, suggests some clustering of the companies. Note the parts of the whole graph are disconnected.

In the fragment below, we see Tesco Property Nominees are only losley linked to each other, and from the previous graphic, we see that Tesco Underwriting doesn’t share any recent director moves with any other companies that I trawled. (That said, the scraper did hit the OpenCorporates API limiter, so there may well be missing edges/data…)

And what is it with accountants naming companies after colours?! (It reminds me of sys admins naming servers after distilleries and Lord of the Rings characters!) Is there any sense in there, or is arbitrary?

Written by Tony Hirst

April 12, 2012 at 3:56 pm

Tesco Graph Hunting on OpenCorporates

with one comment

A quick lunchtime post on some thoughts around constructing corporate graphs around OpenCorporates data. To ground it, consider a search for “tesco” run on gb registered companies via the OpenCorporates reconciliation API.

{"result":[{"id":"/companies/gb/00445790", "name":"TESCO PLC", "type":[{"id":"/organization/organization","name":"Organization"}], "score":78.0, "match":false, "uri":"http://opencorporates.com/companies/gb/00445790"}, {"id":"/companies/gb/05888959", "name":"TESCO AQUA (FINCO1) LIMITED", "type":[{"id":"/organization/organization", "name":"Organization"}], "score":71.0, "match":false, "uri":"http://opencorporates.com/companies/gb/05888959"}, { ...

Some or all of these companies may or may not be part of the same corporate group. (That is, there may be companies in that list with Tesco in the name that are not part of the group of companies associated with a major UK supermarket.)

If we treat the companies returned in that list as one class of nodes in a graph, we can start to construct a range of graphs that demonstrate linkage between companies based on a variety of factors. For example, a matching address for a registered, post off box mediated, address in an offshore tax haven suggests there may be a weak tie at least between companies:

(Alternatively, we might construct bipartite graphs containing company nodes and address nodes, for example, then collapse the graph about common addresses.)

Shared directors would be another source of linkage, although at the moment, I don’t think OpenCorporates publishes directors associated with UK companies (I suspect that data is still commercially licensed?). However, there is associated information available in the OpenCorporates database already…. For example, if we look at the various company filings, we can pick up records relating to director appointments and terminations?

By monitoring filings, we can then start to build up a record of directorial involvement with companies? From looking at the filings, it also suggests that it would make sense to record commencement and cessation dates for directorial appointments…

There may also be weak secondary evidence linking companies. For example, two companies that file trademarks using the same agent have a weak tie through that agent. (Of course, that agent may be acting for two completely independent companies.)

If we weight edges between nodes according to the perceived strength of a tie and then lay out the graph in a way that is sensitive to the number of weight of edge connections between company nodes, we may be able to start mapping out the corporate structure of these large, distributed corporations, either in network map terms, or maybe by mapping geolocated nodes based on registered addresses; and then we can start asking questions about why these distributed corporate entities are structured the way they are…

PS note to self – OpenCorporates API limit with key: 1000/hr, 10k/day

Written by Tony Hirst

April 12, 2012 at 12:36 pm

Posted in Anything you want, Stirring

Tagged with

Autodiscoverable Feeds and UK HEIs (Again…)

with 8 comments

It’s that time of year again when Brian’s banging on about IWMW, the Instituional[ised?] Web Managers’ Workshop, and hence that time of year again when he reminds me* about my UK HE Feed Autodiscovery app that trawls through various UK HEI home pages (the ones on .ac.uk, rather than the one you get by searching for a uni name in Google;-)

* that is, tells me the script is broken and, by implication, gently suggests that I should fix it…;-)

As ever, most universities don’t seem to be supporting autodiscoverable feeds (neither are many councils…), so here are a few thoughts about what feeds you might link to, and why…

- news feeds: the canonical example. News feeds can be used to pipe news around various university websites, and also syndicate content to any local press or hyperlocal news sites. If every UK HEI published a news feed that was autodiscoverable as such, it would be trivial to set up a UK universities aggregated newswire.

- research announcements: I was told that one reason for putting out press releases was simply to build up an institutional memory/archive of notable events. Many universities run research newsletters that remark on awarded grants. How about a “funded research” feed from each university detailing grant awards and other research funding. Again, at a national level, this could be aggregated to provide a research funding newswire, as well as contribtuing data to local archives of research funding success.

- jobs: if every UK HEI published a jobs/vacancies RSS feed, it would trivial to build an aggregator and let people roll their own versions of jobs.ac.uk.

- events: universities contribute a lot to local culture through public talks and exhibitions. Make it easy for the local press and hyperlocal news sites to syndicate this info, and add events to their own aggregated “what’s on” calendars. (And as well as RSS, give ‘em an iCal feed for your events.)

- recent submissions to local repository: provide a feed listing recent submissions to the local research output/paper repository (and/or maybe a feed of the most popular downloads); if local feeds are you thing, the library quite possibly makes things like recent acquisition feeds available…

- YouTube uploads: you might was well add an autodiscoverable feed to your university’s recent uploads on YouTube. If nothing else, it contributes an informal ownership link to the web for folk who care about things like that.

- your university Twitter feed: if you’ve got one. I noticed Glasgow Caledonian linked to their Twitter feed through an autodiscoverable link on their university homepage.

- tenders: there’s a whole load of work going on in gov at the moment regarding transparency as it relates to procurement and tendering. So why not get open with your procurement and tendering data, and increase the chances of SMEs finding out what you’re tendering around. If the applications have to go through a particular process, no problem: link to the appropriate landing page in each feed item.

- energy data: releasing this data may well become a requirement in the not so far off future, so why not get ahead of the game, e.g. as Lincoln are starting to do (Lincoln U energy data)? If everyone was publishing energy data feeds, I’m sure DevCSI hackday folk would quickly roll together something like the aggregating service built by college student @issyl0 out of a Rewired State hack that pulls together UK gov department energy data: GovSpark

- XCRI-CAP course marketing data feeds: JISC is giving away shed loads of cash to support this, so pull your finger out and get the thing published.

- location data: got a KML feed yet? If not, why not? e.g. Innovations in Campus Mapping

PS the backend of my RSS feed autodiscovery app (founded: 2008) is a Yahoo pipe. Just because, I thought I’d take half an hour out to try and build something related on Scraperwiki. The code is here: UK University Autodiscoverable RSS feeds. Please feel free to improve or, fork it, etc. University homepage URLs are identified by scraping a page on the Universities UK website, but I probably should use a feed from the JISC Monitoring Unit (e.g. getting UK University location/contact data).

PPS this could be handy for some folk – the code that runs the talks@cam events site: http://source.caret.cam.ac.uk/svn/projects/talks.cam/. (Thanks Laura:-) – does it do feeds nicely now?! Related: Keeping Up With Events, a quickly hacked app from my Arcadia project that (used to) aggregate Cambridge events feeds.)

Written by Tony Hirst

July 26, 2011 at 6:59 pm

Getting Access to University Course Code Data (or not… (yet…))

with 8 comments

A couple of weeks or so ago, having picked up the TSO OpenUp competition prize for suggesting that it would be a Good Thing for UCAS/university course code data to be made available, I had a meeting with the TSO folk to chat over “what next?” The meeting was an upbeat one with a plan to get started as soon as possible with a scrape of the the UCAS website… so what’s happened since…?

First up – a reading of the UCAS website Terms and Conditions suggests that scraping is a no-no…

6. Intellectual property rights
e. Copying, distributing or any use of the material contained on the website for any commercial purpose is prohibited.
f. You may not create a database by systematically downloading substantial parts of the website

(In the finest traditions of the web, you aren’t allowed to deep link into the site without permission either: 6.c inks to the website are not permitted, other than links to the homepage for your personal use, except with our prior written permission. Links to the website from within a frameset definition are not permitted except with our prior written permission.)

So, err, I guess my link to the terms and conditions breaks those terms and conditions? Oops…;-) Should I be sending them something like this do you think?

Dear enquiries@ucas.ac.uk,
As per your terms and conditions, (paragraph 6 c) please may I publish a link to your terms and conditions web page [ http://www.ucas.com/terms_and_conditions ] in a blog post I am writing that, in part, refers to your terms and conditions?
Luv'n'hugs,
tony

As a fallback, I put a couple of trial balloon FOI requests in to a couple of universities asking for the course names and UCAS course codes for courses offered in 2010/11, along with the search keywords associated with each course (doh! I did it again, deep linking into the UCAS site…)

PS Please may I also link to the page describing course search keywords [ http://www.ucas.com/he_staff/courses/coursesearchkeywords ] ?

The first request went to the University of Southampton, in part because I knew that they already publish chunks of the data (as data) as part of their #opensoton Open Data initiative. (This probably means I was abusing the FOI system, but a point maybe needed to be made…?!;-) The second request was put in to the University of Bristol.

The requests were of the form:

I would be grateful if you could send me in spreadsheet, machine readable electronic form or plain text a copy of the course codes, course titles and search keywords for each course as submitted to UCAS for the 2010-2011 (October 2010) student entry.

If possible, would you also provide HESA subject category codes associated with each course.

So how did I get on?

Bristol’s response was as follows:

On discussion with our Admissions and Student Information teams, it appears that the University does not actually hold this data – it is held on a UCAS database. UCAS are not currently subject to the Freedom of Information Act (they will be in due course) but it may be worth talking to them directly to see if they are willing to assist.

And Southampton’s FOI response?

Course codes and titles may be found here: http://www.soton.ac.uk/corporateservices/foi/request-66210-6124d691.pdf Keywords were not held by the University – you should inquire with UCAS (http://www.ucas.com). HESA subject category codes may be found here: http://www.hesa.ac.uk/index.php/content/view/1806/296/

So what did I learn?

  1. I don’t seem to have made it clear enough to Southampton that I wanted the the 2-tuple (course code, HESA code) for each course. So how should I have asked for that data (the response pointed me to the list of all HESA codes. What I wanted was, for each course code, the course code/HESA code pair).
  2. Generalising from an example of one;-), there seems to be a disconnect between FOI and open data branches of organisations. In my ideal world, the FOI person (an advocate for the person making the request) would also be on good terms with the Open Data team in the organisation, if not a data wrangler themselves. For data requests, the FOI person would make sure the data is released as open data as part of the process of fulfilling the request and then refer the person making the request to the open data site (see also: Open Data Processes – Taps, Query Paths/Audit Trails and Round Tripping). Southampton have part of this process already – the course data is in a PDF on the their site and I was referred to it. (Note that the PDF is not just any PDF – have a look at it! – rather than the spreadsheet, machine readable electronic form or plain text I requested, even though @cgutteridge had posted a link to the SPARQL opendata query for the course code/UCAS code information I’d requested as a reply to my FOI request on the WhatDoTheyKnow site.)
  3. Universities don’t necessarily have any record of the search keywords they associate with the courses they post on UCAS. The UCAS website suggests that (doh!) “[r]ecent analysis of unique IP address use of the UCAS Course Search indicates that the subject search is by far the most popular of the 3 search options currently available”, such that “[w]hen an applicant uses our Course Search facility to search for available courses, they can choose a keyword by which to search, known as the ‘subject search’.” Which is to say, universities have no local record of the terms they use to describe courses that are the the primary way of discovering their courses on UCAS? Blimey… (I wonder how much universities spend on Google AdWords for advertising particular courses on their own course prospectus websites and how they go about selecting those terms?)
  4. Asking for a machine readable “data as data” response has no teeth at the current time. I don’t know if the Protection of Freedoms bill clause that “extends Freedom of Information rights by requiring datasets to be available in a re-usable format” will change this? It seems like it might?

    Where—
    (a) an applicant makes a request for information to a public authority in respect of information that is, or forms part of, a dataset held by the public authority, and
    (b) on making the request for information, the applicant expresses a preference for communication by means of the provision to the applicant of a copy of the information in electronic form, the public authority must, so far as reasonably practicable, provide the information to the applicant in an electronic form which is capable of re-use.

  5. So what next? UCAS is a charity that appears to be operated by, for, and on behalf of UK Higher Education (e.g. UCAS Directors’ Report and Accounts 2009). Whilst not FOIable yet, it looked set to become FOIable from October 2011 (Ministry of Justice: Greater transparency in Freedom of Information), though I haven’t been able to find the SI and commencement date that enact this…?). IF it does become FOIable, we may be able to get the data out that way (although memories of the battle between open data advocates and the Ordnance Survey come to mind…) Hopefully, though, we’ll be able to get the data open by more amicable means before then…:-)

    PS a couple of other things that I’ve been dipping into relating to this project. Firstly, the UCAS Business Plan 2009-2012 (doh!):

    PPS Please may I also link to your Corporate Business Plan 2009-2012 [ http://www.ucas.com/documents/corporate/corpbusplan09-12.pdf ]

    Secondly, the Cabinet Office’s “Better Choices: Better Deals” strategy document [PDF], which as well as its “MyData” right to personal data initiative, also encourages business to put their information (and data…) to work. Whether or not you agree that more information may help to make for better choices from potential students, or that comparison sites have a role to play in this, the UK government appears to believe it and looks set to support the development of businesses operating in this area. For example:

    Effective consumer choices are also important in the public sector – such as decisions about what and where to study.
    However, unlike in private markets, public services are generally:
    ● Free at the point of delivery, so prices do not give us clues about quality or popularity.
    ● Not motivated by profits, so there is little incentive to highlight differences and encourage switching.
    ● Supplied under a universal service obligation, such that they serve a particularly broad range of users, from the very informed to the highly vulnerable.
    In the same way that comparison and feedback sites have developed for private markets, some choice-tools have already emerged for public services. For example, parents and prospective students can use league tables to compare school and university performance, while patients can access websites comparing waiting times for treatments across different healthcare providers, and feedback from fellow consumers about the performance of a local GP practice. Their role is likely to become more important in future as public service markets are opened up and there is scope for further choice-tools to be developed [Better Choices: Better Deals, p. 32]

    If you’re looking to put a bid or business plan together based on using public data as a basis for comparison services, the Better Choices document has more than a few quotable sections;-)

    [Related: Course Detective metasearch/custom search across UK University prospectus websites]

Written by Tony Hirst

April 26, 2011 at 12:58 pm

Posted in Data, Stirring, Thinkses

Tagged with , , , ,

Predictive Ads…? Or Email Address Targeted Advertising…?!

with one comment

As I get was getting increasingly annoyed by large flashing display ads in my feedreader this morning, the thought suddenly occurred to me: could Google serve me ads on third party sites based on my unread Gmail emails?

That is, as I check my feeds before my email in a morning, could I be seeing ads that foreshadow the content of the email I’ve been ignoring for way too long? Or could I receive ads that flag the content of my Priority Inbox messages?

Rules regarding sensitivity and privacy would have to be carefully thought through,m of course. Here’s how they currently stand regarding contextual ads delivered in Gmail (More on Gmail and privacy: Targeted ads in Gmail):

By offering Gmail users relevant ads and information related to the content of their messages, we aim to offer users a better webmail experience. For example, if you and your friends are planning a vacation, you may want to see news items or travel ads about the destination you’re considering.

To ensure a quality user experience for all Gmail users, we avoid showing ads reflecting sensitive or inappropriate content by only showing ads that have been classified as “Family-Safe.” We also avoid targeting ads to messages about catastrophic events or tragedies. [Google's emphasis]

[See also: Ads in Gmail and your personal data Share Comment]

Not quite as future predictive as gDay™ with MATE™ that lets you “search tomorrow’s web today” and “[discover] content on the internet before it is created”, but almost…!

It’s also a step on the road to Eric Schmidt’s dream of providing you with results even before you search for them. (For a more recent interview, see Google’s Eric Schmidt predicts the future of computing – and he plans to be involved.)

Here’s another, more practical(?!) thought – suppose Google served me headers of Priority Inbox email messages that were also marked as urgent through Adwords ads, in a full-on attempt to try to attract my attention to “really important” messages?! “Flashmail” messages delivered through the Adwords network… (I can imagine at least one course manager who I suspect would try to contact me via ads when I don’t pick up my email! ;-)

Searching the internet of things may still be a little way off though….

PS thinking email address targeted ads (mailads?) through a bit more, here are a couple of ways of doing it that immediately come to mind. Suppose I want to target an ad at whoever@example.com:

1) Adwords could place that ad in my GMail sidebar; (I think they’d be unlikely to place ads within emails, even if clearly marked, because this approach has been hugely unpopular in the past (it also p****s me off in feeds ); that said, Google has apparently started experimenting with (image based) display ads in gmail;

2) Adwords could place the ad on a third party site if the Goog spots me via a cookie and sees I’m currently logged in to Google, for example, with the whoever@example.com email address.

As Facebook gets into the universal messaging game, email address based ad targeting would also work there?

PPS interesting – the best ads act as content, so maybe ads could be used to deliver linked content? Twitter promoted tweets – the AdWords for live news?. Which reminds me, I need to work up my bid for using something like AdWords to deliver targeted educational content.

Written by Tony Hirst

February 8, 2011 at 11:08 am

So What Do Universities Sell?

with 13 comments

When I joined the OU as an academic over a decade ago, I spent my first 6 months or so asking everyone I met what it was the OU sold, only to be met with “go away, silly boy” sort of looks. (I still don’t know: courses/modules? degrees/qualifications? CPD products? consultancy? research interests, or capacity (though not development or innovation;-)?! If nothing else, the demographics of our paying customers has changed over that period (“Open University may be in its 40s – but students are getting younger“); but does that mean that what we’re selling has actually changed too? Who knows?!)

That universities are now businesses competing in a marketplace is undeniable, and increasingly looks as if it is opening up to private enterprises (Publishing giant Pearson looks set to offer degrees) who are allowed expected to talk up the ability to generate profits (rather than, err, building up reserves and new buildings;-). See for example Doug Clow’s piece on Apollo Group results – BPP and University of Phoenix where he starts to unpick Apollo Group’s reported financials. (It’s worth remembering where the profits are expected to come from, of course… e.g. Doug again: Tuition Fees and the costs of HE).

So what happens when the market hots the university? More and more marketing, maybe…?

Here’s a round-up of the latest OU job ads…

  • Director of Communications (£88,769 – £100,763): “The Open University has been providing life-changing learning experiences for over 40 years and now has 250,000 students. We make a major contribution to choice and innovation in higher education, social mobility and enriching the skills of the workforce through world class teaching and research.
    “The Director of Communications works closely with the Vice-Chancellor and Executive to develop and implement a communications strategy to support delivery of the corporate strategy, build and develop relationships with key external stakeholders, ensure consistent delivery of brand, protect and develop reputation and develop organisational culture.
    “This is a rare and exciting opportunity for an energetic and visionary person with a passion for education to drive communications activities to build our reputation as the world leader in flexible learning.”
  • Marketing Planning and Programme Manager, Marketing (B2B) (£46,510- £52,347): “The post has been created to assist the University to develop its marketing capacity specifically to the B2B Employer Engagement area. It will be essential to harness the energies of academic and academic related staff in the University’s Business Development units, service units and regions to develop a more effective marketing strategy. This will require planning, modelling, project management, influencing and networking skills of the highest order, and an ability to adapt leadership/management style to an academic context.”
  • Two Programme Communications Managers, Marketing (£36,715 – £43,840): “Working within a small team, you will be responsible for planning, developing and delivering a broad range of marketing acquisition or retention campaigns to meet student number targets.
    “The position requires a proven ability to develop & implement successful marcomms strategies that have the support of key stakeholders. The successful candidate will have a full mix of marketing experience, including a clear understanding of disciplines such as direct & digital marketing, advertising & event management to name a few. This role requires excellent communications & project management skills, ideally twinned with a strong commercial background.”
  • Web Assistant Producer Open Learn (Explore), Open Broadcasting Unit (£29,853 – £35,646): “Earlier this year the OU launched an updated public facing, topical news and media driven site. The site bridges the gap between BBC TV viewing and OU services and functions as the new ‘front door’ to Open Learn and all of the Open University’s open, public content. We are looking for a Web Assistant Producer with web production/editing skills.
    “You will work closely with a Producer, 2 Web Assistant Producers, the Head of Online Commissioning and many others in the Open University, as well as the BBC.”

And whilst the OU – like many other HEIs – is doing its utmost to keep recruitment of new academic staff to a minimum, and allowing natural wastage to reduce staffing further, it’s good to know that at least posts like the above count as academic related:

Academic related jobs at the OU

PS if you have any ideas about what it is that universities actually sell, please let me know in a comment…;-)

PPS Relevant to the above ads, and picking up on a couple of tweets I posted last week, I’m intrigued to know how university communications departments measure their impact? Presumably (despite being academic related) it’s not got a lot to do with being referenced in academic journals?;-) But how do they measure their impact? Answers in the comments, please…:-)

Written by Tony Hirst

January 6, 2011 at 11:47 am

Posted in Stirring

Follow

Get every new post delivered to your Inbox.

Join 337 other followers