Innovation Starts At Home…?

Mention was made a couple of times last week in the VC’s presentation to the OU about the need to be more responsive in our curriculum design and course production. At the moment it can take a team of up to a dozen academics over two years to put an introductory course together, that is then intended to last, without significant change, other than in the preparation of assessment material, for five years or more.

The new “agile” production process is currently being trialled by a new authoring tool, OpenCreate, that is currently available to a few select course teams as a partially complete “beta”. I think it is “cloud” based. And maybe also promoting the new “digital” first strategy. (I wonder how many letters in the KPMG ABC bingo card consulting product the OU paid for, and how much per letter? Note: A may also stand for “analytics”.)

I asked I could have a play with the OpenCreate tool, such as it, last week, but told it was still in early testing (so a good time to be able to comment, then?) and so, “no”. (So instead,  I went back to one of the issues I’d raised a few days ago on somebody else’s project on Github to continue helping with the testing of a feature suggestion. (A few days ago; the suggestion has already been implemented and the issue is now closed as completed. making my life easier and hopefully improving the package too.) Individuals know how to do agile. Organisations don’t. ;-))

So why would I wan’t to play with OpenCreate now, while it’s still flaky? Partly because I suspect the team are working on a UI and have settled elements of the backend. For all the f**kwitted nonsense the consultants may have been spouting about  agile, beta, cloud, digital solutions, any improvements are going to come form the way the users use the tools. And maybe workarounds they find. And by looking at how the thing works, I may be able to explore other bits of the UI design space, and maybe even bits of the output space…

Years ago, the OU moved to an XML authoring route, defining and XML schema (OU-XML) that could be used to repurpose content for multiple output formats (HTML, epub, docx, Word). By the by, these are all standardised document formats, which means other people also build tooling around them. The OU-XML document was an internal standard. Which meant only the OU developed tools for it. Or people we paid. I’m not sure if, or how much Microsoft, were paid to produce the OU’s custom authoring extensions for Word that would output OU-XML, for example… Another authoring route was an XML editor (currently, oXygen, I believe). OU-XML also underpinned OpenLearn content.

That said, OU-XML was a standard, so it was in principle possible for people who had knowledge of it to author tools around it. I played with a few myself, though they never generated much interest internally.

  • generating mind maps from OU/OpenLearn structured authoring XML documents: these provided the overview of a whole course and could also be used as a navigation surface (revisited here and here); I made these sort of mindmaps available as an additional asset in the T151 short course, but they were never officially recognised);
  • I then started treating a whole set of OU-XML documents *as a database* which meant we could generate *ad hoc* courses on a particular topic by searching for keywords across OpenLearn courses and then returning a mindmap constructed around components in different courses, again displaying the result as a mindmap (Generating OpenLearn Navigation Mindmaps Automagically). Note this was all very crude and represented playtime. I’d have pushed it further if anyone internally had shown any interest in exploring this more widely.
  • I also started looking at ways of liberating assets and content, which meant we could perform OpenLearn Searches over Learning Outcomes and Glossary Items. That is, take all the learning outcomes from OpenLearn docs and search into that to find units with learning outcomes on that topic. Or provide a “metaglossary” generated (for free) from glossary terms introduced in all OpenLearn materials. Note that I *really* wanted to do this as a cross-OU course content demo, but as the OU has become more digital, access to content has become less open. (You used to be able to look at complete course, OU print materials in academic libraries. No you need a password to access the locked down digital content; I suspect access expires to students after a period of time too; and it also means students can’t sell on their old course materials;
  • viewing OU-XML documents as structured database meant we could also asset strip OpenLearn for  images, providing a search tool to lookup images related to a particular topic. (Internally, we are encouraged to reuse previously created assets, but the discovery problem about helping authors discover what previously created assets are available has never really been addressed; I’m not sure the OU Digital Archive is really geared up for this, either?)
  • we could also extract links from courses and use them as a course powered custom search engine. This wasn’t very successful at the course level, (not enough links) but might have been interesting at across multiple courses;
  • a first proof of concept pass at a tool to export OU-XML documents from Google docs, so you could author documents using Google docs and then upload the result into the OU publishing system.

Something that has also been on my to do list for a long time are templates to convert Rmd (Rmarkdown) and Jupyter notebook ipynb documents to OU-XML.

So… if I could get to see the current beta OpenCreate tool, I might me able to see what document format authors were being encouraged to author into. I know folk often get the “woahh,, too complicated… feeling when reading blog posts*, but at the end of the day whatever magic dreams folk have for using tech, it boils down to a few poor sods having to figure out how to do that using three things: code, document formats (which we might also view as data representations more generally) and transport mechanisms (things like http; and maybe we could also class things like database connections here). Transport moves stuff between stuff. Representations represent the stuff you want to move. Code lets you do stuff with the represented stuff, and also move it between other things that do black box transformations to it (for example, transforming it from one representation to another).

That’s it. (My computing colleagues might disagree. But they don’t know how to think about systems properly ;-)

If OpenCreate is a browser based authoring tool, the content stuff created by authors will be structured somehow, and possibly previewed somehow. There’ll also be a mechanism for posting the authored stuff into the OU backend.

If I know what (document) format the content is authored in, I can use that as a standard and develop my own demonstration authoring tools and routes around that on the input side. For example, a converted that converts Jupyter notebook, or Rmd, or Google docs authored content into that format.

If there is structure in the format (as there was in OU-XML), I can use that as a basis for exploring what might be done if we can treat the whole collection of OU authored course materials as a database and exploring what sorts of secondary products, or alternative ways of using that content, might be possible.

If the formats aren’t sorted yet, maybe my play would help identify minor tweaks that could make content more, or less, useful. (Of course, this might be a distraction.)

I might also be able to comment on the UI…

But is this likely to happen? Is it f**k, because the OU is an enterprise that’s sold corporate, enterprise IT thinking from muppets who only know “agile” (or is that “analytics”?), “beta”, “cloud” and “digital” as bingo terms that people pay handsomely for. And we don’t do any of them because nobody knows what they mean…

* So for example, in Pondering What “Digital First” and “University of the Cloud” Mean…Pondering What “Digital First” and “University of the Cloud” Mean…, I mention things like “virtual machines” and “Docker” and servers and services. If you think that’s too technical, you know what you can do with your cloud briefings…

The OU was innovative because folk understood technologies of all sorts and made creative use of them. Many of our courses included emerging technologies that were examples of the technologies being taught in the courses. We ate the dogfood we were telling students about. Now we’ve put the dog down and just show students cat pictures given to us by consultants.

Computers May Structure the World But We Don’t Make Use of That

An email:


Erm… a Word document with some images and captions – styled as such:


Some basic IT knowledge – at least – it should be basic in what amounts to a publishing house:


The .docx file is just a zip file… That is, a compressed folder and its contents… So use the .zip

So here’s the unzipped folder listing – can you spot the images?


The XML content of the doc – viewed in Firefox (drag and drop the file into a Firefox browser window). Does anything jump out at you?


Computers can navigate to the tags that contain the caption text by looking for the Caption style. It can be a faff associating the image captions with the images though (you need to keep tallies…) because the Word XML for the figure doesn’t seem to include the filename of the image… (I think you need to count your way through the images, then relate that image index number with the following caption block?)

So re: the email – if authors tag the captions and put captions immediately below an image – THE MACHINE CAN DO IT, if we give someone an hour or two to knock up the script and then probably months and months and months arguing about the workflow.

PS I’d originally screencaptured and directly pasted the images shown the above into a Powerpoint presentation:


I could have recaptured the screenshots, but it was much easier to save the Powerpoint file, change the .pptx suffix to .zip, unzip the folder, browse the unzipped Powerpoint media folder to see which image files I wanted:


and then just upload them directly to WordPress…

See also: Authoring Multiple Docs from a Single IPython Notebook for another process that could be automated but lack of imagination and understanding just blanks out.

Personalised Learning Means Big Differences?

Back when the OU used to push all its course materials out to students in print form, I think the first presentation of a course used to have its own print run. Errata lists for mistakes identified during the presentation would be mailed out to students as supplementary print items (with their own publication number) every so often, and changes made to a master copies of what would become a revised versions of the main print items for later presentations. When a student received an errata list, it was up to them to mechanically make changes to their print items (scribbling out the wrong bits and writing in the corrections, for example), but at least then they’d have a copy of in-place corrected materials.

There generally aren’t that many errata in an OU course, but there always seem to be some that slip through the net, so how do we deal with them now?

With online delivery, I think we’ve got ourselves in a bit of a pickle when it comes to handling errata. (This post/rant is a bit of a mountain/molehill thing but it’s symptomatic of something-I-don’t-know-what. Fed-up-ness, perhaps.) Changes can’t be made to content that has gone live to students in case some students have already seen it (or something?!), and to ensure that everyone gets to see the “same” version of the course materials irrespective when they saw it. Which of course they don’t, because some folk go through the materials before an error is spotted, and some of them don’t read or spot the errata list that gets published in an announcements feed in the VLE sidebar. For those students who do read the errata list, it doesn’t really help much because you can’t update the material unless you print it all out and make changes to the hard copy, or grab your own, annotatable electronic copy and update and work from that. So I reckon the workflow you end up with is that you you have to keep an eye on the errata list whenever you read anything. Which sucks.

One thing I did wonder was whether we could add an errata annotation layer on top of the course materials. For several years, the OU Annotate tool has provided a browser bookmarklet that can overlay an annotation tool on top of a well structured HTML page (which rules out things like annotating PDFs). By highlighting broken text in a suitably vivid colour, putting the errata note in as a comment, tagging the comment with an errata tag, and making it public seemed to provide a quick solution:


The experience could be improved by adding an errata channel or filter that could be used to highlight just errata items, rather than all comments/annotations. I even started wondering whether there could be a VLE setting that would pull in errata tagged items and display them by default, overlaying them onto the course materials without the need for firing up the OU Annotate toolbar. But that would be a bit like publishing the VLE hosted materials with track changes switched on though, which would look a bit rubbish and make it obvious that there were errors we knew about but hadn’t fixed. Which there are; but we can’t; because the materials once published have to be left set in stone for that presentation of the course. (Except when they aren’t.)

The presentation could be improved further for the majority, who reach the errata’d item after the mistake has been found (“pathfinder” students who work through the materials quickly often spot errors before the majority even get to them), simply by us making the change when the error is spotted and before the student gets to see it…

Alternatively, we could make the change but highlight the text in some way to show that it had been changed, perhaps popping up a full errata note – including the original and the change that was made – if a student hovered their mouse cursor over the changed item. An even cleaner view could be provided with a setting that disabled any highlighting of error-correction terms.

One way of doing this would be to go back to the source and annotate that…: the original course materials are written in an XML document which is then rendered down to HTML and various ebook offerings. (For some reason, PDFs aren’t necessarily always produced, perhaps because of accessibility issues. For the students who want the PDF but don’t care about the accessibility features, they’re left to create their own workaround for generating the PDF. Perfect. Enemy. Good. Got to be equitable myth, etc.) Tagging the doc with the change as a change and leaving the original as an annotation then reflowing the HTML would mean the VLE materials would get the update and also be able to reveal the historical view. Of course, for students who downloaded an ebook version or generated a PDF before an update and reflow wouldn’t get the update, and, yada, yada, too difficult to even think about, don’t bother, stick with errata lists, make it the student’s fix responsibility… (Of course, if you downloaded all the ebooks at the start of the course and don’t go back to the VLE to check the errata list, then, erm… arrgh: remember Rule 1: check the VLE for the errata list before you read anything.)

Another route might be to base every student’s view of the course on a fork of the original that uses a form of version control that only displays changes to the materials that the student has already encountered. So if I read chapter 1, and an error is found, when I revisit chapter 1 the change is made and highlighted as a change. If I get to chapter 2 after a chapter 2 error has been found, the update is made before I reach it and not flagged to me as a change. This would mean everyone’s copy of the course materials could be different of course – I hesitate to say “personalised”…!;-) – which could be hugely complicated, but might also allow students to make changes directly to their own copy of the course materials. Git-tastic…

Rather, it seems to me that we have taken a completely depersonalised route to our materials that means we can’t countenance any situation that requires change to a document that everyone is supposed to have in exactly the same form. (One reason for this is to prevent confusion in the sense of different people talking about possibly different versions of something that is ostensibly the same.)

Anyway – all of this makes me think: is personalised learning about offering students stuff that only contains significant differences, but not minor differences? Because minor differences (like corrected typos in my copy but not yours) are just different enough to make you uncomfortable, but major differences, (you get a completely different paragraph or sequence/ordering of content) are “personalised”. Uncanny, that…

Reflections on the Closure of Yahoo Pipes

Last night I popped up a quick post relaying the announcement of impending closure of Yahoo Pipes, recalling my first post on Yahoo Pipes, and rediscovering a manifesto I put together around the rallying cry We Ignore RSS at OUr Peril.

When Yahoo Pipes first came out, the web was full of the spirit of Web2.0 mashup goodness. At the time, the big web companies were opening all all manner of “open” web APIs – Amazon, Google, and perhaps more than any other, Yahoo – with Google and Yahoo particularly seeming to invest in developer evangelism events.

One of the reasons I became sos evangelical about Yahoo Pipes, particularly in working with library communities, was that it enabled non-coders to engage in programming the web. And more than that. It allowed non-coders to use web based programming tools to build out additional functionality for the web.

looking back, it seems to me now that the whole mashup thing arose from the idea of the web as a creative medium, and one which the core developers (the coders) were keen to make accessible to a wider community. Folk wanted to share, and folk wanted other folk to build on their services in interoperation with other services. It was an optimistic time for the tinkerers among us.

The web companies produced APIs that did useful things, used simple, standard representations (RSS, and then Atom, as simple protocols for communicating lists of content items, for example, then, later, JSON as a friendlier, more lightweight alternative to scary XML, which also reduced the need for casual web tinkerers to try to make sense of XMLHttpRequests), and seemed happy enough to support interoperability.

When Yahoo Pipes came online (and for a brief time, Microsoft’s Popfly mashup tool), the graphical drag-and-drop, wire it together, flow based programming model allowed non-coders to start trying developing, publishing, sharing and building on top of each others real web applications. You could inspect the internals of other peoples pipes, and clone those pipes so you could extend or modify them yourself, and put pipes inside pipes, fostering reuse and the notion of building stuff on top of and out of stuff you’ve learned how to do do before.

And it all seemed so hopeful…

And then the web companies started locking things down a bit more. First my Amazon Pipes started to break, and then my Twitter Pipes, as authentication was introduced to access the feeds published by those companies. It started to seem as if those companies didn’t want their content flows rewired, reflowed and repurposed. And so Yahoo Pipes started to become less useful to me. And a little bit of the spirit of a web as a place where the web companies allowed whosoever, coders and non-coders alike, to build a better web using their stuff started to die.

And perhaps with it, the openness and engagement of the core web developers – the coders – started to close off a little too. True, there are repeated initiatives about learning to code, but whilst I’ve fallen into that camp myself over the last few years, and especially over the last two years, having discovered IPython notebooks and the notion of coding, one line at a time, I think we are complicit in closing off opportunities that help people build out the web using bits of the web.

Perhaps the web is too complicated now. Perhaps the vested interests are too vested. Perhaps the barrage of content of and peck, peck, click, click, Like, addiction feeding, pigeon rat, behaviourist conditioning, screen based crack-Like business model has blinded us to the idea that we can use the web to build our own useful tools.

(I also posted yesterday about a planning application map I helped my local hyperlocal – OnTheWight – publish yesterday. If The Isle of Wight Council published current applications as an RSS feed, it would have been trivial to use the Yahoo Pipes to construct the map. It would have been a five minute hack. As it is, the process we used required building a scraper (in code) and hacking a some code to generate the map.)

There still are tools out there that help you build stuff on the web for the web. CartoDB makes map creation relatively straightforward, and things like Mozilla Popcorn allow you to build your own apps around content containers (I think? It’s been a long time since I looked at it).

Taking time out to reflect on this, it seems as if the web cos have become too inward looking. Rather than engaging wider communities to engage in building out the web, the companies get to a size where their systems become ever more complex, yet have to maintain their own coherence, and a cell wall goes up to contain that activity, and authentication starts to be used to limit access further.

At the time as the data flows become more controlled, the only way to access them comes through code. Non-coders are disenfranchised and the lightweight, open protocols that non-coding programming tools can work most effectively with become harder to justify.

When Pipes first appeared, it seemed as if the geeks were interested in building tools that increased opportunities to engage in programming the web, using the web.

And now we have Facebook. Tap, tap, peck, peck, click, click, Like. Ooh shiny… Tap, tap, peck, peck…

Confused Fragments About Open Data Economics…

Some fragments…

the public paid for it so public has a right to it: the public presumably paid for it through their taxes. Companies that use open public data that don’t fully and fairly participate in the tax regime of the country that produced the data then they didn’t pay their fair share for access to it.

data quality will improve: with open license conditions that allow users to take open (public) data and do what they want with it without the requirement to make derived data available in a bulk form under an open data license, how does the closed bit of the feedback loop work? I’ve looked at a lot of open public data releases on council and government websites and seen some companies making use of that data in presumably a cleaned form (if it hasn’t been cleaned, then they’re working with a lot of noise…) But if they have cleaned and normalised the data, have they provided this back ion an open form to the public body that gifted them access to it? Is there an open data quality improvement cycle working there? Erm… no… I suspect if anything, the open data users would try to sell the improved quality data back to the publisher. This may be their sole business model, or it may be a spin-off as a result of using the (cleaned and normalised) data fro some other commercial purpose.

Confused by MOOCs, Still…

All I am nowadays is confused… about everything. Take MOOCs (What Are MOOCs (Good For)? I Don’t Really Know…) – folk seem to think that something (I don’t know what) about MOOCs makes sense, but I don’t understand what it is they think is interesting or what it is they think is happening.

In the same way that I never did understand what folk were talking about when OERs (that is, open educational resources) were all the rage in ed tech circles, I really have no idea what they think they’re talking about now MOOC is the de rigeur topic of conversation.

(See for example Bits and Pieces Around OERs… or OERs: Public Service Education and Open Production. I also note that folk tend not to appreciate the value of linking. Or maybe I misunderstand it. Whatever.)

From the scraps of stats that are making it out of odds and sods of some of the online platforms (data is not generally available; data will pay the bills when the marketing spend gets cut back and until the MOOC platform providers start making money from selling analytics and course platform/VLE “solutions” to institutions or eking out affiliate and referral fees from recruiters) it’s hard to know whose taking the courses and why, and even whether the different platforms are appealing to the same markets.

My gut feeling in the absence of a proper review is that folk taking courses from the US MOOCx providers are as likely to have a degree as not (eg Participation And performance In 8.02x Electricity And Magnetism: The First Physics MOOC From MITx; I have no idea what the demographics of learners signing up for Futurelearn courses are (Futurelearn has far more of a “casual learner”/hobbiest learner (one might even say, “edutainment”…) vibe about it, though it also seems as if it could be positioned quite well as a taster site).

So here are a few of the things I particularly don’t get:

– if advanced courses are attractive to graduates, does that mean there is a gap in the market for courses for graduates? I’ve largely given up trying to convince anyone that universities should do what the banks used to do and treat the first degree as an opportunity to recruit someone for life as part of a lifelong learning package. The professional institutions have traditionally filled this role in the professions, but it’s hard to know how their membership figures are doing? Could/should the universities be signing up their recent graduates to a lifelong learning top-up package, potentially made up from MOOCs provided by their alma mater?

– if graduates like taking courses, why is the OU so keen on a) making it difficult for folk to take individual were-called-courses-are-now-called-modules? b) pricing individual courses out of the leisure-learner or professional-occasional-top-up market? c) insisting on competing with other universities on their terms rather than breaking open new markets for higher education and widening access to it? (Arguably, FutureLearn is a play at widening access.)

– if MOOCs are going to be important as part of a taster style marketing funnel, how would it be if FutureLearn MOOCs were eligible as an additional/alternative courses in the International Baccalaureate (have any MOOC platforms benefitted from PR around such an end-use yet? There are possibly also potential tie-ups there around the provision of invigilated assessment centres?); or received some amoutn of CAT point credit equivalent that counted towards university applications? Again, something I don’t really understand is why the OU has given up on the Young Applicants in Schools scheme at just the time when it’s starting to compete for 18 year old entry?

As I said, I’m increasingly confused, increasingly don’t understand what’s going on, increasingly don’t see whatever the hell it is that everybody else seems to see as emerging from the latest eduhype.

What’s education good for anyway, when we have the web to hand. Does the web change anything, or nothing? Why did we need universities when we had libraries – and university libraries – with books in them? Why does everybody need a degree? If graduates are the only people who make it to the end of an ‘advanced’ (rather than ‘course taster’) MOOC, what the hell are the universities doing? Why do folk who have become graduates need to take courses when we’ve got the web lying around? What is going on? I just don’t understand…

Google’s New Terms Mean You Could Soon Be Acting as a Product Endorser

If you’re a Google account holder, you may have noticed an announcement recently that Google has changed its terms and conditions, in part to allow it to use your +1s and comments as “shared endorsements” in ads published through Google ad services.


So it seems as if there’s now at least two ways Google uses you, me, us, to generate revenue in an advertising context. Firstly, we’re sold as “audience” within a particular segment: “35-50 males into tech”, for example, and audience that advertisers can buy access to. This may even get to the level of individual targeting (for example, Centralising User Tracking on the Web – Let Google Track Everyone For You). Now, secondly, as personal endorsers of a particular company, service or product.

The ‘recent changes’ announcement URL looks like a general “change notice” URL – – so I’ll repost key elements from the announcement here….

“Because many of [us] are allergic to legalese”, announcement goes, “here’s a plain English summary for [our] convenience.”

We’ve made three changes:

Firstly, clarifying how your Profile name and photo might appear in Google products (including in reviews, advertising and other commercial contexts).

You can control whether your image and name appear in ads via the Shared Endorsements setting.

Secondly, a reminder to use your mobile devices safely.
Thirdly, details on the importance of keeping your password confidential.

The first change – how my Profile name and photo might appear in Google products – is the one I’m interested in.

How your Profile name and photo may appear (including in reviews and advertising)

We want to give you, and your friends and connections, the most useful information. Recommendations from people that you know can really help. So your friends, family and others may see your Profile name and photo, and content like the reviews that you share or the ads that you +1’d. This only happens when you take an action (things like +1’ing, commenting or following) – and the only people who see it are the people that you’ve chosen to share that content with. On Google, you’re in control of what you share. This update to our Terms of Service doesn’t change in any way who you’ve shared things with in the past or your ability to control who you want to share things with in the future.

Feedback from people you know can save you time and improve results for you and your friends across all Google services, including Search, Maps, Play and in advertising. For example, your friends might see that you rated an album 4 stars on the band’s Google Play page. And the +1 you gave your favourite local bakery could be included in an ad that the bakery runs through Google. We call these recommendations shared endorsements and you can learn more about them here.

When it comes to shared endorsements in ads, you can control the use of your Profile name and photo via the Shared Endorsements setting.

Here’s a direct link to the setting… [if you have a Google+ account, I suggest you go there, uncheck the box, and hit “Save”]. I never knowingly checked this – so presumably the default is set to checked (that is, with me opted in to the “service”?

I never knowingly checked this - so presumably the defualt is "checked"?

If you turn the setting to “off,” …

you’ll get hassled:

F**k you, google...

or to put it another way,

…your Profile name and photo will not show up on that ad for your favourite bakery or any other ads. This setting only applies to use in ads, and doesn’t change whether your Profile name or photo may be used in other places such as Google Play.

I have no idea what the context of Google Play might mean. I do have an Google Android phone, and it is tied to a Google account. It is largely a mystery to me, particularly when it comes to knowing who has access to – or has taken copies of – my contacts. I have no idea what Google Play services I have or have not been opted in to.

If you previously told Google that you did not want your +1’s to appear in ads, then of course we’ll continue to respect that choice as a part of this updated setting.

I’m not sure what that means? If I’ve checked “do not want my +1’s to appear in ads” box, will the current setting be set to unchecked (opt out of shared endorsements)? Does the original setting still exist somewhere, or has it been replaced by the new setting? Or is there another level of privacy setting somewhere, and if so how do the various levels interact?

This is on my current Google+ settings page:

shared endorsements

and I can’t see anything about +1 ad opt outs, so presumably the setting has changed? I’d have thought I’d have opted out of allowing +1s to appears in ads (had I known: a) that +1s may have been used in ads; and b) that such a setting existed), but presumably that fact passed me by (more on this later in the post…) Or I had opted out and the opt-out wasn’t respected? But surely not that…?

For users under 18, their actions won’t appear in shared endorsements in ads and certain other contexts.

Which is to say, ‘if you lied about your age in order to access to particular services, we’re gonna sell the ability for advertisers to use you to endorse their products to your friends’.

So that’s the “helpful” explanation of the terms.. what do the actual terms say?

When you upload or otherwise submit content to our Services, you give Google (and those we work with) a worldwide licence to use, host, store, reproduce, modify, create derivative works (such as those resulting from translations, adaptations or other changes that we make so that your content works better with our Services), communicate, publish, publicly perform, publicly display and distribute such content. The rights that you grant in this licence are for the limited purpose of operating, promoting and improving our Services, and to develop new ones. This licence continues even if you stop using our Services (for example, for a business listing that you have added to Google Maps). Some Services may offer you ways to access and remove content that has been provided to that Service. Also, in some of our Services, there are terms or settings that narrow the scope of our use of the content submitted in those Services. Make sure that you have the necessary rights to grant us this licence for any content you submit to our Services. [This para, or one very much like it, is in the current terms.]

If you have a Google Account, we may display your Profile name, Profile photo and actions you take on Google or on third-party applications connected to your Google Account (such as +1’s, reviews you write and comments you post) in our Services, including displaying in ads and other commercial contexts. We will respect the choices you make to limit sharing or visibility settings in your Google Account. For example, you can choose your settings so that your name and photo do not appear in an ad.

Hmmm.. so maybe the settings do – or will – have a finer level of control (and complexity…) associated with them? I wonder also whether those two paragraphs can work together? If I comment on a Google+ page, or maybe tag a brand or product in an image I have uploaded, could Google create a derivative work as part of a shared endorsement by me?

Looking Around Some Other Google+ Settings

Finding myself on my Google+ settings page, I had a look at some of the other settings…

Be wary of implict reveals?

Hmm… this could be an issue, if checked? If things are shared to people in my circles, and folk get automatically added to my circles if I just search for them, then, erm, I could maybe unwaringly opt a page in to my circles?

circle shares

But if I do search for someone and they’re added to my circles on my behalf, what circle are they added to?

so which do they get added to?

Not being paranoid or anything, but I can now also imagine something like the following setting appearing on my main Google account insofar as it relates to search, for example:

Google Search Pages
_ Automatically add a Google+ Author to my circles if I click through on a search result marked with a Google+ Author tag.

So what other settings are there that may be of interest?

Several to do with automatically tampering with my content (as if false memory syndromes aren’t bad enough!)

mess with my stuff...

do stuff to my stuff

I seem to remember these being announced, but didn’t think to check that I would automatically be opted in.

Note to self: When Google announces a new Google+ service, or service related to Google accounts, assume I get automatically opted in.

Any others? Ah, ha… a little something that invisibly enmeshes me a little deeper in the Google knowledge web:

link me in to the Google knowledge graph

Here’s the blurb, rather bluntly entitled Find My Face: “Find my face makes finding pictures and videos of you easy and more social. Find my face offers name tag suggestions so you, or people that you know, can quickly tag photos. Any time someone tags you in a photo or video, you’ll be able to accept or reject name tags created by people you know.”

So I’m guessing if I opt in to this, if Google recognises that I’m in a photo, and someone I know views that photo, they’ll be prompted to tag me in it. I wonder if Google actually has a belief graph and a knowledge graph? In the first case, the belief graph would associate me with photos Google’s algorithms think I’m in. In the second case, the knowledge graph, Google would associate me with photos where someone confirms that I am in the photo. If you want to get geeky, this knowledge vs. belief distinction, where knowledge means “justified true belief”, has a basis in things like epistemic logic (which I came across in the context of agent logics) – I’d never really thought about Google’s graph in this way… Hmmm…

Here’s how it works, apparently:

After you turn on Find my Face, Google+ uses the photos or videos you’re tagged in to create a model of your face. The model updates as tags of you are added or removed and you can delete the entire face model at any time by turning off Find my Face.

If you turn on Find my Face, we can use your face model to make it easier to find photos or videos of you. For example, we’ll show a suggestion to tag you when you or someone you know looks at a photo or video that matches your face model. Name tag suggestions by themselves do not change the sharing setting of photos or albums or videos. However, when someone approves the suggestion to add a name tag, the photo and relevant album or video are shared with the person tagged.

So can Google sell that face model of me to other parties? Or just sell recognition of my face in photos and videos as a service, or as part of an audience construction process?

I guess at least I get to approve any photo tags though… Or do I?

Act on my behalf

So if I search for someone on Google+, they’re added to my circles, which means that if they tag me in a photo when prompted by Google+ to do so, their tag is automatically accepted by me by virtue of this proxy setting I seem to have been automatically opted in to? Or am I reading these settings all wrong?

Ho hum, I guess it’s not even the legalese I’m allergic to… it’s understanding the emergent complexity and consequences that arise from different combinations of settings on personal account pages…

A Tracking Inspired Hack That Breaks the Web…? Naughty OpenLearn…

So it’s not just me who wonders Why Open Data Sucks Right Now and comes to this conclusion:

What will make open data better? What will make it usable and useful? What will push people to care about the open data they produce?
Simply that. If we start using the data, we can email, write, text and punch people until their data is in a standard, useful and usable format. How do I know if my data is correct until someone tries to put pins on a map for ever meal I’ve eaten? I simply don’t. And this is the rock/hard place that open data lies in at the moment:

It’s all so moon-hoveringly bad because no-one uses it.
No-one uses it because what is out there is moon-hoveringly bad

Or broken…

Earlier today, I posted some, erm, observations about OpenLearn XML, and in doing so appear to have logged, in a roundabout and indirect way, a couple of bugs. (I did think about raising the issues internally within the OU, but as the above quote suggests, the iteration has to start somewhere, and I figured it may be instructive to start it in the open…)

So here’s another, erm, issue I found relating to accessing OpenLearn xml content. It’s actually something I have a vague memory of colliding with before, but I don’t seem to have blogged it, and since moving to an institutional mail server that limits mailbox size, I can’t check back with my old email messages to recap on the conversation around the matter from last time…

The issue started with this error message that was raised when I tried to parse an OU XML document via Scraperwiki:

Line 85 - tree = etree.parse(cr)
lxml.etree.pyx:2957 -- lxml.etree.parse (src/lxml/lxml.etree.c:56230)(())
parser.pxi:1533 -- lxml.etree._parseDocument (src/lxml/lxml.etree.c:82313)(())
parser.pxi:1562 -- lxml.etree._parseDocumentFromURL (src/lxml/lxml.etree.c:82606)(())
parser.pxi:1462 -- lxml.etree._parseDocFromFile (src/lxml/lxml.etree.c:81645)(())
parser.pxi:1002 -- lxml.etree._BaseParser._parseDocFromFile (src/lxml/lxml.etree.c:78554)(())
parser.pxi:569 -- lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:74498)(())
parser.pxi:650 -- lxml.etree._handleParseResult (src/lxml/lxml.etree.c:75389)(())
parser.pxi:590 -- lxml.etree._raiseParseError (src/lxml/lxml.etree.c:74722)(())
XMLSyntaxError: Entity 'nbsp' not defined, line 155, column 34

nbsp is an HTML entity that shouldn’t appear untreated in an arbitrary XML doc. So I assumed this was a fault of the OU XML doc, and huffed and puffed and sighed for a bit and tried with another XML doc; and got the same result. A trawl around the web looking for whether there were workarounds for the lxml Python library I was using to parse the “XML” turned up nothing… Then I thought I should check…

A command line call to an OU XML URL using curl:


returned the following:

<meta http-equiv="refresh" content="0; url=" /><script type="text/javascript">

Ah… vague memories… there’s some sort of handshake goes on when you first try to access OpenLearn content (maybe something to do with tracking?), before the actual resource that was called is returned to the calling party. Browsers handle this handshake automatically, but the etree.parse(URL) function I was calling to load in and parse the XML document doesn’t. It just sees the HTML response and chokes, raising the error that first alerted me to the problem.

[Seems the redirect is a craptastic Moodle fudge /via @ostephens]

So now it’s two hours later than it was when I started a script, full of joy and light and happy intentions, that would generate an aggregated glossary of glossary items from across OpenLearn and allow users to look up terms, link to associated units, and so on; (the OU-XML document schema that OpenLearn uses has markup for explicitly describing glossary items). Then I got the error message, ran round in circles for a bit, got ranty and angry and developed a really foul mood, probably tweeted some things that I may regret, one day, figured out what the issue was, but not how to solve it, thus driving my mood fouler and darker… (If anyone has a workaround that lets me get an XML file back directly from OpenLearn (or hides the workaround handshake in a Python script I can simply cut and paste), please enlighten me in the comments.)

I also found at least one OpenLearn unit that has glossary items, but just dumps then in paragraph tags and doesn’t use the glossary markup. Sigh…;-)

So… how was your day?! I’ve given up on mine…

Learning in Real Time..?

Believe me, I know, is generally all over the place, with occasional 2-3 week forays down very narrowly focussed (if you’re outside the area) rabbit holes…

And I know, I know, comments like: “you’re so productive it’s hard to keep up” do reoccur, which make me feel both good and bad.. because just STOP there for a minute…

…if you’re in full time HE studying how many hours a week, you’re expected to absorb how many new concepts and big ideas a week as part of your studies…?

…and I blog maybe 1-2 hours of study-time material a day (how many lecture hours a day does a full time student cope with)?

So where are we at? Folk do a degree to give them lifelong learning skills, and maybe get a grip on some power concepts/models that will help them keep making sense of the world after uni?

And to get the grades, they need to spend 20+ hours a week learning things new them that bore the hell out of their lecturers because it’s f***ing obvious to a PhD in the subject (but the students are A’level educated, not 3 years full time PhDing about arcana in the subject, remember…)

I spend maybe 20 hours a week trying to learn stuff, and maybe 10 hours a week blogging what I’ve learned or observed. The learning generally comes from me asking myself a question about how to do or build something, and then trying to figure out how to do it given: a) what I already know/have blogged about; b) what I think I need to search for given what I know I don’t know how to do.

Two things come to mind here: 1) I refer to my own blog posts a lot because: a) I typically don’t remember how to do something I’ve done before; but b) can generally remember if I’ve figured out how to do something before and blogged it; 2) I enjoy figuring things out for the first time for me…. Then I blog it as an offboard memory device. If I come across a new problem, I try to recall a related problem I’ve blogged a solution to, or I go to Stack Overflow.

So.. am I productive? What does that mean???? I think I’m on a daily learning journey, and I blog the result. Students are too… The only difference is, they’re following a path that is curriculum decided and known in advance to their instructor, and I;m making my journey up on an hourly basis.

Here’s a question I used to ask to wind up folk in my department: what did you learn how to do for the first time today? My day generally doesn’t start until I’ve come up with a problem and figured out how to solve it. Which is maybe why I’m not productive in a corporate/instituional sense at all

Just sayin’…

NOTE: this blog post was written/posted at way past my bed time in my own time…so please bear in mind that maybe it’s the cider talkin’…;-)