Unthinkable Thinking Made Real?

A few days ago was one of the highlights of my conference year, Internet Librarian International, which started on Monday when I joined up with Brian Kelly once again for a second (updated) outing of our Preparing for the Future workshop (Brian’s resources; my reference slides (unannotated for now, will be updated at some point)).

I hope to post some reflections on that over the next few days, but for now would like to mention one of the presentations on Tuesday – Thinking the unthinkable: a library without a catalogue by Johan Tilstra (@johantilstra) from Utrecht University Library. This project seems to have been in progress for some time, the main idea being that discovery happens elsewhere: the university library should not be focussing on providing discovery services, but instead should be servicing the delivery of content surfaced or discovered elsewhere. To support this, Utrecht are developing a browser extension – UU Easy Access – that will provide full text access to remote resources. As the blurb puts it, [the extension] detects when you are on a website which the Utrecht University Library has a subscription to. This makes it easy for you to get access through the Library Proxy.

This reminded me of an old experiment back from the days I hassled the library regularly, the OU Library Traveller extension (actually, a Greasemonkey script; remember Greasemonkey?;-)

It seems I only posted fragmentary posts about this doodle (OU Library Traveller – Title Lookup and OU Library Traveller – eBook Greedy, for example) but for those without long memories, here’s a brief recap: a long time ago, Jon Udell published a library lookup bookmarklet that would scan the URL of a web page you were on to see if it contained an ISBN (a universal book number), and if so, it would try to open up the page corresponding to that book on your local library catalogue.

I forget the various iterations involved, or related projects in the area (such as scripts that looked for ISBNs or DOIs in a webpage and rewrote them as links to a search on that resource via a library catalogue, or dereferencing of the doi through a doi lookup and libezproxy service), but at some point I ended up with a Greasemonkey script that would pop up a floating panel on a page that contained an ISBN show whether that book was in the OU Library, or available as a full text e-book. (Traffic light colour coded links also showed if the resource was available, owned by the library but currently unavailable, or not avaialble.) I also had – still have, still use regularly – a bookmarklet that will rewrite the URL for subscription based content, such as an academic journal paper, so it goes via the OU library and (hopefully) provides me with full text access: OU libezproxy bookmarklet (see also Arcadia project: bookmarklets; I think some original, related “official-ish, not quite, yet, in testing” OU Library bookmarklets are still available here).

So the “Thinking the Unthinkable” presentation got me thinking that perhaps I had also been thinking along similar lines, as well as that perhaps I should revisit the code to provide an extension that would automatically enhance pages that contained somewhere about them an ISBN or DOI or web domain recognised by the OU’s libezproxy. (If any OU library devs are reading this, (Owen?!;-) it’d be really useful to have a service that could take a URL and then return a boolean flag to say whether or not the OU libezproxy service could do something useful with that URL… or provide me with a list of domains that the OU libezproxy service likes so I could locally decide whether to try to reroute a URL through it…) Hmm….

As I dug through old blog posts, I was also reminded of a couple of other things. Firstly, another competition hack that tried to associate courses with books using a service published by Dave Patten at the University of Huddersfield. Hmm… Thinks… Related content… or alternative content, maybe… so if I’m on a journal page somewhere, maybe I could identify whether it’s OA available from a university repository..? (Which I guess is what Google Scholar often does when it links to a PDF copy of a paper?)

Secondly, I was reminded of another presentation I gave at ILI six years ago (the slides are indecipherable and without annotation) on “The Invisible Library” (which built on from a similarly titled internal OU presentation a few weeks earlier).

The original idea was that libraries could provide invisible helpdesk support through monitoring social media channels, but also included elements of providing locally mediated access to remotely discovered items in an invisible way through things like the OU Library Traveller. It also seems to refer to “contentless” libraries, (eg as picked up in this April Fool), and perhaps foreshadows the idea of an open access academic library.

So I wonder – time to revisit this properly, and try to recapture the (unthinkable?) thinking I was thinking back then?

PS I also notice that around that time I was experimenting with Google Custom search engines. This is the second time in as many months I’ve rediscovered my CSE doodles (previously with Creating a Google Custom Search Engine Over Hyperlocal Sites). Maybe it’s time I revisited them again, too…?

When Your Identity Can Be Used Against You…

A couple of news stories came to my attention today, one via a news stand, one via a BBC news report that – it turns out – was recapitulating an earlier story.

Both of them demonstrate how you the user of the online service are only a “valued customer” insofar as you help generate revenue.

In the first case, it seems that Amazon – doyens of good employment and tax practice, ahem – are going to start suing folk who publish fake reviews on the site.


Ah, bless… Amazon fighting on behalf of the consumer…

A more timely report – including a copy of the complaint – is posted on Geekwire.

Apparently, “[d]efendants are misleading Amazon’s customers and tarnishing Amazon’s brand for their own profit and the profit of a handful of dishonest sellers and manufacturers. Amazon is bringing this action to protect its customers from this misconduct, by stopping defendants and uprooting the ecosystem in which they participate.”.

It seems that each reviewer of a product has agreed to and is bound by the Conditions of Use of the Amazon site, so I guess that before posting a review, we should all be reading the Ts & Cs… Of course, Amazon ensures that folk using any of its websites have read – and understood – all the terms and conditions of the relevant site. Ahem, again… Cough, splutter, choke….

In case you’re interested, here are the US Terms and Conditions, under acceptance of which “you agree that the Federal Arbitration Act, applicable federal law, and the laws of the state of Washington” apply, presumably because even though Amazon is a Delaware corporation, (not that Delaware not being the most transparent of jurisdictions is likely to have anything to do with that?!) its principal place of business in Seattle, Washington. Sort of. In the UK, for example, which is to say the EU, it’s based in Luxembourg, presumably to help its tax position…? It must be nice being a big enough company to choose what jurisdiction to put what part of your business in, so you can, erm, maximise the benefits…

Although the current case is playing out against Amazon.com users, in case you’re interested, here are the UK Ts & Cs. Read and digest… Remember: you almost undoubtedly signed up to them… Ahem…

(Another by the by on partially related matters – it seems like if you work for Facebook in the UK, business has been good and you can expect a pretty good bonus, but if you’re a member of HM’s Inland Revenue, there’s not been that much business as far as tax is concerned (Facebook paid £4,327 corporation tax despite £35m staff bonuses). In passing, I wonder if Facebook pay the cleaning and ancillary staff that service Facebook’s UK premises a living wage, or whether we should be sleeping happy that soon enough these employees won’t be receiving as much “UK taxpayer’s money” in the form of (soon to be cut) working tax credits, that form of corporate welfare payment used to support companies at state expense, in lieu of them paying a reasonable wage (even leaving tax affairs aside)…) I’m not sure who’s taking the p**s more – international corporations or UK Gov…?

Here’s the root of the second story that caught my eye, in which it seems that successful online gamblers are deemed persona non grata if they’re no longer “revenue positive”, or whatever the phrase is.


So bear in mind that when companies collect your personal data, the benefit that they ultimately want to derive is to the company, not to you, the individual user. Sucker…

Idle Thoughts on “Data Literacy” in the Library…

In part for a possible OU Library workshop, in part trying to mull over possible ideas for an upcoming ILI2015 workshop with Brian Kelly, I’ve been pondering what sorts of “data literacy” skills are in-scope for a typical academic library.

As a starting point, I wonder if this slicing is useful, based on the ideas of data management, discovery, reporting and sensemaking.


It identifies four different, though interconnected, sorts of activity, or concern:

  • Data curation questions – research focus – covering the management and dissemination of research data, as well as dissemination issues. This is mainly about policy, but begs the question about who to go to for the technical “data engineering” issues, and assumes that the researcher can do the data analysis/data science bits.
  • Data resourcing – teaching focus – finding and perhaps helping identify processes to preserve data for use in teaching context.
  • Data reporting – internal process focus – capturing, making sense of/analysing, and communicating data relating to library related resources or activities; to what extent should each librarian be able to use and invoke data as evidence relating to day job activities. Could include giving data to course teams about resource utilisation, research teams to demonstrate impact ito tracking downloads and use of OU published resources.
  • Date sensemaking – info skills focus – PROMPT in a data context, but also begging the question about who to go to for “data computing” applications or skills support (cf academic/scientific computing support, application training); also relates to ‘visual literacy’ in sense of interpreting data visualisations, methods for engaging in data storytelling and academic communication.

Poking in to each of those areas a little further, here’s what comes to mind at first thought…

Data Curation

The library is often the nexus of activity around archiving and publishing research papers as part of an open access archive (in the OU, this is via ORO: Open Research Online). Increasingly, funders (and publishers) require that researchers make data available too, often under an open data license. Into this box I’m thinking of those activities related to supporting the organisation, management, archiving, and publication of data related to research. It probably makes sense to frame this in the context of a formal lifecycle of a research project and either the various touchpoints that the lifecycle might have with the library, or those areas of the lifecycle where particular data issues arise. I’m sure such things exists, but what follows is an off-the-of-my-head informal take on it…!

Initial questions might relate to putting together (and costing) a research data management plan (planning/bidding, data quality policies, metadata plans etc). There might also be requests for advice about sharing data across research partners (which might extend privacy or data protection issues over and above any immediate local ones). In many cases, there may be concerns about linking to other datasets (for example, in terms of licensing or permissions, or relating to linked or derived data use; mapping is often a big concern here), or other, more mundane, operational issues (how do I share large datafiles that are too big to email?). Increasingly, there are likely to be publication/dissemination issues (how/where/in what format do I publish my data so it can be reused, how should I license it?) and legacy data management issues (how/where can I archive my data? what file formats should I use?). A researcher might also need support in thinking through consequences – or requirements – of managing data in a particular way. For example, particular dissemination or archiving requirements might inform the choice of data management solution from the start: if you use an Access database, or directory full of spreadsheets, during the project with one set of indexing, search or analysis requirements, you might find a certain amount of re-engineering work needs to be done in the dissemination phase if there is a requirement that the data is published at record level on a public webpage with different search or organisational requirements.

What is probably out of scope for the library in general terms, although it may be in scope for more specialised support units working out of the library, is providing support in actual technology decisions (as opposed to raising technology specification concerns…) or operations: choice of DBMS, for example, or database schema design. That said, who does provide this support, or whom should the library suggest might be able to provide such support services?

(Note that these practical, technical issues are totally in scope for the forthcoming OU course TM351 – Data management and analysis…;-)

Data resourcing

For the reference librarian, requests are likely to come in from teaching staff, students, or researchers about where to locate or access different sources of data for a particular task. For teaching staff, this might include identifying datasets that can be used in the context of a particular course, possibly over several years. This might require continuity of access via a persistent URL to different sorts of dataset: a fixed (historical) dataset, for example, or a current, “live” dataset, reporting the most recent figures month on month or year on year. Note that there may be some overlap with data management issues, for example, ensuring that data is both persistent and provided in a format that will remain appropriate for student use over several years.

Researchers too might have third party data discovery or access requests, particularly with respect to accessing commercial or privately licensed data. Again, there may be overlaps with data management concerns, such as how to managing secondary data/third party data appropriately so it doesn’t taint the future licensing or distribution of first party or derived data, for example.

Students, like researchers, might have very specific data access requests – either for particular datasets, or for specific facts – or require more general support, such as advice in citing or referencing sources of secondary data they have accessed or used.

Data reporting

In the data reporting bin, I’m thinking of various data reporting tasks the library might be asked to perform by teaching staff or researchers, as well data stuff that has to be done as internally within the library, by librarians, for themselves. That is, tasks within the library that require librarians to employ their own data handling skills.

So for example, a course team might want to know what library managed resources referenced from course material are being when and by how many students. Or learning analytics projects may request access to data to help build learner retention models.

A research team might be interested in number of research paper or data downloads from the local repository, or citation analyses, or other sources of bibliometric data, such as journal metrics or altmetrics, for assessing the impact of a particular project.

And within the library, there may be a need for working with and analysing data to support the daily operations of the library – staffing requirements on the helpdesk based on an analysis of how and when students call on it, perhaps – or to feed into future planning. Looking at journal productivity, for example, (how often journals are accessed, or cited, within the institution) when it comes to renewal (or subscription checking) time; or at a more technical level, building recommendation systems on top of library usage data. Monitoring the performance of particular areas of the library website through website analytic, or even linking out to other datasets and looking at the impact of library resource utilisation by individual students on their performance.

Date sensemaking

In this category, I’m lumping together a range of practical tools and skills to complement to the tools and skills that a library might nurture through information skills training activities (something that’s also in scope for TM351…). So for example, one are might be providing advice about how to visualise data as part of a communication or reporting activity, both in terms of general data literacy (use a bar chart, not a pie chart for this sort of data; switch the misleading colours off; sort the data to better communicate this rather than that, etc) as well as tool recommendations (try using this app to generate these sorts of charts, or this webservice to plot that sort of map). Another might be how to read, interpret, or critique a data visualisation (looking at crappy visualisations can help here!;-), or rate the quality of a dataset in much the same way you might rate the quality of an article.

At a more specialist level, there may be a need to service requests about what tools to use to work with a particular dataset, for example, a digital humanities researcher looking for advice on a text mining project?

I’m also not sure how far along the scale of search skills library support needs to go, or whether different levels of (specialist?) support need to be provided for undergrads, postgrads and researchers? Certainly, if your data is in a tabular format, even just as a Google spreadsheet, you become much more powerful as a user if you can frame complex data queries (pivot tables, any one?) or start customising SQL queries. Being able to merge datasets, filter them (by row, or by column), or facet them, cluster them or fuzzy join them are really powerful dataskills to have – and that can conveniently be developed within a single application such as OpenRefine!;-)

Note that there is likely to be some cross-over here also between the resource discovery role described above and helping folk develop their own data discovery and criticism skills. And there may also be requirements for folk in the library to work on their own data sensemaking skills in order to do the data reporting stuff…


So, is that a useful way of carving up the world of data, as the library might see it?

The four different perspectives on data related activities within the library described above cover not only data related support services offered by the library to other units, but also suggest a need for data related skills within the library to service its own operations.

What I guess I need to do is flesh out each of the topics with particular questions that exemplify the sort of question that might be asked in each context by different sorts of patron (researcher, educator, learner). If you have any suggestions/examples, please feel free to chip them in to the comments below…;-)

Patently Imagined Futures, (or, what’s Facebook been getting up to recently?)

One of the blogs on my “must read” list is Bill Slawski’s SEO by the Sea, which regularly comments on a wide variety of search related patents, both recent and in the past, obtained by Google and what they might mean…

The US patent system is completely dysfunctional, of course, acting as way of preventing innovative competition in a way that I think probably wasn’t intended by its framers, but it does provide an insight into some of the crazy bar talk ideas that Silicon Valley types thought they might just go and try out on millions of people, or perhaps already are trying out.

As an example, here are a couple of recent patents from Facebook that recently crossed my radar.


Images uploaded by users of a social networking system are analyzed to determine signatures of cameras used to capture the images. A camera signature comprises features extracted from images that characterize the camera used for capturing the image, for example, faulty pixel positions in the camera and metadata available in files storing the images. Associations between users and cameras are inferred based on actions relating users with the cameras, for example, users uploading images, users being tagged in images captured with a camera, and the like. Associations between users of the social networking system related via cameras are inferred. These associations are used beneficially for the social networking system, for example, for recommending potential connections to a user, recommending events and groups to users, identifying multiple user accounts created by the same user, detecting fraudulent accounts, and determining affinity between users.

Which is to say: traces of the flaws in a particular camera that are passed through to each photograph are unique enough to uniquely identify that camera. (I note that academic research picked up on by Bruce Schneier demonstrated this getting on for a decade ago: Digital Cameras Have Unique Fingerprints.) So when a photo is uploaded to Facebook, Facebook can associate it with a particular camera. And by association with who’s uploading the photos, a particular camera, as identified by the camera signature baked into a photograph, can be associated with a particular person. Another form of participatory surveillance, methinks.

Note that this is different to the various camera settings that get baked into photograph metadata (you know, that “administrative” data stuff that folk would have you believe doesn’t really reveal anything about the content of a communication…). I’m not sure to what extent that data helps narrow down the identity of a particular camera, particularly when associated with other bits of info in a data mosaic, but it doesn’t take that many bits of data to uniquely identify a device. Like your web-browser’s settings, for example, that are revealed to webservers of sites you visit through browser metadata, and uniquely identify your browser. (See eg this paper from the EFF – How Unique Is Your Web Browser? [PDF] – and the associated test site: test your browser’s uniqueness.) And if your camera’s also a phone, there’ll be a wealth of other bits of metadata that let you associate camera with phone, and so on.

Facebook’s face recognition algorithms can also work out who’s in an image, so more relationships and associations there. If kids aren’t being taught about graph theory in school from a very young age, they should be… (So for example, here’s a nice story about what you can do with edges: SELECTION AND RANKING OF COMMENTS FOR PRESENTATION TO SOCIAL NETWORKING SYSTEM USERS. Here’s a completely impenetrable one: SYSTEMS, METHODS, AND APPARATUSES FOR IMPLEMENTING AN INTERFACE TO VIEW AND EXPLORE SOCIALLY RELEVANT CONCEPTS OF AN ENTITY GRAPH.)

Here’s another one – hinting at Facebook’s role as a future publisher:


An online publisher provides content items such as advertisements to users. To enable publishers to provide content items to users who meet targeting criteria of the content items, an exchange server aggregates data about the users. The exchange server receives user data from two or more sources, including a social networking system and one or more other service providers. To protect the user’s privacy, the social networking system and the service providers may provide the user data to the exchange server without identifying the user. The exchange server tracks each unique user of the social networking system and the service providers using a common identifier, enabling the exchange server to aggregate the users’ data. The exchange server then applies the aggregated user data to select content items for the users, either directly or via a publisher.

I don’t really see what’s clever about this – using an ad serving engine to serve content – even though Business Insider try to talk it up (Facebook just filed a fascinating patent that could seriously hurt Google’s ad revenue). I pondered something related to this way back when, but never really followed it through: Contextual Content Server, Courtesy of Google? (2008), Contextual Content Delivery on Higher Ed Websites Using Ad Servers (2010), or Using AdServers Across Networked Organisations (2014). Note also this remark on the the University of Bedfordshire using Google Banner Ads as On-Campus Signage (2011).

(By the by, I also note that Google has a complementary service where it makes content recommendations relating to content on your own site via AdSense widgets: Google Matched Content.)

PS not totally unrelated, perhaps, a recent essay by Bruce Schneier on the need to regulate the emerging automatic face recognition industry: Automatic Face Recognition and Surveillance.

How Much Time Should an Online Course Take?

Five years or so ago, when MOOCs were still a new thing, I commented on what seemed to be the emerging typical duration of open online courses: Open Courses: About 10 Weeks Seems To Be It, Then?

For the OU’s 10 week short courses, which nominally required up to 10 hours study a week (the courses were rated at 10 CAT points), this meant a duration of 100 hours. The cost (at the time) of those courses was about £150, I think. So about £1.50 an hour purchase cost.

Looking at the upcoming OU FutureLearn course Learn to code for data analysis, the time commitment is 4 weeks at 3-4 hours per week, so about 15 hours. If you don’t want to pay anything, you don’t have to.

Although I can’t offhand find any previous OUseful.info blog posts comparing courses to things like books or games (and I guess, DVD/streamed TV “box sets”), as “cultural content consumption items”, it’s one of the reference points I often think about when it comes to trying to imagining how a course – formal (for credit), or informal – fits into the life of the student amongst other competing demands on the their time, attention and finances. If someone is going to take a course for the first time and spend time/attention/cash on it, does the study pattern neatly replace or substitute a previous pattern of activity, or does it require a more significant change in a learners daily or weekly habits. In other words, what are the attention economics associated with taking a course?

This was all brought to mind again lately when I spotted this post – Forty Hours – which opens with the observation that “the majority of videogames were made on the assumption that they would be played for forty hours. Now, games are being made to be played for longer and longer. (I’ve no idea if this is true or not; I don’t really follow game culture. Maybe the longer games are ones where there is an element of social (especially 2-way audio) enhanced gameplay?)

If true, this seems to contrast with the shortening of courses that is perhaps taking place on FutureLearn (again, I don’t have the data to back this up; it’s just an impression; nor do I have the data about evolving course length more widely in MOOC space. Presumably, the Open Education Research Hub is the sort of place where I should be able to find this sort of data?)

If that is the case, then why are games getting longer and online open courses shorter (if, indeed, they are? And in formal ed, where does semesterisation sit in all this?). As the Forty Hours post goes on:

[E]very major commercial game now attempts to ‘capture’ its audience for at least 200 hours, with multiplayer modes being the core method of retention. The forty hour model was a consequence of selling games-as-products, as boxed content that would be played then thrown onto a pile of completed games (although it turns out that the minority of players finish games). The 200 hour model is a consequence of selling games-as-services, with monetization now an on-going process throughout the time the players are engaged with the title in question. …

The big money is no longer out to hold a player’s attention for forty hours, but to hold a player’s attention long enough to get the next game out, or to hold on to groups of players in the hope to pull in a few big spenders, or to hold the player’s attention throughout the year with events crafted to maintain appeal and bring back those who are slipping away into other games. Hobby players – those who commit to a game service over the long term – often play other games on the side, which is a tiny crumb of good news for indies making smaller games. …

The game-as-product approach where the forty hour model had dominated still survives, but only where it has proved difficult or impossible to tie players down for longer lengths of time. The market for videogames is ceasing to be one of packaged experience (like movies and novels) and becoming a fight for retention, as more and more games in the upper market shift their design towards training new hobby players in a ongoing economy.

In other words, why are we looking to shorten the relationship someone has with a course? Is this so we can extend the relationship the platform has with the learner by getting them to take more, shorter courses rather than fewer longer courses? (UPDATE: Or as Helen Noble points out in a comment, is it because the MOOC is actually a loss leading tease intended to draw students into a longer formal commitment? As opposed to being an alumni touch point, encouraging a graduate to maintain some sort of content with their alma mater in the hope getting a donation or bequest out of them later in life?!)

In terms of the completion commitment pitch (that is, what sort of commitment is required of folk to complete a course, or a game), what do the attention spending, cultural content consumers respond to? And how do the economics of competing concerns play out?

(That sounds like a marketing concern, doesn’t it? But it presumably also impacts on learning design within and across courses?)

So It Seems I’m Now A Senior Lecturer…

…although I haven’t actually seen the letter yet, but our HoD announcement went round about the latest successful promotions earlier today, so I’m hoping that counts…!

I took considerable persuading to put a case in (again…) but thanks to everyone (particularly David, Karen and Mark, and Andy from attempts passed) who put the hours in improving on the multiple revisions of the case as it progressed through the OU promotions process and supporting me in the process – as well as those Deans and HoDs past who’ve allowed me to get away with what I’ve been doing over the last few years;-)

If anyone wants to see a copy of the case I made, I’m happy to let you have a copy…

Anyway… is this now the time to go traditional, stop blogging, and start working on getting the money in and preparing one or two journal publications a year for an academic readership in the tens?!


We Are Watching, and You Will be Counted

Two or three weeks ago, whilst in Cardiff, I noticed one of these things for the first time:

20682320123_203e79367d_o 20682320123_203e79367d_o_jpg

It’s counts the number of cyclists who pass by it and is a great example of the sort of thing that could perhaps be added to a “data walk”, along with several other examples of data revealing street furniture as described by Leigh Dodds in Data and information in the city.

It looks like could be made by a company called Falco – this Falco Cycle Counter CB650 (“[a]lready installed for Cardiff County Council as well as in Copenhagen and Nijmegen”)? (Falco also make another, cheaper one, the CB400.)

From the blurb:

The purpose of the Falco Cycle Counter is to show the number of cyclists on a bicycle path. It shows the number of cyclists per day and year. At the top of the Counter there is a clock indicating time and date. On the reverse it is possible to show city map or other information, alternatively for a two-way bicycle path it is possible to have display on both side of the unit. Already installed for Cardiff County Council as well as in Copenhagen and Nijmegen, three very strong cycling areas, the cycle counter is already proving to be an effective tool in managing cycle traffic.

As with many of these sorts of exhibit, it can phone home:

When configured as a Cycle Counter, the GTC can provide a number of functions depending on the configuration of the Counter. It is equipped with a modem for a SIM card use which provides a platform for mobile data to be exported to a central data collection system.

This makes possible a range of “on-… services”, for example: [g]enerates individual ‘buy-in’ from local people via a website and web feed plus optional Twitter RSS enabling them to follow the progress of their own counter personally.

I was reminded of this appliance (and should really have blogged it sooner) by a post today from Pete Warden – Semantic Sensors – in which he remarked on spotting an article about “people counters” in San Francisco that count passing foot traffic.

In that case, the counters seem to be provided by a company called Springboard who offer a range of counting services using a camera based counting system: a small counting device … mounted on either a building or lighting/CCTV column, a virtual zone is defined and pedestrians and cars who travel through the zone are recorded.

Visitor numbers are recorded using the very latest counting software based on “target specific tracking”. Data is audited each day by Springboard and uploaded daily to an internet server where it is permanently stored.

Target specific tracking software monitors flows by employing a wide range of characteristics to determine a target to identify and track.

Here’s an example of how it works:

As Pete Warden remarked, [t]raditionally we’ve always thought about cameras as devices to capture pictures for humans to watch. People counters only use images as an intermediate stage in their data pipeline, their real output is just the coordinates of nearby pedestrians.

He goes on:

Right now this is a very niche application, because the systems cost $2,100 each. What happens when something similar costs $2, or even 20 cents? And how about combining that price point with rapidly-improving computer vision, allowing far more information to be derived from images?

Those trends are why I think we’re going to see a lot of “Semantic Sensors” emerging. These will be tiny, cheap, all-in-one modules that capture raw noisy data from the real world, have built-in AI for analysis, and only output a few high-level signals.

For all of these applications, the images involved are just an implementation detail, they can be immediately discarded. From a systems view, they’re just black boxes that output data about the local environment.

Using cameras to count footfall appears to be nothing new – for example, the Leeds Data Mill openly publish Leeds City Centre footfall data collected by the council from “8 cameras located at various locations around the city centre [which monitor]numbers of people walking past. These cameras calculate numbers on an hourly basis”. I’ve also briefly mentioned several examples regarding the deployment of related technologies before, for example The Curse of Our Time – Tracking, Tracking Everywhere.

From my own local experience, it seems cameras are also being used (apparently) to gather evidence about possible “bad behaviour” by motorists. Out walking the dog recently, I noticed a camera I hadn’t spotted before:


It’s situated at the start of a hedged both sides footpath that runs along the road, although the mounting suggests that it doesn’t have a field of view down the path. Asking in the local shop, it seems as if the camera was mounted to investigate complaints of traffic accelerating off the mini-roundabout and cutting-up pedestrians about to use the zebra-crossing:


(I haven’t found any public consultations about mounting this camera, and should really ask a question, or even make an FOI request, to clarify by what process the decision was made to install this camera, when it was installed, for how long, for what purpose, and whether it could be used for other ancillary purposes.)

On a slightly different note, I also note from earlier this year that Amazon acquired internet of things platform operator 2lemetry, “an IoT version of Enterprise Application Integration (EAI) middleware solutions, providing device connectivity at scale, cross-communication, data brokering and storage”, apparently. In part, this made me think of an enterprise version of Pachube, as was (now Xively?).

So is Amazon going to pitch against Google (in the form of Nest), or maybe Apple, perhaps building “home things” services around a home hub server? After all, they have listening connected voice for your home already in the form of the voice controlled Amazon Echo (a bit like a standalone Siri, Cortana or Google Now). (Note to self: check out the Amazon Amazon Alexa Voice Services Developer Kit some time…)

As Pete Warden concluded, it seems obvious to me that machine vision is becoming a commodity“. What might we expect as and when listening and voice services also become a commodity?

Related: a recent article from the Guardian posing the question What happens when you ask to see CCTV footage?, as is your right by making a subject access request under the Data Protection Act, picks up on a recently posted paper by OU academic Keith Spiller: Experiences of accessing CCTV data: the urban topologies of subject access requests.