Several years ago, I used to run a joint EPSRC & AHRB funded research network, the Creative Robotics Research Network (CRRN). The idea behind the network was to provide a forum for academics and practitioners with an interest in creative applications of robotics to share ideas, experience and knowledge.
We had a lot of fun with the network – the mailing list was active, we hosted several events, and on one network visit to a special effects company, I have a hazy memory of flamethrowers being involved… Erm…
Anyway, last weekend I went to a Raspberry Pi hackday organised by ex-IW resident Dr Lucy Rogers at Robin Hill, site of the Bestival for any festival goers out there, and currently taking the form of the electric woods, an atmospheric woodland sound and light show with a great curry along the way. If you can get on to the Island for half term, make an evening of it…
The event was sponsored by Alec Dabell, owner of Vectis Ventures, who also run the Island’s theme park – Blackgang Chine. (If you’ve ever holidayed on the Island as a child or with kids of your own, you’ll know it..:-) The idea? To play with some tech that can be worked up for controlling Blackgang’s animatronic dinosaurs or the light shows at Robin Hill and Blackgang Chine, as well as learning something along the way. (IBM’s Andy Stanford-Clark, another Island resident, pitched in with a talk on Lora, a low power wifi protocol for the internet of things, as well as being on hand to help out with those of us getting to grips with NodeRED and MQTT for the first time ;-)
Here’s a clip from a previous event…
Also at the event was another ex-CRRN member, Mat Walker, with his latest creation: Ohbot.
Designed as a desktop “talking head” robot for educational use, the Arduino controlled Ohbot has seven servos to control the motion of the head, lips and eyes and eyelids, as well as colour LEDs in the eyes themselves.
Text-to speech support also provides a good motivation for trying to get the lip synching to work properly. The Ohbot has a surprisingly expressive face, more so even than the remarkably similar one rendered in the simulator that comes as part of the programming environment. With an extra web cam, Ohbot can be programmed to move its head – and eyes – to follow you around the room…
Needless to say, Ohbot got me thinking… And here’s how…
One of the things being developed in the OU at the moment is a remote engineering lab, part of the wider OpenSTEM lab. The engineering lab, which is being put together by uberhacker Tim Drysdale, should go live to second year equivalent OU engineering students in October next year (I think?) and third year equivalent students the year after.
The lab itself has multiple bays for different physical experiments, with several instances of each experiment to allow several student individual access to the same experiment at the same time.
One of the first experiments to be put together is a mechanical pendulum – students can log in to the apparatus, control the motion of the pendulum, and observe in real time it’s behaviour via a live video feed, as well as data traces from instrumentation applied to the apparatus. One of the things Tim has been working on is getting the latency of the control signals and the video feed right down – and it seems to be looking good.
Another couple of courses in production at the OU at the moment are two first year equivalent computing courses. The first one of these teaches students basic programming using Scratch (I have issues with this, but anyway…); Ohbot also uses a blockly style user interface, although it’s currently built just for Windows machines, I think?
Hmmm… as part of the Open Engineering Lab, the OU has bought three (?) Baxter robots, with the intention that students will be able to log in and programmatically control them in real time. I seem to recall there was also some discussion about whether we could run some Lego EV3 robots, perhaps even mobile ones. The problem with mobile robots, of course, is the “activity reset” problem. The remote experimentation lab activities need to run without technician support, which means they need to clear down in a safe way at the end of each student’s activity andd reset themselves fro the next student to log in to them. With mobile robots, this is an issue. But with Ohbot, it should be a doddle? (We’d probably have to rework the software, but that in turn maybe something that could be done in collaboration with the Ohbot guys…)
Keenly priced at under a couple of hundred squids, with sensors, I can easily image a shelf with 8 or so Ohbot bays providing an interactive remote robot programming activity for our first year computing, as well as engineering, students. The question is, can I persuade anyone else that this might be worth exploring..?
Erm… a Word document with some images and captions – styled as such:
Some basic IT knowledge – at least – it should be basic in what amounts to a publishing house:
The .docx file is just a zip file… That is, a compressed folder and its contents… So use the .zip…
So here’s the unzipped folder listing – can you spot the images?
The XML content of the doc – viewed in Firefox (drag and drop the file into a Firefox browser window). Does anything jump out at you?
Computers can navigate to the tags that contain the caption text by looking for the Caption style. It can be a faff associating the image captions with the images though (you need to keep tallies…) because the Word XML for the figure doesn’t seem to include the filename of the image… (I think you need to count your way through the images, then relate that image index number with the following caption block?)
So re: the email – if authors tag the captions and put captions immediately below an image – THE MACHINE CAN DO IT, if we give someone an hour or two to knock up the script and then probably months and months and months arguing about the workflow.
PS I’d originally screencaptured and directly pasted the images shown the above into a Powerpoint presentation:
I could have recaptured the screenshots, but it was much easier to save the Powerpoint file, change the .pptx suffix to .zip, unzip the folder, browse the unzipped Powerpoint media folder to see which image files I wanted:
and then just upload them directly to WordPress…
See also: Authoring Multiple Docs from a Single IPython Notebook for another process that could be automated but lack of imagination and understanding just blanks out.
Via my feeds, a post on the Google Operating System blog that notes Google Converts Queries Into Questions:
When searching for [alcohol with the highest boiling], Google converted my query into a question: “Which alcohol has the highest boiling point?”
I ran a related query – alcohol with highest boiling point – which offered a range of related questions, albeit further down the results list:
Google results trying to draw you into a conversation – and hence running more queries (or questions…)?
A few months ago I bought a Raspberry Pi starter kit that included a Raspberry Pi 3, a box for it, a micro-usb power supply and a micro-SD card with a suitable O/S loaded onto it.
Today I tried to get it up and running…
I put the SD card into the Pi, plugged the power in, the red light went on, solid – and nothing happened… Plug an ethernet cable from my Mac into the back – no lights there either…
I unplugged one of the boxes from the telly and shoved the HDMI cable in the Pi – zilch.
I tried to look at the SD card on my Mac – and it didn’t mount. Another Mac, and another “frame” to put the micro-SD card in… still nothing…
So I suspect, the SD card, which I’m guessing added about a tenner to the price, didn’t work at all… pile of crap.
Go digging around the Lego EV3 kits for another micro-SD card; find one; download an image of the Raspbian O/S, and unzip. Install the dead easy to use etcher app (discovered via the EV3 site) and copy the image onto to microSD-card again.
Try the new SD card, plug the pi into the mains – solid red and a bit of flickering of a second LED. Plug in the ethernet cable connected to my Mac – lights there… success…
But what’s the IP address?
Apparently, a discovery service is running on the Pi, so from the Mac command line: ping raspberrypi.local
PING raspberrypi.local (192.168.2.2): 56 data bytes
yes to prompt, then password: raspberry
Enable internet sharing on the Mac…
..so I can update the Pi…
Next up: see if I can get TM351 services running on it…
This looks like it could be a good place to start, though I’d need to add version numbers: https://github.com/kleinee/jns – a Jupyter server and a scientific stack, a recipe originally guided by this post, (though I’ll probably omit the TeX stuff…; it also includes some optimisations for improving performance, but doesn’t mention setting the gpu_mem? as per eg http://raspberrypi.stackexchange.com/a/23976 ? In correspondence, Eckhard has “GPU memory to 16MB (can conveniently done [sic] via raspi-config)”); or this: http://geoffboeing.com/2016/03/scientific-python-raspberry-pi/
Then it’s just(?) a case of adding the PostgreSQL and (32 bit) MongoDB (like this?) databases? And OpenRefine…. Hmmm… what else is in there too?! Maybe I can get away with using my original installation scripts… (wishing I’d pulled everything neatly into .sh files for https://github.com/psychemedia/ou-tm351 now…)
PS OpenRefine seemed to go in okay:
apt-get install -y openjdk-7-jre-headless wget https://github.com/OpenRefine/OpenRefine/releases/download/v2.6-rc1/openrefine-linux-2.6-rc1.tar.gz tar xzf openrefine-linux-2.6-rc1.tar.gz
(Should really specify the download path, and also delete the gz file after unpacking it.)
To configure the Raspberry Pi to autorun OpenRefine on boot, edit the /etc/rc.local startup file using:
sudo nano /etc/rc.local
and add something like the following to autostart the application on port 3333:
/home/pi/openrefine-2.6-rc1/refine -i 0.0.0.0 -p 3333 -d /mnt/refine &
(Maybe should set the IP address to the actual IP address?)
PS I’m also starting to wonder whether a simple service like monit or supervisord might be handy for checking services are running an letting the user start/stop them via a browser UI. I think I also need a simple flask app on port 80 that can act as a homepage for all the browser accessible services running via the Pi?
One of the things that course teams work hard at at the OU is making materials accessible. This isn’t just because as an educational institution there is a legal obligation to do so: it’s built into the institutional DNA.
In the course of a module production meeting yesterday we had a short workshop on a writing figure descriptions – long text descriptions that can provide a student with a screen reader with an equivalent experience of figure included in the course text, often in the form of a narrated description of the salient points in the image. For readers with a sight impairment, the long description may read out by a screen reader to provide an alternative to looking at the figure directly.
There is an art to writing text descriptions that I’m not sure I’ve ever mastered – I guess I should read the guidance produced by the UK Association for Accessible Formats (which I think draw on OU expertise).
There are some rules of thumb that I do try to bear in mind though (please feel free to correct me in the comments if you take issue with any of these): you don’t want to duplicate what’s in the text that refers to the figure, nor the figure caption. Where the sighted reader is expected to read something for themselves from the figure, you don’t want the figure description to describe the answer as well as the figure. Where the exercise is critiquing a figure, or learning how to read it or extract salient points from it in order to critique it (for example, in the case of an art history course), the long description shouldn’t give away the reading, highlight the salient point specifically, or turn into critique. Generally, the figure description shouldn’t add interpretation to the figure – that comes from the reading of the figure (or the figure description). You also need to take care about the extent which the figure description describes the semantics of the figure; for example, identifying a decision symbol in a flow chart as such (a semantic description) compared to describing it as a diamond (which you might want to do when teaching someone how to read a flow chart for the first time.
Sometimes, a figure appears in a document that doesn’t appear to need much of a description at all; for example, an image that appears purely as an illustration, a portrait of a historical figure, for example, whose brief biographical details appear in the main text. In such a case, it could be argued that a figure description is not really required, or if it is, it should be limited to something along the lines of “A portrait of X”. (A quick way in to generating the description for such an image might also be to refer to any search terms used to discover the image by the original author if it was discovered using a search tool…)
But if the purpose of the image is to break up the flow of the text on the printed page, give the reader a visual break in the text and a brief respite from reading, or help set the atmosphere of the reading, then what should an equivalent experience be for the student accessing the materials via a screen reader? For example, in the workshop I wondered whether the figure description should provide a poetic description to evoke the same sentiment that the author who included the image intended to evoke with it? (A similar trick applied in text is to include a quotation at the start of a section, or as an aside, for example.) A claim could be made that this provides information over and above that contained in the image, but if the aim is to provide an equivalent experience then isn’t this legitimate?
Similarly, if an image is used to lighten the presentation of the text on the page by introducing a break in the text, essentially including an area of white space, how might a light break be introduced into the audio description of the text? By changing the text-to-speech voice, perhaps, or its intonation? On the other hand, an interlude might break a sense of flow if the student is engaged with the academic text and doesn’t want the interruption of a aside?
Another example, again taken from the workshop, concerns the use of photographic imagery that may be intended to evoke a memory of a particular news event, perhaps through the use of an iconic image. In this case, the purpose of the imagery may be emotionally evocative, as well as illustrative; rather than providing a very simple, literal, figure description, could we go further in trying to provide an equivalent experience? For example, could we use a sound effect, perhaps overlaid with a recording of a news headline either taken from a contemporary radio news source (perhaps headed with leading audio ident likely to be familiar to the listener to bring to mind a news bulletin) or a written description then recorded by a voice actor especially to evoke a memory of the event?
In other words, to provide an equivalent experience, should we consider treating the figure description (which will be read by a screen reader) as a radio programme style fill where a sound effect, rather than just a text description, may be more appropriate? For a “poetic aside” intended to substitute for a visual break, should we use a prerecorded, human voice audio clip, rather than triggering the screen reader, even if with a different voice to break up the (audio) flow?
Just as an aside, I note that long descriptions are required for our electronic materials, but I’m not sure how they are handled when materials are produced for print? The OU used to record human readers reading the course texts delivered as audio versions of the course texts to students, presumably with the human reader also inserting the figure descriptions at an appropriate point. I wonder, did the person recording the audio version of the text use a different tone of voice for the different sorts of figures to break up the rest of the recorded text? I also wonder if rather than human reader voiced recordings, the OU now delivers electronic copies of documents that must be converted to speech by students’ own text-to-speech applications? In which case, how do the audio versions compare to the human recorded versions in terms of student experience and understanding?
A couple of other things I wondered about related to descriptions of “annotated” diagrams on the one hand, and descriptions of figures for figures that could be “written” (with the figures generated from the written description) on the other.
In the first case, consider the example of a annotation of a piece of python code, such as the following clumsy annotation of a Python function.
In this case, the figure is annotated (not very clearly!) in such a way to help a sighted reader parse the visual structure of a piece of code – there are semantics in the visual structure. So what’s the equivalent experience for an unsighted or visually impaired student using a screen reader? Such a student is likely to experience the code through a screen reader which will have its own idiosyncratic way of reading aloud the code statement. (There are also tools that can be used to annotate python functions to make them clearer, such as pindent.py.) For an unsighted reader using a screen reader, an equivalent experience is presumably an audio annotated version of the audio description of the code that the student might reasonably expect their screen reader to create from that piece of code?
When it comes to diagrams that can be generated from a formally written description of them (such as some of the examples I’ve previously described here), where the figure itself can be automatically generated from the formal text description, could we also generate a long text description automatically? A couple of issues arise here relating to our expectations of the sighted reader for whom the figure was originally created (assuming that the materials are originally created with a sighted reader in mind), such as whether we expect them to be able to extract some sort of meaning or insight from the figure, for example.
As an example, consider a figure that represents a statistical chart. The construction of such charts can be written using formulations such as Leland Wilkinson’s Grammer of Graphics, operationalised by Hadley Wickham in the ggplot2 R library, (or the Yhat python clone, ggplot). I started exploring how we could generate a literal reading of a chart constructed using ggplot (or via a comment, in matplotlib) in First Thoughts on Automatically Generating Accessible Text Descriptions of ggplot Charts in R; a more semantic reading would come from generating text about the analysis of the chart, or describing “insight” generated from it, as things like Automated Insights’ Wordsmith try to do (eg as a Tableau plugin).
Something else I picked up on in passing was that work is ongoing in making maths notation expressed in MathJax accessible via a browser using screen readers (this project maybe? MathJax a11y tool). By the by, it’s perhaps worth noting that MathJax is used to render LaTeX expressions from Jupyter markdown cells, as well as output cells of a Jupyter notebook. In addition, symbolic maths expressions described using sympy are rendered using MathJax. I haven’t tested maths expressions in the notebooks with the simple jupyter-a11y extension though (demo; I suspect it’s just the LaTeX that gets read aloud – I haven’t tested it…) It would be interesting to
see hear how well maths expressions rendered in Jupyter notebooks are supported by screen reader tools.
Finally, I realise that I am writing from my own biased perspective and I don’t have a good model in my head for how our unsighted students access our materials – which is more fault me. Apologies if any offence caused – please feel free to correct any misunderstandings or bad assumptions on my part via the comments.
PS one thing I looked for last night but could find were any pages containing example HTML pages along with audio recordings of how a user using a screen reader might hear the page read out. I know I should really install some screen reader tools and try them out for myself, but it would take me time to learn them. Seeing examples of variously complex pages – including ones containing maths expressions, figure descriptions, and so on, and how they sound when rendered using a screen a reader as used by an expert user, would be a useful resource I think?
PPS Of course, when it comes to figure captions for illustrative imagery, we could always give the bots a go; for example, I notice this just appeared on the Google Research blog: Show and Tell: image captioning open sourced in TensorFlow.
Via @Charlesarthur, a twitter thread from @nickbaum, one time project manager of Google Reader:
I realized this weekend that it’s my fault that @Google shut down Google Reader. /1
I was the PM from 06-07. We launched a major redesign that significantly changed our growth rate… but didn’t take us to “Google scale”. /2
I used to think it was unfair and short-sighted that Google didn’t give us enough resources to execute to our full potential. /3
… but as a founder, I know resources aren’t something you are owed or deserve. They’re something you earn. /4
I should have realized that not reaching ~100m actives was an existential threat, and worked to convince the team to focus 100% on that. /5
As a service, Google Reader allowed users to curate their own long form content stream by subscribing to web feeds (RSS, Atom). When it shut down, I moved my subscriptions over to feedly.com, where I still read them every day.
If, as the thread above suggests, Google isn’t interested in “free”, “public” services with less than 100m – 100 million – active users, it means that “useful for some”, even if that “some” counts in the tens of millions, just won’t cut it.
Such are the economics of scale, I guess…
100. million. active. users.
So…. the Guardian reports: Mark Zuckerberg accused of abusing power after Facebook deletes ‘napalm girl’ post.
“While we recognize that this photo is iconic, it’s difficult to create a distinction between allowing a photograph of a nude child in one instance and not others.
“We try to find the right balance between enabling people to express themselves while maintaining a safe and respectful experience for our global community. Our solutions won’t always be perfect, but we will continue to try to improve our policies and the ways in which we apply them.
That’s what happens when you live by algorithms.
For more on this, see the new book: Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy by Mathbabe, Cathy O’Neil.
PS Cf algorithmic false positives, false negatives, wtf – not our fault, it’s algorithmics and we aren’t accountable: Facebook loses legal bid to prevent girl suing over naked picture.