A First Attempt at An Amazon Echo Alexa Skills App Using Python: Parlibot, A UK Parliament Agent

Over the last couple of years, I’ve been dabbling with producing simple textual reports from datasets that can be returned in response to simple natural language style queries using chat interfaces such as Slack (for example, Sketching a Slack Slash Parliamentary Auto-Responder Using AWS Lambda Functions). The Amazon Echo, which  launches in the UK at the end of September, provides another context for publishing natural languages style responses, in this case in the form of spoken responses to spoken requests.

In the same way that apps brought a large amount of feature level functionality to mobile phones, the Amazon Echo provides an opportunity for publishers to develop “skills” that can respond to particular voice command issued within hearing of the Echo. Amazon is hopeful that one class of commands  –  Smart Home Skills – will be used to bootstrap a smart home ecosystem that allows you to interact with smart-home devices though voice commands, such as commands to turn your lights on and off, or questions about the status of your home, (“is the garage door still open?”, for example). Another class of services relate to more general information based services, or even games, which can be developed using a second API environment, the Alexa Skills KitFor a full range of available skills, see the Alexa Skills Store.

The Alexa Skills Kit has a similar sort of usability to other AWS services (i.e. it’s a bit rubbish…), but I thought I’d give it a go repurposing some old functions around the UK Parliament API, such as finding out which committees a particular MP sits on, or who are the members of a particular committee, as well as some new ones.

For example, I thought it might be amusing to try to implement a skill that could respond to questions like the following :

  • what written statements were published last week?
  • were there any written statements published last Tuesday?

using some of the “natural language” date-related Python functions I dabbled with yesterday.

One of the nice things about the Alexa Skills API is that it also supports conversational contexts. For example, an answer to one of the above questions (generated by my code) might take the form “There were 27 written statements published then”, but session state associated with that response can also be passed back as metadata to the Alexa service, and then returned from Alexa as session metadata attached to a follow-up question. The answer to the follow-up questions that can then draw on context generated earlier in the conversation. So for example, exchanges such as the following now become possible:

  • Q: were there any written statements published last Tuesday?
  • A: There were 27 written statements published then. Do you want to know them all?
  • Q: No, just the ones from DCLG.
  • A: Okay, there were three written statements issued by the Department for Communities and Local Government last week. One on …. by….; etc etc 

So how can we build an Alexa Skill? I opted for implementing one using Python, with the answer engine running on my Reclaim Hosting webserver rather than as an AWS Lambda Function, which I think Amazon would prefer. (The AWS Lambda functions are essentially free, but it means you have to go through the pain of using another AWS service.) For an example of getting a Python application up and running on your own web host using cPanel, see here.

To make life simpler, I installed the Flask-ASK library (docs), which extends the Flask web application framework so that it plays nicely with the Alexa Skills API. (There’s a standalone tutorial that runs without the need for any web hosting described here: Flask-Ask: A New Python Framework for Rapid Alexa Skills Kit Development.)

The Flask-Ask library allows you to create two sorts of response types in your application that can respond to “intents” defined as part of the Alexa skill itself:

  • a statement, which is a response Alexa that essentially closes a session;
  • and a question, which keeps the session open and allows you to pop session state into the response so you can get it back as part of the next intent command issued from Alexa in that conversation.

The following bit of code shows how to decorate a function that will handle a particular Alexa Skill intent. The session variable can be used to pass session state back to Alexa that can be returned as part of the next intent. The question() wrapper packages up the response (txt) appropriately and keeps the conversational session alive.

@ask.intent("WrittenStatementIntent")
def writtenStatement(period,myperiod):
    txt,tmp=statementGrabber(period=period,myperiod=myperiod)
    session.attributes['period'] = period
    session.attributes['myperiod'] = myperiod
    session.attributes['typ'] = 'WrittenStatementIntent'
    if tmp!='': txt='{} Do you want to hear them all?'.format(txt)
    else: txt='I don't know of any.'
    return question(txt)

We might then handle a response identified as to the affirmative (“yes, tell me them all”) using something like the following, which picks up the session state from the response, generates a paragraph describing all the written statements and returns it, suitably packaged, as a session ending statement().

@ask.intent("AllOfThemIntent")
def sayThemAll():
    period= session.attributes['period']
    myperiod= session.attributes['myperiod']
    typ=session.attributes['typ']
    txt,tmp=statementGrabber(period=period,myperiod=myperiod)
    return statement(tmp)

So how do we define things on the Alexa side?  (An early draft of my config can be found here.) To start with, we need to create a new skill and give it a name. A unique ID is created for the application that is passed in all service requests that we can use a key to decide whether or not to accept and respond to a request from the Alexa Skill server in our application logic. (For convenience, I defined an open service that can accept all requests. I’m not sure if Flask-Ask has a setting that allows the application to be tied to one or more Alexa Skill IDs?)

amazon_apps___services_developer_portal

The second thing we need to do is actually define the interactions that the skill will engage in. This is composed of three parts:

  • an Intent Schema, defined as a JSON object, that specifies a list of intents that the skill can handle. Each intent must be given a unique label (for example, “AllOfThemIntent”), and may be associated with one or more slots. Each slot has a name and a type. The name corresponds to the name of a variable that may be captured and passed (under that name) to the application handler; the type is either a predefined Amazon data type (for example,  AMAZON.DATE, which captures date like things (including some simple natural language date terms, such as yesterday) or a custom data type;
  • one or more user-defined custom data types, defined as a list of keywords that Alexa will try to match exactly (I think? I don’t think fuzzy match, partial match or regular expression matching is supported? If it is, please let me know how via the comments…)
  • some sample utterances, keyed by intent and giving an example of a phrase that the skill should be able to handle; slots may be included in the example utterances, using the appropriate name as provided in the corresponding intent definition.

amazon_apps___services_developer_portal_2

In the above case, I start to define a conversation where a WrittenStatementIntent is intended to identify written statements published on a particular day or over a particular period, and then a follow up AllOfThemIntent can be used to list the details of all of them or a  LimitByDeptIntent can be used to limit the reporting to just statements from a specific department.

When you update the interaction model, it needs rebuilding which may take some time (wait for the spinny thing over the Interaction Model menu item to stop before you try to test anything).

The next part of the definition is used to specify where the application logic can be found. As mentioned, this may be defined as an AWS Lambda function, or you can host it yourself on an https server. In the latter case, for a Flask app, you need to provide a URL where the root of the application is served from.

amazon_apps___services_developer_portal_3

If you are using your own host, you need to provide some information about the trust certificate. I published my application logic as an app on Reclaim Hosting, which appears to offer https out of the can (though I haven’t tried it for a live/published Alexa skill yet.)

amazon_apps___services_developer_portal_4

With the config stuff all in place, you now just need to make sure some application logic is in place to handle it.

For reference, along with the stub of application logic shown above (which just needs a dummy statementGrabber() function that optionally accepts a couple of arguments and that returns a couple of text strings for testing purposes) I also topped my application with the following set-up components (note that as part of the WSGI handling that cPanel uses to run the app, I am creating an application variable that points to it).

import logging
from random import randint
from flask import Flask, render_template
from flask_ask import Ask, statement, question, session

app = Flask(__name__)

application=app

ask = Ask(app, "/")

At the end of the application code, we can fire it up…

if __name__ == '__main__':
    app.run(debug=True)

Get the app running on the server, and now we can test it from the Alexa Skills environment. Unlike deployed skills accessed via the echo, we don’t need to “summon” the app for testing purposes – we can just enter the utterance directly. The JSON code passed to the server is displayed as the Service Request and the Service Response from the application server is also displayed.

amazon_apps___services_developer_portal_6

 

The test panel can also handle conversations established by using Flask-Ask question() wrappers, as shown below:

amazon_apps___services_developer_portal_7

In this case, we filter down on the written statements for last Thursday to just report on the ones issued by the Department for Culture, Media and Sport.

It’s worth noting that Alexa seems to have a limit on the number of  characters allowed when generating a voice output (8000 characters). For large responses, this suggests that adding some sort of sensible paging handler to the application logic could make sense if you need to return a large response; for example, something that chunks up up the response, tells it you piece by piece, and prompts you between each chunk to check you want to hear the next part.

With testing done, and a working app up and running, all that remains is to go through the legal fluff reuiqred to submit the app for publishing (which I haven’t done; a note says you can’t edit the app whilst it’s undergoing approval, but I;m not sure if you can then go back to editing it once it is published?)

A couple of things I learned along the way: firstly, when defining slots, it can be useful to have a controlled vocabulary to hand. For Parliament, things like the Members’ API Reference Data Service can be handy, eg for generating a list of MP names or committee names (in another post I’ll give some more examples about some of the queries I can run). Secondly, when thinking about conversation design, you need to think about the various bits of state than can be associated with a conversation. For example, when making a query about an MP, it makes sense to retain the name (or an identifier for) the MP as part of the session state so that you can refer to that later. If a conversation went “who is the MP for the Isle of Wight?”, “what committees are they on?”, “who else is on those committees?” , it would make sense to capture the list of committees as state somehow when responding to the second question.

One approach I took to managing state within the application was to cache calls to URLs requested in forming the response to one question. If I preserved enough session state to allow me to pull that cached data, I could reanalyse it without having to re-request it from the original URL when putting together a response to a follow up question.

Something it would be nice to have is a list of synonyms for terms in the slots definition, and maybe even a crude lookup that could be used as part of an OpenRefine style reconciliation service to try to partially match slot terms. (I’m not sure how well the model building does this anyway, eg if you put near misses in the slot definitions; or whether it just does exact matching?)

Another takeaway is that it probably makes sense to try to design the code for generating text from data or APIs so that it can be used in a variety of contexts – Slack, Alexa/Echo, email, press release generation, etc, – without much, if any, retooling. Ideally, it would make sense to define a set of test generation functions or API calls that could in turn be called via use-case application wrappers (eg one for Slack, one for Alexa, etc). Issues arise here when it comes to conversation management. Alexa manages conversations via session state, for example. But maybe api.ai can help here, by acting as application independent conversational middleware? That’ll be the next app I need to play with…

PS If you would like to see further posts here exploring Amazon Echo/Alexa skills, why not help me explore the context and gift me an Echo from my Patronage Wishlist?

“Natural Language” Time Periods in Python

Mulling over a search feed that includes date range limits, I had a quick look for a python library that includes “natural language” functions for describing different date ranges. Not finding anything offhand, I popped some quick starter-for-ten functions up at this gist, which should also be embedded below.

It includes things like today(), tomorrow(), last_week(), later_this_month() and so on.

If you know of a “proper” library that does this, please let me know via the comments…

Figure Descriptions: Accessibility and Equivalent Experience

One of the things that course teams work hard at at the OU is making materials accessible. This isn’t just because as an educational institution there is a legal obligation to do so: it’s built into the institutional DNA.

In the course of a module production meeting yesterday we had a short workshop on a writing figure descriptions – long text descriptions that can provide a student with a screen reader with an equivalent experience of figure included in the course text, often in the form of a narrated description of the salient points in the image. For readers with a sight impairment, the long description may read out by a screen reader to provide an alternative to looking at the figure directly.

There is an art to writing text descriptions that I’m not sure I’ve ever mastered – I guess I should read the guidance produced by the UK Association for Accessible Formats (which I think draw on OU expertise).

There are some rules of thumb that I do try to bear in mind though (please feel free to correct me in the comments if you take issue with any of these): you don’t want to duplicate what’s in the text that refers to the figure, nor the figure caption. Where the sighted reader is expected to read something for themselves from the figure, you don’t want the figure description to describe the answer as well as the figure. Where the exercise is critiquing a figure, or learning how to read it or extract salient points from it in order to critique it (for example, in the case of an art history course), the long description shouldn’t give away the reading, highlight the salient point specifically, or turn into critique. Generally, the figure description shouldn’t add interpretation to the figure – that comes from the reading of the figure (or the figure description). You also need to take care about the extent which the figure description describes the semantics of the figure; for example, identifying a decision symbol in a flow chart as such (a semantic description) compared to describing it as a diamond (which you might want to do when teaching someone how to read a flow chart for the first time.

Sometimes, a figure appears in a document that doesn’t appear to need much of a description at all; for example, an image that appears purely as an illustration, a portrait of a historical figure, for example, whose brief biographical details appear in the main text. In such a case, it could be argued that a figure description is not really required, or if it is, it should be limited to something along the lines of “A portrait of X”. (A quick way in to generating the description for such an image might also be to refer to any search terms used to discover the image by the original author if it was discovered using a search tool…)

But if the purpose of the image is to break up the flow of the text on the printed page, give the reader a visual break in the text and a brief respite from reading, or help set the atmosphere of the reading, then what should an equivalent experience be for the student accessing the materials via a screen reader? For example, in the workshop I wondered whether the figure description should provide a poetic description to evoke the same sentiment that the author who included the image intended to evoke with it? (A similar trick applied in text is to include a quotation at the start of a section, or as an aside, for example.) A claim could be made that this provides information over and above that contained in the image, but if the aim is to provide an equivalent experience then isn’t this legitimate?

Similarly, if an image is used to lighten the presentation of the text on the page by introducing a break in the text, essentially including an area of white space, how might a light break be introduced into the audio description of the text? By changing the text-to-speech voice, perhaps, or its intonation? On the other hand, an interlude might break a sense of flow if the student is engaged with the academic text and doesn’t want the interruption of a aside?

Another example, again taken from the workshop, concerns the use of photographic imagery that may be intended to evoke a memory of a particular news event, perhaps through the use of an iconic image. In this case, the purpose of the imagery may be emotionally evocative, as well as illustrative; rather than providing a very simple, literal, figure description, could we go further in trying to provide an equivalent experience? For example, could we use a sound effect, perhaps overlaid with a recording of a news headline either taken from a contemporary radio news source (perhaps headed with leading audio ident likely to be familiar to the listener to bring to mind a news bulletin) or a written description then recorded by a voice actor especially to evoke a memory of the event?

In other words, to provide an equivalent experience, should we consider treating the figure description (which will be read by a screen reader) as a radio programme style fill where a sound effect, rather than just a text description, may be more appropriate? For a “poetic aside” intended to substitute for a visual break, should we use a prerecorded, human voice audio clip, rather than triggering the screen reader, even if with a different voice to break up the (audio) flow?

Just as an aside, I note that long descriptions are required for our electronic materials, but I’m not sure how they are handled when materials are produced for print? The OU used to record human readers reading the course texts delivered as audio versions of the course texts to students, presumably with the human reader also inserting the figure descriptions at an appropriate point. I wonder, did the person recording the audio version of the text use a different tone of voice for the different sorts of figures to break up the rest of the recorded text? I also wonder if rather than human reader voiced recordings, the OU now delivers electronic copies of documents that must be converted to speech by students’ own text-to-speech applications? In which case, how do the audio versions compare to the human recorded versions in terms of student experience and understanding?

A couple of other things I wondered about related to descriptions of “annotated” diagrams on the one hand, and descriptions of figures for figures that could be “written” (with the figures generated from the written description) on the other.

In the first case, consider the example of a annotation of a piece of python code, such as the following clumsy annotation of a Python function.

function-annotation

In this case, the figure is annotated (not very clearly!) in such a way to help a sighted reader parse the visual structure of a piece of code – there are semantics in the visual structure. So what’s the equivalent experience for an unsighted or visually impaired student using a screen reader? Such a student is likely to experience the code through a screen reader which will have its own idiosyncratic way of reading aloud the code statement. (There are also tools that can be used to annotate python functions to make them clearer, such as pindent.py.) For an unsighted reader using a screen reader, an equivalent experience is presumably an audio annotated version of the audio description of the code that the student might reasonably expect their screen reader to create from that piece of code?

When it comes to diagrams that can be generated from a formally written description of them (such as some of the examples I’ve previously described here), where the figure itself can be automatically generated from the formal text description, could we also generate a long text description automatically? A couple of issues arise here relating to our expectations of the sighted reader for whom the figure was originally created (assuming that the materials are originally created with a sighted reader in mind), such as whether we expect them to be able to extract some sort of meaning or insight from the figure, for example.

As an example, consider a figure that represents a statistical chart. The construction of such charts can be written using formulations such as Leland Wilkinson’s Grammer of Graphics, operationalised by Hadley Wickham in the ggplot2 R library, (or the Yhat python clone, ggplot). I started exploring how we could generate a literal reading of a chart constructed using ggplot (or via a comment, in matplotlib) in First Thoughts on Automatically Generating Accessible Text Descriptions of ggplot Charts in R; a more semantic reading would come from generating text about the analysis of the chart, or describing “insight” generated from it, as things like Automated Insights’ Wordsmith try to do (eg as a Tableau plugin).

Something else I picked up on in passing was that work is ongoing in making maths notation expressed in MathJax accessible via a browser using screen readers (this project maybe? MathJax a11y tool). By the by, it’s perhaps worth noting that MathJax is used to render LaTeX expressions from Jupyter markdown cells, as well as output cells of a Jupyter notebook. In addition, symbolic maths expressions described using sympy are rendered using MathJax. I haven’t tested maths expressions in the notebooks with the simple jupyter-a11y extension though (demo; I suspect it’s just the LaTeX that gets read aloud – I haven’t tested it…) It would be interesting to see hear how well maths expressions rendered in Jupyter notebooks are supported by screen reader tools.

Finally, I realise that I am writing from my own biased perspective and I don’t have a good model in my head for how our unsighted students access our materials – which is more fault me. Apologies if any offence caused – please feel free to correct any misunderstandings or bad assumptions on my part via the comments.

PS one thing I looked for last night but could find were any pages containing example HTML pages along with audio recordings of how a user using a screen reader might hear the page read out. I know I should really install some screen reader tools and try them out for myself, but it would take me time to learn them. Seeing examples of variously complex pages – including ones containing maths expressions, figure descriptions, and so on, and how they sound when rendered using a screen a reader as used by an expert user, would be a useful resource I think?

PPS Of course, when it comes to figure captions for illustrative imagery, we could always give the bots a go; for example, I notice this just appeared on the Google Research blog: Show and Tell: image captioning open sourced in TensorFlow.

Creating a Simple Python Flask App via cPanel on Reclaim Hosting

I’ve had my Reclaim Hosting package for a bit over a year now, and now really done anything with it, so I had a quick dabble tonight looking for a way of installing and running a simple Python Flask app.

Searching around, it seems that CPanel offers a way in to creating a Python application:

cpanel_-_main

Seems I then get to choose a python version that will be installed into a virtualenv for the application. I also need to specify the name of a folder in which the application code will live and select the domain and path I want the application to live at:

cpanel_-_setup_python_app

Setting up the app generates a folder into which to put the code, along with a public folder (into which resources should go) and a passenger_wsgi.py file that is used by a piece of installed sysadmin voodoo magic (Phusion Passenger) to actually handle the deployment of the app. (An empty folder is also created in the public_html folder corresponding to the app’s URL path.)

cpanel_file_manager_v3

Based on the Minimal Cyborg How to Deploy a Flask Python App for Cheap tutorial, passenger_wsgi.py needs to link to my app code.

Passenger is a web application server that provides a scriptable API for managing the running of web apps (Passenger/py documentation).

For runnin Pyhon apps, we  is used to launch the applicationif you change the wsgi file, I think yo

A flask app is normally called by running a command of the form python app.py on the commandline. In the case of a python application, the Passenger web application manager uses a passenger_wsgi.py file associated with the application to manage it. In the case of our simple Flask application, this corresponds to creating an object called application  that represents it. If we create an application in a file myapp.py, and create a variable application that refers to it, we can run it via the passenger_wsgi.py file by simply importing it: from myapp import application.

WSGI works by defining a callable object called application inside the WSGI file. This callable expects a request object, which the WSGI server provides; and returns a response object, which the WSGI server serializes and sends to the client.

Flask’s application object, created by a MyApp = Flask(__name__) call, is a valid WSGI callable object. So our WSGI file is as simple as importing the Flask application object (MyApp) from app.py, and calling it application.

But first we need to create the application – for our demo, we can do this using a single file in the app directory. First create the file:

cpanel_file_manager_v3_2

then open it in the online editor:

cpanel_file_manager_v3_3

Borrowing the Minimal Cyborg “Hello World” code:

from flask import Flask
app = Flask(__name__)
application = app # our hosting requires application in passenger_wsgi

@app.route("/")
def hello():
    return "This is Hello World!\n"

if __name__ == "__main__":
    app.run()

I popped it into the myapp.py file and saved it.

(Alternatively, I could have written the code in an editor on my desktop and uploaded the files.)

We now need to edit the passenger_wsgi.py  file so that it loads in the app code and gets from it an object that the Passenger runner can work with. The simplest approach seemed to be to load in the file (from myapp) and get the variable pointing to the flask application from it (import application). I think that Passenger requires the object be made available in a variable called application?

cpanel_x_-_file_manager

That is, comment out the original contents of the file (just in case we want to crib from them later!) and import the application from the app file: from myapp import application.

So what happens if I now try to run the app?

web_application_could_not_be_started

Okay – it seemed to do something but threw an error – the flask package couldn’t be imported. Minimal Cyborg provides a hint again, specifically “make sure the packages you need are installed”. Back in the app config area, we can identify packages we want to add, and then update the virtualenv used for the app to install them.

cpanel_-_setup_python_app2And if we now try to run the app again:
ouseful_org_testapp2_2Yeah!:-)

So now it seems I have a place I can pop some simple Python apps – like some simple Slack/slash command handlers, perhaps…

PS if you want to restart the application, I’m guessing all you have to do is click the Restart button in the appropriate Python app control panel.

Simple Demo of Green Screen Principle in a Jupyter Notebook Using MyBinder

One of my favourite bits of edtech  in the form of open educational technology infrastucture at the moment is mybinder (code), which allows you to fire up a semi-customised Docker container and run Jupyter notebooks based on the contents of a github repository. This makes is trivial to share interactive, Jupyter notebook demos, as long as you’re happy to make your notebooks public and pop them into github.

As an example, here’s a simple notebook I knocked up yesterday to demonstrate how we could created a composited image from a foreground image captured against a green screen, and a background image we wanted to place behind our foregrounded character.

The recipe was based on one I found in a Bryn Mawr College demo (Bryn Mawr is one of the places I look to for interesting ways of using Jupyter notebooks in an educational context.)

The demo works by looking at each pixel in turn in the foreground (greenscreened) image and checking its RGB colour value. If it looks to be green, use the corresponding pixel from the background image in the composited image; if it’s not green, use the colour values of the pixel in the foreground image.

The trick comes in setting appropriate threshold values to detect the green coloured background. Using Jupyter notebooks and ipywidgets, it’s easy enough to create a demo that lets you try out different “green detection” settings using sliders to select RGB colour ranges. And using mybinder, it’s trivial to share a copy of the working notebook – fire up a container and look for the Green screen.ipynb notebook: demo notebooks on mybinder.

green_screen_-_tm112

(You can find the actual notebook code on github here.)

I was going to say that one of the things I don’t think you can do at the moment is share a link to an actual notebook, but in that respect I’d be wrong… The reason I thought was that to launch a mybinder instance, eg from the psychemedia/ou-tm11n github repo, you’d use a URL of the form http://mybinder.org/repo/psychemedia/ou-tm11n; this then launches a container instance at a dynamically created location – eg http://SOME_IP_ADDRESS/user/SOME_CONTAINER_ID – with a URL and container ID that you don’t know in advance.

The notebook contents of the repo are copied into a notebooks folder in the container when the container image is built from the repo, and accessed down that path on the container URL, such as http://SOME_IP_ADDRESS/user/SOME_CONTAINER_ID/notebooks/Green%20screen%20-%20tm112.ipynb.

However, on checking, it seems that any path added to the mybinder call is passed along and appended to the URL of the dynamically created container.

Which means you can add the path to a notebook in the repo to the notebooks/ path when you call mybinder – http://mybinder.org/repo/psychemedia/ou-tm11n/notebooks/Green%20screen%20-%20tm112.ipynb – and the path will will passed through to the launched container.

In other words, you can share a link to a live notebook running on dynamically created container – such as this one – by calling mybinder with the local path to the notebook.

You can also go back up to the Jupyter notebook homepage from a notebook page by going up a level in the URL to the notebooks folder, eg http://mybinder.org/repo/psychemedia/ou-tm11n/notebooks/ .

I like mybinder a bit more each day:-)

Making Music and Embedding Sounds in Jupyter Notebooks

It’s looking as if the new level 1 courses won’t be making use of Jupyter notebooks (unless I can find a way of sneaking them in via the single unit I’be put together!;-) but I still think they’re worth spending time exploring for course material production as well as presentation.

So to this end, as I read through the materials being drafted by others for the course, I’ll be looking for opportunities to do the quickest of quick demos, whenever the opportunity arises, to flag things that might be worth exploring more in future.

So here’s a quick example. One of the nice design features of TM112, the second of the two new first level courses, is that it incorporates some mimi-project activities for students work on across the course. One of the project themes relates to music, so I wondered what doing something musical in a Jupyter notebook might look like.

The first thing I tried was taking the outlines of one of the activities – generating an audio file using python and MIDI – to see how the embedding might work in a notebook context, without the faff of having to generate an audio file from python and then find a means of playing it:

midimusic

Yep – that seems to work… Poking around music related libraries, it seems we can also generate musical notation…

midimusic2

In fact, we can also generate musical notation from a MIDI file too…

midimusic3

(I assume the mappings are correct…)

So there may be opportunities there for creating simple audio files, along with the corresponding score, within the notebooks. Then any changes required to the audio file, as well as the score, can be effected in tandem.

I also had a quick go at generating audio files “from scratch” and then embedding the playable audio file

 

audio

That seems to work too…

We can also plot the waveform:

audio2

This might be handy for a physics or electronics course?

As well as providing an environment for creating “media-ful” teaching resources, the code could also provide the basis of interactive student explorations. I don’t have a demo of any widget powered examples to hand in a musical context (maybe later!), but for now, if you do want to play with the notebooks that generated the above, you can do so on mybinder – http://mybinder.org/repo/psychemedia/ou-tm11n – in the midiMusic.ipynb and Audio.ipynb notebooks. The original notebooks are here: https://github.com/psychemedia/OU-TM11N

The Cost of Scaling…

Via @Charlesarthur, a twitter thread from @nickbaum, one time project manager of Google Reader:

I realized this weekend that it’s my fault that @Google shut down Google Reader. /1

I was the PM from 06-07. We launched a major redesign that significantly changed our growth rate… but didn’t take us to “Google scale”. /2

I used to think it was unfair and short-sighted that Google didn’t give us enough resources to execute to our full potential. /3

… but as a founder, I know resources aren’t something you are owed or deserve. They’re something you earn. /4

I should have realized that not reaching ~100m actives was an existential threat, and worked to convince the team to focus 100% on that. /5

As a service, Google Reader allowed users to curate their own long form content stream by subscribing to web feeds (RSS, Atom). When it shut down, I moved my subscriptions over to feedly.com, where I still read them every day.

If, as the thread above suggests, Google isn’t interested in “free”, “public” services with less than 100m – 100 million – active users, it means that “useful for some”, even if that “some” counts in the tens of millions, just won’t cut it.

Such are the economics of scale, I guess…

100. million. active. users.