Exposing Services Running in a Docker Container Running in Virtualbox to Other Computers on a Local Network

Most of my experiments with Docker on my desktop machine to date have been focused on reducing installation pain and side-effects by running applications and services that I can access from a browser on the same desktop.

The services are exposed against the IP address of the virtual machine running docker, rather than localhost of the host machine, which also means that the containerised services can’t be accessed by other machines connected to the same local network.

So how do we get the docker container ports exposed on the host’s localhost network IP address?

If docker is running the containers via Virtualbox in the virtual machine named default, it seems all we need to do is tweak a couple of port forwarding rules in Virtualbox. So if I’m trying to get port 32769 on the docker IP address relayed to the same port on the host localhost, I can issue the following terminal command if the Docker Virtualbox is currently running:

VBoxManage controlvm "default" natpf1 "tcp-port32769,tcp,,32769,,32769"

which has syntax:

natpf<1-N> [<rulename>],tcp|udp,[<hostip>], <hostport>,[<guestip>],<guestport>

Alternatively, the rule can be created from the Network – Port Forwarding Virtualbox  settings for the default box:

default_-_Network

To clear the rule, use:

VBoxManage controlvm "default" natpf1 delete "tcp-port32769"

or delete from the Virtualbox box settings Network – Port Forwarding rule dialogue.

If the box is not currently running, use:

VBoxManage modifyvm "default" --natpf1 "tcp-port32769,tcp,,32769,,32769"
VBoxManage modifyvm "default" --natpf1 delete "tcp-port32769"

The port should now be visible and localhost:32769 and by extension may be exposed to machines on the same network as the host machine by calling the IP address of the host machine with the value of the forwarded port on host.

On a Mac, you can find the local IP address of the machine from the Mac’s Network settings:

Network

Simples:-)

OpenRobertaLab – Simple Robot Programming Simulator and UI for Lego EV3 Bricks

Rather regretting not having done a deep dive into programming environments for the Lego EV3 somewhat earlier, I came across the block.ly inspired OpenRobertaLab (code, docs) only a couple of days ago.

Open_Roberta_Lab

(Way back when , in the first incarnation of the OU Robotics Outreach Group, we were part of the original Roberta project which was developing a European educational robotics pack, so it’s nice to see it’s continued.)

OpenRobertaLab is a browser accessible environment that allows users to use block.ly blocks to program a simulated robot.

Open_Roberta_Lab2

I’m not sure how easy it is to change the test track used in the simulator? That said, the default does have some nice features – a line to follow, colour bars to detect, a square to drive round.

The OU Robotlab simulator supported a pen down option that meant you could trace the path taken by the robot – I’m not sure if RobertaLab has a similar feature?

robotlab

It also looks as if user accounts are available, presumably so you can save your programmes and return to them at a later date:

Open_Roberta_Lab5

Account creation looks to be self-service:

Open_Roberta_Lab6

OpenRobertaLab also allows you to program a connected EV3 robot running leJOS, the community developed Java programming environment for the EV3s. It seems that it’s also possible to connect to a brick running ev3dev to OpenRobertaLab using the robertalab-ev3dev connector. This package is preinstalled in ev3dev, although it needs enabling (and the brick rebooting) to run. ssh into the brick and then from the brick commandline, run:

sudo systemctl unmask openrobertalab.service
sudo systemctl start openrobertalab.service

Following a reboot, the Open Robertalab client should now automatically run and be available from the OpenRobertaLab menu on the brick. To stop the service / cancel it from running automatically, run:

sudo systemctl stop openrobertalab.service
sudo systemctl mask openrobertalab.service

If the brick has access to the internet, you should now be able to simply connect to the OpenRobertalab server (lab.open-roberta.org).

Requesting a connection from the brick gives you an access code you need to enter on the OpenRobertaLab server. From the robots menu, select connect...:

Open_Roberta_Lab3

and enter the provided connection code (use the connection code displayed on your EV3):

Open_Roberta_Lab4

On connecting, you should hear a celebratory beep!

Note that this was as far as I got – Open Robertalab told me a more recent version of the brick firmware was available and suggested I installed it. Whilst claiming I may still be possible to run commands using old firmware, that didn’t seem to be the case?

As we well as accessing the public Open Robertalab environment on the web, you can also run your own server. There are a few dependencies required for this, so I put together a Docker container psychemedia/robertalab (Dockerfile) containing the server, which means you should be able to run it using Kitematic:

kitematic_robertalab

(For persisting things like user accounts, and and saved programmes, there should probably be a shared data container to persist that info?)

A random port will be assigned, though you can change this to the original default (1999):

kitematic_robertalab

The simulator should run fine using the IP address assigned to the docker machine, but in order to connect a robot on the same local WiFi network to the Open RobertaLab server, or connect to the programming environment from another computer on the local network, you will need to set up proter forwarding from the Docker VM:

virtualboxroboertacontainer

See Exposing Services Running in a Docker Container Running in Virtualbox to Other Computers on a Local Network for more information on exposing the containerised Open Robertalab server to a local network.

On the EV3, you will need to connect to a custom Open Robertalab server. The settings will be the IP address of the computer on which the server is running, which you can find on a Mac from the Mac Network settings, along with the port number the server is running on:

So for example, if Kitematic has assigned the port number 32567, and you didn’t otherwise change it, and you host computer IP address is 192.168.1.86, you should connect to: 192.168.1.86:32567 from the Open Robertalab connection settings on the brick. On connecting, you will be presented with a pass code as above, which you should connect to from your local OpenRobertaLab webpage.

Note that when trying to run programmes on a connected brick, I suffered the firmware mismatch problem again.

From Linked Data to Linked Applications?

Pondering how to put together some Docker IPython magic for running arbitrary command line functions in arbitrary docker containers (this is as far as I’ve got so far), I think the commands must include a couple of things:

  1. the name of the container (perhaps rooted in a particular repository): psychemedia/contentmine or dockerhub::psychemedia/contentmine, for example;
  2. the actual command to be called: for example, one of the contentine commands: getpapers -q {QUERY} -o {OUTPUTDIR} -x

We might also optionally specify mount directories with the calling and called containers, using a conventional default otherwise.

This got me thinking that the called functions might be viewed as operating in a namespace (psychemedia/contentmine or dockerhub::psychemedia/contentmine, for example). And this in turn got me thinking about “big-L, big-A” Linked Applications.

According to Tim Berners Lee’s four rules of Linked Data, the web of data should:

  1. Use URIs as names for things
  2. Use HTTP URIs so that people can look up those names.
  3. When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL)
  4. Include links to other URIs. so that they can discover more things.

So how about a web of containerised applications, that would:

  1. Use URIs as names for container images
  2. Use HTTP URIs so that people can look up those names.
  3. When someone looks up a URI, provide useful information (in the minimal case, this corresponds to a Dockerhub page for example; in a user-centric world, this could just return a help file identifying the commands available in the container, along with help for individual commands; )
  4. Include a Dockerfile. so that they can discover what the application is built from (also may link to other Dockerfiles).

Compared with Linked Data, where the idea is about relating data items one to another, the identifying HTTP URI actually represents the ability to make a call into a functional, execution space. Linkage into the world of linked web resources might be provided through Linked Data relations that specify that a particular resource was generated from an instance of a Linked Application or that the resource can be manipulated by an instance of a particular application.

So for example, files linked to on the web might have a relation that identifies the filetype, and the filetype is linked by another relation that says it can be opened in a particular linked application. Another file might link to a description of the workflow that created it, and the individual steps in the workflow might link to function/command identifiers that are linked to linked applications through relations that associate particular functions with a particular linked application.

Workflows may be defined generically, and then instantiated within a particular experiment. So for example: load file with particular properties, run FFT on particular columns, save output file becomes instantiated within a particular run of an experiment as load file with this URI, run the FFT command from this linked application on particular columns, save output file with this URI.

Hmm… thinks.. there is a huge amount of work already done in the area of automated workflows and workflow execution frameworks/environments for scientific computing. So this is presumably already largely solved? For example, Integrating Containers into Workflows: A Case Study Using Makeflow, Work Queue, and Docker, C. Zheng & D. Thain, 2015 [PDF]?

A handful of other quick points:

  • the model I’m exploring in the Docker magic context is essentially stateless/serverless computing approach, where a commandline container is created on demand and treated in a disposable way to just run a particular function before being destroyed; (see also the OpenAPI approach).
  • The Linked Application notion extends to other containerised applications, such as ones that expose an HTML user interface over http that can be accessed via a browser. In such cases, things like WSDL (or WADL; remember WADL?) provided a machine readable formalised way of describing functional resource availability.
  • In the sense that commandline containerised Linked Applications are actually services, we can also think about web services publishing an http API in a similar way?
  • services such as Sandstorm, which have the notion of self-running containerised documents, have the potentially to actually bind a specific document within an interactive execution environment for that document.

Hmmm… so how much nonsense is all of the above, then?

Steps Towards Some Docker IPython Magic – Draft Magic to Call a Contentmine Container from a Jupyter Notebook Container

I haven’t written any magics for IPython before (and it probably shows!) but I started sketching out some magic for the Contentmine command-line container I described in Using Docker as a Personal Productivity Tool – Running Command Line Apps Bundled in Docker Containers,

What I’d like to explore is a more general way of calling command line functions accessed from arbitrary containers via a piece of generic magic, but I need to learn a few things along the way, such as handling arguments for a start!

The current approach provides crude magic for calling the contentmine functions included in a public contentmine container from a Jupyter notebook running inside a container. The commandline contentmine container is started from within the notebook contained and uses a volume-from the notebook container to pass files between the containers. The path to the directory mounted from the notebook is identified by a bit of jiggery pokery , as is the method for spotting what container the notebook is actually running in (I’m all ears if you know of a better way of doing either of these things?:-)

The magic has the form:

%getpapers /notebooks rhinocerous

to run the getpapers query (with fixed switch settings for now) and the search term rhinocerous; files are shared back from the contentmine container into the .notebooks folder of the Jupyter container.

Other functions include:

%norma /notebooks rhinocerous
%cmine /notebooks rhinocerous

These functions are applied to files in the same folder as was created by the search term (rhinocerous).

The magic needs updating so that it will also work in a Jupyter notebook that is not running within a container – this should simply be just of case of switching in a different directory path. The magics also need tweaking so we can pass parameters in. I’m not sure if more flexibility should also be allowed on specifying the path (we need to make sure that the paths for the mounted directories are the correct ones!)

What I’d like to work towards is some sort of line magic along the lines of:

%docker psychemedia/contentmine -mountdir /CALLING_CONTAINER_PATH -v ${MOUNTDIR}:/PATH COMMAND -ARGS etc

or cell magic:

%%docker psychemedia/contentmine -mountdir /CALLING_CONTAINER_PATH -v ${MOUNTDIR}:/PATH
COMMAND -ARGS etc
...
COMMAND -ARGS etc

Note that these go against the docker command line syntax – should they be closer to it?

The code, and a walked through demo, are included in the notebook available via this gist, which should also be embedded below.


More Docker Doodlings – Accessing GUI Apps Via a Browser from a Container Using Guacamole

In a PS to Using Docker as a Personal Productivity Tool – Running Command Line Apps Bundled in Docker Containers, I linked to a demonstration by Jessie Frazelle on how to connect to GUI based apps running in a container via X11. This is all very well if you have an X client, but it would be neater if we could find a way of treating the docker container as a virtual desktop container, and then accessing the app running inside it via the desktop presented through a browser.

Digging around, Guacamole looks like it provides a handy package for exposing a Linux desktop via a browser based user interface [video demo].

Very nice… Which got me wondering: can we run guacamole inside a container, alongside an X.11 producing app, to expose that app?

Via the Dockerfile referenced in Digikam on Mac OS/X or how to use docker to run a graphical app on Mac OS/X I tracked down linuxserver/dockergui, a Docker image that “makes it possible to use any X application on a headless server through a modern web browser such as chrome”.

Exciting:-) [UPDATE: note that that image uses an old version of the guacamole packages; I tried updating to the latest versions of the packages but it doesn’t just work so I rolled back. Support for Docker was introduced after the version used in the linuxserver/dockergui, but I don’t fully understand what that support does! Ideally, it’d be nice to run a guacamole container and then use docker-compose to link in the applications you want to expose to it? Is that possible? Anyone got an example of how to do it?]

So I gave it a go with Audacity. The files I used are contained in this gist that should also be embedded at the bottom of this post.

(Because the original linuxserver/dockergui was quite old, I downloaded their source files and built a current one of my own to seed my Audacity container.)

Building the Audacity container with:

docker build -t psychemedia/audacitygui .

and then running it with:

docker run -d -p 8080:8080 -p 3389:3389 -e "TZ=Europe/London" --name AudacityGui psychemedia/audacitygui

this is what pops up in the browser:

Guacamole_0_9_6

If we click through on the app, we’re presented with a launcher:

Audacity

Select the app, and Hey, Presto!, Audacity appears…

Audacity1

It seems to work, too… Create a chirp, and then analyse it:

Audacity2

We seem to be able to load and save files in the nobody directory:

Audacity3

I tried exposing the /nobody folder by adding VOLUME /nobody to the Dockerfile and running a -v "${PWD}/files":/nobody switch, but it seemed to break things which is presumably a permissions thing? There are various user roles settings in the linuxserver/dockergui build files, so making poking around with those would fix things? Otherwise, we might have to see the container directly with any files we want in it?:-(

UPDATE: adding RUN mkdir -p /audacityfiles && adduser nobody root to the Dockerfile along with ./VOLUME /audacityfiles and then adding -v "${PWD}/files":/audacityfiles when I create the container allows me to share files in to the container, but I can’t seem to save to the folder? Nor do variations on the theme, such as s creating a subfolder in the nobody folder and giving the same ownership and permissions as the nobody folder. I can save into the nobody folder though. (Just not share it?)

WORKAROUND: in Kitematic, the Exec button on the container view toolbar takes you into the container shell. From there, you can copy files into the shared direcory. For example: cp /nobody/test.aup /nobody/share/test.aup Moving the share to a folder outside /nobody, eg to /audacityfiles means we can simply compy everything from /nobody to /audacityfiles.

Another niggle is with the sound – one of the reasons I tried the Audacity app… (If we can have the visual of the desktop, we want to try to push for sound too, right?!)

Unfortunately, when I tried to play the audio file I’d created, it wasn’t having any of it:

Audacity4

Looking at the log file of the container launch in Kitematic, it seems that ALSA (the Advanced Linux Sound Architecture project) wasn’t happy?

alsa_no

I suspect trying to fix this is a bit beyond my ken, as too are the sorting out the shared folder permissions, I suspect… (I don’t really do sysadmin – which is why I like the idea of ready-to-run application containers).

UPDATE 2: using a different build of the image – hurricane/dockergui:x11rdp1.3, from the linuxserver/dockergui x11rdp1.3 branch, audio does work, though at times it seemed to struggle a bit. I still can’t save files to shared folder though:-(

UPDATE 3: I pushed an image as psychemedia/audacity2. It works from the command line as:
docker run -d -p 8080:8080 -p 3389:3389 -e "TZ=Europe/London" --name AudacityGui -v "${PWD}":/nobody/share psychemedia/audacity2

Anyway – half-way there. If nothing else, we could create and analyse audio files visually in the browser using Audacity, even if we can’t get hold of those audio files or play them!

I’d hope there was a simple permissions fix to get the file sharing to work (anyone? anyone?! ;-) but I suspect the audio bit might be a little bit harder? But if you know of a fix, please let me know:-)

PS I just tried launching psychemedia/audacity2 public Dockerhub image via Docker Cloud, and it seemed to work…


Docker as a Personal Application Runner

Consider, for a moment, the following scenarios associated with installing and running a desktop based application on your own computer:

  • a learner installing software for a distance education course: course materials are produced in advance of the course and may be written with a particular version of the software in mind, distributed as part of the course materials. Learners may have arbitrary O/S (various versions of Windows and OS/X), may be be working on work computers with aggressive IT enforced security policies, or may be working on shared/public computers. Some courses may require links between different applications (for example, a data analysis packages and a database system); in addition, some students may not be able to install any software on their own computer – how can we support them?
  • academic research environment: much academic software is difficult to install and may require an element of sysadmin skills, as well as a particular o/s and particular version so supporting libraries. Why should a digital humanities researcher who want to work with text analysis tools provided in a particular text analysis package also have to learn sys admin skills to install the software before they can use the functions that actually matter to them? Or consider a research group environment, where it’s important that research group members have access to the same software configuration but on their own machines.
  • data journalism environment: another twist on the research environment, data journalists may want to compartmentalise and preserve a particular analysis of a particular dataset, along with the tools associated with running those analyses, as “evidence”, in case the story they write on it is challenged in court. Or maybe they need to fire up a particular suite of interlinked tools for producing a particular story in quick time (from accessing the raw data for the first time to publishing the story within a few hours), making sure they work from a clean set up each time.

What we have here is a packaging problem. We also have a situation where the responsibility for installing a single copy of the application or linked applications is an individual user or small team working on an arbitrary platform with few, if any, sys admin skills.

So can Docker help?

A couple of recent posts on the Docker blog set out to explore what Docker is not.

The first – Containers are not VMs – argues that Docker “is not a virtualization technology, it’s an application delivery technology”. The post goes on:

In a VM-centered world, the unit of abstraction is a monolithic VM that stores not only application code, but often its stateful data. A VM takes everything that used to sit on a physical server and just packs it into a single binary so it can be moved around. But it is still the same thing. With containers the abstraction is the application; or more accurately a service that helps to make up the application.

With containers, typically many services (each represented as a single container) comprise an application. Applications are now able to be deconstructed into much smaller components which fundamentally changes the way they are managed in production.

So, how do you backup your container, you don’t. Your data doesn’t live in the container, it lives in a named volume that is shared between 1-N containers that you define. You backup the data volume, and forget about the container. Optimally your containers are completely stateless and immutable.

The key idea here is that with Docker we have a “something” (in the form of a self-contained container) that implements an application’s logic and publishes the application as a service, but isn’t really all that interested in preserving the state of, or any data associated with, the application. If you want to preserve data or state, you need to store it in a separate persistent data container, or alternative data storage service, that is linked to application containers that want to call on it.

The second post – There’s Application Virtualization and There’s Docker – suggests that “Docker is not application virtualization” in the sense of “put[ting] the application inside of a sandbox that includes the app and all its necessary DLLs. Or, … hosting the application on a server, and serving it up remotely…”, but I think I take issue with this in the way it can be misinterpreted as a generality.

The post explicitly considers such application virtualisation in the context of applications that are “monolithic in that they contain their own GUI (vs. a web app that is accessed via a browser)”, things like Microsoft Office or other “traditional” desktop based applications, for example.

But many of the applications I am interest in are ones that publish their user interface as a service, of sorts, over HTTP in the form of a browser based HTML API, or that are accessed via a the commandline. For these sorts of applications, I believe that Docker represents a powerful environment for personal, disposable, application virtualisation. For example, dedicated readers of this blog may already be aware of my demonstrations of how to:

Via Paul Murrell, I also note this approach for defining a pipeline approach for running docker containers: An OpenAPI Pipeline for NZ Crime Data. Pipeline steps are defined in separate XML modules, and the whole pipeline defined in another XML file. For example, the module step OpenAPI/NZcrime/region-Glendowie.xml runs a specified R command in a Docker container fired up to execute just that command. The pipeline definition file identifies the component modules as nodes in some sort of execution graph, along with the edges connecting them as steps in the pipeline. The pipeline manager handles the execution of the steps in order and passes state between the step in one of several ways (for example, via a shared file or a passed variable). (Further work on the OpenAPI pipeline approach is described in An Improved Pipeline for CPI Data.)

What these examples show is that as well as providing devops provisioning support for scaleable applications on the one hand, and an environment for effective testing and rapid development of applications on the other, Docker containers may also have a role to play in “user-pulled” applications.

This is not so much thinking of Docker from an enterprise perspective in which it acts as an environment that supports development and auto-scaled deployment of containerised applications and services, nor is it a view that a web hosting service might take of Docker images providing an appropriate packaging format for the self-service deployment of long-lived services, such as blogs or wiki applications, (a Docker hub and deployment system to rival cPanel, for example).

Instead, it’s a user-centric, rather than devops-centric view, seeing containers from a single user, desktop perspective, seeing Docker and its ilk as providing an environment that can support off-the-shelf, ready to run tools and applications, that can be run locally or in the cloud, individually or in concert with each other.

Next up in this series: a reflection on the possibilities of a “Digital Library Application Shelf”.

Using Docker as a Personal Productivity Tool – Running Command Line Apps Bundled in Docker Containers

With its focus on enterprise use, it’s probably with good reason that the Docker folk aren’t that interested in exploring the role that Docker may have to play as a technology that supports the execution of desktop applications, or at least, applications for desktop users. (The lack of significant love for Kitematic seems to be representative of that.)

But I think that’s a shame; because for educational and scientific/research applications, docker can be quite handy as a way of packaging software that ultimately presents itself using a browser based user interface delivered over http, as I’ve demonstrated previously in the context of Jupyter notebooks, OpenRefine, RStudio, R Shiny apps, linked applications and so on.

I’ve also shown how we can use Docker containers to package applications that offer machine services via an http endpoint, such as Apache Tika.

I think this latter use case shows how we can start to imagine things like a “digital humanities application shelf” in a digital library (fragmentary thoughts on this), that allows users to take either an image of the application off the shelf (where an image is a thing that lets you fire up a pristine instance of the application), or a running instance of the application of the shelf. (Furthermore, the application can be run locally, on your own desktop computer, or in the cloud, for example, using something like a mybinder like service). The user can then use the application directly (if it has a browser based UI), or call on it from elsewhere (eg in the case of Apache Tika). Once they’re done, they can keep a copy of whatever files they were working with and destroy their running version of the application. If they need the application again, they can just pull a new copy (of the latest version of the app, or the version they used previously) and fire up a new instance of it.

Another way of using Docker came to mind over the weekend when I saw a video demonstrating the use of the contentmine scientific literature analysis toolset. The contentmine installation instructions are a bit of a fiddle for the uninitiated, so I thought I’d try to pop them into a container. That was easy enough (for a certain definition of easy – it was a faff getting node to work and npm to be found, the Java requirements took a couple of goes, and I;m sure the image is way bigger than it really needs to be…), as the Dockerfile below/in the gist shows.

But the question then was how to access the tools? The tools themselves are commandline apps, so the first thing we want to do is to be able to call into the container to run the command. A handy post by Mike English entitled Distributing Command Line Tools with Docker shows how to do this, so that’s all good then…

The next step is to consider how to retain copies of the files created by the command line apps, or pass files to the apps for processing. If we have a target host directory and mount it into the container as as a shared volume, we can keep the files on our desktop or allow the container to create files into the host directory. Then they’ll be accessible to us all the time, even if we destroy the container.

The gist that should be embedded below shows the Dockerfile and a simple batch file passes the Contentmine tool commands into the container which then executes them. The batch file idea could be further extended to produce a set of command shortcuts that essentially alias the Contentmine commands (eg a ./getpapers command rather than a ./contentmine getpapers command, or that combine the various steps associated with a particular pipeline or workflow – getpapers/norma/cmine, for example – into a single command.

UPDATE: the CenturyLinkLabs DRAY docker pipeline looks interesting in this respect for sequencing a set of docker containers and passing the output of one as the input to the next.

If there are other folk out there looking at using Docker specifically for self-managed “pull your own container” individual desktop/user applications, rather than as a devops solution for deploying services at scale, I’d love to chat…:-)

PS for several other examples of using Docker for desktop apps, including accessing GUI based apps using X WIndows / X11, see Jessie Frazelle’s post Docker Containers on the Desktop.

PPS See also More Docker Doodlings – Accessing GUI Apps Via a Browser from a Container Using Guacamole for an attempt at exposing a GUI based app, such as Audacity, running in a container via a browser. Note that I couldn’t get a shared folder or the audio to work, although the GUI bit did…

PPPS I wondered how easy it would be to run command-line containers from within Jupyter notebook itself running in inside a container, but got stuck. Related question on Stack Overflow here.

The rest of the way this post is published is something of an experiment – everything below the line is pulled in from a gist using the WordPress – embedding gists shortcode…