When I first started using MyBinder in 2015, it came with the option of autostarting selectable services — PostgreSQL and a Spark server — within the container along with the Jupyter notebook server (early review). I’m not sure when it disappeared (the Github repo commit history should show it, if anyone’s feeling forensic investigative and wants to let me know via a comment) but for some time I’ve been wondering how to start my own services, such as a database server, or OpenRefine server, in a Binderised container.
I guess I shoulda read the docs again…
…although the pace of change with: a) documented features; b) undocumented features, means it can be hard to keep up with what’s possible in the Jupyterverse (I’m nowhere close to cracking that with the TrackingJupyter in its current formulation…).
So via this issue, some handy leads…
Binderised containers are built using
repo2docker. A new-to-me config file for
repo2docker is the
start file, “a script that can contain simple commands to be run at runtime. If you want this to be a shell script, make sure the first line is #!/bin/bash. The last line must be
exec "$@" equivalent”.
So for example, if we want to autorun a headless OpenRefine instance in MyBinder that I can access via a notebook using an OpenRefine Python client, (see an example notebook here), we can just add the following
start file to the repo:
#!/bin/bash #Start OpenRefine OPENREFINE_DIR="$HOME/openrefine" mkdir -p $OPENREFINE_DIR nohup openrefine-2.8/refine -p 3333 -d OPENREFINE_DIR > /dev/null 2>&1 & #Do the normal Binder start thing here... exec "$@"
PS thanks as ever to the ever helpful Jupyter devs for putting up with my “issues”…
PPS I got stuck trying to generalise the running auto-starting, arbitrary services thing any further. You can see the state of my ignorance here. As is often the case, “permissions” makes me realise I don’t really understand how any of this stuff actually works. That’s why I tend to work: a) locally, b) as