A Simple Pattern for Embedding Third Party Javascript Generated Graphics in Jupyter Notebooks

Part of the rationale for this blog is to capture and record my ignorance. That’s why there’s often a long rambling opening and the lede is buried way down the post: the rambling is contextualisation for what follows. It’s the reflective practice bit.

If you want to cut to the chase, scroll down for examples of how to embed mermaid.js, wavedrom.js, flowchart.js and wavesurfer.js diagrams in a notebook from a Python function call or by using IPython magic.

UPDATE: the magics are now packaged and available on Github and from PyPi: innovationOUtside/nb_js_diagrammers.

The Rambling Context Bit

Much teaching material, as well as research papers, are revisionist. For the researcher, they battle their way after many false starts, misunderstandings, dead ends, lots of backtracking, unwarranted assumptions that cause you to just try that one ridiculous thing that turns out to be the right thing, and from the summit of their successful result, they look down the mountain, see a route that looks like it would have been an easier result, follow that back down the mountain, and then write up that easier journey, the reverse ascent essentially, as the method.

Educators have it even worse. Writing not only from a position of knowledge, there is also the temptation to teach as they were taught. There is also a canon of “stuff that must be taught” (based largely on what they were taught) which further limits the teaching tour, and hence the learning journey.

The “expert’s dilemma” takes many forms…

So, in this blog, as well as trying to capture recipes that work (paths up the mountain), I also try to capture the original forward path, with all the false steps, ignorance, and gaps in my own understanding as I try the climb for the first time, or the second, in a spirit of exploration rather than knowledge.

That is to say, this blog is, as much as anything, a learning diary. And at times it’s knowledge-reflective too, as I try to identify what I didn’t know when I started that I did at the end, knowledge gaps or misapprehensions that perhaps made the journey harder than it needed to be.

I’m reminded of site a few of us, who’d put together the original, unofficial OU Facebook app (is there still an OU Facebook app?), mulled over building that we monikered: “Kwink: Knowing What I Now Know”. The site would encourage folk to to openly reflect on their own journey and to share the misunderstandings they had before a moment of revelation, then the thing they learned, the trick, that solved a particular problem or opened a particular pathway. But it never went past the whiteboard stage. Sites like Stack Overflow a similar effect in another way: there is the naive question from the position of ignorance or confusion, then the expert answer related in a teaching style that starts from the point of the questioner’s ignorance and.or confusion and then tries to present a solution in a way the questioner will understand.

So the position of ignorance that this post describes relates to my complete lack of understanding about how to load and access arbitrary Javascript packages in a classic Jupyter notebook (let alone the more complex JupyterLab environment) in an attempt to identify some of the massive gaps in understanding a have-a-go tinkerer might have compared to someone happy to work as a “proper developer” in either of those environments.

The Problem – Rendering Javascript Created Assets in Jupyter Notebooks

The basic problem statement is a general one: given a third party Javascript package that generates a diagram or interactive application based on some provided data, typically provided as a chunk of JSON data, how do we write a simple Python package that will work in a Jupyter notebook context to render the Javascript rendered diagram from data that currently sits in a Python object?

There are a few extensions that we might also add to the problem:

  • the ability to add multiple diagrams to a notebook at separate points, to only render each one once, and to have no interference between diagrams if we render more than one;
  • if multiple diagrams are loaded in the same notebook, ideally we only want to load the Javascript packages once;
  • if there are no diagrams generated in the notebook, we don’t want to load the packages at all;
  • once the image is created, how do we save it to disk as an image file we can reuse elsewhere.

One solution I have used before is to wrap the Javascript application using Aaron Watter’s jp_proxy_widget as an ipywidget. This provides convenient tools for:

  • loading in the required Javascript packages either from a remote URL or a file bundled into the package;
  • for passing state from Python to Javascript, which means you can pass the Javascript the data it needs to generate the diagram, for example; and
  • for passing state from Javascript to Python, which means you can pass the image data back from Javascript to Python and let the Python code save to disk, for example.

It may be that it is easy enough to create your own ipywidget around the Javascript package, but I found the jp_proxy_widget worked when I tried it, it had examples I could crib from, and I don’t recall getting much of a sense of knowing what I was doing or why when I’d tried looking at the ipywidgets docs (this was several years ago now: things may have changed…).

But the jp_proxy_widget has overheads in terms of loading things, you can only have one widget in the same notebook, and (but I need to check this) I don’t think the widgets rendered in a notebook will directly render in a Juptyer Book version of a notebook.

Another solution is to load the Javascript app into another HTML page and then embed it as an IFrame in the notebook. The folium (interactive maps) and nicolaskruchten/jupyter_pivottablejs packages both take this approach, I think. This has the advantage being relatively easy to to do, but it complicates generating an output image. One approach I have used to grab an image of an interactive created this way is to take the generated HTML page and render it in headless browser using selenium, and then grab a screenshot. Another approach might be to render the page using selenium and then scrape a generated image from it.

Rendering Javascript Generated Assets in Jupyter Notebooks Using Embedded IFrames

So here’s the pattern; the code is essentially cribbed from jupyter_pivottablejs.

import io
import uuid
from pathlib import Path
from IPython.display import IFrame

def js_ui(data, template, out_fn = None, out_path='.',
          width="100%", height="", **kwargs):
    """Generate an IFrame containing a templated javascript package."""
    if not out_fn:
        out_fn = Path(f"{uuid.uuid4()}.html")
        
    # Generate the path to the output file
    out_path = Path(out_path)
    filepath = out_path / out_fn
    # Check the required directory path exists
    filepath.parent.mkdir(parents=True, exist_ok=True)

    # The open "wt" parameters are: write, text mode;
    with io.open(filepath, 'wt', encoding='utf8') as outfile:
        # The data is passed in as a dictionary so we can pass different
        # arguments to the template
        outfile.write(template.format(**data))

    return IFrame(src=filepath, width=width, height=height)

One of the side effects of the above approach is that we generate an HTML file that is saved to disk and then loaded back in to the page. This may be seen as a handy side effect, or it may be regarded as generating clutter.

If we had access to a full HTML iframe API, we would be able to pass in the HTML data using the srcdata parameter, rather than an external file reference, but the IPython IFrame() display function doesn’t support that.

Whatever…

We can use that function to render objects from a wide variery of packages. For example, a flowchart.js flowchart:

TEMPLATE_FLOWCHARTJS = u"""
<!DOCTYPE html>
<html>
    <head>
        <meta charset="UTF-8">
        <title>Flowchart.js</title>
        <script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/raphael/2.3.0/raphael.min.js"></script>
        <script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/flowchart/1.14.1/flowchart.js"></script>
        </head>
        <body>
        
        <div id="diagram"></div>
<script>
  var diagram = flowchart.parse(`{src}`);
  diagram.drawSVG('diagram');
</script>

        </body>
</html>
"""

Note that the template is rather sensitive when it comes to braces ({}). A single brace is used for template substitution, so if the template code has a { } in it, you need to double them up as {{ }}. This is a real faff… There must be a better way?

Here’s an example:

fcode='''
st=>start: Start
e=>end: End
op1=>operation: Generate
op2=>parallel: Evaluate
st(right)->op1(right)->op2
op2(path1, top)->op1
op2(path2, right)->e
'''

Or how about a wavedrom/wavedrom timing diagram:

TEMPLATE_WAVEDROM = """<!DOCTYPE html>
<html>
    <head>
        <meta charset="UTF-8">
        <title>wavedrom.js</title>
<script src="https://cdnjs.cloudflare.com/ajax/libs/wavedrom/2.6.8/skins/default.js" type="text/javascript"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/wavedrom/2.6.8/wavedrom.min.js" type="text/javascript"></script>
</head>
        <body onload="WaveDrom.ProcessAll()">
<script type="WaveDrom">
{src}
</script>
        </body>
</html>
"""

If you’re wondering where the template code comes from, it’s typically a copy of the simplest working example I can find on the original Javascript package documentation website. Note also that you also often get simple minimal example code fragments that don’t appear in the docs on the original Github on repository README homepage.

Here’s some example wavedrom source code…

wcode="""{ signal : [
  { name: "clk",  wave: "p......" },
  { name: "bus",  wave: "x.34.5x",   data: "head body tail" },
  { name: "wire", wave: "0.1..0." },
]}
"""

The mermaid-js package supports several diagram types including flowcharts, sequence diagrams, state diagrams and entity relationship diagrams:

TEMPLATE_MERMAIDJS="""<html>
    <body>
        <script src="https://cdn.jsdelivr.net/npm/mermaid/dist/mermaid.min.js"></script>
        <script>
            mermaid.initialize({{ startOnLoad: true }});
        </script>

        <div class="mermaid">
            {src}
        </div>

    </body>
</html>
"""

For example, a flow chart:

mcode = """
graph TD;
    A-->B;
    A-->C;
    B-->D;
    C-->D;
"""

Or a sequence diagram:

mcode="""
sequenceDiagram
    Alice->>John: Hello John, how are you?
    John-->>Alice: Great!
    Alice-)John: See you later!
"""

Note to self: create a Jupyter notebook server proxy package for mermaid.js server

https://blog.ouseful.info/2020/01/11/rapid-widget-prototyping-using-third-party-javascript-packages-in-jupyter-notebooks/ wavesurfer.js

The wavedrom and mermaid templates actually allow multiple charts to be rendered in the same page as long as they are in their own appropriately classed div element, so we could tweak the template pattern to support that if passed multiple chart source data objects…

Here’s another example: the wavesurfer-js package that provides a whole range of audio player tools, including spectorgrams:

TEMPLATE_WAVESURFERJS="""<html>
    <body>
        <script src="https://unpkg.com/wavesurfer.js/dist/wavesurfer.js"></script>
        <div id="wavesurfer">
            <div id="waveform"></div>
            <div class="controls">
                <button class="btn btn-primary" data-action="play">
                    <i class="glyphicon glyphicon-play"></i>
                    Play
                    /
                    <i class="glyphicon glyphicon-pause"></i>
                    Pause
                </button>
            </div>
        </div>
        <script>
            var GLOBAL_ACTIONS = {{ // eslint-disable-line
                play: function() {{
                    window.wavesurfer.playPause();
                }},

                back: function() {{
                    window.wavesurfer.skipBackward();
                }},

                forth: function() {{
                    window.wavesurfer.skipForward();
                }},

                'toggle-mute': function() {{
                    window.wavesurfer.toggleMute();
                }}
            }};

            // Bind actions to buttons and keypresses
            document.addEventListener('DOMContentLoaded', function() {{
                document.addEventListener('keydown', function(e) {{
                    let map = {{
                        32: 'play', // space
                        37: 'back', // left
                        39: 'forth' // right
                    }};
                    let action = map[e.keyCode];
                    if (action in GLOBAL_ACTIONS) {{
                        if (document == e.target || document.body == e.target || e.target.attributes["data-action"]) {{
                            e.preventDefault();
                        }}
                        GLOBAL_ACTIONS[action](e);
                    }}
                }});

                [].forEach.call(document.querySelectorAll('[data-action]'), function(el) {{
                    el.addEventListener('click', function(e) {{
                        let action = e.currentTarget.dataset.action;
                        if (action in GLOBAL_ACTIONS) {{
                            e.preventDefault();
                            GLOBAL_ACTIONS[action](e);
                        }}
                    }});
                }});
            }});
        </script>

        <script>
            var wavesurfer = WaveSurfer.create({{
                container: '#waveform',
                waveColor: 'violet',
                backend: 'MediaElement',
                progressColor: 'purple'
            }});
        </script>
        <script>
            wavesurfer.load("{src}");
        </script>
    </body>
</html>
"""

We can pass a local or remote (URL) path to an audio file into the player:

wscode = "https://ia902606.us.archive.org/35/items/shortpoetry_047_librivox/song_cjrg_teasdale_64kb.mp3"

The wavesurfer.js template would probbaly benefit from some elaboration to allow configuration of the palyer from passed in parameters.

Do It By Magic

It’s easy enough to create some magic to allow diagramming from block magicked code cells:

from IPython.core.magic import Magics, magics_class, cell_magic, line_cell_magic
from IPython.core import magic_arguments
from pyflowchart import Flowchart

@magics_class
class JSdiagrammerMagics(Magics):
    """Magics for Javascript diagramming.""" 
    def __init__(self, shell):
        super(JSdiagrammerMagics, self).__init__(shell)

    @line_cell_magic
    @magic_arguments.magic_arguments()
    @magic_arguments.argument(
        "--file", "-f", help="Source for audio file."
    )
    def wavesurfer_magic(self, line, cell=None):
        "Send code to wavesurfer.js."
        args = magic_arguments.parse_argstring(self.wavesurfer_magic, line)
        if not args.file:
            return
        return js_ui({"src":args.file}, TEMPLATE_WAVESURFERJS, height=200)

    @cell_magic
    @magic_arguments.magic_arguments()
    @magic_arguments.argument(
        "--height", "-h", default="300", help="IFrame height."
    )
    def mermaid_magic(self, line, cell):
        "Send code to mermaid.js."
        args = magic_arguments.parse_argstring(self.mermaid_magic, line)
        return js_ui({"src":cell}, TEMPLATE_MERMAIDJS, height=args.height)

def load_ipython_extension(ip):
    """Load the extension in IPython."""
    ipython.register_magics(JSdiagrammerMagics)
    
ip = get_ipython()
ip.register_magics(JSdiagrammerMagics)

Here’s how it works…

General Ignorance

So, the pattern is simple, but there’s a couple of things that would make it a lot more useful. At the moment, it requires loading the Javascript in from a remote URL. It would be much more useful if we could use the package offline and bundle the Javascript so it could be accessed offline, but I don’t know how to do that. (The files can be bundled in a Python package easily enough, but what URL would they be loaded in from in the IFrame and how would I generate such a URL?) I guess one way is to create an extension that would load the Javascript files in when the notebook loads, and then embed code into the notebook using an IPython.display.HTML() wrapper rather than using an IPython.display.IFrame()?

UPDATE: here’s another way… read the script in from a bundled js file then add it to the notebook UI via an IPython.display.HTML() call.

# Via https://github.com/JupyterPhysSciLab/jupyter-datainputtable

import os

#Locate input_table package directory
mydir = os.path.dirname(__file__) #absolute path to directory containing this file.

#Load Javascript file
with open(os.path.join(mydir,'javascript','input_table.js')) as tempJSfile:
  tmp=f'<script type="text/javascript">{tempJSfile.read()}</script>'
  display(HTML(tmp))

The same developer also has a crazy hack for one-time execution of notebook Javascript code...

As mentioned previously, there’s also no obvious way of accessing the created diagram so it can be saved to a file, unless we perhaps add some logic into the template to support downloading the created asset? Another route would be to load the HTML into a headless browser and then either screenshot it (e.g. as for example here), or scrape the asset from it.

In terms of ignorance lessons, the above recipe shows a workaround for not having any clue about how to properly load and access Javascript into a notebook (let alone JupyterLab). It doesn’t require a development environment (all the above was created in a single notebook), it doesn’t require knowledge of require or async or frameworks. It does require some simple knowledge of HTML and writing templates, and does require a bit of knowledge, or at least, cut-and-paste skills, in creating the magics.

Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...

%d bloggers like this: