Via the Guardian Developer blog, a post — Leaving Scribe — describing how the Guardian is moving away from its Scribe in-browser text editor to a new one based on ProseMirror, an open-source toolkit “for building rich-text editors on the web” that is also used by the New York Times.
In-browser editors are not something I know much (i.e. anything) about, but the Leaving Scribe post provides a handy review of what’s good to know (like how markup is handled). Go and read it now…
It seems like the Guardian folk have many of the same issues as we do in the OU. For example:
Another area where HTML as a model falls down is editor-only annotations (markup that helps the writer but is detrimental to the reader). Take for example the need to highlight a word in the text that meets some criteria (a suggested tag, or some legal issue around using this word). You may want to show an inline annotation to ask the editor whether they want to add this as a tag.
The problem here is that now we have data that is not part of the document, and yet it is modelled as part of our document. This is technically solvable but again, the DOM API is not well suited for handling this sort of data modelling, especially when the usage of these features becomes more complex. As you start to force more complex features through an HTML data model you have to do more and more work to get around HTML’s limitations around modelling a rich text document and you hit more and more of the browser inconsistencies.
Features of ProseMirror based editors apparently include collaborative editing and an extensible schema. This last one is interesting from an OU perspective, because we have a workflow in which content is published from an internal XML document feedstock.
The important difference between Scribe and ProseMirror is that ProseMirror implements its own model layer that has a one-to-one mapping from semantics to the model, and an API that is made with document transformation in mind – not least collaborative editing.
In ProseMirror, inline content is flat rather than a tree, which means operations like changing styles on text don’t require any tree manipulation. And while nodes (h1, p, blockquote etc.) are still modelled as a tree but again, this accurately models how users think about things like paragraphs and lists, and it’s almost always how they’re rendered when consuming an article.
I’m not sure if the halted OU Create project was using ProseMirror? (I never really found out any technical details and I was banned from posting screenshots or discussing [di(scu)ssing?!] it in public!;-)
We hope in time to be able to get our editor to a point that it is able to be open-sourced but we’ll only do this if we believe we have the documentation and resource in place for that to be useful to users outside the Guardian.
Ah ha… It’d be nice if an OU solution could work in an open-sourcey way, or perhaps join forces with others to get such code out there…
One of the things I’ve been pondering lately is how to generate OU XML from Jupyter notebooks, as well how to demonstrate rich text authoring in notebooks using things like the jupyter-wysiwyg editor (I wonder how easy it is to modify that extension to work with other rich editors?)
So I wonder a couple of things:
- how easy would it be to extend ProseMirror to support the OU XML schema?
- could this customised editor then be used as a rich editor inside a Jupyter notebook markdown cell? (Would it need tweaks to the markdown2html renderer, or an OU-XML2HTML previewer?)
I’m also thinking that OU-XML has a lot of metadata elements which could be embedded as notebook metadata, with just a subset of the OU-XML being supported within the markdown cells. (Markdown cells could also have metadata associated with them.)
I think we could probably get a clunky workflow going quite quickly for authoring OU-XML docs from within Jupyter notebooks if anyone else was interested in exploring it with me…