Search results for: "writing diagrams"

Writing Diagrams (Incl. Mathematical Diagrams)

Continuing an occasional series of posts on approaches to “writing” diagrams in a textual form and then letting the machine render them, here are some recent examples that caught my eye…

Via this Jupyter notebook on inverse kinematics, I came across Asymptote, “a standard for typesetting mathematical figures, just as TeX/LaTeX is the de-facto standard for typesetting equations” (file suffix: .asy). The language uses LaTeX for labels, and is a hight level programming language in it’s own right – which means is can do calculations in it’s own right as part of the diagram creation process.

Asymptote is also available via IPython magic, as demonstrated in this Asymptote demo notebook:

The inverse kinematics notebook is also worth reviewing in a couple of other respects. Firstly, it demonstrates embedding of another “written diagram” approach, using Graphviz:

One of the easiest ways to use Graphviz scripting in Jupyter notebooks is via some IPython Graphviz magic.

It also demonstrates how to use Sympy to to “implement” equations relating to the diagram and then generating animations based on them. (I still think it would be nice if we could unify the various maths rendering and calculating scripts.)

Way back when I learned to programme, I remember being given “railroad diagrams” (though I’m not sure they were called that? Syntax diagrams, maybe?) that described the programming language grammar defined in BNF in a visual way. Here’s a tool for generating them:

It’s a bit shiny, and a bit of pain that it doesn’t just take BNF. On the other hand, this Railroad Diagram Generator looks far more powerful:

Unfortunately, it looks to be web only and there’s no source. However, if you’re okay running Java, here’s an alternative – Chrriis/RRDiagram:

I did find a Python railroad diagram generator, Syntrax [code], but it didn’t accept BNF.

Along similar lines to the blockdiag tools I’ve described before is the purely in browser mermaid.js.

Supported chart types include flowcharts, sequence diagrams and Gantt charts.

Finally, another Grammar of Graphics style charting language – Brunel – for generating d3 output (among other things?). It can be used in notebooks, but it does require Java to be installed (as you might expect of something with an IBM relationship…?!)

PS although I think of writing diagrams more in the sense of generating rendered diagrams from text (which makes generating the diagram reproducible and maintable), these are maybe also relevant as a capture step:

Writing Diagrams – Boxes and Arrows

If you’ve ever had to draw “blocks and arrows” diagrams, you’ll know how irritating it can be if you spend hours laying out the diagram using a presentation editor or drawing tool, only to find you need to edit the drawing, add another box, and lay the whole thing out again.

Surely there must be a better way?

Let’s just think about what a box and arrow diagram is intended to show: when describing a process, the connections typically represent a flow from one thing to another; furthermore, the layout is often rectilinear, laid out along straight lines, the boxes tidily spaced and their edges lined up with each other. In a diagram such as a mindmap, different ideas or concepts are related to each other by drawing a line between them and the layout may be more fluid, with like or related concepts grouped together in space, or by the additional use of colour themes, for example.

The primary information contained in the diagram are the text elements and the connections between them. The positioning on the page often reflects the structure of these connections. When we lay out a diagram, we unconsciously favour layouts that minimise the number of crossed lines (to keep the diagram “clean” looking), and group connected items close together (unless some other information requires us to separate them – for example, we might be using a timeline basis for a horizontal x-axis and placing boxes in areas of the canvas we are working on that are associated with a particular month).

The online Google Drawing document type is typical of drawing tools included in many office applications. As well as being able to draw boxes and connect registration points on each box by lines or arrows, a range of layout tools provides support for aligning and spacing boxes.

google draw

Tools such as popplet provide a friendlier environment for generating similar sorts of diagram:


Whilst drawing tools such as these allow you to craft your diagram by hand, building it up as you go along, actually putting previously collected information into blocks on the canvas, let alone connecting the blocks together and laying them out nicely, may be quite an involved and error prone affair.

In these circumstances, it may make more sense to take a raw representation of the block contents and a simple representation of connections between appropriate blocks and just write the relationships down, letting a drawing tool do the hard work of drawing the blocks, connecting them together and laying them out, at least in draft layout form. To provide for a final layer of customisation, it might also be useful to be able to take a vector/SVG representation of the automatically sketched layout into a drawing package where it can be tidied up by hand and the application of a human designer’s eye.

There are several online tools available that you can use to sketch box and arrow diagrams from simple text descriptions.


Text2Mindmap allows you to construct tree based mindmaps from a simple outline style description of the mindmap.


The layout has a radial basis. Designs can be saved and images downloaded as JPG or PDF files.


Diagrammr allows you to draw simple graph based network structures in which text labelled block elements can be connected to other blocks by labelled edges.


Designs are given a persistent URL, but anyone with access to the URL can edit the diagram.

JS Sequence Diagrams

JS Sequence Diagrams is a javascript library for generating sequence diagrams from a textual description of them.


Diagrams can be saved as SVG files. The source code is available and depends on Jison and Raphaël as the graphics library.


blockdiag (application) uses a language similar to the DOT language for describing a range of block diagram types (block diagrams, sequence diagrams, activity diagrams, logical network diagrams).


Diagrams can be saved as SVG diagrams and associated with a URL that contains all the information used to recreate the diagram. As such, large diagrams are not supported if they make the URL too long. Source code is available.

Several diagram types are available using blockdiag, including graphviz diagrams constructed using the DOT language.


GraphvizFiddle is a fiddler style application that lets you enter Graphviz DOT language descriptions and preview the result.


Files can be generated in SVG format, or a textual definitions (for example, in the dot layout language).


Generating boxes and arrows style diagrams can be a pain at times because the semantics of the diagram – how one item is related to another – is represented in a graphical rather than data based form. By writing down the relations and then automatically generating visual representations of them, we retain access to the data representation whilst letting the hard work of generating the initial draft, at least, of the layout to a machine.

Several tools are available to support the creation of such literally described box and arrow diagrams using a variety of description languages and generating a range of output image formats (SVG probably being the most useful if you need to edit the sketch diagram for yourself to tweak the layout for its final presentation). Code for some of the tools (JS Sequence diagrams, blockdiag) is available.

Arguably the most powerful tools allow you to “write” diagrams using the Graphviz DOT layout language. Whilst there is a certain overhead associated with learning this language, it does save time in the long run if you regularly need to create network style diagrams. Graphviz also supports a range of layout algorithms – see the Graphviz gallery for examples.

PS If you want to write your own diagramming application, the JointJS library looks like a handy library to have on hand… The Venn.js library also looks quite pretty – if you have to generate Venn diagrams, that is!

Writing Diagrams

One of the reasons I don’t tend to use many diagrams in the blog is that I’ve always been mindful that the diagrams I do draw rarely turn out how I wanted them to (the process of converting a mind’s eye vision to a well executed drawing always fails somewhere along the line, I imagine in part because I’ve never really put the time into practising drawing, even with image editors and drawing packages etc etc.)

Which is one reason why I’m always on the lookout for tools that let me write the diagram (e.g. Scripting Diagrams).

So for example, I’m very fond of Graphviz, which I can use to create network diagrams/graphs from a simple textual description of the graph (or a description of the graph that has been generated algorithmically…).

Out of preference, I tend to use Mac version of Graphviz, although the appearance of a canvas/browser version of graphviz is really appealing… (I did put in a soft request for a Drupal module that would generate a Graphviz plot from a URL that pointed to a dot file, but I’m not sure it went anywhere, and the canvas version looks far more interesting anyway…)

Hmmm – it seems there’s an iPhone/iPod touch Graphviz app too – Instaviz:

Another handy text2image service is the rather wonderful Web sequence diagrams, a service that lets you write out a UML sequence diagram:

There’s an API, too, that lets you write a sequence diagram within a <pre> tag in an HTML page, and a javascript routine will then progressively enhance it and provide you with the diagrammatic version, a bit like MathTran, or the Google Chart API etc etc (RESTful Image Generation – When Text Just Won’t Do).

If graphs or sequence diagrams aren’t your thing, here’s a handy hierarchical mindmap generator: Text2Mindmap:

And finally, if you do have to resort to actually drawing diagrams yourself, there are a few tools out there that look promising: for example, the LucidChart flow chart tool crossed my feedreader the other day. More personally, since Gliffy tried to start charging me at some point during last year, I’ve been using the Project Draw Autodesk online editor on quite a regular basis.

PS Online scripting tool for UML diagrams: YUML

More Tukey Gems

Via a half quote by Adam Cooper in his SoLAR flare talk today, elucidated in his blog post Exploratory Data Analysis, I am led to a talk by John Tukey – The Technical Tools of Statistics – read at the 125th Anniversary Meeting of the American Statistical Association, Boston, November 1964.

As ever (see, for example, Quoting Tukey on Visual Storytelling with Data), it contains some gems… The following is a spoiler of the joy of reading the paper itself. I suggest you do that instead – you’ll more than likely find your own gems in the text: The Technical Tools of Statistics.

If you’re too lazy to click away, here are some of the quotes and phrases I particularly enjoyed.

To start with, the quote referenced by Adam:

Some of my friends felt that I should be very explicit in warning you of how much time and money can be wasted on computing, how much clarity and insight can be lost in great stacks of computer output. In fact, I ask you to remember only two points:

  • The tool that is so dull that you cannot cut yourself on it is not likely to be sharp enough to be either useful or helpful.
  • Most uses of the classical tools of statistics have been, are, and will be, made by those who know not what they do.

And here’s one I’m going to use when talking about writing diagrams:

Hand-drawing of graphs, except perhaps for reproduction in books and in some journals, is now economically wasteful, slow, and on the way out.

(It strikes me that using a spreadsheet wizard to create charts in a research or production setting, where we are working in a reproducible, document generation context, is akin to the “hand-drwaing of graphs” of yesteryear?)

“I know of no person or group that is taking nearly adequate advantage of the graphical potentialities of the computer.”

Nothing’s changed?!

[W]e are going to reach a position we should have reached long ago. We are going, if I have to build it myself, to have a programming system — a “language” if you like — with all that that implies, suited to the needs of data analysis. This will be planned to handle numbers in organized patterns of very different shapes, to apply a wide variety of data-analytical operations to make new patterns from old, to carry out the oddest sequences of apparently unrelated operations, to provide a wide variety of outputs, to automatically store all time-expensive intermediate results “on disk” until the user decides whether or not he will want to do something else with them, and to do all this and much more easily.

Since I’ve started playing with pandas, my ability to have written conversations with data has improved. Returning to R after a few months away, I’m also finding that easier to write as well (the tabular data models, and elements of the syntax, are broadly similar across the two).

Most of the technical tools of the future statistician will bear the stamp of computer manufacture, and will be used in a computer. We will be remiss in our duty to our students if we do not see that they learn to use the computer more easily, flexibly, and thoroughly than we ever have; we will be remiss in our duties to ourselves if we do not try to improve and broaden our own uses.

This does not mean that we shall have to continue to teach our students the elements of computer programming; most of the class of ’70 is going to learn that as freshmen or sophomores. Nor does it mean that each student will write his own program for analysis of variance or for seasonal adjustment, this would be a waste. … It must mean learning to put together, effectively and easily — on a program-self-modifying computer and by means of the most helpful software then available — data analytical steps appropriate to the need, whether this is to uncover an anticipated specific appearance or to explore some broad area for unanticipated, illuminating appearances, or, as is more likely, to do both.

Interesting to note that in the UK, “text-based programming” has made it into the curriculum. (Related: Text Based Programming, One Line at a Time (short course pitch).)

Tukey also talks about how computing will offer flexibility and fluidity. Flexibility includes the “freedom to introduce new approaches; freedom, in a word, to be a journeyman carpenter of data-analytical tools”. Fluidity “means that we are prepared to use structures of analysis that can flow rather freely … to fit the apparent desires of the data”.

As the computer revolution finally penetrates into the technical tools of statistics, it will not change the essential characteristics of these tools, no matter how much it changes their appearance, scope, appositeness and economy. We can only look for:

  • more of the essential erector-set character of data analysis techniques, in which a kit of pieces are available for assembly into any of a multitude of analytical schemes,
  • an increasing swing toward a greater emphasis on graphicality and informality of inference,
  • a greater and greater role for, graphical techniques as aids to exploration and incisiveness,
  • steadily increasing emphasis on flexibility and on fluidity,
  • wider and deeper use of empirical inquiry, of actual trials on potentially interesting data, as a way to discover new analytic techniques,
  • greater emphasis on parsimony of representation and inquiry, on the focussing, in each individual analysis, of most of our attention on relatively specific questions, usually in combination with a broader spreading of the remainder of our attention to the exploration of more diverse possibilities.

In order that our tools, and their uses, develop effectively … we shall have to give still more attention to doing the approximately right, rather than the exactly wrong, …

All quotes from John Tukey, The Technical Tools of Statistics, 1964.


ILI2012 Workshop Prep – Appropriating IT: innovative uses of emerging technologies

Given that workshops at ILI2012 last a day (10 till 5), I thought I’d better start prepping the workshop I’m delivering with Martin Hawksey at this year’s Internat Librarian International early… W2 – Appropriating IT: innovative uses of emerging technologies:

Are you concerned that you are not maximising the potential of the many tools available to you? Do you know your mash-ups from your APIs? How are your data visualisation skills? Could you be using emerging technologies more imaginatively? What new technologies could you use to inspire, inform and educate your users? Learn about some of the most interesting emerging technologies and explore their potential for information professionals.

The workshop will combine a range of presentations and discussions about emerging information skills and techniques with some practical ‘makes’ to explore how a variety of free tools and applications can be appropriated and plugged together to create powerful information handling tools with few, if any, programming skills required.

Topics include:

– Visualisation tools
– Maps and timelines
– Data wrangling
– Social media hacks
– Screenscraping and data liberation
– Data visualisation

(If you would like to join in with the ‘makes’, please bring a laptop)

I have some ideas about how to fill the day – and I’m sure Martin does too – but I thought it might be worth asking what any readers of this blog might be interested in learning about in a little more detail and using slightly easier, starting from nowhere baby steps than I usually post.

My initial plan is to come up with five or six self contained elements that can also be loosely joined, structuring the day something like this:

  • opening, and an example of the sort of thing you’ll be able to do by the end of the day – no prior experience required, handheld walkthroughs all the way; intros from the floor along with what folk expect to get out of the day/want to be able to do at the day (h/t @briankelly in the comments; of course, if folks’ expectations differ from what we had planned….;-). As well as demo-ing how to use tools, we’ll also discuss why you might want to do these things and some of the strategies involved in trying to work out how to do them, knowing what you already know, or how to find out/work out how to do them if you don’t..
  • The philosophy of “appropriation”, “small pieces, lightly joined”, “minimum viability” and ‘why Twitter, blogs and Stack Overflow are Good Things”;
  • Visualising Data – because it’s fun to start playing straight away…
    • Google Fusion Tables – visualisations and queries
    • Google visualisation API/chart components

    Payoff: generate some charts and dashboards using pre-provided data (any ideas what data sets we might use…? At least one should have geo-data for a simple mapping demo…)

  • — Morning coffee break? —
  • Data scraping:
    • Google spreadsheets – import CSV, import HTML table;
    • Google Refine – import XLS, import JSON, import XML
    • (Briefly) – note the existence of other scraper tools, incl. Scraperwiki, and how they can be used

    Payoff: scrape some data and generate some charts/views… Any ideas what data to use? For the JSON, I thought about finishing with a grab of Twitter data, to set up after lunch…

  • — Lunch? —
  • (Social) Network Analysis with Gephi
    • Visually analyse Twitter data and/or Facebook data grabbed using Google Refine and/or TAGSExplorer
    • Wikipedia graphing using DBPedia
    • Other examples of how to think in graphs…
  • The scary session…
    • Working with large data files – examples of some simple text processing command line tools
    • Data cleansing and shaping – Google Refine, for the most part, including the use of reconciliation; additional examples based on regular expressions in a text editor, Google spreadsheets as a database, Stanford Data Wrangler, and R…
  • — Afternoon coffee break? —
  • Writing Diagrams – examples referring back to Gephi, mentioning Graphviz, then looking at R/ggplot2, finishing with R’s googleVis library as a way of generating Google Visualisation API Charts…
  • Wrap up – review of the philosophy, showing how it was applied throughout the exercises; maybe a multi-step mashup as a final demo?

Requirements: we’d need good wifi/network connections; also, it would help if participants pre-installed – and checked the set up of: a) a Google account; b) a modern browser (standardising on Google Chrome might be easiest?) c) Google Refine; d) Gephi (which may also require the installation of a Java runtime, eg on a new-ish Mac); e) R; f) RStudio and a raft of R libraries (ggplot2, plyr, reshape, RCurl, stringr, googleViz); g) a good text editor (?I use TextWrangler on a Mac); h) commandline tools (Windows machines);

Throughout each session, participants will be encouraged to identify datasets or IT workflow issues they encounter at work and discuss how the ideas presented in the workshop may be appropriated for use in those contexts…

Of course, this is all subject to change (I haven’t asked Martin how he sees the day panning out yet;-), but it gives a flavour of my current thinking… So: what sorts of things would you like to see? And would you like to book any of the sessions for a workshop at your place…?!;-)

Scripted Diagrams Getting Easier

A quick heads-up on an another tool (diagrammr) that makes it easy to create network/graph diagrams like this:

Just type in a description of the graph and the diagram will be generated at the same time [video]:

[Infoskills note to self: when making a screencast with Jing, after clicking in a text box area, remember to move the mouse cursor out of the way…]

(Regular readers will know I’ve been this sort of thing for some time; for example, see Scripting Charts WIth GraphViz – Hierarchies; and a Question of Attitude, Writing Diagrams, RESTful Image Generation – When Text Just Won’t Do or Visual Gadgets: Scripting Diagrams).)

As well as creating diagrams, Diagrammr allows you to embed them, providing an image/PNG URI for your diagram; you can also edit the image (that is, edit the script that generates the image) after the fact via a shareable URI.

The URI for the editor page can be generated from the image URI, though, so without the ability to set a password on the editor page when you first crate a new image, this means that any time you embed a Diagrammr image, someone else could go and edit the image?

In an educational context, tools like this make it much easier for students to create their own diagrams (typing in a graph description is far quicker than trying to lay it out by hand in a drawing package). As you script the diagram, your attention is focussed on the local structural components/relations that define the graph, whilst at the same time the automatically generated diagram visualises the overall structure and brings alive its complexity at the network level.

(I’m not sure how the graph layouts are generated – maybe using Graphviz on the server to generate the image and return it to the browser? If so, an improved version of diagrammr might be able to return the compiled xdot version of the graph back to an interactive canviz component running in the browser?)

If you’re working in an insitutional VLE context, where the powers that be are still trying to retain control of everything, the Canviz component might offer one solution – an HTML 5 canvas library for displaying ‘compiled’ Graphviz network descriptions.

Although I haven’t tried it out, there is apparently a recipe for integrating Graphviz with Drupal (Graphviz Filter) and a suggestion for including Canviz into the mix (GraphMapping Framework (graphviz_api + graphviz_fields + graphviz_views + graphviz_filter) – has this been implemented yet by anyone, I wonder?). I’ve no idea if anyone has tried to do something similar in a Moodle environment…

PS here’s another one – a UML editor: YUML

Scripting Charts WIth GraphViz – Hierarchies; and a Question of Attitude

A couple of weeks ago, my other was finishing off corrections to her PhD thesis. The layout of one of the diagrams – a simple hierarchy written originally using the Draw tools in an old versioof MS-Word – had gone wrong, so in the final hours before the final printing session, I offered to recreate it.

Not being a draughtsman, of course I decided to script the diagram, using GraphVIz:

The labels are added to the nodes using the GraphViz label command, such as:

n7[label="Trait SE"];

The edges are defined in the normal way:


But there was a problem – in the above figure, two nodes are placed by the GraphvViz layout in the wrong place – the requirement was that the high and low nodes were ordered according to their parents, and as, indeed, they had been ordered in the GraphViz dot file.

A bit of digging turned up a fix, though:

graph [ ordering="out" ];

is a switch that forces GraphViz to place the nodes in a left-to-right fashion in the order in which they are declared.

During the digging, I also found the following type of construct


which will force a set of nodes to be positioned along the same horizontal row. Whilst I didn’t need it for the simple graph I was plotting, I can see this being a useful thing to know.

There are a few more things, though, that i want to point out about this whole exercise.

Firstly, I now tend to assume that I probably should be able to script a diagram, rather than have to draw it. (See also, for example, Writing Diagrams, RESTful Image Generation – When Text Just Won’t Do and Progressive Enhancement – Some Examples.)

Secondly, when the layout “went wrong”, I assumed there’d be a fix – and set about searching for it – and indeed found it, (along with another possibly useful trick along the way).

This second point is an attitudinal thing; knowing an amount of programming, I know that most of the things I want to do most of the time are probably possible because they the exactly the sorts of problems are likely to crop up again and again, and as such solutions are likely to have been coded in, or workarounds found. I assume my problem is nothing special, and I look for the answer; and often find it.

This whole attitude thing is getting to be a big bugbear of mine. Take a lot of the mashups that I post here on They are generally intended not to be one off solutions. This blog is my notebook, so I use it to record “how to” stuff. And a lot of the posts are contrived to demonstrate minimally worked examples of how to do various things.

So for example, in a recent workshop I demonstrated the Last Week’s Football Reports from the Guardian Content Store API (with a little dash of SPARQL).

Now to me, this is a mashup that shows how to :

– construct a relative date limited query on the Guardian content API;
– create a media RSS feed from the result;
– identify a convention in the Guardian copy that essentially let me finesse metadata from a free text field;
– create a SPARQL query over dbpedia and use the result to annotate each result from the Guardian content API;
– create a geoRSS feed from the result that could be plotted directly on a map.

Now I appreciate that no-one in the (techie) workshop had brought a laptop, and so couldn’t really see inside the pipe (the room layout was poor, the projection screen small, my presentation completely unprepared etc etc), but even so, the discounting of the mashup as “but no-one would want to do anything with football match reviews” was…. typical.

So here’s an issue I’ve some to notice more and more. A lot of people see things literally. I look at the football match review pipe and I see it as giving me a worked example of how to create a SPARQL query in a Yahoo pipe, for example (as well as a whole load of other things, even down to how to construct a complex string, and a host of other tiny little building blocks, as well as how to string them together).

Take GraphViz as another example. I see a GraphViz file as a way of rapidly scripting and laying out diagrams using a representation that can accommodate change. It is possible to view source and correct a typo in a node label, whereas it might not be so easy to see how to do that in a jpg or gif.

“Yes but”, now comes the response, “yes, but: an average person won’t be able to use GraphViz to draw a [complicated] diagram”. Which is where my attitude problem comes in again:

1) most people don’t draw complicated diagrams anyway, ever. A hierarchical diagram with maybe 3 layers and 7 or 8 nodes would be as much as they’d ever draw; and if it was more complicated, most people wouldn’t be able to do it in Microsoft Word anyway… I.e. they wouldn’t be able to draw a presentable diagram anyway…

2) even if writing a simple script is too hard, there are already drag and drop drop interfaces that allow the construction of GraphViz drawings that can then be tidied up by the layout engine.

So where am I at? I’m going to have a a big rethink about presenting workshops (good job I got rejected from presenting at the OU’s internal conference, then…) to try to help people to see past the literal and to the deeper truth of mashup recipes, and try to find ways of helping others shift their attitude to see technology as an enabler.

And I also need a response to the retort that “it won’t work for complicated examples” along the lines of: a) you may be right; but b) most people don’t want to do the complicated things anyway…