Scripting Charts WIth GraphViz – Hierarchies; and a Question of Attitude

A couple of weeks ago, my other was finishing off corrections to her PhD thesis. The layout of one of the diagrams – a simple hierarchy written originally using the Draw tools in an old versioof MS-Word – had gone wrong, so in the final hours before the final printing session, I offered to recreate it.

Not being a draughtsman, of course I decided to script the diagram, using GraphVIz:

The labels are added to the nodes using the GraphViz label command, such as:

n7[label="Trait SE"];

The edges are defined in the normal way:

n4->n8;
n4->n9;

But there was a problem – in the above figure, two nodes are placed by the GraphvViz layout in the wrong place – the requirement was that the high and low nodes were ordered according to their parents, and as, indeed, they had been ordered in the GraphViz dot file.

A bit of digging turned up a fix, though:

graph [ ordering="out" ];

is a switch that forces GraphViz to place the nodes in a left-to-right fashion in the order in which they are declared.

During the digging, I also found the following type of construct

{rank=same;ordering=out;n8;n9;n10;n11;n12;n13;n14;n15

which will force a set of nodes to be positioned along the same horizontal row. Whilst I didn’t need it for the simple graph I was plotting, I can see this being a useful thing to know.

There are a few more things, though, that i want to point out about this whole exercise.

Firstly, I now tend to assume that I probably should be able to script a diagram, rather than have to draw it. (See also, for example, Writing Diagrams, RESTful Image Generation – When Text Just Won’t Do and Progressive Enhancement – Some Examples.)

Secondly, when the layout “went wrong”, I assumed there’d be a fix – and set about searching for it – and indeed found it, (along with another possibly useful trick along the way).

This second point is an attitudinal thing; knowing an amount of programming, I know that most of the things I want to do most of the time are probably possible because they the exactly the sorts of problems are likely to crop up again and again, and as such solutions are likely to have been coded in, or workarounds found. I assume my problem is nothing special, and I look for the answer; and often find it.

This whole attitude thing is getting to be a big bugbear of mine. Take a lot of the mashups that I post here on OUseful.info. They are generally intended not to be one off solutions. This blog is my notebook, so I use it to record “how to” stuff. And a lot of the posts are contrived to demonstrate minimally worked examples of how to do various things.

So for example, in a recent workshop I demonstrated the Last Week’s Football Reports from the Guardian Content Store API (with a little dash of SPARQL).

Now to me, this is a mashup that shows how to :

– construct a relative date limited query on the Guardian content API;
– create a media RSS feed from the result;
– identify a convention in the Guardian copy that essentially let me finesse metadata from a free text field;
– create a SPARQL query over dbpedia and use the result to annotate each result from the Guardian content API;
– create a geoRSS feed from the result that could be plotted directly on a map.

Now I appreciate that no-one in the (techie) workshop had brought a laptop, and so couldn’t really see inside the pipe (the room layout was poor, the projection screen small, my presentation completely unprepared etc etc), but even so, the discounting of the mashup as “but no-one would want to do anything with football match reviews” was…. typical.

So here’s an issue I’ve some to notice more and more. A lot of people see things literally. I look at the football match review pipe and I see it as giving me a worked example of how to create a SPARQL query in a Yahoo pipe, for example (as well as a whole load of other things, even down to how to construct a complex string, and a host of other tiny little building blocks, as well as how to string them together).

Take GraphViz as another example. I see a GraphViz file as a way of rapidly scripting and laying out diagrams using a representation that can accommodate change. It is possible to view source and correct a typo in a node label, whereas it might not be so easy to see how to do that in a jpg or gif.

“Yes but”, now comes the response, “yes, but: an average person won’t be able to use GraphViz to draw a [complicated] diagram”. Which is where my attitude problem comes in again:

1) most people don’t draw complicated diagrams anyway, ever. A hierarchical diagram with maybe 3 layers and 7 or 8 nodes would be as much as they’d ever draw; and if it was more complicated, most people wouldn’t be able to do it in Microsoft Word anyway… I.e. they wouldn’t be able to draw a presentable diagram anyway…

2) even if writing a simple script is too hard, there are already drag and drop drop interfaces that allow the construction of GraphViz drawings that can then be tidied up by the layout engine.

So where am I at? I’m going to have a a big rethink about presenting workshops (good job I got rejected from presenting at the OU’s internal conference, then…) to try to help people to see past the literal and to the deeper truth of mashup recipes, and try to find ways of helping others shift their attitude to see technology as an enabler.

And I also need a response to the retort that “it won’t work for complicated examples” along the lines of: a) you may be right; but b) most people don’t want to do the complicated things anyway…

Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...

3 thoughts on “Scripting Charts WIth GraphViz – Hierarchies; and a Question of Attitude”

  1. Here’s a contribution towards a response to ““it won’t work for complicated examples”
    A graphical representation is usually a summary of available data, that has been created for a particular purpose. For example, the London Underground map doesn’t show the changes in depths between the stations, or the real distances between the stations, or a lot of other information about tube lines and stations that is available. It shows which stations are connected, and by which lines, hence helping travellers to plan a route.

    Your mashups can be thought of in a similar way. They ignore a load of data that’s available from each particular data source, to enable a summary of connected data to be presented. E.g. in the football example, you extract the football reports from the content store (ignoring all the other dat), then filter that to leave just recent reports.

    It doesn’t matter how complicated each particular data source is, as long as it’s possible to extract data that’s useful for the particular representation that the mashup delivers.

  2. Don’t give up on the presentations yet. I realise it may be dispiriting to have your audience undermine your good intentions but I would suggest the fault is with them more than us. As someone who has tried to teach and use diagrams, even just on paper, and other modelling techniques, for many years, I fully concur with your views on peoples abilities to do much with diagrams. Part of the reason for that is there is no universal ‘grammar’ for ‘writing’ and ‘reading’ diagrams, however produced. We are familar with the grammar or words and of text, with still and moving pictures, and even numbers but not diagrammatic representations.

    Any diagram is only as useful as it is for its said purpose, and this is irrespective of how ‘complicated’ the data set or situation being modelled. There is no absolute need for people do complicated data mashups themselves but be sure that more and more people will (or do) produce them and so understanding how they were created is important (and by people I also mean companies ansd agencies giving or selling us products and services). If we just accept the output uncritically or think the puitput is all that matters then we deserve what we get e.g. fancy financial products that no one can evaluate properly).

    PS This is probably a temporal and enculturation issue – who would have believed that so many sould do some simple modelling until spreadsheets became so universal a technology.

Comments are closed.

%d bloggers like this: