More Tukey Gems

Via a half quote by Adam Cooper in his SoLAR flare talk today, elucidated in his blog post Exploratory Data Analysis, I am led to a talk by John Tukey – The Technical Tools of Statistics – read at the 125th Anniversary Meeting of the American Statistical Association, Boston, November 1964.

As ever (see, for example, Quoting Tukey on Visual Storytelling with Data), it contains some gems… The following is a spoiler of the joy of reading the paper itself. I suggest you do that instead – you’ll more than likely find your own gems in the text: The Technical Tools of Statistics.

If you’re too lazy to click away, here are some of the quotes and phrases I particularly enjoyed.

To start with, the quote referenced by Adam:

Some of my friends felt that I should be very explicit in warning you of how much time and money can be wasted on computing, how much clarity and insight can be lost in great stacks of computer output. In fact, I ask you to remember only two points:

  • The tool that is so dull that you cannot cut yourself on it is not likely to be sharp enough to be either useful or helpful.
  • Most uses of the classical tools of statistics have been, are, and will be, made by those who know not what they do.

And here’s one I’m going to use when talking about writing diagrams:

Hand-drawing of graphs, except perhaps for reproduction in books and in some journals, is now economically wasteful, slow, and on the way out.

(It strikes me that using a spreadsheet wizard to create charts in a research or production setting, where we are working in a reproducible, document generation context, is akin to the “hand-drwaing of graphs” of yesteryear?)

“I know of no person or group that is taking nearly adequate advantage of the graphical potentialities of the computer.”

Nothing’s changed?!

[W]e are going to reach a position we should have reached long ago. We are going, if I have to build it myself, to have a programming system — a “language” if you like — with all that that implies, suited to the needs of data analysis. This will be planned to handle numbers in organized patterns of very different shapes, to apply a wide variety of data-analytical operations to make new patterns from old, to carry out the oddest sequences of apparently unrelated operations, to provide a wide variety of outputs, to automatically store all time-expensive intermediate results “on disk” until the user decides whether or not he will want to do something else with them, and to do all this and much more easily.

Since I’ve started playing with pandas, my ability to have written conversations with data has improved. Returning to R after a few months away, I’m also finding that easier to write as well (the tabular data models, and elements of the syntax, are broadly similar across the two).

Most of the technical tools of the future statistician will bear the stamp of computer manufacture, and will be used in a computer. We will be remiss in our duty to our students if we do not see that they learn to use the computer more easily, flexibly, and thoroughly than we ever have; we will be remiss in our duties to ourselves if we do not try to improve and broaden our own uses.

This does not mean that we shall have to continue to teach our students the elements of computer programming; most of the class of ’70 is going to learn that as freshmen or sophomores. Nor does it mean that each student will write his own program for analysis of variance or for seasonal adjustment, this would be a waste. … It must mean learning to put together, effectively and easily — on a program-self-modifying computer and by means of the most helpful software then available — data analytical steps appropriate to the need, whether this is to uncover an anticipated specific appearance or to explore some broad area for unanticipated, illuminating appearances, or, as is more likely, to do both.

Interesting to note that in the UK, “text-based programming” has made it into the curriculum. (Related: Text Based Programming, One Line at a Time (short course pitch).)

Tukey also talks about how computing will offer flexibility and fluidity. Flexibility includes the “freedom to introduce new approaches; freedom, in a word, to be a journeyman carpenter of data-analytical tools”. Fluidity “means that we are prepared to use structures of analysis that can flow rather freely … to fit the apparent desires of the data”.

As the computer revolution finally penetrates into the technical tools of statistics, it will not change the essential characteristics of these tools, no matter how much it changes their appearance, scope, appositeness and economy. We can only look for:

  • more of the essential erector-set character of data analysis techniques, in which a kit of pieces are available for assembly into any of a multitude of analytical schemes,
  • an increasing swing toward a greater emphasis on graphicality and informality of inference,
  • a greater and greater role for, graphical techniques as aids to exploration and incisiveness,
  • steadily increasing emphasis on flexibility and on fluidity,
  • wider and deeper use of empirical inquiry, of actual trials on potentially interesting data, as a way to discover new analytic techniques,
  • greater emphasis on parsimony of representation and inquiry, on the focussing, in each individual analysis, of most of our attention on relatively specific questions, usually in combination with a broader spreading of the remainder of our attention to the exploration of more diverse possibilities.

In order that our tools, and their uses, develop effectively … we shall have to give still more attention to doing the approximately right, rather than the exactly wrong, …

All quotes from John Tukey, The Technical Tools of Statistics, 1964.

Wonderful:-)

Quoting Tukey on Visual Storytelling with Data

Time was when I used to be a reasonably competent scholar, digging into the literature chasing down what folk actually said, and chasing forward to see whether claims had been refuted. Then I fell out of love with the academic literature – too many papers that said nothing, too many papers that contained errors, too many papers…

…but as we start production on a new OU course on “data”, I’m having to get back in to the literature so I can defuse any claims that what I want to say is wrong by showing that someone else has said it before (which, if what is said has been peer reviewed, makes it right…).

One thing I’d forgotten about chasing thought lines through the literature was that at times it can be quite fun… and that it can quite often turn up memorable, or at least quotable, lines.

For example, last night I had a quick skim through some papers by folk hero in the statistics world, John Tukey, and turned up the following:

The habit of building one technique on another — of assembling procedures like something made of erector-set parts — can be especially useful in dealing with data. So too is looking at the same thing in many ways or many things in the same way; an ability to generalize in profitable ways and a liking for a massive search for order. Mathematicians understand how subtle assumptions can make great differences and are used to trying to trace the paths by which this occurs. The mathematician’s great disadvantage in approaching data is his—or her—attitude toward the words “hypothesis” and “hypotheses”.

When you come to deal with real data, formalized models for its behavior are not hypotheses in the mathematician’s sense… . Instead these formalized models are reference situations—base points, if you like — things against which you compare the data you actually have to see how it differs. There are many challenges to all the skills of mathematicians — except implicit trust in hypotheses — in doing just this.

Since no model is to be believed in, no optimization for a single model can offer more than distant guidance. What is needed, and is never more than approximately at hand, is guidance about what to do in a sequence of ever more realistic situations. The analyst of data is lucky if he has some insight into a few terms of this sequence, particularly those not yet mathematized.

Picturing of Data Picturing of data is the extreme case. Why do we use pictures? Most crucially to see behavior we had not explicitly anticipated as possible – for what pictures are best at is revealing the unanticipated; crucially, often as a way of making it easier to perceive and understand things that would otherwise be painfully complex. These are the important uses of pictures.

We can, and too often do, use picturing unimportantly, often wastefully, as a way of supporting the feeble in heart in their belief that something we have just found is really true. For this last purpose, when and if important, we usually need to look at a summary.

Sometimes we can summarize the data neatly with a few numbers, as when we report:
– a fitted line—two numbers,
– an estimated spread of the residuals around the “best” line—one more number,
– a confidence interval for the slope of the “best” line—two final numbers.

When we can summarize matters this simply in numbers, we hardly need a picture and often lose by going to it. When the simplest useful summary involves many more numbers, a picture can be very helpful. To meet our major commitment of asking what lies beyond, in the example asking “What is happening beyond what the line describes!”, a picture can be essential.

The main tasks of pictures are then:
– to reveal the unexpected,
– to make the complex easier to perceive.

Either may be effective for that which is important above all: suggesting the next step in analysis, or offering the next insight. In doing either of these there is much room for mathematics and novelty.

How do we decide what is a “picture” and what is not? The more we feel that we can “taste, touch, and handle” the more we are dealing with a picture. Whether it looks like a graph, or is a list of a few numbers is not important. Tangibility is important—what we strive for most.

[Tukey, John W. “Mathematics and the picturing of data.” In Proceedings of the international congress of mathematicians, vol. 2, pp. 523-531. 1975.]

Wonderful!

Or how about these quotes from Tukey, John W. “Data-based graphics: visual display in the decades to come.” Statistical Science 5, no. 3 (1990): 327-339?

Firstly, on exploratory versus explanatory graphics:

I intend to treat making visual displays as something done by many people who want to communicate – often, on the one hand, to communicate identified phenomena to others, and often, on the other, to communicate unidentified phenomena to themselves. This broad clientele needs a “consumer product,” not an art course. To focus on a broad array of users is not to deny the existence of artists of visual communication, only to recognize how few they are and how small a share in the total volume of communication they can contribute. For such artists,very many statements that follow deserve escape clauses or caveats.

More thoughts on exploration and explanation (transfer of recognition), as well as a distinction between exploration and prospecting:

We all need to be clear that visual display can be very effective in serving two quite different functions, but only if used in correspondingly different ways. On the one hand, it can be essential in helping – or, even, in permitting – us to search in some data for phenomena, just as a prospector searches for gold or uranium.

Our task differs from the usual prospector’s task, in that we are concerned both with phenomena that do occur and with those that might occur but do not. On the other hand, visual display can be very helpful in transferring (to reader, viewer or listener) a recognition of the appearances that indicate the phenomena that deserve report. Indeed, when sufficient precomputation drives appropriately specialized displays, visual display can even also convey the statistical significances or non-significances of these appearances.

There is no reason why a good strategy for prospecting will also be a good strategy for transfer. We can expect some aspects of prospecting strategy (and most techniques) to carry over, but other aspects may not. (We say prospecting because we are optimistic that we may in due course have good lists of possible phenomena. If we do not know what we seek, we are “exploring” not “prospecting.”)

One major difference is in prospecting’s freedom to use multiple pictures. If it takes 5 or 10 kinds of pictures to adequately explore one narrow aspect of the data, the only question is: Will 5 or 10 pictures be needed, or can be condense this to 3 or 4 pictures, without appreciable loss? If “yes” we condense; if “no” we stick to the 5 or 10.

If it takes 500 to 1000 (quite different) pictures, however, our choice can only be between finding a relatively general way to do relatively well with many fewer pictures and asking the computer to sort out some number, perhaps 10 or 20 pictures of “greatest interest.”

In doing transfer, once we have one set of pictures to do what is needed, economy of paper (or plastic) and time (to mention or to read) push even harder toward “no more pictures than needed.” But, even here, we must be very careful not to insist, as a necessity not a desideratum, that a single picture can do it all. If it takes two pictures to convey the message effectively, we must use two.

For prospecting, we will want a bundle of pictures, probably of quite different kinds, so chosen that someone will reveal the presence of any one of possible phenomena, of potentially interesting behaviors, which will often have to be inadequately diverse. Developing, and improving, bundles of pictures for selected combinations of kinds of aspects and kinds of situations will be a continuing task for a long time.

For transfer, we will need a few good styles to transfer each phenomenon of possible importance, so that, when more than one phenomenon deserves transfer, we can choose compatible styles and try to transfer two, or even all, of these phenomena in a single picture. (We can look for the opportunity to do this, but, when we cannot find it, we will use two, or more, pictures as necessary.)

For prospecting, we look at long lists of what might occur – and expect to use many pictures. For transfer, we select short lists of what must be made plain – and use as few pictures as will serve us well.

Then on the power of visual representations, whilst recognising they may not always be up to the job…:

The greatest possibilities of visual display lie in vividness and inescapability of the intended message. A visual display can stop your mental flow in its tracks and make you think. A visual display can force you to notice what you never expected to see. (“Why, that scatter diagram has a hole in the middle!”) On the other hand, if one has to work almost as hard to drag something out of a visual display as one would to drag it out of a table of numbers, the visual display is a poor second to the table, which can easily provide so much more precision. (Here, as elsewhere, artists may deserve an escape clause.)

On design:

Another important aspect of impact is immediacy. One should see the intended at once; one should not even have to wait for it to gradually appear. If a visual display lacks immediacy in thrusting before us one of the phenomena for whose presentation it had been assigned responsibility, we ought to ask why and use the answer to modify the display so its impact will be more immediate.

(For a great example of how to progressively refine a graphic to support the making of a particular point, see this Storytelling With Data post on multifaceted data and story.)

Tukey, who we must recall was writing at a time when powerful statistical graphics tools such as ggplot were still yet to be implemented, also suggests that lessons are to be learned from graphic design for the production of effective statistical charts:

The art of statistical graphics was for a long time a pen-and-pencil cottage industry, with the top professionals skilled with the drafting or mapping pen. In the meantime, graphic designers, especially for books, have had access to different sorts of techniques (the techniques of graphic communication), such as grays of different screen weights, against which, for instance, both white and black lines (and curves) are effective. They also have a set of principles shared in part with the Fine Arts (some written down by Leonardo da Vinci). I do not understand all this well enough to try to tell you about “visual centers” and how attention usually moves when one looks at a picture, but I do know enough to find this area important – and to tell you that more of us need to learn a lot more about it.

Data – what is it good for?

Almost everything we do with data involves comparison – most often between two or more values derived from the data, sometimes between one value derived from the data and some mental reference or standard. The dedication of Richard Hamming’s book on numerical analysis reads “The purpose of computation is insight, not numbers.” We need a book on visual display that at least implies “The purpose of display is comparison (recognition of phenomena), not numbers.”

Tukey also encourages us to think about what data represents, and how it is represented:

Much of what we want to know about the world is naturally expressed as phenomena, as potentially interesting things that can be described in non numerical words. That an economic growth rate has been declining steadily throughout President X’s administration, for example, is a phenomenon, while the fact that the GNP has a given value is a number. With exceptions like “I owe him 27 dollars!” numbers are, when we look deeply enough, mainly of interest because they can be assembled, often only through analysis, to describe phenomena. To me phenomena are the main actors, numbers are the supporting cast. Clearly we most need help with the main actors.

If you really want numbers, presumably for later assembly into a phenomenon, a table is likely to serve you best. The graphic map of Napoleon’s incursion into Russia that so stirs Tufte’s imagination and admiration does quite well in showing the relevant phenomena, in giving the answers to “About where?”, “About when?” and “With roughly what fraction of the original army left?” It serves certain phenomena well. But if we want numbers, we can do better either by reading the digits that may be attached to the graphic – a simple but often effective form of table – or by going to a conventional table.

The questions that visual display (in some graphic mode) answers best are phenomenological (in the sense of the first sentence of this section). For instance:

* Is the value small, medium or large?
* Is the difference, or change, up, down or neutral?
* Is the difference, or change, small, medium or large?
* Do the successive changes grow, shrink or stay roughly constant?
* What about change in ratio terms, perhaps thought of as percent of previous?
* Does the vertical scatter change, as we move from left to right?
* Is the scatter pattern doughnut-shaped?

One way that we will enhance the usefulness of visual display is to find new phenomena of potential interest and then learn how to make displays that will be likely to reveal them, when they are present.

The absence of a positive phenomenon is itself a phenomenon! Such absences as:

* the values are all about the same
* there does not seem to be any definite curvature
* the vertical scatter does not seem to change, as we go from left to right!

are certainly potentially interesting. (We can all find instances where they are interesting.) Thus they are, honestly, phenomena in themselves. We need to be able to view apparent absence of specific phenomena effectively as well as noticing them when they are present! This is one of the reasons why fitting scatter plots with summarizing devices like middle traces (Tukey, 1977a, page 279 ff.) can be important.

Phenomena are also picked up later in the paper:

A graph or chart should not be just another form of table, in which we can look up the facts. If it is to do its part effectively, its focus – or so I believe – will have to be one or more phenomena.

Indeed, the requirement that we can directly read values from a chart seems to be something Tukey takes issue with:

As one who denies “reading off numbers” as the prime purpose of visual display, I can only denounce evaluating displays in terms of how well (given careful study) people read numbers off. If such an approach were to guide us well, it would have to be a very unusual accident.

He even goes so far as to suggest that we might consider being flexible in the way we geometrically map from measurement scales to points on a canvas, moving away from proportionality if that helps us see a phenomenon better:

The purpose of display is to make messages about phenomena clear. There is no place for a doctrinaire approach to “truth in geometry.” We must be honest and say what we did, but this need not mean plotting raw data.

The point is that we have a choice, not only in (x, y)-plots, but more generally. Planned disproportionality needs to be a widely-available option, one that requires the partnership of computation and display.

Tukey is also willing to rethink how we use familiar charts:

Take simple bar charts as an example, which I would define much more generally than many classical authors. Why have they survived? Not because they are geometrically true, and not because they lead to good numerical estimates by the viewer! In my thoughts, their virtue lies in the fact that we can all compare two bars, perhaps only roughly, in two quite different ways, both “About how much difference?” and “About what ratio?” The latter, of course, is often translated into “About how much percent change?” (Going on to three or more successive bars, we can see globally whether the changes in amount are nearly the same, but asking the same question about ratios – rather than differences-requires either tedious assessment of ratios between adjacent bars for one adjacent pair after another…

So – what other Tukey papers should I read?