How to lie with charts

Sun 20 Mar 2011 by mskala Tags used: , ,

I'm generally a fan of the IAEA, but this image I just grabbed from their Web site is a textbook example of slanting (literally!) a graphic image so that it misleads the reader.

Fuel temperature chart

The chart shows the temperature of two spent fuel pools at Fukushima Daiichi. Until the morning of March 19, UTC, the temperatures were slowly but steadily increasing. After that, they decreased significantly. These data could well be presented with a two-dimensional line chart.

But the makers of the chart above chose to project it into three dimensions in such a way that the lines slant downward even where the temperature is increasing - obscuring one of the most important pieces of information the numbers represent, which is the direction of change. A human being looking at the chart and not reading the numbers would get the incorrect impression that the temperature was consistently decreasing over the entire time period. Thus the chart has failed in its purpose of making the numbers understandable. The chart actually conveys a negative amount of information, because it conveys incorrect information. If you erased all the graphics and left only the numbers, readers would be better informed.

If they were going to make a chart, there is no reason it needed to be a three-dimensional chart; and if they were going to make a three-dimensional chart, there is no reason they had to use that particular choice of projection angle. I hate to ascribe to malice that which could be explained by stupidity, but this really does look the way deliberate deception would look; and it has the same effect as deliberate deception even if it isn't actually deliberate deception.


This seems like the kind of thing that should really be thrown into the hall of shame at Edward Tufte's BBS: http://www.edwardtufte.com/bboard/q-and-a?topic_id=1

The useless three-dimensionality aside: A scatter plot would be even better than any line chart, because a line chart would falsely imply the existence of a linear progression between any two points. Maybe the IAEA doesn't think simple, slickly-produced graphics are sexy enough.

Or maybe they've all become stupid by Excel. trythil - 2011-03-20 18:50
Well, I don't think the use of a line is so bad. Temperature *is* a smooth progression over time. Two successive measurements on the same pool are meaningfully connected in a way that, for instance, measurements of two different pools at the same time are not. So it's not the classic case of unrelated data items for which a scatter plot is optimal.

The time progression temperature may not be linear, but plotting a piecewise linear function between the measurements seems reasonable to me. One thing Excel does that really annoys me is that by default it plots a smooth spline interpolating the input data - creating inflection points in between that probably don't exist. Matt - 2011-03-20 19:06
Temperature is smooth, but I don't understand what additional information is given by connecting lines.

When discretely sampling a continuous process, I think it's more honest to admit that you've got samples taken at regular (or, in the case of the IAEA plot, irregular) intervals and to just plot what you have. Large gaps between data points connected by a line can cover up anomalies that may be unimportant (or undesirable) to the author's agenda, but may be of interest to someone else. With a scatter plot, you're still missing those anomalies, but at least that omission is out in the open.

I guess a connecting line might be useful as a visual aid, but you can accomplish that in other ways, such as eliminating background junk or excessively heavy gridlines.

I'm working on redesigning the IAEA graphic to suit my tastes and standards; once I've got that done I'll post it here for criticism. Always open to improvement... trythil - 2011-03-20 19:45
Heh, speaking of junk:

> "discretely sampling"

...bleh, that should just read "sampling". Not sure how else you'd sample a signal. trythil - 2011-03-20 19:46
If we're going by a strict "does it convey additional information?" standard, we don't need a chart at all; a table would do. I think the lines connecting points are desirable because they allow for quick recognition of the signs of the first two differences: "Did it go up or down since the last measurement?" and "Is the difference between this measurement and the next one, bigger or smaller than the difference between this measurement and the previous one?" Both those are especially interesting pieces of information in this particular case; and they'd be much harder to recognize from a scatter plot or a table of numbers.

But of course that's only if the visual design allows for those bits to be recognized both easily and correctly. This one doesn't. Matt - 2011-03-20 19:53
I think I now see (one of) your points re: the line chart's advantages. It definitely does make the trend much more obvious.

A line chart redesign: http://depot.ninjawedding.org/lines.png
A scatter plot redesign: http://depot.ninjawedding.org/scatter.png

Both of those were generated using Protovis.

I maintained the IAEA's colors because I had no other preferences, but a friend with protanopia and deternaopia told me that the color scheme made the data series very hard to differentiate. Color in data graphics is something I need more practice with. trythil - 2011-03-20 23:53
I like those a lot better; still prefer the version with lines. You've also corrected another problem I hadn't even noticed, which is that the measurements are shown as equally spaced in time on the IAEA chart even though actually they are not equally spaced. That's a pretty serious problem, too, because using the false equal spacing distorts the slopes of the lines. Matt - 2011-03-21 07:06
Some years ago I tried to talk a couple of professional statistician friends into creating a news agency called StatsWatch (its logo would be a sasquatch holding a clipboard). The idea was to keep those media who subscribed informed about all the statistical stupidity with which they are continually showered. Unfortunately they didn't bite.

I think Darrell Huff's 1954 book, How to Lie with Statistics, is still in print. It had some lovely tips on how to make deceptive graphs. Axel - 2011-03-21 08:53
Tony H.
It's not just the charts at the IAEA site that deceive; there is plenty of slanted wording. "Temperatures measured at the feed water nozzle and at the bottom of the Reactor Pressure Vessel (RPV) continue to decrease slightly at Units 1 and 2, except the temperature at the feed water nozzle of Unit 1's RPV, which has slightly increased to 274 °C." Unless I read this very wrongly, it says that three of four temperatures went down slightly, and one went up. Except that it says they all went down, except for one, which went up. Fair enough if there were, say 50 readings, but there are only four.

I looked late at the line chart posted by trythil above, and the difference is amazing. Despite reading all the discussion and looking closely at the IAEA chart, I still managed to miss that the original also hid the crossover point in the final interval. Tony H. - 2011-03-29 12:56
Oh, one slight quibble with Trythil's charts: there is important information (namely, which line is which) indicated only by colour. That's a bad thing because of the many people who don't have normal colour vision. Using colour is good, because it's easy to see for those who can see it; but nothing should be shown *only* by colour. One way to fix it might be to give the data points different shapes - like circles and triangles - for the two different reactors. Matt - 2011-03-29 23:02

