Statistics: No Box-and-Whiskers; A Better Histogram
Many of you know that I have ‘been around’ for a long time. My first statistics course was around 1970, and I started teaching some statistics in 1973. I’ve had some concerns about a tool invented about that time (box and whisker plots), and want to propose a replacement graphic.
Here are two box & whisker plots (done in horizontal format, which I prefer):
There are two basic flaws in the box & whisker display:
- The display implies information about variation, when the underlying summary does not (quartiles).
- The display requires the reader to invert the visual relationship: A larger ‘box’ means a smaller density, a smaller ‘box’ means a larger density
Here are the underlying data sets, presented in histogram format (which is not perfect, but avoids both of those issues):
Some of the problems with box plots are well documented; a number of more sophisticated displays have been used. See http://vita.had.co.nz/papers/boxplots.pdf. These better displays are seldom used, especially in introductory statistics courses.
The main attractions of the box-plot was that it provided an easy visual display of 5 numbers — minimum, first quartile, median, third quartile, maximum. The problem with creating a visual display of such simple summary data is that it will always imply more information than existed in the summary. We’ve got a solution at hand, much simpler than the alternatives used (which are based on maintaining the box concept):
Replace basic box-and-whisker plots with a “quartiled histogram”.
A quartiled histogram adds the quartile markers to a normal histogram display. Here are two examples; compare these to the box plots above:
The quartiled histogram combines the basic histogram with a simplified cumulative frequency chart — without losing the independent information of each category.
Perhaps a basic box and whisker plot works when the audience is sophisticated in understanding statistics (researchers, statisticians, etc). Because of known perceptual weaknesses, I think we would be better served to either not cover box & whisker plots in intro classes — or to cover them briefly with a caution that they are to be avoided in favor of more sophisticated displays.
Join Dev Math Revival on Facebook:
No Comments
No comments yet.
RSS feed for comments on this post. TrackBack URI
Leave a comment
You must be logged in to post a comment.