
BoxandWhisker
Plots: Sections: Quartiles, boxes, and whiskers, Fivenumber summary, Interquartile ranges and outliers Statistics assumes that your data points (the numbers in your list) are clustered around some central value. The "box" in the boxandwhisker plot contains, and thereby highlights, the middle half of these data points. To create a boxandwhisker plot, you start by ordering your data (putting the values in numerical order), if they aren't ordered already. Then you find the median of your data. The median divides the data into two halves. To divide the data into quarters, you then find the medians of these two halves. Note: If you have an even number of values, so the first median was the average of the two middle values, then you include the middle values in your submedian computations. If you have an odd number of values, so the first median was an actual data point, then you do not include that value in your submedian computations. That is, to find the submedians, you're only looking at the values that haven't yet been used. You have three points:
the first middle point (the median), and the middle points of the two
halves (what I call the "submedians"). These three points divide
the entire data set into quarters, called "quartiles". The top
point of each quartile has a name, being a "Q"
followed by the number of the quarter. So the top point of the first quarter
of the data points is "Q_{1}",
and so forth. Note that Q_{1} is also the middle number
for the first half of the list, Q_{2} is also the middle number
for the whole list, Q_{3} is the middle number for the
second half of the list, and Q_{4} is the largest value in the
list. Once you have these three points, Q_{1}, Q_{2}, and Q_{3}, you have all you need in order to draw a simple boxandwhisker plot. Here's an example of how it works.
4.3, 5.1, 3.9, 4.5, 4.4, 4.9, 5.0, 4.7, 4.1, 4.6, 4.4, 4.3, 4.8, 4.4, 4.2, 4.5, 4.4 My first step is to order the set. This gives me: 3.9, 4.1, 4.2, 4.3, 4.3, 4.4, 4.4, 4.4, 4.4, 4.5, 4.5, 4.6, 4.7, 4.8, 4.9, 5.0, 5.1 The first number I need is the median of the entire set. Since there are seventeen values in this list, I need the ninth value: 3.9, 4.1, 4.2, 4.3, 4.3, 4.4, 4.4, 4.4, 4.4, 4.5, 4.5, 4.6, 4.7, 4.8, 4.9, 5.0, 5.1 The median is Q_{2} = 4.4. The next two numbers I need are the medians of the two halves. Since I used the "4.4" in the middle of the list, I can't reuse it, so my two remaining data sets are: 3.9, 4.1, 4.2, 4.3, 4.3, 4.4, 4.4, 4.4 and 4.5, 4.5, 4.6, 4.7, 4.8, 4.9, 5.0, 5.1 The first half has eight values, so the median is the average of the middle two: Q_{1} = (4.3 + 4.3)/2 = 4.3 The median of the second half is: Copyright © Elizabeth Stapel 20042011 All Rights Reserved Q_{3} = (4.7 + 4.8)/2 = 4.75
By the way, boxandwhisker plots don't have to be drawn horizontally as I did above; they can be vertical, too. Top  1  2  3  Return to Index Next >>


This lesson may be printed out for your personal use.

Copyright © 20042014 Elizabeth Stapel  About  Terms of Use  Linking  Site Licensing 




