
BoxandWhisker
Plots: Sections: Quartiles, boxes, and whiskers, Fivenumber summary, Interquartile ranges and outliers The "interquartile range", abbreviated "IQR", is just the width of the box in the boxandwhisker plot. That is, IQR = Q_{3} – Q_{1}. The IQR can be used as a measure of how spreadout the values are. Statistics assumes that your values are clustered around some central value. The IQR tells how spread out the "middle" values are; it can also be used to tell when some of the other values are "too far" from the central value. These "too far away" points are called "outliers", because they "lie outside" the range in which we expect them. The IQR is the length of the box in your boxandwhisker plot. An outlier is any value that lies more than one and a half times the length of the box from either end of the box. That is, if a data point is below Q_{1} – 1.5×IQR or above Q_{3} + 1.5×IQR, it is viewed as being too far from the central values to be reasonable. Maybe you bumped the weighscale when you were making that one measurement, or maybe your lab partner is an idiot and you should never have let him touch any of the equipment. Who knows? But whatever their cause, the outliers are those points that don't seem to "fit". (Why one and a half times
the width of the box? Why does that particular value demark the difference between "acceptable" and "unacceptable"
values? Because, when John
Tukey was inventing
the boxandwhisker plot in 1977 to display these values, he picked 1.5×IQR as the demarkation line for outliers. This has worked well, so we've
continued using that value ever since.)
10.2, 14.1,
14.4. 14.4, 14.4, 14.5, 14.5, 14.6,
14.7, To find out if there are any outliers, I first have to find the IQR. There are fifteen data points, so the median will be at position (15 + 1) ÷ 2 = 8. Then Q_{2} = 14.6. There are seven data points on either side of the median, so Q_{1} is the fourth value in the list and Q_{3} is the twelfth: Q_{1} = 14.4 and Q_{3} = 14.9. Then IQR = 14.9 – 14.4 = 0.5. Outliers will be any points below Q_{1} – 1.5×IQR = 14.4 – 0.75 = 13.65 or above Q_{3} + 1.5×IQR = 14.9 + 0.75 = 15.65. Then the outliers are at 10.2, 15.9, and 16.4.
The values for Q_{1} – 1.5×IQR and Q_{3} + 1.5×IQR are the "fences" that mark off the "reasonable" values from the outlier values. Outliers lie outside the fences. If your assignment is having you consider outliers and "extreme values", then the values for Q_{1} – 1.5×IQR and Q_{3} + 1.5×IQR are the "inner" fences and the values for Q_{1} – 3×IQR and Q_{3} + 3×IQR are the "outer" fences. The outliers (marked with asterisks or open dots) are between the inner and outer fences, and the extreme values (marked with whichever symbol you didn't use for the outliers) are outside the outer fences. By the way, your book may
refer to the value of "1.5×IQR"
as being a "step". Then the outliers will be the numbers that
are between one and two steps from the hinges, and extreme value will
be the numbers that are more than two steps from the hinges. Looking again at the previous example, the outer fences would be at 14.4 – 3×0.5 = 12.9 and 14.9 + 3×0.5 = 16.4. Since 16.4 is right on the upper outer fence, this would be considered to be only an outlier, not an extreme value. But 10.2 is fully below the lower outer fence, so 10.2 would be an extreme value. Copyright © Elizabeth Stapel 20042011 All Rights Reserved
If you're using your graphing calculator to help with these plots, make sure you know which setting you're supposed to be using and what the results mean, or the calculator may give you a perfectly correct but "wrong" answer.
21, 23, 24, 25, 29, 33, 49 To find the outliers and extreme values, I first have to find the IQR. Since there are seven values in the list, the median is the fourth value, so Q_{2} = 25. The first half of the list is 21, 23, 24, so Q_{1} = 23; the second half is 29, 33, 49, so Q_{3} = 33. Then IQR = 33 – 23 = 10. The outliers will be any values below 23 – 1.5×10 = 23 – 15 = 8 or above 33 + 1.5×10 = 33 + 15 = 48. The extreme values will be those below 23 – 3×10 = 23 – 30 = –7 or above 33 + 3×10 = 33 + 30 = 63. So I have an outlier at 49 but no extreme values, I won't have a top whisker because Q_{3} is also the highest nonoutlier, and my plot looks like this: It should be noted that the methods, terms, and rules outlined above are what I have taught and what I have most commonly seen taught. However, your course may have different specific rules, or your calculator may do computations slightly differently. You may need to be somewhat flexible in finding the answers specific to your curriculum. << Previous Top  1  2  3  Return to Index


This lesson may be printed out for your personal use.

Copyright © 20042014 Elizabeth Stapel  About  Terms of Use  Linking  Site Licensing 




