A group of girl scouts is selling boxes of cookies to raise money for charity. The line plot shows how many boxes of cookies they sold. State whether the mean would be a good measure of central tendency for the data or not. Explain your answer. Is it (A) yes, the mean closely represents the middle of the data or (B) no, the mean is skewed too high because of the data point 36?
The mean is a good measure of central tendency when there are no outliers or extreme values in our data set. In order to answer this question, we need to calculate the mean with and without the possible outlier of 36, the highest number of boxes of cookies sold. If the mean of the data alters significantly when this point is removed, then it will not be a good measure of central tendency.
We recall that in order to calculate the mean, we need to divide the sum of the values by the number of values. In this case, we need to add two 20s, four 22s, two 24s, and so on. An easier way to calculate the sum would be to multiply the number of data points at each value. We have two data points at 20. And two multiplied by 20 is 40. There are four data points at 22. Four multiplied by 22 is 88. Two multiplied by 24 is 48. Three multiplied by 26 is 78. Repeating this process, we have 28, 90, 32, and 36.
There are no data points at 34, 38, and 40. This means that none of the girl scouts sold this number of boxes of cookies. The sum of these numbers is 440. We need to divide this by 17 as there are 17 data points. This means that the group contained 17 girl scouts. Dividing 440 by 17 gives us 25.882 and so on. Rounding this to one decimal place, we have 25.9. The mean number of boxes of cookies sold by the 17 girls was 25.9.
When we remove the girl who sold 36 boxes, we have 404 boxes in total as 440 minus 36 is 404. We need to divide this by 16 as this is the number of other girl scouts. 404 divided by 16 is 25.25. After we have removed the value of 36, the mean has decreased from 25.9 to 25.25. Whilst the mean has decreased slightly, it is still between two of the data points, 24 and 26. This means that the value of 36 has had very little impact on the mean. We can, therefore, conclude that 36 is not an outlier. And this means that the correct answer is: yes, the mean closely represents the middle of the data.
Had the value of 36 been significantly higher, then it would’ve impacted the mean. In that situation, the mean wouldn’t have been a good measure of central tendency.