Video Transcript
The frequency table shows the
distribution of masses in kilograms of items picked up by a removal company. Estimate the mean mass of the items
picked up.
The data set has been presented in
a grouped frequency distribution. The first row of the table contains
the masses of the items, which have been organized into classes presented as 25
dash, 35 dash, and so on, up to 65 dash. The second row of the table
contains the frequencies for each class, and we can see that the total frequency is
50. We’re asked to estimate the mean
mass of the items picked up. And it’s because the data has been
grouped — and hence, we don’t know any of the exact data values — that we can only
estimate, rather than calculate, the mean.
In general, the mean of a data set
is found by dividing the sum of all the data values by the number of values in the
data set. When we’re estimating the mean,
however, we can only estimate the sum of the data values. We do this by choosing a value that
is most representative of each class to use as an approximation for each value in
that class and then multiplying this value by the class frequency to give an
estimate of the sum of the values within each class.
The value that we use is the
midpoint of each class, which is the mean of the upper and lower class
boundaries. From the table though, it may
appear as if we don’t know the upper boundaries of each class. To work these out, we need to make
the assumption that because the data is continuous, there are no gaps between the
classes. And so the upper boundary for one
class is the same as the lower boundary for the class that follows it.
The upper boundary for the first
class is therefore 35. And we can write this value in
parentheses to indicate that the first class contains all the masses that are
greater than or equal to 25 kilograms but strictly less than 35 kilograms. A mass of exactly 35 kilograms
would belong in the next class. The upper boundaries for the next
three classes are 45, 55, and 65. For the final class, it’s a little
different because there’s no class following it to tell us what the upper boundary
should be.
We make the assumption that the
final class has the same width as the previous one. So we assume it has a width of 10,
and hence its upper boundary is 75. Having completed the class
boundaries, we can now calculate the midpoints. The midpoint for the first class is
25 plus 35 over two, which is 30. The remaining midpoints are 40, 50,
60, and 70.
We now have a single value that is
representative of the data within each class. To estimate the sum of the values
in each class, we multiply each midpoint by the class frequency. That gives 210, 360, 750, 600, and
630. The sum of these values, which is
2,550, is our estimate of the sum of all the values in the data set. To estimate the mean of the
distribution, we now divide this estimated sum by the total frequency, which is
50. And it gives 51. Our estimate of the mean mass of
the items picked up by the removal company is 51 kilograms.