Question Video: Estimating the Mean of a Frequency Distribution Mathematics

Consider the following frequency distribution. Find an estimate for the mean.

04:38

Video Transcript

Consider the following frequency distribution. Find an estimate for the mean.

Let’s have a closer look at this frequency distribution. The data has been grouped into classes, which are labeled as 10 dash, 20 dash, 30 dash, and so on. We’re given the frequency, or number of items, in each class. We don’t know the exact value of the data points because they have been grouped. But we do know that in the first class, for example, there are nine pieces of data. We’re asked to find an estimate for the mean of this distribution. And it’s because we don’t know the exact data values that we can only estimate, rather than calculate, the mean.

We recall that, in general, the mean of a data set is found by dividing the sum of all the data values by how many values there are. When we’re estimating the mean, however, we need to find an estimate for the sum of all the data values. We do know the number of data values. This corresponds to the total frequency in the table, which is 50. So, let’s think about how we can estimate the sum of all the data values. We first need to find a single value that is most representative of each class. We want to choose the central value, or midpoint, of each class, which is the mean of the class boundaries.

From the table, it may appear as if we don’t know what the upper boundaries are due to the way the classes have been presented. To work these out, we need to assume that there are no gaps in the data. So, the lower boundary of one class is the upper boundary of the previous one. The first class then will contain all the data values that are greater than or equal to 10, but strictly less than 20. The next class will contain all the data values that are greater than or equal to 20, but strictly less than 30.

By writing the inequalities in this way, with a strict inequality at the upper boundary of each class and a weak inequality at the lower boundary, we ensure there are no gaps but also no overlaps between the classes. When we come to the final class, we have to make an assumption about its upper boundary, as there is no class that follows it. We assume this class has the same width as the class immediately before it. In this distribution, all classes have the same width of 10. And so, we assume that the final class also has a width of 10, and hence its upper boundary is 60.

Having found each of the upper-class boundaries, we’re now ready to calculate the midpoint for each class. Each midpoint is the mean of the lower and upper boundaries for that class. The first midpoint is 10 plus 20 over two, which is 15. The remaining midpoints are 25, 35, 45, and 55. We’ve now found a single value that we can use to represent each class and hence to estimate the sum of the values in each class.

In the first class, there are nine data values, which all have a value close to 15. Hence, the estimated sum of the values in the first class is nine multiplied by 15, which is 135. We can estimate the sum of the values in each of the remaining classes in the same way, each time multiplying the frequency for that class by its midpoint.

To find the estimated sum of all the data values, we add together the estimated totals for each class, which gives 1,800. So, our estimate of the sum of all the data values is 1,800. To estimate the mean, we divide this estimated total by the total frequency of 50. 1,800 divided by 50 is 36.

So, by recalling the process for estimating the mean of a frequency distribution, which requires us to first find the midpoint of each class, we’ve estimated the mean of this frequency distribution to be 36.

Nagwa uses cookies to ensure you get the best experience on our website. Learn more about our Privacy Policy.