In this explainer, we will learn how to estimate the mean using grouped frequency tables.
Let’s begin by recapping what a grouped frequency table is.
Definition: Grouped Frequency Table
A grouped frequency table is a frequency table with data organized into smaller groups, often referred to as sets or classes.
Grouped frequency tables can be very useful when we are working with large data sets or with data sets with a large range of values. A grouped frequency table is a manageable way of representing the data. However, the disadvantage of grouped frequency tables is that we cannot extract the original data values from them. Consider the table below that represents the grades that students received in an examination.
Grade | 0– | 10– | 20– | 30– |
---|---|---|---|---|
Frequency | 1 | 5 | 5 | 4 |
The groups are given as open intervals, 0–, 10–, 20–, 30–. The data values in the first group can be considered as 0 or more, but less than 10. Its boundary values would be 0 and 10. We can observe that 1 student achieved a mark in this interval. However, we do not know the exact mark. So, when it comes to performing any statistical calculations based on a grouped frequency table, such as finding the mean, this can be more difficult. Let’s recall how to calculate the mean of a set of data.
Definition: Mean
The mean is a measure of the center of a data set. It is calculated by adding all of the data values together and then dividing by the number of data values.
So, the mean is given by
In the example of the examination scores above, if we knew each individual score, we could add them up and divide by 15 in order to find the mean.
Say that the individual scores were listed as
To find the mean, we would add the scores above, and then divide by 15. This would give
Without knowing these individual scores, we could not calculate the mean; so, instead, we find an estimate for the mean.
We do this by firstly finding the midpoint of each group. This gives us a single value that is representative of that particular group. If we do this for the grouping 10– above, the midpoint can be calculated by adding the group boundary values (10 and 20) and dividing by two. Hence, for the group 10–, we have
Therefore, the midpoint of 15 is a single representative value for the group 10–. We can calculate the midpoints of the 4 groups in the table above as 5, 15, 25, and 35, as shown
Grade | 0– | 10– | 20– | 30– |
---|---|---|---|---|
Frequency | 1 | 5 | 5 | 4 |
Midpoint | 5 | 15 | 25 | 35 |
Let’s consider the values in the second group. We know that there will be 5 values that are approximately 15. If we calculated the total of these 5 approximate values, we would have . This leads us to the general rule that the sum of the values in each group is found by .
Ordinarily, to find the mean of a set of data values, we would sum all of the values and divide by the number of data values. In a grouped frequency table, we will need to work out the of each group (the estimated sum of the values in each class) and find the total of these values (the estimated total of the values). We will also need to calculate the total frequency.
Grade | 0– | 10– | 20– | 30– | Total |
---|---|---|---|---|---|
Frequency | 1 | 5 | 5 | 4 | 15 |
Midpoint | 5 | 15 | 25 | 35 | - |
Frequency Midpoint | 5 | 75 | 125 | 140 | 345 |
An estimate for the mean can be given as follows:
In the example above, the estimate for the mean can be found as follows:
We found, using the individual scores, that the actual mean was 24. Therefore, the estimate of 23 is relatively close. Of course, if the values in a data set are skewed such that they are all lower or all higher in each class range, then the estimate will not produce an accurate reflection of the data. It is simply an estimate.
We will now see an example where we need to find the midpoints of groups in a grouped frequency table.
Example 1: Calculating the Midpoints For Classes in a Grouped Frequency Table
A number of teenagers were surveyed on how many days a month they exercise. The results are given below. Complete the table by finding the midpoints of each group.
Days | 0– | 5– | 10– | 15– | 20– | 25– |
---|---|---|---|---|---|---|
Frequency | 3 | 7 | 8 | 6 | 4 | 1 |
Midpoint (Days) | 7.5 | 17.5 |
Answer
When given a grouped frequency table, as part of a calculation to find an estimate of the mean of the data, we need to establish the midpoints of each group, or class, in the table.
Each given class in the table, 0–, 5–, 10–, 15–, 20–, and 25–, represents a given number of days per month on which the teenagers exercised. Although each group has an open interval, we can use the successive group as an upper boundary. The first group, 0–, represents days that are greater than or equal to 0 but less than 5. This is because the next class begins with values greater than or equal to 5. We do not have overlapping values in a grouped frequency table. However, we can take 0 and 5 as the lower and upper boundaries for this group.
To calculate the midpoint of a group, we add the boundary values and divide by 2:
The midpoint of the first group can be calculated as follows:
For the midpoint of group 10–, we use the boundary values of 10 and 15 to give
We can find the midpoint of the group 20– using the boundary values 20 and 25, as follows:
For the final group, 25–, we recognize that since there are no further groups, there is no upper boundary. To calculate its midpoint, we can assume that this group has the same class width as the other groups. As such, its upper boundary could be taken as 30. Therefore, the midpoint will be found by
We can complete the table as shown.
Days | 0– | 5– | 10– | 15– | 20– | 25– |
---|---|---|---|---|---|---|
Frequency | 3 | 7 | 8 | 6 | 4 | 1 |
Midpoint (Days) | 7.5 | 17.5 |
The missing values can be listed as
Note that when estimating the mean of a grouped frequency distribution, a common mistake is to find the total of the midpoints and use this in place of the total frequency. We do not need to find this total or use it in our calculations.
We will now see an example that takes us step-by-step through the calculations that we need to find an estimate for the mean.
Example 2: Completing an Extended Frequency Table to Help Calculate an Estimate for the Mean
- In an extract of a book, the number of words per sentence was
counted. Find the missing numbers in the following table.
Number of Words Frequency Midpoint Frequency Midpoint 1–7 15 4 60 8–14 20 15–21 45 18 810 22–28 47 25 1 175 29–35 23 32 736 36–42 10 39 390 - Use the previous table to calculate an estimate for the mean number of words. Give your answer to two decimal places.
Answer
Part 1
The given grouped frequency table lists the number of words per sentence in a book, along with their frequencies (the number of sentences that were found with this number of words). In order to calculate an estimate for the mean in the second part of this question, the first step is to determine the midpoint of each group.
To find the midpoint, we add the two boundary values in each group and divide by 2. For example, the midpoint of the first group, 1–7, has been calculated as 4, since .
Therefore, the missing midpoint of the second group, 8–14, is calculated as follows:
We then need to find the missing value in the column of . To calculate the values in this column, we multiply the midpoint of every group by its frequency.
Hence, for the group of 8–14, we have
Number of Words | Frequency | Midpoint | Frequency Midpoint |
---|---|---|---|
1–7 | 15 | 4 | 60 |
8–14 | 20 | ||
15–21 | 45 | 18 | 810 |
22–28 | 47 | 25 | 1 175 |
29–35 | 23 | 32 | 736 |
36–42 | 10 | 39 | 390 |
The two missing values can be given as 11 and 220.
Part 2
We can now use the table to calculate an estimate for the mean. Recall that this can only be an estimate, as we cannot establish the original data values solely from a grouped frequency table. Usually, to calculate the mean, , of a set of data values, we calculate
An estimate for the sum of the data values in a grouped frequency table is found by the total in the column. The total number of data values is represented by the total frequency. An estimate for the mean of values in a grouped frequency table is given by
From the table, we can calculate the total frequency as
The sum of the column is calculated as
Hence, the estimate for the mean is given as follows:
Approximating this to two decimal places, we can give the answer that an estimate for the mean number of words per sentence is
In order to find an estimate for the mean of values given in a grouped frequency table, we can apply the following steps.
How To: Calculating an Estimate for the Mean of a Grouped Frequency Table
We can calculate an estimate for the mean by following the steps below:
- Find the midpoint, , of each group in the table by adding the boundary values and dividing by 2.
- Multiply the midpoints by the frequencies, , of the corresponding classes to give . Adding additional rows or columns to the frequency table can be useful for recording these products.
- Find the sum of , total .
- Divide this sum by the total frequency, total :
We can now see how these steps can be applied in the following example.
Example 3: Finding an Estimate for the Mean of Grouped Data
The frequency table shows the distribution of daily wages of 50 workers in a factory. Find an estimate for the mean wage received.
Wages | 25– | 35– | 45– | 55– | 65– | Total |
---|---|---|---|---|---|---|
Frequency | 7 | 9 | 15 | 10 | 9 | 50 |
Answer
In order to begin finding an estimate for the mean in this grouped frequency table, we first need to find the midpoint of each group. This allows to represent the values in the group as one single number. It is helpful to add another row to the table to allow us to record these midpoints.
To find the midpoint of each group, we add the boundary values and divide by 2. The groups are given as 25–, 35–, 45–, and so on. Therefore, we can recognize that the group 25– indicates values that are 25 or more but less than 35. We can say that the boundary values of the group 25– are 25 and 35. Similarly, the boundary values of the group 35– are 35 and 45.
We can calculate the midpoint of the first group as
The midpoints of the next 3 groups are 40, 50, and 60. Although there is no upper limit to the last group, we can approach this by considering the class width to be the same as in the other groups. In this case, we would be finding the midpoint of a group that can be modeled as having the boundary values 65 and 75. The midpoint can be given as
We can fill in the midpoints as shown below. There is no need to calculate the totals of the midpoints.
Wages | 25– | 35– | 45– | 55– | 65– | Total |
---|---|---|---|---|---|---|
Frequency | 7 | 9 | 15 | 10 | 9 | 50 |
Midpoint, | 30 | 40 | 50 | 60 | 70 |
The next stage of finding an estimate for the mean is to multiply each midpoint by the frequency of the corresponding group. The product of the frequency and midpoint of the first group would be calculated as follows:
We can add another row to the table to record these products.
We now need to add the values in this final row to find the sum of the values. This can be calculated as
Wages | 25– | 35– | 45– | 55– | 65– | Total |
---|---|---|---|---|---|---|
Frequency | 7 | 9 | 15 | 10 | 9 | 50 |
Midpoint, | 30 | 40 | 50 | 60 | 70 | |
Frequency Midpoint | 210 | 360 | 750 | 600 | 630 | 2 550 |
Finally, to find an estimate for the mean, we divide the sum of the values by the total frequency. This gives us
The answer is that the mean wage can be estimated to be 51.
We will now see another example of how to find an estimate for the mean in a real-life context.
Example 4: Finding an Estimate for the Mean of Grouped Data
The following table shows the salaries of employees in a certain company, given in Egyptian pounds (LE).
Estimate the mean salary (in LE), giving your answer approximated to two decimal places.
Answer
The given table is that of a grouped frequency table. To understand the boundaries of the groups, we can recognize that, in the first group, the inequality indicates values of , the salary, greater than or equal to 1 000 LE and less than 3 000 LE. Since 5 employees had a salary in this range, the frequency of this group is 5.
Because we cannot establish the exact salaries of any of the employees solely from the table, we cannot determine the exact mean. Instead, we find an estimate for the mean. To do this, we perform the following steps:
- Find the midpoint, , of each group in the table by adding the boundary values and dividing by 2.
- Multiply the midpoints by the frequencies, , of the corresponding classes to give . Adding additional rows or columns to the frequency table can be useful for recording these products. Here, the total frequency represents the number of employees whose salary is recorded.
- Find the sum of , total .
- Divide this sum by the total frequency, total :
In this context, is the number of employees whose salaries is in the range given, is the estimated mean salary (the midpoint of the range of a class), total is the total number of employees, and total is an estimate for the sum of the salaries of the 21 employees.
We can add two additional rows to the table. One row for the midpoint, , and a row for the product, : the number of employees multiplied by . We can also add an additional “Total” column for the total frequency and the total of .
We are now ready to determine the midpoint of each group by adding the boundary values and dividing by 2. These can calculated as
We can then calculate by multiplying each of the midpoints by the frequencies of the corresponding groups.
Next, we find the sum of , total , and the total frequency, total . These can be calculated as follows:
Finally, we calculate an estimate for the mean using total and total :
Rounding this to two decimal places, we can give the answer that an estimate for the mean (in LE) can be given as
Note that it is always worth checking whether our answer seems appropriate. We would expect our value to lie within the range of 1 000–9 000 LE. As it is, we can confirm that our answer has an appropriate value.
We can now see one final example.
Example 5: Finding an Estimate for the Mean of Grouped Data
The frequency table shows the distribution of grades that 50 students attained in an exam. Estimate the mean grade.
Grade | 10– | 20– | 30– | 40– | 50– | Total |
---|---|---|---|---|---|---|
Frequency | 9 | 8 | 11 | 13 | 9 | 50 |
Answer
The given grouped frequency table allows us to see a distribution of the scores that 50 students attained in an exam. Since we cannot use this table to ascertain the individual grades of all 50 students, we can only calculate an estimate of the mean grades. We do this by using the following steps.
- Find the midpoint, , of each group in the table by adding the boundary values and dividing by 2.
- Multiply the midpoints by the frequencies, , of the corresponding classes to give .
- Find the sum of , total .
- Divide this sum by the total frequency, total :
Beginning with calculating the midpoint of each group, we find the midpoint of the first group 10–, by recognizing that this indicates values that are 10 or greater. As the next group begins with values of 20 or more, then the boundary of the first group must be values of 10 or more but less than 20. We can take 10 and 20 to be the boundary values of the group. Hence, its midpoint will be calculated as follows:
We can determine the midpoints of the other groups in the same way. Although we do not know the total grades available in the examination, we assume the class width of the final group to be the same as the other classes. The boundary values can be taken as 50 and 60.
Grade | 10– | 20– | 30– | 40– | 50– | Total |
---|---|---|---|---|---|---|
Frequency, | 9 | 8 | 11 | 13 | 9 | 50 |
Midpoint, | 15 | 25 | 35 | 45 | 55 |
Next, we find the products, . For each group, we multiply the midpoint, , by the corresponding frequency.
We can calculate total as
Grade | 10– | 20– | 30– | 40– | 50– | Total |
---|---|---|---|---|---|---|
Frequency, | 9 | 8 | 11 | 13 | 9 | 50 |
Midpoint, | 15 | 25 | 35 | 45 | 55 | - |
135 | 200 | 385 | 585 | 495 | 1 800 |
To calculate the mean, we substitute total and total into the formula:
Hence, the mean grade in the exam can be estimated as 36.
We can summarize the key points below.
Key Points
- A grouped frequency table is a frequency table with data organized into smaller groups, often called classes.
- Because we cannot determine the individual data values from a grouped frequency table, we cannot calculate the exact mean.
- We can calculate an estimate for the mean of a grouped frequency table
by following the steps below:
- Find the midpoint, , of each group in the table by adding the boundary values and dividing by 2.
- Multiply the midpoints by the frequencies, , of the corresponding classes to give .
- Find the sum of , total .
- Divide this sum by the total frequency, total :
- It is useful to add rows and/or columns to the grouped frequency table to record the and values and the totals.