# Video: Pack 2 • Paper 1 • Question 5

Pack 2 • Paper 1 • Question 5

02:44

### Video Transcript

The table describes the exam marks achieved by 30 students. Part a) Estimate the mean mark.

To estimate the mean from a grouped frequency table, we need to find the midpoint of each group and multiply it by the frequency for that group. We then find the sum of these values and divide by the total frequency.

The midpoints of the five groups are 10, 30, 50, 70, and 90. The total frequency is the sum of the five individual frequencies. And it’s also given in the question. It’s 30. The calculation for the estimate of the mean is, therefore, equal to three multiplied by 10 plus zero multiplied by 30 plus nine multiplied by 50 plus 12 multiplied by 70 plus six multiplied by 90, all divided by 30.

We can find the sum of these values using a column addition method. The total is 1860. So the calculation for our estimate of the mean is 1860 divided by 30, which can be simplified by cancelling a factor of 10 from both the numerator and denominator, 186 over three. We can evaluate 186 divided by three using a short division. It’s equal to 62. This is our estimate of the mean mark.

Now there’s a second part to this question. So if you need to jot down any of the working out for the mean, pause the video and do so now.

Part b of the question is this: John says the median may be a better way to summarize this data. Do you agree? Give a reason to support your answer.

Let’s look at the data table more closely. We noticed that three of the students had a score between zero and 20. And in fact, no students had a score between 20 and 40. This means that, of the 30 students, 27 of them had a score between 40 and 100. This suggests that the three students who scored between zero and 20 marks may in fact be outliers, which will have a significant effect on our calculation of the mean. It will bring it down.

However, the median is less affected by outlying values. We need to make a comment to this effect. So firstly, yes, we agree with John. Why? Because the median is less affected by outliers than the mean. However, this is only relevant if there are indeed potential outliers in the dataset. So we also need to say that the three low marks may be outliers.