### Video Transcript

The table describes the exam marks
achieved by 30 students. Part a) Estimate the mean mark.

To estimate the mean from a grouped
frequency table, we need to find the midpoint of each group and multiply it by the
frequency for that group. We then find the sum of these
values and divide by the total frequency.

The midpoints of the five groups
are 10, 30, 50, 70, and 90. The total frequency is the sum of
the five individual frequencies. And it’s also given in the
question. It’s 30. The calculation for the estimate of
the mean is, therefore, equal to three multiplied by 10 plus zero multiplied by 30
plus nine multiplied by 50 plus 12 multiplied by 70 plus six multiplied by 90, all
divided by 30.

We can find the sum of these values
using a column addition method. The total is 1860. So the calculation for our estimate
of the mean is 1860 divided by 30, which can be simplified by cancelling a factor of
10 from both the numerator and denominator, 186 over three. We can evaluate 186 divided by
three using a short division. It’s equal to 62. This is our estimate of the mean
mark.

Now there’s a second part to this
question. So if you need to jot down any of
the working out for the mean, pause the video and do so now.

Part b of the question is this:
John says the median may be a better way to summarize this data. Do you agree? Give a reason to support your
answer.

Let’s look at the data table more
closely. We noticed that three of the
students had a score between zero and 20. And in fact, no students had a
score between 20 and 40. This means that, of the 30
students, 27 of them had a score between 40 and 100. This suggests that the three
students who scored between zero and 20 marks may in fact be outliers, which will
have a significant effect on our calculation of the mean. It will bring it down.

However, the median is less
affected by outlying values. We need to make a comment to this
effect. So firstly, yes, we agree with
John. Why? Because the median is less affected
by outliers than the mean. However, this is only relevant if
there are indeed potential outliers in the dataset. So we also need to say that the
three low marks may be outliers.