# Video: Pack 2 • Paper 2 • Question 9

Pack 2 • Paper 2 • Question 9

05:21

### Video Transcript

The scatter diagram shows the scores that 10 students received on their science test compared to the number of lessons that they missed. Part a) State the coordinates of the point that is an outlier. Part b) Ignoring the outlier, 1) first draw the line of best fit for the remaining points then 2) describe the type of correlation. Another student in the class missed six lessons. Part c) Estimate the score they received.

There is also a part d that we will look at later. The first part of our question asked us to identify the outlier. Well, an outlier is a point on the scatter diagram that doesn’t follow the general pattern or trend of the other points.

In this question, the point circled doesn’t follow the general trend. The coordinates of this point are six, 70. We go along the corridor to six and up to 70. This means that the student missed six lessons and scored 70 in the science test.

The next part of our question asked us to draw a line of best fit. A line of best fit needs to have roughly the same number of points above and below the line. It must also have points above and below the line at the start and at the end of the line. Our line of best fit will look something like this.

It’s important to note that everyone’s line of best fit will be slightly different. Therefore, there is a margin for error in the exam. This will also impact on our answer to part c.

The second part of part b asked us to describe the correlation. Once we have drawn our line of best fit, the type of correlation, it could be positive or negative. If our line of best fit slopes upwards from left to right, it is a positive correlation as the slope has a positive gradient. If our line slopes down from left to right, it has a negative correlation as the gradient is negative.

In this case, we have a negative correlation. This means that the more lessons a student missed, the lower their mark in the science test. As one variable increases, the other one decreases.

Part c of our question told us about another student in the class who missed six lessons. We need to use our line of best fit to estimate the score they received. Remember everyone’s line of best fit will be slightly different. So your answer here might be slightly different to ours. This will be taken into account in the mark scheme in any exam.

In order to estimate the score they received, we firstly need to go up from six lessons. Once we hit our line of best fit, we need to go horizontally across to the 𝑦-axis. On our diagram, this gives us an answer of 42 marks as each little square is worth two marks and we are one square above 40. We can, therefore, estimate that a student that missed six lessons would have a science score of 42.

The final part of the question said the following.

Any student who scored less than 40 failed the test. Part d) Do you agree or disagree with the following statement? “Any student who missed seven lessons or more failed the test. Justify your answer.”

Well, the key bit of information here is that we’re interested in any student who missed seven lessons or more. The scatter diagram only shows students who missed seven lessons or less. This means that we have no data for students who missed eight or more lessons. It is also important to note that the line of best fit is just an estimate and therefore it doesn’t show the guaranteed scores.

This means that the statement is false. So the correct answer is disagree because the line of best fit doesn’t show guaranteed scores and we have no data for students who missed eight lessons or more.

This question has shown us how we can use a scatter diagram to draw a line of best fit, identify the correlation, and estimate scores. However, any estimation must be within the dataset of the particular question and also that our line of best fit doesn’t guarantee the result, as shown by the outlier in part a of the question.