The scatter diagram shows the scores that 10 students received on their science test compared to the number of lessons that they missed. a) State the coordinates of the point that is an outlier.
Scatter graphs are a good way of displaying two sets of data to see if there is a correlation or a connection. For this reason, scatter diagrams often have a pattern. We call a data point an outlier if it doesn’t fit this pattern.
We can see that the majority of the points form roughly a line. This point circled though does not fit in with the pattern. We can assume that this is the outlier. Let’s see if we can work out the coordinates.
Remember when we list coordinates, we list the 𝑥-coordinate — that’s the horizontal coordinate — before the 𝑦-coordinate — the vertical coordinate. We can see that the horizontal coordinate is six and vertically it’s 70. So the coordinates of the point that’s an outlier is six, 70.
b) Ignoring the outlier part, part i) draw the line of best fit for the remaining points.
We can only draw a line of best fit if there is a pattern. And a pattern indicates there’s a correlation — that’s a relationship — between the two variables, in this case a relationship between the scores that they received and the number of lessons that they missed.
The line of best fit is a straight line that goes as closely as possible through the coordinates plotted. We’re aiming to have approximately the same number of coordinates on either side of the line. It does not need to pass through the origin, but it should have roughly the same steepness as our points.
Part ii) Describe the type of correlation.
The question has specifically told us to describe the type of correlation. It’s, therefore, not enough to say that the more lessons that were missed, the lower the scores the students achieved.
There are three types of correlation that we’re interested in. The first is positive correlation. When two things have a positive correlation, that means as one increases, so does the other. There is negative correlation: as one increases, the other decreases. This looks like the points are sloping downwards. And then, there is no correlation. The points don’t form any sort of real pattern.
In this case, we can see that our point slope downwards as does our line of best fit. That means we have a negative correlation. Now, it is enough just to write the word negative when asked to describe the type of correlation. But it’s good to get into the habit of writing the word correlation as well. As if we’ve been asked to describe the relationship between the two variables, negative correlation will get us the mark and negative wouldn’t.
Another student in the class missed six lessons. Estimate the score they received.
This is sometimes called interpolation. We’ll begin by drawing a line going up from six lessons to the line of best fit. At the point where we hit the line of best fit, we then go across to the score. For our line of best fit, that’s 40 marks.
Remember though since we drew the line of best fit by eye, it may be that we get an answer either side of this: 39 or 41. In this case, we would have needed to have been really careful with the scale. We can see that five little squares represents 10 marks. We can divide three by five and that tells us that one of these little squares is equal to two marks. And in turn, half a square would be one mark.
For our line of best fit though, we can estimate that the score they received was 40 marks.