Question Video: Choosing the Most Likely Correlation Coefficient Based on a Scatter Plot

What is the most likely value of the product moment correlation coefficient for the data shown in the diagram? [A] −0.58 [B] 0 [C] −0.94 [D] 0.78 [E] 0.37

02:35

Video Transcript

What is the most likely value of the product-moment correlation coefficient for the data shown in the diagram? Is it (A) negative 0.58, (B) zero, (C) negative 0.94, (D) 0.78, or (E) 0.37?

In estimating Pearson’s correlation coefficient from a scatter plot, there are two things we look at. The first is the direction of the linear pattern, which in our case is top left to bottom right. And the second thing is the spread of the data points around a possible line of best fit, that is, how close our data points are to a potential line of best fit. Generally speaking, we know that if a linear pattern of data is from bottom left to top right, then we have positive or direct correlation. Conversely, if our data follow a linear pattern from top left to bottom right, we say our data is negatively or inversely correlated. And if our data is directly correlated, that’s positively, then our coefficient is between zero and one, whereas if our data is inversely correlated, the coefficient is between negative one and zero.

In our case, our linear pattern is from top left to bottom right, so ours is the second case. This means our correlation coefficient must be between negative one and zero. And this means we can eliminate both (D) and (E) since these are both positive. And now if we look at the spread of the data, we know that the wider the spread away from a potential line of best fit, the weaker the correlation and that the closer the data points are to a potential line of best fit, the stronger the correlation. We know that Pearson’s correlation coefficient takes values from negative one to positive one and that the closer the coefficient is to positive or negative one, the stronger the correlation. And we know that the closer the correlation coefficient gets to zero, the weaker the correlation.

In the given plot, most of the data points are very close to a possible line of best fit. And remembering that our coefficient is negative, this means our coefficient must be close to negative one. We can eliminate (B) since we know that a correlation coefficient of zero means there’s no correlation at all, and we have very strong correlation. And so we’re left with option (A) and option (C). Option (A) with the value negative 0.58 would indicate a moderate correlation. That’s because it’s just over halfway between zero and negative one. And since our correlation is very strong, we can eliminate option (A). Option (C) is the closest to negative one with a value negative 0.94. So the most likely product-moment correlation coefficient for the data shown is (C) is equal to negative 0.94.