Question Video: Finding Spearman’s Correlation Coefficient for Bivariate Data Mathematics

Name: Finding Spearman’s Correlation Coefficient for Bivariate Data
Uploaded: 2020-09-29
Description: In a study of the relation between students’ grades in mathematics and science, the results were found for six students. Find the Spearman’s correlation coefficient. Round your answer to three decimal places.

Start Practising

In a study of the relation between students’ grades in mathematics and science, the results were found for six students. Find the Spearman’s correlation coefficient. Round your answer to three decimal places.

05:14

Video Transcript

In a study of the relation between students’ grades in mathematics and science, the following results were found for six students. Find the Spearman’s correlation coefficient. Round your answer to three decimal places.

The Spearman’s correlation coefficient applies to what’s called bivariate data. This means data where each point is represented by two variables. In our case, we can say that each point in our data set consists of the grades of a given student. The two variables involved then are, first, that student’s grade in mathematics and then the student’s grade in science. So we do have a bivariate data set, and therefore we can calculate Spearman’s correlation coefficient.

Now, interestingly, this coefficient doesn’t directly have to do with a given data. Rather, it describes what’s called the relative rank of these data. To see what that means, let’s add two more rows to our table. This row represents the rank of the students’ mathematics grade and this one the rank of their science grade. To fill in these rows, we’re going to assign a number for each letter grade. We’ll set it up so that the lowest grade in each row gets the corresponding lowest rank. This means that the lowest grade in a given row gets a rank of one, the next lowest gets a rank of two, and so on.

Looking at our list of mathematics grades though, right away we see there’s a difficulty. There is actually a three-way tie for the lowest mathematics grade. Technically, these are the first-, second-, and third-lowest grades. But since they’re all the same grade, it wouldn’t make sense to assign them ranks of one, two, and three. Instead, what we’ll do is take the average of these three ranks. The average of one, two, and three is two. So we’ll say that the rank of each mathematics grade of 𝐷 is two.

Looking at the next lowest mathematics grade, we see that there are two grades of 𝐵. These then are the fourth- and fifth-lowest grades in the mathematics set. But instead of assigning them different ranks because they are the same grade, we’ll again assign them the average of these two rankings of four and five. The average of four and five is 4.5. Then lastly, we have the sixth-lowest grade or the highest grade of 𝐴. Since we’re ranking from low to high, the rank of this grade will be six.

We’ve now ranked the mathematics grades, and we can move on to ranking the science ones. Once again, we’ll rank lower grades as lower ranks. We can therefore start off with this lowest grade of 𝐹. This has a rank of one. And the next lowest grade is a grade of 𝐶 which three students received. These grades then have the second-, third-, and fourth-lowest ranks. Taking the average of those rankings, we get a result of three. Next, there’s a grade of 𝐵. This has the fifth-lowest ranking. And lastly, there’s a grade of 𝐴 which has the sixth lowest.

Now that we know the relative rankings of each of these data points, we’ve computed the numbers that Spearman’s correlation coefficient actually measures. Specifically, this coefficient measures the differences between our math grade rankings and our science grade rankings. What we can do then is add yet another row to our table. This row 𝑑 sub 𝑖 represents those differences between math and science grades. So for our first data point, the first student’s grades, that difference equals two minus three or negative one. Then for the second student, it’s 4.5 minus three or 1.5. And we continue on down the row. Six minus five is one. 4.5 minus six is negative 1.5. Two minus three is negative one. And two minus one is one.

At this point, we can recall the equation for Spearman’s correlation coefficient. It’s equal to one minus six times the sum, that’s what this symbol 𝛴 means, of all the differences 𝑑 sub 𝑖 squared divided by 𝑛, where 𝑛 is the number of data points in our data set, times 𝑛 squared minus one. We see then that to calculate this coefficient, there are two bits of information we need. First, we need to know the sum of all the 𝑑 sub 𝑖 terms squared. And second, we need to know the number of points in our data set. Because in our case we have six students’ grades, we know that 𝑛 equals six. To help us solve for the sum of 𝑑 sub 𝑖 squared then, we can create one final row in our table.

To fill out this row, we’ll just square the values in our 𝑑 sub 𝑖 row. Negative one squared is positive one. 1.5 squared is 2.25. One squared is one and so on all down the line. To calculate the sum of 𝑑 sub 𝑖 squared, we’ll add up all the values in this last row. One plus 2.25 plus one plus 2.25 plus one plus one equals 8.5. Knowing this value now, we can substitute it in to our equation for the Spearman’s correlation coefficient. Substituting in 8.5 for the sum of 𝑑 sub 𝑖 squared and six for 𝑛, we get this expression for our coefficient. Notice that in our fraction one factor of six cancels from numerator and denominator. Along with this, since six squared is 36, we can rewrite our fraction as 8.5 over 35. This is all equal to a decimal value of 0.757142 and so on.

But we remember that we want to round our answer to three decimal places. Since the number in the fourth decimal place is less than five, our answer will round to 0.757. This is the Spearman’s correlation coefficient to three decimal places.

Question Video: Finding Spearman’s Correlation Coefficient for Bivariate Data Mathematics

Video Transcript

Join Nagwa Classes