# Question Video: Determining the Spearman’s Rank Correlation Coefficient and Type of Correlation between Two Variables Mathematics

Using the information from the table, find the Spearman’s rank correlation coefficient and determine the type of correlation between the variables 𝑋 and 𝑌. Give the numerical part of your answer to four decimal places.

06:29

### Video Transcript

Using the information from the table, find the Spearman’s rank correlation coefficient and determine the type of correlation between the variables 𝑋 and 𝑌. Give the numerical part of your answer to four decimal places.

The Spearman’s rank correlation coefficient is a measure of the tendency for one variable in a bivariate dataset to increase or decrease as the other does, although not necessarily in a linear way. Calculation of this statistic doesn’t use the raw data but instead the rank or position of each value within the data set. We’ll begin by assigning a rank to each value of the 𝑋-variable and a rank to each value of a 𝑌-variable. It doesn’t matter whether we choose the smallest or the largest value to have the rank of one, as long as we’re consistent for the two variables.

For the 𝑋-variable, the smallest data value is six, so we assign this rank one. There are then two data values of nine, so we need to understand how to deal with cases of equal data values. If we were to write out the ordered list of 𝑋-values, these two values of nine would take the second and third places in the list. As they are equal, we give both data values the average of these two ranks. So, both values of nine get given the rank of 2.5. Treating tied ranks in this way ensures that the sum of the ranks will be the same for both variables. We then assign rank four to the data value 10 and ranks five and six to the values 13 and 14, respectively.

Next, we perform the same process for the 𝑌-variable. This time, the two smallest values are equal, so we assign them both the same rank of 1.5, which is the average of one and two. The next two values are also equal, so we award each of these the average of three and four, which is 3.5. Finally, we award rank five to 21 and rank six to the largest value of 23. Now that we’ve assigned all of the ranks, let’s introduce the formula for calculating the Spearman’s rank correlation coefficient. It is one minus six multiplied by the sum of 𝑑 𝑖 squared over 𝑛 multiplied by 𝑛 squared minus one. Here, 𝑑 𝑖 represents the difference in the ranks for each pair of data, and 𝑛 represents the number of data pairs, which in this question is six.

The next thing we need to calculate then is the difference in ranks for each pair of data, so we’ll add another row to the table to do this. It doesn’t actually matter which way round we subtract the ranks, but let’s subtract the rank of 𝑌 from the rank of 𝑋 for consistency. First, we have six minus five, which is one, then 2.5 minus 3.5, which is negative one. The remaining differences are negative two, 3.5, negative 2.5, and one. At this point, there’s a useful check we can perform, because the sum of the differences should always be equal to zero. Summing the six values in the bottom row of the table does indeed give zero, so this confirms that the work we’ve done so far is correct.

Next, we need to find the square of each difference, so we can add another row to the table to do so. We’re now ready to calculate the Spearman’s rank correlation coefficient for this data set. Summing the squared differences in the final row of the table gives 25.5. We then substitute this value for the sum of 𝑑 𝑖 squared and six for 𝑛 into the Spearman’s rank correlation coefficient formula to give 𝑟 sub 𝑠 equals one minus six multiplied by 25.5 over six multiplied by six squared minus one. Evaluating gives one minus 51 over 70, which is 19 over 70. We’re asked to give the answer to four decimal places, so evaluating this fraction as a decimal first gives 0.271428 continuing. And then rounding to four decimal places gives 0.2714.

The final part of the question asks us to determine the type of correlation that exists between the variables 𝑋 and 𝑌. This means we need to use the value of the Spearman’s rank correlation coefficient we’ve just calculated to determine whether the two variables have a tendency to increase together or for one to decrease as the other increases. We recall that the value of the Spearman’s rank correlation coefficient is always between negative and positive one inclusive. A positive value indicates that the two variables have a tendency to increase together, which we refer to as direct correlation. A negative value indicates that as one value increases, the other decreases, which we refer to as inverse correlation.

As the Spearman’s rank correlation coefficient we’ve calculated is positive, this means that there is a direct correlation between 𝑋 and 𝑌. The value is quite close to zero though, so whilst a direct correlation does exist, it is relatively weak. We’ve found then that the value of the Spearman’s rank correlation coefficient for this dataset to four decimal places is 0.2714. And hence, there is direct correlation between 𝑋 and 𝑌.