Video Transcript
The data shows the relation between a company’s production and its employees’ salaries in five years. Find Spearman’s correlation coefficient between production and salaries.
Our data consists of five pairs, each of which has one measurement for production and one measurement for salary. For example, in the first pair, production takes the value of 1,000 and salary takes the value of 150. To find Spearman’s correlation coefficient between our two variables, we use the formula shown. In the formula, 𝑛 is the number of data pairs, which in our case is five, and 𝑑 subscript 𝑖 is the difference in ranks for data pair 𝑖 where 𝑖 runs from one to 𝑛. To use this formula, we first need to rank the data associated with each variable separately. We can rank our data from either lowest to highest or highest to lowest as long as we stick to the same system for both variables.
Let’s rank our lowest value one so it takes first place in our ranking and our next lowest as two and so on. If we call the ranks of the production data 𝑅 p, then the lowest value is 1,000, and this takes the value of one. The next lowest production value is 2,000. This takes second place and is therefore assigned the rank of two. 2,300 is our next lowest value, which we rank third. 2,500 is next, which is ranked fourth. And 4,000 is the largest, which is ranked fifth.
Now let’s do the same for the salaries data. Calling the ranks for the salaries’ values 𝑅 subscript s, the lowest value is 150. So this is assigned the rank one. 180 is the next lowest figure. So this is assigned the rank two. The next lowest is 200 which is assigned the rank three. 250 is assigned rank four. And finally, 700 is assigned the rank of five.
Now for our formula, we’re going to need to find the difference in the ranks. That’s 𝑑 𝑖, which is 𝑅 p minus 𝑅 s. So, for example, 𝑑 one is one minus one, which of course is equal to zero. So 𝑑 one is zero. Similarly, 𝑑 two is two minus three, which is negative one. So 𝑑 two is negative one. 𝑑 three is four minus four, which is equal to zero. 𝑑 four is equal to five minus five, which is also zero. And 𝑑 five is three minus two, which is equal to one. And now for our formula, we need 𝑑 𝑖 squared, which is the difference in ranks for each pair squared. For our first data pair, we have zero squared is equal to zero. For our second, we have negative one squared, which is equal to positive one. For our third pair, zero squared is zero. And again, for our fourth pair and for our final pair, one squared is equal to one.
Now for our formula, remember, we want the sum of the differences squared. That’s what the 𝛴 symbol means. And the sum of our differences squared is equal to zero plus one plus zero plus zero plus one, which is two. So now we have everything we need for our formula with 𝑛 is equal to five and the sum of the differences squared is equal to two. The Spearman’s rank correlation coefficient is then one minus six times two over five times five squared minus one where two is the sum of our differences squared and 𝑛 is equal to five. Evaluating our fraction, this gives us one minus 12 divided by 120. That is one minus 0.1, which is 0.9. The Spearman’s correlation coefficient between production and salaries is therefore 0.9.
Now, 𝑟 𝑠 is a measure of how the ranks of the data associated with the two variables agree or disagree, and it can take values from negative one to positive one. A Spearman’s correlation coefficient with the value of positive one represents perfect agreement between the rankings. A Spearman’s coefficient of negative one represents entirely opposing rankings. And a Spearman’s correlation coefficient close to zero means that the rankings neither agree nor disagree. In our case, our coefficient has a value of 0.9. And since this is very close to positive one, this means our rankings are in very strong agreement.
Making some room then, we can write down an interpretation of our coefficient. We say that since 𝑟 𝑠 is close to one, the ranks are in strong agreement. And we can conclude that high production is associated with high salaries and vice versa. We could say then that the more the company spends on salaries, the better the production figures, which, of course, is what they would hope for.