Video Transcript
Find the Spearman’s rank
correlation coefficient between the product price and its lifetime from the given
data. Round your answer to four decimal
places.
We’re given a table with lifetime
in years and price in dollars. And we’re asked to find Spearman’s
rank correlation coefficient between the paired data. We use the term paired because each
pair of data refers uniquely to one product so that the product with a lifetime of
one year has a price of 79 dollars, for example. Now to use the given formula to
calculate Spearman’s rank correlation coefficient, we need to know the number of
pairs of data 𝑛. And we need to know the difference
in ranks for each pair of data, and we then work out the sum of the differences
squared.
Now, since the lifetime data is
actually ordered sequentially already, that is, it goes from one to six with no
omissions, the lifetime data is already ranked. So we can simply use the data
itself as the rank. However, for the sake of clarity,
let’s write this down again in a new row. And next we need to rank our price
data. Noticing that a low price
corresponds to a low rank in lifetime, we can begin our price ranking at one also so
that we rank the price 79 as one. Our next lowest price is 103
dollars, which can be ranked as second. Our third lowest is 105, which is
ranked third, and so on so that 125 is ranked fourth, 160 dollars is ranked fifth,
and 214 dollars is ranked sixth.
Our next step is to find the
difference in ranks for each pair of data. We subtract the price rank from the
lifetime rank so that, in the first column, we have one minus one is equal to
zero. And for a lifetime of five years
and a price of 160 dollars, we have five minus five is equal to zero. Next, four minus four is equal to
zero, two minus three is negative one, six minus six is zero, and three minus two is
equal to positive one. Our next calculation is the
difference in ranks squared so that we have zero squared is zero and so on for the
rest of our differences. And now to use our Spearman’s rank
correlation coefficient, we need the sum of the differences squared, that is, zero
plus zero plus zero plus one plus zero plus one, which is equal to two.
It’s worth noting at this point
that if we were to sum the differences in ranks, we get zero, and this should always
be the case. In our case, we have zero plus zero
plus zero plus negative one plus zero plus positive one, and that’s equal to
zero. In order to use the formula, we
also need to know the number of data pairs, and we have six data pairs so 𝑛 is
equal to six.
So now making some room, we have
everything we need for our formula so that Spearman’s rank correlation coefficient
for this data is one minus six times two all over six times six squared minus
one. That is one minus 12 over 6 times
35, where six times 35 is 210, which is approximately equal to one minus
0.05714. This gives us Spearman’s rank
correlation coefficient approximately equal to 0.94286. And so to four decimal places,
Spearman’s rank correlation coefficient for this data is 0.9429. Since this value is very close to
positive one, we can interpret this as a very strong direct relationship or
association between a product lifetime in years and its price in dollars. That is, the higher the price, the
longer the product lasts.
It’s perhaps worth noting that had
our coefficient been negative at negative 0.9429, our interpretation would be the
exact opposite. In that case, we would interpret
the value as the higher the price, the shorter the lifetime. The relationship would still be
extremely strong since now negative 0.9429 is very close to negative one. But in this case, it would be an
inverse association. Often when we have bivariate data
that we wish to find Spearman’s rank correlation coefficient for, we find that we
have tied ranks.
This occurs when ranking data. If two or more data points are
identical, their rank is then the average of the place numbers they take up in the
ordered list. Suppose, for example, we have a
data set for the variable 𝑋 with values 20, 30, 20, 10, and five. If we wish to rank our data from
low to high, we note that five is the lowest value, so this comes with rank one. 10 is the next lowest value, so
this has rank two.
But now we have two values of 20 so
that the value of 20 takes up both third and fourth places in our ordered list. So we take the average of the place
numbers that these two 20s take up. That’s three plus four divided by
two and that’s equal to 3.5 so that both instances of 20 are ranked 3.5. And since third and fourth places
are now taken up, we rank our final piece of data fifth.