Video Transcript
For a given data set, the sum of 𝑥 equals 102, the sum of 𝑦 equals 1092, the sum of 𝑥 squared is 1382, the sum of 𝑦 squared is 100392, sum of 𝑥𝑦 is 8656, and 𝑛 equals 12. Find the equation of the regression line of 𝑦 on 𝑥 in the form 𝑦 equals 𝑚𝑥 plus 𝑏, giving 𝑚 and 𝑏 correct to two decimal places.
Remember, the least squares regression line for a set of bivariate data with variables 𝑥 and 𝑦 is given as 𝑦 hat equals 𝑎 plus 𝑏𝑥. 𝑏 is essentially the slope or gradient of the line of best fit and is given by 𝑆 𝑥𝑦 over 𝑆 𝑥𝑥, where 𝑆 𝑥𝑦 is found by subtracting the sum of 𝑥 times the sum of 𝑦 over 𝑛 from the sum of 𝑥𝑦. And 𝑆 𝑥𝑥 is the sum of 𝑥 squared minus the sum of 𝑥 all squared divided by 𝑛. And 𝑎 is the 𝑦-intercept. It’s 𝑦 bar minus 𝑏𝑥 bar, where 𝑦 bar and 𝑥 bar are the means of 𝑦 and 𝑥, respectively.
We’ve been given all of the values we need to be able to calculate each of these. Let’s begin with 𝑆 𝑥𝑦, which is the sum of 𝑥𝑦, that’s 8656, minus the sum of 𝑥 times the sum of 𝑦 divided by 𝑛. In other words, it’s 8656 minus 102 times 1092 over 12, which is negative 626. Similarly, 𝑆 𝑥𝑦 is 1382, that’s the sum of 𝑥 squared, minus the sum of 𝑥 squared, which is 102 squared, divided by 12. That’s 515. 𝑏 is the quotient of these, so it’s negative 626 divided by 515, or negative 1.22 correct to two decimal places.
Now that we have the slope, we can work out the value of the 𝑦-intercept by using the formula 𝑎 equals 𝑦 bar minus 𝑏 times 𝑥 bar. 𝑦 bar is the sum of all of the 𝑦-values divided by 12, so 1092 divided by 12, which is equal to 91. Similarly, the mean of 𝑥 is 102 divided by 12, which is 8.5. Then, 𝑎 is 𝑦 bar minus 𝑏 times 𝑥 bar. So it’s 91 minus negative 626 over 515 times 8.5, which correct to two decimal places is 101.33. And so the regression line can be written as 𝑦 bar equals 101.33 minus 1.22𝑥, or equivalently 𝑦 equals negative 1.22𝑥 plus 101.33.