Question Video: Finding the Equation of a Regression Line of a Linear Regression Model Mathematics

The table shows the relation between the variables π‘₯ and 𝑦. Find the equation of the regression line in the form 𝑦 hat = π‘Ž + 𝑏π‘₯. Approximate π‘Ž and 𝑏 to 3 decimal places.

08:28

Video Transcript

The table shows the relation between the variables π‘₯ and 𝑦. Find the equation of the regression line in the form 𝑦 hat is equal to π‘Ž plus 𝑏π‘₯. Approximate π‘Ž and 𝑏 to three decimal places.

In this question, we’re given a table of data points which show a relationship between two variables, the variable π‘₯ and the variable 𝑦. We need to use this table to find the equation of the regression line linking π‘₯ and 𝑦. That’s the line of best fit. We’re told to give our answer in the form 𝑦 hat is equal to π‘Ž plus 𝑏π‘₯. We’re also told we only need to approximate the value of π‘Ž and 𝑏 to three decimal places.

To answer this question, let’s start by recalling how we find the least squares regression line linking two variables π‘₯ and 𝑦. We recall to find the least squares regression line between two variables π‘₯ and 𝑦, we can use the following formula. 𝑏 will be equal to 𝑠 sub π‘₯𝑦 divided by 𝑠 sub π‘₯π‘₯, where 𝑠 sub π‘₯𝑦 is a measure between the covariance of π‘₯ and 𝑦 and 𝑠 sub π‘₯π‘₯ is a measure of the variance of π‘₯. And our value of π‘Ž is going to be equal to 𝑦 bar minus 𝑏 times π‘₯ bar, where π‘₯ bar is the mean π‘₯-value and 𝑦 bar is the mean 𝑦-value. Of course, this alone is not quite enough to find the values of π‘Ž and 𝑏. We also need the formula for 𝑠 sub π‘₯𝑦, 𝑠 sub π‘₯π‘₯, and 𝑦 bar and π‘₯ bar.

First, we recall 𝑠 sub π‘₯π‘₯ is equal to the sum of π‘₯ squared minus the sum of π‘₯ all squared over 𝑛 and 𝑠 sub π‘₯𝑦 is the sum of π‘₯ times 𝑦 minus the sum of π‘₯ times the sum of 𝑦 over 𝑛. Similarly, we know how to find the average value of π‘₯ and 𝑦. The mean value of π‘₯ will be the total of all of our data points of π‘₯ divided by the number of data points. That’s the sum of π‘₯ over 𝑛. Similarly, the mean value of 𝑦 will be the sum of 𝑦 over 𝑛.

We’re now ready to start finding the equation of our regression line. However, there’s a lot of things we need to take in. First, although this seems very complicated, there are only five things we need to find. We need to find the value of 𝑛, the sum of π‘₯, the sum of 𝑦, the sum of π‘₯ times 𝑦, and the sum of π‘₯ squared. Once we’ve found these five values, we just need to substitute these into our formulae to find the values of π‘Ž and 𝑏.

We’ll do these one at a time. Let’s start with the value of 𝑛. 𝑛 is the number of data points. We can actually see this directly from our table. We can see that there are only six data points in this example. So our value of 𝑛 is equal to six. We can also find the sum of π‘₯ and the sum of 𝑦 from our table. Let’s start with the sum of π‘₯. We just need to add all of the π‘₯-values in our table together. So in this case, the sum of π‘₯ is 10 plus 22 plus 22 plus 13 plus 16 plus 21. And we can calculate this. We see that it’s equal to 104.

And we can then do exactly the same thing to find the sum of 𝑦. We just want to add all of the values of 𝑦 in our table together. So in this case the sum of 𝑦 is going to be 25 plus 18 plus 24 plus 25 plus 12 plus 17. And then we can calculate this. We get that the sum of 𝑦 is equal to 121. This means we’ve so far managed to find the values of 𝑛, the sum of π‘₯, and the sum of 𝑦. We have two more things we need to calculate. Next, let’s find the value of the sum of π‘₯ squared. To find the sum of π‘₯ squared, we need to square all of our values of π‘₯ in the table and then add these together. So in our table, we get the sum of π‘₯ squared is equal to 10 squared plus 22 squared plus 22 squared plus 13 squared plus 16 add 21 squared. And if we calculate this, we get 1934.

There’s only one more thing we need to calculate. We need to find the sum of π‘₯ multiplied by 𝑦. This is more tricky because we need to find the sum of π‘₯ multiplied by 𝑦 for each of the data points in our table. So let’s start with the first column in our table. The π‘₯-value is 10, and the 𝑦-value is 25. We need to multiply these together to get 10 times 25. We need to do the same with the second column in our table. The π‘₯-value is 22, and the 𝑦-value is 18. We need to multiply these together to get 22 times 18. And we need to add this to our previous product.

And we need to follow this process for all of the columns in our table, giving us the following expression for the sum of π‘₯𝑦. And if we calculate this expression, we see we get the sum of π‘₯𝑦 is equal to 2048. Now that we found the value of 𝑛, the sum of π‘₯, the sum of 𝑦, the sum of π‘₯ squared, and the sum of π‘₯ times 𝑦, we’re ready to find the values of π‘Ž and 𝑏. And it’s worth pointing out we should always start with the value of 𝑏 because we need the value of 𝑏 to find the value of π‘Ž. And of course to find the value of 𝑏, we first need to find 𝑠 sub π‘₯π‘₯ and 𝑠 sub π‘₯𝑦.

Let’s start with 𝑠 sub π‘₯π‘₯. First, we need the sum of π‘₯ squared. And we know this is equals to 1934. Then we need to subtract the sum of π‘₯ all squared divided by 𝑛. Remember the sum of π‘₯ is equal to 104. We then need to square this, and we need to divide this by the value of 𝑛, which is six. Therefore, we’ve shown that 𝑠 sub π‘₯π‘₯ is equal to 1934 minus 104 squared over six. And we can calculate this. It’s equal to 394 divided by three. And it’s important we find this value exactly because we don’t want to round until the end because this might make our answer incorrect.

We can then do exactly the same to find 𝑠 sub π‘₯𝑦. We need to substitute in our values for the sum of π‘₯𝑦, the sum of π‘₯, the sum of 𝑦, and 𝑛. Doing this, we get that 𝑠 sub π‘₯𝑦 is equal to 2048 minus 104 times 121 over six. And if we calculate this expression exactly, we get negative 148 divided by three. And now we can find the value of 𝑏. Remember, this is the quotient of these two values. 𝑏 is equal to 𝑠 sub π‘₯𝑦 divided by 𝑠 sub π‘₯π‘₯. So in this case, 𝑏 is equal to negative 148 over three divided by 394 over three. And we could evaluate this by using our calculator. Or we could remember that to divide two fractions, we can also multiply it by the reciprocal of our second fraction.

Either way, we get our value of 𝑏 is negative 74 divided by 197. And remember, it’s important to find this value exactly. We’ll round our values at the end. We’re now ready to find the value of π‘Ž, but let’s first clear some space. To find the value of π‘Ž, we first need to find the mean value of π‘₯ and the mean value of 𝑦. Let’s start with the mean value of π‘₯. That’s the total value of π‘₯ divided by the number of data points, in this case, 104 over six. And in this case, we can cancel the shared factor of two in the numerator and denominator to get that π‘₯ bar is equal to 52 over three.

We can do the same to find the mean value of 𝑦. That’s the total value of 𝑦 divided by the number of data points, in this case, 121 over six. And this fraction doesn’t simplify any further. We’re now ready to find the value of π‘Ž. Remember π‘Ž is equal to the mean value of 𝑦 minus 𝑏 multiplied by the mean value of π‘₯. So in our case, π‘Ž is going to be equal to 121 over six minus negative 74 over 197 multiplied by 52 over three. And we can simplify this expression. First, subtracting a negative number is the same as just adding a positive number. Next, we can simplify the second term in this expression by just multiplying the numerators and multiplying the denominators.

This gives us 121 over six plus 3848 divided by 591. And now we could find our value of π‘Ž exactly as a fraction. However, it’s not necessary to answer this question since we only need to find the values of π‘Ž and 𝑏 to three decimal places. So we’ll write this as an expansion. π‘Ž is equal to 26.6776, and this expansion continues. We want to round this to three decimal places. So we need to look at the fourth decimal place in our expansion. This is equal to six, which is greater than or equal to five. So we need to round up. This gives us that π‘Ž is equal to 26.678 to three decimal places.

We can do exactly the same with our value of 𝑏. We write out the value of 𝑏 as a decimal expansion. We get negative 0.3756, and this expansion continues. We want this to three decimal places. So we need to look at the fourth decimal place. This is also equal to six. So we’re going to need to round up. So to three decimal places, our value of 𝑏 is negative 0.376.

But we’re not done yet. Remember, the question wants us to give our answer in the equation of a line, 𝑦 hat is equal to π‘Ž plus 𝑏π‘₯. Substituting the values of π‘Ž and 𝑏 into this equation for a line and writing the π‘₯-term first, we get 𝑦 hat is equal to negative 0.376π‘₯ plus 26.678, which is our final answer. Therefore, in this question, we were able to find the least squares regression line between the variables π‘₯ and 𝑦. And we found the values of π‘Ž and 𝑏 to three decimal places. We got that 𝑦 hat will be equal to negative 0.376π‘₯ plus 26.678.

Nagwa uses cookies to ensure you get the best experience on our website. Learn more about our Privacy Policy.