# Lesson Worksheet: Least Squares Regression Line Mathematics

In this worksheet, we will practice finding and using the least squares regression line equation.

**Q1: **

The scatterplot shows a set of data for which a linear regression model appears appropriate.

The data used to produce this scatterplot is given in the table shown.

0.5 | 1 | 1.5 | 2 | 2.5 | 3 | 3.5 | 4 | |

9.25 | 7.6 | 8.25 | 6.5 | 5.45 | 4.5 | 1.75 | 1.8 |

Calculate the equation of the least squares regression line of on , rounding the regression coefficients to the nearest thousandth.

- A
- B
- C
- D
- E

**Q2: **

Using the information in the table, find the regression line . Round and to 3 decimal places.

Cultivated Land in Feddan | 126 | 13 | 104 | 180 | 38 | 161 | 14 | 99 | 55 | 177 |
---|---|---|---|---|---|---|---|---|---|---|

Production of a Summer Crop in Kilograms | 160 | 40 | 80 | 340 | 260 | 200 | 280 | 280 | 140 | 100 |

- A
- B
- C
- D

**Q6: **

Using the information in the table, find the error in if . Give your answer to the nearest integer.

26 | 22 | 28 | 15 | 30 | 10 | 25 | 29 | |

5 | 4 | 12 | 7 | 14 | 10 | 13 | 15 |

**Q7: **

The table shows the price of a barrel of oil and the economic growth. Using the information in the table, estimate the economic growth if the price of a barrel of oil is 35.40 dollars.

Price of a Barrel of Oil in Dollars | 26 | 13.30 | 22.90 | 12.40 | 26.70 | 17.90 | 23.60 | 37.40 |
---|---|---|---|---|---|---|---|---|

Economic Growth Rate | 1.8 | 0.4 | 3.7 | 2.3 | 3.2 | 2.7 | 0.5 | 0.3 |

- A1.5
- B2.5
- C0.2
- D2.4

**Q11: **

The latitude () and the average temperatures in February (, measured in ) of 10 world cities were measured. The calculated least squares linear regression model for this data was .

What is the interpretation of the value of in the model?

- AFor every additional degree of latitude, the average temperature decreased by .
- BIt is the -intercept of the regression line.
- CIt is the average temperature in February for a city of latitude 0 (on the equator).
- DFor every additional 0.713 degrees of latitude, the average temperature decreased by .
- EFor every additional degree of latitude, the average temperature increased by .

What is the interpretation of the value of 35.7 in the model?

- AFor every additional degree of latitude, the average temperature increased by .
- BFor every additional 0.713 degrees of latitude, the average temperature decreased by .
- CFor every additional degree of latitude, the average temperature decreased by .
- DIt is the average temperature in February for a city of latitude 0 (on the equator).
- EIt is the gradient of the regression line.

**Q12: **

A city council is investing in improving their bus services. Over a five-year period, they collect data on the amount of money invested in each bus route (, measured in 100s of dollars) and the percent of bus services that run on time (, measured in %). They find that the data can be described by the linear regression model .

What is the interpretation of the value of 2.7 in the regression model?

- AFor every additional $100 of investment, an additional β of bus services run on time.
- BIt represents the percent of bus services that would run on time with no investment.
- CIt is the -intercept of the regression line.
- DFor every additional $52.3 of investment, an additional β of bus services run on time.

What is the interpretation of the value of 52.3 in the regression model?

- AFor every additional $100 of investment, an additional β of bus services run on time.
- BIt is the gradient of the regression line.
- CIt represents the percent of bus services that would run on time with $100 of investment.
- DIt represents the percent of bus services that would run on time with no investment.

**Q13: **

The relationship between the distances jumped by competitors in the long jump and high jump during the womenβs heptathlon at the 2016 Rio Olympics can be modeled by the regression line .

What is the interpretation of the value 0.218 in the regression model?

- AFor every extra meter jumped in the high jump, the competitors jumped on average an extra 0.218 meters in the long jump.
- BFor every extra meter jumped in the long jump, the competitors jumped, on average, an extra 0.218 meters in the high jump.
- CThis is the predicted high jump result for a competitor who jumped 0 meters in the long jump competition.
- DIt is the -intercept of the regression line.

What is the interpretation of the value 0.483 in the regression model?

- AThis is the predicted high jump result, in meters, for a competitor who jumped 0 meters in the long jump competition.
- BIt is the slope of the regression line.
- CFor every extra meter jumped in the long jump, the competitors jumped, on average, an extra 0.483 meters in the high jump.
- DIt is the -intercept of the regression line.
- EThis is the predicted long jump result, in meters, for a competitor who jumped 0 meters in the high jump competition.

Does the interpretation of the value 0.483 seem reasonable in the context of the data?

- AYes
- BNo, the model has been extrapolated a long way and is therefore unreliable.

Estimate, to the nearest hundredth of a meter, the expected high jump result for a competitor who jumped 6.03 m in the long jump competition.

**Q14: **

A linear model was fitted to three data sets. The residual plot for each data set is shown. For which data set is a linear model appropriate?

- A(A)
- B(B)
- C(C)

**Q15: **

Given the regression line , find the expected value of when .

**Q16: **

An ice cream salesman records data on the number of ice creams sold each day and the temperature at midday during the April-November period. He fits a linear regression model of the form to the data. Would you expect the regression coefficient to be positive or negative in this context?

- Anegative
- Bpositive

**Q17: **

Two variables and have a correlation coefficient of and their mean and standard deviations are denoted by , and , respectively. Which of the following is the formula for calculating the slope, , of the least squares regression line ?

- A
- B
- C
- D
- E

**Q18: **

The scatterplot shows the high jump and long jump results achieved by 15 competitors in the womenβs heptathlon competition in the 2016 Rio Olympics.

Does a linear model appear to be appropriate for modeling this data set?

- AYes
- BNo

Would you expect the regression coefficient of this model to be positive or negative?

- APositive
- BNegative

The data table shows the numerical data used to produce the scatter diagram.

Long Jump (m) | 5.51 | 5.72 | 5.81 | 5.88 | 5.91 | 6.05 | 6.08 | 6.10 | 6.16 | 6.19 | 6.31 | 6.31 | 6.34 | 6.48 | 6.58 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

High Jump (m) | 1.65 | 1.77 | 1.83 | 1.77 | 1.77 | 1.77 | 1.8 | 1.77 | 1.8 | 1.86 | 1.86 | 1.83 | 1.89 | 1.86 | 1.98 |

Representing long jump by and high jump by , find the values of , and to the nearest thousandth.

- A
- B
- C
- D
- E

Hence, calculate the equation of the regression line of on .

- A
- B
- C
- D
- E

**Q19: **

Liam conducted a statistical experiment to measure the number of goals as a function of the number of soccer games. With the number of soccer games as his independent variable and the number of goals as his dependent variable, the line of best fit had a slope of 2.28. What does this mean?

- AFor every goal, 2.28 games were played.
- BThe unit of the slope is 2.28 goals per game.
- CThe unit of the slope is 2.28 games per goal.

**Q20: **

A variable has a mean of 67.9 with a standard deviation of 3.1.

A variable has a mean of 29.3 with a standard deviation of 1.2.

Given that the correlation coefficient between and is 0.37, calculate the least squares regression line of on . Round the final values for and to 3 decimal places.

- A
- B
- C
- D
- E

**Q21: **

Use the information in the table to find the equation of the least squares regression line of on . Write the equation in the form , where and are accurate to three decimal places.

1 | 22 | 18 | 396 | 484 | 324 |

2 | 22 | 19 | 418 | 484 | 361 |

3 | 23 | 20 | 460 | 529 | 400 |

4 | 26 | 18 | 468 | 676 | 324 |

5 | 31 | 23 | 713 | 961 | 529 |

6 | 32 | 24 | 768 | 1,024 | 576 |

7 | 34 | 22 | 748 | 1,156 | 484 |

8 | 37 | 25 | 925 | 1,369 | 625 |

9 | 41 | 29 | 1,189 | 1,681 | 841 |

10 | 42 | 27 | 1,134 | 1,764 | 729 |

Sum | 310 | 225 | 7,219 | 10,128 | 5,193 |

- A
- B
- C
- D
- E

**Q22: **

Amelia hits a golf ball. She knows that the height, , of the golf ball above the ground is 0 at time . She suspects that subsequently is a quadratic function of . If Amelia is correct, plotting which of the following would give a straight line graph?

- A against
- B against
- C against
- D against
- E against

**Q23: **

Jacob has collected data on the amount of fertilizer, , he uses for each of his tomato plants, and the amount they grow as a result, . He suspects there is a linear relationship between these two variables. If Jacob is correct, plotting which of the following would give a straight line graph?

- A against
- B against
- C against
- D against
- E against

**Q24: **

A gardener is investigating what effect the volume of weed killer used () has on the number of weeds () in her garden. She collects data and then fits a linear regression model of the form to the data. Would you expect the regression coefficient to be positive or negative in this context?

- Anegative
- Bpositive

**Q25: **

A teacher was investigating the effect of an online revision program on her studentsβ results. She asked them to record how long they spent using the program over a weekly period and then compared this with their scores in an end-of-unit test.

The teacher then fit a regression model to the data, using to represent the number of hours spent on the revision program and to represent the score in the end-of-unit test. The equation of the fitted least squares regression model is .

What is the interpretation of the value 5.36 in the regression model?

- AIt is the predicted score for a student who spent no time using the online revision program.
- BIt is the -intercept of the regression line.
- CFor every extra hour a student spent on the online revision program, they achieved an extra 5.36 marks (on average) in the end-of-unit test.

What is the interpretation of the value 35 in the regression model?

- AIt is the slope of the regression line.
- BFor every extra hour a student spent on the online revision program, they achieved an extra 35 marks (on average) in the end-of-unit test.
- CIt is the -intercept of the regression line.
- DIt is the predicted score for a student who spent no time using the online revision program.