### Video Transcript

In this video, weâ€™re gonna look at
some data presented in a scatter plot, and weâ€™re gonna draw a line of best fit. Then weâ€™re gonna work out an
equation of that line and use it to interpret the meaning of the rate of change or
slope and the value of the đť‘¦-intercept.

The fares charged by some cabs
for journeys of different lengths are shown in the table of values below. And then Iâ€™ve got a table of
values showing the distance in miles and the fare in dollars of these various
different journeys within the city. Now this isnâ€™t actually a
question; itâ€™s just a statement. But what weâ€™re gonna do is
create a scatter plot, and then weâ€™re going to try and find a line of best fit,
and then weâ€™re gonna try and interpret the meaning of that line of best fit and
just sort of talk around the issue and interpret the data.

So first letâ€™s think about our
đť‘Ą- and đť‘¦-variables. Now generally speaking we
wouldâ€™ve thought that the further you went in a taxi, the more theyâ€™re going to
charge. So we think the distance would
be the đť‘Ą, the independent variable, and the fare would be đť‘¦, the dependent
variable. So thatâ€™s what weâ€™re gonna
use. Now with the data weâ€™ve got,
the đť‘Ą-variable goes up to about twenty and the đť‘¦-variable goes up to
fifty-five. So weâ€™ve just drawn some axes
there and labelled them up: the đť‘¦ is the fare and the đť‘Ą-axis is the
distance. So now weâ€™re just gonna take
all of those plots individually and plot them on the on the axes.

So weâ€™re just taking, for each
ordered pair, weâ€™re taking the đť‘Ą-value and the đť‘¦-value and using those as our
đť‘Ą- and đť‘¦-coordinates and just kind of plotting them all. Now the distribution of those
points on the on the graph there strongly suggests a straight line. So what weâ€™re gonna do is try
to create a line that goes through as many of those points or as close to as
many of those points as possible with a pretty even distribution of points above
and below that line. So I reckon that would look
roughly like that. So we can use this line then
for making predictions about fares. So for example, if we had a
journey about five miles to do, if we draw a line going from five miles up to
our line of best fit and then we map across to the đť‘¦-axis, we can see that
thatâ€™s gonna cost us about eighteen dollars. And in fact likewise, if we had
forty dollars to spend, how far can we get on forty dollars? So if we go across from forty
dollars to the line of best fit and then map that down to the đť‘Ą-axis, thatâ€™s
gonna get us about fourteen miles. So our graph here, the line of
best fit on that graph has a meaning. Itâ€™s enabling us to make
predictions about the cab fare given how far weâ€™ve travelled or about how far we
can travel given a certain fare.

Now the fact that these data
points are not all exactly on that line of best fit, theyâ€™re all slightly
different, is telling us that this-this line of best fit isnâ€™t giving us an
exact value; itâ€™s only giving us an approximation. So the line of best fit is an
approximate rule describing the relationship between the number of miles that
weâ€™re travelling and the actual cab fare. Now the fact that the points
are generally quite close to that line tells us that it seems to represent quite
a good approximation to the rule and that the numbers that weâ€™re gonna get out
the predictions weâ€™re gonna get are gonna be a pretty good approximation. But remember, itâ€™s only based
on the data that weâ€™ve got for journeys between three and twenty miles, and itâ€™s
only based on eight actual cab rides as well, so it wouldnâ€™t be sensible to
necessarily expect the same rule to apply for journeys of fifty or a hundred or
even a thousand miles. So within the constraints that
we just talked about. The cab fare seems to
approximately follow a straight line relationship as weâ€™ve shown in this
graph. Now the other thing is that
when we use the graph to make the predictions, itâ€™s quite difficult to read the
exact values. So what weâ€™re gonna do is
calculate the equation of that line, and then we can put numbers into our
equation and generate numbers as results.

Now to work out the equation of
that line, remember we need two things: we need the slope of the line and we
need the đť‘¦-intercept. Now the đť‘¦-interceptâ€™s
relatively easil- easy to see. That looks like thatâ€™s gonna be
about five. And the slope, remember, is
every time I increase the đť‘Ą-coordinate by one, by how much does the
đť‘¦-coordinate increase. Now if I was actually just
gonna do this here, sort of say whatâ€™s the distance between here and here kinda
thing, thatâ€™s actually quite difficult on these- on these particular axes with
the scales that weâ€™ve got to do. So what Iâ€™m gonna do is Iâ€™m
gonna look for points which are on exact coordinate axes and which are as far
apart as possible, so Iâ€™m gonna take this one and this one over here. And Iâ€™m gonna use the
definition of slope which is the difference in đť‘¦-coordinates divided by the
difference in đť‘Ą-coordinates. So between these two points,
the đť‘¦-coordinate is going up from ten here to fifty here, so the difference is
gonna be fifty minus ten. And for the đť‘Ą-coordinates,
weâ€™re going up from two here to eighteen here, so the difference is eighteen
minus two. And that becomes forty over
sixteen, which simplifies to five over two. So this means weâ€™ve got a slope
of five over two, and weâ€™ve got a đť‘¦-intercept of five. Since the relationship is a
linear relationship, weâ€™ve got a straight line graph, weâ€™re gonna use that
general form of the equation đť‘¦ equals đť‘š đť‘Ą plus đť‘Ź, and the slope is five over
two, and the intercept is five, so we can plug those numbers in. Now the đť‘¦ is the fare in
dollars; so actually saying that the multiplier is five over two well it- itâ€™s
perfectly accurate. When weâ€™re talking about money,
itâ€™s probably better to say itâ€™s two dollars fifty cents or two point five
dollars, so our equation becomes đť‘¦ equals two point five đť‘Ą plus five. And the way that we interpret
those numbers is the intercept being five; thatâ€™s the đť‘¦-coordinate when đť‘Ą
equals zero. This means that just to get in
the cab, these taxi drivers are charging you five dollars, so thatâ€™s kind of a
start fee for your journey. And then the slope tells you
how much theyâ€™re charging every time they increase đť‘Ą by one, so đť‘Ą is the
number of miles that you travel in the taxi. So basically for every mile
that you go, theyâ€™re char- charging you two dollars and fifty cents. So our interpretation is that
each fare consists of a fixed fee of five dollars plus two dollars fifty per
mile travel. Now this is only an
approximation as we said; none of the fares exactly match that cause none of the
points are exactly one the line, but theyâ€™re all pretty close to that. Thatâ€™s the general rule they
sort of follow quite closely.

And now weâ€™ve got the equation;
we can use that probably more easily than we can use the graph for making
predictions about how much each journey would cost. So on the graph, if we were
going eight miles, weâ€™d have to go up to the graph here and then sort of come
across and so the roughly guess is that gonna be twenty-five, twenty-four,
twenty-six dollars. But if we put the number
straight into the equation, we can see that the cost is gonna be two point five
times eight plus five. So thatâ€™s twenty plus five,
which is twenty-five dollars. So itâ€™s easier to get sort of
more accurate answers by using the equation. Now to use the equation to make
predictions in the other direction, so say we got thirty-five dollars and we
wanna know roughly how far we weâ€™ll get with our thirty-five dollars, weâ€™re
gonna have to rearrange that equation so weâ€™re gonna have to make đť‘Ą the
subject. So what Iâ€™m gonna do here is
take away five from both sides of that equation, which gives us đť‘¦ minus five on
the left-hand side and two point five đť‘Ą on the right-hand side cause five take
away five is nothing. And now if I divide both sides
by two point five, Iâ€™ll know what đť‘Ą is equal to. So the distance that I can
travel for a given fare is đť‘¦ minus five over two point five. So letâ€™s say we did have
thirty-five dollars, we can put thirty-five in for đť‘¦. So đť‘Ą, the number of miles that
we can go will be thirty-five take away five; thatâ€™s thirty over two point five,
which is twelve miles.

So on the graph it would look
like this, but I think itâ€™s kind of easier to get a more accurate answer when
youâ€™re actually working with equations and numbers. So we follow the process right
through. We started off with a table of
values over here. From that, we plotted the
graph. From the graph, we calculated
this equation, and weâ€™ve seen how we can use that equation to make predictions
of fares based on how far weâ€™re travelling or how far we can get with a given
amount of money. Weâ€™ve also interpreted that
equation so that we know that five tells us what the fixed fee is for every
journey, and the slope two point five tells us that weâ€™re charging two dollars
fifty or roughly per mile that we travel. Now the only thing that we
havenâ€™t considered all in all of this is for that particular function, that eq-
that equation representing the function for the relationship between the
distance and the fare, what would be a suitable domain? Now given that weâ€™re not gonna
charge negative amounts if we start driving backwards to different places, it
probably makes sense that the distance weâ€™re travelling is always gonna be
positive. So in terms of the maths, it
makes sense to put a restriction on the numbers that we can put into this
equation saying that the đť‘Ą-values, the number of miles, has to be at least zero
for this to make any kind of sense.

So having done all of that letâ€™s do
one more example and weâ€™re going to do that a little bit more quickly.

So nine students were asked to
measure the diameter and circumference of nine different circles that was one
each, and the results are shown in the table below. So weâ€™ve got for each student,
theyâ€™ve measured so they havenâ€™t done any calculations here, theyâ€™ve just you
know taken a ruler or a piece of string and measured these distances. So for example, the first
student had a diameter of two inches on their circle and a circumference of six
inches. So what weâ€™re gonna do is weâ€™re
gonna plot that on a scatter graph. And then weâ€™re gonna do a line
of best fit, work out the equation of that line of best fit, and then try and
interpret some of the-the parameters. So first things first, we need
to define which our đť‘Ą- and đť‘¦-coordinates are. So Iâ€™m gonna say đť‘Ą, so you set
the diameter of the circle and then that determines what the circumference is,
so Iâ€™m gonna use đť‘Ą for the diameter and đť‘¦ for the circumference, and then Iâ€™m
gonna treat each of these as an ordered pair and use the đť‘Ą-coordinate and the
đť‘¦-coordinate and plot those points.

And thatâ€™s what we get. Now most of the points pretty
strongly suggest a straight line relationship between đť‘Ą and đť‘¦, between the
diameter and the circumference, but there is one point that looks very different
to the others. Now whatâ€™s going on here? So a number of different
possibilities come to mind. I mean it could be that the
ruler that that student was using was extraordinarily sensitive to changes in
temperature so it expands and contracts as it heats up or cools down, and maybe
they were doing their measuring in an environment where the temperature was
rapidly changing. It could be that there was a
bizarre gravitational event nearby which massively warped space-time while the
student was making their measurements. It could be that they found a
bizarre circle which looks very different and has different properties to all
the other circles or maybe it could be that the student was just very bad at
measuring or possibly they just write down the numbers the wrong way round. Well we donâ€™t know which one of
those is the real situation, and we canâ€™t really make any assumptions. It looks very likely that
theyâ€™ve just transposed the diameter and the circumference. But because this is secondary
data, we donâ€™t have access to the original circles, we donâ€™t have access to the
original students, I think what weâ€™re gonna do is just assume that it looks very
different. Itâ€™s probably wrong; weâ€™re
going to ignore that piece of data for now. And you do have to be very
careful about throwing away bits of data that you just donâ€™t like the look of
because you can obviously skew your results. But from what we know about
circles and the way that they work and about geometry, I think itâ€™s pretty clear
that that-that-that does look like a dodgy piece of data, so I think in this
particular case weâ€™re safe to ignore it. So letâ€™s draw a line of best
fit through the rest of the points.

Now that looks like a
reasonable line of best fit for the rest of those points. So to work out the equation of
that line, weâ€™re gonna have to find the intercept and the slope. Well that line seems to go
through the origin. So the đť‘¦-intercept, when đť‘Ą is
zero, the đť‘¦-value is zero. And to work out the slope, Iâ€™m
gonna pick two points which are on my grid lines, and Iâ€™m gonna work out the
difference in đť‘Ą and the difference in đť‘¦ again. So in going from this point to
this point, the đť‘¦-coordinateâ€™s gone up from zero up to thirty, so the
difference in đť‘¦-coordinated is thirty. And between those same two
points, the đť‘Ą-coordinate is going up from zero up to nine point five. So the slope looks like itâ€™s
thirty divided by nine point five, which is about three point one six. And because weâ€™ve got a linear
relationship, our equation is gonna look something like đť‘¦ equals đť‘š đť‘Ą plus đť‘Ź,
and weâ€™ve calculated the slope is three point one six and the intercept is
zero. So thereâ€™s our equation đť‘¦
equals three point one six đť‘Ą plus zero. Then again, we donâ€™t usually
bother writing plus zero on the end of our equations, so weâ€™re just gonna go
with đť‘¦ equals three point one six đť‘Ą.

So just thinking back that đť‘Ą
represents the diameter in inches and đť‘¦ represents the circumference of inches,
then for those circles weâ€™re saying wit-with this data that we have, we reckon
that the circumference is roughly equal to three point one six times the
diameter. And just sort of interpreting
those parameters, the intercept here at zero, that makes sense; so if weâ€™ve got
a circle thatâ€™s got a diameter of zero, we havenâ€™t really got a circle so the
circumference will be zero as well. So weâ€™re, weâ€™re sort of happy
with the interpretation of that, and it means that every time we add an inch to
the diameter of our circle, weâ€™re gonna multiply that by three point one six to
get the circumference. So every extra inch on the
diameter adds three point one six inches to the circumference of the circle. Now even from here, I can hear
those of you who are paying attention in your geometry classes screaming at me
that, â€śBut we know the circumference of a circle is đťś‹ times the diameter!â€ť So what weâ€™ve done in our
little experiment here with these nine students is weâ€™ve calculated using
statistical techniques an approximation for the value of đťś‹. These two things are completely
compatible with each other except one of them is a bit more inaccurate than the
other one. Because our students are doing
measuring, theyâ€™re not always doing that a hundred percent accurately, so some
of these points are not quite on the line, although in theory they should all be
exactly on a straight line. But with all these errors, when
we add all these errors up, our estimation of the value of đťś‹ has come out
slightly wrong; itâ€™s three point one six instead of three point one four one
five nine blah blah blah blah blah. But nonetheless, itâ€™s not a bad
estimate. So hopefully, the couple of
examples weâ€™ve just looked have given you a chance to see the value of scatter
plots and how useful they can be in interpreting data. But more importantly perhaps,
theyâ€™ve enabled you to calculate the equation of a straight line and to
interpret some of the values. So the intercept, the
đť‘¦-intercept, and the slope of that line, weâ€™ve interpreted that in-in some sort
of a real-life context.