Video Transcript
Before watching this video, you
should already be familiar with standard deviation and how to calculate it from a
list of values. We’ve got another video covering
that if you’re not sure about it. In this video will be calculating
the standard deviation of a set of values presented in the form of a frequency
table.
Remember that standard deviation is
a number that tells you about the amount of variability in a set of numbers. It’s the root mean square deviation
from the mean, or the standard deviation is equal to Σ. That little funny symbol there
which is encapsulated in this formula here. The sum of each individual score
minus the mean squared divided by 𝑛. And then, we take the square root
of the whole sum.
Although we’ve actually got another
formula which can be easier to use. And we sum that up as the mean of
the squares minus the square of the mean. And then, we take the square root
of all that. So, if you’ve got a straightforward
list of numbers that you want to calculate the standard deviation for, then it’s a
relatively straightforward, if sometimes a bit lengthy, process.
For example, if you want to
calculate the standard deviation of three, seven, eight, 10, and 11, we can just
call those the 𝑥-values and write them down, then square them. And if you add up the 𝑥-values in
this case, we get 39. And if you add up the 𝑥 squared
values in this case, you get 343. You’ll also notice we’ve got five
bits of data because there were five numbers that we were given in the first
place. So, to work out the standard
deviation, we just need to plug some of those values into our formula.
Well the sum of the 𝑥 squareds is
343, and 𝑛 was five, and the sum of the 𝑥s was 39. And this all simplifies down to 194
over 25. So, our answer would be that the
standard deviation for this set of data is 2.79, correct to two decimal places.
Brilliant, but that’s not what this
video is about.
Let’s say we’re given some data in
a frequency table. How can we calculate the standard
deviation then? For example, here we’ve got a
series of prices and their frequencies. So, three items had a price of
maybe 10 dollars, two items had a price of 20 dollars and four items had a price of
30 dollars. What’s the standard deviation in
the prices?
Well, we could just write out the
data in a big list and use the method that we already know. So, that’d be three lots of 10, two
lots of 20, and four lots of 30. Meaning, we’ve got nine bits of
data in total. And if we add them all up, we get
190. So, now, let’s go on and write down
the 𝑥 squared values. And of course, 10 squared is 100,
20 squared is 400, and 30 squared is 900. And then, the sum of the 𝑥
squareds is 4700. And we can plug those numbers into
our formula. So, we’ve got to calculate 4700
divided by nine minus 190 over nine squared and then take the square root of that
answer. And that works out to be 8.75, to
two decimal places.
Now, I’d rush through that using
the power of video manipulation, but can you imagine if these frequencies were a lot
higher. Then we would have had a massive
long list of numbers here. And it would have taken a long time
to do that calculation. Now, luckily, there’s an
alternative formula which can save us a bit of time.
So, if we call 𝑓 the frequencies
and the prices, in this case, 𝑥, that’s all the individual scores, Σ will be equal
to the sum of 𝑓 times 𝑥 squared divided by the sum of 𝑓, the number of pieces of
data we’ve got, minus the sum of the 𝑓𝑥s divided by the sum of the frequencies all
squared. And then, take the square root of
that. Well, let’s see that in operation
now.
Well, first, I’m gonna add another
row to our table. So, these are gonna be the 𝑥
squareds. So, that’s 10 squared, 20 squared,
and 30 squared, which is 100, 400, and 900. Now, I’m gonna add a row for 𝑓
times 𝑥, the frequency times the 𝑥-score. So, for the first column, that’s
gonna be three, the frequency score, times 10, the 𝑥-score. The second column’s gonna be two
times 20. And the third is gonna be four
times 30, giving us 30, forty, and 120.
And next, I’m gonna add a row for
the 𝑓 times 𝑥 squareds. So, the frequencies are gonna be
the same as we just used in our calculations, but this time rather than multiplying
each frequency by its corresponding 𝑥-value, we’re multiplying by its corresponding
𝑥 squared value. So, that’s 100, 400, and 900. And when I complete those
calculations, my 𝑓𝑥 squared scores are gonna be 300, 800, and 3600.
So, next, I need to do some adding
up. The sum of the frequencies, three
plus two plus four is nine. The sum of the 𝑓 𝑥s, that’s 30
plus 40 plus 120, is 190. And the sum of the 𝑓𝑥 squared, so
that’s 300 plus 800 plus 3600, is 4700. So, now, I just need to plug those
numbers into my formula. So, the sum of the 𝑓𝑥 squared is
4700. And the sum of the 𝑓s is nine. And then, from that we’re
subtracting the sum of the 𝑓𝑥s is 190 over the sum of the 𝑓s is nine all
squared. And we’re taking the square root of
the whole thing.
And luckily, it comes up with
exactly the same answer. Now, in this particular example,
because the frequencies were so low, it probably took slightly longer to doing this
second method than it did the original method writing out all the individual
scores. But you can imagine if all those
frequencies were much higher, writing out all the individual scores would’ve taken a
lot longer. So, this second method would
certainly have saved some time.
Right, then, let’s just go through
one final example before we finish the video.
Calculate the standard deviation of
the following data. So, our scores were one, two,
three, four, and five. And three people scored one, nine
people scored two, twelve people scored three, five people scored four, and four
people scored five.
So, the first thing we’re gonna
have to do is add three rows, one for the 𝑥 squared values, one for the 𝑓 times
𝑥-values, and one for the 𝑓 times 𝑥 squared values. So, first of all, let’s calculate
the 𝑥 squared values. So, one squared is one, two squared
is four, three squared is nine, four squared is 16, and five squared is 25. And now, we can fill in the 𝑓𝑥
column. So, the frequencies are three,
nine, 12, five, and four, remember. So, three times one is three, nine
times two is 18, 12 times three is 36, five times four is 20, and four times five is
20.
And now, we can fill out the 𝑓𝑥
squared row. Three times one again is three,
nine times four is 36, 12 times nine is 108, five times 16 this time is 80, and then
four times 25 is 100. And now, we’re gonna need to
calculate the sum of the 𝑓s, the sum of the 𝑓 times 𝑥s, and the sum of the 𝑓
times 𝑥 squareds. Well, adding up the numbers on the
𝑓 row, three plus nine plus 12 plus five plus four, gives us 33. Adding up the numbers on the 𝑓𝑥
row, three plus 18 plus 36 plus 20 plus 20 is equal to 97. And adding up the 𝑓𝑥 squared,
three plus 36 plus 108 plus 80 plus 100 is 327.
And just recapping our formula for
Σ, the standard deviation is the square root of the sum of the 𝑓𝑥 squared divided
by the sum of the 𝑓s minus the sum of the 𝑓𝑥s divided by the sum of the 𝑓s all
squared. So, just filling in those values
there, the sum of the 𝑓𝑥 squareds is 327, the sum of the 𝑓s was 33. So, that’s gonna go here and
here. And the sum of the 𝑓s was 97. And popping that all in the
calculator and rounding it to two decimal places gives us an answer of 1.13.
So, hopefully, that last example
there highlights the fact that when the frequencies get a little bit higher, that
would’ve been — we’d have had to write out 33 numbers. So, it would have been quite a bit
more difficult to actually do that question if we hadn’t used our table formula.
So, to summarise what we’ve learned
then, for values given in a frequency table where 𝑥 is the individual scores and 𝑓
are the frequencies of those scores happening, the standard deviation Σ is equal to
the square root of the sum of the 𝑓𝑥 squareds divided by the sum of the 𝑓s minus
the sum of the 𝑓𝑥s divided by the sum of the 𝑓s all squared.