Before watching this video, you should already be familiar with standard
deviation and how to calculate it from a list of values. We’ve got another video covering that if
you’re not sure about it. In this video will be calculating the standard deviation of a set of values
presented in the form of a frequency table.
Remember, the standard deviation is a number that tells you about the amount
of variability in a set of numbers; it’s the root mean square deviation from the mean or the
standard deviation is equal to Σ, that little funny symbol there, which is
encapsulated in this formula here: the sum of each individual score minus the mean squared
divided by 𝑛, and then we take the square root of the whole sum. Although we’ve actually got another formula which can be easier to use, and we sum
that up as the mean of the squares minus the square of the mean, and then we take the
square root of all that.
So if you’ve got a straightforward list of numbers that you want to calculate
the standard deviation for. Then it’s a relatively straightforward if sometimes a bit lengthy
process. For example, if you want to calculate the standard deviation of three, seven,
eight, ten, and eleven, we can just call those the 𝑥-values and write them down, then square them. And if you add up the 𝑥-values in this case, we get thirty-nine. And if you add up the 𝑥 squared values in this case, you get three hundred and forty-three.
You’ll also notice we’ve got five bits of data because there were five
numbers that we were given in the first place. So to work out the standard deviation, we just need to plug some of those
values into our formula. Well the sum of the 𝑥 squareds is three hundred and forty-three,
and 𝑛 was five, and
the sum of the 𝑥s was thirty-nine.
And this all simplifies down to a hundred and ninety-four over twenty-five So our answer would be that the standard deviation for this set of data is
two point seven nine correct to two decimal places.
Brilliant! But that’s not what this video is about. Let’s say we’re given some data
in a frequency table, how can we calculate the standard deviation then? For example, here we’ve got a series of prices and their frequencies. So three
items had a price of maybe ten dollars, two items had a price of twenty dollars
and four items had a price of thirty dollars. What’s the standard deviation in the prices?
Well, we could just write out the data in a big list and use the method that we
already know. So that’d be three lots of ten,
two lots of twenty, and four lots of thirty, meaning we’ve got nine bits of data in total. And if we add them all up, we get a hundred and ninety. So now let’s go on and write down
the 𝑥 squared values.
And of course, ten squared is a hundred, twenty squared is four hundred,
and thirty squared is nine hundred, and then the sum of the 𝑥 squareds is four thousand seven hundred.
And we can plug those numbers into our formula, so we’ve got to calculate four thousand seven hundred divided by nine minus
a hundred and ninety over nine squared and then take the square root of that answer.
And that works out to be eight point seven five to two decimal places.
Now I’d rush through that using the power of video manipulation, but can
you imagine if these frequencies were a lot higher then we would have had a massive long list
of numbers here, and it would have taken a long time to do that calculation. Now luckily, there’s
an alternative formula which can save us a bit of time. So if we call 𝑓 the frequencies and the prices in this case
𝑥, that’s the individual scores, Σ will be equal
to the sum of 𝑓 times 𝑥 squared divided by the sum of 𝑓, the number of pieces of data we’ve got, minus
the sum of the 𝑓 𝑥s divided by the sum of the frequencies all squared and then take the
square root of that. Well let’s see that in operation now.
Well first, I’m gonna add another row to our table, so these are gonna be the
𝑥 squareds. So that’s ten squared, twenty squared,
and thirty squared, which is a hundred, four hundred, and nine hundred.
Now I’m gonna add a row for 𝑓 times 𝑥, the frequency times the 𝑥-score.
So for the first column, that’s gonna be three, the frequency score, times ten, the
𝑥 score. The second column’s gonna be two times twenty, and the third is gonna be four times thirty, giving us thirty, forty, and a hundred and twenty.
And next, I’m gonna add a row for the 𝑓 times 𝑥 squareds
So the frequencies are gonna be the same as we just used in our
calculations, but this time rather than multiplying each frequency by its corresponding
𝑥-value, we’re multiplying by its corresponding 𝑥 squared value, so that’s
a hundred, four hundred, and nine hundred. And when I complete those calculations, my 𝑓 𝑥 squared scores are
gonna be three hundred, eight hundred, and three thousand six hundred.
So next, I need to do some adding up. The sum of the frequencies, three plus two plus four is nine,
The sum of the 𝑓 𝑥s, that’s thirty plus forty plus a hundred and twenty,
is one hundred and ninety, and the sum of the 𝑓 𝑥 squareds, so that’s three hundred plus eight hundred plus three thousand six hundred, is
four thousand seven hundred.
So now I just need to plug those numbers into my formula, so
the sum of the 𝑓 𝑥 squareds is four thousand seven hundred, and the sum of the 𝑓s is nine. And then from that
we’re subtracting the sum of the 𝑓 𝑥s is a hundred and ninety over the sum of the 𝑓s is nine all squared and
we’re taking the square root of the whole thing.
And luckily, it comes up with exactly the same answer.
Now in this particular example, because the frequencies were so low, it
probably took slightly longer to do this second method than it did the original method
writing out all the individual scores. But you can imagine if all those frequencies were much
higher, writing out all the individual scores would’ve taken a lot longer. So this second
method would certainly have saved some time.
Right then, let’s just go through one final example before we finish the video.
Calculate the standard deviation of the following data. So our scores are one, two, three, four,
and five, and three people scored one, nine people scored two,
twelve people scored three, five people scored four,
and four people scored five. So the first thing we’re gonna have to do is add three rows: one for the
𝑥 squared values, one for the 𝑓 times 𝑥 values, and one for the 𝑓 times 𝑥 squared
values. So first of all, let’s calculate the 𝑥 squared values. So one squared is one, two
squared is four, three squared is nine, four squared is sixteen, and five squared is twenty-five.
And now we can fill in the 𝑓 𝑥 column. So the frequencies of three, nine,
twelve, five, and four numbers, so three times one is three, nine times two is eighteen,
twelve times three is thirty-six, five times four is twenty, and four times five is twenty. And now we can fill out the 𝑓 𝑥 squared row; three times one again is three, nine
times four is thirty-six, twelve times nine is one hundred and eight, five times sixteen this time is eighty,
and then four times twenty-five is a hundred.
And now we’re gonna need to calculate the sum of the 𝑓’s the sum
of the 𝑓 times 𝑥’s and the sum of the 𝑓 times 𝑥 squareds.
Well, adding up the numbers on the 𝑓 row, three plus nine plus twelve plus five plus four, gives us
thirty-three; adding up the numbers on the 𝑓 𝑥 row, three plus eighteen plus thirty-six plus twenty plus twenty is equal to
ninety-seven; and adding up the 𝑓 𝑥 squareds, three plus thirty-six plus one hundred and eight plus eighty plus a hundred is three hundred and twenty-seven.
And just recapping our formula for Σ, the standard deviation
is the square root of the sum of the 𝑓 𝑥 squareds divided by the sum of the 𝑓s minus the
sum of the 𝑓 𝑥s divided by the sum of the 𝑓s all squared.
So just filling in those values there, the sum of the 𝑓 𝑥 squareds is three hundred and twenty-seven,
the sum of the 𝑓s was thirty-three, so that’s gonna go here and here, and the sum of the 𝑓s was
ninety-seven. And popping that all in the calculator and rounding it to two decimal places
gives us an answer of one point one three.
So hopefully that last example there highlights the fact that when the
frequencies get a little bit higher, that would’ve been — we’d have had to write out thirty-three numbers, so
would have been quite a bit more difficult to actually do that question if we hadn’t used our
table formula. So to summarise what we’ve learned then, for values given in a frequency table
where 𝑥 is the individual scores and 𝑓 are the frequencies of those
scores happening, the standard deviation Σ is equal to the square root of the sum of the
𝑓 𝑥 squareds divided by the sum of the 𝑓s minus the sum of the 𝑓 𝑥s divided by the sum of the 𝑓s