Before watching this video, you should already be familiar with standard deviation and how to calculate it from a list of values. We’ve got another video covering that if you’re not sure about it. In this video will be calculating the standard deviation of a set of values presented in the form of a frequency table.
Remember, the standard deviation is a number that tells you about the amount of variability in a set of numbers; it’s the root mean square deviation from the mean or the standard deviation is equal to Σ, that little funny symbol there, which is encapsulated in this formula here: the sum of each individual score minus the mean squared divided by 𝑛, and then we take the square root of the whole sum. Although we’ve actually got another formula which can be easier to use, and we sum that up as the mean of the squares minus the square of the mean, and then we take the square root of all that.
So if you’ve got a straightforward list of numbers that you want to calculate the standard deviation for. Then it’s a relatively straightforward if sometimes a bit lengthy process. For example, if you want to calculate the standard deviation of three, seven, eight, ten, and eleven, we can just call those the 𝑥-values and write them down, then square them. And if you add up the 𝑥-values in this case, we get thirty-nine. And if you add up the 𝑥 squared values in this case, you get three hundred and forty-three.
You’ll also notice we’ve got five bits of data because there were five numbers that we were given in the first place. So to work out the standard deviation, we just need to plug some of those values into our formula. Well the sum of the 𝑥 squareds is three hundred and forty-three, and 𝑛 was five, and the sum of the 𝑥s was thirty-nine. And this all simplifies down to a hundred and ninety-four over twenty-five So our answer would be that the standard deviation for this set of data is two point seven nine correct to two decimal places.
Brilliant! But that’s not what this video is about. Let’s say we’re given some data in a frequency table, how can we calculate the standard deviation then? For example, here we’ve got a series of prices and their frequencies. So three items had a price of maybe ten dollars, two items had a price of twenty dollars and four items had a price of thirty dollars. What’s the standard deviation in the prices?
Well, we could just write out the data in a big list and use the method that we already know. So that’d be three lots of ten, two lots of twenty, and four lots of thirty, meaning we’ve got nine bits of data in total. And if we add them all up, we get a hundred and ninety. So now let’s go on and write down the 𝑥 squared values.
And of course, ten squared is a hundred, twenty squared is four hundred, and thirty squared is nine hundred, and then the sum of the 𝑥 squareds is four thousand seven hundred. And we can plug those numbers into our formula, so we’ve got to calculate four thousand seven hundred divided by nine minus a hundred and ninety over nine squared and then take the square root of that answer. And that works out to be eight point seven five to two decimal places.
Now I’d rush through that using the power of video manipulation, but can you imagine if these frequencies were a lot higher then we would have had a massive long list of numbers here, and it would have taken a long time to do that calculation. Now luckily, there’s an alternative formula which can save us a bit of time. So if we call 𝑓 the frequencies and the prices in this case 𝑥, that’s the individual scores, Σ will be equal to the sum of 𝑓 times 𝑥 squared divided by the sum of 𝑓, the number of pieces of data we’ve got, minus the sum of the 𝑓 𝑥s divided by the sum of the frequencies all squared and then take the square root of that. Well let’s see that in operation now.
Well first, I’m gonna add another row to our table, so these are gonna be the 𝑥 squareds. So that’s ten squared, twenty squared, and thirty squared, which is a hundred, four hundred, and nine hundred. Now I’m gonna add a row for 𝑓 times 𝑥, the frequency times the 𝑥-score. So for the first column, that’s gonna be three, the frequency score, times ten, the 𝑥 score. The second column’s gonna be two times twenty, and the third is gonna be four times thirty, giving us thirty, forty, and a hundred and twenty.
And next, I’m gonna add a row for the 𝑓 times 𝑥 squareds So the frequencies are gonna be the same as we just used in our calculations, but this time rather than multiplying each frequency by its corresponding 𝑥-value, we’re multiplying by its corresponding 𝑥 squared value, so that’s a hundred, four hundred, and nine hundred. And when I complete those calculations, my 𝑓 𝑥 squared scores are gonna be three hundred, eight hundred, and three thousand six hundred. So next, I need to do some adding up. The sum of the frequencies, three plus two plus four is nine, The sum of the 𝑓 𝑥s, that’s thirty plus forty plus a hundred and twenty, is one hundred and ninety, and the sum of the 𝑓 𝑥 squareds, so that’s three hundred plus eight hundred plus three thousand six hundred, is four thousand seven hundred.
So now I just need to plug those numbers into my formula, so the sum of the 𝑓 𝑥 squareds is four thousand seven hundred, and the sum of the 𝑓s is nine. And then from that we’re subtracting the sum of the 𝑓 𝑥s is a hundred and ninety over the sum of the 𝑓s is nine all squared and we’re taking the square root of the whole thing. And luckily, it comes up with exactly the same answer.
Now in this particular example, because the frequencies were so low, it probably took slightly longer to do this second method than it did the original method writing out all the individual scores. But you can imagine if all those frequencies were much higher, writing out all the individual scores would’ve taken a lot longer. So this second method would certainly have saved some time.
Right then, let’s just go through one final example before we finish the video. Calculate the standard deviation of the following data. So our scores are one, two, three, four, and five, and three people scored one, nine people scored two, twelve people scored three, five people scored four, and four people scored five. So the first thing we’re gonna have to do is add three rows: one for the 𝑥 squared values, one for the 𝑓 times 𝑥 values, and one for the 𝑓 times 𝑥 squared values. So first of all, let’s calculate the 𝑥 squared values. So one squared is one, two squared is four, three squared is nine, four squared is sixteen, and five squared is twenty-five.
And now we can fill in the 𝑓 𝑥 column. So the frequencies of three, nine, twelve, five, and four numbers, so three times one is three, nine times two is eighteen, twelve times three is thirty-six, five times four is twenty, and four times five is twenty. And now we can fill out the 𝑓 𝑥 squared row; three times one again is three, nine times four is thirty-six, twelve times nine is one hundred and eight, five times sixteen this time is eighty, and then four times twenty-five is a hundred. And now we’re gonna need to calculate the sum of the 𝑓’s the sum of the 𝑓 times 𝑥’s and the sum of the 𝑓 times 𝑥 squareds.
Well, adding up the numbers on the 𝑓 row, three plus nine plus twelve plus five plus four, gives us thirty-three; adding up the numbers on the 𝑓 𝑥 row, three plus eighteen plus thirty-six plus twenty plus twenty is equal to ninety-seven; and adding up the 𝑓 𝑥 squareds, three plus thirty-six plus one hundred and eight plus eighty plus a hundred is three hundred and twenty-seven.
And just recapping our formula for Σ, the standard deviation is the square root of the sum of the 𝑓 𝑥 squareds divided by the sum of the 𝑓s minus the sum of the 𝑓 𝑥s divided by the sum of the 𝑓s all squared. So just filling in those values there, the sum of the 𝑓 𝑥 squareds is three hundred and twenty-seven, the sum of the 𝑓s was thirty-three, so that’s gonna go here and here, and the sum of the 𝑓s was ninety-seven. And popping that all in the calculator and rounding it to two decimal places gives us an answer of one point one three.
So hopefully that last example there highlights the fact that when the frequencies get a little bit higher, that would’ve been — we’d have had to write out thirty-three numbers, so would have been quite a bit more difficult to actually do that question if we hadn’t used our table formula. So to summarise what we’ve learned then, for values given in a frequency table where 𝑥 is the individual scores and 𝑓 are the frequencies of those scores happening, the standard deviation Σ is equal to the square root of the sum of the 𝑓 𝑥 squareds divided by the sum of the 𝑓s minus the sum of the 𝑓 𝑥s divided by the sum of the 𝑓s all squared.