Lesson Video: Standard Deviation of a Data Set | Nagwa Lesson Video: Standard Deviation of a Data Set | Nagwa

Lesson Video: Standard Deviation of a Data Set Mathematics • Third Year of Preparatory School

In this video, we will learn how to find and interpret the standard deviation from a given data set.

16:20

Video Transcript

In this video, we will learn how to find and interpret the standard deviation from a given data set.

In order to understand the meaning of the standard deviation of a data set, we first recall the definition of the mean of a data set. The mean, also known as the average or expected value of a data set, is used as a measure of central tendency. For a data set 𝑥 containing values 𝑥 sub one, 𝑥 sub two, 𝑥 sub three, and so on, up to 𝑥 sub 𝑛 where there are 𝑛-values, the mean denoted by the Greek letter 𝜇 is calculated by taking the sum of the data set and dividing it by the number of values 𝑛. This can be written as the sum from 𝑖 equals one to 𝑛 of 𝑥 sub 𝑖 all divided by 𝑛.

Let’s now define what we mean by the standard deviation. The standard deviation of a data set is used to measure the dispersion of data from the mean. The larger the standard deviation, the more dispersed the data is from the mean. And the smaller the standard deviation, the less dispersed the data is from the mean. For the same data set 𝑥 with values 𝑥 sub one, 𝑥 sub two, and so on, up to 𝑥 sub 𝑛 where there are 𝑛-values, the standard deviation denoted by 𝜎 𝑥 is calculated as follows. We find the square root of the sum of the difference of values of the data set from the mean 𝜇 all squared divided by the number of values 𝑛. This can be simplified as shown. The two shorthand formulae to calculate the mean in standard deviation will be key in solving the examples in this video.

We will begin by using the formula for standard deviation of a data set to determine the standard deviation when given the sum of the difference of squares and the number of data points.

If the sum of 𝑥 minus 𝑥 bar all squared for a set of six values equals 25, find the standard deviation of the set, and round the result to the nearest thousandth.

We begin by recalling what some of the notation in the question means. 𝑥 bar, also sometimes written as the Greek letter 𝜇, is the mean of the data set. We are asked to find the standard deviation of the set. This is denoted 𝜎 𝑥 and satisfies the equation shown. 𝑛 is the number of values in the data set, in this question six. And we’re also told that the sum of 𝑥 minus 𝑥 bar all squared is equal to 25. Substituting these values, we see that the standard deviation 𝜎 𝑥 is equal to the square root of 25 over six. Typing this into our calculator gives us an answer of 2.041241 and so on. We are asked to give the result to the nearest thousandth. So we need to round to three decimal places. And the standard deviation is therefore equal to 2.041.

In this question, we were given the sum of the difference of values of the data set from the mean all squared. However, in general, we will just be given the data set. Our next step will therefore be to consider the four-step process we can use to find the standard deviation of a data set.

We begin by recalling the formula to calculate the standard deviation 𝜎 𝑥 that we have already seen. When given a data set, our first step is to find the mean 𝜇 or 𝑥 bar of the data set. Our second step is to find the difference between the mean and the value of each of the data points. Next, we find the sum of the squares of each of the values we found in step two. Finally, we substitute the sum of the squares and the value of 𝑛 into the formula and then square root to calculate the standard deviation, noting that this value will always be positive. We will now look at an example where we need to follow this four-step process.

Calculate the standard deviation of the values 45, 35, 42, 49, 39, and 34. Give your answer to three decimal places.

We begin by recalling that the formula to calculate the standard deviation 𝜎 𝑥 of a data set is as shown, where 𝑛 is the number of members of the data set and 𝜇 is its mean. We recall that we can calculate the mean of a data set by finding the sum of the values and dividing by how many values there are. The mean 𝜇 in this case is equal to the sum of the six values divided by six. This is equal to 224 divided by six, which equals 40.6 recurring. We will now set up a table which will enable us to follow a step-by-step process to calculate the standard deviation.

In the first row of our table, we have the six values in our data set 𝑥 sub 𝑖. We begin by subtracting the mean 𝜇 from each of these values. 45 minus 40.6 recurring is equal to 4.3 recurring. Subtracting the mean from 35 gives us negative 5.6 recurring. Repeating this process for the other four values in our data set, we have 1.3 recurring, 8.3 recurring, negative 1.6 recurring, and negative 6.6 recurring. Our next step is to find the square of each of these values. Noting that all of these must be positive, we have the six values shown. We are now in a position to find the sum of 𝑥 sub 𝑖 minus 𝜇 all squared from 𝑖 equals one to 𝑖 equals six. This is the sum of the six values in the third row.

Typing this into our calculator gives us 169.3 recurring. The standard deviation 𝜎 𝑥 is therefore equal to the square root of 169.3 recurring divided by six, which is equal to 5.312459 and so on. As we are asked to give our answer to three decimal places, we can conclude that the standard deviation of the values 45, 35, 42, 49, 39, and 34 is 5.312.

Before looking at one final example, we will consider how we can calculate the mean and standard deviation of a data set in a frequency table. For a data set 𝑥 containing values 𝑥 sub one, 𝑥 sub two, and so on, up to 𝑥 sub 𝑛, with corresponding frequencies 𝑓 equal to 𝑓 sub one, 𝑓 sub two, and so on and 𝑛 distinct values of the data set, the mean 𝜇 is calculated as follows. It is the sum of 𝑥 sub 𝑖 𝑓 sub 𝑖 from 𝑖 equals one to 𝑛 divided by the sum of 𝑓 sub 𝑖 from 𝑖 equals one to 𝑛. When answering any questions of this type, we’ll need to add a row to our table containing the values of 𝑥 sub 𝑖 multiplied by 𝑓 sub 𝑖.

We can then use this value of the mean to calculate the standard deviation in a similar way. The standard deviation 𝜎 𝑥 is equal to the square root of the sum of 𝑥 sub 𝑖 minus 𝜇 all squared multiplied by 𝑓 sub 𝑖 from 𝑖 equals one to 𝑛 divided by the sum of 𝑓 sub 𝑖 from 𝑖 equals one to 𝑛. After we find the square of the differences, we need to multiply each of these values by the frequency before finding their sum. Let’s now look at an example of this type.

The table shows the distribution of goals scored in the first half of a football season. Find the standard deviation of the number of goals scored. Give your answer to three decimal places.

We can see from the table that in five matches, there were no goals scored in the first half. In two matches, one goal was scored. There were seven matches, and both three and four goals were scored. And there were four matches where six goals were scored in the first half. We are asked to find the standard deviation of the number of goals scored. And this can be calculated using the following formula when a data set is given in a frequency table. In this question, 𝑥 sub 𝑖 will be the number of goals. 𝑓 sub 𝑖 will be the number of matches. And 𝜇 will be the mean number of goals scored per match.

This mean value can be calculated by finding the sum of 𝑥 sub 𝑖 multiplied by 𝑓 sub 𝑖 from 𝑖 equals one to 𝑛 divided by the sum of 𝑓 sub 𝑖 from 𝑖 equals one to 𝑛. Before using either of our formulae, we will add some extra rows to our table. In order to calculate the mean, we begin by multiplying each value of 𝑥 sub 𝑖 by the corresponding value of 𝑓 sub 𝑖. Multiplying zero goals by five matches gives us a total of zero goals. One multiplied by two is equal to two. Completing this row, we obtain values of 21, 28, and 24. Adding an extra column for the sum, we need to find this value from 𝑖 equals one to 𝑛 for the second and third rows.

The sum of the frequencies is 25, which means there were a total of 25 matches played. This will be the denominator when calculating both the mean and the standard deviation. Adding zero, two, 21, 28, and 24 gives us 75. And the mean is therefore equal to 75 divided by 25. The average or mean number of goals scored per match was three. In the fourth row of our table, we will subtract this mean from each of our 𝑥-values. Zero minus three is equal to negative three. And subtracting 𝜇 from each of the other 𝑥-values gives us negative two, zero, one, and three. Our next step is to square all five of these values. Noting that squaring a negative number gives a positive answer, we have nine, four, zero, one, and nine.

Finally, we need to multiply each of these values by the corresponding frequency. Nine multiplied by five is 45. Next, we multiply four by two to give us eight. Our last three values are zero, seven, and 36. We now need to find the sum of the values in the bottom row. And this is equal to 96. The standard deviation 𝜎 𝑥 is therefore equal to the square root of 96 over 25. This simplifies to the square root of 3.84. And recalling that we were asked to give our answer to three decimal places, we can type this into our calculator. To three decimal places, the standard deviation of the number of goals scored is 1.960.

Whilst we will not cover it in this video, it is worth noting that we can also find the standard deviation of a group data set in a similar way. When dealing with a group data set, we find the midpoint of each group, and these will be our values of 𝑥 sub 𝑖. We then proceed in the exact same manner as in this question.

We will now finish this video by summarizing the key points. We saw in this video that the standard deviation of a data set is used to measure the dispersion of data from the mean. For data presented in a list, the formula for the standard deviation 𝜎 𝑥 is as shown, where the set of data 𝑥 has values 𝑥 sub one, 𝑥 sub two, and so on, up to 𝑥 sub 𝑛, with 𝑛 members and mean 𝜇. When the data is presented in a frequency table and each element of the data set has corresponding frequency 𝑓 sub one, 𝑓 sub two, and so on, up to 𝑓 sub 𝑛, then we can calculate the standard deviation as shown. We also note that for grouped frequency tables where data is given in intervals, the midpoint of the interval is used to represent the values of 𝑥 sub 𝑖.

Join Nagwa Classes

Attend live sessions on Nagwa Classes to boost your learning with guidance and advice from an expert teacher!

  • Interactive Sessions
  • Chat & Messaging
  • Realistic Exam Questions

Nagwa uses cookies to ensure you get the best experience on our website. Learn more about our Privacy Policy