In this explainer, we will learn how to apply the normal distribution in real-life situations.
Many continuous variables in the real world approximately follow the normal distribution. Variables like heights and weights collected from unbiased samples are expected to be normally distributed. If we collect the values of such variables from a large random sample, then we expect the distribution to resemble the following histogram.
If the distribution of a continuous variable is symmetric and concentrated near the mean (like the data set pictured above), then we can assume that the variable is approximately normally distributed. To approximate the percentage of data points lying within a given range in such variables, we can use the normal probability distribution.
For example, say that we want to approximate the percentage of people from France whose heights are between 160 cm and 180 cm. Assume that, after collecting the data from a large random sample, we have computed the mean height of the sample to be 175 cm and the standard deviation to be 5 cm. We would begin this problem by defining a normal random variable with mean 175 cm and standard deviation 5 cm. We recall that denotes that the variable is normally distributed with mean and standard deviation . Using this notation, . Then, the percentage of people from France whose heights are between 160 cm and 180 cm is approximated by the probability . We remember that since is a continuous random variable, the strict inequality and the weak inequality are interchangeable. This is because the probability that will take a particular value is zero; that is, for any . In particular, this probability can also be written in several different ways:
So, we do not need to be concerned about whether or not the phrase βbetween 160 cm and 180 cmβ includes the endpoints 160 cm and 180 cm.
Recall that if , then is the standard normal variable . Then, the probability for is obtained using the bell curve and the standard normal table. In this explainer, we will use the standard normal table that provides probabilities of the form for .
To compute the probability , we begin by standardizing the normal distribution:
Since is the standard normal random variable, we analyze the region by drawing the bell curves.
Then, we can write
From the standard normal table, we find and . Summing these numbers, we get
So . We remember that, to convert a probability to a percentage, we need to multiply the probability by 100. So, , which means that the heights of approximately of people from France are between 160 cm and 180 cm.
Given a value of a random variable, the -score represents its position relative to the mean value, measured by the number of standard deviations.
Definition: π§-Score
Let be a data point from a variable with mean and standard deviation . Then, the -score associated with is given by
The probability associated with a -score is , where is the standard normal variable.
We note that the formula above is analogous to that of standardizing a normal distribution, except that both and are in lowercase. For example, a -score of indicates that the value is to the left of . In other words, .
Let us look at a few examples to familiarize ourselves with different contexts.
Example 1: Estimating Normal Distribution Probabilities in Context
A crop of apples has a mean weight of 105 g and a standard deviation of 3 g. It is assumed that a normal distribution is an appropriate model for this data. What is the approximate probability that a randomly selected apple from the crop has a weight less than 105 g?
Answer
We begin by using to represent the weight of an apple, which is assumed to follow the normal distribution with mean and standard deviation . In other words, we can write . We need to compute the probability .
To standardize the normal distribution, we first subtract from each side. Then, we divide each side by . Lastly, we replace with :
By symmetry of the bell curve, , as seen in the picture below.
We remember that, to convert a probability to a percentage, we need to multiply the probability by 100. So , which means that the probability that a randomly selected apple from the crop has a weight less than 105 g is .
In the last example, we presented the process of standardizing the normal distribution to compute its probability. However, this was unnecessary for this particular example, since we are asked simply to compute the probability that a randomly selected apple had a weight less than the mean. Since the weights of apples are assumed to be normally distributed, this means in particular that the distribution is symmetric about the mean. In other words, approximately half of the apples would have weights less than the mean, and the other half will have weights above it. Using this reasoning, we could have inferred straight away that .
In our next example, we will demonstrate the process for computing the probability for a nontrivial region.
Example 2: Calculating Probabilities from a Normal Distribution in Context
The monthly salaries of workers at a factory are normally distributed with mean 210 pounds and standard deviation 10 pounds. Determine the probability of choosing at random a worker with a salary between 184 and 233 pounds.
Answer
Let represent the monthly salary, which is normally distributed with and . We need to compute . Standardizing the normal distribution,
Since involves positive and negative values of , we need to split this into the positive and the negative regions. Let us think through the process using pictures of the bell curve.
This leads to the following equations:
Using the standard normal table, we get and . Adding up the probabilities,
So, the probability of choosing at random a worker with a salary between 184 and 233 pounds is 0.9846.
In the next two examples, we will consider the percentage of data lying within a given range. We remember that the probability is converted into the percentage after multiplying by 100.
Example 3: Estimating Population Percentages from a Normal Distribution in Context
The masses of a population of blackbirds are normally distributed with mean 103 g and standard deviation 11 g.
- To the nearest integer, what percentage of blackbirds have masses less than 110 g?
- To the nearest tenth, what percentage of blackbirds have masses greater than 124 g?
- To the nearest integer, what percentage of blackbirds have masses between 95 g and 120 g?
Answer
Let be the mass of a blackbird. Then, .
Part 1
Let us find the percentage of blackbirds with masses less than 110 g. Using probability notation, we need to compute . We being by standardizing the normal distribution:
To use the standard normal table, we need to round to the nearest hundredth, 0.64. Then, we split into the positive and the negative regions as pictured below.
Then,
We recall that , while is obtained from the standard normal table. Summing the probabilities,
Converting the probability into a percentage, we get . Rounding to the nearest integer, of blackbirds have masses less than 110 g.
Part 2
Let us find the percentage of blackbirds with masses greater than 124 g. In probability notation, we need to compute . We begin by standardizing the normal distribution:
We need to round to the nearest hundredth, 1.91. Then, the right-hand side of the equation above is equal to . We graph the bell curves below to analyze the region .
Then, we get
We know that , and we use the standard normal table to obtain . Then,
Converting the probability into a percentage, we get . Rounding to the nearest tenth, of blackbirds have masses greater than 124 g.
Part 3
Let us find the percentage of blackbirds with masses between 95 g and 120 g. In probability notation, we need to compute . We begin by standardizing the normal distribution:
We need to round and to the nearest hundredth, and 1.55 respectively. Then, the right-hand side of the equation above is equal to . We use the symmetry of the bell curve to analyze this probability.
Then, we get
Using the standard normal table, we get and . Summing the probabilities,
Converting the probability into a percentage, we get . Rounding to the nearest integer, of blackbirds have masses between 95 g and 120 g.
Example 4: Using Normal Distribution Probabilities to Solve a Real-Life Problem
In a school with 1βββ000 students, the heights of students are normally distributed with mean 113 cm and standard deviation 5 cm. How many students are shorter than 121 cm?
Answer
Let represent the height of a student. Then, . To answer the question, we first need to determine approximately what percentage of the students are shorter than 121 cm. So we compute . Standardizing the normal distribution,
We draw the bell curve to analyze the probability.
This leads to
By symmetry, . Using the standard normal table, we get . Adding up the probabilities,
So , which means that approximately of the students are shorter than 121 cm. Since we have 1βββ000 students total, of the total students is
We have rounded the right-hand side of the equation above to the nearest integer, since the number of students must be an integer.
So, approximately 945 students are shorter than 121 cm.
The two parameters and characterize a normally distributed random variable. If we are given the values of these two parameters, we can standardize the normal distribution and find the probabilities using the standard normal table. Some problems leave one or both of these parameters unknown. In the next two examples, we will consider problems with unknown parameters.
Example 5: Finding the Mean Using Normal Distribution
The heights of a sample of flowers are normally distributed with mean and standard deviation 12 cm. Given that of the flowers are shorter than 47 cm, determine .
Answer
Let represent the height of a flower. Then, . Since of the flowers are shorter than 47 cm, we know that . We converted the percentage to a decimal number by dividing by 100.
Standardizing the normal distribution,
Let us denote . We are given that . In other words, 0.1056 is the probability associated with the -score given by . Since the probability 0.1056 is less than 0.5, must be negative.
We use the pictures below to think through the process.
This leads to the equations
Since and , this means
Using the standard normal table, the value of corresponding to the probability of 0.3944 is 1.25: so , or equivalently . Recall that we have defined . Then,
So, .
Example 6: Finding the Mean Using Normal Distribution
The heights of a group of students follow a normal distribution with a standard deviation of 20 cm. The probability that a studentβs height is less than or equal to 180 cm is equal to the probability that a standard normal variable is less than or equal to 2.2. Find the mean height of the group of students.
Answer
Let represent the height of a student, which is normally distributed with . Then, . We notice here that the mean, , is unknown and the question asks us to find this value.
We are given that , and we remember that , so
So this means . Since we know ,
So, the mean height of the group of students is 136 cm.
Key Points
- For application problems involving the normal distribution, we begin by defining to be the normal variable with mean and standard deviation .
- If the problem provides variance instead of standard deviation, then we should remember to take the square root to obtain the standard deviation .
- Given a normal random variable with mean and standard deviation , we can standardize it using the formula .
- Given a value of a random variable, its -score is . The probability associated with the -score is .