In this explainer, we will learn how to find the measures of central tendency like the mean, median, and mode.

To find information from *a numerical data set*, one of the first things we do is to look for what might be
a *typical* value or *average* in the set. This gives us an idea of where
the center of the data set sits. The mean, the median, and the mode are all different measures of central tendency.

Note: *A numerical data set* is one in which the values are measurements. Data collected on height, weight,
and temperature are examples of numerical data, whereas, for example, color of car, type of breakfast cereal, and make of cell phone are not numerical.

Let us remind ourselves how to work out the mean, median, and mode of a numerical data set.

### The Mean, Median, and Mode of a Data Set

**The Mean**: To calculate the mean of a numerical data set, add up all the data values and then divide the total by the number of values in the data set.

**The Median**: To work out the median of a data set:

- Organize the data according to size: either from the smallest value to the largest, or from the largest value to the smallest.
- Count the number of values. If the number of values is an odd number, then the median is the middle value of the ordered data set. If the number of values is even, then the median is the mean of the two values at the center of the ordered data set.

**The Mode**: The mode is the most commonly or frequently occurring value (or values). We sometimes call this the modal value.

**Notes**

- A data set can have only one mean and one median but it may have one mode, more than one mode, or no mode at all. If a data set has two modes, for example, we would say it is bimodal.
- An outlier is a value that is much smaller or much larger than most of the other values in a data set. If there is one or more outliers in a data set, this can have an effect on the usefulness of the mean as a measure of central tendency.
- There are many different kinds of mean and the one we are using, which is most commonly used, is called the arithmetic mean.

In our first example we will find the mean, median, and mode of a data set.

### Example 1: Finding the Mean, Median, and Mode of a Data Set

The data set shows the number of tomatoes growing on each tomato plant in a garden.

7 | 12 | 8 | 3 | 0 | 4 | 4 | 6 | 5 |

- Calculate the mean of the data, giving your answer correct to the nearest integer.
- Find the median of the data.
- Find the mode of the data.

### Answer

**Part 1**

To calculate the mean of the data we first add all the data values together. In this case, that means adding up the numbers of tomatoes on every plant:

The sum of the values is 49, so there is a total of 49 tomatoes in the garden.

The next step is to work out how many tomato plants there are in the garden. There is 1 plant with 7 tomatoes, 1 plant with 12 tomatoes, 1 plant with 3 tomatoes, 1 plant with no tomatoes, 2 plants with 4 tomatoes, 1 plant with 6 tomatoes, and 1 plant with 5 tomatoes. This gives us tomato plants. To work out the mean, we divide the total number of tomatoes by the number of plants.

Hence, to the nearest integer (or whole number), the mean of the data is 5. We interpret this as follows: the mean number of tomatoes on a plant in the garden is 5.

**Part 2**

To find the median of the data, we first order the data according to size. If we start from the smallest number of tomatoes on a plant, the data can be ordered as follows:

We know from when we worked out the mean that there are 9 data values, that is, 9 tomato plants. This is an odd number so the median is the middle value of the ordered data set. The middle value of 9 in the set is the 5th value:

The middle value is 5. So, the median number of tomatoes on a plant in the garden is 5, which, in this case, is the same as the mean number of tomatoes.

**Part 3**

To find the mode, we can look at our ordered data to see if any of the values occur more than the others:

The only value to occur more than once in this data set is 4. So the mode or the modal value is 4. That is, the most frequently occurring number of tomatoes on a plant is 4.

In this next example, we will think about which measure of average is most appropriate.

### Example 2: Choosing the Most Appropriate Average

Liam wants to know the most popular age at which to ride on a particular roller coaster in a theme park. Should he calculate the mean, median, or modal age of the riders?

### Answer

Liam wants to know the **most popular** age at which to ride on the roller coaster. Of our three possible
calculations, the mean gives us the average age to ride on the roller coaster, and the median gives us the middle of the riders’ ages.
The mode is defined as the most frequently occurring (i.e., the most common) value. So the most popular age would be the modal age,
and this is what Liam should calculate.

Let us look at an example of the type of data where the mean is not an appropriate measure of central tendency.

### Example 3: When the Mean Is Not Appropriate

A small company makes and sells cakes. There are 11 employees (including the owner) in the company whose salaries are given in the table below.

Employee | Salary ($ per year) |
---|---|

Store Person | $15,000 |

Mixer 1 | $16,500 |

Mixer 2 | $16,500 |

Baker 1 | $17,500 |

Baker 2 | $17,500 |

Head Baker | $20,000 |

Icer 1 | $18,500 |

Icer 2 | $18,500 |

Sales Person | $19,000 |

Finance Director | $75,000 |

Owner | $90,000 |

Find the mean, median, and mode for the salaries data and determine which is the most appropriate measure of central tendency.

### Answer

To find the mean, we add up all the salaries and divide by the number of employees. The sum of all the salaries is

The mean salary is, therefore, as follows:

So, the mean salary for an employee of the cake company is $29,455. To find the median salary, we will need to put the data in order of size, starting at the smallest salary, which is $15,000.

Employee | Salary ($ per year) |
---|---|

Store Person | $15,000 |

Mixer 1 | $16,500 |

Mixer 2 | $16,500 |

Baker 1 | $17,500 |

Baker 2 | $17,500 |

Icer 1 | $18,500 |

Icer 2 | $18,500 |

Sales Person | $19,000 |

Head Baker | $20,000 |

Finance Director | $75,000 |

Owner | $90,000 |

There are 11 employees, which is an odd number, so the median will be the middle value. Half of 11 is 5.5, so rounding up to 6, the median will be the 6th value in the ordered data set.

The median salary is therefore $18,500. To find the mode, we note that there are two employees earning $16,500, two employees earning $17,500, and two employees earning $18,500, and none of the other employees have the same salary. The salaries’ data is therefore trimodal; that is, it has 3 modal salaries, which are $16,500, $17,500, and $18,500.

To summarize our findings, the mean salary is $29,455, the median salary is $18,500, and there are three modes which are $16,500, $17,500, and $18,500.

To determine which of these measures of central tendency is the most appropriate, we must look back at our data set. We can see that out of 11 employees, 9 earned $20,000 or less.

That leaves only 2 employees who earned more than $20,000. (In fact they both earned substantially more
than $20,000!) So, the mean salary of $29,455
does not give us a realistic indication of where the center of the data is. Remember that we are looking for a measure of the central tendency of
the data: a *typical* value. But the vast majority of the employees earn nowhere near the mean amount.

The median salary, on the other hand, is $18,500, which is much closer to what most employees earn. It is not particularly useful to quote three modes, so the most appropriate measure of central tendency for this data is the median.

### Example 4: Measures of Central Tendency

William collected the following data that represents the number of books his friends read last year. Which measure of central tendency results in an answer of 9.5 books?

1 | 5 | 12 | 6 | 2 | 7 | 3 | 14 | 15 |

13 | 13 | 8 | 2 | 15 | 11 | 15 | 5 | 15 |

### Answer

To work out which measure of central tendency is represented by 9.5 books, let us work out each of the mean, median, and mode and see which one (or more) is 9.5.

**The Mean**: We calculate the mean by adding up all of the data and dividing by the number of numbers in our data set. The sum of all the values is

And the number of values in our data set is 18. So the mean number of books read by William’s 18 friends was as follows:

Unfortunately, this is not the number we are looking for, which is 9.5. So let us find the median of the data and see if the median number of books read was 9.5.

**The Median**: First we put the data in order of size:

Since there are 18 values in this data set, which is an even number, the median is the average of the two middle values. Half of 18 is 9, so the middle two values in the ordered data set are the 9th and 10th:

The median number of books is 9.5, and this is the measure we were looking for. Let us finish this example by checking what the mode of the data is.

**The Mode**: We can see from the ordered data that 2 of William’s friends read 2 books, 2 read 5 books, 2 read 13 books, and 4 friends read
15 books:

The remaining 8 of William’s friends each read a different number of books from each other. Since the number of books occurring most commonly is 15, this is the mode, which, of course, is not 9.5. Note that it would be impossible in this case for the mode to be equal to 9.5, since in our data set we have only whole books (i.e., they are counted in integers—there are no fractions of books).

The measure of data represented by 9.5 books is the median.

### Example 5: Mean, Median, and Mode with Negative Numbers

Find the mean, median, and mode of the following set of daily temperatures: , , , , , , , .

### Answer

To calculate the mean temperature, we first add all of the data values together:

We then divide this by the number of numbers, that is, by the number of temperatures recorded, which is 8:

To work out the median temperature, we first put the data in order of size. Since we are dealing with temperatures and some of these are negative values, we order the data from coldest to warmest:

There are eight data values (or temperature readings), which is an even number, so the median will be the average of the middle two values, that is, the average of the 4th and 5th values:

The fourth value is , and the fifth is . The average of these two temperatures is

So, the median temperature is .

The last measure we want to find is the mode, which is the most frequently occurring temperature. From the ordered data set we can see that the only temperature that occurs more than once is . This occurs twice in the data set, so the modal temperature is .

Our next example shows what can happen to our measures of the average if we add some data to our data set.

### Example 6: How Does Adding Data Affect the Mean, Median, and Mode?

The table shows the average April rainfall, in inches, for 12 cities.

4.5 | 2.1 | 4.4 | 2.1 | 3.2 | 3.9 |

1.9 | 2.3 | 1.3 | 2.8 | 3.0 | 3.1 |

If another city, with the value 2.1, is added to this list, which of the following would be true?

- The mean would decrease.
- The mean would increase.
- The median would increase.
- The mode would decrease.

### Answer

To decide which of the statements is correct, we will begin by working out the values of the mean, median, and mode for this rainfall data.

**The Mean**: We calculate the mean by adding up all the data values and dividing by the number of values in
our data set, that is, by the number of cities:

Now if we add in the 2.1 inches of rainfall for the other city, this will increase our total rainfall by 2.1, and we now have thirteen cities. So, the new mean is

The mean rainfall has therefore decreased slightly, from 2.883 to 2.823 inches, and so statement A is true. Let us now check what happens to the median if we add 2.1 to our data set.

**The Median**: To find the median, we put our original data in order of size, starting with the smallest:

There are 12 cities, which is an even number, so the median will be the average of the middle two, that is, the average of the 6th and 7th values in the ordered list:

Now if we add in the rainfall for the 13th city, since 13 is an odd number, the median will be the middle value of the new ordered list. To find which is the middle value, we divide the number of values (or cities) by 2: . There is no “6.5th” data value so we must round this up to 7. Our middle value (the median) is then the 7th value in the ordered list. The new median rainfall is therefore 2.8 inches, which is less than the previous median of 2.9 inches. So, the median rainfall has decreased. This is not one of the statements in our list. Next we will consider what happens to the mode if we add the other city.

**The Mode**: The mode is the most frequently occurring value. In our original data set, there is one mode,
which is 2.1 inches. This occurs twice in the data set, whereas all the other values occur only once each.
If we add the data for our 13th city to our list, (i.e., another instance of 2.1),
there are now three instances of 2.1 inches:

The mode has not changed.

In summary, by adding the rainfall (2.1 inches) for another city, both the mean and the median rainfall decreased, but the mode remained the same.

### Note

We could consider this question without necessarily calculating the mean and median for the larger data set.
Having worked out the mean and median for the original data, mean rainfall = 2.883 inches
and median rainfall = 2.9 inches, we know that we are adding a value
(2.1 inches) that is
** less** than either of these. This smaller value will effectively pull the mean and median towards it.
So in fact we would expect these two measures of average to ** decrease** slightly because of this.
In this particular case the change is very small. If our data set was larger and/or contained more values equal to
or around the median value, the change may have been negligible.

Since the mode is 2.1 inches and we are adding in another instance of 2.1, the mode will not change.

### Key Points

**The Mean**: The mean of a numerical data set is obtained by adding up all the data values in the set and
then dividing the total by the number of values in the data set.

**The Median**: To find the median of a data set:

- Organize the data according to size, either from the smallest value to the largest, or from the largest value to the smallest.
- Count the number of values. If the number of values is an odd number, then the median is the middle value of the ordered data set. If the number of values is even, then the median is the mean of the two values at the center of the ordered data set.

**The Mode**: The mode is the most commonly or frequently occurring value (or values). We sometimes call this the modal value.

### Notes

- A data set can have only one mean and one median but it may have one mode, more than one mode, or no mode at all. If a data set has two modes, for example, we would say it is bimodal.
- An outlier is a value that is much smaller or much larger than most of the other values in a data set. Outliers in a data set can have an effect on the usefulness of the mean as a measure of central tendency. If one or more outliers are present in a data set, it may be better to use the median or the mode to measure central tendency.
- There are many different kinds of mean and the one we are using, which is most commonly used, is called the arithmetic mean.