In this explainer, we will learn how to find the median and upper and lower quartiles of a data set.

Our goal in organizing and analyzing a data set is to gain information from it, and there are a number of “summary statistics” that can help us with this. For example, the mean, median, and mode of a data set give us an indication of where the center of the data or the most frequently occurring values lie. And the range, interquartile range, variance, and standard deviation are all measures of how the data spread out from, or cluster around, the central values.

We can also split the data into quarters, so that 25% of the data values lie within a particular quarter.

The picture below represents a data set with 16 data values.

**Q1** The first (or lower) quartile (Q1) marks the center of the lowest half of a data
set. So, 25% of the data sit below Q1 and 75% of the data sit above Q1.

**Q2** The second quartile (Q2), which is the median, marks the middle of a data set. So,
50% of the data set is below the median and 50% is above the median.

**Q3** The third (or upper) quartile (Q3) marks the center of the top half of a data set.
So, 75% of the data set is below Q3 and 25% is above it.

Let us see how this all works with an example.

### Example 1: The Median and Quartiles of a Data Set

The number of Bonus Bugs won by each of 15 students in the first level of a computer game tournament was recorded. The results are in the table below.

- Find the median (Q2) and the lower and upper quartiles (Q1 and Q3) for the number of Bonus Bugs won.
- If the organizers of the tournament decide that the top 25% of students can compete in Level 2, above what number of Bonus Bugs must a student win to go to the next level?

### Answer

**Part 1**

To find the median and the quartiles, Q1 and Q3, we first need to order the data from smallest to largest. The least number of bugs won was 14, so this is our first item in an ordered list. The second lowest number of Bonus Bugs won was 15, and the next highest was 16, and so on, until the highest number of Bonus Bugs won was 35. Our ordered list then looks like this:

We can see how the number of Bonus Bugs won is spread out on the number line by putting the ordered data into a line plot. (Each bug in the plot below represents one student.)

Let us first find the median, which is the central value of the data set. To do this, we note that we have 15 students who recorded their scores, so 15 data points in our data set. The middle value (the median) will be the eighth value in the ordered data set, since the eighth value splits the data set in two, with seven data points on either side of this.

We can see that the eighth value in the ordered data set is 22. So, the median number of Bonus Bugs won by a student was 22. We can interpret this as follows: 50% of the students won less than 22 Bonus Bugs and 50% won more than 22 Bonus Bugs while playing level 1.

Now let us find the lower and upper quartiles, Q1 and Q3, of the Bonus Bugs data. The median splits the data in half and the quartiles split the halves in half. As we have seven data points in the lower half below the median, the middle of this half will be the fourth data point, which in this case has a value of 17. This has three data points on either side of it below the median. If we look at our line plot, we can see where the lower quartile sits.

The lower quartile, Q1, is therefore 17 and we can interpret this as follows: 25% of the students won less than 17 Bonus Bugs in level 1, so 75% of the students won more than 17 Bonus Bugs in level 1.

We can do the same with the top half of the data to find the upper quartile, Q3. There are seven data points above the median, and the upper quartile will split these in half. The upper quartile, Q3 is therefore the twelfth data point, which has a value of 29. This has three data points either side of it above the median.

The upper quartile, Q3, is therefore 29. We can interpret this as follows: approximately 75% of the students won less than 29 Bonus Bugs playing level 1, and approximately 25% of the students won 29 or more Bonus Bugs.

**Part 2**

If the top scoring 25% of students can compete at level 2, then our upper quartile, Q3, tells us that any student scoring 29 or above can go on to the next level.

Note that we include a score of 29 in our top 25%. This is because we have 15 students and the exact cut-off point for the top 25% is not a whole number. But we cannot have a fraction of a student, so we must approximate. If we work out what 75% of 15 is, then , but we cannot have 11.25 students, so we take our third quartile, Q3, to be the 12th data point, since this splits the top half of the data in half. This is above the “11.25th” point marking the lower 75% and corresponds to 29 Bonus Bugs. So we include this in our top 25%.

In the next example, we find the quartiles of a small data set.

### Example 2: Quartiles of a Data Set

Maged’s history test scores are 74, 96, 85, 90, 71, and 98. Determine the upper and lower quartiles of his scores.

### Answer

The first thing we need to do to find the upper and lower quartiles of Maged’s history test scores is to put them in order of size. Starting with the lowest score, this was 71, so that comes first. The next lowest was 74, so that goes next in our new list, and so on, until the highest score, 98, is last in our ordered list:

The midpoint of the data is between the scores 85 and 90, so we can mark this off in our ordered data set.

We know that the lower quartile, Q1, is the midpoint of the lowest half of the data. The midpoint of the lowest 50% of our ordered data is the second score, 74, as there is one score on either side of this in the lower half.

So, the lower quartile, Q1, is the score 74. We can interpret this as follows: 75% of Maged’s history scores were 74 or above. (Or 25% of Maged’s history scores were below 74%.)

We can do exactly the same for the upper quartile in the top half of the ordered data. In this case, the midpoint of the highest 50% of history scores is 96.

The upper quartile of Maged’s history scores is then 96. We can interpret this as follows: approximately 25% of Maged’s scores were 96 or above.

Note that the midpoint of the ordered data set corresponds to the median of the data (which is also Q2, the middle quartile). We can work out its value by finding the average of the two central values, 85 and 90:

We can then say that Maged’s median history score was 87.5.

In the next example, we will work out the median and quartiles for a data set.

### Example 3: The Median and Quartiles of a Data Set

Determine the median and quartiles of the following set of data: 1 350, 1 400, 1 250, 1 050, 1 450, 1 150, 1 000.

### Answer

Our first step is to put the data in order of size from smallest to largest. The smallest value is 1 000 so this begins the ordered list. The next smallest value is 1 050 so this comes next, and so on, up to the largest value, which is 1 450. The ordered data set is, then,

The middle or central value in this ordered set is the number 1 250. It is the fourth value in the ordered list of seven values, with three values on either side of it.

The median of the data set is therefore 1 250. 50% of the values are above 1 250 and 50% are below 1 250.

To find the upper and lower quartiles, we split both the top and bottom halves of the data set in half.

The lower quartile, Q1, is 1,050 and the upper quartile, Q3, is 1 400. We can then say, for example, that 75% of the data is below 1 400 and that 50% of the data are between 1 050 and 1 400.

In our next example, we will find the quartiles for a data set with large values.

### Example 4: The Effect of Adding Data to a Data Set on the Lower Quartile

The table shows the capacities of 6 sports stadiums around the world. Describe how the lower quartile will be affected if Emirates Stadium, London, United Kingdom, which has a capacity of 60 338, is included in the data.

Stadium | Location | Capacity |
---|---|---|

Gelora Sriwijaya Stadium | Palembang, Indonesia | 40 000 |

Olympiastadion | Berlin, Germany | 74 228 |

Frankenstadion | Nuremberg, Germany | 48 548 |

Memorial Stadium | Nebraska, United States | 86 047 |

Borg El Arab Stadium | Alexandria, Egypt | 86 000 |

Stamford Bridge | London, United Kingdom | 42 055 |

### Answer

The first step in determining the effect, on the lower quartile, of adding another stadium capacity to the data set is to put the original data in order of size. We will then find the lower quartile of both the original data set and the new data set and compare the results.

Taking the stadium names and the capacities from the original table and ordering the capacities from smallest to largest in a row, we have the following table.

To find the lower quartile, Q1, we first need to split the data in half. The center of the lower half of the data is then Q1.

The capacity for Stamford Bridge, at 42 055, has one value on either side of it in the lower half, so this is at the center of the lower half. The lower quartile is therefore 42 055.

Now let us find the lower quartile if we add the capacity for Emirates Stadium, which is 60 338. The ordered data set now looks like this.

Finding the center of the data set now, we can see that in fact the capacity for Emirates is the central value of the data set: there are exactly three stadium capacities on either side of it in the ordered list.

The lower quartile, Q1, is the central value of the lower half of the data set.

The capacity for Stamford Bridge, 42 055, is still in the lower quartile, Q1! This is because, in the original data set, the number of values was six and the central value was between the middle two values, whereas, in the new data set, there are seven values and the middle value (the median) is the fourth (Emirates). But there are still three values on either side of this. So there is no change in the lower quartile.

### Example 5: The Median and Quartiles of a Grouped Data Set

In the second year of a computer game tournament, there were forty-two participants and the number of Bonus Bugs each one won in level 1 was recorded. The data are shown in the graph below where each bug represents one participant.

- Find the median number of Bonus Bugs won and the upper and lower quartiles, Q1 and Q3.
- The top 25% of participants can go on to play level 2 in the tournament. What score must the participants achieve to play level 2?

### Answer

**Part 1**

To find the median number of Bonus Bugs won, we need to find the middle value of the data. As there are 42 participants, and half of 42 is 21, the middle score will be between the 21st and 22nd highest scores. And since the data is grouped in a graph, it is already ordered, so from the graph we can find what number of Bonus Bugs on the axis corresponds to the 21st and 22nd values.

Both the 21st and 22nd highest scores are 26 Bonus Bugs, so the median score in level 1 was 26 Bonus Bugs.

To find the lower and upper quartiles, Q1 and Q3, we need to know the score at the center of both the lower and the upper half of the data. For Q1, there are 21 values in the lower half of the set, so Q1 will be the 11th data point (there are 10 values on either side of this in the lower half).

The 11th value corresponds to a Bonus Bug score of 23, so the lower quartile, Q1, is 23 Bonus Bugs.

We can do the same for the upper quartile, Q3. The central value of the upper half of the data is the 32nd value, since there are 10 participants’ scores on either side of this in the top half.

From the graph, we can see that the 32nd value in our data set corresponds to a score of 29 Bonus Bugs. So the upper quartile, Q3, is 29 Bonus Bugs.

**Part 2**

The cut-off point for the top 25% of scores is actually Q3, the upper quartile. So, the top 25% of participants achieved a score of 29 Bonus Bugs or more. The qualifying score for entry to level 2 is therefore 29 Bonus Bugs.

### Key Points

- The first (or lower) quartile (Q1) marks the center of the lowest half of a data set. So, 25% of the data sit below the value of Q1 and 75% of the data sit above Q1.
- The second quartile (Q2), which is the median, marks the middle of a data set. So, 50% of the data set is below the median and 50% is above the median.
- The third (or upper) quartile (Q3) marks the center of the top half of a data set. So, 75% of the data set is below Q3 and 25% is above it.