In this explainer, we will learn how to estimate the median for data presented in a grouped frequency table using cumulative frequency graphs.

Letβs begin by recalling what the median of a data set is.

### Definition: Median

The median of a data set represents the middle value.

Half of the data is above the median, and half of the data is below the median.

We can calculate the median of a data set by ordering the values in the set, either from least to greatest or from greatest to least, and finding the value at the middle position.

Consider the ordered set of values

There are a total of 8 values. The middle value would lie between the 4th and 5th value.

The 4th value is 7 and the 5th value is 10. The median is calculated as the midpoint of the 4th and 5th values. Thus, we have

In the case of an odd number of values, the median is the middle value.

However, when dealing with grouped data, we cannot find the median in the same way. Letβs take the situation in which the data above was grouped into the classes 0β, 5β, 10β, and 15β. The first class, 0β, indicates values that are 0 or greater but less than 5 (the lower boundary of the following class). A grouped frequency table of this data set would be as follows.

Value | 0β | 5β | 10β | 15β |
---|---|---|---|---|

Frequency | 2 | 2 | 3 | 1 |

We can observe that there is no way to extract an exact median from a grouped table, as we cannot tell the original, exact data
values from a grouped distribution. Instead, we determine an *estimate* for the median. One way in which we can visualize the
numbers represented in a grouped frequency table is by graphing the cumulative frequency, either an ascending or descending
cumulative frequency. Remember that when finding the median, the data needs to be in ascending or descending order, which is why
the cumulative frequency is so useful.

Letβs recap what cumulative frequency is.

### Definition: Ascending and Descending Cumulative Frequency

The ascending cumulative frequency, often referred to simply as the cumulative frequency, of a value indicates the frequency of values that are less than .

The descending cumulative frequency of a value indicates the frequency of values that are greater than or equal to .

We can use both types, or either type, of cumulative frequency graphs to find the median. The total frequency of a data set is the total number of data values. In a cumulative frequency diagram, this is the final cumulative frequency value. In a descending cumulative frequency diagram, this is the first descending cumulative frequency value.

The median position, which represents the middle value, can be found by

When the ascending or descending cumulative frequency is graphed on the , we draw a horizontal line from the median position on the , until it meets the curve. Then, we draw a vertical line from this point down to the . This value on the is the estimate for the median.

As half of the data is above the median and half is below the median, it does not matter which type of cumulative frequency graph we use to find the estimate for the median. If we were to draw both the ascending and descending cumulative frequency curves for a data set on a single graph, the point at which the curves intersect would be at the median position.

Unlike when finding the median directly from a data set, we do not need to add 1 to the frequency and then divide by 2 to find the position of the middle value. The main reason for this is that grouped frequency data sets represented as cumulative frequency graphs are often large. As we are finding an estimate for the median, it is sufficient simply to divide the total frequency by 2 to find the value of the median position.

We will now see how we can find an estimate for the median using a cumulative frequency graph.

### Example 1: Estimating the Median from a Cumulative Frequency Graph

From the following cumulative frequency graph that represents the masses of some balls that have different colors, find an estimate for the median.

### Answer

We an recall that cumulative frequency is a βrunning totalβ of frequencies. The cumulative frequency of a value, , indicates the frequency of values that are less than . The median of a data set represents the middle value.

In order to find an estimate for the median, using a cumulative frequency graph, we need to find the value that is halfway through the frequencies. In context, if we had all the balls ordered from lightest to heaviest (or heaviest to lightest), we would need to determine the mass of the ball at the middle position.

Using the graph, we can observe that the highest cumulative frequency is 15 balls, corresponding to 5 kg on the . This tells us that 15 balls had a mass of less than 5 kg and, more importantly, that the total number of balls is 15.

The position of the median is at half of the total frequency and therefore at 7.5 on the . We draw a horizontal line from 7.5 on the until it meets the curve and then draw a vertical line from this point to the .

We observe that the line meets the at 2.1. Hence, we can give the answer that an estimate for the median weight of the balls is 2.1 kg.

A common error when finding an estimate for the median using cumulative frequency graphs is to use the halfway point of the variable under study (on the ) rather than half of the ascending or descending cumulative frequency (on the ). For example, if the incorrect method was applied in the previous example, drawing a vertical line at 2.5 kg on the (the halfway point between 0 kg and 5 kg) up to the curve and then drawing a line from this point to the would result in a cumulative frequency of approximately 9.7 kg. This does not give us the correct estimate for the median, but rather it gives us that there are 9.7 balls less than 2.5 kg.

We may occasionally have a median value that is equivalent to the halfway point on the ; however, this is usually not the case, and it is not an equivalent way of determining an estimate for the median.

We will now see an example of how we can estimate the median from a descending cumulative frequency graph.

### Example 2: Estimating the Median from a Descending Cumulative Frequency Graph

An employer surveyed 30 employees to determine the distance in kilometres of their commute to work. The data is given in the descending cumulative frequency graph.

Determine an estimate for the median commuting distance.

### Answer

We can recall that the descending cumulative frequency of a value indicates the frequency of values that are greater than or equal to . Using the given descending cumulative frequency graph, we can see, for example, that all 30 employees had a commuting distance that was greater than or equal to 0 km. The next coordinate, , indicates that 24 employees had a commuting distance that was greater than or equal to 10 km.

The median of a data set represents the middle value. We can only use the graph to find an estimate for the median, as we do not have the value of every value in the data set (the distance commuted by each employee).

The median position is found by

The total frequency in this problem is the total number of employees surveyed. This is the first, highest value in the descending cumulative frequency diagram: 30. Hence, the median position is

We draw a horizontal line from 15 on the until it meets the curve and then draw a vertical line from this point to the .

We can then give the answer that an estimate for the median distance commuted is 19 km.

As previously mentioned, we can use either an ascending cumulative frequency diagram or a descending cumulative frequency diagram to estimate the median. Letβs now see an example of using both.

### Example 3: Estimating the Median from a Cumulative Frequency Graph and a Descending Cumulative Frequency Graph

The time per day, in minutes, that a group of people spend on their hobbies is given in the table below. This data has also been represented as an ascending cumulative frequency diagram and a descending cumulative frequency diagram.

Time per Day (Minutes) | 0β | 30β | 60β | 90β | 120β | 150β |
---|---|---|---|---|---|---|

Frequency | 20 | 35 | 39 | 17 | 9 | 0 |

Determine an estimate for the median time per day spent on hobbies.

### Answer

We can observe that the two graphs represent the same data. The ascending cumulative frequency, often referred to simply as the cumulative frequency, of a value indicates all the values that are less than . The descending cumulative frequency of a value indicates the frequency of values that are greater than or equal to .

The median of a data set represents the middle value. Its position will be at half of the total frequency.

The total frequency can be found from the table by adding the frequencies. This gives us

Alternatively, we can observe that the total frequency in a cumulative frequency graph is the highest value. We can see on the graph that this is 120. The median position can be calculated as

On the cumulative frequency diagram, we draw a horizontal line at 60 on the to the curve, and then we draw a vertical line from this point to the .

Therefore, we can determine that the median is estimated to be 63 minutes.

Letβs compare this to the median value that we would get from the descending cumulative frequency curve. Using this graph, the total frequency is still the highest value, and we can see that this is the first value. The median will still be at the position .

We draw a horizontal line at 60 on the to the curve, and then we draw a vertical line from this point to the .

This also gives us that the median is estimated to be 63 minutes.

For any data set, if we represented the same data as both ascending and descending cumulative frequencies on the same graph, the point of intersection of the curves would be at the median position. We can also observe that that a descending cumulative frequency graph is a vertical reflection of the ascending cumulative frequency graph.

Using either of these graphs, we obtain the estimate for the median of 63 minutes.

We will now see an example of how we can estimate the median of a data set by first drawing a cumulative frequency graph.

### Example 4: Drawing a Cumulative Frequency Graph and Estimating the Median

The cost, in dollars, of cans of soda in different places is recorded in the table below.

Cost ($) | $0β | $0.50β | $1.00β | $1.50β | $2.00β |
---|---|---|---|---|---|

Frequency | 1 | 6 | 15 | 21 | 7 |

Determine an estimate for the median cost of soda approximated to the nearest hundredth.

### Answer

The costs of soda are given in a grouped frequency distribution, with the first class in the distribution, $0β, indicating costs that are greater than or equal to $0 but less than $0.50 (the lower boundary of the subsequent class). We cannot determine an exact median from this table, but we can calculate an estimate by drawing a cumulative frequency diagram.

The cumulative frequency, or the ascending cumulative frequency, of a value indicates the frequency of values that are less than . Thus, in order to draw the cumulative frequency curve, we need to calculate the cumulative frequencies of values that are βless thanβ the lower boundaries of each of the classes in the table. It can be helpful to draw a new table like the one below. We also include the upper boundary of the given final class. To do this, we assume that if the final class, $2.00β, has the same class width as the other classes, then its upper boundary will be $2.50. Costs in this final class must be less than $2.50.

Notice that it is common to include an initial cumulative frequency of 0. In this context, we know that the frequency of soda being sold for less than $0 will be 0.

Next, since 1 place sold soda for an amount equal to $0 or greater but less than $0.50, we can write that the second cumulative frequency, of sodas that cost less than $0.50, must be 1.

To find the third cumulative frequency, we add the second frequency to the second cumulative frequency total. We have .

We can then continue adding the frequencies as a βrunning totalβ to create the cumulative frequency values.

To plot the cumulative frequency graph, we have βCost ($)β on the and βCumulative Frequencyβ on the . The -coordinate of each point will be the βless thanβ values, and the -coordinate is the corresponding cumulative frequency.

Thus, we have the coordinates

The cumulative frequency graph can be drawn as follows.

To find an estimate for the mean, we use the fact that the median position is calculated by

The total frequency can be found either by adding all the frequencies in the table or by using the final cumulative frequency. The total frequency is 50.

Hence,

We draw a horizontal line on the graph from 25 on the until it meets the curve. Then, we draw a vertical line from this point to the .

Therefore, we can give the answer that the median cost of soda in the different places is .

In the final example, we will see a similar problem, but instead we first draw a descending cumulative frequency diagram.

### Example 5: Drawing a Descending Cumulative Frequency Graph and Estimating the Median

The lengths, in metres, of a number of vehicles boarding a ship is given in the table below.

Length (m) | Number of Vehicles |
---|---|

2.00β | 3 |

3.00β | 25 |

4.00β | 12 |

5.00β | 6 |

6.00β | 2 |

- Draw a descending cumulative frequency diagram to represent the data.
- Determine an estimate for the median vehicle length.

### Answer

**Part 1**

We recall that the descending cumulative frequency of a value indicates the frequency of values that are greater than or equal to . A descending cumulative frequency graph will start with a descending cumulative frequency that is equal to the total frequency of the data set and decrease until it reaches a descending cumulative frequency of 0.

In order to create a descending cumulative frequency diagram, we first need to calculate the descending cumulative frequency of each class. To do this, since descending cumulative frequency represents values that are βgreater than or equal to,β then we use the lower boundary of each class. We can set up a table into which we input the values.

The first descending cumulative frequency is equal to the total frequency. This is because we know that the length of every vehicle in the table is greater than or equal to 2.00 m. We can calculate the total frequency by adding all the frequencies, to give

Thus, the first descending cumulative frequency value is 48.

Next, we calculate the second descending cumulative frequency value by subtracting the first frequency from the first descending cumulative frequency value. This gives us . We know that 45 vehicles had a length greater than or equal to 3.00 m, as we exclude the 3 vehicles in the first class whose length was less than 3.00 m.

We can continue to calculate the descending cumulative frequency values by subtracting the frequency of the previous class from the previous descending cumulative frequency value.

To obtain a final descending cumulative frequency of 0, we can assume that the final class, 6.00, has the same class width as the other classes. The upper boundary of the class would be that of 7.00 m. Since we have no values recorded in this class, then its frequency would be 0. The final descending cumulative frequency is 0.

In order to plot a descending cumulative frequency diagram, we take the variable under study on the , in this example this is the length, and the descending cumulative frequency on the . The -coordinate of each point is given by the βgreater than or equal toβ value, and the -coordinate is the corresponding descending cumulative frequency. Therefore, we will plot the coordinates

Hence, we can draw the graph as shown.

**Part 2**

We can use the graph we have drawn to estimate the median. We cannot determine an exact median as we do not have the individual data values, only the values in the grouped distribution. The median of a data set represents the middle value, and its position on either a cumulative frequency diagram (ascending cumulative frequency diagram) or a descending cumulative frequency diagram is given by

We have already determined that the total frequency is 48; hence,

We then draw a horizontal line on the graph from 24 on the until it meets the curve. Then, we draw a vertical line from this point to the .

We can then give the answer that an estimate for the median vehicle length is 3.90 m.

We now summarize the key points.

### Key Points

- The ascending cumulative frequency, often referred to simply as the cumulative frequency, of a value indicates the frequency of values that are less than .
- The descending cumulative frequency of a value indicates the frequency of values that are greater than or equal to .
- The median of a data set represents the middle value.
- We cannot calculate an exact median from a grouped frequency distribution; instead we determine an estimate for the median.
- We can determine the median position using either a cumulative frequency diagram or a descending cumulative frequency diagram as
- Using either type of cumulative frequency diagram, we draw a horizontal line from the value of the median position on the until it meets the curve, and then we draw a vertical line from this point to the . This value on the is the estimate for the median.