Video Transcript
In this video, we will learn how to
draw a cumulative frequency diagram and how to use it to make estimations about the
data. Let’s begin by understanding what
cumulative frequency is.
Cumulative frequency is the sum of
all the previous frequencies up to the current point. It is often referred to as the
running total of frequencies. The ascending cumulative frequency
of a value 𝑥 can be found by adding all the frequencies less than 𝑥. The use of cumulative frequencies
is a statistical method that is typically applied to grouped frequency tables, where
data is organized into smaller groups or classes. Let’s look at an example of how we
find the cumulative frequency of a set of data that is given in a grouped frequency
table.
The table shows the number of
hours that 100 students spent revising for an exam. Determine the missing
cumulative frequency results.
The grouped frequency table
presents the data on the number of hours that students spent studying. The groups or classes have open
intervals such that the first group, zero dash, represents values of zero hours
or greater but less than two. This is because the next group
begins with values greater than or equal to two. We do not have overlapping
values in a grouped frequency table.
We are asked to complete the
cumulative frequency table based on the frequencies. The cumulative frequency gives
the running total of the frequencies. An ascending cumulative
frequency will always represent the frequencies of values that are less than a
particular value. The first group in the
frequency table has a cumulative frequency of zero. This is because we can conclude
from the frequency table that there were zero students revising for less than
two hours. To find the second cumulative
frequency value, we add the frequency of the second group to the previous
cumulative frequency. There are 10 students who
revised for less than four hours. Hence, the second cumulative
frequency is 10 plus zero, which equals 10.
We now need to determine the
cumulative frequency of students who revised for less than six hours. The class four dash in the
grouped frequency table indicates that 19 students revised four hours or more
and less than six hours. However, the 10 students in the
previous group also revised for less than six hours. Hence, the cumulative frequency
for less than six hours is equal to 19 plus 10, which equals 29. This third cumulative frequency
was found by adding the frequency of the third class to the previous cumulative
frequency.
We can then continue this
process to find each of the cumulative frequency values. It is worth noting that the
cumulative frequency of all values will be the same as the total frequency. This is useful for checking
whether our values are correct. The total frequencies can be
calculated as zero plus 10 plus 19 plus 37 plus 24 plus 10, which equals
100. Since the final cumulative
frequency is also 100, then we have confirmed that the missing cumulative
frequency values are zero, 10, 29, 66, 90, and 100.
We will now see the most common way
in which cumulative frequency is presented, as a cumulative frequency graph. A cumulative frequency graph
displays the cumulative frequency of a data set. This can be a cumulative frequency
polygon, where straight lines join the points, or a cumulative frequency curve. The cumulative frequency for a
value 𝑥 is the total number of data values that are less than 𝑥. Since cumulative frequency is a
running total of values, the graph of the cumulative frequency will never
descend. It may have horizontal portions
where the cumulative frequency remains the same if the frequency of a group is
zero.
We will now look at an example
where we need to identify the correct representation of a data set as a cumulative
frequency graph.
A manufacturer samples the
mass, in grams, of 30 pencils from their production line. Their masses are recorded in
the table. No pencil has a mass greater
than 60 grams. Which cumulative frequency
graph correctly shows this information? Is it graph (A), (B), (C), (D),
or (E)?
In order to identify the
correct graph, we need to calculate the cumulative frequencies for the values in
the table. This will give us a running
total for values that are less than a given point. The less than value that we
will use will be the upper boundary of each class.
We begin by recognizing that
the first group in the table represents masses that are 10 grams or greater but
less than 20 grams. We can add the cumulative
frequency row to our table, which will represent pencils with a mass of less
than 20 grams, less than 30 grams, less than 40 grams, and so on. Since no pencil has a mass
greater than 60 grams, the last element of the cumulative frequency row
represents the number of pencils with masses less than 60 grams.
Recalling that cumulative
frequency is the running total, we have values of three, nine, 20, 27, and
30. Note that we calculate these
values by adding the frequency in that column to the previous cumulative
frequency value. The final cumulative frequency
will be the same as the total frequency. In this case, this will be the
value 30, since 30 pencils were sampled.
When drawing or identifying the
cumulative frequency graph in this context, we have the mass on the 𝑥-axis and
cumulative frequency on the 𝑦-axis. The 𝑥-coordinate values will
be the less than mass values or the upper boundaries of each class. This allows us to use a
cumulative frequency curve to identify values that are less than any particular
value. The coordinates that would be
plotted can be given as 10, zero; 20, three; 30, nine; 40, 20; 50, 27; and 60,
30. As previously stated, the
𝑥-coordinate is the upper boundary of each group and the 𝑦-coordinate is the
corresponding cumulative frequency. The graph that matches these
coordinates is that of graph (B). And so this is the cumulative
frequency graph for the given information.
Whilst graphs (A), (D), and (E) are
cumulative frequency graphs, they do not match the data in the table. Graph (C) is a frequency polygon
and is not a cumulative frequency graph. When creating a cumulative
frequency diagram, it is preferable to join the points with a smooth curve, rather
than with straight lines. This gives us a better
approximation for the data and allows us to make more accurate estimations for
cumulative frequencies that do not lie on boundaries of classes.
We will now see an example of this,
where we are given a cumulative frequency graph and we use it to help us estimate
values that are less than, greater than, or equal to particular values.
Mason took a sample of 100
balls from a box. He weighed each ball and
recorded its weight in the table. He used the data to draw the
cumulative frequency graph shown on the grid. Estimate how many balls had a
weight of less than 80 grams. Estimate how many balls had a
weight of 130 grams or more.
Cumulative frequency is the sum
of all the previous frequencies up to the current point. It is often referred to as the
running total of frequencies. The given graph shows the
cumulative frequency of the weights of 100 balls. We can see from the graph that
the highest cumulative frequency is 100. Any point on the cumulative
frequency graph indicates the total number of balls that are less than the given
weight.
In order to find an estimate
for the number of balls that are less than 80 grams, we can draw a vertical line
from 80 on the 𝑥-axis until it meets the curve. We then draw a horizontal line
from this point to the 𝑦-axis to allow us to read the corresponding 𝑦-value,
the cumulative frequency. Observing that each minor grid
line on the 𝑦-axis represents a frequency of two, we can give the answer to the
first part of this question. The number of balls less than
80 grams can be estimated as 26 balls.
Although each value on the
cumulative frequency curve represents frequencies that are less than a
particular value, we can still use the curve to find the values for greater than
or equal to values. To estimate the number of balls
that are 130 grams or more, we use the same process. We draw a vertical line from
130 on the 𝑥-axis to the curve and then draw a horizontal line from this point
to the 𝑦-axis. We can read the cumulative
frequency of 78 balls from the 𝑦-axis, which means that 78 balls had a weight
less than 130 grams.
In order to find the number of
balls that had a weight of 130 grams or more, we subtract this from the total
frequency. The total frequency is the
total number of balls that have been weighed. Hence, it is 100. Therefore, we have 100 minus
78, which is equal to 22. The answer to the second part
of the question is that we estimate that there are 22 balls with a weight of 130
grams or more.
In a grouped frequency table, the
groups or classes may be described using different notation. We have seen how a class of 110
dash represents values that are 110 or greater and less than the lower boundary of
the subsequent class. We can also use inequalities to
represent the boundaries in continuous data sets. For example, data representing
heights ℎ may be allocated different intervals written as ℎ is greater than or equal
to 110 and less than 120, as shown. Whilst we have not seen an example
of this type in this video, we would follow the exact same process when calculating
cumulative frequency totals, drawing cumulative frequency graphs, and making
estimations about the data.
We will now summarize the key
points from this video. Cumulative frequency is the sum of
all the previous frequencies up to the current point. It is often referred to as the
running total of frequencies. To draw a cumulative frequency
graph, we first determine all the cumulative frequency totals for values that are
less than the upper boundary of each class. To plot the coordinates for each
cumulative frequency value, we take the upper boundary of a class as the
𝑥-coordinate and the corresponding cumulative frequency as the 𝑦-coordinate. Any point on a cumulative frequency
curve represents the cumulative frequency of variables that are less than the
corresponding 𝑥-coordinate. To find the frequency of values
that are greater than or equal to any 𝑥-coordinate, we subtract the value of the
𝑦-coordinate from the total frequency.