Video Transcript
In is video, we will learn how to
differentiate composite functions by applying the chain rule. We will see how to apply this to
simple functions initially. And then, we’ll consider more
complex functions such as trigonometric and reciprocal trigonometric functions.
Firstly, a reminder of what
composite functions are. They are essentially functions of a
function. Suppose we have two functions, 𝑓
of 𝑥 equals two 𝑥 plus five and 𝑔 of 𝑥 equals 𝑥 cubed. The composite functions 𝑓 of 𝑔 of
𝑥 and 𝑔 of 𝑓 of 𝑥 are what we get if we composed these two functions in either
order. We apply one, and then we apply the
other.
𝑓 of 𝑔 of 𝑥 means we applied 𝑔
first, giving 𝑥 cubed. And then, we take this as our input
for the function 𝑓, which will give two 𝑥 cubed plus five. 𝑔 of 𝑓 of 𝑥, however, is the
composite function we would get if we apply 𝑓 first to give two 𝑥 plus five and
then take this as our input to the function 𝑔, which would give two 𝑥 plus five
all cubed. And if we distribute the
parentheses and simplify, this gives eight 𝑥 cubed plus 60𝑥 squared plus 150𝑥
plus 125.
So, we know how to compose
functions. But what about finding their
derivatives? Well, in this case, if you’re asked
to find the derivatives of either 𝑓 of 𝑔 of 𝑥 or 𝑔 of 𝑓 of 𝑥, it wouldn’t be
too bad. Because we could compose the
functions first, manipulate them algebraically, and then differentiate the resulting
polynomial.
But suppose, instead, the power of
𝑥 in the function 𝑔 of 𝑥 had been 10 or 20 rather than just three, it would be
extremely time consuming and tedious to distribute all these parentheses in order to
give a polynomial. So, it would be much more helpful
for us to have a rule that allows us to differentiate a composite function. And indeed, there is one. It’s called the chain rule.
We’ll illustrate the chain rule by
first finding the derivative of the composite function 𝑔 of 𝑓 of 𝑥. So, we’ve defined the functions 𝑓
of 𝑥 and 𝑔 of 𝑥 to be two 𝑥 plus five and 𝑥 cubed, respectively. And we saw that the composite
function 𝑔 of 𝑓 of 𝑥 was two 𝑥 plus five all cubed, which simplifies to eight 𝑥
cubed plus 60𝑥 squared plus 150𝑥 plus 125. Now let’s consider finding the
derivative of this function.
To do so, we need to recall the
power rule, which tells us that the derivative with respect to 𝑥 of 𝑎, that’s a
constant, multiplied by 𝑥 to the power of 𝑛 is 𝑎𝑛𝑥 to the power of 𝑛 minus
one. And we recall also that in order to
find the derivative of a sum or difference, we can just differentiate each term
separately and then add them together.
So, differentiating 𝑔 of 𝑓 of 𝑥
then gives the derivative 𝑔 of 𝑓 of 𝑥 prime, which is 24𝑥 squared plus 120𝑥
plus 150. Remember, the derivative of a
constant is just zero. So, when we differentiate that term
of plus 125, it just gives zero. Now let’s see if we can manipulate
this derivative to see if we can identify any relationship with the derivatives of
𝑓 and 𝑔 individually.
We’ll first take out a common
factor of six to give six multiplied by four 𝑥 squared plus 20𝑥 plus 25. We may then notice that four 𝑥
squared plus 20𝑥 plus 25 is actually a perfect square. It’s equal to two 𝑥 plus five all
squared. And as two 𝑥 plus five is our
expression for 𝑓 of 𝑥, this is actually equal to 𝑓 of 𝑥 squared. But what about that six? Well, six is equal to two times
three. So, we can write this whole
derivative as two times three times 𝑓 of 𝑥 all squared. But how does this help?
Well, to see this, we need to find
the derivatives of 𝑓 and 𝑔. Applying the power rule, we see
that 𝑓 prime of 𝑥 is equal to two and 𝑔 prime of 𝑥 is equal to three 𝑥
squared. So, that two in our derivative of
the composite function is the same as 𝑓 prime of 𝑥. Now three times 𝑓 of 𝑥 all
squared is actually the derivative of 𝑔 evaluated at 𝑓 of 𝑥. 𝑔 prime of 𝑥 is three 𝑥
squared. So, 𝑔 prime of 𝑓 of 𝑥 is three
𝑓 of 𝑥 squared.
So, what have we found? Well, for this example, we found
that the derivative of 𝑔 of 𝑓 of 𝑥 is equal to the derivative of 𝑓, that’s the
derivative of the inner function, multiplied by the derivative of 𝑔, that’s the
outer function, with the inner function still inside. Now this is an illustration of the
chain rule. It’s not a proof but that is beyond
the scope of what we’re going to look at in this video.
So, the chain rule then, it tells
us that the derivative of the composite function 𝑔 of 𝑓 of 𝑥 is equal to 𝑓 prime
of 𝑥 multiplied by 𝑔 prime of 𝑓 of 𝑥. We can also express the chain rule
using Leibniz’s notation. If 𝑦 is equal to 𝑔 of 𝑓 of 𝑥,
and we let 𝑢 equal 𝑓 of 𝑥 so that 𝑦 becomes 𝑔 of 𝑢, a function of 𝑢, then d𝑦
by d𝑥 is equal to d𝑦 by d𝑢 multiplied by d𝑢 by d𝑥.
This may look quite complicated,
but it’s actually a relatively straightforward process, as we’ll see in our
examples. Leibniz’s notation is really
helpful because it makes the chain rule a little bit more intuitive. Remember that finding derivatives
is all about small changes in 𝑥. So, let’s allow Δ𝑢 to represent a
small change in 𝑢 as a result of a small change in 𝑥.
In order to find the derivative of
𝑦 with respect to 𝑥 d𝑦 by d𝑥, we consider the difference quotient Δ𝑦 by
Δ𝑥. We see that by multiplying both the
numerator and denominator by Δ𝑢, which must be nonzero, and then reordering the
terms, we get Δ𝑦 over Δ𝑢 multiplied by Δ𝑢 by Δ𝑥. As Δ𝑥 tends to zero, so will both
Δ𝑢 and Δ𝑦, giving d𝑦 by d𝑥 equals d𝑦 by d𝑢 multiplied by d𝑢 by d𝑥. That is the chain rule. The chain rule allows us to
differentiate a wide class of complex functions. Let’s look at some examples.
Find the first derivative of the
function 𝑦 equals five 𝑥 squared minus six to the power of six.
Now we see that this is an example
of a composite function. If we consider the first function
to be five 𝑥 squared minus six and the second to be 𝑥 to the power of six. We take five 𝑥 squared minus six
as the input to our second function, giving five 𝑥 squared minus six all to the
power of six. As this is a composite function, we
can apply the chain rule.
The chain rule tells us that if 𝑦
is a function of 𝑢 and 𝑢 is a function of 𝑥, then d𝑦 by d𝑥 is equal to d𝑦 by
d𝑢 multiplied by d𝑢 by d𝑥. So, we need to decide how we’re
going to define the function 𝑢. Well, we take 𝑢 to be our first
function. It’s the part inside the
parentheses, 𝑢 equals five 𝑥 squared minus six. 𝑦, therefore, becomes a function
of 𝑢. 𝑦 equals 𝑢 to the sixth, and 𝑢
is a function of 𝑥.
We need to find both d𝑦 by d𝑢 and
d𝑢 by d𝑥, which we can do by applying the power rule. In the case of the d𝑦 by d𝑢, we
just need to think of all the 𝑥’s in the power rule as being 𝑢’s. We have then that d𝑦 by d𝑢 is
equal to six 𝑢 to the five, and d𝑢 by d𝑥 is equal to 10𝑥. We write down the chain rule and
then make the relevant substitutions, giving d𝑦 by d𝑥 is equal to six 𝑢 to the
five multiplied by 10𝑥.
Now here’s a really important
point. That derivative of 𝑦 with respect
to 𝑥 must be in terms of 𝑥, and, at the moment, we still have the variable 𝑢
involved. So, we must make sure that we
reverse our substitution. 𝑢 is equal to five 𝑥 squared
minus six, so we have six multiplied by five 𝑥 squared minus six to the power of
five multiplied by 10𝑥. Simplifying then, we have that the
first derivative of the function 𝑦 equals five 𝑥 squared minus six to the power of
six is 60𝑥 multiplied by five 𝑥 squared minus six to the power of five.
Now this illustrates a really
powerful application of the chain rule, in fact, a general rule for finding the
derivative of a bracket raised to a power. If we express the derivative as
10𝑥 multiplied by six multiplied by five 𝑥 squared minus six to the power of five,
then we see what we have is the derivative of the bracket, or the derivative of
what’s inside the parentheses, that’s 10𝑥, multiplied by the original power, six,
multiplied by that bracket with the power reduced by one from what it was
originally.
This gives us the chain rule
extension to the power rule. This tells us that if we have a
function 𝑓 of 𝑥 raised to a power, then the derivative is equal to 𝑓 prime of 𝑥,
that’s the derivative of what’s inside the parentheses, multiplied by 𝑛 and then
multiplied by 𝑓 of 𝑥 with the power reduced by one, 𝑓 of 𝑥 to the 𝑛 minus
one. This is particularly useful if we
have really high powers. So, let’s see how we can apply this
rule to another example.
Determine the derivative of 𝑦
equals negative two 𝑥 squared minus three 𝑥 plus four to the power of 55.
Now this is where we really see
the importance of the chain rule. When we have an exponent as
high as 55, we certainly don’t want to attempt to distribute all the
parentheses. Instead, we’re going to use the
chain rule extension of the power rule, which tells us that the derivative of 𝑓
of 𝑥 to the 𝑛 is 𝑓 prime of 𝑥 multiplied by 𝑛 multiplied by 𝑓 of 𝑥 to the
𝑛 minus one.
So, 𝑓 of 𝑥 will be that
function inside the parentheses, negative two 𝑥 squared minus three 𝑥 plus
four. We can apply the power rule to
differentiate 𝑓 of 𝑥, giving negative four 𝑥 minus three. Now we can work out d𝑦 by
d𝑥. It’s equal to 𝑓 prime of 𝑥,
that’s negative four 𝑥 minus three, multiplied by 𝑛, that’s 55, multiplied by
𝑓 of 𝑥 to the power of 𝑛 minus one, that’s negative two 𝑥 squared minus
three 𝑥 plus four to the power of 54.
There’s no need to expand the
parentheses. So, we’ve found that d𝑦 by d𝑥
is equal to 55 multiplied by negative four 𝑥 minus three multiplied by negative
two 𝑥 squared minus three 𝑥 plus four to the power of 54. And we’ve done this by applying
the chain rule extension to the power rule.
We can also apply the chain rule
more than once within the same problem. So, let’s consider an example of
this.
Find the first derivative of
the function 𝑦 equals the square root of eight 𝑥 minus sin of nine 𝑥 to the
power of eight.
Here we have 𝑦 is equal to the
square root of another function, so we have a composite function. We’re, therefore, going to
apply the chain rule. We’re going to define 𝑢 to be
the function underneath the square root, so 𝑢 is equal to eight 𝑥 minus sin of
nine 𝑥 to the power of eight. Then, 𝑦 is equal to the square
root of 𝑢, which we can express using index notation as 𝑢 to the power of
one-half.
The chain rule tells us that
d𝑦 by d𝑥 is equal to d𝑦 by d𝑢 multiplied by d𝑢 by d𝑥. So, we need to find each of
these derivatives. d𝑦 by d𝑢 is relatively straightforward. Using the power rule, we get
one-half 𝑢 to the power of negative one-half. For d𝑢 by d𝑥, the derivative
of eight 𝑥 is just eight. But what about the derivative
of sin of nine 𝑥 to the power of eight?
We actually need to apply the
chain rule again. We can let 𝑔 equal this
function. And we can change the notation
a little to write it as sin nine 𝑥 to the power of eight. It’s an equivalent notation,
but it might it make it a little clearer how we’re going to find the
derivative.
We recall the chain rule
extension to the power rule, which told us that if we had a function 𝑓 of 𝑥
raised to a power 𝑛, then its derivative was 𝑓 prime of 𝑥 multiplied by 𝑛
multiplied by 𝑓 of 𝑥 to the power of 𝑛 minus one. Here, we have a function, sin
of nine 𝑥 raised to a power eight, so we can apply the chain rule extension to
the power rule. We need to recall one more rule
which is that the derivative with respect to 𝑥 of sin 𝑎𝑥 is 𝑎 cos 𝑎𝑥.
So, we begin. The derivative of the part
inside the parentheses is nine cos nine 𝑥. Then, we multiply by the power
eight. And then, we have the function
inside the parentheses written out again, but with the power reduced by one. Simplifying gives 72 cos nine
𝑥 sin to the power of seven nine 𝑥. So, now that we found both d𝑦
by d𝑢 and d𝑢 by d𝑥, we can substitute into the chain rule.
We have then that d𝑦 by d𝑥 is
equal to a half 𝑢 to power of negative a half multiplied by eight minus 72 cos
nine 𝑥 sin nine 𝑥 to the power of seven. We must also remember to
replace 𝑢 in terms of 𝑥. So, 𝑢 is equal to eight 𝑥
minus sin of nine 𝑥 to the power of eight. We’ll also simplify the
fractions. Dividing by that denominator of
two leaves coefficients of four and 36 in the numerator.
And we recall also that 𝑢 to
the power of negative a half is equal to one over root 𝑢. So, our derivative d𝑦 by d𝑥
simplifies to four minus 36 cos nine 𝑥 sin nine 𝑥 to the power of seven all
over the square root of eight 𝑥 minus sin nine 𝑥 to the power of eight.
So, we’ve seen within this
question that we can apply the chain rule more than once within the same
problem. In fact, we can apply it as
many times as is necessary.
Let’s remind ourselves then of some
of the key points that we’ve seen in this video. The chain rule is useful for
differentiating composite functions, that’s functions of other functions. If 𝑦 is equal to the composite
function at 𝑔 of 𝑓 of 𝑥, then d𝑦 by d𝑥 is equal to 𝑓 prime of 𝑥, that’s the
derivative of the inner function, multiplied by 𝑔 prime of 𝑓 of 𝑥. That’s the derivative of the outer
function with the inner function still inside.
We’ve also seen the that if we make
the substitution 𝑢 equals 𝑓 of 𝑥, then 𝑦 becomes a function of 𝑢. And the chain rule can be expressed
as d𝑦 by d𝑥 is equal to d𝑦 by d𝑢 multiplied by d𝑢 by d𝑥. We find the derivative of 𝑦 with
respect to 𝑢 and multiply by the derivative of 𝑢 with respect to 𝑥. We must make sure that we undo our
substitution at the end, so that d𝑦 by d𝑥 is in terms of 𝑥 only.
We’ve also seen the chain rule
extension to the power rule, which tells us that the derivative of a function 𝑓 of
𝑥 to the power of 𝑛 is 𝑓 prime of 𝑥 multiplied by 𝑛 multiplied by 𝑓 of 𝑥 to
the 𝑛 minus one. Finally, we saw that we can apply
the chain rule as many times as we like within a particular problem. The chain rule is a really powerful
tool. And it opens up a really wide class
of functions which we’re able to differentiate.