Lesson Video: The Chain Rule Mathematics • Higher Education

In this video, we will learn how to find the derivatives of composite functions using the chain rule.

15:25

Video Transcript

In is video, we will learn how to differentiate composite functions by applying the chain rule. We will see how to apply this to simple functions initially. And then, we’ll consider more complex functions such as trigonometric and reciprocal trigonometric functions.

Firstly, a reminder of what composite functions are. They are essentially functions of a function. Suppose we have two functions, 𝑓 of 𝑥 equals two 𝑥 plus five and 𝑔 of 𝑥 equals 𝑥 cubed. The composite functions 𝑓 of 𝑔 of 𝑥 and 𝑔 of 𝑓 of 𝑥 are what we get if we composed these two functions in either order. We apply one, and then we apply the other.

𝑓 of 𝑔 of 𝑥 means we applied 𝑔 first, giving 𝑥 cubed. And then, we take this as our input for the function 𝑓, which will give two 𝑥 cubed plus five. 𝑔 of 𝑓 of 𝑥, however, is the composite function we would get if we apply 𝑓 first to give two 𝑥 plus five and then take this as our input to the function 𝑔, which would give two 𝑥 plus five all cubed. And if we distribute the parentheses and simplify, this gives eight 𝑥 cubed plus 60𝑥 squared plus 150𝑥 plus 125.

So, we know how to compose functions. But what about finding their derivatives? Well, in this case, if you’re asked to find the derivatives of either 𝑓 of 𝑔 of 𝑥 or 𝑔 of 𝑓 of 𝑥, it wouldn’t be too bad. Because we could compose the functions first, manipulate them algebraically, and then differentiate the resulting polynomial.

But suppose, instead, the power of 𝑥 in the function 𝑔 of 𝑥 had been 10 or 20 rather than just three, it would be extremely time consuming and tedious to distribute all these parentheses in order to give a polynomial. So, it would be much more helpful for us to have a rule that allows us to differentiate a composite function. And indeed, there is one. It’s called the chain rule.

We’ll illustrate the chain rule by first finding the derivative of the composite function 𝑔 of 𝑓 of 𝑥. So, we’ve defined the functions 𝑓 of 𝑥 and 𝑔 of 𝑥 to be two 𝑥 plus five and 𝑥 cubed, respectively. And we saw that the composite function 𝑔 of 𝑓 of 𝑥 was two 𝑥 plus five all cubed, which simplifies to eight 𝑥 cubed plus 60𝑥 squared plus 150𝑥 plus 125. Now let’s consider finding the derivative of this function.

To do so, we need to recall the power rule, which tells us that the derivative with respect to 𝑥 of 𝑎, that’s a constant, multiplied by 𝑥 to the power of 𝑛 is 𝑎𝑛𝑥 to the power of 𝑛 minus one. And we recall also that in order to find the derivative of a sum or difference, we can just differentiate each term separately and then add them together.

So, differentiating 𝑔 of 𝑓 of 𝑥 then gives the derivative 𝑔 of 𝑓 of 𝑥 prime, which is 24𝑥 squared plus 120𝑥 plus 150. Remember, the derivative of a constant is just zero. So, when we differentiate that term of plus 125, it just gives zero. Now let’s see if we can manipulate this derivative to see if we can identify any relationship with the derivatives of 𝑓 and 𝑔 individually.

We’ll first take out a common factor of six to give six multiplied by four 𝑥 squared plus 20𝑥 plus 25. We may then notice that four 𝑥 squared plus 20𝑥 plus 25 is actually a perfect square. It’s equal to two 𝑥 plus five all squared. And as two 𝑥 plus five is our expression for 𝑓 of 𝑥, this is actually equal to 𝑓 of 𝑥 squared. But what about that six? Well, six is equal to two times three. So, we can write this whole derivative as two times three times 𝑓 of 𝑥 all squared. But how does this help?

Well, to see this, we need to find the derivatives of 𝑓 and 𝑔. Applying the power rule, we see that 𝑓 prime of 𝑥 is equal to two and 𝑔 prime of 𝑥 is equal to three 𝑥 squared. So, that two in our derivative of the composite function is the same as 𝑓 prime of 𝑥. Now three times 𝑓 of 𝑥 all squared is actually the derivative of 𝑔 evaluated at 𝑓 of 𝑥. 𝑔 prime of 𝑥 is three 𝑥 squared. So, 𝑔 prime of 𝑓 of 𝑥 is three 𝑓 of 𝑥 squared.

So, what have we found? Well, for this example, we found that the derivative of 𝑔 of 𝑓 of 𝑥 is equal to the derivative of 𝑓, that’s the derivative of the inner function, multiplied by the derivative of 𝑔, that’s the outer function, with the inner function still inside. Now this is an illustration of the chain rule. It’s not a proof but that is beyond the scope of what we’re going to look at in this video.

So, the chain rule then, it tells us that the derivative of the composite function 𝑔 of 𝑓 of 𝑥 is equal to 𝑓 prime of 𝑥 multiplied by 𝑔 prime of 𝑓 of 𝑥. We can also express the chain rule using Leibniz’s notation. If 𝑦 is equal to 𝑔 of 𝑓 of 𝑥, and we let 𝑢 equal 𝑓 of 𝑥 so that 𝑦 becomes 𝑔 of 𝑢, a function of 𝑢, then d𝑦 by d𝑥 is equal to d𝑦 by d𝑢 multiplied by d𝑢 by d𝑥.

This may look quite complicated, but it’s actually a relatively straightforward process, as we’ll see in our examples. Leibniz’s notation is really helpful because it makes the chain rule a little bit more intuitive. Remember that finding derivatives is all about small changes in 𝑥. So, let’s allow Δ𝑢 to represent a small change in 𝑢 as a result of a small change in 𝑥.

In order to find the derivative of 𝑦 with respect to 𝑥 d𝑦 by d𝑥, we consider the difference quotient Δ𝑦 by Δ𝑥. We see that by multiplying both the numerator and denominator by Δ𝑢, which must be nonzero, and then reordering the terms, we get Δ𝑦 over Δ𝑢 multiplied by Δ𝑢 by Δ𝑥. As Δ𝑥 tends to zero, so will both Δ𝑢 and Δ𝑦, giving d𝑦 by d𝑥 equals d𝑦 by d𝑢 multiplied by d𝑢 by d𝑥. That is the chain rule. The chain rule allows us to differentiate a wide class of complex functions. Let’s look at some examples.

Find the first derivative of the function 𝑦 equals five 𝑥 squared minus six to the power of six.

Now we see that this is an example of a composite function. If we consider the first function to be five 𝑥 squared minus six and the second to be 𝑥 to the power of six. We take five 𝑥 squared minus six as the input to our second function, giving five 𝑥 squared minus six all to the power of six. As this is a composite function, we can apply the chain rule.

The chain rule tells us that if 𝑦 is a function of 𝑢 and 𝑢 is a function of 𝑥, then d𝑦 by d𝑥 is equal to d𝑦 by d𝑢 multiplied by d𝑢 by d𝑥. So, we need to decide how we’re going to define the function 𝑢. Well, we take 𝑢 to be our first function. It’s the part inside the parentheses, 𝑢 equals five 𝑥 squared minus six. 𝑦, therefore, becomes a function of 𝑢. 𝑦 equals 𝑢 to the sixth, and 𝑢 is a function of 𝑥.

We need to find both d𝑦 by d𝑢 and d𝑢 by d𝑥, which we can do by applying the power rule. In the case of the d𝑦 by d𝑢, we just need to think of all the 𝑥’s in the power rule as being 𝑢’s. We have then that d𝑦 by d𝑢 is equal to six 𝑢 to the five, and d𝑢 by d𝑥 is equal to 10𝑥. We write down the chain rule and then make the relevant substitutions, giving d𝑦 by d𝑥 is equal to six 𝑢 to the five multiplied by 10𝑥.

Now here’s a really important point. That derivative of 𝑦 with respect to 𝑥 must be in terms of 𝑥, and, at the moment, we still have the variable 𝑢 involved. So, we must make sure that we reverse our substitution. 𝑢 is equal to five 𝑥 squared minus six, so we have six multiplied by five 𝑥 squared minus six to the power of five multiplied by 10𝑥. Simplifying then, we have that the first derivative of the function 𝑦 equals five 𝑥 squared minus six to the power of six is 60𝑥 multiplied by five 𝑥 squared minus six to the power of five.

Now this illustrates a really powerful application of the chain rule, in fact, a general rule for finding the derivative of a bracket raised to a power. If we express the derivative as 10𝑥 multiplied by six multiplied by five 𝑥 squared minus six to the power of five, then we see what we have is the derivative of the bracket, or the derivative of what’s inside the parentheses, that’s 10𝑥, multiplied by the original power, six, multiplied by that bracket with the power reduced by one from what it was originally.

This gives us the chain rule extension to the power rule. This tells us that if we have a function 𝑓 of 𝑥 raised to a power, then the derivative is equal to 𝑓 prime of 𝑥, that’s the derivative of what’s inside the parentheses, multiplied by 𝑛 and then multiplied by 𝑓 of 𝑥 with the power reduced by one, 𝑓 of 𝑥 to the 𝑛 minus one. This is particularly useful if we have really high powers. So, let’s see how we can apply this rule to another example.

Determine the derivative of 𝑦 equals negative two 𝑥 squared minus three 𝑥 plus four to the power of 55.

Now this is where we really see the importance of the chain rule. When we have an exponent as high as 55, we certainly don’t want to attempt to distribute all the parentheses. Instead, we’re going to use the chain rule extension of the power rule, which tells us that the derivative of 𝑓 of 𝑥 to the 𝑛 is 𝑓 prime of 𝑥 multiplied by 𝑛 multiplied by 𝑓 of 𝑥 to the 𝑛 minus one.

So, 𝑓 of 𝑥 will be that function inside the parentheses, negative two 𝑥 squared minus three 𝑥 plus four. We can apply the power rule to differentiate 𝑓 of 𝑥, giving negative four 𝑥 minus three. Now we can work out d𝑦 by d𝑥. It’s equal to 𝑓 prime of 𝑥, that’s negative four 𝑥 minus three, multiplied by 𝑛, that’s 55, multiplied by 𝑓 of 𝑥 to the power of 𝑛 minus one, that’s negative two 𝑥 squared minus three 𝑥 plus four to the power of 54.

There’s no need to expand the parentheses. So, we’ve found that d𝑦 by d𝑥 is equal to 55 multiplied by negative four 𝑥 minus three multiplied by negative two 𝑥 squared minus three 𝑥 plus four to the power of 54. And we’ve done this by applying the chain rule extension to the power rule.

We can also apply the chain rule more than once within the same problem. So, let’s consider an example of this.

Find the first derivative of the function 𝑦 equals the square root of eight 𝑥 minus sin of nine 𝑥 to the power of eight.

Here we have 𝑦 is equal to the square root of another function, so we have a composite function. We’re, therefore, going to apply the chain rule. We’re going to define 𝑢 to be the function underneath the square root, so 𝑢 is equal to eight 𝑥 minus sin of nine 𝑥 to the power of eight. Then, 𝑦 is equal to the square root of 𝑢, which we can express using index notation as 𝑢 to the power of one-half.

The chain rule tells us that d𝑦 by d𝑥 is equal to d𝑦 by d𝑢 multiplied by d𝑢 by d𝑥. So, we need to find each of these derivatives. d𝑦 by d𝑢 is relatively straightforward. Using the power rule, we get one-half 𝑢 to the power of negative one-half. For d𝑢 by d𝑥, the derivative of eight 𝑥 is just eight. But what about the derivative of sin of nine 𝑥 to the power of eight?

We actually need to apply the chain rule again. We can let 𝑔 equal this function. And we can change the notation a little to write it as sin nine 𝑥 to the power of eight. It’s an equivalent notation, but it might it make it a little clearer how we’re going to find the derivative.

We recall the chain rule extension to the power rule, which told us that if we had a function 𝑓 of 𝑥 raised to a power 𝑛, then its derivative was 𝑓 prime of 𝑥 multiplied by 𝑛 multiplied by 𝑓 of 𝑥 to the power of 𝑛 minus one. Here, we have a function, sin of nine 𝑥 raised to a power eight, so we can apply the chain rule extension to the power rule. We need to recall one more rule which is that the derivative with respect to 𝑥 of sin 𝑎𝑥 is 𝑎 cos 𝑎𝑥.

So, we begin. The derivative of the part inside the parentheses is nine cos nine 𝑥. Then, we multiply by the power eight. And then, we have the function inside the parentheses written out again, but with the power reduced by one. Simplifying gives 72 cos nine 𝑥 sin to the power of seven nine 𝑥. So, now that we found both d𝑦 by d𝑢 and d𝑢 by d𝑥, we can substitute into the chain rule.

We have then that d𝑦 by d𝑥 is equal to a half 𝑢 to power of negative a half multiplied by eight minus 72 cos nine 𝑥 sin nine 𝑥 to the power of seven. We must also remember to replace 𝑢 in terms of 𝑥. So, 𝑢 is equal to eight 𝑥 minus sin of nine 𝑥 to the power of eight. We’ll also simplify the fractions. Dividing by that denominator of two leaves coefficients of four and 36 in the numerator.

And we recall also that 𝑢 to the power of negative a half is equal to one over root 𝑢. So, our derivative d𝑦 by d𝑥 simplifies to four minus 36 cos nine 𝑥 sin nine 𝑥 to the power of seven all over the square root of eight 𝑥 minus sin nine 𝑥 to the power of eight.

So, we’ve seen within this question that we can apply the chain rule more than once within the same problem. In fact, we can apply it as many times as is necessary.

Let’s remind ourselves then of some of the key points that we’ve seen in this video. The chain rule is useful for differentiating composite functions, that’s functions of other functions. If 𝑦 is equal to the composite function at 𝑔 of 𝑓 of 𝑥, then d𝑦 by d𝑥 is equal to 𝑓 prime of 𝑥, that’s the derivative of the inner function, multiplied by 𝑔 prime of 𝑓 of 𝑥. That’s the derivative of the outer function with the inner function still inside.

We’ve also seen the that if we make the substitution 𝑢 equals 𝑓 of 𝑥, then 𝑦 becomes a function of 𝑢. And the chain rule can be expressed as d𝑦 by d𝑥 is equal to d𝑦 by d𝑢 multiplied by d𝑢 by d𝑥. We find the derivative of 𝑦 with respect to 𝑢 and multiply by the derivative of 𝑢 with respect to 𝑥. We must make sure that we undo our substitution at the end, so that d𝑦 by d𝑥 is in terms of 𝑥 only.

We’ve also seen the chain rule extension to the power rule, which tells us that the derivative of a function 𝑓 of 𝑥 to the power of 𝑛 is 𝑓 prime of 𝑥 multiplied by 𝑛 multiplied by 𝑓 of 𝑥 to the 𝑛 minus one. Finally, we saw that we can apply the chain rule as many times as we like within a particular problem. The chain rule is a really powerful tool. And it opens up a really wide class of functions which we’re able to differentiate.

Nagwa uses cookies to ensure you get the best experience on our website. Learn more about our Privacy Policy.