In this explainer, we will learn how to find the derivatives of composite functions using the chain rule.
Once we have learned how to differentiate simple functions, we might start to wonder how we can differentiate more complex functions. Generally, more complex functions are created from simpler ones by combining them together in various ways. There are a few basic ways to combine two functions and :
- addition or subtraction: ;
- multiplication or division: or ;
- composition: .
To be able to differentiate more complex functions, it would be very helpful to have rules that tell us how to differentiate functions combined in these particular ways. At this point in a calculus course, we already know that the derivative of a sum is the sum of the derivatives:
Furthermore, we know that differentiation is actually a linear operation. This means that, in addition to the sum rule, we have the following rule for multiplication by a constant :
There are also rules for multiplication and division known as the product and quotient rules. However, in this explainer, we will focus on the rule for differentiating composite functions.
Letβs begin by considering an example where we differentiate a composite function by first simplifying the composite expression and then applying the power rule to the resulting expression. This example will lead us to the general formula for differentiating composite functions.
Example 1: Derivatives of Composite Functions
Consider the function .
- By expanding the binomial, find the derivative of .
- Let and . Find the derivative of and .
- Express in terms of , , and .
Answer
Part 1
We begin by expanding the parentheses. We can do this using the binomial theorem or by simply multiplying out the parentheses. Below, we simply multiply out the parentheses:
We can now use the power rule, to differentiate each term as follows:
Part 2
Using the power rule, we can easily find the derivatives of and as follows:
Part 3
We would like to try to find an expression for in terms of , , and . To do this, we begin by factoring the expression we found for in part 1. We start by factoring out the common factor of 6 from the expression:
We can now factor the expression in parentheses as follows:
Since , we can rewrite this as
Furthermore, we know that ; therefore,
Finally, since , we have
In the previous example, we had function defined as the composition of two functions and ; that is,
We found that the derivative of this function was given by
Although we considered this for two specific functions, the rule itself generalizes to any composition of differentiable functions; this result is known as the chain rule.
Rule: The Chain Rule
Given a function that is differentiable at and a function that is differentiable at , their composition defined by is differentiable at and its derivative is given by
We can write this in Leibniz notation as where and .
The nice thing about using Leibniz notation is that it makes the chain rule very intuitive since the fractional notations on the right-hand side of the equation formally simplify to the expression on the left-hand side.
We let be the change in as a result of a small change in , , which we can write as
This change in has a corresponding change in :
We can now consider the difference quotient if , we can multiply the numerator and denominator by to get
We can then take limits as and arrive at the expression for the chain rule. This reasoning is not quite good enough for a proof since it is quite possible that is zero even if . Therefore, to prove the chain rule, we need to be careful with this point. However, a reasoning like this demonstrates how reasonable and intuitive the formula for the chain rule is.
Letβs consider an example where we apply the chain rule using the Leibniz notation.
Example 2: Finding Derivatives Using the Chain Rule
Determine the derivative of .
Answer
An example like this really demonstrates the importance of the chain rule. It is of course possible to expand the parentheses and get a long polynomial expression. However, clearly, this would be a considerable amount of algebra. Instead, we can apply the chain rule which will prove much simpler and less prone to error.
We begin by identifying the inner and outer functions. We let , and then . We now find the derivatives of and . Using the power rule, we have
Substituting into the expression , we obtain
Substituting these into the formula for the chain rule, we can find the derivative of as follows:
As we can see from the last example, one of the key skills in applying the chain rule is identifying the function composition.
Let us consider another example where we apply the chain rule. In this example, we will use the prime notation rather than the Leibniz notation for the chain rule.
Example 3: Using the Chain Rule
Determine the derivative of .
Answer
The function is the composition of two functions. We first need to identify the correct choice of inner and outer functions. In this case, the natural choice of inner function is , which gives an outer function of . We can now find the derivatives of and . Using the power rule, the derivative of is simply
Similarly, we can use the power rule to find the derivative of :
Substituting these into the formula for the chain rule, we can find the derivative of as follows:
In the previous example, there was one apparent choice for the inner and outer functions to the chain rule. Often, there is a natural choice; however, sometimes we will find that there is more than one possible choice. In these cases, we try to pick the functions to minimize the work we need to do. Let us consider an example where we need to consider our choice of the inner and outer functions carefully.
Example 4: Finding the Derivative at a Point Using the Chain Rule
Evaluate at , where .
Answer
For a question like this, we have more than one possible choice for our inner and outer functions. We could choose our inner function to be , which would result in an outer function of , or we could choose an inner function of , which yields as the outer function. If we choose the first example, we would find that we need to apply the chain rule a second time to find the derivative of ; for this reason, the second choice of inner and outer functions is better since it will only necessitate that we apply the chain rule once. Therefore, setting and . Using the power rule, we can find the derivatives of and as follows:
Replacing with in the expression , we obtain
Substituting these into the formula for the chain rule, we have
We can now evaluate this at as follows:
Sometimes, we might need to apply the chain rule in situations where we do not have an expression for a particular function, but we have information about the value of the derivative at a given point. The following question is an example of this.
Example 5: Using the Chain Rule with Unknown Functions
Given that , , and , determine at .
Answer
Given that , we can apply the chain rule to find the derivative where our inner function is and our outer function is . We begin by calculating the derivatives and as follows:
We can now substitute these expressions into the chain rule as follows:
To evaluate this at , we have
Substituting in and , we get
In our final example, we will consider a function that is the composition of multiple functions.
Example 6: Applying the Chain Rule Multiple Times
Find the derivative of the function .
Answer
The first thing we need to do is identify our outer and inner functions. We set ; then, . We can now find the derivatives of each of these parts and apply the chain rule. We begin by finding the derivative as follows:
We now need to find the derivative of with respect to . The first term is easy to differentiate, but the second term is a composition of functions. Hence, to find the derivative of this term, we will need to apply the chain rule. We begin by writing
We let ; then, we can set our inner function as , which results in an outer function of . The definition of corresponds exactly to the definition of . Hence, its derivative will be
We can now find the derivative of with respect to , which we can easily do using the product rule as follows:
We can now apply the chain rule to to get
Substituting this back into the expression for , we have
We can now apply the chain rule to as follows:
When we are applying the chain rule multiple times, we should apply a top-down approach as if we are pealing the layers off an onion. Hence, we should find the outermost function and then deal with the inner function which might require a fresh application of the chain rule.
Letβs recap a few important points from this explainer.
Key Points
- The chain rule states that, given a function that is differentiable at and a function that is differentiable at , their composition defined by is differentiable at and its derivative is given by We can write this in Leibniz notation as where and .
- Sometimes we will find that there is more than one possible choice. In these cases, we try to pick the functions to minimize the work we need to do.
- When we differentiate the composition of three or more functions, we need to apply the chain rule multiple times. We begin with the outermost function, and the derivative of the inner function will require an additional chain rule.