Explainer: The Chain Rule

In this explainer, we will learn how to find the derivatives of composite functions using the chain rule.

Once we have learned how to differentiate simple functions, we might start to wonder how we can differentiate more complex functions. Generally, more complex functions are created from simpler ones by combining them together in various ways. There are a few basic ways to combine two functions 𝑓(𝑥) and 𝑔(𝑥):

  1. addition or subtraction: 𝑓(𝑥)±𝑔(𝑥);
  2. multiplication or division: 𝑓(𝑥)𝑔(𝑥) or 𝑓(𝑥)𝑔(𝑥);
  3. composition: 𝑓(𝑔(𝑥)) or 𝑔(𝑓(𝑥)).

To be able to differentiate more complex functions, it would be very helpful to have rules that tell us how to differentiate functions combined in these particular ways. At this point in a calculus course, we already know that the derivative of a sum is the sum of the derivatives: (𝑓±𝑔)=𝑓±𝑔.

Furthermore, we know that differentiation is actually a linear operation. This means that, in addition to the sum rule, we have the following rule for multiplication by a constant 𝑐: (𝑐𝑓)=𝑐𝑓.

There are also rules for multiplication and division known as the product and quotient rules. However, in this explainer, we will focus on the rule for differentiating composite functions.

Example 1: Derivatives of Composite Functions

Consider the function 𝑓(𝑥)=(2𝑥+1).

  1. By expanding the binomial, find the derivative of 𝑓.
  2. Let 𝑔(𝑥)=𝑥 and (𝑥)=2𝑥+1. Find the derivative of 𝑔 and .
  3. Express 𝑓 in terms of , 𝑔, and .

Answer

Part 1

We begin by expanding the parentheses. We can do this using the binomial theorem or by simply multiplying out the parentheses. Below, we simply multiply out the parentheses: 𝑓(𝑥)=(2𝑥+1)=(2𝑥+1)4𝑥+4𝑥+1=8𝑥+8𝑥+2𝑥+4𝑥+4𝑥+1=8𝑥+12𝑥+6𝑥+1.

We can now use the power rule, dd𝑥𝑥=𝑛𝑥, to differentiate each term as follows: 𝑓(𝑥)=3(8)𝑥+2(12)𝑥+6=24𝑥+24𝑥+6.

Part 2

Using the power rule, we can easily find the derivatives of 𝑔 and as follows: 𝑔(𝑥)=3𝑥,(𝑥)=2.

Part 3

We would like to try to find an expression for 𝑓 in terms of , 𝑔, and . To do this, we begin by factoring the expression we found for 𝑓 in part 1. We start by factoring out the common factor of 6 from the expression: 𝑓(𝑥)=64𝑥+4𝑥+1.

We can now factor the expression in parentheses as follows: 𝑓(𝑥)=6(2𝑥+1).

Since (𝑥)=2𝑥+1, we can rewrite this as 𝑓(𝑥)=6((𝑥)).

Furthermore, we know that 𝑔(𝑥)=3𝑥; therefore, 𝑓(𝑥)=2𝑔((𝑥)).

Finally, since (𝑥)=2, we have 𝑓(𝑥)=(𝑥)𝑔((𝑥)).

In the previous question, we had function 𝑓 defined as the composition of two functions 𝑔 and ; that is, 𝑓(𝑥)=𝑔((𝑥)).

We found that the derivative of this function was given by 𝑓(𝑥)=(𝑥)𝑔((𝑥)).

Although we considered this for two specific functions, the rule itself generalizes to any composition of differentiable functions; this result is known as the chain rule.

The Chain Rule

Given a function that is differentiable at 𝑥 and a function 𝑔 that is differentiable at (𝑥), their composition 𝑓=𝑔 defined by 𝑓(𝑥)=𝑔((𝑥)) is differentiable at 𝑥 and its derivative 𝑓 is given by 𝑓(𝑥)=(𝑥)𝑔((𝑥)).

We can write this in Leibniz notation as dddddd𝑦𝑥=𝑦𝑢𝑢𝑥, where 𝑦=𝑔(𝑢) and 𝑢=(𝑥).

The nice thing about using Leibniz notation is that it somehow makes the chain rule very intuitive.

We let Δ𝑢 be the change in 𝑢 as a result of a small change in 𝑥, Δ𝑥, which we can write as Δ𝑢=(𝑥+Δ𝑥)(𝑥).

This change in 𝑢 has a corresponding change in 𝑦: Δ𝑦=𝑔(𝑢+Δ𝑢)𝑔(𝑢).

We can now consider the difference quotient Δ𝑦Δ𝑥; if Δ𝑢0, we can multiply the numerator and denominator by Δ𝑢 to get Δ𝑦Δ𝑥=Δ𝑦Δ𝑢Δ𝑢Δ𝑥.

We can then take limits as Δ𝑥0 and arrive at the expression for the chain rule. This reasoning is not quite good enough for a proof since it is quite possible that Δ𝑢 is zero even if Δ𝑥0. Therefore, to prove the chain rule, we need to be careful with this point. However, a reasoning like this demonstrates how reasonable and intuitive the formula for the chain rule is.

We will now consider some examples where we apply the chain rule to find the derivatives of different functions.

Example 2: Using the Chain Rule

Determine the derivative of 𝑓(𝑥)=22𝑥1.

Answer

The function 𝑓 is the composition of two functions. We first need to identify the correct choice of inner and outer functions. In this case, the natural choice of inner function is (𝑥)=2𝑥1, which gives an outer function of 𝑔(𝑢)=2𝑢. We can now find the derivatives of 𝑔 and . Using the power rule, the derivative of is simply (𝑥)=2.

Similarly, we can use the power rule to find the derivative of 𝑔: 𝑔(𝑢)=1𝑢.

Substituting these into the formula for the chain rule, 𝑓(𝑥)=(𝑥)𝑔((𝑥)), we can find the derivative of 𝑓 as follows: 𝑓(𝑥)=212𝑥1=22𝑥1.

As we can see from the last example, one of the key skills in applying the chain rule is identifying the function composition. Often, this is a natural choice; however, sometimes we will find that there is more than one possible choice; and, in these cases, we try to pick the functions to minimize the work we need to do. We will see one example of this later on.

Example 3: Finding Derivatives Using the Chain Rule

Determine the derivative of 𝑦=2𝑥3𝑥+4.

Answer

An example like this really demonstrates the importance of the chain rule. It is of course possible to expand the parentheses and get a long polynomial expression. However, clearly, this would be a considerable amount of algebra. Instead, we can apply the chain rule which will prove much simpler and less prone to error.

We begin by identifying the inner and outer functions. We let 𝑢=2𝑥3𝑥+4, and then 𝑦=𝑢. We now find the derivatives of dd𝑦𝑢 and dd𝑢𝑥. Using the power rule, we have dddd𝑢𝑥=4𝑥3,𝑦𝑢=55𝑢.

Substituting these into the formula for the chain rule, dddddd𝑦𝑥=𝑦𝑢𝑢𝑥, we can find the derivative of 𝑦 as follows: 𝑦=55(4𝑥3)2𝑥3𝑥+4.

Example 4: Finding the Derivative at a Point Using the Chain Rule

Evaluate dd𝑦𝑥 at 𝑥=1, where 𝑦=14𝑥1.

Answer

For a question like this, we have more than one possible choice for our inner and outer functions. We could choose our inner function to be 𝑢=4𝑥1, which would result in an outer function of 𝑦=1𝑢, or we could choose an inner function of 𝑢=4𝑥1, which yields 𝑦=1𝑢 as the outer function. If we choose the first example, we would find that we need to apply the chain rule a second time to find the derivative of 4𝑥1; for this reason, the second choice of inner and outer functions is better since it will only necessitate that we apply the chain rule once. Therefore, setting 𝑢=4𝑥1 and 𝑦=1𝑢. Using the power rule, we can find the derivatives of dd𝑦𝑢 and dd𝑢𝑥 as follows: dddd𝑢𝑥=12𝑥,𝑦𝑢=12𝑢.

Substituting these into the formula for the chain rule, dddddd𝑦𝑥=𝑦𝑢𝑢𝑥, we have dd𝑦𝑥=12𝑥12(4𝑥1)=6𝑥(4𝑥1).

We can now evaluate this at 𝑥=1 as follows: dd𝑦𝑥=6(41)=633=233.

Example 5: When to Use the Chain Rule

Find dd𝑥2𝑥+2𝑥 at 𝑥=1.

Answer

We can apply the chain rule to a function like this. However, we could also expand the parentheses and then differentiate. We will demonstrate both methods here. We will begin by applying the chain rule, 𝑓(𝑥)=(𝑥)𝑔((𝑥)), and the power rule to get dddd𝑥2𝑥+2𝑥=22𝑥+2𝑥𝑥2𝑥+2𝑥=22𝑥+2𝑥1𝑥1𝑥.

We can expand the parentheses to get dd𝑥2𝑥+12𝑥=222𝑥+2𝑥2𝑥=44𝑥.

As we can see, this method involved a lot of algebra and lots of steps. We will now consider the method of first expanding the parentheses and then finding the derivative. Hence, dddd𝑥2𝑥+2𝑥=𝑥4𝑥+8+4𝑥.

Using the power rule to find the derivative, we have dd𝑥2𝑥+2𝑥=44𝑥.

As we can see, simplifying the expression before taking the derivate was much simpler and cleaner.

We have been asked to evaluate the derivative at 𝑥=1. Therefore, substituting this in, we have dd𝑥2𝑥+2𝑥||||=441=0.

Sometimes, we might need to apply the chain rule in situations where we do not have an expression for a particular function, but we have information about the value of the derivative at a given point. The following question is an example of this.

Example 6: Using the Chain Rule with Unknown Functions

Given that 𝑦=𝑓(𝑥), 𝑓(4)=2, and 𝑓(4)=7, determine dd𝑦𝑥 at 𝑥=4.

Answer

Given that 𝑦=𝑓(𝑥), we can apply the chain rule to find the derivative where our inner function is 𝑢=𝑓(𝑥) and our outer function is 𝑦=𝑢. We begin by calculating the derivatives dd𝑦𝑢 and dd𝑢𝑥 as follows: dddd𝑦𝑢=12𝑢,𝑢𝑥=𝑓(𝑥).

We can now substitute these expressions into the chain rule as follows: dddddd𝑦𝑥=𝑦𝑢𝑢𝑥=12𝑓(𝑥)𝑓(𝑥)=𝑓(𝑥)2𝑓(𝑥).

To evaluate this at 𝑥=4, we have dd𝑦𝑥|||=𝑓(4)2𝑓(4).

Substituting in 𝑓(4)=2 and 𝑓(4)=7, we get dd𝑦𝑥|||=227=77.

In our final example, we will consider a function that is the composition of multiple functions.

Example 7: Applying the Chain Rule Multiple Times

Find the derivative of the function 𝑦=𝑥+𝑥+𝑥.

Answer

The first thing we need to do is identify our outer and inner functions. We set 𝑢=𝑥+𝑥+𝑥; then, 𝑦=𝑢. We can now find the derivatives of each of these parts and apply the chain rule. We begin by finding the derivative dd𝑦𝑢 as follows: dd𝑦𝑢=12𝑢.

We now need to find the derivative of 𝑢 with respect to 𝑥. The first term is easy to differentiate, but the second term is a competition of functions. Hence, to find the derivative of this term, we will need to apply the chain rule. We begin by writing dddd𝑢𝑥=1+𝑥𝑥+𝑥.

We let 𝑧=𝑥+𝑥; then, we can set our inner function as 𝑣=𝑥+𝑥, which results in an outer function of 𝑧=𝑣. The definition of 𝑧 corresponds exactly to the definition of 𝑦. Hence, its derivative will be dd𝑧𝑣=12𝑣.

We can now find the derivative of 𝑣 with respect to 𝑥, which we can easily do using the product rule as follows: dd𝑣𝑥=1+12𝑥=2𝑥+12𝑥.

We can now apply the chain rule to 𝑧 to get dddddd𝑧𝑥=𝑧𝑣𝑣𝑥=12𝑥+𝑥2𝑥+12𝑥=2𝑥+14𝑥𝑥+𝑥.

Substituting this back into the expression for dd𝑢𝑥, we have dd𝑢𝑥=1+2𝑥+14𝑥𝑥+𝑥=4𝑥𝑥+𝑥+2𝑥+14𝑥𝑥+𝑥.

We can now apply the chain rule to 𝑦 as follows: dddddd𝑦𝑥=𝑦𝑢𝑢𝑥=12𝑥+𝑥+𝑥4𝑥𝑥+𝑥+2𝑥+14𝑥𝑥+𝑥=4𝑥𝑥+𝑥+2𝑥+18𝑥𝑥+𝑥𝑥+𝑥+𝑥.

When we are applying the chain rule multiple times, we should apply a top-down approach as if we are pealing the layers off an onion. Hence, we should find the outermost function and then deal with the inner function which might require a fresh application of the chain rule.

Key Points

  • The chain rule states that, given a function that is differentiable at 𝑥 and a function 𝑔 that is differentiable at (𝑥), their composition 𝑓=𝑔 defined by 𝑓(𝑥)=𝑔((𝑥)) is differentiable at 𝑥 and its derivative 𝑓 is given by 𝑓(𝑥)=(𝑥)𝑔((𝑥)). We can write this in Leibniz notation as dddddd𝑦𝑥=𝑦𝑢𝑢𝑥, where 𝑦=𝑔(𝑢) and 𝑢=(𝑥).
  • Given the composition of simple algebraic functions, it is sometimes best to simplify rather than apply the chain rule.
  • The chain rule can be applied to the composition of multiple functions.

Nagwa uses cookies to ensure you get the best experience on our website. Learn more about our Privacy Policy.