Lesson Explainer: The Chain Rule | Nagwa Lesson Explainer: The Chain Rule | Nagwa

Lesson Explainer: The Chain Rule Mathematics

In this explainer, we will learn how to find the derivatives of composite functions using the chain rule.

Once we have learned how to differentiate simple functions, we might start to wonder how we can differentiate more complex functions. Generally, more complex functions are created from simpler ones by combining them together in various ways. There are a few basic ways to combine two functions 𝑓(𝑥) and 𝑔(𝑥):

  1. addition or subtraction: 𝑓(𝑥)±𝑔(𝑥);
  2. multiplication or division: 𝑓(𝑥)𝑔(𝑥) or 𝑓(𝑥)𝑔(𝑥);
  3. composition: 𝑓(𝑔(𝑥)).

To be able to differentiate more complex functions, it would be very helpful to have rules that tell us how to differentiate functions combined in these particular ways. At this point in a calculus course, we already know that the derivative of a sum is the sum of the derivatives: (𝑓±𝑔)=𝑓±𝑔.

Furthermore, we know that differentiation is actually a linear operation. This means that, in addition to the sum rule, we have the following rule for multiplication by a constant 𝑐: (𝑐𝑓)=𝑐𝑓.

There are also rules for multiplication and division known as the product and quotient rules. However, in this explainer, we will focus on the rule for differentiating composite functions.

Let’s begin by considering an example where we differentiate a composite function by first simplifying the composite expression and then applying the power rule to the resulting expression. This example will lead us to the general formula for differentiating composite functions.

Example 1: Derivatives of Composite Functions

Consider the function 𝑓(𝑥)=(2𝑥+1).

  1. By expanding the binomial, find the derivative of 𝑓.
  2. Let 𝑔(𝑥)=𝑥 and (𝑥)=2𝑥+1. Find the derivative of 𝑔 and .
  3. Express 𝑓 in terms of , 𝑔, and .

Answer

Part 1

We begin by expanding the parentheses. We can do this using the binomial theorem or by simply multiplying out the parentheses. Below, we simply multiply out the parentheses: 𝑓(𝑥)=(2𝑥+1)=(2𝑥+1)4𝑥+4𝑥+1=8𝑥+8𝑥+2𝑥+4𝑥+4𝑥+1=8𝑥+12𝑥+6𝑥+1.

We can now use the power rule, dd𝑥𝑥=𝑛𝑥, to differentiate each term as follows: 𝑓(𝑥)=3(8)𝑥+2(12)𝑥+6=24𝑥+24𝑥+6.

Part 2

Using the power rule, we can easily find the derivatives of 𝑔 and as follows: 𝑔(𝑥)=3𝑥,(𝑥)=2.

Part 3

We would like to try to find an expression for 𝑓 in terms of , 𝑔, and . To do this, we begin by factoring the expression we found for 𝑓 in part 1. We start by factoring out the common factor of 6 from the expression: 𝑓(𝑥)=64𝑥+4𝑥+1.

We can now factor the expression in parentheses as follows: 𝑓(𝑥)=6(2𝑥+1).

Since (𝑥)=2𝑥+1, we can rewrite this as 𝑓(𝑥)=6((𝑥)).

Furthermore, we know that 𝑔(𝑥)=3𝑥; therefore, 𝑓(𝑥)=2𝑔((𝑥)).

Finally, since (𝑥)=2, we have 𝑓(𝑥)=(𝑥)𝑔((𝑥)).

In the previous example, we had function 𝑓 defined as the composition of two functions 𝑔 and ; that is, 𝑓(𝑥)=𝑔((𝑥)).

We found that the derivative of this function was given by 𝑓(𝑥)=(𝑥)𝑔((𝑥)).

Although we considered this for two specific functions, the rule itself generalizes to any composition of differentiable functions; this result is known as the chain rule.

Rule: The Chain Rule

Given a function (𝑥) that is differentiable at 𝑥=𝑥 and a function 𝑔(𝑢) that is differentiable at (𝑥), their composition 𝑓=𝑔 defined by 𝑓(𝑥)=𝑔((𝑥)) is differentiable at 𝑥 and its derivative 𝑓(𝑥) is given by 𝑓(𝑥)=(𝑥)𝑔((𝑥)).

We can write this in Leibniz notation as dddddd𝑦𝑥=𝑦𝑢𝑢𝑥, where 𝑦=𝑔(𝑢) and 𝑢=(𝑥).

The nice thing about using Leibniz notation is that it makes the chain rule very intuitive since the fractional notations on the right-hand side of the equation formally simplify to the expression on the left-hand side.

We let Δ𝑢 be the change in 𝑢 as a result of a small change in 𝑥, Δ𝑥, which we can write as Δ𝑢=(𝑥+Δ𝑥)(𝑥).

This change in 𝑢 has a corresponding change in 𝑦: Δ𝑦=𝑔(𝑢+Δ𝑢)𝑔(𝑢).

We can now consider the difference quotient Δ𝑦Δ𝑥; if Δ𝑢0, we can multiply the numerator and denominator by Δ𝑢 to get Δ𝑦Δ𝑥=Δ𝑦Δ𝑢Δ𝑢Δ𝑥.

We can then take limits as Δ𝑥0 and arrive at the expression for the chain rule. This reasoning is not quite good enough for a proof since it is quite possible that Δ𝑢 is zero even if Δ𝑥0. Therefore, to prove the chain rule, we need to be careful with this point. However, a reasoning like this demonstrates how reasonable and intuitive the formula for the chain rule is.

Let’s consider an example where we apply the chain rule using the Leibniz notation.

Example 2: Finding Derivatives Using the Chain Rule

Determine the derivative of 𝑦=2𝑥3𝑥+4.

Answer

An example like this really demonstrates the importance of the chain rule. It is of course possible to expand the parentheses and get a long polynomial expression. However, clearly, this would be a considerable amount of algebra. Instead, we can apply the chain rule which will prove much simpler and less prone to error.

We begin by identifying the inner and outer functions. We let 𝑢=2𝑥3𝑥+4, and then 𝑦=𝑢. We now find the derivatives of dd𝑦𝑢 and dd𝑢𝑥. Using the power rule, we have dddd𝑢𝑥=4𝑥3,𝑦𝑢=55𝑢.

Substituting 𝑢=2𝑥3𝑥+4 into the expression dd𝑦𝑢, we obtain dd𝑦𝑢=552𝑥3𝑥+4.

Substituting these into the formula for the chain rule, dddddd𝑦𝑥=𝑦𝑢𝑢𝑥, we can find the derivative of 𝑦 as follows: 𝑦=552𝑥3𝑥+4(4𝑥3).

As we can see from the last example, one of the key skills in applying the chain rule is identifying the function composition.

Let us consider another example where we apply the chain rule. In this example, we will use the prime notation rather than the Leibniz notation for the chain rule.

Example 3: Using the Chain Rule

Determine the derivative of 𝑓(𝑥)=22𝑥1.

Answer

The function 𝑓 is the composition of two functions. We first need to identify the correct choice of inner and outer functions. In this case, the natural choice of inner function is (𝑥)=2𝑥1, which gives an outer function of 𝑔(𝑢)=2𝑢. We can now find the derivatives of 𝑔 and . Using the power rule, the derivative of is simply (𝑥)=2.

Similarly, we can use the power rule to find the derivative of 𝑔: 𝑔(𝑢)=2𝑢=2𝑢=2×12𝑢=1𝑢.

Substituting these into the formula for the chain rule, 𝑓(𝑥)=(𝑥)𝑔((𝑥)), we can find the derivative of 𝑓 as follows: 𝑓(𝑥)=212𝑥1=22𝑥1.

In the previous example, there was one apparent choice for the inner and outer functions to the chain rule. Often, there is a natural choice; however, sometimes we will find that there is more than one possible choice. In these cases, we try to pick the functions to minimize the work we need to do. Let us consider an example where we need to consider our choice of the inner and outer functions carefully.

Example 4: Finding the Derivative at a Point Using the Chain Rule

Evaluate dd𝑦𝑥 at 𝑥=1, where 𝑦=14𝑥1.

Answer

For a question like this, we have more than one possible choice for our inner and outer functions. We could choose our inner function to be 𝑢=4𝑥1, which would result in an outer function of 𝑦=1𝑢, or we could choose an inner function of 𝑢=4𝑥1, which yields 𝑦=1𝑢 as the outer function. If we choose the first example, we would find that we need to apply the chain rule a second time to find the derivative of 4𝑥1; for this reason, the second choice of inner and outer functions is better since it will only necessitate that we apply the chain rule once. Therefore, setting 𝑢=4𝑥1 and 𝑦=1𝑢. Using the power rule, we can find the derivatives of dd𝑦𝑢 and dd𝑢𝑥 as follows: dddd𝑢𝑥=12𝑥,𝑦𝑢=1𝑢=𝑢=12𝑢=12𝑢.

Replacing 𝑢 with 4𝑥1 in the expression dd𝑦𝑢, we obtain dd𝑦𝑢=12(4𝑥1).

Substituting these into the formula for the chain rule, dddddd𝑦𝑥=𝑦𝑢𝑢𝑥, we have dd𝑦𝑥=12(4𝑥1)12𝑥=6𝑥(4𝑥1).

We can now evaluate this at 𝑥=1 as follows: dd𝑦𝑥=6(41)=633=233.

Sometimes, we might need to apply the chain rule in situations where we do not have an expression for a particular function, but we have information about the value of the derivative at a given point. The following question is an example of this.

Example 5: Using the Chain Rule with Unknown Functions

Given that 𝑦=𝑓(𝑥), 𝑓(4)=2, and 𝑓(4)=7, determine dd𝑦𝑥 at 𝑥=4.

Answer

Given that 𝑦=𝑓(𝑥), we can apply the chain rule to find the derivative where our inner function is 𝑢=𝑓(𝑥) and our outer function is 𝑦=𝑢. We begin by calculating the derivatives dd𝑦𝑢 and dd𝑢𝑥 as follows: dddd𝑦𝑢=12𝑢,𝑢𝑥=𝑓(𝑥).

We can now substitute these expressions into the chain rule as follows: dddddd𝑦𝑥=𝑦𝑢𝑢𝑥=12𝑓(𝑥)𝑓(𝑥)=𝑓(𝑥)2𝑓(𝑥).

To evaluate this at 𝑥=4, we have dd𝑦𝑥|||=𝑓(4)2𝑓(4).

Substituting in 𝑓(4)=2 and 𝑓(4)=7, we get dd𝑦𝑥|||=227=77.

In our final example, we will consider a function that is the composition of multiple functions.

Example 6: Applying the Chain Rule Multiple Times

Find the derivative of the function 𝑦=𝑥+𝑥+𝑥.

Answer

The first thing we need to do is identify our outer and inner functions. We set 𝑢=𝑥+𝑥+𝑥; then, 𝑦=𝑢. We can now find the derivatives of each of these parts and apply the chain rule. We begin by finding the derivative dd𝑦𝑢 as follows: dd𝑦𝑢=12𝑢.

We now need to find the derivative of 𝑢 with respect to 𝑥. The first term is easy to differentiate, but the second term is a composition of functions. Hence, to find the derivative of this term, we will need to apply the chain rule. We begin by writing dddd𝑢𝑥=1+𝑥𝑥+𝑥.

We let 𝑧=𝑥+𝑥; then, we can set our inner function as 𝑣=𝑥+𝑥, which results in an outer function of 𝑧=𝑣. The definition of 𝑧 corresponds exactly to the definition of 𝑦. Hence, its derivative will be dd𝑧𝑣=12𝑣.

We can now find the derivative of 𝑣 with respect to 𝑥, which we can easily do using the product rule as follows: dd𝑣𝑥=1+12𝑥=2𝑥+12𝑥.

We can now apply the chain rule to 𝑧 to get dddddd𝑧𝑥=𝑧𝑣𝑣𝑥=12𝑥+𝑥2𝑥+12𝑥=2𝑥+14𝑥𝑥+𝑥.

Substituting this back into the expression for dd𝑢𝑥, we have dd𝑢𝑥=1+2𝑥+14𝑥𝑥+𝑥=4𝑥𝑥+𝑥+2𝑥+14𝑥𝑥+𝑥.

We can now apply the chain rule to 𝑦 as follows: dddddd𝑦𝑥=𝑦𝑢𝑢𝑥=12𝑥+𝑥+𝑥4𝑥𝑥+𝑥+2𝑥+14𝑥𝑥+𝑥=4𝑥𝑥+𝑥+2𝑥+18𝑥𝑥+𝑥𝑥+𝑥+𝑥.

When we are applying the chain rule multiple times, we should apply a top-down approach as if we are pealing the layers off an onion. Hence, we should find the outermost function and then deal with the inner function which might require a fresh application of the chain rule.

Let’s recap a few important points from this explainer.

Key Points

  • The chain rule states that, given a function that is differentiable at 𝑥 and a function 𝑔 that is differentiable at (𝑥), their composition 𝑓=𝑔 defined by 𝑓(𝑥)=𝑔((𝑥)) is differentiable at 𝑥 and its derivative 𝑓 is given by 𝑓(𝑥)=(𝑥)𝑔((𝑥)). We can write this in Leibniz notation as dddddd𝑦𝑥=𝑦𝑢𝑢𝑥, where 𝑦=𝑔(𝑢) and 𝑢=(𝑥).
  • Sometimes we will find that there is more than one possible choice. In these cases, we try to pick the functions to minimize the work we need to do.
  • When we differentiate the composition of three or more functions, we need to apply the chain rule multiple times. We begin with the outermost function, and the derivative of the inner function will require an additional chain rule.

Download the Nagwa Classes App

Attend sessions, chat with your teacher and class, and access class-specific questions. Download the Nagwa Classes app today!

Nagwa uses cookies to ensure you get the best experience on our website. Learn more about our Privacy Policy.