Let me share with you something I found particularly weird when I was a student first
learning calculus. Let’s say that you have a circle with radius five centered at the origin of the
𝑥𝑦-plane. This is something defined with the equation 𝑥 squared plus 𝑦 squared equals five
squared. That is, all of the points on this circle are a distance five from the origin, as
encapsulated by the Pythagorean theorem. Where the sum of the squares of the two legs on this triangle equal the square of the
hypotenuse, five squared. And suppose that you wanna find the slope of a tangent line to this circle, maybe at
the point 𝑥, 𝑦 equals three, four.
Now, if you’re savvy with geometry, you might already know that this tangent line is
perpendicular to the radius touching it at that point. But let’s say you don’t already know that, or maybe you want a technique that
generalizes to curves other than just circles. As with other problems about the slopes of tangent lines to curves, the key thought
here is to zoom in close enough that the curve basically looks just like its own
tangent line. And then ask about a tiny step along that curve. The 𝑦-component of that little step is what you might call d𝑦. And the 𝑥-component is a little d𝑥. So the slope that we want is the rise over run, d𝑦 divided by d𝑥.
But unlike other tangent-slope problems in calculus, this curve is not the graph of a
function. So we can’t just take a simple derivative, asking about the size of some tiny nudge
to the output of a function caused by some tiny nudge to the input. 𝑥 is not an input and 𝑦 is not an output. They’re both just interdependent values related by some equation.
This is what’s called an implicit curve. It’s just the set of all points 𝑥, 𝑦 that satisfy some property written in terms of
the two variables 𝑥 and 𝑦. The procedure for how you actually find d𝑦 d𝑥 for curves like this is the thing
that I found very weird as a calculus student. You take the derivative of both sides, like this. For 𝑥 squared, you write two 𝑥 times d𝑥. And similarly, 𝑦 squared becomes two 𝑦 times d𝑦. And then, the derivative of that constant, five squared, on the right is just
zero. Now, you can see why this feels a little strange, right? What does it mean to take the derivative of an expression that has multiple variables
in it? And why is it that we’re tacking on the little d𝑦 and the little d𝑥 in this
But, if you just blindly move forward with what you get, you can rearrange this
equation and find an expression for d𝑦 divided by d𝑥. Which, in this case, comes out to be negative 𝑥 divided by 𝑦. So at the point with coordinates 𝑥, 𝑦 equals three, four, that slope would be
negative three divided by four, evidently. This strange process is called implicit differentiation. And don’t worry, I have an explanation for how you can interpret taking a derivative
of an expression with two variables like this. But first, I wanna set aside this particular problem and show how it’s connected to a
different type of calculus problem, something called a related-rates problem.
Imagine a five-meter-long ladder held up against a wall. Where the top of the ladder starts four meters above the ground, which, by the
Pythagorean theorem, means that the bottom is three meters away from the wall. And let’s say it’s slipping down in such a way that the top of the ladder is dropping
at a rate of one meter per second. The question is, in that initial moment, what’s the rate at which the bottom of the
ladder is moving away from the wall? It’s interesting, right? That distance from the bottom of the ladder to the wall is 100 percent determined by
the distance from the top of the ladder to the floor. So we should have enough information to figure out how the rates of change for each
of those values actually depend on each other. But it might not be entirely clear how exactly you relate those two.
First things first, it’s always nice to give names to the quantities that we care
about. So let’s label that distance from the top of the ladder to the ground 𝑦 of 𝑡,
written as a function of time cause it’s changing. Likewise, label the distance between the bottom of the ladder and the wall 𝑥 of
𝑡. The key equation that relates these terms is the Pythagorean theorem, 𝑥 of 𝑡
squared plus 𝑦 of 𝑡 squared equals five squared. What makes that a powerful equation to use is that it’s true at all points of
time. Now, one way that you could solve this would be to isolate 𝑥 of 𝑡. And then you figure out what 𝑦 of 𝑡 has to be based on that one-meter-per-second
drop rate. And you could take the derivative of the resulting function, d𝑥 d𝑡, the rate at
which 𝑥 is changing with respect to time.
And that’s fine; it involves a couple layers of using the chain rule. And it’ll definitely work for you. But I wanna show a different way that you can think about the same problem. This left-hand side of the equation is a function of time, right? It just so happens to equal a constant, meaning the value evidently doesn’t change
while time passes. But it’s still written as an expression dependent on time, which means we can
manipulate it like any other function that has 𝑡 as an input. In particular, we can take a derivative of this left-hand side. Which is a way of saying, “If I let a little bit of time pass, some small d𝑡, which
causes 𝑦 to slightly decrease and 𝑥 to slightly increase, how much does this
On the one hand, we know that that derivative should be zero, since the expression is
a constant. And constants don’t care about your tiny nudges in time. They just remain unchanged. But on the other hand, what do you get when you compute this derivative? Well, the derivative of 𝑥 of 𝑡 squared is two times 𝑥 of 𝑡 times the derivative
of 𝑥. That’s the chain rule that I talked about last video. Two 𝑥 d𝑥 represents the size of a change to 𝑥 squared caused by some change to 𝑥,
and then we’re dividing out by d𝑡. Likewise, the rate at which 𝑦 of 𝑡 squared is changing is two times 𝑦 of 𝑡 times
the derivative of 𝑦.
Now evidently, this whole expression must be zero. And that’s an equivalent way of saying that 𝑥 squared plus 𝑦 squared must not
change while the ladder moves. At the very start, time 𝑡 equals zero, the height, 𝑦 of 𝑡, is four meters, and
that distance, 𝑥 of 𝑡, is three meters. And since the top of the ladder is dropping at a rate of one meter per second, that
derivative, d𝑦 d𝑡, is negative one meters per second. Now this gives us enough information to isolate the derivative, d𝑥 d𝑡. And when you work it out, it comes out to be four-thirds meters per second.
The reason I bring up this ladder problem is that I want you to compare it to the
problem of finding the slope of a tangent line to the circle. In both cases, we had the equation 𝑥 squared plus 𝑦 squared equals five
squared. And in both cases, we ended up taking the derivative of each side of this
expression. But for the ladder question, these expressions were functions of time. So taking the derivative has a clear meaning. It’s the rate at which the expression changes as time changes. But what makes the circle situation strange is that rather than saying that small
amount of time, d𝑡, has passed, which causes 𝑥 and 𝑦 to change. The derivative just has these tiny nudges, d𝑥 and d𝑦, just floating free, not tied
to some other common variable, like time. Let me show you a nice way to think about this.
Let’s give this expression, 𝑥 squared plus 𝑦 squared, a name, maybe 𝑆. 𝑆 is essentially a function of two variables. It takes every point 𝑥, 𝑦 on the plane and associates it with a number. For points on this circle, that number happens to be 25. If you step off the circle away from the center, that value would be bigger. For other points 𝑥, 𝑦 closer to the origin, that value would be smaller. Now what it means to take a derivative of this expression, a derivative of 𝑆, is to
consider a tiny change to both of these variables. Some tiny change, d𝑥, to 𝑥 and some tiny change, d𝑦, to 𝑦. And not necessarily one that keeps you on the circle, by the way. It’s just any tiny step in any direction of the 𝑥𝑦-plane. And from there you ask, how much does the value of 𝑆 change? And that difference, the difference in the value of 𝑆 before the nudge and after the
nudge, is what I’m writing as d𝑆.
For example, in this picture, we’re starting off at a point where 𝑥 equals three and
where 𝑦 equals four. And let’s just say that that step I drew has d𝑥 at negative 0.02 and d𝑦 at negative
0.01. Then the decrease in 𝑆, the amount that 𝑥 squared plus 𝑦 squared changes over that
step, would be about two times three times negative 0.02 plus two times four times
negative 0.01. That’s what this derivative expression, two 𝑥 d𝑥 plus two 𝑦 d𝑦, actually
means. It’s a recipe for telling you how much the value 𝑥 squared plus 𝑦 squared changes
as determined by the point 𝑥, 𝑦 where you start and the tiny step d𝑥, d𝑦 that
you take. And as with all things derivative, this is only an approximation. But it’s one that gets truer and truer for smaller and smaller choices of d𝑥 and
d𝑦. The key point here is that when you restrict yourself to steps along the circle,
you’re essentially saying you want to ensure that this value of 𝑆 doesn’t
change. It starts at a value of 25, and you wanna keep it at a value of 25. That is, d𝑆 should be zero.
So setting this expression two 𝑥 d𝑥 plus two 𝑦 d𝑦 equal to zero is the condition
under which one of these tiny steps actually stays on the circle. Again, this is only an approximation. Speaking more precisely, that condition is what keeps you on the tangent line of the
circle, not the circle itself. But for tiny enough steps, those are essentially the same thing. Of course, there’s nothing special about the expression 𝑥 squared plus 𝑦 squared
equals five squared. It’s always nice to think through more examples. So let’s consider this expression sin of 𝑥 times 𝑦 squared equals 𝑥. This corresponds to a whole bunch of U-shaped curves on the plane. And those curves, remember, represent all of the points 𝑥, 𝑦 where the value of sin
of 𝑥 times 𝑦 squared happens to equal the value of 𝑥.
Now, imagine taking some tiny step with components d𝑥, d𝑦 and not necessarily one
that keeps you on the curve. Taking the derivative of each side of this equation is gonna tell us how much the
value of that side changes during the step. On the left side, the product rule that we talked through last video tells us that
this should be left d right plus right d left. That is, sin of 𝑥 times the change to 𝑦 squared, which is two 𝑦 times d𝑦, plus 𝑦
squared times the change to sin of 𝑥, which is cos of 𝑥 times d𝑥. The right side is simply 𝑥, so the size of a change to that value is exactly d𝑥,
right? Now setting these two sides equal to each other is a way of saying, “Whatever your
tiny step with coordinates d𝑥 and d𝑦 is, if it’s gonna keep us on the curve, the
values of both the left-hand side and the right-hand side must change by the same
amount.” That’s the only way that this top equation can remain true.
From there, depending on what problem you’re trying to solve, you have something to
work with algebraically. And maybe the most common goal is to try to figure out what d𝑦 divided by d𝑥
is. As a final example here, I wanna show how you can actually use this technique of
implicit differentiation to figure out new derivative formulas. I’ve mentioned that the derivative of 𝑒 to the 𝑥 is itself. But what about the derivative of its inverse function, the natural log of 𝑥? Well, the graph of the natural log of 𝑥 can be thought of as an implicit curve. It’s all of the points 𝑥, 𝑦 on the plane where 𝑦 happens to equal ln of 𝑥. It just happens to be the case that the 𝑥s and 𝑦s of this equation aren’t as
intermingled as they were in our other examples. The slope of this graph, d𝑦 divided by d𝑥, should be the derivative of ln of 𝑥,
right? Well, to find that, first rearrange this equation, 𝑦 equals ln of 𝑥, to be 𝑒 to
the 𝑦 equals 𝑥. This is exactly what the natural log of 𝑥 means. It’s saying 𝑒 to the what equals 𝑥.
Since we know the derivative of 𝑒 to the 𝑦, we can take the derivative of both
sides here. Effectively asking how a tiny step with components d𝑥, d𝑦 changes the value of each
one of these sides. To ensure that a step stays on the curve, the change to this left side of the
equation, which is 𝑒 to the 𝑦 times 𝑑𝑦, must equal the change to the right side,
which in this case is just d𝑥. Rearranging, that means that d𝑦 divided by d𝑥, the slope of our graph, equals one
divided by 𝑒 to the 𝑦. And when we’re on the curve, 𝑒 to the 𝑦 is by definition the same thing as 𝑥. So evidently, the slope is one divided by 𝑥. And of course, an expression for the slope of a graph of a function written in terms
of 𝑥 like this is the derivative of that function. So evidently, the derivative of ln of 𝑥 is one divided by 𝑥.
By the way, all of this is a little sneak peek into multivariable calculus. Where you consider functions that have multiple inputs and how they change as you
tweak those multiple inputs. The key, as always, is to have a clear image in your head of what tiny nudges are at
play and how exactly they depend on each other.