The last several videos have been about the idea of a derivative. And before moving on to integrals, I wanna take some time to talk about limits. To be honest, the idea of a limit is not really anything new. If you know what the word approach means, you pretty much already know what a limit
is. You could say that it’s a matter of assigning fancy notation to the intuitive idea of
one value that just gets closer to another.
But there actually are a few reasons to devote a full video to this topic. For one thing, it’s worth showing how the way that I’ve been describing derivatives
so far lines up with the formal definition of a derivative as it’s typically
presented in most courses and textbooks. I wanna give you a little confidence that thinking in terms of d𝑥 and d𝑓 as
concrete nonzero nudges is not just some trick for building intuition. It’s actually backed up by the formal definition of a derivative in all of its
rigor. I also wanna shed light on what exactly mathematicians mean when they say approach in
terms of something called the epsilon–delta definition of limits. Then we’ll finish off with a clever trick for computing limits called L’Hôpital’s
So first things first, let’s take a look at the formal definition of the
derivative. As a reminder, when you have some function 𝑓 of 𝑥, to think about its derivative at
a particular input, maybe 𝑥 equals two. You start by imagining nudging that input some little d𝑥 away and looking at the
resulting change to the output, d𝑓. The ratio d𝑓 divided by d𝑥, which can be nicely thought of as the rise-over-run
slope between the starting point on the graph and the nudged point, is almost what
the derivative is. The actual derivative is whatever this ratio approaches as d𝑥 approaches zero. And just to spell out a little of what’s meant there. That nudge to the output, d𝑓, is the difference between 𝑓 at the starting input
plus d𝑥 and 𝑓 at the starting input, the change to the output caused by d𝑥.
To express that you wanna find what this ratio approaches as d𝑥 approaches zero, you
write lim, for limit, with d𝑥 arrow zero below it. Now, you’ll almost never see terms with a lowercase d, like d𝑥, inside a limit
expression like this. Instead, the standard is to use a different variable, something like Δ𝑥, or commonly
ℎ for whatever reason. The way I like to think of it is that terms with this lowercase d in the typical
derivative expression have built into them this idea of a limit. The idea that d𝑥 is supposed to eventually go to zero. So in a sense, this left-hand side here, d𝑓 over d𝑥, the ratio we’ve been thinking
about for the past few videos, is just shorthand for what the right-hand side here
spells out in more detail. Writing out exactly what we mean by d𝑓 and writing out this limit process
explicitly. And this right-hand side here is the formal definition of a derivative, as you would
commonly see it in any calculus textbook.
And if you’ll pardon me for a small rant here, I wanna emphasize that nothing about
this right-hand side references the paradoxical idea of an infinitely small
change. The point of limits is to avoid that. This value ℎ is the exact same thing as the d𝑥 I’ve been referencing throughout the
series. It’s a nudge to the input of 𝑓 with some nonzero, finitely small size, like
0.001. It’s just that we’re analyzing what happens for arbitrarily small choices of ℎ. In fact, the only reason that people introducing new variable name into this formal
definition — rather than just, you know, using d𝑥 — is to be super extra clear that
these changes to the input are just ordinary numbers that have nothing to do with
infinitesimals. Because the thing is, there are others who like to interpret this d𝑥 as an
infinitely small change, whatever that would mean. Or to just say that d𝑥 and d𝑓 are nothing more than symbols that we shouldn’t take
But by now in the series, you know I’m not really a fan of either of those views. I think you can and should interpret d𝑥 as a concrete, finitely small nudge just so
long as you remember to ask what happens when that thing approaches zero. For one thing, and I hope the past few videos have helped convince you of this, that
helps to build stronger intuition for where the rules of calculus actually come
from. But it’s not just some trick for building intuitions. Everything I’ve been saying about derivatives with this concrete-finitely-small-nudge
philosophy is just a translation of this formal definition we’re staring at right
now. So long story short, the big fuss about limits is that they let us avoid talking
about infinitely small changes. By instead, asking what happens as the size of some small change to our variable
And this brings us to goal number two, understanding exactly what it means for one
value to approach another. For example, consider the function two plus ℎ cubed minus two cubed all divided by
ℎ. This happens to be the expression that pops out when you unravel the definition of a
derivative of 𝑥 cubed evaluated at 𝑥 equals two. But let’s just think of it as any all function with an input ℎ. Its graph is this nice continuous looking parabola. Which would make sense because it’s a cubic term divided by a linear term. But actually, if you think about what’s going on at ℎ equals zero, plugging that in,
you would get zero divided by zero, which is not defined. So really, this graph has a hole at that point. and you have to kind of exaggerate to
draw that hole, often with a little empty circle like this.
But keep in mind, the function is perfectly well-defined for inputs as close to zero
as you want. And wouldn’t you agree that as ℎ approaches zero, the corresponding output, the
height of this graph, approaches 12? And it doesn’t matter which side you come at it from. That limit of this ratio as ℎ approaches zero is equal to 12. But imagine that you are a mathematician inventing calculus and someone skeptically
asks you, “Well, what exactly do you mean by approach?” That would be kind of an annoying question. I mean, come on, we all know what it means for one value to get closer to
another. But let’s start thinking about ways that you might be able to answer that person,
For a given range of inputs within some distance of zero, excluding the forbidden
point zero itself. Look at all of the corresponding outputs, all possible heights of the graph above
that range. As the range of input values closes in more and more tightly around zero. That range of output values closes in more and more closely around 12. And importantly, the size of that range of output values can be made as small as you
want. As a counterexample, consider a function that looks like this. Which is also not defined at zero, but it kinda jumps up at that point. When you approach ℎ equals zero from the right, the function approaches the value
two. But as you come at it from the left, it approaches one. Since there’s not a single clear, unambiguous value that this function approaches as
ℎ approaches zero. The limit is simply not defined at that point.
One way to think of this is that when you look at any range of inputs around zero and
consider the corresponding range of outputs. As you shrink that input range, the corresponding outputs don’t narrow in on any
specific value. Instead, those outputs straddle a range that never shrinks smaller than one. Even as you make that input range as tiny as you could imagine. And this perspective of shrinking an input range around the limiting point. And seeing whether or not you’re restricted and how much that shrinks the output
range. Leads to something called the epsilon–delta definition of limits.
Now I should tell you, you could argue that this is needlessly heavy-duty for an
introduction to calculus. Like I said, if you know what the word approach means, you already know what a limit
means. There’s nothing new on the conceptual level here. But this is an interesting glimpse into the field of real analysis. And it gives you a taste for how mathematicians make the intuitive ideas of calculus
a little more airtight and rigorous. You’ve already seen the main idea here. When a limit exists, you can make this output range as small as you want. But when the limit doesn’t exist, that output range cannot get smaller than some
particular value. No matter how much you shrink the input range around the limiting input.
Let’s phrase that same idea but a little more precisely. Maybe in the context of this example where the limiting value is 12. Think about any distance away from 12, where for some reason it’s common to use the
Greek letter 𝜀 to denote that distance. And the intent here is gonna be that this distance, 𝜀, is as small as you want. What it means for the limit to exist is that you will always be able to find a range
of inputs around our limiting point some distance 𝛿 around zero. So that any input within 𝛿 of zero corresponds to an output within a distance 𝜀 of
And the key point here is that that’s true for any 𝜀, no matter how small. You’ll always be able to find the corresponding 𝛿. In contrast, when a limit does not exist, as in this example here. You can find a sufficiently small 𝜀, like 0.4. So that no matter how small you make your range around zero, no matter how tiny 𝛿
is. The corresponding range of outputs is just always too big. There is no limiting output where everything is within a distance 𝜀 of that
So far, this is all pretty theory heavy, don’t you think? Limits being used to formally define the derivative, and then 𝜀s and 𝛿s being used
to rigorously define the limit itself. So let’s finish things off here with a trick for actually computing limits. For instance, let’s say for some reason you were studying the function sin of 𝜋
times 𝑥 divided by 𝑥 squared minus one. Maybe this was modeling some kind of dampened oscillation. When you plot a bunch of points to graph this, it looks pretty continuous. But there’s a problematic value at 𝑥 equals one. When you plug that in, sin of 𝜋 is, well, zero. And the denominator also comes out to zero. So the function is actually not defined at that input. And the graph should really have a hole there.
This also happens by the way at 𝑥 equals negative one. But let’s just focus our attention on a single one of these holes for now. The graph certainly does seem to approach a distinct value at that point, wouldn’t
you say? So you might ask, how exactly do you find what output this approaches as 𝑥
approaches one, since you can’t just plug in one. Well, one way to approximate it would be to plug in a number that’s just really,
really close to one, like 1.00001. Doing that, you’d find that there should be a number around negative 1.57. But is there a way to know precisely what it is? Some systematic process to take an expression like this one that looks like zero
divided by zero at some input and ask what is its limit as 𝑥 approaches that
After limits so helpfully let us write the definition for derivatives, derivatives
can actually come back here and return the favor to help us evaluate limits. Let me show you what I mean. Here’s what the graph of sin of 𝜋 times 𝑥 looks like. And here’s what the graph of 𝑥 squared minus one looks like. That’s kind of a lot to have up on the screen, but just focus on what’s happening
around 𝑥 equals one. The point here is that sin of 𝜋 times 𝑥 and 𝑥 squared minus one are both zero at
that point. They both cross the 𝑥-axis. In the same spirit as plugging in a specific value near one, like 1.00001. Let’s zoom in on that point and consider what happens just a tiny nudge, d𝑥, away
from it. The value sin of 𝜋 times 𝑥 is bumped down. And the value of that nudge, which was caused by the nudge d𝑥 to the input, is what
we might call dsin of 𝜋𝑥.
And from our knowledge of derivatives, using the chain rule, that should be around
cos of 𝜋 times 𝑥 times 𝜋 times d𝑥. Since the starting value was 𝑥 equals one, we plug in 𝑥 equals one to that
expression. In other words, the amount that this sin of 𝜋 times 𝑥 graph changes is roughly
proportional to d𝑥. With a proportionality constant equal to cos of 𝜋 times 𝜋. And cos of 𝜋, if we think back to our trig knowledge, is exactly negative one. So we can write this whole thing as negative 𝜋 times d𝑥.
Similarly, the value of the 𝑥 squared minus one graph changes by some d𝑥 squared
minus one. And taking the derivative, the size of that nudge should be two 𝑥 times d𝑥. Again, we were starting at 𝑥 equals one, so we plug in 𝑥 equals one to that
expression. Meaning, the size of that output nudge is about two times one times d𝑥. What this means is that for values of 𝑥 which are just a tiny nudge, d𝑥, away from
one. The ratio, sin of 𝜋𝑥 divided by 𝑥 squared minus one, is approximately negative 𝜋
times d𝑥 divided by two times d𝑥. The d𝑥s here cancel out, so what’s left is negative 𝜋 over two. And importantly, those approximations get more and more accurate for smaller and
smaller choices of d𝑥, right? So this ratio, negative 𝜋 over two, actually tells us the precise limiting value as
𝑥 approaches one.
And remember, what that means is that the limiting height on our original graph is,
evidently, exactly negative 𝜋 over two. Now what happened there is a little subtle. So I wanna go through it again, but this time a little more generally. Instead of these two specific functions, which are both equal to zero at 𝑥 equals
one. Think of any two functions 𝑓 of 𝑥 and 𝑔 of 𝑥, which are both zero at some common
value, 𝑥 equals 𝑎. The only constraint is that these have to be functions where you’re able to take a
derivative of them at 𝑥 equals 𝑎. Which means that they each basically look like a line when you zoom in close enough
to that value. Now even though you can’t compute 𝑓 divided by 𝑔 at this trouble point, since both
of them equal zero. You can ask about this ratio for values of 𝑥 really, really close to 𝑎, the limit
as 𝑥 approaches 𝑎.
And it’s helpful to think of those nearby inputs as just a tiny nudge, d𝑥, away from
𝑎. The value of 𝑓 at that nudged point is approximately its derivative, d𝑓 over d𝑥,
evaluated at 𝑎 times d𝑥. Likewise, the value of 𝑔 at that nudged point is approximately the derivative of 𝑔
evaluated at 𝑎 times d𝑥. So near that trouble point, the ratio between the outputs of 𝑓 and 𝑔 is actually
about the same as the derivative of 𝑓 at 𝑎 times d𝑥 divided by the derivative of
𝑔 at 𝑎 times d𝑥. Those d𝑥s cancel out, so the ratio of 𝑓 and 𝑔 near 𝑎 is about the same as the
ratio between their derivatives. Because each of those approximations gets more and more accurate for smaller and
smaller nudges. This ratio of derivatives gives the precise value for the limit.
This is a really handy trick for computing a lot of limits. Whenever you come across some expression that seems to equal zero divided by zero
when you plug in some particular input. Just try taking the derivative of the top and bottom expressions and plugging in that
same trouble input. This clever trick is called L’Hôpital’s rule. Interestingly, it was actually discovered by Johann Bernoulli. But L’Hôpital was a wealthy dude who essentially paid Bernoulli for the rights to
some of his mathematical discoveries. Academia is weird back then. But hey, in a very literal way, it pays to understand these tiny nudges.
Now right now, you might be remembering that the definition of a derivative for a
given function comes down to computing the limit of a certain fraction that looks
like zero divided by zero. So you might think that L’Hôpital’s rule could give us a handy way to discover new
derivative formulas. But that would actually be cheating, since presumably you don’t know what the
derivative of the numerator here is. When it comes to discovering derivative formulas, something that we’ve been doing a
fair amount this series, there is no systematic plug-and-chug method. But that’s a good thing. Whenever creativity is needed to solve problems like these, it’s a good sign that
you’re doing something real. Something that might give you a powerful tool to solve future problems.
And speaking of powerful tools, up next, I’m gonna be talking about what an integral
is as well as the fundamental theorem of calculus. And this is another example of where limits can be used to help give a clear meaning
to a pretty delicate idea that flirts with infinity.