Pop Video: Eigenvectors and Eigenvalues

Grant Sanderson • 3Blue1Brown • Boclips

Eigenvectors and Eigenvalues


Video Transcript

Eigenvectors and eigenvalues is one of those topics that a lot of students find particularly unintuitive. Questions like “Why are we doing this?” and “What does this actually mean?” are too often left just floating away in an unanswered sea of computations.

And as I’ve put out the videos of this series, a lot of you have commented about looking forward to visualizing this topic in particular. I suspect that the reason for this is not so much that eigen-things are particularly complicated or poorly explained. In fact, it’s comparatively straightforward, and I think most books do a fine job explaining it.

The issue is that it only really makes sense if you have a solid visual understanding for many of the topics that precede it. Most important here is that you know how to think about matrices as linear transformations, but you also need to be comfortable with things like determinants, linear systems of equations, and change of basis.

Confusion about eigen-stuffs usually has more to do with the shaky foundation in one of these topics than it does with eigenvectors and eigenvalues themselves.

To start, consider some linear transformation in two dimensions like the one shown here. It moves the basis vector 𝑖-hat to the coordinates three, zero and 𝑗-hat to one, two. So it’s represented with a matrix whose columns are three, zero and one, two.

Focus in on what it does to one particular vector, and think about the span of that vector, the line passing through its origin and its tip. Most vectors are gonna get knocked off their span during the transformation. I mean, it would seem pretty coincidental if the place where the vector landed also happened to be somewhere on that line. But some special vectors do remain on their own span, meaning the effect that the matrix has on such a vector is just to stretch it or squish it like a scalar.

For this specific example, the basis vector 𝑖-hat is one such special vector. The span of 𝑖-hat is the 𝑥-axis. And from the first column of the matrix, we can see that 𝑖-hat moves over to three times itself, still on that 𝑥-axis.

What’s more, because of the way linear transformations work, any other vector on the 𝑥-axis is also just stretched by a factor of three and, hence, remains on its own span. A slightly sneakier vector that remains on its own span during this transformation is negative one, one. It ends up getting stretched by a factor of two.

And again, linearity is gonna imply that any other vector on the diagonal line spanned by this guy is just gonna get stretched out by a factor of two. And for this transformation, those are all the vectors with this special property of staying on their span, those on the 𝑥-axis, getting stretched out by a factor of three, and those on this diagonal line, getting stretched by a factor of two.

Any other vector is gonna get rotated somewhat during the transformation, knocked off the line that it spans. As you might have guessed by now, these special vectors are called the eigenvectors of the transformation, and each eigenvector has associated with it what’s called an eigenvalue, which is just the factor by which it’s stretched or squished during the transformation.

Of course, there’s nothing special about stretching versus squishing or the fact that these eigenvalues happen to be positive. In another example, you could have an eigenvector with eigenvalue negative one-half, meaning that the vector gets flipped and squished by a factor of one-half. But the important part here is that it stays on the line that it spans out without getting rotated off of it.

For a glimpse of why this might be a useful thing to think about, consider some three-dimensional rotation. If you can find an eigenvector for that rotation, a vector that remains on its own span, what you’ve found is the axis of rotation. And it’s much easier to think about a 3D rotation in terms of some axis of rotation and an angle by which it’s rotating rather than thinking about the full three-by-three matrix associated with that transformation.

In this case, by the way, the corresponding eigenvalue would have to be one, since rotations never stretch or squish anything. So the length of the vector would remain the same. This pattern shows up a lot in linear algebra.

With any linear transformation described by a matrix, you could understand what it’s doing by reading off the columns of this matrix as the landing spots for basis vectors. But often, a better way to get at the heart of what the linear transformation actually does, less dependent on your particular coordinate system, is to find the eigenvectors and eigenvalues.

I won’t cover the full details on methods for computing eigenvectors and eigenvalues here, but I’ll try to give an overview of the computational ideas that are most important for a conceptual understanding.

Symbolically, here’s what the idea of an eigenvector looks like. 𝐴 is the matrix representing some transformation, with 𝐯 as the eigenvector and 𝜆 is a number, namely, the corresponding eigenvalue. What this expression is saying is that the matrix-vector product, 𝐴 times 𝐯, gives the same result as just scaling the eigenvector 𝐯 by some value 𝜆.

So finding the eigenvectors and their eigenvalues of the matrix 𝐴 comes down to finding the values of 𝐯 and 𝜆 that make this expression true. It’s a little awkward to work with at first because that left-hand side represents matrix-vector multiplication, but the right-hand side here is scalar-vector multiplication.

So let’s start by rewriting that right-hand side as some kind of matrix-vector multiplication, using a matrix which has the effect of scaling any vector by a factor of 𝜆. The columns of such a matrix will represent what happens to each basis vector, and each basis vector is simply multiplied by 𝜆, so this matrix will have the number 𝜆 down the diagonal, with zeros everywhere else.

The common way to write this guy is to factor that 𝜆 out and write it as 𝜆 times 𝐼, where 𝐼 is the identity matrix with ones down the diagonal. With both sides looking like matrix-vector multiplication, we can subtract off that right-hand side and factor out the 𝐯.

So what we now have is a new matrix 𝐴 minus 𝜆 times the identity, and we’re looking for a vector 𝐯 such that this new matrix times 𝐯 gives the zero vector. Now this will always be true if 𝐯 itself is the zero vector, but that’s boring. What we want is a nonzero eigenvector. And if you watched Chapter 5 and 6, you’ll know that the only way it’s possible for the product of a matrix with a nonzero vector to become zero is if the transformation associated with that matrix squishes space into a lower dimension. And that squishification corresponds to a zero determinant for the matrix.

To be concrete, let’s say your matrix 𝐴 has columns two, one and two, three and think about subtracting off a variable amount 𝜆 from each diagonal entry. Now imagine tweaking 𝜆, turning a knob to change its value. As that value of 𝜆 changes, the matrix itself changes, and so the determinant of the matrix changes.

The goal here is to find a value of 𝜆 that will make this determinant zero, meaning the tweaked transformation squishes space into a lower dimension. In this case, the sweet spot comes when 𝜆 equals one. Of course, if we’ve chosen some other matrix, the eigenvalue might not necessarily be one. The sweet spot might be hit at some other value of 𝜆.

So this is kind of a lot, but let’s unravel what this is saying. When 𝜆 equals one, the matrix 𝐴 minus 𝜆 times the identity squishes space onto a line. That means there’s a nonzero vector 𝐯 such that 𝐴 minus 𝜆 times the identity times 𝐯 equals the zero vector. And remember, the reason we care about that is because it means 𝐴 times 𝐯 equals 𝜆 times 𝐯, which you can read off as saying that the vector 𝐯 is an eigenvector of 𝐴 staying on its own span during the transformation 𝐴.

In this example, the corresponding eigenvalue is one. So 𝐯 would actually just stay fixed in place. Pause and ponder if you need to make sure that that line of reasoning feels good.

This is the kind of thing I mentioned in the introduction. If you didn’t have a solid grasp of determinants and why they relate to linear systems of equations having nonzero solutions, an expression like this would feel completely out of the blue. To see this in action, let’s revisit the example from the start.

With the matrix whose columns are three, zero and one, two, to find if a value 𝜆 is an eigenvalue, subtract it from the diagonals of this matrix and compute the determinant. Doing this, we get a certain quadratic polynomial in 𝜆: three minus 𝜆 times two minus 𝜆.

Since 𝜆 can only be an eigenvalue, if this determinant happens to be zero, you can conclude that the only possible eigenvalues are 𝜆 equals two and 𝜆 equals three. To figure out what the eigenvectors are that actually have one of these eigenvalues, say 𝜆 equals two, plug in that value of 𝜆 to the matrix and then solve for which vectors this diagonally altered matrix sends to zero. If you computed this the way you would any other linear system, you’d see that the solutions are all the vectors on the diagonal line spanned by negative one, one.

This corresponds to the fact that the unaltered matrix three, zero, one, two has the effect of stretching all those vectors by a factor of two. Now a 2D transformation doesn’t have to have eigenvectors. For example, consider a rotation by 90 degrees. This doesn’t have any eigenvectors since it rotates every vector off of its own span.

If you actually try computing the eigenvalues of a rotation like this, notice what happens. Its matrix has columns zero, one and negative one, zero. Subtract off 𝜆 from the diagonal elements and look for when the determinant is zero.

In this case, you get the polynomial 𝜆 squared plus one. The only roots of that polynomial are the imaginary numbers 𝑖 and negative 𝑖. The fact that there are no real number solutions indicates that there are no eigenvectors.

Another pretty interesting example, worth holding in the back of your mind, is a shear. This fixes 𝑖-hat in place and moves 𝑗-hat one over. So its matrix has columns one, zero and one, one.

All of the vectors on the 𝑥-axis are eigenvectors with eigenvalue one since they remain fixed in place. In fact, these are the only eigenvectors. When you subtract off 𝜆 from the diagonals and compute the determinant, what you get is one minus 𝜆 squared, and the only root of this expression is 𝜆 equals one.

This lines up with what we see geometrically, that all of the eigenvectors have eigenvalue one. Keep in mind, though, it’s also possible to have just one eigenvalue, but with more than just a line full of eigenvectors. A simple example is a matrix that scales everything by two. The only eigenvalue is two. But every vector in the plane gets to be an eigenvector with that eigenvalue.

Now is another good time to pause and ponder some of this before I move on to the last topic.

I wanna finish off here with the idea of an eigenbasis, which relies heavily on ideas from the last video. Take a look at what happens. If our basis vectors just so happened to be eigenvectors.

For example, maybe 𝑖-hat is scaled by negative one and 𝑗-hat is scaled by two. Writing their new coordinates as the columns of a matrix, notice that those scalar multiples, negative one and two, which are the eigenvalues of 𝑖-hat and 𝑗-hat, sit on the diagonal of our matrix, and every other entry is a zero.

Anytime a matrix has zeros everywhere other than the diagonal, it’s called, reasonably enough, a diagonal matrix. And the way to interpret this is that all the basis vectors are eigenvectors, with the diagonal entries of this matrix being their eigenvalues.

There are a lot of things that make diagonal matrices much nicer to work with. One big one is that it’s easier to compute what will happen if you multiply this matrix by itself a whole bunch of times. Since all one of these matrices does is scale each basis vector by some eigenvalue, applying that matrix many times, say 100 times, is just gonna correspond to scaling each basis vector by the 100th power of the corresponding eigenvalue.

In contrast, try computing the 100th power of a nondiagonal matrix. Really, try it for a moment. It’s a nightmare. Of course, you’ll rarely be so lucky as to have your basis vectors also be eigenvectors. But if your transformation has a lot of eigenvectors, like the one from the start of this video, enough so that you can choose a set that spans the full space, then you could change your coordinate system so that these eigenvectors are your basis vectors.

I talked about change of basis last video, but I’ll go through a superquick reminder here of how to express the transformation currently written in our coordinate system into a different system.

Take the coordinates of the vectors that you want to use as a new basis, which, in this case, means our two eigenvectors. Then make those coordinates the columns of a matrix, known as the change of basis matrix. When you sandwich the original transformation, putting the change of basis matrix on its right and the inverse of the change of basis matrix on its left, the result will be a matrix representing that same transformation, but from the perspective of the new basis vector’s coordinate system.

The whole point of doing this with eigenvectors is that this new matrix is guaranteed to be diagonal, with its corresponding eigenvalues down that diagonal. This is because it represents working in a coordinate system where what happens to the basis vectors is that they get scaled during the transformation.

A set of basis vectors, which are also eigenvectors, is called, again reasonably enough, an eigenbasis. So if, for example, you needed to compute the 100th power of this matrix, it would be much easier to change to an eigenbasis, compute the 100th power in that system, then convert back to our standard system.

You can’t do this with all transformations. A shear, for example, doesn’t have enough eigenvectors to span the full space. But if you can find an eigenbasis, it makes matrix operations really lovely.

For those of you willing to work through a pretty neat puzzle to see what this looks like in action and how it can be used to produce some surprising results, I’ll leave up a prompt here on the screen. It takes a bit of work, but I think you’ll enjoy it.

The next and final video of this series is gonna be on abstract vector spaces. See you then.

Nagwa uses cookies to ensure you get the best experience on our website. Learn more about our Privacy Policy.