In this explainer, we will learn how to find the LU decomposition (factorization) of a matrix using Doolittle’s method.
Despite all of the elegance and simplicity of the algebraic rules which govern many of the associated operations, there is no escaping the fact that linear algebra is a computationally heavy discipline and that this is especially true for those who are new to the subject. In order to properly understand and appreciate the benefits of linear algebra, it is necessary for a new student to undertake many of the common calculations manually, before these are either streamlined or replaced with more efficient methods. However, there are some operations such as matrix multiplication where there is little room for maneuver in terms of simplifying a calculation and, ultimately, there is no escape from having to perform a large number of arithmetic operations.
Due to the number of computations involved in linear algebra, we are constantly motivated to find better, less burdensome methods for completing calculations. There are few advantages in completing more calculations than is strictly necessary, especially in terms of the time taken and the possibility of introducing mistakes with each superfluous stage of the working. Even in the modern age of computing, where calculations can be completed flawlessly and autonomously at a rate of billions per second, we are still motivated to decrease the number of calculations that are needed, given that there are so few reasons not to!
In linear algebra, one of the most common tricks when working with a matrix is to see if it can be shifted or split up into a form that will be easier for subsequent calculations (especially matrix multiplication, exponentiation, and inversion). In many of these methods, it is often desirable to shift a matrix of interest into specific forms, with one of the most common being the triangular forms. In this explainer, we will show, where possible, how to take a general matrix and split it into two separate triangular matrices that can be combined together with matrix multiplication to obtain the original matrix. Although the process for completing this is actually less complicated than it often appears, it will help us to first clarify several definitions before demonstrating this method. We will begin with the notion of upper and lower-triangular matrices, which are common objects of interest in linear algebra.
Definition: Upper- and Lower-Triangular Matrices
Consider a square matrix . If all entries below the diagonal entries are zero, then the matrix is called “upper triangular.” If all entries above the diagonal entries are zero, then the matrix is called “lower triangular.”
We will demonstrate this concept with several examples. Notice that all of the following matrices are upper triangular, because all entries below the (highlighted) diagonal entries are zero:
As shown in the middle matrix, it is perfectly valid for any of the diagonal entries to be equal to zero. The definition only requires that the entries below the diagonal entries all have a value of zero, and there is no mention of the value of the diagonal entries themselves or of the entries that are above the diagonal entries. Any matrix is not in upper-triangular form if any of the entries below the diagonal are nonzero. Taking the matrices above, we could easily add in some errant entries that would ruin the upper-triangular form:
Lower-triangular matrices are defined in a very similar way, except we require that all entries above the diagonal entries be zero. For example, all of the following are lower-triangular matrices:
This property can be removed by adding some nonzero entries above the diagonal entries. For example,
The aim of this explainer will be to take a general matrix and to split it into its “LU decomposition,” meaning that , where L is a lower-triangular matrix and U is an upper-triangular matrix. We should begin by saying that this will not always be possible and that we may occasionally need the “PLU” decomposition, for which there is a separate explainer. We will continue undaunted by this fact. We will begin to describe the general method that we will use.
As we begin to develop our method for finding the LU decomposition of a matrix, it will be easiest for us to think in terms of elementary row operations, as this is the method that we would normally choose if we were aiming to manipulate the entries of a matrix, either as part of Gauss–Jordan elimination or as one of the many other methods that rely on row operations. More specifically, we will show how each type of row operation can be written in terms of a simple, invertible, lower-triangular matrix, hence allowing a series of row operations to be encoded exactly and in full by an accompanying series of matrices. There are three types of elementary row operations, with the simplest two of the three being the row swap operation . This is the only one of the three types of elementary row operations that we do not cover in this explainer, as it pertains to PLU decomposition rather than LU decomposition. For the other two types of row operations, we will define the equivalent elementary matrix. Please be aware that the literature on linear algebra has never reached a consensus as to how the elementary matrices should be labeled, so we are using our own terminology.
Definition: The Second Type of Elementary Row Operation and the Corresponding Elementary Matrix
Consider a matrix and the second type of elementary row operation , where , giving the row-equivalent matrix . Then, we can define the corresponding “elementary” matrix: which is essentially the identity matrix but with the entry in the th row and th column being equal to instead of 1. Then, the row-equivalent matrix can be written as the matrix multiplication
This result is simple enough to demonstrate by example. Suppose that we had the square matrix and that we wished to perform the row operation in order to obtain the row-equivalent matrix . Then, we would expect to find that
Using the definition above, we can construct the corresponding elementary matrix. This will just be a copy of the identity matrix but with the diagonal entry in the third row being equal to 5 instead of 1. The matrix is
From this, we can check that has produced the row-equivalent matrix that we were expecting. Practically speaking, the second type of row operation is not as useful as the third type and we will find ourselves making less use of this type of row operation. However, given that we may choose to use this row operation, we will still explain how to calculate the inverse of . Fortunately, the result is simple to both state and verify, so we will summarize it in the following theorem without proof.
Theorem: Inverse of the Matrix 𝐷𝑖 (𝑐)
The inverse of the elementary matrix is given by the formula
If we had scaled a row by a constant and wanted to undo this operation, then we would just multiply the same row in the new matrix by the constant , thus returning the original matrix. This rationale is respected by the theorem above and can be demonstrated by example. We can check that the theorem is correct in the case of the example above, by noting that the inverse matrix of is the matrix . We can then apply this to the row-equivalent matrix to recover the original matrix :
We could equivalently have checked directly that , where is the identity matrix.
To complete the LU decomposition of a matrix, the most important type of row operation is the third type defined by . For example, it is possible to complete the LU decomposition of a matrix without using the second type of row operation that we described above. Generally speaking, however, it would be impossible to achieve this without the third type of row operation. This importance is easy to appreciate, given that the third type of row operation is the one which is used most commonly when completing any process that involves the elementary row operations. For example, there would generally be little hope of completing Gauss–Jordan elimination if we did not have license to use the third type of row operation. With this in mind, we now define the third type of elementary matrix.
Definition: The Third Type of Elementary Row Operation and the Corresponding Elementary Matrix
Consider a matrix and the third type of elementary row operation , giving the row-equivalent matrix . Then, we can define the corresponding “elementary” matrix: which is essentially the identity matrix but with the entry in the th row and th column being equal to instead of 0. Then, the row-equivalent matrix can be written as the matrix multiplication
We will demonstrate this in relation to the example above, where we had the matrix
If we wanted to apply the row operation , then we would expect the resulting row-equivalent matrix
Because of the definition above, we know that we can encode the elementary row operation by the elementary matrix . This matrix will be identical to the identity matrix , except for an entry in the second row and first column having a value of 3 instead of 0. This matrix therefore has the form
We can verify that this matrix can be used to recover the row-equivalent matrix by matrix multiplication:
With the second type of row operation, we said that it was important to understand both the matrix itself and the corresponding inverse matrix, so that we can complete the LU decomposition. This is also the case of the third type of row operation that we were describing in the definition above. Even though the third type of row operation is more complicated than the second type, the inverse operation is little more difficult to express.
Theorem: Inverse of the Matrix 𝐸𝑖𝑗 (𝑐)
The inverse of the matrix is given by the formula
Suppose that we had taken a matrix and performed the elementary row operation . If we wanted to undo this effect, then we would use the row operation on the new matrix, hence returning the original matrix. This understanding is totally encapsulated by the statement of the theorem above, for which we now provide a check. To test this, we can produce the inverse matrix corresponding to the example above. Since we used , we have that the inverse matrix is . We can then check that
Now that we have discussed the necessary ingredients for LU decomposition, we will both define this concept and then show how it can be achieved. We will do so initially by way of example, wherein we will briefly justify all of the steps that we are taking. Many of the techniques are ostensibly similar to those used in Gauss–Jordan elimination and other related methods. The completion of the LU decomposition focuses on achieving an upper-triangular form, whereas Gauss–Jordan elimination focuses on achieving an echelon form. These two concepts are similar but are not identical, and in fact when attempting to complete the LU decomposition we will have fewer options than we do generally for Gauss–Jordan.
Definition: The LU Decomposition of a Matrix
Consider a matrix . Then, the “LU decomposition” requires finding a lower-triangular matrix L and an upper-triangular matrix U such that . This is sometimes referred to as the “LU factorization” of a matrix.
We will demonstrate this with the simplest possible example of a matrix. We take the matrix and task ourselves with finding a lower-triangular matrix L and an upper-triangular matrix U such that . Suppose that we begin with the familiar method of first highlighting all of the pivot entries of :
After having practiced Gauss–Jordan elimination, our first instinct might be to simplify , which in this case would involve removing the factor of 3 that appears in every entry in the top row. To do this, we would use the elementary row operation , which has the corresponding elementary matrix
We will also state the inverse of this matrix:
Given that these two matrices are inverses of each other, they must obey the relationship where is the identity matrix. This is certainly the case, which means that we can write this equation in full:
It is this statement that allows us to begin finding the LU decomposition of . Given that for any matrix of order , we can write
Now the reason for writing this equation is explained by equation (1), which we duly substitute into the equation immediately above, giving
Matrix multiplication is associative, which means that we can combine the second and third matrices using the operation of matrix multiplication. This gives precisely the desired effect, as the new middle matrix is the same as the original matrix but with the constant factor of 3 now removed from the first row:
The leftmost matrix is a lower-triangular matrix, as all of the entries above diagonal entries are zero. In the case of a matrix, this leaves only the entry in the top right, which is shown to be zero. Unfortunately, the middle matrix is not yet in upper-triangular form and so we have not yet found our LU decomposition. Now, once again inspecting the pivot entries shows that we can remove the pivot in the second row to turn the middle matrix into an upper-triangular matrix. This is completed with the row operation . The corresponding elementary matrix is which has the inverse elementary matrix
Once again, these two elementary matrices are multiplicative inverses of each other, so we have the relationship . Written out in long form, this is
We can take equation (2) and insert the identity matrix as shown, without changing the validity of the equation:
Substituting equation (3) then gives
We once again rely on the property of matrix multiplication being associative. Given that this is always the case, we can order the calculations in any way we would prefer. In this situation, it is best to order the calculations as shown:
Then, completing the two matrix multiplications gives the required decomposition:
We have therefore obtained the two matrices
such that . It is easily checked that L is a lower-triangular matrix and that U is an upper-triangular matrix and the final confirmation of the result can be completed by checking that equation (4) is true.
The calculation of an LU decomposition, once practiced, would normally be completed without much of the supporting explanation that was given in the above example. Much as elementary row operations can be used to manipulate a matrix into any type of possible decomposition, obtaining an LU decomposition requires that we only ever produce lower-triangular matrices on the leftmost side of our equations. This means that our elementary matrices (and hence our elementary row operations) are chosen accordingly, as it will be counterproductive to use elementary row operations that incur a non-lower-triangular form in the leftmost terms. This is not as challenging as it might seem at first, and the criteria are essentially those described by Doolittle’s algorithm, which we will discuss in more detail later in this explainer.
Example 1: LU Decomposition for a 3 × 3 Matrix
Find the LU decomposition for the matrix
We begin by highlighting the pivot entries in the matrix , which are the first nonzero entries of each row:
Our aim will be to move the pivot entries to the right, hence moving closer toward an upper-triangular form for . There are infinitely many ways to achieve an upper-triangular form, although we will begin by aiming to remove the pivot which is in the second row. If we were thinking purely in terms of elementary row operations, then we could use to achieve this. Thinking in terms of the corresponding matrix, we would require the elementary matrix (and inverse)
Given that the two matrices above are multiplicative inverses of each other, the matrix multiplication will necessarily return the identity matrix. This means that we can multiply on the left-hand side of by this combined expression, without changing overall (since the identity matrix preserves any matrix under matrix multiplication). Thus, we could write which is essentially just the same as writing the equation . However, once we combine the second and third matrices together by using matrix multiplication, we find that we have actually obtained the desired row-equivalent matrix in the middle matrix:
The leftmost term is a lower-triangular matrix, and the middle matrix is not yet in upper-triangular form, so we still have some work to do. Now suppose that we wished to remove the pivot entry in the third row, to help manipulate the middle matrix toward upper-triangular form. One option would be the row operation , which has the equivalent elementary matrices
We can now inject both of these matrices between the left and middle matrices in equation (5). Overall, this has no effect on the validity of the equation because the two inserted matrices are each other’s inverse. We obtain
Because matrix multiplication is associative, we could equally write this as
If we multiply together the first pair of matrices and then multiply together the second pair of matrices, we obtain the alternative form
The leftmost matrix is still in lower-triangular form, which is necessary for the LU decomposition to be complete. However, we have not quite finished yet because the middle matrix is not in upper-triangular form. We now have many options but, for the sake of neatness, we will choose to simplify both the second and the third rows. For the second row, we can use the row scaling operation , which has the elementary matrices
Injecting these into equation (6) gives
By using the same approach as above, we recall that matrix multiplication is associative. Multiplying together the first pair of matrices and then multiplying together the second pair of matrices, we obtain
A similar trick can be applied to the third row, this time using the row operation . The corresponding elementary matrices are
By placing these two matrices into equation (7), we find
By again multiplying together the first pair of matrices and then, separately, the second pair of matrices, we have the equation
The leftmost matrix is still in lower-triangular form, as will be required for the LU decomposition. The middle matrix is very nearly in upper-triangular form and this can be completed with the row operation . The elementary matrices that we require are
Injecting these into equation (8) gives which then simplifies to give the desired form
We have now achieved exactly what we set out to achieve. Namely, we have written the matrix in the form , where L is a lower-triangular matrix and U is an upper-triangular matrix. Indeed, we could (and arguably should) check that the above equation holds, which is a simple matter of completing the matrix multiplication.
The main point of understanding before the statement of Doolittle’s algorithm is that, for each elementary symmetric matrix, we are also calculating the corresponding inverse of this matrix. When a matrix is multiplied by its inverse, the result is the identity matrix. Also, when an identity matrix is multiplied by any other matrix, the result is this matrix. Doolittle’s algorithm requires both of these properties as it is normally phrased in a way that is more technical than the way we present the result below.
Theorem: Doolittle’s Method for Calculating the LU Decomposition of a Matrix
Consider a square matrix , for which it is possible to find the LU decomposition. Also, suppose that there exist elementary, lower-triangular matrices which place into upper-triangular form, as shown:
Then, the lower-triangular matrix is constructed as the product of the inverses in the following way:
It is then the case that .
There are two key points to this theorem. Firstly, the matrices are lower-triangular, which means that the inverses will also be lower-triangular and so will the product of these matrices. This guarantees that the matrix is in lower-triangular form, as needed. The second key point is that this decomposition, if possible, does actually achieve the matrix . This is a simple matter of checking that
The most useful property in this situation is that matrix multiplication is associative, meaning that we can bracket the terms in any order that we would find to be most helpful. In this case, we will pair together the matrices, working from the central point outward. Given that is the inverse of , we would find that , the identity matrix. This allows equation (9) to be written in the following way: where the final line has been achieved by recalling that for any matrix . It is now clear that because every elementary matrix will become paired with its own, which will then return the identity matrix, the next step follows exactly the same pattern:
Continuing to pair up these matrices until the final pair gives
It is now clear why Doolittle’s algorithm works. The only reason why this approach would not work would be if one of the elementary matrices was not lower triangular. If it is not possible to create the LU decomposition without the use of non-lower-triangular matrices, then a minor adjustment is needed, which is referred to as the PLU decomposition, wherein non-lower-triangular matrices of a certain type are temporarily permitted in the calculations. For the remainder of this explainer, we will give examples where it is possible to find the LU decomposition, with the PLU decomposition being treated in a separate explainer.
Example 2: LU Decomposition for a 3 × 3 Matrix
Find the LU decomposition for the matrix
We will begin immediately by highlighting the pivot entries of , as this will help us guide the matrix into upper-triangular form by borrowing the principles from Gauss–Jordan elimination. We have
We need to remove both of the pivots in the second and third rows, using elementary row operations to replace these entries with zeroes. A good first step would be the elementary row operation , which has the corresponding elementary matrices:
Since these matrices are inverses of each other, their product will by the identity matrix . Introducing the product of the matrices will therefore have no effect. Applying this to the trivial equation gives
Multiplying together the second and third matrices returns a lower-triangular matrix for the first matrix, as well as the second matrix being of the desired form, with the pivot in the second row no longer appearing in the first column:
Much as the leftmost matrix is in lower-triangular form, the middle matrix is not yet in upper-triangular form. We can move closer to this by removing the pivot in the third row with the row operation . This has the corresponding elementary matrices
As before, we place these matrices between the first and second matrices, this time of equation (10), which gives
By multiplying together the first pair of matrices and then the second pair of matrices, we obtain a new decomposition:
The first matrix is still in lower-triangular form, as we will need for the completion of the LU decomposition. However, the middle matrix is still one row operation away from being placed in upper-triangular form. The final row operation that we will need to apply is , which gives the associated elementary matrices
These matrices are injected into equation (11), giving
Finally, we multiply together the first pair of matrices and the second pair of matrices, giving the desired form:
We therefore have the following triangular matrices: such that .
As ever with linear algebra, once a technique has been mastered for a matrix of a particular order, it is normally little extra conceptual effort to apply the same approach to matrices with different orders. This will again show to be the case with the application of Doolittle’s algorithm to a matrix with order . Much as the calculations for a matrix are more numerous, the overall method is no more difficult than the previous examples, relying on exactly the same principles to decompose the matrix into LU form.
Example 3: LU Decomposition for a 4 × 4 Matrix
Find the LU decomposition for the matrix
The initial step is to highlight the pivot entries of each row, as shown:
The pivot in the second row is already in the correct position, given that we are aiming to turn into a row-equivalent, upper-triangular matrix. We should instead focus on removing the pivot entries in the third and fourth rows. The pivot in the third row can be removed with the row operation , which has the corresponding elementary matrices
Given that the product of these matrices is just the identity matrix , we can introduce them to the trivial equation without changing its validity. Thus, we obtain
Multiplying together the second and third matrices gives an equation where the middle matrix no longer has a pivot in the third row, as was our intention:
Note that the first matrix is in lower-triangular form but the middle matrix is not yet in upper-triangular form, which means that there is more work to do. Our next target will be the pivot in the fourth row, which can be removed with the row operation . The corresponding elementary matrices are