Explainer: LU Decomposition: Doolittleโ€™s Method

In this explainer, we will learn how to find the LU decomposition (factorization) of a matrix using Doolittleโ€™s method.

Despite all of the elegance and simplicity of the algebraic rules which govern many of the associated operations, there is no escaping the fact that linear algebra is a computationally heavy discipline and that this is especially true for those who are new to the subject. In order to properly understand and appreciate the benefits of linear algebra, it is necessary for a new student to undertake many of the common calculations manually, before these are either streamlined or replaced with more efficient methods. However, there are some operations such as matrix multiplication where there is little room for maneuver in terms of simplifying a calculation and, ultimately, there is no escape from having to perform a large number of arithmetic operations.

Due to the number of computations involved in linear algebra, we are constantly motivated to find better, less burdensome methods for completing calculations. There are few advantages in completing more calculations than is strictly necessary, especially in terms of the time taken and the possibility of introducing mistakes with each superfluous stage of the working. Even in the modern age of computing, where calculations can be completed flawlessly and autonomously at a rate of billions per second, we are still motivated to decrease the number of calculations that are needed, given that there are so few reasons not to!

In linear algebra, one of the most common tricks when working with a matrix is to see if it can be shifted or split up into a form that will be easier for subsequent calculations (especially matrix multiplication, exponentiation, and inversion). In many of these methods, it is often desirable to shift a matrix of interest into specific forms, with one of the most common being the triangular forms. In this explainer, we will show, where possible, how to take a general matrix and split it into two separate triangular matrices that can be combined together with matrix multiplication to obtain the original matrix. Although the process for completing this is actually less complicated than it often appears, it will help us to first clarify several definitions before demonstrating this method. We will begin with the notion of upper and lower-triangular matrices, which are common objects of interest in linear algebra.

Definition: Upper- and Lower-Triangular Matrices

Consider a square matrix ๐ด. If all entries below the diagonal entries are zero, then the matrix is called โ€œupper triangular.โ€ If all entries above the diagonal entries are zero, then the matrix is called โ€œlower triangular.โ€

We will demonstrate this concept with several examples. Notice that all of the following matrices are upper triangular, because all entries below the (highlighted) diagonal entries are zero: ๏˜12โˆ’3034001๏ค,โŽกโŽขโŽขโŽฃ2051050โˆ’20004000โˆ’1โŽคโŽฅโŽฅโŽฆ,โŽกโŽขโŽขโŽขโŽขโŽฃ3210โˆ’304โˆ’2020033โˆ’1000โˆ’1200003โŽคโŽฅโŽฅโŽฅโŽฅโŽฆ.

As shown in the middle matrix, it is perfectly valid for any of the diagonal entries to be equal to zero. The definition only requires that the entries below the diagonal entries all have a value of zero, and there is no mention of the value of the diagonal entries themselves or of the entries that are above the diagonal entries. Any matrix is not in upper-triangular form if any of the entries below the diagonal are nonzero. Taking the matrices above, we could easily add in some errant entries that would ruin the upper-triangular form: ๏˜12โˆ’3034201๏ค,โŽกโŽขโŽขโŽฃ2051050โˆ’2000400โˆ’3โˆ’1โŽคโŽฅโŽฅโŽฆ,โŽกโŽขโŽขโŽขโŽขโŽฃ3210โˆ’304โˆ’2020833โˆ’1โˆ’100โˆ’1200003โŽคโŽฅโŽฅโŽฅโŽฅโŽฆ.

Lower-triangular matrices are defined in a very similar way, except we require that all entries above the diagonal entries be zero. For example, all of the following are lower-triangular matrices: ๏”10โˆ’14๏ ,โŽกโŽขโŽขโŽฃ3000280030โˆ’106โˆ’127โŽคโŽฅโŽฅโŽฆ,โŽกโŽขโŽขโŽขโŽขโŽฃ100001โˆ’300004โˆ’200173โˆ’1025144โŽคโŽฅโŽฅโŽฅโŽฅโŽฆ.

This property can be removed by adding some nonzero entries above the diagonal entries. For example, ๏”14โˆ’14๏ ,โŽกโŽขโŽขโŽฃ3010280030โˆ’136โˆ’127โŽคโŽฅโŽฅโŽฆ,โŽกโŽขโŽขโŽขโŽขโŽฃ10โˆ’1001โˆ’300004โˆ’200173โˆ’1025144โŽคโŽฅโŽฅโŽฅโŽฅโŽฆ.

The aim of this explainer will be to take a general matrix ๐ด and to split it into its โ€œLU decomposition,โ€ meaning that ๐ด=LU, where L is a lower-triangular matrix and U is an upper-triangular matrix. We should begin by saying that this will not always be possible and that we may occasionally need the โ€œPLUโ€ decomposition, for which there is a separate explainer. We will continue undaunted by this fact. We will begin to describe the general method that we will use.

As we begin to develop our method for finding the LU decomposition of a matrix, it will be easiest for us to think in terms of elementary row operations, as this is the method that we would normally choose if we were aiming to manipulate the entries of a matrix, either as part of Gaussโ€“Jordan elimination or as one of the many other methods that rely on row operations. More specifically, we will show how each type of row operation can be written in terms of a simple, invertible, lower-triangular matrix, hence allowing a series of row operations to be encoded exactly and in full by an accompanying series of matrices. There are three types of elementary row operations, with the simplest two of the three being the row swap operation ๐‘Ÿ๐‘–โ†”๐‘Ÿ๐‘—. This is the only one of the three types of elementary row operations that we do not cover in this explainer, as it pertains to PLU decomposition rather than LU decomposition. For the other two types of row operations, we will define the equivalent elementary matrix. Please be aware that the literature on linear algebra has never reached a consensus as to how the elementary matrices should be labeled, so we are using our own terminology.

Definition: The Second Type of Elementary Row Operation and the Corresponding Elementary Matrix

Consider a matrix ๐ด and the second type of elementary row operation ๐‘Ÿ๐‘–โ†’๐‘๐‘Ÿ๐‘–, where ๐‘โ‰ 0, giving the row-equivalent matrix ๏‚ก๐ด. Then, we can define the corresponding โ€œelementaryโ€ matrix: ๐ท๐‘–(๐‘)=โŽกโŽขโŽขโŽขโŽขโŽฃ1โ‹ฏ0โ‹ฏ0โ‹ฎโ‹ฑโ‹ฎโ‹ฎโ‹ฎ0โ‹ฏ๐‘โ‹ฏ0โ‹ฎโ‹ฎโ‹ฎโ‹ฑโ‹ฎ0โ‹ฏ0โ‹ฏ1โŽคโŽฅโŽฅโŽฅโŽฅโŽฆ, which is essentially the identity matrix but with the entry in the ๐‘–th row and ๐‘–th column being equal to ๐‘ instead of 1. Then, the row-equivalent matrix ๏‚ก๐ด can be written as the matrix multiplication ๏‚ก๐ด=๐ท๐‘–(๐‘)๐ด.

This result is simple enough to demonstrate by example. Suppose that we had the square matrix ๐ด=๏˜3210โˆ’13125๏ค and that we wished to perform the row operation ๐‘Ÿ3โ†’5๐‘Ÿ3 in order to obtain the row-equivalent matrix ๏‚ก๐ด. Then, we would expect to find that ๏‚ก๐ด=๏˜3210โˆ’1351025๏ค.

Using the definition above, we can construct the corresponding elementary matrix. This will just be a copy of the 3ร—3 identity matrix but with the diagonal entry in the third row being equal to 5 instead of 1. The matrix is ๐ท3(5)=๏˜100010005๏ค.

From this, we can check that ๏‚ก๐ด=๐ท3(5)๐ด=๏˜100010005๏ค๏˜3210โˆ’13125๏ค=๏˜3210โˆ’1351025๏ค has produced the row-equivalent matrix that we were expecting. Practically speaking, the second type of row operation is not as useful as the third type and we will find ourselves making less use of this type of row operation. However, given that we may choose to use this row operation, we will still explain how to calculate the inverse of ๐ท๐‘–(๐‘). Fortunately, the result is simple to both state and verify, so we will summarize it in the following theorem without proof.

Theorem: Inverse of the Matrix ๐ท๐‘–โ€Š(๐‘)

The inverse of the elementary matrix ๐ท๐‘–(๐‘) is given by the formula ๐ทโˆ’1๐‘–(๐‘)=๐ท๐‘–๏€ผ1๐‘๏ˆ.

If we had scaled a row by a constant ๐‘ and wanted to undo this operation, then we would just multiply the same row in the new matrix by the constant 1๐‘, thus returning the original matrix. This rationale is respected by the theorem above and can be demonstrated by example. We can check that the theorem is correct in the case of the example above, by noting that the inverse matrix of ๐ท3(5) is the matrix ๐ทโˆ’13(5)=๐ท3๏€ผ15๏ˆ. We can then apply this to the row-equivalent matrix ๏‚ก๐ด to recover the original matrix ๐ด: ๐ท3๏€ผ15๏ˆ๏‚ก๐ด=โŽกโŽขโŽขโŽฃ1000100015โŽคโŽฅโŽฅโŽฆ๏˜3210โˆ’1351025๏ค=๏˜3210โˆ’13125๏ค=๐ด.

We could equivalently have checked directly that ๐ท3๏€ผ15๏ˆ๐ท3(5)=๐ท3(5)๐ท3๏€ผ15๏ˆ=๐ผ3, where ๐ผ3 is the 3ร—3 identity matrix.

To complete the LU decomposition of a matrix, the most important type of row operation is the third type defined by ๐‘Ÿ๐‘–โ†’๐‘Ÿ๐‘–+๐‘๐‘Ÿ๐‘—. For example, it is possible to complete the LU decomposition of a matrix without using the second type of row operation that we described above. Generally speaking, however, it would be impossible to achieve this without the third type of row operation. This importance is easy to appreciate, given that the third type of row operation is the one which is used most commonly when completing any process that involves the elementary row operations. For example, there would generally be little hope of completing Gaussโ€“Jordan elimination if we did not have license to use the third type of row operation. With this in mind, we now define the third type of elementary matrix.

Definition: The Third Type of Elementary Row Operation and the Corresponding Elementary Matrix

Consider a matrix ๐ด and the third type of elementary row operation ๐‘Ÿ๐‘–โ†’๐‘Ÿ๐‘–+๐‘๐‘Ÿ๐‘—, giving the row-equivalent matrix ๏‚ก๐ด. Then, we can define the corresponding โ€œelementaryโ€ matrix: ๐ธ๐‘–๐‘—=โŽกโŽขโŽขโŽขโŽขโŽขโŽขโŽขโŽฃ1โ‹ฏ0โ‹ฏ0โ‹ฏ0โ‹ฎโ‹ฑโ‹ฎโ‹ฏโ‹ฎโ‹ฏโ‹ฎ0โ‹ฏ1โ‹ฏ0โ‹ฏ0โ‹ฎโ‹ฎโ‹ฎโ‹ฑโ‹ฎโ‹ฎโ‹ฎ0โ‹ฏ๐‘โ‹ฏ1โ‹ฏ0โ‹ฎโ‹ฏโ‹ฎโ‹ฏโ‹ฎโ‹ฑโ‹ฎ0โ‹ฏ0โ‹ฏ0โ‹ฏ1โŽคโŽฅโŽฅโŽฅโŽฅโŽฅโŽฅโŽฅโŽฆ, which is essentially the identity matrix but with the entry in the ๐‘–th row and ๐‘—th column being equal to ๐‘ instead of 0. Then, the row-equivalent matrix ๏‚ก๐ด can be written as the matrix multiplication ๏‚ก๐ด=๐ธ๐‘–๐‘—(๐‘)๐ด.

We will demonstrate this in relation to the example above, where we had the matrix ๐ด=๏˜3210โˆ’13125๏ค.

If we wanted to apply the row operation ๐‘Ÿ2โ†’๐‘Ÿ2+3๐‘Ÿ1, then we would expect the resulting row-equivalent matrix ๏‚ก๐ด=๏˜321956125๏ค.

Because of the definition above, we know that we can encode the elementary row operation ๐‘Ÿ2โ†’๐‘Ÿ2+3๐‘Ÿ1 by the elementary matrix ๐ธ21(3). This matrix will be identical to the identity matrix ๐ผ3, except for an entry in the second row and first column having a value of 3 instead of 0. This matrix therefore has the form ๐ธ21(3)=๏˜100310001๏ค.

We can verify that this matrix can be used to recover the row-equivalent matrix ๏‚ก๐ด by matrix multiplication: ๏‚ก๐ด=๐ธ21(3)๐ด=๏˜100310001๏ค๏˜3210โˆ’13125๏ค=๏˜321956125๏ค.

With the second type of row operation, we said that it was important to understand both the matrix itself and the corresponding inverse matrix, so that we can complete the LU decomposition. This is also the case of the third type of row operation that we were describing in the definition above. Even though the third type of row operation is more complicated than the second type, the inverse operation is little more difficult to express.

Theorem: Inverse of the Matrix ๐ธ๐‘–๐‘—โ€Š(๐‘)

The inverse of the matrix ๐ธ๐‘–๐‘—(๐‘) is given by the formula ๐ธโˆ’1๐‘–๐‘—(๐‘)=๐ธ๐‘–๐‘—(โˆ’๐‘).

Suppose that we had taken a matrix and performed the elementary row operation ๐‘Ÿ๐‘–โ†’๐‘Ÿ๐‘–+๐‘๐‘Ÿ๐‘—. If we wanted to undo this effect, then we would use the row operation ๐‘Ÿ๐‘–โ†’๐‘Ÿ๐‘–โˆ’๐‘๐‘Ÿ๐‘— on the new matrix, hence returning the original matrix. This understanding is totally encapsulated by the statement of the theorem above, for which we now provide a check. To test this, we can produce the inverse matrix corresponding to the example above. Since we used ๐ธ21(3), we have that the inverse matrix is ๐ธโˆ’121(3)=๐ธ21(โˆ’3). We can then check that ๐ธโˆ’121(3)๏‚ก๐ด=๏˜100โˆ’310001๏ค๏˜321956125๏ค=๏˜3210โˆ’13125๏ค=๐ด.

Now that we have discussed the necessary ingredients for LU decomposition, we will both define this concept and then show how it can be achieved. We will do so initially by way of example, wherein we will briefly justify all of the steps that we are taking. Many of the techniques are ostensibly similar to those used in Gaussโ€“Jordan elimination and other related methods. The completion of the LU decomposition focuses on achieving an upper-triangular form, whereas Gaussโ€“Jordan elimination focuses on achieving an echelon form. These two concepts are similar but are not identical, and in fact when attempting to complete the LU decomposition we will have fewer options than we do generally for Gaussโ€“Jordan.

Definition: The LU Decomposition of a Matrix

Consider a matrix ๐ด. Then, the โ€œLU decompositionโ€ requires finding a lower-triangular matrix L and an upper-triangular matrix U such that LU=๐ด. This is sometimes referred to as the โ€œLU factorizationโ€ of a matrix.

We will demonstrate this with the simplest possible example of a 2ร—2 matrix. We take the matrix ๐ด=๏”3โˆ’914๏  and task ourselves with finding a lower-triangular matrix L and an upper-triangular matrix U such that ๐ด=LU. Suppose that we begin with the familiar method of first highlighting all of the pivot entries of ๐ด: ๐ด=๏”3โˆ’914๏ .

After having practiced Gaussโ€“Jordan elimination, our first instinct might be to simplify ๐ด, which in this case would involve removing the factor of 3 that appears in every entry in the top row. To do this, we would use the elementary row operation ๐‘Ÿ1โ†’13๐‘Ÿ1, which has the corresponding elementary matrix ๐ท1๏€ผ13๏ˆ=๏™13001๏ฅ.

We will also state the inverse of this matrix: ๐ทโˆ’11๏€ผ13๏ˆ=๐ท1(3)=๏”3001๏ .

Given that these two matrices are inverses of each other, they must obey the relationship ๐ทโˆ’11๏€ผ13๏ˆ๐ท1(3)=๐ผ2, where ๐ผ2 is the 2ร—2 identity matrix. This is certainly the case, which means that we can write this equation in full:

๏”3001๏ ๏™13001๏ฅ=๏”1001๏ .(1)

It is this statement that allows us to begin finding the LU decomposition of ๐ด. Given that ๐ผ2๐ด=๐ด for any matrix ๐ด of order 2ร—2, we can write ๏”1001๏ ๏”3โˆ’914๏ =๏”3โˆ’914๏ .

Now the reason for writing this equation is explained by equation (1), which we duly substitute into the equation immediately above, giving ๏”3001๏ ๏™13001๏ฅ๏”3โˆ’914๏ =๏”3โˆ’914๏ .

Matrix multiplication is associative, which means that we can combine the second and third matrices using the operation of matrix multiplication. This gives precisely the desired effect, as the new middle matrix is the same as the original matrix but with the constant factor of 3 now removed from the first row:

๏”3001๏ ๏”1โˆ’314๏ =๏”3โˆ’914๏ .(2)

The leftmost matrix is a lower-triangular matrix, as all of the entries above diagonal entries are zero. In the case of a 2ร—2 matrix, this leaves only the entry in the top right, which is shown to be zero. Unfortunately, the middle matrix is not yet in upper-triangular form and so we have not yet found our LU decomposition. Now, once again inspecting the pivot entries shows that we can remove the pivot in the second row to turn the middle matrix into an upper-triangular matrix. This is completed with the row operation ๐‘Ÿ2โ†’๐‘Ÿ2โˆ’๐‘Ÿ1. The corresponding elementary matrix is ๐ธ21(โˆ’1)=๏”10โˆ’11๏ , which has the inverse elementary matrix ๐ธโˆ’121(โˆ’1)=๐ธ21(1)=๏”1011๏ .

Once again, these two elementary matrices are multiplicative inverses of each other, so we have the relationship ๐ธโˆ’121(โˆ’1)๐ธ21(โˆ’1)=๐ผ2. Written out in long form, this is

๏”1011๏ ๏”10โˆ’11๏ =๏”1001๏ .(3)

We can take equation (2) and insert the identity matrix as shown, without changing the validity of the equation: ๏”3001๏ ๏”1001๏ ๏”1โˆ’314๏ =๏”3โˆ’914๏ .

Substituting equation (3) then gives ๏”3001๏ ๏€ผ๏”1011๏ ๏”10โˆ’11๏ ๏ˆ๏”1โˆ’314๏ =๏”3โˆ’914๏ .

We once again rely on the property of matrix multiplication being associative. Given that this is always the case, we can order the calculations in any way we would prefer. In this situation, it is best to order the calculations as shown: ๏€ผ๏”3001๏ ๏”1011๏ ๏ˆ๏€ผ๏”10โˆ’11๏ ๏”1โˆ’314๏ ๏ˆ=๏”3โˆ’914๏ .

Then, completing the two matrix multiplications gives the required decomposition:

๏”3011๏ ๏”1โˆ’307๏ =๏”3โˆ’914๏ .(4)

We have therefore obtained the two matrices L=๏”3011๏ ,U=๏”1โˆ’307๏ ,

such that LU=๐ด. It is easily checked that L is a lower-triangular matrix and that U is an upper-triangular matrix and the final confirmation of the result can be completed by checking that equation (4) is true.

The calculation of an LU decomposition, once practiced, would normally be completed without much of the supporting explanation that was given in the above example. Much as elementary row operations can be used to manipulate a matrix into any type of possible decomposition, obtaining an LU decomposition requires that we only ever produce lower-triangular matrices on the leftmost side of our equations. This means that our elementary matrices (and hence our elementary row operations) are chosen accordingly, as it will be counterproductive to use elementary row operations that incur a non-lower-triangular form in the leftmost terms. This is not as challenging as it might seem at first, and the criteria are essentially those described by Doolittleโ€™s algorithm, which we will discuss in more detail later in this explainer.

Example 1: LU Decomposition for a 3 ร— 3 Matrix

Find the LU decomposition for the matrix ๐ด=๏˜183270301๏ค.

Answer

We begin by highlighting the pivot entries in the matrix ๐ด, which are the first nonzero entries of each row: ๐ด=๏˜183270301๏ค.

Our aim will be to move the pivot entries to the right, hence moving closer toward an upper-triangular form for ๐ด. There are infinitely many ways to achieve an upper-triangular form, although we will begin by aiming to remove the pivot which is in the second row. If we were thinking purely in terms of elementary row operations, then we could use ๐‘Ÿ2โ†’๐‘Ÿ2โˆ’2๐‘Ÿ1 to achieve this. Thinking in terms of the corresponding matrix, we would require the elementary matrix (and inverse) ๐ธ21(โˆ’2)=๏˜100โˆ’210001๏ค,๐ธโˆ’121(โˆ’2)=๐ธ21(2)=๏˜100210001๏ค.

Given that the two matrices above are multiplicative inverses of each other, the matrix multiplication ๐ธ21(โˆ’2)๐ธโˆ’121(โˆ’2) will necessarily return the 3ร—3 identity matrix. This means that we can multiply on the left-hand side of ๐ด by this combined expression, without changing ๐ด overall (since the identity matrix preserves any matrix under matrix multiplication). Thus, we could write ๏˜100210001๏ค๏˜100โˆ’210001๏ค๏˜183270301๏ค=๏˜183270301๏ค, which is essentially just the same as writing the equation ๐ด=๐ด. However, once we combine the second and third matrices together by using matrix multiplication, we find that we have actually obtained the desired row-equivalent matrix in the middle matrix:

๏˜100210001๏ค๏˜1830โˆ’9โˆ’6301๏ค=๏˜183270301๏ค.(5)

The leftmost term is a lower-triangular matrix, and the middle matrix is not yet in upper-triangular form, so we still have some work to do. Now suppose that we wished to remove the pivot entry in the third row, to help manipulate the middle matrix toward upper-triangular form. One option would be the row operation ๐‘Ÿ3โ†’๐‘Ÿ3โˆ’3๐‘Ÿ1, which has the equivalent elementary matrices ๐ธ31(โˆ’3)=๏˜100010โˆ’301๏ค,๐ธโˆ’131(โˆ’3)=๐ธ31(3)=๏˜100010301๏ค.

We can now inject both of these matrices between the left and middle matrices in equation (5). Overall, this has no effect on the validity of the equation because the two inserted matrices are each otherโ€™s inverse. We obtain ๏˜100210001๏ค๏€๏˜100010301๏ค๏˜100010โˆ’301๏ค๏Œ๏˜1830โˆ’9โˆ’6301๏ค=๏˜183270301๏ค.

Because matrix multiplication is associative, we could equally write this as ๏€๏˜100210001๏ค๏˜100010301๏ค๏Œ๏€๏˜100010โˆ’301๏ค๏˜1830โˆ’9โˆ’6301๏ค๏Œ=๏˜183270301๏ค.

If we multiply together the first pair of matrices and then multiply together the second pair of matrices, we obtain the alternative form

๏˜100210301๏ค๏˜1830โˆ’9โˆ’60โˆ’24โˆ’8๏ค=๏˜183270301๏ค.(6)

The leftmost matrix is still in lower-triangular form, which is necessary for the LU decomposition to be complete. However, we have not quite finished yet because the middle matrix is not in upper-triangular form. We now have many options but, for the sake of neatness, we will choose to simplify both the second and the third rows. For the second row, we can use the row scaling operation ๐‘Ÿ2โ†’โˆ’13๐‘Ÿ2, which has the elementary matrices ๐ท2๏€ผโˆ’13๏ˆ=โŽกโŽขโŽขโŽฃ1000โˆ’130001โŽคโŽฅโŽฅโŽฆ,๐ทโˆ’12๏€ผโˆ’13๏ˆ=๐ท2(โˆ’3)=๏˜1000โˆ’30001๏ค.

Injecting these into equation (6) gives ๏˜100210301๏ค๏˜1000โˆ’30001๏คโŽกโŽขโŽขโŽฃ1000โˆ’130001โŽคโŽฅโŽฅโŽฆ๏˜1830โˆ’9โˆ’60โˆ’24โˆ’8๏ค=๏˜183270301๏ค.

By using the same approach as above, we recall that matrix multiplication is associative. Multiplying together the first pair of matrices and then multiplying together the second pair of matrices, we obtain

๏˜1002โˆ’30301๏ค๏˜1830320โˆ’24โˆ’8๏ค=๏˜183270301๏ค.(7)

A similar trick can be applied to the third row, this time using the row operation ๐‘Ÿ3โ†’โˆ’18๐‘Ÿ3. The corresponding elementary matrices are ๐ท3๏€ผโˆ’18๏ˆ=โŽกโŽขโŽขโŽฃ10001000โˆ’18โŽคโŽฅโŽฅโŽฆ,๐ทโˆ’13๏€ผโˆ’18๏ˆ=๐ท3(โˆ’8)=๏˜10001000โˆ’8๏ค.

By placing these two matrices into equation (7), we find ๏˜1002โˆ’30301๏ค๏˜10001000โˆ’8๏คโŽกโŽขโŽขโŽฃ10001000โˆ’18โŽคโŽฅโŽฅโŽฆ๏˜1830320โˆ’24โˆ’8๏ค=๏˜183270301๏ค.

By again multiplying together the first pair of matrices and then, separately, the second pair of matrices, we have the equation

๏˜1002โˆ’3030โˆ’8๏ค๏˜183032031๏ค=๏˜183270301๏ค.(8)

The leftmost matrix is still in lower-triangular form, as will be required for the LU decomposition. The middle matrix is very nearly in upper-triangular form and this can be completed with the row operation ๐‘Ÿ3โ†’๐‘Ÿ3โˆ’๐‘Ÿ2. The elementary matrices that we require are ๐ธ32(โˆ’1)=๏˜1000100โˆ’11๏ค,๐ธโˆ’132(โˆ’1)=๐ธ32(1)=๏˜100010011๏ค.

Injecting these into equation (8) gives ๏˜1002โˆ’3030โˆ’8๏ค๏˜100010011๏ค๏˜1000100โˆ’11๏ค๏˜183032031๏ค=๏˜183270301๏ค, which then simplifies to give the desired form ๏˜1002โˆ’303โˆ’8โˆ’8๏ค๏˜18303200โˆ’1๏ค=๏˜183270301๏ค.

We have now achieved exactly what we set out to achieve. Namely, we have written the matrix ๐ด in the form LU=๐ด, where L is a lower-triangular matrix and U is an upper-triangular matrix. Indeed, we could (and arguably should) check that the above equation holds, which is a simple matter of completing the matrix multiplication.

The main point of understanding before the statement of Doolittleโ€™s algorithm is that, for each elementary symmetric matrix, we are also calculating the corresponding inverse of this matrix. When a matrix is multiplied by its inverse, the result is the identity matrix. Also, when an identity matrix is multiplied by any other matrix, the result is this matrix. Doolittleโ€™s algorithm requires both of these properties as it is normally phrased in a way that is more technical than the way we present the result below.

Theorem: Doolittleโ€™s Method for Calculating the LU Decomposition of a Matrix

Consider a square matrix ๐ด, for which it is possible to find the LU decomposition. Also, suppose that there exist elementary, lower-triangular matrices ๐น1,๐น2,โ€ฆ,๐น๐‘› which place ๐ด into upper-triangular form, as shown: U=๐น๐‘›๐น๐‘›โˆ’1โ‹ฏ๐น2๐น1๐ด.

Then, the lower-triangular matrix is constructed as the product of the inverses in the following way: L=๐นโˆ’11๐นโˆ’12โ‹ฏ๐นโˆ’1๐‘›โˆ’1๐นโˆ’1๐‘›.

It is then the case that LU=๐ด.

There are two key points to this theorem. Firstly, the matrices ๐น๐‘˜ are lower-triangular, which means that the inverses ๐นโˆ’1๐‘˜ will also be lower-triangular and so will the product of these matrices. This guarantees that the matrix ๐ฟ=๐นโˆ’11๐นโˆ’12โ‹ฏ๐นโˆ’1๐‘›โˆ’1๐นโˆ’1๐‘› is in lower-triangular form, as needed. The second key point is that this decomposition, if possible, does actually achieve the matrix ๐ด. This is a simple matter of checking that

๐ด=LU=๏€น๐นโˆ’11๐นโˆ’12โ‹ฏ๐นโˆ’1๐‘›โˆ’1๐นโˆ’1๐‘›๏…(๐น๐‘›๐น๐‘›โˆ’1โ‹ฏ๐น2๐น1๐ด).(9)

The most useful property in this situation is that matrix multiplication is associative, meaning that we can bracket the terms in any order that we would find to be most helpful. In this case, we will pair together the matrices, working from the central point outward. Given that ๐นโˆ’1๐‘› is the inverse of ๐น๐‘›, we would find that ๐นโˆ’1๐‘›๐น๐‘›=๐ผ๐‘›, the ๐‘›ร—๐‘› identity matrix. This allows equation (9) to be written in the following way: ๐ด=LU=๐นโˆ’11๐นโˆ’12โ‹ฏ๐นโˆ’1๐‘›โˆ’1๏€น๐นโˆ’1๐‘›๐น๐‘›๏…๐น๐‘›โˆ’1โ‹ฏ๐น2๐น1๐ด=๐นโˆ’11๐นโˆ’12โ‹ฏ๐นโˆ’1๐‘›โˆ’1๐ผ๐‘›๐น๐‘›โˆ’1โ‹ฏ๐น2๐น1๐ด=๐นโˆ’11๐นโˆ’12โ‹ฏ๐นโˆ’1๐‘›โˆ’1๐น๐‘›โˆ’1โ‹ฏ๐น2๐น1๐ด, where the final line has been achieved by recalling that ๐ผ๐‘›๐‘€=๐‘€๐ผ๐‘›=๐‘€ for any matrix ๐‘€. It is now clear that because every elementary matrix will become paired with its own, which will then return the identity matrix, the next step follows exactly the same pattern: ๐ด=๐นโˆ’11๐นโˆ’12โ‹ฏ๏€น๐นโˆ’1๐‘›โˆ’1๐น๐‘›โˆ’1๏…โ‹ฏ๐น2๐น1๐ด=๐นโˆ’11๐นโˆ’12โ‹ฏ๐ผ๐‘›โ‹ฏ๐น2๐น1๐ด=๐นโˆ’11๐นโˆ’12โ‹ฏ๐น2๐น1๐ด.

Continuing to pair up these matrices until the final pair gives ๐ด=๐นโˆ’11๐น1๐ด=๐ผ๐‘›๐ด=๐ด.

It is now clear why Doolittleโ€™s algorithm works. The only reason why this approach would not work would be if one of the elementary matrices ๐นโˆ’1๐‘˜ was not lower triangular. If it is not possible to create the LU decomposition without the use of non-lower-triangular matrices, then a minor adjustment is needed, which is referred to as the PLU decomposition, wherein non-lower-triangular matrices of a certain type are temporarily permitted in the calculations. For the remainder of this explainer, we will give examples where it is possible to find the LU decomposition, with the PLU decomposition being treated in a separate explainer.

Example 2: LU Decomposition for a 3 ร— 3 Matrix

Find the LU decomposition for the matrix ๐ด=๏˜102โˆ’411323๏ค.

Answer

We will begin immediately by highlighting the pivot entries of ๐ด, as this will help us guide the matrix into upper-triangular form by borrowing the principles from Gaussโ€“Jordan elimination. We have ๐ด=๏˜102โˆ’411323๏ค.

We need to remove both of the pivots in the second and third rows, using elementary row operations to replace these entries with zeroes. A good first step would be the elementary row operation ๐‘Ÿ2โ†’๐‘Ÿ2+4๐‘Ÿ1, which has the corresponding elementary matrices: ๐ธ21(4)=๏˜100410001๏ค,๐ธโˆ’121(4)=๐ธ21(โˆ’4)=๏˜100โˆ’410001๏ค.

Since these matrices are inverses of each other, their product will by the 3ร—3 identity matrix ๐ผ3. Introducing the product of the matrices ๐ธโˆ’121(4)๐ธ21(4) will therefore have no effect. Applying this to the trivial equation ๐ด=๐ด gives ๏˜100โˆ’410001๏ค๏˜100410001๏ค๏˜102โˆ’411323๏ค=๏˜102โˆ’411323๏ค.

Multiplying together the second and third matrices returns a lower-triangular matrix for the first matrix, as well as the second matrix being of the desired form, with the pivot in the second row no longer appearing in the first column:

๏˜100โˆ’410001๏ค๏˜102019323๏ค=๏˜102โˆ’411323๏ค.(10)

Much as the leftmost matrix is in lower-triangular form, the middle matrix is not yet in upper-triangular form. We can move closer to this by removing the pivot in the third row with the row operation ๐‘Ÿ3โ†’๐‘Ÿ3โˆ’3๐‘Ÿ1. This has the corresponding elementary matrices ๐ธ31(โˆ’3)=๏˜100010โˆ’301๏ค,๐ธโˆ’131(โˆ’3)=๐ธ31(3)=๏˜100010301๏ค.

As before, we place these matrices between the first and second matrices, this time of equation (10), which gives ๏˜100โˆ’410001๏ค๏˜100010301๏ค๏˜100010โˆ’301๏ค๏˜102019323๏ค=๏˜102โˆ’411323๏ค.

By multiplying together the first pair of matrices and then the second pair of matrices, we obtain a new decomposition:

๏˜100โˆ’410301๏ค๏˜10201902โˆ’3๏ค=๏˜102โˆ’411323๏ค.(11)

The first matrix is still in lower-triangular form, as we will need for the completion of the LU decomposition. However, the middle matrix is still one row operation away from being placed in upper-triangular form. The final row operation that we will need to apply is ๐‘Ÿ3โ†’๐‘Ÿ3โˆ’2๐‘Ÿ2, which gives the associated elementary matrices ๐ธ32(โˆ’2)=๏˜1000100โˆ’21๏ค,๐ธโˆ’132(โˆ’2)=๐ธ32(2)=๏˜100010021๏ค.

These matrices are injected into equation (11), giving ๏˜100โˆ’410301๏ค๏˜100010021๏ค๏˜1000100โˆ’21๏ค๏˜10201902โˆ’3๏ค=๏˜102โˆ’411323๏ค.

Finally, we multiply together the first pair of matrices and the second pair of matrices, giving the desired form: ๏˜100โˆ’410321๏ค๏˜10201900โˆ’21๏ค=๏˜102โˆ’411323๏ค

We therefore have the following triangular matrices: L=๏˜100โˆ’410321๏ค,U=๏˜10201900โˆ’21๏ค, such that ๐ด=LU.

As ever with linear algebra, once a technique has been mastered for a matrix of a particular order, it is normally little extra conceptual effort to apply the same approach to matrices with different orders. This will again show to be the case with the application of Doolittleโ€™s algorithm to a matrix with order 4ร—4. Much as the calculations for a 4ร—4 matrix are more numerous, the overall method is no more difficult than the previous examples, relying on exactly the same principles to decompose the matrix into LU form.

Example 3: LU Decomposition for a 4 ร— 4 Matrix

Find the LU decomposition for the matrix ๐ด=โŽกโŽขโŽขโŽฃ101โˆ’1020โˆ’2122032โˆ’11โŽคโŽฅโŽฅโŽฆ.

Answer

The initial step is to highlight the pivot entries of each row, as shown: ๐ด=โŽกโŽขโŽขโŽฃ101โˆ’1020โˆ’2122032โˆ’11โŽคโŽฅโŽฅโŽฆ.

The pivot in the second row is already in the correct position, given that we are aiming to turn ๐ด into a row-equivalent, upper-triangular matrix. We should instead focus on removing the pivot entries in the third and fourth rows. The pivot in the third row can be removed with the row operation ๐‘Ÿ3โ†’๐‘Ÿ3โˆ’๐‘Ÿ1, which has the corresponding elementary matrices ๐ธ31(โˆ’1)=โŽกโŽขโŽขโŽฃ10000100โˆ’10100001โŽคโŽฅโŽฅโŽฆ,๐ธโˆ’131(โˆ’1)=๐ธ31(1)=โŽกโŽขโŽขโŽฃ1000010010100001โŽคโŽฅโŽฅโŽฆ.

Given that the product of these matrices is just the identity matrix ๐ผ4, we can introduce them to the trivial equation ๐ด=๐ด without changing its validity. Thus, we obtain โŽกโŽขโŽขโŽฃ1000010010100001โŽคโŽฅโŽฅโŽฆโŽกโŽขโŽขโŽฃ10000100โˆ’10100001โŽคโŽฅโŽฅโŽฆโŽกโŽขโŽขโŽฃ101โˆ’1020โˆ’2122032โˆ’11โŽคโŽฅโŽฅโŽฆ=โŽกโŽขโŽขโŽฃ101โˆ’1020โˆ’2122032โˆ’11โŽคโŽฅโŽฅโŽฆ.

Multiplying together the second and third matrices gives an equation where the middle matrix no longer has a pivot in the third row, as was our intention:

โŽกโŽขโŽขโŽฃ1000010010100001โŽคโŽฅโŽฅโŽฆโŽกโŽขโŽขโŽฃ101โˆ’1020โˆ’2021132โˆ’11โŽคโŽฅโŽฅโŽฆ=โŽกโŽขโŽขโŽฃ101โˆ’1020โˆ’2122032โˆ’11โŽคโŽฅโŽฅโŽฆ.(12)

Note that the first matrix is in lower-triangular form but the middle matrix is not yet in upper-triangular form, which means that there is more work to do. Our next target will be the pivot in the fourth row, which can be removed with the row operation ๐‘Ÿ4โ†’๐‘Ÿ4โˆ’3๐‘Ÿ1. The corresponding elementary matrices are ๐ธ41(โˆ’3)=โŽกโŽขโŽขโŽฃ100001000010โˆ’3001โŽคโŽฅโŽฅโŽฆ,๐ธโˆ’141(โˆ’3)=๐ธ41(3)=โŽกโŽขโŽขโŽฃ1