In this explainer, we will learn how to solve a system of three linear equations using the inverse of the matrix of coefficients.
There are a number of perspectives from which linear algebra can be fruitfully and consistently viewed. One such perspective is to understand matrices as a way of encoding information about how vectors are transformed through space, which provides an algebraic understanding of how we morph points, lines, planes, and higher-dimensional objects. Another perspective is that linear algebra is one manifestation of a more general idea known as vector spaces, wherein attractive abstract properties are used to define many systems that share algebraic properties with linear algebra, conventional algebra, and many other areas of mathematics.
Out of all the possible perspectives that we can have on linear algebra, arguably there is one which contributed most to the development of the subject, at least in the initial stages. Mathematicians throughout history have always had a peculiar attraction towards solving equations. Following the popularization of algebra, it soon became interesting for mathematicians to study simultaneous equations, where two variables must be solved for in tandem. This idea generalizes in a natural way to systems of equations with many unknown variables, which is where we begin to witness the versatility and power of linear algebra in driving the historical developments within mathematics, offering an increasingly well-honed and varied toolkit.
Our goal in this explainer will be to solve systems of linear equations by using our understanding of the inverse of a square matrix. As we will see, any system of linear equations can be expressed strictly in terms of matrices, meaning that we can use our understanding of linear algebra to solve these.
Definition: Matrix Form of a System of Linear Equations
Consider a general system of linear equations in the variables and the coefficients :
Then we define the coefficient matrix and the two vectors
Then the system of linear equations can be encapsulated by the matrix equation , which in long form would be written as
We will see later that this way of using matrix multiplication is a very useful approach for expressing a system of linear equations, as it allows us to use any tool from the vast mathematical toolkit of linear algebra. Let us consider, for example, the system of linear equations
At this stage, we would probably be tempted to solve this system of linear equations by using any of the known, standard methods for solving simultaneous equations of two variables. However, we choose to follow the definition above and we express the above system of linear equations by constructing the matrices
The system of linear equations can then be expressed as . In long form this is
It might understandably seem that this representation has not helped us in the slightest, which is where the concept of the matrix inverse can be parachuted in to provide support. Suppose that we were to construct the matrix inverse of using the well-known formula for the inverse of a matrix. For a general matrix the inverse matrix is
For our matrix , we would find that
Suppose now that we were to multiply left side of equation (2) by the matrix inverse . We would obtain
Since matrix multiplication is associative, we can group the terms on the left-hand side in the following order:
As if by magic, we find that the term within the curved brackets is just a copy of the identity matrix. Completing this matrix multiplication gives allowing us to take the scaling constant within the matrix to find
Completing the final matrix multiplication on the left-hand side gives
We now have an expression for and in terms of one final matrix multiplication. Carrying this out gives
We can check that and are the only two values which solve the system of linear equations in (1), confirming that we have solved the problem.
Rather than focus on the specific problem above, we can show that this method can be applied to a general system of linear equations, providing that a few conditions are met. Suppose that we have a linear system where there are as many equations as there are unknown variables. In other words, we have
Then define the matrix and two vectors
Then the system of linear equations can be described by the matrix equation where is a square matrix. It is crucial that be a square matrix as the multiplicative inverse is not defined for non-square matrices. Now supposing that exists, we can multiply by it on the left-hand side of the above equation, giving
Matrix multiplication is associative, meaning that we can write
By definition, we have that , where is the identity matrix. This implies that
We also know that the identity matrix leaves a matrix unchanged when combined under the operation of matrix multiplication. This allows the final simplification
If our goal was to find the vector , it has clearly been achieved in the above equation. We will now demonstrate how this method can be applied to other systems of linear equations.
Example 1: Using the Inverse Matrix to Solve a System of Linear Equations
Solve the system of the linear equations using the inverse of a matrix.
We create the matrices corresponding to the system of linear equations above. If we assign then the problem can be equivalently encoded by the matrix problem
More succinctly, we could write
Our goal will be to solve this equation for , given that this vector contains the variables and that we would like to find. Assuming that the inverse exists, we could multiply by it on the left-hand side of the equation, giving
Given that matrix multiplication is associative, this statement is equivalent to saying that
By definition, we know that , where is the identity matrix, giving
The identity matrix will leave the vector unchanged when combined under matrix multiplication as . This implies that
We now have a formula for , provided that we can find . To do this we use the expression for the inverse of a matrix:
Given that we have
We can now use equation (3) to find :
Earlier we purposefully defined
We therefore find that and , as we can check in the original equations.
It might, at this stage, seem as though the method we have just presented is an overly convoluted way of finding the solution to a system of linear equations where there are two variables and two unknowns. Normally we would prefer to use a familiar and simpler technique for solving simultaneous equations of this type. The significance of the matrix inversion method is easier to understand when working with matrices of order and above. Also, the inverse of a square matrix is something that we would likely take an independent interest in, regardless of the actual problem that we are trying to solve, so it may be the case that we will calculate this matrix irrespective of the problem involving it that we are actually trying to solve.
There is a subtler point that we must also consider. Suppose that, in the example above, the system of linear equations was exactly the same except for the quantities on the right-hand side of both equations, as was encoded by the vector . In order to solve the problem, we would still need to find the inverse matrix and complete the calculation . In this sense, finding the inverse matrix is a task that will solve the system of linear equations for any vector . Furthermore, it may not be possible to find because the matrix has a determinant of zero. In this situation, the value of would be irrelevant, because it would not be possible to solve the problem.
Example 2: Using the Inverse Matrix to Solve a System of Linear Equations (with the Gauss–Jordan Method)
Solve the system of the linear equations using the inverse of a matrix.
We begin by assigning the values
This allows us to write the above system of linear equations as
Equivalently, we can now define the equations in the very neat form
Phrased this way, we now aim to find , as this vector contains all of the unknown variables , and . In order to do this, we will use first assume that the inverse exists and we then multiply the left-hand side of the equation above by this matrix:
Given that by definition where is the identity matrix and also given that for any matrix with order , we have
We now know how to express in terms of the inverse matrix , which we must now calculate. To do this, we will use the Gauss–Jordan elimination method for calculating the inverse of a square matrix. We remind ourselves of the two matrices which we then join together as
If the inverse exists, then we will be able to use elementary row operations to change the above matrix into the form . First we highlight the pivot, which is the first nonzero entry in each row:
The entry in the top-left corner is fairly convenient but it will be more helpful if this entry had a value of 1. We quickly scale the top row with the operation , giving
To achieve the desired form, we must obtain the identity matrix in the left-hand side of the joined matrix. The identity matrix has a 1 in the top-left entry and the rest of the entries in this column are zero. It is therefore necessary to remove the two pivot entries in the first column using the row operations and :
For similar reasons to the ones above, we would prefer the pivot in the second row to have a value of 1, so we perform the row operation :
Now we will remove the pivot in the third row, since this is directly below the pivot in the second row. The row operation gives the matrix
At this stage, it might be tempting to immediately perform the row operation , which would introduce fractions into the third row and hence the remainder of the calculations. Although it is by no means necessary, it is usually preferable to avoid this if possible. Because of this, we instead choose the row operation , which gives
We also complete the row operations and :
We have completed these row operations as a preparatory measure. Now we will remove the nonzero entries which are above the pivot in the third row, using the row operations and . The resulting matrix is
We now have the penultimate step of removing the nonzero entry above the pivot in the second row. The row operation gives
Instead of achieving the form , we have instead produced the matrix . This is certainly not a failure on our part, since we can now write that
It can be checked that , which means that we have found the correct inverse. We can now solve the problem by using equation (4). We have that
This gives the final answers that , , and . It can be checked in the original system of linear equations that these are the correct values.
In the question above, we used the Gauss–Jordan method for finding the inverse matrix of the corresponding system of linear equations. Using row operations to manipulate a matrix is a fundamental skill in linear algebra and questions like the one above are an excellent source of practice. Nonetheless, there are other methods that can be used to calculate the inverse of a matrix that may be preferable depending on the matrix involved. In the following example we will use the adjoint matrix method to calculate the necessary matrix inverse. This method is often considered to be preferable for calculating the inverse of a matrix, especially for matrices of order , although it applies to square matrices of any order.
Example 3: Using the Inverse Matrix to Solve a System of Linear Equations (with the Adjoint Matrix Method)
Use the inverse of a matrix to solve the system of the linear equations
We will first create the matrices
The system of linear equations can be encoded equivalently by the matrix multiplication
This allows the simplest expression of the system of equations, as . By multiplying the left-hand side by the inverse and then simplifying, we can express the vector by the expression
We would like to calculate , since this vector has entries that are the unknown variables , and . To use the equation above to find , we must first calculate . To do this we will use the adjoint matrix method, which is described as follows.
Using the adjoint matrix method means that we must calculate the determinant of . We use Sarrus’ rule, which gives
Since the determinant is nonzero, we know that the matrix is nonsingular and hence the inverse does exist. We have already used 3 of the matrix minors of in calculating , but to use the adjoint matrix method to calculate , it is necessary to list all 9 matrix minors of :
For these matrices, we can calculate the determinants but we must remember to include the parity term that is used in creating the adjoint matrix. This included, we have
The cofactor matrix is populated by the right-hand terms of the above 9 equations:
The adjoint matrix is the transpose of the cofactor matrix:
The inverse matrix is written in terms of the adjoint matrix and the determinant that we calculated in equation (6), using the formula
Now that we know , we can solve the original system of linear equations by using equation (5):
This means that the solution to the original problem is , , and .
The questions above can be approached with an abstract simplicity once the system of equations is reduced to the matrix equation . The benefits of such an expression is that we can treat it in a detached, algebraic sense, which makes it easy to see that the system can be solved by using linear algebra to achieve the matrix equation . Treating the problem only in this abstract way, however, would belie the computational complexity that is incurred when trying to calculate the inverse matrix . Furthermore, it is obviously not possible to solve the system of equations if we do not have an exact form for , which will normally involve either the Gauss–Jordan method or the adjoint matrix method. The ability to change perspective between the abstract view and the computational view is a defining characteristic of linear algebra, wherein we must frequently shift our perspective in order to fully understand the problem that we are working with and the techniques that we might employ to solve it. For many mathematicians, this is one of the joys of studying linear algebra, but even if this is not the case for everyone, it would be hard not to empathise with this perspective in this particuar situation, given the examples above.
- A system of linear equations can be encoded by the matrix equation , where the aim is to solve the system by finding .
- If is a square matrix and is invertible, then we can find the matrix either by the Gauss–Jordan method or by the adjoint matrix method.
- If the inverse can be found, then we can use linear algebra to find .