In this explainer, we will learn how to express a system of linear equations as a matrix equation and how to write a set of simultaneous equations from a matrix equation.
When linear algebra was first being developed, one of the main motivating reasons was to help solve systems of linear equations. The reason for this was that many real-world problems of the age were initially written in a long, convoluted manner which included surprisingly few numbers. Suppose that we had the following question:
“The cost of buying three apples and eight bananas is 5 pounds and 10 pence. The cost of buying twelve apples and five bananas is 8 pounds and 25 pence. How much does it cost to buy a single apple or a single banana?”
This problem is easily solved by anybody who is familiar with simultaneous equations or linear algebra. Suppose that we were to let the variables and , respectively, be the costs of a single apple and a single banana. Then the above question can be phrased as the following system of linear equations:
There are several clear advantages to writing the problem above as this system of linear equations. In choosing this format, no information about the problem is lost and we have written all of the relevant information in a way that is more succinct and hence easier to work with. Algebra can be used to quickly show that and .
For a problem such as the one above, it might appear that there is very little room for improvement over the form in the system of equations (1). If we are familiar with matrix multiplication, however, we can actually make a slight improvement to the statement of this problem by using the appropriate matrices. Although we will not justify our working just yet, we will soon see how it is possible to write the system of equations (1) in a form that is even more neat:
Writing the system of equations in this way allows us to think of the three matrices somewhat separately. The colossal toolkit of linear algebra can then be applied to any of these matrices independently, although practically we are usually most interested in the left-most matrix.
To justify how we achieved the above matrix representation, we will actually work in reverse by first constructing a different equation featuring matrices. Suppose that we had the following:
The matrix multiplication is only well defined if the matrix has order and the matrix has order , in which case that matrix has order . In the case of the matrices above, the left-most matrix has order and the middle matrix has order , which means that the matrix multiplication is well defined. Furthermore, the order of the resulting matrix will be , which is indeed the case for the right-most matrix.
The middle matrix contains the variables , , , and . Although we cannot know the possible values of these variables without further working, we can still choose to express the above matrix equation in terms of the matrix multiplications.
The first stage is to pair entries in the first row of the left matrix with the entries in the first and only column of the middle matrix, which will be equal to the entry in the first row and first column of the right matrix, as we have highlighted below:
Pairing the entries in order, we use the definition of matrix multiplication to give . For reasons that will shortly become clear, we choose to write this equation in the very rigid form
We can apply the same working with the second row of the left matrix:
We have , which we write in the same format that we used for equation (2):
Finally we turn to the third row of the left matrix:
Pairing each of the entries gives Now we write this expression in the same format that we used for equations (2) and (3):
Now that we have effectively completed the matrix multiplication, we can write the three equations (2)–(4) together as one aligned system of linear equations:
Had we instead begun with this system of linear equations, then we could simply have reversed the steps in order to write the problem as the matrix equation that we were originally given. Realistically, there is seldom such a need to perform many steps in order to do this because, if performed correctly, the task is little more difficult than reading coefficients from a system of linear equations and then writing these coefficients into an appropriate matrix. This can be complicated if the system of linear equations is not written in a standard way, so it is good practice to always rewrite the system of linear equations into a standard form where the alignment and spacing will help to create the corresponding matrix equation with minimal risk of there being a transcription error.
Theorem: Matrix form of a System of Linear Equations
Consider a general system of linear equations in the variables and the coefficients :
Then this can equivalently be written as the matrix equation
The left-most matrix is typically referred to as the “coefficient” matrix corresponding to the system of linear equations.
This theorem summarizes the general result behind the specific example that we just examined. Using this theorem, we can deduce some general properties that we will need to understand before beginning any examples. In the statement of the above theorem, the system of linear equations had equations, each featuring up to variables. The coefficient matrix corresponding to this system must therefore have order , whereas the middle matrix will have order and the right matrix will have order . If we are presented with a system of linear equations, we should use this information as a guideline, to understand what the orders of the associated matrices must be. In the following two examples, we show how this theorem can be applied to encode two entirely different systems into matrix form.
Example 1: Calculating When a Systems of Linear Equations Is Inconsistent
Express the simultaneous equations as a matrix equation.
Before we write these simultaneous equations in matrix form, it will be helpful to take the two given equations and then rewrite them in a form that is more useful. By aligning each variable in the same column and allowing some spacing between the entries, we choose the form
There are 2 simultaneous equations, meaning that the coefficient matrix must have 2 rows. Given that there are 2 variables, and , the coefficient matrix must have 2 columns. This means that we are trying to create a matrix equation of the following form: where the entries are unknowns that need to be found. The first equation of (5) is which means that we can populate the first row of the coefficient matrix as
The second equation of (5) is
We can inject this information into the second row of the coefficient matrix without changing any entries in the first row. This is achieved as shown in the highlighted entries below:
This final equation represents the full matrix representation of the simultaneous equations and is entirely equivalent (provided that we understand the definition of matrix multiplication).
Example 2: A System of Linear Equations With 3 Variables and 3 Equations
Express the following set of simultaneous equations as a matrix equation:
Before we begin, we should note that there are 3 equations and 3 variables: , , and . This means that the coefficient matrix will have the order of . Our goal will be to find the matrix equation of the form which reproduces the system of linear equations in the statement of the question.
We begin by taking the first of the three given equations, which is
By using the definition of matrix multiplication, we can populate the first row of the coefficient matrix as
We still need to take the second and third lines in the system of equations and write these into the coefficient matrix. The second equation is which we can embed into the second row of the coefficient matrix, crucially without changing the first row. This gives
Now we only have the third equation, which we can write into the third row of the coefficient matrix without affecting the entries in the two rows above. This gives
This represents the full representation of the system of linear equations in matrix form.
In the above example, the original equations were written into matrix form, which has separated the coefficient matrix from the variables and the numbers on the right-hand side of each equation. In this situation we would probably be most interested in the coefficient matrix because this matrix will encode vital information about the system of linear equations, such as whether or not a solution exists and, if so, whether the variables may take single values or infinitely many values.
Example 3: A System of Linear Equations with 3 Variables and 3 Equations
Which of the following shows the set of simultaneous equations that could be solved using the matrix
An efficient way to answer this problem would be to write the matrix equation as a system of linear equations. The left-most matrix (the coefficient) has 3 rows, which means that there will be 3 equations. The coefficient matrix also has 3 columns, which means that there will be 3 variables, which in our example above are , , and . We can effectively turn each row of the coefficient matrix into an equation featuring these variables by performing the matrix equation in 3 stages. We begin by highlighting the first row of the left matrix and the first and only column of the middle matrix, which corresponds to the entry in the right matrix which is in the first row and first column:
The definition of matrix multiplication gives the equation
Now, moving onto the second row of the left matrix, we will use the highlighted entries as shown:
This gives the second equation of the corresponding system of linear equations:
Finally, we look at the third row of the coefficient matrix:
Expanding this out gives the final equation of the system:
Now that we have the three equations of the corresponding system of linear equations in equations (6)–(8), we can write them altogether as the complete system of linear equations
Now that we have written the system of linear equations in standard form, we can compare it to the 5 possible options that were given in the question. We will work through each option in turn, aiming to dismiss each one as quickly as possible. In order for any of the options to be equal to the system of linear equations defined by the matrix equation in the question, the three equations must agree exactly. If at least one of the equations is different, then the two systems cannot be equal.
(a) The first equation of this system is , which can be rearranged to give
This disagrees with the first-line equation of the complete linear system that is given in equation (6) because the two numbers on the right-hand side of the equations are different. Therefore option (a) is not the correct choice.
(b) The first equation of this system is , which when rearranged gives
This equation is also different to that given in equation (6), so this cannot be the correct choice.
(c) In this system of linear equations, the first line is , which can equivalently be written as
The right-hand side of this equation agrees with the right-hand side of equation (6). However, the variables have opposite signs in both equations, so the equations are not equivalent and this cannot be the correct answer.
(d) For this system, the first line is , which we write as
This is exactly equal to equation (6), which is the first equation of the given linear system. Given that these two agree exactly, we now examine the second equation of (d), which is that , which can be written in the form
It is an encouraging sign that this is equal to the second equation of the system, as given in equation (7) . We now only need to check the final equation of option (d), which is that . We choose to write this as
This equation agrees with that given in (8), meaning that option (d) is the correct option.
(e) Given that option (d) is the correct option and that option (e) is different from option (e), this cannot be correct.
In the following examples, we will not repeat the full method of writing a system of linear equations in matrix form (or vice versa), as this is very easy to do after some practice. Instead, the remaining questions will focus on demonstrating how to treat a system of linear equations that are not written in any standard form. When discussing such a system, there is no requirement to write this out in a way that is neatly formatted. As we will see, however, it is often very useful to take a system of linear equations and write them in a form that is neat and aligned, with all variables on the left-hand sides of the given equations. After having done so, writing the system as a matrix equation is a less risky task.
Example 4: A System of Linear Equations with 3 Variables and 3 Equations
Express the set of simultaneous equations as a matrix equation.
The first step is to take each equation in the given system and to rewrite this with all variables on the left-hand side of each equation and with each variable aligned in the same column:
Now that we have these three equations written in this format, we can equivalently encode this system by the matrix equation
In the final example that we will give, we will show how to take principles from the above explainer and then apply these in a slightly different way. The method will essentially be identical, with only cosmetic changes to the process that we would normally follow.
Example 5: A System of Linear Equations That Is Already Partially Written in Matrix Form
Write in the form where is a matrix.
Let us begin by stating the question in a slightly more helpful way. We would like to find the matrix such that
This allows us to determine the order of by using the definition of matrix multiplication. The matrix multiplication is only well defined if has order and has order , with the matrix product having order . In the above equation, the middle matrix has order and the right matrix has order , which means that must have order . Therefore we are looking to produce a matrix equation of the following form:
We refer to the matrix on the left as the coefficient matrix of the system of linear equations. It will be helpful to write the right-most matrix in a way which aligns the variables in the same columns:
After having written the right matrix in this form, we deduce that the coefficient matrix on the left must be as follows:
Earlier we stated that the matrix expression of a system of linear equations is particularly helpful, in that it standardizes the form in which such systems are written, eliminating any reference to variable names or any other extraneous information. Although it is not necessary to write a system in matrix form, it is frequently more helpful to do so. When written in matrix form, it is generally easier to perform advanced algorithms such as the Gauss–Jordan method to find the solution of a system of linear equations. Just as helpful, we can separate the coefficient matrix from the rest of the system and treat this object separately. If the coefficient matrix is a square matrix, then we may be interested in finding the determinant and the matrix inverse, if it exists. Once we have the coefficient matrix, we may use this in conjunction with any other matrix that might be relevant to the problem at hand, allowing us to operate with this matrix using the rules of linear algebra. In this manner, the variables used to describe the system of linear equations are totally irrelevant and are just arbitrary labels, with the real mathematical understanding being derived from the coefficient matrix and the numbers which appear in the right-most matrix of the equivalent matrix expression.
- When given any system of linear equations, it is wisest to rewrite these expressions with all of the variables on the left-hand side and correctly aligned with the variables they refer to.
- When written in this specific format, it is normally a seamless process to write the system in the matrix form.
- If we are initially given a matrix equation of the same type as in this explainer, we can write this as a corresponding system of linear equations.
- Writing a system of linear equations in matrix form allows us to treat the coefficient matrix independently and also shows that the variable labels are completely arbitrary.