Gaussian elimination
In mathematics, Gaussian elimination, also known as row reduction, is an
- Swapping two rows,
- Multiplying a row by a nonzero number,
- Adding a multiple of one row to another row.
Using these operations, a matrix can always be transformed into an
Using row operations to convert a matrix into reduced row echelon form is sometimes called Gauss–Jordan elimination. In this case, the term Gaussian elimination refers to the process until it has reached its upper triangular, or (unreduced) row echelon form. For computational reasons, when solving systems of linear equations, it is sometimes preferable to stop row operations before the matrix is completely reduced.
Definitions and example of algorithm
The process of row reduction makes use of
Another point of view, which turns out to be very useful to analyze the algorithm, is that row reduction produces a matrix decomposition of the original matrix. The elementary row operations may be viewed as the multiplication on the left of the original matrix by elementary matrices. Alternatively, a sequence of elementary operations that reduces a single row may be viewed as multiplication by a Frobenius matrix. Then the first part of the algorithm computes an LU decomposition, while the second part writes the original matrix as the product of a uniquely determined invertible matrix and a uniquely determined reduced row echelon matrix.
Row operations
There are three types of elementary row operations which may be performed on the rows of a matrix:
- Swap the positions of two rows.
- Multiply a row by a non-zero scalar.
- Add to one row a scalar multiple of another.
If the matrix is associated to a system of linear equations, then these operations do not change the solution set. Therefore, if one's goal is to solve a system of linear equations, then using these row operations could make the problem easier.
Echelon form
For each row in a matrix, if the row does not consist of only zeros, then the leftmost nonzero entry is called the
For example, the following matrix is in row echelon form, and its leading coefficients are shown in red:
It is in echelon form because the zero row is at the bottom, and the leading coefficient of the second row (in the third column), is to the right of the leading coefficient of the first row (in the second column).
A matrix is said to be in reduced row echelon form if furthermore all of the leading coefficients are equal to 1 (which can be achieved by using the elementary row operation of type 2), and in every column containing a leading coefficient, all of the other entries in that column are zero (which can be achieved by using elementary row operations of type 3).
Example of the algorithm
Suppose the goal is to find and describe the set of solutions to the following system of linear equations:
The table below is the row reduction process applied simultaneously to the system of equations and its associated
System of equations Row operations Augmented matrix The matrix is now in echelon form (also called triangular form) Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "http://localhost:6011/en.wikipedia.org/v1/":): {\displaystyle \begin{alignat}{4} 2x &{}+{}& y &\quad& &{}={}& 7 & \\ & & y &\quad& &{}={}& 3 & \\ & & &\quad& z &{}={}& -1 & \end{alignat} }
The second column describes which row operations have just been performed. So for the first step, the x is eliminated from L2 by adding 3/2L1 to L2. Next, x is eliminated from L3 by adding L1 to L3. These row operations are labelled in the table as
Once y is also eliminated from the third row, the result is a system of linear equations in triangular form, and so the first part of the algorithm is complete. From a computational point of view, it is faster to solve the variables in reverse order, a process known as back-substitution. One sees the solution is z = −1, y = 3, and x = 2. So there is a unique solution to the original system of equations.
Instead of stopping once the matrix is in echelon form, one could continue until the matrix is in reduced row echelon form, as it is done in the table. The process of row reducing until the matrix is reduced is sometimes referred to as Gauss–Jordan elimination, to distinguish it from stopping after reaching echelon form.
History
The method of Gaussian elimination appears – albeit without proof – in the Chinese mathematical text Chapter Eight: Rectangular Arrays of The Nine Chapters on the Mathematical Art. Its use is illustrated in eighteen problems, with two to five equations. The first reference to the book by this title is dated to 179 AD, but parts of it were written as early as approximately 150 BC.[1][2][3] It was commented on by Liu Hui in the 3rd century.
The method in Europe stems from the notes of
Some authors use the term Gaussian elimination to refer only to the procedure until the matrix is in echelon form, and use the term Gauss–Jordan elimination to refer to the procedure which ends in reduced echelon form. The name is used because it is a variation of Gaussian elimination as described by Wilhelm Jordan in 1888. However, the method also appears in an article by Clasen published in the same year. Jordan and Clasen probably discovered Gauss–Jordan elimination independently.[8]
Applications
Historically, the first application of the row reduction method is for solving systems of linear equations. Below are some other important applications of the algorithm.
Computing determinants
To explain how Gaussian elimination allows the computation of the determinant of a square matrix, we have to recall how the elementary row operations change the determinant:
- Swapping two rows multiplies the determinant by −1
- Multiplying a row by a nonzero scalar multiplies the determinant by the same scalar
- Adding to one row a scalar multiple of another does not change the determinant.
If Gaussian elimination applied to a square matrix A produces a row echelon matrix B, let d be the product of the scalars by which the determinant has been multiplied, using the above rules. Then the determinant of A is the quotient by d of the product of the elements of the diagonal of B:
Computationally, for an n × n matrix, this method needs only
Finding the inverse of a matrix
A variant of Gaussian elimination called Gauss–Jordan elimination can be used for finding the inverse of a matrix, if it exists. If A is an n × n square matrix, then one can use row reduction to compute its inverse matrix, if it exists. First, the n × n identity matrix is augmented to the right of A, forming an n × 2n block matrix [A | I]. Now through application of elementary row operations, find the reduced echelon form of this n × 2n matrix. The matrix A is invertible if and only if the left block can be reduced to the identity matrix I; in this case the right block of the final matrix is A−1. If the algorithm is unable to reduce the left block to I, then A is not invertible.
For example, consider the following matrix:
To find the inverse of this matrix, one takes the following matrix augmented by the identity and row-reduces it as a 3 × 6 matrix:
By performing row operations, one can check that the reduced row echelon form of this augmented matrix is
One can think of each row operation as the left product by an elementary matrix. Denoting by B the product of these elementary matrices, we showed, on the left, that BA = I, and therefore, B = A−1. On the right, we kept a record of BI = B, which we know is the inverse desired. This procedure for finding the inverse works for square matrices of any size.
Computing ranks and bases
The Gaussian elimination algorithm can be applied to any m × n matrix A. In this way, for example, some 6 × 9 matrices can be transformed to a matrix that has a row echelon form like
All of this applies also to the reduced row echelon form, which is a particular row echelon format.
Computational efficiency
The number of arithmetic operations required to perform row reduction is one way of measuring the algorithm's computational efficiency. For example, to solve a system of n equations for n unknowns by performing row operations on the matrix until it is in echelon form, and then solving for each unknown in reverse order, requires n(n + 1)/2 divisions, (2n3 + 3n2 − 5n)/6 multiplications, and (2n3 + 3n2 − 5n)/6 subtractions,[9] for a total of approximately 2n3/3 operations. Thus it has a arithmetic complexity (time complexity, where each arithmetic operation take a unit of time, independently of the size of the inputs) of O(n3).
This complexity is a good measure of the time needed for the whole computation when the time for each arithmetic operation is approximately constant. This is the case when the coefficients are represented by
Gaussian elimination and its variants can be used on computers for systems with thousands of equations and unknowns. However, the cost becomes prohibitive for systems with millions of equations. These large systems are generally solved using iterative methods. Specific methods exist for systems whose coefficients follow a regular pattern (see system of linear equations).
Bareiss algorithm
The first strongly-polynomial time algorithm for Gaussian elimination was published by Jack Edmonds in 1967.[11]: 37 Independently, and almost simultaneously, Erwin Bareiss discovered another algorith, based on the following remark, which applies to a division-free variant of Gaussian elimination.
In standard Gaussian elimination, one subtracts from each row below the pivot row a multiple of by where and are the entries in the pivot column of and respectively.
Bareiss variant consists, instead, of replacing with This produces a row echelon form that has the same zero entries as with the standard Gaussian elimination.
Bareiss' main remark is that each matrix entry generated by this variant is the determinant of a submatrix of the original matrix.
In particular, if one starts with integer entries, the divisions occuring in the algorithm are exact divisions resulting in integers. So, all intermediate entries and final entries are integers. Moreover,
Moreover, as a upper bound on the size of final entries is known, a complexity can be obtained with
As a corollary, the following problems can be solved in strongly polynomial time:[11]: 40 , with the same bit complexity.
- Testing whether m given rational vectors are linearly independent
- Computing the determinant of a rational matrix
- Computing a solution of a rational equation system Ax = b
- Computing the inverse matrixof a nonsingular rational matrix
- Computing the rank of a rational matrix
Numeric instability
One possible problem is
Generalizations
Gaussian elimination can be performed over any field, not just the real numbers.
Computing the rank of a tensor of order greater than 2 is
Pseudocode
As explained above, Gaussian elimination transforms a given m × n matrix A into a matrix in
In the following pseudocode, A[i, j]
denotes the entry of the matrix A in row i and column j with the indices starting from 1. The transformation is performed in place, meaning that the original matrix is lost for being eventually replaced by its row-echelon form.
h := 1 /* Initialization of the pivot row */ k := 1 /* Initialization of the pivot column */ while h ≤ m and k ≤ n /* Find the k-th pivot: */ i_max :=argmax(i = h ... m, abs(A[i, k])) if A[i_max, k] = 0 /* No pivot in this column, pass to next column */ k := k + 1 else swap rows(h, i_max) /* Do for all rows below pivot: */ for i = h + 1 ... m: f := A[i, k] / A[h, k] /* Fill with zeros the lower part of pivot column: */ A[i, k] := 0 /* Do for all remaining elements in current row: */ for j = k + 1 ... n: A[i, j] := A[i, j] - A[h, j] * f /* Increase pivot row and column */ h := h + 1 k := k + 1
This algorithm differs slightly from the one discussed earlier, by choosing a pivot with largest absolute value. Such a partial pivoting may be required if, at the pivot place, the entry of the matrix is zero. In any case, choosing the largest possible absolute value of the pivot improves the numerical stability of the algorithm, when floating point is used for representing numbers.[14]
Upon completion of this procedure the matrix will be in row echelon form and the corresponding system may be solved by back substitution.
See also
- Fangcheng (mathematics)
- Gram–Schmidt process - another process for bringing a matrix into some canonical form.
References
- ^ "DOCUMENTA MATHEMATICA, Vol. Extra Volume: Optimization Stories (2012), 9-14". www.emis.de. Retrieved 2022-12-02.
- ^ Calinger 1999, pp. 234–236
- ISBN 978-0-691-11880-2.
- ^ Grcar 2011a, pp. 169–172
- ^ Grcar 2011b, pp. 783–785
- ^ Lauritzen, p. 3
- ^ Grcar 2011b, p. 789
- JSTOR 2322413
- ^ Farebrother 1988, p. 12
- ]
- ^ MR 1261419
- ^ Golub & Van Loan 1996
- ].
- .
Works cited
- Atkinson, Kendall A. (1989), An Introduction to Numerical Analysis (2nd ed.), New York: ISBN 978-0471624899.
- Bolch, Gunter; Greiner, Stefan; de Meer, Hermann; Trivedi, Kishor S. (2006), Queueing Networks and Markov Chains: Modeling and Performance Evaluation with Computer Science Applications (2nd ed.), ISBN 978-0-471-79156-0.
- Calinger, Ronald (1999), A Contextual History of Mathematics, ISBN 978-0-02-318285-3.
- Farebrother, R.W. (1988), Linear Least Squares Computations, STATISTICS: Textbooks and Monographs, Marcel Dekker, ISBN 978-0-8247-7661-9.
- Lauritzen, Niels, Undergraduate Convexity: From Fourier and Motzkin to Kuhn and Tucker.
- ISBN 978-0-8018-5414-9.
- Grcar, Joseph F. (2011a), "How ordinary elimination became Gaussian elimination", Historia Mathematica, 38 (2): 163–218, S2CID 14259511
- Grcar, Joseph F. (2011b), "Mathematicians of Gaussian elimination" (PDF), Notices of the American Mathematical Society, 58 (6): 782–792
- ISBN 978-0-89871-521-7.
- Katz, Victor J. (2004), A History of Mathematics, Brief Version, ISBN 978-0-321-16193-2.
- Kaw, Autar; Kalu, Egwu (2010). "Numerical Methods with Applications: Chapter 04.06 Gaussian Elimination" (PDF) (1st ed.). University of South Florida. Archived (PDF) from the original on 2012-09-07.
- Lipson, Marc; Lipschutz, Seymour (2001), Schaum's outline of theory and problems of linear algebra, New York: ISBN 978-0-07-136200-9
- Press, WH; Teukolsky, SA; Vetterling, WT; Flannery, BP (2007), "Section 2.2", Numerical Recipes: The Art of Scientific Computing (3rd ed.), New York: Cambridge University Press, ISBN 978-0-521-88068-8, archived from the originalon 2012-03-19, retrieved 2011-08-08