Generalized minimal residual method
In mathematics, the generalized minimal residual method (GMRES) is an iterative method for the numerical solution of an indefinite nonsymmetric system of linear equations. The method approximates the solution by the vector in a Krylov subspace with minimal residual. The Arnoldi iteration is used to find this vector.
The GMRES method was developed by
The method
Denote the
The n-th
GMRES approximates the exact solution of by the vector that minimizes the Euclidean norm of the residual .
The vectors might be close to linearly dependent, so instead of this basis, the Arnoldi iteration is used to find orthonormal vectors which form a basis for . In particular, .
Therefore, the vector can be written as with , where is the m-by-n matrix formed by . In other words, finding the n-th approximation of the solution (i.e., ) is reduced to finding the vector , which is determined via minimizing the residue as described below.
The Arnoldi process also constructs , an ()-by- upper Hessenberg matrix which satisfies
Because columns of are orthonormal, we have
This yields the GMRES method. On the -th iteration:
- calculate with the Arnoldi method;
- find the which minimizes ;
- compute ;
- repeat if the residual is not yet small enough.
At every iteration, a matrix-vector product must be computed. This costs about
Convergence
The nth iterate minimizes the residual in the Krylov subspace . Since every subspace is contained in the next subspace, the residual does not increase. After m iterations, where m is the size of the matrix A, the Krylov space Km is the whole of Rm and hence the GMRES method arrives at the exact solution. However, the idea is that after a small number of iterations (relative to m), the vector xn is already a good approximation to the exact solution.
This does not happen in general. Indeed, a theorem of Greenbaum, Pták and Strakoš states that for every nonincreasing sequence a1, ..., am−1, am = 0, one can find a matrix A such that the ‖rn‖ = an for all n, where rn is the residual defined above. In particular, it is possible to find a matrix for which the residual stays constant for m − 1 iterations, and only drops to zero at the last iteration.
In practice, though, GMRES often performs well. This can be proven in specific situations. If the symmetric part of A, that is , is
If A is symmetric and positive definite, then we even have
In the general case, where A is not positive definite, we have
All these inequalities bound only the residuals instead of the actual error, that is, the distance between the current iterate xn and the exact solution.
Extensions of the method
Like other iterative methods, GMRES is usually combined with a
The cost of the iterations grow as O(n2), where n is the iteration number. Therefore, the method is sometimes restarted after a number, say k, of iterations, with xk as initial guess. The resulting method is called GMRES(k) or Restarted GMRES. For non-positive definite matrices, this method may suffer from stagnation in convergence as the restarted subspace is often close to the earlier subspace.
The shortcomings of GMRES and restarted GMRES are addressed by the recycling of Krylov subspace in the GCRO type methods such as GCROT and GCRODR.[6] Recycling of Krylov subspaces in GMRES can also speed up convergence when sequences of linear systems need to be solved.[7]
Comparison with other solvers
The Arnoldi iteration reduces to the
Another class of methods builds on the unsymmetric Lanczos iteration, in particular the BiCG method. These use a three-term recurrence relation, but they do not attain the minimum residual, and hence the residual does not decrease monotonically for these methods. Convergence is not even guaranteed.
The third class is formed by methods like CGS and BiCGSTAB. These also work with a three-term recurrence relation (hence, without optimality) and they can even terminate prematurely without achieving convergence. The idea behind these methods is to choose the generating polynomials of the iteration sequence suitably.
None of these three classes is the best for all matrices; there are always examples in which one class outperforms the other. Therefore, multiple solvers are tried in practice to see which one is the best for a given problem.
Solving the least squares problem
One part of the GMRES method is to find the vector which minimizes
The minimum can be computed using a QR decomposition: find an (n + 1)-by-(n + 1) orthogonal matrix Ωn and an (n + 1)-by-n upper triangular matrix such that
The QR decomposition can be updated cheaply from one iteration to the next, because the Hessenberg matrices differ only by a row of zeros and a column:
Given the QR decomposition, the minimization problem is easily solved by noting that
Example code
Regular GMRES (MATLAB / GNU Octave)
function [x, e] = gmres(A, b, x, max_iterations, threshold)
n = length(A);
m = max_iterations;
% use x as the initial vector
r = b - A * x;
b_norm = norm(b);
error = norm(r) / b_norm;
% initialize the 1D vectors
sn = zeros(m, 1);
cs = zeros(m, 1);
%e1 = zeros(n, 1);
e1 = zeros(m+1, 1);
e1(1) = 1;
e = [error];
r_norm = norm(r);
Q(:,1) = r / r_norm;
% Note: this is not the beta scalar in section "The method" above but
% the beta scalar multiplied by e1
beta = r_norm * e1;
for k = 1:m
% run arnoldi
[H(1:k+1, k), Q(:, k+1)] = arnoldi(A, Q, k);
% eliminate the last element in H ith row and update the rotation matrix
[H(1:k+1, k), cs(k), sn(k)] = apply_givens_rotation(H(1:k+1,k), cs, sn, k);
% update the residual vector
beta(k + 1) = -sn(k) * beta(k);
beta(k) = cs(k) * beta(k);
error = abs(beta(k + 1)) / b_norm;
% save the error
e = [e; error];
if (error <= threshold)
break;
end
end
% if threshold is not reached, k = m at this point (and not m+1)
% calculate the result
y = H(1:k, 1:k) \ beta(1:k);
x = x + Q(:, 1:k) * y;
end
%----------------------------------------------------%
% Arnoldi Function %
%----------------------------------------------------%
function [h, q] = arnoldi(A, Q, k)
q = A*Q(:,k); % Krylov Vector
for i = 1:k % Modified Gram-Schmidt, keeping the Hessenberg matrix
h(i) = q' * Q(:, i);
q = q - h(i) * Q(:, i);
end
h(k + 1) = norm(q);
q = q / h(k + 1);
end
%---------------------------------------------------------------------%
% Applying Givens Rotation to H col %
%---------------------------------------------------------------------%
function [h, cs_k, sn_k] = apply_givens_rotation(h, cs, sn, k)
% apply for ith column
for i = 1:k-1
temp = cs(i) * h(i) + sn(i) * h(i + 1);
h(i+1) = -sn(i) * h(i) + cs(i) * h(i + 1);
h(i) = temp;
end
% update the next sin cos values for rotation
[cs_k, sn_k] = givens_rotation(h(k), h(k + 1));
% eliminate H(i + 1, i)
h(k) = cs_k * h(k) + sn_k * h(k + 1);
h(k + 1) = 0.0;
end
%%----Calculate the Givens rotation matrix----%%
function [cs, sn] = givens_rotation(v1, v2)
% if (v1 == 0)
% cs = 0;
% sn = 1;
% else
t = sqrt(v1^2 + v2^2);
% cs = abs(v1) / t;
% sn = cs * v2 / v1;
cs = v1 / t; % see http://www.netlib.org/eispack/comqr.f
sn = v2 / t;
% end
end
See also
References
- ISSN 0196-5204.
- ^ Paige and Saunders, "Solution of Sparse Indefinite Systems of Linear Equations", SIAM J. Numer. Anal., vol 12, page 617 (1975) https://doi.org/10.1137/0712047
- ^ Nifa, Naoufal (2017). Solveurs performants pour l'optimisation sous contraintes en identification de paramètres [Efficient solvers for constrained optimization in parameter identification problems] (Thesis) (in French).
- ^ Eisenstat, Elman & Schultz 1983. NB all results for GCR also hold for GMRES, cf. Saad & Schultz 1986
- ISBN 978-0-89871-361-9.)
{{cite book}}
: CS1 maint: multiple names: authors list (link - S2CID 2933274.
- .
- ISBN 978-0-387-95452-3.
- Meister, Andreas; Vömel, Christof (2005). Numerik linearer Gleichungssysteme. Wiesbaden: Vieweg. ISBN 978-3-528-13135-7.
- Saad, Y. (2003). Iterative Methods for Sparse Linear Systems (2nd ed.). Philadelphia: SIAM. ISBN 978-0-89871-534-7.
- Eisenstat, Stanley C.; Elman, Howard C.; Schultz, Martin H. (1983). "Variational Iterative Methods for Nonsymmetric Systems of Linear Equations". SIAM Journal on Numerical Analysis. 20 (2): 345–357. ISSN 0036-1429.
- Dongarra et al., Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods, 2nd Edition, SIAM, Philadelphia, 1994
- Imankulov, Timur; Lebedev, Danil; Matkerim, Bazargul; Daribayev, Beimbet; Kassymbek, Nurislam (2021-10-08). "Numerical Simulation of Multiphase Multicomponent Flow in Porous Media: Efficiency Analysis of Newton-Based Method". Fluids. 6 (10): 355. ISSN 2311-5521.