Next Article in Journal
On Zero-Divisor Graphs of Zn When n Is Square-Free
Next Article in Special Issue
Müntz–Legendre Wavelet Collocation Method for Solving Fractional Riccati Equation
Previous Article in Journal
On Topologies on Simple Graphs and Their Applications in Radar Chart Methods
Previous Article in Special Issue
Krein–Sobolev Orthogonal Polynomials II
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Error Estimators for a Krylov Subspace Iterative Method for Solving Linear Systems of Equations with a Symmetric Indefinite Matrix

by
Mohammed Alibrahim
1,
Mohammad Taghi Darvishi
2,
Lothar Reichel
1,* and
Miodrag M. Spalević
3
1
Department of Mathematical Sciences, Kent State University, Kent, OH 44242, USA
2
Department of Mathematics, Razi University, Kermanshah 67149, Iran
3
Department of Mathematics, Faculty of Mechanical Engineering, University of Belgrade, Kraljice Marije 16, 11120 Belgrade 35, Serbia
*
Author to whom correspondence should be addressed.
Axioms 2025, 14(3), 179; https://doi.org/10.3390/axioms14030179
Submission received: 27 January 2025 / Revised: 11 February 2025 / Accepted: 16 February 2025 / Published: 28 February 2025

Abstract

:
This paper describes a Krylov subspace iterative method designed for solving linear systems of equations with a large, symmetric, nonsingular, and indefinite matrix. This method is tailored to enable the evaluation of error estimates for the computed iterates. The availability of error estimates makes it possible to terminate the iterative process when the estimated error is smaller than a user-specified tolerance. The error estimates are calculated by leveraging the relationship between the iterates and Gauss-type quadrature rules. Computed examples illustrate the performance of the iterative method and the error estimates.

1. Introduction

Consider the linear system of equations
A x = b ,
where A R m × m is a large, symmetric, nonsingular, and indefinite matrix and x and b are real vectors. Such systems arise in various areas of applied mathematics and engineering. When A is too large to make its factorization feasible or attractive, an iterative solution method has to be employed. Among the most well-known iterative methods for solving linear systems of the kind (1) are MINRES or SYMMLQ by Paige and Saunders (see [1,2]). However, none of these methods allow for easy estimation of the error in the computed iterates. This can make it difficult to decide when to terminate the iterations.
For a symmetric, positive definite matrix A, the conjugate gradient method is typically used to solve (1). Various techniques are available to estimate the A-norm of the error in the iterates determined by the conjugate gradient method. These techniques leverage the relationship between the conjugate gradient method and Gauss-type quadrature rules applied to integrate the function f ( t ) = t 1 . The quadrature rules are determined with respect to an implicitly defined non-negative measure defined by the matrix A, the right-hand side b , and the initial iterate x 0 (see, for example, Almutairi et al. [3], Golub and Meurant [4,5], Meurant and Tichý [6] and references therein).
Error estimation of iterates in cases where the matrix A is nonsingular, symmetric, and indefinite has not received much attention in the literature. We observe that the application of A-norm estimates of the error in the iterates is meaningful when A is symmetric and positive definite [7]. However, this is not the case when A is symmetric and indefinite. In this paper, we estimate the Euclidean norm of the error for each iterate produced by an iterative method, which is described below.
In their work, Calvetti et al. [8] introduced a Krylov subspace method for the iterative solution of (1). They proposed estimating the Euclidean norm of the error in the iterates generated by their method using pairs of associated Gauss and anti-Gauss quadrature rules. However, the quality of the error norm estimates determined in this manner is mixed. Examples of computed results demonstrate that some error norm estimates are significantly exaggerated compared to the actual error norm in the iterates.
This paper presents novel methods for calculating the Euclidean norm of the error in the iterates computed using the iterative method described in Section 2 and [8]. In particular, the anti-Gauss rule used in [8] is replaced by other quadrature rules.
For notational simplicity, we start with the initial approximate solution x 0 = 0 . Consequently, the kth approximate solution x k determined by the iterative method discussed in this paper lives in the Krylov subspace:
K k ( A , b ) = span { b , A b , , A k 1 b } , k = 1 , 2 , ,
i.e.,
x k = p k 1 ( A ) b , k = 1 , 2 , ,
for a suitable iteration polynomial p k 1 in Π k 1 , where Π k 1 is the set of all polynomials of degree less than or equal to k 1 . We require that the iteration polynomials satisfy
p k 1 ( 0 ) = 0 , k = 1 , 2 , .
Then
p k 1 ( t ) = t q k 2 ( t )
fulfills condition (4) for a polynomial q k 2 Π k 2 .
We introduce the residual error related to x k as
r k = b A x k
and let x * denote the solution of (1). Then the error e ˜ k = x * x k in x k can be expressed as
e ˜ k = A 1 r k .
Equations (6) and (7) yield
e ˜ k t e ˜ k = r k t A 2 r k = ( b A x k ) t A 2 ( b A x k ) = b t A 2 b 2 b t A 1 x k + x k t x k ,
where the superscript t denotes transposition. We may calculate the Euclidean norm of the vector e ˜ k by using the terms found on the right-hand side of Equation (8). It is straightforward to compute the term x k t x k , and using (5), the expression 2 b t A 1 x k can be calculated as
2 b t A 1 x k = 2 b t q k 2 ( A ) b .
Hence, the expression 2 b t A 1 x k can be evaluated without using A 1 . The iterative method has to be chosen so that recursion formulas for the polynomials q k 2 , k = 2 , 3 , easily can be computed. Finally, we have to estimate b t A 2 b , which by setting f ( t ) = 1 / t 2 , can be written as the following matrix functional
F ( A ) = b t f ( A ) b .
We use Gauss-type quadrature rules determined by the recurrence coefficients of the iterative method to approximate (9).
The structure of this paper is as follows. Section 2 outlines the iterative Krylov subspace method employed for the solution of (1). This method was discussed in [8]. We review the method for the convenience of the reader. Our presentation differs from that in [8]. The iterative method is designed to facilitate the evaluation of the last two terms on the right-hand side of (8). Section 3 explores various Gauss-type quadrature rules that are employed to estimate the first term on the right-hand side of (8). The quadrature rules used in this study include three kinds of Gauss-type rules, namely averaged and optimally averaged rules by Laurie [9] and Spalević [10], respectively, as well as Gauss-Radau rules with a fixed quadrature node at the origin. Additionally, we describe how to update the quadrature rules cost-effectively as the number of iterations grows. This section improves on the quadrature rules considered in [8]. Section 4 describes the use of these quadrature rules to estimate the Euclidean norm of the errors e ˜ k , k = 1 , 2 , . Section 5 provides three computed examples and Section 6 contains concluding remarks.
The iterates determined by an iterative method often converge significantly faster when a suitable preconditioner is applied (see, e.g., Saad [11] for discussions and examples of preconditioners). We assume that the system (1) is preconditioned when this is appropriate.

2. The Iterative Scheme

This section revisits the iterative method considered in [8] for solving linear systems of Equation (1) with a nonsingular, symmetric, and indefinite matrix A. We begin by discussing some fundamental properties of the method. Subsequently, we describe updating formulas for the approximate solutions x k . This method can be seen as a variation of the SYMMLQ scheme discussed in [1,12].
It is convenient to introduce the spectral factorization of the coefficient matrix (1),
A = U Λ U t ,
where Λ R m × m is a diagonal matrix with diagonal elements { λ k } k = 1 m and the matrix U R m × m is orthogonal. The spectral factorization is used for the description of the iterative method but does not have to be computed for the solution of (1). Defining b = [ b 1 , b 2 , , b m ] t = U t b , the functional (9) can be written as
F ( A ) = b t f ( Λ ) b = k = 1 m f ( λ k ) b k 2 = f ( x ) d w ( x ) ,
where the measure d w has jump discontinuities at the eigenvalues λ k of A.
Our iterative method is based on the Lanczos algorithm. Let I k R k × k denote the identity matrix. By applying 1 k m steps of the Lanczos process to A with initial vector b , the following Lanczos decomposition is obtained:
A V k = V k T k + f k e k t ,
where f k R m and V k = [ v 1 , v 2 , , v k ] R m × k are such that V k t V k = I k , V k t f k = 0 , and
v 1 = b / b .
Additionally, T k is a symmetric, tridiagonal k × k matrix, e k denotes the kth column of the identity matrix, and · represents the Euclidean vector norm. The columns of the matrix V k span the Krylov subspace (2), i.e.,
range ( V k ) = K k ( A , b ) .
It is assumed that all subdiagonal entries of T k are nonvanishing; otherwise, the recursion formulas of the Lanczos process break down, and the solution of (1) can be formulated as a linear combination of the vectors v j that are available at the time of breakdown. The recursion relation for the columns of V k is established by Equation (11) and, in conjunction with (12), shows that
v j = S j 1 ( A ) b , j = 1 , 2 , , k ,
for certain polynomials S j of degree j.
Theorem 1.
The polynomials S j determined by (14) are orthonormal with respect to the inner product
( g , h ) = I ( g h )
induced by the operator I ( f ) = f ( x ) d w ( x ) .
Proof. 
We have
( S j 1 , S 1 ) = S j 1 ( x ) S 1 ( x ) d w ( x ) = b t U S j 1 ( Λ ) S 1 ( Λ ) U t b = b t U S j 1 ( Λ ) U t U S 1 ( Λ ) U t b = b t S j 1 ( A ) S 1 ( A ) b = ( S j 1 ( A ) b ) t ( S 1 ( A ) b ) = v j t v = 0 , j , 1 , j = ,
because the columns v j , 1 j k , of the matrix V k are orthogonal and of unit norm (see (11)).    □
We also use the related decomposition to (11):
A V k 1 = V k T k , k 1 ,
where T k , k 1 is the leading submatrix of T k of order k × ( k 1 ) .
We use the QR factorization for T k , that is,
T k = Q k R k , for Q k , R k R k × k ,
where Q k t Q k = I k and the matrix R k = [ r j ( k ) ] j , = 1 k is upper triangular. Similarly, we introduce the factorization
T k , k 1 = Q k R ¯ k 1 0 = Q k , k 1 R ¯ k 1 ,
where the ( k 1 ) × ( k 1 ) matrix R ¯ k 1 is the leading submatrix of R k and Q k , k 1 R k × ( k 1 ) is the leading submatrix of Q k .
Theorem 2.
Combine the QR factorization (17) with the Lanczos decomposition (11). This defines a new iterative process with iteration polynomials that comply with (4).
Proof. 
Let k > 1 . By applying the QR factorization (17) within the Lanczos decomposition (11), we obtain
A V k = V k Q k R k + f k e k t .
Multiplying (19) by Q k from the right-hand side, letting V ˜ k : = V k Q k , and defining T ˜ k = R k Q k , we obtain
A V ˜ k = V ˜ k T ˜ k + f k e k t Q k .
The column vectors of the matrix expressed as V ˜ k = [ v ˜ 1 ( k ) , v ˜ 2 ( k ) , , v ˜ k ( k ) ] are orthonormal, and the matrix T ˜ k is symmetric and tridiagonal. To expose the relation between the first column v 1 of V k and the first column v ˜ 1 ( k ) of V ˜ k , we multiply (19) by e 1 from the right. This yields
A V k e 1 = V ˜ k R k e 1 + f k e k t e 1 , k > 1 ,
which simplifies to
A v 1 = r 11 ( k ) v ˜ 1 ( k ) ,
where r 11 ( k ) e 1 = R k e 1 . For a suitable choice of the sign of r 11 ( k ) , we have
v ˜ 1 ( k ) = A b / A b .
Since T k is tridiagonal, the orthogonal matrix Q k in the QR factorization (17) has upper Hessenberg form. As a result, only the last two entries of the vector expressed as e k t Q k are non-zero. Hence, the decomposition (20) differs from a Lanczos decomposition by potentially having non-zero entries in the last two columns of the matrix f k e k t Q k .
Suppose that the matrix V ¯ k 1 consists of the first k 1 columns of V ˜ k . Then,
V ¯ k 1 = V k Q k , k 1 ,
where Q k , k 1 is defined by Equation (18). Typically, V ¯ k 1 V ˜ k 1 ; additional details can be found in Section 4. When the last column is removed from each term in (20), the following decomposition results:
A V ¯ k 1 = V ¯ k 1 T ˜ k 1 + f ¯ k 1 e k 1 t .
In (23), the matrix T ˜ k 1 is the ( k 1 ) × ( k 1 ) leading submatrix of T ˜ k 1 . Furthermore, V ¯ k 1 t f ¯ k 1 = 0 , and V ¯ k 1 t V ¯ k 1 = I k 1 . As a result, according to (21), we have that (23) is a Lanczos decomposition with the starting vector v ˜ 1 ( k ) of V ¯ k 1 proportional to the vector A b . Similarly to (13), we have
range ( V ¯ k 1 ) = K k 1 ( A , A b ) .
To determine the iteration polynomials (3) and the corresponding approximate solutions x k of (3), we impose the following requirement for certain vectors w k 1 R k 1 :
x k = p k 1 ( A ) b = V ¯ k 1 w k 1 .
It follows from (24) that any polynomial p k 1 determined by (25) fulfills (4). This completes the proof.    □
Remark 1.
We chose w k 1 in (25) and, thereby, p k 1 in Π k 1 in a manner that guarantees the residual error (6) for the approximate solution x k of (1) satisfies the Petrov-Galerkin condition, i.e.,
0 = V k 1 t r k = V k 1 t b V k 1 t A V ¯ k 1 w k 1 ,
which, according to (12) and the factorization (22), simplifies to
b e 1 = ( A V k 1 ) t V k Q k , k 1 w k 1 .
Remark 2.
Replacing the matrix V ¯ k 1 in (26) with V k 1 recovers the SYMMLQ method [1]. However, the iteration polynomial p k 1 associated with the SYMMLQ method typically does not satisfy condition (4). Our method implements a QR factorization of matrix T k , akin to the SYMMLQ method implementation by Fischer ([12], Section 6.5). In contrast, Paige and Saunders’ [1] implementation of the SYMMLQ method relies on an LQ factorization of T k .
Remark 3.
Equation (26) shows that the iterative method is a Petrov-Galerkin method. In each step of the method, the dimension of the solution subspace (24) is increased by one, and the residual error is required to be orthogonal to the subspace ( K k 1 ( A , b ) ) , cf. (13). This secures convergence of the iterates (25) to the solution x * of (1) as k increases.
Using Theorems 1 and 2, along with Remark 1, we can simplify the right-hand side of (7). First, using (16) and (18), we obtain
( A V k 1 ) t V k Q k , k 1 = T k , k 1 t Q k , k 1 = R ¯ k 1 t .
Subsequently, by substituting (28) into (27), we obtain
R ¯ k 1 t w k 1 = b e 1 .
We can evaluate w k 1 by forward substitution using (29). The rest of this section focuses on evaluating the right-hand side of (8).
Section 4 presents iterative formulas to efficiently update the approximate solutions x k . The remainder of this section discusses the evaluation of the right-hand side of (8). From (24) and (25), it can be deduced that x k lives in K k 1 ( A , A b ) . Consequently, there is a vector y k 1 R k 1 such that
A 1 x k = V k 1 y k 1 .
Using the decomposition (16), the kth iterate generated by our iterative method can be written as
x k = A V k 1 y k 1 = V k T k , k 1 y k 1 .
Furthermore, according to (22) and (25), we have
x k = V k Q k , k 1 w k 1 .
This implies that
Q k , k 1 w k 1 = T k , k 1 y k 1 .
Multiplying (31) by Q k , k 1 t from the left and using (18) yields
w k 1 = Q k , k 1 t T k , k 1 y k 1 = R ¯ k 1 y k 1 .
By successively applying (12), (29), (30), and (32), we obtain
b t A 1 x k = b t V k 1 y k 1 = b e 1 t y k 1 = b e 1 t R ¯ k 1 1 w k 1 = w k 1 t w k 1 .
According to (25), it follows that x k t x k = w k 1 t w k 1 . Combining this with (33) shows that Equation (8) can be represented as
e ˜ k t e ˜ k = r k t A 2 r k = b t A 2 b w k 1 t w k 1 .
The term w k 1 t w k 1 can easily be computed using (29). Section 3 describes several Gauss-type quadrature rules that are applied to compute estimates of b t A 2 b in Section 4.

3. Quadrature Rules

This section considers the approximation of integrals like
I ( f ) = a b f ( x ) d w ( x )
by Gauss-type quadrature rules, where [ a , b ] [ , ] and d w denotes a non-negative measure with an infinite number of support points such that all moments μ j = a b x j d w ( x ) exist, for j = 0 , 1 , 2 , . In this section, we assume that μ 0 = 1 . Let
Q n ( f ) = i = 1 n w i ( n ) f ( ξ i ( n ) )
denote an n-node quadrature rule to approximate (34). Then
a b f ( x ) d w ( x ) = Q n ( f ) + E n ( f ) ,
where E n ( f ) is the remainder term. This term vanishes for all polynomials in Π d for some non-negative integer d. The value of d is referred to as the degree of precision of the quadrature rule. It is well known that the maximum value of d for an n-node quadrature rule is 2 n 1 . This value is achieved by the n-node Gauss rule (see, e.g., [13] for a proof). The latter rule can be written as
G n ( f ) = i = 1 n w i ( n ) f ( ξ i ( n ) ) .
The nodes ξ i ( n ) , i = 1 , 2 , , n , are the eigenvalues of the matrix
T n = α 1 β 1 0 β 1 α 2 β 2 β n 2 α n 1 β n 1 0 β n 1 α n R n × n ,
and the weights w i ( n ) are the square of the first elements of normalized eigenvectors.
The entries α i and β i of T n are obtained from the recursion formula for the sequence of monic orthogonal polynomials { P i } i = 0 associated with the inner product (15):
P i + 1 ( x ) = ( x α i ) P i ( x ) β i 2 P i 1 ( x ) , i = 0 , 1 , 2 , ,
where P 1 ( x ) 0 and P 0 ( x ) 1 . The values of α i and β i in (37) can be determined from the following formulas (see, e.g., Gautschi [13] for details):
α i = ( x P i , P i ) ( P i , P i ) , β i 2 = ( P i , P i ) ( P i 1 , P i 1 ) ;
They also can be computed by the Lanczos process, which is presented in Algorithm 1.
Algorithm 1: The Lanczos algorithm.
Axioms 14 00179 i001
It is straightforward to demonstrate that
G n ( f ) = e 1 t f ( T n ) e 1 ,
where e 1 = [ 1 , 0 , , 0 ] t .
We are interested in measures d w with support in two real intervals [ a , b ] and [ c , d ] , where a < b < 0 < c < d . The following result sheds light on how the nodes of the Gauss rule (35) are allocated for such measures.
Theorem 3.
Let d ω be a non-negative measure with support on the union of bounded real intervals [ a , b ] and [ c , d ] , where a < b < c < d . Then, the Gauss rule (35) has at most one node in the open interval ( b , c ) .
Proof. 
The result follows from [14] (Theorem 3.41.1). □
The following subsection reviews some Gauss-type quadrature rules that are used to estimate the error in approximate solutions x k of (1) that are generated by the iterative method proposed in Section 2.

Selected Gauss-Type Quadrature Rules

In [9], Laurie presented anti-Gauss quadrature rules. A recent analysis of anti-Gauss rules was also carried out by Díaz de Alba et al. [15]. Related investigations can be found in [16,17]. The ( n + 1 ) -point anti-Gauss rule G ˘ n + 1 , which is associated with the Gauss rule (35), is defined by the property
I G ˘ n + 1 ( f ) = I G n ( f ) , for all f Π 2 n + 1 .
The following tridiagonal matrix is used to determine the rule G ˘ n + 1 :
T ˘ n + 1 = α 1 β 1 0 β 1 α 2 β 2 β n 2 α n 1 β n 1 β n 1 α n 2 β n 0 2 β n α n + 1 R ( n + 1 ) × ( n + 1 ) .
Similarly to (38), we have
G ˘ n + 1 ( f ) = e 1 t f T ˘ n + 1 e 1 .
Moreover,
G ˘ n ( f ) = I ( f ) , for all f Π 2 n 1 .
Further, Laurie [9] introduced the averaged Gauss quadrature rule associated with G n . It has 2 n + 1 nodes and is given by
A 2 n + 1 = 1 2 G n + G ˘ n + 1 .
The property (39) suggests that the quadrature error for A 2 n + 1 is smaller than the error for G n . Indeed, it follows from (39) that the degree of precision of A 2 n + 1 is no less than 2 n + 1 . This implies that the difference
A 2 n + 1 ( f ) G n ( f )
can be used to estimate the quadrature error
I ( f ) G n ( f ) .
Computed results in [18] illustrate that for numerous integrands and various values of n, the difference (41) provides fairly accurate approximation of the quadrature error (42). The accuracy of these estimates depends both on the integrand and the value of n.
In [10], Spalević presented optimal averaged Gauss quadrature rules, which usually have a higher degree of precision than averaged Gauss rules with the same number of nodes. The symmetric tridiagonal matrix for the optimal averaged Gauss quadrature rule A ^ 2 n + 1 with 2 n + 1 nodes is defined as follows. Introduce the reverse matrix of T n , which is given by
T n = α n β n 1 0 β n 1 α n 1 β n 2 β 2 α 2 β 1 0 β 1 α 1 R n × n ,
as well as the concatenated matrix
T ^ 2 n + 1 = T n β n e n 0 β n e n t α n + 1 β n + 1 e 1 t 0 β n + 1 e 1 T n R ( 2 n + 1 ) × ( 2 n + 1 ) .
The nodes of the rule A ^ 2 n + 1 are the eigenvalues, and the weights are the squared first components of normalized eigenvectors of the matrix T ^ 2 n + 1 . It is worth noting that n of the nodes of A ^ 2 n + 1 agree with the nodes of G n . Similarly to Equation (38), we have
A ^ 2 n + 1 ( f ) = e 1 t f T ^ 2 n + 1 e 1 .
The degree of precision for this quadrature rule is at least 2 n + 2 . Analyses of the degree of precision of the rules A ^ 2 n + 1 and the location of their largest and smallest nodes for several measures for which explicit expressions for coefficients α i and β i are known can be found in [19] and references therein. An estimate of the quadrature error in the Gauss rule (35) is given by
A ^ 2 n + 1 ( f ) G n ( f ) .
Numerical examples provided in [18] show this estimate to be quite accurate for a wide range of integrands. As the rule A ^ 2 n + 1 typically has strictly higher degree of precision than Laurie’s averaged Gauss rule (40), we expect the quadrature error estimate (43) to generally be more accurate than the estimate (41), particularly for integrands with high-order differentiability.
In the computations, we use the representation
A ^ 2 n + 1 ( f ) = β n + 1 2 β n 2 + β n + 1 2 G n ( f ) + β n 2 β n 2 + β n + 1 2 G n + 1 * ( f ) ,
where
G n + 1 * ( f ) = e 1 t f T n + 1 * e 1
with
T n + 1 * = α 1 β 1 0 β 1 α 2 β 2 β n 2 α n 1 β n 1 β n 1 α n β n * 0 β n * α n + 1 R ( n + 1 ) × ( n + 1 )
and β n * = β n 2 + β n + 1 2 .
We finally consider the Gauss-Radau quadrature rule  R n + 1 , 0 ( f ) , which has n + 1 nodes, with one node anchored at 0. This rule can be written as
R n + 1 , 0 ( f ) = w 0 ( n + 1 ) f ( 0 ) + i = 1 n w i ( n + 1 ) f ( ξ i , 0 ( n + 1 ) ) .
To maximize the degree of precision, which is 2 n , the n nodes ξ i , 0 ( n + 1 ) , i = 1 , 2 , , n , are suitably chosen. The rule R n + 1 , 0 ( f ) can be expressed as
R n + 1 , 0 ( f ) = e 1 t f ( T n + 1 , 0 ) e 1 ,
where
T n + 1 , 0 = α 1 β 1 0 β 1 α 2 β 2 β n 2 α n 1 β n 1 β n 1 α n β n 0 β n α ˜ n + 1 R ( n + 1 ) × ( n + 1 ) ,
and the entry α ˜ n + 1 is chosen so that T n + 1 , 0 has an eigenvalue at the origin. Details on how to determine α ˜ n + 1 are provided by Gautschi [13,20] and Golub [21].
Theorem 4.
Let the nodes { ξ i , 0 ( n + 1 ) } i = 1 n + 1 of the rule (44) and the nodes { ξ i ( n ) } i = 1 n of the rule (35) be ordered according to increasing magnitude. Then ξ 0 , 0 ( n + 1 ) = 0 and
| ξ i , 0 ( n + 1 ) |   >   | ξ i ( n ) | , i = 1 , , n .
Proof. 
The last subdiagonal entry of the matrix (45) is non-vanishing since the measure d w in (34) has infinitely many support points. By the Cauchy interlacing theorem, the eigenvalues of the leading principal n × n submatrix (36) of the symmetric tridiagonal ( n + 1 ) × ( n + 1 ) matrix (45) strictly interlace the eigenvalues of the latter. Since one of the eigenvalues of (45) vanishes, the theorem follows. □
The measure d w has no support at the origin. We therefore apply the Gauss-Radau rules with f ( 0 ) = 0 in (44).
Finally, we compute error-norm estimates by using the “minimum rule”:
A 4 n + 2 min ( f ) = min { A 2 n + 1 ( f ) , A ^ 2 n + 1 ( f ) } .
It typically has 3 n + 2 distinct nodes. This rule is justified by the observation that the rules A 2 n + 1 ( f ) and A ^ 2 n + 1 ( f ) sometimes overestimate the error norm.

4. Error-Norm Estimation

This section outlines how the quadrature rules discussed in the previous section can be used to estimate the Euclidean norm of the error in the iterates x k , k = 0 , 1 , , determined by the iterative method described in Section 2. The initial iterate is assumed to be x 0 = 0 . Then the iterate x n lives in the Krylov subspace (cf. (24) and (25)),
K n 1 ( A , A b ) = span A b , A 2 b , , A n 1 b ,
The residual corresponding to the iterate x n is defined in (6). We use the relation (7) to obtain the Euclidean norm of the error, cf. (8). Our task is to estimate the first term in the right-hand side of (8). In this section, the measure is defined by (10). In particular, the measure has support in intervals on the negative and positive real axis. These intervals exclude an interval around the origin.
We turn to the computation of the iterates x k described by (25). The computations can be structured in such a way that only a few m-vectors need to be stored. Let the k × k real matrix T k be given by (36) with n = k , i.e.,
T k = α 1 β 1 0 β 1 α 2 β 2 β 2 α 3 α k 2 β k 2 β k 2 α k 1 β k 1 0 β k 1 α k .
Based on the discussion following Equation (12), we may assume that the β j are nonzero. This ensures that the eigenvalues of T k are distinct. We can compute the QR factorization (17) of T k by applying a sequence of k 1 Givens rotations to T k ,
G k ( j ) : = I j 1 c j s j s j c j I k j 1 R k × k , c j 2 + s j 2 = 1 , s j 0 ,
This yields an orthogonal matrix Q k and an upper triangular matrix R k given by
Q k : = G k ( 1 ) T G k ( 2 ) T G k ( k 1 ) T , R k : = G k ( k 1 ) G k ( k 2 ) G k ( 1 ) T k .
For a discussion on Givens rotations, see, e.g., ([2], Chapter 5). In our iterative method, the matrix Q k is not explicitly formed; instead, we use the representation in (49). Since T k is tridiagonal, the upper triangular matrix R k has nonzero entries solely on the diagonal and the two adjacent superdiagonals.
Application of k steps of the Lanczos process to the matrix A with initial vector b results in the matrix T k , as shown in (47). By performing one more step, analogous to (11), we obtain the following Lanczos decomposition:
A V k + 1 = V k + 1 T k + 1 + f k + 1 e k + 1 t .
We observe that the last subdiagonal element of the symmetric tridiagonal matrix T k + 1 can be calculated as β k = f k right after the kth Lanczos step is completed. This is convenient when evaluating the tridiagonal matrices (45) associated with Gauss-Radau quadrature rules.
We can express the matrix T k + 1 in terms of its QR factorization as follows:
T k + 1 = Q k + 1 R k + 1 ,
whose factors can be computed from Q k and R k in a straightforward manner. Indeed, we have
Q k + 1 = Q k 0 0 t 1 G k + 1 ( k ) T , Q k + 1 , k = Q k 0 0 t 1 G k + 1 , k ( k ) T ,
wherein Q k + 1 is a ( k + 1 ) × ( k + 1 ) real orthogonal matrix; Q k + 1 , k is a ( k + 1 ) × k real matrix; G k + 1 ( k ) is defined by (48); and G k + 1 , k ( k ) is a real ( k + 1 ) × k matrix, which is made up of the first k columns of G k + 1 ( k ) .
We express the matrices in terms of their columns to derive updating formulas for the computation of the triangular matrix R k + 1 in (51) from R k in (49):
R k = [ r 1 ( k ) , r 2 ( k ) , , r k ( k ) ] , R k + 1 = [ r 1 ( k + 1 ) , r 2 ( k + 1 ) , , r k ( k + 1 ) , r k + 1 ( k + 1 ) ] .
A comparison between (17) and (51) yields the following results:
r j ( k + 1 ) = r j ( k ) 0 , 1 j < k ,
and
r k ( k + 1 ) = G k + 1 ( k ) r k ( k ) β k , r k + 1 ( k + 1 ) = G k + 1 ( k ) G k + 1 ( k 1 ) T k + 1 e k + 1 .
Thus, the elements of all the matrices R 1 , R 2 , , R k + 1 can be calculated in just O ( k 2 ) arithmetic floating-point operations (flops).
As defined by (18), the matrix expressed as R ¯ k = [ r ¯ j ( k ) ] j , = 1 k is the leading principal submatrix of R k + 1 of order k, and differs from R k = [ r j ( k ) ] j , = 1 k only in its last diagonal entry. From Equation (53) and the fact that β k is nonzero, it follows that r ¯ k k ( k ) > r k k ( k ) 0 . When T k is nonsingular, we obtain that r k k ( k ) > 0 . Here we assume that the diagonal entries of the upper triangular matrix in all QR factorizations are non-negative.
We turn to the computation of the columns of the matrix V ˜ k + 1 , that is,
V ˜ k + 1 = [ v ˜ 1 ( k + 1 ) , v ˜ 2 ( k + 1 ) , , v ˜ k + 1 ( k + 1 ) ] : = V k + 1 Q k + 1 ,
from columns of V ˜ k , where V k + 1 is obtained by the modified Lanczos scheme (50), see Algorithm 1, and Q k + 1 is determined by (52). Substituting (52) into the right-hand side of (54) yields
V ˜ k + 1 = [ V k , v k + 1 ] Q k + 1 = [ V ˜ k , v k + 1 ] G k + 1 ( k ) T = [ V ¯ k 1 , c k v ˜ k ( k ) + S k v k + 1 , S k v ˜ k ( k ) + c k v k + 1 ] .
As a result, the initial k 1 columns of the matrix V ˜ k + 1 correspond to those of V ¯ k 1 . The columns v ˜ k ( k + 1 ) and v ˜ k + 1 ( k + 1 ) in V ˜ k + 1 are linear combinations obtained from the last columns of both V ˜ k and V k + 1 .
Given the solution w k 1 of the linear system (29) and considering that R ¯ k is upper triangular, with R ¯ k 1 as the leading principal submatrix of order k 1 , the computation of the solution w k = [ η 1 , η 2 , , η k ] t of R ¯ k t w k = b e 1 is inexpensive. We find that
w k = w k 1 η k , η k = ( r ¯ k 2 , k ( k ) η k 2 + r ¯ k 1 , k ( k ) η k 1 ) / r ¯ k k ( k ) .
Note that the computation of w k from w k 1 only requires the last column of the matrix R ¯ k .
We are now in a position to compute x k + 1 from x k . Using Equations (25) and (55), we obtain
x k + 1 = V ¯ k w k = V ¯ k 1 w k 1 + η k v ˜ k ( k + 1 ) = x k + η k v ˜ k ( k + 1 ) ,
Note that only the last few columns of V k and V ˜ k are required to update the iterate x k + 1 .
Algorithm 1 shows pseudo-code for computing the nontrivial elements of the matrix T n in (36) and the matrix V in (50). Each iteration requires the evaluation of one matrix-vector product with the matrix A and a few operations with m-vectors. The latter only require O ( m ) flops.

5. Computed Examples

This section illustrates the performance of the iterative method of Section 2 and the error estimates of Section 4 when applied to three linear systems of Equation (1). In all examples, the matrix A is symmetric, nonsingular, and indefinite. All computations were carried out on a laptop computer using MATLAB R2024b. The initial approximate solution was set to x 0 = 0 , and iterations were terminated once the Euclidean norm of the error fell below 10 6 .
The matrix A R m × m in the computed examples were determined from their spectral factorization (2), where the eigenvector matrix U is a random orthogonal matrix and the eigenvalues are distributed on the positive and negative real axis. The exact solution for all examples is x * = [ 1 , 1 , , 1 ] t R m . The approximation of the Euclidean norm of the errors in the iterates was done using the quadrature rules of Section 3.
Problem 1.
In this example, we set m = 491 . The spectrum of the symmetric matrix A is in the union of the intervals: [ 150 , 10 ] [ 1 , 350 ] . The eigenvalues of A are given by
λ i = i 151 for 1 i 141 , i 141 for 142 i 491 .
We have cond ( A ) = 350 . Figure 1 displays numerical results for this example. The plots show the convergence history of the Gauss, Gauss-Radau, optimal averaged Gauss, and minimum A 4 n + 2 min ( f ) quadrature rules. The plots show all rules considered to be convergent. Furthermore, the Gauss-Radau and A 4 n + 2 min ( f ) rules display the fastest convergence.
Problem 2.
Let m = 200 and let the spectrum of the symmetric matrix A be in the union of two intervals: [ 7.3891 , 2.7456 ] [ 1.0513 , 148.4132 ] . The eigenvalues of A are given by
λ i = exp ( i / 20 ) for 1 i 100 , exp ( i / 100 ) for 101 i 200 .
Thus, the matrix A has 100 negative eigenvalues, and its condition number is cond ( A ) 147 . Figure 2 shows the convergence history for the Gauss, Gauss-Radau, optimal averaged Gauss, and A 4 n + 2 min ( f ) rules.
Problem 3.
In this example, we use the symmetric matrix A R 100 × 100 with eigenvalues
λ i = i 2 for 1 i 50 , i 3 for 51 i 100 .
Thus, the matrix A has 50 negative eigenvalues, with its spectrum in the intervals of [ 1 , 4 · 10 4 ] [ 10 6 , 7.5386 · 10 6 ] and cond ( A ) = 1.25 · 10 8 . Figure 3 displays the convergence history for the Gauss, Gauss-Radau, optimal averaged Gauss, and minimum A 4 n + 2 min ( f ) rules.

6. Concluding Remarks

This paper describes novel techniques for estimating the Euclidean norm of the error in iterates obtained with a Krylov subspace iterative method. The method is designed to solve linear systems of equations with a symmetric, nonsingular, indefinite matrix. The error norm estimates are obtained by using the relation between the Krylov subspace iterative method and Gauss-type quadrature rules. The computed results demonstrate the performance of the iterative method and the error norm estimates. Among the considered quadrature rules, the rules (44) and (46) provide the most accurate error-norm estimates.

Author Contributions

Investigation, M.A., M.T.D., L.R. and M.M.S. All authors have read and agreed to the published version of the manuscript.

Funding

The research by M.M.S. was supported, in part, by the Serbian Ministry of Science, Technological Development, and Innovations according to Contract 451-03-65/2024-03/200105 dated 5 February 2024.

Data Availability Statement

The data that support the findings of the study are available from the authors upon reasonable request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

References

  1. Paige, C.C.; Saunders, M.A. Solution of sparse indefinite systems of linear equations. SIAM J. Numer. Anal. 1975, 12, 617–629. [Google Scholar] [CrossRef]
  2. Golub, G.H.; Loan, C.F.V. Matrix Computations, 4th ed.; Johns Hopkins University Press: Baltimore, MD, USA, 2013. [Google Scholar]
  3. Almutari, H.; Meurant, G.; Reichel, L.; Spalević, M.M. New error estimates for the conjugate gradient method. J. Comput. Appl. Math. 2025, 459, 116357. [Google Scholar] [CrossRef]
  4. Golub, G.H.; Meurant, G. Matrices, moments and quadrature. In Numerical Analysis 1993; Griffiths, D.F., Watson, G.A., Eds.; Longman: Essex, UK, 1994; pp. 105–156. [Google Scholar]
  5. Golub, G.H.; Meurant, G. Matrices, Moments and Quadrature with Applications; Princeton University Press: Princeton, NJ, USA, 2010. [Google Scholar]
  6. Meurant, G.; Tichý, P. Error Norm Estimation in the Conjugate Gradient Algorithm; SIAM: Philadelphia, PA, USA, 2024. [Google Scholar]
  7. Meurant, G.; Tichý, P. On computing quadrature-based bounds for the A-norm of the error in conjugate gradients. Numer. Algorithms 2013, 62, 163–191. [Google Scholar] [CrossRef]
  8. Calvetti, D.; Morigi, S.; Reichel, L.; Sgallari, F. An iterative method with error estimators. J. Comput. Appl. Math. 2001, 127, 93–119. [Google Scholar] [CrossRef]
  9. Laurie, D.P. Anti-Gaussian quadrature formulas. Math. Comp. 1996, 6, 739–747. [Google Scholar] [CrossRef]
  10. Spalević, M.M. On generalized averaged Gaussian formulas. Math. Comput. 2007, 76, 1483–1492. [Google Scholar] [CrossRef]
  11. Saad, Y. Iterative Methods for Sparse Linear Systems, 2nd ed.; SIAM: Philadephia, PA, USA, 2003. [Google Scholar]
  12. Fischer, B. Polynomial Based Iteration Methods for Symmetric Linear Systems; Teubner-Wiley: New York, NY, USA, 1996. [Google Scholar]
  13. Gautschi, W. Orthogonal Polynomials, Computation and Approximation; Oxford University Press: Oxford, UK, 2004. [Google Scholar]
  14. Szego, G. Orthogonal Polynomials, 4th ed.; American Mathematical Society: Providence, RI, USA, 1975. [Google Scholar]
  15. Díaz de Alba, P.; Fermo, L.; Rodriguez, G. Solution of second kind Fredholm integral equations by means of Gauss and anti-Gauss quadrature rules. Numer. Math. 2020, 146, 699–728. [Google Scholar] [CrossRef]
  16. Hascelik, A.I. Modified anti-Gauss and degree optimal average formulas for Gegenbauer measure. Appl. Numer. Math. 2008, 58, 171–179. [Google Scholar] [CrossRef]
  17. Notaris, S.E. Anti-Gaussian quadrature formulae based on the zeros of Stieltjes polynomials. BIT Numer. Math. 2018, 58, 179–198. [Google Scholar] [CrossRef]
  18. Reichel, L.; Spalević, M.M. Averaged Gauss quadrature formulas: Properties and applications. J. Comput. Appl. Math. 2022, 410, 114232. [Google Scholar] [CrossRef]
  19. Djukić, D.L.; Mutavdzić Djukić, R.M.; Reichel, L.; Spalević, M.M. Weighted averaged Gaussian quadrature rules for modified Chebyshev measures. Appl. Numer. Math. 2024, 200, 195–208. [Google Scholar] [CrossRef]
  20. Gautschi, W. The interplay between classical analysis and (numerical) linear algebra—A tribute to Gene H. Golub. Electron. Trans. Numer. Anal. 2002, 13, 119–147. [Google Scholar]
  21. Golub, G.H. Some modified matrix eigenvalue problems. SIAM Rev. 1973, 15, 318–334. [Google Scholar] [CrossRef]
Figure 1. Problem 1. Plots (ad): Convergence history of quadrature rules. (a) Convergence history of averaged (the dashed red curve) and optimal averaged (the blue curve) rules in comparison with the exact solution (the black curve). (b) Convergence history of Gauss-Radau (the dashed red curve) rule in comparison with the exact solution (the black curve) and the residual vector (the dotted blue curve). (c) Convergence history of Gauss rule (the dashed red curve) in comparison with the exact solution (the black curve) and the residual vector (the dotted blue curve). (d) Convergence history of the A 4 n + 2 min ( f ) rule (the dashed red curve) in comparison with the exact solution (the black curve) and the residual vector (the dotted blue curve). (e) Absolute error of Gauss (the dotted blue curve), Gauss-Radau (the red curve), and A 4 n + 2 min ( f ) (the green curve) rules, in a semi-logarithmic plot.
Figure 1. Problem 1. Plots (ad): Convergence history of quadrature rules. (a) Convergence history of averaged (the dashed red curve) and optimal averaged (the blue curve) rules in comparison with the exact solution (the black curve). (b) Convergence history of Gauss-Radau (the dashed red curve) rule in comparison with the exact solution (the black curve) and the residual vector (the dotted blue curve). (c) Convergence history of Gauss rule (the dashed red curve) in comparison with the exact solution (the black curve) and the residual vector (the dotted blue curve). (d) Convergence history of the A 4 n + 2 min ( f ) rule (the dashed red curve) in comparison with the exact solution (the black curve) and the residual vector (the dotted blue curve). (e) Absolute error of Gauss (the dotted blue curve), Gauss-Radau (the red curve), and A 4 n + 2 min ( f ) (the green curve) rules, in a semi-logarithmic plot.
Axioms 14 00179 g001
Figure 2. Problem 2. Plots (ad): convergence history of quadrature rules. (a) Convergence history of averaged (the dashed red curve) and optimal averaged (the blue curve) rules in comparison with the exact solution (the black curve). (b) Convergence history of the Gauss–Radau rule (the dashed red curve) in comparison with the exact solution (the black curve) and the residual vector (the dotted blue curve). (c) Convergence history of the Gauss rule (the dashed red curve) in comparison with the exact solution (the black curve) and the residual vector (the dotted blue curve). (d) Convergence history of the A 4 n + 2 min ( f ) rule (the dashed red curve) in comparison with the exact solution (the black curve) and the residual vector (the dotted blue curve). (e) Absolute error of the Gauss (the dotted blue curve), Gauss-Radau (the red curve), and A 4 n + 2 min ( f ) (the green curve) rules in a semi-logarithmic plot.
Figure 2. Problem 2. Plots (ad): convergence history of quadrature rules. (a) Convergence history of averaged (the dashed red curve) and optimal averaged (the blue curve) rules in comparison with the exact solution (the black curve). (b) Convergence history of the Gauss–Radau rule (the dashed red curve) in comparison with the exact solution (the black curve) and the residual vector (the dotted blue curve). (c) Convergence history of the Gauss rule (the dashed red curve) in comparison with the exact solution (the black curve) and the residual vector (the dotted blue curve). (d) Convergence history of the A 4 n + 2 min ( f ) rule (the dashed red curve) in comparison with the exact solution (the black curve) and the residual vector (the dotted blue curve). (e) Absolute error of the Gauss (the dotted blue curve), Gauss-Radau (the red curve), and A 4 n + 2 min ( f ) (the green curve) rules in a semi-logarithmic plot.
Axioms 14 00179 g002
Figure 3. Problem 3. Plots (ad): convergence history of quadrature rules. (a) Convergence history of averaged (the dashed red curve) and optimal averaged (the blue curve) rules in comparison with the exact solution (the black curve). (b) Convergence history of the Gauss–Radau rule (the dashed red curve) in comparison with the exact solution (the black curve) and the residual vector (the dotted blue curve). (c) Convergence history of the Gauss rule (the dashed red curve) in comparison with the exact solution (the black curve) and the residual vector (the dotted blue curve). (d) Convergence history of the A 4 n + 2 min ( f ) rule (the dashed red curve) in comparison with the exact solution (the black curve) and the residual vector (the dotted blue curve). (e) Absolute error of the Gauss (the dotted blue curve), Gauss-Radau (the red curve), and A 4 n + 2 min ( f ) (the green curve) rules in a semi-logarithmic plot.
Figure 3. Problem 3. Plots (ad): convergence history of quadrature rules. (a) Convergence history of averaged (the dashed red curve) and optimal averaged (the blue curve) rules in comparison with the exact solution (the black curve). (b) Convergence history of the Gauss–Radau rule (the dashed red curve) in comparison with the exact solution (the black curve) and the residual vector (the dotted blue curve). (c) Convergence history of the Gauss rule (the dashed red curve) in comparison with the exact solution (the black curve) and the residual vector (the dotted blue curve). (d) Convergence history of the A 4 n + 2 min ( f ) rule (the dashed red curve) in comparison with the exact solution (the black curve) and the residual vector (the dotted blue curve). (e) Absolute error of the Gauss (the dotted blue curve), Gauss-Radau (the red curve), and A 4 n + 2 min ( f ) (the green curve) rules in a semi-logarithmic plot.
Axioms 14 00179 g003
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Alibrahim, M.; Darvishi, M.T.; Reichel, L.; Spalević, M.M. Error Estimators for a Krylov Subspace Iterative Method for Solving Linear Systems of Equations with a Symmetric Indefinite Matrix. Axioms 2025, 14, 179. https://doi.org/10.3390/axioms14030179

AMA Style

Alibrahim M, Darvishi MT, Reichel L, Spalević MM. Error Estimators for a Krylov Subspace Iterative Method for Solving Linear Systems of Equations with a Symmetric Indefinite Matrix. Axioms. 2025; 14(3):179. https://doi.org/10.3390/axioms14030179

Chicago/Turabian Style

Alibrahim, Mohammed, Mohammad Taghi Darvishi, Lothar Reichel, and Miodrag M. Spalević. 2025. "Error Estimators for a Krylov Subspace Iterative Method for Solving Linear Systems of Equations with a Symmetric Indefinite Matrix" Axioms 14, no. 3: 179. https://doi.org/10.3390/axioms14030179

APA Style

Alibrahim, M., Darvishi, M. T., Reichel, L., & Spalević, M. M. (2025). Error Estimators for a Krylov Subspace Iterative Method for Solving Linear Systems of Equations with a Symmetric Indefinite Matrix. Axioms, 14(3), 179. https://doi.org/10.3390/axioms14030179

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop