Next Article in Journal
Time Scale Transformation in Bivariate Pearson Diffusions: A Shift from Light to Heavy Tails
Previous Article in Journal
Superposition and Interaction Dynamics of Complexitons, Breathers, and Rogue Waves in a Landau–Ginzburg–Higgs Model for Drift Cyclotron Waves in Superconductors
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

On the Lanczos Method for Computing Some Matrix Functions

1
School of Sciences and Arts, Suqian University, Suqian 223800, China
2
Department of Mathematics and Statistics, University of Victoria, Victoria, BC V8W 3R4, Canada
3
Department of Medical Research, China Medical University Hospital, China Medical University, Taichung 40402, Taiwan
4
Department of Mathematics and Informatics, Azerbaijan University, AZ197 Baku, Azerbaijan
5
Section of Mathematics, International Telematic University Uninettuno, I-00186 Rome, Italy
6
Center for Converging Humanities, Kyung Hee University, 26 Kyungheedae-ro, Dongdaemun-gu, Seoul 02447, Republic of Korea
*
Author to whom correspondence should be addressed.
Axioms 2024, 13(11), 764; https://doi.org/10.3390/axioms13110764
Submission received: 7 October 2024 / Revised: 30 October 2024 / Accepted: 1 November 2024 / Published: 4 November 2024
(This article belongs to the Section Mathematical Analysis)

Abstract

:
The study of matrix functions is highly significant and has important applications in control theory, quantum mechanics, signal processing, and machine learning. Previous work has mainly focused on how to use the Krylov-type method to efficiently calculate matrix functions f ( A ) β and β T f ( A ) β when A is symmetric. In this paper, we mainly illustrate the convergence using the polynomial approximation theory for the case where A is symmetric positive definite. Numerical results illustrate the effectiveness of our theoretical results.

1. Introduction

The study of matrix functions is highly significant and has important applications in control theory, quantum mechanics, signal processing, and machine learning. Some studies on discrete matrix functions have been investigated in [1,2]. Let A R n × n be a large sparse matrix and β R n be a nonzero real vector. Many research works have focused on computing f ( A ) β and β T f ( A ) β [3,4,5] using the Krylov-type method, where f is a function that makes f ( A ) R n × n well defined and β T f ( A ) β is called a bilinear form when A is symmetric [6]. Theoretical results and numerical experiments have shown that f ( A ) β and β T f ( A ) β can be calculated efficiently using the Krylov-type method [7,8,9,10].
In this paper, we consider the special case that matrix A is symmetric positive definite and f ( · ) = log ( · ) , f ( · ) = ( · ) 1 2 or f ( · ) = ( · ) 1 . The logarithm of A is any matrix X that holds exp ( X ) = A (refer to ([11], Chapter 11), [12,13], or [14]). Let the eigen decomposition of A be
A = U Λ U T ,
with
Λ = λ 1 λ 2 λ n ,
where U R n × n , U T U = I , and λ 1 λ 2 λ n > 0 . If an eigenvalue of A is less than 0, then log ( A ) is not well defined ([11], Chapter 11). Thus, if A is symmetric, it must be positive definite. The matrix logarithm function is defined for A as follows ([11], Definition 1.2):
log ( A ) : = U · log ( λ 1 ) log ( λ 2 ) log ( λ n ) · U T R n × n .
Similarly, the matrix square root is defined for A as follows ([11], Definition 1.2):
A 1 2 : = U · λ 1 λ 2 λ n · U T R n × n .
The estimation of the log-determinant of matrix A is widely used in, for example, Markov random field models [15], lattice quantum chromodynamics [16], statistical learning [17], and so on. Moreover, the calculation of the bilinear form β T log ( A ) β is crucial for the estimation of the log-determinant of matrix A. The computation of A 1 2 β arises in the context of Markov function problems and domain decomposition methods [18]. The computation of trace ( A 1 ) has received much attention [6,19]. The utilization of modified moments [6,20], Monte Carlo methods [21], and Gaussian quadratures [6,22] to estimate trace ( A 1 ) for symmetric positive definite matrices has been proposed. These estimates have a wide range of applications in mathematics [6], physics [23], and statistics [24]. Moreover, the computation of β T A 1 β is important for estimating the trace of the matrix inverse [25,26]. The Lanczos method is a method commonly used to compute log ( A ) β , β T log ( A ) β , A 1 2 β and β T A 1 β .
Using the polynomial approximation theory, we prove, in this paper, that the Lanczos method converges at rates
O k 1 ϰ 1 ϰ + 1 k , O k 1 ϰ 1 ϰ + 1 2 k 1 , O ϰ 1 ϰ + 1 k a n d O ϰ 1 ϰ + 1 2 k 1
for computing log ( A ) β , β T log ( A ) β , A 1 2 β , and β T A 1 β , respectively. Here, k denotes the iterations and ϰ = A A 1 , with · denoting the Euclidean norm of a matrix or a vector in this paper.
Throughout our paper, we let A be symmetric positive definite and organized the paper as follows. We present the Lanczos approximation to log ( A ) β , β T log ( A ) β , A 1 2 β , and β T A 1 β in Section 2. In Section 3, a convergence analysis of the Lanczos method for computing log ( A ) β , β T log ( A ) β , A 1 2 β , and β T A 1 β is provided. Numerical results are presented in Section 4 to illustrate the effectiveness of our bounds, and in Section 5, we present a conclusion.

2. Lanczos Approximation of log ( A ) β , β T log ( A ) β , A 1 2 β , and β T A 1 β

For the Krylov subspace
K k ( A , β ) = span { β , A β , , A k 1 β } ,
we can usually obtain an orthonormal basis Q k of it using the Lanczos process [6]. Moreover, it holds that
A Q k = Q k S k + θ k q k + 1 e 1 T ,
where
Q k = [ q 1 , q 2 , , q k ] R n × k , q 1 = β β , [ Q k q k + 1 ] T [ Q k q k + 1 ] = I k + 1 , e 1 = ( 1 0 0 ) T
and
S k = Q k T A Q k = ϖ 1 θ 1 θ 1 ϖ 2 θ 2 θ k 2 ϖ k 1 θ k 1 θ k 1 ϖ k R k × k .
Thus, β Q k log ( S k ) e 1 , β 2 e 1 T log ( S k ) e 1 , β Q k S k 1 2 e 1 , and β 2 e 1 T S k 1 e 1 are taken as approximations of log ( A ) β , β T log ( A ) β , A 1 2 β , and β T A 1 β , respectively ([11], Section 13.2).
If the Lanczos process (1) terminates at iteration k max for the first time, then θ k max = 0 and
A Q k max = Q k max S k max .
Let the eigen decomposition of S k max be
S k max = W · diag ( d 1 , d 2 , , d k max ) · W T .
Hence, { d 1 , d 2 , , d k max } { λ 1 , λ 2 , , λ n } . By ([11], Definition 1.4), we have that there exist three polynomials q, p, and h such that A 1 2 = q ( A ) , log ( A ) = p ( A ) and h ( A ) = A 1 . Hence, q ( λ i ) = λ i , p ( λ i ) = log ( λ i ) , and h ( λ i ) = λ i 1 , i = 1 , 2 , , n . Thus, it can be seen that q ( d i ) = d i , p ( d i ) = log ( d i ) , h ( d i ) = d i 1 , i = 1 , 2 , , k max ,
log ( A ) β = p ( A ) β = β · p ( A ) Q k max e 1 = ( 2 ) β · Q k max p ( S k max ) e 1 = β · Q k max W · diag p ( d 1 ) , p ( d 2 ) , , p ( d k max ) · W T e 1 = β · Q k max W · diag log ( d 1 ) , log ( d 2 ) , , log ( d k max ) · W T e 1 = β · Q k max log ( S k max ) e 1 ,
and
β T log ( A ) β = β · β T Q k max log ( S k max ) e 1 = β 2 · e 1 T log ( S k max ) e 1 .
Similarly, it holds that
A 1 2 β = β · S k max 1 2 e 1 and β T A 1 β = β 2 · e 1 T S k max 1 e 1 .
Thus, when the Lanczos process terminates, log ( A ) β , β T log ( A ) β , A 1 2 β , and β T A 1 β can be found exactly.

3. Main Achievements

The convergence of the Lanczos approximations to β Q k log ( S k ) e 1 , β 2 e 1 T log ( S k ) e 1 , β Q k S k 1 2 e 1 , and β 2 e 1 T S k 1 e 1 is mainly considered in this section. We first give the following three lemmas.
Lemma 1 
(([27], eq.(5.6.2)), and ([28], p. 449)). For any | z | 1 and x [ 1 , 1 ] , we have
log ( 1 2 x z + z 2 ) = 2 n = 1 T n ( x ) z n n ,
and
1 1 2 x z + z 2 = n = 1 P n ( x ) z n ,
where T n ( x ) represents the first kind of n-th degree Chebyshev polynomial
T n ( x ) : = cos ( n arccos x ) , x [ 1 , 1 ] , n N + ,
and P n ( x ) represents the n-th degree Legendre polynomial
P n ( x ) = ( 1 ) n 2 n n ! · d n d x n ( 1 x 2 ) n , x [ 1 , 1 ] , n N + .
Moreover,
max | x | 1 | P n ( x ) | = 1 .
Lemma 2 
([8], Lemma 2.2). If f is analytic on [ λ n , λ 1 ] , then
f ( A ) β β Q j f ( S j ) e 1 β 2 · min p j P j 1 max λ [ λ n , λ 1 ] | f ( λ ) p ( λ ) | ,
| β T f ( A ) β β 2 e 1 T f ( S j ) e 1 | β 2 2 · min p j P 2 j 1 max λ [ λ n , λ 1 ] | f ( λ ) p ( λ ) | ,
where j N + , and P s denotes the polynomials set with degrees no higher than s N + .
Lemma 3 
([29]). For any ω > 1 , we have
min ϕ P j max x [ 1 , 1 ] 1 x ω ϕ ( x ) ω + ω 2 1 j ω 2 1 .
The main achievements of this article are presented below.
Theorem 1. 
Let ϰ = A A 1 . For k 1 ,
ϵ 1 ( k ) : = log ( A ) β β · Q k log ( S k ) e 1 log ( A ) β 2 β log ( A ) β · ϰ + 1 k ϰ 1 ϰ + 1 k ,
and
0 ϵ 2 ( k ) : = β 2 · e 1 T log ( S k ) e 1 β T log ( A ) β | β T log ( A ) β | β 2 | β T log ( A ) β | · ϰ + 1 k ϰ 1 ϰ + 1 2 k .
Proof. 
Let
η : = λ 1 + λ n λ 1 λ n = ϰ + 1 ϰ 1 .
Then,
z 0 : = η η 2 1 = ϰ 1 ϰ + 1 ( 0 , 1 ) , and 1 + z 0 2 = 2 z 0 η .
According to Lemma 1, we have
log 2 z 0 ( η x ) = log ( 1 2 x z 0 + z 0 2 ) = 2 n = 1 T n ( x ) z 0 n n .
Consider the linear transformation
λ = λ n λ 1 2 x + λ 1 + λ n 2 , x [ 1 , 1 ] ,
which maps x [ 1 , 1 ] bijectively onto λ [ λ n , λ 1 ] . Then,
log ( A ) β ϵ 1 ( k ) β ( 6 ) 2 min p P k 1 max λ [ λ n , λ 1 ] | log λ p ( λ ) | = 2 min p P k 1 max 1 x 1 log λ n λ 1 2 x + λ 1 + λ n 2 p λ n λ 1 2 x + λ 1 + λ n 2 = 2 min q P k 1 max 1 x 1 log λ n λ 1 2 x η q x = 2 min q P k 1 max 1 x 1 log λ 1 λ n 4 z 0 2 z 0 ( η x ) q x = 2 min q P k 1 max 1 x 1 log 2 z 0 ( η x ) q x + log λ 1 λ n 4 z 0 .
Let
q x log λ 1 λ n 4 z 0 = 2 n = 1 k 1 T n ( x ) z 0 n n P k 1 .
By (12) and | T n ( x ) | 1 for x [ 1 , 1 ] , we obtain that
log ( A ) β ϵ 1 ( k ) β 2 max x [ 1 , 1 ] log 2 z 0 ( η x ) + 2 n = 1 k 1 T n ( x ) z 0 n n x 4 n = k z 0 n n = 4 z 0 k n = 0 z 0 n n + k 4 z 0 k k n = 0 z 0 n = 4 z 0 k k ( 1 z 0 ) ,
which yields (8).
Furthermore, a linear map Ψ : R n 1 × n 1 R n 2 × n 2 is called a unital positive linear map ([30], p. 5) if the following hold:
(i)
A is symmetric positive definite ⇒ Ψ ( A ) is also symmetric positive definite;
(ii)
Ψ ( I n 1 ) = I n 2 .
Define
Φ ( Δ ) = Δ ( 1 : k , 1 : k ) R k × k , where Δ R k max × k max .
Thus, Φ is a unital positive linear map. By ([30], Corollary 1.8),
log ( S k ) Φ ( log ( S k max ) ) = log ( Φ ( S k max ) ) Φ ( log ( S k max ) ) i s   p o s i t i v e   s e m i d e f i n i t e .
It follows that
0 e 1 T log ( S k ) e 1 e 1 T Φ ( log ( S k max ) ) e 1 = e 1 T log ( S k ) e 1 e 1 T log ( S k max ) e 1 = ( 3 ) e 1 T log ( S k ) e 1 β 2 · β T log ( A ) β ,
which yields the first inequality in (9). In light of (7), we see that
| β T log ( A ) β | ϵ 2 ( k ) β 2 2 min h P 2 k 1 max λ [ λ n , λ 1 ] | log ( λ ) h ( λ ) | = 2 min h P 2 k 1 max 1 x 1 log λ n λ 1 2 x + λ 1 + λ n 2 h λ n λ 1 2 x + λ 1 + λ n 2 = 2 min h P 2 k 1 max 1 x 1 log 2 z 0 ( η x ) h λ n λ 1 2 x + λ 1 + λ n 2 + log λ 1 λ n 4 z 0 .
Analogously, consider
h λ n λ 1 2 x + λ 1 + λ n 2 log λ 1 λ n 4 z 0 = 2 n = 1 2 k 1 T n ( x ) z 0 n n P 2 k 1 .
Recall that | T n ( x ) | 1 for x [ 1 , 1 ] . We then have
0 | β T log ( A ) β | ϵ 2 ( k ) β 2 2 max 1 x 1 log 2 z 0 ( η x ) + 2 n = 1 2 k 1 T n ( x ) z 0 n n ( 12 ) 4 n = 2 k z 0 n n = 4 z 0 2 k n = 0 z 0 n n + 2 k 2 z 0 2 k k n = 0 z 0 n = 2 z 0 2 k k ( 1 z 0 ) ,
which yields (9). □
Theorem 2. 
For k 1 ,
ϵ 3 ( k ) : = A 1 2 β β Q k S k 1 2 e 1 A 1 2 β 2 β ( κ + 1 ) A 1 2 β λ 1 λ n ϰ 1 ϰ + 1 k + 1 2
Proof. 
By (11) and Lemma 1,
1 2 z 0 ( η x ) = 1 1 2 x z 0 + z 0 2 = n = 1 P n ( x ) z 0 n .
It holds that
A 1 2 β β Q k S k 1 2 e 1 β 2 · min p j P j 1 max λ [ λ n , λ 1 ] 1 λ p ( λ ) = 2 min p P k 1 max x [ 1 , 1 ] λ n λ 1 2 x + λ 1 + λ n 2 1 2 p λ n λ 1 2 x + λ 1 + λ n 2 = ( 10 ) 2 min q P k 1 max x [ 1 , 1 ] λ 1 λ n 4 z 0 2 z 0 ( η x ) 1 2 q x = 2 λ 1 λ n 4 z 0 min q ˜ P k 1 max x [ 1 , 1 ] 2 z 0 ( η x ) 1 2 q ˜ x 4 z 0 λ 1 λ n · max x [ 1 , 1 ] 2 z 0 ( η x ) 1 2 n = 1 k 1 P n ( x ) z 0 n ( 4 ) 4 z 0 λ 1 λ n · n = k max 1 x 1 | P n ( x ) | z 0 n = ( 5 ) 4 z 0 λ 1 λ n · z 0 k 1 z 0 = 2 ( κ + 1 ) z 0 k + 1 2 λ 1 λ n ,
which yields (13). □
Finally, we provide a convergence analysis of β T A 1 β β 2 e 1 T S k 1 e 1 .
Theorem 3. 
For k 1 ,
0 ϵ 4 ( k ) : = β T A 1 β β 2 e 1 T S k 1 e 1 β T A 1 β β 2 λ n · β T A 1 β ϰ 1 ϰ + 1 2 k 1 .
Proof. 
On the one hand, we deduce that
β T A 1 β β 2 e 1 T S k 1 e 1 = β 2 e 1 T Q k max T A 1 Q k max e 1 e 1 T S k 1 e 1 = β 2 e 1 T S k max 1 e 1 e 1 T S k 1 e 1 0 .
On the other hand,
β T A 1 β β 2 e 1 T S k 1 e 1 β 2 ( 6 ) 2 min p k P 2 k 1 max λ [ λ n , λ 1 ] | 1 λ p ( λ ) | = 2 min p P 2 k 1 max x [ 1 , 1 ] ( λ n λ 1 ) x 2 + λ 1 + λ n 2 1 p ( λ n λ 1 ) x 2 + λ 1 + λ n 2 = ( 10 ) 4 λ 1 λ n · min p ˜ P 2 k 1 max x [ 1 , 1 ] 1 x η p ˜ x = L e m m a 3 4 λ 1 λ n · η + η 2 1 ( 2 k 1 ) η 2 1 = 4 λ n ( ϰ 1 ) · η + η 2 1 ( 2 k 1 ) η 2 1 .
Recall that η = ϰ + 1 ϰ 1 ; it follows that
η + η 2 1 = ϰ + 1 ϰ 1 and η 2 1 = 4 ϰ ( ϰ 1 ) 2 > 4 ϰ 1 ,
which yields (14). □
Remark 1. 
It is well known that the rate of convergence in computing A 1 β using the conjugate gradient method is
ς k : = β A y k β O ϰ 1 ϰ + 1 k ,
where y k is the approximate solution of A 1 β derived from the k-th step of the conjugate gradient method for solving linear system A y = β . It follows that ϵ 4 ( k ) = O ( ς k 2 ) . Thus, in practice, we can estimate the size of ϵ 4 from the size of ς k .
Remark 2. 
According to Theorems 1–3, the convergence rates of ϵ 1 ( k ) , ϵ 2 ( k ) ϵ 3 ( k ) , and ϵ 4 ( k ) are
O k 1 ϰ 1 ϰ + 1 k , O k 1 ϰ 1 ϰ + 1 2 k 1 , O ϰ 1 ϰ + 1 k and O ϰ 1 ϰ + 1 2 k 1 ,
respectively. The smaller the condition number ϰ of A, the faster these four quantities converge to zero. It should be noted that the convergence rate of ϵ 2 ( k ) is significantly higher than ϵ 1 ( k ) and ϵ 3 ( k ) and slightly higher than that of ϵ 4 ( k ) . The numerical experiments in Section 4 illustrate these two facts.
If f is analytic on [ λ n , λ 1 ] , then
| β 2 · e 1 T f ( S k ) e 1 β T f ( A ) β | O ϰ 1 ϰ + 1 2 k .
In contrast, for f ( · ) = log ( · ) , the convergence rate provided by (9) is superior.

4. Numerical Experiments

We want to show the effectiveness of the bounds provided by (8), (9), (13), and (14) with two examples.
Example 1. 
Let the matrix B = diag ( t i n [ a , b ] ) [31], where
t i n [ a , b ] = ( b a 2 ) ( t i n + ( a + b b a ) ) with t i n = cos ( 2 i 1 ) π 2 n , i = 1 , 2 , , n .
Here, n = 5000 , a = 10 , b = 10 , and t i n [ a , b ] represents the n-th translated Chebyshev zero node on [ a , b ] . Let
A = B + ζ I , β = v v ,
with
v = ( 1 1 1 ) T R 5000 , and ζ > 10 .
The selection of ζ is listed in Table 1.
Example 2. 
Consider the Strakoš matrix
A = diag ( { α i } i = 1 n ) with α i = α 1 + i 1 n 1 ( α n α 1 ) ρ n i , i = 1 , 2 , , n = 10000 .
The distribution of the eigenvalue is controlled by the parameter ρ. We take
α 1 = 1 , ρ = 0.99 and β = v v , with v = ( 1 1 1 ) T R 10000
in this example. The selection of α n is listed in Table 2.
So, we obtain different values of ϰ with different choices of ζ and α n ; refer to Table 1 and Table 2. In Figure 1, Figure 2, Figure 3 and Figure 4, we plot the curves of ϵ 1 ( k ) , ϵ 2 ( k ) , ϵ 3 ( k ) , and ϵ 4 ( k ) with different ϰ , as well as the bounds in (8) and (9), respectively. It is obvious to see from Figure 1, Figure 2, Figure 3 and Figure 4 that (i) the iterations increase with the increasing of ϰ , which is consistent with our results (see (8), (9), (13), and (14)); (ii) the convergence rate of ϵ 2 ( k ) is significantly higher than ϵ 1 ( k ) and ϵ 3 ( k ) and slightly higher than that of ϵ 4 ( k ) , which is also consistent with our results; (iii) the curves of the bounds in (8), (9), (13), and (14) are almost parallel to the curves of the real values, except for ϵ 3 ( k ) in Example 2 (see Figure 4c,d). That is, our bounds can evaluate the convergence rates of ϵ 1 ( k ) , ϵ 2 ( k ) , ϵ 3 ( k ) , and ϵ 4 ( k ) effectively, in most instances. All these show the effectiveness of the theoretical results proposed in our paper.

5. Concluding Remarks

We established the convergence rates of the Lanczos method for computing log ( A ) β , β T log ( A ) β , A 1 2 β , and β T A 1 β , as
O k 1 ϰ 1 ϰ + 1 k , O k 1 ϰ 1 ϰ + 1 2 k 1 , O ϰ 1 ϰ + 1 k and O ϰ 1 ϰ + 1 2 k 1 ,
respectively. Here, A is assumed to be symmetric positive definite. It is not difficult to see that the convergence rate of ϵ 2 ( k ) is significantly higher than ϵ 1 ( k ) and ϵ 3 ( k ) and slightly higher than that of ϵ 4 ( k ) . Numerical experiments illustrate the effectiveness of our theoretical results.
The Lanczos method can also be used to compute some other matrix functions, e.g., exp ( A ) β , A p β ( 0 < p < 1 ), sin ( A ) β , cos ( A ) β , and so on. The key to convergence analysis is to find the Chebyshev expansion of these functions. Furthermore, our analysis is confined to the scenario where A is symmetric positive definite. The situation is complicated for a nonsymmetric matrix whose eigenvalues are often complex. Therefore, we must resort to the theory of polynomial approximation over complex domains for convergence analysis. Extending this analysis to cases where A is nonsymmetric warrants further investigation. Finally, it is clear from the numerical experiments that the bounds we provide are not sharp enough when the condition number is large, and it is also an interesting question to investigate whether these upper bounds can be improved further.

Author Contributions

Y.G. prepared the mathematica programs, tables, and figures; Y.G., H.M.S. and X.L. wrote the manuscript text. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Suqian University Youth Foundation (Grant No. 2023XQNA15) and Suqian Sci & Tech Program (Grant No. K202419).

Data Availability Statement

This manuscript has no associated data.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Cuchta, T.; Grow, D.; Wintz, N. Discrete matrix hypergeometric functions. J. Math. Anal. Appl. 2023, 518, 126716. [Google Scholar] [CrossRef]
  2. Cuchta, T.; Luketic, R. Discrete hypergeometric Legendre polynomials. Mathematics 2021, 9, 2546. [Google Scholar] [CrossRef]
  3. Güttel, S. Rational Krylov approximation of matrix functions: Numerical methods and optimal pole selection. GAMM-Mitteilungen 2013, 36, 8–31. [Google Scholar] [CrossRef]
  4. Ilic, M.D.; Turner, I.W.; Simpson, D.P. A restarted Lanczos approximation to functions of a symmetric matrix. IMA J. Numer. Anal. 2009, 30, 1044–1061. [Google Scholar] [CrossRef]
  5. Ubaru, S.; Chen, J.; Saad, Y. Fast estimation of tr(f(A)) via stochastic Lanczos quadrature. SIAM J. Matrix Anal. Appl. 2017, 38, 1075–1099. [Google Scholar] [CrossRef]
  6. Golub, G.H.; Meurant, G. Matrices, Moments and Quadrature with Applications; Princeton University Press: Princeton, NJ, USA, 2009; Volume 30. [Google Scholar]
  7. Frommer, A.; Schweitzer, M. Error bounds and estimates for Krylov subspace approximations of Stieltjes matrix functions. BIT Numer. Math. 2015, 56, 865–892. [Google Scholar] [CrossRef]
  8. Chen, T.; Hallman, E. Krylov-aware stochastic trace estimation. SIAM J. Matrix Anal. Appl. 2023, 44, 1218–1244. [Google Scholar] [CrossRef]
  9. Druskin, V.L.; Knizhnerman, L.A. Error bounds in the simple Lanczos procedure for computing functions of symmetric matrices and eigenvalues. Comput. Math. Math. Phys. 1991, 31, 20–30. [Google Scholar]
  10. Frommer, A.; Kahl, K.; Lippert, T.; Rittich, H. 2-norm error bounds and estimates for Lanczos approximations to linear systems and rational matrix functions. SIAM J. Matrix Anal. Appl. 2013, 34, 1046–1065. [Google Scholar] [CrossRef]
  11. Higham, N.J. Functions of Matrices: Theory and Computation; SIAM: Philadelphia, PA, USA, 2008. [Google Scholar]
  12. Cuchta, T.; Grow, D.; Wintz, N. A dynamic matrix exponential via a matrix cylinder transformation. J. Math. Anal. Appl. 2019, 479, 733–751. [Google Scholar] [CrossRef]
  13. Golub, G.H.; Van Loan, C.F. Matrix Computations; Johns Hopkins University Press: Baltimore, MD, USA, 2012. [Google Scholar]
  14. Saad, Y. Analysis of Some Krylov Subspace Approximations to the Matrix Exponential Operator. SIAM J. Numer. Anal. 1992, 29, 209–228. [Google Scholar] [CrossRef]
  15. Wainwright, M.J.; Jordan, M.I. Log-determinant relaxation for approximate inference in discrete Markov random fields. IEEE Trans. Signal Process. 2006, 54, 2099–2109. [Google Scholar] [CrossRef]
  16. Thron, C.; Dong, S.J.; Liu, K.F.; Ying, H.P. Padé-Z2 estimator of determinants. Phys. D-Rev. Part. Fields Gravit. Cosmol. 1998, 57, 1642–1653. [Google Scholar] [CrossRef]
  17. Affandi, H.; Fox, E.; Adams, R.; Taskar, B. Learning the parameters of determinantal point process kernels. In Proceedings of the 31st International Conference on Machine Learning, Beijing, China, 21–26 June 2014; pp. 1224–1232. [Google Scholar]
  18. Arioli, M.; Loghin, D. Matrix Square-Root Preconditioners for the Steklov-Poincaré Operator; Technical Report RAL-TR-2008-003; Rutherford Appleton Laboratory: Didcot, UK, 2008. [Google Scholar]
  19. Tang, J.; Saad, Y. A Probing Method for Computing the Diagonal of the Inverse of a Matrix; Report UMSI–2010–42; Minnesota Supercomputer Institute, University of Minnesota: Minneapolis, MN, USA, 2010. [Google Scholar]
  20. Meurant, G. Estimates of the trace of the inverse of a symmetric matrix using the modified Chebyshev algorithms. Numer. Algorithms 2009, 51, 309–318. [Google Scholar] [CrossRef]
  21. Bai, Z.; Fahey, M.; Golub, G.H. Some large Scale computation problems. J. Comput. Appl. Math. 1996, 74, 71–89. [Google Scholar] [CrossRef]
  22. Bai, Z.; Golub, G.H. Bounds for the trace of the inverse and the determinant of symmetric positive definite matrices. Ann. Numer. Math. 1997, 4, 29–38. [Google Scholar]
  23. Dong, S.J.; Liu, K.F. Stochastic estimation with Z2 noise. Phys. Lett. B 1994, 328, 130–136. [Google Scholar] [CrossRef]
  24. Ortner, B.; Krauter, A.R. Lower bounds for the determinant and the trace of a class of Hermitian matrices. Linear Algebra Its Appl. 1996, 236, 147–180. [Google Scholar] [CrossRef]
  25. Brezinski, C.; Fika, P.; Mitrouli, M. Moments of a linear operator, with applications to the trace of the inverse of matrices and the solution of equations. Numer. Linear Algebra Appl. 2012, 19, 937–953. [Google Scholar] [CrossRef]
  26. Wu, L.; Laeuchli, J.; Kalantzis, V.; Stathopoulos, A.; Gallopoulos, E. Estimating the trace of the matrix inverse by interpolating from the diagonal of an approximate inverse. J. Comput. Phys. 2016, 326, 828–844. [Google Scholar] [CrossRef]
  27. Beals, R.; Wong, R. Special Functions and Orthogonal Polynomials; Cambridge University Press: Cambridge, UK, 2016. [Google Scholar]
  28. Olver, F.; Lozier, D.; Boisvert, R.; Clark, C. The NIST Handbook of Mathematical Functions; Cambridge University Press: New York, NY, USA, 2010. [Google Scholar]
  29. Bernstein, S.N. Sur l, ordre de la meilleure approximation des fonctions continues par les polynômes de degré donné. R. Acad. Med. Belg. 1912, 4, 1–104. [Google Scholar]
  30. Zhan, X. Matrix Inequalities; Springer: Berlin/Heidelberg, Germany, 2002. [Google Scholar]
  31. Zhang, L.; Shen, C.; Li, R. On the generalized Lanczos trust-region method. SIAM J. Optim. 2017, 27, 2110–2142. [Google Scholar] [CrossRef]
Figure 1. Example 1: Lines correspond to ϵ 1 ( k ) and ϵ 2 ( k ) ; refer to Theorem 1.
Figure 1. Example 1: Lines correspond to ϵ 1 ( k ) and ϵ 2 ( k ) ; refer to Theorem 1.
Axioms 13 00764 g001aAxioms 13 00764 g001b
Figure 2. Example 1: Lines correspond to ϵ 3 ( k ) and ϵ 4 ( k ) ; refer to Theorems 2 and 3.
Figure 2. Example 1: Lines correspond to ϵ 3 ( k ) and ϵ 4 ( k ) ; refer to Theorems 2 and 3.
Axioms 13 00764 g002aAxioms 13 00764 g002b
Figure 3. Example 2: Lines correspond to ϵ 1 ( k ) and ϵ 2 ( k ) ; refer to Theorem 1.
Figure 3. Example 2: Lines correspond to ϵ 1 ( k ) and ϵ 2 ( k ) ; refer to Theorem 1.
Axioms 13 00764 g003aAxioms 13 00764 g003b
Figure 4. Example 2: Lines correspond to ϵ 3 ( k ) and ϵ 4 ( k ) ; refer to Theorems 2 and 3.
Figure 4. Example 2: Lines correspond to ϵ 3 ( k ) and ϵ 4 ( k ) ; refer to Theorems 2 and 3.
Axioms 13 00764 g004aAxioms 13 00764 g004b
Table 1. Example 1: Different values of ϰ with different choices of ζ .
Table 1. Example 1: Different values of ϰ with different choices of ζ .
ζ = 10 + 10 1 ζ = 10 + 10 2 ζ = 10 + 10 3 ζ = 10 + 10 4
ϰ 2.01 × 10 2 2.00 × 10 3 2.00 × 10 4 1.99 × 10 5
Table 2. Example 2: Different values of ϰ with different choices of α n .
Table 2. Example 2: Different values of ϰ with different choices of α n .
α n = 10 2 α n = 10 3 α n = 10 4 α n = 10 5
ϰ 1 × 10 2 1 × 10 3 1 × 10 4 1 × 10 5
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Gu, Y.; Srivastava, H.M.; Liu, X. On the Lanczos Method for Computing Some Matrix Functions. Axioms 2024, 13, 764. https://doi.org/10.3390/axioms13110764

AMA Style

Gu Y, Srivastava HM, Liu X. On the Lanczos Method for Computing Some Matrix Functions. Axioms. 2024; 13(11):764. https://doi.org/10.3390/axioms13110764

Chicago/Turabian Style

Gu, Ying, Hari Mohan Srivastava, and Xiaolan Liu. 2024. "On the Lanczos Method for Computing Some Matrix Functions" Axioms 13, no. 11: 764. https://doi.org/10.3390/axioms13110764

APA Style

Gu, Y., Srivastava, H. M., & Liu, X. (2024). On the Lanczos Method for Computing Some Matrix Functions. Axioms, 13(11), 764. https://doi.org/10.3390/axioms13110764

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop