Abstract
A flexible extended Krylov subspace method (-EKSM) is considered for numerical approximation of the action of a matrix function to a vector b, where the function f is of Markov type. -EKSM has the same framework as the extended Krylov subspace method (EKSM), replacing the zero pole in EKSM with a properly chosen fixed nonzero pole. For symmetric positive definite matrices, the optimal fixed pole is derived for -EKSM to achieve the lowest possible upper bound on the asymptotic convergence factor, which is lower than that of EKSM. The analysis is based on properties of Faber polynomials of A and . For large and sparse matrices that can be handled efficiently by LU factorizations, numerical experiments show that -EKSM and a variant of RKSM based on a small number of fixed poles outperform EKSM in both storage and runtime, and usually have advantages over adaptive RKSM in runtime.
MSC:
65F50; 65F60; 65E10
1. Introduction
Consider a large square matrix and a function f such that the matrix function is well-defined [,]. The numerical approximation of
where is a vector, is a common problem in scientific computing. It arises in numerical solutions of differential equations [,,,], matrix functional integrators [,], model order reduction [,], and optimization problems [,]. Note that approximating the action of to a vector b and approximating the matrix are different. For a large sparse matrix A, is usually fully dense and infeasible to form explicitly.
Numerical methods for approximating the action of to a vector have been extensively studied in recent years, especially for large-scale problems; see, e.g., [,,] and references therein. Existing algorithms often construct certain polynomial or rational approximations to f over the spectrum of A and apply such approximations directly to the vector b without forming any dense matrices of the same size as A. A class of mostly common projection methods are based on Krylov subspaces ; however, for many large matrices this may require a very large dimension of approximation spaces. Rational Krylov subspace methods have been investigated to decrease the size of subspaces for approximations; see, e.g., [,,,,]. Two well-known examples are the extended rational Krylov subspace method (EKSM) [,,] and the adaptive rational Krylov subspace method (adaptive RKSM) [,].
In this paper, we explore a generalization of EKSM that uses one fixed nonzero pole alternately with the infinite pole for approximating the action of Markov-type (Cauchy–Stieltjes) functions []. Markov-type functions can be written as
where is a measure that ensures that the integral converges absolutely. Note that this definition can be generalized to integrals defined on , where . We consider the interval here, as this is sufficient for the most widely-studied Markov-type functions with , and for their simple modifications, such as with . Our analysis can be extended to integrals on as long as the measure satisfies , as assumed in [].
This study is motivated by [], which provided an upper bound on the convergence factor of EKSM for approximating . Our work concerns a generalization of EKSM, replacing the zero pole used in EKSM with a fixed nonzero pole s; hence, we call it the flexible extended Krylov subspace method (-EKSM). This algorithm can apply the three-term recurrence to enlarge the subspace for symmetric matrices such as EKSM, while full orthogonalization process may be necessary for adaptive RKSM regardless of the symmetry of matrices. Beckermann and Reichel [] studied the asymptotic convergence rate of RKSM with different pole selections for approximating of Markov functions via Faber transform; however, they did not provide explicit expressions of the optimal cyclic 2 poles or the corresponding rate of convergence, which could be done by solving a quartic equation analytically. In this paper, we derive explicit expressions of the optimal pole s and the corresponding convergence factor using the different analytic tool from []. While our bounds on the convergence factor seem to not be as tight as the bounds in [] numerically, our poles and bounds are provided in explicit expressions; in addition, our pole usually leads to faster convergence for discrete Laplacian matrices based on finite difference discretizations and many practical nonsymmetric matrices.
We explore the optimal pole s to achieve the lowest upper bound on the asymptotic convergence factor, which is guaranteed to outperform EKSM on symmetric positive definite (SPD) matrices. For nonsymmetric matrices with an elliptic numerical range, we provide numerical evidence to demonstrate the advantage of -EKSM over EKSM in convergence rate. Numerical experiments show that -EKSM converges at least as rapidly as the upper bound suggests. In practice, if the linear systems needed for constructing rational Krylov subspaces can be solved efficiently by LU factorizations, then -EKSM outperforms EKSM in both time and storage cost over a wide range of matrices, and it could run considerably faster than adaptive RKSM for many problems. In this paper, we only consider factorization-based direct methods for solving the inner linear systems; for the use and implications of iterative linear solvers see, e.g., [].
Rational Krylov subspace methods may exhibit superlinear convergence for approximating . As the algorithms proceed and more rational Ritz values converge to the exterior eigenvalues of A, the ‘effective spectrum’ of A not covered by the converged Ritz values shrinks, leading to gradual acceleration of the convergence. This analysis has been performed for EKSM applied to the 1D discrete Laplacian ([], Section 5.2) based on the result from [], leading to a sharper explicit bound on the convergence. The same idea could be explored with -EKSM; however, it is not considered here, as we did not observe superlinear convergence in our experiments. This was probably because the effective spectrum of our large test matrices did not shrink quickly enough to exhibit convergence speedup before the stopping criterion was satisfied.
Though -EKSM is closely connected to EKSM, we emphasize that the convergence of -EKSM cannot be derived directly from that of EKSM applied to a shifted matrix. Admittedly, with the same vector b it is the case that -EKSM with a pole s applied to A and EKSM applied to both generate the same subspaces , however, the existing theory of EKSM [] can only provide a bound on the convergence factor for approximating , which is not what is needed and has no obvious relationship with the convergence for approximating from the same subspace. Our analysis is based on a special min–min optimization instead of the results of EKSM applied to a shifted matrix.
The remainder of this paper is organized as follows. In Section 3, we discuss the implementation of -EKSM. In Section 4, we analyze the linear convergence factor of -EKSM and provide the optimal pole with which the lowest upper bound on the convergence factor can be achieved for SPD matrices. In addition, we numerically explore the optimal pole and the convergence factor of -EKSM for nonsymmetric matrices with an elliptic numerical range. In Section 5, we consider a variant of RKSM that applies a few fixed cyclic poles to provide faster approximations than -EKSM for certain challenging nonsymmetric matrices. In Section 6, we show the results of numerical experiments for different methods on a variety of matrices. Our conclusions are provided in Section 7, followed by several proofs of Lemmas in the Appendices.
2. Rational Krylov Subspace Methods and -EKSM
For a wide range of matrix function approximation problems, polynomial Krylov subspace methods converge very slowly [,]. To speed up convergence, a more efficient approach is to apply rational Krylov subspace methods; see, e.g., [,] and references therein.
The procedure of RKSM is outlined as follows. Starting with an initial nonzero vector b that generates , where , RKSM keeps expanding the subspaces to search for increasingly more accurate approximate solutions to our problem of interest. In order for RKSM to expand the current subspace to , we apply the linear operator to a vector . To build an orthonormal basis of the enlarged subspace , we may choose and adopt the modified Gram–Schmidt orthogonalization, obtaining
where . To ensure that the linear operator is well-defined and nonsingular, we require that , , up to a scaling factor, and that and (if and are nonzeros) not be an eigenvalue of A. The use of four parameters () provides the flexibility to accommodate both the zero (, ) and the infinity (, ) poles in a unified framework. The expansion of rational Krylov subspaces does not have to be based on the last orthonormal basis vector , as in (2). There are alternative ways to choose the continuation vector to expand the subspaces; see, e.g., [].
The shift-inverse matrix vector product is equivalent to solving (the inner linear system) for w. Multiplying both sides of (2) by , moving all terms containing A to the left-hand side and all other terms to the right-hand side, we have
Note that the above relation should hold for each index value , thus, it is not hard to see that (3) can be written in the following matrix form:
where contains the orthonormal basis vectors of the rational Krylov subspace:
and , , and are all upper Hessenberg matrices. Specifically,
where .
The idea of RKSM as a projection method is the same as standard Krylov: first, solve the Galerkin projected problem defined for , then project the solution back to the m-dimensional subspace as an approximate solution for the original problem defined for A.
The parameters can be changed in each iteration to determine the poles. For example, if we set , and for all , it is the standard Krylov subspace method; if we set , , and for , it becomes EKSM. A special variant of EKSM [,] constructs the following extended subspaces:
A practical choice for the two indices l and m leads to subspaces of the form for some , which requires an orthonormal basis for the Krylov subspaces with vectors
However, there is no convergence theory for this special variant.
The general rational Krylov space of order m is provided by [,]:
with prescribed in (2). For EKSM, the rational Krylov subspace of dimension is
EKSM applies the operators and A in an alternating manner in each iteration.
For adaptive RKSM, the operation at step m can be written as follows:
where is a nonzero pole and is a zero of the underlying rational function. To find the optimal poles and zeros at each step, we first restrict the poles and zeros to disjoint sets and , respectively, where and [] and where is the numerical range of A. The pair is called a condenser [,]. An analysis of RKSM considers a sequence of rational nodal functions
where the zeros and the poles . Adaptive RKSM tries to obtain asymptotically optimal rational functions by defining and recursively with the following conditions: after choosing and of minimal distance, define []:
The points are called generalized Leja points [,]. In practice, we compute approximations with respect to the poles and zeros defined in (6) during the progress of iteration. Adaptive RKSM usually converges with fewer iterations than EKSM while using a smaller approximation subspace [,,]. While usually converging in fewer iterations than the variants with a few cyclic poles [], each step of adaptive RKSM requires a solution to a shifted linear system with a new shift, which is more expensive than using existing LU factorizations to solve the linear system with the same coefficient matrix that has been factorized. If the linear system at each RKSM step is solved by a direct method, adaptive RKSM tends to require longer runtimes than variants with a few cyclic poles based on reusing LU factorizations for each distinct pole. Adaptive RKSM is most competitive if the linear systems arising from each step need to be solved approximately by an iterative method and if effective preconditioning can be structured for each linear system with different shift.
In this paper, we consider generating rational Krylov subspaces with cyclic poles s, , s, , (), which we call the flexible extended Krylov subspace method (-EKSM). The corresponding linear operators are provided by and A, which are applied in an alternating manner. To this end, we set , , , and for . The approximation space of -EKSM is
The choice of the repeated pole s influences the convergence rate of -EKSM. Our goal is to find the optimal pole for Markov-type functions of matrices such that -EKSM achieves the lowest upper bound on the linear convergence factor. This subspace is identical to the one generated by EKSM applied to the shifted matrix ; the convergence theory of EKSM [] would provide a convergence factor bound for approximating instead of , however, this is not our concern here, as our results are derived with a special min–min optimization analysis, not from the results of EKSM applied to a shifted matrix.
3. Implementation of -EKSM for Approximating
Without loss of generality, suppose that in (1) and let the initial subspace be . The approximation to after steps is
where is an orthonormal set of basis vectors of the subspace
and denotes the restriction of matrix A in .
Because , we have . To construct this subspace, we can apply EKSM to the matrix , obtain as the orthonormal set of basis vectors of , and obtain
which is a block upper Hessenberg matrix (see Section 3 in []). From (7), we obtain . Following the derivation of EKSM [], we can derive a similar Arnoldi relation for our proposed -EKSM, where is a block upper Hessenberg matrix with blocks, and we obtain . More implementation details of RKSM can be found in [,]. Notice that similar to EKSM, for symmetric problems, the orthogonalization cost of -EKSM can be saved with the block three-term recurrence to enlarge the subspace.
The residual norm of -EKSM is ; however, it is not directly computable because is unknown. One stopping criterion for the Arnoldi approximation is to compute [,,]; however, this may not be valid for RKSM. Another possibility is to compute , which is the norm of the difference between two computed approximations; see, e.g., [,]. Alternatively, it is possible to monitor the angle between the approximations [] in two consecutive iterations. This convergence criterion is sometimes used in the literature on eigenvalue computations; see, e.g., [,]. The two criteria usually exhibit very similar behavior. In this section, we choose the latter.
For all RKSM, linear system solvers are required in common, as the action of to vectors is needed in (2). For EKSM and -EKSM, respectively, we need the action of and to the vectors. If these linear systems can be solved efficiently by direct methods, both of them need only one LU factorization performed one time and applicable to all linear solves. However, because of the adaptive poles, for adaptive RKSM it is necessary to solve a linear system with a different coefficient matrix for every iteration. Although adaptive RKSM achieves an asymptotically optimal convergence rate, it can be more time-consuming than EKSM and -EKSM, as a new LU factorization in each step is usually much more expensive than a linear solution using existing LU factors.
4. Convergence Analysis of -EKSM
Next, we study the optimal pole s for -EKSM to achieve the lowest upper bound on the convergence factor through min–min optimization.
4.1. General Convergence Analysis
In this section, we explore the asymptotic convergence of -EKSM. Consider the class of Markov-type functions in (8). For any , a Markov-type function can be split into the sum of two integrals []:
Here, we let be the numerical range of matrix A and define , where s is the repeated pole for -EKSM. We assume that is symmetric with respect to the real axis and lies strictly in the right half of the complex plane. Then, for , is symmetric with respect to the real axis and lies in the left half of the complex plane. We define as the direct Riemann mapping [] for (), where D is the unit disk, and define as the inverse Riemann mapping.
Our convergence analysis initially follows the approach in [], then analyzes a special min–min optimization. It first uses the Faber polynomials [], providing a rational expansion of functions for investigating the approximation behavior of -EKSM. The main challenge is to find how the fixed real pole s impacts the convergence and to determine the optimal s to achieve the lowest upper bound on the convergence factor.
Lemma 1.
For the Markov-type function defined by (8) (where μ is a measure such that the integral converges absolutely) and some given , the following inequalities hold for any :
where and where , , and are all greater than 1. Here, for , are some real numbers and denotes the Faber polynomials of degree k associated with the Riemann mapping , while and are constant positive real numbers independent of m.
Proof.
The proof is provided in Appendix A. □
Lemma 2.
Assume that . For any given used in (9), the error of approximating by -EKSM with cyclic poles s, , s, , satisfies
Proof.
Let us define
Because both g and h are analytic in , and as , from Theorem 2 in [] we have
Next, we follow the proof in Section 3, Theorem 3.4 in [], and use the above inequality:
□
Remark 1.
The results of our analysis can additionally be applied to a linear combination of several Markov-type functions with monomials , . One example of these functions is , , as and is a Markov function. In addition, if the support of the underlying measure of the Markov function is a proper subset of , the error bound may not be sharp. The asymptotic convergence might be superlinear as well; see, e.g., []. While this idea could be explored with -EKSM, it is not considered here because we did not observe superlinear convergence in our experiments. This was probably because the effective spectrum of our large test matrices did not shrink quickly enough to exhibit convergence speedup before the stopping criterion was satisfied.
To find the optimal pole to achieve the lowest upper bound on the asymptotic convergence factor of -EKSM, we need to determine such that
is maximized. Let us define
Therefore, -EKSM converges at least linearly with a convergence factor , where depends on s and a. For any given pole , we can find the artificial parameter used in (9) such that it minimizes . Let be the minimized ; then, we need to find that minimizes . We denote the minimized by , which is the lowest upper bound on the asymptotic convergence factor.
In summary, to find the optimal pole s needed to obtain the lowest upper bound on the asymptotic convergence factor of -EKSM, we can solve the following optimization problem:
The asymptotic convergence factor of -EKSM in (13) is dependent on the Riemann mapping . The formula of is different for matrices with different numerical ranges, which leads to different values of . In the following section, we show that this problem has an analytical solution if A is symmetric positive definite.
4.2. The Symmetric Positive Definite Case
To explore the optimal pole s and the corresponding bound on the convergence factor of -EKSM, we can consider a symmetric positive definite matrix A. Assume that are the smallest and the largest eigenvalues of A, respectively. The Riemann mappings are
where , , and . It follows that
Note that all the three expressions are the values of the function at different values of t. Therefore, to compare , , and , it is sufficient to compare M, , and , which is much easier.
Proof.
The proof is provided in Appendix B. □
We are now ready to show the major result regarding the optimal pole and the corresponding lowest upper bound on the asymptotic convergence factor of -EKSM for approximating of Markov-type functions.
Theorem 1.
Let be the convergence factor of -EKSM for approximating as defined in (12), where the matrix A is symmetric positive definite. Then, for the optimization problem
the optimal solution is
and the optimal objective function value is
Proof.
It is equivalent to find the optimal s that solves the following problem:
From Lemma 3, T is a piecewise function with variable s, and we only need to find its maximum value for .
For , is a monotonically increasing function; therefore, when , has its maximum value on this interval.
For , . We find the first derivative of to be
Because decreases monotonically on , it has its maximum value when . Therefore,
To sum up, for both and , the maximizer of is . Consequently, it is the global optimal solution for .
4.3. Nonsymmetric Case
Similar to the SPD case, to explore the lowest upper bound on the convergence factor of -EKSM we can consider a nonsymmetric matrix A with eigenvalues located in the right half of the complex plane. Let , , and assume that the numerical range of matrix A can be covered by an ellipse centered at point with a semi-major axis and semi-minor axis .
The Riemann mapping is provided by
where . Although the Riemann mapping is not easy to derive explicitly, for a given s we can first approximate by a polygon, then use the Schwarz–Christoffel mapping toolbox [] to approximate numerically. Then, we can compare , , and for different values of a. Based on (13), we tested different values of s to find the optimal pole such that is maximized. Table 1 shows the optimal pole s and the upper bound on the asymptotic convergence factor for matrices with different elliptic numerical ranges.
Table 1.
Convergence factor of EKSM and -EKSM for matrices with different elliptical numerical ranges ().
In Table 1, indicates the SPD case; with , the numerical range becomes a disk. It can be seen that when decreases, the convergence factor for both EKSM and -EKSM increases, which implies that in the case of an elliptic numerical range both methods converge significantly more slowly than in the SPD case. In particular, when and , it takes about times as many steps as are needed for the corresponding SPD case (). It is worthwhile to compare these two methods with adaptive RKSM to determine whether the slowdown is severe.
Another observation from Table 1 is that the optimal pole in the nonsymmetric case is not far away from that in the SPD case. Hence, it is reasonable to approximate the optimal shift for the nonsymmetric case using the one for the SPD case (see (17)), as the actual optimal based on an accurate estimate of the numerical range is generally difficult to evaluate, if it is even possible at all. Actually, for a nonsymmetric matrix , the approximation of using (16) is exactly the optimal pole for its symmetric part . Because
it is clear that . If has an ellipse boundary centered at point with a semi-major axis and semi-minor axis , it follows that . To obtain such an approximation with respect to , we only need to run a modest number of Arnoldi steps within an acceptable amount of time in order to obtain the approximations with respect to and that are needed in (17).
4.4. Convergence Analysis with Blaschke Product
Another convergence analysis for approximating functions of matrices can be seen in []. Using the same notation as above and combining Theorem 5.2 with Equation (6.4) in [], we obtain a bound with the following form:
where is called the Blaschke product and . Using the cyclic poles as -EKSM, we find , () to minimize
Note that for , achieves its maximum either when or when . The problem then becomes the following optimization problem for w:
It can be shown that the minimum is achieved when . The optimal w is then one root of a fourth-order equation, which is greater than :
where .
For the symmetric positive definite case, where is defined as in (14), ; thus, the optimal w in (20) only depends on the condition number of the matrix A.
The convergence analysis for the optimal pole based on [] involves a quartic function in (20), and it is difficult to to find an explicit formula for the optimal pole. On the other hand, our analysis based on [] provides an explicit formula for the optimal pole in Theorem 1. Next, in Section 6, we compare the theoretical convergence rates and actual performance for these two optimal poles to different benchmarks.
5. RKSM with Several Cyclic Poles for Approximating
In our problem setting, the shift-inverse matrix/vector operations for RKSM are performed by factorization-based direct linear solvers; -EKSM usually outperforms EKSM in both space size and runtime. Compared with adaptive RKSM, -EKSM often takes more steps but less time to converge for large sparse SPD matrices, although its performance in both space and time can become inferior to adaptive RKSM for certain challenging nonsymmetric problems. To improve the performance of -EKSM, we consider using a few more fixed repeated poles. The rationale for this strategy is to take a balanced tradeoff between -EKSM and the adaptive variants of RKSM, ensuring that this variant of RKSM has modest storage and runtime costs.
For example, we can consider such a method based on four repeated poles. Starting with the optimal pole of -EKSM (17) and (the negative of the largest real part of all eigenvalues), we apply several steps of adaptive RKSM to find and use new poles until we find at least one pole smaller than and one pole greater than (both in terms of modulus). For all poles obtained adaptively during this procedure, we let be the smallest (in modulus) and be the largest one. It is not hard to see that the adaptive RKSM steps terminate with the last pole being either or . Our numerical experience suggests that additional simple adjustment to or can help to improve convergence. Specifically, if , then is divided by a factor of ; otherwise, is multiplied by the same factor. Experimentally, we found that provides the best overall performance. Thus, we keep the LU factorizations associated with the four poles, and in each step we choose the pole cyclically from the set .
In fact, we can use convergence analysis with Blaschke product in Section 4.4. If we want to use four cyclic poles, we can solve the following optimization problem:
It takes time to compute the optimal numerically for the specific problem setting, and our numerical experience shows that it takes a similar number of iterations to converge compared to the above -EKSM variant with four poles.
6. Numerical Experiments
We tested different variants of RKSM for approximating , where the functions were , , , , and . The first four consist of Markov-type functions and a Markov-type function multiplied with monomials , ; while the last function is non-Markov type, our algorithms exhibit similar behavior when approximating as on the other functions. All experiments were carried out in MATLAB R2019b on a laptop with 16 GB DDR4 2400 MHz memory, Windows 10 operating system, and 2.81 GHz Intel dual-core CPU.
6.1. Asymptotic Convergence of EKSM and -EKSM
For a real symmetric positive definite matrix A, EKSM with cyclic poles converges at least linearly as follows:
where (and see Proposition 3.6 in []). Similarly, -EKSM with cyclic poles converges at least linearly with factor
because , -EKSM has a smaller upper bound on the convergence factor than EKSM.
For the optimal pole from (20), using the Blaschke product technique we can denote the method as -EKSM* and its optimal pole as , with the convergence factor in (19). Because is a root of a fourth-order equation, it is difficult to explicitly find its value; thus, we list several examples to compute the poles and convergence factors for both single pole methods. In addition, we list the convergence factors for the shift-inverse Arnoldi method (SI) based on one fixed nonzero pole:
(see [], Corollary 6.4a).
For a matrix A with the smallest eigenvalue and largest eigenvalue , Table 2 shows the difference in the upper bounds of their convergence factors; note that the asymptotic convergence factors are independent of the function f.
Table 2.
Bounds on the asymptotic convergence factor for EKSM, -EKSM, and -EKSM* with optimal pole .
It can be seen from Table 2 that -EKSM has a lower upper bound on the asymptotic convergence factor than EKSM, with -EKSM* having an even lower upper bound. The optimal pole for -EKSM is roughly two to three times that of for -EKSM*. The shift-inverse Arnoldi has the largest converegence factor; thus, we did not compare it with the other methods in our later tests.
To check the asymptotic convergence factor for each method, it is necessary to know the exact vector for a given matrix A and vector b in order to calculate the norm of the residual at each step. For relatively large matrices it is only possible to directly evaluate the exact for diagonal matrices within a reasonable time. Because each SPD matrix is orthogonally similar to the diagonal matrix of its eigenvalues, our experiment results can be expected to reflect the behavior of EKSM, -EKSM, and -EKSM* applied to more general SPD matrices.
We constructed two diagonal matrices . The diagonal entry for is , 10,000, where the s are uniformly distributed on the interval and , . The diagonal entry for is , 20,000, . We approximated using EKSM, -EKSM, and -EKSM*, where b is a fixed vector with entries of standard normally distributed random numbers. The experimental results are shown in Figure 1.
Figure 1.
Actual and asymptotic convergence of EKSM, -EKSM, and -EKSM* for SPD matrices: (a) , and (b) , .
From the figures for different Markov-type functions, it can be seen that all methods converge with factors no worse than their theoretical bounds, which verifies the validity of the results shown in Theorem 1. Moreover, -EKSM and -EKSM* always converge faster than EKSM for all the test functions and matrices. In particular, for approximating , our -EKSM only takes about one quarter as many iterations to achieve a relative error less than as compared to EKSM. Furthermore, the theoretical bounds are not always sharp for those methods.
For nonsymmetric matrices with elliptic numerical ranges, the theoretical convergence rate cannot be derived by an explicit formula; Table 1 shows the numerical results. To verify these results, we constructed a block diagonal matrix with diagonal blocks with eigenvalues that lie on the circle centered at with radius . We then constructed another block diagonal matrix with eigenvalues that lie in the ellipse centered at , with a semi-major axis and semi-minor axis . The optimal pole for -EKSM can be computed using the strategy described in Section 4.2 ( and ). Figure 2 shows the spectra of the matrices , and for . Figure 3 shows the results for the nonsymmetric matrices. Table 1 shows the asymptotic convergence factors of EKSM and -EKSM for matrices and , where , , , and for , while , , , and for .
Figure 2.
Spectra of several matrices: (a) spectrum of ; (b) spectrum of ; (c) spectrum of ; (d) spectrum of ; (e) spectrum of ; (f) spectrum of .
Figure 3.
Performance and theoretical asymptotic convergence of EKSM and -EKSM for nonsymmetric matrices with elliptic spectra: (a) approximating ; (b) approximating .
Similar to the results for SPD matrices, -EKSM converges faster than EKSM for the two artificial nonsymmetric problems. The upper bounds on the convergence factor in Table 1 match the actual convergence factor quite well.
6.2. Test for Practical SPD Matrices
Next, we tested EKSM, -EKSM, -EKSM*, and adaptive RKSM on several SPD matrices and compared their runtimes and the dimension of their approximation subspaces. Note that for general SPD matrices the largest and smallest eigenvalues are usually not known in advance. The Lanczos method or its restarted variants can be applied to estimate them, and this computation time should be taken into consideration for -EKSM and -EKSM*. The variant of EKSM in (5) is not considered in this section, as there is no convergence theory to compare with the actual performance and the convergence rate largely depends on the choice of l and m in (5).
For the SPD matrices, we used Cholesky decomposition with approximate minimum degree ordering to solve the linear systems involving A or for all four methods. The stopping criterion of all methods was to check whether the angle between the approximate solutions obtained at two successive steps fell below a certain tolerance. EKSM and -EKSM, and -EKSM* all apply the Lanczos three-term recurrence and perform local re-orthogonalization to enlarge the subspace, whereas adaptive RKSM applies a full orthogonalization process with global re-orthogonalization.
We tested four 2D discrete Laplacian matrices of orders , , , and based on standard five-point stencils on a square. For all problems, the vector b was a vector with entries of standard normally distributed random numbers, allowing the behavior of all four methods to be compared for matrices with different condition numbers.
Table 3 reports the runtimes and the dimensions of the rational Krylov subspaces that the four methods entail when applied to all test problems; in the table, EK, FEK, and ARK are abbreviation of EKSM, -EKSM, and adaptive RKSM, respectively. The stopping criterion was that the angle between the approximate solutions from two successive steps was less than . The single pole results for -EKSM are , and , respectively, while for -EKSM* the results are , and , respectively. The shortest CPU time appearing in each line listed in the table is marked in bold.
Table 3.
Performance of EKSM, -EKSM, -EKSM*, and adaptive RKSM on SPD problems.
With only one exception, that of , it is apparent that -EKSM converges the fastest of the four methods in terms of wall clock time for all the test functions and matrices with different condition numbers. While -EKSM takes more steps than adaptive RKSM to converge, it requires fewer steps than EKSM. Furthermore, the advantage of -EKSM becomes more pronounced for matrices with a larger condition number. Notably, the advantage of -EKSM in terms of computation time is stronger than in terms of the spatial dimension, which is due to the orthogonalization cost being proportional to the square of the spatial dimension. -EKSM* takes slightly more steps than -EKSM to converge in these examples, and both methods have similar computation times.
The unusual behavior of all methods for can be explained as follows. For these Laplacian matrices, the largest eigenvalues range from to . Because and (because decreases monotonically on ), the eigenvector components in vector b associated with relatively large eigenvalues would be eliminated in vector in double precision. In fact, because , all eigenvalues of A greater than are essentially ‘invisible’ for under tolerance , and the effective condition number of all four Laplacian matrices is about . As a result, it takes EKSM the same number of steps to converge for all these matrices; the shift for -EKSM and -EKSM* computed using and of these matrices is in fact not optimal for matrices with such a small effective condition number.
The pole in (17) for the SPD matrix is optimal, as we have proved that it has the smallest asymptotic convergence factor among all choices of the single pole. In order to numerically compare the behaviors for different setting of the single pole, we tested matrices Lap. B and Lap. C in Table 3 with and , respectively. For each problem, we tested -EKSM by setting different single poles s to , , , , and setting for -EKSM*. Figure 4 shows the experimental results. It can be seen that -EKSM has the fastest asymptotic convergence rate among all the 6 different values of single poles when setting the optimal single pole , which confirms that is indeed optimal in our experiments.
Figure 4.
Convergence of -EKSM for different setting of single poles: (a) on Lap. B and (b) on Lap. C.
6.3. Test for Practical Nonsymmetric Matrices
We consider 18 nonsymmetric real matrices, all of which have all eigenvalues on the right half of the complex plane. While these are all real sparse matrices, they all have complex eigenvalues with positive real parts. Half are in the form of , where both M and K are sparse and is not formed explicitly. Table 4 reports several features for each matrix A: the matrix size is n, the smallest and largest eigenvalues in terms of absolute value are and , respectively, the smallest and largest real parts of the eigenvalues are and , respectively, and the largest imaginary part of the eigenvalues is . Note that all these original matrices have spectra strictly in the left half of the complex plane; we simply switched their signs to make well-defined for Markov-type functions.
Table 4.
Selected features of the test problems.
For the single pole of -EKSM and the initial pole of the four-pole variant (4Ps) in Section 5, we used the optimal pole for the SPD case matrices (17) by setting and . The same setting of and for building the Riemann mapping in (14) was applied to compute the optimal pole for -EKSM* in (20). In particular, because precise evaluation of and is time-consuming, we approximated them using the ‘eigs’ function in MATLAB, which is based on the Krylov–Schur Algorithm; see, e.g., []. We set the residual tolerance to equal for ‘eigs’, ensuring that all the test matrices could find the largest and smallest eigenvalues within a reasonable time. Higher accuracy in computing the eigenvalues is not required when determining the optimal pole, as the convergence performance for -EKSM does not change noticeably with tiny changes of the value of the pole. Note that the single pole we used for each problem is independent of the Markov-type function; see Table 4. For nonsymmetric matrices, we used LU factorizations to solve the linear systems involving coefficient matrices of A or for all methods. The stopping criterion was either when the angle between the approximate solutions was less than a tolerance for two successive steps, or when the dimension of the Krylov subspaces reached 1000. There have been a few discussions about restarting for approximating , though only for polynomial approximation based on Arnoldi-like methods; see, e.g., [,]. In this paper, we only focus on the comparison of convergence rates for several different Krylov methods without restarting. Here, we need to choose a proper tolerance for each problem such that it is small enough to fully exhibit the rate of convergence for all methods while not being too small to satisfy. The last column of Table 4 reports the tolerances, which are fixed regardless of the different Markov-type functions.
In Table 5, Table 6, Table 7, Table 8 and Table 9, we report the runtime and dimension of the approximation spaces that the four methods entail for approximating to the specified tolerances. The runtime includes the time spent on the evaluation of optimal poles for -EKSM, -EKSM*, and the four-pole variant. The “−” symbol indicates failure to find a sufficiently accurate solution when the maximum dimension of approximation space (1000) was reached. The shortest CPU time appeared in each line of the listed tables is marked in bold. Figure 5 shows an example plot for each method, with the relative error (where is the approximation to at step k) plotted against the dimension of the approximation space for each function.
Table 5.
Performance of five rational Krylov subspace methods for the function .
Table 6.
Performance of five rational Krylov subspace methods for the function .
Table 7.
Performance of five rational Krylov subspace methods for the function .
Table 8.
Performance of five rational Krylov subspace methods for the function .
Table 9.
Performance of five rational Krylov subspace methods for the function .
Figure 5.
Decay of as the approximation space dimension increases: (a) on aerofoilA; (b) on matRe500E; (c) on plate; (d) on step.
In the ninety total cases for eighteen problems and five functions shown in Table 5, Table 6, Table 7, Table 8 and Table 9, the four-pole variant is the fastest in runtime in sixty cases and -EKSM is the fastest in fourteen cases. Among all cases when the four-pole variant is not the fastest, it is no more than slower than the fastest in twelve cases and 10– slower in eight cases.
Overall, the four-pole variant is the best in terms of runtime, though there are several exceptions. The first is tolosa, which is the only problem on which adaptive RKSM ran the fastest for all functions. For this problem, the dimension of the matrix is relatively small; this makes it more efficient to perform a new linear solver at each step, as the LU cost is cheap. Moreover, for tolosa, is close to , meaning that the algorithms based on repeated poles converge slower; see Table 1. The second exception is venkat, where the four-pole variant is not the fastest for . In fact, for , EKSM, -EKSM, and -EKSM* converge within a much smaller dimension of the approximation space than for the other functions. A possible explanation is that in computer arithmetic it is difficult to accurately capture the relative change in function values for at small variables, and venkat has majority of eigenvalues that are small in terms of absolute value. For example, , which means that a relative change of in the independent variable of near can lead to a relative change of in function value; thus, fails to observe such a difference in input above the given tolerance . The third exception is convdiffA, for which EKSM is fastest for four out of all five functions. In fact, EKSM takes fewer steps to converge than -EKSM, which can be seen at the bottom right of Figure 5. A possible reason for this is that and can only be evaluated approximately by several iterations of the Arnoldi method, and sometimes their values cannot be found accurately. The ‘optimal’ pole based on inaccurate can sometimes be far away from the real optimal pole. Notable, for this exception -EKSM* takes fewer steps to converge than EKSM, though it requires more computation time. This is because EKSM uses infinite poles; for the matrix in the form of , where M is an identity matrix, it is not necessary to apply a linear solver for infinite poles, only a simple matrix vector multiplication. For the other cases in which -EKSM or -EKSM* runs fastest, the four-pole variant usually runs only slightly slower, as in those cases it takes less time to enlarge the approximation space than to compute more LU factorizations for using more repeated poles.
It is important to underscore that the runtime needed for -EKSM with our optimal pole is less than that for -EKSM* with an optimal pole derived based on [] for a majority of the nonsymmetric test matrices, which is similar to the minor advantage in runtime of our -EKSM shown in Table 3 for Laplacian matrices. In addition, the runtime of the four-pole variant suggests that if sparse LU factorization is efficient for the shifted matrices needed for RKSM, then using a small number of near optimal poles seems to be an effective way to achieve the lowest overall runtime.
In terms of the dimension of the approximation space, adaptive RKSM always need the smallest subspace to converge, with the four-pole variant in second place except for for tolosa. In most, cases EKSM needs the largest subspace to converge, while in others -EKSM needs the largest subspace. In the cases where -EKSM and the four-pole variant converge, the four-pole variant takes to fewer steps.
In summary, our experiments suggest that -EKSM, -EKSM*, and the four-pole variant are competitive in reducing the runtime of rational Krylov methods based on direct linear solvers for approximating ; on the other hand, if the goal is to save storage cost, adaptive RKSM is preferable.
7. Conclusions
In this paper, we have studied an algorithm called the flexible extended Krylov subspace method (-EKSM) for approximating for Markov-type functions. The central idea is to find an optimal pole to replace the zero pole in EKSM such that -EKSM needs only the same single LU factorization as EKSM while converging more rapidly.
In the main theoretical contribution of this work, Theorem 1, we prove that there exists a unique optimal pole for a symmetric positive-definite matrix that guarantees the fastest convergence of -EKSM, which always outperforms EKSM. The theorem provides a formula for both the optimal pole and an upper bound on the convergence factor. Numerical experiments show that -EKSM is more efficient than EKSM and that it is competitive in runtime compared with adaptive RKSM if the shifted linear systems needed for rational Krylov methods are solved using a direct linear solver.
-EKSM may lose its advantages for challenging nonsymmetric matrices because of possible failure to compute the optimal poles numerically and due to its relatively slow convergence rate for these problems. This performance can be improved by using four fixed poles chosen flexibly in the early stage of computation. Our numerical results show that the four-pole variant is the most efficient in terms of runtime for many problems.
Author Contributions
Conceptualization, S.X. and F.X.; methodology, S.X. and F.X.; software, S.X.; validation, S.X.; formal analysis, S.X. and F.X.; investigation, S.X.; resources, F.X.; data curation, F.X.; writing—original draft preparation, S.X.; writing—review and editing, F.X.; visualization, S.X.; supervision, F.X.; project administration, F.X.; funding acquisition, F.X. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by the U.S. National Science Foundation under grants DMS-1719461, DMS-1819097, and DMS-2111496.
Data Availability Statement
Not applicable.
Acknowledgments
The authors would like to thank Jeong-Rock Yoon for useful conversations and suggestions on the proof of the bounded rotation of . In addition, we thank the two anonymous reviewers for their helpful comments and suggestions.
Conflicts of Interest
The authors declare no conflict of interest.
Appendix A. Proof of Lemma 1
Because : () and , , , we can conclude that , and are all greater than 1.
Equation (10) has been proved in Section 3, Lemma 3.1 in []. For (11) involving , our proof is analogous to what was done for .
Because denotes the Riemann mapping, for a fixed we can write the Faber polynomials by means of their generating function []:
We define a new variable . Then, from (9), we have
Because is symmetric with respect to the real axis and lies in the left half of the complex plane, monotonically maps onto and onto . We then have the following properties for and :
where are constants independent of .
It follows that the integral in the last expression of (A1) satisfies
Note that for any it is the case that . It follows that .
We note that the map () from to can be written as a composition of three maps , where , , and , all of which are bijective. We denote and as the boundary of and , respectively. Let be the image of under and let be the image of under (which is the preimage of under ). Because is the numerical range of A, it is convex and compact per the Hausdorff–Toeplitz theorem; therefore, has a boundary rotation of . As translates horizontally to the direction of the positive real axis, it preserves the shape of the preimage such that is of bounded rotation as well. In addition, it can be shown that maps to with bounded rotation (see details in Lemma A1 shown below). Finally, because preserves the shape of the preimage, has bounded rotation. From Chapter IX, Section 3, Theorem 11 in [], we have , and the following inequality holds for :
where is some real positive constant independent of m.
In light of the above observations, if we denote
then
where is an upper bound of due to .
For , there are two cases to derive the minimum of .
Case 1: if , then . Because lies strictly in the right half of the plane, by definition lies between the vertical lines and ; thus, and we have
Case 2: if , then ; clearly, , and because , we have
Note that the conclusion of Case 2 is valid for Case 1; therefore,
Lemma A1.
Define the mapping , , where is the boundary of a compact convex domain lying strictly in the right half of the complex plane and is symmetric with respect to the real axis. Let be the image of the interval under the injection , which is assumed to be absolutely continuous. Then, has bounded rotation.
Proof.
We define as the subset where is continuous. Note that the directional angle of the tangent line to at t is . The boundary rotation of is defined in Page 270 of reference [] as follows:
which is due to the convexity of the domain enclosed by .
Note that is the image of the interval under the injection . Similarly, the directional angle of the tangent line to at t is
Because f is a conformal mapping that preserves the angles, the variation of the directional angle at the discontinuities of (if any) are preserved. This can be written as . The boundary rotation of can be written as
where
Because is the boundary of a compact convex domain, lies strictly in the right half of the complex plane, and is symmetric with respect to the real axis, there exists such that
and we can select , such that
Note that while these may not be unique, they split into two disjoint continuous branches, denoted as and , on both of which is monotonic (though not necessarily strictly) with respect to t. It follows that
because the two end points of and are and , respectively. Here, ; thus, we have
Moreover, because , we have . The claim is established. □
Appendix B. Proof of Lemma 3
We first define and . For a fixed pole , are functions of the variable a; specifically, M is a linear function of a, is a constant independent of a, and is linear to the reciprocal of a shifted value of a (see (16)). Here, we are interested in their absolute values.
We can set up a Cartesian coordinate system to illustrate M, , and as functions of and compare their values. The horizontal asymptote of the function is . First, we need to compare this with .
Case 1: .
The illustration is shown in Figure A1. Because is constant, , and for , ; thus, such that . To sum up, in this case we have .
Figure A1.
Sketch map of Case 1 in proof of Lemma 3.
Case 2: .
There is an intersection between and for , which we denote as . Solving the equation for a, we obtain . Note that can be either above or below the line of M.
First, if is below or on the line of M, then
Letting , we need to find the interval of s such that . Here, is a cubic function and its two stationary points have positive function values; thus, it only has one real root. We can use the Cardano formula [] to obtain its real root:
Therefore, for we have .
We denote the intersection between M and as . As shown in Figure A2, when is between and , .
Figure A2.
Sketch map of Case 2 in proof of Lemma 3: (a) when is below or on the line of M and (b) when is above the line of M.
Second, if is above the line of M, then ; see Figure A2.
When , we have
We only need the root for , which is . Therefore,
References
- Higham, N.J. Functions of Matrices: Theory and Computation; Society for Industrial and Applied Mathematics (SIAM): Philadelphia, PA, USA, 2008; pp. xx+425. [Google Scholar]
- Schilders, W.H.A.; van der Vorst, H.A.; Rommes, J. (Eds.) Model Order Reduction: Theory, Research Aspects and Applications; Mathematics in Industry; Springer: Berlin/Heidelberg, Germany, 2008; Volume 13, pp. xii+471. [Google Scholar]
- Grimm, V.; Hochbruck, M. Rational approximation to trigonometric operators. BIT 2008, 48, 215–229. [Google Scholar] [CrossRef][Green Version]
- Burrage, K.; Hale, N.; Kay, D. An efficient implicit FEM scheme for fractional-in-space reaction-diffusion equations. SIAM J. Sci. Comput. 2012, 34, A2145–A2172. [Google Scholar] [CrossRef]
- Bloch, J.C.; Heybrock, S. A nested Krylov subspace method to compute the sign function of large complex matrices. Comput. Phys. Commun. 2011, 182, 878–889. [Google Scholar] [CrossRef][Green Version]
- Kressner, D.; Tobler, C. Krylov subspace methods for linear systems with tensor product structure. SIAM J. Matrix Anal. Appl. 2010, 31, 1688–1714. [Google Scholar] [CrossRef]
- Hochbruck, M.; Ostermann, A. Exponential Runge-Kutta methods for parabolic problems. Appl. Numer. Math. 2005, 53, 323–339. [Google Scholar] [CrossRef]
- Hochbruck, M.; Lubich, C. Exponential integrators for quantum-classical molecular dynamics. BIT 1999, 39, 620–645. [Google Scholar] [CrossRef]
- Bai, Z. Krylov subspace techniques for reduced-order modeling of large-scale dynamical systems. Appl. Numer. Math. 2002, 43, 9–44. [Google Scholar] [CrossRef]
- Freund, R.W. Krylov-subspace methods for reduced-order modeling in circuit simulation. J. Comput. Appl. Math. 2000, 123, 395–421. [Google Scholar] [CrossRef]
- Wang, S.; de Sturler, E.; Paulino, G.H. Large-scale topology optimization using preconditioned Krylov subspace methods with recycling. Internat. J. Numer. Methods Engrg. 2007, 69, 2441–2468. [Google Scholar] [CrossRef]
- Biros, G.; Ghattas, O. Parallel Lagrange-Newton-Krylov-Schur methods for PDE-constrained optimization. I. The Krylov-Schur solver. SIAM J. Sci. Comput. 2005, 27, 687–713. [Google Scholar] [CrossRef]
- Simoncini, V. Computational methods for linear matrix equations. SIAM Rev. 2016, 58, 377–441. [Google Scholar] [CrossRef]
- Druskin, V.; Knizhnerman, L. Extended Krylov subspaces: Approximation of the matrix square root and related functions. SIAM J. Matrix Anal. Appl. 1998, 19, 755–771. [Google Scholar] [CrossRef]
- Bergamaschi, L.; Caliari, M.; Martínez, A.; Vianello, M. Comparing Leja and Krylov approximations of large scale matrix exponentials. In Proceedings of the Computational Science–ICCS 2006, Reading, UK, 28–31 May 2006; Springer: Berlin/Heidelberg, Germany, 2006; pp. 685–692. [Google Scholar]
- Knizhnerman, L.; Simoncini, V. Convergence analysis of the extended Krylov subspace method for the Lyapunov equation. Numer. Math. 2011, 118, 567–586. [Google Scholar] [CrossRef][Green Version]
- Eiermann, M.; Ernst, O.G. A restarted Krylov subspace method for the evaluation of matrix functions. SIAM J. Numer. Anal. 2006, 44, 2481–2504. [Google Scholar] [CrossRef]
- Frommer, A.; Simoncini, V. Stopping criteria for rational matrix functions of Hermitian and symmetric matrices. SIAM J. Sci. Comput. 2008, 30, 1387–1412. [Google Scholar] [CrossRef]
- Popolizio, M.; Simoncini, V. Acceleration techniques for approximating the matrix exponential operator. SIAM J. Matrix Anal. Appl. 2008, 30, 657–683. [Google Scholar] [CrossRef][Green Version]
- Bai, Z.Z.; Miao, C.Q. Computing eigenpairs of Hermitian matrices in perfect Krylov subspaces. Numer. Algorithms 2019, 82, 1251–1277. [Google Scholar] [CrossRef]
- Knizhnerman, L.; Simoncini, V. A new investigation of the extended Krylov subspace method for matrix function evaluations. Numer. Linear Algebra Appl. 2010, 17, 615–638. [Google Scholar] [CrossRef]
- Simoncini, V. A new iterative method for solving large-scale Lyapunov matrix equations. SIAM J. Sci. Comput. 2007, 29, 1268–1288. [Google Scholar] [CrossRef]
- Güttel, S.; Knizhnerman, L. A black-box rational Arnoldi variant for Cauchy-Stieltjes matrix functions. BIT 2013, 53, 595–616. [Google Scholar] [CrossRef]
- Druskin, V.; Simoncini, V. Adaptive rational Krylov subspaces for large-scale dynamical systems. Syst. Control Lett. 2011, 60, 546–560. [Google Scholar] [CrossRef]
- Benzi, M.; Simoncini, V. Decay bounds for functions of Hermitian matrices with banded or Kronecker structure. SIAM J. Matrix Anal. Appl. 2015, 36, 1263–1282. [Google Scholar] [CrossRef]
- Beckermann, B.; Reichel, L. Error estimates and evaluation of matrix functions via the Faber transform. SIAM J. Numer. Anal. 2009, 47, 3849–3883. [Google Scholar] [CrossRef]
- Xu, S.; Xue, F. Inexact rational Krylov subspace methods for approximating the action of functions of matrices. Electron. Trans. Numer. Anal. 2023, 58, 538–567. [Google Scholar] [CrossRef]
- Beckermann, B.; Güttel, S. Superlinear convergence of the rational Arnoldi method for the approximation of matrix functions. Numer. Math. 2012, 121, 205–236. [Google Scholar] [CrossRef]
- Hochbruck, M.; Lubich, C. On Krylov subspace approximations to the matrix exponential operator. SIAM J. Numer. Anal. 1997, 34, 1911–1925. [Google Scholar] [CrossRef]
- Druskin, V.; Knizhnerman, L. Krylov subspace approximation of eigenpairs and matrix functions in exact and computer arithmetic. Numer. Linear Algebra Appl. 1995, 2, 205–217. [Google Scholar] [CrossRef]
- Druskin, V.; Knizhnerman, L.; Zaslavsky, M. Solution of large scale evolutionary problems using rational Krylov subspaces with optimized shifts. SIAM J. Sci. Comput. 2009, 31, 3760–3780. [Google Scholar] [CrossRef]
- Druskin, V.; Knizhnerman, L.; Simoncini, V. Analysis of the rational Krylov subspace and ADI methods for solving the Lyapunov equation. SIAM J. Numer. Anal. 2011, 49, 1875–1898. [Google Scholar] [CrossRef]
- Berljafa, M.; Güttel, S. Generalized rational Krylov decompositions with an application to rational approximation. SIAM J. Matrix Anal. Appl. 2015, 36, 894–916. [Google Scholar] [CrossRef]
- Jagels, C.; Reichel, L. Recursion relations for the extended Krylov subspace method. Linear Algebra Appl. 2011, 434, 1716–1732. [Google Scholar] [CrossRef]
- Jagels, C.; Reichel, L. The extended Krylov subspace method and orthogonal Laurent polynomials. Linear Algebra Appl. 2009, 431, 441–458. [Google Scholar] [CrossRef]
- Ruhe, A. Rational Krylov sequence methods for eigenvalue computation. Linear Algebra Appl. 1984, 58, 391–405. [Google Scholar] [CrossRef]
- Ruhe, A. Rational Krylov algorithms for nonsymmetric eigenvalue problems. In Recent Advances in Iterative Methods; The IMA Volumes in Mathematics and Its Applications; Springer: New York, NY, USA, 1994; Volume 60, pp. 149–164. [Google Scholar]
- Güttel, S. Rational Krylov approximation of matrix functions: Numerical methods and optimal pole selection. GAMM-Mitt. 2013, 36, 8–31. [Google Scholar] [CrossRef]
- Bagby, T. The modulus of a plane condenser. J. Math. Mech. 1967, 17, 315–329. [Google Scholar] [CrossRef]
- Gončar, A.A. The problems of E. I. Zolotarev which are connected with rational functions. Mat. Sb. 1969, 78, 640–654. [Google Scholar]
- Caliari, M.; Vianello, M.; Bergamaschi, L. Interpolating discrete advection-diffusion propagators at Leja sequences. J. Comput. Appl. Math. 2004, 172, 79–99. [Google Scholar] [CrossRef]
- Caliari, M.; Vianello, M.; Bergamaschi, L. The LEM exponential integrator for advection-diffusion-reaction equations. J. Comput. Appl. Math. 2007, 210, 56–63. [Google Scholar] [CrossRef]
- Druskin, V.; Lieberman, C.; Zaslavsky, M. On adaptive choice of shifts in rational Krylov subspace reduction of evolutionary problems. SIAM J. Sci. Comput. 2010, 32, 2485–2496. [Google Scholar] [CrossRef]
- Ruhe, A. Rational Krylov: A practical algorithm for large sparse nonsymmetric matrix pencils. SIAM J. Sci. Comput. 1998, 19, 1535–1551. [Google Scholar] [CrossRef]
- Botchev, M.A.; Grimm, V.; Hochbruck, M. Residual, restarting, and Richardson iteration for the matrix exponential. SIAM J. Sci. Comput. 2013, 35, A1376–A1397. [Google Scholar] [CrossRef]
- Saad, Y. Analysis of some Krylov subspace approximations to the matrix exponential operator. SIAM J. Numer. Anal. 1992, 29, 209–228. [Google Scholar] [CrossRef]
- van den Eshof, J.; Hochbruck, M. Preconditioning Lanczos approximations to the matrix exponential. SIAM J. Sci. Comput. 2006, 27, 1438–1457. [Google Scholar] [CrossRef]
- Björck, A.; Golub, G.H. Numerical methods for computing angles between linear subspaces. Math. Comp. 1973, 27, 579–594. [Google Scholar] [CrossRef]
- De Sturler, E. Truncation strategies for optimal Krylov subspace methods. SIAM J. Numer. Anal. 1999, 36, 864–889. [Google Scholar] [CrossRef]
- Elman, H.C.; Su, T. Low-rank solution methods for stochastic eigenvalue problems. SIAM J. Sci. Comput. 2019, 41, A2657–A2680. [Google Scholar] [CrossRef]
- Bak, J.; Newman, D.J. Complex Analysis, 3rd ed.; Undergraduate Texts in Mathematics; Springer: New York, NY, USA, 2010; pp. xii+328. [Google Scholar]
- Suetin, P.K. Series of Faber Polynomials; Analytical Methods and Special Functions; Gordon and Breach Science Publishers: Amsterdam, The Netherlands, 1998; Volume 1, pp. xx+301. [Google Scholar]
- Crouzeix, M. Numerical range and functional calculus in Hilbert space. J. Funct. Anal. 2007, 244, 668–690. [Google Scholar] [CrossRef]
- Driscoll, T.A. Algorithm 843: Improvements to the Schwarz-Christoffel toolbox for MATLAB. ACM Trans. Math. Softw. 2005, 31, 239–251. [Google Scholar] [CrossRef]
- Stewart, G.W. A Krylov-Schur algorithm for large eigenproblems. SIAM J. Matrix Anal. Appl. 2001, 23, 601–614. [Google Scholar] [CrossRef]
- Frommer, A.; Güttel, S.; Schweitzer, M. Convergence of restarted Krylov subspace methods for Stieltjes functions of matrices. SIAM J. Matrix Anal. Appl. 2014, 35, 1602–1624. [Google Scholar] [CrossRef]
- Curtiss, J.H. Faber polynomials and the Faber series. Am. Math. Mon. 1971, 78, 577–596. [Google Scholar] [CrossRef]
- Duren, P.L. Univalent Functions; Fundamental Principles of Mathematical Sciences; Springer: New York, NY, USA, 1983; Volume 259, pp. xiv+382. [Google Scholar]
- Bronshtein, I.N.; Semendyayev, K.A.; Musiol, G.; Mühlig, H. Handbook of Mathematics, 6th ed.; Springer: Berlin/Heidelberg, Germany, 2015; pp. xliv+1207. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).