Abstract
This paper considers the computation of approximations of matrix functionals of form , where A is a large symmetric positive definite matrix, v is a vector, and f is a Stieltjes function. The functional is approximated by a rational Gauss quadrature rule with poles on the negative real axis (or part thereof) in the complex plane, and we focus on the allocation of the poles. Specifically, we propose that the poles, when considered positive point charges, be allocated to make the negative real axis (or part thereof) approximate an equipotential curve. This is easily achieved with the aid of conformal mapping.
MSC:
65D15; 65D32; 65E05
1. Introduction
We are concerned with the development of numerical methods for the approximation of expressions of the form
where is a large symmetric positive definite matrix, the vector is of unit Euclidean norm, f is a Stieltjes function, and the superscript T denotes transposition. Thus, f can be represented as
where is a nonnegative distribution function defined in the interval such that is defined; see, e.g., [1,2,3,4] for discussions and illustrations of Stieltjes functions. Examples of Stieltjes functions include
When the matrix A is large, straightforward evaluation of (1) by first computing may be prohibitively expensive; see Higham [5] for discussions and analyses of many methods for the evaluation of matrix functions. We are, therefore, interested in exploring the possibility of reducing A to a fairly small matrix by a rational Krylov subspace method and approximating by evaluating . This results in a rational approximant of , whose poles are allocated by the user on the negative real axis. The rational approximant can be chosen to converge to zero as , similarly as . This paper focuses on the choice of poles. In particular, we propose to allocate the poles so that, when considered as positive point charges, they roughly make the negative real axis or part thereof an equipotential curve. This can easily be achieved with the aid of conformal mapping.
Similarly, as in [6], we express the rational approximant as a linear combination of rational orthogonal functions with specified poles and use the fact that these functions satisfy a recurrence relation with few terms only. We use a rational Gauss quadrature rule to determine an approximation of an integral of this rational approximant. The novelty of the present paper is that it addresses the allocation of the poles of the rational approximant. This issue was ignored in [6]. A nice introduction to rational Gauss quadrature is provided by Gautschi (Section 3.1.4 in [7]).
The choice of poles for rational approximants of Stieltjes functions has previously been discussed by Güttel and Knizhnerman [8], who propose a greedy algorithm, and by Massei and Robol [4], who express the approximation problem as a Zolotarev minimization problem. We will allocate the poles with the aid of conformal mapping. This approach is based on the observation that the approximation error is invariant under conformal mapping. Specifically, we allocate the poles equidistantly on the unit circle and then map the circle conformally to the negative real axis or a subset thereof. A benefit of this approach is that it is very easy to implement; in particular, it does not require the solution of a minimization problem.
This paper is organized as follows. Section 2 expresses (1) as an integral of the Stieltjes function over the positive real axis with respect to a measure that is determined by both the matrix A and vector v. This kind of transformation is described, e.g., by Golub and Meurant (Chapter 7 in [9]). Properties of orthogonal rational functions with real poles are reviewed in Section 3, and rational Gauss quadrature is discussed in Section 4. Our approach to choosing the poles is described in Section 5. A few computed examples are presented in Section 6, and concluding remarks can be found in Section 7.
2. Integration of Stieltjes Functions
We derive an integral that will be approximated by a rational Gauss quadrature rule. A similar derivation can be found in [6]. Our derivation uses the spectral factorization
where the matrix is orthogonal and the eigenvalues of A are ordered according to . We remark that the spectral factorization (3) is helpful for deriving our approximation method. However, the application of the quadrature rules of this paper does not require the computation of this factorization.
We introduce the vector . Using the factorization (3), the expression (1) with f defined by (2) can be written as
where the nonnegative measure has support at the eigenvalues of A. The distribution function can be chosen to be a nondecreasing piece-wise constant function with jumps of size at the eigenvalues of A. We will approximate the integral (4) by a rational Gauss quadrature rule.
3. Recursion Relations for Orthogonal Rational Function
We first review results in [10] on recursion relations for certain orthogonal rational functions. In this reference, functions with real poles and complex conjugate poles are allowed. Here we focus on rational functions with real poles only. The number of terms in the recursion relations depends on the number of distinct poles of the rational functions as well as on the ordering of certain elementary rational basis functions to be defined below.
We introduce linear spaces of rational functions with finite distinct real poles of multiplicity ,
In our application, the poles will be allocated on the negative real axis. Let
and define the -dimensional linear space
Here is chosen so that . The space (7) then contains linear functions. Let
denote an elementary basis for the space , i.e., and each basis function , for , is one of the functions
for some positive integers i and j. We use the notation to indicate that the basis function comes before . We refer to the ordering of the basis functions (8) as natural if and the remaining functions , , satisfy:
- for all integers ,
- for all integers and every real pole .
In particular, if for some , then some preceding elementary basis function with equals . An analogous statement also holds for negative powers for .
It is shown in [10] that a Stieltjes procedure for orthogonalizing elementary basis functions with respect to some inner product only requires short recursion relations when the basis functions are in natural order. A recursion relation is said to be short if the number of terms in the relation can be bounded independently of the number of orthogonal functions generated. A well-known special case is a situation when all elementary basis functions are monomials, i.e., when , . Then the Stieltjes procedure yields a family of orthogonal polynomials with of degree j for all j. The polynomials satisfy a three-term recurrence relation.
Similarly, when orthogonalizing elementary basis functions in the natural order with respect to a bilinear form , where is a linear functional on , the orthogonal rational basis functions obtained satisfy short recurrence relations, provided that for all j. We will orthogonalize the elementary basis functions with respect to the bilinear form
This bilinear form is an inner product for all elementary basis functions of low enough order.
When the elementary basis functions are of the form (9), the number of terms required in the recurrence relations depends on how often powers of the same function appear in the sequence (8). We review two types of recursion relations from [10] that are required below. First, consider the recursion relation
where is the largest integer smaller than r such that is a monomial if there is such a monomial (otherwise ), and is the smallest integer larger than r such that is a monomial. To introduce a real pole α we need the recursion relation
where is the largest integer smaller than r such that is a rational function with a pole at , provided that there is such a rational function (otherwise ), and is the smallest integer larger than r such that is a rational function with a pole at .
We introduce the vector of orthonormal rational functions,
Assume that is a monomial and that is the largest integer smaller than m such that is a monomial. Then the recursion formulas (12) can be written in the form
where denotes the jth axis vector. The matrix in (13) has the following block-diagonal structure: the matrix has square blocks along the diagonal, such that any two consecutive blocks overlap in one diagonal element. The jth block of is of dimension , where is the number of rational functions between consecutive monomials and for some j. In detail, the jth block of is the submatrix (using MATLAB notation) with the entries for , where and . Further, . The nonvanishing entries of are recursion coefficients for the orthogonal rational functions . The recursion coefficients satisfy
This shows that the matrix depends only on the first m elementary basis functions , . In particular, is independent of . Throughout this paper, we will assume that is a monomial because then the zeros of are eigenvalues of .
The structure of the matrix is illustrated by the following example.
Example 1.
Consider the elementary basis
This elementary basis, together with the function , where k is defined by (6) and , satisfies the requirements of natural ordering. The basis (14) spans the space . The matrix determined by this basis has k 3 × 3 blocks along the diagonal,
Here we mark matrix entries that may be nonvanishing by ∗. If we place one more elementary rational basis function between the two consecutive monomials and in the elementary basis, then the size of the jth block of increases by one. Analogously, if we remove one elementary rational basis function between two consecutive monomials, then the size of the corresponding block decreases by one.
Assume that the functions , , are orthonormal with respect to the bilinear form (10). Then the vectors
form an orthonormal basis for the rational Krylov subspace
for , with respect to the bilinear form (10). Indeed,
We note that the vectors and the orthogonal rational functions satisfy the same recursion relations. The vectors, therefore, can be determined by the rational Lanczos process (Algorithm 1). This process is analogous to the Stieltjes-type procedure (Algorithm 3.1 in [10]) for computing an orthonormal basis for the space . The rational Lanczos process is based on the recursion relations (12). The norm in Algorithm 1 is the Euclidean vector norm.
The execution of Algorithm 1 requires the solution of linear systems of equations with matrices of the form , where the are suitable real scalars. We assume, in this paper, that the matrix A has a structure that allows fairly efficient computation of by a direct or an iterative method. This is, for instance, the case when A is banded with a small bandwidth or when A is a symmetric positive Toeplitz matrix; see, e.g., ref. [11] for fast solvers for the latter kind of systems of equations. We note here that since A is a symmetric positive definite and , the matrix also is a symmetric positive definite. Moreover, when A is obtained by discretizing an elliptic partial differential equation, it may be possible to solve the required linear systems rapidly by a multigrid method.
| Algorithm 1 The rational Lanczos process |
| 1: Input: , a sequence of matrix-valued elementary basis functions , and functions for evaluating matrix-vector products with A and for solving linear systems of equations with matrices of the form . Thus, we do not explicitly form the elementary basis functions and . The sequence of elementary basis functions implicitly defines the integers j and ℓ used below. |
| 2: Output: Orthonormal basis . |
| 3: Initialization: |
| 4: while do |
| 5: if for some then |
| 6: ; |
| 7: for do |
| 8: ; ; |
| 9: end for |
| 10: ; ; |
| 11: |
| 12: else if for some then |
| 13: ; |
| 14: for do |
| 15: ; ; |
| 16: end for |
| 17: ; ; |
| 18: |
| 19: end if |
| 20: end while |
In Algorithm 1, we assume that the basis (8) satisfies conditions 1 and 2 of natural ordering. The value of is such that is a basis function with the same poles as , with at least one of the poles of smaller order; if no such basis function exists, then . When carrying out m steps of Algorithm 1, we obtain an orthonormal basis for the rational Krylov subspace .
The matrix
and the symmetric matrix
satisfy the relation
where the vector is such that . The subspace , , is spanned by the first i columns of the matrix . In particular, the rational Krylov subspaces are nested, i.e., . The orthogonal projection of A onto the rational Krylov subspace (15) is given by
We will assume that m is small enough so that the decomposition (16) with the stated properties exists. This is the generic situation. Note that the determination of the matrix only requires that steps of Algorithm 1 be carried out.
4. Rational Gauss Quadrature
This section describes properties of rational Gauss quadrature rules for the approximation of functionals (1) when f is a Stieltjes function (2) and A is a symmetric positive definite matrix. Then
It is shown in [12] that
is a rational Gauss quadrature rule for the approximation of (17). The rational Gauss quadrature rule is characterized by the property
where
denotes a -dimensional linear subspace spanned by the chosen elementary rational functions.
In view of (4), the rational Gauss rule (18) can be expressed as
where we recall that the measure defines function f; the rational Gauss rule is with respect to the measure in (4).
The orthonormal rational function has m distinct zeros that lie in the convex hull of the support of the measure , and they are the eigenvalues of in (13); see (Theorem 2.5 in [12]). Recall that we assume to be a monomial. Thus the rational function in (13) can be written as
where
is a polynomial of degree k with k defined by (6). The scalars are the poles of the rational approximant; see the beginning of Section 3. Moreover, is a scalar. A remainder term for the rule (19) is derived in [6].
5. Allocation of Poles
The formulas in the previous section are valid independently of the location of the distinct negative real poles and their multiplicities. However, their location and multiplicities affect the size of the quadrature error
Example 5.3 reported in [6] indicates that when the rational approximant of the Stieltjes function f has more distinct poles, the rational Gauss rule (19) may give a smaller quadrature error (21) than when the rational approximant is of the same order with fewer distinct poles. However, no discussion on the allocation of the poles is provided in [6]. Here we propose to allocate the poles so that, when considered positive point charges, they approximately make the negative real axis an equipotential curve. The allocation of poles that satisfy this property is very easy to determine with the aid of conformal mapping. Computed examples in Section 6 illustrate the performance of this pole allocation technique.
We remark that using many distinct poles may not be advantageous because each distinct pole requires the solution of a linear system of equations of the form
for some vector w; see the discussion in Section 3. When the systems of equations are solved by factorization of the system matrix, each distinct pole requires the computation of a factorization. It follows that the computing time required to achieve a quadrature error (21) that is smaller than a specified tolerance may be smaller when using a rational approximant with fewer distinct poles than when using a rational approximant of the same order with more distinct poles. The optimal choice of the number of distinct poles depends on the structure of A, i.e., on how demanding the computation of the factorization of the matrix is. This section considers the allocation of ℓ distinct poles; cf. (5).
We set the poles to be images under a conformal mapping of equidistant points on the unit circle. These poles are easy to compute. The application of conformal mappings is justified by the fact that the approximation errors of analytic functions by rational functions are invariant under conformal mapping because conformal mappings map equipotential curves onto equipotential curves.
Consider the problem of approximating an analytic function outside the unit circle by a rational function with poles on the unit circle. If no particular properties of the analytic function are known, then it is natural to allocate the poles equidistantly on the unit circle; see, e.g., Walsh (Chapter 9 in [13]) for a discussion on rational approximation and equipotential curves. Let the unit circle be in the z-plane. We can map the unit circle to the interval by the Joukowski map
see, e.g., Henrici (Chapter 5 in [14]). As z traverses the unit circle, w traverses the interval twice. We now map w to the semi-infinite interval with by the Moebius transformation
Then when , and for . When z traverses the top half of the unit circle counterclockwise, starting at and ending at , traverses the interval starting at . Similarly, when z traverses the bottom half of the unit circle clockwise, also traverses the interval starting at .
To allocate ℓ distinct poles on the interval , we first define the ℓ equidistant points
on the upper unit semicircle. The Joukowski map yields the points
on the interval . Finally, using (22) gives the distinct poles
on . We will use these poles in the computed examples of the following section.
6. Computed Examples
This section illustrates the application of rational Gauss rules to Stieltjes functions of a symmetric positive definite matrix. We illustrate the pole allocation described in Section 5 and compare it with ad hoc pole allocation. The computations were carried out using MATLAB R2017b on a 64-bit MacBook Pro personal computer; arithmetic was performed with about 15 significant decimal digits. We determine the quadrature error by explicitly evaluating the functionals (1). This limits the size of the matrices A that we can consider.
Example 2.
Consider the approximation of the functional
where is a symmetric positive definite Toeplitz matrix with first row , and . The smallest eigenvalue of A is , and the largest one is . The value of is about . We compute approximations of (24) by rational Gauss rules. The computations require the solution of linear system of equations with symmetric positive definite Toeplitz matrices , where are poles. Ammar and Gragg [11] describe fast algorithms for the solution of systems of equations with this kind of matrix.
To calculate the exact value of (24), we evaluate by first computing the matrix square root with the MATLAB functionsqrtmand then solving a linear system of equations with the matrix A. The rational Gauss rule is evaluated as
where is determined by first computing the matrix using the MATLAB functionsqrtmand then solving a linear system of equations with the matrix .
The first row of Table 1 displays the error in the approximation of (24) obtained by rational Gauss rules with the somewhat arbitrarily chosen distinct poles
of multiplicity two. On the second row, Table 1 shows the error in the approximation obtained with rational Gauss rules with poles defined by (23) with and . Thus, the poles used are of two multiplicity two. The distinct poles are and . Table 1 shows the latter poles to yield a smaller error than the poles (25).
Table 1.
Example 2: Errors in rational Gauss rules for approximating (24) with A is a symmetric positive definite Toeplitz matrix. For the first row, the distinct poles are , each of multiplicity two. The second row shows the error when the poles are allocated with the aid of conformal mapping using (23) with and . Then the distinct poles are .
We now turn to the use of four distinct poles. The first row of Table 2 displays the errors in approximations determined by rational Gauss rules with the arbitrarily chosen distinct poles
of multiplicity one. Row two of Table 2 displays the approximation of the error obtained with the four poles of multiplicity one, which is defined by (23) with and . The latter poles can be seen to give a smaller approximation error than the equidistant poles (26).
Table 2.
Example 2: Errors in rational Gauss rules for approximating (24) with A a symmetric positive definite Toeplitz matrix. In the first row, the poles are of multiplicity one. The second row shows the error when the poles are allocated by using (23) with and . The poles are of multiplicity one.
Example 3.
We would like to calculate an approximation of the functional
where is a symmetric positive definite Toeplitz matrix with a first row . The vector is the same as in Example 1. The smallest eigenvalue of A is , and the largest one is . The value of is approximately .
The first row of Table 3 shows the error for the rational Gauss quadrature rule with somewhat arbitrarily chosen poles of multiplicity two and of multiplicity one. The second row displays the error for the rational Gauss rule with poles of multiplicity two and of multiplicity one. The latter poles are determined by (23) with and . We observe that the approach of choosing poles with the aid of conformal mapping provides a more accurate result.
Table 3.
Example 3: Errors in rational Gauss rules for the approximating of (27) with A a symmetric positive definite Toeplitz matrix. In the first row, the poles are of multiplicity two and of multiplicity one. In the second row, the poles are allocated using (23) with and . The pole is of multiplicity two, and the pole is of multiplicity one.
The first and second rows of Table 4 display the errors in rational Gauss rules with somewhat arbitrarily chosen poles of multiplicity two and with poles of multiplicity two, respectively. The latter poles are defined by (23) with and . It can be seen that the error in the second row is smaller than the error in the first row.
Table 4.
Example 3: Errors in rational Gauss rules for approximating (27) with A a symmetric positive definite Toeplitz matrix. In the first row, the poles are and of multiplicity two. In the second row, the poles are allocated with (23) with and . This gives the distinct poles and of multiplicity two.
Example 4.
In this example, we determine an approximation of
where the matrix and vector are the same as in Example 1. The value of is approximately . We note that
is a Stieltjes function.
Table 5 displays the errors in approximations determined by rational Gauss rules. In the first row, we choose four somewhat arbitrary equidistant poles, , each of multiplicity one. In the second row, the poles are of multiplicity one, and the poles in the third row are of multiplicity two. The poles in the second and third rows are allocated using (23). It can be seen that the errors in the second and third rows are of a smaller magnitude than the error in the first row.
Table 5.
Example 4: Errors in rational Gauss rules for computing approximations of (28) when A is a symmetric positive definite Toeplitz matrix. In the first row, we choose four equidistant poles of multiplicity one. The second row displays the approximation error when the poles are allocated by (23) with and ; the poles are of multiplicity one. The poles on the third row are obtained by using (23) with and . Then the distinct poles are of multiplicity two.
7. Conclusions
This paper discusses the evaluation of expressions of the form (1), where A is a symmetric positive definite matrix and f is a Stieltjes function. We approximate f by a rational function with user-specified poles and integrate the rational function by a rational Gauss quadrature rule. The rational function is expressed as a linear combination of rational orthogonal functions, which can be generated with recursion formulas with few terms. The paper focuses on the choice of poles of the rational function. They are allocated on the negative real axis and chosen to be images of equidistant points on the upper unit semicircle under a suitable conformal mapping. The computed examples show this approach of allocating the poles to give a smaller quadrature error than poles chosen in an ad hoc manner.
Author Contributions
All authors contributed equally to this publication. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Data Availability Statement
No data are required.
Acknowledgments
The authors would like to thank the referees for comments that lead to improvements of the presentation.
Conflicts of Interest
There is no conflict of interest.
References
- Frommer, A. Monotone convergence of the Lanczos approximations to matrix functions of Hermitian matrices. Electron. Trans. Numer. Anal. 2009, 35, 118–128. [Google Scholar]
- Frommer, A.; Schweitzer, M. Error bounds and estimates for Krylov subspace approximations of Stieltjes matrix functions. BIT Numer. Math. 2016, 56, 865–892. [Google Scholar] [CrossRef]
- Henrici, P. Applied and Computational Complex Analysis; Wiley: New York, NY, USA, 1977; Volume 2. [Google Scholar]
- Massei, S.; Robol, L. Rational Krylov for Stieltjes matrix functions: Convergence and pole selection. BIT Numer. Math. 2021, 61, 237–273. [Google Scholar] [CrossRef]
- Higham, N.J. Functions of Matrices; SIAM: Philadelphia, PA, USA, 2008. [Google Scholar]
- Alahmadi, J.; Pranić, M.; Reichel, L. Rational Gauss quadrature rules for the approximation of matrix functionals involving Stieltjes functions. Numer. Math. 2022, 151, 443–473. [Google Scholar] [CrossRef]
- Gautschi, W. Orthogonal Polynomials: Computation and Approximation; Oxford University Press: Oxford, UK, 2004. [Google Scholar]
- Güttel, S.; Knizhnerman, L. A black box rational Arnoldi variant for Cauchy–Stietjes matrix functions. BIT Numer. Math. 2013, 53, 595–616. [Google Scholar] [CrossRef]
- Golub, G.H.; Meurant, G. Matrices, Moments and Quadrature with Applications; Princeton University Press: Princeton, NJ, USA, 2010. [Google Scholar]
- Pranić, M.S.; Reichel, L. Recurrence relations for orthogonal rational functions. Numer. Math. 2013, 123, 629–642. [Google Scholar] [CrossRef]
- Ammar, G.S.; Gragg, W.B. Superfast solution of real positive definite Toeplitz systems. SIAM J. Matrix Anal. Appl. 1988, 9, 61–76. [Google Scholar] [CrossRef]
- Pranić, M.S.; Reichel, L. Rational Gauss quadrature. SIAM J. Numer. Anal. 2014, 52, 832–851. [Google Scholar] [CrossRef]
- Walsh, J.L. Interpolation and Approximation by Rational Functions in the Complex Domain, 5th ed.; American Mathematical Society: Providence, RI, USA, 1969. [Google Scholar]
- Henrici, P. Applied and Computational Complex Analysis; Wiley: New York, NY, USA, 1974; Volume 1. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).