Pole Allocation for Rational Gauss Quadrature Rules for Matrix Functionals Defined by a Stieltjes Function

Jihan Alahmadi; Miroslav Pranić; Lothar Reichel

doi:10.3390/axioms12020105

,

and

¹

Department of Mathematics, Prince Sattam Bin Abdulaziz University, Al-Kharj 16273, Saudi Arabia

²

Department of Mathematics and Informatics, Faculty of Science M. Stojanovića 2, University of Banja Luka, 51000 Banja Luka, Bosnia and Herzegovina

³

Department of Mathematical Sciences, Kent State University, Kent, OH 44242, USA

^*

Author to whom correspondence should be addressed.

Axioms2023, 12(2), 105;https://doi.org/10.3390/axioms12020105

This article belongs to the Special Issue Orthogonal Polynomials, Special Functions and Applications

Version Notes

Order Reprints

Abstract

This paper considers the computation of approximations of matrix functionals of form

F (A) : = v^{T} f (A) v

, where A is a large symmetric positive definite matrix, v is a vector, and f is a Stieltjes function. The functional

F (A)

is approximated by a rational Gauss quadrature rule with poles on the negative real axis (or part thereof) in the complex plane, and we focus on the allocation of the poles. Specifically, we propose that the poles, when considered positive point charges, be allocated to make the negative real axis (or part thereof) approximate an equipotential curve. This is easily achieved with the aid of conformal mapping.

Keywords:

Stieltjes function; matrix function; rational Gauss quadrature

MSC:

65D15; 65D32; 65E05

1. Introduction

We are concerned with the development of numerical methods for the approximation of expressions of the form

F (A) : = v^{T} f (A) v,

(1)

where

A \in R^{N \times N}

is a large symmetric positive definite matrix, the vector

v \in R^{N} \ {0}

is of unit Euclidean norm, f is a Stieltjes function, and the superscript ^T denotes transposition. Thus, f can be represented as

f (z) = \int_{0}^{\infty} \frac{1}{t + z} d μ (t), z \in C \ (- \infty, 0],

(2)

where

μ

is a nonnegative distribution function defined in the interval

[0, \infty)

such that

f (z)

is defined; see, e.g., [1,2,3,4] for discussions and illustrations of Stieltjes functions. Examples of Stieltjes functions include

\begin{matrix} f (z) & = & z^{- a} = \frac{sin (a π)}{π} \int_{0}^{\infty} \frac{1}{t + z} d μ (t), with d μ (t) = t^{- a} d t, a \in (0, 1), \\ f (z) & = & \frac{log (z + 1)}{z} = \int_{1}^{\infty} \frac{1}{t + z} d μ (t), with d μ (t) = t^{- 1} d t . \end{matrix}

When the matrix A is large, straightforward evaluation of (1) by first computing

f (A)

may be prohibitively expensive; see Higham [5] for discussions and analyses of many methods for the evaluation of matrix functions. We are, therefore, interested in exploring the possibility of reducing A to a fairly small matrix

H_{m}

by a rational Krylov subspace method and approximating

F (A)

by evaluating

f (H_{m})

. This results in a rational approximant of

F (A)

, whose poles are allocated by the user on the negative real axis. The rational approximant can be chosen to converge to zero as

z \to \infty

, similarly as

f (z)

. This paper focuses on the choice of poles. In particular, we propose to allocate the poles so that, when considered as positive point charges, they roughly make the negative real axis or part thereof an equipotential curve. This can easily be achieved with the aid of conformal mapping.

Similarly, as in [6], we express the rational approximant as a linear combination of rational orthogonal functions with specified poles and use the fact that these functions satisfy a recurrence relation with few terms only. We use a rational Gauss quadrature rule to determine an approximation of an integral of this rational approximant. The novelty of the present paper is that it addresses the allocation of the poles of the rational approximant. This issue was ignored in [6]. A nice introduction to rational Gauss quadrature is provided by Gautschi (Section 3.1.4 in [7]).

The choice of poles for rational approximants of Stieltjes functions has previously been discussed by Güttel and Knizhnerman [8], who propose a greedy algorithm, and by Massei and Robol [4], who express the approximation problem as a Zolotarev minimization problem. We will allocate the poles with the aid of conformal mapping. This approach is based on the observation that the approximation error is invariant under conformal mapping. Specifically, we allocate the poles equidistantly on the unit circle and then map the circle conformally to the negative real axis or a subset thereof. A benefit of this approach is that it is very easy to implement; in particular, it does not require the solution of a minimization problem.

This paper is organized as follows. Section 2 expresses (1) as an integral of the Stieltjes function over the positive real axis with respect to a measure that is determined by both the matrix A and vector v. This kind of transformation is described, e.g., by Golub and Meurant (Chapter 7 in [9]). Properties of orthogonal rational functions with real poles are reviewed in Section 3, and rational Gauss quadrature is discussed in Section 4. Our approach to choosing the poles is described in Section 5. A few computed examples are presented in Section 6, and concluding remarks can be found in Section 7.

2. Integration of Stieltjes Functions

We derive an integral that will be approximated by a rational Gauss quadrature rule. A similar derivation can be found in [6]. Our derivation uses the spectral factorization

A = U Λ U^{T}, Λ = diag [λ_{1}, λ_{2}, \dots, λ_{N}],

(3)

where the matrix

U \in R^{N \times N}

is orthogonal and the eigenvalues

λ_{i}

of A are ordered according to

0 < λ_{1} \leq \dots \leq λ_{N}

. We remark that the spectral factorization (3) is helpful for deriving our approximation method. However, the application of the quadrature rules of this paper does not require the computation of this factorization.

We introduce the vector

[ν_{1}, ν_{2}, \dots, ν_{N}] : = v^{T} U

. Using the factorization (3), the expression (1) with f defined by (2) can be written as

\begin{matrix} v^{T} f (A) v = v^{T} U f (Λ) U^{T} v & = \int_{0}^{\infty} v^{T} U {(t I + Λ)}^{- 1} U^{T} v d μ (t) \\ = \int_{0}^{\infty} \sum_{i = 1}^{N} {(t + λ_{i})}^{- 1} ν_{i}^{2} d μ (t) \\ = \int_{0}^{\infty} \int_{0}^{\infty} {(t + y)}^{- 1} d ν (y) d μ (t) \\ = \int_{0}^{\infty} f (y) d ν (y) = : I (f), \end{matrix}

(4)

where the nonnegative measure

d ν (y)

has support at the eigenvalues

λ_{i}

of A. The distribution function

ν (y)

can be chosen to be a nondecreasing piece-wise constant function with jumps of size

ν_{i}^{2}

at the eigenvalues

λ_{i}

of A. We will approximate the integral (4) by a rational Gauss quadrature rule.

3. Recursion Relations for Orthogonal Rational Function

We first review results in [10] on recursion relations for certain orthogonal rational functions. In this reference, functions with real poles and complex conjugate poles are allowed. Here we focus on rational functions with real poles only. The number of terms in the recursion relations depends on the number of distinct poles of the rational functions as well as on the ordering of certain elementary rational basis functions to be defined below.

We introduce linear spaces of rational functions with finite distinct real poles

α_{i}

of multiplicity

k_{i}

,

Q_{i, k_{i}} = span \{\frac{1}{{(y - α_{i})}^{j}} : j = 1, 2, \dots, k_{i}\}, i = 1, 2, \dots, ℓ .

(5)

In our application, the poles will be allocated on the negative real axis. Let

k = \sum_{i = 1}^{ℓ} k_{i}

(6)

and define the

(m + 1)

-dimensional linear space

S_{m + 1} : = P_{m + 1 - k} \oplus Q_{1, k_{1}} \oplus \dots \oplus Q_{ℓ, k_{ℓ}} .

(7)

Here

k_{i}

is chosen so that

0 \leq k \leq m - 1

. The space (7) then contains linear functions. Let

Ψ_{m + 1} = \{ψ_{0}, ψ_{1}, \dots, ψ_{m}\}

(8)

denote an elementary basis for the space

S_{m + 1}

, i.e.,

ψ_{0} (y) = 1

and each basis function

ψ_{i} (y)

, for

i = 1, 2, \dots, m

, is one of the functions

y^{j}, \frac{1}{{(y - α_{i})}^{j}},

(9)

for some positive integers i and j. We use the notation

ψ_{s} ≺ ψ_{t}

to indicate that the basis function

ψ_{s}

comes before

ψ_{t}

. We refer to the ordering of the basis functions (8) as natural if

ψ_{0} (y) = 1

and the remaining functions

ψ_{j}

,

j = 1, 2, \dots, m

, satisfy:

$y^{p} ≺ y^{p + 1}$ for all integers $p > 0$ ,
$\frac{1}{{(y - α_{i})}^{p}} ≺ \frac{1}{{(y - α_{i})}^{p + 1}}$ for all integers $p > 0$ and every real pole $α_{i}$ .

In particular, if

ψ_{j} = y^{p}

for some

p > 1

, then some preceding elementary basis function

ψ_{i}

with

i < j

equals

y^{p - 1}

. An analogous statement also holds for negative powers

{(y - α_{i})}^{- p}

for

p > 1

.

It is shown in [10] that a Stieltjes procedure for orthogonalizing elementary basis functions with respect to some inner product only requires short recursion relations when the basis functions are in natural order. A recursion relation is said to be short if the number of terms in the relation can be bounded independently of the number of orthogonal functions generated. A well-known special case is a situation when all elementary basis functions are monomials, i.e., when

ψ_{j} (y) = y^{j}

,

j = 0, 1, 2, \dots

. Then the Stieltjes procedure yields a family of orthogonal polynomials

ϕ_{0}, ϕ_{1}, ϕ_{2}, \dots

with

ϕ_{j}

of degree j for all j. The polynomials

ϕ_{j}

satisfy a three-term recurrence relation.

Similarly, when orthogonalizing elementary basis functions in the natural order with respect to a bilinear form

[f, g] = L (f g)

, where

L

is a linear functional on

S_{m + 1}

, the orthogonal rational basis functions

ϕ_{0}, ϕ_{1}, ϕ_{2}, \dots

obtained satisfy short recurrence relations, provided that

L (ϕ_{j}^{2}) \neq 0

for all j. We will orthogonalize the elementary basis functions with respect to the bilinear form

⟨ f, g ⟩ = {(f (A) v)}^{T} (g (A) v) .

(10)

This bilinear form is an inner product for all elementary basis functions of low enough order.

When the elementary basis functions are of the form (9), the number of terms required in the recurrence relations depends on how often powers of the same function appear in the sequence (8). We review two types of recursion relations from [10] that are required below. First, consider the recursion relation

y ϕ_{r} (y) = \sum_{j = - n_{1}}^{n_{2}} c_{r, r + j} ϕ_{r + j} (y), r = 0, 1, \dots,

(11)

where

r - n_{1}

is the largest integer smaller than r such that

ψ_{r - n_{1}}

is a monomial if there is such a monomial (otherwise

r - n_{1} = 0

), and

r + n_{2}

is the smallest integer larger than r such that

ψ_{r + n_{2}}

is a monomial. To introduce a real pole α we need the recursion relation

\frac{1}{y - α_{i}} ϕ_{r} (y) = \sum_{j = - n_{3}}^{n_{4}} c_{r, r + j}^{(i)} ϕ_{r + j} (y), r = 0, 1, 2, \dots,

(12)

where

r - n_{3}

is the largest integer smaller than r such that

ψ_{r - n_{3}}

is a rational function with a pole at

α_{i}

, provided that there is such a rational function (otherwise

r - n_{3} = 0

), and

r + n_{4}

is the smallest integer larger than r such that

ψ_{r + n_{4}}

is a rational function with a pole at

α_{i}

.

We introduce the vector of orthonormal rational functions,

Φ_{m} (y) : = {[ϕ_{0} (y), ϕ_{1} (y), \dots, ϕ_{m - 1} (y)]}^{T} .

Assume that

ψ_{m}

is a monomial and that

m - d

is the largest integer smaller than m such that

ψ_{m - d}

is a monomial. Then the recursion formulas (12) can be written in the form

y Φ_{m} (y) = H_{m} Φ_{m} (y) + \sum_{j = 1}^{d} h_{m - j, m} ϕ_{m} (y) e_{m + 1 - j},

(13)

where

e_{j} = {[0, \dots, 0, 1, 0, \dots, 0]}^{T} \in R^{m}

denotes the jth axis vector. The matrix

H_{m}

in (13) has the following block-diagonal structure: the matrix has

m - k - 1

square blocks along the diagonal, such that any two consecutive blocks overlap in one diagonal element. The jth block of

H_{m}

is of dimension

r \times r

, where

r - 2

is the number of rational functions between consecutive monomials

y^{j - 1}

and

y^{j}

for some j. In detail, the jth block of

H_{m}

is the submatrix

H_{m} (r_{1} : r_{2}, r_{1} : r_{2})

(using MATLAB notation) with the entries

h_{i j}

for

r_{1} \leq i, j \leq r_{2}

, where

ψ_{r_{1}} (y) = y^{j - 1}

and

ψ_{r_{2}} (y) = y^{j}

. Further,

r = r_{2} - r_{1} + 1

. The nonvanishing entries of

H_{m} = {[h_{i j}]}_{i, j = 0}^{m - 1}

are recursion coefficients for the orthogonal rational functions

ϕ_{i}, i = 0, 1, \dots, m - 1

. The recursion coefficients satisfy

h_{i, j} = ⟨ y ϕ_{i} (y), ϕ_{j} (y) ⟩ = ⟨ y ϕ_{j} (y), ϕ_{i} (y) ⟩ = h_{j, i} .

This shows that the matrix

H_{m}

depends only on the first m elementary basis functions

ψ_{j}

,

j = 0, 1, \dots, m - 1

. In particular,

H_{m}

is independent of

ψ_{m}

. Throughout this paper, we will assume that

ψ_{m}

is a monomial because then the zeros of

ϕ_{m}

are eigenvalues of

H_{m}

.

The structure of the matrix

H_{m}

is illustrated by the following example.

Example 1.

Consider the elementary basis

\begin{matrix} \{1, \frac{1}{y - α_{1}}, y, \frac{1}{{(y - α_{1})}^{2}}, y^{2}, \dots, \frac{1}{y - α_{ℓ}}, y^{k}\} . \end{matrix}

(14)

This elementary basis, together with the function

ψ_{m} (y) = y^{k + 1}

, where k is defined by (6) and

m = 1 + 2 k

, satisfies the requirements of natural ordering. The basis (14) spans the space

S_{m + 1}

. The matrix

H_{m}

determined by this basis has k 3 × 3 blocks along the diagonal,

H_{m} = [\begin{matrix} * & * & * \\ * & * & * \\ * & * & * & * & * \\ * & * & * \\ * & * & * \\ ⋱ \\ * & * & * \\ * & * & * \\ * & * & * & * & * \\ * & * & * \\ * & * & * \end{matrix}] .

Here we mark matrix entries that may be nonvanishing by ∗. If we place one more elementary rational basis function between the two consecutive monomials

y^{j - 1}

and

y^{j}

in the elementary basis, then the size of the jth block of

H_{m}

increases by one. Analogously, if we remove one elementary rational basis function between two consecutive monomials, then the size of the corresponding block decreases by one.

Assume that the functions

ϕ_{j}

,

j = 0, 1, \dots, m

, are orthonormal with respect to the bilinear form (10). Then the vectors

{v_{0} = ϕ_{0} (A) v, \dots, v_{j} = ϕ_{j} (A) v}

form an orthonormal basis for the rational Krylov subspace

K_{j + 1} (A, v) = span {ψ_{0} (A) v, ψ_{1} (A) v, \dots, ψ_{j} (A) v},

(15)

for

j = 0, 1, \dots, m

, with respect to the bilinear form (10). Indeed,

v_{j}^{T} v_{i} = {(ϕ_{j} (A) v)}^{T} (ϕ_{i} (A) v) = ⟨ ϕ_{j}, ϕ_{i} ⟩ .

We note that the vectors

v_{j}

and the orthogonal rational functions

ϕ_{j}

satisfy the same recursion relations. The vectors, therefore, can be determined by the rational Lanczos process (Algorithm 1). This process is analogous to the Stieltjes-type procedure (Algorithm 3.1 in [10]) for computing an orthonormal basis for the space

S_{m + 1}

. The rational Lanczos process is based on the recursion relations (12). The norm

∥ \cdot ∥

in Algorithm 1 is the Euclidean vector norm.

The execution of Algorithm 1 requires the solution of linear systems of equations with matrices of the form

A - α_{i} I

, where the

α_{i}

are suitable real scalars. We assume, in this paper, that the matrix A has a structure that allows fairly efficient computation of

{(A - α_{i} I)}^{- 1} v_{r - 1}

by a direct or an iterative method. This is, for instance, the case when A is banded with a small bandwidth or when A is a symmetric positive Toeplitz matrix; see, e.g., ref. [11] for fast solvers for the latter kind of systems of equations. We note here that since A is a symmetric positive definite and

α_{i} < 0

, the matrix

A - α_{i} I

also is a symmetric positive definite. Moreover, when A is obtained by discretizing an elliptic partial differential equation, it may be possible to solve the required linear systems rapidly by a multigrid method.

Algorithm 1 The rational Lanczos process

1: Input:

v \in R^{N} \ {0}

, a sequence of matrix-valued elementary basis functions

ψ_{0}, ψ_{1}, \dots, ψ_{m}

, and functions for evaluating matrix-vector products with A and for solving linear systems of equations with matrices of the form

A - α_{ℓ} I

. Thus, we do not explicitly form the elementary basis functions

A^{j}

and

{(A - α_{ℓ} I)}^{- j}

. The sequence of elementary basis functions implicitly defines the integers j and ℓ used below.

2: Output: Orthonormal basis

{v_{r}}_{r = 0}^{m}

.

3: Initialization:

v_{0} : = v / ∥ v ∥; r : = 1;

4: while

r \leq m

do

5: if

ψ_{r} = A^{j}

for some

j \in N

then

6:

u : = A v_{r - 1}

;

7: for

i = \hat{r} : r - 1

do

8:

c_{r - 1, i} : = v_{i}^{T} u

;

u : = u - c_{r - 1, i} v_{i}

;

9: end for

10:

δ_{r} : = ∥ u ∥

;

v_{r} : = u / δ_{r}

;

11:

r = r + 1

12: else if

ψ_{r} = {(A - α_{ℓ} I)}^{- j}

for some

j, ℓ \in N

then

13:

u : = {(A - α_{ℓ} I)}^{- 1} v_{r - 1}

;

14: for

i = \hat{r} : r - 1

do

15:

c_{r - 1, i} : = v_{i}^{T} u

;

u : = u - c_{r - 1, i} v_{i}

;

16: end for

17:

δ_{r} : = ∥ u ∥

;

v_{r} : = u / δ_{r}

;

18:

r = r + 1

19: end if

20: end while

In Algorithm 1, we assume that the basis (8) satisfies conditions 1 and 2 of natural ordering. The value of

\hat{r}

is such that

ψ_{\hat{r}}

is a basis function with the same poles as

ψ_{r}

, with at least one of the poles of smaller order; if no such basis function

ψ_{\hat{r}}

exists, then

\hat{r} = 0

. When carrying out m steps of Algorithm 1, we obtain an orthonormal basis

{v_{0}, v_{1}, \dots, v_{m}}

for the rational Krylov subspace

K_{m + 1} (A, v)

.

The matrix

V_{m} = [v_{0}, v_{1}, \dots, v_{m - 1}] \in R^{N \times m}, with v_{0} = v,

and the symmetric matrix

H_{m} = {[h_{i, j}]}_{i, j = 0}^{m - 1} \in R^{m \times m}, h_{i, j} = I (y ϕ_{i} ϕ_{j}) = v_{i}^{T} A v_{j},

satisfy the relation

A V_{m} = V_{m} H_{m} + \sum_{j = 1}^{d} h_{m - j, m} v_{m} e_{m + 1 - j}^{T},

(16)

where the vector

v_{m} \in R^{N}

is such that

V_{m}^{T} v_{m} = 0

. The subspace

K_{i}

,

1 \leq i \leq m

, is spanned by the first i columns of the matrix

V_{m}

. In particular, the rational Krylov subspaces are nested, i.e.,

K_{1} \subseteq K_{2} \subseteq \dots

. The orthogonal projection of A onto the rational Krylov subspace (15) is given by

H_{m} = V_{m}^{T} A V_{m} .

We will assume that m is small enough so that the decomposition (16) with the stated properties exists. This is the generic situation. Note that the determination of the matrix

H_{m}

only requires that

m - 1

steps of Algorithm 1 be carried out.

4. Rational Gauss Quadrature

This section describes properties of rational Gauss quadrature rules for the approximation of functionals (1) when f is a Stieltjes function (2) and A is a symmetric positive definite matrix. Then

F (A) = v^{T} f (A) v = \int_{0}^{\infty} v^{T} {(t I + A)}^{- 1} v d μ (t) = : I (f) .

(17)

It is shown in [12] that

{\hat{G}}_{m} (f) : = e_{1}^{T} f (H_{m}) e_{1}

(18)

is a rational Gauss quadrature rule for the approximation of (17). The rational Gauss quadrature rule is characterized by the property

{\hat{G}}_{m} (f) = I (f), \forall f \in S_{2 m},

where

S_{2 m} : = P_{2 m - 2 k} \oplus Q_{1, 2 k_{1}} \oplus \dots \oplus Q_{ℓ, 2 k_{ℓ}}

denotes a

2 m

-dimensional linear subspace spanned by the chosen elementary rational functions.

In view of (4), the rational Gauss rule (18) can be expressed as

{\hat{G}}_{m} (f) = \int_{0}^{\infty} e_{1}^{T} {(t I + H_{m})}^{- 1} e_{1} d μ (t),

(19)

where we recall that the measure

d μ

defines function f; the rational Gauss rule is with respect to the measure

d ν

in (4).

The orthonormal rational function

ϕ_{m}

has m distinct zeros

{y_{i}}_{i = 1}^{m}

that lie in the convex hull of the support of the measure

d ν

, and they are the eigenvalues of

H_{m}

in (13); see (Theorem 2.5 in [12]). Recall that we assume

ψ_{m}

to be a monomial. Thus the rational function

ϕ_{m}

in (13) can be written as

ϕ_{m} (y) = c_{m} \frac{\prod_{i = 1}^{m} (y - y_{i})}{w (y)} \in S_{m + 1},

where

w (y) = \prod_{i = 1}^{ℓ} {(y - α_{i})}^{k_{i}}

(20)

is a polynomial of degree k with k defined by (6). The scalars

α_{i}

are the poles of the rational approximant; see the beginning of Section 3. Moreover,

c_{m}

is a scalar. A remainder term for the rule (19) is derived in [6].

5. Allocation of Poles

The formulas in the previous section are valid independently of the location of the distinct negative real poles

α_{i}

and their multiplicities. However, their location and multiplicities affect the size of the quadrature error

E_{m} (f) : = | F (A) - {\hat{G}}_{m} (f) | .

(21)

Example 5.3 reported in [6] indicates that when the rational approximant of the Stieltjes function f has more distinct poles, the rational Gauss rule (19) may give a smaller quadrature error (21) than when the rational approximant is of the same order with fewer distinct poles. However, no discussion on the allocation of the poles is provided in [6]. Here we propose to allocate the poles so that, when considered positive point charges, they approximately make the negative real axis an equipotential curve. The allocation of poles that satisfy this property is very easy to determine with the aid of conformal mapping. Computed examples in Section 6 illustrate the performance of this pole allocation technique.

We remark that using many distinct poles may not be advantageous because each distinct pole

α_{i}

requires the solution of a linear system of equations of the form

(A - α_{i} I) z = w

for some vector w; see the discussion in Section 3. When the systems of equations are solved by factorization of the system matrix, each distinct pole requires the computation of a factorization. It follows that the computing time required to achieve a quadrature error (21) that is smaller than a specified tolerance may be smaller when using a rational approximant with fewer distinct poles than when using a rational approximant of the same order with more distinct poles. The optimal choice of the number of distinct poles depends on the structure of A, i.e., on how demanding the computation of the factorization of the matrix

A - α_{i} I

is. This section considers the allocation of ℓ distinct poles; cf. (5).

We set the poles to be images under a conformal mapping of equidistant points on the unit circle. These poles are easy to compute. The application of conformal mappings is justified by the fact that the approximation errors of analytic functions by rational functions are invariant under conformal mapping because conformal mappings map equipotential curves onto equipotential curves.

Consider the problem of approximating an analytic function outside the unit circle by a rational function with poles on the unit circle. If no particular properties of the analytic function are known, then it is natural to allocate the poles equidistantly on the unit circle; see, e.g., Walsh (Chapter 9 in [13]) for a discussion on rational approximation and equipotential curves. Let the unit circle be in the z-plane. We can map the unit circle to the interval

[- 1, 1]

by the Joukowski map

w = \frac{1}{2} (z + 1 / z);

see, e.g., Henrici (Chapter 5 in [14]). As z traverses the unit circle, w traverses the interval

[- 1, 1]

twice. We now map w to the semi-infinite interval

[- \infty, α]

with

α \leq 0

by the Moebius transformation

ζ = \frac{w - 1}{w + 1} + α .

(22)

Then

ζ = - \infty

when

w = - 1

, and

ζ = α

for

w = 1

. When z traverses the top half of the unit circle counterclockwise, starting at

z = 1

and ending at

z = - 1

,

ζ

traverses the interval

[- \infty, α]

starting at

α

. Similarly, when z traverses the bottom half of the unit circle clockwise,

ζ

also traverses the interval

[- \infty, α]

starting at

α

.

To allocate ℓ distinct poles on the interval

[- \infty, α]

, we first define the ℓ equidistant points

z_{j} = exp (π i (j - 1 / 2) / ℓ), i = \sqrt{- 1}, j = 1, 2, \dots, ℓ,

on the upper unit semicircle. The Joukowski map yields the points

w_{j} = \frac{1}{2} (z_{j} + 1 / z_{j}), j = 1, 2, \dots, ℓ,

on the interval

[- 1, 1]

. Finally, using (22) gives the distinct poles

α_{j} = \frac{w_{j} - 1}{w_{j} + 1} + α, j = 1, 2, \dots, ℓ,

(23)

on

[- \infty, α]

. We will use these poles in the computed examples of the following section.

6. Computed Examples

This section illustrates the application of rational Gauss rules to Stieltjes functions of a symmetric positive definite matrix. We illustrate the pole allocation described in Section 5 and compare it with ad hoc pole allocation. The computations were carried out using MATLAB R2017b on a 64-bit MacBook Pro personal computer; arithmetic was performed with about 15 significant decimal digits. We determine the quadrature error by explicitly evaluating the functionals (1). This limits the size of the matrices A that we can consider.

Example 2.

Consider the approximation of the functional

F (A) : = v^{T} A^{- 1 / 2} v,

(24)

where

A \in R^{1000 \times 1000}

is a symmetric positive definite Toeplitz matrix with first row

[1, 1 / 2, \dots, 1 / 1000]

, and

v = {[1 / \sqrt{1000}, \dots, 1 / \sqrt{1000}]}^{T} \in R^{1000}

. The smallest eigenvalue of A is

λ_{1} = 0.3863

, and the largest one is

λ_{1000} = 12.1259

. The value of

F (A)

is about

0.2897

. We compute approximations of (24) by rational Gauss rules. The computations require the solution of linear system of equations with symmetric positive definite Toeplitz matrices

A - α_{i} I

, where

α_{i} < 0

are poles. Ammar and Gragg [11] describe fast algorithms for the solution of systems of equations with this kind of matrix.

We will use the rational Krylov subspace

\begin{matrix} K_{m} (A, v) = & span {v, A v, {(A - α_{1} I)}^{- 1} v, A^{2} v, \dots, {(A - α_{1} I)}^{- k_{1}} v, A^{k_{1} + 1} v, \\ {(A - α_{2} I)}^{- 1} v, A^{k_{1} + 2} v, \dots, {(A - α_{2} I)}^{- k_{2}} v, A^{k_{1} + k_{2} + 1} v, \dots, \\ {(A - α_{ℓ} I)}^{- 1} v, \dots, {(A - α_{ℓ} I)}^{- k_{ℓ}} v, A^{k + 1} v}, \end{matrix}

where k is determined by (6).

To calculate the exact value of (24), we evaluate

v^{T} A^{- 1 / 2} v

by first computing the matrix square root

A^{1 / 2}

with the MATLAB functionsqrtmand then solving a linear system of equations with the matrix A. The rational Gauss rule is evaluated as

e_{1}^{T} H_{m}^{- 1 / 2} e_{1},

where

H_{m}^{- 1 / 2} e_{1}

is determined by first computing the matrix

H_{m}^{1 / 2}

using the MATLAB functionsqrtmand then solving a linear system of equations with the matrix

H_{m}

.

The first row of Table 1 displays the error in the approximation of (24) obtained by rational Gauss rules with the somewhat arbitrarily chosen distinct poles

α_{i} \in {- 1, - 2}

(25)

of multiplicity two. On the second row, Table 1 shows the error in the approximation obtained with rational Gauss rules with poles defined by (23) with

α = 0

and

ℓ = 2

. Thus, the poles used are of two multiplicity two. The distinct poles are

α_{1} = - 0.1716

and

α_{2} = - 5.8284

. Table 1 shows the latter poles to yield a smaller error than the poles (25).

Table 1. Example 2: Errors in rational Gauss rules for approximating (24) with A is a symmetric positive definite Toeplitz matrix. For the first row, the distinct poles are

α_{i} \in {- 1, - 2}

, each of multiplicity two. The second row shows the error when the poles are allocated with the aid of conformal mapping using (23) with

α = 0

and

ℓ = 2

. Then the distinct poles are

α_{i} \in {- 0.1716, - 5.8284}

.

We now turn to the use of four distinct poles. The first row of Table 2 displays the errors in approximations determined by rational Gauss rules with the arbitrarily chosen distinct poles

α_{i} \in {- 1 / 2, - 1, - 3 / 2, - 2}

(26)

of multiplicity one. Row two of Table 2 displays the approximation of the error obtained with the four poles

α_{i} \in {- 0.0396, - 0.4465, - 2.2398, - 25.2741}

of multiplicity one, which is defined by (23) with

α = 0

and

ℓ = 4

. The latter poles can be seen to give a smaller approximation error than the equidistant poles (26).

Table 2. Example 2: Errors in rational Gauss rules for approximating (24) with A a symmetric positive definite Toeplitz matrix. In the first row, the poles are

α_{i} \in {- 1 / 2, - 1, - 3 / 2, - 2}

of multiplicity one. The second row shows the error when the poles are allocated by using (23) with

α = 0

and

ℓ = 4

. The poles are

α_{i} \in {- 0.0396, - 0.4465, - 2.2398, - 25.2741}

of multiplicity one.

Example 3.

We would like to calculate an approximation of the functional

F (A) : = v^{T} log (A + I) A^{- 1} v,

(27)

where

A \in R^{1000 \times 1000}

is a symmetric positive definite Toeplitz matrix with a first row

[3, 3 / 2, \dots, 3 / 1000]

. The vector

v \in R^{1000}

is the same as in Example 1. The smallest eigenvalue of A is

λ_{1} = 1.1589

, and the largest one is

λ_{1000} = 36.3776

. The value of

F (A)

is approximately

0.1009

.

The first row of Table 3 shows the error for the rational Gauss quadrature rule with somewhat arbitrarily chosen poles

α_{1} = 0

of multiplicity two and

α_{2} = - 1 / 4

of multiplicity one. The second row displays the error for the rational Gauss rule with poles

α_{1} = - 1.1716

of multiplicity two and

α_{2} = - 6.8284

of multiplicity one. The latter poles are determined by (23) with

α = - 1

and

ℓ = 2

. We observe that the approach of choosing poles with the aid of conformal mapping provides a more accurate result.

Table 3. Example 3: Errors in rational Gauss rules for the approximating of (27) with A a symmetric positive definite Toeplitz matrix. In the first row, the poles are

α_{1} = 0

of multiplicity two and

α_{2} = - 1 / 4

of multiplicity one. In the second row, the poles are allocated using (23) with

α = - 1

and

ℓ = 2

. The pole

α_{1} = - 1.1716

is of multiplicity two, and the pole

α_{2} = - 6.8284

is of multiplicity one.

The first and second rows of Table 4 display the errors in rational Gauss rules with somewhat arbitrarily chosen poles

α_{i} \in {0, - 1}

of multiplicity two and with poles

α_{i} \in {- 1.1716, - 6.8284}

of multiplicity two, respectively. The latter poles are defined by (23) with

α = - 1

and

ℓ = 2

. It can be seen that the error in the second row is smaller than the error in the first row.

Table 4. Example 3: Errors in rational Gauss rules for approximating (27) with A a symmetric positive definite Toeplitz matrix. In the first row, the poles are

α_{1} = 0

and

α_{2} = - 1

of multiplicity two. In the second row, the poles are allocated with (23) with

α = - 1

and

ℓ = 2

. This gives the distinct poles

α_{1} = - 1.1716

and

α_{2} = - 6.8284

of multiplicity two.

Example 4.

In this example, we determine an approximation of

F (A) : = v^{T} (π {(I + \sqrt{A})}^{- 1}) v,

(28)

where the matrix

A \in R^{1000 \times 1000}

and vector

v \in R^{1000}

are the same as in Example 1. The value of

F (A)

is approximately

0.7053

. We note that

f (z) = \frac{π}{1 + \sqrt{z}} = \int_{0}^{\infty} \frac{1}{t + z} d μ (t), d μ (t) = \frac{\sqrt{t}}{1 + t} d t,

is a Stieltjes function.

Table 5 displays the errors in approximations determined by rational Gauss rules. In the first row, we choose four somewhat arbitrary equidistant poles,

α_{i} \in {0, - 2, - 4, - 6}

, each of multiplicity one. In the second row, the poles are

α_{i} \in {- 0.0396, - 0.4465, - 2.2398, - 25.2741}

of multiplicity one, and the poles in the third row are

α_{i} \in {- 0.1716, - 5.8284}

of multiplicity two. The poles in the second and third rows are allocated using (23). It can be seen that the errors in the second and third rows are of a smaller magnitude than the error in the first row.

Table 5. Example 4: Errors in rational Gauss rules for computing approximations of (28) when A is a symmetric positive definite Toeplitz matrix. In the first row, we choose four equidistant poles

α_{i} \in {0, - 2, - 4, - 6}

of multiplicity one. The second row displays the approximation error when the poles are allocated by (23) with

α = 0

and

ℓ = 4

; the poles are

α_{i} \in {- 0.0396, - 0.4465, - 2.2398, - 25.2741}

of multiplicity one. The poles on the third row are obtained by using (23) with

α = 0

and

ℓ = 2

. Then the distinct poles are

α_{i} \in {- 0.1716, - 5.8284}

of multiplicity two.

7. Conclusions

This paper discusses the evaluation of expressions of the form (1), where A is a symmetric positive definite matrix and f is a Stieltjes function. We approximate f by a rational function with user-specified poles and integrate the rational function by a rational Gauss quadrature rule. The rational function is expressed as a linear combination of rational orthogonal functions, which can be generated with recursion formulas with few terms. The paper focuses on the choice of poles of the rational function. They are allocated on the negative real axis and chosen to be images of equidistant points on the upper unit semicircle under a suitable conformal mapping. The computed examples show this approach of allocating the poles to give a smaller quadrature error than poles chosen in an ad hoc manner.

Author Contributions

All authors contributed equally to this publication. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

No data are required.

Acknowledgments

The authors would like to thank the referees for comments that lead to improvements of the presentation.

Conflicts of Interest

There is no conflict of interest.

References

Frommer, A. Monotone convergence of the Lanczos approximations to matrix functions of Hermitian matrices. Electron. Trans. Numer. Anal. 2009, 35, 118–128. [Google Scholar]
Frommer, A.; Schweitzer, M. Error bounds and estimates for Krylov subspace approximations of Stieltjes matrix functions. BIT Numer. Math. 2016, 56, 865–892. [Google Scholar] [CrossRef]
Henrici, P. Applied and Computational Complex Analysis; Wiley: New York, NY, USA, 1977; Volume 2. [Google Scholar]
Massei, S.; Robol, L. Rational Krylov for Stieltjes matrix functions: Convergence and pole selection. BIT Numer. Math. 2021, 61, 237–273. [Google Scholar] [CrossRef]
Higham, N.J. Functions of Matrices; SIAM: Philadelphia, PA, USA, 2008. [Google Scholar]
Alahmadi, J.; Pranić, M.; Reichel, L. Rational Gauss quadrature rules for the approximation of matrix functionals involving Stieltjes functions. Numer. Math. 2022, 151, 443–473. [Google Scholar] [CrossRef]
Gautschi, W. Orthogonal Polynomials: Computation and Approximation; Oxford University Press: Oxford, UK, 2004. [Google Scholar]
Güttel, S.; Knizhnerman, L. A black box rational Arnoldi variant for Cauchy–Stietjes matrix functions. BIT Numer. Math. 2013, 53, 595–616. [Google Scholar] [CrossRef]
Golub, G.H.; Meurant, G. Matrices, Moments and Quadrature with Applications; Princeton University Press: Princeton, NJ, USA, 2010. [Google Scholar]
Pranić, M.S.; Reichel, L. Recurrence relations for orthogonal rational functions. Numer. Math. 2013, 123, 629–642. [Google Scholar] [CrossRef]
Ammar, G.S.; Gragg, W.B. Superfast solution of real positive definite Toeplitz systems. SIAM J. Matrix Anal. Appl. 1988, 9, 61–76. [Google Scholar] [CrossRef]
Pranić, M.S.; Reichel, L. Rational Gauss quadrature. SIAM J. Numer. Anal. 2014, 52, 832–851. [Google Scholar] [CrossRef]
Walsh, J.L. Interpolation and Approximation by Rational Functions in the Complex Domain, 5th ed.; American Mathematical Society: Providence, RI, USA, 1969. [Google Scholar]
Henrici, P. Applied and Computational Complex Analysis; Wiley: New York, NY, USA, 1974; Volume 1. [Google Scholar]

Table 1. Example 2: Errors in rational Gauss rules for approximating (24) with A is a symmetric positive definite Toeplitz matrix. For the first row, the distinct poles are

α_{i} \in {- 1, - 2}

, each of multiplicity two. The second row shows the error when the poles are allocated with the aid of conformal mapping using (23) with

α = 0

and

ℓ = 2

. Then the distinct poles are

α_{i} \in {- 0.1716, - 5.8284}

.

Table 1. Example 2: Errors in rational Gauss rules for approximating (24) with A is a symmetric positive definite Toeplitz matrix. For the first row, the distinct poles are

α_{i} \in {- 1, - 2}

, each of multiplicity two. The second row shows the error when the poles are allocated with the aid of conformal mapping using (23) with

α = 0

and

ℓ = 2

. Then the distinct poles are

α_{i} \in {- 0.1716, - 5.8284}

.

m	$E_{m} (f)$
10	$3.42 \times 10^{- 11}$
10	$8.19 \times 10^{- 13}$

Table 2. Example 2: Errors in rational Gauss rules for approximating (24) with A a symmetric positive definite Toeplitz matrix. In the first row, the poles are

α_{i} \in {- 1 / 2, - 1, - 3 / 2, - 2}

of multiplicity one. The second row shows the error when the poles are allocated by using (23) with

α = 0

and

ℓ = 4

. The poles are

α_{i} \in {- 0.0396, - 0.4465, - 2.2398, - 25.2741}

of multiplicity one.

Table 2. Example 2: Errors in rational Gauss rules for approximating (24) with A a symmetric positive definite Toeplitz matrix. In the first row, the poles are

α_{i} \in {- 1 / 2, - 1, - 3 / 2, - 2}

of multiplicity one. The second row shows the error when the poles are allocated by using (23) with

α = 0

and

ℓ = 4

. The poles are

α_{i} \in {- 0.0396, - 0.4465, - 2.2398, - 25.2741}

of multiplicity one.

m	$E_{m} (f)$
10	$1.15 \times 10^{- 11}$
10	$2.70 \times 10^{- 13}$

Table 3. Example 3: Errors in rational Gauss rules for the approximating of (27) with A a symmetric positive definite Toeplitz matrix. In the first row, the poles are

α_{1} = 0

of multiplicity two and

α_{2} = - 1 / 4

of multiplicity one. In the second row, the poles are allocated using (23) with

α = - 1

and

ℓ = 2

. The pole

α_{1} = - 1.1716

is of multiplicity two, and the pole

α_{2} = - 6.8284

is of multiplicity one.

Table 3. Example 3: Errors in rational Gauss rules for the approximating of (27) with A a symmetric positive definite Toeplitz matrix. In the first row, the poles are

α_{1} = 0

of multiplicity two and

α_{2} = - 1 / 4

of multiplicity one. In the second row, the poles are allocated using (23) with

α = - 1

and

ℓ = 2

. The pole

α_{1} = - 1.1716

is of multiplicity two, and the pole

α_{2} = - 6.8284

is of multiplicity one.

m	$E_{m} (f)$
8	$6.66 \times 10^{- 11}$
8	$4.09 \times 10^{- 13}$

Table 4. Example 3: Errors in rational Gauss rules for approximating (27) with A a symmetric positive definite Toeplitz matrix. In the first row, the poles are

α_{1} = 0

and

α_{2} = - 1

of multiplicity two. In the second row, the poles are allocated with (23) with

α = - 1

and

ℓ = 2

. This gives the distinct poles

α_{1} = - 1.1716

and

α_{2} = - 6.8284

of multiplicity two.

Table 4. Example 3: Errors in rational Gauss rules for approximating (27) with A a symmetric positive definite Toeplitz matrix. In the first row, the poles are

α_{1} = 0

and

α_{2} = - 1

of multiplicity two. In the second row, the poles are allocated with (23) with

α = - 1

and

ℓ = 2

. This gives the distinct poles

α_{1} = - 1.1716

and

α_{2} = - 6.8284

of multiplicity two.

m	$E_{m} (f)$
10	$1.60 \times 10^{- 13}$
10	$1.29 \times 10^{- 15}$

Table 5. Example 4: Errors in rational Gauss rules for computing approximations of (28) when A is a symmetric positive definite Toeplitz matrix. In the first row, we choose four equidistant poles

α_{i} \in {0, - 2, - 4, - 6}

of multiplicity one. The second row displays the approximation error when the poles are allocated by (23) with

α = 0

and

ℓ = 4

; the poles are

α_{i} \in {- 0.0396, - 0.4465, - 2.2398, - 25.2741}

of multiplicity one. The poles on the third row are obtained by using (23) with

α = 0

and

ℓ = 2

. Then the distinct poles are

α_{i} \in {- 0.1716, - 5.8284}

of multiplicity two.

Table 5. Example 4: Errors in rational Gauss rules for computing approximations of (28) when A is a symmetric positive definite Toeplitz matrix. In the first row, we choose four equidistant poles

α_{i} \in {0, - 2, - 4, - 6}

of multiplicity one. The second row displays the approximation error when the poles are allocated by (23) with

α = 0

and

ℓ = 4

; the poles are

α_{i} \in {- 0.0396, - 0.4465, - 2.2398, - 25.2741}

of multiplicity one. The poles on the third row are obtained by using (23) with

α = 0

and

ℓ = 2

. Then the distinct poles are

α_{i} \in {- 0.1716, - 5.8284}

of multiplicity two.

m	$E_{m} (f)$
10	$2.49 \times 10^{- 12}$
10	$1.01 \times 10^{- 13}$
10	$2.68 \times 10^{- 13}$

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Pole Allocation for Rational Gauss Quadrature Rules for Matrix Functionals Defined by a Stieltjes Function

Abstract

1. Introduction

2. Integration of Stieltjes Functions

3. Recursion Relations for Orthogonal Rational Function

4. Rational Gauss Quadrature

5. Allocation of Poles

6. Computed Examples

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics