On Some Extended Block Krylov Based Methods for Large Scale Nonsymmetric Stein Matrix Equations

Abdeslem Hafid Bentbib; Khalide Jbilou; EL Mostafa Sadek

doi:10.3390/math5020021

,

and

¹

Laboratory LAMAI, University of Cadi Ayyad, Marrakesh 40000, Morocco

²

LMPA, 50 rue F. Buisson, ULCO Calais, Calais 62228 , France

³

ENSA d’EL Jadida, University Chouaib Doukkali, EL Jadida 24002, Morocco

^*

Author to whom correspondence should be addressed.

Mathematics2017, 5(2), 21;https://doi.org/10.3390/math5020021

This article belongs to the Special Issue Numerical Linear Algebra with Applications

Version Notes

Order Reprints

Abstract

In the present paper, we consider the large scale Stein matrix equation with a low-rank constant term

A X B - X + E F^{T} = 0

. These matrix equations appear in many applications in discrete-time control problems, filtering and image restoration and others. The proposed methods are based on projection onto the extended block Krylov subspace with a Galerkin approach (GA) or with the minimization of the norm of the residual. We give some results on the residual and error norms and report some numerical experiments.

Keywords:

extended block Krylov subspaces; low-rank approximation; Stein matrix equation; Galerkin approach (GA); minimal residual (MR) methods

1. Introduction

In this paper, we are interested in the numerical solution of large scale nonsymmetric Stein matrix equations of the form:

A X B - X + E F^{T} = 0

(1)

where A and B are real, sparse and square matrices of size

n \times n

and

s \times s,

respectively, and E and F are matrices of size

n \times r

and

s \times r

, respectively.

Stein matrix equations play an important role in many problems in control and filtering theory for discrete-time large-scale dynamical systems, in each step of Newton’s method for discrete-time algebraic Riccati equations, model reduction problems, image restoration techniques and other problems [1,2,3,4,5,6,7,8,9,10].

Direct methods for solving the matrix Equation (1), such as those proposed by Bartels–Stewart [11] and the Hessenberg–Schur [12] algorithms, are attractive if the matrices are of small size. For a general overview of numerical methods for solving the Stein matrix equation [1,2,13].

The Stein matrix Equation (1) can be formulated as an

n s \times n s

large linear system using the Kronecker formulation:

(B^{T} \otimes A - I_{s} \otimes I_{n}) v e c (X) = - v e c (E F^{T})

(2)

where

v e c (X)

is the vector obtained by stacking all the columns of the matrix X,

I_{n}

is the n-by-n identity matrix, and the Kronecker product of two matrices A and B is defined by

A \otimes B = [a_{i j} B],

where

A = [a_{i j}]

. This product satisfies the properties

(A \otimes B) (C \otimes D) = (A C \otimes B D)

,

{(A \otimes B)}^{T} = A^{T} \otimes B^{T}

and

v e c (A X B) = (B^{T} \otimes A) v e c (X)

. Then, the matrix Equation (1) has a unique solution if and only if

λ μ \neq 1

for all

λ \in σ (A)

and

μ \in σ (B)

, where

σ (A)

denotes the spectrum of the matrix A. Throughout the paper, we assume that this condition is satisfied. Moreover, if both A and B are Schur stable, i.e.,

σ (A)

and

σ (B)

lie in the open unit disc, and then the solution of Equation (1) can be expressed as the following infinite matrix series:

X = \sum_{i = 0}^{\infty} A^{i} E F^{T} B^{i}

To solve large linear matrix equations, several Krylov subspace projection methods have been proposed (see, e.g., [1,13,14,15,16,17,18,19,20,21,22,23,24] and the references therein). The main idea developed in these methods is to use a block Krylov subspace or an extended block Krylov subspace and then project the original large matrix equation onto these Krylov subspaces using a Galerkin condition or a minimization property of the obtained residual. Hence, we will be interested in these two procedures to get approximate solutions to the solution of the Stein matrix Equation (1). The rest of the paper is organized as follows. In the next section, we recall the extended block Krylov subspace and the extended block Arnoldi (EBA) algorithm with some properties. In Section 3, we will apply the Galerkin approach (GA) to Stein matrix equations by using the extended Krylov subspaces. In Section 4, we define the minimal residual (MR) method for Stein matrix equations by using the extended Krylov subspaces. We finally present some numerical experiments in Section 5.

2. The Extended Block Krylov Subspace Algorithm

In this section, we recall the EBA algorithm applied to

(A, V),

where

V \in R^{n \times r}

. The block Krylov subspace associated with

(A, V)

is defined as:

K_{m} (A, V) = R a n g e ({V, A V, A^{2} V, \dots, A^{m - 1} V})

The extended block Krylov subspace associated with the pair

(A, V)

is given as:

\begin{matrix} K_{m}^{e} (A, V) & = & R a n g e {V, A^{- 1} V, A V, A^{- 2} V, A^{2} V, \dots, A^{m - 1} V, A^{- m} V}) \\ = & K_{m} (A, V) + K_{m} (A^{- 1}, A^{- 1} V) \end{matrix}

The EBA Algorithm 1 is defined as follows [15,16,18,23]:

Algorithm 1. The Extended Block Arnoldi (EBA) Algorithm

(1)

Inputs: A an

n \times n

matrix, V an

n \times r

matrix and m an integer

(2)

Compute the QR decomposition of

[V, A^{- 1} V]

, i.e.,

[V, A^{- 1} V] = V_{1} Λ

(3)

Set

V_{0} = []

(4)

for

j = 1, 2, \dots, m

(a)

Set

V_{j}^{(1)}

: first r columns of

V_{j}

;

V_{j}^{(2)}

: second r columns of

V_{j}

(b)

V_{j} = [V_{j - 1}, V_{j}]

;

{\hat{V}}_{j + 1} = [A V_{j}^{(1)}, A^{- 1} V_{j}^{(2)}]

(c)

Orthogonalize

{\hat{V}}_{j + 1}

w.r. to

V_{j}

to get

V_{j + 1}

, i.e.

∗: for $i = 1, 2, \dots, j$
∗: $H_{i, j} = V_{i}^{T} {\hat{V}}_{j + 1};$
∗: ${\hat{V}}_{j + 1} = {\hat{V}}_{j + 1} - V_{i} H_{i, j}$
∗: End for

(d)

Compute the

Q R

decomposition of

{\hat{V}}_{j + 1}

, i.e.,

{\hat{V}}_{j + 1} = V_{j + 1} H_{j + 1, j}

(5)

End For

This algorithm allows us to construct an orthonormal matrix

V_{m} = [V_{1}, V_{2}, \dots, V_{m}]

that is a basis of the block extended Krylov subspace

K_{m}^{e} (A, V)

. The restriction of the matrix A to the block extended Krylov subspace

K_{m}^{e} (A, V)

is given by

T_{m} = V_{m}^{T} A V_{m}

.

Let

{\bar{T}}_{m} = V_{m + 1}^{T} A V_{m}

. Then, we have the following relations [25]:

\begin{array}{l} A V_{m} & = & V_{m + 1} {\bar{T}}_{m} \\ = & V_{m} T_{m} + V_{m + 1} T_{m + 1, m} E_{m}^{T} \end{array}

where

E_{m} = {[0_{2 r \times 2 (m - 1) r}, I_{2 r}]}^{T}

is the matrix of the last

2 r

columns of the identity matrix

I_{2 m r}

[23,25]. In the next section, we will define the GA for solving Stein matrix equations.

3. Galerkin-Based Methods

In this section, we will apply the Galerkin projection method to obtain low-rank approximate solutions of the nonsymmetric Stein matrix Equation (1). This approach has been applied for Lyapunov, Sylvester or Riccati matrix equations [1,14,15,19,20,21,23,25,26].

3.1. The Case: Both A and B Are Large Matrices

We consider here a nonsymmetric Stein matrix equation, where A and B are large and sparse matrices with

r ≪ n

and

r ≪ s

. We project the initial problem by using the extended block Krylov subspaces

K_{m}^{e} (A, E)

and

K_{m}^{e} (B^{T}, F)

associated with the pairs

(A, E)

and

(B^{T}, F)

, respectively, and get orthonormal bases

{V_{1}, V_{2}, \dots, V_{m}}

and

{W_{1}, W_{2}, \dots, W_{m}}

. We then consider approximate solutions of the Stein matrix Equation (1) that have the low-rank form:

X_{m}^{G A} = V_{m} Y_{m}^{G A} W_{m}^{T}

(3)

where

V_{m} = [V_{1}, V_{2}, \dots, V_{m}]

and

W_{m} = [W_{1}, W_{2}, \dots, W_{m}]

.

The matrix

Y_{m}^{G A}

is determined from the following Galerkin orthogonality condition:

V_{m}^{T} R_{m}^{G A} W_{m} = V_{m}^{T} (A X_{m}^{G A} B - X_{m}^{G A} + E F^{T}) W_{m} = 0

(4)

Now, replacing

X_{m}^{G A} = V_{m} Y_{m}^{G A} W_{m}^{T}

in Equation (4), we obtain the reduced Stein matrix equation:

T_{A} Y_{m}^{G A} {(T_{B})}^{T} - Y_{m}^{G A} + \tilde{E} {\tilde{F}}^{T} = 0

(5)

where

\tilde{E} = V_{m}^{T} E

,

\tilde{F} = W_{m}^{T} F

,

T_{A} = V_{m}^{T} A V_{m}, and T_{B} = W_{m}^{T} B^{T} W_{m} .

Assuming that

λ_{i} (T_{A}) λ_{j} (T_{B}) \neq 1

for any

i = 1, 2, \dots, 2 m r

and

j = 1, 2, \dots, 2 m r

, the solution

Y_{m}

of the low-order Stein Equation (5) can be obtained by a direct method such as those described in [11]. The following result on the norm of the residual

R_{m}

allows us to stop the iterations without having to compute the approximation

X_{m}^{G A}

.

Theorem 1.

Let

X_{m}^{G A}

be the approximation obtained at step m by the EBA algorithm. Then, the Frobenius norm of the residual

R_{m}^{G}

associated to the approximation

X_{m}^{G A}

is given by:

∥ R_{m}^{G} ∥_{F} = \sqrt{α_{m}^{2} + β_{m}^{2} + γ_{m}^{2}}

(6)

where

α_{m} = {∥T_{m}^{A} Y_{m} E_{m} {(T_{m + 1, m}^{B})}^{T}∥}_{F}, β_{m} = {∥T_{m + 1, m}^{A} E_{m}^{T} {(T_{m}^{B})}^{T}∥}_{F},

and:

γ_{m} = {∥T_{m + 1, m}^{A} E_{m}^{T} Y_{m} E_{m} {(T_{m + 1, m}^{B})}^{T}∥}_{F}

Proof.

The proof is similar to the one given at proposition 6 in [17]. ☐

In the following result, we give an upper bound for the norm of the error

X - X_{m}^{G A}

.

Theorem 2.

Assume that

{∥ A ∥}_{2} < 1

and

{∥ B ∥}_{2} < 1

, and let

Y_{m}^{G A}

be the exact solution of projected Stein matrix Equation (5) and

X_{m}^{G A}

be the approximate solution given by running m steps of the EBA algorithm. Then:

∥ X - X_{m}^{G A} ∥_{2} \leq \frac{{∥ A ∥}_{2} ∥ T_{m + 1, m}^{B} ∥_{2} + {∥ B ∥}_{2} ∥ T_{m + 1, m}^{A} ∥_{2} + ∥ T_{m + 1, m}^{A} ∥_{2} {∥ T_{m + 1, m}^{B} ∥}_{2}}{1 - {∥ A ∥}_{2} {∥ B ∥}_{2}} {∥ Y_{m} ∥}_{2}

(7)

Proof.

The proof is similar to the one given at Theorem 2 in [27]. ☐

The approximate solution

X_{m}^{G A}

can be given as a product of two matrices of low rank. Consider the singular value decomposition of the

2 m r \times 2 m r

matrix:

Y_{m}^{G A} = {\tilde{Y}}_{1} Σ {\tilde{Y}}_{2}^{T}

where

Σ

is the diagonal matrix of the singular values of

Y_{m}^{M R}

sorted in decreasing order. Let

Y_{1, l}

and

Y_{2, l}

be the

2 m r \times l

matrices of the first l columns of

{\tilde{Y}}_{1}

and

{\tilde{Y}}_{2},

respectively, corresponding to the l singular values of magnitude greater than some tolerance. We obtain the truncated singular value decomposition:

Y_{m}^{G A} \approx U_{1, l} Σ_{l} {U_{2, l}}^{T}

where

Σ_{l} = diag [σ_{1}, \dots, σ_{l}]

. Setting

Z_{1, m} = V_{m} U_{1, l} Σ_{l}^{1 / 2}

, and

Z_{2, m} = W_{m} U_{2, l} Σ_{l}^{1 / 2},

it follows that:

X_{m}^{G A} \approx Z_{1, m} Z_{2, m}^{T}

(8)

This is very important for large problems when one doesn’t need to compute and store the approximation

X_{m}

at each iteration.

The GA is given in Algorithm 2:

Algorithm 2. Galerkin Approach (GA) for the Stein Matrix Equations

(1): Inputs: A an $n \times n$ matrix, B an $s \times s$ matrix, E an $n \times r$ matrix and F an $s \times r$ matrix.
(2): Choose a tolerance $t o l > 0$ , a maximum number of $i t e r m a x$ iterations.
(3): For $m = 1, 2, 3, \dots, i t e r m a x$
(4): Compute $V_{m}, T_{m}^{A}$ , by Algorithm 1 applied to $(A, E) .$
(5): Compute $W_{m}, T_{m}^{B}$ , by Algorithm 1 applied to $(B^{T}, F) .$
(6): Solve the low order Stein Equation (5) and compute $∥ R_{m} ∥_{F}$ given by Equation (6)
(7): if $∥ R_{m} ∥_{F} \leq t o l$ , stop,
(8): Using Equation (8), the approximate solution $X_{m}^{G A}$ is given by $X_{m}^{G A} \approx Z_{1, m} Z_{2, m}^{T}$ .

In the next section, we consider the case where the matrix A is large while B has a moderate or a small size.

3.2. The Case: A Large and B Small

In this section, we consider the Stein matrix equation:

A X B - X + E = 0

(9)

where E is a matrix of size

n \times s

with

s < < n

.

In this case, we will consider approximations of the exact solution X as:

X_{m} = V_{m} Y_{m}

where

V_{m}

is the orthonormal basis obtained by applying the extended block Krylov subspace

K_{m}^{e} (A, E)

. The orthogonality Galerkin condition gives:

V_{m}^{T} R_{m} = 0

(10)

where

R_{m}

is the m-th residual given by

R_{m} = A X_{m} B - X_{m} + E

. Therefore, we obtain the projected Stein matrix equation:

T_{A} Y_{m}^{G A} B - Y_{m}^{G A} + \tilde{E} = 0

(11)

where

T_{A} = V_{m}^{T} A V_{m}

and

\tilde{E} = V_{m}^{T} E

.

The next result gives a useful expression of the norm of the residual.

Theorem 3.

Let

Y_{m}^{G A}

the exact solution of the reduced Stein matrix Equation (11) and let

X_{m}^{G A} = V_{m} Y_{m}^{G A}

be the approximate solution of Equation (9) with

R_{m} = R (X_{m}^{G A})

the corresponding residual. Then:

∥ R_{m} ∥_{F} = {∥T_{m + 1, m}^{A} E_{m}^{T} Y_{m}^{G A} B∥}_{F}

(12)

Proof.

The residual is given by

R_{m} = A X_{m}^{G A} B - X_{m}^{G A} + E

. Since E is belonging to

K_{m}^{e} (A, E)

, then

V_{m} V_{m}^{T} E = E

. Using the relation

A V_{m} = V_{m + 1} {\bar{T}}_{m}^{A}

, we have:

\begin{matrix} ∥ R_{m} ∥_{F} & = & {∥A V_{m} Y_{m}^{G A} B - V_{m} Y_{m}^{G A} + E∥}_{F} \\ = & {∥V_{m + 1} {\bar{T}}_{m}^{A} Y_{m}^{G A} B - V_{m} Y_{m}^{G A} + V_{m} V_{m}^{T} E∥}_{F} \\ = & {∥V_{m + 1} {\bar{T}}_{m}^{A} Y_{m}^{G A} B - V_{m + 1} (\begin{matrix} I \\ 0 \end{matrix}) Y_{m}^{G A} + V_{m + 1} (\begin{matrix} I \\ 0 \end{matrix}) \tilde{E}∥}_{F} \\ = & {∥V_{m + 1} [{\bar{T}}_{m}^{A} Y_{m}^{G A} B - (\begin{matrix} I \\ 0 \end{matrix}) Y_{m}^{G A} + (\begin{matrix} \tilde{E} \\ 0 \end{matrix})]∥}_{F} \end{matrix}

As the matrix

V_{m + 1}

is orthogonal and

{\bar{T}}_{m}^{A} = [\begin{matrix} T_{m}^{A} \\ T_{m + 1, m}^{A} E_{m}^{T} \end{matrix}]

, we have:

\begin{matrix} ∥ R_{m} ∥_{F} & = & {∥[[\begin{matrix} T_{m}^{A} \\ T_{m + 1, m}^{A} E_{m}^{T} \end{matrix}] Y_{m}^{G A} B - (\begin{matrix} Y_{m}^{G A} \\ 0 \end{matrix}) + (\begin{matrix} \tilde{E} \\ 0 \end{matrix})]∥}_{F} \\ = & {∥[(\begin{matrix} T_{m}^{A} Y_{m}^{G A} B - Y_{m}^{G A} + \tilde{E} \\ T_{m + 1, m}^{A} E_{m}^{T} Y_{m}^{G A} B \end{matrix})]∥}_{F} \end{matrix}

Therefore:

∥ R_{m} ∥_{F} = {∥T_{m + 1, m}^{A} E_{m}^{T} Y_{m}^{G A} B∥}_{F}

☐

This result is very important because it allows us to calculate the Frobenius norm of

R_{m} (X_{m}^{G A})

without having to compute the approximate solution.

Next, we give a result showing that the error

X - X_{m}

is an exact solution of a perturbed Stein matrix equation.

Theorem 4.

Let

X_{m}

be the approximate solution of Equation (9) obtained after m iterations of the EBA algorithm. Then:

(A - F_{m}) X_{m} B - X_{m} = E

(13)

where

F_{m} = V_{m + 1} T_{m + 1, m}^{A} V_{m}^{T} .

Proof.

Multiplying the Equation (11) from the left by

V_{m}

, we obtain:

[A V_{m} - V_{m + 1} T_{m + 1, m}^{A} E_{m}^{T}] Y_{m} B - V_{m} Y_{m} = V_{m} \tilde{E}

(14)

As

V_{m} \tilde{E} = E

, we get:

(A - F_{m}) X_{m} B - X_{m} = E

(15)

where:

F_{m} = V_{m + 1} T_{m + 1, m}^{A} V_{m}^{T}

☐

We can now state the following result, which gives an upper bound for the norm of the error.

Theorem 5.

If

{∥ A ∥}_{2} < 1

and

{∥ B ∥}_{2} < 1

, then we have:

∥ X - X_{m} ∥_{2} \leq \frac{∥ T_{m + 1, m}^{A} E_{m} Y_{m} B ∥_{2}}{1 - {∥ A ∥}_{2} {∥ B ∥}_{2}}

(16)

Proof.

By subtracting Equation (13) from Equation (9), we get:

A (X - X_{m}) B - (X - X_{m}) = - F_{m} X_{m} B

(17)

The error

X_{m} - X

is the solution of the Stein matrix Equation (17) and can be expressed as:

X_{m} - X = \sum_{i = 0}^{+ \infty} A^{i} [F_{m} X_{m} B] B^{i}

(18)

\begin{matrix} ∥ X_{m} {- X ∥}_{2} & \leq & \sum_{i = 0}^{+ \infty} ∥ A^{i} ∥_{2} ∥ F_{m} X_{m} {B ∥}_{2} {∥ B^{i} ∥}_{2} \end{matrix}

(19)

\begin{matrix} \leq & ∥ F_{m} X_{m} {B ∥}_{2} \sum_{i = 0}^{+ \infty} {({∥ A ∥}_{2} {∥ B ∥}_{2})}^{i} \end{matrix}

(20)

\begin{matrix} \leq & \frac{∥ F_{m} X_{m} {B ∥}_{2}}{1 - {∥ A ∥}_{2} {∥ B ∥}_{2}} \end{matrix}

(21)

\begin{matrix} \leq & \frac{∥ T_{m + 1, m}^{A} E_{m} Y_{m} {B ∥}_{2}}{1 - {∥ A ∥}_{2} {∥ B ∥}_{2}} \end{matrix}

(22)

☐

In the next section, we present projection methods based on extended block Krylov subspaces and MR property.

4. Minimal Residual Method for Large Scale Stein Matrix Equations

In this section, we present a MR method for solving large scale Stein matrix equations. A MR method for solving large scale Lyapunov matrix equation is given in [22].

4.1. The Case: Both A and B Are Large

Instead of using a Galerkin condition as we explained in the preceding section, we consider here approximate solutions

X_{m}^{M R} = V_{m} Y_{m}^{M R} W_{m}^{T}

satisfying the following minimization property:

X^{M R} = arg min_{X_{m} = V_{m} Y_{m} W_{m}^{T}} {∥A X_{m} B - X_{m} + E F^{T}∥}_{F}

(23)

We have the following result.

Theorem 6.

The solution

X_{m}^{M R}

of the the minimization problem:

X_{m}^{M R} = arg min_{X_{m} = V_{m} Y_{m} W_{m}^{T}} {∥A X_{m} B - X_{m} + E F^{T}∥}_{F}

is given by:

X_{m}^{M R} = V_{m} Y_{m}^{M R} W_{m}^{T}

where

Y_{m}^{M R}

solves the following low dimentional minimization problem:

Y_{m}^{M R} = arg min {∥{\bar{T}}_{m}^{A} Y_{m} {({\bar{T}}_{m}^{B})}^{T} - (\begin{matrix} I \\ 0 \end{matrix}) Y_{m} (\begin{matrix} I & 0 \end{matrix}) + (\begin{matrix} R_{E} R_{F}^{T} & 0 \\ 0 & 0 \end{matrix})∥}_{F}

(24)

with

E = V_{1} R_{E}

and

F = W_{1} R_{F}

, the

Q R

factorization of E and F, respectively.

Proof.

We have:

\begin{matrix} min_{X = V_{m} Y_{m} W_{m}^{T}} {∥A X B - X + E F^{T}∥}_{F} \\ = min_{Y_{m}} {∥A V_{m} Y_{m} W_{m}^{T} B - V_{m} Y_{m} W_{m}^{T} - V_{1} R_{E} R_{F}^{T} W_{1}^{T}∥}_{F} \\ = min_{Y_{m}} {∥V_{m + 1} [{\bar{T}}_{m}^{A} Y_{m} {({\bar{T}}_{m}^{B})}^{T} - (\begin{matrix} I \\ 0 \end{matrix}) Y_{m} (\begin{matrix} I & 0 \end{matrix}) + (\begin{matrix} R_{E} R_{F}^{T} & 0 \\ 0 & 0 \end{matrix})] W_{m + 1}^{T}∥}_{F} \\ = min_{Y_{m}} {∥[{\bar{T}}_{m}^{A} Y_{m} {({\bar{T}}_{m}^{B})}^{T} - (\begin{matrix} I \\ 0 \end{matrix}) Y_{m} (\begin{matrix} I & 0 \end{matrix}) + (\begin{matrix} R_{E} R_{F}^{T} & 0 \\ 0 & 0 \end{matrix})]∥}_{F} \end{matrix}

☐

One advantage of using the minimization approach is the fact that the projected problem (24) always has a solution that is not the case when one uses a GA.

The main problem is now how to solve the reduced order minimization problem (24). One possibility is the use of the preconditioned global conjugate gradient (PGCG) method.

4.2. The Preconditioned Global CG Method for Solving the Reduced Minimization Problem

In this section, we adopt the preconditioned conjugate gradient method (PCG) [28,29] to solve the reduced minimization problem (24). The normal equation associated with (24) is given by:

L_{m}^{*} (L_{m} (Y)) = L_{m}^{*} (C)

(25)

where:

L_{m} (Y) = {\bar{T}}_{m}^{A} Y {({\bar{T}}_{m}^{B})}^{T} - (\begin{matrix} I \\ 0 \end{matrix}) Y (\begin{matrix} I & 0 \end{matrix})

Notice that

L_{m}^{*}

is the adjoint of the linear operator

L_{m}

with respect to the Frobenius inner product is given by:

L_{m}^{*} (Z) = {({\bar{T}}_{m}^{A})}^{T} Z {\bar{T}}_{m}^{B} - (\begin{matrix} I & 0 \end{matrix}) Z (\begin{matrix} I \\ 0 \end{matrix})

and:

C = (\begin{matrix} R_{E} R_{F}^{T} & 0 \\ 0 & 0 \end{matrix})

We can decompose the matrices

{\bar{T}}_{m}^{A}

and

{\bar{T}}_{m}^{B}

as follows:

{\bar{T}}_{m}^{A} = (\begin{matrix} T_{m}^{A} \\ h_{m}^{A} \end{matrix}) and {\bar{T}}_{m}^{B} = (\begin{matrix} T_{m}^{B} \\ h_{m}^{B} \end{matrix})

where

h_{m}^{A}

and

h_{m}^{B}

represent the last

2 r

rows of the matrices

{\bar{T}}_{m}^{A}

and

{\bar{T}}_{m}^{B}

, respectively. Therefore, the normal Equation (25) can be written as:

{\bar{T}}_{m}^{A}^{T} {\bar{T}}_{m}^{A} Y {\bar{T}}_{m}^{B}^{T} {\bar{T}}_{m}^{B} + Y - {T_{m}^{A}}^{T} Y T_{m}^{B} - T_{m}^{A} Y {(T_{m}^{B})}^{T} - L_{m}^{*} (C) = 0

(26)

Considering the singular value decomposition (SVD) of the matrices

{\bar{T}}_{m}^{A}

and

{\bar{T}}_{m}^{B}

:

{\bar{T}}_{m}^{A} = {\bar{U}}_{A} {\bar{Σ}}_{A} {\bar{V}}_{A}^{T}; {\bar{T}}_{m}^{B} = {\bar{U}}_{B} {\bar{Σ}}_{B} {\bar{V}}_{B}^{T}

(27)

we get the eigendecomposition:

{\bar{T}}_{m}^{A}^{T} {\bar{T}}_{m}^{A} = Q_{A} D_{A} Q_{A}^{T}, {\bar{T}}_{m}^{B}^{T} {\bar{T}}_{m}^{B} = Q_{B} D_{B} Q_{B}^{T}

(28)

where

Q_{A} = {\bar{V}}_{A}

,

Q_{B} = {\bar{V}}_{B}

and

D_{A} = {\bar{Σ}}_{A}^{T} {\bar{Σ}}_{A}

.

Let

\tilde{Y} = Q_{A}^{T} Y Q_{B}

and

\tilde{C} = Q_{A}^{T} L_{m}^{*} (C) Q_{B}

, and then the normal Equation (26) is now expressed as:

D_{A} \tilde{Y} D_{B} + \tilde{Y} - {\tilde{T}}_{m}^{A} \tilde{Y} {({\tilde{T}}_{m}^{B})}^{T} - {({\tilde{T}}_{m}^{A})}^{T} \tilde{Y} ({\tilde{T}}_{m}^{B}) - \tilde{C} = 0

(29)

where

{\tilde{T}}_{m}^{A} = Q_{A}^{T} T_{m}^{A} Q_{A}

and

{\tilde{T}}_{m}^{B} = Q_{B}^{T} T_{m}^{B} Q_{B}

. This expression suggests that one can use the first part as a preconditioner, that is, the matrix operator:

P (\tilde{Y}) = D_{A} \tilde{Y} D_{B} + \tilde{Y}

(30)

It can be seen that the expression (29) corresponds to the normal equation of the following matrix operator:

{\tilde{L}}_{m} (\tilde{Y}) = {\tilde{T}}_{m}^{A} \tilde{Y} {({\tilde{T}}_{m}^{B})}^{T} - (\begin{matrix} Q_{A} \\ 0 \end{matrix}) \tilde{Y} (\begin{matrix} Q_{B}^{T} & 0 \end{matrix})

(31)

where

{\tilde{T}}_{m}^{A} = {\bar{T}}_{m}^{A} Q_{A}

and

{\tilde{T}}_{m}^{B} = {\bar{T}}_{m}^{B} Q_{B}

. Then, the preconditioned global CG algorithm is obtained by applying the preconditioner (30) to the normal equation associated with the matrix linear operator defined by Equation (31). This is summarized in Algorithm 3.

Algorithm 3. The Preconditioned Global Conjugate Gradient (PGCG) Algorithm.

(1)

Set

{\tilde{Y}}_{0} = 0

Compute

{\tilde{R}}_{0} = C - {\tilde{L}}_{m} ({\tilde{Y}}_{0})

;

S_{0} = {\tilde{L}}_{m}^{*} ({\tilde{R}}_{0})

,

Z_{0} = P^{- 1} (S_{0})

;

P_{0} = S_{0}

(2)

For

j = 0, 1, 2, \dots, j_{m a x}

(a): $W_{j} = {\tilde{L}}_{m} (P_{j})$
(b): $α_{j} = {⟨S_{j}, Z_{j}⟩}_{F} / {| W_{j} |}_{F}^{2}$
(c): ${\tilde{Y}}_{j + 1} = {\tilde{Y}}_{j} + α_{j} P_{j}$
(d): ${\tilde{R}}_{j + 1} = {\tilde{R}}_{j} - α_{j} W_{j}$
(e): If $∥ {\tilde{R}}_{j + 1} ∥_{F}$ is small enough, then stop
Else
(f): $S_{j + 1} = {\tilde{L}}_{m}^{*} ({\tilde{R}}_{j + 1})$
(g): $Z_{j + 1} = P^{- 1} (S_{j + 1})$
(h): $β_{j} = {⟨S_{j + 1}, Z_{j + 1}⟩}_{F} / {⟨S_{j}, Z_{j}⟩}_{F}$
(i): $P_{j + 1} = Z_{j + 1} + β_{j} P_{j}$

(3)

End For

Notice that the use of the preconditioner

P

requires the solution, at each iteration, of a Stein equation. As the matrices

D_{A}

and

D_{B}

of these Stein matrix equations are diagonal matrices, this reduces the costs.

The MR Algorithm 4 for the Stein matrix equations is summarized as follows:

Algorithm 4. The Minimal Residual (MR) Method for Nonsymmetric Stein Matrix Equations

(1): Choose a tolerance $t o l > 0$ , a maximum number of $i t e r m a x$ iterations
(2): For $m = 1, 2, 3, \dots, i t e r m a x$
(3): Update $V_{m}, {\bar{T}}_{m}^{A}$ , by algorithm 1 (EBA) applied to $(A, E)$
(4): Update $W_{m}, {\bar{T}}_{m}^{B}$ , by algorithm 1 (EBA) applied to $(B^{T}, F)$
(5): Solve the low order problem (24)
(6): if $∥ R_{m} ∥_{F} \leq t o l$ , stop
(7): Using Equation (8), the approximate solution $X_{m}^{M R}$ is given by $X_{m} \approx Z_{1, m} Z_{2, m}^{T}$

4.3. The Case: A Large and B Small

In this subsection, we apply the MR norm method to the nonsymmetric Stein Equation (9) in the case A large and B small. The approximate solution is given by:

X_{m}^{M R} = V_{m} Y_{m}^{M R}

with:

X^{M R} = arg min_{X_{m} = V_{m} Y_{m}} {∥A X_{m} B - X_{m} + E∥}_{F}

(32)

We have the following result, which is not difficult to prove.

Theorem 7.

The solution of the minimization problem:

X_{m}^{M R} = arg min_{X_{m} = V_{m} Y_{m}} {∥A X_{m} B - X_{m} + E∥}_{F}

is given by:

X_{m}^{M R} = V_{m} Y_{m}^{M R}

where:

Y_{m}^{M R} = arg min_{Y_{m}} {∥[{\bar{T}}_{m}^{A} Y_{m} B - [\begin{matrix} Y_{m} \\ 0 \end{matrix}] + [\begin{matrix} R_{E} \\ 0 \end{matrix}]]∥}_{F}

(33)

with

E = V_{1} R_{E}

being the

Q R

decomposition of E.

The reduced minimization problem (33) can also be solved by using the preconditioned global CG method (PGCG), as we did for the problem (24).

5. Numerical Experiments

In this section, we present some numerical experiments of large and sparse Stein matrix equations. We compared EBA-MR and EBA-GA methods. For the GA and at each iteration m, we solved the projected Stein matrix equations by using the Bartels–Stewart algorithm [11]. When solving the minimization reduced problem by the PGCG, we stopped the iterations when the relative norm of the residual was less than

t o l_{l} = 10^{- 12}

or when a maximum of

k m a x = 200

iterations was achieved. The algorithms were coded in Matlab 8.0 (2014). The stopping criterion used for EBA-MR and GA was

∥ R (X_{m}) ∥_{F} < 10^{- 7}

or a maximum of

m_{m a x} = 100

iterations was achieved.

In all of the examples, the coefficients of the matrices E and F were random values uniformly distributed on

[0, 1]

.

Example 1.

In this first example, the matrices A and B are obtained from the centered finite difference discretization of the operators:

L_{A} (u) = Δ u + f_{1} (x, y) \frac{\partial u}{\partial x} +, f_{2} (x, y) \frac{\partial u}{\partial y} + f (x, y) u

L_{B} (u) = Δ u + g_{1} (x, y) \frac{\partial u}{\partial x} + g_{2} (x, y) \frac{\partial u}{\partial y} + g (x, y) u

on the unit square

[0, 1] \times [0, 1]

with homogeneous Dirichlet boundary conditions. The number of inner grid points in each direction was

n_{0}

and

s_{0}

for the operators

L_{A}

and

L_{B}

, respectively. The matrices A and B were obtained from the discretization of the operator

L_{A}

and

L_{B}

with the dimensions

n = n_{0}^{2}

and

s = s_{0}^{2}

, respectively. The discretization of the operator

L_{A} (u)

and

L_{B} (u)

yields matrices extracted from the Lyapack package [30] using the command fdm_2d_matrix and denoted as A = fdm(n0,’f_1(x,y)’,’f_2(x,y)’,’f(x,y)’). In this example,

n = 10, 000

and

s = 4900

, respectively, and are named as

A = fdm (n 0, f_{1} (x, y), f_{2} (x, y), f (x, y))

and

B = fdm (s 0, g_{1} (x, y), g_{2} (x, y), g (x, y))

with

f_{1} (x, y) = - e^{x y}

,

f_{2} (x, y) = - sin (x y)

,

f (x, y) = y^{2}

,

g_{1} (x, y) = - 100 e^{x}

,

g_{2} (x, y) = - 12 x y

and

g (x, y) = \sqrt{x^{2} + y^{2}}

. For this experiment, we used

r = 3

.

In Figure 1, we plotted the Frobenius norms of the residuals versus the number of iterations for the MR and the GAs.

Figure 1. Galerkin approach (GA): dashed line, minimal residual (MR): solid line.

In Table 1, we compared the performances of the MR method and the GA. For both methods, we listed the residual norms, the maximum number of iteration and the corresponding execution time.

Table 1. Results for Example 1.

Example 2.

For the second set of experiments, we considered matrices from the University of Florida Sparse Matrix Collection [31] and from the Harwell Boeing Collection (http://math.nist.gov/MatrixMarket).

In Figure 2, we used the matrices A = pde2961 and B = fdm

(s 0, 100 e^{x}, 12 x y, \sqrt{x^{2} + y^{2}})

with dimensions

n = 2961

and

s = 3600

, respectively, and

r = 3

.

Figure 2. GA: dashed line, MR: solid line.

In Figure 3, we used the matrices A=Themal and B=fdm

(s_{0}, e^{x y}, s i n (x y), x^{2} - y^{2})

with dimensions

n = 3456

and

s = 6400

, respectively, and

r = 3

.

Figure 3. GA: dashed line, MR: solid line.

In Table 2, we compared the performances of the MR method and the GA. For both methods, we listed the residual norms, the maximum number of iterations and the corresponding execution time.

Table 2. Results for Example 5.2.

6. Conclusions

We presented in this paper two iterative methods for computing numerical solutions for large scale Stein matrix equations with low rank right-hand sides. The proposed methods are based on projection onto extended block Krylov subspaces with a Galerkin or a minimal residual approach. The approximate solutions are given as products of two low rank matrices and allow for saving memory for large problems. The numerical experiments show that the proposed Krylov-based methods are effective for large and sparse matrices.

Author Contributions

Authors have contributed equally in the mathematical part, the editorial as well as the experimental part.

Conflicts of Interest

The authors declare no conflict of interest.

References

Bouhamidi, A.; Heyouni, M.; Jbilou, K. Block Arnoldi-based methods for large scale discrete-time algebraic Riccati equations. J. Comput. Appl. Math. 2011, 236, 1531–1542. [Google Scholar] [CrossRef]
Bouhamidi, A.; Jbilou, K. Sylvester Tikhonov-regularization methods in image restoration. J. Comput. Appl. Math. 2007, 206, 86–98. [Google Scholar] [CrossRef]
Zhou, B.; Lam, J.; Duan, G.-R. On Smith-type iterative algorithms for the Stein matrix equation. Appl. Math. Lett. 2009, 22, 1038–1044. [Google Scholar] [CrossRef]
Zhou, B.; Lam, J.; Duan, G.-R. Toward solution of matrix equation X = Af (X)B + C. Linear Algebra Appl. 2011, 435, 1370–1398. [Google Scholar] [CrossRef]
Zhou, B.; Duan, G.-R.; Lam, J. Positive definite solutions of the nonlinear matrix equation. Appl. Math. Comput. 2013, 219, 7377–7391. [Google Scholar] [CrossRef]
Li, Z.-Y.; Zhou, B.; Lam, J. Towards positive definite solutions of a class of nonlinear matrix equations. Appl. Math. Comput. 2014, 237, 546–559. [Google Scholar] [CrossRef]
Van Dooren, P. Gramian Based Model Reduction of Large-Scale Dynamical Systems. In Numerical Analysis; Chapman and Hall/CRC Press: London, UK, 2000; pp. 231–247. [Google Scholar]
Datta, B.N. Numerical Methods for Linear Control Systems; Academic Press: New York, NY, USA, 2003. [Google Scholar]
Datta, B.N.; Datta, K. Theoretical and computational aspects of some linear algebra problems in control theory. In Computational and Combinatorial Methods in Systems Theory; Byrnes, C.I., Lindquist, A., Eds.; Elsevier: Amsterdam, The Netherlands, 1986; pp. 201–212. [Google Scholar]
Calvetti, D.; Levenberg, N.; Reichel, L. Iterative methods for X − AXB = C. J. Comput. Appl. Math. 1997, 86, 73–101. [Google Scholar] [CrossRef]
Bartels, R.H.; Stewart, G.W. Solution of the matrix equation A X + X B = C, Algorithm 432. Commun. ACM 1972, 15, 820–826. [Google Scholar] [CrossRef]
Golub, G.H.; Nash, S.; van Loan, C. A Hessenberg Schur method for the problem AX + XB = C. IEEE Trans. Autom. Control 1979, 24, 909–913. [Google Scholar] [CrossRef]
Simoncini, V. Computational methods for linear matrix equations. SIAM Rev. 2016, 58, 377–441. [Google Scholar] [CrossRef]
Agoujil, S.; Bentbib, A.H.; Jbilou, K.; Sadek, E.L.M. A minimal residual norm method for large-scale Sylvester matrix equations. Elect. Trans. Numer. Anal. 2014, 43, 45–59. [Google Scholar]
Bentbib, A.H.; Jbilou, K.; Sadek, E.M. On some Krylov subspace based methods for large-scale nonsymmetric algebraic Riccati problems. Comput. Math. Appl. 2015, 2555–2565. [Google Scholar] [CrossRef]
Druskin, V.; Knizhnerman, L. Extended Krylov subspaces: Approximation of the matrix square root and related functions. SIAM J. Matrix Anal. Appl. 1998, 19, 755–771. [Google Scholar] [CrossRef]
El Guennouni, A.; Jbilou, K.; Riquet, A.J. Block Krylov subspace methods for solving large Sylvester equations. Numer. Algorithms 2002, 29, 75–96. [Google Scholar] [CrossRef]
Heyouni, M. Extended Arnoldi methods for large low-rank Sylvester matrix equations. Appl. Numer. Math. 2010, 60, 1171–1182. [Google Scholar] [CrossRef]
Jaimoukha, I.M.; Kasenally, E.M. Krylov subspace methods for solving large Lyapunov equations. SIAM J. Numer. Anal. 1994, 31, 227–251. [Google Scholar] [CrossRef]
Jbilou, K. Low-rank approximate solution to large Sylvester matrix equations. Appl. Math. Comput. 2006, 177, 365–376. [Google Scholar] [CrossRef]
Jbilou, K.; Riquet, A.J. Projection methods for large Lyapunov matrix equations. Linear Algebra Appl. 2006, 415, 344–358. [Google Scholar] [CrossRef]
Lin, Y.; Simoncini, V. Minimal residual methods for large scale Lyapunov equations. Appl. Numer. Math. 2013, 72, 52–71. [Google Scholar] [CrossRef]
Simoncini, V. A new iterative method for solving large-scale Lyapunov matrix equations. SIAM J. Sci. Comput. 2007, 29, 1268–1288. [Google Scholar] [CrossRef]
Jagels, C.; Reichel, L. Recursion relations for the extended Krylov subspace method. Linear Algebra Appl. 2011, 434, 1716–1732. [Google Scholar] [CrossRef]
Heyouni, M.; Jbilou, K. An extended Block Arnoldi algorithm for large-scale solutions of the continuous-time algebraic Riccati equation. Elect. Trans. Numer. Anal. 2009, 33, 53–62. [Google Scholar]
Saad, Y. Numerical solution of large Lyapunov equations. In Signal Processing, Scattering, Operator Theory and Numerical Methods; Kaashoek, M.A., van Shuppen, J.H., Ran, A.C., Eds.; Birkhaser: Boston, MA, USA, 1990; pp. 503–511. [Google Scholar]
Bouhamidi, A.; Hached, M.; Heyouni, M.; Jbilou, K. A preconditioned block Arnoldi method for large Sylvester matrix equations. Numer. Linear Algebra Appl. 2011, 20, 208–219. [Google Scholar] [CrossRef]
Saad, Y. Iterative Methods for Sparse Linear Systems, 2nd ed.; Society for Industrial and Applied Mathematics: Philadelphia, PA, USA, 2003. [Google Scholar]
Saad, Y.; Yeung, M.; Erhel, J.; Guyomarc’h, F. A deflated version of the conjugate gradient algorithm. SIAM J. Sci. Comput. 2000, 21, 1909–1926. [Google Scholar] [CrossRef]
Penzl, T. LYAPACK A MATLAB Toolbox for Large Lyapunov and Riccati Equations, Model Reduction Problems, and Linear-Quadratic Optimal Control Problems. Available online: http://www.tu-chemintz.de/sfb393/lyapack (Accessed on 10 June 2016).
Davis, T. The University of Florida Sparse Matrix Collection, NA Digest, Volume 97, No. 23, 7 June 1997. Available online: http://www.cise.ufl.edu/research/sparse/matrices (Accessed on 10 June 2016).

Figure 1. Galerkin approach (GA): dashed line, minimal residual (MR): solid line.

Figure 2. GA: dashed line, MR: solid line.

Figure 3. GA: dashed line, MR: solid line.

Table 1. Results for Example 1.

Test Problem	Method	Iterations	Residual Norm	Times (s)
$n = 8100$ , $s = 3600$ , $r = 2$	GA	43	7.56 × 10 $^{- 8}$	4.80
$n = 8100$ , $s = 3600$ , $r = 2$	MR	3	$1.46 \times 10^{- 8}$	1.87
$n =$ 10,000, $s = 4900$ , $r = 4$	GA	45	$4.99 \times 10^{- 8}$	26.52
$n =$ 10,000, $s = 4900$ , $r = 4$	MR	3	$6.28 \times 10^{- 8}$	3.75
$n =$ 12,100, $s = 7900$ , $r = 3$	GA	49	$8.93 \times 10^{- 8}$	12.96
$n =$ 12,100, $s = 7900$ , $r = 3$	MR	3	$4.98 \times 10^{- 8}$	3.63

Table 2. Results for Example 5.2.

Test Problem	Method	Iterations	Residual Norm	Time (s)
$n = 2961$ , $s = 3600$ , $r = 2$ , A = `pde`2961	GA	45	$9.10 \times 10^{- 9}$	3.7440
and B = `fdm` $(s_{0}, 100 e^{x}, 12 x y, \sqrt{x^{2} + y^{2}})$	MR	7	$1.54 \times 10^{- 9}$	1.0296
$n = 3456$ , $s = 8100$ , $r = 3$ , A = `Thermal`	GA	40	$3.27 \times 10^{- 8}$	10.1245
and B = `fdm` $(e^{x}, sin (x y), x^{2} - y^{2})$	MR	8	$7.29 \times 10^{- 9}$	7.3008

© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

On Some Extended Block Krylov Based Methods for Large Scale Nonsymmetric Stein Matrix Equations

Abstract

1. Introduction

2. The Extended Block Krylov Subspace Algorithm

3. Galerkin-Based Methods

3.1. The Case: Both A and B Are Large Matrices

3.2. The Case: A Large and B Small

4. Minimal Residual Method for Large Scale Stein Matrix Equations

4.1. The Case: Both A and B Are Large

4.2. The Preconditioned Global CG Method for Solving the Reduced Minimization Problem

4.3. The Case: A Large and B Small

5. Numerical Experiments

6. Conclusions

Author Contributions

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics