1. Introduction
In this paper, we are interested in the numerical solution of large scale nonsymmetric Stein matrix equations of the form:
where
A and
B are real, sparse and square matrices of size
$n\times n$ and
$s\times s,$ respectively, and
E and
F are matrices of size
$n\times r$ and
$s\times r$, respectively.
Stein matrix equations play an important role in many problems in control and filtering theory for discretetime largescale dynamical systems, in each step of Newton’s method for discretetime algebraic Riccati equations, model reduction problems, image restoration techniques and other problems [
1,
2,
3,
4,
5,
6,
7,
8,
9,
10].
Direct methods for solving the matrix Equation (
1), such as those proposed by Bartels–Stewart [
11] and the Hessenberg–Schur [
12] algorithms, are attractive if the matrices are of small size. For a general overview of numerical methods for solving the Stein matrix equation [
1,
2,
13].
The Stein matrix Equation (
1) can be formulated as an
$ns\times ns$ large linear system using the Kronecker formulation:
where
$vec\left(X\right)$ is the vector obtained by stacking all the columns of the matrix
X,
${I}_{n}$ is the
nby
n identity matrix, and the Kronecker product of two matrices
A and
B is defined by
$A\otimes B=\left[{a}_{ij}B\right],$ where
$A=\left[{a}_{ij}\right]$. This product satisfies the properties
$(A\otimes B)(C\otimes D)=(AC\otimes BD)$,
${(A\otimes B)}^{T}={A}^{T}\otimes {B}^{T}$ and
$vec\left(AXB\right)=({B}^{T}\otimes A)vec\left(X\right)$. Then, the matrix Equation (
1) has a unique solution if and only if
$\lambda \mu \ne 1$ for all
$\lambda \in \sigma \left(A\right)$ and
$\mu \in \sigma \left(B\right)$, where
$\sigma \left(A\right)$ denotes the spectrum of the matrix
A. Throughout the paper, we assume that this condition is satisfied. Moreover, if both
A and
B are Schur stable, i.e.,
$\sigma \left(A\right)$ and
$\sigma \left(B\right)$ lie in the open unit disc, and then the solution of Equation (
1) can be expressed as the following infinite matrix series:
To solve large linear matrix equations, several Krylov subspace projection methods have been proposed (see, e.g., [
1,
13,
14,
15,
16,
17,
18,
19,
20,
21,
22,
23,
24] and the references therein). The main idea developed in these methods is to use a block Krylov subspace or an extended block Krylov subspace and then project the original large matrix equation onto these Krylov subspaces using a Galerkin condition or a minimization property of the obtained residual. Hence, we will be interested in these two procedures to get approximate solutions to the solution of the Stein matrix Equation (
1). The rest of the paper is organized as follows. In the next section, we recall the extended block Krylov subspace and the extended block Arnoldi (EBA) algorithm with some properties. In
Section 3, we will apply the Galerkin approach (GA) to Stein matrix equations by using the extended Krylov subspaces. In
Section 4, we define the minimal residual (MR) method for Stein matrix equations by using the extended Krylov subspaces. We finally present some numerical experiments in
Section 5.
2. The Extended Block Krylov Subspace Algorithm
In this section, we recall the EBA algorithm applied to
$(A,V),$ where
$V\in {\mathbb{R}}^{n\times r}$. The block Krylov subspace associated with
$(A,V)$ is defined as:
The extended block Krylov subspace associated with the pair
$(A,V)$ is given as:
The EBA Algorithm 1 is defined as follows [
15,
16,
18,
23]:
Algorithm 1. The Extended Block Arnoldi (EBA) Algorithm 
 (1)
Inputs: A an $n\times n$ matrix, V an $n\times r$ matrix and m an integer  (2)
Compute the QR decomposition of $[V,{A}^{1}V]$, i.e., $[V,{A}^{1}V]={V}_{1}\mathsf{\Lambda}$  (3)
Set ${\mathbb{V}}_{0}=\left[\text{}\right]$  (4)
for $j=1,2,\dots ,m$  (a)
Set ${V}_{j}^{\left(1\right)}$: first r columns of ${V}_{j}$; ${V}_{j}^{\left(2\right)}$: second r columns of ${V}_{j}$  (b)
${\mathbb{V}}_{j}=\left[{\mathbb{V}}_{j1},{V}_{j}\right]$; ${\widehat{V}}_{j+1}=\left[A\text{}{V}_{j}^{\left(1\right)},{A}^{1}\text{}{V}_{j}^{\left(2\right)}\right]$  (c)
Orthogonalize ${\widehat{V}}_{j+1}$ w.r. to ${\mathbb{V}}_{j}$ to get ${V}_{j+1}$, i.e.
 ∗
for $i=1,2,\dots ,j$  ∗
${H}_{i,j}={V}_{i}^{T}\text{}{\widehat{V}}_{j+1};$  ∗
$\widehat{V}}_{j+1}={\widehat{V}}_{j+1}{V}_{i}\text{}{H}_{i,j$  ∗
End for
 (d)
Compute the $QR$ decomposition of ${\widehat{V}}_{j+1}$, i.e., ${\widehat{V}}_{j+1}={V}_{j+1}\text{}{H}_{j+1,j}$
 (5)
End For

This algorithm allows us to construct an orthonormal matrix ${\mathbb{V}}_{m}=\left[{V}_{1},{V}_{2},\dots ,{V}_{m}\right]$ that is a basis of the block extended Krylov subspace ${\mathcal{K}}_{m}^{e}(A,V)$. The restriction of the matrix A to the block extended Krylov subspace ${\mathcal{K}}_{m}^{e}(A,V)$ is given by ${\mathbb{T}}_{m}={\mathbb{V}}_{m}^{T}\text{}A\text{}{\mathbb{V}}_{m}$.
Let
${\overline{\mathbb{T}}}_{m}={\mathbb{V}}_{m+1}^{T}A{\mathbb{V}}_{m}$. Then, we have the following relations [
25]:
where
${\mathbb{E}}_{m}={[{0}_{2r\times 2(m1)r},{I}_{2r}]}^{T}$ is the matrix of the last
$2r$ columns of the identity matrix
${I}_{2mr}$ [
23,
25]. In the next section, we will define the GA for solving Stein matrix equations.
3. GalerkinBased Methods
In this section, we will apply the Galerkin projection method to obtain lowrank approximate solutions of the nonsymmetric Stein matrix Equation (
1). This approach has been applied for Lyapunov, Sylvester or Riccati matrix equations [
1,
14,
15,
19,
20,
21,
23,
25,
26].
3.1. The Case: Both A and B Are Large Matrices
We consider here a nonsymmetric Stein matrix equation, where
A and
B are large and sparse matrices with
$r\ll n$ and
$r\ll s$. We project the initial problem by using the extended block Krylov subspaces
${\mathcal{K}}_{m}^{e}(A,E)$ and
${\mathcal{K}}_{m}^{e}({B}^{T},F)$ associated with the pairs
$(A,\text{}E)$ and
$({B}^{T},F)$, respectively, and get orthonormal bases
$\{{V}_{1},{V}_{2},\dots ,{V}_{m}\}$ and
$\{{W}_{1},{W}_{2},\dots ,{W}_{m}\}$. We then consider approximate solutions of the Stein matrix Equation (
1) that have the lowrank form:
where
${\mathbb{V}}_{m}=\left[{V}_{1},{V}_{2},\dots ,{V}_{m}\right]$ and
${\mathbb{W}}_{m}=\left[{W}_{1},{W}_{2},\dots ,{W}_{m}\right]$.
The matrix
${Y}_{m}^{GA}$ is determined from the following Galerkin orthogonality condition:
Now, replacing
${X}_{m}^{GA}={\mathbb{V}}_{m}{Y}_{m}^{GA}{\mathbb{W}}_{m}^{T}$ in Equation (
4), we obtain the reduced Stein matrix equation:
where
$\tilde{E}={\mathbb{V}}_{m}^{T}E$,
$\tilde{F}={\mathbb{W}}_{m}^{T}F$,
${\mathbb{T}}_{A}={\mathbb{V}}_{m}^{T}A{\mathbb{V}}_{m},\text{}\mathrm{and}\text{}{\mathbb{T}}_{B}={\mathbb{W}}_{m}^{T}{B}^{T}{\mathbb{W}}_{m}.$Assuming that
${\lambda}_{i}\left({\mathbb{T}}_{A}\right){\lambda}_{j}\left({\mathbb{T}}_{B}\right)\ne 1$ for any
$i=1,2,\dots ,2mr$ and
$j=1,2,\dots ,2mr$, the solution
${Y}_{m}$ of the loworder Stein Equation (
5) can be obtained by a direct method such as those described in [
11]. The following result on the norm of the residual
${\mathcal{R}}_{m}$ allows us to stop the iterations without having to compute the approximation
${X}_{m}^{GA}$.
Theorem 1. Let ${X}_{m}^{GA}$ be the approximation obtained at step m by the EBA algorithm. Then, the Frobenius norm of the residual ${\mathcal{R}}_{m}^{G}$ associated to the approximation ${X}_{m}^{GA}$ is given by: where ${\alpha}_{m}={\u2225{\mathbb{T}}_{m}^{A}{Y}_{m}{\mathbb{E}}_{m}{\left({T}_{m+1,m}^{B}\right)}^{T}\u2225}_{F},\text{\hspace{1em}}{\beta}_{m}={\u2225{T}_{m+1,m}^{A}{\mathbb{E}}_{m}^{T}{\left({\mathbb{T}}_{m}^{B}\right)}^{T}\u2225}_{F},$ and: Proof. The proof is similar to the one given at proposition 6 in [
17]. ☐
In the following result, we give an upper bound for the norm of the error $X{X}_{m}^{GA}$.
Theorem 2. Assume that ${\parallel A\parallel}_{2}<1$ and ${\parallel B\parallel}_{2}<1$, and let ${Y}_{m}^{GA}$ be the exact solution of projected Stein matrix Equation (5) and ${X}_{m}^{GA}$ be the approximate solution given by running m steps of the EBA algorithm. Then: Proof. The proof is similar to the one given at Theorem 2 in [
27]. ☐
The approximate solution
${X}_{m}^{GA}$ can be given as a product of two matrices of low rank. Consider the singular value decomposition of the
$2mr\times 2mr$ matrix:
where
$\Sigma $ is the diagonal matrix of the singular values of
${Y}_{m}^{MR}$ sorted in decreasing order. Let
${Y}_{1,l}$ and
${Y}_{2,l}$ be the
$2mr\times l$ matrices of the first
l columns of
${\tilde{Y}}_{1}$ and
${\tilde{Y}}_{2},$ respectively, corresponding to the
l singular values of magnitude greater than some tolerance. We obtain the truncated singular value decomposition:
where
${\Sigma}_{l}=\mathrm{diag}[{\sigma}_{1},\dots ,{\sigma}_{l}]$. Setting
${Z}_{1,m}={\mathbb{V}}_{m}\text{}{U}_{1,l}\text{}{\Sigma}_{l}^{1/2}$, and
${Z}_{2,m}={\mathbb{W}}_{m}\text{}{U}_{2,l}\text{}{\Sigma}_{l}^{1/2},$ it follows that:
This is very important for large problems when one doesn’t need to compute and store the approximation ${X}_{m}$ at each iteration.
The GA is given in Algorithm 2:
Algorithm 2. Galerkin Approach (GA) for the Stein Matrix Equations 
 (1)
Inputs: A an $n\times n$ matrix, B an $s\times s$ matrix, E an $n\times r$ matrix and F an $s\times r$ matrix.  (2)
Choose a tolerance $tol>0$, a maximum number of $itermax$ iterations.  (3)
For $m=1,2,3,\dots ,itermax$  (4)
Compute ${\mathbb{V}}_{m},{\mathbb{T}}_{m}^{A}$, by Algorithm 1 applied to $(A,\text{}E).$  (5)
Compute ${\mathbb{W}}_{m},\text{}{\mathbb{T}}_{m}^{B}$, by Algorithm 1 applied to $({B}^{T},\text{}F).$  (6)
Solve the low order Stein Equation ( 5) and compute $\parallel {\mathcal{R}}_{m}{\parallel}_{F}$ given by Equation ( 6)  (7)
if $\parallel {\mathcal{R}}_{m}{\parallel}_{F}\le tol$, stop,  (8)
Using Equation ( 8), the approximate solution ${X}_{m}^{GA}$ is given by ${X}_{m}^{GA}\approx {Z}_{1,m}\text{}{Z}_{2,m}^{T}$.

In the next section, we consider the case where the matrix A is large while B has a moderate or a small size.
3.2. The Case: A Large and B Small
In this section, we consider the Stein matrix equation:
where
E is a matrix of size
$n\times s$ with
$s<<n$.
In this case, we will consider approximations of the exact solution
X as:
where
${\mathbb{V}}_{m}$ is the orthonormal basis obtained by applying the extended block Krylov subspace
${\mathcal{K}}_{m}^{e}(A,E)$. The orthogonality Galerkin condition gives:
where
${\mathcal{R}}_{m}$ is the
mth residual given by
${\mathcal{R}}_{m}=A{X}_{m}B{X}_{m}+E$. Therefore, we obtain the projected Stein matrix equation:
where
${\mathbb{T}}_{A}={\mathbb{V}}_{m}^{T}A{\mathbb{V}}_{m}$ and
$\tilde{E}={\mathbb{V}}_{m}^{T}E$.
The next result gives a useful expression of the norm of the residual.
Theorem 3. Let ${Y}_{m}^{GA}$ the exact solution of the reduced Stein matrix Equation (11) and let ${X}_{m}^{GA}={\mathbb{V}}_{m}{Y}_{m}^{GA}$ be the approximate solution of Equation (9) with ${\mathcal{R}}_{m}=\mathcal{R}\left({X}_{m}^{GA}\right)$ the corresponding residual. Then: Proof. The residual is given by
${\mathcal{R}}_{m}=A{X}_{m}^{GA}B{X}_{m}^{GA}+E$. Since
E is belonging to
${\mathcal{K}}_{m}^{e}(A,E)$, then
${\mathbb{V}}_{m}{\mathbb{V}}_{m}^{T}E=E$. Using the relation
$A{\mathbb{V}}_{m}={\mathbb{V}}_{m+1}{\overline{\mathbb{T}}}_{m}^{A}$, we have:
As the matrix
${\mathbb{V}}_{m+1}$ is orthogonal and
${\overline{\mathbb{T}}}_{m}^{A}=\left[\begin{array}{c}{\mathbb{T}}_{m}^{A}\\ {T}_{m+1,m}^{A}{\mathbb{E}}_{m}^{T}\end{array}\right]$, we have:
This result is very important because it allows us to calculate the Frobenius norm of ${\mathcal{R}}_{m}\left({X}_{m}^{GA}\right)$ without having to compute the approximate solution.
Next, we give a result showing that the error $X{X}_{m}$ is an exact solution of a perturbed Stein matrix equation.
Theorem 4. Let ${X}_{m}$ be the approximate solution of Equation (9) obtained after m iterations of the EBA algorithm. Then:where ${F}_{m}={V}_{m+1}{T}_{m+1,m}^{A}{V}_{m}^{T}.$ Proof. Multiplying the Equation (
11) from the left by
${\mathbb{V}}_{m}$, we obtain:
As
${\mathbb{V}}_{m}\tilde{E}=E$, we get:
where:
☐
We can now state the following result, which gives an upper bound for the norm of the error.
Theorem 5. If ${\parallel A\parallel}_{2}<1$ and ${\parallel B\parallel}_{2}<1$, then we have: Proof. By subtracting Equation (
13) from Equation (
9), we get:
The error
${X}_{m}X$ is the solution of the Stein matrix Equation (
17) and can be expressed as:
☐
In the next section, we present projection methods based on extended block Krylov subspaces and MR property.
5. Numerical Experiments
In this section, we present some numerical experiments of large and sparse Stein matrix equations. We compared EBAMR and EBAGA methods. For the GA and at each iteration
m, we solved the projected Stein matrix equations by using the Bartels–Stewart algorithm [
11]. When solving the minimization reduced problem by the PGCG, we stopped the iterations when the relative norm of the residual was less than
$to{l}_{l}={10}^{12}$ or when a maximum of
$kmax=200$ iterations was achieved. The algorithms were coded in Matlab 8.0 (2014). The stopping criterion used for EBAMR and GA was
$\parallel \mathcal{R}\left({X}_{m}\right){\parallel}_{F}<{10}^{7}$ or a maximum of
${m}_{max}=100$ iterations was achieved.
In all of the examples, the coefficients of the matrices E and F were random values uniformly distributed on $[0,1]$.
Example 1. In this first example, the matrices A and B are obtained from the centered finite difference discretization of the operators:on the unit square $[0,1]\times [0,1]$ with homogeneous Dirichlet boundary conditions. The number of inner grid points in each direction was ${n}_{0}$ and ${s}_{0}$ for the operators ${L}_{A}$ and ${L}_{B}$, respectively. The matrices A and B were obtained from the discretization of the operator ${L}_{A}$ and ${L}_{B}$ with the dimensions $n={n}_{0}^{2}$ and $s={s}_{0}^{2}$, respectively. The discretization of the operator ${L}_{A}\left(u\right)$ and ${L}_{B}\left(u\right)$ yields matrices extracted from the Lyapack package [30] using the command fdm_2d_matrix and denoted as A = fdm(n0,’f_1(x,y)’,’f_2(x,y)’,’f(x,y)’). In this example, $n=10,000$ and $s=4900$, respectively, and are named as $A=\mathrm{fdm}(n0,{f}_{1}(x,y),{f}_{2}(x,y),f(x,y))$ and $B=\mathrm{fdm}(s0,{g}_{1}(x,y),{g}_{2}(x,y),g(x,y))$ with ${f}_{1}(x,y)={e}^{xy}$, ${f}_{2}(x,y)=sin\left(xy\right)$, $f(x,y)={y}^{2}$, ${g}_{1}(x,y)=100{e}^{x}$, ${g}_{2}(x,y)=12xy$ and $g(x,y)=\sqrt{{x}^{2}+{y}^{2}}$. For this experiment, we used $r=3$. In
Figure 1, we plotted the Frobenius norms of the residuals versus the number of iterations for the MR and the GAs.
In
Table 1, we compared the performances of the MR method and the GA. For both methods, we listed the residual norms, the maximum number of iteration and the corresponding execution time.
Example 2. For the second set of experiments, we considered matrices from the University of Florida Sparse Matrix Collection [31] and from the Harwell Boeing Collection (http://math.nist.gov/MatrixMarket). In
Figure 2, we used the matrices
A =
pde2961 and
B =
fdm$(s0,100{e}^{x},12xy,\sqrt{{x}^{2}+{y}^{2}})$ with dimensions
$n=2961$ and
$s=3600$, respectively, and
$r=3$.
In
Figure 3, we used the matrices
A=
Themal and
B=
fdm$({s}_{0},{e}^{xy},sin\left(xy\right),{x}^{2}{y}^{2})$ with dimensions
$n=3456$ and
$s=6400$, respectively, and
$r=3$.
In
Table 2, we compared the performances of the MR method and the GA. For both methods, we listed the residual norms, the maximum number of iterations and the corresponding execution time.