Previous Article in Journal
Solar Irradiance Forecasting Based on Deep Learning Methodologies and Multi-Site Data

Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

# Modified Jacobi-Gradient Iterative Method for Generalized Sylvester Matrix Equation

by
Nopparut Sasaki
and
Pattrawut Chansangiam
*
Department of Mathematics, Faculty of Science, King Mongkut’s Institute of Technology Ladkrabang, Bangkok 10520, Thailand
*
Author to whom correspondence should be addressed.
Symmetry 2020, 12(11), 1831; https://doi.org/10.3390/sym12111831
Submission received: 14 October 2020 / Revised: 28 October 2020 / Accepted: 2 November 2020 / Published: 5 November 2020

## Abstract

:
We propose a new iterative method for solving a generalized Sylvester matrix equation $A 1 X A 2 + A 3 X A 4 = E$ with given square matrices $A 1 , A 2 , A 3 , A 4$ and an unknown rectangular matrix X. The method aims to construct a sequence of approximated solutions converging to the exact solution, no matter the initial value is. We decompose the coefficient matrices to be the sum of its diagonal part and others. The recursive formula for the iteration is derived from the gradients of quadratic norm-error functions, together with the hierarchical identification principle. We find equivalent conditions on a convergent factor, relied on eigenvalues of the associated iteration matrix, so that the method is applicable as desired. The convergence rate and error estimation of the method are governed by the spectral norm of the related iteration matrix. Furthermore, we illustrate numerical examples of the proposed method to show its capability and efficacy, compared to recent gradient-based iterative methods.
MSC:
65F45; 15A12; 15A60; 15A69

## 1. Introduction

In control engineering, certain problems concerning the analysis and design of control systems can be formulated as the Sylvester matrix equation:
$A 1 X + X A 2 = C$
where $X ∈ R m × n$ is an unknown matrix, and $A 1 , A 2 , C$ are known matrices of appropriate dimensions. Here, $R m × n$ stands for the set of $m × n$ real matrices. Let us denote by $( · ) T$ the transpose of a matrix. When $A 2 = A 1 T$, the equation is reduced to the Lyapunov equation, which is often found in continuous- and discrete-time stability analysis [1,2]. The Sylvester equation is a special case of a generalized Sylvester matrix equation:
$A 1 X A 2 + A 3 X A 4 = E$
where $X ∈ R m × n$ is unknown, and $A 1 , A 2 , A 3 , A 4 , E$ are known constant matrices of appropriate dimensions. This equation also includes the equation $A 1 X A 2 = C$ and the Kalman–Yakubovich equation $A 1 X A 2 − X = C$ as special cases. All of these equations have profound applications in linear system theory and related areas.
Normally, a direct way to solve the generalized Sylvester Equation (2) is to reduce it to a linear system by taking the vector operator. Then, Equation (2) reduces to $P x = b$ where
$P = A 2 T ⊗ A 1 + A 4 T ⊗ A 3 , x = vec ( X ) , and b = vec ( E ) .$
Here, $vec ( · )$ is the vector operator and the operation ⊗ is the Kronecker multiplication. So, Equation (2) has a unique solution if and only if the square matrix P is invertible. However, it is not easy to compute $P − 1$ when the sizes of $A 1 , A 2 , A 3 , A 4$ are not small, since the size of P can be very large. Such a size problem leads to computation difficulty in that excessive computer memory is required for the inversion of large matrices. Thus, another way exists which transform the matrix cofficients into a Schur or Hessenberg form, for which solutions may be readily computed—see [3,4].
For matrix equations of large dimensions, iterative algorithms to find an approximated/exact solution are remarkable. There are many techniques to construct an iterative procedure for Equation (2) or its special cases—e.g., matrix sign function [5], block successive over-relaxation [6], block recursion [7,8], Krylov subspace [9,10], and truncated low-rank algorithm [11]. Lately, there have been some variants of Hermitian and skew-Hermitian splitting—e.g., a generalized modified Hermitian and skew-Hermitian splitting algorithm [12], accelerated double-step scale splitting algorithm [13], PHSS algorithm [14], and the four parameter PSS algorithm [15]. Furthermore, the idea of conjugate gradient leads to finite-step iterative methods to find the exact solution such as the generalized conjugated direction algorithm [16], the conjugated gradient least-squares algorithm [17], and generalized product-type methods based on the Bi-conjugate gradient algorithm [18].
In the last decade, many authors have developed gradient-based iterative (GI) algorithms for certain linear matrix equations that satisfy the asymptotic stability (AS) in the following meaning:
(AS): The sequence of approximated solutions converges to the exact solution, no matter the initial value is.
The first GI algorithm for solving (1) was developed by Ding and Chen [19]. In that paper, a sufficient condition in terms of a convergence factor is determined so that the algorithm satisfies (AS) property. By introducing of a relaxation parameter, Niu et al. [20] suggested a relaxed gradient iterative (RGI) gradient algorithm to overcome (1). Numerical studies show that when the relaxation factor is correctly selected the convergent behavior of the Niu’s algorithm is stronger than the Ding’s algorithm. Zhang and Sheng [21] introduced an RGI algorithm for finding the symmetric (skew-symmetric) solution of Equation (1). Xie et al. [22] improved the RGI algorithm to become an accelerated GI (AGBI) algorithm, on the basis of the information generated in the previous half-step and a relaxation factor. Ding and Chen [23] also applied the ideas of gradients and least-squares to formulate the least-squares iterative (LSI) algorithm. In [24], Fan et al. realized that multiplication of the matrix in GI would take great time and space if $A 1$ and $A 2$ were big and dense, so they proposed the following Jacobi-gradient iterative (JGI) method.
Method 1
(Jacobi-Gradient based Iterative (JGI) algorithm [24]). For $i = 1 , 2$, let $D i$ be the diagonal part of $A i$. Given any initial matrices $X 1 ( 0 ) , X 2 ( 0 )$. Set $k = 0$ and compute $X ( 0 ) = ( 1 / 2 ) ( X 1 ( 0 ) + X 2 ( 0 ) )$. For $k = 1 , 2 , … , E n d$, do:
$X 1 ( k ) = X ( k − 1 ) + μ D 1 [ C − A 1 X ( k − 1 ) − X ( k − 1 ) A 2 ] , X 2 ( k ) = X ( k − 1 ) + μ [ C − A 1 X ( k − 1 ) − X ( k − 1 ) A 2 ] D 2 , X ( k ) = 1 2 ( X 1 ( k ) + X 2 ( k ) ) .$
After that, Tian et al. [25] proposed that an accelerated Jacobi-gradient iterative (AJGI) algorithm for solving the Sylvester matrix equation relies on two relaxation factors and the half-step update. However, the parameter values to apply to the algorithm are difficult to find since they are given in terms of a nonlinear inequality. For the generalized Sylvester Equation (2), the gradient iterative (GI) algorithm [19] and the least-squares iterative (LSI) algorithm [26] were established as follows.
Method 2
(GI algorithm [19]). Given any two initial matrices $X 1 ( 0 ) , X 2 ( 0 )$. Set $k = 0$ and compute $X ( 0 ) = ( 1 / 2 ) ( X 1 ( 0 ) + X 2 ( 0 ) )$. For $k = 1 , 2 , … , E n d$, do:
$X 1 ( k ) = X ( k − 1 ) + μ A 1 T [ E − A 1 X ( k − 1 ) A 2 − A 3 X ( k − 1 ) A 4 ] A 2 T , X 2 ( k ) = X ( k − 1 ) + μ A 3 T [ E − A 1 X ( k − 1 ) A 2 − A 3 X ( k − 1 ) A 4 ] A 4 T , X ( k ) = 1 2 ( X 1 ( k ) + X 2 ( k ) ) .$
A sufficient condition for which the algorithm satisfies (AS) is
$0 < μ < 2 ‖ A 1 ‖ 2 2 ‖ A 2 ‖ 2 2 + ‖ A 3 ‖ 2 2 ‖ A 4 ‖ 2 2 .$
Method 3
(LSI algorithm [26]). Given any two initial matrices . Set $k = 0$ and compute $X ( 0 ) = 1 2 ( X 1 ( 0 ) + X 2 ( 0 ) )$. For $k = 1 , 2 , … , E n d$, do:
$X 1 ( k ) = X ( k − 1 ) + μ ( A 1 T A 1 ) − 1 A 1 T [ E − A 1 X ( k − 1 ) A 2 − A 3 X ( k − 1 ) A 4 ] A 2 T ( A 2 A 2 T ) − 1 , X 2 ( k ) = X ( k − 1 ) + μ ( A 3 T A 3 ) − 1 A 3 T [ E − A 1 X ( k − 1 ) A 2 − A 3 X ( k − 1 ) A 4 ] A 4 T ( A 4 A 4 T ) − 1 , X ( k ) = 1 2 ( X 1 ( k ) + X 2 ( k ) ) .$
If $0 < μ < 4$, then the algorithm satisfies (AS).
In this paper, we shall propose a new iterative method for solving the generalized Sylvester matrix Equation (2), when $A 1 , A 3 ∈ R m × m$, $A 2 , A 4 ∈ R n × n$ and $X , E ∈ R m × n$. This algorithm requires only one initial value $X ( 0 )$ and only one parameter, called a convergence factor. We decompose the coefficient matrices to be the sum of its diagonal part and others. The recursive formula for iteration is derived from the gradients of quadratic norm-error functions together with hierarchical identification principle. Under assumptions on the real-parts sign of eigenvalues of matrix coefficients, we find necessary and sufficient conditions on a convergent factor for which (AS) holds. The convergence rate and error estimates are regulated by the iteration matrix spectral radius. In particular, when the iteration matrix is symmetric, we obtain a convergence criteria, error estimates and the optimal convergent factor in terms of spectral norms and condition number. Moreover, numerical simulations are also provided to illustrate our results for (2) and (1). We compare the efficiency of our algorithm to LSI, GI, RGI, AGBI and JGI algorithms.
Let us recall some terminology from matrix analysis—see e.g., [27]. For any square matrix X, denote $σ ( X )$ its spectrum, $ρ ( X )$ its spectral radius, and $tr ( X )$ its trace. Let us denote the largest and the smallest eigenvalues of a matrix by $λ max ( · )$ and $λ min ( · )$, respectively. Recall that the spectral norm $‖ · ‖ 2$ and the Frobenius norm $‖ · ‖ F$ of $A ∈ R m × n$ are, respectively, defined by
$‖ A ‖ 2 = λ max ( A T A ) and ‖ A ‖ F = tr ( A T A ) .$
The condition number of $A ≠ 0$ is defined by
$κ A = λ max A T A λ min A T A .$
Denote the real part of a complex number z by $ℜ ( z )$.
The rest of paper is organized as follows. We propose a modified Jacobi-gradient iterative algorithm in Section 2. Convergence criteria, convergence rate, error estimates, and optimal convergence factor are discussed in Section 3. In Section 4, we provide numerical simulations of the algorithm. Finally, we conclude the paper in Section 5.

## 2. A Modified Jacobi-Gradient Iterative Method for the Generalized Sylvester Equation

In this section, we propose an iterative algorithm for solving the generalized Sylvester equation, called a modified Jacobi-gradient iterative algorithm.
Throughout, let $m , n ∈ N$ and $A 1 , A 3 ∈ R m × m$, $A 2 , A 4 ∈ R n × n$ and $E ∈ R m × n$. We would like to find a matrix $X ∈ R m × n$, such that
$A 1 X A 2 + A 3 X A 4 = E .$
Write $A 1 = D 1 + F 1 , A 2 = D 2 + F 2 , A 3 = D 3 + F 3 and A 4 = D 4 + F 4$, where $D 1 , D 2 , D 3 , D 4$ are the diagonal parts of $A 1 , A 2 , A 3 , A 4$, respectively. A necessary and sufficient condition for (3) to have a unique solution is the invertibility of the square matrix
$P : = A 2 T ⊗ A 1 + A 4 T ⊗ A 3 .$
In this case, the solution is given by $vec X = P − 1 vec E$.
To obtain an iterative algorithm for solving (3), we recall the hierarchical identification principle in [19]. We write (3) to
$( D 1 + F 1 ) X ( D 2 + F 2 ) + A 3 X A 4 = E ,$
$A 1 X A 2 + ( D 3 + F 3 ) X ( D 4 + F 4 ) = E .$
Define two matrices
$M : = E − F 1 X D 2 − D 1 X F 2 − F 1 X F 2 − A 3 X A 4 ,$
$N : = E − F 3 X D 4 − D 3 X F 4 − F 3 X F 4 − A 1 X A 2 .$
From (4) and (5), we shall find the approximated solution of the following two subsystems
$D 1 X D 2 = M and D 3 X D 4 = N$
so that the following norm-error functions are minimized:
$L 1 ( X ) : = ‖ D 1 X D 2 − M ‖ F 2 and L 2 ( X ) : = ‖ D 3 X D 4 − N ‖ F 2 .$
$d d X tr ( A X ) = A T ,$
we can deduce the gradient of the error $L 1$ as follows:
$∂ ∂ X L 1 ( X ) = ∂ ∂ X tr ( D 1 X D 2 − M ) T ( D 1 X D 2 − M ) = ∂ ∂ X tr ( X D 2 D 2 X T D 1 D 1 ) − ∂ ∂ X tr ( X D 2 M T D 1 ) − ∂ ∂ X tr ( X T D 1 M D 2 ) = ( D 1 D 2 ) X T D 2 D 2 + D 1 D 1 X D 2 D 2 − D 1 M D 2 − ( D 2 M T D 1 ) T = 2 D 1 ( D 1 X D 2 − M ) D 2 .$
Similarly, we have
$∂ ∂ X L 2 ( X ) = 2 A 3 T ( A 3 X A 4 − N ) A 4 T .$
Let $X 1 ( k )$ and $X 2 ( k )$ be the estimates or iterative solutions of the system (6) at k-th iteration. The recursive formulas of $X 1 ( k )$ and $X 2 ( k )$ come from the gradient formulas (8) and (9), as follows:
$X 1 ( k ) = X ( k − 1 ) + μ D 1 ( M − D 1 X ( k − 1 ) D 2 ) D 2 = X ( k − 1 ) + μ D 1 ( E − A 1 X ( k − 1 ) A 2 + A 3 X ( k − 1 ) A 4 ) D 2 , X 2 ( k ) = X ( k − 1 ) + μ D 3 ( N − D 3 X ( k − 1 ) D 3 ) D 4 = X ( k − 1 ) + μ D 3 ( E − A 1 X ( k − 1 ) A 2 + A 3 X ( k − 1 ) A 4 ) D 4 .$
Based on the hierarchical identification principle, the unknown variable X is replaced by its estimates at the $( k − 1 )$-th iteration. To avoid duplicated computation, we introduce a matrix
$S ( k ) = E − ( A 1 X ( k ) A 2 + A 3 X ( k ) A 4 ) ,$
so we have
$X ( k ) = 1 2 ( X 1 ( k ) + X 2 ( k ) ) = X ( k − 1 ) + μ ( D 1 S ( k − 1 ) D 2 + D 3 S ( k − 1 ) D 4 ) .$
Since any diagonal matrix is sparse, the operation counts in the computation (10) can be substantially reduced. Let us denote $S ( k ) = [ s i j ( k ) ]$, $X ( k ) = [ x i j ( k ) ]$, and $D l = [ d i j ( l ) ]$ for each $l = 1 , 2 , 3 , 4$. Indeed, the multiplication of $D 1 S ( k ) D 2$ results in a matrix whose $( i j )$-th entry is the product of the i-th entry in the diagonal of $D 1$, the $( i j )$-th entry of $S ( k )$, and the j-th entry of $D 2$—i.e., $D 1 S ( k ) D 2 = [ d i i ( 1 ) s i j ( k ) d j j ( 2 ) ]$. Similarly, $D 3 S ( k ) D 4 = [ d i i ( 3 ) s i j ( k ) d j j ( 4 ) ]$. Thus,
$D 1 S ( k ) D 2 + D 3 S ( k ) D 4 = [ ( d i i ( 1 ) d j j ( 2 ) + d i i ( 3 ) d j j ( 4 ) ) s i j ( k ) ] .$
The above discussion leads to the following Algorithm 1.
 Algorithm 1: Modified Jacobi-gradient based iterative (MJGI) algorithm
The complexity analysis for each step of the algorithm is given by $2 m n ( m + n + 5 )$. When $m = n$, the complexity analysis is $4 n 3 + 10 n 2 ∈ O ( n 3 )$, so that the algorithm runtime complexity is cubic time. The convergence property of the algorithm relies on the convergence factor $μ$. The appropriate value of this parameter is determined in the next section.

## 3. Convergence Analysis of the Proposed Method

In this section, we make convergence analysis of Algorithm 1. First, we transform it into a linear iterative method of the first order: $x ( k ) = T x ( k − 1 )$ where $x ( k )$ is a vector variable and T is a matrix. The iteration matrix T will reflect convergence criteria, convergence rate, and error estimates of the algorithm.

#### 3.1. Convergence Criteria

Theorem 1.
Assume that the generalized Sylvester matrix Equation (3) has a unique solution$X *$. Denote$H = D ( P ) P$and write$σ ( H ) = λ 1 , … , λ m n$. Let${ X ( k ) }$be a sequence generated from Algorithm 1.
(1)
Then, (AS) holds if and only if$ρ ( I m n − μ H ) < 1$.
(2)
If$ℜ ( λ j ) > 0$for all$j = 1 , … , m n$, then (AS) holds if and only if
$0 < μ < max j = 1 , … , m n 2 ℜ ( λ j ) | λ j | 2 .$
(3)
If$ℜ ( λ j ) < 0$for all$j = 1 , … , m n$, then (AS) holds if and only if
$min j = 1 , … , m n 2 ℜ ( λ j ) | λ j | 2 < μ < 0 .$
(4)
If H is symmetric, then (AS) holds if and only if$λ max ( H )$and$λ min ( H )$have the same sign, and μ is chosen so that
$0 < μ < 2 λ max ( H ) if λ min ( H ) > 0 , 2 λ min ( H ) < μ < 0 if λ max ( H ) < 0 .$
Proof.
$X ˜ ( k ) = X ( k ) − X * , X ˜ 1 ( k ) = X 1 ( k ) − X * and X ˜ 2 ( k ) = X 2 ( k ) − X * .$
We will show that $X ˜ ( k ) → 0 ,$ or equivalently, $vec X ˜ ( k ) → 0$ as $k → ∞$. A direct computation reveals that
$X ˜ ( k ) = 1 2 ( X 1 ( k ) + X 2 ( k ) ) = X ˜ ( k − 1 ) − 1 2 μ D 1 ( A 1 X ˜ ( k − 1 ) A 2 + A 3 X ˜ ( k − 1 ) A 4 ) D 2 − 1 2 μ D 3 ( A 1 X ˜ ( k − 1 ) A 2 + A 3 X ˜ ( k − 1 ) A 4 ) D 4 .$
By taking the vector operator and using properties of the Kronecker product, we have
$vec X ˜ ( k ) = vec X ˜ ( k − 1 ) − μ vec ( D 1 A 1 X ˜ ( k − 1 ) A 2 D 2 + D 1 A 3 X ˜ ( k − 1 ) A 4 D 2 ) − μ vec ( D 3 A 1 X ˜ ( k − 1 ) A 2 D 4 + D 3 A 3 X ˜ ( k − 1 ) A 4 D 4 ) = { I m n − μ [ ( D 2 ⊗ D 1 ) ( A 2 T ⊗ A 1 ) ( + D 2 ⊗ D 1 ) ( A 4 T ⊗ A 3 ) + ( D 4 ⊗ D 3 ) ( A 2 T ⊗ A 1 ) + ( D 4 ⊗ D 3 ) ( A 4 T ⊗ A 3 ) ] } vec ( X ˜ ( k − 1 ) ) = [ I m n − μ ( D 2 ⊗ D 1 + D 4 ⊗ D 3 ) ( A 2 T ⊗ A 1 + A 4 T ⊗ A 3 ) ] vec ( X ˜ ( k − 1 ) ) .$
Let us denote the diagonal part of P by $D ( P )$. Indeed,
$D ( P ) = D 2 ⊗ D 1 + D 4 ⊗ D 3 .$
Thus, we arrive at a linear iterative process
$vec X ˜ ( k ) = [ I m n − μ H ] vec X ˜ ( k − 1 ) ,$
where $H = D ( P ) P$. Hence, the following statements are equivalent:
(i)
$vec X ˜ ( k ) → 0$ for any initial value $vec X ˜ ( 0 )$.
(ii)
System (11) has an asymptotically-stable zero solution.
(iii)
The iteration matrix $I m n − μ H$ has spectral radius less than 1.
Indeed, since $I m n − μ H$ is a polynomial of H, we get
$ρ ( I m n − μ H ) = max λ ∈ σ ( H ) | 1 − μ λ | .$
Thus, $ρ ( I m n − μ H ) < 1$ if and only if $| 1 − μ λ | < 1$ for all $λ ∈ σ ( H )$. Write $λ j = a j + i b j$ where $a j , b j ∈ R$. It follows that the condition $| 1 − μ λ j | < 1$ is equivalent to $( 1 − μ λ j ) ( 1 − μ λ j ) ¯ < 1$, or
$μ ( − 2 a j + μ ( a j 2 + b j 2 ) ) < 0 .$
Thus, we arrive at two alternative conditions:
(i)
$μ > 0$ and $− 2 a j + μ ( a j 2 + b j 2 ) < 0$ for all $j = 1 , 2 , 3 , … , m n$;
(ii)
$μ < 0$ and $− 2 a j + μ ( a j 2 + b j 2 ) > 0$ for all $j = 1 , 2 , 3 , … , m n$.
Case 1
$a j = ℜ ( λ j ) > 0$ for all j. In this case, $ρ ( I m n − μ H ) < 1$ if and only if
$0 < μ < max j = 1 , … , m n 2 a j a j 2 + b j 2 .$
Case 2
$a j = ℜ ( λ j ) < 0$ for all j. In this case, $ρ ( I m n − μ H ) < 1$ if and only if
$min j = 1 , … , m n 2 a j a j 2 + b j 2 < μ < 0 .$
Now, suppose that H is a symmetric matrix. Then $I m n − μ H$ is also symmetric, and thus all its eigenvalue are real. Hence,
$ρ ( I m n − μ H ) = max | 1 − μ λ min ( H ) | , | 1 − μ λ max ( H ) | .$
It follows that $ρ ( I m n − μ H ) < 1$ if and only if
$0 < μ λ min ( H ) < 2 and 0 < μ λ max ( H ) < 2 .$
So, $λ min ( H )$ and $λ max ( H )$ cannot be zero.
Case 1
If $λ max ( H ) ≥ λ min ( H ) > 0 ,$ then the condition (16) is equivalent to
$0 < μ < 2 λ max ( H ) .$
Case 2
If $λ min ( H ) ≤ λ max ( H ) < 0 ,$ then the condition (16) is equivalent to
$2 λ min ( H ) < μ < 0 .$
Case 3
If $λ min ( H ) < 0 < λ max ( H )$, then
$2 λ min ( H ) < μ < 0 and 0 < μ < 2 λ max ( H ) ,$
Therefore, the condition (16) holds if and only if $λ max ( H )$ and $λ min ( H )$ have the same sign and $μ$ is chosen according to the above condition. □

#### 3.2. Convergence Rate and Error Estimate

We now discuss the convergence rate and error estimates of Algorithm 1 from the iterative process (11).
Suppose that Algorithm 1 satisfies the (AS) property—i.e., $ρ ( I m n − μ H ) < 1$. From (11), we have
$‖ X ( k ) − X * ‖ F = ‖ vec X ˜ ( k ) ‖ F = ‖ ( I m n − μ H ) vec X ˜ ( k − 1 ) ‖ F ≤ ‖ I m n − μ H ‖ 2 ‖ X ˜ ( k − 1 ) ‖ F = ‖ I m n − μ H ‖ 2 ‖ X ( k − 1 ) − X * ‖ F .$
It follows inductively that for each $k ∈ N$,
$‖ X ( k ) − X * ‖ F ≤ ‖ I m n − μ H ‖ 2 k ‖ X ( 0 ) − X * ‖ F .$
Hence, the spectral norm of $I m n − μ H$ describes how fast the approximated solutions $X ( k )$ converges to the exact solution $X *$. The smaller spectral radius, the faster $X ( k )$ goes to $X *$. In that case, since $‖ I m n − μ H ‖ 2 < 1$, if $‖ X ( k − 1 ) − X * ‖ F ≠ 0$ (i.e., $X ( k − 1 )$ is not the exact solution) then
$‖ X ( k ) − X * ‖ F < ‖ X ( k − 1 ) − X * ‖ F .$
Thus, the error at each iteration gets smaller than the previous one.
The above discussion is summarized in the following theorem.
Theorem 2.
Suppose that the parameter μ is chosen as in Theorem 1 so that Algorithm 1 satisfies (AS). Then, the convergence rate of the algorithm is governed by the spectral radius (16). Moreover, the error estimate $‖ X ( k ) − X * ‖ F$ compared to the previous step and the fast step are provided by (17) and (18), respectively. In particular, the error at each iteration gets smaller than the $( n o n z e r o )$ previous one, as in (19).
From (16), if the eigenvalues of $μ H$ are close to 1, then the spectral radius of the iteration matrix is close to 0, and hence, the error $vec X ˜ ( k )$ or $X ˜ ( k )$ converge faster to 0.
Remark 1.
The convergence criteria and the convergence rate of Algorithm 1 depend on A, B, C and D but not on E. However, the matrix E can be used for the stopping criteria.
The next proposition determines the iteration number for which the approximated solution $X ( k )$ is close to the exact solution $X *$ so that $‖ X ( k ) − X * ‖ F < ϵ$.
Proposition 1.
According to Algorithm 1, for each given error $ϵ > 0$, we have $‖ X ( k ) − X * ‖ F < ϵ$ after $k *$ iterations for any $k *$, such that
$k * > log ϵ − log ‖ X ( k ) − X * ‖ F log ‖ I m n − μ H ‖ 2 .$
Proof.
From the estimation (18), we have
$‖ X ( k ) − X * ‖ F ≤ ‖ I m n − μ H ‖ 2 k ‖ X ( 0 ) − X * ‖ F → 0 as k → ∞ .$
This means precisely that for each given $ϵ > 0$, there is a $k * ∈ N$ such that for all $k ≥ k *$,
$‖ I m n − μ H ‖ 2 k ‖ X ( 0 ) − X * ‖ F < ϵ .$
Taking logarithms, we have that the above condition is equivalent to (20). Thus, if we run Algorithm 1 $k *$ times, then we get $‖ X ( k ) − X * ‖ < ϵ$ as desired. □

#### 3.3. Optimal Parameter

We discuss the fastest convergence factor for Algorithm 1.
Theorem 3.
The optimal convergence factor μ for which Algorithm 1 satisfies (AS) is one that minimizes $‖ I m n − μ H ‖ 2$. If, in addition, H is symmetric, then the optimal convergence factor for which the algorithm satisfies (AS) is determined by
$μ o p t = 2 λ min ( H ) + λ max ( H ) .$
In this case, the convergence rate is governed by
$ρ ( I m n − μ H ) = λ max ( H ) − λ min ( H ) λ max ( H ) + λ min ( H ) = κ 2 − 1 κ 2 + 1 ,$
where κ denotes be condition number of H, and we have the following estimates:
$‖ X ( k ) − X * ‖ F ≤ κ 2 − 1 κ 2 + 1 ‖ X ( k − 1 ) − X * ‖ F ,$
$‖ X ( k ) − X * ‖ F ≤ ( κ 2 − 1 κ 2 + 1 ) k ‖ X ( 0 ) − X * ‖ F .$
Proof.
From Theorem 2, it is clear that the fastest convergence factor is attained at a convergence factor that minimizes $‖ I m n − μ H ‖ 2$. Now, assume that H is symmetric. Then, $I m n − μ H$ is also symmetric, thus all its eigenvalues are real and
$‖ I m n − μ H ‖ 2 = ρ ( I m n − μ H ) .$
For convenience, denote $a = λ min ( H )$, $b = λ max ( H )$, and
$f ( μ ) : = ρ ( I m n − μ H ) = max | 1 − μ a | , | 1 − μ b | .$
First, we consider the case $λ min ( H ) > 0$. To obtain the fastest convergence factor, according to (15), we must solve the following optimization problem
$min 0 < μ < 2 λ max ( H ) ‖ I m n − μ H ‖ 2 = min 0 < μ < 2 b f ( μ ) .$
We obtain that the minimizer is given by $μ o p t = 2 / ( a + b )$, so that $f ( μ o p t ) = ( b − a ) / ( b + a )$. For the case $λ max ( H ) < 0$, we solve the following optimization problem
$min 2 λ min ( H ) < μ < 0 ‖ I m n − μ H ‖ 2 = min 2 a < μ < 0 f ( μ ) .$
A similar argument yields the same minimizer (21) and the same convergence rate (22). From (17), (18) and (25), we obtain the bounds (23) and (24). □

## 4. Numerical Simulations

In this section, we report numerical results to illustrate the effectiveness of Algorithm 1. We consider various sizes of matrix systems, namely, small $( 2 × 2 )$, medium $( 10 × 10 )$ and large $( 100 × 100 )$. For the generalized Sylvester equation, we compare the performance of Algorithm 1 to the GI and LSI algorithms. For the Sylvester equation, we compare our algorithm with GI, RGI, AGBI and JGI algorithms. All iterations have been carried out the same environment: MATLAB R2017b, Intel(R) Core(TM) i7-7660U CPU @ 2.5GHz, RAM 8.00 GB Bus speed 2133 MHz. We abbreviate IT and CPU for iteration time and CPU time (in seconds), respectively. As step k-th of the iteration, we consider the following error:
$δ ( k ) : = ‖ E − A 1 X ( k ) A 2 − A 3 X ( k ) A 4 ‖ F$
where $X ( k )$ is the k-th approximated solution of the corresponding system.

#### 4.1. Numerical Simulation for the Generalized Sylvester Matrix Equation

Example 1.
Consider the matrix equation $A 1 X A 2 + A 3 X A 4 = E$ where
$A 1 = 0.6959 − 0.6385 0.6999 0.0336 , A 2 = − 0.0688 − 0.5309 0.3196 0.6544 , A 3 = 0.4076 0.7184 − 0.8200 0.9686 , A 4 = 0.5313 0.1056 0.3251 0.6110 , E = 0.7788 0.0908 0.4235 0.2665 .$
Then, the exact solution of X is
$X * = 1.3036 − 0.0532 1.2725 1.2284 .$
Choose $X ( 0 ) = zeros ( 2 )$. In this case, all eigenvalues of H have positive real parts. The effect of changing the convergence factor μ is illustrated in Figure 1. According to Theorem 1, the criteria for the convergence of $X ( k )$ is that $μ ∈ ( 0 , 4.1870 )$. Since $μ 1 , μ 2 , μ 3 , μ 4$ satisfy this criteria, the error is becoming smaller and goes to zero as k increase, as in Figure 1. Among them, $μ 4 = 4.0870$ gives the fastest convergence. For $μ 5$ and $μ 6$, which do not meet the criteria, the error $δ ( k )$ does not converge to zero.
Example 2.
Suppose that $A 1 X A 2 + A 3 X A 4 = E ,$ where $A 1 , A 2 , A 3 , A 4 and E$ are $10 × 10$ matrices where
$A 1 = tridiag ( 1 , 3 , − 1 ) , A 2 = tridiag ( 1 , 1 , − 2 ) , A 3 = tridiag ( − 2 , − 2 , 3 ) , A 4 = tridiag ( − 3 , 2 , − 1 ) and E = heptadiag ( 1 , − 2 , 1 , − 2 , − 2 , 1 , − 3 ) .$
Here, E is a heptadiagonal matrix—i.e., a band matrix with bandwidth 3. Choose an initial matrix $X ( 0 ) = zeros ( 10 ) ,$ where $zeros ( n )$ is an n-by-n matrix that contains 0 for every position. We compare Algorithm 1 with the direct method, LSI and GI algorithms. Table 1 shows the errors at the final step of iteration as well as the computation time after 75 iterations. Figure 2 illustrates that the approximated solutions via LSI diverge, while those via GI and MJGI converge. Table 1 and Figure 2 imply that our algorithm takes significantly less computational time and error than others.
Example 3.
We consider the equation $A 1 X A 2 + A 3 X A 4 = E$ in which $A 1 , A 2 , A 3 , A 4 and E$ are $100 × 100$ matrices determined by
$A 1 = tridiag ( 1 , 1 , − 1 ) , A 2 = tridiag ( 1 , 2 , − 2 ) , A 3 = tridiag ( − 1 , − 2 , 3 ) , A 4 = tridiag ( − 2 , 1 , − 1 ) and E = heptadiag ( 1 , 2 , − 4 , 1 , − 2 , 2 , − 3 ) .$
The initial matrix is given by $X ( 0 ) = zeros ( 100 ) .$ We run LSI, GI and MJGI algorithms by using
$μ = 0.1 , μ = ( ‖ A 1 ‖ 2 ‖ A 2 ‖ 2 + ‖ A 3 ‖ 2 ‖ A 4 ‖ 2 ) − 1 , μ = 2 ( ‖ A 1 ‖ 2 ‖ A 2 ‖ 2 + ‖ A 3 ‖ 2 ‖ A 4 ‖ 2 ) − 1 ,$
respectively. The reported result in Table 2 and Figure 3 illustrate that the approximated solution generated from LSI diverges, while those from GI or MJGI converge. Both computational time and the error $δ ( 100 )$ from MJGI are less than those from GI.

#### 4.2. Numerical Simulation for Sylvester Matrix Equation

Assume that the Sylvester equation
$A 1 X + X A 4 = E$
has a unique solution. This condition is equivalent to that the Kronecker sum $A 4 T ⊕ A 1$ is invertible, or all possible sums between eigenvalues of $A 1$ and $A 4$ are nonzero. To solve (26), the Algorithm 2 is proposed:
$T ( k ) : = E − A 1 X ( k ) − X ( k ) A 4 .$
 Algorithm 2: Modified Jacobi-gradient based iterative (MJGI) algorithm for Sylvester equation
Example 4.
Consider the equation $A 1 X + X A 4 = E ,$ in which E is the same matrix as in the previous example,
$A 1 = tridiag ( 2 , − 1 , 1 ) ∈ R 10 × 10 and A 4 = tridiag ( − 1 , 1 , − 2 ) ∈ R 10 × 10 .$
In this case, all eigenvalues of the iteration matrix have positive real parts, so that we can apply our algorithm. We compare our algorithm with GI, RGI, AGBI and JGI algorithms. The results after running 100 iterations are shown in Figure 4 and Table 3. According to the error and CT in Table 3 and Figure 4, our algorithm uses less computational time and has smaller errors than others.

## 5. Conclusions and Suggestion

A modified Jacobi-gradient (MJGI) algorithm (Algorithm 1) is proposed for solving the generalized Sylvester matrix Equation (3). In order to have MJGI algorithm applicable for any sizes of matrix system and any initial matrices, the convergence factor $μ$ must be chosen properly according to Theorem 1. In this case, the iteration matrix $I m n − μ H$ has a spectral radius less than 1. When the iteration matrix is symmetric, we determine the optimal convergent factor $μ o p t$ which enhances the algorithm reaching the fastest rate of convergence. The asymptotic convergence rate of the algorithm is governed by the spectral radius of $I m n − μ H$. So, if the eigenvalue H is close to 1, then the algorithm converges faster in the long run. The numerical examples reveal that our algorithm is suitable for small ($2 × 2$), medium ($10 × 10$) and large ($100 × 100$) sizes of matrix systems. In addition, the MJGI algorithm performs well compared to recent gradient iterative algorithms. For future works, we may add another parameter for an updating step to make the algorithm converge faster—see [25]. Another possible way is to apply the idea in this paper to derive an iterative algorithm for nonlinear matrix equations.

## Author Contributions

Supervision, P.C.; software, N.S.; writing—original draft preparation, N.S.; writing—review and editing, P.C. All authors contributed equally and significantly in writing this article. All authors have read and agreed to the published version of the manuscript.

## Funding

This research received no external funding.

## Acknowledgments

The first author received financial support from the RA-TA graduate scholarship from the faculty of Science, King Mongkut’s Institute of Technology Ladkrabang, Grant. No. RA/TA-2562-M-001 during his Master’s study.

## Conflicts of Interest

The authors declare no conflict of interest.

## References

1. Shang, Y. Consensus seeking over Markovian switching networks with time-varying delays and uncertain topologies. Appl. Math. Comput. 2016, 273, 1234–1245. [Google Scholar] [CrossRef]
2. Shang, Y. Average consensus in multi-agent systems with uncertain topologies and multiple time-varying delays. Linear Algebra Appl. 2014, 459, 411–429. [Google Scholar] [CrossRef]
3. Golub, G.H.; Nash, S.; Van Loan, C.F. A Hessenberg-Schur method for the matrix AX + XB = C. IEEE Trans. Automat. Control. 1979, 24, 909–913. [Google Scholar] [CrossRef] [Green Version]
4. Ding, F.; Chen, T. Hierarchical least squares identification methods for multivariable systems. IEEE Trans. Automat. Control 1997, 42, 408–411. [Google Scholar] [CrossRef]
5. Benner, P.; Quintana-Orti, E.S. Solving stable generalized Lyapunov equations with the matrix sign function. Numer. Algorithms 1999, 20, 75–100. [Google Scholar] [CrossRef]
6. Starke, G.; Niethammer, W. SOR for AXXB = C. Linear Algebra Appl. 1991, 154–156, 355–375. [Google Scholar] [CrossRef] [Green Version]
7. Jonsson, I.; Kagstrom, B. Recursive blocked algorithms for solving triangular systems—Part I: One-sided and coupled Sylvester-type matrix equations. ACM Trans. Math. Softw. 2002, 28, 392–415. [Google Scholar] [CrossRef]
8. Jonsson, I.; Kagstrom, B. Recursive blocked algorithms for solving triangular systems—Part II: Two-sided and generalized Sylvester and Lyapunov matrix equations. ACM Trans. Math. Softw. 2002, 28, 416–435. [Google Scholar] [CrossRef] [Green Version]
9. Kaabi, A.; Kerayechian, A.; Toutounian, F. A new version of successive approximations method for solving Sylvester matrix equations. Appl. Math. Comput. 2007, 186, 638–648. [Google Scholar] [CrossRef]
10. Lin, Y.Q. Implicitly restarted global FOM and GMRES for nonsymmetric matrix equations and Sylvester equations. Appl. Math. Comput. 2005, 167, 1004–1025. [Google Scholar] [CrossRef]
11. Kressner, D.; Sirkovic, P. Truncated low-rank methods for solving general linear matrix equations. Numer. Linear Algebra Appl. 2015, 22, 564–583. [Google Scholar] [CrossRef]
12. Dehghan, M.; Shirilord, A. A generalized modified Hermitian and skew-Hermitian splitting (GMHSS) method for solving complex Sylvester matrix equation. Appl. Math. Comput. 2019, 348, 632–651. [Google Scholar] [CrossRef]
13. Dehghan, M.; Shirilord, A. Solving complex Sylvester matrix equation by accelerated double-step scale splitting (ADSS) method. Eng. Comput. 2019. [Google Scholar] [CrossRef]
14. Li, S.Y.; Shen, H.L.; Shao, X.H. PHSS iterative method for solving generalized Lyapunov equations. Mathematics 2019, 7, 38. [Google Scholar] [CrossRef] [Green Version]
15. Shen, H.L.; Li, Y.R.; Shao, X.H. The four-parameter PSS method for solving the Sylvester equation. Mathematics 2019, 7, 105. [Google Scholar] [CrossRef] [Green Version]
16. Hajarian, M. Generalized conjugate direction algorithm for solving the general coupled matrix equations over symmetric matrices. Numer. Algorithms 2016, 73, 591–609. [Google Scholar] [CrossRef]
17. Hajarian, M. Extending the CGLS algorithm for least squares solutions of the generalized Sylvester-transpose matrix equations. J. Frankl. Inst. 2016, 353, 1168–1185. [Google Scholar] [CrossRef]
18. Dehghan, M.; Mohammadi-Arani, R. Generalized product-type methods based on Bi-conjugate gradient (GPBiCG) for solving shifted linear systems. Comput. Appl. Math. 2017, 36, 1591–1606. [Google Scholar] [CrossRef]
19. Ding, F.; Chen, T. Gradient based iterative algorithms for solving a class of matrix equations. IEEE Trans. Automat. Control 2005, 50, 1216–1221. [Google Scholar] [CrossRef]
20. Niu, Q.; Wang, X.; Lu, L.-Z. A relaxed gradient based algorithm for solving Sylvester equation. Asian J. Control 2011, 13, 461–464. [Google Scholar] [CrossRef]
21. Zhang, X.D.; Sheng, X.P. The relaxed gradient based iterative algorithm for the symmetric (skew symmetric) solution of the Sylvester equation AX + XB = C. Math. Probl. Eng. 2017, 2017, 1624969. [Google Scholar] [CrossRef]
22. Xie, Y.J.; Ma, C.F. The accelerated gradient based iterative algorithm for solving a class of generalized Sylvester-transpose matrix equation. Appl. Math. Comput. 2012, 218, 5620–5628. [Google Scholar] [CrossRef]
23. Ding, F.; Chen, T. Iterative least-squares solutions of coupled Sylvester matrix equations. Syst. Control Lett. 2005, 54, 95–107. [Google Scholar] [CrossRef]
24. Fan, W.; Gu, C.; Tian, Z. Jacobi-gradient iterative algorithms for Sylvester matrix equations. In Proceedings of the 14th Conference of the International Linear Algebra Society, Shanghai University, Shanghai, China, 16–20 July 2007. [Google Scholar]
25. Tian, Z.; Tian, M.; Gu, C.; Hao, X. An accelerated Jacobi-gradient based iterative algorithm for solving Sylvester matrix equations. Filomat 2017, 31, 2381–2390. [Google Scholar] [CrossRef]
26. Ding, F.; Liu, P.X.; Chen, T. Iterative solutions of the generalized Sylvester matrix equations by using the hierarchical identification principle. Appl. Math. Comput. 2008, 197, 41–50. [Google Scholar] [CrossRef]
27. Horn, R.A.; Johnson, C.R. Topics in Matrix Analysis; Cambridge University Press: New York, NY, USA, 1991. [Google Scholar]
Figure 1. Error of Example 1.
Figure 1. Error of Example 1.
Figure 2. Error of Example 2.
Figure 2. Error of Example 2.
Figure 3. Comparison of Example 3.
Figure 3. Comparison of Example 3.
Figure 4. Errors of Example 4.
Figure 4. Errors of Example 4.
Table 1. Computational time and error for Example 2.
Table 1. Computational time and error for Example 2.
MethodITCTError: $δ ( 75 )$
Direct-0.0364-
LSI750.01251.1296 × 105
GI750.00491.4185
MJGI750.00220.5251
Table 2. Computational time and error for Example 3.
Table 2. Computational time and error for Example 3.
MethodITCTError: $δ ( 100 )$
Direct-34.6026-
LSI1000.19202.7572 × 104
GI1000.08494.7395
MJGI1000.02981.8844
Table 3. CTs and errors for Example 4.
Table 3. CTs and errors for Example 4.
MethodITCTError: $δ ( 100 )$
Direct-0.0118-
GI1000.00512.5981
RGI1000.00613.4741
AGBI1000.00517.3306
JGI1000.003817.2652
MJGI1000.00280.4281
 Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Share and Cite

MDPI and ACS Style

Sasaki, N.; Chansangiam, P. Modified Jacobi-Gradient Iterative Method for Generalized Sylvester Matrix Equation. Symmetry 2020, 12, 1831. https://doi.org/10.3390/sym12111831

AMA Style

Sasaki N, Chansangiam P. Modified Jacobi-Gradient Iterative Method for Generalized Sylvester Matrix Equation. Symmetry. 2020; 12(11):1831. https://doi.org/10.3390/sym12111831

Chicago/Turabian Style

Sasaki, Nopparut, and Pattrawut Chansangiam. 2020. "Modified Jacobi-Gradient Iterative Method for Generalized Sylvester Matrix Equation" Symmetry 12, no. 11: 1831. https://doi.org/10.3390/sym12111831

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.