The Weighted, Relaxed Gradient-Based Iterative Algorithm for the Generalized Coupled Conjugate and Transpose Sylvester Matrix Equations

: By applying the weighted relaxation technique to the gradient-based iterative (GI) algorithm and taking proper weighted combinations of the solutions, this paper proposes the weighted, relaxed gradient-based iterative (WRGI) algorithm to solve the generalized coupled conjugate and transpose Sylvester matrix equations. With the real representation of a complex matrix as a tool, the necessary and sufﬁcient conditions for the convergence of the WRGI algorithm are determined. Also, some sufﬁcient convergence conditions of the WRGI algorithm are presented. Moreover, the optimal step size and the corresponding optimal convergence factor of the WRGI algorithm are given. Lastly, some numerical examples are provided to demonstrate the effectiveness, feasibility and superiority of the proposed algorithm


Introduction
Matrix equations are often used in mathematics and engineering applications, such as control theory, signal processing and computational mathematics [1][2][3][4].For example, the forward and backward periodic Sylvester matrix equations (PSMEs) with the following forms are an indispensable part of pole assignment and the design of state observers for linear discrete periodic systems [5].Thus, studying the computational methods of matrix equations has become an important subject in the field of computational mathematics and control.For matrix equations, computing their exact solutions is very meaningful for many practical problems.However, in many applications, such as stability analysis of control systems, it is usually not necessary to calculate the exact solution, as the approximate solution is sufficient.Therefore, the research of iterative solutions to matrix equations has attracted many researchers [6][7][8][9][10][11][12].
One of the important ways to study the approximate solutions of matrix equations is to establish iterative methods.In recent years, many researchers have proposed a great deal of iterative methods to solve different kinds of Sylvester matrix equations.For the generalized coupled Sylvester matrix equations where A ij ∈ R r i ×m j , B ij ∈ R n j ×s i , F i ∈ R r i ×s i and X j ∈ R m j ×n j , i = 1, 2, • • • , p, j = 1, 2, • • • , q, Ding and Chen [13] applied the hierarchical identification principle and introduced the block matrix inner product to propose the GI algorithm.Based on the idea of the GI algorithm, some researchers established many improved versions of the GI algorithm and investigated their convergence properties [14,15].To improve the convergence rate of the GI algorithm, Zhang [16] proposed the residual norm steepest descent (RNSD), conjugate gradient normal equation (CGNE) and biconjugate gradient stabilized (Bi-CGSTAB) algorithms to solve the matrix in Equation (1).Subsequently, Zhang [17] constructed the full-column rank, full-row rank and reduced-rank gradient-based algorithms, with the main idea of them being to construct an objective function and use the gradient search.Afterward, Zhang and Yin [18] developed the conjugate gradient least squares (CGLS) algorithm for Equation (1), which can be convergent within the finite iteration steps in the absence of round off errors.
The generalized coupled Sylvester-conjugate matrix equations have the following form: where The matrix in Equation ( 2) can be regarded as the generalization of Equation (1) in the complex field.For the matrix in Equation (2), Wu et al. [19] extended the GI algorithm to solve it and derived the sufficient condition for the convergence of the GI algorithm.Due to the fact that the sufficient condition in [19] is somewhat conservative, Huang and Ma [20] established the sufficient and necessary conditions for convergence of the GI algorithm based on the properties of the real representation of a complex matrix and the vec operator.Also, they made use of different definitions of the real representation to derive another sufficient and necessary condition for the convergence of the GI algorithm.In [21], Huang and Ma introduced l relaxation factors into the GI algorithm and proposed two relaxed gradient-based iterative (RGI) algorithms.They proved that the RGI algorithms are convergent under suitable restrictions in light of the real representation of a complex matrix and the vec operator.Quite recently, Wang et al. [22] developed a cyclic gradient-based iterative (CGI) algorithm by introducing a modular operator, which is different from previous iterative methods.The most remarkable advantage of the CGI algorithm is that less information is used in each iteration update, which helps save memory and improve efficiency.
In addition, the generalized, coupled Sylvester-transpose matrix equations where A ij ∈ R m i ×r j , C ij ∈ R m i ×t j , B ij ∈ R t j ×n i , D ij ∈ R r j ×n i , F i ∈ R m i ×n i and X j ∈ R r j ×t j , i = 1, 2, • • • , s, j = 1, 2, • • • , l are related to fault detection, observer design and so forth.Due to the important role of the generalized, coupled Sylvester-transpose matrix equations in several applied problems, numerous methods have been developed to solve them.For example, Song et al. [23] constructed the GI algorithm for the matrix in Equation (3) by using the principle of hierarchical identification.According to the rank of the related matrices of the matrix in Equation (3), Huang and Ma [24] developed three relaxed gradient-based iterative (RGI) algorithms recently by minimizing the objective functions.
In [25], Beik et al. considered the following matrix equations: where • • • , N are the known matrices with suitable dimensions in a complex number field, X = X 1 , X 2 , • • • , X p is a group of unknown matrices and X H j represents the conjugate transpose of the matrix X j .The matrix in Equation ( 4) is quite general and includes several kinds of Sylvester matrix equations, and it can be viewed as a general form of the aforementioned matrices in Equations ( 1)- (3).By using the hierarchical identification principle, the authors in [25] put forward the GI algorithm over a group of reflexive (anti-reflexive) matrices.
Inspired by the above work, this paper focuses on solving the iterative solution of the generalized coupled conjugate and transpose Sylvester matrix equations: where • • • , l are the known matrices and X j ∈ C r j ×s j for j = 1, 2, • • • , l are the unknown matrices that need to be determined.When l = 1, Wang et al. [26] presented the relaxed gradient iterative (RGI) algorithm for Equation ( 5), which has four system parameters.Note that the matrices in Equations ( 1)-( 3) are special cases of the matrix in Equation ( 5), and thus the results obtained in this work contain the existing ones in [20,23,24,27].Owing to the fact that the convergence speed of the GI algorithm is slow in many cases, it is quite meaningful to further improve the numerical performance of the GI algorithm for the matrix in Equation ( 5).According to [21,24,26,28,29], it can be seen that the relaxation technique can ameliorate the numerical behaviors of the existing GI-like algorithms.Then, this motivated us to apply the weighted relaxation technique to the GI algorithm.By using different step size factors and the weighted technique, we construct the weighted, relaxed gradient-based iterative (WRGI) algorithm for solving the matrix in Equation (5).The WRGI algorithm contains s relaxation factors.When all relaxation factors are equal, the WRGI algorithm will reduce to the GI one proposed in [25].
In [25], the optimal convergence factor of the GI algorithm was not derived.Compared with the GI algorithm in [25], the WRGI algorithm proposed in this paper can own a higher computational efficiency, and its convergence properties are analyzed in detail, including the convergence conditions, optimal parameters and corresponding optimal convergence factor.The proposed WRGI algorithm has a faster convergence rate than the GI one by adjusting the values of the relaxation factors so it is more conducive to solving matrix equations in control theory, signal processing and computational mathematics, among other applications.The main contributions of this paper are given below: • By using a series of step size factors and the weighted relaxation technique, we establish the weighted, relaxed gradient-based iterative (WRGI) algorithm for solving the matrix in Equation ( 5), which generalizes and improves the existing GI one in [25].Aside from that, the proposed WRGI algorithm contains the RGI ones in [21,24].

•
We analytically provide the necessary and sufficient condition for the convergence of the proposed WRGI algorithm.Also, the expressions of the optimal step size and the corresponding optimal convergence factor of the WRGI algorithm are derived.
The rest of this paper is organized as follows.In Section 2, some definitions and previous results are given, and the GI algorithm that has been proposed before is reviewed.In Section 3, we first design a new algorithm referred to as the WRGI algorithm to solve the matrix in Equation (5).Then, we prove that the WRGI algorithm is convergent for any started matrices under proper conditions, and we explicitly give the optimal step factor such that the convergence rate of the WRGI algorithm is maximized.Section 4 gives two numerical examples to demonstrate the effectiveness and advantages of the proposed WRGI algorithm.Finally, we give some concluding remarks to end this paper in Section 5.

Preliminaries
In this section, as a matter of convenience to discuss the main results of this paper, we describe the following notations which will be used throughout this paper.Let C n×n be the set of all n × n complex matrices.For a given matrix A ∈ C n×n , some related notations are the following:

•
A denotes the conjugate of the matrix A; Moreover, some useful definitions and lemmas are given below: Definition 1 ([30]).For two matrices E ∈ C m×n and F ∈ C k×l , the Kronecker product of the matrices E and F is defined as follows: Definition 2 ([27]).Let e in be the n-dimensional column vector whose ith element of e in is one, and other elements are zero.Then, the vec permutation matrix P(m, n) can be defined as follows: with a i being the ith column of A. The vector-stretching function of A is defined as follows: Lemma 1 ([20]).Let CYD = G be a matrix equation, with the matrices C and D being full-column rank and full-row rank, respectively.Then, the iterative solution Y(l) produced by the GI iteration process Y(l converges to the exact solution Y * of CYD = G for any initial matrix Y(0) if and only if In addition, the optimal step size τ 0 is Lemma 2 ([30]).Let A ∈ C m×n , B ∈ C s×t and X ∈ C n×s .Then, we have the following: (i) vec(AXB) = (B T ⊗ A)vec(X); (ii) vec(X T ) = P(n, s)vec(X).Next, we introduce two definitions of the real representation of a complex matrix.For A ∈ C m×n , A can be uniquely expressed as We define the operators (•) ∇ and (•) as follows: It can be seen from Equation ( 11) that the sizes of A ∇ and A are two times that of A. Then, by combining Equation (11) with the definition of the Frobenius norm, we can obtain Aside from that, for the identity matrix I n with the matrix order n, we define the following matrices: The properties of the real representation of several complex matrices are given by the following lemma: Lemma 4 ( [21]).Let A ∈ C m×n and B ∈ C n×r .Then, the following statements hold: (1) (AB) The lemma below gives the norm relationship between the block matrices and its submatrices: Lemma 5 ([31]).Let B be a block-partitioned matrix with and B ij be a matrix with a proper size.Then, it holds that If rank(B) = 1, or if B is a vector, then we have The following result gives the norm inequality for the p norms of the block matrix and its submatrices, which is essential for analyzing the convergence of the WRGI algorithm: Lemma 6 ([21]).Let B = [B ij ] m×n be a partitioned matrix with the form in Equation (13), and let the orders of the matrices Then, for any deduced q norm, we have

The Weighted, Relaxed Gradient-Based Iterative (WRGI) Algorithm and Its Convergence Analysis
In this section, we first propose the weighted, relaxed gradient-based iterative (WRGI) algorithm to solve the matrix in Equation ( 5) based on the hierarchical identification principle.Then, we discuss the convergence properties of the WRGI algorithm, which include the convergence conditions, optimal step size and the corresponding optimal convergence factor of the WRGI algorithm.
First, we define the following intermediate matrices for i = 1, • • • , s, p = 1, • • • , l by applying the hierarchical identification principle: For the sake of the following discussions, we define Then, Equations ( 14)-( 17) can be written as Therefore, the matrix in Equation ( 5) can be decomposed into the following matrix equations By applying Lemma 1 to Equations ( 22)-( 25), we can construct the recursive forms as follows where µ is a step size factor.
For convenience, we define the following notations: Substituting Equations ( 18)-( 21) into Equations ( 26)-(29), respectively, and then using By taking the average of X 1,i p (k + 1), X 2,i p (k + 1), X 3,i p (k + 1) and X 4,i p (k + 1) in the above equations, one can obtain the following iterative algorithm: Then, we construct the calculation forms of X p (k + 1) (p = 1, • • • , l) by introducing the suitable relaxation parameters and using the balanced strategies Remark 1.We apply Lemma 1 to Equations ( 22)-( 25) and then use the same step size factor µ to establish the iterative sequences in Equations (26)-(29), which is conducive to deducing the convergence conditions of the proposed algorithm.It is noteworthy that we also can use different step size factors in Equations (26)-( 29) and design a new algorithm, but it may be difficult to derive the convergence conditions of the new algorithm this way.Using different step size factors to construct iterative sequences and proposing a new algorithm will be the direction and focus in our future work.
Based on the above discussions, we obtain the following weighted, relaxed gradientbased iterative (WRGI) algorithm (Algorithm 1) to solve the generalized coupled conjugate and transpose Sylvester matrix equations.

Algorithm 1: The weighted, relaxed gradient-based iterative (WRGI) algorithm
Step 1: Given the matrices Step 3: Compute X p (k + 1) using In the following, we will discuss the convergence properties of the WRGI algorithm by applying the properties of the real representation of a complex matrix and the Kronecker product of the matrices.For convenience, we introduce the following notations.Let and where .
Theorem 1.The matrix in Equation (5) has a unique solution if and only if rank(R) = l ∑ p=1 4s p r p (that is, R is of a full-column rank) and rank(R) = rank((R, f )).The unique solution is given by and the corresponding homogeneous matrix equations l and R is defined by Equation (31).
Proof.Applying the real representation of the complex matrix to Equation (5) leads to By using the Kronecker products of the matrices and the vector-stretching operator in the above equations, we have which can be equivalently transformed into the following linear system: Therefore, Equation (33) has a unique solution if and only if R has a full-column rank and rank(R) = rank((R, f )).In this case, the unique solution in Equation (32) for the matrix in Equation ( 5) can be obtained.The conclusions follow immediately.
Based on the properties of the matrix norms and Algorithm 1, we establish the sufficient condition for the convergence of the proposed WRGI algorithm in the following theorem: Theorem 2. Suppose that the matrix in Equation (5) has a unique solution X Proof.Define the error matrices It follows from Algorithm 1 that Taking the Frobenius norm in Equation (36) and using the properties of the norm leads to It is not difficult to verify that and With the relations in Equations ( 38) and (39), it follows from Equation (37) that By making use of the relation in Equation ( 12), as well as Lemmas 2 and 4, we have where V is defined by Equation ( 30) and By applying the properties of the 2-norms of the matrices and vectors, we obtain that (42) That aside, in light of the relation in Equation ( 12), we deduce that Substituting Equation (43) into Equation (42) leads to (44) By combining Equations ( 40) and (44) and changing the order of addition, we derive If the parameter µ is chosen to satisfy 0 < µ < min 1≤i≤s 8 and therefore According to the convergence theorem of series, we have lim Having in mind that the matrix in Equation ( 5) has a unique solution, then it follows from the definition of The proof is completed.
In the sequel, by applying the properties of the real representation of a complex matrix and the vector-stretching operator, we study the necessary and sufficient condition for the convergence of the WRGI algorithm.This result can be stated as in Theorem 3: Theorem 3. Suppose that the matrix in Equation (5) has a unique solution X Then, the iterative sequences {X p (k)} (p = 1, • • • , l) generated by Algorithm 1 converge to X * for any initial matrices X p (0 Proof.For convenience, we define From Equation (36) and the definition in Equation (34) for Z i (k) in Theorem 2, it can be obtained that Then, by combining the real representation with Equation (47) and applying Lemma 4, we obtain By taking the vector-stretching operator on both sides of Equation (48) and using Lemma 2, we have This gives the following result: and R is defined by Equation (31).Equation ( 50) implies that the sufficient and necessary condition for the convergent of Algorithm 1 (the WRGI algorithm) is where It follows from Equation (52) that After simple computations, we derive , which completes the proof.
Remark 2. When the relaxation parameters α i of the WRGI algorithm are α i = 1 s , i = 1, 2, • • • , s, then the condition in Equation (46) reduces to the following condition: Based on Theorem 3, we will study the optimal step size µ and the corresponding optimal convergence factor of the WRGI algorithm in the following theorem: Theorem 4. Assume that the conditions of Theorem 2 are valid.Then, it holds that where In addition, the optimal convergence factor µ opt is Under this situation, the convergence rate is maximized, and we have Proof.Due to the fact that In light of Equations ( 50) and ( 56), one has Combining the relations in Equations ( 12) and (57) results in where It can be seen from Equation (59) that the smaller the ρ I − µ 4 R T T from which one can deduce that Substituting Equation (61) into Equation (59) yields Equation (55).The proof is completed.
Although the convergence conditions of the WRGI algorithm are given in Theorems 2 and 3, the Kronecker product and the real representations of the system matrices are involved in computing V 2 and T 1 2 R 2 .This may bring about a high dimensionality problem and lead to high computing consumption.To overcome this drawback, we will give a sufficient condition for the convergence of the proposed WRGI algorithm below.

Corollary 1. Suppose that the matrix in Equation
.
Proof.According to Lemmas 4-6 and the fact that (62) Substituting Equation (62) into the relation in Equation ( 46) gives the conclusion of this corollary.

Numerical Experiments
In this section, two numerical examples are provided to verify the effectiveness and advantage of the proposed WRGI algorithm for solving the matrix in Equation ( 5) in terms of the number of iterations (IT) and the computing times in seconds (CPU).All the computations were performed in MATLAB R2018b on a personal computer with an AMD Ryzen 7 5800H, where the CPU was 3.20 GHz and the memory was 16.0 GB.Example 1 ([32]).Consider the generalized coupled conjugate and transpose Sylvester matrix equations , with the following coefficient matrices: It can be verified that the unique solution of this matrix equation is We take the initial iterative matrices 4]   and set to be the termination condition with a constant ξ > 0, where Here, the prescribed maximum iterative number is k max = 20,000, and X p (k) (p = 1, 2, 3, 4) represents the kth iteration solution.
It follows from Remark 2 that the WRGI algorithm reduces to the GI one [25] as α 1 = α 2 = α 3 = α 4 = 0.25.In this case, according to Theorem 4, the optimal convergence factor is µ opt = 4.5603 × 10 −6 .However, the optimal convergence factor µ opt calculated by Theorem 4 may not minimize the IT of the GI algorithm, and the reason for this is that there are errors in the calculation process.Thus, the parameter µ adopted in the GI algorithm is the experimentally found optimal one, which minimizes the IT of the GI algorithm.After experimental debugging, the experimental optimal parameters of the GI algorithm are µ GI1 = 4.2 × 10 −6 , µ GI2 = 4.53 × 10 −6 , µ GI3 = 4.556 × 10 −6 and µ GI4 = 4.558 × 10 −6 when ξ = 0.1, ξ = 0.01, ξ = 0.001 and ξ = 0.0001, respectively.Aside from that, we can obtain that the optimal parameter of the WRGI algorithm is µ = 4.6493 × 10 −6 in terms of Theorem 4. Owing to the existence of computation errors, the parameter µ in the WRGI algorithm is adopted to be the experimentally found one as demonstrated above.Through experimental debugging, the experimental optimal parameters of the WRGI algorithm are µ WRGI1 = 4.2 × 10 −6 , µ WRGI2 = 4.61 × 10 −6 , µ WRGI3 = 4.645 × 10 −6 and µ WRGI4 = 4.647 × 10 −6 when ξ = 0.1, ξ = 0.01, ξ = 0.001 and ξ = 0.0001, respectively.Under these circumstances, the relaxation factors are taken to be α 1 = 0.26, α 2 = 0.26, α 3 = 0.24 and α 4 = 0.24.Table 1 lists the numerical results of the GI and the WRGI algorithms.As observed earlier in the comparisons of the GI and the WRGI algorithms in Table 1, the two tested algorithms were convergent for all cases, and the IT of them increased gradually with the decreases in the value of ξ.Additionally, it can be seen from Table 1 that the proposed WRGI algorithm performed better than the GI one with respect to the IT and CPU times.This implies that by applying the weighted relaxation technique, the WRGI algorithm can own better numerical behaviors than the GI one.
To better validate the advantage of the WRGI algorithm, Figures 1-4 describe the convergence curves of the GI and WRGI algorithms for four different values of ξ.These curves reflect that the WRGI algorithm is convergent and its convergence speed is faster than the GI one, except for the case of ξ = 0.1.Additionally, the advantage of the WRGI algorithm becomes more pronounced as the value of ξ decreases.This is consistent with the results in Table 1.The unique solution for these matrix equations is In this example, we chose as the initial matrices and took as the termination condition with a positive number ξ, where , and the prescribed maximum iterative number was k max = 20,000, while X p (k) (p = 1, 2) denotes the kth iteration solution.
In Table 2, we compare the numerical results of the GI and WRGI algorithms to solve the generalized coupled conjugate and transpose Sylvester matrix equations in Example 2 with respect to two different values of ξ.Since the IT of the algorithms exceeded the maximum number of iterations when ξ = 0.001 and ξ = 0.0001, we did not list the corresponding numerical results here.From the numerical results listed in Table 2, we can conclude that when the value of ξ decreases, the IT and CPU times of the tested algorithms increase.The changing scope of the IT of the proposed WRGI algorithm was smaller than that for the GI one.In addition, the WRGI algorithm outperformed the GI one from the point of view of computing efficiency, and the advantage of the WRGI algorithm became more pronounced as ξ became small.
Figure 5 describes the convergence curves of the GI and WRGI algorithms for Example 2 with ξ = 0.1 and ξ = 0.01, respectively.It follows from Figure 5 that the error curves of the WRGI algorithm decreased faster than those for the GI one, which means that the proposed WRGI algorithm is superior to the GI one in terms of IT.

Figure 1 .
Figure 1.Convergence curves of the tested algorithms for Example 1 with ξ = 0.1.

Figure 2 .
Figure 2. Convergence curves of the tested algorithms for Example with ξ = 0.01.

Figure 3 .
Figure 3. Convergence curves of the tested algorithms for Example with ξ = 0.001.

Figure 4 .
Figure 4. Convergence curves of the tested algorithms for Example with ξ = 0.0001.
R , the faster the convergence rate of the WRGI algorithm.Direct calculations give that ρ I

Table 1 .
Numerical results of the tested algorithms for Example 1 with different values of ξ.

Table 2 .
Numerical results of the tested algorithms for Example 2 with different values of ξ.