Gradient Iterative Method with Optimal Convergent Factor for Solving a Generalized Sylvester Matrix Equation with Applications to Diffusion Equations

: We introduce a gradient iterative scheme with an optimal convergent factor for solving a generalized Sylvester matrix equation ∑ pi = 1 A i XB i = F , where A i , B i and F are conformable rectangular matrices. The iterative scheme is derived from the gradients of the squared norm-errors of the associated subsystems for the equation. The convergence analysis reveals that the sequence of approximated solutions converge to the exact solution for any initial value if and only if the convergent factor is chosen properly in terms of the spectral radius of the associated iteration matrix. We also discuss the convergent rate and error estimations. Moreover, we determine the fastest convergent factor so that the associated iteration matrix has the smallest spectral radius. Furthermore, we provide numerical examples to illustrate the capability and efﬁciency of this method. Finally, we apply the proposed scheme to discretized equations for boundary value problems involving convection and diffusion.


Introduction
It is well known that several problems in control and system theory are closely related to a generalized Sylvester matrix equation of the form where A i , B i and F are given matrices of conforming dimensions. Equation (1) includes the following special cases: known respectively as the Sylvester equation, the Lyapunov equation, and the Kalman-Yakubovich equation. Equations (1)-(4) have important applications in stability analysis, optimal control, observe design, output regulation problem, and so on; see e.g., [1][2][3]. Equation (1) can be solved directly using the vector operator and the Kronecker product. Here, recall that the vector operator 1 2 and Method 1 ([15]). Assume that the matrix Equation (1) has a unique solution X. Construct A j X(k − 1)B j ]B T i , i = 1, 2, . . . , p, ] −1 , then the sequence {X(k)} ∞ k=0 converges to the exact solution X for any given initial matrices X 1 (0), X 2 (0), ..., X p (0).
In this paper, we propose a gradient-based iterative method with an optimal convergent factor (GIO) for solving the generalized Sylvester matrix Equation (1). This method is derived from least-squares optimization and hierarchical identification principle (see Section 2). Convergence analysis (see Section 3) reveals that the sequence of approximated solutions converges to the exact solution for any initial value if and only if the convergent factor is chosen properly. Then we discuss the convergent rate and error estimates for the method. Moreover, the convergent factor will be determined so that the convergent rate is fastest, or equivalently, the spectral radius of associated iteration matrix is minimized. In particular, the GIO method can solve the Sylvester Equation (2) (see Section 4). To illustrate the efficiency of the proposed method, we provide numerical experiments in Section 5. We compare the efficiency of our method for solving (2) with other iterative methods such as gradient based iterative method (GI) [15], least-squares iterative method (LS) [17], relaxed gradient based iterative method (RGI) [18], modified gradient based iterative method (MGI) [19], Jacobi-gradient based iterative method (JGI) [20,21] and accelerated Jacobi-gradient based iterative method AJGI [22]. In Section 6, we apply the GIO method to the convection-diffusion and the diffusion equation. Finally, we conclude the overall work in Section 7.

Introducing a Gradient Iterative Method
Let us denote by R r×s the set of r × s real matrices. Let m, n, p, q ∈ N be such that mq = np. Consider the matrix Equation (1) where A i ∈ R m×n , B i ∈ R p×q , F ∈ R m×q are given constant matrices and X ∈ R n×p is an unknown matrix to be found. Suppose that (1) has a unique solution, i.e., the matrix P is invertible. Now, we discuss how to solve (1) indirectly using an effective iterative method. According to the hierarchical identification principle, the system (1) is decomposed into p subsystems. For each i ∈ {1, 2, ..., p}, set Our aim is to approximate the solution of p subsystems: so that the following least-squares error is minimized: The gradient of each L i can be computed as follows: Let X i (k) be the estimate or iterative solution at iteration k, associated with the subsystem (6). From the gradient formula (8), the iterative scheme for X i (k) is given by the following equation: where τ is a convergent factor. According to the hierarchical identification principle, the unknown parameter X in (9) is replaced by its estimate X(k − 1). After taking the arithmetic mean of X i (k), we obtain the following process: Method 3. Gradient-based iterative method with optimal convergent factor Initializing step: For i = 1, 2, . . . , p, set A i = A T i and B i = B T i . Choose τ ∈ R. Set k := 0. Choose initial matrix X(0).
Updating step: For k = 1 to end, do: Note that the terms E(k), A i , B i were introduced in order to eliminate duplicated computations. To stop the process, one may impose a stopping rule such as the relative error E(k) F / F F is less than a tolerance error . The convergence property of this method depends on the convergent factor τ. A discussion of possible/optimal values of τ will be presented in the next section.

Convergence Analysis
In this section, we show that the approximated solutions derived from Method 3 converge to the exact solution. First, we transform a recursive equation of the error of approximated solutions into a first-order linear iterative system x(k) = Tx(k − 1) where x(k) is a vector and T is an iteration matrix. Then, we investigate the iteration matrix T to obtain the convergence rate and error estimations. Finally we discuss the fastest convergent factor and find the number of iterations corresponding to a given satisfactory error. Theorem 1. Assume that the matrix Equation (1) has a unique solution X. Let τ ∈ R. Then the approximate solutions derived from (9) converge to the exact solution for any initial value X(0) if and only if In this case, the spectral radius of the associated iteration matrix T = I np − τP T P is given by Proof. At each k-th iteration, consider the error matrixX(k) = X(k) − X. We havẽ We shall show that X(k) → X by showing thatX(k) → 0 or vec[X(k)] → 0. By taking the vector operator to the above equation, we get We see that (12) is a first-order linear iterative system in the form x(k) = Tx(k − 1). Thus, vecX(k) → 0 for any initial values X i (0) if and only if the iteration matrix T has spectral radius less than 1. Since T is symmetric, all its eigenvalues are real. Note that any eigenvalue of T is of the form 1 − τλ where λ is an eigenvalue of P T P. Thus, its spectral radius is given by (11). It follows that Since P is invertible, the matrix P T P is positive definite. Thus, λ max (P T P) > 0. The condition (13) now becomes 0 < τ < 2 λ max (P T P) Hence, we arrive at (10).

Theorem 2.
Assume the hypothesis of Theorem 1, so that the sequence {X k } converges to the exact solution X for any initial value X(0).
(1). We have the following error estimates Moreover, the asymptotic convergence rate of Method 3 is governed by ρ [T] in (11).
Let ε > 0 be a satisfactory error. We have X(k) − X F < after the k-th iteration for any k ∈ N that satisfies Proof. According to (12), we have . Thus for each k ∈ N, the approximation (14) holds. By induction, we obtain the estimation (15). The estimate (15) implies that the asymptotic convergence rate of the method depends on ρ [T]. To prove the assertion, we have by taking logarithms that the condition (16) is equivalent to Thus if (16) holds, then X(k) − X F < .
The convergence rate exhibits how fast of the approximated solutions converge to the exact solution. Theorem 2 reveals that the smaller the spectral radius ρ[T], the faster the approximated solutions go to the exact solution. Moreover, by taking = 0.5 × 10 −n in (16), we have that X(k) has an accuracy of n decimal digits if k satisfies Recall that the condition number of a matrix A (relative to the spectral norm) is defined by Theorem 3. Assume the hypothesis of Theorem 1. Then the optimal value of τ > 0 for which Method 3 has the fastest asymptotic convergence rate is determined by In this case, the spectral radius of the iteration matrix is given by Proof. The convergence of Method 3 implies that (10) holds. Then, Method 3 has the convergence rate as the same to the linear iteration (12), and thus, it is governed by the spectral radius ρ [T] in (11). The fastest convergence rate is equivalent to the smallest of ρ [T]. Thus, we make the following minimization: Minimize Thus, the optimal value is reached at (17) so that the minimum is given by (18).
We see that if the condition number of P is closer to 1 then the approximate solutions converge faster to the exact solution. Note that the condition number of P is close to 1 if and only if the maximum eigenvalue of P T P is close to the minimum eigenvalue of P T P.

The GIO Method for the Sylvester Equation
In this section, we discuss the gradient-based iterative method with optimal convergent factor for solving Sylvester matrix equation. Moreover we discover convergence criteria, convergence rate, error estimate and optimal factor. Let m, n, p, q ∈ N be such that m = n and p = q. Consider the Sylvester matrix Equation (2) where A ∈ R m×n , B ∈ R p×q , F ∈ R m×q are given constant matrices and X ∈ R n×p is an unknown matrix to be found. Suppose that (2) has a unique solution, i.e., Q := I p ⊗ A + B T ⊗ I n is invertible, or equivalently, A and −B have no common eigenvalues. Updating step: For k = 1 to end, do: Corollary 1. Assume that the Sylvester matrix Equation (2) has a unique solution X. Let τ ∈ R. Then the following hold: (i) The approximate solutions generated by Method 4 converge to the exact solution for any initial value X(0) if and only if In this case, the spectral radius of the associated iteration matrix S = I np − τQ T Q is given by (ii) The asymptotic convergence rate of Method 4 is governed by ρ[S] in (20).
The optimal value of τ > 0 for which Method 4 has the fastest asymptotic convergence rate is determined by

Numerical Examples for Generalized Sylvester Matrix Equation
In this section, we show the capability and efficiency of the proposed method by illustrating some numerical examples. To compare the performance of any algorithms, we must use the same PC environment, and consider informed errors together with iteration numbers (IT) and computational times (CT: in seconds). Our iterations have been carried out by MATLAB R2013a, Intel(R) Core(TM) i5-760 CPU @ 2.80 GHz, RAM 8.00 GB PC environment. We measure the computational time taken for an iterative process by the MATLAB functions tic and toc. In Example 1, we show that our method is also efficient although matrices are non-square and we discuss the effect of changing the convergent factor τ. In Example 2, we consider a larger square matrix system and show that our method is still efficient. In Example 3, we compare the efficiency of our method to another recent iterative methods. The matrix equation considered in this example is the Sylvester equation with square coefficient matrices since it fits with all of the recent methods. In all illustrated examples, we compare the efficiency of iterative methods to the direct method x = P −1 b mentioned in Introduction. Let us denote by tridiag(u, v, w) the tridiagonal matrix with main diagonal u, v and w.
The optimal convergent factor can be computed as follows: λ min (P T P) + λ max (P T P) The effect of changing convergent factors τ is illustrated in Figure 1. We see that as k large enough, the relative error E(k) F / F F for τ opt goes faster to 0 than for other convergent factors. If τ does not satisfy the condition (10), then the approximated solutions diverge for the given initial matrices. Moreover, Table 1 shows that the computational time of our algorithm (GIO) is significantly less than the time of the direct method. Table 1 also demonstrates that, when we fix the error E(k) F to be less than 5 × 10 −3 , the GIO algorithm outperforms another GI algorithms with different convergent factors in both iteration numbers and computational times.
We can compute τ opt ≈ 0.002553. Figure 2 shows that the relative error E(k) F / F F for τ opt goes faster to 0 than for other convergent factors. If τ does not satisfy (10), then the approximate solutions diverge for the given initial matrices. From Table 2, we see that the computational time of our algorithm is significantly less than the time of the direct method. Furthermore, when the satisfactory error E(k) F is less than = 0.5, the GIO algorithm has more efficiency than another GI algorithms in both iteration numbers and computational times.    1, 4). We compare the efficiency of our method (GIO) with another iterative methods such as GI, LS, RGI, MGI, JGI and AJGI. We choose the same convergent factor τ = 0.01836 and the same initial matrix X(0) = tridiag(0, 10 −6 , 0). To compare the efficiency of these methods, we fix the iteration number to be 50 and consider the relative errors E(k) F / F F . The results are displayed in Figure 3. The iteration numbers and the computational times when we fix the error E(k) F to be less than 5 × 10 −3 are illustrated in Table 3. We see that our method is outperform to the direct method and another iterative methods with less iteration number and lower computational time. In particular, the approximated solutions generated from JGI method diverge.

An Application to Discretization of the Convection-Diffusion Equation
In this section, we apply the GIO method to a discretization of convection-diffusion equation in the form ∂u ∂t where µ and α are the convection and diffusion coefficients, respectively. Equation (22) Rearranging the above equation leads to where r = µl/n and p = αl/h 2 are the convection and diffusion numbers, respectively. We can transform (22) into a linear system of MN unknowns u 11 , ..., u MN in the form where U = [u n m ], P CD ∈ R M×N has N × N blocks of the form I M on its diagonal and tridiag(−p − 1 2 r, −1 + 2p, −p + 1 2 r) under its diagonal. The vector b is partitioned in M blocks as We can see that Equation (24) is the generalized Sylvester equation where p = 1, A = P CD , X = vec(U), B = I and F = b. From Method 3, we obtain the following: Method 5. Input M, N ∈ N as number of partition. Set P CD = P T CD .
Updating step: For k = 1 to end, do: To stop the method, one may impose a stopping rule such as E(k) F / b F < where is a tolerance error.
Now, we provide a numerical experiment for a convection-diffusion equation.
After compiling Method 5 for 100 iterations, we see from Figure 4 that the relative error E(k) F goes faster to 0 than for other methods such as GI, LS, RGI, MGI, JGI and AJGI. Moreover, Table 4    A particular case µ = 0 of Equation (22) is called the diffusion equation. In this case, the formulas of P CD and b 1 , . . . , b N are reduced as r = 0.
After compiling Method 5 for 200 iterations ( Figure 5), we see that our method is outperform to another iterative methods with less iteration number and lower computational time. The 3D-plot in Figure 6 shows that the iterative solution is well approximated to the exact solution.

Conclusions
We propose a gradient-based iterative method with an optimal convergent factor for solving a generalized Sylvester matrix equation. The convergence analysis reveals that the sequence of approximated solutions converge to the exact solution for any initial value if and only if the convergent factor is chosen properly. The convergent rate and error estimations depend on the spectral radius of the associated iteration matrix. Moreover, we obtain the fastest convergent factor so that the associated iteration matrix has the smallest spectral radius. Furthermore, the proposed algorithm is applicable for the discretization of the diffusion equations. The numerical experiments illustrate that our method is applicable for any conformable square/rectangular matrices of small/large sizes. Moreover, they reveal that our method performs well comparing to recent iterative methods.
Author Contributions: N.B. and P.C. contributed equally and significantly in writing this article. All authors read and approved the final manuscript.
Funding: The first author would like to thank Science Achievement Scholarship of Thailand (SAST), Grant No. 01/2560, from Ministry of Education for financial support during the Ph.D. study.