A General Approach to Sylvester-Polynomial-Conjugate Matrix Equations

: Sylvester-polynomial-conjugate matrix equations unify many well-known versions and generalizations of the Sylvester matrix equation AX − XB = C which have a wide range of applications. In this paper, we present a general approach to Sylvester-polynomial-conjugate matrix equations via groupoids, vector spaces


Introduction
The Sylvester matrix equation is the equation AX − XB = C where all matrices are complex, matrices A, B, C are given, and X is unknown.Special cases of the equation already appear in introductory courses to linear algebra, e.g., as the matrix form Ax = b of a system of linear equations, the equation Ax = 0 defining the nullspace of A, the normal equation A T Ax = A T b determining the solution to a linear least squares problem, the equation (A − λI)x = 0 defining eigenvalues and eigenvectors of A, the equation AX = I defining an inverse of a square matrix A, and the equation AX − XA = 0 defining commuting matrices (see [1] and [2], Chapter 16).
The Sylvester matrix equation has numerous applications in systems and control theory, signal processing, image restoration, engineering, and differential equations (see [3,4] and [5], Section 1, for a concise review of literature and methods for solving this equation).To present a concrete example, let us consider the restoration of images, i.e., the reconstruction or estimation of the original image on the base of its noised or degraded version.As described in [6], in the presence of white Gaussian noise and under suitable assumptions on a two-dimensional image, the minimum mean square error estimate F of the original image is a solution of a Sylvester's matrix equation T −1 1 F + FT 2 = C, where the matrix T 1 is defined with the help of the covariance matrix of the vector of samples from the image in the vertical direction, T 2 is defined similarly but with respect to the horizontal direction, and C is defined with the help of a vector of the noised image (see [6] for particularities).An interesting and unexpected appearance of a Sylvester's equation in ring theory is presented in [7].Recall that an element a of a unital and not necessarily commutative ring R is said to be suitable if, for every left ideal L of the ring R such that a − a 2 ∈ L, there exists an idempotent e = e 2 ∈ R with e − a ∈ L. The ring R is called an exchange ring if all elements in R are suitable; the study of such rings is an important topic of research in ring theory.In [7], Khurana, Lam, and Nielsen presented a new criterion for the suitability of an element.They proved that an element A of the ring R n×n of n-by-n matrices over R is suitable if and only if there exists an idempotent matrix B = B 2 ∈ R n×n such that the Sylvester matrix equation XA − BX = I is solvable in R n×n (in [7], the result is presented in the case where n = 1).In [7], the authors expressed their "total surprise" that studying the solvability of the Sylvester equation XA − BX = I with B idempotent turned out to be precisely equivalent to studying when A is suitable over the ring R.
Also, conjugate versions and generalizations of the Sylvester matrix equation are extensively studied (see, e.g., [8,9] for more information); in the rest of this paragraph and the next one, we list some of them.The matrix equation AX − XB = C, where X denotes the matrix obtained by taking the complex conjugate of each element of X, is called the normal Sylvester-conjugate matrix equation.The matrix equation X − AXF = C is called the Kalman-Yakubovich-conjugate matrix equation (also known as the Stein-conjugate matrix equation, see [10]).The matrix equation X − AXF = BY is called the Yakubovich-conjugate matrix equation ( [11,12]).
For positive integers k and l, let C k×l denote the set of k-by-l complex matrices.In [13], Wu, Duan, Fu, and Wu studied the so-called generalized Sylvester-conjugate matrix equation where A, E ∈ C n×n , B ∈ C n×r , F ∈ C p×p and S ∈ C n×p are known matrices, whereas X ∈ C n×p and Y ∈ C r×p are the matrices to be determined.When B = 0 and E = I, this matrix equation becomes the normal Sylvester-conjugate matrix equation; when A = I and B = 0, it becomes the Kalman-Yakubovich-conjugate matrix equation; when A = I and S = 0, it becomes the Yakubovich-conjugate matrix equation.Moreover, when A = I, the matrix Equation ( 1) becomes the nonhomogeneous Yakubovich-conjugate matrix equation X + BY = EXF + S investigated in [11]; when B = 0, the matrix Equation ( 1) becomes the extended Sylvester-conjugate matrix equation AX = EXF + S investigated in [14]; when E = I (and X is interchanged with X), the matrix Equation ( 1) becomes the nonhomogeneous Sylvester-conjugate matrix equation AX + BY = XF + S investigated in [8], Section 3, and furthermore, if S = 0, it becomes the homogeneous Sylvester-conjugate matrix equation AX + BY = XF investigated in [8], Section 2. Hence, Equation (1) unifies many important conjugate versions of the Sylvester matrix equation.
In [9], Wu, Feng, Liu, and Duan proposed a unified approach to solving a more general class of Sylvester-polynomial-conjugate matrix equations that includes the matrix Equation (1) and the Sylvester polynomial matrix equation (see [15]) as special cases.To present the main result of [9], what is done in Theorem 1 below, we first recall some definitions and notations introduced in [9,16] (alternatively, see [17], pp.98, 99, 368, 389; we refer the reader to [17], Chapter 10, for more detailed information on the Sylvesterpolynomial-conjugate matrix equations).These definitions and notations may look a bit complicated at first glance, but we have to cite them in order to be able to present the main result of [9], which was our motivation for this paper and which we generalize broadly by putting it in a new context.In Section 3, we express these definitions and notation in the language of matrices over skew polynomial rings, and from this new perspective, they become clear and easy to understand.The reader who now does not wish to become familiar with these specific objects can move to the paragraph after Theorem 1 and possibly return to the skipped text later.
For any complex matrix V, square complex matrix F, and non-negative integer k, the matrix V * k is defined inductively by V * k = V * (k−1) with V * 0 = V, and the matrix The set of polynomials over C n×m in the indeterminate s is denoted by C n×m [s], and its elements are called complex polynomial matrices.Given T(s) = ∑ t i=0 T i s i ∈ C n×r [s], V ∈ C r×p , and F ∈ C p×p , the Sylvester-conjugate sum is defined as ( For complex polynomial matrices , their conjugate product is defined as In [9], the authors investigated a general type of complex matrix equations, which they called the Sylvester-polynomial-conjugate matrix equation, where and F ∈ C p×p are known matrices, and X ∈ C n×p and Y ∈ C r×p are the unknown matrices to be determined.It is easy to see that by using the Sylvester-conjugate sum (2), Equation (4) can be written as where and thus, for such A(s), B(s), and C(s), Equation ( 5) becomes the generalized Sylvesterconjugate matrix Equation (1).Thus, each method for solving the polynomial matrix Equation ( 5) automatically provides a method for solving the matrix Equation (1) and, hence, the conjugate variants of the Sylvester matrix equation listed in the first four paragraphs of this section.
Recall from [18] that polynomial matrices A(s) ∈ C n×n [s] and B(s) ∈ C n×r [s] are left coprime in the frame of the conjugate product if there exists a polynomial matrix U(s) ∈ C (n+r)×(n+r) [s] such that U(s) is invertible with respect to the conjugate product ⊛ and [A(s) B(s)] ⊛ U(s) = [I 0].Below, we recall the main result of [9], which provides a complete solution of the Sylvester-polynomial-conjugate matrix Equation ( 5) in the case where A(s) and B(s) are left coprime.
s] be left coprime in the framework of the conjugate product.Hence, there exist polynomial matrices P(s if and only if where Z ∈ C m×p is an arbitrarily chosen parameter matrix.
Throughout the paper, the set of positive integers is denoted by N, the imaginary unit of the field C of complex numbers is denoted by i, and the imaginary basis elements of the division ring H of quaternions are denoted by i, j, k.
In this paper, we put the Sylvester-conjugate matrix Equation ( 6) in a much more general context of matrices over skew polynomial rings.Recall that if R is a ring (not necessarily commutative) and σ : R → R is an endomorphism of the ring R, then the skew polynomial ring R[s; σ] consists of polynomials over R in one indeterminate s (i.e., polynomials of the form ∑ n i=0 a i s i with a i ∈ R) that are added in an obvious way and multiplied formally subject to the rule sa = σ(a)s for any a ∈ R, along with the rules of distributivity and associativity.Clearly, if σ is the identity map id R of R, then the ring R[s; σ] coincides with the usual polynomial ring R[s], and thus, the usual polynomial ring is a special case of the skew polynomial ring construction.Skew polynomial rings are a wellknown tool in algebra to provide examples of lacking symmetry between many ring objects defined by multiplication from the left and their counterparts defined by multiplication from the right.
The main advantage of our approach to Sylvester-conjugate matrix equations via skew polynomial rings lies in the freedom of choosing both the ring R and its endomorphism σ.For instance, as we see in Section 4, to obtain matrix Equation (4), it suffices to take R = C and σ as the complex conjugation (i.e., σ(z) = z).On the other hand, by taking R = C with σ = id C , we obtain the non-conjugate version of (4) and non-conjugate versions of all the matrix equations mentioned in the first four paragraphs of this section.Moreover, to obtain j-conjugate versions of these Sylvester-like matrix equations (which are well studied in the literature; see Section 4), it suffices that R is the division ring of quaternions H and σ is the j-conjugation (i.e., σ(h) = −jhj).
The paper is organized as follows.In Section 2, we present a general approach to equations of the form (6) based on groupoids and vector spaces.In Section 3, we apply the result of Section 2 to matrices over skew polynomial rings, obtaining Theorem 3, which describes all solutions to equations of the form (6) in the case where A(s) and B(s) are left coprime.As immediate consequences, in Section 4, we obtain Theorem 1 along with its version for the Sylvester-polynomial-j-conjugate matrix equation over quaternions.In particular, we develop some ideas of [9].

Main Result
Recall that a groupoid is a set M together with a binary operation ⊕ on M. If M, N are groupoids, then a map φ : Let M 11 , M 12 be groupoids (soon, it will become clear why they are doubly indexed) and V 1 , V 2 be vector spaces over a field K. Assume that for any j ∈ {1, 2}, an operation In this section, we consider the following problem of solving equations of the form of ( 5): Obviously, the structure consisting of groupoids M 11 , M 12 and vector spaces V 1 , V 2 , for which we have formulated Problem ( * ), is too poor to provide a satisfactory solution.In the theorem below, we enrich the structure appropriately, obtaining a solution to Problem ( * ).Theorem 2. Let M 11 , M 12 , M 21 , M 22 be groupoids with operations commonly denoted by ⊕, and let V 1 , V 2 be finite-dimensional vector spaces over a field K. Assume that for any i, j, k ∈ {1, 2}, operations ⊠ : M ij × V j → V i and ⊙ : M ij × M jk → M ik are given such that (1) The operation ⊠ induces groupoid homomorphisms with respect to the first operand and linear maps with respect to the second operand, that is, Let a ∈ M 11 and b ∈ M 12 be such that for some p ∈ M 11 , g ∈ M 12 , d, q ∈ M 21 , and h, w ∈ M 22 , the following conditions hold: Then, for any c ∈ Proof.To prove the result, we consider the following two maps: Let us note that α(x, y) is just the left side of Equation ( 7) that we want to solve, and by (1)(b), both α and β are linear maps.For any c ∈ V 1 , by using (i), (1)(a), and (2), we obtain and thus, (x, y) = (p ⊠ c, q ⊠ c) is a particular solution of Equation (7).

An Application to Matrices over Skew Polynomial Rings
Let R be a (possibly noncommutative) ring with unity, and let σ : R → R be a ring endomorphism of R. Then the set of polynomials over R in one indeterminate s, with the usual addition of polynomials and multiplication subject to the rule sa = σ(a)s for any a ∈ R (along with distributivity and associativity), is a ring, called the skew polynomial ring and denoted by R[s; σ] (see, e.g., [19], p. 10).Thus, elements of R[s; σ] are polynomials of the form ∑ n i=0 a i s i with usual addition, i.e., coefficientwise, and multiplication given by For any n, m ∈ N and A = [a ij ] ∈ R n×m , we put σ(A) = [σ(a ij )], i.e., σ(A) ∈ R n×m is the matrix obtained by taking the value of σ of each element of A. We denote by R n×m [s; σ] the set of polynomials over R n×m in the indeterminate s with usual addition of polynomials and with multiplication of any polynomials Let us note that for any matrix A = [a ij ] ∈ R n×m , the monomial As k can be seen as the matrix of monomials [a ij s k ] ∈ (R[s; σ]) n×m , and thus, R n×m [s; σ] can be viewed as the set of n × m matrices over the skew polynomial ring R[s; σ], with addition and multiplication induced by those in the ring R[s; σ]; this is why elements of R n×m [s; σ] are called polynomial matrices.
Let T be a ring, let n, m ∈ N, and let A ∈ T n×n and B ∈ T n×m .We say the matrices A and B are left coprime if there exists an invertible matrix U ∈ T (n+m)×(n+m) such that for the block matrix [A B] ∈ T n×(n+m) we have (cf.[17], Theorem 9.20) Let us partition U and U −1 as U = P G Q H and U −1 = F Z D W with P, F ∈ T n×n , G, Z ∈ T n×m , Q, D ∈ T m×n and H, W ∈ T m×m .Hence, We use this observation in the following theorem, which solves Problem ( * ) for left coprime matrices over a skew polynomial ring.Theorem 3. Let R be a ring with an endomorphism σ such that R is a finite-dimensional vector space over a field K. Let n, m, p ∈ N, and assume that for any i, j ∈ {n, m} an operation ⊠ : R i×j [s; σ] × R j×p → R i×p is given with the following properties: ( Let polynomial matrices A(s) ∈ R n×n [s; σ] and B(s) ∈ R n×m [s; σ] be left coprime.Hence, there exist polynomial matrices P(s (11).Then, for any matrix C ∈ R n×p , a pair (X, Y) ∈ R n×p × R m×p satisfies the equation Proof.We apply Theorem 2 with with ⊕ being the usual addition of polynomial matrices and ⊙ being the skew multiplication (10) of polynomial matrices.We only need to show that assumption (2) of Theorem 2 is satisfied, since clearly so are all other assumptions of Theorem 2. For that, let , and V ∈ R k×p .Below, using properties (1) and ( 2) along with the Formula (10), we derive the desired equality which completes the proof.
We present examples of operations ⊠ on matrices over skew polynomial rings to which Theorem 3 can be applied.

Example 1.
Let R be a ring with an endomorphism σ such that R is a vector space over a field K. Let : R[s; σ] × R → R be an operation with the following properties: Let i, j, and p be positive integers.Below, we show how one can extend the operation to an operation For each m and r, let Example 2. Let R be a ring with an endomorphism σ such that R is a vector space over a field K, and let : R × R → R be an operation with the following properties: (1) (a) (S + T) v = S v + T v for any S, T, v ∈ R; (b) S (ku + lv) = k(S u) + l(S v) for any k, l ∈ K and S, u, v ∈ R; (2) S (T v) = (ST) v for any S, T, v ∈ R; (3) Let φ : R → R be a K-linear map such that φ(T v) = σ(T) φ(v) for any T, v ∈ R. We extend the operation : R × R → R to an operation : R[s It is easy to verify that the extended operation : R[s; σ] × R → R satisfies conditions (1)-(3) listed in Example 1, so by the method described in Example 1, can be further extended to an operation ⊠ : R i×j [s; σ] × R j×p → R i×p satisfying conditions (1)-(3) of Theorem 3.
Example 3. Let f , g, h ∈ R be such that f 2 + gh + 1 = 0. We define an operation : C × C → C by setting for any complex numbers T = a + bi, v = c + di (written in the algebraic form) that Then, for R = C, K = R, and σ the complex conjugation, the conditions ( 1 Example 4. Let R be a ring with an endomorphism σ, and let K ⊆ R be a field such that kr = rk for any k ∈ K and r ∈ R. For T(s) = ∑ t m=0 T m s m ∈ R i×j [s; σ] and V ∈ R j×p , we define It is clear that the operation ⊠ : R i×j [s; σ] × R j×p → R i×p satisfies conditions (1)-(3) of Theorem 3.

Example 5.
Let R be a ring with an endomorphism σ, F ∈ R p×p , and let K ⊆ R be a field such that σ(k) = k and kr = rk for any k ∈ K and r ∈ R. For T(s) = ∑ t m=0 T m s m ∈ R i×j [s; σ] and V ∈ R j×p , we define One can verify that the operation ⊠ : R i×j [s; σ] × R j×p → R i×p satisfies assumptions (1)-(3) of Theorem 3.

Solution to the Sylvester-Polynomial-Conjugate Matrix Equations over Complex Numbers and Quaternions
In Section 1, we recalled Theorem 1, which is the main result of [9], which gives the complete solution to the Sylvester-polynomial-conjugate matrix Equation ( 6) in the case where A(s) and B(s) are left coprime.Below, we obtain Theorem 1 as a direct corollary of Theorem 3.
Proof of Theorem 1.Let σ be the complex conjugation.Referring to the notation recalled in Section 1, it is easy to see that V * k = σ k (V) and the conjugate product ⊛ is just the skew multiplication (10) of polynomial matrices.Furthermore, and thus, the Sylvester-conjugate sum F ⊞ is a special case of the operation ⊠ in Example 5. Hence, Theorem 1 is an immediate consequence of Theorem 3.
Let H be the skew field of quaternions, that is, H = R ⊕ Ri ⊕ Rj ⊕ Rk, with the multiplication performed subject to the rules ij = −ji = k and i 2 = j 2 = k 2 = −1.For a quaternion matrix A ∈ H m×n , the matrix A = −jAj is called the j-conjugate of A (in [20], where the notion of the j-conjugate of a quaternion matrix was introduced, it is denoted by A, whereas in [21], it is denoted by ← → A ). Similarly to Sylvester-conjugate-matrix equations, their counterparts for matrices over quaternions are intensively studied.For instance, the normal Sylvester-j-conjugate matrix equation AX − XB = C was investigated in [22][23][24][25], the Kalman-Yakubovich-j-conjugate matrix equation X − A XB = C was investigated in [22,[26][27][28][29], and the homogenous Sylvester-j-conjugate matrix equation A X + BY = XF was investigated in [30].Furthermore, the homogeneous Yakubovich-j-conjugate matrix equation X + BY = E XF was investigated in [28], and the nonhomogeneous Yakubovich-j-conjugate matrix equation X + BY = E XF + S was investigated in [31].The two-sided generalized Sylvester matrix equation A 1 XB 1 + A 2 YB 2 = C over H was investigated in [32].
In [21], Wu, Liu, Li, and Duan defined the notion of the j-conjugate product of quaternion polynomial matrices.First, for any quaternion matrix A and positive integer k, they defined inductively the quaternion matrix A •k by setting A , their j-conjugate product is defined as Let σ : H → H be the map defined by σ(h) = −jhj for any h ∈ H.Then, σ is an automorphism of the division ring H, and for any B ∈ H q×r and nonegative integer i, we have that B •i = σ i (B).Hence, the j-conjugate product ( 12) is simply the product of matrices over the skew polynomial ring H[s; σ].Given T(s) = ∑ t i=0 T i s i ∈ H n×r [s], V ∈ H r×p , and F ∈ H p×p , analogously as in [9], we define the Sylvester-j-conjugate sum as Similarly as in the second paragraph of Section 1, one can easily see that each of the aforementioned j-conjugate matrix equations is a special case of the polynomial equation where Z ∈ H m×p is an arbitrarily chosen parameter matrix.

Conclusions
This study focuses on Sylvester-polynomial-conjugate matrix equations, which unify many known versions and generalizations of the Sylvester matrix equation and have many applications, such as in systems and control theory, signal processing, image restoration, engineering, and differential equations.In this paper, we propose a new general approach to Sylvester-polynomial-conjugate matrix equations, using algebraic structures, such as groupoids, vector spaces, and skew polynomial rings.The novelty and broad scope of our approach to Sylvester-polynomial-conjugate matrix equations using skew polynomial rings lie mainly in the freedom of choosing both the ring of coefficients and its endomorphism for the construction of a skew polynomial ring and, then, the polynomial matrices structure appearing in Sylvester-polynomial-conjugate matrix equations.
)-(3) of Example 2 hold, and thus, by using (as described in Example 2) the map φ : C → C defined by φ(x + yi) = xr + ys + (xt − yr)i with r, s, t ∈ R such that 2 f r + gs + ht = 0, the operation : C × C → C can be extended to an operation ⊠ : C i×j [s; σ] × C j×p → C i×p satisfying assumptions of Theorem 3.

F⊞
Y = C, where A(s) ∈ H n×n [s], B(s) ∈ H n×m [s], F ∈ H p×p , C ∈ H n×p are given and X ∈ H n×p , Y ∈ H m×p are unknown.Since σ(r) = r and rh = hr for any r ∈ R and h ∈ H, with the use of the argument of Example 5, we can apply Theorem 3 to matrices over the skew polynomial ring H[s; σ], obtaining the following result as a direct corollary.Theorem 4. Let A(s) ∈ H n×n [s] and B(s) ∈ H n×m [s] be left coprime in the framework of the j-conjugate product ⊛.Hence, there exist polynomial matrices P(s) ∈ H n×n [s], G(s) ∈ H n×m [s], D(s), Q(s) ∈ H m×n [s], H(s), W(s) ∈ H m×m [s] such that A(s) ⊛ P(s) + B(s) ⊛ Q(s) = I n , D(s) ⊛ G(s) + W(s) ⊛ H(s) = I m , A(s) ⊛ G(s) + B(s) ⊛ H(s) = 0.Then for any matrices F ∈ H p×p and C ∈ H n×p , a pair (X, Y) ∈ H n×p × H m×p satisfies the