A Comprehensive Review on the Generalized Sylvester Equation AX − YB = C

Wang, Qing-Wen; Gao, Jiale

doi:10.3390/sym17101686

Open AccessReview

A Comprehensive Review on the Generalized Sylvester Equation AX − YB = C

by

Qing-Wen Wang

^1,2,*

and

Jiale Gao

¹

Department of Mathematics, Shanghai University, Shanghai 200444, China

²

Collaborative Innovation Center for the Marine Artificial Intelligence, Shanghai 200444, China

^*

Author to whom correspondence should be addressed.

Symmetry 2025, 17(10), 1686; https://doi.org/10.3390/sym17101686

Submission received: 12 August 2025 / Revised: 14 September 2025 / Accepted: 18 September 2025 / Published: 8 October 2025

(This article belongs to the Special Issue Mathematics: Feature Papers 2025)

Download

Browse Figures

Versions Notes

Abstract

Since Roth’s work on the generalized Sylvester equation (GSE)

A X - Y B = C

in 1952, related research has consistently attracted significant attention. Building on this, this review systematically summarizes relevant research on GSE from five perspectives: research methods, constrained solutions, various generalizations, iterative algorithms, and applications. Furthermore, we provide comments on current research, put forward several intriguing questions, and offer prospects for future research trends. We hope this work can fill the gap in the review literature on GSE and offer some inspiration for subsequent studies in the field.

Keywords:

generalized Sylvester equation; Moore–Penrose inverse; matrix decompositions; semi-tensor product; constrained solutions; operator equations; tensor equations; polynomial matrix equations; iterative algorithms; hand–eye calibration; encryption and decryption schemes

MSC:

15A03; 15A09; 15A24; 15B33; 15B57; 65F10; 65F45

Contents

1.	Introduction.................................................................................................................................................................................................	3

2.	Preliminaries...............................................................................................................................................................................................	4

3.	Roth’s Equivalence Theorem......................................................................................................................................................................................................	6

4.	Different Methods on GSE...............................................................................................................................................................................................................	7
	4.1. Method by Linear Transformations and Subspace Dimensions................................................................................................	7
	4.2. Method by Generalized Inverses...................................................................................................................................................	8
	4.3. Method by Singular Value Decompositions.................................................................................................................................	10
	4.4. Method by Simultaneous Decompositions...................................................................................................................................	12
	4.5. Method by Real (Complex) Representations................................................................................................................................	13
	4.6. Method by Determinable Representations...................................................................................................................................	15
	4.7. Method by Semi-Tensor Products..................................................................................................................................................	20

5.	Constrained Solutions of GSE..................................................................................................................................................................	24
	5.1. Chebyshev Solutions and lp-Solutions..........................................................................................................................................	25
	5.2. ★-Congruent Solutions...................................................................................................................................................................	26
	5.3. (Minimum-Norm Least-Squares) Symmetric Solutions...............................................................................................................	27
	5.4. Self-Adjoint and Positive (Semi)Definite Solutions......................................................................................................................	28
	5.5. Per(Skew)Symmetric and Bi(Skew)Symmetric Solutions.............................................................................................................	29
	5.6. Maximal and Minimal Ranks of the General Solution..................................................................................................................	31
	5.7. Re-(Non)negative and Re-(Non)positive Definite Solutions.........................................................................................................	32
	5.8. η-Hermitian and η-Skew-Hermitian Solutions..............................................................................................................................	35
	5.9. ϕ-Hermitian Solutions......................................................................................................................................................................	37
	5.10. Equality-Constrained Solutions.......................................................................................................................................................	38

6.	Various Generalizations of GSE.................................................................................................................................................................	40
	6.1. Generalizing RET over Different Rings...........................................................................................................................................	40
	6.1.1. Generalizing RET over Unit Regular Rings......................................................................................................................	40
	6.1.2. Generalizing RET over Principal Ideal Domains..............................................................................................................	41
	6.1.3. Generalizing RET over Division and Module-Finite Rings...........................................................................................	42
	6.1.4. Generalizing RET over Commutative Rings...................................................................................................................	43
	6.1.5. Generalizing RET over Artinian and Noncommutative Rings.....................................................................................	44
	6.2. Generalizing RET to a Rank Minimization Problem....................................................................................................................	45
	6.3. GSE over Dual Numbers and Dual Quaternions..........................................................................................................................	46
	6.4. Linear Operator Equations on Hilbert Spaces...............................................................................................................................	49
	6.5. Tensor Equations...............................................................................................................................................................................	52
	6.6. Polynomial Matrix Equations..........................................................................................................................................................	56
	6.6.1. By the Divisibility of Polynomials.....................................................................................................................................	56
	6.6.2. By Skew-Prime Polynomial Matrices................................................................................................................................	57
	6.6.3. By the Realization of Matrix Fraction Descriptions.........................................................................................................	58
	6.6.4. By the Unilateral Polynomial Matrix Equation................................................................................................................	59
	6.6.5. By the Equivalence of Block Polynomial Matrices...........................................................................................................	61
	6.6.6. By Jordan Systems of Polynomial Matrices......................................................................................................................	62
	6.6.7. By Linear Matrix Equations...............................................................................................................................................	63
	6.6.8. By Root Functions of Polynomial Matrices......................................................................................................................	65
	6.7. Sylvester-Polynomial-Conjugate Matrix Equations.......................................................................................................................	66
	6.8. Generalized Forms of GSE................................................................................................................................................................	70

7.	Iterative Algorithms....................................................................................................................................................................................	75

8.	Applications to GSE....................................................................................................................................................................................	83
	8.1. Theoretical Applications..................................................................................................................................................................	84
	8.1.1. Solvability of Matrix Equations.........................................................................................................................................	84
	8.1.2. UTV Decomposition of Dual Matrices............................................................................................................................	85
	8.1.3. Microlocal Triangularization of Pseudo-Differential Systems......................................................................................	85
	8.2. Practical Applications.........................................................................................................................................................................	86
	8.2.1. Calibration Problems...........................................................................................................................................................	86
	8.2.2. Encryption and Decryption Schemes for Color Images...................................................................................................	87

9.	Conclusions........................................................................................................................................................................................................	89

10.	References..........................................................................................................................................................................................................	91

1. Introduction

This review commences with the famous Sylvester equation

A X - X B = C

with unknown X, named after the British mathematician James Joseph Sylvester (1814–1897) [1]. For a detailed discussion of this equation, refer to the review article [2] by Rajendra Bhatia, a Fellow of the Indian National Science Academy. The core of our review revolves around the generalized Sylvester equation (abbreviated as GSE):

A X - Y B = C

with unknown X and Y.

In 1952, Roth established the necessary and sufficient conditions for the solvability of GSE over a field by means of block matrix equivalence. Since then, a large number of papers on GSE have been published in rapid succession. The solvability conditions and explicit expressions of the general solution for GSE have been investigated from diverse perspectives: linear transformations, generalized inverses, matrix decompositions, real (complex) representations, determinant representations, and semi-tensor products, etc. Extensive studies have been conducted on various constrained solutions (e.g., ★-congruent, symmetric, self-adjoint, positive (semi)definite, per(skew)symmetric, bi(skew)symmetric, re-(non)negative definite, re-(non)positive definite,

η

-Hermitian,

η

-skew-Hermitian,

ϕ

-Hermitian, and equality-constrained solutions) of GSE, as well as its various best approximate solutions when the solvability conditions are not satisfied. Furthermore, the generalizations of GSE in different algebraic structures (e.g., unit regular rings, principal ideal domains, division rings, module-finite rings, commutative rings, Artinian rings, noncommutative rings, dual numbers, dual quaternions), operator equations, tensor equations, polynomial matrix equations, Sylvester-polynomial-conjugate matrix equations, and formal aspects have also shown remarkable vitality. Iterative algorithms for solving various solutions to GSE and its extended forms have also attracted considerable attention, with continuous updates and optimizations. Finally, the relevant results of GSE have demonstrated extraordinary significance in both theoretical applications (e.g., solvability of matrix equations, dual number matrix factorizations, and microlocal triangularization of pseudo-differential systems) and practical applications (e.g., hand-eye calibration problems, and encryption and decryption of color images).

Qing-Wen Wang, the first author of this paper, has been engaged in researching and teaching on the theory of matrix equations since the early 1990s, and has published over 100 related articles. Numerous scholars both domestically and internationally have suggested that we systematically present the theory of solving linear matrix equations. To date, our team has published three review articles, focusing, respectively, on solving the equations

A X B = C

[3],

A X = C

and

X B = D

[4], as well as

A_{1} X B_{1} = C_{1}

and

A_{2} X B_{2} = C_{2}

[5].

As mentioned earlier, in the development history of research on GSE, its conclusions are interrelated, mutually inspiring, mutually promotive, and progressively advanced, forming a complex, intertwined, and extensive network. By reviewing numerous literature, analyzing and comparing, summarizing, questioning, and prospecting, we strive to identify the commonalities, main threads, and approaches in these studies, aiming to provide insights and inspiration for subsequent research on GSE. Thus, this work is of significant value. In this paper, we intend to focus on GSE as the core theme, and unfold its solution methods, solution categories, generalizations, iterative algorithms, as well as theoretical and practical applications in a step-by-step manner, so as to provide a comprehensive overview.

The remainder of this paper consists of eight sections. Section 2 presents some necessary notations and notes. In Section 3, we introduce Roth’s work on GSE published in [6], which serves as the starting point for this paper. In Section 4, we summarize seven methods for studying GSE, demonstrating diverse perspectives and research schemes. Various solutions to GSE are discussed in detail in Section 5. We devote Section 6 to introducing the generalizations of GSE in certain algebraic aspects. In Section 7, we enumerate several classic iterative algorithms for solving various solutions to GSE and its generalized forms. The theoretical and practical applications of GSE are presented in Section 8. Conclusions are stated in Section 9.

2. Preliminaries

This section recalls some notations and definitions used throughout the paper. Additionally, other necessary terminology for subsequent (sub)sections will be introduced within each. A few symbols, due to their bulk, will be referenced via specific indices in original sources, rather than being reproduced. This not only does not hinder readers’ understanding but also makes the paper more concise and readable.

Let

F

be a ring; let

F [λ]

be the polynomial ring over

F

with the variable

λ

; let

F^{m \times n}

be the set of all

m \times n

matrices over

F

; let

F^{m \times n} [λ]

be the set of all

m \times n

polynomial matrices over

F

with the variable

λ

. In particular,

F^{m} = F^{m \times 1}

. Let

deg f (λ)

be the degree of

f (λ) \in F [λ]

; let

rank (A (λ))

be the rank of

A (λ) \in F^{m \times n} [λ]

; let

A^{T}

be the transpose of

A \in F^{m \times n}

. The symbol

det (A (λ))

denotes the determinant of

A (λ) \in F^{n \times n} [λ]

for a field

F

. The component-wise representation of

A \in F^{m \times n}

can be denoted in the three following forms:

A = [a_{i j}] = [a_{i, j}] = {[a_{i, j}]}_{m \times n} \in F^{m \times n} .

The Kronecker product of

A = [a_{i j}] \in F^{m \times n}

and

B \in F^{s \times t}

is defined by

A \otimes B = [\begin{matrix} a_{11} B & a_{12} B & \dots & a_{1 n} B \\ a_{21} B & a_{22} B & \dots & a_{2 n} B \\ ⋮ & ⋮ & ⋮ & ⋮ \\ a_{m 1} B & a_{m 2} B & \dots & a_{m n} B \end{matrix}] \in F^{m s \times n t} .

The matrices

A \in F^{m \times n}

and

B \in F^{m \times n}

are equivalent if there exist two nonsingular matrices

P \in F^{m \times m}

and

Q \in F^{n \times n}

such that

A = P B Q

. Moreover,

A (λ) \in F^{m \times n} [λ]

and

B (λ) \in F^{m \times n} [λ]

are equivalent if there exist two invertible polynomial matrices

P (λ) \in F^{m \times m} [λ]

and

Q (λ) \in F^{n \times n} [λ]

such that

A (λ) = P (λ) B (λ) Q (λ)

.

For a subspace

T \subseteq F^{n}

, let

P_{T}

be the orthogonal projector onto

T

; and let

T^{⊥}

and

dim (T)

be the orthogonal complement and dimension of

T

, respectively. In addition, denote

A T = {A t ∣ t \in T}

for

A \in F^{m \times n}

. For two vector spaces V and W over a field F, let

τ

be a linear transformation from V to W; then

Ker (τ)

and

Im (τ)

represent the kernel space and the image of

τ

, respectively.

Let

Z^{+}

,

N

,

R

, and

C

be the sets of all positive integers, natural numbers, real numbers, and complex numbers, respectively. The symbol

sign (a)

stands for the sign of

a \in R

, and ∅ represents the empty set. Let the set of all (real) quaternions be

H = {q_{1} + q_{2} i + q_{3} j + q_{4} k ∣ i^{2} = j^{2} = k^{2} = - 1, i j k = - 1, q_{1}, q_{2}, q_{3}, q_{4} \in R},

which is a four-dimensional noncommutative division algebra over

R

[7]. Let the conjugate of a quaternion

q = q_{1} + q_{2} i + q_{3} j + q_{4} k \in H

be

\bar{q} = q_{1} - q_{2} i - q_{3} j - q_{4} k .

Let

A = [a_{i j}] \in H^{m \times n}

. The conjugate of A is

\bar{A} = [{\bar{a}}_{i j}] \in H^{m \times n}

, and the conjugate transpose of A is

A^{*} = {\bar{A}}^{T}

. Let

A^{η} = - η A η and {A^{η}}^{*} = - η A^{*} η,

where

η \in {i, j, k}

. If

A = {A^{η}}^{*}

(

A = - {A^{η}}^{*}

) for

η \in {i, j, k}

, then A is called

η

-Hermitian (

η

-anti-Hermitian) [8]. An

η

-anti-Hermitian matrix is also called an

η

-skew-Hermitian matrix. Moreover,

{A^{i}}^{*} = A^{*}

and

{A^{j}}^{*} = {A^{k}}^{*} = A^{T}

for

A \in C^{m \times n}

.

Denote

P_{A} = A^{†} A, Q_{A} = A A^{†}, L_{A} = I - A^{†} A, R_{A} = I - A A^{†},

where

A^{†}

is the Moore–Penrose inverse [9] of a matrix A. The symbols

A^{- 1}

,

rank (A)

,

R (A)

, and

N (A)

denote the inverse, rank, range, and null space of a matrix A, respectively. In addition,

\bar{A}

,

A^{*}

, and

{∥ A ∥}_{F}

are the conjugate, conjugate transpose, and Frobenius norm of

A \in C^{m \times n}

, respectively. Let I and 0 be the identity matrix and the null matrix, respectively, with appropriate orders. Specifically,

I_{n}

denotes the

n \times n

identity matrix. For a symmetric matrix

A \in R^{m \times m}

,

A > 0

and

A \geq 0

denote the positive definite matrix and the positive semidefinite matrix, respectively. In addition, denote

A \geq B

if

A - B \geq 0

for matrices A and B. The matrix

A = [A_{1}, A_{2}, \dots, A_{n}] = [\begin{matrix} A_{1} & A_{2} & \dots & A_{n} \end{matrix}]

denotes a new matrix formed by arranging matrix blocks

A_{1}, A_{2}, \dots, A_{n}

(with the same number of rows) column-wise. The symbol

diag (σ_{1}, \dots, σ_{r})

denotes a (block) diagonal matrix with diagonal entries

σ_{1}, \dots, σ_{r}

.

For a ring

F

and

a \in F

, a is regular (or inner-invertible) if there exists

a^{-} \in F

such that

a a^{-} a = a

; a is

(1, 2)

-invertible (or reflexive) if there exists

a^{(1, 2)} \in F

such that

a a^{(1, 2)} a = a

and

a^{(1, 2)} a a^{(1, 2)} = a^{(1, 2)}

(see [10]). The characteristic of a field

F

is denoted as

char (F)

.

An equation or a system of equations is said to be solvable (consistent) if it has at least one solution. The symbols “⇒” and “⇔” denote “imply” and “if and only if”, respectively. For the reader’s convenience, we provide the main abbreviations used in this paper along with their full names in Table 1.

At the end of this section, we present three specific notes regarding this review.

(1)

The selection of references follows the core principle of focusing on the theoretical research and practical applications of GSE, with specific criteria as follows:

(i): Time frame: It covers 19th-century foundational studies to 2025’s latest achievements. It includes classic works like Hamilton’s quaternion research (1844) and Sylvester’s matrix equation study (1884), and emphasizes 2015–2025 recent studies (over 30% of total);
(ii): Publication venues: Priority is given to peer-reviewed works, including top journals (e.g., SIAM J. Matrix Anal. Appl.), authoritative monographs (e.g., by Gohberg), and key conference papers (e.g., from IEEE ICMA);
(iii): Content types: Original research is the main focus, with a small number of GSE-related review literature included. Only a few preprints/arXiv works are selected, due to their significance for subsequent research.
(iv): Relevance scope: Though some research does not focus directly on GSE itself, its traces of relevance to GSE are easily detectable. Thus, we regard such content as an integral part of GSE-related research.

(2)

Results closely related to the theme are rigorously presented as theorems, while less relevant conclusions are briefly summarized narratively. Furthermore, the proofs of these theorems are omitted here.

(3)

The remarks in this paper include comments and suggestions on relevant results, encompassing both previous researchers’ views and our reflections, questions, and prospects.

3. Roth’s Equivalence Theorem

First, we specifically present the paper’s core equation:

A X - Y B = C,

(1)

where

A = [a_{i, j}] \in F^{m \times r}

,

B = [b_{i, j}] \in F^{s \times n}

, and

C = [c_{i, j}] \in F^{m \times n}

are given over a ring

F

.

Let

F

be a field. In 1952, Roth [6] first studied the necessary and sufficient conditions for the solvability to the polynomial matrix form of Equation (1), i.e.,

A (λ) X (λ) - Y (λ) B (λ) = C (λ),

(2)

where

A (λ), B (λ), C (λ) \in F^{n \times n} [λ]

, by using the normal form of polynomial matrices.

Theorem 1

([6]). Equation (2) has a solution pair

X (λ), Y (λ) \in F^{n \times n} [λ]

if and only if the polynomial matrices

[\begin{matrix} A (λ) & C (λ) \\ 0 & B (λ) \end{matrix}] a n d [\begin{matrix} A (λ) & 0 \\ 0 & B (λ) \end{matrix}]

are equivalent.

Roth [6] has stated that Theorem 1 remains valid for rectangular polynomial matrices of appropriate orders. Thus, the following theorem can be derived immediately.

Theorem 2

(Theorem 1, [6]). Let

A \in F^{m \times r}

,

B \in F^{s \times n}

, and

C \in F^{m \times n}

. Then,

Equation (1) has a solution pair X \in F^{r \times n} and Y \in F^{m \times s}

(3)

if and only if

[\begin{matrix} A & C \\ 0 & B \end{matrix}] and [\begin{matrix} A & 0 \\ 0 & B \end{matrix}] are equivalent .

(4)

We call this theorem Roth’s equivalence theorem (abbreviated as RET).

Remark 1.

It is easy to find that the study of Equation (1) is equivalent to that of the equation

A X + Y B = C

(5)

with given

A = [a_{i, j}] \in F^{m \times r}

,

B = [b_{i, j}] \in F^{s \times n}

, and

C = [c_{i, j}] \in F^{m \times n}

, by regarding

- B

in Equation (1) as B in Equation (5). Thus, in this paper, we collectively refer to Equations (1) and (5) as GSE.

4. Different Methods on GSE

Roth [6] considered Equation (1) based on the canonical forms of the polynomial matrices. Subsequently, many scholars have provided different proofs of RET from various perspectives. These proofs demonstrate the effectiveness and uniqueness of different mathematical methods in considering Equation (1). In addition, these methods have also had a profound impact on the subsequent study of different Sylvester-type equations. This is precisely one of the enchantments of mathematics: reaching the same endpoint through different paths, or even completely unrelated ones. Furthermore, mathematicians ceaselessly pursue groundbreaking innovations and perpetually strive to discover the most elegant path.

Remark 2.

It is notable that it is easy to see that (3) implies (4) by

[\begin{matrix} I & - Y \\ 0 & I \end{matrix}] [\begin{matrix} A & 0 \\ 0 & B \end{matrix}] [\begin{matrix} I & X \\ 0 & I \end{matrix}] = [\begin{matrix} A & A X - Y B \\ 0 & B \end{matrix}] .

Therefore, one only needs to consider the sufficiency of RET, i.e., (4) ⇒ (3).

4.1. Method by Linear Transformations and Subspace Dimensions

Let

F

be a field. Flanders and Wimmer [11] proved RET for

m = r

and

s = n

by means of linear transformations and subspace dimension arguments. This method is more fundamental and elementary than Roth’s method.

Proof of RET [11].

Step 1:: Define $ψ_{i} : M_{r + s, 2 (r + s)} \to M_{r + s}$ by

$\begin{matrix} ψ_{0} (U, W) = [\begin{matrix} A & 0 \\ 0 & B \end{matrix}] U - W [\begin{matrix} A & 0 \\ 0 & B \end{matrix}] and ψ_{1} (U, W) = [\begin{matrix} A & C \\ 0 & B \end{matrix}] U - W [\begin{matrix} A & 0 \\ 0 & B \end{matrix}] . \end{matrix}$

Then, the condition (4) yields

dim (Ker (ψ_{0})) = dim (Ker (ψ_{1})) .

Step 2:: Let

$U = [\begin{matrix} U_{11} & U_{12} \\ U_{21} & U_{22} \end{matrix}] and W = [\begin{matrix} W_{11} & W_{12} \\ W_{21} & W_{22} \end{matrix}] .$

Then,

\begin{matrix} Ker (ψ_{0}) = \{(U, W) |\begin{matrix} A U_{11} = W_{11} A, & A U_{12} = W_{12} B, \\ B U_{21} = W_{21} A, & B U_{22} = W_{22} B, \end{matrix}\}, \\ Ker (ψ_{1}) = \{(U, W) |\begin{matrix} A U_{11} + C U_{21} = W_{11} A, & A U_{12} + C U_{22} = W_{12} B, \\ B U_{21} = W_{21} A, & B U_{22} = W_{22} B, \end{matrix}\} . \end{matrix}

Let

Z = {[\begin{matrix} U_{21} & U_{22} & W_{21} & W_{22} \end{matrix}] ∣ B U_{21} = W_{21} A, B U_{22} = W_{22} B} .

For

i = 0, 1

, define

ν_{i} : Ker (ψ_{i}) \to Z, ν_{i} (U, W) = [\begin{matrix} U_{21} & U_{22} & W_{21} & W_{22} \end{matrix}] .

Then,

Im (ν_{1}) \subseteq Im (ν_{0}) = Z and Ker (ν_{1}) = Ker (ν_{0}) .

So,

Im (ν_{1}) = Im (ν_{0})

.

Step 3:: Since $(U, W) \in K e r (ψ_{0})$ with $U_{22} = - I$ , there also exists such in $Ker (ψ_{1})$ . Therefore, $A U_{12} - W_{12} B = C$ , i.e., (3) holds.

□

Remark 3.

(1): In [11], Flanders and Wimmer mentioned that, by making small modifications to the above proof, one can similarly obtain the proof of RET under the condition of rectangular matrices A, B, and C.
(2): In terms of linear transformations and subspace dimensions, Dmytryshyn et al. discussed two highly complex systems of equations (see Theorem 1.1 in [12] and Theorem 1 in [13]).

4.2. Method by Generalized Inverses

Let

F

be a field. In 1955, Penrose [9] concisely defined the Moore–Penrose inverse of any rectangular matrix through four matrix equations.

Definition 1

([9,10]). Let

A \in F^{m \times n}

. If there exists

X \in F^{n \times m}

such that

\begin{matrix} (1) A X A = A, (2) X A X = X, (3) {(A X)}^{*} = A X, (4) {(X A)}^{*} = X A, \end{matrix}

then X is called the Moore–Penrose (abbreviated as MP) inverse of A and is denoted by

A^{†}

. Especially if

X \in F^{n \times m}

satisfies the above Equation

(1)

, then X is called an inner inverse of A and is denoted by

A^{-}

. Clearly,

A^{†}

is a special inner inverse of A.

Since then, the theory of generalized inverses has flourished (see [10,14,15,16,17,18]), and it has been widely used to study the solvability conditions of linear equations and represent the explicit expressions of their general solutions when solvable. Penrose’s study on the matrix equation

A X B = C

[9] is the most renowned.

In 1979, Baksalary and Kala [19] naturally utilized inner inverses to establish solvability conditions and representations of the general solution for Equation (1).

Theorem 3

([19]). Equation (1) has a solution pair

X \in F^{r \times n}

and

Y \in F^{m \times s}

if and only if

(I - A A^{-}) C (I - B^{-} B) = 0,

(6)

in which case

\begin{matrix} X & = A^{-} C + A^{-} Z B + (I - A^{-} A) W, \\ Y & = - (I - A A^{-}) C B^{-} + Z - (I - A A^{-}) Z B B^{-}, \end{matrix}

where W and Z are arbitrary matrices with appropriate orders.

Remark 4.

In views of RET and Theorem 3, Baksalary and Kala (Remark 2, [19]) noted that

(6) \Leftrightarrow (4) \Leftrightarrow rank [\begin{matrix} A & C \\ 0 & B \end{matrix}] = rank (A) + rank (B) .

They also pointed out that this result can be directly derived by using (Formula (8.7), [20]) in a particular case, i.e.,

rank [\begin{matrix} A & C \\ 0 & B \end{matrix}] = rank (A) + rank (B) + rank ((I - A A^{-}) C (I - B^{-} B)) .

Moreover, they said that proof using generalized inverses is simpler than Roth’s [6] and Flanders et al.’s [11]. In fact, this is indeed the case when considering the simplicity of the proof and its length.

Remark 5.

Under the hypotheses of Theorem 3, it is easy to obtain that

\begin{matrix} (6) & \Leftrightarrow R (C (I - B^{-} B)) \subseteq N ((I - A A^{-})) \end{matrix}

\begin{matrix} \Leftrightarrow C N (B) \subseteq R (A), \end{matrix}

(7)

which is also proved by Woude in (Lemma 3.2, [21]) according to an elementary method. In addition, Woude applied this result to a control problem that occurs in almost non-stationary stochastic processes via measurement feedback (see Theorems 3.3 and 4.1, [21]).

Remark 6.

It is easy to show that the solvability of Equation (1) is equivalent to the existence of X and Y such that

[\begin{matrix} I & - Y \\ 0 & I \end{matrix}] [\begin{matrix} A & 0 \\ 0 & B \end{matrix}] [\begin{matrix} I & X \\ 0 & I \end{matrix}] = [\begin{matrix} A & C \\ 0 & B \end{matrix}] .

(8)

In terms of a geometrical method, Olshevsky gave a cyclic argument over

C

as follows:

(3) \Rightarrow (8) \Rightarrow (4) \Rightarrow (7) \Rightarrow (3),

(see the proof of Theorem 1.2 in [22]).

Let

F = C

. Meyer [23] revealed an interesting result, that is, the solvability of Equation (1) is equivalent to the existence of the upper block inner inverse of a certain block matrix.

Theorem 4

(Theorems 1 and 2, [23]). The block triangular complex matrix

T = [\begin{matrix} A & C \\ 0 & B \end{matrix}]

has an upper block triangular inner inverse if and only if

rank (T) = rank (A) + rank (B),

in which case,

T^{-} = [\begin{matrix} A^{-} & - A^{-} C B^{-} \\ 0 & B^{-} \end{matrix}],

is an inner inverse of T for any

A^{-}

and

B^{-}

.

4.3. Method by Singular Value Decompositions

Let

F = R

. The singular value decomposition (abbreviated as SVD) [24], as an important tool for the study of matrix theory, also plays a significant role in the research on solutions of matrix equations. Chu [25] utilized the SVD to study the solvability conditions and the representation of the general solution of Equation (5).

In fact, let the SVDs of A and B given in Equation (5) be

A = U_{A} D_{A} V_{A}^{T} and B = U_{B} D_{B} V_{B}^{T},

(9)

where

U_{A}

,

U_{B}

,

V_{A}

, and

V_{B}

are orthogonal matrices with appropriate orders,

D_{A} = [\begin{matrix} Σ_{A} & 0 \\ 0 & 0 \end{matrix}], D_{B} = [\begin{matrix} Σ_{B} & 0 \\ 0 & 0 \end{matrix}],

Σ_{A} = diag (α_{1}, \dots, α_{k})

with

α_{i} > 0

(

1 \leq i \leq k

), and

Σ_{B} = diag (δ_{1}, \dots, δ_{l})

with

δ_{j} > 0

(

1 \leq i \leq l

). Then,

(5) \Leftrightarrow U_{A} D_{A} V_{A}^{T} X + Y U_{B} D_{B} V_{B}^{T} = C \Leftrightarrow D_{A} \tilde{X} + \tilde{Y} D_{B} = \tilde{C},

where

\tilde{X} = V_{A}^{T} X V_{B}

,

\tilde{Y} = U_{A}^{T} Y U_{B}

, and

\tilde{C} = U_{A}^{T} C V_{B}

. Partitioning the above equation analogously to the partitioning of

D_{A}

and

D_{B}

, we obtain

\begin{matrix} [\begin{matrix} Σ_{A} & 0 \\ 0 & 0 \end{matrix}] [\begin{matrix} {\tilde{X}}_{11} & {\tilde{X}}_{12} \\ {\tilde{X}}_{21} & {\tilde{X}}_{22} \end{matrix}] + [\begin{matrix} {\tilde{Y}}_{11} & {\tilde{Y}}_{12} \\ {\tilde{Y}}_{21} & {\tilde{Y}}_{22} \end{matrix}] [\begin{matrix} Σ_{B} & 0 \\ 0 & 0 \end{matrix}] = [\begin{matrix} {\tilde{C}}_{11} & {\tilde{C}}_{12} \\ {\tilde{C}}_{21} & {\tilde{C}}_{22} \end{matrix}] \\ \Leftrightarrow & [\begin{matrix} Σ_{A} {\tilde{X}}_{11} + {\tilde{Y}}_{11} Σ_{B} & Σ_{A} {\tilde{X}}_{12} \\ {\tilde{Y}}_{21} Σ_{B} & 0 \end{matrix}] = [\begin{matrix} {\tilde{C}}_{11} & {\tilde{C}}_{12} \\ {\tilde{C}}_{21} & {\tilde{C}}_{22} \end{matrix}] . \end{matrix}

(10)

Based on the above discussion, the following theorem can be obtained.

Theorem 5

(Theorem 2, [25]). Under the notations in (9) and (10), let

U_{A} = [\begin{matrix} U_{A_{1}} & U_{A_{2}} \end{matrix}] and V_{B} = [\begin{matrix} V_{B_{1}} & V_{B_{2}} \end{matrix}] .

(1): Then, Equation (5) is solvable if and only if

${\tilde{C}}_{22} = 0, i . e ., U_{A_{2}}^{T} C V_{B_{2}} = 0 .$

(11)
(2): Suppose that (11) holds. Denote

$\tilde{C} = ({\tilde{c}}_{i j}) and M_{i j} = [\begin{matrix} α_{i} & δ_{j} \end{matrix}] .$

Then, ${\tilde{X}}_{21}$ , ${\tilde{X}}_{22}$ , ${\tilde{Y}}_{12}$ , and ${\tilde{Y}}_{22}$ are arbitrary,

${\tilde{X}}_{12} = Σ_{A}^{- 1} {\tilde{C}}_{12}, {\tilde{Y}}_{21} = {\tilde{C}}_{21} Σ_{B}^{- 1}, {\tilde{X}}_{11} = ({\tilde{x}}_{i j}), and {\tilde{Y}}_{11} = ({\tilde{y}}_{i j}),$

where

$[\begin{matrix} {\tilde{x}}_{i j} \\ {\tilde{y}}_{i j} \end{matrix}] = M_{i j}^{†} {\tilde{c}}_{i j} + (I - M_{i j}^{†} M_{i j}) Z_{i j}$

for arbitrary $Z_{i j}$ . Moreover, if ${\tilde{Y}}_{11}$ is arbitrary, then

${\tilde{X}}_{11} = Σ_{A}^{- 1} ({\tilde{C}}_{11} - {\tilde{Y}}_{11} Σ_{B}) .$
(3): If (11) holds, then

$\begin{matrix} X & = V_{A} [\begin{matrix} Σ_{A}^{- 1} (U_{A_{1}}^{T} C V_{B_{1}} - Z_{1} Σ_{B}) & Σ_{A}^{- 1} U_{A_{1}}^{T} C V_{B_{1}} \\ Z_{2} & Z_{3} \end{matrix}] V_{B}^{T}, \\ Y & = U_{A} [\begin{matrix} Z_{1} & Z_{4} \\ U_{A_{2}}^{T} C V_{B_{1}} Σ_{B}^{- 1} & Z_{5} \end{matrix}] U_{B}^{T}, \end{matrix}$

where $Z_{1}, \dots, Z_{5}$ are arbitrary matrices with appropriate orders.

Remark 7.

Interestingly, Chu in Theorem 1 of [25] utilized the generalized singular value decomposition (abbreviated as GSVD) [26] to study the extended form of Equation (5) over

R

, i.e.,

A X E + F Y B = C,

(12)

where

A, E, F, B

, and C are given real matrices with appropriate orders. Eleven years later, Xu et al. [27] once again discussed the solvability conditions, the general solution, and least-squares solutions of Equation (12) over

C

by using the canonical correlation decomposition (abbreviated as CCD) introduced in [28].

Remark 8.

Inspired by solving Equation (5) via SVD, one can utilize equivalent normal forms of A and B to study Equation (5) over a field, along with an approach similar to Theorem 5. In fact, we only need to regard orthogonal matrices

U_{A}

,

U_{B}

,

V_{A}

, and

V_{B}

in (9) as invertible matrices, the transpose

V_{A}^{T}

and

V_{B}^{T}

as the inverse

V_{A}^{- 1}

and

V_{B}^{- 1}

, and

Σ_{A}

and

Σ_{B}

as identity matrices of appropriate orders.

It can be found that using the equivalent normal form of matrices to solve matrix equations is also a very convenient method. This viewpoint has been confirmed multiple times in subsequent research. Wang et al. used the elementary row and column transformations of matrices to give the equivalent normal form of a matrix triplet

[\begin{matrix} A & B & C \end{matrix}]

with the same row number over an arbitrary division ring

F

(see Theorem 2.1, [29]). Similarly, the equivalent normal form of another matrix triplet

{[\begin{matrix} D^{T} & E^{T} & F^{T} \end{matrix}]}^{T}

with the same column number can also be obtained (see Theorem 2.2, [29]). The two equivalent normal forms are applied to solve the matrix equation

A X D + B Y E + C Z F = G,

(13)

where X, Y, and Z are unknown (see Theorem 3.2, [29]).

Interestingly, He et al. utilized Theorem 2.1 of [29] to propose a simultaneous decomposition of seven matrices over

H

(see Theorem 2.3, [30]), i.e.,

[\begin{matrix} G & A & B & C \\ D \\ E \\ F \end{matrix}]

and discussed Equation (13) once again by this decomposition (see Theorem 3.1, [30]). Compared with the method in [29], which directly applies the equivalent normal forms of two matrix triplets to matrix G, the simultaneous decomposition is more concise. In addition, Ref. [31] once again brilliantly demonstrates the important role of simultaneous decomposition in solving matrix equations.

4.4. Method by Simultaneous Decompositions

Let

F

be a field. Remark 8 notes using equivalent canonical forms of the matrices A and B to solve Equation (1). Can we find an equivalent canonical form that is simultaneously related to the matrix triplet

[\begin{matrix} C & A \\ B \end{matrix}]

(14)

so as to discuss Equation (1) more simply? Gustafson [32] gave a positive answer to this problem.

Theorem 6

([32]). Let A, B, and C be given in Equation (1). Then, there exist invertible matrices T, U, V, and W of appropriate orders such that

[\begin{matrix} T C W & T A U \\ V B W \end{matrix}] = [\begin{matrix} C^{'} & A^{'} \\ B^{'} \end{matrix}],

(15)

where

A^{'} = [\begin{matrix} I_{z} & 0 & 0 & 0 \\ 0 & I_{t_{3}} & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & I_{r_{2}} \\ 0 & 0 & 0 & 0 \end{matrix}], B^{'} = [\begin{matrix} I_{z} & 0 & 0 & 0 & 0 & 0 \\ 0 & I_{t_{1}} & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & I_{r_{1}} & 0 & 0 \end{matrix}], C^{'} = [\begin{matrix} I_{z} & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & I_{r_{1}} & 0 & 0 \\ 0 & 0 & 0 & 0 & I_{r_{2}} & 0 \\ 0 & 0 & 0 & 0 & 0 & I_{t_{2}} \end{matrix}],

z + t_{3} + r_{2} = rank (A)

,

z + t_{1} + r_{1} = rank (B)

, and

z + t_{2} + r_{1} + r_{2} = rank (C)

. We call (15) the simultaneous decomposition of (14).

Remark 9.

By elementary matrix row and column transformations, Gustafson designed the 12-step algorithm to obtain the simultaneous decomposition of (14) (see (Section 3, [32])).

Applying the simultaneous decomposition of (14) to Equation (1) yields

\begin{matrix} (1) & \Leftrightarrow T^{- 1} A^{'} U^{- 1} X - Y V^{- 1} B^{'} W^{- 1} = T^{- 1} C^{'} W^{- 1} \\ \Leftrightarrow A^{'} X^{'} - Y^{'} B^{'} = C^{'}, \end{matrix}

(16)

where

X^{'} = U^{- 1} X W

and

Y^{'} = T Y V^{- 1}

. Thus,

X = U X^{'} W^{- 1}

and

Y = T^{- 1} Y^{'} V

.

Theorem 7

([32]). Equation (16) is solvable if and only if

t_{2} = 0

, in which case,

A^{'} = [\begin{matrix} I_{z} & 0 & 0 & 0 & 0 \\ 0 & I_{t_{3}} & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & I_{r_{2}} & 0 \\ 0 & 0 & 0 & 0 & 0 \end{matrix}], B^{'} = [\begin{matrix} I_{z} & 0 & 0 & 0 & 0 \\ 0 & I_{t_{1}} & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & I_{r_{1}} & 0 \end{matrix}], C^{'} = [\begin{matrix} I_{z} & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & I_{r_{1}} & 0 \\ 0 & 0 & 0 & 0 & I_{r_{2}} \end{matrix}],

X^{'} = [\begin{matrix} X_{11} & X_{12} & 0 & X_{14} & 0 \\ X_{21} & X_{22} & 0 & X_{24} & 0 \\ X_{31} & X_{32} & X_{33} & X_{35} & X_{35} \\ X_{41} & X_{42} & 0 & X_{44} & I_{r_{2}} \end{matrix}], Y^{'} = [\begin{matrix} X_{11} - I_{z} & X_{12} & Y_{13} & X_{14} \\ X_{21} & X_{22} - I_{t_{1}} & Y_{23} & X_{24} \\ 0 & 0 & Y_{33} & 0 \\ 0 & 0 & Y_{43} & - I_{r_{1}} \\ X_{41} & X_{42} & Y_{53} & X_{44} \end{matrix}],

where the

X_{i j}

and

Y_{i j}

are arbitrary matrices with appropriate orders.

Remark 10.

The quiver theory introduced by Gabriel [33] is an important tool in the representation theory of algebras. Gustafson [32] gave a novel interpretation of the existence of the simultaneous decomposition of (14) from the perspective of the quiver theory. Finally, he gave a necessary and sufficient condition for Equation (1) to have a solution by using the corresponding representation of arrows (see Section 10, [32]). Note that refs. [12,34] also make use of the graph theory to discuss linear matrix equations.

Remark 11.

Wang et al. in Theorem 2.1 of [35] continued with the idea of simultaneous decomposition and decomposed the following matrices over a division ring

F

:

[\begin{matrix} A & B & C \\ D \end{matrix}],

(17)

where A, B, and C are of the same row number, and A and D are of the same column number. So, Theorem 6 is a corollary of Theorem 2.1 of [35] (see Corollary 2.2, [35]). Also, it should be noted that Theorem 2.1 of [29] is indeed a special case of Theorem 2.1 of [35] (see Corollary 2.3, [35]). In addition, Wang et al. applied the simultaneous decomposition of (17) to solving two types of systems of matrix equations. Interestingly, He et al. [36] further refined the simultaneous decomposition presented by [35].

He et al. [37] further considered the simultaneous decompositions of two more general forms over

H

:

[\begin{matrix} A \\ B & C & D \\ E \end{matrix}] and [\begin{matrix} A \\ B & C \\ D \\ E \end{matrix}],

where matrices in the same row (or column) have the same number of rows (or columns), and applied them to solving systems of quaternion matrix equations. Recently, Huo et al. [38] generalized the simultaneous decomposition of (17) to quaternion tensors under the Einstein product.

4.5. Method by Real (Complex) Representations

It is well known that an associative algebra

A

with finite dimensions over a field

F

is isomorphic to a subalgebra of the algebra

F^{n \times n}

, where n is the dimension of

A

over

F

(see [39]). We now consider the case of complex numbers, i.e.,

F = C

.

Let

A = A_{0} + A_{1} i \in C^{m \times r}

, where

A_{0}, A_{1} \in R^{m \times r}

, and

i

is the imaginary unit such that

i^{2} = - 1

. Define a map

ϕ : C^{m \times r} \to R^{2 m \times 2 r} with ϕ (A) = ϕ (A_{0} + A_{1} i) = [\begin{matrix} A_{0} & - A_{1} \\ A_{1} & A_{0} \end{matrix}] .

We call

ϕ (A)

a real representation of the complex matrix A [40]. Then, one can check that

ϕ (\cdot)

is an isomorphism of the real algebra

C^{m \times r}

onto the real subalgebra

\{[\begin{matrix} S_{0} & - S_{1} \\ S_{1} & S_{0} \end{matrix}] ∣ S_{0}, S_{1} \in R^{m \times r}\} .

Then,

(5) \Leftrightarrow ϕ (A) ϕ (X) + ϕ (Y) ϕ (B) = ϕ (C) .

On the other hand, suppose that a real matrix pair

\hat{X} = [\begin{matrix} X_{11} & X_{12} \\ X_{21} & X_{22} \end{matrix}] and \hat{Y} = [\begin{matrix} Y_{11} & Y_{12} \\ Y_{21} & Y_{22} \end{matrix}]

is a solution pair of the following equation

ϕ (A) \hat{X} + \hat{Y} ϕ (B) = ϕ (C) .

(18)

Then,

K_{2 r}^{- 1} \hat{X} K_{2 n}

and

K_{2 m}^{- 1} \hat{Y} K_{2 s}

are also a solution pair of Equation (18), where

K_{2 t} = [\begin{matrix} 0 & I_{t} \\ - I_{t} & 0 \end{matrix}] for t = m, n, r, s .

Let

\bar{X} = \frac{1}{2} (X_{11} + X_{22}) + \frac{1}{2} (X_{21} - X_{12}) i

and

\bar{Y} = \frac{1}{2} (Y_{11} + Y_{22}) + \frac{1}{2} (Y_{21} - Y_{12}) i

. Then,

\begin{matrix} ϕ (\bar{X}) & = \frac{1}{2} [\begin{matrix} X_{11} + X_{22} & X_{12} - X_{21} \\ X_{21} - X_{12} & X_{11} + X_{22} \end{matrix}] = \frac{1}{2} (\hat{X} + K_{2 r}^{- 1} \hat{X} K_{2 n}), \\ and ϕ (\bar{Y}) & = \frac{1}{2} [\begin{matrix} Y_{11} + Y_{22} & Y_{12} - Y_{21} \\ Y_{21} - Y_{12} & Y_{11} + Y_{22} \end{matrix}] = \frac{1}{2} (\hat{Y} + K_{2 m}^{- 1} \hat{Y} K_{2 s}) \end{matrix}

satisfy (18), so

\bar{X}

and

\bar{Y}

satisfy Equation (5).

Based on the above analysis, the problem of a complex matrix equation is transformed into a problem of a real matrix equation by using the real representation of complex matrices. Based on this consideration, Liu [40] proposed the following theorem for solving Equation (5) over

C

.

Theorem 8

(Lemma 1.3, [40]). Let

A = A_{0} + A_{1} i, B = B_{0} + B_{1} i, and C = C_{0} + C_{1} i

be given in Equation (5). Then, Equation (5) is consistent over

C

if and only if

[\begin{matrix} A_{0} & - A_{1} \\ A_{1} & A_{0} \end{matrix}] [\begin{matrix} X_{11} & X_{12} \\ X_{21} & X_{22} \end{matrix}] + [\begin{matrix} Y_{11} & Y_{12} \\ Y_{21} & Y_{22} \end{matrix}] [\begin{matrix} B_{0} & - B_{1} \\ B_{1} & B_{0} \end{matrix}] = [\begin{matrix} C_{0} & - C_{1} \\ C_{1} & C_{0} \end{matrix}]

(19)

is consistent over

R

, in which case,

\{\begin{matrix} X = X_{0} + X_{1} i = \frac{1}{2} (X_{11} + X_{22}) + \frac{1}{2} (X_{21} - X_{12}) i, \\ Y = Y_{0} + Y_{1} i = \frac{1}{2} (Y_{11} + Y_{22}) + \frac{1}{2} (Y_{21} - Y_{12}) i, \end{matrix}

(20)

where

X_{11}, X_{12}, X_{21}, X_{22}, Y_{11}, Y_{12}, Y_{21}

, and

Y_{22}

constitute the general solution of Equation (19). Furthermore, the explicit forms of X and Y given in (20) are

\begin{matrix} X_{0} = & \frac{1}{2} P_{1} ϕ {(A)}^{-} ϕ (C) Q_{1} + \frac{1}{2} P_{2} ϕ {(A)}^{-} ϕ (C) Q_{2} + [U_{1}, U_{2}] [\begin{matrix} ϕ (B) Q_{1} \\ ϕ (B) Q_{2} \end{matrix}] + [P_{1} F_{ϕ (A)}, P_{2} F_{ϕ (A)}] [\begin{matrix} V_{1} \\ V_{2} \end{matrix}], \\ X_{1} = & \frac{1}{2} P_{2} ϕ {(A)}^{-} ϕ (C) Q_{1} - \frac{1}{2} P_{1} ϕ {(A)}^{-} ϕ (C) Q_{2} + [U_{1}, U_{2}] [\begin{matrix} - ϕ (B) Q_{2} \\ ϕ (B) Q_{1} \end{matrix}] + [P_{2} F_{ϕ (A)}, - P_{1} F_{ϕ (A)}] [\begin{matrix} V_{1} \\ V_{2} \end{matrix}], \\ Y_{0} = & \frac{1}{2} S_{1} E_{ϕ (A)} ϕ (C) ϕ {(B)}^{-} T_{1} + \frac{1}{2} S_{2} E_{ϕ (A)} ϕ (C) ϕ {(B)}^{-} T_{2} - [S_{1} ϕ (A), S_{2} ϕ (A)] [\begin{matrix} {\hat{U}}_{1} \\ {\hat{U}}_{2} \end{matrix}] + [W_{1}, W_{2}] [\begin{matrix} E_{ϕ (B)} T_{1} \\ E_{ϕ (B)} T_{2} \end{matrix}], \\ Y_{1} = & \frac{1}{2} S_{2} E_{ϕ (A)} ϕ (C) ϕ {(B)}^{-} T_{1} - \frac{1}{2} S_{1} E_{ϕ (A)} ϕ (C) ϕ {(B)}^{-} T_{2} + [- S_{2} ϕ (A), S_{1} ϕ (A)] [\begin{matrix} {\hat{U}}_{1} \\ {\hat{U}}_{2} \end{matrix}] + [W_{1}, W_{2}] [\begin{matrix} E_{ϕ (B)} T_{1} \\ - E_{ϕ (B)} T_{2} \end{matrix}], \end{matrix}

where

P_{1} = [I_{r}, 0], P_{2} = [0, I_{r}], S_{1} = [I_{m}, 0], S_{2} = [0, I_{m}]

,

\begin{matrix} Q_{1} & = [\begin{matrix} I_{n} \\ 0 \end{matrix}], Q_{2} = [\begin{matrix} 0 \\ I_{n} \end{matrix}], T_{1} = [\begin{matrix} I_{s} \\ 0 \end{matrix}], T_{2} = [\begin{matrix} 0 \\ I_{s} \end{matrix}], \end{matrix}

and

U_{i}, V_{i}, {\hat{U}}_{i}, W_{i} (i = 1, 2)

are arbitrary matrices over

R

with appropriate orders.

Remark 12.

Liu [40] further used Theorem 8 to discuss the maximal and minimal ranks of Equation (5)’s solutions, as seen in Section 5.6 of this paper.

Remark 13.

We know that the quaternion algebra (or the quaternion division ring) cannot be a field because it does not satisfy the commutative law. However, when studying matrix equations over

H

, the method of converting the quaternion matrix equations to the real (or complex) matrix equations by using the real (or complex) representation of quaternions [7] is widely applied. For example, refs. [41,42] studied the special least-squares solutions of a class of quaternion matrix equations by using the real representation and the complex representation of quaternions, respectively. Moreover, the real (or complex) representation of some other quaternions mentioned in Remark 43 have been explored in [43,44,45,46].

On the other hand, because real (or complex) representations have the drawback of a significant increase in computational load, Wei et al. [47] introduced real structure-preserving methods over

H

for LU decomposition, QR decomposition, and SVD, thereby solving quaternion linear systems.

4.6. Method by Determinable Representations

As is well known, Cramer’s rule for the linear equation

A x = b

for unknown vector x is an effective means of expressing its unique solution. In addition, it can be found in Section 4.2 that the theory of generalized inverses is closely related to the study of Equation (1). Naturally, we can consider whether Cramer’s rule for the solutions of Equation (1) can be obtained through the determinant representations of generalized inverses.

Generalized inverses (especially the MP inverse) over fields (particularly the complex field) have been thoroughly discussed and applied to characterize various solutions of matrix equations (see [10,15,48]). Notably, Kyrchei [49] presented the determinantal representations of the MP inverse and the Drazin inverse [50] from the new perspective of limit representations of generalized inverses in 2008.

When we consider the determinantal representations of generalized inverses in the quaternion algebra, it relies on the theory of determinants of quaternion matrices. However, due to the noncommutativity of quaternions, the determinants of quaternion matrices become much more complicated (see [51,52]). It was not until several decades later, after Kyrchei [53,54] introduced the theory of column-row determinants over

H

, that this problem was effectively solved.

Definition 2

(Column-row determinants over

H

). (Definitions 2.4 and 2.5, [53]) Let

A = [a_{i j}] \in H^{n \times n}

, and let

S_{n}

be the symmetric group on

I_{n} = {1, 2, \dots, n}

.

(1): For $i = 1, 2, \dots, n$ , the i-th row determinant of A is defined by

$\begin{matrix} {rdet}_{i} A & = \sum_{σ \in S_{n}} {(- 1)}^{n - r} (a_{i i_{k_{1}}} a_{i_{k_{1}} i_{k_{1} + 1}} \dots a_{i_{k_{1} + l_{1}} i}) \dots (a_{i_{k_{r}} i_{k_{r} + 1}} \dots a_{i_{k_{r} + l_{r}} i_{k_{r}}}), \\ σ & = (i i_{k_{1}} i_{k_{1} + 1} \dots i_{k_{1} + l_{1}}) (i_{k_{2}} i_{k_{2} + 1} \dots i_{k_{2} + l_{2}}) \dots (i_{k_{r}} i_{k_{r} + 1} \dots i_{k_{r} + l_{r}}), \end{matrix}$

where $i_{k_{2}} < i_{k_{3}} < \dots < i_{k_{r}}$ , and $i_{k_{t}} < i_{k_{t} + s}$ for $t = 2, \dots, r$ and $s = 1, \dots, l_{t}$ .
(2): For $j = 1, 2, \dots, n$ , the j-th column determinant of A is defined by

$\begin{matrix} {cdet}_{j} A & = \sum_{τ \in S_{n}} {(- 1)}^{n - r} (a_{j_{k_{r}} j_{k_{r} + l_{r}}} \dots a_{j_{k_{r} + 1} j_{k_{r}}}) \dots (a_{j j_{k_{1} + l_{1}}} \dots a_{j_{k_{1} + 1} j_{k_{1}}} a_{j_{k_{1}} j}), \\ τ & = (j_{k_{r} + l_{r}} \dots j_{k_{r} + 1} j_{k_{r}}) \dots (j_{k_{2} + l_{2}} \dots j_{k_{2} + 1} j_{k_{2}}) (j_{k_{1} + l_{1}} \dots j_{k_{1} + 1} j_{k_{1}} j), \end{matrix}$

where $j_{k_{2}} < j_{k_{3}} < \dots < j_{k_{r}}$ and $j_{k_{t}} < j_{k_{t} + s}$ for $t = 2, \dots, r$ and $s = 1, \dots, l_{t}$ .

Remark 14.

Kyrchei in Theorem 3.1 of [53] showed that if

A \in H^{n \times n}

is Hermitian, i.e.,

A = A^{*}

, then

{rdet}_{1} A = \dots = {rdet}_{n} A = {cdet}_{1} A = \dots = {cdet}_{n} A \in R .

Thus, Remark 3.1 [53] defines the determinant of a Hermitian matrix A by

det A = {rdet}_{i} A = {cdet}_{i} A, (i = 1, 2, \dots, n) .

We now introduce some necessary notations. Let

A \in H^{m \times n}

. Then,

R_{l} (A)

,

R_{r} (A)

,

N_{l} (A)

, and

N_{r} (A)

denote the left row space, the right column space, the left null space, and the right null space of A, respectively. Let

α = {α_{1}, \dots, α_{k}} \subseteq {1, \dots, m} and β = {β_{1}, \dots, β_{k}} \subseteq {1, \dots, n},

where

1 \leq k \leq min {m, n}

. By

A_{β}^{α}

, we denote the submatrix of A whose rows are indexed by

α

and whose columns are indexed by

β

. If A is Hermitian, then

| A_{α}^{α} |

is the corresponding principal minor of A. For

1 \leq k \leq n

, let

L_{k, n} = {α ∣ α = (α_{1}, \dots, α_{k}), 1 \leq α_{1} < \dots < α_{k} \leq n} .

And, for

i \in α

and

j \in β

, let

I_{r, m} {i} = {α ∣ α \in L_{r, m}, i \in α} and J_{r, n} {j} = {β ∣ β \in L_{r, n}, j \in β} .

We denote the j-th column of A by

a_{. j}

and its i-th row by

a_{i .}

. And,

A_{. j} (b)

is the matrix that is obtained from A by replacing its j-th column by the column vector

b \in H^{m \times 1}

, and

A_{i .} (b)

is the matrix obtained from A by replacing its i-th row by the row vector

b \in H^{1 \times n}

. Symbols

a_{\cdot j}^{*}

and

a_{i \cdot}^{*}

denoted the j-th column and the i-th row of

A^{*}

, respectively.

Kyrchei [55] obtained the Cramer’s rules for some left, right and two-sided quaternion matrix equations by the theory of the column-row determinants over

H

. Song et al. [56] then utilized results in [55] to further study the Cramer’s rule for the quaternion matrix equation:

A X E + F Y B = C,

(21)

where

A \in H^{m \times n}

,

E \in H^{s \times q}

,

F \in H^{m \times q}

,

B \in H^{t \times q}

, and

C \in H^{m \times q}

are given.

Theorem 9

(Theorem 3.1, [56]). Suppose that Equation (21) is consistent. Let

\begin{matrix} T = A^{*} (I + R_{F}) A, S = E (I + L_{B}) E^{*}, A_{11} = A^{*} R_{F} A T^{†}, A_{22} = A^{*} A T^{†}, \\ E_{11} = S^{†} E E^{*}, E_{22} = S^{†} E L_{B} E^{*}, Y_{10} = A_{22}^{†} A_{11} A^{*} C L_{B} E^{*} + L_{A_{22}} A^{*} R_{F} C E^{*} E_{22} E_{11}^{†}, \\ Y_{20} = A_{11}^{†} A_{22} A^{*} R_{F} C E^{*} + L_{A_{11}} A^{*} C L_{B} E^{*} E_{11} E_{22}^{†} . \end{matrix}

Let

K^{*}

, L,

M^{*}

, and N be of full column rank matrices over

H

such that

N_{r} (T) = R_{r} (K^{*}), N_{r} (S) = R_{r} (L), N_{r} (F) = R_{r} (M^{*}), N_{r} (B) = R_{r} (N) .

Then, the general solution of Equation (21) is

\begin{matrix} x_{i j} & = \frac{{rdet}_{j} {(S + L L^{*})}_{j .} (c_{i .}^{A})}{\det (T + K^{*} K) \det (S + L L^{*})} = \frac{{cdet}_{i} {(T + K^{*} K)}_{. i} (c_{. j}^{E})}{\det (T + K^{*} K) \det (S + L L^{*})}, \\ y_{h l} & = \frac{{rdet}_{l} {(B B^{*} + N N^{*})}_{l .} (g_{h .}^{A})}{\det (F^{*} F + M^{*} M) \det (B B^{*} + N N^{*})} = \frac{{cdet}_{h} {(F^{*} F + M^{*} M)}_{. h} (g_{. l}^{E})}{\det (F^{*} F + M^{*} M) \det (B B^{*} + N N^{*})}, \end{matrix}

where

\begin{matrix} c_{i .}^{A} & = [{cdet}_{i} {(T + K^{*} K)}_{. i} (d_{. 1}), \dots, {cdet}_{i} {(T + K^{*} K)}_{. i} (d_{. s})], \\ g_{h .}^{A} & = [{cdet}_{h} {(F^{*} F + M^{*} M)}_{. h} (k_{. 1}), \dots, {cdet}_{h} {(F^{*} F + M^{*} M)}_{. h} (k_{. t})], \\ c_{. j}^{E} & = {[{rdet}_{j} {(S + L L^{*})}_{j .} (d_{1 .}), \dots, {rdet}_{j} {(S + L L^{*})}_{j .} (d_{n .})]}^{T}, \\ g_{. l}^{E} & = {[{rdet}_{l} {(B B^{*} + N N^{*})}_{l .} (k_{1 .}), \dots, {rdet}_{l} {(B B^{*} + N N^{*})}_{l .} (k_{q .})]}^{T} \end{matrix}

with

d_{i .}

and

d_{. j}

are the i-th row and j-th column of

(T + K^{*} K) T^{†} (A^{*} R_{F} C E^{*} + A^{*} C L_{B} E^{*} + Y_{10} + Y_{20}) S^{†} (S + L L^{*}) + M + W,

respectively, and

k_{. i}

and

k_{j .}

are the i-th column and j-th row of

(F^{*} F + M^{*} M) (F^{†} (C - A X E) B^{†}) (B B^{*} + N N^{*}) + Q,

respectively, for

i = 1, \dots, m

,

j = 1, \dots, s

,

h = 1, \dots, q

, and

l = 1, \dots, t

, where

\begin{matrix} M & = (T + K^{*} K) T^{†} (L_{A_{22}} V_{1} R_{E_{11}} + L_{A_{11}} V_{2} R_{E_{22}}) S^{†} (S + L L^{*}), \\ W & = T Z L L^{*} + K^{*} K Z S + K^{*} K Z L L^{*}, Q = F^{*} F H N N^{*} + M^{*} M H (B B^{*} + N N^{*}) \end{matrix}

for arbitrary

V_{1}

,

V_{2}

, Z, and H with appropriate orders.

Remark 15.

When E and F in Equation (21) are identity matrices, by Theorem 9, we immediately derived the Cramer’s rule to the matrix equation over

H

:

A X + Y B = C .

(22)

Note that Theorem 9 uses the auxiliary matrices, i.e., K, L, M, and N, to derive the determinant representation of the general solution of Equation (22). However, these auxiliary matrices are not always easy to obtain in practical applications. Then, can we obtain Cramer’s rule for Equation (22) only through its given coefficient matrices?

In order to answer the above problem, let us first take a look at another work of Kyrchei [57]. Based on the column-row determinant theory and the limit representation of the MP inverse, he gave the determinant representation of the MP inverse over

H

in [57]. This research has greatly promoted the study of Cramer’s rules to the various solutions of matrix equations over

H

(see [58,59,60,61,62,63,64]). Denote

H_{r}^{m \times n} = {A \in H^{m \times n} | rank (A) = r} .

Theorem 10

(Theorem 5, [57]). Let

A \in H_{r}^{m \times n}

and

A^{†} = [a_{i, j}^{+}] \in H^{n \times m}

. Then,

a_{i, j}^{+} = \frac{\sum_{β \in J_{r, n} {i}} {cdet}_{i} {({(A^{*} A)}_{\cdot i} (a_{\cdot j}^{*}))}_{β}^{β}}{\sum_{β \in J_{r, n}} |{(A^{*} A)}_{β}^{β}|} = \frac{\sum_{α \in I_{r, m} {j}} {rdet}_{j} {({(A A^{*})}_{j \cdot} (a_{i \cdot}^{*}))}_{α}^{α}}{\sum_{α \in I_{r, m}} |{(A A^{*})}_{α}^{α}|},

where

i = 1, 2 . \dots, n

and

j = 1, 2, \dots, m

.

A result similar to Theorem 3 was given by Kyrchei in [58].

Theorem 11

(Lemma 5.1, [58]). The following are equivalent:

(1): Equation (22) is solvable;
(2): $R_{A} C L_{B} = 0$ ;
(3): $rank [\begin{matrix} A & C \\ 0 & B \end{matrix}] = rank [\begin{matrix} A & 0 \\ 0 & B \end{matrix}]$ ;

in which case,

\begin{matrix} X = A^{†} C - A^{†} V B + L_{A} U and Y = R_{A} C B^{†} + A A^{†} V + W R_{B}, \end{matrix}

(23)

where U, V, and W are arbitrary matrices of appropriate orders over

H

.

It is easy to see that

\begin{matrix} X = A^{†} C and Y = C B^{†} - A A^{†} C B^{†} \end{matrix}

(24)

are the partial solution pair of Equation (22) by taking U, V, and W as zero matrices in (23). Then, by applying Theorem 10 to (24), Kyrchei [58] presented a new Cramer’s rule for the partial solution of Equation (22), which only makes use of the coefficient matrices, i.e., A, B, and C.

Theorem 12

(Theorem 5.2, [58]). Let

A \in H_{r_{1}}^{m \times n}

and

B \in H_{r_{2}}^{t \times q}

. Then, the partial solution pair

X = [x_{i j}] \in H^{n \times q}

and

Y = [y_{g f}] \in H^{m \times t}

in (24) can be expressed as

x_{i j} = \frac{\sum_{β \in J_{r_{1}, n} {i}} {cdet}_{i} {({(A^{*} A)}_{. i} (c_{. j}^{(1)}))}_{β}^{β}}{\sum_{β \in J_{r_{1}, n}} |{(A^{*} A)}_{β}^{β}|},

where

c_{. j}^{(1)}

is the j-th columns of

A^{*} C

, and

\begin{matrix} y_{g f} & = \frac{\sum_{α \in I_{r_{2}, t} {f}} {rdet}_{f} {({(B B^{*})}_{. f} (c_{g .}^{(2)}))}_{α}^{α}}{\sum_{α \in I_{r_{2}, t}} |{(B B^{*})}_{α}^{α}|} \\ - \frac{\sum_{l = 1}^{m} \sum_{α \in I_{r_{1}, m} {l}} {rdet}_{l} {({(A A^{*})}_{l .} ({\ddot{a}}_{g .}^{(1)}))}_{α}^{α} \sum_{α \in I_{r_{2}, t} {f}} {rdet}_{f} {({(B B^{*})}_{. f} (c_{l .}^{(2)}))}_{α}^{α}}{\sum_{α \in I_{r_{1}, m}} |{(A A^{*})}_{α}^{α}| \sum_{α \in I_{r_{2}, t}} |{(B B^{*})}_{α}^{α}|}, \end{matrix}

where

c_{g .}^{(2)}

and

{\ddot{a}}_{g .}^{(1)}

are the g-th rows of

C B^{*}

and

A A^{*}

, respectively.

Remark 16.

Note that although Kyrchei [58] gave the determinant representation of the particular solution of Equation (22), the problem proposed in Remark 15 is not solved completely. That is to say, Cramer’s rule for representing the general solution of Equation (22) only using the coefficient matrices remains an unsolved problem.

Interestingly, Song [65] considered the determinant representation for the general solution of Equation (22) with the restricted conditions, i.e.,

Equation (22) subject to \{\begin{matrix} R_{r} (X) \subseteq T_{1}, N_{r} (X) \supseteq S_{1}, \\ R_{l} (Y) \subseteq T_{2}, N_{l} (Y) \supseteq S_{2}, \end{matrix}

(25)

where

T_{1} \subseteq H^{n}

,

S_{1} \subseteq H^{p}

,

T_{2} \subseteq H^{1 \times t}

, and

S_{2} \subseteq H^{1 \times m}

. Let

P_{T} \in H^{n \times n}

(

Q_{T} \in H^{n \times n}

) denote the right (left)

H

-orthogonal projector onto a right (left)

H

-vector subspace

T \in H^{n \times 1}

(

T \in H^{1 \times n}

) along

T^{⊥}

.

Theorem 13

(Theorem 3.2, [65]). Let A, B, C,

T_{1}

,

T_{2}

,

S_{1}

, and

S_{2}

be given in (25), and let

M = R_{A P_{T_{1}}} Q_{S_{2}^{⊥}}, N = Q_{T_{2}} B P_{S_{1}^{⊥}}, T = A^{*} R_{Q_{S_{2}^{⊥}}} A + A^{*} A, S = I + L_{Q_{T_{2}} B},

and

Y_{1}

be such that

A^{*} R_{Q_{S_{2}^{⊥}}} A T^{†} Y_{1} = A^{*} A T^{†} A^{*} R_{Q_{S_{2}^{⊥}}} C and Y_{1} L_{Q_{T_{2}} B} = A^{*} C L_{Q_{T_{2}} B} .

(1): The restricted Equation (25) is solvable if and only if

$\begin{matrix} R_{M} R_{A P_{T_{1}}} C = 0, R_{A P_{T_{1}}} C L_{Q_{T_{2}} B} = 0, C L_{P_{S_{1}^{⊥}}} L_{N} = 0, R_{Q_{S_{2}^{⊥}}} C L_{Q_{T_{2}} B} = 0, \end{matrix}$

in which case,

$\begin{matrix} X & = {(T P_{T_{1}})}^{†} \tilde{C} P_{S_{1}^{⊥}} + P_{T_{1}} L_{A P_{T_{1}}} U_{1} P_{S_{1}^{⊥}} + P_{T_{1}} V_{1} P_{S_{1}^{⊥}} \\ = {(T P_{T_{1}})}^{†} \tilde{C} P_{S_{1}^{⊥}} + P_{T_{1}} Z_{1} P_{S_{1}^{⊥}} - P_{T_{11}} Z_{1} P_{S_{1}^{⊥}} \\ = {(T P_{T_{1}})}^{†} \tilde{C} P_{S_{1}^{⊥}} + P_{T_{1} \cap N_{r} (A)} U_{2} P_{S_{1}^{⊥}} + P_{T_{1}} V_{2} P_{S_{1}^{⊥}}, \\ Y & = Q_{S_{2}^{⊥}} (C - A X) {(Q_{T_{2}} B)}^{†} + Q_{S_{2}^{⊥}} V_{3} R_{Q_{T_{2}} B} Q_{T_{2}}, \end{matrix}$

where

$\tilde{C} = (A^{*} R_{Q_{S_{2}^{⊥}}} C + A^{*} C L_{Q_{T_{2}} B} + A^{*} R_{Q_{S_{2}^{⊥}}} C L_{Q_{T_{2}} B} + Y_{1}) S^{- 1},$

and $Z_{1}$ , $V_{1}$ , $U_{1}$ , $V_{2}$ , $U_{2}$ , and $V_{3}$ are arbitrary matrices over $H$ with appropriate dimensions.
(2): Let $C_{1}^{*}, K_{1}^{*}, C_{1}$ , and $K_{2}$ be full column rank matrices such that

$T_{1} = N_{r} (C_{1}), T_{1} \cap N_{r} (T) = R_{r} (K_{1}^{*}), T_{2} = N_{l} (C_{2}), T_{2} \cap N_{r} (B) = R_{l} (K_{2}^{*}) .$

Denote $X = [x_{i j}] \in H^{n \times q}$ and $Y = [y_{k l}] \in H^{m \times t}$ . If Equation (25) is solvable, then

$\begin{matrix} x_{i j} & = \frac{{cdet}_{i} {(A^{*} A + C_{1}^{*} C_{1} + K_{1}^{*} K_{1})}_{. i} (d_{. j})}{det (A^{*} A + C_{1}^{*} C_{1} + K_{1}^{*} K_{1})}, \\ y_{k l} & = \frac{{rdet}_{l} {(B B^{*} + C_{2} C_{2}^{*} + K_{2} K_{2}^{*})}_{l .} (d_{k .})}{det (B B^{*} + C_{2} C_{2}^{*} + K_{2} K_{2}^{*})} \end{matrix}$

with $d_{. j}$ is the j-th column of

$T (A^{*} R_{Q_{S_{2}^{⊥}}} C + A^{*} C L_{Q_{T_{2}} B} + A^{*} R_{Q_{S_{2}^{⊥}}} C L_{Q_{T_{2}} B} + Y_{1}) + K_{1}^{*} K_{1} Z_{1},$

and $d_{k .}$ is the i-th row of

$(C - A X) B^{*} + Z_{2} K_{2} K_{2}^{*},$

where $i = 1, \dots, n$ , $j = 1, \dots, q$ , $k = 1, \dots, m$ , $l = 1, \dots, t$ , and $Z_{1}$ and $Z_{2}$ are arbitrary matrices over $H$ with appropriate orders.

4.7. Method by Semi-Tensor Products

To effectively handle multidimensional arrays and nonlinear problems, Chinese mathematician Daizhan Cheng [66] pioneered a new matrix product in 2001: the semi-tensor product (abbreviated as STP). It exactly coincides with traditional matrix products when the two factor matrices meet the dimension requirements. Due to the excellent properties of STP [67], namely

(1): It is applied to any two matrices;
(2): It has certain commutative properties;
(3): It inherits all properties of the conventional matrix product;
(4): It enables easy expression of multilinear functions (mappings);

STP has been effectively applied to Boolean (control) networks, logical dynamic systems, system biology, graph theory, formation control, finite automata and symbolic dynamics, circuit design and failure detection, coding and cryptography, fuzzy control, engineering, game theory, and so on (see [67,68,69,70,71] and references therein).

Definition 3

([66]). Let

F

be a ring,

A \in F^{m \times n}

, and

B \in F^{p \times q}

. The left STP of A and B is defined as

A ⋉ B = (A \otimes I_{\frac{t}{n}}) (B \otimes I_{\frac{t}{p}}),

where

t = lcm (n, p)

is the least common multiple of n and p.

Remark 17.

Similar to the left STP, Cheng [68] defined the right STP of

A \in F^{m \times n}

and

B \in F^{p \times q}

, i.e.,

A ⋊ B = (I_{\frac{t}{n}} \otimes A) (I_{\frac{t}{p}} \otimes B),

where

t = lcm (n, p)

. Both (Proposition 1.3, [67]) and (Proposition 2.3.2, [70]) show that the block multiplication of matrices can be extended to the left STP. Unfortunately, the right STP does not satisfy the block multiplication rule, which is a significant difference between the left STP and right STP. Next, we use a simple example from [70] to demonstrate this limitation of the right STP: let

A = [a_{1} a_{2} a_{3} a_{4}]

and

B = {[b_{1} b_{2}]}^{T}

. Then,

A ⋉ B = [a_{1} b_{1} + a_{3} b_{2} a_{2} b_{1} + a_{4} b_{2}] and A ⋊ B = [a_{1} b_{1} + a_{2} b_{2} a_{3} b_{1} + a_{4} b_{2}] .

Partition A and B equally into

A = [A_{1} A_{2}]

and

B = {[B_{1} B_{2}]}^{T}

, respectively. According to the block multiplication rule, we have

\begin{matrix} A_{1} ⋉ B_{1} + A_{2} ⋉ B_{2} = [a_{1} a_{2}] ⋉ b_{1} + [a_{3} a_{4}] ⋉ b_{2} = [a_{1} b_{1} + a_{3} b_{2} a_{2} b_{1} + a_{4} b_{2}], \\ A_{1} ⋊ B_{1} + A_{2} ⋊ B_{2} = [a_{1} a_{2}] ⋊ b_{1} + [a_{3} a_{4}] ⋊ b_{2} = [a_{1} b_{1} + a_{3} b_{2} a_{2} b_{1} + a_{4} b_{2}] . \end{matrix}

Obviously,

A ⋉ B = A_{1} ⋉ B_{1} + A_{2} ⋉ B_{2}, but A ⋊ B \neq A_{1} ⋊ B_{1} + A_{2} ⋊ B_{2} .

This drawback makes the right STP unable to replace the left STP in many applications, greatly limiting its use. Therefore, most scholars focus exclusively on research into the left STP. Based on this, the STP referred to in this paper specifically denotes the left STP.

Remark 18.

As a generalization of the traditional matrix product, and given its extensive application significance, the study of matrix equations under STP has emerged as an inevitable and pivotal research direction. In 2016, Yao et al. [72] were the first to conduct research on the classical complex matrix equation under STP:

A ⋉ X = B,

where X is unknown. Subsequently, building on the research framework of [72], investigations into diverse matrix equations under STP have flourished (see [73,74,75,76,77,78]).

Regrettably, no studies directly addressing Equation (5) under STP have been published to date. However, when the dimension of Y is constrained to equal that of

X^{T}

, the exploration of Sylvester-transpose matrix Equation [74]

A ⋉ X + X^{T} ⋉ B = C

with unknown X may provide valuable guidance for this problem.

Notably, in 2019, Cheng and Liu [79], inspired by the research on cross-dimensional linear systems [80], further proposed the second matrix–matrix semi-tensor product (abbreviated as MM-2 STP), denoted by

\circ_{l}

. Subsequently, Wang [81] investigated the complex matrix equation under the MM-2 STP:

A \circ_{l} X = B,

where X is unknown, and A and B are given. So, exploring Equation (1) under MM-2 STP emerges as a potential research topic. Additionally, Cheng [67] introduced the dimension-free matrix theory, which systematically elucidates the deep mathematical insights underlying STP.

Since 2020, Liaocheng University’s team led by Professors Ying Li and Jianli Zhao has utilized STP to propose several novel matrix representations, including complex, quaternion, and octonion matrices. These representations have also been applied to solving corresponding matrix equations, and have yielded effective numerical experimental results.

Ding et al. [82] first proposed the real vector representation for quaternion matrices, whose properties were characterized by STP, as follows:

Definition 4

(Definitions 3.1–3.3, [82]). Let

x = x_{1} + x_{2} i + x_{3} j + x_{4} k \in H

. Denote

v^{R} (x) = {[\begin{matrix} x_{1} & x_{2} & x_{3} & x_{4} \end{matrix}]}^{T} .

Let

x = [\begin{matrix} x^{1} & \dots & x^{n} \end{matrix}] \in H^{1 \times n}

and

y = {[\begin{matrix} y^{1} & \dots & y^{n} \end{matrix}]}^{T} \in H^{n}

. Denote

v^{R} (x) = [\begin{matrix} v^{R} (x^{1}) \\ ⋮ \\ v^{R} (x^{n}) \end{matrix}] and v^{R} (y) = [\begin{matrix} v^{R} (y^{1}) \\ ⋮ \\ v^{R} (y^{n}) \end{matrix}] .

For

A \in H^{m \times n}

, the real column stacking form

v_{c}^{R} (A)

and the real row stacking form

v_{r}^{R} (A)

of A are defined as

v_{c}^{R} (A) = [\begin{matrix} v^{R} ({Col}_{1} (A)) \\ v^{R} ({Col}_{2} (A)) \\ ⋮ \\ v^{R} ({Col}_{n} (A)) \end{matrix}] and v_{r}^{R} (A) = [\begin{matrix} v^{R} ({Row}_{1} (A)) \\ v^{R} ({Row}_{2} (A)) \\ ⋮ \\ v^{R} ({Row}_{m} (A)) \end{matrix}],

where

{Col}_{j} (A) (j = 1, \dots, n)

and

{Row}_{i} (A) (j = 1, \dots, m)

are the j-th column and i-th row of A, respectively.

Remark 19.

For

A \in H^{m \times n}

and

B \in H^{n \times p}

, Theorem 3.3(3) of [82] shows

v_{r}^{R} (A B) = G (v_{r}^{R} (A) ⋉ v_{c}^{R} (B)),

where

\begin{matrix} G & = [\begin{matrix} F ⋉ {(δ_{m}^{1})}^{T} ⋉ [I_{4 m n} \otimes {(δ_{p}^{1})}^{T}] \\ ⋮ \\ F ⋉ {(δ_{m}^{1})}^{T} ⋉ [I_{4 m n} \otimes {(δ_{p}^{p})}^{T}] \\ ⋮ \\ F ⋉ {(δ_{m}^{m})}^{T} ⋉ [I_{4 m n} \otimes {(δ_{p}^{1})}^{T}] \\ ⋮ \\ F ⋉ {(δ_{m}^{m})}^{T} ⋉ [I_{4 m n} \otimes {(δ_{p}^{p})}^{T}] \end{matrix}], F = M_{Q} ⋉ (\sum_{i = 1}^{n} {(δ_{n}^{i})}^{T} ⋉ (I_{4 n} \otimes {(δ_{n}^{i})}^{T})), \end{matrix}

\begin{matrix} M_{Q} & = [\begin{matrix} 1 & 0 & 0 & 0 & 0 & - 1 & 0 & 0 & 0 & 0 & - 1 & 0 & 0 & 0 & 0 & - 1 \\ 0 & 1 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & - 1 & 0 \\ 0 & 0 & 1 & 0 & 0 & 0 & 0 & - 1 & 1 & 0 & 0 & 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 & 0 & 1 & 0 & 0 & - 1 & 0 & 0 & 1 & 0 & 0 & 0 \end{matrix}], \end{matrix}

and

δ_{m}^{i} (i = 1, \dots, m)

is the i-th column of

I_{m}

.

Using the real vector representation for quaternion matrices, Ding et al. [82] further discussed the special least-squares solutions of quaternion matrix equation

A X B + C Y D = E,

(26)

where

A \in H^{m \times n}

,

B \in H^{n \times s}

,

C \in H^{m \times k}

,

D \in H^{k \times s}

, and

E \in H^{m \times s}

. Denote

δ_{k} [i_{1}, \dots, i_{s}] = [δ_{k}^{i_{1}}, \dots, δ_{k}^{i_{s}}],

W_{[m, n]} = δ_{m n} [1, \dots, (n - 1) m + 1, \dots, m, \dots, n m],

and

J^{η} = [\begin{matrix} J_{1}^{η} \\ ⋮ \\ J_{m}^{η} \\ ⋮ \\ J_{n}^{η} \end{matrix}], J_{m}^{η} = [\begin{matrix} J_{1 m}^{η} \\ ⋮ \\ J_{r m}^{η} \\ ⋮ \\ J_{n m}^{η} \end{matrix}], R^{η} = [\begin{matrix} R_{1}^{η} \\ ⋮ \\ R_{m}^{η} \\ ⋮ \\ R_{n}^{η} \end{matrix}], R_{m}^{η} = [\begin{matrix} R_{1 m}^{η} \\ ⋮ \\ R_{r m}^{η} \\ ⋮ \\ R_{n m}^{η} \end{matrix}], m = 1, 2, \dots, n,

where

\begin{matrix} for η = i, J_{r m}^{i} = \{\begin{matrix} {(δ_{n (n + 1) / 2}^{\frac{(r - 1) (2 n - r + 2)}{2} + m - r + 1})}^{T} \otimes R_{4}, & r < m, \\ {(δ_{n (n + 1) / 2}^{\frac{(m - 1) (2 n - m + 2)}{2} + r - m + 1})}^{T} \otimes I_{4}, & r \geq m, \end{matrix} R_{4} = [\begin{matrix} 1 & 0 & 0 & 0 \\ 0 & - 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{matrix}]; \\ for η = j, J_{r m}^{j} = \{\begin{matrix} {(δ_{n (n + 1) / 2}^{\frac{(r - 1) (2 n - r + 2)}{2} + m - r + 1})}^{T} \otimes L_{4}, & r < m, \\ {(δ_{n (n + 1) / 2}^{\frac{(m - 1) (2 n - m + 2)}{2} + r - m + 1})}^{T} \otimes I_{4}, & r \geq m, \end{matrix} L_{4} = [\begin{matrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & - 1 & 0 \\ 0 & 0 & 0 & 1 \end{matrix}]; \\ for η = k, J_{r m}^{k} = \{\begin{matrix} {(δ_{n (n + 1) / 2}^{\frac{(r - 1) (2 n - r + 2)}{2} + m - r + 1})}^{T} \otimes S_{4}, & r < m, \\ {(δ_{n (n + 1) / 2}^{\frac{(m - 1) (2 n - m + 2)}{2} + r - m + 1})}^{T} \otimes I_{4}, & r \geq m, \end{matrix} S_{4} = [\begin{matrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & - 1 \end{matrix}]; \end{matrix}

\begin{matrix} for η = i, R_{r m}^{i} = \{\begin{matrix} {(δ_{n (n + 1) / 2}^{\frac{(r - 1) (2 n - r + 2)}{2} + m - r + 1})}^{T} \otimes R_{4}^{'}, & r < m, \\ {(δ_{n (n + 1) / 2}^{\frac{(m - 1) (2 n - m + 2)}{2} + r - m + 1})}^{T} \otimes I_{4}, & r \geq m, \end{matrix} R_{4}^{'} = [\begin{matrix} - 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & - 1 & 0 \\ 0 & 0 & 0 & - 1 \end{matrix}]; \\ for η = j, R_{r m}^{j} = \{\begin{matrix} {(δ_{n (n + 1) / 2}^{\frac{(r - 1) (2 n - r + 2)}{2} + m - r + 1})}^{T} \otimes L_{4}^{'}, & r < m, \\ {(δ_{n (n + 1) / 2}^{\frac{(m - 1) (2 n - m + 2)}{2} + r - m + 1})}^{T} \otimes I_{4}, & r \geq m, \end{matrix} L_{4}^{'} = [\begin{matrix} - 1 & 0 & 0 & 0 \\ 0 & - 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & - 1 \end{matrix}]; \\ for η = k, R_{r m}^{k} = \{\begin{matrix} {(δ_{n (n + 1) / 2}^{\frac{(r - 1) (2 n - r + 2)}{2} + m - r + 1})}^{T} \otimes S_{4}^{'}, & r < m, \\ {(δ_{n (n + 1) / 2}^{\frac{(m - 1) (2 n - m + 2)}{2} + r - m + 1})}^{T} \otimes I_{4}, & r \geq m, \end{matrix} S_{4}^{'} = [\begin{matrix} - 1 & 0 & 0 & 0 \\ 0 & - 1 & 0 & 0 \\ 0 & 0 & - 1 & 0 \\ 0 & 0 & 0 & 1 \end{matrix}] . \end{matrix}

Theorem 14

(Theorem 4.3 and Corollary 4.4, [82]). Let A, B, C, D, and E be given in Equation (26), and let

\hat{M} = [\begin{matrix} M_{1} & M_{2} \end{matrix}],

where

\begin{matrix} M_{1} & = G_{2} ⋉ G_{3} ⋉ v_{r}^{R} (A) ⋉ W_{[4 n s, 4 n^{2}]} ⋉ v_{c}^{R} (B) ⋉ J^{η}, \\ M_{2} & = G_{4} ⋉ G_{5} ⋉ v_{r}^{R} (C) ⋉ W_{[4 k s, 4 k^{2}]} ⋉ v_{c}^{R} (D) ⋉ R^{η}, \end{matrix}

and

G_{i}

has the same structure as G given in Remark 19 but differs in orders.

(1): Then, Equation (26) is consistent if and only if

$(\hat{M} {\hat{M}}^{†} - I_{4 m s}) v_{r}^{R} (E) = 0 .$
(2): Let

$S_{M} = \{(X, Y) | X = {X^{η}}^{*}, Y = - {Y^{η}}^{*}, ∥ A X B + C Y D - E ∥ = min\} .$

Then,

$S_{M} = \{(X, Y) | [\begin{matrix} v_{s}^{R} (X) \\ v_{s}^{R} (Y) \end{matrix}] = {\hat{M}}^{†} v_{r}^{R} (E) + (I_{2 (n^{2} + k^{2}) + 2 (n + k)} - {\hat{M}}^{†} \hat{M}) y, y \in R^{2 (n^{2} + k^{2}) + 2 (n + k)}\} .$
(3): If $(\hat{X}, \hat{Y}) \in S_{M}$ satisfies

$∥ \hat{X} ∥^{2} + {∥ \hat{Y} ∥}^{2} = min_{(X, Y) \in S_{M}} ({∥ X ∥}^{2} + {∥ Y ∥}^{2}),$

then

$[\begin{matrix} v_{s}^{R} (\hat{X}) \\ v_{s}^{R} (\hat{Y}) \end{matrix}] = {\hat{M}}^{†} v_{r}^{R} (E) .$

Remark 20.

Setting B and C in Equation (26) as identity matrices, Theorem 14 yields the corresponding results for Equation (5) over

H

.

Remark 21.

During the same period, Ding et al. [83] and Wang et al. [84] also developed the real vector representation of quaternion matrices and applied this method to address problems in quaternion matrix equations. Inspired by this idea, Liu et al. [85] proposed the left and right real element representations of octonion matrices to solve the classical octonion matrix equation

A X B = C,

where X is unknown. Recently, Chen and Song [86] used the real vector representation of quaternion matrices to study the least-squares lower (upper) triangular Toeplitz solutions and (anti)centrosymmetric solutions of quaternion matrix equations.

Remark 22.

It is known that real representations of quaternion matrices are not unique. For instance, Liu et al. [87] defined three real representations of quaternion matrices. Interestingly, Fan et al. first [88,89] proposed the

L

-representation for quaternion matrices by STP to systematically study the real representations of quaternion matrices. Moreover,

L

-representation serves as an effective tool in solving quaternion matrix equations. Meanwhile, Zhang et al. [90] also defined the

L

-representation for commutative quaternion matrices and applied it to solving the corresponding matrix equations.

Following this idea, Fan et al. [91] established the

C

-representation of quaternion matrices, which generalizes the complex representation of quaternion matrices. This new representation is also applied to study η-Hermitian solutions of the quaternion matrix equation

A X B = C,

with unknown X. Similarly, Xi et al. [92] defined the

L_{C}

-representation of reduced biquaternion matrices to investigate the mixed solutions of the reduced biquaternion matrix equation

\sum_{i = 1}^{n} A_{i} X_{i} B_{i} = E,

(27)

where

X_{i} (i = 1, \dots, n)

is unknown. Evidently, Equation (5) is the special case of Equation (27) over reduced biquaternions.

Contemporaneously with [88,89], Fan et al. [93] also derived the minimal norm least-squares (anti)-Hermitian solution of the quaternion matrix equation directly by the vectorization properties of STP (i.e., Theorems 2.9 and 2.10, [93]) and the complex representation of quaternion matrix. Recently, Liu et al. [94] directly used the vectorization properties of STP to investigate (skew) bisymmetric solutions of the generalized Lyapunov quaternion matrix equation.

We contend that the four aforementioned STP-based methods—specifically,

L

-representation,

C

-representation,

L_{C}

-representation, and vectorization properties of STP—provide distinct perspectives for the investigation of Equation (1).

5. Constrained Solutions of GSE

When GSE fails to satisfy solvability conditions, find its least-squares solution under a certain matrix norm; since such solutions are always non-unique, further seeking for the minimum-norm least-squares solution (also known as the best approximate solution) is a common research approach. Furthermore, in practical applications, specific constraints are often imposed on the solutions of GSE (e.g., symmetric solutions, re-(non)positive definite solutions, equality-constrained solutions). Thus, this section focuses on various solutions to GSE.

5.1. Chebyshev Solutions and $l_{p}$ -Solutions

Let

F = R

. Ziętak studied the Chebyshev solutions [95] and the

l_{p}

-solutions [96] of Equation (5) by using the Chebyshev norm and the

l_{p}

-norm of a matrix, respectively.

Definition 5.

Let

A = [a_{i, j}] \in R^{m \times n}

and

1 < p < \infty

. The Chebyshev norm of A, denoted by

{∥ A ∥}_{\infty}

, is defined as

{∥ A ∥}_{\infty} = max_{1 \leq i \leq m, 1 \leq j \leq n} | a_{i, j} |,

where

| a_{i, j} |

is the absolute value of

a_{i, j}

for

1 \leq i \leq m

and

1 \leq j \leq n

. And, the

l_{p}

-norm of A, denoted by

{∥ A ∥}_{p}

, is defined as

{∥ A ∥}_{p} = {(\sum_{i = 1}^{m} \sum_{j = 1}^{n} {| a_{i, j} |}^{p})}^{1 / p} .

Theorem 15

(Theorem 2.2, [95]). Let

r < m

and

s < n

. Suppose that (4) is not satisfied. Then, the matrices

X_{\infty} = [x_{k j}]

and

Y_{\infty} = [y_{i l}]

are a Chebyshev solution pair of Equation (5), i.e.,

∥ A X_{\infty} + Y_{\infty} {B - C ∥}_{\infty} = min_{X, Y} {∥ A X + Y B - C ∥}_{\infty},

if and only if there exists

V = [v_{i j}] (\neq 0) \in R^{m \times n}

such that

V^{T} A = 0

,

V B^{T} = 0

,

\begin{matrix} v_{i j} sign (r_{i j}) > 0 for (i, j) \in J_{1}, and v_{i j} = 0 for (i, j) \notin J_{1}, \end{matrix}

where

r_{i j} = \sum_{k = 1}^{r} a_{i k} x_{k j} + \sum_{l = 1}^{s} y_{i l} b_{l j} - c_{i j},

and

J_{1}

is an appropriate subset of

J = \{(i, j) ∣ | r_{i j} {| = ∥ A X + Y B - C ∥}_{\infty}\}

.

Remark 23.

Moreover, Ziętak formulated the equivalent conditions for the Chebyshev solution of Equation (5) by Theorem 3.3 of [95] and Theorem 4.1 of [95] under the assumption:

m = r + 1, n = s + 1, rank (A) = r, rank (B) = s,

(28)

and another assumption:

A \neq 0, B \neq 0, r = s = 1,

(29)

respectively.

Theorem 16

(Theorem 2.1, [96]). Let

r < m

,

s < n

, and

1 < p < \infty

. Then, the matrices

X_{p} = [x_{i, j}]

and

Y_{p} = [y_{i, j}]

are an

l_{p}

-solution of Equation (5), i.e.,

∥ A X_{p} + Y_{p} {B - C ∥}_{p} = min_{X, Y} {∥ A X + Y B - C ∥}_{p},

if and only if

V^{T} A = 0 and V B^{T} = 0,

where

\begin{matrix} V = {[v_{i, j}]}_{m \times n} = {[sign (r_{i, j}) {| r_{i, j} |}^{p - 1}]}_{m \times n} and r_{i, j} = \sum_{k = 1}^{r} a_{i, k} x_{k, j} + \sum_{l = 1}^{s} y_{i, l} b_{l, j} - c_{i, j} \end{matrix}

for

i = 1, 2, \dots, m

and

j = 1, 2, \dots, n

.

Ziętak [96] further presented additional characterizations of the

l_{p}

-solutions of Equation (5). Based on these characterizations, an algorithm was designed for solving such solutions (see Section 7 for details).

Theorem 17

(Theorems 2.2 and 2.3, [96]). Let A, B, and C be given in Equation (5) and let

1 < p < \infty

. Then, the following are equivalent:

(1): The matrices $X_{*} \in R^{r \times n}$ and $Y_{*} \in R^{m \times s}$ are the $l_{p}$ -solution of Equation (5);
(2): The following equalities hold:

$min_{X} ∥ A X + Y_{*} {B - C ∥}_{p} = min_{Y} ∥ A X_{*} {+ Y B - C ∥}_{p} = {∥ A X_{*} + Y_{*} B - C ∥}_{p};$
(3): The columns $x_{j}^{(*)}$ of $X_{*} \in R^{r \times n}$ are the $l_{p}$ -solutions of the linear systems

$A x = c_{j} - Y_{*} b_{j} (j = 1, 2, \dots, n)$

and the columns $y_{i}^{(*)}$ of $Y_{*}^{T} \in R^{s \times m}$ are the $l_{p}$ -solutions of the linear systems

$B^{T} y = d_{i} - X_{*}^{T} a_{i} (i = 1, 2, \dots, m),$

where $x \in R^{r}$ , $y \in R^{s}$ , $c_{j}$ , $b_{j}$ , $d_{i}$ and $a_{i}$ are the appropriate columns of C, B, $C^{T}$ , and $A^{T}$ , respectively.

Remark 24.

Since

A (X + (I - A^{-} A) W - A^{-} Z B) + (Y + Z (I - B B^{-}) + A A^{-} Z B B^{-}) B = A X + Y B

for arbitrary W and Z, neither the Chebyshev solution nor the

l_{p}

-solution of Equation (5) is unique.

Remark 25.

Note that, in Theorem 2.1 of [27], Xu et al. gave the explicit expression of the

l_{2}

-solutions of Equation (12). Moreover, in Theorems 4.2 and 4.3 of [97], Liao et al. also consider the best approximate solution of (12) to a given matrix pair

(X_{f}, Y_{f})

by using GSVD and CCD. Therefore, when both E and F are identity matrices, we can immediately obtain decomposed expressions of

l_{2}

-solutions and the best approximate solution of Equation (5).

5.2. ★-Congruent Solutions

We now discuss Equation (5) under the constraint condition

Y = X^{★}

, i.e.,

A X + X^{★} B = C,

(30)

where

X^{★}

denotes either

X^{T}

or

X^{*}

. We call X satisfying Equation (30) a ★-congruent solution of Equation (5).

Wimmer is the first to study the necessary and sufficient conditions for the solvability of Equation (30) under

★ = *

over

C

(see Theorem 2, [98]). After that, De Terán and Dopico [99] generalize the Wimmer’s work to a field

F

with

char (F) \neq 2

, in which case,

X^{★}

denotes the transpose of X, except in the particular case

F = C

, where it may be either the transpose or the conjugate transpose of X. Moreover, we call

A \in F^{n \times n}

and

B \in F^{n \times n}

are ★-congruent if there exists a nonsingular matrix

P \in F^{n \times n}

such that

P^{★} A P = B

.

Theorem 18

(Theorem 2.3, [99]). Let

F

be a field with

char (F) \neq 2

. Then, Equation (30) is solvable if and only if

[\begin{matrix} C & A \\ B & 0 \end{matrix}] and [\begin{matrix} 0 & A \\ B & 0 \end{matrix}]

are ★-congruent.

Remark 26.

Additionally, in Lemma 5.10 of [100], Byers and Kressner established the equivalent conditions for the existence of a unique solution to Equation (30) only when

★ = T

. Kressner et al. generalized this result in Lemma 8 of [101] to include the case where

★ = *

. In Theorems 2.1 and 2.2 of [102], Cvetković-Ilić investigated Equation (30) for the bounded linear operators under certain conditions, in which case

X^{★}

denotes the adjoint operator of X.

5.3. (Minimum-Norm Least-Squares) Symmetric Solutions

Let

F = R

,

s = m

,

r = n

, and

A = B

for Equation (5), i.e.,

A X + Y A = C .

(31)

Chang and Wang [103] studied the symmetric, minimum-2-norm symmetric, least-squares symmetric, and minimum-2-norm least-squares symmetric solutions of Equation (31) by SVD over

R

. Let

{SR}^{m \times m}

be the set of all

m \times m

real symmetric matrices, and let

A \circ B = [a_{i j} b_{i j}] \in F^{m \times n}

denote the Hadamard product of

A = [a_{i j}] \in F^{m \times n}

and

B = [b_{i j}] \in F^{m \times n}

.

Theorem 19

(Theorem 2.1, [103]). Let the SVD of A be

A = U [\begin{matrix} Σ & 0 \\ 0 & 0 \end{matrix}] V^{T},

where

Σ = diag (σ_{1}, \dots, σ_{r}) > 0

,

r = rank (A)

, and

U = [\begin{matrix} U_{1} & U_{2} \end{matrix}] \in R^{m \times m} and V = [\begin{matrix} V_{1} & V_{2} \end{matrix}] \in R^{n \times n}

are real orthogonal with

U_{1} \in R^{m \times r}, U_{2} \in R^{m \times (m - r)}, V_{1} \in R^{n \times r}, and V_{2} \in R^{n \times (n - r)} .

Denote

\begin{matrix} W_{1} = Σ U_{1}^{T} C V_{1} - V_{1}^{T} C^{T} U_{1} Σ, W_{2} = Σ^{- 1} (U_{1}^{T} C V_{1} + V_{1}^{T} C^{T} U_{1}), \\ M_{1} = Σ V_{1}^{T} C^{T} U_{1} - U_{1}^{T} C V_{1} Σ, M_{2} = Σ^{- 1} U_{1}^{T} C V_{1} . \end{matrix}

Define

Φ = [φ_{i j}] \in R^{r \times r} and Ψ = [ψ_{i j}] \in R^{r \times r},

where

φ_{i j} = \{\begin{matrix} \frac{1}{σ_{i}^{2} - σ_{j}^{2}}, & σ_{i} \neq σ_{j}, \\ 0, & σ_{i} = σ_{j}, \end{matrix} and ψ_{i j} = 1 - | sign (σ_{i} - σ_{j}) |

for

i, j = 1, \dots, r

.

(1): Let

$L_{I} = {[X, Y] ∣ X \in {SR}^{n \times n}, Y \in {SR}^{m \times m}, A X + Y A = C} .$

Then, $L_{I} \neq \emptyset$ if and only if

$U_{2}^{T} C V_{2} = 0 and Ψ \circ (U_{1}^{T} C V_{1} - V_{1}^{T} C^{T} U_{1}) = 0,$

in which case

$\begin{matrix} L_{I} = {[V & [\begin{matrix} Φ \circ W_{1} + Ψ \circ (M_{2} - Y_{11}) & Σ^{- 1} U_{1}^{T} C V_{2} \\ V_{2}^{T} C^{T} U_{1} Σ^{- 1} & X_{22} \end{matrix}] V^{T}, \\ U & [\begin{matrix} Φ \circ M_{1} + Ψ \circ Y_{11} & Σ^{- 1} V_{1}^{T} C^{T} U_{2} \\ U_{2}^{T} C V_{1} Σ^{- 1} & Y_{22} \end{matrix}] U^{T}] \\ ∣ X_{22} \in {SR}^{(n - r) \times (n - r)}, Y_{11} \in {SR}^{r \times r}, Y_{22} \in {SR}^{(m - r) \times (m - r)}} . \end{matrix}$
(2): If $[\hat{X}, \hat{Y}] \in L_{I}$ satisfies

$∥ [\hat{X}, \hat{Y}] ∥_{F} = min_{[X, Y] \in L_{I}} (∥ \hat{X} ∥_{F}^{2} + ∥ \hat{Y} {∥_{F}^{2})}^{\frac{1}{2}},$

then $[\hat{X}, \hat{Y}] \in L_{I}$ is unique and

$\begin{matrix} \hat{X} & = V [\begin{matrix} Φ \circ W_{1} + \frac{1}{2} Ψ \circ M_{2} & Σ^{- 1} U_{1}^{T} C V_{2} \\ V_{2}^{T} C^{T} U_{1} Σ^{- 1} & 0 \end{matrix}] V^{T}, \\ \hat{Y} & = U [\begin{matrix} Φ \circ M_{1} + \frac{1}{2} Ψ \circ M_{2} & Σ^{- 1} V_{1}^{T} C^{T} U_{2} \\ U_{2}^{T} C V Σ^{- 1} & 0 \end{matrix}] U^{T} . \end{matrix}$
(3): Let

$L_{I L S} = {[X, Y] | X \in {SR}^{n \times n}, Y \in {SR}^{m \times m} {, ∥ A X + Y A - C ∥}_{F} = min} .$

Then,

$\begin{matrix} L_{I L S} = {[V & [\begin{matrix} Φ \circ W_{1} + Ψ \circ (\frac{1}{2} W_{2} - Y_{11}) & Σ^{- 1} U_{1}^{T} C V_{2} \\ V_{2}^{T} C^{T} U_{1} Σ^{- 1} & X_{22} \end{matrix}] V^{T}, \\ U & [\begin{matrix} Φ \circ M_{1} + Ψ \circ Y_{11} & Σ^{- 1} V_{1}^{T} C^{T} U_{2} \\ U_{2}^{T} C V_{1} Σ^{- 1} & Y_{22} \end{matrix}] U^{T}] \\ ∣ X_{22} \in {SR}^{(n - r) \times (n - r)}, Y_{11} \in {SR}^{r \times r}, Y_{22} \in {SR}^{(m - r) \times (m - r)}} . \end{matrix}$
(4): If $[\hat{X}, \hat{Y}] \in L_{I L S}$ satisfies

$∥ [\hat{X}, \hat{Y}] ∥_{F} = min_{[X, Y] \in L_{I L S}} (∥ \hat{X} ∥_{F}^{2} + ∥ \hat{Y} {∥_{F}^{2})}^{\frac{1}{2}},$

then $[\hat{X}, \hat{Y}]$ is unique and

$\begin{matrix} \hat{X} & = V [\begin{matrix} Φ \circ W_{1} + \frac{1}{4} Ψ \circ W_{2} & Σ^{- 1} U_{1}^{T} C V_{2} \\ V_{2}^{T} C^{T} U_{1} Σ^{- 1} & 0 \end{matrix}] V^{T}, \\ \hat{Y} & = U [\begin{matrix} Φ \circ M_{1} + \frac{1}{4} Ψ \circ W_{2} & Σ^{- 1} V_{1}^{T} C^{T} U_{2} \\ U_{2}^{T} C V_{1} Σ^{- 1} & 0 \end{matrix}] U^{T} . \end{matrix}$

5.4. Self-Adjoint and Positive (Semi)Definite Solutions

In this subsection, we consider Equation (1) with

r = n

,

s = m

and

C = 0

, i.e.,

A X - Y B = 0,

(32)

which is required in optimal control theory [104].

Jameson et al. [105] explored the symmetric, positive semidefinite, and positive definite real solutions of Equation (32) over

R

. Subsequently, Dobovišek [106] further studied the self-adjoint, positive semidefinite, positive definite, minimal, and extreme solutions of Equation (32) over

C

.

Definition 6

([106]). If the nonnegative matrix

Y_{m}

is such that

(X, Y_{m})

is a solution pair of Equation (32), and

Y_{m} \leq Y

(or

Y_{m} \geq Y

) for all other solution pairs

(X, Y)

of Equation (32) with the same X, then

Y_{m}

is called a minimal (or maximal) solution of Equation (32). The minimal and maximal solutions of Equation (32) are collectively referred to as the extreme solutions of Equation (32).

Theorem 20

([106]). Let A be of full column rank,

m \geq n

,

A = [\begin{matrix} 0 \\ I \end{matrix}], B = [\begin{matrix} B_{1} \\ B_{2} \end{matrix}], and Y = [\begin{matrix} Y_{11} & Y_{12} \\ Y_{12}^{*} & Y_{22} \end{matrix}] .

(1): (Theorem 6, [106]) Assume that $A B^{*}$ has at least one real eigenvalue or at least one conjugate pair of eigenvalues. Then, Equation (32) has a non-zero self-adjoined solution pair, i.e., $X^{*} = X$ and $Y^{*} = Y$ .
(2): (Theorem 7, [106]) Equation (32) has a nonzero positive semidefinite solution Y, i.e., $Y \geq 0$ , if and only if $A B^{*}$ has at least one real eigenvalue.
(3): (Theorem 8, [106]) Equation (32) has a positive definite solution Y, i.e., $Y > 0$ , if and only if $A B^{*}$ has only real eigenvalues and is diagonalizable.
(4): (Theorem 9, [106]) Equation (32) has a positive semidefinite solution, i.e., $X \geq 0$ , if and only if the matrix A has at least one real eigenvalue, in which case, $X \geq 0$ is nonzero if $rank (A B^{*}) > rank (B_{2})$ .
(5): (Theorem 10, [106]) Equation (32) has a positive definite solution, i.e., $X > 0$ , if and only if all eigenvalues of $B_{2}^{*}$ are real, $B_{2}^{*}$ is diagonalizable, and $rank (A B^{*}) = rank (A)$ . Moreover, if $Y \geq 0$ , then $X > 0$ if and only if all eigenvalues of $B^{*}$ are positive and $B^{*}$ is diagonalizable.
(6): (Theorem 11, [106]) If $rank (B) < m$ , then Equation (32) with a fixed solution X does not have an extreme solution for Y. If $rank (B) = m$ , the solution Y is unique.
(7): (Theorem 12, [106]) If Equation (32) has a solution pair $(X, Y)$ and $Y \geq 0$ , then there exists a minimal solution $Y_{m} \geq 0$ .

Remark 27.

The expressions of self-adjoint, positive semidefinite, and positive definite solutions are also presented when the solvability conditions are met (see (Theorems 6–10, [106])). Additionally, Dobovišek discussed these solutions for

m < n

(see (Theorems 13–17, [106])) and for matrices A and B without full rank (see (Theorems 18–20, [106])).

5.5. Per(Skew)Symmetric and Bi(Skew)Symmetric Solutions

It is well known that (skew)selfconjugate, per(skew)symmetric, and centro(skew) symmetric matrices be applied to information theory, linear system theory, and numerical analysis theory (see [107,108,109,110]). Let

F = Ω

be a finite dimensional central algebra with an involution

σ

and

char (Ω) \neq 2

(see [111]). Wang et al. in [112,113] gave necessary and sufficient conditions for the existence of per(skew)symmetric solutions and bi(skew)symmetric solutions to Equation (1) over

Ω

, respectively.

Definition 7.

For

A = [a_{i, j}] \in Ω^{m \times n}

, let

A^{*} = [σ (a_{i, j})] \in Ω^{n \times m}, A^{(*)} = [σ (a_{n - j + 1, m - i + 1})] \in Ω^{n \times m}, A^{#} = [a_{m - i + 1, n - j + 1}] \in Ω^{m \times n} .

Then, A is called to be (skew)selfconjugate if

A = A^{*} (- A^{*})

, per(skew)symmetric if

A = A^{(*)} (- A^{(*)})

, and centro(skew)symmetric if

A = A^{#} (- A^{#})

. If A is both (skew)selfconjugate and per(skew)symmetric, then A is called to be bi(skew)symmetric. Moreover, a solution pair

(X, Y)

of Equation (1) is called to be per(skew)symmetric (or bi(skew)symmetric) if both X and Y are per(skew)symmetric (or bi(skew)symmetric).

Theorem 21

([112]). Let

A, B, C \in Ω^{m \times n} [λ]

.

(1): (Corollary 2.3, [112]) Equation (1) has a persymmetric solution pair $(X, Y)$ if and only if there exist invertible matrices $P \in Ω^{2 n \times 2 n}$ and $Q \in Ω^{2 m \times 2 m}$ such that

$\begin{matrix} Q [\begin{matrix} A & C \\ 0 & B \end{matrix}] P^{- 1} & = [\begin{matrix} A & 0 \\ 0 & B \end{matrix}], \end{matrix}$

(33)

$\begin{matrix} P [\begin{matrix} - I_{n} & 0 \\ 0 & I_{n} \end{matrix}] P^{(*)} & = [\begin{matrix} - I_{n} & 0 \\ 0 & I_{n} \end{matrix}], \end{matrix}$

(34)

$\begin{matrix} Q [\begin{matrix} - I_{m} & 0 \\ 0 & I_{m} \end{matrix}] Q^{(*)} & = [\begin{matrix} - I_{m} & 0 \\ 0 & I_{m} \end{matrix}] . \end{matrix}$

(35)
(2): (Corollary 2.8, [112]) Equation (1) has a perskewsymmetric solution pair $(X, Y)$ if and only if there exist invertible matrices $P \in Ω^{2 n \times 2 n}$ and $Q \in Ω^{2 m \times 2 m}$ such that (33), $P^{(*)} P = I_{2 n}$ , and $Q^{(*)} Q = I_{2 m}$ .
(3): (Corollary 2.13, [112]) Equation (1) has a solution pair $(X, Y)$ such that X is persymmetric and Y is perskewsymmetric if and only if there exist invertible matrices $P \in Ω^{2 n \times 2 n}$ and $Q \in Ω^{2 m \times 2 m}$ such that (33), (34), and $Q^{(*)} Q = I_{2 m}$ .
(4): (Corollary 2.16, [112]) Equation (1) has a solution pair $(X, Y)$ such that X is perskewsymmetric and Y is persymmetric if and only if there exist invertible matrices $P \in Ω^{2 n \times 2 n}$ and $Q \in Ω^{2 m \times 2 m}$ such that (33), (35), and $P^{(*)} P = I_{2 n}$ .

Theorem 22

([113]). Let

A, B, C \in Ω^{m \times n} [λ]

.

(1): (Corollary 2, [113]) Equation (1) has a bisymmetric solution pair $(X, Y)$ if and only if there exist invertible matrices $P \in Ω^{2 n \times 2 n}$ and $Q \in Ω^{2 m \times 2 m}$ such that

$\begin{matrix} Q [\begin{matrix} A & C \\ 0 & B \end{matrix}] P^{- 1} & = [\begin{matrix} A & 0 \\ 0 & B \end{matrix}], \end{matrix}$

(36)

$\begin{matrix} P [\begin{matrix} 0 & I_{n} \\ - I_{n} & 0 \end{matrix}] P^{*} & = [\begin{matrix} 0 & I_{n} \\ - I_{n} & 0 \end{matrix}], \end{matrix}$

(37)

$\begin{matrix} P [\begin{matrix} - I_{n} & 0 \\ 0 & I_{n} \end{matrix}] P^{(*)} & = [\begin{matrix} - I_{n} & 0 \\ 0 & I_{n} \end{matrix}], \end{matrix}$

(38)

$\begin{matrix} P^{#} [\begin{matrix} 0 & I_{n} \\ I_{n} & 0 \end{matrix}] P^{- 1} & = [\begin{matrix} 0 & I_{n} \\ I_{n} & 0 \end{matrix}], \end{matrix}$

(39)

$\begin{matrix} Q [\begin{matrix} 0 & I_{m} \\ - I_{m} & 0 \end{matrix}] Q^{*} & = [\begin{matrix} 0 & I_{m} \\ - I_{m} & 0 \end{matrix}], \end{matrix}$

(40)

$\begin{matrix} Q [\begin{matrix} - I_{m} & 0 \\ 0 & I_{m} \end{matrix}] Q^{(*)} & = [\begin{matrix} - I_{m} & 0 \\ 0 & I_{m} \end{matrix}], \end{matrix}$

(41)

$\begin{matrix} Q^{#} [\begin{matrix} 0 & I_{m} \\ I_{m} & 0 \end{matrix}] Q^{- 1} & = [\begin{matrix} 0 & I_{m} \\ I_{m} & 0 \end{matrix}] . \end{matrix}$

(42)
(2): (Corollary 5, [113]) Equation (1) has a biskewsymmetric solution pair $(X, Y)$ if and only if there exist invertible matrices $P \in Ω^{2 n \times 2 n}$ and $Q \in Ω^{2 m \times 2 m}$ such that (36),

$\begin{matrix} P^{(*)} P = I_{2 n}, P [\begin{matrix} 0 & I_{n} \\ I_{n} & 0 \end{matrix}] P^{*} = [\begin{matrix} 0 & I_{n} \\ I_{n} & 0 \end{matrix}] = P^{#} [\begin{matrix} 0 & I_{n} \\ I_{n} & 0 \end{matrix}] P^{- 1}, \end{matrix}$

(43)

$\begin{matrix} Q^{(*)} Q = I_{2 m}, Q [\begin{matrix} 0 & I_{m} \\ I_{m} & 0 \end{matrix}] Q^{*} = [\begin{matrix} I_{m} & 0 \\ 0 & I_{m} \end{matrix}] = Q^{#} [\begin{matrix} 0 & I_{m} \\ I_{m} & 0 \end{matrix}] Q^{- 1} . \end{matrix}$

(44)
(3): (Corollary 8, [113]) Equation (1) has a solution pair $(X, Y)$ such that X is bisymmetric and Y is biskewsymmetric if and only if there exist invertible matrices $P \in Ω^{2 n \times 2 n}$ and $Q \in Ω^{2 m \times 2 m}$ such that (36)–(39), and (44).
(4): (Corollary 10, [113]) Equation (1) has a solution pair $(X, Y)$ such that X is biskewsymmetric and Y is bisymmetric if and only if there exist invertible matrices $P \in Ω^{2 n \times 2 n}$ and $Q \in Ω^{2 m \times 2 m}$ such that (36), (40)–(42), and (43).

5.6. Maximal and Minimal Ranks of the General Solution

Let

F = C

, and let a solution pair

(X, Y)

of Equation (5) be

X = X_{0} + X_{1} i \in C^{r \times n} and Y = Y_{0} + Y_{1} i \in C^{m \times s},

where

X_{0}, X_{1} \in R^{r \times n}

and

Y_{0}, Y_{1} \in R^{m \times s}

. Liu [40] determined the maximal and minimal ranks for X, Y,

X_{0}

,

X_{1}

,

Y_{0}

, and

Y_{1}

.

Theorem 23

(Theorems 2.1 and 2.2, [40]). Let Equation (5) be consistent.

(1): Then,

$\begin{matrix} max_{A X + Y B = C} rank (X) & = min \{n, r, r - rank (A) + rank [\begin{matrix} B \\ C \end{matrix}]\}, \\ max_{A X + Y B = C} rank (Y) & = min {m, s, s - rank (B) + rank [A, C]}, \\ min_{A X + Y B = C} rank (X) & = rank [\begin{matrix} B \\ C \end{matrix}] - rank (B), \\ min_{A X + Y B = C} rank (Y) & = rank [A, C] - rank (A) . \end{matrix}$
(2): Let

$\begin{matrix} S_{1} & = \{X_{0} \in R^{r \times n} ∣ A (X_{0} + i X_{1}) + (Y_{0} + i Y_{1}) B = C\}, \\ S_{2} & = \{X_{1} \in R^{r \times n} ∣ A (X_{0} + i X_{1}) + (Y_{0} + i Y_{1}) B = C\} . \end{matrix}$

Then,

\begin{matrix} max_{X_{0} \in S_{1}} r (X_{0}) = min \{r, n, rank [\begin{matrix} B_{0} & 0 \\ B_{1} & 0 \\ C_{1} & A_{0} \\ - C_{0} & A_{1} \end{matrix}] - 2 rank (A) + r\}, \\ min_{X_{0} \in S_{1}} r (X_{0}) = rank [\begin{matrix} B_{0} & 0 \\ B_{1} & 0 \\ C_{1} & A_{0} \\ - C_{0} & A_{1} \end{matrix}] - rank [\begin{matrix} A_{0} \\ A_{1} \end{matrix}] - rank [\begin{matrix} B_{0} \\ B_{1} \end{matrix}], \end{matrix}

\begin{matrix} max_{X_{1} \in S_{2}} r (X_{1}) = min \{r, n, rank [\begin{matrix} B_{0} & 0 \\ B_{1} & 0 \\ C_{0} & A_{0} \\ C_{1} & A_{1} \end{matrix}] - 2 rank (A) + r\}, \\ min_{X_{1} \in S_{2}} r (X_{1}) = rank [\begin{matrix} B_{0} & 0 \\ B_{1} & 0 \\ C_{0} & A_{0} \\ C_{1} & A_{1} \end{matrix}] - rank [\begin{matrix} A_{0} \\ A_{1} \end{matrix}] - rank [\begin{matrix} B_{0} \\ B_{1} \end{matrix}] . \end{matrix}

(3): Let

$\begin{matrix} S_{3} & = \{Y_{0} \in R^{m \times s} ∣ A (X_{0} + i X_{1}) + (Y_{0} + i Y_{1}) B = C\}, \\ S_{4} & = \{Y_{1} \in R^{m \times s} ∣ A (X_{0} + i X_{1}) + (Y_{0} + i Y_{1}) B = C\} . \end{matrix}$

Then,

\begin{matrix} max_{Y_{0} \in S_{3}} r (Y_{0}) = min \{m, s, rank [\begin{matrix} A_{0} & A_{1} & C_{1} & - C_{0} \\ 0 & 0 & B_{0} & B_{1} \end{matrix}] - 2 rank (B) + s\}, \\ min_{Y_{0} \in S_{3}} r (Y_{0}) = rank [\begin{matrix} A_{0} & A_{1} & C_{1} & - C_{0} \\ 0 & 0 & B_{0} & B_{1} \end{matrix}] - rank [A_{0}, A_{1}] - rank [B_{0}, B_{1}], \\ max_{Y_{1} \in S_{4}} r (Y_{1}) = min \{m, s, rank [\begin{matrix} A_{0} & A_{1} & C_{0} & C_{1} \\ 0 & 0 & B_{0} & B_{1} \end{matrix}] - 2 rank (B) + s\}, \\ min_{Y_{1} \in S_{4}} r (Y_{1}) = rank [\begin{matrix} A_{0} & A_{1} & C_{0} & C_{1} \\ 0 & 0 & B_{0} & B_{1} \end{matrix}] - rank [A_{0}, A_{1}] - rank [B_{0}, B_{1}] . \end{matrix}

Remark 28.

In (Corollary 2.3, [40]), Liu also presented equivalent conditions for Equation (5) to have a (all) real solution pair(s), i.e.,

X = X_{0} and Y = Y_{0},

and a (all) pure imaginary solution pair(s), i.e.,

X = i X_{1} and Y = i Y_{1} .

However, in (Section 3, [114]), Wang et al. provided two counterexamples to illustrate that the items (a) and (c) in (Corollary 2.3, [40]) are incorrect.

5.7. Re-(Non)negative and Re-(Non)positive Definite Solutions

Let

F = C

. For a Hermitian matrix

A \in C^{n \times n}

,

i_{+} (A)

,

i_{-} (A)

, and

i_{0} (A)

represent the numbers of the positive, negative, and zero eigenvalues of A, respectively. In Corollary 5.7 of [115], Wang and He established the maximal and minimal values of

i_{\pm} (X + X^{*}) and i_{\pm} (Y + Y^{*})

for a solution pair

(X, Y)

of the complex matrix Equation

A X B + C Y D = E,

(45)

where

A \in C^{m \times k_{2}}

,

B \in C^{k_{2} \times q}

,

C \in C^{m \times k_{3}}

,

D \in C^{k_{3} \times q}

, and

E \in C^{m \times q}

are given. This result directly yields the equivalent conditions for re-positive definite, re-negative definite, re-nonnegative definite, and re-nonpositive definite solutions to Equation (45).

Definition 8

([115]). Let

A \in C^{n \times n}

and

H (A) = A + A^{*}

. The A is called to be re-positive definite if

H (A) > 0

, re-nonnegative definite if

H (A) \geq 0

, re-negative definite if

H (A) < 0

, and re-nonpositive definite if

H (A) \leq 0

.

Theorem 24

(Corollary 5.8, [115]). Let

X \in C^{k_{2} \times k_{2}}

and

Y \in C^{k_{3} \times k_{3}}

be a solution pair of Equation (45). Denote

\begin{matrix} G_{1} & = [\begin{matrix} 0 & - D & C^{*} & 0 \\ - D^{*} & 0 & E^{*} & 0 \\ C & E & 0 & A \\ 0 & 0 & A^{*} & 0 \end{matrix}], G_{2} = [\begin{matrix} 0 & - C^{*} & D & 0 \\ - C & 0 & E & 0 \\ D^{*} & E^{*} & 0 & B^{*} \\ 0 & 0 & B & 0 \end{matrix}], \\ G_{3} & = [\begin{matrix} 0 & - B & A^{*} & 0 \\ - B^{*} & 0 & E^{*} & 0 \\ A & E & 0 & C \\ 0 & 0 & C^{*} & 0 \end{matrix}], G_{4} = [\begin{matrix} 0 & - A^{*} & B & 0 \\ - A & 0 & E & 0 \\ B^{*} & E^{*} & 0 & D^{*} \\ 0 & 0 & D & 0 \end{matrix}], \end{matrix}

\begin{matrix} s_{1} & = rank [\begin{matrix} 0 & - A^{*} & B & 0 & 0 \\ - A & 0 & E & C & 0 \\ B^{*} & E^{*} & 0 & 0 & D^{*} \end{matrix}], s_{2} = rank [\begin{matrix} 0 & - B & A^{*} & 0 & 0 \\ - A & E & 0 & C & 0 \\ B^{*} & 0 & E^{*} & 0 & D^{*} \\ 0 & 0 & C^{*} & 0 & 0 \end{matrix}], \\ s_{3} & = rank [\begin{matrix} 0 & - B & A^{*} & 0 & 0 \\ - A & E & 0 & C & 0 \\ B^{*} & 0 & E^{*} & 0 & D^{*} \\ 0 & D & 0 & 0 & 0 \end{matrix}], s_{4} = rank [\begin{matrix} A \\ B^{*} \end{matrix}], \end{matrix}

\begin{matrix} w_{1} & = rank [\begin{matrix} 0 & - C^{*} & D & 0 & 0 \\ - C & 0 & E & A & 0 \\ D^{*} & E^{*} & 0 & 0 & B^{*} \end{matrix}], w_{2} = rank [\begin{matrix} 0 & - D & C^{*} & 0 & 0 \\ - C & E & 0 & A & 0 \\ D^{*} & 0 & E^{*} & 0 & B^{*} \\ 0 & 0 & A^{*} & 0 & 0 \end{matrix}], \\ w_{3} & = rank [\begin{matrix} 0 & - D & C^{*} & 0 & 0 \\ - C & E & 0 & A & 0 \\ D^{*} & 0 & E^{*} & 0 & B^{*} \\ 0 & B & 0 & 0 & 0 \end{matrix}], w_{4} = rank [\begin{matrix} C \\ D^{*} \end{matrix}] . \end{matrix}

Then,

(1): X is re-positive definite if and only if

$\begin{matrix} i_{+} (G_{3}) & = rank [\begin{matrix} A & C \end{matrix}] + rank (B) and i_{+} (G_{3}) \geq rank [\begin{matrix} D \\ B \end{matrix}] + rank (A), \\ or i_{+} (G_{3}) & \geq rank [\begin{matrix} A & C \end{matrix}] + rank (B) and i_{+} (G_{4}) = rank [\begin{matrix} D \\ B \end{matrix}] + rank (A) . \end{matrix}$
(2): X is re-negative definite if and only if

$\begin{matrix} i_{-} (G_{3}) & = rank [\begin{matrix} A & C \end{matrix}] + rank (B) and i_{-} (G_{4}) \geq rank [\begin{matrix} D \\ B \end{matrix}] + rank (A), \\ or i_{-} (G_{3}) & \geq rank [\begin{matrix} A & C \end{matrix}] + rank (B) and i_{-} (G_{4}) = rank [\begin{matrix} D \\ B \end{matrix}] + rank (A) . \end{matrix}$
(3): X is re-nonnegative definite if and only if

$\begin{matrix} s_{1} - s_{4} + i_{-} (G_{3}) - s_{2} & = 0 and s_{1} - s_{4} + i_{-} (G_{4}) - s_{3} \leq 0, \\ or s_{1} - s_{4} + i_{-} (G_{3}) - s_{2} & \leq 0 and s_{1} - s_{4} + i_{-} (G_{4}) - s_{3} = 0 . \end{matrix}$
(4): X is re-nonpositive definite if and only if

$\begin{matrix} s_{1} - s_{4} + i_{+} (G_{3}) - s_{2} & = 0 and s_{1} - s_{4} + i_{+} (G_{3}) - s_{3} \leq 0, \\ or s_{1} - s_{4} + i_{+} (G_{3}) - s_{2} & \leq 0 and s_{1} - s_{4} + i_{+} (G_{4}) - s_{3} = 0 . \end{matrix}$
(5): Y is re-positive definite if and only if

$\begin{matrix} i_{+} (G_{1}) & = rank [\begin{matrix} A & C \end{matrix}] + rank (D) and i_{+} (G_{2}) \geq rank [\begin{matrix} D \\ B \end{matrix}] + rank (C), \\ or i_{+} (G_{1}) & \geq rank [\begin{matrix} A & C \end{matrix}] + rank (D) and i_{+} (G_{2}) = rank [\begin{matrix} D \\ B \end{matrix}] + rank (C) . \end{matrix}$
(6): Y is re-negative definite if and only if

$\begin{matrix} i_{-} (G_{1}) & = rank [\begin{matrix} A & C \end{matrix}] + rank (D) and i_{-} (G_{2}) \geq rank [\begin{matrix} D \\ B \end{matrix}] + rank (C), \\ or i_{-} (G_{1}) & \geq rank [\begin{matrix} A & C \end{matrix}] + rank (D) and i_{-} (G_{2}) = rank [\begin{matrix} D \\ B \end{matrix}] + rank (C) . \end{matrix}$
(7): Y is re-nonnegative definite if and only if

$\begin{matrix} w_{1} - w_{4} + i_{-} (G_{1}) - w_{2} & = 0 and w_{1} - w_{4} + i_{-} (G_{2}) - w_{3} \leq 0, \\ or w_{1} - w_{4} + i_{-} (G_{1}) - w_{2} & \leq 0 and w_{1} - w_{4} + i_{-} (G_{2}) - w_{3} = 0 . \end{matrix}$
(8): Y is re-nonpositive definite if and only if

$\begin{matrix} w_{1} - w_{4} + i_{+} (G_{1}) - w_{2} & = 0 and w_{1} - w_{4} + i_{+} (G_{2}) - w_{3} \leq 0, \\ or w_{1} - w_{4} + i_{+} (G_{1}) - w_{2} & \leq 0 and w_{1} - w_{4} + i_{+} (G_{2}) - w_{3} = 0 . \end{matrix}$

Remark 29.

When both B and C in Theorem 24 are taken as identity matrices with

k_{2} = q

and

m = k_{3}

, we can immediately obtain the equivalent conditions for the existence of re-positive definite, re-negative definite, re-nonnegative definite, and re-nonpositive definite solutions of Equation (5).

5.8. $η$ -Hermitian and $η$ -Skew-Hermitian Solutions

Let

F = H

. The

η

-Hermitian matrices have been employed in statistical multichannel processing and widely linear modeling (see [116,117]). Yuan and Wang [118] considered the least-squares

η

-Hermitian solution with the least norm to the quaternion matrix equation

A X E + F X B = C

with unknown X. After that, He and Wang [119] investigated the

η

-Hermitian solutions of the matrix equation

A X {A^{η}}^{*} + B Y {B^{η}}^{*} = C,

(46)

where A, B, and C are given quaternion matrices with appropriate orders. Moreover, a solution pair

(X, Y)

of Equation (46) is called to be

η

-Hermitian (

η

-skew-Hermitian) if both X and Y are

η

-Hermitian (

η

-skew-Hermitian).

Theorem 25

(Corollaries 3.5 and 4.3, [119]). Let A, B, and C be given over

H

such that

C = {C^{η}}^{*}

. Set

M = R_{A} B and S = B L_{M} .

Then, the following are equivalent:

(1): Equation (46) has an η-Hermitian solution pair $(X, Y)$ ;
(2): $R_{M} R_{A} C = 0$ and $R_{A} C {(R_{B})}^{η *} = 0$ ;
(3): $rank [\begin{matrix} A & C \\ 0 & B^{η *} \end{matrix}] = rank (A) + rank (B)$ and $rank [\begin{matrix} A & B & C \end{matrix}] = rank [\begin{matrix} A & B \end{matrix}]$ .

In this case,

\begin{matrix} X = & A^{†} C {(A^{†})}^{η *} - \frac{1}{2} A^{†} B M^{†} C [I + {(B^{†})}^{η *} S^{η *}] {(A^{†})}^{η *} \\ - \frac{1}{2} A^{†} (I + S B^{†}) C {(M^{†})}^{η *} B^{η *} {(A^{†})}^{η *} - A^{†} S W_{2} S^{η *} {(A^{†})}^{η *} + L_{A} U + U^{η *} {(L_{A})}^{η}, \\ Y = & \frac{1}{2} M^{†} C {(B^{†})}^{η *} [I + {(S^{†} S)}^{η}] + \frac{1}{2} (I + S^{†} S) B^{†} C {(M^{†})}^{η *} \\ + L_{M} W_{2} {(L_{M})}^{η} + V L_{B}^{η} + L_{B} V^{η *} + L_{M} L_{S} W_{1} + W_{1}^{η *} {(L_{S})}^{η} {(L_{M})}^{η}, \end{matrix}

where

W_{1}

, U, V, and

W_{2} = W_{2}^{η *}

are arbitrary quaternion matrices with appropriate sizes, and

\begin{matrix} min_{A X A^{η *} + B Y B^{η *} = C} rank (X) = 2 rank [\begin{matrix} C & B \end{matrix}] - rank [\begin{matrix} 0 & B^{η *} \\ B & C \end{matrix}], \\ min_{A X A^{η *} + B Y B^{η *} = C} rank (Y) = 2 rank [\begin{matrix} A & C \end{matrix}] - rank [\begin{matrix} 0 & A^{η *} \\ A & C \end{matrix}] . \end{matrix}

Remark 30.

Inspired by Theorem 25, we now consider the η-Hermitian solutions of Equation (5) over

H

, namely finding X and Y over

H

such that

A X + Y B = C, X = {X^{η}}^{*}, and Y = {Y^{η}}^{*},

(47)

where A, B, and C are given over

H

. By means of

X = \frac{1}{2} (\hat{X} + {\hat{X}}^{η *}) and Y = \frac{1}{2} (\hat{Y} + {\hat{Y}}^{η *}),

it is easy to check that the following are equivalent:

(1): The statement (47) holds.
(2): There exist the matrices $\hat{X}$ and $\hat{Y}$ over $H$ such that

$A \hat{X} + \hat{Y} B = C and \hat{X} {A^{η}}^{*} + {B^{η}}^{*} \hat{Y} = {C^{η}}^{*} .$
(3): There exist the matrices $\hat{X}$ and $\hat{Y}$ over $H$ such that

$[\begin{matrix} A & I \end{matrix}] \hat{X} [\begin{matrix} I & 0 \\ 0 & {A^{η}}^{*} \end{matrix}] + [\begin{matrix} I & {B^{η}}^{*} \end{matrix}] \hat{Y} [\begin{matrix} B & 0 \\ 0 & I \end{matrix}] = [\begin{matrix} C & {C^{η}}^{*} \end{matrix}] .$

Therefore, solving the η-Hermitian solutions of Equation (5) reduces to solving an equation of the form

A X B + C Y D = E,

which has been solved by Baksalary and Kala [120]. By the same method, we have

(1): There exists a matrix pair $(X, Y)$ such that

$A X + Y B = C and X = {X^{η}}^{*},$

if and only if there exist the matrices $\hat{X}$ , $\hat{Y}$ , and $\hat{Z}$ such that

$A \hat{X} + \hat{Y} B = C and \hat{X} {A^{η}}^{*} + {B^{η}}^{*} \hat{Z} = {C^{η}}^{*} .$

(48)
(2): There exists a matrix pair $(X, Y)$ such that

$A X + Y B = C and Y = {Y^{η}}^{*},$

if and only if there exist the matrices $\hat{X}$ , $\hat{Y}$ , and $\hat{Z}$ such that

$A \hat{X} + \hat{Y} B = C and \hat{Z} {A^{η}}^{*} + {B^{η}}^{*} \hat{Y} = {C^{η}}^{*} .$

(49)

Moreover, solvability conditions and expressions of the general solution to Equations (48) and (49) can be obtained by Theorem 2.1 of [121].

Kyrchei [122] further discussed the

η

-skew-Hermitian solutions of Equation (46) under

C = - {C^{η}}^{*}

using a method similar to Theorem 25.

Theorem 26

(Corollary 4.4, [122]). Under the hypotheses of Theorem 25, the following are equivalent:

(1): Equation (46) has an η-skew-Hermitian solution pair $(X, Y)$ ;
(2): $R_{M} R_{A} C = 0$ and $R_{A} C {(R_{B})}^{η *} = 0$ ;
(3): $rank [\begin{matrix} A & C \\ 0 & B^{η^{*}} \end{matrix}] = rank (A) + rank (B)$ and $rank [\begin{matrix} A & B & C \end{matrix}] = rank [\begin{matrix} A & B \end{matrix}]$ ;

in which case, the η-skew-Hermitian solutions of Equation (46) are

\begin{matrix} X = & A^{†} C {(A^{†})}^{η *} - \frac{1}{2} [A^{†} B M^{†} C {(A^{†})}^{η *} + A^{†} C {(A^{†} B M^{†})}^{η *}] \\ - \frac{1}{2} [A^{†} B M^{†} C {(A^{†} S B^{†})}^{η *} + A^{†} S B^{†} C {(A^{†} B M^{†})}^{η *}] - A^{†} S W_{2} {(A^{†} S)}^{η *} - L_{A} U + {(L_{A} U)}^{η *}, \\ Y = & \frac{1}{2} [M^{†} C {(B^{†})}^{η *} + B^{†} C {(M^{†})}^{η *}] + \frac{1}{2} [M^{†} C {(B^{†})}^{η *} {(Q_{S})}^{η *} + Q_{S} B^{†} C {(M^{†})}^{η *}] \\ + L_{M} W_{2} L_{M}^{†} + V L_{B}^{†} - L_{B} V^{η *} + L_{M} L_{S} W_{1} - W_{1}^{η *} L_{S}^{†} L_{M}^{η *}, \end{matrix}

where

W_{1}

, U, V, and

W_{2} = W_{2}^{η *}

are arbitrary over

H

with appropriate sizes.

Remark 31.

Using a method similar to that in Remark 30, one can also consider the η-skew-Hermitian solutions of Equation (5) over

H

, i.e., finding X and Y such that

A X + Y B = C, X = - {X^{η}}^{*}, and Y = - {Y^{η}}^{*} .

Remark 32.

In terms of the determinant representations of the MP inverse over

H

(i.e., Theorem 10), Kyrchei [122] also showed the Cramer’s rules for the partial η-Hermitian and η-skew-Hermitian solution to Equation (46) under

C = {C^{η}}^{*}

and

C = - {C^{η}}^{*}

, respectively.

5.9. $ϕ$ -Hermitian Solutions

Let

F = H

. Rodman [7] introduced the quaternion matrix

A_{ϕ}

over

H

, a generation of

A^{*}

and

{A^{η}}^{*}

, and defined the

ϕ

-Hermitian quaternion matrix as follows:

Definition 9

([7]). Let

ϕ : H \to H

be a map.

(1): We call ϕ an anti-endomorphism if for any $α, β \in H$ , ϕ satisfies

$ϕ (α β) = ϕ (β) ϕ (α) and ϕ (α + β) = ϕ (α) + ϕ (β) .$

An anti-endomorphism ϕ is called an involution if $ϕ^{2}$ is the identity map.
(2): Let ϕ be a nonzero involution. Then, ϕ can be represented as a matrix in $R^{4 \times 4}$ with respect to the basis ${1, i, j, k}$ , i.e.,

$ϕ = [\begin{matrix} 1 & 0 \\ 0 & T \end{matrix}],$

where either $T = - I_{3}$ (in which case ϕ is called a standard involution), or $T \in R^{3 \times 3}$ is an orthogonal symmetric matrix with the eigenvalues ${1, 1, - 1}$ (in which case ϕ is called a nonstandard involution).
(3): Let ϕ be a nonstandard involution and $A = [a_{i, j}] \in H^{m \times n}$ . Define

$ϕ (A) = [ϕ (a_{i, j})] \in H^{m \times n} and A_{ϕ} = ϕ (A^{T}) \in H^{n \times m} .$

If $A = A_{ϕ}$ with $m = n$ , then A is called a ϕ-Hermitian matrix.

He et al. [123] considered the

ϕ

-Hermitian solution

Z = Z_{ϕ}

of the following system

\{\begin{matrix} A_{1} X - Y B_{1} = C_{1}, \\ A_{2} Z - Y B_{2} = C_{2}, \end{matrix}

(50)

where

A_{i}

,

B_{i}

, and

C_{i} (i = 1, 2)

are given matrices over

H

with appropriate orders.

Theorem 27

(Theorem 4.5, [123]). Let

\begin{matrix} A_{11} = R_{B_{2}} B_{1}, B_{11} = R_{A_{2}} A_{2}, C_{11} = B_{1} L_{A_{11}}, D_{11} = R_{A_{1}} (R_{A_{2}} C_{2} B_{2}^{†} B_{1} - C_{1}) L_{A_{11}}, A_{22} = [L_{A_{2}}, - {(R_{C_{11}} B_{2})}_{ϕ}], \\ B_{22} = [\begin{matrix} R_{C_{11}} B_{2} \\ - {(L_{A_{2}})}_{ϕ} \end{matrix}], C_{22} = {(A_{2}^{†} C_{2} L_{B_{2}})}_{ϕ} + {(B_{2})}_{ϕ} {(C_{11}^{†})}_{ϕ} D_{22} {(B_{11}^{†})}_{ϕ} - A_{2}^{†} C_{2} - B_{11}^{†} D_{11} C_{11}^{†} B_{2}, \\ A = R_{A_{22}} L_{B_{11}}, B = B_{2} L_{B_{22}}, C = - R_{A_{22}} {(B_{2})}_{ϕ}, D = {(L_{B_{11}})}_{ϕ} L_{B_{22}}, E = R_{A_{22}} C_{22} L_{B_{22}}, M = R_{A} C, \end{matrix}

Then, the following are equivalent:

(1): The system (50) has a solution $(X, Y, Z)$ such that $Z = Z_{ϕ}$ .
(2): The following rank equalities hold:

$\begin{matrix} rank [\begin{matrix} C_{i} & A_{i} \\ B_{i} & 0 \end{matrix}] = rank (A_{i}) + rank (B_{i}), i = 1, 2, \\ rank [\begin{matrix} C_{1} & C_{2} & A_{1} & A_{2} \\ B_{1} & B_{2} & 0 & 0 \end{matrix}] = rank [\begin{matrix} A_{1} & A_{2} \end{matrix}] + rank [\begin{matrix} B_{1} & B_{2} \end{matrix}], \\ rank [\begin{matrix} C_{1} & C_{2} {(A_{2})}_{ϕ} - A_{2} {(C_{2})}_{ϕ} & A_{1} & A_{2} {(B_{2})}_{ϕ} \\ B_{1} & B_{2} {(A_{2})}_{ϕ} & 0 & 0 \end{matrix}] = rank [\begin{matrix} A_{1} & A_{2} {(B_{2})}_{ϕ} \end{matrix}] + rank [\begin{matrix} B_{1} & B_{2} {(A_{2})}_{ϕ}) \end{matrix}], \end{matrix}$

$\begin{matrix} rank [\begin{matrix} C_{2} {(A_{2})}_{ϕ} - A_{2} {(C_{2})}_{ϕ} & A_{2} {(B_{2})}_{ϕ} \\ B_{2} {(A_{2})}_{ϕ} & 0 \end{matrix}] = 2 rank (A_{2} {(B_{2})}_{ϕ}), \\ rank [\begin{matrix} C_{1} & C_{2} {(A_{2})}_{ϕ} - A_{2} {(C_{2})}_{ϕ} & A_{1} & A_{2} {(B_{2})}_{ϕ} \\ 0 & - {(C_{1})}_{ϕ} & 0 & {(B_{1})}_{ϕ} \\ B_{1} & B_{2} {(A_{2})}_{ϕ} & 0 & 0 \\ 0 & {(A_{1})}_{ϕ} & 0 & 0 \end{matrix}] = 2 rank [\begin{matrix} A_{1} & A_{2} {(B_{2})}_{ϕ} \\ 0 & {(B_{1})}_{ϕ} \end{matrix}] . \end{matrix}$
(3): The following equations hold:

$R_{A_{2}} C_{2} L_{B_{2}} = 0, D_{11} L_{C_{11}} = 0, R_{B_{11}} D_{11} = 0, R_{M} R_{A} E = 0, R_{C} E L_{B} = 0, R_{A} E L_{D} = 0 .$

In this case,

\begin{matrix} X & = \frac{1}{2} (X_{1} + {(X_{5})}_{ϕ}), Y = \frac{1}{2} (X_{2} + {(X_{4})}_{ϕ}), and Z = \frac{1}{2} (X_{3} + {(X_{3})}_{ϕ}), \end{matrix}

where

X_{1}, X_{2}, \dots, X_{5}

are given in (Formulas (4.24)–(4.39), [123]).

Remark 33.

(1): When $A_{1} = B_{1} = C_{1} = 0$ , Theorem 27 yields the result for

$A_{2} Z - Y B_{2} = C_{2} subject to Z = Z_{ϕ},$

which can be regarded as Equation (1) under the constrain that X is ϕ-Hermitian, i.e.,

$A X - Y B = C subject to X = X_{ϕ} .$

(51)
(2): Note that ϕ-Hermitian matrices are a generalization of Hermitian matrices. In Theorems 5.1 and 5.2 of [121], He and Wang have investigated the following problem over $C$ :

$\begin{matrix} A X - Y B = C subject to X = X^{*} (or Y = Y^{*}), \end{matrix}$

which is clearly similar to the problem (51).
(3): By the same method as in Remark 30, we can also discuss the following problem:

$A X - Y B = C subject to X = X_{ϕ} and Y = Y_{ϕ} .$

5.10. Equality-Constrained Solutions

Let

F = C

. Wang et al. [124] considered the solvability conditions and the general solution for Equation (5) over

C

under the following equality constraints:

A_{1} X = C_{1}, Y B_{2} = C_{2}, A_{3} X B_{3} = C_{3}, and A_{4} Y B_{4} = C_{4},

(52)

where

A_{1}

,

A_{3}

,

A_{4}

,

B_{2}

,

B_{3}

,

B_{4}

,

C_{1}

,

C_{2}

,

C_{3}

, and

C_{4}

are given.

Theorem 28

(Theorem 3.2, [124]). Let

A_{1}

,

B_{1}

,

C_{1}

,

B_{2}

,

C_{2}

,

A_{3}

,

B_{3}

,

C_{3}

,

A_{4}

,

B_{4}

,

C_{4}

, A, B, and C be given matrices over

C

with appropriate sizes. Set

\begin{matrix} T = A_{3} L_{A_{1}}, K = R_{B_{2}} B_{4}, ϕ_{1} = A_{1}^{†} C_{1} + L_{A_{1}} T^{†} (C_{3} - A_{3} A_{1}^{†} C_{1} B_{3}) B_{3}^{†}, \\ ϕ_{2} = C_{2} B_{2}^{†} + A_{4}^{†} (C_{4} - A_{4} C_{2} B_{2}^{†} B_{4}) K^{†} R_{B_{2}}, A_{11} = A L_{A_{1}} L_{T}, B_{11} = R_{K} R_{B_{2}} B, \\ C_{33} = A L_{A_{1}}, D_{33} = R_{B_{3}}, C_{44} = L_{A_{4}}, D_{44} = R_{B_{2}} B, E_{11} = C - A ϕ_{1} - ϕ_{2} B, \\ A_{a} = R_{A_{11}} C_{33}, B_{b} = D_{33} L_{B_{11}}, C_{c} = R_{A_{a}} C_{44}, D_{d} = D_{44} L_{B_{11}}, \\ E = R_{A_{11}} E_{11} L_{B_{11}}, M = R_{A_{a}} C_{c}, N = D_{d} L_{B_{b}}, S = C_{c} L_{M} . \end{matrix}

Then, the following are equivalent:

(1): Equation (5) under the constraints (52) is consistent.
(2): The following rank equations hold:

$rank [A_{1}, C_{1}] = rank (A_{1}), rank [\begin{matrix} C_{3} \\ B_{3} \end{matrix}] = rank (B_{3}), rank [\begin{matrix} A_{1} & C_{1} B_{3} \\ A_{3} & C_{3} \end{matrix}] = rank [\begin{matrix} A_{1} \\ A_{3} \end{matrix}],$

$rank [\begin{matrix} C_{2} \\ B_{2} \end{matrix}] = rank (B_{2}), rank [A_{4}, C_{4}] = rank (A_{4}), rank [\begin{matrix} C_{4} & A_{4} C_{2} \\ B_{4} & B_{2} \end{matrix}] = rank [B_{4}, B_{2}],$

$rank [\begin{matrix} 0 & B_{2} & B B_{3} \\ A & C_{2} & C B_{3} \\ A_{3} & 0 & C_{3} \\ A_{1} & 0 & C_{1} B_{3} \end{matrix}] = rank [\begin{matrix} 0 & B_{2} & B B_{3} \\ A & 0 & 0 \\ A_{3} & 0 & 0 \\ A_{1} & 0 & 0 \end{matrix}], rank [\begin{matrix} 0 & B & B_{2} \\ A & C & C_{2} \\ A_{1} & C_{1} & 0 \end{matrix}] = rank [\begin{matrix} 0 & B & B_{2} \\ A & 0 & 0 \\ A_{1} & 0 & 0 \end{matrix}],$

$\begin{matrix} rank [\begin{matrix} 0 & B & B_{4} & B_{2} \\ A_{4} A & A_{4} C & C_{4} & A_{4} C_{2} \\ A_{1} & C_{1} & 0 & 0 \end{matrix}] = rank [\begin{matrix} 0 & B & B_{4} & B_{2} \\ A_{4} A & 0 & 0 & 0 \\ A_{1} & 0 & 0 & 0 \end{matrix}], \\ rank [\begin{matrix} 0 & B_{4} & B_{2} & B B_{3} \\ A_{3} & 0 & 0 & C_{3} \\ A_{1} & 0 & 0 & C_{1} B_{3} \\ A_{4} A & C_{4} & A_{4} C_{2} & A_{4} C B_{3} \end{matrix}] = rank [\begin{matrix} 0 & B_{4} & B_{2} & B B_{3} \\ A_{3} & 0 & 0 & 0 \\ A_{1} & 0 & 0 & 0 \\ A_{4} A & 0 & 0 & 0 \end{matrix}] . \end{matrix}$
(3): The following equations hold:

$\begin{matrix} R_{A_{1}} C_{1} = 0, R_{T} (C_{3} - A_{3} A_{1}^{†} C_{1} B_{3}) = 0, C_{3} L_{B_{3}} = 0, \\ C_{2} L_{B_{2}} = 0, R_{A_{4}} C_{4} = 0, (C_{4} - A_{4} C_{2} B_{2}^{†} B_{4}) L_{K} = 0, \\ R_{M} R_{A_{a}} E = 0, E L_{B_{b}} L_{N} = 0, R_{A_{a}} E L_{D_{d}} = 0, R_{C_{c}} E L_{B_{b}} = 0 . \end{matrix}$

In this case,

\begin{matrix} X & = A_{1}^{†} C_{1} + L_{A_{1}} T^{†} (C_{3} - A_{3} A_{1}^{†} C_{1} B_{3}) B_{3}^{†} + L_{A_{1}} L_{T} Z_{1} + L_{A_{1}} W_{1} R_{B_{3}}, \\ Y & = C_{2} B_{2}^{†} + A_{4}^{†} (C_{4} - A_{4} C_{2} B_{2}^{†} B_{4}) K^{†} R_{B_{2}} + L_{A_{4}} W_{2} R_{B_{2}} + Z_{2} R_{K} R_{B_{2}}, \\ Z_{1} & = A_{11}^{†} (E_{11} - C_{33} W_{1} D_{33} - C_{44} W_{2} D_{44}) - A_{11}^{†} V_{7} B_{1} + L_{A_{11}} V_{6}, \\ Z_{2} & = R_{A_{11}} (E_{11} - C_{32} W_{1} D_{33} - C_{44} W_{2} D_{44}) B_{11}^{†} + A_{11} A_{11}^{†} V_{7} + V_{8} R_{B_{11}}, \\ W_{1} & = A_{a}^{†} E B_{b}^{†} - A_{a}^{†} C_{c} M^{†} E B_{b}^{†} - A_{a}^{†} S C_{c}^{†} E N^{†} D_{d} B_{b}^{†} - A_{a}^{†} S V_{4} R_{N} D_{d} B_{b}^{†} + L_{A_{a}} V_{1} + V_{2} R_{B_{b}}, \\ W_{2} & = M^{†} E D_{d}^{†} + S^{†} S C_{c}^{†} E N^{†} + L_{M} L_{S} V_{3} + L_{M} V_{4} R_{N} + V_{5} R_{D_{d}}, \end{matrix}

where

V_{1}, \dots, V_{8}

are arbitrary matrices over

C

with appropriate orders.

Remark 34.

Inspired by Theorem 28, using the Kronecker product and the vectorization operation, Wang et al. [125] investigated the minimum-norm least-squares solution for the quaternion tensor system under the Einstein product:

\begin{matrix} A *_{N} X + Y *_{N} B = C \\ subject to & \{\begin{matrix} A_{1} *_{N} X = C_{1}, A_{3} *_{N} X *_{N} B_{3} = C_{3}, \\ Y *_{N} B_{2} = C_{2}, A_{4} *_{N} Y *_{N} B_{4} = C_{4}, \end{matrix} \end{matrix}

(53)

where

X

and

Y

are unknown tensors and the others are given tensors over

H

. Thus, the minimum-norm least-squares solution of the tensor Equation (53) is directly given in (Corollary 3.4, [125]), which is also expounded in Section 6.5.

6. Various Generalizations of GSE

This section shows the generalizations of GSE to diverse domains such as various rings, dual numbers, dual quaternions, linear operators, tensors, and matrix polynomials, as well as its more general forms. This embodies another enchantment of mathematics: constantly exploring more general problems to ultimately reveal the most essential conclusions.

6.1. Generalizing RET over Different Rings

RET given in Section 3 characterizes the equivalent condition for the solvability of GSE over a field. This subsection mainly generalizes RET over the following algebraic structures: unit regular rings, principal ideal rings, commutative rings, division rings, Artinian rings, etc. To simplify expressions, we first introduce Guralnick’s definition in [126].

Definition 10

([126]). If (3) is equivalent to (4) over a ring

F

, then we say that

F

has the equivalence property.

6.1.1. Generalizing RET over Unit Regular Rings

A ring

F

is called to be unit regular if for any

a \in F

, there exists a unit

u \in F

such that

a = a u a .

Hartwig [127] generalized RET to a unit regular ring, also extending Theorems 3 and 4.

Theorem 29

([127]). Let

F

be a unit regular ring, and

M = [\begin{matrix} a & c \\ 0 & b \end{matrix}] \in F^{2 \times 2} .

Then, the following statements are equivalent:

(1): M has an inner inverse with the form of $[\begin{matrix} r & s \\ 0 & t \end{matrix}] \in F^{2 \times 2}$ ;
(2): $a x - y b = c$ has a solution pair $x, y \in F$ ;
(3): $(1 - a a^{-}) c (1 - b^{-} b) = 0$ for all $a^{-}$ and $b^{-}$ ;
(4): $a_{1} c b_{1} = 0$ , where $a_{1} \in {x \in F ∣ x a = 0}$ and $b_{1} \in {x \in F ∣ b x = 0}$ ;
(5): $M = p [\begin{matrix} a & 0 \\ 0 & b \end{matrix}] q$ , where $p, q \in F$ are invertible;
(6): $(1 - a a^{(1, 2)}) c (1 - b^{(1, 2)} b) = 0$ for all $a^{(1, 2)}$ and $b^{(1, 2)}$ ;
(7): $M^{(1, 2)} = [\begin{matrix} a^{(1, 2)} & - a^{(1, 2)} c b^{(1, 2)} \\ 0 & b^{(1, 2)} \end{matrix}]$ is a reflexive inverse of M.

If

F

is a skewfield (or a commutative ring without zero divisors) and

a, c, b \in F^{n \times n}

, then items (1)–(7) are also equivalent to

(5a): $rank (M) = rank (a) + rank (b)$ ;
(5b): $rank ((1 - a a^{(1, 2)}) c (1 - b^{(1, 2)} b)) = 0$ .

Remark 35.

In the conclusions of [127], Hartwig mentioned that RET also holds for matrices over Euclidean domains and unit regular rings. These rings are finite and elementary divisor rings satisfying the cancellation law. However, whether these properties are sufficient to ensure that the validity of RET remains an open problem. In [126], Guralnick considered parts of this problem.

6.1.2. Generalizing RET over Principal Ideal Domains

Building on Theorem 2 of [128], Feinberg [129] considered RET over principal ideal domains and further extended it to a more general form.

Theorem 30

(Theorems 1 and 2, [129]). Let

F

be a principal ideal domain.

(1): Let $A \in F^{r \times r}$ , $B \in F^{s \times s}$ , and $C \in F^{r \times s}$ . Then, the matrix equation

$A X + Y B = C$

is consistent if and only if

$[\begin{matrix} A & C \\ 0 & B \end{matrix}] and [\begin{matrix} A & 0 \\ 0 & B \end{matrix}]$

are equivalent.
(2): Let $M_{i j} \in F^{r_{j} \times r_{j}}$ for $1 \leq i \leq j \leq t$ . Then,

$[\begin{matrix} M_{11} & M_{12} & \dots & M_{1 t} \\ 0 & M_{22} & \dots & M_{2 t} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ 0 & \dots & 0 & M_{t t} \end{matrix}] and [\begin{matrix} M_{11} & 0 & \dots & 0 \\ 0 & M_{22} & \dots & 0 \\ ⋮ & ⋮ & ⋱ & ⋮ \\ 0 & \dots & 0 & M_{t t} \end{matrix}]$

are equivalent if and only if there exist $X_{i j}, Y_{i j} \in F^{r_{i} \times r_{j}}$ such that

$M_{i j} = M_{i i} X_{i j} + \sum_{k = i + 1}^{j} Y_{i k} M_{k j}$

for $1 \leq i < j \leq t$ .

Remark 36.

Feinberg [129] also noted that it is easy to generalize Theorem 1 to a principal ideal domain by a method similar to that of Theorem 1 of [129].

Let

F = C

. In view of Remark 6, we can see that (4) implies that there exist the equivalence matrices to be of the upper block triangular, i.e.,

[\begin{matrix} I & - Y \\ 0 & I \end{matrix}] and [\begin{matrix} I & X \\ 0 & I \end{matrix}] .

Olshevsky [22] generalized the above conclusion by showing that, if any block upper triangular matrix is equivalent to its block diagonal part, then the equivalent matrices can also be chosen in the form of a block upper triangular matrix.

Theorem 31

(Theorem 3.2, [22]). Let

A_{i j} \in C^{n_{i} \times n_{j}}

for

1 \leq i \leq j \leq k

. If

G = [\begin{matrix} A_{11} & 0 & \dots & \dots & 0 \\ 0 & A_{22} & 0 & \dots & 0 \\ ⋮ & 0 & ⋱ & ⋱ & ⋮ \\ ⋮ & ⋱ & ⋱ & 0 \\ 0 & \dots & \dots & 0 & A_{k k} \end{matrix}] and H = [\begin{matrix} A_{11} & A_{12} & \dots & \dots & A_{1 k} \\ 0 & A_{22} & A_{23} & \dots & A_{2 k} \\ ⋮ & 0 & ⋱ & ⋮ \\ ⋮ & ⋱ & ⋱ & ⋱ & ⋮ \\ 0 & \dots & \dots & 0 & A_{k k} \end{matrix}]

are equivalent, then there exist

X_{i j}, Y_{i j} \in C^{n_{i} \times n_{j}}

(

1 \leq i < j \leq k

) such that

[\begin{matrix} I_{n_{1}} & Y_{12} & \dots & \dots & Y_{1 k} \\ 0 & I_{n_{2}} & Y_{23} & \dots & Y_{2 k} \\ ⋮ & 0 & ⋱ & ⋮ \\ ⋮ & ⋱ & ⋱ & ⋮ \\ 0 & \dots & \dots & 0 & I_{n_{k}} \end{matrix}] G [\begin{matrix} I_{n_{1}} & X_{12} & \dots & \dots & X_{1 k} \\ 0 & I_{n_{2}} & X_{23} & \dots & X_{2 k} \\ ⋮ & 0 & ⋱ & ⋮ \\ ⋮ & ⋱ & ⋱ & ⋮ \\ 0 & \dots & \dots & 0 & I_{n_{k}} \end{matrix}] = H .

Remark 37.

Theorem 31 can also be derived from Theorem 30 (30).

6.1.3. Generalizing RET over Division and Module-Finite Rings

Let

F

be a ring with identity. For

G \in F^{a \times b}

and

H \in F^{c \times d}

, let

E_{R} (G, H) = \{(T, S) ∣ T \in F^{c \times a}, S \in F^{d \times b} and T G = H S\} .

Denote

M = [\begin{matrix} A & C \\ 0 & B \end{matrix}],

where A, B, and C are given in Equation (1). For

(T, S) \in E_{R} (M, A)

, denote

T = [\begin{matrix} T_{1} & T_{2} \end{matrix}] and S = [\begin{matrix} S_{1} & S_{2} \end{matrix}],

where

T_{1} \in F^{m \times m}

,

T_{2} \in F^{m \times s}

,

S_{1} \in F^{r \times r}

, and

S_{2} \in F^{r \times n}

. Then,

T M = A S \Leftrightarrow \{\begin{matrix} T_{1} A = A S_{1}, \\ T_{1} C + T_{2} B = A S_{2} . \end{matrix}

Define a map

g_{M, R} : E_{R} (M, A) \to E_{R} (A, A) by g ((T_{1}, T_{2}), (S_{1}, S_{2})) = (T_{1}, S_{1}) .

(54)

Gustafson et al. [130] gave a general characterization of the solvability of Equation (1) over

F

based on the map

g_{M, R}

, which is essentially similar to the method used by Hartwig in [127].

Theorem 32

(Lemma 1, [130]). Equation (1) is consistent if and only if the map

g_{M, R}

defined in (54) is surjective.

Using Theorem 32, Gustafson et al. [130] further proved that RET also holds for a division ring and for a ring which is finitely generated as modules over its center.

Theorem 33

([130]). A division ring (or a ring that is module-finite over its center) has the equivalence property.

6.1.4. Generalizing RET over Commutative Rings

Let

F

be a commutative ring with identity. Gustafson [131] showed that RET remain valid over the commutative ring

F

. By reducing this problem to Artinian rings, Gustafson employed a simple argument with the composition length. This approach parallels that of Flanders and Wimmer [11], who used linear transformations and subspace dimensions to discuss RET over fields.

Theorem 34

(Theorem 1, [131]). A commutative ring with identity has the equivalence property.

Guralnick [132] further generalized Theorem 34 to finite sets of matrices over the commutative ring

F

. Let

F [x_{1}, x_{2}, \dots, x_{t}]

be the polynomial ring over

F

, where

x_{i}

(i = 1, \dots, t)

is commute with each other and any element in

F

. For

\tilde{C} = \{C_{i} \in F^{m \times n} ∣ 1 \leq i \leq r\} and \tilde{D} = \{D_{i} \in F^{m \times n} ∣ 1 \leq i \leq r\},

we call that

\tilde{C}

and

\tilde{D}

are simultaneously equivalent if there exist invertible matrices

U \in F^{m \times m}

and

V \in F^{n \times n}

such that

U C_{i} V = D_{i}

for any

1 \leq i \leq r

(see [132,133]).

Theorem 35

(Theorem B(i), [132]). For

A_{i} \in F^{m \times r}

,

B_{i} \in F^{s \times n}

, and

C_{i} \in F^{m \times n}

, let

M_{i} = [\begin{matrix} A_{i} & C_{i} \\ 0 & B_{i} \end{matrix}] and N_{i} = [\begin{matrix} A_{i} & 0 \\ 0 & B_{i} \end{matrix}],

where

1 \leq i \leq t

. Suppose that the polynomial ring

F [x_{1}, x_{2}, \dots, x_{t}]

has the equivalence property. Then, the system of matrix equations

\{\begin{matrix} A_{1} X - Y B_{1} = C_{1}, \\ A_{2} X - Y B_{2} = C_{2}, \\ \dots \\ A_{r} X - Y B_{r} = C_{r}, \end{matrix}

(55)

has a common solution pair

X \in F^{r \times n}

and

Y \in F^{m \times s}

if and only if

\tilde{M} = \{M_{i} ∣ 1 \leq i \leq r\} and \tilde{N} = \{N_{i} ∣ 1 \leq i \leq r\}

are simultaneously equivalent.

Remark 38.

Dmytryshyn and their collaborators [12,13] have done significant contributions on the further generalizations of the system (55). Moreover, Dmytryshyn was awarded the SIAM Student Paper Prize 2015 for their work [12].

Let

F

be a field with

char (F) \neq 2

. We stipulate that, in the following systems from [12,13], except for the unknown matrices, the remaining matrices are given with appropriate orders over

F

. In Theorem 4.1 of [12], Dmytryshyn et al. first study the system

A_{i} X_{k} \pm X_{j} B_{i} = C_{i},

(56)

where

i = 1, \dots, n, k, j \in {1, \dots, m}, 1 \leq m \leq 2 n,

and

X_{1}, \dots, X_{m}

are unknown. Their method for solving the system (56) extends that used in [11]. Based on the system (56), they then considered the main research subject of [12], i.e.,

\{\begin{matrix} A_{i} X_{k} \pm X_{j} B_{i} = C_{i}, i = 1, \dots, n_{1}, \\ F_{i^{'}} X_{k^{'}} \pm X_{j^{'}}^{★} G_{i^{'}} = H_{i^{'}}, i^{'} = 1, \dots, n_{2}, \end{matrix}

(57)

where

(i): $k, j, k^{'}, j^{'} \in {1, \dots, m}$ , $1 \leq m \leq (2 n_{1} + 2 n_{2})$ , and $X_{1}, \dots, X_{m}$ are unknown;
(ii): For $1 \leq l \leq m$ , the symbol $X_{l}^{★}$ denotes the matrix transpose $X_{l}^{T}$ and for the complex number field, also the matrix conjugate transpose $X_{l}^{*}$ ,

(see Theorem 1.1, [12]). In Theorem 6.1 of [12], they further generalized the system (57) to the following form:

\{\begin{matrix} A_{i} X_{k} K_{i} - L_{i} X_{j} B_{i} = C_{i}, i = 1, \dots, n_{1}, \\ F_{i^{'}} X_{k^{'}} M_{i^{'}} + N_{i^{'}} X_{j^{'}}^{★} G_{i^{'}} = H_{i^{'}}, i^{'} = 1, \dots, n_{2}, \end{matrix}

(58)

where

j, k, j^{'}, k^{'} \in {1, \dots, m}

,

1 \leq m \leq (2 n_{1} + 2 n_{2})

, and

X_{1}, X_{2}, \dots, X_{m}

are unknown.

Let

F

be a skew field of

char (F) \neq 2

that is finite dimensional over its center. Interestingly, two years later, Dmytryshyn et al. [13] generalized the system (58) over

F

including the complex conjugation of unknown matrices, i.e.,

A_{i} X_{i^{'}}^{ε_{i}} M_{i} - N_{i} X_{i^{''}}^{δ_{i}} B_{i} = C_{i},

(i): Of complex matrix equations, in which $ε_{i}, δ_{i} \in {1, C, T, *}$ and $X^{C} = \bar{X}$ is the complex conjugate of X;
(ii): Of quaternion matrix equations, in which $ε_{i}, δ_{i} \in {1, *}$ and $X^{*}$ is the quaternion conjugate transpose of X,

where

i^{'}, i^{''} \in {1, \dots, t}

,

i = 1, \dots, s

,

1 \leq t \leq 2 s

, and

X_{1}, \dots, X_{t}

are unknown (see Theorem 2, [13]). The system (57) is also extended over

F

(see Theorem 1, [13]).

6.1.5. Generalizing RET over Artinian and Noncommutative Rings

Let N be a subgroup of a finitely generated Abelian group M. If M and

N \oplus M / N

are isomorphic, then N is a direct summand of M? This problem is raised by H. Matsumura, and it is subsequently answered affirmatively by H. Toda. Miyata [134] showed that this result can also be generalized to the more general case of module, i.e., if

F

is a commutative Noetherian ring and M is a finitely presented

F

-module with a submodule N, then

M ≅ N \oplus M / N \Leftrightarrow N is a summand of M .

(59)

We say that

F

has the extension property if (59) holds.

Guralnick [126] proved the equivalence between the equivalence property and the extension property for an Artinian ring

F

.

Theorem 36

(Corollary 2.7, [126]). Let

F

be a right Artinian ring. Then,

F

has the extension property if and only if

F

has the equivalence property.

Remark 39.

In the proof of Theorem 3.4 of [126], Guralnick proposed a new perspective to prove Theorem 34. Moreover, Guralnick in Theorem 3.5 of [126] showed that a commutative ring has the extension property. Differing from Miyata [134] and Gustafson [131], this proof avoids the completion of a local Noetherian ring.

Guralnick [126] found that two special classes of Artinian rings (i.e., semisimple Artinian rings and Artinian principal ideal rings) possess the equivalence property.

Theorem 37

(Theorem 2.4 and Corollary 4.6, [126]).

(1): A semisimple Artinian ring has the equivalence property.
(2): An Artinian principal ideal ring has the equivalence property.

Guralnick [126] finally discussed the more general case where

F

is a regular ring, which generalizes Theorem 37 as well items (2) and (5) in Theorem 29.

Theorem 38

(Theorem 4.3, [126]). Let

F

be a regular ring. Then,

F

has the equivalence property if and only if

F^{n \times n}

is directly finite for all n.

Interestingly, Guralnick [135] gave a generalized definition of the equivalence property, i.e., the generalized equivalence property.

Definition 11

([135]). Let

F

be a ring with identity. For

A_{i j} \in F^{n_{i} \times m_{j}}

(

1 \leq i \leq j \leq k

), denote

\tilde{A} = [\begin{matrix} A_{11} & 0 \\ ⋱ \\ 0 & A_{k k} \end{matrix}] and \tilde{B} = [\begin{matrix} A_{11} & A_{i j} \\ ⋱ \\ 0 & A_{k k} \end{matrix}] .

We define that

F

has the generalized equivalence property if

\tilde{A}

and

\tilde{B}

are and equivalent imply that there exist

X_{i j} \in F^{n_{i} \times n_{j}}

and

Y_{i j} \in F^{m_{i} \times m_{j}}

such that

[\begin{matrix} I & X_{i j} \\ ⋱ \\ 0 & I \end{matrix}] \tilde{A} = \tilde{B} [\begin{matrix} I & Y_{i j} \\ ⋱ \\ 0 & I \end{matrix}] .

Guralnick [135] then proved that not only semisimple Artinian rings and Artinian principal ideal rings but also module finite R-algebras for a commutative ring R possess the generalized equivalence property. This evidently generalizes Theorem 37.

Theorem 39

(Theorems 3.3, 3.6 and 3.7, [135]). A semisimple Artinian ring, an Artinian principal ideal ring, or a module finite R-algebra for a commutative ring R has the generalized equivalence property.

Remark 40.

It is worth noting that the generalized form of Equation (5), i.e.,

A X B + C Y D = E

with unknown X and Y, has also been discussed over fields, principal ideal domains, simple Artinian rings, regular rings with identity, and the associative rings with unit by [120,136,137,138,139], respectively.

6.2. Generalizing RET to a Rank Minimization Problem

In a brief three-page article [140], Lin and Wimmer revealed that RET is essentially a special case of a rank minimization problem over a field

F

. Let

GI (k)

be the set of all invertible matrices of order k.

Theorem 40

(Theorem 2, [140]). Let

F

be a field, and let

A \in F^{m \times m}

,

B \in F^{n \times n}

, and

C \in F^{m \times n}

. Then,

\begin{matrix} min \{rank (A X - Y B - C) ∣ X, Y \in F^{m \times n}\} \\ = & min \{rank (P [\begin{matrix} A & C \\ 0 & B \end{matrix}] - [\begin{matrix} A & 0 \\ 0 & B \end{matrix}] Q) ∣ P, Q \in GI (m + n)\} . \end{matrix}

Subsequently, Ito and Wimmer [141] generalized Theorem 40 to Bezout domains under the condition that A and B are regular. An integral domain

F

with identity is called a Bezout domain if it satisfies that all finitely generated ideals are principal.

Theorem 41

(Theorem 3.1, [141]). Let

F

be a Bezout domain, and let

A \in F^{m \times m}

,

B \in F^{n \times n}

, and

C \in F^{m \times n}

. If A and B are regular over

F

, then

\begin{matrix} min \{rank (A X - Y B - C) ∣ X, Y \in F^{m \times n}\} \\ = & min \{rank (P [\begin{matrix} A & C \\ 0 & B \end{matrix}] - [\begin{matrix} A & 0 \\ 0 & B \end{matrix}] Q) ∣ P, Q \in GI (m + n)\} \\ = & rank [\begin{matrix} A & C \\ 0 & B \end{matrix}] - rank [\begin{matrix} A & 0 \\ 0 & B \end{matrix}] \\ = & \dim (R (A) + C N (B)) - rank (A) . \end{matrix}

Remark 41.

Theorem 41 also yields the equivalence of (3), (4), and (7) over a Bezout domain directly (see (Corollary 3.2, [141])).

6.3. GSE over Dual Numbers and Dual Quaternions

In 1843, the Irish mathematician William Rowan Hamilton [142] invented quaternions

H

(also called Hamilton quaternions or real quaternions). The set

H

is a noncommutative associative division algebra, and it also generalizes the real field

R

and the complex field

C

. The quaternion algebra has been effectively applied to mechanics, optics, color image processing, signal processing, computer graphics, flight mechanics, quantum physics, and so on (see [7,47,143,144,145,146,147]).

On the other hand, the British mathematician William Kingdon Clifford [148] invented dual numbers and dual quaternions in 1873. Up to now, dual numbers have been widely used in fields such as kinematics, statics, dynamics, robotics, and brain dynamics (see [149,150,151,152,153,154]).

Definition 12

([148]). A dual number is defined as

\hat{a} = a_{0} + a_{1} ε,

where

a_{0}, a_{1} \in R

, and ε is the dual unit such that

ε \neq 0, 0 ε = ε 0 = 0, 1 ε = ε 1 = ε and ε^{2} = 0 .

(60)

In this case,

a_{0}

is called primal/real/standard part of

\hat{a}

, and

a_{1}

is called dual/infinitesimal part of

\hat{a}

. The set of all dual numbers is denoted by

D

, which is a commutative ring.

Fan et al. [155] established solvability conditions and expressions of the general solution to Equation (5) over

D

by using the MP inverse and SVD.

Theorem 42

(Theorem 3, [155]). Let

A = A_{0} + A_{1} ε \in D^{m \times r}, B = B_{0} + B_{1} ε \in D^{s \times n}, C = C_{0} + C_{1} ε \in D^{m \times n},

where

A_{i} \in R^{m \times r}

,

B_{i} \in R^{s \times n}

, and

C_{i} \in R^{m \times n}

(i = 0, 1)

. The SVDs of

A_{0}

and

B_{0}

are given by

A_{0} = P [\begin{matrix} Ω & 0 \\ 0 & 0 \end{matrix}] Q^{T} and B_{0} = M [\begin{matrix} Λ & 0 \\ 0 & 0 \end{matrix}] N^{T},

where

Ω = diag (ω_{1}, \dots, ω_{l})

,

l = rank (A_{0})

,

Λ = diag (λ_{1}, \dots, λ_{t})

,

t = rank (B_{0})

, and

P \in R^{m \times m}

,

Q \in R^{r \times r}

,

M \in R^{s \times s}

, and

N \in R^{n \times n}

are orthogonal. Let

P = [\begin{matrix} P_{1} & P_{2} \end{matrix}], Q = [\begin{matrix} Q_{1} & Q_{2} \end{matrix}], M = [\begin{matrix} M_{1} & M_{2} \end{matrix}], N = [\begin{matrix} N_{1} & N_{2} \end{matrix}],

where

P_{2} \in R^{m \times (m - l)}

,

Q_{2} \in R^{r \times (r - l)}

,

M_{2} \in R^{s \times (s - t)}

, and

N_{2} \in R^{n \times (n - t)}

. Denote

J = C_{1} - A_{1} A_{0}^{†} C_{0} - R_{A_{0}} C_{0} B_{0}^{†} B_{1}, K_{1} = P_{2}^{T} A_{1} Q_{2}, K_{2} = M_{2}^{T} B_{1} N_{2} .

Then, Equation (5) has a solution pair

X \in D^{r \times n}

and

Y \in D^{m \times s}

if and only if

R_{A_{0}} C_{0} L_{B_{0}} = 0 and R_{K_{1}} P_{2}^{T} J N_{2} L_{K_{2}} = 0,

in which case,

X = X_{0} + X_{1} ε and Y = Y_{0} + Y_{1} ε

with

\begin{matrix} X_{0} = & A_{0}^{†} C_{0} + Q_{2} K_{1}^{†} P_{2}^{T} J L_{B_{0}} - A_{0}^{†} P_{1} V_{11} M_{1}^{T} B_{0} + Q_{2} V_{23} N_{1}^{T} + Q_{2} (L_{K_{1}} W_{4} - K_{1}^{†} W_{3} K_{2}) N_{2}^{T}, \\ Y_{0} = & R_{A_{0}} C_{0} B_{0}^{†} + P_{2} R_{K_{1}} P_{2}^{T} J N_{2} K_{2}^{†} M_{2}^{T} + P_{1} (V_{11} M_{1}^{T} + V_{12} M_{2}^{T}) + P_{2} (W_{3} - R_{K_{1}} W_{3} K_{2} K_{2}^{†}) M_{2}^{T}, \\ X_{1} = & A_{0}^{†} J - A_{0}^{†} A_{1} Q_{2} (K_{1}^{†} P_{2}^{T} J N_{2} - K_{1}^{†} W_{3} K_{2} + L_{K_{1}} W_{4}) N_{2}^{T} - A_{0}^{†} P_{1} (V_{11} M_{1}^{T} + V_{12} M_{2}^{T}) B_{1} \\ - A_{0}^{†} R_{1} B_{0} + L_{A_{0}} R_{2} + A_{0}^{†} A_{1} (A_{0}^{†} P_{1} V_{11} M_{1}^{T} B_{0} - Q_{2} V_{23} N_{1}^{T}), \\ Y_{1} = & R_{A_{0}} J B_{0}^{†} + R_{A_{0}} A_{1} (A_{0}^{†} P_{1} V_{11} M_{1}^{T} B_{0} - Q_{2} V_{23} N_{1}^{T}) B_{0}^{†} - P_{2} R_{K_{1}} P_{2}^{T} J N_{2} K_{2}^{†} M_{2}^{T} B_{1} B_{0}^{†} \\ - P_{2} (W_{3} - R_{K_{1}} W_{3} K_{2} K_{2}^{†}) M_{2}^{T} B_{1} B_{0}^{†} + R_{1} - R_{A_{0}} R_{1} B_{0} B_{0}^{†}, \end{matrix}

where

V_{11}

,

V_{12}

,

V_{23}

,

W_{3}

,

W_{4}

,

R_{1}

, and

R_{2}

are arbitrary matrices over

D

with appropriate sizes.

At the same time, due to the excellent property that dual quaternions can represent both rotation and translation, the theory of dual quaternions is not only one of the most powerful tools for handling rigid-body motion but also finds applications in computer graphics, medical procedures, neural networks, proximity operations in spacecraft, modern robotics, and so on (see [156,157,158]).

Definition 13

([148]). A dual quaternion is defined as

\hat{q} = q_{0} + q_{1} ε,

where

q_{0}, q_{1} \in H

. The set of all dual quaternions is denoted by

DH

, which a noncommutative ring with zero divisors.

Recently, Xie et al. [159], inspired by the hand–eye calibration problem in robotics research, studied Equation (1) over

DH

.

Theorem 43

(Theorem 3.1, [159]). Let

A = A_{0} + A_{1} ε \in {DH}^{m \times r}, B = B_{0} + B_{1} ε \in {DH}^{s \times n}, C = C_{0} + C_{1} ε \in {DH}^{m \times n},

where

A_{i} \in R^{m \times r}

,

B_{i} \in R^{s \times n}

, and

C_{i} \in R^{m \times n}

(

i = 0, 1

). Set

\begin{matrix} A_{11} = A_{1} L_{A_{0}}, A_{2} = R_{A_{0}} A_{11}, A_{3} = R_{A_{0}}, C_{3} = - R_{A_{0}} C_{0}, B_{2} = R_{B_{0}} B_{1}, \\ A_{4} = R_{A_{2}} R_{A_{0}}, A_{5} = - A_{4} A_{1} A_{0}^{†}, C_{4} = A_{4} (A_{1} A_{0}^{†} C_{0} + C_{0} B_{0}^{†} B_{1} - C_{1}), \\ A_{6} = R_{A_{4}} A_{5}, C_{5} = R_{A_{4}} C_{4}, B_{3} = B_{2} L_{B_{0}}, B_{4} = C_{4} L_{B_{0}} . \end{matrix}

Then, the following are equivalent:

(1): Equation (1) has a solution pair $X \in {DH}^{r \times n}$ and $Y \in {DH}^{m \times s}$ ;
(2): $C_{3} L_{B_{0}} = 0$ and $B_{4} L_{B_{3}} = 0$ ;
(3): The following rank equations hold:

$\begin{matrix} rank [\begin{matrix} B_{0} & 0 \\ - C_{0} & A_{0} \end{matrix}] = rank (B_{0}) + rank (A_{0}), \\ rank [\begin{matrix} B_{0} & 0 & 0 & 0 \\ B_{1} & B_{0} & 0 & 0 \\ - C_{1} & - C_{0} & A_{0} & A_{1} \\ 0 & 0 & 0 & A_{0} \end{matrix}] = rank [\begin{matrix} B_{0} & 0 \\ B_{1} & B_{0} \end{matrix}] + rank [\begin{matrix} A_{0} & A_{1} \\ 0 & A_{0} \end{matrix}]; \end{matrix}$

in which case,

X = X_{0} + X_{1} ε and Y = Y_{0} + Y_{1} ε

with

\begin{matrix} X_{0} & = A_{0}^{†} (C_{0} + Y_{0} B_{0}) + L_{A_{0}} W, \end{matrix}

(61)

\begin{matrix} X_{1} & = A_{0}^{†} [C_{1} + Y_{0} B_{1} + Y_{1} B_{0} - A_{1} A_{0}^{†} (C_{0} + Y_{0} B_{0}) - A_{11} W] + L_{A_{0}} W_{1}, \\ Y_{0} & = A_{3}^{†} C_{3} B_{0}^{†} + L_{A_{3}} U_{1} + U_{2} R_{B_{0}}, \\ Y_{1} & = A_{4}^{†} (C_{4} - A_{5} U_{1} B_{0} - A_{4} U_{2} B_{2}) B_{0}^{†} + L_{A_{4}} W_{3} + W_{4} R_{B_{0}}, \\ W & = A_{2}^{†} A_{3} [C_{1} + Y_{0} B_{1} + Y_{1} B_{0} - A_{1} A_{0}^{†} (C_{0} + Y_{0} B_{0})] + L_{A_{2}} W_{2}, \\ U_{1} & = A_{6}^{†} C_{5} B_{0}^{†} + L_{A_{6}} W_{5} + W_{6} R_{B_{0}}, U_{2} = A_{4}^{†} B_{4} B_{3}^{†} + L_{A_{4}} W_{7} + W_{8} R_{B_{3}}, \end{matrix}

(62)

where

W_{1}, W_{2}, \dots, W_{8}

are arbitrary matrices over

DH

with appropriate sizes.

Remark 42.

Subsequently, Xie and Wang [160] further investigated a more general form of Equation (1) over

DH

, namely

A X + E X F = C Y + D,

(63)

where X and Y are unknown. Additionally, the systematic introduction to Equation (63) and its more general forms can be found in the book [161].

Remark 43.

After the Hamilton quaternions, different concepts of quaternions were proposed, greatly enriching the quaternion theory, such as biquaternion (also called complexified quaternions), split quaternions, commutative quaternions (also called Segre biquaternions or reduced biquaternions), generalized commutative quaternions, degenerate quaternions, degenerate pseudo-quaternions, doubly degenerate quaternions, and quaternion algebras over a field (see [43,162,163,164,165,166,167]).

Interestingly, in Remark 8.2.7 of [168], Pottmann and Wallner introduced the concept of the generalized quaternions, which, in specific cases, coincide with Hamilton quaternions, split quaternions, degenerate quaternions, pseudo-degenerate quaternions, and doubly degenerate quaternions.

The theory of dual numbers has also developed rapidly, giving rise to three-dimensional dual numbers (also known as hyper-dual numbers), n-dimensional dual numbers (also known as higher dimensional dual numbers), interval dual numbers, fuzzy dual numbers, neutrosophic dual numbers, and finite complex modulo integer neutrosophic dual numbers (see [169] and references therein).

One can see that dual quaternions are essentially a combination of dual numbers and quaternions. It is then natural to ask: given such a rich variety of quaternions and dual numbers, what sparks will fly when they interact, and what applications will emerge?

6.4. Linear Operator Equations on Hilbert Spaces

Generalizing matrix equations to operator equations on Hilbert spaces or Hilbert

C^{*}

-modules has been a mainstream research direction. For a

C^{*}

-algebra

A

, a Hilbert

C^{*}

-module

E

[170,171] is a right

A

-module equipped with an

A

-valued inner product

〈 \cdot, \cdot 〉 : E \times E \to A

such that its induced norm

∥ x ∥ = {∥ 〈 x, x 〉 ∥}^{\frac{1}{2}}

is complete.

The theory of generalized inverses also serves as an effective tool for studying operator equations. Notably, most research requires the condition that the ranges of related operators are closed to ensure the existence of their MP inverses (see [172,173]). Interestingly, Douglas [174] pioneered an alternative approach in Hilbert spaces without the strong condition of closed ranges, which is known as the Douglas theorem. This work has provided valuable inspiration for subsequent research on operator equations on Hilbert

C^{*}

-modules (see [171,175]).

Let

A

be a

C^{*}

-algebra, and let

E

and

F

be Hilbert

C^{*}

-modules. The set of all bounded

A

-linear maps

A : E \to F

is denoted by

L (E, F)

. Particularly,

L (E) = L (E, E)

. The adjoint of

A \in L (E, F)

is a map

A^{*} \in L (F, E)

such that

〈 A x, y 〉 = 〈 x, A^{*} y 〉 for all x \in E, y \in F .

The range and the null space of an operator A are denoted by

R (A)

and

N (A)

, respectively. A closed submodule

L

of

E

is called to be orthogonally complemented [170] if

E = L \oplus L^{⊥},

where

L^{⊥} = {x \in E : 〈 x, y 〉 = 0 for all y \in L}

. And,

\bar{L}

is the closure of

L

. Let

P_{A^{*}}

be the projection of

E

onto

\bar{R (A^{*})}

. And,

R_{A} = I - P_{A^{*}}

, where I is the identity operator on

E

.

Let

A, B, C \in L (E)

be such that A and B are adjointable. Mousavi et al. [175] investigated Equation (5) in Hilbert

C^{*}

-modules

E

, where only the range closures of adjointable operators need to be orthogonally complemented.

Theorem 44

(Theorem 3.3, [175]). Let

\bar{R (A)}

,

\bar{R (B)}

,

\bar{R (A^{*})}

, and

\bar{R (B^{*})}

are orthogonally complemented. If

R (C R_{B}) \subseteq R (A) and R (P_{B^{*}} C^{*}) \subseteq R (B^{*}),

then Equation (5) is consistent, in which case,

X = X_{h} + X_{p} and Y = Y_{h} + Y_{p},

where

X_{p}

and

Y_{p}

satisfying

A X_{p} = P_{A} C R_{B}

and

B^{*} Y_{p}^{*} = P_{B^{*}} C^{*}

,

X_{h} = R_{A} W_{1} + W_{2} P_{B^{*}} and Y_{h} = P_{A} W_{3} + W_{4} R_{B^{*}},

where

W_{1}, W_{2}, W_{3}, W_{4} \in L (E)

are arbitrary satisfying

A W_{2} P_{B^{*}} + P_{A} W_{3} B = 0 .

Remark 44.

In Example 2.1 of [175], Mousavi et al. gave the example to show that, in a Hilbert

C^{*}

-module, an operator’s range closure being orthogonally complemented is weaker than its range being closed.

Let

A \in L (E, F)

satisfy that

R (A)

is closed. In view of the orthogonal decompositions of closed submodules, i.e.,

E = R (A^{*}) \oplus N (A) and F = R (A) \oplus N (A^{*}),

Karizaki et al. in Corollary 1.2 of [176] showed that the operator A can be decomposed into the following matrix form

A = [\begin{matrix} A_{1} & 0 \\ 0 & 0 \end{matrix}] : [\begin{matrix} R (A^{*}) \\ N (A) \end{matrix}] \to [\begin{matrix} R (A) \\ N (A^{*}) \end{matrix}],

where

A_{1}

is invertible, and thus the MP inverse

A^{†} \in L (F, E)

of A is

A^{†} = [\begin{matrix} A_{1}^{- 1} & 0 \\ 0 & 0 \end{matrix}] : [\begin{matrix} R (A) \\ N (A^{*}) \end{matrix}] \to [\begin{matrix} R (A^{*}) \\ N (A) \end{matrix}] .

Interestingly, four years after [175], Moghani et al. [177] supplemented the conclusions on solving operator Equation (5) on Hilbert

C^{*}

-modules using the matrix forms of adjointable operators and generalized inverses.

Theorem 45

(Theorem 3.2, [177]). Let

A \in L (E, F)

,

B \in L (F, E)

, and

C \in L (F)

be such that

R (A)

and

R (B)

are closed,

R (A) = R (B^{*})

, and

R (A^{*}) = R (B)

. Then, the operator Equation (5) has a solution pair

X \in L (F, E)

and

Y \in L (E, F)

if and only if

(I - A A^{†}) C (I - B^{†} B) = 0,

in which case,

\begin{matrix} X & = \frac{1}{2} A^{†} C + \frac{1}{2} A^{†} C (I - B^{†} B) + \frac{1}{2} W B + (I - A^{†} A) Z, \\ Y & = \frac{1}{2} A A^{†} C B^{†} + (I - A A^{†}) C B^{†} - \frac{1}{2} A W B B^{†} + V (I - B B^{†}), \end{matrix}

where

Z \in L (F, E)

and

V \in L (E, F)

are arbitrary, and

W \in L (E)

satisfies

(I - A A^{†}) W B B^{†} = 0 .

Remark 45.

In Section 4 of [177], Moghani et al. further studied the following operator equation

A X E + F Y B = C,

where X and Y are unknown operators between Hilbert

C^{*}

-modules.

Let

H

be an infinite dimensional separable Hilbert space. Recently, An et al. [178] revisited the operator Equation (1) in

H

. For

A \in L (H)

, a

(1, 2)

-inverse

A^{(1, 2)}

of A is an operator in

L (H)

satisfying

A A^{(1, 2)} A = A

and

A^{(1, 2)} A A^{(1, 2)} = A^{(1, 2)}

.

Theorem 46

(Theorem 2.1, [178]). Let A, B, and

C \in L (H)

. Then, Equation (1) has a solution pair

X = A^{(1, 2)} C and Y = - (I - A A^{(1, 2)}) C B^{(1, 2)}

if and only if

(I - A A^{(1, 2)}) C (I - B^{(1, 2)} B) = 0 .

For

A, B \in L (H)

, we call that the operator pair

(A, B)

has the generalized Fuglede-Putnam property [178,179] if

A X = Y B for X, Y \in L (H) \Rightarrow A^{*} X = Y B^{*} .

An et al. [178] then presented an interesting connection between the generalized orthogonality and the solvability of the operator Equation (1) in

H

.

Theorem 47

(Theorem 2.9, [178]). Let A, B, and

C \in L (H)

.

(1): If the operator Equation (1) is consistent, then invertible operators $U, V \in H \oplus H$ exist such that

$U [\begin{matrix} A & 0 \\ 0 & B \end{matrix}] = [\begin{matrix} A & C \\ 0 & B \end{matrix}] V .$
(2): Suppose that $(B, A)$ and $(B, B)$ satisfy the generalized Fuglede–Putnam property.
If there exist invertible operators $T, S \in H \oplus H$ such that

$T [\begin{matrix} A & 0 \\ 0 & B \end{matrix}] = [\begin{matrix} A & C \\ 0 & B \end{matrix}] S,$

and the $(2, 2)$ -entry of $S T^{*}$ is invertible, then the operator Equation (1) is consistent.

Example 1.

Note that Olshevsky in Section 2 of [22] designed an example to show that RET does not hold in infinite dimensional spaces. Indeed, let

H

be an infinite dimensional separable Hilbert space with the orthonormal basis

{e_{i}}_{i = 1}^{\infty}

. Define the operators

A, C \in L (H)

as

\begin{matrix} A e_{3 k + 1} = 0, A e_{3 k + 2} = 0, A e_{3 k + 3} = e_{3 k + 2} (k = 0, 1, 2, \dots), \\ C e_{1} = e_{1}, and C e_{i} = 0 (i \neq 1) . \end{matrix}

Put

B = A

. Let

E = [\begin{matrix} A & 0 \\ 0 & B \end{matrix}] and F = [\begin{matrix} A & C \\ 0 & B \end{matrix}] .

Then, one can observe that the operator E has the only the eigenvalue

λ_{0} = 0

. Corresponding to this eigenvalue, there are a countable number of Jordan chains of lengths 1 and 2. Additionally, the vectors of these chains form the orthonormal basis of

H \oplus H

. So, F has the same properties. Thus, E and F are equivalent. On the other hand, the operator Equation (1) is not solvable. In fact, assume that X and Y satisfy Equation (1). Then,

A X e_{1} = e_{1}

, which is a contradiction with

e_{1} \notin R (A)

.

Inspired by Bhatia’s characterizations of the unique solution of Sylvester equation (i.e., Theorem VII.2.3, [180]), An et al. [178] further proposed an integral expression for the solution of the operator Equation (1) under the specific conditions. For

A \in L (H)

,

σ (A)

denotes the spectrum of A.

Theorem 48

(Theorem 2.17, [178]). Let A, B, and

C \in L (H)

.

(1): If the spectra of A and B are contained in the open right half-plane and the open left half-plane, respectively, then the operator Equation (1) has the solution pair

$X = \int_{0}^{\infty} e^{- 2 t A} C d t and Y = - \int_{0}^{\infty} C e^{- 2 t B} d t .$
(2): Suppose that A and B are Hermitian operators such that

$σ (A) \cap σ (B) = \emptyset and α + \frac{1}{2} = β,$

where α and β are eigenvalues of A and B, respectively. Assume that, for an absolutely integrable function f defined on $R$ , its Fourier transform $\hat{f} (s)$ satisfies

$\hat{f} (s) = \frac{1}{s},$

where $s \in σ (A) - σ (B)$ . Then, the operator Equation (1) has the solution pair

$X = \int_{- \infty}^{\infty} e^{- i t A} C f (t) d t and Y = \int_{- \infty}^{\infty} C e^{i t B} f (t) d t .$

6.5. Tensor Equations

From the end of the 19th century to the present, the understanding of tensors in multilinear algebra and physics has essentially gone through three ways: as multi-indexed objects satisfying certain transformation rules, as multilinear maps, and as elements in the tensor product of vector spaces (see [181] and references therein). On the other hand, the direct interpretation of “tensors” as multidimensional arrays (or hypermatrices) has also been widely accepted by many scholars (see [182,183,184] and references therein). Following this habit, the tensors in this paper are referred to as multidimensional arrays.

Definition 14.

Let

F

be a ring. An N-order

I_{1} \times \dots \times I_{N}

-dimension tensor

A

over

F

is defined as a multidimensional array with

I_{1} I_{2} \dots I_{N}

entries, i.e.,

A = {[\begin{matrix} a_{i_{1} \dots i_{N}} \end{matrix}]}_{1 \leq i_{j} \leq I_{j}} (j = 1, \dots, N),

where

a_{i_{1} \dots i_{N}} \in F

for

1 \leq i_{j} \leq I_{j}

and

j = 1, \dots, N

. Moreover, denote

{(A)}_{i_{1} \dots i_{N}} = a_{i_{1} \dots i_{N}} .

The set of all N-order

I_{1} \times \dots \times I_{N}

-dimensional tensors over

F

is denoted by

F^{I_{1} \times \dots \times I_{N}}

.

Tensor theory has been effectively applied in diverse fields: image processing [185], handwritten digit classification [186], hypergraphs [182], extreme learning machines [187], signal processing and machine learning [188], quantum physics and mechanics [189], etc. In addition, the review article [190] by Kolda and Bader introduced the theory of tensor decomposition and its applications in psychometrics, chemometrics, numerical linear algebra, computer vision, numerical analysis, neuroscience, and so on.

In the 2017 preprint [191], He et al. first discussed SVD and the MP inverse of quaternion tensors under the Einstein product, establishing a fundamental framework for solving quaternion tensor equations under the Einstein product.

Definition 15

([192]). Let

A = [\begin{matrix} a_{i_{1} \dots i_{N} j_{1} \dots j_{N}} \end{matrix}] \in H^{I_{1} \times \dots \times I_{N} \times J_{1} \times \dots \times J_{N}} and B = [\begin{matrix} b_{j_{1} \dots j_{N} k_{1} \dots k_{M}} \end{matrix}] \in H^{J_{1} \times \dots \times J_{N} \times K_{1} \times \dots \times K_{M}} .

The Einstein product of

A

and

B

is defined as

A *_{N} B = [\begin{matrix} c_{i_{1} \dots i_{N} k_{1} \dots k_{M}} \end{matrix}] \in H^{I_{1} \times \dots \times I_{N} \times K_{1} \times \dots \times K_{M}},

where

c_{i_{1} \dots i_{N} k_{1} \dots k_{M}} = \sum_{j_{1} \dots j_{N}} a_{i_{1} \dots i_{N} j_{1} \dots j_{N}} b_{j_{1} \dots j_{N} k_{1} \dots k_{M}},

for

1 \leq i_{j} \leq I_{j}

,

j = 1, \dots, N

,

1 \leq k_{t} \leq K_{t}

, and

t = 1, \dots, M

.

The conjugate transpose

A^{*}

of

A = [\begin{matrix} a_{i_{1} \dots i_{N} j_{1} \dots j_{M}} \end{matrix}] \in H^{I_{1} \times \dots \times I_{N} \times J_{1} \times \dots \times J_{M}}

is

A^{*} = [\begin{matrix} b_{j_{1} \dots j_{M} i_{1} \dots i_{N}} \end{matrix}] \in H^{J_{1} \times \dots \times J_{M} \times I_{1} \times \dots \times I_{N}} with b_{j_{1} \dots j_{M} i_{1} \dots i_{N}} = {\bar{a}}_{i_{1} \dots i_{N} j_{1} \dots j_{M}} .

The MP inverse [193] of

A \in H^{I_{1} \times \dots \times I_{N} \times J_{1} \times \dots \times J_{N}}

is

A^{†} \in H^{J_{1} \times \dots \times J_{N} \times I_{1} \times \dots \times I_{N}}

satisfying

A *_{N} A^{†} *_{N} A = A, A^{†} *_{N} A *_{N} A^{†} = A^{†}, {(A *_{N} A^{†})}^{*} = A *_{N} A^{†}, {(A^{†} *_{N} A)}^{*} = A^{†} *_{N} A .

Moreover, let the unit (or identity) tensor be

I_{I_{N}} = [\begin{matrix} e_{i_{1} \dots i_{N} j_{1} \dots j_{N}} \end{matrix}] \in H^{I_{1} \times \dots \times I_{N} \times I_{1} \times \dots \times I_{N}},

where all diagonal entries

e_{i_{1} \dots i_{N} i_{1} \dots i_{N}}

are 1 and all off-diagonal entries are 0. Definite

L_{A} = I - A^{†} *_{N} A and R_{A} = I - A *_{N} A^{†},

where

I

denotes the unit tensor with appropriate dimensions.

Theorem 49

(Corollary 5.3, [191]). Let

A \in H^{I_{1} \times \dots \times I_{N} \times J_{1} \times \dots \times J_{N}}, B \in H^{H_{1} \times \dots \times H_{M} \times L_{1} \times \dots \times L_{M}}, C \in H^{I_{1} \times \dots \times I_{N} \times L_{1} \times \dots \times L_{M}} .

Then, Equation (5) over quaternion tensors under the Einstein product, i.e.,

A *_{N} X + Y *_{M} B = C

(64)

has a solution pair

X \in H^{J_{1} \times \dots \times J_{N} \times L_{1} \times \dots \times L_{M}} and Y \in H^{I_{1} \times \dots \times I_{N} \times H_{1} \times \dots \times H_{M}}

if and only if

R_{A} *_{N} C *_{M} L_{B} = 0,

in which case,

\begin{matrix} X & = A^{†} *_{N} C - U_{1} *_{M} B + L_{A} *_{N} U_{2}, \\ Y & = R_{A} *_{N} C *_{M} B^{†} + A *_{N} U_{1} + U_{3} *_{M} R_{B}, \end{matrix}

where

U_{1}

,

U_{2}

, and

U_{3}

are arbitrary tensors over

H

with appropriate dimensions.

Remark 46.

(1): Theorem 49 is a direct corollary of Theorem 5.1 of [191], which establishes the solvability conditions and the general solution for the following quaternion tensor equation:

$A *_{N} X *_{M} D + E *_{N} Y *_{M} B = C,$

where $X$ and $Y$ are unknown and other tensors are given over $H$ .
(2): Inspired by the transformation between tensors and matrices over $R$ (see (Definition 2.8, [194])), He et al. [191,195] defined an analogous transformation over $H$ , i.e., the transformation f is a map defined as

$\begin{matrix} f : & H^{I_{1} \times \dots \times I_{N} \times J_{1} \times \dots \times J_{N}} \to H^{(I_{1} I_{2} \dots \cdot I_{N}) \times (J_{1} J_{2} \dots \cdot J_{N})} \\ A \in H^{I_{1} \times \dots \times I_{N} \times J_{1} \times \dots \times J_{N}} \mapsto A = f (A) \in H^{(I_{1} I_{2} \dots I_{N}) \times (J_{1} J_{2} \dots J_{N})}, \end{matrix}$

where the components of A are given by

${(A)}_{i_{1} \dots i_{N} j_{1} \dots j_{N}} \overset{f}{\mapsto} {(A)}_{[i_{1} + \sum_{k = 2}^{N} (i_{k} - 1) \prod_{s = 1}^{k - 1} I_{s}] [j_{1} + \sum_{k = 2}^{N} (j_{k} - 1) \prod_{s = 1}^{k - 1} J_{s}]} .$

Lemma 2.2 of [191] shows that the transformation f is a bijection satisfying

$f (A + B) = f (A) + f (B) and f (A *_{N} C) = f (A) f (C)$

for $A, B \in H^{I_{1} \times \dots \times I_{N} \times J_{1} \times \dots \times J_{N}}$ and $C \in H^{J_{1} \times \dots \times J_{N} \times L_{1} \times \dots \times L_{N}}$ . The transformation f ingeniously bridges quaternion tensors under the Einstein product and quaternion matrices under the ordinary product. By virtue of its isomorphism property, f serves as a powerful tool for studying problems related to quaternion tensors under the Einstein product.
(3): The work [191] on quaternion tensor equations has profoundly influenced subsequent research on tensor equations over $H$ (see [196,197,198,199,200,201,202]).

Subsequently, Wang et al. [125] discussed the minimum-norm least-squares solution of the quaternion tensor equation of the form (64), i.e.,

A *_{N} X + Y *_{N} B = D

with the unknown

X

and

Y

. We now introduce some notations.

Let

A = [a_{i_{1} \dots i_{N} j_{1} \dots j_{M}}] \in H^{I_{1} \times \dots \times I_{N} \times J_{1} \times \dots \times J_{M}}

. The transpose

A^{T}

of

A

is

A^{T} = [b_{j_{1} \dots j_{M} i_{1} \dots i_{N}}] \in H^{J_{1} \times \dots \times J_{M} \times I_{1} \times \dots \times I_{N}},

where

b_{j_{1} \dots j_{M} i_{1} \dots i_{N}} = a_{i_{1} \dots i_{N} j_{1} \dots j_{M}}

, and the conjugate

\bar{A}

of

A

is

\bar{A} = [{\bar{a}}_{i_{1} \dots i_{N} j_{1} \dots j_{M}}] \in H^{I_{1} \times \dots \times I_{N} \times J_{1} \times \dots \times J_{M}} .

The symbol

A_{(i_{1} \dots i_{N} | :)} = [a_{i_{1} \dots i_{N} : \dots :}] \in H^{J_{1} \times \dots \times J_{M}}

stands for a subblock of

A

, and

Vec (A)

is a new tensor obtained by lining up all subtensor in a column, where the t-th subblock of

Vec (A)

is

A_{(i_{1} \dots i_{N} | :)}

for

t = i_{N} + \sum_{K = 1}^{N - 1} [(i_{K} - 1) \prod_{L = K + 1}^{N} I_{L}] .

Since

A = A_{1} + A_{2} j

for

A_{1}, A_{2} \in C^{I_{1} \times \dots \times I_{N} \times J_{1} \times \dots \times J_{M}}

, the complex representation tensor of

A

is defined as

f (A) = [\begin{matrix} A_{1} & A_{2} \\ - {\bar{A}}_{2} & {\bar{A}}_{1} \end{matrix}] \in C^{2 I_{1} \times \dots \times 2 I_{N} \times 2 J_{1} \times \dots \times 2 J_{M}} .

Let

Θ_{A} = [A_{1}, A_{2}]

,

\vec{A} = (Re A_{1}, Im A_{1}, Re A_{2}, Im A_{2})

,

Vec (\vec{A}) = [\begin{matrix} Vec (Re A_{1}) \\ Vec (Im A_{1}) \\ Vec (Re A_{2}) \\ Vec (Im A_{2}) \end{matrix}], and K_{J_{M}} = [\begin{matrix} I & i I_{J_{M}} & 0 & 0 \\ 0 & 0 & I_{J_{M}} & i I_{J_{M}} \\ I_{J_{M}} & - i I_{J_{M}} & 0 & 0 \\ 0 & 0 & I_{J_{M}} & - i I_{J_{M}} \end{matrix}],

where

i

is imaginary unit such that

i^{2} = - 1

.

Let

A \in R^{I_{1} \times \dots \times I_{N} \times J_{1} \times \dots \times J_{N}}

. The Frobenius norm

{∥ \cdot ∥}_{F}

of

A

is

{∥ A ∥}_{F} = {(\sum_{i_{1} \dots i_{N} j_{1} \dots j_{N}} {|a_{i_{1} \dots i_{N} j_{1} \dots j_{N}}|}^{2})}^{1 / 2} .

For

B = B_{1} + B_{2} j \in H^{J_{1} \times \dots \times J_{N} \times K_{1} \times \dots \times K_{M}}

, define

A • f (B) = [\begin{matrix} A \otimes B_{1} & A \otimes B_{2} \\ A \otimes (- {\bar{B}}_{2}) & A \otimes {\bar{B}}_{1} \end{matrix}],

where ⊗ denotes the Kronecker product of two tensors. The the inverse of

A \in C^{I_{1} \times \dots \times I_{N} \times I_{1} \times \dots \times I_{N}}

is the tensor

A^{- 1} \in C^{I_{1} \times \dots \times I_{N} \times I_{1} \times \dots \times I_{N}}

satisfying

A *_{N} A^{- 1} = A^{- 1} *_{N} A = I .

Theorem 50

(Corollary 3.4, [125]). Let

A, B, D \in H^{J_{1} \times \dots \times J_{N} \times J_{1} \times \dots \times J_{N}}

, and let

\begin{matrix} H_{L 1} = \{[X, Y]\} \end{matrix}

be the set of all

X, Y \in H^{J_{1} \times \dots \times J_{N} \times J_{1} \times \dots \times J_{N}}

such that

∥ A *_{N} X + Y *_{N} {B - D ∥}_{F}^{2} = min_{X_{1}, Y_{1} \in H^{I_{1} \times \dots \times I_{N} \times J_{1} \times \dots \times J_{N}}} {∥ A *_{N} X_{1} + Y_{1} *_{N} B - D ∥}_{F}^{2} .

Denote

A = A_{1} + A_{2} j

. Put

\begin{matrix} P_{01} & = [\begin{matrix} A_{1} • f {(I_{J_{N}})}^{T} & A_{2} • f {(I_{J_{N}} j)}^{*} \end{matrix}] *_{N} K_{J_{N}}, Q_{01} = [\begin{matrix} I_{J_{N}} • f {(B)}^{T} & 0 \end{matrix}] *_{N} K_{J_{N}}, \\ T_{11} & = [\begin{matrix} Re P_{01} & Re Q_{01} \end{matrix}], T_{12} = [\begin{matrix} Im P_{01} & Im Q_{01} \end{matrix}], E_{1} = [\begin{matrix} Vec (Re Θ_{D}) \\ Vec (Im Θ_{D}) \end{matrix}], \\ R_{1} & = (I - T_{11}^{†} *_{N} T_{11}) *_{N} T_{12}^{T}, \\ H_{1} & = R_{1}^{†} + (I - R_{1}^{†} *_{N} R_{1}) *_{N} Z_{1} *_{N} T_{12} *_{N} T_{11}^{†} *_{N} {(T_{11}^{†})}^{T} *_{N} (I - T_{12}^{T} *_{N} R_{1}^{†}), \\ Z_{1} & = {(I + (I - R_{1}^{†} *_{N} R_{1}) *_{N} T_{12} *_{N} T_{11}^{†} *_{N} {(T_{11}^{†})}^{T} *_{N} T_{12}^{T} *_{N} (I - R_{1}^{†} *_{N} R_{1}))}^{- 1} . \end{matrix}

(1): Then,

$\begin{matrix} H_{L 1} = {[X, Y] | [\begin{matrix} Vec (\vec{X}) \\ Vec (\vec{Y}) \end{matrix}] = & [\begin{matrix} T_{11}^{†} - H_{1}^{T} *_{N} T_{12} *_{N} T_{11}^{†} & H_{1}^{T} \end{matrix}] *_{N} E_{1} \\ + (I - T_{11}^{†} *_{N} T_{11} - R_{1} *_{N} R_{1}^{†}) *_{N} W_{1}}, \end{matrix}$

where $W_{1}$ is arbitrary with appropriate dimensions.
(2): If $[X_{l 1}, Y_{l 1}] \in H_{L 1}$ satisfies

${∥[X_{l 1}, Y_{l 1}]∥}_{F}^{2} = min_{[X, Y] \in H_{L 1}} ({∥ X ∥}_{F}^{2} + {∥ Y ∥}_{F}^{2}),$

then $[X_{l 1}, Y_{l 1}] \in H_{L 1}$ is unique and

$[\begin{matrix} Vec (\vec{X_{l 1}}) \\ Vec (\vec{Y_{l 1}}) \end{matrix}] = [\begin{matrix} T_{11}^{†} - H_{1}^{T} *_{N} T_{12} *_{N} T_{11}^{†} & H_{1}^{T} \end{matrix}] *_{N} E_{1} .$

Remark 47.

Recently, some scholars have extended quaternion tensor equations under the Einstein product to different categories. For instance, Jia and Wang [203] investigated split quaternion tensor equations, while Yang et al. [204] explored dual split quaternion tensor equations. On the other hand, tensor theory encompasses a variety of product operations, including the Einstein product [192], k-mode Product [182], contracted product [205], T-product [206], Qt-product [207], general product [208], cosine transform product (c-product) [209], and M-product [210,211]. Combined with Remark 43, the investigation of tensor equations over diverse quaternion algebras under various tensor products reveals significant untapped research potential.

6.6. Polynomial Matrix Equations

As described in Section 3, Roth was the first to study the solvability conditions of polynomial matrix Equation (2) via the equivalence of two block polynomial matrices. Since then, the theoretical research and practical applications regarding this polynomial matrix equation have gradually become more extensive and enriched. For instance, it is successfully applied to multivariable linear discrete systems in stochastic control [212], the algebraic regulator problem [213], etc. Next, we mainly discuss different approaches to studying this polynomial matrix.

6.6.1. By the Divisibility of Polynomials

Let F be a field. Cheng and Pearson [214], in their research on the regulator problem with internal stability, provided an equivalent characterization of the solvability of the polynomial matrix equation

B X + Y D = P

(65)

with given polynomial matrix

B \in F^{p \times n} [λ]

,

D \in F^{p \times p} [λ]

, and

P \in F^{p \times p} [λ]

, by the divisibility of a series of polynomials. Assume that

rank (B) = r

and

rank (D) = p

. Using Lemma 2 of [214] (i.e., Smith normal form theorem for polynomial matrices), there exist unimodular matrices

M_{b}, N_{b}, M_{d}

, and

N_{d}

such that

M_{b} B N_{b} = B_{d} = [\begin{matrix} B_{0} & 0 \\ 0 & 0 \end{matrix}] and M_{d} D N_{d} = diag (d_{1}, d_{2}, \dots, d_{p}),

(66)

where

B_{0} = diag (b_{1}, b_{2}, \dots, b_{r})

,

b_{1} b_{2} \dots b_{p} \neq 0

, and

d_{1} d_{2} \dots d_{p} \neq 0

. Left-multiplying (65) by

M_{b}

and right-multiplying (65) by

N_{d}

yield that

B_{d} X^{'} + Y^{'} D_{d} = P^{'},

(67)

where

X^{'} = N_{b}^{- 1} X N_{d}

,

Y^{'} = M_{b} Y M_{d}^{- 1}

, and

P^{'} = M_{b} P N_{d}

. For

a, b \in F [λ]

,

a | b

denotes a divides b.

Theorem 51

(Lemma 6, [214]). Let B, D, and P be given in (65) with

rank (B) = r

and

rank (D) = p

. Let

b_{1}, b_{2}, \dots, b_{r}

and

d_{1}, d_{2}, \dots, d_{p}

be given in (66), and

P^{'} = [p_{i j}^{'}]

be given in (67). Then, Equation (65) has a solution pair

(X, Y)

if and only if

(1): $g_{i j} | p_{i j}^{'}$ for $i = 1, \dots, r$ and $j = 1, \dots, p$ ;
(2): $d_{j} | p_{i j}^{'}$ if $p > r$ , for $i = r + 1, \dots, p$ and $j = 1, \dots, p$ .

where

g_{i j}

is the monic greatest common divisor of

b_{i}

and

d_{j}

.

Remark 48.

Notably, in Theorem 2 of [214], Cheng and Pearson equivalently transform the solvability of a restricted regulator problem with internal stability into the solvability of Equation (65) in a special form.

6.6.2. By Skew-Prime Polynomial Matrices

Let F be a field. In 1978, Wolovich [215] proposed a new approach to studying the polynomial matrix equation

A (λ) X (λ) + Y (λ) B (λ) = C (λ)

(68)

with given

A (λ) \in F^{p \times m} [λ]

,

B (λ) \in F^{q \times t} [λ]

, and

C (λ) \in F^{p \times t} [λ]

, based on skew-prime polynomial matrices.

Definition 16

([215,216]). Let

A (λ) \in F^{p \times m} [λ]

and

B (λ) \in F^{q \times p} [λ]

with

q + m > p

. Assume that there exist

M (λ) \in F^{m \times p} [λ]

and

N (λ) \in F^{p \times q} [λ]

such that

A (λ) M (λ) + N (λ) B (λ) = I_{p} .

Then,

A (λ)

and

B (λ)

are called externally skew prime (or

B (λ)

and

A (λ)

are called internally skew prime). Moreover,

A (λ)

and

N (λ)

are called relatively left prime while

M (λ)

and

B (λ)

are called relatively right prime.

Suppose that

A (λ)

is nonsingular with

p = m

. Then,

A^{- 1} (s) C (λ)

can be factored in dual prime form:

A^{- 1} (s) C (λ) = \bar{C} (s) {\bar{A}}^{- 1} (s),

(69)

where

\bar{C} (s) \in F^{p \times t} [λ]

and

\bar{A} (s) \in F^{t \times t} [λ]

are relatively right prime.

Theorem 52

(Theorem 3 and Corollary 3, [215]). Let

A (λ) \in F^{p \times p} [λ]

be nonsingular and let

\bar{A} (s)

be given in (69).

(1): If $\bar{A} (s)$ and $B (λ)$ are externally skew, then Equation (68) is consistent.
(2): Suppose that $A (λ)$ and $C (λ)$ are relatively left prime. Then, Equation (68) is consistent if and only if $\bar{A} (s)$ and $B (λ)$ are externally skew prime.

Remark 49.

Note that Wolovich proved the sufficiency of the item (2) in Theorem 52 via a constructive method, implying a new procedure to find a solution pair of Equation (68). Moreover, when

C (λ) = I

, all solutions of Equation (68) are characterized in (Section 5, [215]), and this characterization is further used to obtain the unique solution of Equation (68).

6.6.3. By the Realization of Matrix Fraction Descriptions

Let

F

be a field. For given

Q \in F^{p \times q} [λ]

,

R \in F^{m \times t} [λ]

, and

Φ \in F^{p \times t} [λ]

, consider the polynomial matrix equation

X R + Q Y = Φ,

(70)

where

X \in F^{p \times m} [λ]

and

Y \in F^{q \times t} [λ]

are unknown. According to the realization of matrix fraction descriptions presented in [217], Emre and Silverman [218] transformed Equation (70) into a set of linear matrix equations when Q is nonsingular.

Let

F^{q} [λ]

and

F^{q} (λ)

be the sets of all q-tuples of polynomials in

λ

with coefficients in

F

and all q-tuples of rational functions in

λ

over

F

, respectively. Assume that Q is nonsingular with

p = q

. Let

F_{Q} = {x \in F^{p} [λ] : Q^{- 1} x is strictly proper} .

For

X_{1} \in F^{p \times m} [λ]

satisfying that

Q^{- 1} X_{1}

is strictly proper, define the following

F

-linear maps:

\begin{matrix} G : F^{m} \to F_{Q}, u \mapsto X_{1} u for u \in F^{m}, \\ π : F^{p} (λ) \to F^{p} (λ), q \mapsto strictly proper part of q, \\ π_{Q} : F^{p} [λ] \to F^{p} [λ], x \mapsto Q π (Q^{- 1} x), \\ F : F_{Q} \to F_{Q}, x \mapsto π_{Q} (z x), \\ H : F_{Q} \to F^{p}, x \mapsto {(Q^{- 1} x)}_{- 1}, \end{matrix}

where

{(Q^{- 1} x)}_{- 1}

is the coefficient of

λ^{- 1}

in the formal power series of

Q^{- 1} x

in

λ^{- 1}

. For

Z = Q^{- 1} X_{1}

, we call

Σ = (F, G, H)

the Q-realization of Z [217].

Let

S \in F^{p \times n} [λ]

be such that its columns are a basis of

F_{Q}

. Let

(F, G_{1}, H)

be the Q-realization of

Q^{- 1} S

. Let

\hat{F}

,

{\hat{G}}_{1}

, and

\hat{H}

denote the matrix representations of F,

G_{1}

, and H, respectively, with respect to the canonical bases of

F^{m}

and

F^{p}

, and the columns of S serving as a basis of

F_{Q}

. Put

R = \sum_{j = 0}^{r} u_{- j} λ^{j},

where

u_{- j} \in F^{m \times p}

. Define

\hat{Φ} \in F^{n \times p}

uniquely by

π_{Q} (Φ) = S \hat{Φ}

, and for the unique polynomial matrix

Φ_{1}

, express

Φ

as

Φ = Q Φ_{1} + S \hat{Φ} .

(71)

Moreover, let the linear equations be as follows:

\sum_{j = 0}^{r} {\hat{F}}^{j} \hat{G} u_{- j} = \hat{Φ},

(72)

where

\hat{G} \in F^{n \times m}

are unknown.

Theorem 53

(Theorem 2.5, [218]). Let R, Q, and Φ be given in (70). Suppose that Q is nonsingular with

p = q

. Denote

\bar{E} (Q, R) = {(X_{1}, Y_{1}) ∣ X_{1} R + Q Y_{1} = Φ and Q^{- 1} X_{1} is strictly proper} .

The following are equivalent:

(1): $(X_{1}, Y_{1}) \in \bar{E} (Q, R)$ ;
(2): Equation (72) has a solution $\hat{G}$ such that

$X_{1} = S \hat{G} and Y_{1} = Φ_{1} - Q_{p},$

where $Φ_{1}$ satisfies (71), and $Q_{p}$ is the polynomial part of $Q^{- 1} X_{1} R$ .

Remark 50.

(1): Under the hypotheses of Theorem 53, let

$E (Q, R) = {(X, Y) | X R + Q Y = Φ} .$

In terms of Lemma 2.2 of [218], Emre and Silverman have shown that

$E (Q, R) = {(X_{1}, Y_{1}) + (\bar{X}, \bar{Y}) ∣ (X_{1}, Y_{1}) \in \bar{E} (Q, R) and (\bar{X}, \bar{Y}) \in \bar{H} (Q, R)},$

where $\bar{H} (Q, R) = {(Q Q_{1}, - Q_{1} R) ∣ Q_{1} is an arbitrary polynomial matrix}$ . This implies that, to characterize $E (Q, R)$ , it is sufficient to characterize $\bar{E} (Q, R)$ .
(2): In Section 3, [218], Equation (70) is further generalized to the case where Q is a general polynomial matrix. In fact, for $Q \in F^{p \times q} [λ]$ , there exist unimodular polynomial matrices $M_{1}$ and $M_{2}$ such that

$M_{1} Q M_{2} = [\begin{matrix} \hat{Q} & 0 \\ 0 & 0 \end{matrix}],$

where $\hat{Q}$ is the nonsingular polynomial matrix. Let

$\hat{X} = M_{1} X = [\begin{matrix} {\hat{X}}_{1} \\ {\hat{X}}_{2} \end{matrix}], \hat{Y} = M_{2}^{- 1} Y = [\begin{matrix} {\hat{Y}}_{1} \\ {\hat{Y}}_{2} \end{matrix}], M_{1} Φ = [\begin{matrix} {\tilde{Φ}}_{1} \\ {\tilde{Φ}}_{2} \end{matrix}] .$

Then,

$\begin{matrix} Equation (70) & \Leftrightarrow \hat{X} R + [\begin{matrix} \hat{Q} & 0 \\ 0 & 0 \end{matrix}] [\begin{matrix} {\hat{Y}}_{1} \\ {\hat{Y}}_{2} \end{matrix}] = [\begin{matrix} {\tilde{Φ}}_{1} \\ {\tilde{Φ}}_{2} \end{matrix}] \\ \Leftrightarrow \{\begin{matrix} {\hat{X}}_{2} R = {\tilde{Φ}}_{2}, \\ {\hat{X}}_{1} R + \hat{Q} {\hat{Y}}_{1} = {\tilde{Φ}}_{1} . \end{matrix} \end{matrix}$

(73)

So, Theorem 53 can be applied to the second equation of (73); see [219] for the first Equation of (73).

6.6.4. By the Unilateral Polynomial Matrix Equation

Let

F

be a field. For given

A, B, C \in F^{n \times n} [λ]

, consider the following polynomial matrix Equation (also called the bilateral polynomial matrix equation):

A X + Y B = C,

(74)

where

X, Y \in F^{n \times n} [λ]

are unknown. Żak [220] proposed an algorithm for finding the unique solution pair

(X, Y)

of Equation (74). In fact, let

A = \sum_{i = 0}^{N} λ^{i} A_{i} and B = \sum_{j = 0}^{M} λ^{j} B_{j} .

For

C = {[c_{i j}]}_{n \times n} \in F^{n \times n} [λ]

, denote

{vec}_{r} (C) = {[\begin{matrix} c_{11} & \dots & c_{1 n} & c_{21} & \dots & c_{2 n} & \dots & c_{n 1} & \dots & c_{n n} \end{matrix}]}^{T} .

Using the Kronecker product, Equation (74) can be transformed into the following unilateral polynomial matrix equation:

A X + B Y = C,

(75)

where

X = {vec}_{r} (X)

,

Y = {vec}_{r} (Y)

,

C = {vec}_{r} (C)

,

\begin{matrix} A = \sum_{i = 0}^{N} λ^{i} (A_{i} \otimes I_{n}), and B = \sum_{j = 0}^{M} λ^{j} (I_{n} \otimes B_{j}^{T}) . \end{matrix}

Let

X = \sum_{i = 0}^{M_{1} - 1} λ^{i} X_{i} and Y = \sum_{i = 0}^{N_{1} - 1} λ^{i} Y_{i} .

Denote

A_{i} = A_{i} \otimes I_{n} and B_{j} = I_{n} \otimes B_{j}^{T},

where

i = 0, 1, \dots, N

and

j = 0, 1, \dots, M

. Then, by comparing of like powers, (75) can be rewritten as

\begin{matrix} [\begin{matrix} B_{0} & A_{0} \\ B_{1} & B_{0} & A_{1} & A_{0} \\ ⋮ & B_{1} & B_{0} & ⋮ & A_{1} & A_{0} \\ B_{M} & ⋮ & B_{1} & A_{N} & ⋮ & A_{1} \\ B_{M} & ⋮ & A_{N} & ⋮ \\ B_{M} & A_{N} \end{matrix}] [\begin{matrix} Y_{0} \\ ⋮ \\ Y_{N_{1} - 1} \\ X_{0} \\ ⋮ \\ X_{M_{1} - 1} \end{matrix}] = [\begin{matrix} C_{0} \\ C_{1} \\ ⋮ \\ ⋮ \\ ⋮ \end{matrix}] . \\ N_{1} + M_{1} blocks \end{matrix}

(76)

Let us assume without loss of generality that

B_{0}

is nonsingular, which implies that

B_{0}

is also nonsingular. As shown by Feinstein and Bar-Ness in [221], performing a series of elementary row operations on Equation (76) yields the following form:

[\begin{matrix} \begin{matrix} I_{n} \\ ⋱ \\ I_{n} \end{matrix} & \begin{matrix} L_{0} \\ L_{1} \\ ⋮ \\ L_{0} & \dots & L_{0} \end{matrix} \\ 0 & L \end{matrix}] [\begin{matrix} y \\ x \end{matrix}] = [\begin{matrix} p_{1} \\ p_{2} \end{matrix}] .

(77)

Therefore, a necessary and sufficient condition for the solvability of Equation (74) is

rank [\begin{matrix} L & p_{2} \end{matrix}] = rank (L),

in which case, we can obtain x by solving

L x = P_{2}

, and then compute y recursively. Furthermore, the upper bound on the degree of

Y

in (75) are also given as follows:

Theorem 54

([220]). For

A

and

B

given in (75), assume that

(1): $A$ and $B$ are relatively left prime;
(2): $B$ is nonsingular and satisfies that $B^{- 1}$ is strictly proper;
(3): $A_{1} (λ) B_{1}^{- 1} (λ)$ is the right coprime factorization of $B^{- 1} A$ , where $B_{1} (λ)$ is row reduced.

If Equation (75) is consistent, then it has a solution pair

(X (λ), Y (λ))

such that

{deg}_{r i} X (λ) < {deg}_{r i} B_{1} (λ) and deg Y (λ) < deg A_{1} (λ),

where

{deg}_{r i}

denotes the degree of the i-th row.

Remark 51.

Żak [220] also noted that the number of equations in (77) can be further reduced by using additional information about A and B given in (74). For instance, if a row of B has a degree of zero, then the corresponding row of X is identically zero, so discard the relevant column and row of L given in (77).

6.6.5. By the Equivalence of Block Polynomial Matrices

Let

F

be a field. Building upon Theorem 1, Wimmer [222] investigated the constant solutions of the following polynomial matrix equation:

A (λ) X - Y B (λ) = C (λ),

(78)

where

A (λ) \in F^{m \times n} [λ]

,

B (λ) \in F^{p \times k} [λ]

, and

C (λ) \in F^{m \times k} [λ]

are given.

Theorem 55

(Lemma 2.1, [222]). Equation (78) has a constant solution pair

X \in F^{n \times k} and Y \in F^{m \times p}

if and only if there exist two nonsingular constant matrices

R \in F^{(n + k) \times (n + k)} and S \in F^{(m + p) \times (m + p)}

such that

[\begin{matrix} A (λ) & 0 \\ 0 & B (λ) \end{matrix}] R = S [\begin{matrix} A (λ) & C (λ) \\ 0 & B (λ) \end{matrix}] .

Remark 52.

Theorem 55 also appeared as a lemma in the earlier article (Lemma 3, [223]), though no complete proof was provided.

Remark 53.

Let

A_{i} \in F^{m \times n}

,

B_{i} \in F^{p \times k}

, and

C_{i} \in F^{m \times k}

for

i = 1, 2

. Using Theorem 55, Wimmer (Theorem 1.1, [222]) showed that the system of matrix equations

\{\begin{matrix} A_{1} X - Y B_{1} = C_{1}, \\ A_{2} X - Y B_{2} = C_{2}, \end{matrix}

has a solution pair

X \in F^{n \times k}

and

Y \in F^{m \times p}

if and only if there exist two nonsingular matrices

R \in F^{(n + k) \times (n + k)}

and

S \in F^{(m + p) \times (m + p)}

such that

S ([\begin{matrix} A_{1} & C_{1} \\ 0 & B_{1} \end{matrix}] - λ [\begin{matrix} A_{2} & C_{2} \\ 0 & B_{2} \end{matrix}]) = ([\begin{matrix} A_{1} & 0 \\ 0 & B_{1} \end{matrix}] - λ [\begin{matrix} A_{2} & 0 \\ 0 & B_{2} \end{matrix}]) R .

Obviously, this result is also a generalization of RET. A similar research idea is also introduced in Section 8.1.

6.6.6. By Jordan Systems of Polynomial Matrices

Let

A \in C^{m \times m} [λ]

,

B \in C^{n \times n} [λ]

, and

C \in C^{m \times n} [λ]

be such that

\det (A) \neq 0

and

\det (B) \neq 0

. According to Jordan systems of polynomial matrices, Wimmer [224] discussed the solvability conditions for the polynomial matrix equations

A X - Y B = C,

(79)

where

X, Y \in C^{m \times n} [λ]

are unknown. Let

σ (B) = {λ \in C ∣ det (B (λ)) = 0},

and let the elementary divisors corresponding to

λ_{1} \in σ (B)

be

{(λ - λ_{1})}^{l_{1}}, \dots, {(λ - λ_{1})}^{l_{q}},

where

l_{1} \geq \dots \geq l_{q} \geq 1

and

l = l_{1} + \dots + l_{q}

satisfying

det (B) = {(λ - λ_{1})}^{l} c (λ) and c (λ_{1}) \neq 0 .

For

r \in Z^{+}

, denote

N_{r} = {[\begin{matrix} 0 & 1 \\ ⋱ & ⋱ \\ ⋱ & 1 \\ 0 \end{matrix}]}_{r \times r} .

Let the Jordan matrix of B associated to

λ_{1}

be

J = diag (λ_{1} I - N_{l_{1}}, \dots, λ_{1} I - N_{l_{q}}) .

(80)

Then, there exist

H \in C^{n \times l}

and

\hat{H} \in C^{n \times l} [λ]

such that

B H = \hat{H} (λ I - J),

where the columns of

\hat{H}

are

C

-linearly independent (see [225]). Thus, H is called a right Jordan system of B corresponding to

λ_{1}

. Then, G is called a left Jordan system of A corresponding to

λ_{1} \in σ (A)

if

G^{T}

is a right Jordan system of

A^{T}

.

For

λ_{1} \in σ (A) \cap σ (B)

, let G and H be left and right Jordan systems of A and B corresponding to

λ_{1}

, respectively. Then,

(G, H)

is called a pair of Jordan systems of

(A, B)

corresponding to

λ_{1}

. According to (80), partition

H = [\begin{matrix} H_{1} & H_{2} & \dots & H_{q} \end{matrix}]

with

H_{j} = [h_{j 0}, \dots, h_{j, l_{j} - 1}]

for

j = 1, \dots, q

. Similarly, assume that the elementary divisors belonging to

λ_{1} \in σ (A)

are

{(λ - λ_{1})}^{k_{1}}, \dots, {(λ - λ_{1})}^{k_{p}},

where

k_{1} \geq \dots \geq k_{p} \geq 1

. Partition

G = [\begin{matrix} G_{1} \\ ⋮ \\ G_{p} \end{matrix}] with G_{i} = [\begin{matrix} g_{i 0} \\ ⋮ \\ g_{i, k_{i} - 1} \end{matrix}] for i = 1, \dots, p .

The k-th derivative of

C (λ)

is denoted by

C^{(k)} (λ)

. We call that

(G, H)

has the property

(Σ)

if

\sum_{ν + σ + τ = r_{i j}} g_{i ν} \frac{C^{(σ)} (λ_{1})}{σ!} h_{j τ} = 0,

where

r_{i j} = 0, 1, \dots, min (k_{i}, l_{j}) - 1, i = 1, \dots, p

, and

j = 1, \dots, q

.

Theorem 56

(Theorem 1.1, [224]). Let A, B, and C be given in (79). Then, the following are equivalent:

(1): Equation (79) is consistent;
(2): There exists a pair of Jordan systems $(G, H)$ of $(A, B)$ with property $(Σ)$ for each $λ \in σ (A) \cap σ (B)$ ;
(3): All pairs of Jordan systems of $(A, B)$ have property $(Σ)$ for each $λ \in σ (A) \cap σ (B)$ .

Remark 54.

Wimmer [224] pointed out that Theorem 56 can also be extended to matrices over the ring

O (G)

of complex holomorphic functions in a domain

G

.

6.6.7. By Linear Matrix Equations

Let

F

be a field. Assume that

A (λ) = \sum_{i = 0}^{m} A_{i} λ^{i}, B (λ) = \sum_{i = 0}^{n} B_{i} λ^{i}, and C (λ) = \sum_{i = 0}^{k} C_{i} λ^{i},

(81)

where

A_{i}, B_{i}, C_{i} \in F^{r \times r}

satisfying

A_{m} \neq 0

,

B_{n} \neq 0

, and

C_{k} \neq 0

. Consider the polynomial matrix equation:

A (λ) X (λ) + Y (λ) B (λ) = C (λ),

(82)

where

X (λ), Y (λ) \in F^{r \times r} [λ]

are unknown.

Barnett [226] provided an equivalent condition for Equation (82) to have a unique solution pair

(X (λ), Y (λ))

with

deg X (λ) < n and deg Y (λ) < m .

(83)

For

H (λ) = \sum_{i = 0}^{m} H_{i} λ^{i}

with

H_{m} \neq 0

, we say that

H (λ)

is regular if

det (H_{m}) \neq 0

.

Theorem 57

([226]). Let

A (λ)

,

B (λ)

, and

C (λ)

given in (81) be such that

A (λ)

and

B (λ)

are regular and that

deg C (λ) \leq n + m - 1

. Then, Equation (82) has a unique solution pair

(X (λ), Y (λ))

such that (83) if and only if

det (A (λ))

and

det (B (λ))

are relatively prime, (i.e., the greatest common divisor is a constant independent λ).

Remark 55.

Note that the condition in Theorem 57, i.e., that

det (A (λ))

and

det (B (λ))

are relatively prime, implies that

A (λ)

and

B (λ)

are nonsingular (see [227]).

Subsequently, Feinstein and Bar-Ness [227] conducted further research on the solutions to Equation (82) with (83); such special solutions are also called the minimal solutions.

Theorem 58

(Theorem II, [227]). Let

A (λ)

,

B (λ)

, and

C (λ)

given in (81) be such that

A (λ)

and

B (λ)

are nonsingular and

deg C (λ) \leq n + m - 1

. Assume that

A (λ)

(or

B (λ)

) is regular. Then, Equation (82) has a unique minimal solution if and only if

det (A (λ))

and

det (B (λ))

are relatively prime.

Remark 56.

In Theorem III of [227], Feinstein and Bar-Ness showed that, if Equation (82) has a minimal solution

(X (λ), Y (λ))

, then

(X (λ), Y (λ))

is not unique if and only if both

A (λ)

and

B (λ)

are not regular.

Motivated by Theorem 57, Chen and Tian [228] proved that Equation (81) can be reduced to a linear matrix equation. For

A (λ)

and

B (λ)

given in (81), let

A_{R} = [\begin{matrix} 0 & 0 & \dots & 0 & - A_{0} \\ I & 0 & \dots & 0 & - A_{1} \\ 0 & I & \dots & 0 & - A_{2} \\ ⋮ & ⋮ & ⋱ & ⋮ & ⋮ \\ 0 & 0 & \dots & I & - A_{m - 1} \end{matrix}] and B_{L} = [\begin{matrix} 0 & I & 0 & \dots & 0 \\ 0 & 0 & I & \dots & 0 \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ 0 & 0 & 0 & \dots & I \\ - B_{0} & - B_{1} & - B_{2} & \dots & - B_{n - 1} \end{matrix}] .

Theorem 59

(Lemma 3.1 and Theorems 1.1 and 1.3, [228]). Let

A (λ)

,

B (λ)

, and

C (λ)

given in (81) be such that

A_{m} = B_{n} = I_{r}

, and let

H_{C (λ)}

be the set of all solutions

(X (λ), Y (λ))

of Equation (82) with (83).

(1): Let $k < m + n$ . If Equation (82) is solvable, then $H_{C (λ)} \neq \emptyset$ .
(2): Let $k < m$ . There exists $X (λ)$ satisfying $(X (λ), Y (λ)) \in H_{C (λ)}$ if and only if

$A_{R}^{n} Y + \sum_{i = 0}^{n - 1} A_{R}^{i} Y B_{i} = C,$

(84)

where

$Y (λ) = \sum_{i = 0}^{m - 1} Y_{i} λ^{i}, Y = [\begin{matrix} Y_{0} \\ ⋮ \\ Y_{m - 1} \end{matrix}], C = [\begin{matrix} C_{0} \\ ⋮ \\ C_{m - 1} \end{matrix}] .$
(3): Let $k < n$ . There exists $Y (λ)$ satisfying $(X (λ), Y (λ)) \in H_{C (λ)}$ if and only if

$X B_{L}^{m} + \sum_{i = 0}^{n - 1} A_{i} X B_{L}^{i} = \tilde{C},$

(85)

where $X (λ) = \sum_{i = 0}^{n - 1} X_{i} λ^{i}$ , $X = [\begin{matrix} X_{0} & \dots & X_{n - 1} \end{matrix}]$ , and $\tilde{C} = [\begin{matrix} C_{0} & \dots & C_{n - 1} \end{matrix}]$ .

Remark 57.

(1): For $A = [a_{i j}] \in F^{p \times q}$ , let

$row (A) = [\begin{matrix} a_{11} & \dots & a_{p 1} & a_{12} & \dots & a_{p 2} & \dots & a_{1 q} & \dots & a_{p q} \end{matrix}] .$

and $vec (A) = row {(A)}^{T}$ . Then,

$\begin{matrix} (84) & \Leftrightarrow (I \otimes A_{R}^{n} + \sum_{i = 0}^{n - 1} B_{i}^{T} \otimes A_{R}^{i}) vec (Y) = vec (C), \\ (85) & \Leftrightarrow row (X) (I \otimes B_{L}^{m} + \sum_{i = 0}^{m - 1} A_{i} \otimes B_{L}^{i}) = row (\tilde{C}) . \end{matrix}$
(2): The explicit solutions to Equations (84) and (85) have been studied in [229,230], which also serve as a starting point of Section 6.7 in this paper.
(3): Moreover, Sheng and Tian [228] mentioned that Theorem 59 still holds when the field $F$ is extended to a commutative ring with identity.

6.6.8. By Root Functions of Polynomial Matrices

Let

L_{1}^{n \times n} [a, b]

denote the space of

n \times n

matrix-valued functions that are Lebesgue integrable over the interval

[a, b]

. Let

B (λ) = I_{n} + \int_{- ω}^{0} e^{i λ t} b (t) d t, D (λ) = I_{n} + \int_{0}^{ω} e^{i λ t} d (t) d t, G (λ) = \int_{- ω}^{ω} e^{i λ t} g (t) d t,

where

b \in L_{1}^{n \times n} [- ω, 0]

,

d \in L_{1}^{n \times n} [0, ω]

, and

g \in L_{1}^{n \times n} [- ω, ω]

. Consider the following linear entire matrix function equation

X (λ) B (λ) + D (λ) Y (λ) = G (λ), λ \in C,

(86)

where X and Y are unknown

n \times n

matrix functions

X (λ) = \int_{0}^{ω} e^{i λ t} x (t) d t and Y (λ) = \int_{- ω}^{0} e^{i λ t} y (t) d t

with

x \in L_{1}^{n \times n} [0, ω]

and

y \in L_{1}^{n \times n} [- ω, 0]

. Gohberg [231] proposed necessary and sufficient conditions for the solvability of Equation (86) using the root functions of the coefficients.

Definition 17

([231,232]). Let an

n \times n

matrix function

H (λ)

be analytic at

λ_{0} \in C

. A

C^{n}

-valued function φ is called a root function of

H (λ)

at

λ_{0}

of order (at least)

k \in Z^{+}

if φ is analytic at

λ_{0}

,

φ (λ_{0}) \neq 0

, and

H (λ) φ (λ)

has a zero at

λ_{0}

of order (at least) k.

Theorem 60

(Theorem 1.1, [231]). Equation (86) is consistent if and only if for each common zero

λ_{0}

of

det (B (λ))

and

det (D (λ))

, if φ is a root function of

B (λ)

at

λ_{0}

of order p and ψ is a root function of

D {(λ)}^{T}

at

λ_{0}

of order q, then

ψ {(λ)}^{T} G (λ) φ (λ)

has a zero at

λ_{0}

of order at least

min {q, p}

.

Let

H (λ)

be an analytic

r \times r

matrix function, and let

λ_{0}

be a point in the domain of analyticity of

H (λ)

. If

det (H (λ_{0})) = 0

, then

λ_{0}

is called an eigenvalue of

H (λ)

. Let a

C^{r}

-vector valued function

ϕ (λ)

be analytic in a neighborhood of an eigenvalue

λ_{0}

of

F (λ)

. If

ϕ (λ_{0}) \neq 0

and

F (λ_{0}) ϕ (λ_{0}) = 0

, then

ϕ (λ)

is called a right root function of

F (λ)

at

λ_{0}

. The order (at least) k of the right root function

ϕ (λ)

at

λ_{0}

is the order (at least) k of

λ_{0}

as a zero of the analytic function

F (λ) ϕ (λ)

. Similarly, left root functions can be defined (see [233]).

Utilizing the right and left root functions of polynomial matrices, Kaashoek and Lerer [233] then presented a discrete version of Theorem 60. Let

L (λ) = \sum_{j = 0}^{l} λ^{j} L_{j}, M (λ) = \sum_{j = 0}^{m} λ^{j} M_{j}, G (λ) = \sum_{j = 0}^{ℓ + m - 1} λ^{j} G_{j},

(87)

where

L_{j}, M_{j}, G_{j} \in C^{r \times r}

with

L_{l} \neq 0

and

M_{m} \neq 0

. Consider the following polynomial matrix equation:

X (λ) L (λ) + M (λ) Y (λ) = G (λ),

(88)

where

X (λ)

and

Y (λ)

are unknown. Define

\begin{matrix} \hat{L} (λ) = λ^{l} L (λ^{- 1}) = \sum_{j = 0}^{l} λ^{j} L_{l - j}, \hat{M} (λ) = λ^{m} M (λ^{- 1}) = \sum_{j = 0}^{m} λ^{j} M_{m - j}, \\ \bar{G} (λ) = λ^{l + m - 1} G (λ^{- 1}) = \sum_{j = 0}^{l + m - 1} λ^{j} G_{l + m - 1 - j} . \end{matrix}

Theorem 61

(Theorem 1.1, [233]). For

L (λ)

and

M (λ)

given in (87), assume that both

det (L (λ))

and

det (M (λ))

do not vanish identically. Then, Equation (88) has a solution pair

(X (λ), Y (λ))

such that

deg X (λ) \leq m - 1 and deg Y (λ) \leq l - 1

if and only if both of the following two conditions hold:

(1): For each $λ_{0} \in C$ satisfying $det (L (λ_{0})) = det (M (λ_{0})) = 0$ , if $f (λ)$ is a right root function of $L (λ)$ at $λ_{0}$ of order s and $h (λ)$ is a left root function of $M (λ)$ at $λ_{0}$ of order t, then $h (λ) G (λ) f (λ)$ has a zero at $λ_{0}$ of order at least $min {s, t}$ ;
(2): If $f_{\circ} (λ)$ is a right root function of $\hat{L} (λ)$ at zero of order $s_{\circ}$ and $h_{\circ} (λ)$ is a left root function of $\hat{M} (λ)$ at zero of order $t_{\circ}$ , then $h_{\circ} (λ) \bar{G} (λ) f_{\circ} (λ)$ has a zero of order at least $min {s_{\circ}, t_{\circ}}$ .

Remark 58.

Kaashoek and Lerer proved Theorem 61 by means of Theorem 1.2 of [233], which is a direct corollary of Theorems 3.2 and 4.1 of [234]. Although this proof strategy can be regarded as a discrete version of the proof of Theorem 60, in the discrete case, the common spectrum at infinity plays a crucial role, whereas in [231] there are no common root functions at infinity.

6.7. Sylvester-Polynomial-Conjugate Matrix Equations

For given matrices A,

B_{i} (i = 1, \dots, k)

, and C, the matrix equation

\sum_{i = 0}^{k} A^{i} X B_{i} = C

has been thoroughly investigated for its important role in control theory (see [229,230,235,236,237]). Based on results in [230], Wu et al. [238] defined Sylvester sums and Kronecker maps, and used them to discuss the following matrix equation over

R

:

\sum_{i = 0}^{n_{1}} A_{i} X F^{i} = \sum_{i = 0}^{n_{2}} B_{i} R F^{i},

where X is unknown and other matrices are given over

R

. Building on the work in [238], Wu et al. [239,240,241] established the framework of conjugate products and Sylvester-conjugate sums over

C

. Specifically, for

A \in C^{m \times n}

and

k \in N

, define

A^{* k} = \{\begin{matrix} A, & for even k, \\ \bar{A}, & for odd k . \end{matrix}

The conjugate product [239,241] of

A (λ) = \sum_{i = 0}^{m} A_{i} λ^{i} \in C^{p \times q} [λ] and B (λ) = \sum_{j = 0}^{n} B_{j} λ^{j} \in C^{q \times r} [λ]

is defined as

A (λ) ⊛ B (λ) = \sum_{i = 0}^{m} \sum_{j = 0}^{n} A_{i} B_{j}^{* i} λ^{i + j} .

For

A \in C^{n \times n}

and

k \in N

, define

A^{\overset{\leftarrow}{k}} = A^{k - 2 ⌊\frac{k}{2}⌋} {(\bar{A} A)}^{⌊\frac{k}{2}⌋},

where

⌊ \cdot ⌋

is the floor function (downward rounding). For

Z \in C^{r \times p}

,

F \in C^{p \times p}

, and

T (λ) = \sum_{i = 0}^{t} T_{i} λ^{i} \in C^{n \times r} [λ],

we define the Sylvester-conjugate sum [241] of

T (λ)

and Z with respect to F by

T (λ) \overset{F}{⊞} Z = \sum_{i = 0}^{t} T_{i} Z^{* i} F^{\overset{\leftarrow}{i}} .

(89)

Remark 59.

Note that Lemma 7 in ref. [241] reveals an intriguing relationship between the conjugate product and the Sylvester-conjugate sum. Specifically, if

A (λ) \in C^{l \times q} [λ]

,

B (λ) \in C^{q \times r} [λ]

,

F \in C^{p \times p}

, and

Z \in C^{r \times p}

, then

(A (λ) ⊛ B (λ)) \overset{F}{⊞} Z = A (λ) \overset{F}{⊞} (B (λ) \overset{F}{⊞} Z) .

On this basis, Wu et al. [241] investigated the following Sylvester-polynomial-conjugate matrix equation:

\sum_{i = 0}^{ϕ_{1}} A_{i} X^{* i} F^{\overset{\leftarrow}{i}} + \sum_{j = 0}^{ϕ_{2}} B_{j} Y^{* j} F^{\overset{\leftarrow}{j}} = \sum_{k = 0}^{ϕ_{3}} C_{k} R^{* k} F^{\overset{\leftarrow}{k}},

(90)

where

A_{i} \in C^{n \times n} (i = 0, 1, \dots, ϕ_{1})

,

B_{j} \in C^{n \times r} (j = 0, 1, \dots, ϕ_{2})

,

C_{k} \in C^{n \times m} (k = 0, 1, \dots, ϕ_{3})

,

R \in C^{m \times p}

, and

F \in C^{p \times p}

are given, and

X \in C^{n \times p}

and

Y \in C^{r \times p}

are unknown. Denote

A (λ) = \sum_{i = 0}^{ϕ_{1}} A_{i} λ^{i}, B (λ) = \sum_{i = 0}^{ϕ_{2}} B_{i} λ^{i}, C (λ) = \sum_{i = 0}^{ϕ_{3}} C_{i} λ^{i},

(91)

where

A_{i}

,

B_{i}

, and

C_{i}

are given in (90). Thus, by conjugate products and Sylvester-conjugate sums, Equation (90) can be directly rewritten as

A (λ) \overset{F}{⊞} X + B (λ) \overset{F}{⊞} Y = C (λ) \overset{F}{⊞} R .

(92)

Additionally, we say that

A (λ) \in C^{n \times n} [λ]

and

B (λ) \in C^{n \times r} [λ]

are left coprime [239] if all their greatest common left divisors are unimodular, which is also equivalent to the existence of

C (λ) \in C^{n \times r} [λ]

and

D (λ) \in C^{r \times r} [λ]

such that

A (λ) ⊛ C (λ) + B (λ) ⊛ D (λ) = I .

Theorem 62

(Theorem 2, [241]). Assume that

A (λ)

and

B (λ)

given in (91) are left coprime. Suppose that the unimodular polynomial matrix

U (λ) \in C^{(n + r) \times (n + r)} [λ]

satisfies

[\begin{matrix} A (λ) & B (λ) \end{matrix}] ⊛ U (λ) = [\begin{matrix} I & 0 \end{matrix}] .

Then, all solutions of Equation (92) (or (90)) are

[\begin{matrix} X \\ Y \end{matrix}] = (U (λ) ⊛ [\begin{matrix} C (λ) & 0 \\ 0 & I \end{matrix}]) \overset{F}{⊞} [\begin{matrix} R \\ Z \end{matrix}],

(93)

where

Z \in C^{r \times p}

is arbitrary. Furthermore, if partition

U (λ) = [\begin{matrix} H (λ) & N (λ) \\ L (λ) & D (λ) \end{matrix}]

with

N (λ) \in C^{n \times r} [λ]

and

D (λ) \in C^{r \times r} [λ]

, then (93) can be rewritten as

\{\begin{matrix} X = (H (λ) ⊛ C (λ)) \overset{F}{⊞} R + N (λ) \overset{F}{⊞} Z, \\ Y = (L (λ) ⊛ C (λ)) \overset{F}{⊞} R + D (λ) \overset{F}{⊞} Z, \end{matrix}

for arbitrary

Z \in C^{r \times p}

.

Remark 60.

(1): Theorem 9 in ref. [239] guarantees the existence of the polynomial matrix $U (λ)$ in Theorem 62.
(2): Taking

$ϕ_{1} = 0, ϕ_{2} = 1, B_{0} = 0, B_{1} = I, and ϕ_{3} = 0,$

Equation (90) over $R$ reduces to

$A X + Y B = C,$

where $A = A_{0}$ , $B = F$ , and $C = C_{0} R$ . Clearly, Theorem 62 is also a generalization of RET over $R$ .
(3): In Theorem 1 of [241], Wu et al. characterized the homogeneous case of Equation (90) more specifically via a pair of right coprime polynomial matrices. Moreover, in Remark 4 of [241], they utilized the same method to discuss a more general form of Equation (90), i.e.,

$\sum_{k = 1}^{θ} \sum_{i = 0}^{ω_{k}} A_{k i} X_{k} F^{\overset{\leftarrow}{i}} = \sum_{j = 0}^{c} C_{j} R F^{\overset{\leftarrow}{j}},$

where $X_{k} (k = 1, \dots, θ)$ are unknown and others are given.
(4): It can be observed that Lemmas 11 and 12 of [241] are crucial for proving Theorem 62 and Theorem 1 of [239]. Meanwhile, it should be noted that Lemmas 11 and 12 of [241] provide only necessary conditions for left and right coprimeness, respectively. Thus, we contend that exploring the converse problems of these two lemmas is interesting.
(5): Equation (90) generalizes a class of complex conjugate matrix equations (see [242,243,244,245,246,247,248]). For systematic research on complex conjugate matrix equations and their applications in discrete-time antilinear systems, refer to the monograph [249] by Wu and Zhang.

Inspired by [241], Mazurek [250] recently generalized Theorem 62 based on groupoids and vector spaces.

Theorem 63

(Theorem 2, [250]). Let

M_{11}, M_{12}, M_{21},

and

M_{22}

be groupoids with binary operations commonly denoted by ⊕, and let

V_{1}

and

V_{2}

be finite-dimensional vector spaces over a field

F

. Assume that for any

i, j, k \in {1, 2}

, two operations

⊠ : M_{i j} \times V_{j} \to V_{i} and ⊙ : M_{i j} \times M_{j k} \to M_{i k}

are given such that

(1): $(s \oplus t) ⊠ v s . = s ⊠ v s . + t ⊠ v s .$ for $i, j \in {1, 2}$ , $s, t \in M_{i j}$ , and $v s . \in V_{j}$ ;
(2): $s ⊠ (k u + l v) = k (s ⊠ u) + l (s ⊠ v)$ for $i, j \in {1, 2}$ , $s \in M_{i j}$ , $k, l \in F$ , and $u, v s . \in V_{j}$ ;
(3): $s ⊠ (t ⊠ v) = (s ⊙ t) ⊠ v s .$ for $i, j, k \in {1, 2}$ , $s \in M_{i j}$ , $t \in M_{j k}$ , and $v s . \in V_{k}$ .

Suppose that for

a \in M_{11}

and

b \in M_{12}

, there exist

p \in M_{11}

,

g \in M_{12}

,

d, q \in M_{21}

, and

h, w \in M_{22}

such that

(i): $((a ⊙ p) \oplus (b ⊙ q)) ⊠ v s . = v s .$ for any $v s . \in V_{1}$ ;
(ii): $((d ⊙ g) \oplus (w ⊙ h)) ⊠ u = u$ for any $u \in V_{2}$ ,;
(iii): $((a ⊙ g) \oplus (b ⊙ h)) ⊠ u = 0$ for any $u \in V_{2}$ .

Then, for

c \in V_{1}

, the all solutions of the equation

a ⊠ x + b ⊠ y = c

are

(x, y) = (p ⊠ c + g ⊠ z, q ⊠ c + h ⊠ z),

where

z \in V_{2}

is arbitrary.

For a ring

F

with unity and a ring endomorphism

σ

of

F

, the skew polynomial ring

F [λ; σ]

is the set of polynomials over

F

in the indeterminate

λ

with the usual addition and multiplication subject to

λ a = σ (a) λ

for any

a \in F

(see [251]). Specifically, the multiplication in

F [λ; σ]

is given by

(\sum_{i = 0}^{n} a_{i} λ^{i}) (\sum_{j = 0}^{m} b_{j} λ^{j}) = \sum_{i = 0}^{n} \sum_{j = 0}^{m} a_{i} σ^{i} (b_{j}) λ^{i + j}

(94)

for

\sum_{i = 0}^{n} a_{i} λ^{i}, \sum_{j = 0}^{m} b_{j} λ^{j} \in F [λ; σ]

. The set of

m \times n

matrices over the skew polynomial ring

F [s; σ]

is denoted by

F^{m \times n} [s; σ]

. Assume that

F

is also a finite-dimensional vector space over a field

K

, and put

\begin{matrix} M_{11} & = F^{n \times n} [s; σ], M_{12} = F^{n \times m} [s; σ], M_{21} = F^{m \times n} [s; σ], \\ M_{22} & = F^{m \times m} [s; σ], V_{1} = F^{n \times p}, V_{2} = F^{m \times p}, \end{matrix}

with the usual addition (denoted by ⊕) and the skew multiplication (94) (denoted by ⊙) of polynomial matrices. Then, applying Theorem 63, Mazurek obtained the relevant result (i.e., Theorem 3, [250]) for Equation

A (λ) ⊠ X + B (λ) ⊠ Y = C,

where

A (λ) \in F^{n \times n} [λ; σ]

,

B (λ) \in F^{n \times m} [λ; σ]

, and

C \in F^{n \times p}

are given, and

X \in F^{n \times p}

and

Y \in F^{m \times p}

are unknown.

Remark 61.

In the proof of Theorem 1 in [250], Mazurek showed that Theorem 62 is an immediate corollary of Theorem 3 of [250].

Moreover, analogous to the conjugate product of complex matrices, Wu et al. [252] further defined the

j

-conjugate product of quaternion matrices. The

j

-conjugate of

A \in H^{m \times n}

is

\overset{\leftrightarrow}{A} = - j A j .

For

k \in Z^{+}

, inductively define

A^{\circ k} = \overset{\leftrightarrow}{A^{\circ (k - 1)}} with A^{\circ 0} = A

. Then, the

j

-conjugate product [252] of

A (λ) = \sum_{i = 0}^{m} A_{i} λ^{i} \in H^{p \times q} [λ] and B (λ) = \sum_{j = 0}^{n} B_{j} λ^{j} \in H^{q \times r} [λ]

is defined as

A (λ) ⊛ B (λ) = \sum_{i = 0}^{m} \sum_{j = 0}^{n} A_{i} B_{j}^{\circ i} λ^{i + j} .

We say that

A (λ) \in H^{n \times n} [λ]

and

B (λ) \in H^{n \times m} [λ]

are ⊛-left coprime [250] if there exists a unimodular polynomial matrix

U (λ) \in H^{(n + m) \times (n + m)} [λ]

such that

[\begin{matrix} A (λ) & B (λ) \end{matrix}] ⊛ U (λ) = [\begin{matrix} I & 0 \end{matrix}] .

Let the map

σ

be

σ : H \to H

with

σ (h) = - j h j

for

h \in H

. Then,

σ

is an automorphism on

H

. So, the

j

-conjugate product is the product of matrices over

H [λ; σ]

. Similarly to (89), Mazurek [250] defined the Sylvester-

j

-conjugate sum over

H

,

T (λ) \overset{F}{⊞} V = T_{0} V + \sum_{m = 1}^{t} T_{m} σ^{m} (V) σ^{m - 1} (F) \dots σ^{1} (F) σ^{0} (F)

for

T (λ) = \sum_{i = 0}^{t} T_{i} λ^{i} \in H^{n \times r} [λ]

,

V \in H^{r \times p}

, and

F \in H^{p \times p}

. Applying Theorem 3 of [250] to matrices over

H [λ; σ]

yields the following result immediately.

Theorem 64

(Theorem 4, [250]). Let

A (λ) \in H^{n \times n} [λ]

and

B (λ) \in H^{n \times m} [λ]

be ⊛-left coprime. Then, there exist

P (λ) \in H^{n \times n} [λ], G (λ) \in H^{n \times m} [λ], D (λ), Q (λ) \in H^{m \times n} [λ], H (λ), W (λ) \in H^{m \times m} [λ]

such that

\begin{matrix} A (λ) ⊛ P (λ) + B (λ) ⊛ Q (λ) = I_{n}, D (λ) ⊛ G (λ) + W (λ) ⊛ H (λ) = I_{m}, \\ A (λ) ⊛ G (λ) + B (λ) ⊛ H (λ) = 0 . \end{matrix}

Moreover, given

F \in H^{p \times p}

and

C \in H^{n \times p}

, the general solution of the matrix equation

A (λ) \overset{F}{⊞} X + B (λ) \overset{F}{⊞} Y = C

is

\{\begin{matrix} X = P (λ) \overset{F}{⊞} C + G (λ) \overset{F}{⊞} Z, \\ Y = Q (λ) \overset{F}{⊞} C + H (λ) \overset{F}{⊞} Z, \end{matrix}

where

Z \in H^{m \times p}

is arbitrary.

Remark 62.

Theorem 64 not only generalizes Theorem 62 to

H

, but also presents a more general result than the relevant results in [87,253,254,255,256,257,258,259,260,261].

6.8. Generalized Forms of GSE

In Section 4.3, we can see that the SVD plays an important role in solving Equation (1). Then, will the development of the SVD promote research on a more general form of Equation (1)? The work in [31,262,263] shows that it is affirmative.

De Moor and Zha [262] established GSVD of a finite number

k \in Z^{+}

of matrices over

C

.

Theorem 65

(GSVD for any k matrices (Theorem 1, [262])). Let

A_{1} \in C^{n_{0} \times n_{1}}, A_{2} \in C^{n_{1} \times n_{2}}, \dots, A_{k - 1} \in C^{n_{k - 2} \times n_{k - 1}}, and A_{k} \in C^{n_{k - 1} \times n_{k}} .

Then, there exist unitary matrices

U_{1} \in C^{n_{0} \times n_{0}}

and

V_{k} \in C^{n_{k} \times n_{k}}

, and nonsingular matrices

X_{j} \in C^{n_{j} \times n_{j}} (j = 1, \dots, k - 1)

such that

U_{1}^{*} A_{1} X_{1} = Λ_{1}, Z_{1} A_{2} X_{2} = Λ_{2}, \dots, Z_{i - 1} A_{i} X_{i} = Λ_{i}, \dots, Z_{k - 1} A_{k} V_{k} = Λ_{k},

where

Z_{j} = X_{j}^{*}

(or

= X_{j}^{- 1}

) for

j = 1, 2, \dots, k - 1

(i.e., both choices are always possible),

\begin{matrix} Λ_{j} = & [\begin{matrix} I & 0 & 0 & 0 & \dots & 0 & 0 \\ 0 & 0 & 0 & 0 & \dots & 0 & 0 \\ 0 & I & 0 & 0 & \dots & 0 & 0 \\ 0 & 0 & 0 & 0 & \dots & 0 & 0 \\ 0 & 0 & I & 0 & \dots & 0 & 0 \\ 0 & 0 & 0 & 0 & \dots & 0 & 0 \\ \dots & \dots & \dots & \dots & \dots & \dots & \dots \\ 0 & 0 & 0 & 0 & \dots & I & 0 \\ 0 & 0 & 0 & 0 & \dots & 0 & 0 \end{matrix}] \begin{matrix} r_{j}^{1} \\ r_{j - 1}^{1} - r_{j}^{1} \\ r_{j}^{2} \\ r_{j - 1}^{2} - r_{j}^{2} \\ r_{j}^{3} \\ r_{j - 1}^{3} - r_{j}^{3} \\ \dots \\ r_{j}^{j} \\ n_{j - 1} - r_{j - 1} - r_{j}^{j} \end{matrix} \\ \begin{matrix} r_{j}^{1} & r_{j}^{2} & r_{j}^{3} & r_{j}^{4} & \dots & r_{j}^{j} & n_{j} - r_{j} \end{matrix} \end{matrix}

(

j = 1, 2, \dots, k - 1

) with

r_{0} = 0

and

r_{j} = \sum_{i = 1}^{j} r_{j}^{i} = rank (A_{j})

,

\begin{matrix} Λ_{k} = & [\begin{matrix} Λ_{k}^{1} & 0 & 0 & 0 & \dots & 0 & 0 \\ 0 & 0 & 0 & 0 & \dots & 0 & 0 \\ 0 & Λ_{k}^{2} & 0 & 0 & \dots & 0 & 0 \\ 0 & 0 & 0 & 0 & \dots & 0 & 0 \\ 0 & 0 & Λ_{k}^{3} & 0 & \dots & 0 & 0 \\ 0 & 0 & 0 & 0 & \dots & 0 & 0 \\ \dots & \dots & \dots & \dots & \dots & \dots & \dots \\ 0 & 0 & 0 & 0 & \dots & Λ_{k}^{k} & 0 \\ 0 & 0 & 0 & 0 & \dots & 0 & 0 \end{matrix}] \begin{matrix} r_{k}^{1} \\ r_{k - 1}^{1} - r_{k}^{1} \\ r_{k}^{2} \\ r_{k - 1}^{2} - r_{k}^{2} \\ r_{k}^{3} \\ r_{k - 1}^{3} - r_{k}^{3} \\ \dots \\ r_{k}^{k} \\ n_{k - 1} - r_{k - 1} - r_{k}^{k} \end{matrix} \\ \begin{matrix} r_{k}^{1} & r_{k}^{2} & r_{k}^{3} & r_{k}^{4} & \dots & r_{k}^{k} & n_{k} - r_{k} \end{matrix} \end{matrix}

(95)

with

r_{k} = \sum_{i = 1}^{k} r_{k}^{i} = rank (A_{k})

, and

Λ_{k}^{i} \in C^{r_{k}^{i} \times r_{k}^{i}} (i = 1, 2, \dots, k)

are diagonal with positive diagonal elements.

It is easy to see that there exists an elementary column transformation that turns

Λ_{k}

given in (95) into the matrix

Λ_{k}^{'}

consisting of only zero and identity matrix blocks. So, under the hypotheses of Theorem 65, there exist nonsingular matrices

U_{1} \in C^{n_{0} \times n_{0}}

,

V_{k}^{'} \in C^{n_{k} \times n_{k}}

, and

X_{j} \in C^{n_{j} \times n_{j}} (j = 1, \dots, k - 1)

such that

U_{1}^{*} A_{1} X_{1} = Λ_{1}, Z_{1} A_{2} X_{2} = Λ_{2}, \dots, Z_{i - 1} A_{i} X_{i} = Λ_{i}, \dots, Z_{k - 1} A_{k} V_{k}^{'} = Λ_{k}^{'} .

(96)

In this case,

Λ_{1}, \dots, Λ_{k - 1}

, and

Λ_{k}^{'}

only have zero and identity blocks.

Following the idea analogous to that in (96), He (Lemma 2.1, [263]) considered the pure product singular value decomposition (PSVD) for four quaternion matrices, i.e.,

[\begin{matrix} A_{1} \\ I & A_{2} \\ I & A_{3} \\ I & A_{4} \end{matrix}],

where

A_{1} \in H^{q_{1} \times q_{2}}

,

A_{2} \in H^{q_{2} \times q_{3}}

,

A_{3} \in H^{q_{3} \times q_{4}}

, and

A_{4} \in H^{q_{4} \times q_{5}}

. Using the PSVD, He further investigated the system of generalized Sylvester matrix equations over

H

:

\{\begin{matrix} X_{1} A_{1} - B_{1} X_{2} & = C_{1}, \\ X_{2} A_{2} - B_{2} X_{3} & = C_{2}, \\ X_{3} A_{3} - B_{3} X_{4} & = C_{3}, \\ X_{4} A_{4} - B_{4} X_{5} & = C_{4}, \end{matrix}

where

X_{1}, X_{2}, \dots, X_{5}

are unknown (see Theorems 4.1 and 4.2, [263]).

Inspired by He’s aforementioned work, since the GSVD of an arbitrary number k of matrices has been established, can we then consider a system of k generalized Sylvester equations? That is to say, consider

\{\begin{matrix} X_{1} A_{1} - B_{1} X_{2} = C_{1}, \\ X_{2} A_{2} - B_{2} X_{3} = C_{2}, \\ ⋮ \\ X_{k} A_{k} - B_{k} X_{k + 1} = C_{k}, \end{matrix}

(97)

where

k \in Z^{+}

, and

X_{1}, X_{2}, \dots, X_{k + 1}

are unknown.

To answer this question, let us first take a look at the work of He et al. in [31]. In terms of Theorem 65, He et al. in Theorem 2.2 of [31] gave the simultaneous decomposition of fifteen matrices over

C

, i.e.,

[\begin{matrix} B_{1} \\ A_{1} & E_{1} & C_{1} \\ D_{1} & B_{2} \\ A_{2} & E_{2} & C_{2} \\ D_{2} & B_{3} \\ A_{3} & E_{3} & C_{3} \\ D_{3} \end{matrix}],

where

A_{i}, B_{i}, C_{i}, D_{i}, E_{i} (i = 1, 2, 3)

are given matrices over

C

with appropriate orders. By this simultaneous decomposition of fifteen matrices, they further studied the following system of complex matrix equations:

\{\begin{matrix} A_{1} X_{1} B_{1} + C_{1} X_{2} D_{1} = E_{1}, \\ A_{2} X_{2} B_{2} + C_{2} X_{3} D_{2} = E_{2}, \\ A_{3} X_{3} B_{3} + C_{3} X_{4} D_{3} = E_{3}, \end{matrix}

where

X_{1}, \dots, X_{4}

are unknown (see Theorems 3.1 and 3.6, [31]). Interestingly, they also demonstrated the simultaneous decomposition of

5 k

matrices by the same means, i.e.,

[\begin{matrix} B_{1} \\ A_{1} & E_{1} & C_{1} \\ D_{1} & B_{2} \\ A_{2} & E_{2} & C_{2} \\ D_{2} & ⋱ \\ ⋱ & ⋱ \\ B_{k} \\ A_{k} & E_{k} & C_{k} \\ D_{k} \end{matrix}],

where

k \in Z^{+}

, and

A_{i}, B_{i}, C_{i}, D_{i}, E_{i} (i = 1, 2, \dots, k)

are given matrices over

C

with appropriate orders (see Theorem 4.1, [31]). Next, it is natural to consider the system

\{\begin{matrix} A_{1} X_{1} B_{1} + C_{1} X_{2} D_{1} = E_{1}, \\ A_{2} X_{2} B_{2} + C_{2} X_{3} D_{2} = E_{2}, \\ ⋮ \\ A_{k} X_{k} B_{k} + C_{k} X_{k + 1} D_{k} = E_{k}, \end{matrix}

(98)

where

X_{1}, X_{2}, \dots, X_{k + 1}

are unknown. Obviously, Problem (97) is a special case of the system (98). However, only one conjecture on the solvability conditions of the system (98) was presented, i.e., (Conjecture 4.2, [31]). The conjecture was not officially solved until 2025 (see Corollary 4.4, [34]). Moreover, in Theorem 2.1 of [34], He et al. further investigated

(98) subject to \{\begin{matrix} G_{1} X_{1} = P_{1}, G_{2} X_{2} = P_{2}, \dots, G_{k + 1} X_{k + 1} = P_{k + 1}, \\ X_{1} H_{1} = Q_{1}, X_{2} H_{2} = Q_{2}, \dots, X_{k + 1} H_{k + 1} = Q_{k + 1}, \end{matrix}

(99)

where

X_{1}, X_{2}, \dots, X_{k + 1}

are unknown. Furthermore, Xie et al. [264] studied the following system over

H

:

\{\begin{matrix} A_{1} X_{1} + Y_{1} B_{1} + C_{1} Z_{1} D_{1} + F_{1} Z_{2} G_{1} = E_{1}, \\ A_{2} X_{2} + Y_{2} B_{2} + C_{2} Z_{2} D_{2} + F_{2} Z_{3} G_{2} = E_{2}, \\ A_{3} X_{3} + Y_{3} B_{3} + C_{3} Z_{3} D_{3} + F_{3} Z_{4} G_{3} = E_{3}, \end{matrix}

(100)

where

X_{i}, Y_{i}, Z_{i} (i = 1, 2, 3)

, and

Z_{4}

are unknown.

Up to this point, we can observe that the systems (99) and (100) encompass most of the current formal generalizations of Equation (1) without considering differences in number sets, such as [121,265,266,267,268,269,270,271].

Remark 63.

However, there are currently no published studies on the more generalized system as follows:

\begin{matrix} \{\begin{matrix} \sum_{j = 1}^{l} A_{1, j} X_{1 j} B_{1, j} + A_{1, l + 1} X_{1, l + 1} B_{1, l + 1} + A_{1, l + 2} X_{2, l + 1} B_{1, l + 2} = C_{1}, \\ \sum_{j = 1}^{l} A_{2, j} X_{2, j} B_{2, j} + A_{2, l + 1} X_{2, l + 1} B_{2, l + 1} + A_{2, l + 2} X_{3, l + 1} B_{2, l + 2} = C_{2}, \\ ⋮ \\ \sum_{j = 1}^{l} A_{i, j} X_{i, j} B_{i, j} + A_{i, l + 1} X_{i, l + 1} B_{i, l + 1} + A_{i, l + 2} X_{i + 1, l + 1} B_{i, l + 2} = C_{i}, \\ ⋮ \\ \sum_{j = 1}^{l} A_{k, j} X_{k, j} B_{k, j} + A_{k, l + 1} X_{k, l + 1} B_{k, l + 1} + A_{k, l + 2} X_{k + 1, l + 1} B_{k, l + 2} = C_{k}, \end{matrix} \\ subject to & \{\begin{matrix} G_{i, j} X_{i, j} = P_{i, j}, X_{i, j} H_{i, j} = Q_{i, j} for 1 \leq i \leq k, 1 \leq j \leq l + 1, \\ G_{k + 1, l + 1} X_{k + 1, l + 1} = P_{k + 1, l + 1}, X_{k + 1, l + 1} H_{k + 1, l + 1} = Q_{k + 1, l + 1}, \end{matrix} \end{matrix}

(101)

where

l, k \in Z^{+},

A_{i, j}, B_{i, j}, C_{i}

(1 \leq i \leq k, 1 \leq j \leq l + 2)

,

G_{i, j}, P_{i, j}, H_{i, j}, Q_{i, j}

(1 \leq i \leq k, 1 \leq j \leq l + 1)

,

G_{k + 1, l + 1}, P_{k + 1, l + 1}, H_{k + 1, l + 1}

, and

Q_{k + 1, l + 1}

are given, and

X_{i, j}

(1 \leq i \leq k, 1 \leq j \leq l + 1)

and

X_{k + 1, l + 1}

are unknown. So, this is also an interesting problem.

Remark 64.

Through the continuous research on the generalizations of Equation (1), it can be found that the GSVD gradually becomes ineffective, while the tool of the generalized inverse has always been an effective method. However, we also find that the generalized inverse theory has little success in studying the systems of equations that simultaneously couple multiple (≥2) unknown matrices.

For instance, for given matrices A, B, and C, the system

\{\begin{matrix} A X - Y B = C, \\ X - Y = 0, \end{matrix}

(102)

is consistent if and only if the Sylvester equation

A X - X B = C

(103)

is consistent. At present, there are almost no articles that directly represent the general solution of the Sylvester equation using only the generalized inverses of the coefficient matrices. However, it can be intuitively observed that, using the Kronecker product, SVD, or STP to discuss the Sylvester equation is a feasible option.

It is noted that Liu put forward an open problem in [40], that is, to study the equivalent conditions for the solvability of the system of matrix equations over

C

:

\{\begin{matrix} A_{1} X + Y B_{1} = C_{1}, \\ A_{2} X + Y B_{2} = C_{2}, \end{matrix}

(104)

where X and Y are unknown. Clearly, the system (104) is a generalization of both Equation (5) and the system (102). Later, using the rank equalities, Wang et al. [114] solved this problem over

H

under the certain condition, i.e.,

rank [\begin{matrix} A_{1} & A_{2} \end{matrix}] = rank (A_{1}) + rank (A_{2}) and rank [\begin{matrix} B_{1} \\ B_{2} \end{matrix}] = rank (B_{1}) + rank (B_{2}) .

(105)

This wonderful work is illuminating for completely solving Equation (103) and the system (104) by using rank equalities or the generalized inverses.

Theorem 66

(Theorem 2.8, [114]). Let

A_{1}, A_{2} \in H^{m \times p}, B_{1}, B_{2} \in H^{q \times n},

and

C_{1}, C_{2} \in H^{m \times n}

be such that every matrix equation in system (104) is consistent. If (105) holds, then system (104) is consistent if and only if

\begin{matrix} rank [\begin{matrix} B_{1} & 0 \\ B_{2} & 0 \\ - C_{1} & A_{1} \\ C_{2} & A_{2} \end{matrix}] & = rank [\begin{matrix} A_{1} \\ A_{2} \end{matrix}] + rank [\begin{matrix} B_{1} \\ B_{2} \end{matrix}], \\ rank [\begin{matrix} A_{1} & A_{2} & - C_{1} & C_{2} \\ 0 & 0 & B_{1} & B_{2} \end{matrix}] & = rank [\begin{matrix} A_{1} & A_{2} \end{matrix}] + rank [\begin{matrix} B_{1} & B_{2} \end{matrix}], \end{matrix}

\begin{matrix} rank [\begin{matrix} 0 & B_{1} & B_{2} \\ A_{1} & 0 & 0 \\ A_{2} & 0 & F \end{matrix}] & = rank [\begin{matrix} A_{1} \\ A_{2} \end{matrix}] + rank [\begin{matrix} B_{1} & B_{2} \end{matrix}], \\ rank [\begin{matrix} 0 & B_{1} & B_{2} \\ A_{1} & 0 & 0 \\ A_{2} & 0 & \hat{F} \end{matrix}] & = rank [\begin{matrix} A_{1} \\ A_{2} \end{matrix}] + rank [\begin{matrix} B_{1} & B_{2} \end{matrix}], \end{matrix}

where

\begin{matrix} F & = A_{1} (A_{2}^{(1, 2)} C_{2} - A_{1}^{(1, 2)} C_{1}) {[\begin{matrix} B_{1} \\ - B_{2} \end{matrix}]}^{(1, 2)} [\begin{matrix} B_{1} \\ - B_{2} \end{matrix}] + Ω B_{1}, \\ \hat{F} & = A_{2} (A_{2}^{(1, 2)} C_{2} - A_{1}^{(1, 2)} C_{1}) {[\begin{matrix} B_{1} \\ - B_{2} \end{matrix}]}^{(1, 2)} [\begin{matrix} B_{1} \\ - B_{2} \end{matrix}] + Ω B_{2}, \\ Ω & = [\begin{matrix} - A_{1} & A_{2} \end{matrix}] {[\begin{matrix} - A_{1} & A_{2} \end{matrix}]}^{(1, 2)} (R_{A_{2}} C_{2} B_{2}^{(1, 2)} - R_{A_{1}} C_{1} B_{1}^{(1, 2)}) \end{matrix}

with

R_{A_{1}} = I - A_{1} A_{1}^{(1, 2)}

.

Remark 65.

For more research on Equation (104), please refer to [222,272].

7. Iterative Algorithms

Analytical solutions of GSE over various algebras have been introduced in Section 5 and Section 6. However, in practical applications, challenges such as high computational complexity, stability, and robustness issues often arise. Therefore, investigating numerical solutions to GSE is imperative. In this section, we primarily present the relevant results on the numerical solutions of GSE. First, several iterative algorithms that directly address the GSE are proposed.

In 1984, Ziętak [96] proposed an algorithm (Algorithm 1) for computing the

l_{p}

-solution of Equation (5) over

R

, using the equivalence between items (1) and (3) of Theorem 17.

Theorem 67.

Let

{X_{t}}

and

{Y_{t}}

be generalized by Algorithm 1, and let

1 < p < \infty

. Then, the sequence

{R_{t} = A X_{t} + Y_{t} B}

is convergent. Moreover, if

\hat{X} \in R^{r \times n}

and

\hat{Y} \in R^{m \times s}

satisfy

\hat{R} = A \hat{X} + \hat{Y} B

, where

\hat{R} = lim_{t \to \infty} R_{t},

then

\hat{X}

and

\hat{Y}

are the

l_{p}

-solution of Equation (5).

Remark 66.

When

p = 2

, Ziętak (Theorem 3.3, [96]) showed that the iterative process in Algorithm 1 ends after two iterations for any arbitrary initial matrix

Y_{0}

, i.e., matrices

X_{1}

and

Y_{1}

are the

l_{2}

-solution of Equation (5).

Algorithm 1 Algorithm [96] for the

l_{p}

-solution of Equation (5) over

R

Require:: Given matrices $A \in R^{m \times r}$ , $B \in R^{s \times n}$ , and $C \in R^{m \times n}$ ; Initial matrices: $Y_{0} \in R^{m \times s}$ ;
Ensure:: Sequences: ${X_{t} \in R^{r \times n}}$ and ${Y_{t} \in R^{m \times s}}$ ;
1:: for $t = 1, 2, \dots$ do
2:: for $j = 1, 2, \dots, n$ do
3:: Let $c_{j}$ and $b_{j}$ be j-th columns of C and B, respectively;
4:: Solve the $l_{p}$ -solution $x_{j}^{(t)}$ of the linear system $A x = c_{j} - Y_{t - 1} b_{j}$ ;
5:: Set the j-th column of $X_{t}$ as $x_{j}^{(t)}$ ;
6:: end for
7:: for $i = 1, 2, \dots, m$ do
8:: Let $d_{i}$ and $a_{i}$ be the i-th columns of $C^{T}$ and $A^{T}$ , respectively;
9:: Solve the $l_{p}$ -solution $y_{i}^{(t)}$ of the linear system $B^{T} y = d_{i} - X_{t}^{T} a_{i}$ ;
10:: Set the i-th column of $Y_{t}^{T}$ as $y_{i}^{(t)}$ ;
11:: end for
12:: Compute $γ_{2 t - 1} = {∥ A X_{t} + Y_{t - 1} B - C ∥}_{p}$ ;
13:: Compute $γ_{2 t} = {∥ A X_{t} + Y_{t} B - C ∥}_{p}$ ;
14:: if the sequence ${γ_{t}}$ is not decreasing then
15:: Break
16:: end if
17:: end for

In 1985, analogous to Algorithm R [273] for a nonlinear matrix equation, Ziętak [95] devised Algorithm T (Algorithm 2). Based on this algorithm, Ziętak [95] further discussed the Chebyshev solution of Equation (5) under the conditions (29) and (28).

Algorithm 2 Algorithm T [95] for the Chebyshev Solution of Equation (5) over

R

Require:: Matrices $A \in R^{m \times r}$ , $B \in R^{s \times n}$ , and $C \in R^{m \times n}$ ; Initial matrices $X_{0} \in R^{r \times n}$ and $Y_{0} \in R^{m \times s}$ ; Tolerance $ϵ > 0$ (for termination criterion);
Ensure:: Convergent sequences: ${X_{k} \in R^{r \times n}}$ and ${Y_{k} \in R^{m \times s}}$ ;
1:: Initial residual matrix: $R_{0} = A X_{0} + Y_{0} B - C$ ;
2:: Initial residual norm: $γ_{0} = {∥ R_{0} ∥}_{\infty}$ ;
3:: Set iteration index: $k \leftarrow 0$ ;
4:: Termination criterion: $| γ_{2 k + 1} - γ_{2 k} | < ϵ$ ;
5:: while Termination criterion not satisfied do
6:: $R_{2 k} = A X_{k} + Y_{k} B - C$ ;
7:: $γ_{2 k} = {∥ R_{2 k} ∥}_{\infty}$ ;
8:: for $j = 1$ to n do
9:: Extract the j-th column of $R_{2 k}$ : $r_{j}^{(2 k)}$ ;
10:: Solve the Chebyshev solution $h_{j}^{(k)}$ of $A h = - r_{j}^{(2 k)}$ ;
11:: Set the j-th column of $X_{k + 1}$ as $x_{j}^{(k + 1)} = x_{j}^{(k)} + h_{j}^{(k)}$ ;
12:: end for
13:: $R_{2 k + 1} = A X_{k + 1} + Y_{k} B - C$ ;
14:: $γ_{2 k + 1} = {∥ R_{2 k + 1} ∥}_{\infty}$ ;
15:: for $i = 1$ to m do
16:: Extract the i-th column of $R_{2 k + 1}^{T}$ : $s_{i}^{(2 k + 1)}$ ;
17:: Solve the Chebyshev solution $g_{i}^{(k)}$ of $B^{T} g = - s_{i}^{(2 k + 1)}$ ;
18:: Set the i-th column of $Y_{k + 1}^{T}$ as $y_{i}^{(k + 1)} = y_{i}^{(k)} + g_{i}^{(k)}$ ;
19:: end for
20:: $k \leftarrow k + 1$
21:: end while

Remark 67.

From Algorithm 2, it follows that

x_{j}^{(k + 1)}

are the Chebyshev solutions of

A x = c_{j} - Y_{k} b_{j} (j = 1, \dots, n)

and

y_{i}^{(k + 1)}

are the Chebyshev solutions of

B^{T} y = d_{i} - X_{k + 1}^{T} a_{i} (i = 1, \dots, m),

where

c_{j}, b_{j}, d_{i}

, and

a_{i}

are the columns of

C, B, C^{T}

and

A^{T}

, respectively. Moreover, one can derive that the sequence

{γ_{l}}

given by Algorithm 2 is convergent.

Theorem 68

(Theorems 5.2 and 5.3, [95]). Let A, B, and C be given in Equation (5).

(1): Suppose that (29) is satisfied. If

$Z_{*} = A X_{*} + Y_{*} B$

is a cluster point of the sequence ${Z_{2 k}}$ generated by Algorithm 2, then $X_{*}$ and $Y_{*}$ are a Chebyshev solution of Equation (5)
(2): Suppose that (28) is satisfied and matrix A satisfies Haar’s condition, i.e., $det (A_{i}) \neq 0$ for $i = 1, \dots, m$ , where $A_{i}$ is the matrix obtained from A by deletion of the i-th row. Then, the matrices $X_{1}$ and $Y_{1}$ generated by Algorithm 2 are the Chebyshev solution of Equation (5) for arbitrary initial matrices $X_{0}$ and $Y_{0}$ .

In 2008, using Kronecker products, Yang and Huang [274] derived the normwise backward errors of the approximate solutions of Equation (5) over

R

, as well as their upper and lower bounds.

Definition 18

([275]). Let

A \in R^{m \times m}

,

B \in R^{n \times n}

, and

C \in R^{m \times n}

, and let

\tilde{X}, \tilde{Y} \in R^{m \times n}

be a numerical solution of Equation (5). Put

\begin{matrix} η = \{(Δ A, Δ B, Δ C) ∣ (A + Δ A) \tilde{X} + \tilde{Y} (B + Δ B) = C + Δ C\} \\ and & η (\tilde{X}, \tilde{Y}) = min_{(Δ A, Δ B, Δ C) \in η} {∥[\frac{1}{θ_{1}} Δ A, \frac{1}{θ_{2}} Δ B, \frac{1}{θ_{3}} Δ C]∥}_{F}, \end{matrix}

where

θ_{1}, θ_{2}, θ_{3} > 0

are parameters. Then,

η (\tilde{X}, \tilde{Y})

is referred to as the relative backward error if

θ_{1} = {∥ A ∥}_{F}

,

θ_{2} = {∥ B ∥}_{F}

,

θ_{3} = {∥ C ∥}_{F}

; it is termed the absolute backward error if

θ_{1} = θ_{2} = θ_{3} = 1

.

Theorem 69

(Theorem 1, [274]). Under the hypotheses Definition 18, let

R = C - A \tilde{X} - \tilde{Y} B and T = [θ_{1} ({\tilde{X}}^{T} \otimes I_{m}), θ_{2} (I_{n} \otimes \tilde{Y}), - θ_{3} (I_{n} \otimes I_{m})] .

Then

\frac{{∥ R ∥}_{F}}{\sqrt{n θ_{3}^{2} + θ_{1}^{2} ∥ \tilde{X} ∥_{F}^{2} + θ_{2}^{2} {∥ \tilde{Y} ∥}_{F}^{2}}} \leq η (\tilde{X}, \tilde{Y}) = {∥ T^{†} Vec (R) ∥}_{2} \leq \frac{{∥ R ∥}_{F}}{\sqrt{θ_{3}^{2} + θ_{1}^{2} λ_{n} ({\tilde{X}}^{T} \tilde{X}) + θ_{1}^{2} λ_{m} (\tilde{Y} {\tilde{Y}}^{T})}},

where

Vec (\cdot)

denotes the column-wise vectorization of a matrix, and

λ_{n} (\cdot)

denotes the largest eigenvalue of a

n \times n

positive semidefinite matrix.

In 2011, Li et al. [276] extended the classical conjugate gradient least-squares algorithm (abbreviated as CGLSA) to compute the optimal solution of Equation (5) over

R

with symmetric pattern constraints, i.e., consider

min_{X \in {SR}^{m \times m}, Y \in {SR}^{n \times n}} {∥ A X + Y B - C ∥}_{F},

(106)

where

A, B, C \in R^{n \times m}

, and

{SR}^{m \times m}

denotes the set of all

m \times m

real symmetric matrices.

Theorem 70

(Theorems 1 and 2, [276]). Let

A, B, C \in R^{n \times m}

.

(1): For any arbitrary initial matrices $X^{(0)} \in {SR}^{m \times m}$ and $Y^{(0)} \in {SR}^{n \times n}$ , the matrix sequence ${X^{(k)}}$ and ${Y^{(k)}}$ generated by Algorithm 3 will converge to a solution of Problem (106) with finitely many steps in exact arithmetic.
(2): Let

$S = \{[X, Y] | X = \frac{1}{2} (A^{T} H + H^{T} A) and Y = \frac{1}{2} (H B^{T} + B H^{T})\}$

for arbitrary $H \in R^{n \times m}$ . If $[X^{(0)}, Y^{(0)}] \in S$ , then by Algorithm 3 we can obtain the unique least norm solution of Problem (106).

Remark 68.

Under the conditions of Theorem 70, if matrix pair

[X, Y] \in S

is a solution of Problem (106), then it is the unique least norm solution. Additionally, Li et al. (Theorem 3, [276]) characterized the minimization property of Algorithm 3, which ensures that the algorithm converges smoothly.

The classic alternating direction method (abbreviated as ADM) is an extension of the augmented Lagrangian multiplier method [277]. In 2017, inspired by [278], Ke and Ma [279] applied ADM to solve the nonnegative solutions of Equation (5) over

R

, i.e.,

find X (\geq 0) \in R^{s \times n} and Y (\geq 0) \in R^{m \times l} s . t . A X + Y B = C,

where

A \in R^{m \times s}

,

B \in R^{l \times n}

, and

C \in R^{m \times n}

. Note that Problem (107) is equivalent to the following quadratic programming problem:

min_{\begin{matrix} X \in R^{s \times n}, Y \in R^{m \times l} \end{matrix}} \frac{1}{2} {∥ A X + Y B - C ∥}_{F}^{2} s . t . X \geq 0 and Y \geq 0,

which is further equivalent to

min_{X, U \in R^{s \times n}, Y, V \in R^{m \times l}} \frac{1}{2} {∥ A X + Y B - C ∥}_{F}^{2} s . t . X = U, Y = V, U \geq 0, and V \geq 0 .

(107)

For

A = (a_{i j}) \in R^{m \times n}

, let

P_{+} (A) = [b_{i j}] \in R^{m \times n}

, where

b_{i j} = max {0, a_{i j}}

.

Definition 19

([279]). A point

(X, Y, U, V)

satisfies the KKT conditions for Problem (107), i.e., there exist the matrices Λ and Π such that

\begin{matrix} A^{T} (A X + Y B - C) + Λ = 0, (A X + Y B - C) B^{T} + Π = 0, X - U = 0, \\ Y - V = 0, Λ \leq 0 \leq U, Λ ⊙ U = 0, Π \leq 0 \leq V, Π ⊙ V = 0, \end{matrix}

where ⊙ denotes the component-wise multiplication.

Theorem 71

(Theorem 4.1 and Corollary 4.1, [279]). Let the sequence

{W_{k} = (X_{k}, Y_{k}, U_{k}, V_{k}, Λ_{k},

Π_{k})}

be generated by Algorithm 4.

(1): Suppose that ${W_{k}}$ satisfies

$\begin{matrix} lim_{k \to \infty} (W_{k + 1} - W_{k}) = 0 . \end{matrix}$

Then, any accumulation point $(X, Y, U, V, Λ, Π)$ of ${W_{k}}$ satisfies the KKT conditions for Problem (107).
(2): If ${W_{k}}$ converges, then it converges to a KKT point of Problem (107).

Algorithm 3 Extended CGLSA [276] for the real symmetric solution of Equation (5)

Require:: Given matrices: $A, B, C \in R^{n \times m}$ ; Initial matrices: $X^{(0)} \in {SR}^{m \times m}$ and $Y^{(0)} \in {SR}^{n \times n}$ ; Tolerance $ϵ > 0$ (for termination criterion);
Ensure:: The symmetric solution: $X^{(k)}$ and $Y^{(k)}$ ;
1:: The initial residual matrix: $R_{0} = C - A X^{(0)} - Y^{(0)} B$ ;
2:: $P_{0, 1} = \frac{1}{2} (A^{T} R_{0} + R_{0}^{T} A)$ and $P_{0, 2} = \frac{1}{2} (R_{0} B^{T} + B R_{0}^{T})$ ;
3:: $Q_{0, 1} = P_{0, 1}$ and $Q_{0, 2} = P_{0, 2}$ ;
4:: $M_{0} = A Q_{0, 1} + Q_{0, 2} B$ ;
5:: Set iteration index: $k = 0$ ;
6:: while $∥ P_{k, 1} ∥_{F}^{2} + {∥ P_{k, 2} ∥}_{F}^{2} > ϵ$ do
7:: Update iteration index: $k \leftarrow k + 1$
8:: $M_{k} = A Q_{k, 1} + Q_{k, 2} B$
9:: $α_{k} = \frac{∥ P_{k, 1} ∥_{F}^{2} + {∥ P_{k, 2} ∥}_{F}^{2}}{∥ M_{k} ∥_{F}^{2}}$
10:: Update solution matrices: $X^{(k + 1)} = X^{(k)} + α_{k} Q_{k, 1}$ and $Y^{(k + 1)} = Y^{(k)} + α_{k} Q_{k, 2}$ ;
11:: $R_{k + 1} = R_{k} - α_{k} M_{k}$ ;
12:: $P_{k + 1, 1} = P_{k, 1} - \frac{α_{k}}{2} (A^{T} M_{k} + M_{k}^{T} A)$ and $P_{k + 1, 2} = P_{k, 2} - \frac{α_{k}}{2} (M_{k} B^{T} + B M_{k}^{T})$ ;
13:: $β_{k} = \frac{∥ P_{k + 1, 1} ∥_{F}^{2} + {∥ P_{k + 1, 2} ∥}_{F}^{2}}{∥ P_{k, 1} ∥_{F}^{2} + {∥ P_{k, 2} ∥}_{F}^{2}}$ ;
14:: $Q_{k + 1, 1} = P_{k + 1, 1} + β_{k} Q_{k, 1}$ and $Q_{k + 1, 2} = P_{k + 1, 2} + β_{k} Q_{k, 2}$ ;
15:: end while
16:: Output the final solution: $X^{(k + 1)}$ and $Y^{(k + 1)}$ .

Algorithm 4 ADM [279] for the nonnegative solution of Equation (5) over

R

Require:: Given matrices: $A \in R^{m \times r}$ , $B \in R^{s \times n}$ , and $C \in R^{m \times n}$ ; Initial matrices: $Y_{0}, U_{0}, V_{0}, Λ_{0}, Π_{0} = 0$ ; Penalty parameters: $α, β > 0$ ; $γ \in (0, 1.618)$ ; Tolerance $ϵ > 0$ ;
Ensure:: The nonnegative solution: $X^{(k + 1)}$ and $Y^{(k + 1)}$ ;
1:: Set iteration index: $k = 0$ ;
2:: while $k < maxiter$ do
3:: $X_{k + 1} = {(A^{T} A + α I)}^{- 1} (A^{T} C + α U_{k} - A^{T} Y_{k} B - Λ_{k})$ ;
4:: $Y_{k + 1} = (C B^{T} + β V_{k} - A X_{k + 1} B^{T} - Π_{k}) {(B B^{T} + β I)}^{- 1}$ ;
5:: $U_{k + 1} = P_{+} (X_{k + 1} + \frac{1}{α} Λ_{k})$ ;
6:: $V_{k + 1} = P_{+} (Y_{k + 1} + \frac{1}{β} Π_{k})$ ;
7:: $Λ_{k + 1} = Λ_{k} + γ α (X_{k + 1} - U_{k + 1})$ ;
8:: $Π_{k + 1} = Π_{k} + γ β (Y_{k + 1} - V_{k + 1})$ ;
9:: if $∥ A X_{k + 1} + Y_{k + 1} {B - C ∥}_{F} \leq ϵ$ and $X_{k + 1}, Y_{k + 1} \geq 0$ then
10:: Output $X_{k + 1}, Y_{k + 1}$ and return
11:: end if
12:: $k \leftarrow k + 1$ ;
13:: end while

We observe that GSE has numerous generalizations, as discussed in Section 6. Consequently, the iterative algorithms for these generalized forms specialize to the corresponding results of GSE as special cases. Next, for convenience, we present the key algorithms related to these generalizations in an enumerated form.

(I)

In 2006, Peng et al. [280] showed an efficient iterative algorithm for solving the following matrix equation over

R

:

A X B + C Y D = E,

(108)

where X and Y are unknown. They noted that the algorithm can also be used to construct symmetric, antisymmetric, and bisymmetric solutions of (108) with only minor changes.

(II)

The condition number is an important topic in numerical analysis, characterizing the worst-case sensitivity of problems to input data perturbations. A large condition number indicates an ill-posed problem. Consider the following matrix equation:

\{\begin{matrix} A X - Y B = C, \\ D X - Y E = F, \end{matrix}

(109)

where X and Y are unknown.

(i): In 1996, Kågström and Poromaa [281] presented LAPACK-style algorithms and software for solving Equation (109) over $C$ .
(ii): In 2007, Lin and Wei [282] studied the perturbation analysis for Equation (109) over $R$ , explicitly deriving the expressions and upper bounds for normwise, mixed, and componentwise condition numbers.
(iii): In 2013, Diao et al. [283] developed the small sample statistical condition estimation algorithm to evaluate the normwise, mixed, and componentwise condition numbers of Equation (109) over $R$ . In [283], they also investigated the effective condition number for Equation (109) and derived sharp perturbation bounds using this condition number.

(III)

In 2008, Dehghan and Hajarian [284] proposed an iterative algorithm to solve the reflexive solutions of Equation (109) over

R

.

(IV)

In 2010, Dehghan and Hajarian [285] presented an iterative algorithm for solving the generalized bisymmetric solutions of the generalized coupled Sylvester matrix equation over

R

:

\{\begin{matrix} A X B + C Y D = M, \\ E X F + G Y H = N, \end{matrix}

(110)

where X and Y are unknown generalized bisymmetric matrices.

(V)

In 2012, inspired by the least-squares QR-factorization algorithm in [286], Li and Huang [287] proposed an iterative method to find the best approximate solution of Equation (110) over

R

, where unknown matrices X and Y are constrained to be symmetric, generalized bisymmetric, or

(R, S)

-symmetric.

(VI)

In 2018, inspired by [288,289], Lv and Ma (Section 3, [290]) proposed a parametric iterative algorithm for Equation (109) over

R

. Moreover, in (Section 4, [290]), they developed an accelerated iterative algorithm based on this parametric approach. Note that Ref. [289] is a monograph on iterative algorithms for the constrained solutions of matrix equations.

(VII)

Interestingly, in 2024, Ma et al. [291] proposed a Newton-type splitting iterative method for the coupled Sylvester-like absolute value equation

R

:

\{\begin{matrix} A_{1} X B_{1} + C_{1} | Y | D_{1} = E_{1}, \\ A_{2} Y B_{2} + C_{2} | X | D_{2} = E_{2}, \end{matrix}

where X and Y are unknown. Here,

| A |

means that each component of a matrix A is absolute-valued.

(VIII)

The algorithms in [292,293,294] suffer from parameter tuning challenges, particularly for large-scale problems. To address this limitation, in 2025, Shirilord and Dehghan [295] recently proposed an advanced gradient descent-based parameter-free method to solve Equation (109) over

R

.

In the final part of this section, we further enumerate some iterative algorithms for solving the generalizations of GSE, which involve an arbitrary number of unknown (or coefficient) matrices.

(A)

In 2005–2006, using the hierarchical identification principle, Ding and Chen [293,294] presented a large family of iterative methods for the more general form of Equation (5) over

R

, i.e.,

A_{i, 1} X_{1} B_{i, 1} + A_{i, 2} X_{2} B_{i, 2} + \dots + A_{i, p} X_{p} B_{i, p} = C_{i}, i = 1, 2, \dots, p,

(111)

where

X_{1}, X_{2}, \dots, X_{p}

are unknown. These iterative methods subsume the well-known Jacobi and Gauss–Seidel iterations. Subsequent scholars have conducted more extensive research on numerical algorithms for Equation (111).

(a): In 2010, based on the conjugate gradient method, Dehghan and Hajarian [296] constructed an iterative method for Equation (111) over $R$ with the generalized bisymmetric solutions $(X_{1}, X_{2}, \dots, X_{p})$ .
(b): In 2012, Dehghan and Hajarian [297] introduced two iterative methods for solving (111) over $R$ with the generalized centro-symmetric and central antisymmetric solutions $(X_{1}, X_{2}, \dots, X_{p})$ .
(c): In 2014, Hajarian [298] solved Equation (111) over $C$ by the matrix form of the conjugate gradients squared method.
(d): In 2016, Hajarian [299] presented the generalized conjugate direction algorithm for computing Equation (111) with the symmetric solutions $(X_{1}, X_{2}, \dots, X_{p})$ .
(e): In 2017, based on the Hestenes–Stiefel version of the biconjugate residual (BCR) algorithm, Hajarian [300] solved the generalized Sylvester matrix equation

$\sum_{i = 1}^{f} (A_{i} X B_{i}) + \sum_{j = 1}^{g} (C_{j} Y D_{j}) = E$

over $R$ with the generalized reflexive solutions $(X, Y)$ . In 2018, Lv and Ma [301] introduced another Hestenes–Stiefel version of BCR method for computing the centrosymmetric or anti-centrosymmetric solutions of Equation (111) over $R$ .
(f): In 2018, inspired by [302], Sheng [292] proposed a relaxed gradient-based iterative (abbreviated as RGI) algorithm to solve Equation (109), and further generalized this algorithm to Equation (111). Moreover, numerical examples in [292] demonstrate that the RGI algorithm outperforms the iterative algorithm in [294] in terms of speed, elapsed time, and iterative steps.
(g)): In 2018, Hajarian [303] extended the Lanczos version of BCR algorithm to find the symmetric solutions $(X, Y, Z)$ of the matrix equation over $R$ :

$A_{i} X B_{i} + C_{i} Y D_{i} + E_{i} Z F_{i} = G_{i}, i = 1, 2, \dots, t .$

In 2020, Yan and Ma [304] also used the Lanczos version of BCR algorithm to study Equation (111) over $R$ with the (anti-)reflexive solutions $(X_{1}, X_{2}, \dots, X_{p})$ .

(B)

In 2009, from an optimization perspective, Zhou et al. [305] developed a novel iterative method for solving Equation (111) over

R

and its more general form, i.e.,

\sum_{j = 1}^{s_{i, 1}} A_{i, 1, j} X_{1} B_{i, 1, j} + \sum_{j = 1}^{s_{i, 2}} A_{i, 2, j} X_{2} B_{i, 2, j} + \dots + \sum_{j = 1}^{s_{i, p}} A_{i, p, j} X_{p} B_{i, p, j} = C_{i}, i = 1, 2, \dots, p

with unknown

X_{1}, X_{2}, \dots, X_{p}

, which contains iterative methods in [293,294] as special cases. In 2015, by extending the generalized product biconjugate gradient algorithms, Hajarian gave [306] four effective matrix algorithms for the coupled matrix equation over

R

:

\sum_{j = 1}^{l} (A_{i, 1, j} X_{1} B_{i, 1, j} + A_{i, 2, j} X_{2} B_{i, 2, j} + \dots + A_{i, l, j} X_{l} B_{i, l, j}) = D_{i}, i = 1, 2, \dots, l,

where

X_{1}, X_{2}, \dots, X_{l}

are unknown.

(C)

In 2011, Wu et al. [307] constructed an iterative algorithm to solve the coupled Sylvester-conjugate matrix equation over

C

:

\sum_{j = 1}^{p} (A_{i j} X_{j} B_{i j} + C_{i j} {\bar{X}}_{j} D_{i j}) = F_{i}, i = 1, 2, \dots, n,

where

X_{1}, X_{2}, \dots, X_{p}

are unknown. In 2021, inspired by [307], Yan and Ma [308] proposed an iterative algorithm for the generalized Hamiltonian solutions of the generalized coupled Sylvester-conjugate matrix equations over

C

:

\{\begin{matrix} \sum_{j = 1}^{q} (A_{i j} X_{j} B_{i j} + C_{i j} {\bar{Y}}_{j} D_{i j}) = M_{i}, \\ \sum_{j = 1}^{q} (E_{i j} Y_{j} F_{i j} + G_{i j} {\bar{X}}_{j} H_{i j}) = N_{i}, \end{matrix}

where

i = 1, \dots, p

, and

X_{j}

and

Y_{j}

(

j = 1, \dots, q

) are unknown generalized Hamiltonian matrices.

(D)

In 2015, inspired by [309,310], Hajarian [311] obtained an iterative method for the coupled Sylvester-transpose matrix equations over

R

:

\{\begin{matrix} \sum_{k = 1}^{l} (A_{1, k} X B_{1, k} + C_{1, k} X^{T} D_{1, k} + E_{1, k} Y F_{1, k}) = M_{1}, \\ \sum_{k = 1}^{l} (A_{2, k} X B_{2, k} + C_{2, k} X^{T} D_{2, k} + E_{2, k} Y F_{2, k}) = M_{2} \end{matrix}

with unknown X and Y, by developing the biconjugate A-orthogonal residual and the conjugate A-orthogonal residual squared methods. Based on this developed method, Hajarian [311] also considered the coupled periodic Sylvester matrix equations over

R

:

\{\begin{matrix} A_{1, j} X_{j} B_{1, j} + C_{1, j} X_{j + 1} D_{1, j} + E_{1, j} Y_{j} F_{1, j} = M_{1, j}, \\ A_{2, j} X_{j} B_{2, j} + C_{2, j} X_{j + 1} D_{2, j} + E_{2, j} Y_{j} F_{2, j} = M_{2, j}, \end{matrix} for j = 1, 2, \dots,

where

X_{j}

and

Y_{j}

are unknown periodic matrices with a period.

(E)

Discrete-time periodic matrix equations are an important tool for analyzing and designing periodic systems [312]. More related studies are as follows:

(a): In 2017, Hajarian [313] introduced a generalized conjugate direction method for solving the general coupled Sylvester discrete-time periodic matrix equations over $R$ :

$\{\begin{matrix} \sum_{j = 1}^{m} (A_{i j} X_{i} B_{i j} + C_{i j} X_{i + 1} D_{i j} + E_{i j} Y_{i} F_{i j} + G_{i j} Y_{i + 1} H_{i j}) = M_{i}, & i = 1, 2, \dots, \\ \sum_{j = 1}^{m} ({\hat{A}}_{i j} X_{i} {\hat{B}}_{i j} + {\hat{C}}_{i j} X_{i + 1} {\hat{D}}_{i j} + {\hat{E}}_{i j} Y_{i} {\hat{F}}_{i j} + {\hat{G}}_{i j} Y_{i + 1} {\hat{H}}_{i j}) = {\hat{M}}_{i}, & i = 1, 2, \dots, \end{matrix}$

where $X_{i}$ and $Y_{i}$ are unknown periodic matrices with a period.
(b): In 2022, Ma and Yan [314] proposed a modified conjugate gradient algorithm for solving the general discrete-time periodic Sylvester matrix equations over $R$ :

$\sum_{j = 1}^{h} (A_{i j} X_{i} B_{i j} + C_{i j} X_{i + 1} D_{i j} + E_{i j} Y_{i} F_{i j} + G_{i j} Y_{i + 1} H_{i j}) = M_{i}, i = 1, 2, \dots, T,$

where $X_{i}$ and $Y_{i}$ are unknown periodic matrices of period T.

(F)

Interestingly, in 2014, Dehghani-Madiseh and Dehghan [315] presented the generalized interval Gauss–Seidel iteration method for the outer estimation of the AE-solution set of the interval generalized Sylvester matrix equation over

R

:

\sum_{i = 1}^{p} A_{i} X_{i} + \sum_{j = 1}^{q} Y_{j} B_{j} = C,

where

X_{i}

(

i = 1, \dots, p

) and

Y_{j}

(

j = 1, \dots, q

) are unknown interval matrices.

(G)

In 2018, Hajarian [316] established the biconjugate residual algorithm for solving the matrix equation over

R

:

\sum_{i = 1}^{s} A_{i} X B_{i} + \sum_{j = 1}^{t} C_{j} Y D_{j} = M,

where X and Y are the unknown generalized reflexive and anti-reflexive matrices, respectively.

(H)

In 2022, based on Kronecker product approximations, Li et al. [317] established a preconditioned modified conjugate residual method for solving the following tensor equation over

R

:

\{\begin{matrix} X_{1} \times_{1} A_{11} + X_{2} \times_{2} A_{12} + \dots + X_{n - 1} \times_{n - 1} A_{1 (n - 1)} + X_{n} \times_{n} A_{1 n} = B_{1}, \\ X_{2} \times_{1} A_{21} + X_{3} \times_{2} A_{22} + \dots + X_{n} \times_{n - 1} A_{2 (n - 1)} + X_{1} \times_{n} A_{2 n} = B_{2}, \\ ⋮ ⋮ ⋮ \\ X_{n} \times_{1} A_{n 1} + X_{1} \times_{2} A_{n 2} + \dots + X_{n - 2} \times_{n - 1} A_{n (n - 1)} + X_{n - 1} \times_{n} A_{n n} = B_{n}, \end{matrix}

where

X_{1}, \dots, X_{n}

are unknown. Here,

A \times_{k} B

denotes the k-mode product [182] of a tensor

A \in R^{I_{1} \times I_{2} \times \dots \times I_{n}}

and a matrix

B \in R^{m \times I_{k}}

.

Remark 69.

We believe that the iterative algorithms for numerical solutions of GSE and its generalizations summarized in this section can also provide certain inspiration and guidance for the study of numerical solutions to other linear equations.

8. Applications to GSE

In this section, we mainly introduce several applications of GSE in both theoretical and practical problems. It is also worth noting that the generalizations of GSE find applications in pole and eigenstructure assignment [318], scalar functional observer design [319], robots and acoustic source localization [320], pseudo-differential system [321], control theory [249], parametric control [322], model reference tracking control [323], etc.

8.1. Theoretical Applications

8.1.1. Solvability of Matrix Equations

In 1988, Wimmer [223] utilized the solvability condition of the polynomial matrix form of Equation (1) (i.e., Theorem 55) to give a necessary and sufficient condition for the consistency of the matrix equation over a field

F

:

X - A X B = C,

(112)

where

A \in F^{p \times p}

,

B \in F^{q \times q}

, and

C \in F^{p \times q}

.

Theorem 72

(Theorem 2, [223]). Equation (112) has a solution

X \in F^{p \times q}

if and only if there exist nonsingular matrices

S, R \in F^{(p + q) \times (p + q)}

such that

S (λ [\begin{matrix} I & 0 \\ 0 & B \end{matrix}] + [\begin{matrix} A & C \\ 0 & I \end{matrix}]) R = (λ [\begin{matrix} I & 0 \\ 0 & B \end{matrix}] + [\begin{matrix} A & 0 \\ 0 & I \end{matrix}]) .

Remark 70.

The core of the proof of Theorem 72 is that

(112) \Leftrightarrow \{\begin{matrix} X - A Y = C, \\ Y = X B, \end{matrix} \Leftrightarrow (A + λ I) Y - X (I + λ B) = - C .

Furthermore, using a special polynomial matrix form of Equation (5), Huang and Liu [237] considered the solvability condition for the matrix equation

\sum_{i = 0}^{k} A^{i} X B_{i} = C

(113)

with unknown X over a ring with identity.

Theorem 73

(Theorems 1 and 2, [237]). Let

F

be a ring with identity, and let

A \in F^{n \times n}

,

B_{i} \in F^{m \times q} (i = 0, 1, \dots, k)

, and

C \in F^{n \times q}

. Denote

B (λ) = \sum_{i = 0}^{k} B_{i} λ^{i} \in F^{m \times q} [λ] .

(1): Then, Equation (113) is solvable if and only if the polynomial matrix equation

$(λ I - A) X (λ) + Y (λ) B (λ) = C$

has a solution pair $X (λ) \in F^{n \times q} [λ]$ and $Y (λ) \in F^{n \times m} [λ]$ .
(2): Suppose that $F$ is a division ring and A is algebraic (or $F$ is a finitely generated as module over its center). Then, Equation (113) is solvable if and only if

$[\begin{matrix} λ I - A & - C \\ 0 & B (λ) \end{matrix}] and [\begin{matrix} λ I - A & 0 \\ O & B (λ) \end{matrix}]$

are equivalent.

Remark 71.

(1): Theorem 73 remains valid when $F$ is a finite dimensional central simple algebra over a field (see Theorems 3 and 4, [235]).
(2): In Corollaries 1–3 of [237], Huang and Liu indicated that relevant results regarding the solvability of the equations $A X - X B = C$ , $X - A X B = C$ , and $A X B = C$ can be directly derived by Theorem 73.

8.1.2. UTV Decomposition of Dual Matrices

In 2024, Xu et al. [324] presented the UTV decomposition of dual complex matrices based on the solvability conditions and general solution representations of Equation (1) over

C

(i.e., Theorem 3).

For

a_{s}, a_{i} \in C

,

a = a_{s} + a_{i} ε

represents a dual complex number, where

ε

is the dual unit given in (60). The set of all dual complex numbers is denoted by

DC

. For

A = A_{s} + A_{i} ε \in {DC}^{n \times p}

, A has unitary columns if

n \geq p

and

A^{*} A = I_{n}

, where

A^{*} = A_{s}^{*} + A_{i}^{*} ε

is the conjugate transpose of A. For

A = A_{s} + A_{i} ε \in {DC}^{n \times n}

, A is unitary if

A^{*} A = A A^{*} = I_{n}

; A is diagonal if both

A_{s}

and

A_{i}

are diagonal; and A is nonsingular if

A A^{- 1} = A^{- 1} A = I_{n}

for some

A^{- 1} \in {DC}^{n \times n}

.

We say that

A = A_{s} + A_{i} ε \in {D C}^{m \times n}

has the UTV decomposition [324] if

A = U T V^{*},

(114)

where

U = U_{s} + U_{i} ε \in {DC}^{m \times k}

has unitary columns,

T = T_{s} \in C^{k \times k}

is triangular and nonsingular, and

V = V_{s} + V_{i} ε \in {DC}^{n \times k}

has a unitary standard part

V_{s}

.

Theorem 74

(Theorem 3.1, [324]). Let

A = A_{s} + A_{i} ε \in {D C}^{m \times n}

. Assume that the UTV decomposition of

A_{s}

is given by

A_{s} = U_{s} T_{s} V_{s}^{*},

where both

U_{s} \in C^{m \times k}

and

V_{s} \in C^{n \times k}

have unitary columns, and

T_{s} \in C^{k \times k}

is triangular and nonsingular. Then, the UTV decomposition of A exists if and only if

(I_{m} - U_{s} U_{s}^{*}) A_{i} (I_{n} - V_{s} V_{s}^{*}) = 0,

in which case,

U_{i} = (I_{m} - U_{s} U_{s}^{*}) A_{i} V_{s} T_{s}^{- 1} + U_{s} P and V_{i} = A_{i}^{*} U_{s} {(T_{s}^{- 1})}^{*} - V_{s} T_{s}^{*} P^{*} {(T_{s}^{- 1})}^{*},

where

P \in C^{k \times k}

is arbitrary skew-Hermitian matrix.

Remark 72.

The proof of Theorem 3.1 in [324] shows that the pivotal step in proving Theorem 74 is that A has UTV decomposition if and only if the matrix equation

U_{i} T_{s} V_{s}^{*} + U_{s} T_{s} V_{i}^{*} = A_{i}

is consistent for unknown

U_{i} \in {DC}^{m \times k}

and

V_{i} \in {DC}^{n \times k}

.

8.1.3. Microlocal Triangularization of Pseudo-Differential Systems

In 2013, using the solvability of Equation (1) over

C

, Kiran [321] constructed a recursive scheme to factorize a pseudo-differential system into lower and upper triangular systems (LU factorization) independent of lower order terms.

Let

O P S_{N}^{0} (Ω)

be the set of all

N \times N

pseudo-differential systems of order 0 defined on an open subset

Ω

of

R^{n}

. In addition, the notation and terminology in this subsection follow that in [325,326].

Definition 20

(Definition 2.2, [321]). A matrix valued operator

A \in O P S_{N}^{0} (Ω)

admits

L U

factorization if

A = L U,

where

L \in O P S_{N}^{0} (Ω)

is an elliptic lower triangular matrix whose principal symbol has the identity on the diagonal entries, and

U \in O P S_{N}^{0} (Ω)

is an upper triangular matrix.

Theorem 75

(Theorem 2.3, [321]). Let

λ_{1} (x, ξ), \dots, λ_{N} (x, ξ)

be N sections of eigenvalues of the principal symbol of

A \in O P S_{N}^{0} (Ω)

, including multiplicities, in a conic neighborhood Γ of

(x_{0}, ξ_{0}) \in T^{*} Ω ∖ {0}

. If

λ_{i}^{- 1} (0) \cap λ_{j}^{- 1} (0) \cap Γ = \emptyset, i \neq j,

then A is microlocally triangularizable in Γ independent of lower order terms. Moreover, the system admits

L U

factorization independent of lower order terms in Γ if and only if the principal symbol of A admits an

L U

factorization where the first

N - 1

eigenvalues of the upper triangular matrix do not vanish in Γ.

Remark 73.

(1): In Sections 3.3 and 3.4 of [321], Kiran showed that the triangularization scheme in Theorem 75 can also be applied to symbolic hierarchies.
(2): Lemma 2.5 of [321] shows that Equation (1) over $C$ has a unique solution X if and only if A or B is nonsingular. However, there is a simple counterexample to its sufficiency. Indeed, if both A and B are identity matrices (and thus nonsingular), the solution X of Equation (1) is obviously not unique for a given C. For instance, take $X = C$ and $Y = 0$ , or $X = 2 C$ and $Y = C$ . This minor error, however, does not affect the existence of solutions to Equation (1).

8.2. Practical Applications

8.2.1. Calibration Problems

The late 1970s to early 1980s witnessed a surge of interest in deploying robotic manipulators for automated manufacturing. However, integrating robots as core components in flexible manufacturing systems remained challenging across many industrial applications, prompting extensive research into the manipulator calibration [327].

Inspired by [328,329], Zhuang et al. [330] first solved a specialized form of Equation (1) to address a robot calibration problem: the calibration of the robot/world (i.e., the BASE transformation) and tool/flange (i.e., the TOOL transformation). As a matter of fact, robot manipulator calibration refers to the procedure of enhancing a robot manipulator’s accuracy by adjusting its control software.

Figure 1 provides a schematic illustration of the geometry of a robotic cell. The world coordinate frame serves as an external reference frame. The base coordinate frame is defined within the robot structure. The flange coordinate frame is defined on the mounting surface of the robot end effector. The tool frame is positioned at a point inside the end effector.

Then, the robot kinematic model can be transformed into the following form:

A X = Y B,

(115)

where

(i): $A$ is the known homogeneous transformation from end effector pose measurements,
(ii): $B$ is derived from the calibrated manipulator internal-link forward kinematics;
(iii): $X$ is the unknown transformation from the tool frame to the flange frame;
(iv): $Y$ is the unknown transformation from the world frame to the base frame.

Assume that there are n pose measurements for

i = 1, 2, \dots n

. Thus, the calibration problem is reduced to solving the system of equations

A_{i} X = Y_{i} B, i = 1, 2, \dots n .

Subsequently, Zhuang et al. elaborated on the solution of Equation (115) using the rotational properties of quaternions in Section 3 of [330].

Subsequently, the robot manipulator calibration problem is further optimized by considering Equation (115) through different approaches: dual quaternion method [331,332,333,334,335], new hybrid calibration method [336], least-squares approach [337], Kronecker product method [338], 3D position measurements [339], nonlinear optimization and evolutionary computation [340], 2D positional features [341], dual Lie algebra [342], symbolic method [343], linear matrix inequality and semi-definite programming optimization [344], probabilistic framework [345], transference principle [346], etc.

8.2.2. Encryption and Decryption Schemes for Color Images

The RGB (red, green, blue) color channels can be directly mapped to the imaginary parts (

i

,

j

,

k

) of a pure imaginary quaternion matrix. Naturally, a dual quaternion matrix can represent two color images since both its standard and infinitesimal parts are quaternion matrices. By solving Equation (1) over

DH

(i.e., Theorem 43), Xie et al. (Section 4, [159]) proposed the encryption and decryption schemes for color images, as shown in Figure 2.

The specific encryption and decryption processes for color images are presented in Algorithms 5 and 6, respectively. To ensure the uniqueness of the decrypted color images

({\hat{X}}_{0}, {\hat{X}}_{1})

output by Algorithmic 6, the encryption dual quaternion matrices A and B must be restricted to satisfy Condition P, i.e., both the standard parts and the infinitesimal parts of A and B are of either full row rank or full column rank.

Algorithm 5 Color image encryption scheme

1:: Input: two original color images $X_{0}$ and $X_{1}$ , two color images $Y_{0}$ and $Y_{1}$ as keys, and two encryption dual quaternion matrices A and B satisfying Condition P;
2:: Output: two encrypted color images $C_{0}$ and $C_{1}$ by

$A (X_{0} + X_{1} ε) - (Y_{0} + Y_{1} ε) B = C_{0} + C_{1} ε .$

Algorithm 6 Color image decryption scheme

1:: Input: the encryption matrices A and B, the keys $Y_{0}$ and $Y_{1}$ , and the encrypted color images $C_{0}$ and $C_{1}$ from Algorithm 5.
2:: Output: two decrypted color images ${\hat{X}}_{0}$ and ${\hat{X}}_{1}$ by (61) and (62) in Theorem 43.

As illustrated in Figure 3, Xie et al. [159] selected two color images (i.e., Bike 1 and Bike 2) as the objects to be encrypted, and another two color images (i.e., Sunflower and Big-Windmill) as the keys. They first encrypted Bike 1 and Bike 2 using Algorithm 5, obtaining the two encrypted color images (see Figure 4). Subsequently, Algorithm 6 was applied to decrypt the encrypted images, resulting in the two decrypted color images (see Figure 5). Moreover, the structural similarity index measures (SSIM) of both decrypted images are shown in Table 2, which fully demonstrates the high effectiveness of their decryption scheme.

Remark 74.

Recently, research on encrypting and decrypting color images and color videos by solving matrix or tensor equations has attracted attention in [203,204,347].

9. Conclusions

Research on GSE and its generalizations has long been a vibrant field, boasting both profound theoretical value and extensive practical application prospects. This comprehensive review encompasses 75 theorems, 74 remarks, and 347 references spanning from 1844 to 2025, covering pure mathematics (linear algebra, abstract algebra, operators, tensors, semi-tensor products, polynomial matrices, etc.), computational mathematics (iterative algorithms, condition numbers, etc.), and applied mathematics (robotics, image processing, encryption/decryption schemes, etc.). Centered on solving GSE, this paper elaborates on five dimensions—methods, constraints, generalizations, algorithms, and applications—distilling the field’s essence through point-by-point analysis and synthesis. A network diagram (i.e., Figure 6) intuitively illustrates the GSE research’s core framework.

As shown in Figure 6, let us start with “

A X - Y B = C

(GSE)” in the red box, which is the core research object of the entire paper. The orange boxes show the various generalizations of GSE in different mathematical fields, which greatly enriches the relevant research. Then, looking down, the red box “SOLUTIONS” represents the main problem discussed in GSE and its generalizations, that is, the problem of solving equations, which is the core of the entire research. The green box on the right imposes the relevant constraint on the solution of the equation due to the demand for special matrices or specific conditions in practical applications. Further down, the purple box is a further elaboration on solving the equation: first, discuss the solvability conditions of the relevant equation; in the solvable case, study the representation of its analytical solution (explicit solution); in the unsolvable case, consider the analytical solution of its best approximation solution (minimum-norm least squares solution); finally, when the relevant analytical solution cannot be obtained, or when constrained by factors such as computational complexity, it is necessary to further study the numerical solution of related problems by designing iterative algorithms and continuously enhance computational efficiency. The blue box on the left contains various methods for tackling related solving problems, and also different approaches that can be chosen to solve the same problem. The gray box at the bottom is the purpose or end-point of all previous theoretical research, that is, to serve applications, which includes two aspects: theoretical applications and practical applications.

In the narrative of this paper, we have provided detailed introductions, in the form of remarks, to several interesting problems worthy of further exploration. For the convenience of readers, we have hereby compiled Table 3 to briefly summarize these problems.

Numerous researchers worldwide have made significant contributions to GSE-related studies. Owing to the authors’ limitations, we have not been able to cover all GSE research findings, and there may inevitably be inadequacies. We offer our apologies here. Moreover, as GSE research advances rapidly with continuous innovations, this paper’s framework and content will grow increasingly rich and substantial as developments unfold. Finally, we kindly request experts and readers to offer their valuable insights, assisting us in further refining the summarization work in this field.

Author Contributions

Conceptualization, Q.-W.W. and J.G.; Methodology, Q.-W.W. and J.G.; Investigation, Q.-W.W. and J.G.; Writing—Original Draft Preparation, Q.-W.W. and J.G.; Writing—Review and Editing, Q.-W.W. and J.G.; Supervision, Q.-W.W.; Funding Acquisition, Q.-W.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China grant number [12371023].

Data Availability Statement

No new data were created or analyzed in this study.

Acknowledgments

The authors sincerely appreciate the editor and the anonymous reviewers for their insightful comments and valuable suggestions.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Sylvester, J.J. Sur l’équation en matrices px = xq. C. R. Acad. Sci. Paris 1884, 99, 67–71, 115–116. (In French) [Google Scholar]
Bhatia, R.; Rosenthal, P. How and why to solve the operator equation AX − XB = Y. Bull. Lond. Math. Soc. 1997, 29, 1–21. [Google Scholar]
Wang, Q.W.; Xie, L.M.; Gao, Z.H. A survey on solving the matrix equation AXB=C with applications. Mathematics 2025, 13, 450. [Google Scholar] [CrossRef]
Wang, Q.W.; Gao, Z.H.; Gao, J. A comprehensive review on solving the system of equations AX = C and XB = D. Symmetry 2025, 17, 625. [Google Scholar]
Wang, Q.W.; Gao, Z.H.; Li, Y.F. An overview of methods for solving the system of matrix equations A₁XB₁=C₁ and A₂XB₂=C₂. Symmetry 2025, 17, 1307. [Google Scholar] [CrossRef]
Roth, W.E. The equations AX − YB = C and AX − XB = C in matrices. Proc. Am. Math. Soc. 1952, 3, 392–396. [Google Scholar]
Rodman, L. Topics in Quaternion Linear Algebra; Princeton University Press: Princeton, NJ, USA, 2014. [Google Scholar]
Horn, R.A.; Zhang, F. A generalization of the complex Autonne-Takagi factorization to quaternion matrices. Linear Multilinear Algebra 2012, 60, 1239–1244. [Google Scholar] [CrossRef]
Penrose, R. A generalized inverse for matrices. Math. Proc. Camb. Philos. Soc. 1955, 51, 406–413. [Google Scholar] [CrossRef]
Ben-Israel, A.; Greville, T.N.E. Generalized Inverses: Theory and Applications, 2nd ed.; Springer: New York, NY, USA, 2003. [Google Scholar]
Flanders, H.; Wimmer, H.K. On the matrix equations AX − XB = C and AX − YB = C. SIAM J. Appl. Math. 1977, 32, 707–710. [Google Scholar]
Dmytryshyn, A.; Kågström, B. Coupled Sylvester-type matrix equations and block diagonalization. SIAM J. Matrix Anal. Appl. 2015, 36, 580–593. [Google Scholar] [CrossRef]
Dmytryshyn, A.; Futorny, V.; Klymchuk, T.; Sergeichuk, V.V. Generalization of Roth’s solvability criteria to systems of matrix equations. Linear Algebra Appl. 2017, 527, 294–302. [Google Scholar] [CrossRef]
Rao, C.R.; Mitra, S.K. Generalized Inverse of Matrices and Its Applications; Wiley: New York, NY, USA, 1971. [Google Scholar]
Wang, G.; Wei, Y.; Qiao, S. Generalized Inverses: Theory and Computations; Springer: Singapore, 2018. [Google Scholar]
Cvetković-Ilixcx, D.S.; Wei, Y. Algebraic Properties of Generalized Inverses; Springer: Singapore, 2017. [Google Scholar]
Zhang, D.; Zhao, Y.; Mosić, D. The generalized Drazin inverse of the sum of two elements in a Banach algebra. J. Comput. Appl. Math. 2025, 470, 116701. [Google Scholar] [CrossRef]
Gao, Z.H.; Wang, Q.W.; Xie, L.M. A novel Moore–Penrose inverse to dual quaternion matrices with applications. Appl. Math. Lett. 2026, 172, 109727. [Google Scholar] [CrossRef]
Baksalary, J.K.; Kala, R. The matrix equation AX − YB = C. Linear Algebra Appl. 1979, 25, 41–43. [Google Scholar] [CrossRef]
Marsaglia, G.; Styan, G.P.H. Equalities and inequalities for ranks of matrices. Linear Multilinear Algebra 1974, 2, 269–292. [Google Scholar] [CrossRef]
van der Woude, J.W. Almost non-interacting control by measurement feedback. Syst. Control Lett. 1987, 9, 7–16. [Google Scholar] [CrossRef]
Olshevsky, V. Similarity of block diagonal and block triangular matrices. Integral Equ. Oper. Theory 1992, 15, 853–863. [Google Scholar] [CrossRef]
Meyer, C.D., Jr. Generalized inverses of block triangular matrices. SIAM J. Appl. Math. 1970, 19, 741–750. [Google Scholar] [CrossRef]
Golub, G.H.; Loan, C.F.V. Matrix Computations; The Johns Hopkins University Press: Baltimore, MD, USA, 1983. [Google Scholar]
Chu, K.E. Singular value and generalized singular value decompositions and the solution of linear matrix equations. Linear Algebra Appl. 1987, 88–89, 83–98. [Google Scholar] [CrossRef]
Paige, C.C.; Saunders, M.A. Towards a generalized singular value decomposition. SIAM J. Numer. Anal. 1981, 18, 398–405. [Google Scholar] [CrossRef]
Xu, G.; Wei, M.; Zheng, D. On solutions of matrix equation AXB + CYD = F. Linear Algebra Appl. 1998, 279, 93–109. [Google Scholar] [CrossRef]
Golub, G.H.; Zha, H. Perturbation analysis of the canonical correlations of matrix pairs. Linear Algebra Appl. 1994, 210, 3–28. [Google Scholar] [CrossRef]
Wang, Q.W.; van Woude, J.W.; Yu, S.W. An equivalence canonical form of a matrix triplet over an arbitrary division ring with applications. Sci. China-Math. 2011, 54, 907–924. [Google Scholar] [CrossRef]
He, Z.H.; Wang, Q.W.; Zhang, Y. A simultaneous decomposition for seven matrices with applications. J. Comput. Appl. Math. 2019, 349, 93–113. [Google Scholar] [CrossRef]
He, Z.H.; Agudelo, O.M.; Wang, Q.W.; Moor, B.D. Two-sided coupled generalized Sylvester matrix equations solving using a simultaneous decomposition for fifteen matrices. Linear Algebra Appl. 2016, 496, 549–593. [Google Scholar] [CrossRef]
Gustafson, W.H. Quivers and matrix equations. Linear Algebra Appl. 1995, 231, 159–174. [Google Scholar] [CrossRef]
Gabriel, P. Unzerlegbare Darstellungen I. Manuscripta Math. 1972, 6, 71–103. [Google Scholar] [CrossRef]
He, Z.H.; Dmytryshyn, A.; Wang, Q.W. A new system of Sylvester-like matrix equations with arbitrary number of equations and unknowns over the quaternion algebra. Linear Multilinear Algebra 2025, 73, 1269–1309. [Google Scholar] [CrossRef]
Wang, Q.W.; Zhang, X.; Woude, J.W.v. A new simultaneous decomposition of a matrix quaternity over an arbitrary division ring with applications. Commun. Algebra 2012, 40, 2309–2342. [Google Scholar] [CrossRef]
He, Z.H.; Wang, Q.W.; Zhang, Y. The complete equivalence canonical form of four matrices over an arbitrary division ring. Linear Multilinear Algebra 2018, 66, 74–95. [Google Scholar] [CrossRef]
He, Z.H.; Xu, Y.Z.; Wang, Q.W.; Zhang, C.Q. The equivalence canonical forms of two sets of five quaternion matrices with applications. Math. Meth. Appl. Sci. 2025, 48, 5483–5505. [Google Scholar]
Huo, J.W.; Xu, Y.Z.; He, Z.H. A simultaneous decomposition for a quaternion tensor quaternity with applications. Mathematics 2025, 13, 1679. [Google Scholar] [CrossRef]
Flaut, C.; Shpakivskyi, V. Real matrix representations for the complex quaternions. Adv. Appl. Clifford Algebras 2013, 23, 657–671. [Google Scholar] [CrossRef]
Liu, Y.H. Ranks of solutions of the linear matrix equation AX + YB = C. Comput. Math. Appl. 2006, 52, 861–872. [Google Scholar] [CrossRef]
Zhang, F.; Mu, W.; Li, Y.; Zhao, J. Special least squares solutions of the quaternion matrix equation AXB + CXD = E. Comput. Math. Appl. 2016, 72, 1426–1435. [Google Scholar] [CrossRef]
Yuan, S. Least squares pure imaginary solution and real solution of the quaternion matrix equation AXB + CXD = E with the least norm. J. Appl. Math. 2014, 2014, 857081. [Google Scholar] [CrossRef]
Liu, X.; Zhang, Y. Matrices over quaternion algebras. In Matrix and Operator Equations and Applications; Moslehian, M.S., Ed.; Springer: Cham, Switzerland, 2023; pp. 139–183. [Google Scholar]
Wang, G.; Guo, Z.; Zhang, D.; Jiang, T. Algebraic techniques for least-squares problem over generalized quaternion algebras: A unified approach in quaternionic and split quaternionic theory. Math. Meth. Appl. Sci. 2020, 43, 1124–1137. [Google Scholar] [CrossRef]
Yu, C.; Liu, X.; Zhang, Y. The generalized quaternion matrix equation AXB + CX^★D = E. Math. Meth. Appl. Sci. 2020, 43, 8506–8517. [Google Scholar] [CrossRef]
Ren, B.Y.; Wang, Q.W.; Chen, X.Y. The η-anti-Hermitian solution to a constrained matrix equation over the generalized segre quaternion algebra. Symmetry 2023, 15, 592. [Google Scholar] [CrossRef]
Wei, M.S.; Li, Y.; Zhang, F.; Zhao, J. Quaternion Matrix Computations; Nova Science Publishers: New York, NY, USA, 2018. [Google Scholar]
Stanimirovic, P.S. General determinantal representation of pseudoinverses of matrices. Mat. Vesn. 1996, 48, 1–9. [Google Scholar]
Kyrchei, I.I. Analogs of the adjoint matrix for generalized inverses and corresponding Cramer rules. Linear Multilinear Algebra 2008, 56, 453–469. [Google Scholar] [CrossRef]
Drazin, M.P. Pseudo-inverses in associative rings and semigroups. Am. Math. Mon. 1958, 65, 506–514. [Google Scholar] [CrossRef]
Aslaksen, H. Quaternionic determinants. Math. Intell. 1996, 18, 57–65. [Google Scholar] [CrossRef]
Cohen, N.; Leo, S.D. The quaternionic determinant. Electron. J. Linear Algebra 2000, 7, 100–111. [Google Scholar] [CrossRef]
Kyrchei, I.I. Cramer’s rule for quaternionic systems of linear equations. J. Math. Sci. 2008, 155, 839–858. [Google Scholar] [CrossRef]
Kyrchei, I.I. The theory of the column and row determinants in a quaternion linear algebra. In Advances in Mathematics Research; Baswell, A.R., Ed.; Nova Science Publisher: New York, NY, USA, 2012; Volume 15, pp. 301–358. [Google Scholar]
Kyrchei, I.I. Cramer’s rule for some quaternion matrix equations. Appl. Math. Comput. 2010, 217, 2024–2030. [Google Scholar] [CrossRef]
Song, G.J.; Wang, Q.W.; Yu, S.W. Cramer’s rule for a system of quaternion matrix equations with applications. Appl. Math. Comput. 2018, 336, 490–499. [Google Scholar] [CrossRef]
Kyrchei, I.I. Determinantal representations of the Moore-Penrose inverse matrix over the quaternion skew field. J. Math. Sci. 2012, 180, 23–33. [Google Scholar] [CrossRef]
Kyrchei, I. Cramer’s rules for Sylvester quaternion matrix equation and its special cases. Adv. Appl. Clifford Algebr. 2018, 28, 90. [Google Scholar] [CrossRef]
Kyrchei, I. Explicit representation formulas for the minimum norm least squares solutions of some quaternion matrix equations. Linear Algebra Appl. 2013, 438, 136–152. [Google Scholar] [CrossRef]
Kyrchei, I.I. Explicit determinantal representation formulas for the solution of the two-sided restricted quaternionic matrix equation. J. Appl. Math. Comput. 2018, 58, 335–365. [Google Scholar] [CrossRef]
Kyrchei, I. Determinantal representations of solutions to systems of quaternion matrix equations. Adv. Appl. Clifford Algebr. 2018, 28, 23. [Google Scholar] [CrossRef]
Song, G.J.; Dong, C.Z. New results on condensed Cramer’s rule for the general solution to some restricted quaternion matrix equations. J. Appl. Math. Comput. 2017, 53, 321–341. [Google Scholar] [CrossRef]
Song, G.J.; Wang, Q.W. Condensed Cramer rule for some restricted quaternion linear equations. Appl. Math. Comput. 2011, 218, 3110–3121. [Google Scholar] [CrossRef]
Song, G.J.; Wang, Q.W.; Chang, H.X. Cramer rule for the unique solution of restricted matrix equations over the quaternion skew field. Comput. Math. Appl. 2011, 61, 1576–1589. [Google Scholar] [CrossRef]
Song, G.J. Determinantal expression of the general solution to a restricted system of quaternion matrix equations with applications. Bull. Korean Math. Soc. 2018, 55, 1285–1301. [Google Scholar]
Cheng, D. Semi-tensor product of matrices and its application to Morgan’s problem. Sci. China 2001, 44, 195–212. [Google Scholar]
Cheng, D. From Dimension-Free Matrix Theory to Cross-Dimensional Dynamic Systems; Elsevier: London, UK, 2019. [Google Scholar]
Cheng, D. Matrix and Polynomial Approach to Dynamic Control Systems; Science Press: Beijing, China, 2002. [Google Scholar]
Cheng, D.; Qi, H.; Xue, A. A survey on semi-tensor product of matrices. J. Syst. Sci. Complex. 2007, 20, 304–322. [Google Scholar] [CrossRef]
Cheng, D.; Qi, H. Semi-Tensor Product of Matrices—Theory and Applications, 2nd ed.; Science Press: Beijing, China, 2011. (In Chinese) [Google Scholar]
Cheng, D.; Qi, H.; Li, Z. Analysis and Control of Boolean Networks: A Semi-Tensor Product Approach; Springer: London, UK, 2011. [Google Scholar]
Yao, J.; Feng, J.; Meng, M. On solutions of the matrix equation AX = B with respect to semi-tensor product. J. Franklin Inst. 2016, 353, 1109–1131. [Google Scholar] [CrossRef]
Wang, J. Least squares solutions of matrix equation AXB = C under semi-tensor product. Electron. Res. Arch. 2024, 32, 2976–2993. [Google Scholar] [CrossRef]
Jaiprasert, J.; Chansangiam, P. Solving the Sylvester-transpose matrix equation under the semi-tensor product. Symmetry 2022, 14, 1094. [Google Scholar] [CrossRef]
Wang, N. Solvability of the Sylvester equation AX − XB = C under left semi-tensor product. Math. Model. Control 2022, 2, 81–89. [Google Scholar] [CrossRef]
Ji, Z.; Li, J.; Zhou, X.; Duan, F.; Li, T. On solutions of matrix equation AXB = C under semi-tensor product. Linear Multilinear Algebra 2021, 69, 1935–1963. [Google Scholar] [CrossRef]
Li, J.; Tao, L.; Li, W.; Chen, Y.; Huang, R. Solvability of matrix equations AX = B, XC = D under semi-tensor product. Linear Multilinear Algebra 2017, 65, 1705–1733. [Google Scholar] [CrossRef]
Wang, J.; Feng, J.; Huang, H. Solvability of the matrix equation AX² = B with semi-tensor product. Electron. Res. Arch. 2020, 29, 2249–2267. [Google Scholar] [CrossRef]
Cheng, D.; Liu, Z. A new semi-tensor product of matrices. Control Theory Technol. 2019, 17, 4–12. [Google Scholar] [CrossRef]
Cheng, D.; Xu, Z.; Shen, T. Equivalence-based model of dimension-varying linear systems. IEEE Trans. Autom. Control 2020, 65, 5444–5449. [Google Scholar] [CrossRef]
Wang, J. On solutions of the matrix equation A∘_l X = B with respect to MM-2 semitensor product. J. Math. 2021, 2021, 6651434. [Google Scholar]
Ding, W.; Li, Y.; Wang, D.; Wei, A. Constrained least squares solution of Sylvester equation. Math. Model. Control 2021, 1, 112–120. [Google Scholar] [CrossRef]
Ding, W.; Li, Y.; Wang, D. A real method for solving quaternion matrix equation $X - A \hat{X} B = C$ based on semi-tensor product of matrices. Adv. Appl. Clifford Algebr. 2021, 31, 78. [Google Scholar] [CrossRef]
Wang, D.; Li, Y.; Ding, W.X. Several kinds of special least squares solutions to quaternion matrix equation AXB = C. J. Appl. Math. Comput. 2022, 68, 1881–1899. [Google Scholar] [CrossRef]
Liu, X.; Li, Y.; Ding, W.; Tao, R. A real method for solving octonion matrix equation AXB = C based on semi-tensor product of matrices. Adv. Appl. Clifford Algebr. 2024, 34, 12. [Google Scholar] [CrossRef]
Chen, W.; Song, C. STP method for solving the least squares special solutions of quaternion matrix equations. Adv. Appl. Clifford Algebr. 2025, 35, 6. [Google Scholar] [CrossRef]
Liu, X.; Wang, Q.W.; Zhang, Y. Consistency of quaternion matrix equations AX^★ − XB = C and X − AX^★B = C^★. Electron. J. Linear Algebra 2019, 35, 394–407. [Google Scholar] [CrossRef]
Fan, X.; Li, Y.; Liu, Z.; Zhao, J. Solving quaternion linear system based on semi-tensor product of quaternion matrices. Symmetry 2022, 14, 1359. [Google Scholar] [CrossRef]
Fan, X.; Li, Y.; Liu, Z.; Zhao, J. The (anti)-η-Hermitian solution of quaternion linear system. Filomat 2024, 38, 4679–4695. [Google Scholar] [CrossRef]
Zhang, M.; Li, Y.; Sun, J.; Fan, X.; Wei, A. A new method based on the semi-tensor product of matrices for solving communicative quaternion matrix equation $\sum_{i = 1}^{k} A_{i} X B_{i} = C$ and its application. Bull. Sci. Math. 2025, 199, 103576. [Google Scholar] [CrossRef]
Fan, X.; Li, Y.; Sun, J.; Zhao, J. Solving quaternion linear system AXB = E based on semi-tensor product of quaternion matrices. Banach J. Math. Anal. 2023, 17, 25. [Google Scholar] [CrossRef]
Xi, Y.; Liu, Z.; Li, Y.; Tao, R.; Wang, T. On the mixed solution of reduced biquaternion matrix equation $\sum_{i = 1}^{n} A_{i} B_{i} C_{i} = E$ with sub-matrix constraints and its application. AIMS Math. 2023, 8, 27901–27923. [Google Scholar] [CrossRef]
Fan, X.; Li, Y.; Zhang, M.; Zhao, J. Solving the least squares (anti)-Hermitian solution for quaternion linear systems. Comput. Appl. Math. 2022, 41, 371. [Google Scholar] [CrossRef]
Liu, Z.; Li, Y.; Fan, X.; Ding, W. A new method of solving special solutions of quaternion generalized Lyapunov matrix equation. Symmetry 2022, 14, 1120. [Google Scholar] [CrossRef]
Ziętak, K. The Chebyshev solution of the linear matrix equation AX + YB = C. Numer. Math. 1985, 46, 455–478. [Google Scholar] [CrossRef]
Ziętak, K. The l_p-solution of the linear matrix equation AX + YB = C. Computing 1984, 32, 153–162. [Google Scholar] [CrossRef]
Liao, A.P.; Bai, Z.Z.; Lei, Y. Best approximate solution of matrix equation AXB + CYD = E. SIAM J. Matrix Anal. Appl. 2005, 27, 675–688. [Google Scholar] [CrossRef]
Wimmer, H.K. Roth’s theorems for matrix equations with symmetry constraints. Linear Algebra Appl. 1994, 199, 357–362. [Google Scholar] [CrossRef]
Terán, F.D.; Dopico, F.M. Consistency and efficient solution of the Sylvester equation for ★-congruence. Electron. J. Linear Algebra 2011, 22, 849–863. [Google Scholar] [CrossRef]
Byers, R.; Kressner, D. Structured condition numbers for invariant subspaces. SIAM J. Matrix Anal. Appl. 2006, 28, 326–347. [Google Scholar] [CrossRef]
Kressner, D.; Schröder, C.; Watkins, D.S. Implicit QR algorithms for palindromic and even eigenvalue problems. Numer. Algor. 2009, 51, 209–238. [Google Scholar] [CrossRef]
Cvetković-Ilixcx, D.S. The solutions of some operator equations. J. Korean Math. Soc. 2008, 45, 1417–1425. [Google Scholar] [CrossRef]
Chang, X.W.; Wang, J.S. The symmetric solution of the matrix equations AX + YA = C, AXA^T + BYB^T = C, and (A^TXA, B^TXB) = (C, D). Linear Algebra Appl. 1993, 179, 171–189. [Google Scholar] [CrossRef]
Jameson, A.; Kreindler, E. Inverse problem of linear optimal control. SIAM J. Control 1973, 11, 1–19. [Google Scholar] [CrossRef]
Jameson, A.; Kreindler, E.; Lancaster, P. Symmetric, positive semidefinite, positive definite real solutions of AX = XA^T and AX = YB. Linear Algebra Appl. 1992, 160, 189–215. [Google Scholar] [CrossRef]
Dobovišek, M. On minimal solutions of the matrix equation AX − YB=0. Linear Algebra Appl. 2001, 325, 81–99. [Google Scholar] [CrossRef]
Cantoni, A.; Butler, P. Eigenvalues and eigenvectors of symmetric centrosymmetric matrices. Linear Algebra Appl. 1976, 13, 275–288. [Google Scholar] [CrossRef]
Reid, R.M. Some eigenvalues properties of persymmetric matrices. SIAM Rev. 1997, 39, 313–316. [Google Scholar] [CrossRef]
Andrew, A.L. Centrosymmetric matrices. SIAM Rev. 1998, 40, 697–698. [Google Scholar] [CrossRef]
Weaver, J.R. Centrosymmetric (cross-symmetric) matrices, their basic properties, eigenvalues, eigenvectors. Am. Math. Mon. 1985, 92, 711–717. [Google Scholar] [CrossRef]
Draxl, P.K. Skew Field; Cambridge University Press: London, UK, 1983. [Google Scholar]
Wang, Q.W.; Li, S.Z. Persymmetric and perskewsymmetric solutions to sets of matrix equations over a finite central algebra. Acta Math. Sin. 2004, 47, 27–34. (In Chinese) [Google Scholar]
Wang, Q.W.; Sun, J.H.; Li, S.Z. Consistency for bi(skew)symmetric solutions to systems of generalized Sylvester equations over a finite central algebra. Linear Algebra Appl. 2002, 353, 169–182. [Google Scholar] [CrossRef]
Wang, Q.W.; Zhang, H.S.; Song, G.J. A new solvable condition for a pair of generalized Sylvester equations. Electron. J. Linear Algebra 2009, 18, 289–301. [Google Scholar] [CrossRef]
Wang, Q.W.; He, Z.H. Some matrix equations with applications. Linear Multilinear Algebra 2012, 60, 1327–1353. [Google Scholar] [CrossRef]
Took, C.C.; Mandic, D.P. Augmented second-order statistics of quaternion random signals. Signal Process. 2011, 91, 214–224. [Google Scholar] [CrossRef]
Took, C.C.; Mandic, D.P.; Zhang, F. On the unitary diagonalisation of a special class of quaternion matrices. Appl. Math. Lett. 2011, 24, 1806–1809. [Google Scholar] [CrossRef]
Yuan, S.F.; Wang, Q.W. Two special kinds of least squares solutions for the quaternion matrix equation AXB + CXD = E. Electron. J. Linear Algebra 2012, 23, 257–274. [Google Scholar] [CrossRef]
He, Z.H.; Wang, Q.W. A real quaternion matrix equation with applications. Linear Multilinear Algebra 2013, 61, 725–740. [Google Scholar] [CrossRef]
Baksalary, J.K.; Kala, R. The matrix equation AXB + CYD = E. Linear Algebra Appl. 1980, 30, 141–147. [Google Scholar] [CrossRef]
He, Z.H.; Wang, Q.W. A pair of mixed generalized Sylvester matrix equations. J. Shanghai Univ. Nat. Sci. 2014, 20, 138–156. [Google Scholar]
Kyrchei, I. Cramer’s rules of η-(skew-)Hermitian solutions to the quaternion Sylvester-type matrix equations. Adv. Appl. Clifford Algebras 2019, 29, 56. [Google Scholar] [CrossRef]
He, Z.H.; Liu, J.; Tam, T.Y. The general ϕ-Hermitian solution to mixed pairs of quaternion matrix Sylvester equations. Electron. J. Linear Algebra 2017, 32, 475–499. [Google Scholar] [CrossRef]
Wang, L.; Wang, Q.; He, Z. The common solution of some matrix equations. Algebra Colloq. 2016, 23, 71–81. [Google Scholar] [CrossRef]
Wang, Q.W.; Lv, R.Y.; Zhang, Y. The least-squares solution with the least norm to a system of tensor equations over the quaternion algebra. Linear Multilinear Algebra 2022, 70, 1942–1962. [Google Scholar] [CrossRef]
Guralnick, R.M. Roth’s theorems and decomposition of modules. Linear Algebra Appl. 1980, 39, 155–165. [Google Scholar] [CrossRef]
Hartwig, R.E. Roth’s equivalence problem in unit regular rings. Proc. Am. Math. Soc. 1976, 59, 39–44. [Google Scholar] [CrossRef]
Newman, M. The Smith normal form of a partitioned matrix. J. Res. Nat. Bur. Stand.-B. Math. Sci. 1974, 78B, 3–6. [Google Scholar] [CrossRef]
Feinberg, R.B. Equivalence of partitioned matrices. J. Res. Nat. Bur. Stand.-B. Math. Sci. 1976, 80B, 89–97. [Google Scholar] [CrossRef]
Gustafson, W.H.; Zelmanowitz, J.M. On matrix equivalence and matrix equations. Linear Algebra Appl. 1979, 27, 219–224. [Google Scholar] [CrossRef]
Gustafson, W.H. Roth’s theorems over commutative rings. Linear Algebra Appl. 1979, 23, 245–251. [Google Scholar] [CrossRef]
Guralnick, R.M. Roth’s theorems for sets of matrices. Linear Algebra Appl. 1985, 71, 113–117. [Google Scholar] [CrossRef]
Lee, S.G.; Vu, Q.P. Simultaneous solutions of matrix equations and simultaneous equivalence of matrices. Linear Algebra Appl. 2012, 437, 2325–2339. [Google Scholar] [CrossRef]
Miyata, T. Note on direct summands of modules. J. Math. Kyoto Univ. 1967, 7, 65–69. [Google Scholar] [CrossRef]
Guralnick, R.M. Matrix equivalence and isomorphism of modules. Linear Algebra Appl. 1982, 43, 125–136. [Google Scholar] [CrossRef]
Özgüler, A.B. The matrix equation AXB + CYD = E over a principal ideal domain. SIAM J. Matrix Anal. Appl. 1991, 12, 581–591. [Google Scholar] [CrossRef]
Huang, L.; Zeng, Q. The matrix equation AXB + CYD = E over a simple artinian ring. Linear Multilinear Algebra 1995, 38, 225–232. [Google Scholar]
Wang, Q.W. A system of matrix equations and a linear matrix equation over arbitrary regular rings with identity. Linear Algebra Appl. 2004, 384, 43–54. [Google Scholar] [CrossRef]
Dajić, A. Common solutions of linear equations in a ring, with applications. Electron. J. Linear Algebra 2015, 30, 66–79. [Google Scholar] [CrossRef]
Lin, M.; Wimmer, H.K. The generalized Sylvester matrix equation, rank minimization and Roth’s equivalence theorem. Bull. Aust. Math. Soc. 2011, 84, 441–443. [Google Scholar] [CrossRef]
Ito, N.; Wimmer, H.K. Rank minimization of generalized Sylvester equations over Bezout domains. Linear Algebra Appl. 2013, 439, 592–599. [Google Scholar] [CrossRef]
Hamilton, W.R. Lectures on Quaternions; Hodges and Smith: Dublin, Ireland, 1853. [Google Scholar]
Voight, J. Quaternion Algebras; Springer: Cham, Switzerland, 2021. [Google Scholar]
Adler, S.L. Quaternionic Quantum Mechanics and Quantum Fields; Oxford University Press: New York, NY, USA, 1995. [Google Scholar]
Kuipers, J.B. Quaternions and Rotation Sequences: A Primer with Applications to Orbits, Aerospace and Virtual Reality; Princeton University Press: Princeton, NJ, USA, 1999. [Google Scholar]
Girard, P.R. Quaternions, Clifford Algebras and Relativistic Physics; Birkhäuser: Basel, Switzerland, 2007. [Google Scholar]
Li, W. Quaternion Matrices; National University of Defense Technology Press: Changsha, China, 2002. (In Chinese) [Google Scholar]
Clifford, M.A. Preliminary sketch of biquaternions. Proc. Lond. Math. Soc. 1873, 4, 381–395. [Google Scholar] [CrossRef]
Wei, T.; Ding, W.; Wei, Y. Singular value decomposition of dual matrices and its application to traveling wave identification in the brain. SIAM J. Matrix Anal. Appl. 2024, 45, 634–660. [Google Scholar] [CrossRef]
Fischer, I. Dual-Number Methods in Kinematics, Statics and Dynamics; CRC Press: Boca Raton, FL, USA, 1999. [Google Scholar]
Condurache, D.; Burlacu, A. Dual tensors based solutions for rigid body motion parameterization. Mech. Mach. Theory 2014, 74, 390–412. [Google Scholar] [CrossRef]
Gu, Y.L.; Luh, J.Y.S. Dual-number transformations and its applications to robotics. IEEE J. Robot. Autom. 1987, 3, 615–623. [Google Scholar] [CrossRef]
Udwadia, F.E.; Pennestri, E.; de Falco, D. Do all dual matrices have dual Moore-Penrose inverses? Mech. Mach. Theory 2020, 151, 103878. [Google Scholar] [CrossRef]
Udwadia, F.E. Dual generalized inverses and their use in solving systems of linear dual equations. Mech. Mach. Theory 2021, 156, 104158. [Google Scholar] [CrossRef]
Fan, R.; Zeng, M.; Yuan, Y. The solutions to some dual matrix equations. Miskolc Math. Notes 2024, 25, 679–691. [Google Scholar] [CrossRef]
Farias, J.G.; Pieri, E.D.; Martins, D. A review on the applications of dual quaternions. Machines 2024, 12, 402. [Google Scholar] [CrossRef]
Kenwright, B. A beginner’s guide to dual-quaternions: What they are, how they work, how to use them for 3D character hierarchies. In Proceedings of the 20th International Conference in Central Europe on Computer Graphics, Visualization and Computer Vision, Pilsen, Czech Republic, 25–28 June 2012. [Google Scholar]
Mukundan, R. Quaternions: From classical mechanics to computer graphics, and beyond. In Proceedings of the 7th Asian Technology Conference in Mathematics, Melaka, Malaysia, 17–21 December 2002; pp. 97–106. [Google Scholar]
Xie, L.M.; Wang, Q.W.; He, Z.H. The generalized hand-eye calibration matrix equation AX − YB = C over dual quaternions. Comput. Appl. Math. 2025, 44, 137. [Google Scholar] [CrossRef]
Xie, L.M.; Wang, Q.W. A generalized Sylvester dual quaternion matrix equation with applications. Authorea 2025. [Google Scholar] [CrossRef]
Duan, G.R. Generalized Sylvester Equations: Unified Parametric Solutions; Taylor and Francis Group: Abingdon, UK; CRC Press: Boca Raton, FL, USA, 2015. [Google Scholar]
Hamilton, W.R., II. On quaternions; or on a new system of imaginaries in algebra. Lond. Edinb. Dublin Philos. Mag. J. Sci. 1844, 25, 10–13. [Google Scholar] [CrossRef]
Cockle, J. On systems of algebra involving more than one imaginary and on equations of the fifth degree. Lond. Edinb. Dublin Philos. Mag. J. Sci. 1849, 35, 434–437. [Google Scholar] [CrossRef]
Segre, C. The real representations of complex elements and extension to bicomplex systems. Math. Ann. 1892, 40, 413–467. (In Italian) [Google Scholar] [CrossRef]
Tian, Y.; Liu, X.; Zhang, Y. Least-squares solutions of the generalized reduced biquaternion matrix equations. Filomat 2023, 37, 863–870. [Google Scholar] [CrossRef]
Yaglom, I.M. Complex Numbers in Geometry; Academic Press: New York, NY, USA, 1968. [Google Scholar]
Lam, T.Y. Introduction to Quadratic Forms over Fields; American Mathematical Society: Providence, RI, USA, 2005. [Google Scholar]
Pottmann, H.; Wallner, J. Computational Line Geometry; Springer: Berlin/Heidelberg, Germany, 2001. [Google Scholar]
Kandasamy, W.B.V.; Smarandache, F. Dual Numbers; Zip Publishing: Columbus, OH, USA, 2012. [Google Scholar]
Lance, E.C. Hilbert C^*-Modules: A Toolkit for Operator Algebraists; Cambridge University Press: Cambridge, UK, 1995. [Google Scholar]
Fang, X.; Yu, J.; Yao, H. Solutions to operator equations on Hilbert C^*-modules. Linear Algebra Appl. 2009, 431, 2142–2153. [Google Scholar] [CrossRef]
Xu, Q. Common Hermitian and positive solutions to the adjointable operator equations AX = C, XB = D. Linear Algebra Appl. 2008, 429, 1–11. [Google Scholar] [CrossRef]
Dajić, A.; Koliha, J.J. Positive solutions to the equations AX = C and XB = D for Hilbert space operators. J. Math. Anal. Appl. 2007, 333, 567–576. [Google Scholar] [CrossRef]
Douglas, R.G. On majorization, factorization and range inclusion of operators on Hilbert space. Proc. Am. Math. Soc. 1966, 17, 413–416. [Google Scholar] [CrossRef]
Mousavi, Z.; Eskandari, R.; Moslehian, M.S.; Mirzapour, F. Operator equations AX + YB = C and AXA^* + BYB^* = C in Hilbert C^*-modules. Linear Algebra Appl. 2017, 517, 85–98. [Google Scholar] [CrossRef]
Karizaki, M.M.; Hassani, M.; Amyari, M.; Khosravi, M. Operator matrix of Moore-Penrose inverse operators on Hilbert C^*-modules. Colloq. Math. 2015, 140, 171–182. [Google Scholar] [CrossRef]
Moghani, Z.N.; Karizaki, M.M.; Khanehgir, M. Solutions of the Sylvester equation in C^*-Modular operators. Ukr. Math. J. 2021, 73, 354–369. [Google Scholar] [CrossRef]
An, I.J.; Ko, E.; Lee, J.E. On the generalized Sylvester operator equation AX − YB = C. Linear Multilinear Algebra 2022, 72, 585–596. [Google Scholar] [CrossRef]
Jo, S.; Kim, Y.; Ko, E. On Fuglede-Putnam properties. Positivity 2015, 19, 911–925. [Google Scholar] [CrossRef]
Bhatia, R. Matrix Analysis; Springer: New York, NY, USA, 1997. [Google Scholar]
Lim, L.H. Tensors in computations. Acta Numer. 2021, 30, 555–764. [Google Scholar] [CrossRef]
Qi, L.; Luo, Z. Tensor Analysis: Spectral Theory and Special Tensors; SIAM: Philadelphia, PA, USA, 2017. [Google Scholar]
Ding, W.; Wei, Y. Theory and Computation of Tensors: Multi-Dimensional Arrays; Elsevier: Amsterdam, The Netherlands; Academic Press: London, UK, 2016. [Google Scholar]
Che, M.; Wei, Y. Theory and Computation of Complex Tensors and its Applications; Springer: Singapore, 2020. [Google Scholar]
Wu, F.; Li, C.; Li, Y. Manifold regularization nonnegative triple decomposition of tensor sets for image compression and representation. J. Optim. Theory Appl. 2022, 192, 979–1000. [Google Scholar] [CrossRef]
Savas, B.; Eldén, L. Handwritten digit classification using higher order singular value decomposition. Pattern Recognit. 2007, 40, 993–1003. [Google Scholar] [CrossRef]
Huang, S.; Zhao, G.; Chen, M. Tensor extreme learning design via generalized Moore-Penrose inverse and triangular type-2 fuzzy sets. Neural Comput. Appl. 2019, 31, 5641–5651. [Google Scholar] [CrossRef]
Sidiropoulos, N.D.; Lathauwer, L.D.; Fu, X.; Huang, K.; Papalexakis, E.E.; Faloutsos, C. Tensor decomposition for signal processing and machine learning. IEEE Trans. Signal Process. 2017, 65, 3551–3582. [Google Scholar] [CrossRef]
Qi, L.; Chen, H.; Chen, Y. Tensor Eigenvalues and Their Applications; Springer: Singapore, 2018. [Google Scholar]
Kolda, T.G.; Bader, B.W. Tensor decompositions and applications. SIAM Rev. 2009, 51, 455–500. [Google Scholar] [CrossRef]
He, Z.H.; Navasca, C.; Wang, Q.W. Tensor decompositions and tensor equations over quaternion algebra. arXiv 2017, arXiv:1710.07552v1. [Google Scholar] [CrossRef]
Einstein, A. The foundation of the general theory of relativity. In The Collected Papers of Albert Einstein; Kox, A.J., Klein, M.J., Schulmann, R., Eds.; Princeton University Press: Princeton, NJ, USA, 1997; Volume 6, pp. 146–200. [Google Scholar]
Sun, L.; Zheng, B.; Bu, C.; Wei, Y. Moore-Penrose inverse of tensors via Einstein product. Linear Multilinear Algebra 2016, 64, 686–698. [Google Scholar] [CrossRef]
Brazell, M.; Li, N.; Navasca, C.; Tamon, C. Solving multilinear systems via tensor inversion. SIAM J. Matrix Anal. Appl. 2013, 34, 542–570. [Google Scholar] [CrossRef]
He, Z.H.; Navasca, C.; Wang, X.X. Decomposition for a quaternion tensor triplet with applications. Adv. Appl. Clifford Algebras 2022, 32, 9. [Google Scholar] [CrossRef]
Wang, Q.W.; Wang, X. A system of coupled two-sided Sylvester-type tensor equations over the quaternion algebra. Taiwan J. Math. 2020, 24, 1399–1416. [Google Scholar] [CrossRef]
Mehany, M.S.; Wang, Q.; Liu, L. A system of Sylvester-like quaternion tensor equations with an application. Front. Math. 2024, 19, 749–768. [Google Scholar] [CrossRef]
Wang, Q.W.; Wang, X.; Zhang, Y. A constraint system of coupled two-sided Sylvester-like quaternion tensor equations. Comput. Appl. Math. 2020, 39, 317. [Google Scholar] [CrossRef]
Qin, J.; Wang, Q.W. Solving a system of two-sided Sylvester-like quaternion tensor equations. Comput. Appl. Math. 2023, 42, 232. [Google Scholar] [CrossRef]
He, Z.H. The general solution to a system of coupled Sylvester-type quaternion tensor equations involving η-Hermicity. Bull. Iran. Math. Soc. 2019, 45, 1407–1430. [Google Scholar] [CrossRef]
Xie, M.; Wang, Q.W. Reducible solution to a quaternion tensor equation. Front. Math. China 2020, 15, 1047–1070. [Google Scholar] [CrossRef]
Xie, M.; Wang, Q.W.; Zhang, Y. The minimum-norm least squares solutions to quaternion tensor systems. Symmetry 2022, 14, 1460. [Google Scholar] [CrossRef]
Jia, Z.R.; Wang, Q.W. The general solution to a system of tensor equations over the split quaternion algebra with applications. Mathematics 2025, 13, 644. [Google Scholar] [CrossRef]
Yang, L.; Wang, Q.W.; Kou, Z. A system of tensor equations over the dual split quaternion algebra with an application. Mathematics 2024, 12, 3571. [Google Scholar] [CrossRef]
Bader, B.W.; Kolda, T.G. Algorithm 862: Matlab tensor classes for fast algorithm prototyping. ACM Trans. Math. Softw. 2006, 32, 635–653. [Google Scholar] [CrossRef]
Kilmer, M.; Martin, C.D. Factorization strategies for third-order tensors. Linear Algebra Appl. 2011, 435, 641–658. [Google Scholar] [CrossRef]
Qin, Z.; Ming, Z.; Zhang, L. Singular value decomposition of third order quaternion tensors. Appl. Math. Lett. 2022, 123, 107597. [Google Scholar] [CrossRef]
Shao, J.Y. A general product of tensors with applications. Linear Algebra Appl. 2013, 439, 2350–2366. [Google Scholar] [CrossRef]
Kernfeld, E.; Kilmer, M.; Aeron, S. Tensor-tensor products with invertible linear transforms. Linear Algebra Appl. 2015, 485, 545–570. [Google Scholar] [CrossRef]
Kilmer, M.; Horesh, L.; Avron, H.; Newman, E. Tensor-tensor products for optimal representation and compression. arXiv 2019, arXiv:2001.00046v1. [Google Scholar] [CrossRef]
Jin, H.; Xu, S.; Wang, Y.; Liu, X. The Moore-Penrose inverse of tensors via the M-product. Comput. Appl. Math. 2023, 42, 294. [Google Scholar] [CrossRef]
Kučera, V. Algebraic approach to discrete stochastic control. Kybernetika 1975, 11, 114–147. [Google Scholar]
Bengtsson, G. Output regulation and internal models–a frequency domain approach. Automatica 1977, 13, 333–345. [Google Scholar] [CrossRef]
Cheng, L.; Pearson, J. Frequency domain synthesis of multivariable linear regulators. IEEE Trans. Autom. Control 1978, 23, 3–15. [Google Scholar] [CrossRef]
Wolovich, W.A. Skew prime polynomial matrices. IEEE Trans. Autom. Control 1978, 23, 880–887. [Google Scholar] [CrossRef]
Wolovich, W.A. Linear Multivariable Systems; Springer: New York, NY, USA, 1974. [Google Scholar]
Fuhrmann, P.A. Algebraic system theory: An analyst’s point of view. J. Franklin Inst. 1976, 301, 521–540. [Google Scholar] [CrossRef]
Emre, E.; Silverman, L.M. The equation XR + QY = Φ: A characterization of solutions. SIAM J. Control Optim. 1981, 19, 33–38. [Google Scholar] [CrossRef]
Emre, E. The polynomial equation QQ_c + RP_c = Φ with application to dynamic feedback. SIAM J. Control Optim. 1980, 18, 611–620. [Google Scholar] [CrossRef]
Żak, S.H. On the polynomial matrix equation AX + YB = C. IEEE Trans. Autom. Control 1985, 30, 1240–1242. [Google Scholar] [CrossRef]
Feinstein, J.; Bar-Ness, Y. The solution of the matrix polynomial A(s)X(s) + B(s)Y(s) = C(s). IEEE Trans. Autom. Control 1984, 29, 75–77. [Google Scholar] [CrossRef]
Wimmer, H.K. Consistency of a pair of generalized Sylvester equations. IEEE Trans. Autom. Control 1994, 39, 1014–1016. [Google Scholar] [CrossRef]
Wimmer, H.K. The matrix equation X − AXB = C and an analogue of Roth’s theorem. Linear Algebra Appl. 1988, 109, 145–147. [Google Scholar] [CrossRef]
Wimmer, H.K. The generalized Sylvester equation in polynomial matrices. IEEE Trans. Autom. Control 1996, 41, 1372–1376. [Google Scholar] [CrossRef]
Wimmer, H.K. The structure of nonsingular polynomial matrices. Math. Syst. Theory 1981, 14, 367–379. [Google Scholar] [CrossRef]
Barnett, S. Regular polynomial matrices having relatively prime determinants. Proc. Camb. Philos. Soc. 1969, 65, 585–590. [Google Scholar] [CrossRef]
Feinstein, J.; Bar-Ness, Y. On the uniqueness minimal solution of the matrix polynomial equation A(λ)X(λ) + Y(λ)B(λ) = C(λ). J. Franklin Inst. 1980, 310, 131–134. [Google Scholar] [CrossRef]
Chen, S.; Tian, Y. On solutions of generalized Sylvester equation in polynomial matrices. J. Franklin Inst. 2014, 351, 5376–5385. [Google Scholar] [CrossRef]
Wimmer, H.K. Explicit solutions of the matrix equation $\sum A^{i} X D_{i} = C$ AⁱXD_i = C. SIAM J. Matrix Anal. Appl. 1992, 13, 1123–1130. [Google Scholar] [CrossRef]
Huang, L. The explicit solutions and solvability of linear matrix equations. Linear Algebra Appl. 2000, 311, 195–199. [Google Scholar] [CrossRef]
Gohberg, I.; Kaashoek, M.A.; Lerer, L. On a class of entire matrix function equations. Linear Algebra Appl. 2007, 425, 434–442. [Google Scholar] [CrossRef]
Gohberg, I.; Kaashoek, M.A.; Schagen, F. Partially Specified Matrices and Operators: Classification, Completion, Applications; Birkhäuser Verlag: Basel, Switzerland, 1995. [Google Scholar]
Kaashoek, M.A.; Lerer, L. On a class of matrix polynomial equations. Linear Algebra Appl. 2013, 439, 613–620. [Google Scholar] [CrossRef]
Gohberg, I.; Kaashoek, M.A.; Lerer, L. The resultant for regular matrix polynomials and quasi commutativity. Indiana Univ. Math. J. 2008, 57, 2793–2813. [Google Scholar] [CrossRef]
Huang, L. The solvability of linear matrix equation over a central simple algebra. Linear Multilinear Algebra 1996, 40, 353–363. [Google Scholar] [CrossRef]
Huang, L. The quaternion matrix equation $\sum A^{i} X B_{i} = E$ . Acta Math. Sin. New Ser. 1998, 14, 91–98. [Google Scholar]
Huang, L.; Liu, J. The extension of Roth’s theorem for matrix equations over a ring. Linear Algebra Appl. 1997, 259, 229–235. [Google Scholar] [CrossRef]
Wu, A.G.; Duan, G.R.; Xue, Y. Kronecker maps and Sylvester-polynomial matrix equations. IEEE Trans. Autom. Control 2007, 52, 905–910. [Google Scholar] [CrossRef]
Wu, A.G.; Liu, W.; Duan, G.R. On the conjugate product of complex polynomial matrices. Math. Comput. Model. 2011, 53, 2031–2043. [Google Scholar] [CrossRef]
Wu, A.G.; Duan, G.R.; Feng, G.; Liu, W. On conjugate product of complex polynomials. Appl. Math. Lett. 2011, 24, 735–741. [Google Scholar] [CrossRef]
Wu, A.G.; Feng, G.; Liu, W.; Duan, G.R. The complete solution to the Sylvester-polynomial-conjugate matrix equations. Math. Comput. Model. 2011, 53, 2044–2056. [Google Scholar] [CrossRef]
Bevis, J.H.; Hall, F.J.; Hartwig, R.E. Consimilarity and the matrix equation AX − XB = C. In Current Trends in Matrix Theory; Uhlig, F., Grone, R., Eds.; North-Holland: New York, NY, USA, 1987; pp. 51–64. [Google Scholar]
Bevis, J.H.; Hall, F.J.; Hartwig, R.E. The matrix equation $A \bar{X} - X B = C$ and its special cases. SIAM J. Matrix Anal. Appl. 1988, 9, 348–359. [Google Scholar] [CrossRef]
Wu, A.G.; Duan, G.R.; Yu, H.H. On solutions of the matrix equations XF − AX = C and $X F - A \bar{X} = C$ . Appl. Math. Comput. 2006, 183, 932–941. [Google Scholar]
Jiang, T.; Wei, M. On solutions of the matrix equations X − AXB = C and $X - A \bar{X} B = C$ . Linear Algebra Appl. 2003, 367, 225–233. [Google Scholar] [CrossRef]
Wu, A.G.; Wang, H.Q.; Duan, G.R. On matrix equations X − AXF = C and $X - A \bar{X} F = C$ . J. Comput. Appl. Math. 2009, 230, 690–698. [Google Scholar] [CrossRef]
Wu, A.G.; Fu, Y.M.; Duan, G.R. On solutions of matrix equations V − AVF = BW and $V - A \bar{V} F = B W$ . Math. Comput. Model. 2008, 47, 1181–1197. [Google Scholar] [CrossRef]
Wu, A.G.; Feng, G.; Hu, J.; Duan, G.R. Closed-form solutions to the nonhomogeneous Yakubovich-conjugate matrix equation. Appl. Math. Comput. 2009, 214, 442–450. [Google Scholar] [CrossRef]
Wu, A.G.; Zhang, Y. Complex Conjugate Matrix Equations for Systems and Control; Springer: Singapore, 2017. [Google Scholar]
Mazurek, R. A general approach to Sylvester-polynomial-conjugate matrix equations. Symmetry 2024, 16, 246. [Google Scholar] [CrossRef]
Lam, T.Y. A First Course in Noncommutative Rings; Springer: New York, NY, USA, 1991. [Google Scholar]
Wu, A.G.; Liu, W.; Li, C.; Duan, G.R. On j-conjugate product of quaternion polynomial matrices. Appl. Math. Comput. 2013, 219, 11223–11232. [Google Scholar] [CrossRef]
Futorny, V.; Klymchuk, T.; Sergeichuk, V.V. Roth’s solvability criteria for the matrix equations $A X - \hat{X} B = C and X - A \hat{X} B = C$ over the skew field of quaternions with an involutive automorphism q ↦ $\hat{q}$ . Linear Algebra Appl. 2016, 510, 246–258. [Google Scholar] [CrossRef]
Jiang, T.; Ling, S. On a solution of the quaternion matrix equation $A \tilde{X} - X B = C$ and its applications. Adv. Appl. Clifford Algebras 2013, 23, 689–699. [Google Scholar] [CrossRef]
Song, C.; Chen, G. On solutions of matrix equation XF − AX = C and $X F - A \tilde{X} = C$ over quaternion field. J. Appl. Math. Comput. 2011, 37, 57–68. [Google Scholar] [CrossRef]
Jiang, T.S.; Wei, M.S. On a solution of the quaternion matrix equation $X - A \tilde{X} B = C$ and its application. Acta Math. Sin. Engl. Ser. 2005, 21, 483–490. [Google Scholar] [CrossRef]
Song, C.; Chen, G.; Liu, Q. Explicit solutions to the quaternion matrix equations X − AXF = C and $X - A \tilde{X} F = C$ . Int. J. Comput. Math. 2012, 89, 890–900. [Google Scholar] [CrossRef]
Song, C.; Feng, J.; Wang, X.; Zhao, J. A real representation method for solving Yakubovich-j-conjugate quaternion matrix equation. Abstr. Appl. Anal. 2014, 2014, 285086. [Google Scholar] [CrossRef]
Yuan, S.; Liao, A. Least squares solution of the quaternion matrix equation $X - A \hat{X} B = C$ with the least norm. Linear Multilinear Algebra 2011, 59, 985–998. [Google Scholar] [CrossRef]
Song, C.; Feng, J. On solutions to the matrix equations XB − AX = CY and $X B - A \hat{X} = C Y$ . J. Franklin Inst. 2016, 353, 1075–1088. [Google Scholar] [CrossRef]
Song, C.; Chen, G. Solutions to matrix equations X − AXB = CY + R and XB − AX = CY and $X - A \hat{X} B = C Y + R$ . J. Comput. Appl. Math. 2018, 343, 488–500. [Google Scholar] [CrossRef]
Moor, B.D.; Zha, H. A tree of generalization of the ordinary singular value decomposition. Linear Algebra Appl. 1991, 147, 469–500. [Google Scholar] [CrossRef]
He, Z.H. Pure PSVD approach to Sylvester-type quaternion matrix equations. Electron. J. Linear Algebra 2019, 35, 266–284. [Google Scholar] [CrossRef]
Xie, M.Y.; Wang, Q.W.; He, Z.H.; Saad, M.M. A system of Sylvester-type quaternion matrix equations with ten variables. Acta Math. Sin. Engl. Ser. 2022, 38, 1399–1420. [Google Scholar] [CrossRef]
Mehany, M.S.; Wang, Q.W. Three symmetrical systems of coupled Sylvester-like quaternion matrix equations. Symmetry 2022, 14, 550. [Google Scholar] [CrossRef]
He, Z.H.; Wang, Q.W.; Zhang, Y. A system of quaternary coupled Sylvester-type real quaternion matrix equations. Automatica 2018, 87, 25–31. [Google Scholar] [CrossRef]
Rehman, A.; Wang, Q.W.; Ali, I.; Akram, M.; Ahmad, M.O. A constraint system of generalized Sylvester quaternion matrix equations. Adv. Appl. Clifford Algebr. 2017, 27, 3183–3196. [Google Scholar] [CrossRef]
Wang, Q.W.; He, Z.H. Solvability conditions and general solution for mixed Sylvester equations. Automatica 2013, 49, 2713–2719. [Google Scholar] [CrossRef]
Wang, Q.W.; He, Z.H. Systems of coupled generalized Sylvester matrix equations. Automatica 2014, 50, 2840–2844. [Google Scholar] [CrossRef]
Wang, Q.W.; Rehman, A.; He, Z.H.; Zhang, Y. Constraint generalized Sylvester matrix equations. Automatica 2016, 69, 60–64. [Google Scholar] [CrossRef]
He, Z.H.; Wang, Q.W. A system of periodic discrete-time coupled Sylvester quaternion matrix equations. Algebra Colloq. 2017, 24, 169–180. [Google Scholar] [CrossRef]
Kågström, B. A perturbation analysis of the generalized Sylvester equation (AR − LB, DR − LE) = (C, F). SIAM J. Matrix Anal. Appl. 1994, 15, 1045–1060. [Google Scholar] [CrossRef]
Ziętak, K. The properties of the minimax solution of a non-linear matrix equation XY = A. IMA J. Numer. Anal. 1983, 3, 229–244. [Google Scholar] [CrossRef]
Yang, X.; Huang, W. Backward error analysis of the matrix equations for Sylvester and Lyapunov. J. Sys. Sci. Math. Scis. 2008, 28, 524–534. (In Chinese) [Google Scholar]
Sun, J.G.; Xu, S.F. Perturbation analysis of the maximal solution of the matrix equation X + A^*X⁻¹ = P. II. Linear Algebra Appl. 2003, 362, 211–228. [Google Scholar] [CrossRef]
Li, J.F.; Hu, X.Y.; Duan, X.F. A symmetric preserving iterative method for generalized Sylvester equation. Asian J. Control 2011, 13, 408–417. [Google Scholar] [CrossRef]
Rockafellar, R.T. The multiplier method of Hestenes and Powell applied to convex programming. J. Optim. Theory Appl. 1973, 12, 555–562. [Google Scholar] [CrossRef]
Hartwig, R.E. A note on light matrices. Linear Algebra Appl. 1987, 97, 153–169. [Google Scholar] [CrossRef]
Ke, Y.; Ma, C. An alternating direction method for nonnegative solutions of the matrix equation AX + YB = C. Comp. Appl. Math. 2017, 36, 359–365. [Google Scholar] [CrossRef]
Peng, Z.; Peng, Y. An efficient iterative method for solving the matrix equation AXB + CYD = E. Numer. Linear Algebra Appl. 2006, 13, 473–485. [Google Scholar] [CrossRef]
Kågström, B.; Poromaa, P. Lapack-style algorithms and software for solving the generalized Sylvester equation and estimating the separation between regular matrix pairs. ACM Trans. Math. Softw. 1996, 22, 78–103. [Google Scholar] [CrossRef]
Lin, Y.; Wei, Y. Condition numbers of the generalized Sylvester equation. IEEE Trans. Autom. Control 2007, 52, 2380–2385. [Google Scholar] [CrossRef]
Diao, H.; Shi, X.; Wei, Y. Effective condition numbers and small sample statistical condition estimation for the generalized Sylvester equation. Sci. China-Math. 2013, 56, 967–982. [Google Scholar] [CrossRef]
Dehghan, M.; Hajarian, M. An iterative algorithm for the reflexive solutions of the generalized coupled Sylvester matrix equations and its optimal approximation. Appl. Math. Comput. 2008, 202, 571–588. [Google Scholar] [CrossRef]
Dehghan, M.; Hajarian, M. An iterative method for solving the generalized coupled Sylvester matrix equations over generalized bisymmetric matrices. Appl. Math. Modell. 2010, 34, 639–654. [Google Scholar] [CrossRef]
Paige, C.C.; Saunders, M.A. LSQR: An algorithm for sparse linear equations and sparse least squares. ACM Trans. Math. Softw. 1982, 8, 43–71. [Google Scholar] [CrossRef]
Li, S.K.; Huang, T.Z. LSQR iterative method for generalized coupled Sylvester matrix equations. Appl. Math. Modell. 2012, 36, 3545–3554. [Google Scholar] [CrossRef]
Huang, J. On parameter iteration method for solving the mixed-type Lyapunov matrix equation. Math. Numer. Sinica 2007, 29, 285–292. (In Chinese) [Google Scholar]
Zhang, K. Iterative Algorithms for Constrained Solutions of Matrix Equations; National Defense Industry Press: Beijing, China, 2015. (In Chinese) [Google Scholar]
Lv, C.; Ma, C. Two parameter iteration methods for coupled Sylvester matrix equations. East Asian J. Appl. Math. 2018, 8, 336–351. [Google Scholar] [CrossRef]
Ma, C.; Wu, Y.; Xie, Y. The Newton-type splitting iterative method for a class of coupled Sylvester-like absolute value equation. J. Appl. Anal. Comput. 2024, 14, 3306–3331. [Google Scholar] [CrossRef]
Sheng, X. A relaxed gradient based algorithm for solving generalized coupled Sylvester matrix equations. J. Franklin Inst. 2018, 355, 4282–4297. [Google Scholar] [CrossRef]
Ding, F.; Chen, T. Iterative least-squares solutions of coupled Sylvester matrix equations. Syst. Control Lett. 2005, 54, 95–107. [Google Scholar] [CrossRef]
Ding, F.; Chen, T. On iterative solutions of general coupled matrix equations. SIAM J. Control Optim. 2006, 44, 2269–2284. [Google Scholar] [CrossRef]
Shirilord, A.; Dehghan, M. Gradient descent-based parameter-free methods for solving coupled matrix equations and studying an application in dynamical systems. Appl. Numer. Math. 2025, 212, 29–59. [Google Scholar] [CrossRef]
Dehghan, M.; Hajarian, M. The general coupled matrix equations over generalized bisymmetric matrices. Linear Algebra Appl. 2010, 432, 1531–1552. [Google Scholar] [CrossRef]
Dehghan, M.; Hajarian, M. Iterative algorithms for the generalized centro-symmetric and central anti-symmetric solutions of general coupled matrix equations. Eng. Comput. 2012, 29, 528–560. [Google Scholar] [CrossRef]
Hajarian, M. Matrix form of the CGS method for solving general coupled matrix equations. Appl. Math. Lett. 2014, 34, 37–42. [Google Scholar] [CrossRef]
Hajarian, M. Generalized conjugate direction algorithm for solving the general coupled matrix equations over symmetric matrices. Numer. Algor. 2016, 73, 591–609. [Google Scholar] [CrossRef]
Hajarian, M. Convergence of HS version of BCR algorithm to solve the generalized Sylvester matrix equation over generalized reflexive matrices. J. Franklin Inst. 2017, 354, 2340–2357. [Google Scholar] [CrossRef]
Lv, C.Q.; Ma, C.F. BCR method for solving generalized coupled Sylvester equations over centrosymmetric or anti-centrosymmetric matrix. Comput. Math. Appl. 2018, 75, 70–88. [Google Scholar] [CrossRef]
Niu, Q.; Wang, X.; Lu, L.Z. A relaxed gradient based algorithm for solving Sylvester equations. Asian J. Control 2011, 13, 461–464. [Google Scholar] [CrossRef]
Hajarian, M. Computing symmetric solutions of general Sylvester matrix equations via Lanczos version of biconjugate residual algorithm. Comput. Math. Appl. 2018, 76, 686–700. [Google Scholar] [CrossRef]
Yan, T.; Ma, C. The BCR algorithms for solving the reflexive or anti-reflexive solutions of generalized coupled Sylvester matrix equations. J. Franklin Inst. 2020, 357, 12787–12807. [Google Scholar] [CrossRef]
Zhou, B.; Duan, G.R.; Li, Z.Y. Gradient based iterative algorithm for solving coupled matrix equations. Syst. Control Lett. 2009, 58, 327–333. [Google Scholar] [CrossRef]
Hajarian, M. Matrix GPBiCG algorithms for solving the general coupled matrix equations. IET Control Theory Appl. 2015, 9, 74–81. [Google Scholar] [CrossRef]
Wu, A.G.; Li, B.; Zhang, Y.; Duan, G.R. Finite iterative solutions to coupled Sylvester-conjugate matrix equations. Appl. Math. Modell. 2011, 35, 1065–1080. [Google Scholar] [CrossRef]
Yan, T.; Ma, C. An iterative algorithm for generalized Hamiltonian solution of a class of generalized coupled Sylvester-conjugate matrix equations. Appl. Math. Comput. 2021, 411, 126491. [Google Scholar] [CrossRef]
Carpentieri, B.; Jing, Y.F.; Huang, T.Z. The BiCOR and CORS iterative algorithms for solving nonsymmetric linear systems. SIAM J. Sci. Comput. 2011, 33, 3020–3036. [Google Scholar] [CrossRef]
Jing, Y.F.; Huang, T.Z.; Zhang, Y.; Li, L.; Cheng, G.H.; Ren, Z.G.; Duan, Y.; Sogabe, T.; Carpentieri, B. Lanczos-type variants of the COCR method for complex nonsymmetric linear systems. J. Comput. Phys. 2009, 228, 6376–6394. [Google Scholar] [CrossRef]
Hajarian, M. Developing BiCOR and CORS methods for coupled Sylvester-transpose and periodic Sylvester matrix equations. Appl. Math. Modell. 2015, 39, 6073–6084. [Google Scholar] [CrossRef]
Bittanti, S.; Colaneri, P. Periodic Systems: Filtering and Control; Springer: London, UK, 2009. [Google Scholar]
Hajarian, M. Convergence analysis of generalized conjugate direction method to solve general coupled Sylvester discrete-time periodic matrix equations. Int. J. Adapt. Control Signal Process. 2017, 31, 985–1002. [Google Scholar] [CrossRef]
Ma, C.; Yan, T. A finite iterative algorithm for the general discrete-time periodic Sylvester matrix equations. J. Franklin Inst. 2022, 359, 4410–4432. [Google Scholar]
Dehghani-Madiseh, M.; Dehghan, M. Generalized solution sets of the interval generalized Sylvester matrix equation $\sum_{i = 1}^{p} A_{i} X_{i} + \sum_{i = 1}^{p} A_{i} X_{i} = C$ and some approaches for inner and outer estimations. Comput. Math. Appl. 2014, 68, 1758–1774. [Google Scholar] [CrossRef]
Hajarian, M. Convergence properties of BCR method for generalized Sylvester matrix equation over generalized reflexive and anti-reflexive matrices. Linear Multilinear Algebra 2018, 66, 1975–1990. [Google Scholar]
Li, T.; Wang, Q.W.; Zhang, X.F. A modified conjugate residual method and nearest Kronecker product preconditioner for the generalized coupled Sylvester tensor equations. Mathematics 2022, 10, 1730. [Google Scholar] [CrossRef]
Varga, A. A numerically reliable approach to robust pole assignment for descriptor systems. Futur. Gener. Comp. Syst. 2012, 19, 1221–1230. [Google Scholar] [CrossRef]
Trinh, H.; Tran, T.D.; Nahavandi, S. Design of scalar functional observers of order less than (ν − 1). Int. J. Control 2006, 79, 1654–1659. [Google Scholar] [CrossRef]
Jin, L.; Yan, J.; Du, X.; Xiao, X.; Fu, D. RNN for solving time-variant generalized Sylvester equation with applications to robots and acoustic source localization. IEEE Trans. Ind. Inform. 2020, 16, 6359–6369. [Google Scholar] [CrossRef]
Kiran, N.U. Simultaneous triangularization of pseudo-differential systems. J. Pseudo-Differ. Oper. Appl. 2013, 4, 45–61. [Google Scholar]
Gu, D.; Zhang, D.; Liu, Q. Parametric control to permanent magnet synchronous motor via proportional plus integral feedback. Trans. Inst. Meas. Control 2021, 43, 925–932. [Google Scholar]
Song, W.; Jin, A. Observer-based model reference tracking control of the Markov jump system with partly unknown transition rates. Appl. Sci. 2023, 13, 914. [Google Scholar] [CrossRef]
Xu, R.; Wei, T.; Wei, Y.; Yan, H. UTV decomposition of dual matrices and its applications. Comput. Appl. Math. 2024, 43, 41. [Google Scholar] [CrossRef]
Radjavi, H.; Rosenthal, P. Simultaneous Triangularization; Springer: New York, NY, USA, 2000. [Google Scholar]
Taylor, M. Pseudo Differential Operators; Springer: Berlin/Heidelberg, Germany, 1974. [Google Scholar]
Mooring, B.W.; Roth, Z.S.; Driels, M.R. Fundamentals of Manipulator Calibration; Wiley: New York, NY, USA, 1991. [Google Scholar]
Shiu, Y.C.; Ahmad, S. Calibration of wrist-mounted robotic sensors by solving homogeneous transformation equations of the form AX = XB. IEEE Trans. Robot. Autom. 1989, 5, 16–29. [Google Scholar] [CrossRef]
Zhuang, H.; Roth, Z.S. Comments on “Calibration of wrist-mounted robotic sensors by solving homogeneous transformation equations of the form AX = XB”. IEEE Trans. Robot. Autom. 1991, 7, 877–878. [Google Scholar] [CrossRef]
Zhuang, H.; Roth, Z.S.; Sudhakar, R. Simultaneous robot/world and tool/flange calibration by solving homogeneous transformation equations of the form AX = YB. IEEE Trans. Robot. Autom. 1994, 10, 549–554. [Google Scholar] [CrossRef]
Qi, L. Standard dual quaternion optimization and its applications in hand-eye calibration and SLAM. Commun. Appl. Math. Comput. 2023, 5, 1469–1483. [Google Scholar] [CrossRef]
Li, A.; Wang, L.; Wu, D. Simultaneous robot-world and hand-eye calibration using dual-quaternions and Kronecker product. Int. J. Phys. Sci. 2010, 5, 1530–1536. [Google Scholar]
Chen, Z.; Ling, C.; Qi, L.; Yan, H. A regularization-patching dual quaternion optimization method for solving the hand-eye calibration problem. J. Optim. Theory Appl. 2024, 200, 1193–1215. [Google Scholar] [CrossRef]
Wang, X.; Huang, J.; Song, H. Simultaneous robot-world and hand-eye calibration based on a pair of dual equations. Measurement 2021, 181, 109623. [Google Scholar] [CrossRef]
Daniilidis, K. Hand-eye calibration using dual quaternions. Int. J. Robot. Res. 1999, 18, 286–298. [Google Scholar] [CrossRef]
Wang, J.; Qu, D.; Xu, F. A new hybrid calibration method for extrinsic camera parameters and hand-eye transformation. Preoceedings of the IEEE International Conference on Mechatronics and Automation, Niagara Falls, ON, Canada, 29 July–1 August 2005; Volume 4, pp. 1981–1985. [Google Scholar]
Ernst, F.; Richter, L.; Matthäus, L.; Martens, V.; Bruder, R.; Schlaefer, A.; Schweikard, A. Non-orthogonal tool/flange and robot/world calibration. Int. J. Med. Robotics Comput. Assist. Surg. 2012, 8, 407–420, Erratum in Int. J. Med. Robotics Comput. Assist. Surg. 2017, 13, e1883. [Google Scholar] [CrossRef]
Shah, M. Solving the robot-world/hand-eye calibration problem using the Kronecker product. J. Mech. Robot. 2013, 5, 031007. [Google Scholar] [CrossRef]
Wu, L.; Ren, H. Finding the kinematic base frame of a robot by hand-eye calibration using 3D position data. IEEE Trans. Autom. Sci. Eng. 2017, 14, 314–324. [Google Scholar] [CrossRef]
Tan, N.; Gu, X.; Ren, H. Simultaneous robot-world, sensor-tip, kinematics calibration of an underactuated robotic hand with soft fingers. IEEE Access 2018, 6, 22705–22715. [Google Scholar] [CrossRef]
Shah, M.; Bostelman, R.; Legowik, S.; Hong, T. Calibration of mobile manipulators using 2D positional features. Measurement 2018, 124, 322–328. [Google Scholar] [CrossRef]
Condurache, D.; Ciureanu, I.A. A novel solution for AX = YB sensor calibration problem using dual Lie algebra. In Proceedings of the 2019 6th International Conference on Control, Decision and Information Technologies (CoDIT’19), Paris, France, 23–26 April 2019; pp. 302–307. [Google Scholar]
Wu, J.; Liu, M.; Zhu, Y.; Zou, Z.; Dai, Z.M.; Zhang, C.; Jiang, Y.; Li, C. Globally optimal symbolic hand-eye calibration. IEEE-ASME Trans. Mechatron. 2021, 26, 1369–1379. [Google Scholar] [CrossRef]
Pan, J.; Fu, Z.; Yue, H.; Lei, X.; Li, M.; Chen, X. Toward simultaneous coordinate calibrations of AX = YB problem by the LMI-SDP optimization. IEEE Trans. Autom. Sci. Eng. 2023, 20, 2445–2453. [Google Scholar] [CrossRef]
Ha, J. Probabilistic framework for hand-eye and robot-world calibration AX = YB. IEEE Trans. Robot. 2023, 39, 1196–1211. [Google Scholar] [CrossRef]
Wang, X.; Song, H. One-step solving the robot-world and hand-eye calibration based on the principle of transference. J. Mech. Robot. 2024, 17, 031014. [Google Scholar]
Xie, L.M.; Wang, Q.W. Some novel results on a classical system of matrix equations over the dual quaternion algebra. Filomat 2025, 39, 1477–1490. [Google Scholar] [CrossRef]

Figure 1. Geometry of a robotic system.

Figure 2. Encrypting and decrypting color images.

Figure 3. Two original color images and keys.

Figure 4. Two encrypted color images (generated by Algorithm 5).

Figure 5. Two decrypted color images (generated by Algorithm 6).

Figure 6. Core framework of GSE research.

Table 1. Abbreviations and their full names.

Full Name	Abbreviation
Generalized Sylvester equation	GSE
Roth’s equivalence theorem	RET
Moore–Penrose inverse	MP inverse
Singular value decomposition	SVD
Generalized singular value decomposition	GSVD
Canonical correlation decomposition	CCD
Semi-tensor product	STP
Second matrix–matrix semi-tensor product	MM-2 STP
Conjugate gradient least-squares algorithm	CGLSA
Alternating direction method	ADM
Relaxed gradient-based iterative algorithm	RGI algorithm

Table 2. SSIM values for the decrypted color images.

Color Image	SSIM
Bike 1	0.99
Bike 2	0.99

Table 3. Open problems for further research on GSE.

Number	Remark Number	Open Problem
1	Remarks 15 and 16	Cramer’s rule for GSE only through coefficient matrices (partially solved)
2	Remark 18	Solving GSE under STP (or MM-2 STP)
3	Remark 22	Solving the GSE via $L$ -representation, $C$ -representation, $L_{C}$ -representation, and vectorization properties of STP, respectively,
4	Remark 35	Discussing RET from a single common property of Euclidean domains and unit regular rings (partially solved)
5	Remark 43	Research on the fundamental properties and applications of combinations of different types of quaternions and dual numbers
6	Remark 47	Investigating GSE tensor equations under different tensor products and quaternion algebras.
7	Remark 60 (4)	Exploring converse problems for Lemmas 11 and 12 of [241]
8	Remark 63	Solving the restricted system (101)
9	Remark 64	Solving the system (104) (partially solved)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, Q.-W.; Gao, J. A Comprehensive Review on the Generalized Sylvester Equation AX − YB = C. Symmetry 2025, 17, 1686. https://doi.org/10.3390/sym17101686

AMA Style

Wang Q-W, Gao J. A Comprehensive Review on the Generalized Sylvester Equation AX − YB = C. Symmetry. 2025; 17(10):1686. https://doi.org/10.3390/sym17101686

Chicago/Turabian Style

Wang, Qing-Wen, and Jiale Gao. 2025. "A Comprehensive Review on the Generalized Sylvester Equation AX − YB = C" Symmetry 17, no. 10: 1686. https://doi.org/10.3390/sym17101686

APA Style

Wang, Q.-W., & Gao, J. (2025). A Comprehensive Review on the Generalized Sylvester Equation AX − YB = C. Symmetry, 17(10), 1686. https://doi.org/10.3390/sym17101686

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Comprehensive Review on the Generalized Sylvester Equation AX − YB = C

Abstract

1. Introduction

2. Preliminaries

3. Roth’s Equivalence Theorem

4. Different Methods on GSE

4.1. Method by Linear Transformations and Subspace Dimensions

4.2. Method by Generalized Inverses

4.3. Method by Singular Value Decompositions

4.4. Method by Simultaneous Decompositions

4.5. Method by Real (Complex) Representations

4.6. Method by Determinable Representations

4.7. Method by Semi-Tensor Products

5. Constrained Solutions of GSE

5.1. Chebyshev Solutions and l p -Solutions

5.2. ★-Congruent Solutions

5.3. (Minimum-Norm Least-Squares) Symmetric Solutions

5.4. Self-Adjoint and Positive (Semi)Definite Solutions

5.5. Per(Skew)Symmetric and Bi(Skew)Symmetric Solutions

5.6. Maximal and Minimal Ranks of the General Solution

5.7. Re-(Non)negative and Re-(Non)positive Definite Solutions

5.8. η -Hermitian and η -Skew-Hermitian Solutions

5.9. ϕ -Hermitian Solutions

5.10. Equality-Constrained Solutions

6. Various Generalizations of GSE

6.1. Generalizing RET over Different Rings

6.1.1. Generalizing RET over Unit Regular Rings

6.1.2. Generalizing RET over Principal Ideal Domains

6.1.3. Generalizing RET over Division and Module-Finite Rings

6.1.4. Generalizing RET over Commutative Rings

6.1.5. Generalizing RET over Artinian and Noncommutative Rings

6.2. Generalizing RET to a Rank Minimization Problem

6.3. GSE over Dual Numbers and Dual Quaternions

6.4. Linear Operator Equations on Hilbert Spaces

6.5. Tensor Equations

6.6. Polynomial Matrix Equations

6.6.1. By the Divisibility of Polynomials

6.6.2. By Skew-Prime Polynomial Matrices

6.6.3. By the Realization of Matrix Fraction Descriptions

6.6.4. By the Unilateral Polynomial Matrix Equation

6.6.5. By the Equivalence of Block Polynomial Matrices

6.6.6. By Jordan Systems of Polynomial Matrices

6.6.7. By Linear Matrix Equations

6.6.8. By Root Functions of Polynomial Matrices

6.7. Sylvester-Polynomial-Conjugate Matrix Equations

6.8. Generalized Forms of GSE

7. Iterative Algorithms

8. Applications to GSE

8.1. Theoretical Applications

8.1.1. Solvability of Matrix Equations

8.1.2. UTV Decomposition of Dual Matrices

8.1.3. Microlocal Triangularization of Pseudo-Differential Systems

8.2. Practical Applications

8.2.1. Calibration Problems

8.2.2. Encryption and Decryption Schemes for Color Images

9. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

5.1. Chebyshev Solutions and $l_{p}$ -Solutions

5.8. $η$ -Hermitian and $η$ -Skew-Hermitian Solutions

5.9. $ϕ$ -Hermitian Solutions