Matrix Pencil Optimal Iterative Algorithms and Restarted Versions for Linear Matrix Equation and Pseudoinverse

Liu, Chein-Shan; Kuo, Chung-Lun; Chang, Chih-Wen

doi:10.3390/math12111761

Open AccessArticle

Matrix Pencil Optimal Iterative Algorithms and Restarted Versions for Linear Matrix Equation and Pseudoinverse

by

Chein-Shan Liu

¹

,

Chung-Lun Kuo

¹ and

Chih-Wen Chang

^2,*

¹

Center of Excellence for Ocean Engineering, National Taiwan Ocean University, Keelung 202301, Taiwan

²

Department of Mechanical Engineering, National United University, Miaoli 36063, Taiwan

^*

Author to whom correspondence should be addressed.

Mathematics 2024, 12(11), 1761; https://doi.org/10.3390/math12111761

Submission received: 9 May 2024 / Revised: 30 May 2024 / Accepted: 3 June 2024 / Published: 5 June 2024

(This article belongs to the Special Issue Nonlinear Functional Analysis: Theory, Methods, and Applications)

Download

Browse Figures

Versions Notes

Abstract

We derive a double-optimal iterative algorithm (DOIA) in an m-degree matrix pencil Krylov subspace to solve a rectangular linear matrix equation. Expressing the iterative solution in a matrix pencil and using two optimization techniques, we determine the expansion coefficients explicitly, by inverting an

m \times m

positive definite matrix. The DOIA is a fast, convergent, iterative algorithm. Some properties and the estimation of residual error of the DOIA are given to prove the absolute convergence. Numerical tests demonstrate the usefulness of the double-optimal solution (DOS) and DOIA in solving square or nonsquare linear matrix equations and in inverting nonsingular square matrices. To speed up the convergence, a restarted technique with frequency m is proposed, namely, DOIA(m); it outperforms the DOIA. The pseudoinverse of a rectangular matrix can be sought using the DOIA and DOIA(m). The Moore–Penrose iterative algorithm (MPIA) and MPIA(m) based on the polynomial-type matrix pencil and the optimized hyperpower iterative algorithm OHPIA(m) are developed. They are efficient and accurate iterative methods for finding the pseudoinverse, especially the MPIA(m) and OHPIA(m).

Keywords:

linear matrix equations; matrix pencil Krylov subspace method; double-optimal iterative algorithm; Moore–Penrose pseudoinverse; restarted DOIA; optimized hyperpower method

MSC:

65F10; 65F30

1. Introduction

Since pioneering works [1,2], Krylov subspace methods have been studied and developed extensively in the iterative solution of linear equations systems [3,4,5,6,7,8,9], like the minimum residual algorithm [10], the generalized minimal residual method (GMRES) [6,7], the quasiminimal residual method [8], the biconjugate gradient method [11], the conjugate gradient squared method [12], and the biconjugate gradient stabilized method [13]. For more discussion on the Krylov subspace methods, one can refer to review papers [5,14,15] and text books [16,17].

In this paper, we use the matrix pencil Krylov subspace method to derive some novel numerical algorithms to solve the following linear matrix equation:

A Z = F,

(1)

where

Z \in R^{n \times p}

is an unknown matrix. We consider two possible cases:

\begin{matrix} Case 1 : A \in R^{n \times n}, F \in R^{n \times p}, \end{matrix}

(2)

\begin{matrix} Case 2 : A \in R^{q \times n}, F \in R^{q \times p} . \end{matrix}

(3)

We assume

rank (A) = rank [A F]

for consistency in Equation (1).

The problem of solving a matrix equation is one of the most important topics in computational mathematics and has been widely studied in many different fields. As shown by Jbilou et al. [18], there are many methods for solving square linear matrix equations. Many iterative algorithms based on the block Krylov subspace method have been developed to solve matrix equations [19,20,21]. For a multiple-righthand-side system with A being nonsingular, the block Krylov subspace methods for computing

A^{- 1} F

in Equation (1) comprise an even larger body of the literature [22,23,24]. There are some novel algorithms for solving Equation (1) with nonsquare A and square Z in [25,26,27,28]. Having an effective and fast solution to matrix equations plays a fundamental role in numerous fields of science. To reduce the computational complexity and increase the accuracy, a more efficient method is still required.

One of the applications of Equation (1) is finding the solution to the following linear equation system:

A x = b .

(4)

It is noted that for the solution to Equation (4), we need to find the left inversion of A, which is denoted by

A_{L}^{- 1}

. When we take

F = I_{n}

, Equation (1) is used to find the right inversion of A, which is denoted by

A_{R}^{- 1}

. Mathematically,

A_{L}^{- 1} = A_{R}^{- 1}

for nonsingular square matrices; however, from the numerical outputs,

A_{L}^{- 1} \neq A_{R}^{- 1}

, especially for ill-conditioned matrices.

In order to find the left inversion of A, we can solve

X A = I_{n};

(5)

a transpose leads to

A^{T} X^{T} = I_{n} .

(6)

If we use

A^{T}

to replace A, then we supply an optimal solution and optimal iterative scheme to seek

X^{T}

; inserting its transpose into Equation (4), we can obtain the solution of x with

x = X b .

(7)

There are many algorithms for finding

A_{L}^{- 1}

or

A_{R}^{- 1}

for nonsingular matrices. Liu et al. [29] modified the conjugate gradient (CG)-type method to the matrix conjugate gradient method (MCGM), which performed well in finding the inverse matrix. Other iterative schemes encompass the Newton–Schultz method [30], the Chebyshev method [31], the Homeier method [32], the method by Petkovic and Stanimirovic [33], and some cubic and higher-order convergence methods [34,35,36,37].

We can apply Equation (1) to find a pseud-inverse

A^{†}

with dimensions

n \times q

of a rectangular matrix A with dimensions

q \times n

:

A A^{†} = I_{q},

(8)

where

I_{q}

is a q-dimensional identity matrix. This is a special Case 2 in Equation (3) with

p = q

and

F = I_{q}

.

For wide application, the pseudoinverse and the Moore–Penrose inverse matrices have been investigated and computed using many methods [33,34,35,36,37,38,39,40,41,42,43,44,45]. More higher-order iterative schemes for the Moore–Penrose inverse were recently analyzed by Sayevand et al. [46].

In order to compare different solvers for systems of nonlinear equations, some novel goodness and qualification criteria are defined in [47]. They include convergence order, number of function evaluations, number of iterations, CPU time, etc., for evaluating the quality of a proposed iterative method.

From Equation (1),

B = F - A Z_{0}

(9)

is an initial residual matrix, if an initial guess

Z_{0}

is given. Let

X = Z - Z_{0} .

(10)

Then, Equation (1) is equivalent to

A X = B,

(11)

which is used to search a descent direction X after giving an initial residual. It follows from Equations (9) and (10) that

R = F - A Z = B - A X

(12)

is a residual matrix. Below we will develop novel method to solve Equation (11), and hence Equation (1), if we replace B by F and Z by X. Since we cannot find the exact solutions of Equations (1) and (11) in general, the residual R in Equation (12) is not zero.

1.1. Notation

Throughout this paper, uppercase letters denote matrices, while lowercase letters denote vectors or scalars. If two matrices A and B have the same order, their inner product is defined by

A : B = t r (A^{T} B)

, where

t r

is the trace of a square matrix, and the superscript T signifies the transpose. As a consequence, the Frobenius norm of A is given by

{∥ A ∥}^{2} : = t r (A^{T} A) = A : A

. The component of a vector a is written as

a_{i}

, while the component of a matrix A is written as

A_{i j}

. A single subscript k in a matrix

U_{k}

means that it is the kth matrix in a matrix pencil.

1.2. Main Contribution

In GMRES [16], an m-dimensional matrix Krylov subspace is supposed,

K_{m} (A, B) : = Span {B, A B, \dots, A^{m - 1} B} .

(13)

The Petrov–Galerkin method searches

X \in K_{m}

via a perpendicular property:

B - A X ⊥ L_{m} = A K_{m} .

(14)

The descent matrix

X \in K_{m}

can be achieved by minimizing the residual [16]:

{min ∥ R ∥}^{2} = min {∥ B - A X ∥}^{2} .

(15)

Equation (14) is equivalent to the projection of B on

L_{m}

, and R is perpendicular to

L_{m}

. To seek a fast convergent iterative algorithm, we must keep the orthogonality and simultaneously maximize the projection quantity:

max \frac{{[B : (A X)]}^{2}}{{∥ A X ∥}^{2}} .

(16)

The main contribution of this paper is that we seek the best descent matrix X simultaneously satisfying (15) and (16). Some excellent properties including the absolute convergence of the proposed iterative algorithm are proven. When the matrix Krylov subspace method in Equation (13) is not suitable for the Case 2 problem, we can treat both Case 1 and 2 problems of linear matrix equations in a unified manner. Now, our problem is to construct double-optimal iterative algorithms with an explicit expansion form to solve the linear matrix Equation (1), which involves inverting a low-dimensional

m \times m

positive definite matrix. Our problem is more difficult than that using (15) to derive iterative algorithms of the GMRES type. For the pseudoinverse problem of a rectangular matrix, higher-order polynomial methods and hyperpower iterative algorithms are unified into the frame of a double-optimal matrix pencil method but with different matrix pencils.

1.3. Outline

We start from an m-degree matrix pencil Krylov subspace to express the solution to the linear matrix equation in Section 2; two cases of linear matrix equations are considered. In Section 3, two merit functions are optimized for the determination of the expansion coefficients. More importantly, an explicit form double-optimal solution (DOS) to Equation (11) is created, for which we needed to invert an

m \times m

positive definite matrix. In Section 4, we propose an iterative algorithm updated using the DOS method, which provides the best descent direction used in the double-optimal iterative algorithm (DOIA). A restarted version with frequency m, namely, the DOIA(m) is proposed. The DOS can be viewed as a single-step DOIA. Linear matrix equations are solved by the DOS, DOIA, and DOIA(m) methods in Section 5 to display some advantages of the presented methodology to find the approximate solution of Equation (1). The Moore–Penrose inverse of a rectangular matrix is addressed in Section 6, proposing two optimized methods based on a new matrix pencil of polynomials and a new modification of the hyperpower method. Numerical testing of the Moore–Penrose inverse of rectangular matrix is carried out in Section 7. Finally, the conclusions are drawn in Section 8.

2. The Matrix Pencil Krylov Subspace Method

This section constructs suitable bases for Case 1 and Case 2 problems of the linear matrix Equation (1). We denote the expansion space with

Span {U_{0}, U}

. For Case 1,

\begin{matrix} U_{0} = B, \end{matrix}

(17)

\begin{matrix} U : = [U_{1}, \dots, U_{m}] = [A U_{0}, \dots, A^{m} U_{0}], \end{matrix}

(18)

while, for Case 2,

\begin{matrix} U_{0} = A^{T} B, \end{matrix}

(19)

\begin{matrix} U : = [U_{1}, \dots, U_{m}] = [A^{T} A U_{0}, \dots, {(A^{T} A)}^{m} U_{0}] . \end{matrix}

(20)

In

U

, the matrices

U_{1}, \dots, U_{m}

are orthonormalized by using the modified Gram–Schmidt process, such that

U_{i} : U_{j} = δ_{i j}

, where : denotes the inner product of

U_{i}

and

U_{j}

.

In the space of

Span {U_{0}, U}

, X in Equation (11) can be expanded as

X = α_{0} U_{0} + \sum_{k = 1}^{m} α_{k} U_{k} = α_{0} U_{0} + U α,

(21)

where

α : = {(α_{1}, \dots, α_{m})}^{T}

. The coefficients

α_{0}

and

α

are determined optimally from a combination of

U_{0} \in R^{n \times p}

and the m matrix

U_{k} \in R^{n \times p}, k = 1, \dots, m

in a degree m matrix pencil Krylov subspace.

Both Case 1 and Case 2 can be treated in a unified manner as shown below, but with different

U_{0}

and

U

.

U

is a matrix pencil consisting of m matrices in the matrix Krylov subspace, and

U α

is the product of the pencil with vector

α \in R^{m}

. For the frequent use of the later, we give a definition of

U α

as

U α : = \sum_{j = 1}^{m} α_{j} U_{j} .

(22)

In Equation (21), X is spanned by

{U_{0}, U}

, and we may write it as

X \in Span {U_{0}, U}

. So, we call our expansion method a degree

m + 1

matrix- pencil Krylov subspace method. It is different from the fixed affine matrix Krylov subspace

X \in U_{0} + Span {U}

. The concept of fixed and varying affine Krylov subspaces was elaborated in [48].

3. Double-Optimal Solution

In this section, we seek the best descent matrix X simultaneously satisfying (15) and (16). The properties including the orthogonality of residual and absolute convergence of the proposed iterative algorithm are proven. We must emphasize that the derived iterative algorithms have explicit expansion forms to solve the linear matrix Equation (11), by inverting an

m \times m

positive definite matrix.

3.1. Two Minimizations

To simplify the notation, let

Y : = A X;

(23)

we consider the orthogonal projection of B to Y, measured by the error matrix:

E : = B - B : (\frac{Y}{∥ Y ∥}) \frac{Y}{∥ Y ∥} .

(24)

The best approximation of X in Equation (11) can be found, when

Y = A X

minimizes

{∥ E ∥}^{2} = {∥ B ∥}^{2} - \frac{{(B : Y)}^{2}}{{∥ Y ∥}^{2}} \geq 0,

(25)

or maximizes the orthogonal projection of B to Y:

max_{Y} \{\frac{{(B : Y)}^{2}}{{∥ Y ∥}^{2}}\} .

(26)

We can solve

A X = B

by an approximation of X from the above optimization technique, but

B - A X

is not exactly equal to zero, since

Y = A X

involves the unknown matrix X, and Y is not exactly equal to B.

Let

f : = \frac{{∥ Y ∥}^{2}}{{(B : Y)}^{2}} \geq \frac{1}{{∥ B ∥}^{2}},

(27)

which is the reciprocal of the merit function in Equation (26).

Let

Y_{j} = A U_{j} .

(28)

Then, Equation (23), with the aid of Equation (21), can be written as

Y = Y_{0} + Y α,

(29)

where

\begin{matrix} Y_{0} : = α_{0} A U_{0}, \end{matrix}

(30)

\begin{matrix} Y : = [Y_{1}, \dots, Y_{m}] = A U = [A U_{1}, \dots, A U_{m}]; \end{matrix}

(31)

Y

is an m-degree matrix-pencil. We suppose that for each

Y_{j}, j = 1, \dots, m

, rank

(Y_{j}) = min [n, p]

for Case 1, and rank

(Y_{j}) = min [q, p]

for Case 2. We take rank

(Y_{j}) = n_{0}

, where

n_{0} = min [n, p]

for Case 1, and

n_{0} = min [q, p]

for Case 2.

Inserting Equation (29) for Y into Equation (27), we encounter the following minimization problem:

min_{α} \{f = \frac{{∥ Y ∥}^{2}}{{(B : Y)}^{2}} = \frac{∥ Y_{0} {+ Y α ∥}^{2}}{{[B : Y_{0} + B : (Y α)]}^{2}}\};

(32)

however, in Y there is still an unknown scalar

α_{0}

in

Y_{0} = α_{0} A U_{0}

. Other minimization of the residual is supplemented:

min_{α_{0}} {{∥ R ∥}^{2} = {∥ B - A X ∥}^{2} = {∥ B - Y ∥}^{2}} .

(33)

The solution of these two optimization problems is a major task at the next subsection.

3.2. Two Main Theorems

The first theorem determines the expansion coefficients

α_{j}, j = 0, 1, \dots, m

in Equation (21).

Theorem 1.

For

X \in S p a n {U_{0}, U}

, the optimal solution of Equation (11) derived from the optimizations in (32) and (33) is given by

X = α_{0} [U_{0} + λ_{0} U D v - U D u],

(34)

where

\begin{matrix} C = [C_{i j}], C_{i j} = Y_{i} : Y_{j}, \\ D = C^{- 1}, \\ v = [v_{i}], v_{i} = B : Y_{i}, \\ u = [u_{i}], u_{i} = (A U_{0}) : Y_{i}, \\ λ_{0} = \frac{∥ A U_{0} ∥^{2} - u^{T} D u}{B : (A U_{0}) - v^{T} D u}, \\ W = λ_{0} Y D v - Y D u + A U_{0}, \\ α_{0} = \frac{W : B}{{∥ W ∥}^{2}} . \end{matrix}

(35)

We suppose that rank

(C) = m < n_{0}

, such that

C^{- 1}

exists.

Proof.

With the help of Equation (29),

B : Y

in (32) can be written as

B : Y = B : Y_{0} + v^{T} α,

(36)

where the components of

v \in R^{m}

are given by

v_{i} : = B : Y_{i} .

(37)

Taking the squared norms of Equation (29) yields

{∥ Y ∥}^{2} = ∥ Y_{0} {+ Y α ∥}^{2} = {∥ Y_{0} ∥}^{2} + 2 {(Y_{0} : Y)}^{T} α + α^{T} Y : Y α,

where

Y : Y = Y_{i} : Y_{j}, i, j = 1, \dots, m

is an

m \times m

matrix. We can derive

{∥ Y ∥}^{2} = {∥ Y_{0} ∥}^{2} + 2 w^{T} α + α^{T} C α,

(38)

where

\begin{matrix} w_{i} : = Y_{0} : Y_{i}, \end{matrix}

(39)

\begin{matrix} C_{i j} = Y_{i} : Y_{j}; \end{matrix}

(40)

w \in R^{m}

and

C \in R^{m \times m}

is a symmetric positive definite matrix.

The minimality condition of f is

\nabla_{α} \frac{{∥ Y ∥}^{2}}{{(B : Y)}^{2}} = 0 \Rightarrow {(B : Y)}^{2} y_{2} - 2 B : Y {∥ Y ∥}^{2} y_{1} = 0,

(41)

where

\begin{matrix} y_{1} : = \nabla_{α} (B : Y) = v, \end{matrix}

(42)

\begin{matrix} y_{2} : = \nabla_{α} {∥ Y ∥}^{2} = 2 w + 2 C α; \end{matrix}

(43)

y_{1}

and

y_{2}

are m-vectors, while

C > 0 \in R^{m \times m}

with rank

(C) = m

. Thus, the equation to determine

α

is

B : Y y_{2} - 2 {∥ Y ∥}^{2} y_{1} = 0 .

(44)

We can observe from Equation (44) that

y_{2}

is proportional to

y_{1}

, which is supposed to be

y_{2} = 2 λ y_{1},

(45)

where

λ

is a multiplier to be determined from

{∥ Y ∥}^{2} = λ B : Y .

(46)

Then, it follows from Equations (42), (43) and (45) that

\begin{matrix} w + C α = λ v, \end{matrix}

(47)

\begin{matrix} α = λ D v - D w, \end{matrix}

(48)

where

D : = C^{- 1} > 0 \in R^{m \times m} .

(49)

Inserting Equation (48) into Equations (36) and (38), we have

\begin{matrix} B : Y = B : Y_{0} + λ v^{T} D v - v^{T} D w, \end{matrix}

(50)

\begin{matrix} {∥ Y ∥}^{2} = λ^{2} v^{T} D v + {∥ Y_{0} ∥}^{2} - w^{T} D w . \end{matrix}

(51)

Inserting Equations (50) and (51) into Equation (46) yields

λ^{2} v^{T} D v + {∥ Y_{0} ∥}^{2} - w^{T} D w = λ B : Y_{0} + λ^{2} v^{T} D v - λ v^{T} D w .

Cancelling

λ^{2} v^{T} D v

on both sides, we can derive

∥ Y_{0} ∥^{2} - w^{T} D w = λ [B : Y_{0} - v^{T} D w],

(52)

which renders

λ = \frac{∥ Y_{0} ∥^{2} - w^{T} D w}{B : Y_{0} - v^{T} D w} .

(53)

Inserting it into Equation (48),

α

is obtained as follows:

α = \frac{∥ Y_{0} ∥^{2} - w^{T} D w}{B : Y_{0} - v^{T} D w} D v - D w .

(54)

Let

u_{i} : = (A U_{0}) : Y_{i} .

(55)

Inserting Equation (30) for

Y_{0} = α_{0} A U_{0}

into Equation (39), and comparing it to Equation (55), we have

w_{i} = α_{0} u_{i} ⟹ w = α_{0} u .

(56)

By inserting

α

in Equation (48) into Equation (21) and using

Y_{0} = α_{0} A U_{0}

, we can obtain

X = α_{0} U_{0} + λ U D v - U D w = α_{0} U_{0} + α_{0} U z,

(57)

where

\begin{matrix} z : = λ_{0} D v - D u, \end{matrix}

(58)

\begin{matrix} λ_{0} : = \frac{∥ A U_{0} ∥^{2} - u^{T} D u}{B : (A U_{0}) - v^{T} D u}, \end{matrix}

(59)

\begin{matrix} λ = \frac{∥ Y_{0} ∥^{2} - w^{T} D w}{B : Y_{0} - v^{T} D w} = α_{0} λ_{0} . \end{matrix}

(60)

Upon letting

V : = U_{0} + U z,

(61)

Equation (57) can be expressed as

X = α_{0} V .

(62)

Now

α_{0}

can be determined by (33). Inserting Equation (62) into (33) yields

{∥ B - A X ∥}^{2} = ∥ B - α_{0} {A V ∥}^{2} = α_{0}^{2} {∥ W ∥}^{2} - 2 α_{0} W : B + {∥ B ∥}^{2},

(63)

where

W : = A V = A U_{0} + Y z = λ_{0} Y D v - Y D u + A U_{0}

(64)

is derived according to Equations (31) and (58).

Differentiating Equation (63) vs

α_{0}

and equating it to zero, generates

α_{0} = \frac{W : B}{{∥ W ∥}^{2}};

(65)

by Equation (62),

X = \frac{W : B}{{∥ W ∥}^{2}} V .

(66)

The proof of Theorem 1 is complete. □

To prove Theorem 2, we need the following two lemmas.

Lemma 1.

In terms of the matrix-pencil

Y

in Equation (31), we have

\begin{matrix} B : (Y D v) = v^{T} D v, \end{matrix}

(67)

\begin{matrix} B : (Y D u) = v^{T} D u . \end{matrix}

(68)

Proof.

By using the definition in Equation (22), the left-hand side of Equation (67) can be written as

B : (Y D v) = B : (\sum_{j = 1}^{m} {(D v)}_{j} Y_{j}) = \sum_{j = 1}^{m} {(D v)}_{j} B : Y_{j} = \sum_{j = 1}^{m} v_{j} {(D v)}_{j},

where

v_{j}

and

{(D v)}_{j}

denote, respectively, the jth component of v and

D v

, and Equation (37) is used in the last equality. In terms of vector form the last term is just the right-hand side of Equation (67). Similarly, Equation (68) can be proved by the same manner, i.e.,

B : (Y D u) = B : (\sum_{j = 1}^{m} {(D u)}_{j} Y_{j}) = \sum_{j = 1}^{m} {(D u)}_{j} B : Y_{j} = \sum_{j = 1}^{m} v_{j} {(D u)}_{j} .

This ends the proof of Lemma 1. □

Lemma 2.

In terms of the matrix-pencil

Y

in Equation (31), we have

(A U_{0}) : (Y z) = u^{T} z .

(69)

Proof.

By using the definition in Equation (22), the left-hand side of Equation (69) can be written as

(A U_{0}) : (Y z) = (A U_{0}) : (\sum_{j = 1}^{m} z_{j} Y_{j}) = \sum_{j = 1}^{m} z_{j} (A U_{0}) : Y_{j} = \sum_{j = 1}^{m} z_{j} u_{j},

where

z_{j}

and

u_{j}

denote, respectively, the jth component of z and u, and Equation (55) is used in the last equality. In terms of vector form the last term is just the right-hand side of Equation (69). This ends the proof of Lemma 2. □

Theorem 2.

In Theorem 1, the two parameters

α_{0}

and

λ_{0}

satisfy the following reciprocal relation:

λ = α_{0} λ_{0} = 1 .

(70)

Proof.

Taking the inner product of B with Equation (64) and by Lemma 1, leads to

B : W = λ_{0} v^{T} D v + B : (A U_{0}) - v^{T} D u .

(71)

With the aid of Equation (59), we have

B : W = λ_{0} v^{T} D v + \frac{1}{λ_{0}} [∥ A U_{0} ∥^{2} - u^{T} D u] .

(72)

Then after multiplying

λ_{0}

on the above equation, and by Equation (65), we can obtain

α_{0} λ_{0} {∥ W ∥}^{2} = {∥ A U_{0} ∥}^{2} + λ_{0}^{2} v^{T} D v - u^{T} D u .

(73)

Next we prove that Equation (73) is equal to

{∥ W ∥}^{2}

. From Equation (64) and with the aids of Equations (40), (58) and (69), it follows that

\begin{matrix} {∥ W ∥}^{2} & = & ∥ A U_{0} ∥^{2} + 2 (A U_{0}) : (Y z) + z^{T} C z \\ = & ∥ A U_{0} ∥^{2} + 2 u^{T} z + z^{T} C z \\ = & ∥ A U_{0} ∥^{2} + 2 λ_{0} u^{T} D v - 2 u^{T} D u + (λ_{0} v^{T} D - u^{T} D) C (λ_{0} D v - D u) \\ = & ∥ A U_{0} ∥^{2} + λ_{0}^{2} v^{T} D v - u^{T} D u, \end{matrix}

(74)

where

D C D = D

was used in view of Equation (49). Then by comparing Equations (73) and (74), we have proven Equation (70). This ends the proof of

α_{0} λ_{0} = 1

. Then by Equation (60),

λ = 1

is proven. □

A more direct proof of

λ = 1

is available by comparing Equation (46) to Equation (96), which is derived in the proof of Theorem 4.

3.3. Estimating Residual Error

To estimate the residual error we begin with the following lemma.

Lemma 3.

In terms of the matrix-pencil

Y

in Equation (31), we have

\begin{matrix} (A U_{0}) : (Y D v) = u^{T} D v, \end{matrix}

(75)

\begin{matrix} (A U_{0}) : (Y D u) = u^{T} D u . \end{matrix}

(76)

Proof.

By using the definition in Equations (22) and (55), the left-hand side of Equation (75) can be written as

(A U_{0}) : (Y D v) = (A U_{0}) : (\sum_{j = 1}^{m} {(D v)}_{j} Y_{j}) = \sum_{j = 1}^{m} {(D v)}_{j} (A U_{0}) : Y_{j} = \sum_{j = 1}^{m} u_{j} {(D v)}_{j},

where

u_{j}

and

{(D v)}_{j}

denote, respectively, the jth component of u and

D v

. In terms of vector form the last term is just the right-hand side of Equation (75). Similarly, Equation (76) can be proved by the same manner, i.e.,

(A U_{0}) : (Y D u) = (A U_{0}) : (\sum_{j = 1}^{m} {(D u)}_{j} Y_{j}) = \sum_{j = 1}^{m} {(D u)}_{j} (A U_{0}) : Y_{j} = \sum_{j = 1}^{m} u_{j} {(D u)}_{j} .

This ends the proof of Lemma 3. □

Lemma 4.

In terms of the matrix-pencil

Y

in Equation (31), we have

\begin{matrix} (Y D u) : (Y D u) = u^{T} D u, \end{matrix}

(77)

\begin{matrix} (Y D v) : (Y D v) = v^{T} D v, \end{matrix}

(78)

\begin{matrix} (Y D u) : (Y D v) = u^{T} D v . \end{matrix}

(79)

Proof.

By using the definition in Equation (22), we have

(Y D u) : (Y D u) = ({(D u)}_{i} Y_{i}) : ({(D u)}_{j} Y_{j}) = {(D u)}_{i} {(D u)}_{j} Y_{i} : Y_{j} .

(80)

The above two repeated indices i and j are summed automatically from 1 to m according to the convention of Einstein [49]. Then with the aid of Equation (40), we have

(Y D u) : (Y D u) = ({(D u)}_{i} Y_{i}) : ({(D u)}_{j} Y_{j}) = {(D u)}_{i} C_{i j} {(D u)}_{j} = u_{i} D_{i j} u_{j},

(81)

where

D = C^{- 1}

was used. In terms of vector form the above equation leads to Equation (77). Similarly, we have

\begin{matrix} (Y D v) : (Y D v) = {(D v)}_{i} C_{i j} {(D v)}_{j} = v_{i} D_{i j} v_{j}, \end{matrix}

(82)

\begin{matrix} (Y D u) : (Y D v) = {(D u)}_{i} C_{i j} {(D v)}_{j} = u_{i} D_{i j} v_{j} . \end{matrix}

(83)

In terms of vector form the above two equations reduce to Equations (78) and (79). This ends the proof of Lemma 4. □

Theorem 3.

For

X \in S p a n {U_{0}, U}

, the residual error of the optimal solution in Equation (57) satisfies:

{∥ B - A X ∥}^{2} = {∥ B ∥}^{2} - v^{T} D v - \frac{{[B : (A U_{0}) - v^{T} D u]}^{2}}{∥ A U_{0} ∥^{2} - u^{T} D u} .

(84)

Proof.

According to Equation (70), we can refine X in Equation (34) to

X = α_{0} [U_{0} - U D u] + U D v,

(85)

where

α_{0} = \frac{B : (A U_{0}) - v^{T} D u}{∥ A U_{0} ∥^{2} - u^{T} D u} .

(86)

Let us check residual square:

{∥ B - A X ∥}^{2} = {∥ B - Y ∥}^{2} = {∥ Y ∥}^{2} - 2 B : Y + {∥ B ∥}^{2},

(87)

where

Y = A X = α_{0} A U_{0} - α_{0} Y D u + Y D v

(88)

is obtained from Equation (85), in which

Y = A U

.

By using Lemmas 1, 3 and 4, it follows from Equation (88) that

\begin{matrix} {∥ Y ∥}^{2} = α_{0}^{2} (∥ A U_{0} ∥^{2} - u^{T} D u) + v^{T} D v, \end{matrix}

(89)

\begin{matrix} B : Y = α_{0} B : (A U_{0}) - α_{0} v^{T} D u + v^{T} D v . \end{matrix}

(90)

Equation (90) is easily derived by taking the inner product of B to Equation (88):

B : Y = α_{0} B : (A U_{0}) - α_{0} B : Y D u + B : Y D v,

and using Lemma 1. Taking the squared norms of Equation (88), we have

\begin{matrix} {∥ Y ∥}^{2} = {∥ α_{0} A U_{0} - α_{0} Y D u + Y D v ∥}^{2} \\ = α_{0}^{2} ∥ A U_{0} ∥^{2} + α_{0}^{2} {∥ Y D u ∥}^{2} + {∥ Y D v ∥}^{2} \\ - 2 α_{0}^{2} (A U_{0}) : (Y D u) + 2 α_{0} (A U_{0}) : (Y D v) - 2 α_{0} (Y D u) : (Y D v); \end{matrix}

by using Lemmas 3 and 4, it is simplified to

\begin{matrix} {∥ Y ∥}^{2} = α_{0}^{2} {∥ A U_{0} ∥}^{2} + α_{0}^{2} u^{T} D u + v^{T} D v \\ - 2 α_{0}^{2} u^{T} D u + 2 α_{0} u^{T} D v - 2 α_{0} u^{T} D v = α_{0}^{2} {∥ A U_{0} ∥}^{2} - α_{0}^{2} u^{T} D u + v^{T} D v . \end{matrix}

It derives Equation (89).

Then, inserting the above two equations into Equation (87), we have

\begin{matrix} {∥ B - A X ∥}^{2} = α_{0}^{2} (∥ A U_{0} ∥^{2} - u^{T} D u) + v^{T} D v - 2 α_{0} B : (A U_{0}) + 2 α_{0} v^{T} D u - 2 v^{T} D v + {∥ B ∥}^{2} \\ = α_{0}^{2} (∥ A U_{0} ∥^{2} - u^{T} D u) + 2 α_{0} [v^{T} D u - B : (A U_{0})] + {∥ B ∥}^{2} - v^{T} D v . \end{matrix}

(91)

Consequently, inserting Equation (86) for

α_{0}

into the above equation, yields Equation (84).

□

As a consequence, we can prove that the two merit functions

{∥ E ∥}^{2} = {∥ B ∥}^{2} - {(B : Y)}^{2} / {∥ Y ∥}^{2}

and

{∥ B - A X ∥}^{2}

are the same.

Theorem 4.

In the optimal solution of

X \in S p a n {U_{0}, U}

, the values of the two merit functions are the same, i.e.,

{∥ E ∥}^{2} = {∥ B - A X ∥}^{2} = {∥ R ∥}^{2},

(92)

where

{∥ E ∥}^{2} = {∥ B ∥}^{2} - {(B : Y)}^{2} / {∥ Y ∥}^{2}

. Moreover, we have

{∥ R ∥}^{2} < {∥ B ∥}^{2} .

(93)

Proof.

Inserting Equation (86) for

α_{0}

into Equations (89) and (90), we can obtain

\begin{matrix} {∥ Y ∥}^{2} = \frac{{(B : (A U_{0}) - v^{T} D u)}^{2}}{∥ A U_{0} ∥^{2} - u^{T} D u} + v^{T} D v, \end{matrix}

(94)

\begin{matrix} B : Y = \frac{{(B : (A U_{0}) - v^{T} D u)}^{2}}{∥ A U_{0} ∥^{2} - u^{T} D u} + v^{T} D v . \end{matrix}

(95)

It is remarkable that

{∥ Y ∥}^{2} = B : Y .

(96)

From Equations (84) and (94), we can derive

{∥ B - A X ∥}^{2} = {∥ B ∥}^{2} - {∥ Y ∥}^{2} .

(97)

Inserting Equation (96) for

B : Y

into Equation (25), it follows that

{∥ E ∥}^{2} = {∥ B ∥}^{2} - {∥ Y ∥}^{2} .

(98)

This ends the proof of Equation (92).

By using the definition of residual matrix

R = B - A X

and from Equation (97), we have

{∥ R ∥}^{2} = {∥ B ∥}^{2} - {∥ Y ∥}^{2} .

(99)

Because of

{∥ Y ∥}^{2} > 0

, Equation (93) follows readily. □

3.4. Orthogonal Projection

Equation (92) indicates that

{∥ E ∥}^{2}

and

{∥ R ∥}^{2}

have the same values when the double-optimality conditions are achieved, of which the key equation (70) plays a dominant role; it renders

{∥ Y ∥}^{2} = B : Y

and the equality in Equation (92) holds. The residual error is absolutely decreased in Equation (93); Equation (84) gives the estimation of residual error.

Theorem 5.

For

X \in S p a n {U_{0}, U}

,

Y = A X

can be orthogonally decomposed into

Y = α_{0} [A U_{0} - Y D u] + Y D v,

(100)

where

A U_{0} - Y D u

and

Y D v

are orthogonal.

Proof.

It follows from Equation (88) that

Y = α_{0} [A U_{0} - Y D u] + Y D v .

(101)

From Lemma 3, we have

(A U_{0}) : (Y D v) = u^{T} D v,

and from Lemma 4, we have

(Y D u) : (Y D v) = u^{T} D v .

Subtracting the above two equations, we have

(A U_{0} - Y D u) : (Y D v) = 0 .

This ends the proof. □

Furthermore,

Y D v

can be written as

Y D v = \sum_{i = 1}^{m} {(D v)}_{i} Y_{i} = \sum_{i = 1}^{m} D_{i j} v_{j} Y_{i} = \sum_{i = 1}^{m} D_{i j} Y_{j} : B Y_{i},

with the aid of Equations (22) and (37). Now, we can introduce the projection operator

P

by

P (B) = (\sum_{i = 1}^{m} D_{i j} Y_{i} Y_{j}) (B) = \sum_{i = 1}^{m} D_{i j} Y_{i} Y_{j} : B,

(102)

which acts on B and results in

\sum_{i = 1}^{m} D_{i j} Y_{i} Y_{j} : B

. Accordingly, we can define the projection operator

P

as

P = \sum_{i = 1}^{m} D_{i j} Y_{i} Y_{j} .

(103)

Theorem 6.

For

X \in S p a n {U_{0}, U}

, the optimal solution (34) is an exact solution of

P (A X) = P (B),

(104)

which is a projection of Equation (11) into the affine Krylov subspace.

Proof.

From Equations (100) and (23), we have

A X = α_{0} [A U_{0} - Y D u] + Y D v .

(105)

Applying the projection operator

P

to the above equation yields

P (A X) = α_{0} P (A U_{0} - Y D u) + P (Y D v) .

(106)

We need to prove

\begin{matrix} P (A U_{0} - Y D u) = 0, \end{matrix}

(107)

\begin{matrix} P (Y D v) = P (B) . \end{matrix}

(108)

From Equations (103) and (55), we have

P (A U_{0}) = D_{i j} Y_{i} u_{j} .

(109)

On the other hand, from Equations (103), (22), (40), and (49), we have

\begin{matrix} P (Y D u) = D_{i j} Y_{i} {(D u)}_{k} Y_{j} : Y_{k} = D_{i j} Y_{i} D_{k ℓ} u_{ℓ} C_{j k} \\ = D_{i j} C_{j k} D_{k ℓ} Y_{i} u_{ℓ} = D_{i ℓ} Y_{i} u_{ℓ} = D_{i j} Y_{i} u_{j} . \end{matrix}

(110)

Subtracting the above two equations, we can prove Equation (107). From Equations (103), (22), (40), (49) and (37), we have

\begin{matrix} P (Y D v) = D_{i j} Y_{i} {(D v)}_{k} Y_{j} : Y_{k} = D_{i j} Y_{i} D_{k ℓ} v_{ℓ} C_{j k} = D_{i j} C_{j k} D_{k ℓ} Y_{i} v_{ℓ} \\ = D_{i ℓ} Y_{i} v_{ℓ} = D_{i j} Y_{i} v_{j} = D_{i j} Y_{i} Y_{j} : B = P (B) . \end{matrix}

(111)

The proof is complete. □

Corollary 1.

In Theorem 3,

∥ A U_{0} ∥^{2} - u^{T} D u > 0;

(112)

hence,

{∥ B - A X ∥}^{2} < {∥ B ∥}^{2} .

(113)

Proof.

First, we need to prove

∥ A U_{0} ∥^{2} - u^{T} D u > 0 .

(114)

By using Equations (103) and (55), we have

{(A U_{0})}^{T} P (A U_{0}) = u_{i} D_{i j} u_{j},

(115)

which can be further written as

u^{T} D u = u_{i} D_{i j} u_{j} = {(A U_{0})}^{T} P^{2} (A U_{0}) = {∥ P (A U_{0}) ∥}^{2},

(116)

because

P

is a projection operator. Thus, we have

∥ A U_{0} ∥^{2} - u^{T} D u = {(A U_{0})}^{T} (I - P) (A U_{0}) > 0,

(117)

because

I - P

is also a projection operator, where I is an identity operator. In view of Equations (84) and (112), the result in Equation (113) is straightforward. □

Remark 1.

For the Case 1 problem, the conventional generalized minimal residual method (GMRES) [16,18] only takes the minimization in (15) into account. Therefore,

α_{0} = 0

in the GMRES. To accelerate the convergence of the iterative algorithm, the maximal projection as mentioned in (16) must be considered. Therefore, in the developed iterative algorithm DOIA, we consider both optimization problems in (15) and (16), and simultaneously

α_{0}

and

α_{i}, i = 1, \dots, m

are derived explicitly.

Remark 2.

If we directly solve the optimization problem in (32) for

α_{i}, i = 1, \dots, m + 1

in a larger

m + 1

-degree matrix pencil with

X = \sum_{k = 1}^{m + 1} α_{k} U_{k},

then, via Equation (29), we have

Y = A X = \sum_{k = 1}^{m + 1} α_{k} Y_{k},

where

Y_{k} = A U_{k}

; it can be seen that

Y_{0} = 0

. Consequently,

w = 0

in Equation (39), and λ in Equation (53) cannot be defined owing to

Y_{0} = 0

and

w = 0

.

4. Double-Optimal Iterative Algorithm

According to Theorem 1, the double-optimal iterative algorithm (DOIA) to solve Equation (1) reads as follows (Algorithm 1).

Algorithm 1 DOIA

1: Select m and give an initial value of

Z_{0}

2: Do

k = 0, 1, \dots

3:

U_{0}^{k} = R_{k}

(Case 1),

U_{0}^{k} = A^{T} R_{k}

(Case 2)

4:

U_{j}^{k} = A^{j} U_{0}^{k}

(Case 1),

U_{j}^{k} = {(A^{T} A)}^{j} U_{0}^{k}

(Case 2)

j = 1, \dots, m

5:

U_{k} = [U_{1}^{k}, \dots, U_{m}^{k}]

(orthonormalization)

6:

Y_{k} = A U_{k}

7:

C_{i j}^{k} = Y_{i}^{k} : Y_{j}^{k}, C_{k} = [C_{i j}^{k}]

8:

D_{k} = C_{k}^{- 1}

9:

v_{i}^{k} = R_{k} : Y_{i}^{k}, v_{k} = [v_{i}^{k}]

10:

u_{i}^{k} = (A U_{0}^{k}) : Y_{i}^{k}, u_{k} = [u_{i}^{k}]

11:

λ_{k} = \frac{∥ A U_{0}^{k} ∥^{2} - u_{k}^{T} D_{k} u_{k}}{R_{k} : (A U_{0}^{k}) - v_{k}^{T} D_{k} u_{k}}

12:

W_{k} = λ_{k} Y_{k} D_{k} v_{k} + A U_{0}^{k} - Y_{k} D_{k} u_{k}

13:

α_{k} = \frac{R_{k} : W_{k}}{∥ W_{k} ∥^{2}}

14:

X_{k} = α_{k} [U_{0}^{k} + λ_{k} U_{k} D_{k} v_{k} - U_{k} D_{k} u_{k}]

15:

Z_{k + 1} = Z_{k} + X_{k}

16: Enddo, if

∥ R_{k} ∥ < ε

4.1. Crucial Properties of DOIA

In this section, we prove the crucial properties of the DOIA, including the absolute convergence and the orthogonality of the residual.

Corollary 2.

In the DOIA, Theorem 4 guarantees that the residual is decreased step-by-step, i.e.,

\begin{matrix} ∥ R_{k + 1} ∥^{2} = ∥ R_{k} ∥^{2} - {∥ A X_{k} ∥}^{2}, \end{matrix}

(118)

\begin{matrix} ∥ R_{k + 1} ∥ < ∥ R_{k} ∥ . \end{matrix}

(119)

Proof.

Taking

R = R_{k + 1}

,

B = R_{k}

and

Y = A X_{k}

in Equation (99), we can derive Equation (118) readily. From Equation (84), we have

∥ R_{k + 1} ∥^{2} = {∥ R_{k} ∥}^{2} - v_{k}^{T} D_{k} v_{k} - \frac{{[B : (A U_{0}^{k}) - v_{k}^{T} D_{k} u_{k}]}^{2}}{∥ A U_{0}^{2} ∥^{2} - u_{k}^{T} D_{k} u_{k}} .

(120)

Then, from Equation (114), we can prove Equation (119). □

Corollary 3.

In the DOIA, the convergence rate is given by

\frac{∥ R_{k} ∥}{∥ R_{k + 1} ∥} = \frac{1}{\sin θ} > 1, 0 < θ < π, θ \neq \frac{π}{2},

(121)

where θ is the intersection angle between

R_{k}

and

Y_{k}

.

Proof.

From Equations (99), (96), and (118), we have

∥ R_{k + 1} ∥^{2} = ∥ R_{k} ∥^{2} - ∥ R_{k} ∥ ∥ Y_{k} ∥ \cos θ = ∥ R_{k} ∥^{2} - {∥ Y_{k} ∥}^{2}, θ \neq \frac{π}{2},

(122)

where

Y_{k} = A X_{k}

, and

0 < θ < π

is the intersection angle between

R_{k}

and

Y_{k}

, which can be written as

\cos θ = \frac{∥ Y_{k} ∥}{∥ R_{k} ∥},

(123)

with the help of Equation (96). Then, Equation (122) can be further reduced to

∥ R_{k + 1} ∥^{2} = ∥ R_{k} ∥^{2} [1 - \cos^{2} θ] = {∥ R_{k} ∥}^{2} \sin^{2} θ .

(124)

Taking the square roots of both sides, we can obtain Equation (121). □

Corollary 4.

In the DOIA, the residual vector

R_{k + 1}

is A orthogonal to the descent direction

X_{k}

, i.e.,

R_{k + 1} : (A X_{k}) = 0 .

(125)

Proof.

From

R = F - A Z

and Equations (12) and (23), we have

R = B - Y .

(126)

Taking the inner product with Y and using Equations (96) and (23), it follows that

R : Y = R : (A X) = 0,

(127)

in which R is perpendicular to Y. Letting

R = R_{k + 1}

and

X = X_{k}

, Equation (125) is proven. □

The DOIA can provide a good approximation of Equation (11) with a better descent direction

X_{k}

in the matrix pencil of the matrix Krylov subspace. Under this situation, we can prove the following corollary, which guarantees that the present algorithm quickly converges to the true solution.

Corollary 5.

In the DOIA, two consecutive residual matrices

R_{k}

and

R_{k + 1}

are orthogonal by

R_{k + 1} : (R_{k + 1} - R_{k}) = 0 .

(128)

Proof.

From the last equation in the DOIA, we have

R_{k + 1} = R_{k} - A X_{k} .

(129)

Taking inner product with

R_{k + 1}

and using Lemma 2, yields

R_{k + 1} : R_{k + 1} = R_{k + 1} : R_{k} - R_{k + 1} : (A X_{k}) = R_{k + 1} : R_{k},

(130)

which can be rearranged in Equation (128). □

4.2. Restarted DOIA(m)

In the DOIA, we fix m of the dimensions of the Krylov subspace. An alternative version of the DOIA can perform well by varying m in the range

[m_{0}, m_{1}]

; like GMRES(m), it is named DOIA(m);

m = m_{1} - m_{0} + 1

is the frequency of the restart (Algorithm 2).

Algorithm 2 DOIA(m)

1: Select

m_{0}

and

m_{1}

, and give

Z_{0}

2: Do

i = 1, \dots

3: Do

m = m_{0}, \dots, m_{1}

,

k = m - m_{0}

4:

U_{0}^{k} = R_{k}

(Case 1),

U_{0}^{k} = A^{T} R_{k}

(Case 2)

5:

U_{j}^{k} = A^{j} U_{0}^{k}

(Case 1),

U_{j}^{k} = {(A^{T} A)}^{j} U_{0}^{k}

(Case 2)

j = 1, \dots, m

6:

U_{k} = [U_{1}^{k}, \dots, U_{m}^{k}]

(orthonormalization)

7:

Y_{k} = A U_{k}

8:

C_{i j}^{k} = Y_{i}^{k} : Y_{j}^{k}, C_{k} = [C_{i j}^{k}]

9:

D_{k} = C_{k}^{- 1}

10:

v_{i}^{k} = R_{k} : Y_{i}^{k}, v_{k} = [v_{i}^{k}]

11:

u_{i}^{k} = (A U_{0}^{k}) : Y_{i}^{k}, u_{k} = [u_{i}^{k}]

12:

λ_{k} = \frac{∥ A U_{0}^{k} ∥^{2} - u_{k}^{T} D_{k} u_{k}}{R_{k} : (A U_{0}^{k}) - v_{k}^{T} D_{k} u_{k}}

13:

W_{k} = λ_{k} Y_{k} D_{k} v_{k} + A U_{0}^{k} - Y_{k} D_{k} u_{k}

14:

α_{k} = \frac{R_{k} : W_{k}}{∥ W_{k} ∥^{2}}

15:

X_{k} = α_{k} [U_{0}^{k} + λ_{k} U_{k} D_{k} v_{k} - U_{k} D_{k} u_{k}]

16:

Z_{k + 1} = Z_{k} + X_{k}

17: Enddo of (3), if

∥ R_{k + 1} ∥ < ε

18: Otherwise,

Z_{0} = Z_{k + 1}

, go to (2)

5. Numerical Examples

To demonstrate the efficiency and accuracy of the presented iterative algorithms DOIA and DOIA(m), several examples were examined. All the numerical computations were carried out with Fortran 77 in Microsoft Developer Studio with Intel Core I7-3770, CPU 2.80 GHz and 8 GB memory. The precision was

10^{- 16}

.

5.1. Example 1

We solve Equation (1) with

\begin{matrix} A = [A_{i j}] = {[i + j / 2 - 1]}_{n \times n}, \end{matrix}

(131)

\begin{matrix} Z = [Z_{i j}] = {[3 i + 2 j]}_{n \times p} . \end{matrix}

(132)

Matrix F can be computed by inserting the above matrices into Equation (1).

This problem belongs to Case 1. When we apply the iterative algorithm in Section 4 to solve this problem with

n = 50

and

p = 20

, we fix

m = 3

, and the convergence criterion is

ε = 10^{- 8}

. As shown in Figure 1a, the DOIA converges very fast with only three steps. In order compare the numerical results with the exact solution, the matrix elements are vectorized along each row, and the number of components is given sequentially. As shown in Figure 1b, the numerical and exact solutions are almost coincident, with the numerical error shown in Figure 1c. The maximum error (ME) is

1.9 \times 10^{- 12}

, and the residual error is

∥ A Z - F ∥ = 1.04 \times 10^{- 11}

.

When the problem dimension is raised to

n = 150

and

p = 50

, the original DOIA does not converge within 100 steps. However, using the restarted DOIA(m) with

m_{0} = 2

and

m_{1} = 3

, it requires three steps under

ε = 10^{- 6}

, obtaining a highly accurate solution with ME =

3.92 \times 10^{- 11}

and the error of

∥ A Z - F ∥ = 1.04 \times 10^{- 9}

. The CPU time is 1.06 s.

Table 1 compares the results obtained by the DOIA(m) for different n in Equations (131) and (132): for

n = 300

, we take

m_{0} = 3

,

m_{1} = 5

and

ε = 10^{- 5}

; for

n = 500

, we take

m_{0} = 4

,

m_{1} = 6

and

ε = 10^{- 4}

.

Next, we consider Equation (1) with the A given in Equation (131), and F is given by

F_{i j} = 1, i = 1, \dots, n, j = 1, \dots, p

. With

n = 150

,

p = 50

and

ε = 10^{- 6}

, the restarted DOIA(m) with

m_{0} = 2

and

m_{1} = 3

converges in two steps, obtaining a highly accurate solution with

max | {(A Z_{N})}_{i j} - F_{i j} | = 5.08 \times 10^{- 10}

, where

Z_{N}

is the numerical solution of Z in Equation (1).

5.2. Example 2

In Equation (1), we consider a cyclic matrix A. The first row is given by

(1, \dots, N)

, where

N = max (n, q)

. The algorithm is given as follows (Algortim 3).

Algorithm 3 For cyclic matrix

1: Give N

2: Do

i = 1 : N

3: Do

j = 1 : N

4: If

i = 1

, then

S_{i, j} = j

; otherwise

5:

S_{i, j} = S_{i - 1, j} + 1

6: If

S_{i, j} > N

, then

S_{i, j} = S_{i, j} - N

7: Enddo of j

8: Enddo of i

The nonsquare matrix A is obtained by taking the first q rows from S if

n > q

or the first n columns from S if

n < q

. Matrix Z in Equation (1) is given by

Z = [Z_{i j}] = {[i + j - 1]}_{n \times p},

where we fix

n = 5

,

p = 15

, and

q = 30

. Matrix F can be computed by inserting the above matrices into Equation (1).

This problem belongs to Case 2. When we apply the iterative algorithm in Section 4 to find the solution, we fix

m = 2

, and the convergence criterion is

ε = 10^{- 12}

. As shown in Figure 2a, the DOIA converges very fast in 47 steps. As shown in Figure 2b, the numerical and exact solutions are almost coincident, with the numerical error shown in Figure 2c. The ME is

7.11 \times 10^{- 15}

. Then, the DOIA(m), with

m_{0} = 1

and

m_{1} = 2

, obtains an ME =

7.11 \times 10^{- 15}

in 43 steps.

We consider the inverse matrix obtained in Case 2 with

q = p = n

and

F = I_{n}

. Let X be the inverse matrix of A; the Newton–Schultz iterative method is

X_{k + 1} = X_{k} (2 I_{n} - A X_{k}) .

For the initial value with

X_{0} = A^{T} / {∥ A ∥}^{2}

, the Newton–Schultz iterative method can converge very quickly.

We consider the cyclic matrix constructed from Algorithm 3 for the cyclic matrix and apply the DOIA(m) to find the

A^{- 1}

with the same initial value with

X_{0} = A^{T} / {∥ A ∥}^{2}

. Table 2 compares the

∥ A X - I_{n} ∥

and iteration number (IN) obtained by the DOIA(m) and the Newton–Schultz iterative method (NSIM) with

ε = 10^{- 15}

. The DOIA(m) can find a more accurate inverse matrix with a lower IN.

For

n = 7

, the CPU time of DOIA(m) is 0.41 s. Although n is raised to

n = 50

, the DOIA(m), with

m_{0} = 2

and

m_{1} = 3

, is still applicable, with

∥ A X - I_{n} ∥ = 7.72 \times 10^{- 13}

obtained in 149 steps, with a CPU time of 6.85 s.

5.3. Example 3

We consider

A = [\begin{matrix} 1 & 2 & 3 & 4 & 5 & 6 \\ 2 & 3 & 4 & 5 & 6 & 1 \\ 3 & 4 & 5 & 6 & 1 & 2 \\ 4 & 5 & 6 & 1 & 2 & 3 \\ 5 & 6 & 1 & 2 & 3 & 4 \\ 6 & 1 & 2 & 3 & 4 & 5 \end{matrix}] .

(133)

We first compute

A^{- 1}

and then the solution of the corresponding linear system. This problem belongs to Case 1.

We take

m = 4

and apply the DOS to find the inverse matrix of A by setting

B = I_{6}

in Equation (11). The values of

A X - I_{6}

obtained from the DOS with the accuracy to find the inverse matrix

A^{- 1}

has two orders. The residual error is

∥ A X - I_{5} ∥ = 0.159

. In the solution to Equation (4), when we compare the numerical solution with the exact solution

x_{i} = 1, i = 1, \dots, 6

, we find that the maximum error is

1.088 \times 10^{- 4}

. Since the DOS is just a single-step DOIA, to improve the accuracy, we can apply the DOIA to solve the considered problem.

When we apply the DOIA to solve this problem with

m = 3

, the accuracy can be raised to

∥ A X - I_{6} ∥ = 9.77 \times 10^{- 9}

, while in the solution to Equation (4), ME =

5.73 \times 10^{- 11}

. In Figure 3a, we show the residual error obtained by the DOIA, which is convergent in 67 steps under

ε = 10^{- 8}

; Figure 3b compares the errors of the elements of the inverse matrix, using the components of

A X - I_{6}

. It can be seen that the DOIA is very effective and is much accurate than the DOS method. DOS is a single-step DOIA without iteration.

For the DOIA(m) with

m_{0} = 2

and

m_{1} = 4

, it converges in 28 steps under

ε = 10^{- 8}

; the accuracy can be raised to

∥ A X - I_{6} ∥ = 9.73 \times 10^{- 9}

, and ME =

2.13 \times 10^{- 11}

. The CPU time of the DOIA(m) is 0.33 s, which converges faster than the DOIA.

To test the stability of the DOIA, we consider random noise with intensity s to disturb the coefficients in Equation (133). We take the same values of the parameters in the DOIA. Table 3 compares

∥ A X - I_{6} ∥

, ME, and iteration number (IN). Essentially, the noise does not influence the accuracy of

∥ A X - I_{6} ∥

; however, ME and IN worsen when s increases.

5.4. Example 4

By testing the performance of the DOS on the solution to the linear equation system, we consider the following convex quadratic programming problem with an equality constraint:

\begin{matrix} min \{f = \frac{1}{2} x^{T} P x + q^{T} x\}, \end{matrix}

(134)

\begin{matrix} C x = b_{0}, \end{matrix}

(135)

where P is an

n_{1} \times n_{1}

matrix, C is an

m_{1} \times n_{1}

matrix, and

b_{0}

is an

m_{1}

vector, which means that Equation (135) provides

m_{1}

linear constraints. According to Lagrange theory, we need to solve the linear system (4) with the following b and A:

\begin{matrix} b : = [\begin{matrix} - q \\ b_{0} \end{matrix}], \\ A = [\begin{matrix} P & C^{T} \\ C & 0_{m_{1} \times m_{1}} \end{matrix}] . \end{matrix}

For a definite solution, we take

n_{1} = 3

and

m_{1} = 2

with

\begin{matrix} min {f = x_{1}^{2} + 2 x_{2}^{2} + x_{3}^{2} - 2 x_{1} x_{2} + x_{3}}, \\ x_{1} + x_{2} + x_{3} = 4, \\ 2 x_{1} - x_{2} + x_{3} = 2 . \end{matrix}

The dimension of A is

n \times n

, where

n = n_{1} + m_{1} = 5

. This problem belongs to Case 1. When we take

m = 4

and employ the DOS,

λ_{0} = - 0.3716283

and

α_{0} = - 2.690861

meet

α_{0} λ_{0} = 1

. With

x = A^{- 1} b

, the solution is

min f = 3.977273

at

(x_{1}, x_{2}, x_{3}) = (1.909091, 1.954545, 0.136364)

with

∥ A x - b ∥ = 3.729 \times 10^{- 15}

; it is almost an exact solution obtained without any iteration, because the DOS can be viewed as a single-step DOIA.

5.5. Example 5: Solving Ill-Conditioned Hilbert Matrix Equation

The Hilbert matrix is highly ill-conditioned:

A_{i j} = \frac{1}{i + j - 1} .

(136)

We consider an exact solution with

x_{j} = 1, j = 1, \dots, n

, and

b_{i}

is given by

b_{i} = \sum_{j = 1}^{n} \frac{1}{i + j - 1} .

It is known that the Hilbert matrix is a highly ill-conditioned matrix. For

n = 5

, the condition number is

4.81 \times 10^{5}

, and, for

n = 10

, the condition number is

1.19 \times 10^{9}

. It can be proved that the asymptotic of the condition number of the Hilbert matrix is [50]

O (\frac{{(1 + \sqrt{2})}^{4 n + 4}}{\sqrt{n}}) .

This problem belongs to Case 1. We solve this problem by using the DOS. For

n = 20

, we let m run from 2 to 9, and peak the best m with the minimum error of

∥ A X - I_{n} ∥ .

In Figure 4a, we plot the above residual with respect to m, where we can observe that the value of

α_{0} λ_{0}

is almost equal to 1, as shown in Figure 4b. We take

m = 4

, and the maximum error of x is

0.05

.

Below, we solve the inverse of a five-dimensional Hilbert matrix with

n = 5

by using the DOS method, where we take

m = 3

. It is interesting that

X = A^{- 1}

is also symmetric. The residual error is

∥ A X - I_{5} ∥ = 1.443

. However, when we apply the DOIA to solve this problem with

m = 3

, the accuracy of

∥ A X - I_{5} ∥

can be raised to

∥ A X - I_{5} ∥ = 0.912

. In Figure 5a, we show the residual error obtained by the DOIA, which does not converge within 100 steps under

ε = 10^{- 8}

. Figure 5b compares the error of the elements of the inverse matrix by using the components of

A X - I_{5}

. Due to the highly ill-conditioned nature of the Hilbert matrix, the results obtained by the DOS and DOIA are quite different. ME =

1.51 \times 10^{- 2}

is obtained for

x_{i}

.

For the DOIA(m) with

m_{0} = 2

and

m_{1} = 4

, it converges in 62 steps under

ε = 10^{- 8}

; the accuracy can be raised to

∥ A X - I_{5} ∥ = 6.83 \times 10^{- 9}

, and ME =

1.48 \times 10^{- 9}

. Obviously, the DOIA(m) can significantly improve the convergence speed and accuracy.

Then, we compare the numerical results with those obtained with the QR method. For

n = 10

, we apply the DOIA to solve this problem with

m = 7

, and the accuracy is

∥ A X - I_{10} ∥ = 1.371

, while that obtained with the QR method is

∥ A X - I_{10} ∥ = 1.574

. In Figure 6a, we compare the solutions of linear system (4), where the exact solution is supposed to be

x_{i} = 1, i = 1, \dots, 10

. Whereas the DOIA provides quite accurate solutions with ME =

2.117 \times 10^{- 3}

, the QR method fails, as the diagonal elements of matrix R are very small for the ill-conditioned Hilbert matrix. The errors of the inverse matrix obtained with these two methods are comparable, as shown in Figure 6b.

The DOIA(m) with

m_{0} = 2

and

m_{1} = 6

does not converge within 100 steps under

ε = 10^{- 8}

; however, the accuracy can be raised to ME =

1.24 \times 10^{- 5}

.

We consider the Hilbert matrix in Equation (136) and apply the DOIA(m) to find

A^{- 1}

with an initial value

X_{0} = A^{T} / {∥ A ∥}^{2}

. Because the Newton–Schultz iterative method (NSIM) converges slowly, we give the upper bound IN = 500. Table 4 compares the

∥ A X - I_{n} ∥

and IN obtained with the DOIA(m) with

ε = 10^{- 10}

and NSIM. DOIA(m) can find an accurate inverse matrix in fewer interations.

5.6. Example 6

In this example, we apply the DOS with

m = 5

to solve a six-dimensional matrix Equation (11) with the coefficient matrix A given in Equation (133), and the solution is the Hilbert matrix given in Equation (136) with

n = 6

. First, we find the right inversion of A from

A X = I_{6}

by using the DOS, and then the solution of the matrix equation is given by

X^{T} B

. The numerical solution of the matrix elements with

i = 1, \dots, 36

, which is obtained by arranging the matrix element from left to right and top to bottom with a consecutive Arabic number, is compared with the exact one in Figure 7a; the numerical error is shown in Figure 7b, which is smaller than 0.008. The residual error

∥ A X - B ∥

is 0.0157, which is quite accurate.

For the DOIA(m) with

m_{0} = 1

and

m_{1} = 2

, it converges in 28 steps under

ε = 10^{- 10}

; the accuracy raises to

∥ A X - B ∥ = 1.47 \times 10^{- 11}

, and ME =

9.24 \times 10^{- 13}

. The convergence can be accelerated to 14 steps when we take

m_{0} = 1

and

m_{1} = 5

; the accuracy is slightly decreased to ME =

4.35 \times 10^{- 12}

and

∥ A X - B ∥ = 4.96 \times 10^{- 11}

.

5.7. Example 7

Consider the mixed boundary value problem of the Laplace equation:

\begin{matrix} Δ u = u_{r r} + \frac{1}{r} u_{r} + \frac{1}{r^{2}} u_{θ θ} = 0, \\ u (ρ, θ) = h (θ), 0 \leq θ \leq π, \\ u_{n} (ρ, θ) = g (θ), π \leq θ \leq 2 π . \end{matrix}

The method of fundamental solutions is taken as

u (x) = \sum_{j = 1}^{n} c_{j} U (x, s_{j}), s_{j} \in Ω^{c} .

where

U (x, s_{j}) = ln r_{j}, r_{j} = ∥ x - s_{j} ∥ .

We consider

\begin{matrix} u (x, y) = \cos x cosh y + \sin x sinh y, \\ ρ (θ) = exp (\sin θ) \sin^{2} (2 θ) + exp (\cos θ) \cos^{2} (2 θ) . \end{matrix}

This problem is a special Case 2 with

p = 1

. We take

q = 50

and

n = 30

. ME =

3.19 \times 10^{- 2}

is obtained in 174 steps by the DOIA(m) with

m_{0} = 4

and

m_{1} = 5

. The CPU time is 0.65 s. It can also be treated as a special Case 1 with

p = 1

. We take

n = 50

. ME =

5.5 \times 10^{- 2}

is obtained within 300 steps by the DOIA(m) with

m_{0} = 2

and

m_{1} = 3

. The CPU time is 0.62 s. To improve the accuracy, we can develop the vector form of the DOIA(m), which is, however, another issue not reported here.

According to [6], we apply the GMRES(m) with

m = 50

to solve this problem. ME =

4.8 \times 10^{- 2}

is obtained in 2250 steps, and the CPU time is 2.19 s.

6. Pseudoinverse of Rectangular Matrix

The Moore–Penrose pseudoinverse of A denoted as

A^{†}

is the most famous for the inversion of a rectangular matrix, satisfying the following Penrose equations:

\begin{matrix} A A^{†} A = A, \end{matrix}

(137)

\begin{matrix} A^{†} A A^{†} = A^{†}, \end{matrix}

(138)

\begin{matrix} {(A A^{†})}^{T} = A A^{†}, \end{matrix}

(139)

\begin{matrix} {(A^{†} A)}^{T} = A^{†} A . \end{matrix}

(140)

6.1. A New Matrix Pencil

We rewrite Equation (8) as

A X = I_{q};

(141)

upon giving an initial value

X_{0}

, we attempt to seek the next step solution X with

X = α_{0} X_{0} + \sum_{k = 1}^{m} α_{k} U_{k},

(142)

where

U_{k} = X_{0} {(A X_{0})}^{k}

. The new matrix pencil with degree m is given by

U = [U_{1}, \dots, U_{m}] = [X_{0} A X_{0}, \dots, X_{0} {(A X_{0})}^{m}],

(143)

which consists of an m-degree polynomial of

A X_{0}

. The modified Gram–Schmidt process is employed to orthogonalize and normalize

U

.

The iterative form of Equation (142) is

X_{k + 1} = α_{0}^{k} X_{k} + α_{1}^{k} X_{k} A X_{k} + α_{2}^{k} X_{k} {(A X_{k})}^{2} + \dots + α_{m}^{k} X_{k} {(A X_{k})}^{m} .

(144)

Notice that it includes several iterative algorithms as special cases: the Newton–Schultz method [30], the Chebyshev method [31], the Homeier method [32], the PS method (PSM) by Petkovic and Stanimirovic [38], and the KKRJ [36]:

\begin{matrix} X_{k + 1} = 2 X_{k} - X_{k} A X_{k} (Newton - Schultz method), \end{matrix}

(145)

\begin{matrix} X_{k + 1} = 3 X_{k} - 3 X_{k} A X_{k} + X_{k} {(A X_{k})}^{2} (Chebyshev method), \end{matrix}

(146)

\begin{matrix} X_{k + 1} = \frac{7}{2} X_{k} - \frac{9}{2} X_{k} A X_{k} + \frac{5}{2} X_{k} {(A X_{k})}^{2} - \frac{1}{2} X_{k} {(A X_{k})}^{3} (Homeier method), \end{matrix}

(147)

\begin{matrix} X_{k + 1} = (1 + β) X_{k} - β X_{k} A X_{k}, (PS method), \end{matrix}

(148)

\begin{matrix} X_{k + 1} = (3 + β) X_{k} - (3 + 3 β) X_{k} A X_{k} + (1 + 3 β) X_{k} {(A X_{k})}^{2} \\ - β X_{k} {(A X_{k})}^{3} (KKRJ method) . \end{matrix}

(149)

In Equation (144), two optimization methods are used to determine the coefficients

α_{0}^{k}, α_{1}^{k}, \dots, α_{m}^{k}

.

The results in Theorem 1 are also applicable to problem (141). Hence, we propose the following iterative algorithm, namely, the Moore–Penrose iterative algorithm (MPIA).

MPIA is a new algorithm based on the newly developed DOIA in Section 4; MPIA is used as a new matrix pencil in Equation (143) and with

U_{0} = X_{0}

. The motivation for the development of MPIA using

U_{k} = X_{0} {(A X_{0})}^{k}

in the matrix pencil is that we can generalize the methods in Equations (145)–(149) to the m orders method and provide a theoretical foundation to determine the expansion coefficients from the double-optimization technique (Algorithm 4).

Algorithm 4 MPIA

1: Select m and give

X_{0}

2: Do

k = 0, 1, \dots

3:

U_{j}^{k} = X_{k} {(A X_{k})}^{j}, j = 1, \dots, m

4:

U_{k} = [U_{1}^{k}, \dots, U_{m}^{k}]

(orthonormalization)

5:

Y_{k} = [Y_{1}^{k}, \dots, Y_{m}^{k}] = [A U_{1}^{k}, \dots, A U_{m}^{k}]

6:

C_{i j}^{k} = Y_{i}^{k} : Y_{j}^{k}, C_{k} = [C_{i j}^{k}]

7:

D_{k} = C_{k}^{- 1}

8:

v_{i}^{k} = I_{q} : Y_{i}^{k}, v_{k} = [v_{i}^{k}]

9:

u_{i}^{k} = (A X_{k}) : Y_{i}^{k}, u_{k} = [u_{i}^{k}]

10:

λ_{k} = \frac{∥ A X_{k} ∥^{2} - u_{k}^{T} D_{k} u_{k}}{I_{q} : (A X_{k}) - v_{k}^{T} D_{k} u_{k}}

11:

W_{k} = λ_{k} Y_{k} D_{k} v_{k} + A X_{k} - Y_{k} D_{k} u_{k}

12:

α_{k} = \frac{I_{q} : W_{k}}{∥ W_{k} ∥^{2}}

13:

X_{k + 1} = α_{k} [X_{k} + λ_{k} U_{k} D_{k} v_{k} - U_{k} D_{k} u_{k}]

14: Enddo, if

∥ X_{k + 1} - X_{k} ∥ < ε

Like DOIA(m), the MPIA(m) can be constructed similarly.

6.2. Optimized Hyperpower Method

Pan et al. [51] proposed the following hyperpower method:

X_{k + 1} = X_{k} \sum_{i = 0}^{p - 1} {(I_{q} - A X_{k})}^{i};

(150)

also refer to [52,53]. Equation (150) includes the Newton–Schultz method [30], and the Chebyshev method [31] as special cases.

Let

m = p - 1

, and Equation (150) is rewritten as

X_{k + 1} = X_{k} + \sum_{i = 1}^{m} {(I_{q} - A X_{k})}^{i};

(151)

we generalize it to

X_{k + 1} = α_{0}^{k} X_{k} + \sum_{i = 1}^{m} α_{i}^{k} {(I_{q} - A X_{k})}^{i} .

(152)

Equation (152) is used to find the optimized descent matrix X in the current-step residual equation:

A X = R_{k} = I_{q} - A Z_{k} .

(153)

Two optimization methods are used to determine the coefficients

α_{0}^{k}, α_{1}^{k}, \dots, α_{m}^{k}

. The results in Theorem 1 are also applicable to problem (153). Hence, we propose the following iterative algorithm, namely, the optimized hyperpower iterative algorithm (OHPIA(m)); we take the restarted technique into consideration. Since the matrix pencil consists of the powers of the residual matrices, the orthonormalization for

U_{k}

is not suggested (Algorithm 5).

Algorithm 5 OHPIA(m)

1: Select

m_{0}

and

m_{1}

, and give

Z_{0}

2: Do

i = 1, \dots

3: Do

m = m_{0}, \dots, m_{1}

,

k = m - m_{0}

4:

U_{0}^{k} = X_{k}

5:

U_{j}^{k} = X_{k} R_{k}^{j} = X_{k} {(I_{q} - A Z_{k})}^{j}, j = 1, \dots, m

6:

U_{k} = [U_{1}^{k}, \dots, U_{m}^{k}]

7:

Y_{k} = A U_{k}

8:

C_{i j}^{k} = Y_{i}^{k} : Y_{j}^{k}, C_{k} = [C_{i j}^{k}]

9:

D_{k} = C_{k}^{- 1}

10:

v_{i}^{k} = R_{k} : Y_{i}^{k}, v_{k} = [v_{i}^{k}]

11:

u_{i}^{k} = (A U_{0}^{k}) : Y_{i}^{k}, u_{k} = [u_{i}^{k}]

12:

λ_{k} = \frac{∥ A U_{0}^{k} ∥^{2} - u_{k}^{T} D_{k} u_{k}}{R_{k} : (A U_{0}^{k}) - v_{k}^{T} D_{k} u_{k}}

13:

W_{k} = λ_{k} Y_{k} D_{k} v_{k} + A U_{0}^{k} - Y_{k} D_{k} u_{k}

14:

α_{k} = \frac{R_{k} : W_{k}}{∥ W_{k} ∥^{2}}

15:

X_{k} = α_{k} [U_{0}^{k} + λ_{k} U_{k} D_{k} v_{k} - U_{k} D_{k} u_{k}]

16:

Z_{k + 1} = Z_{k} + X_{k}

17: Enddo of (3), if

∥ R_{k + 1} ∥ < ε

18: Otherwise,

Z_{0} = Z_{k + 1}

, go to (2)

OHPIA is a new algorithm based on the DOIA newly developed in Section 4; OHPIA is used a new matrix pencil derived from Equation (152):

U = [U_{1}, \dots, U_{m}] = [I_{q} - A X_{k}, \dots, {(I_{q} - A X_{k})}^{m}] .

7. Numerical Testing of Rectangular Matrix

7.1. Example 8

We find the Moore–Penrose inverse of the following rank-deficient matrix [54]:

A = [\begin{matrix} - 1 & 0 & 1 & 2 \\ - 1 & 1 & 0 & - 1 \\ 0 & - 1 & 1 & 3 \\ 0 & 1 & - 1 & - 3 \\ 1 & - 1 & 0 & 1 \\ 1 & 0 & - 1 & - 2 \end{matrix}] .

This problem belongs to Case 2. We apply the iterative method of the DOIA to find the Moore–Penrose inverse, which has an exact solution:

A^{†} = \frac{1}{102} [\begin{matrix} - 15 & - 18 & 3 & - 3 & 18 & 15 \\ 8 & 13 & - 5 & 5 & - 13 & - 8 \\ 7 & 5 & 2 & - 2 & - 5 & - 7 \\ 6 & - 3 & 9 & - 9 & 3 & - 6 \end{matrix}] .

(154)

Under the same convergence criterion

ε = 10^{- 9}

as used by Xia et al. [54], the iteration process of the DOIA with

m = 1

converges very fast in three steps. By comparison to Equation (154), ME =

7.3 \times 10^{- 9}

is obtained.

In Table 5, we compare the DOIA (

m = 1

and

ε = 10^{- 15}

) with other numerical methods specified by Xia et al. [54], Petkovic and Stanimirovic [38], and Kansal et al. [36], to asses the performance of the DOIA measured using four numerical errors of the Penrose equations (137)–(140) and the iteration number (IN).

Recently, Kansal et al. [36] proposed the iterative scheme in Equation (149). For this problem, we take

β = 0.5

. When we take

ε = 10^{- 15}

, it overflows; hence, we take

ε = 10^{- 13}

.

In the above, Alg. is the abbreviation for Algorithm, IP represents the initial point, and IN is the iteration number.

1 = {(1, \dots, 1)}^{T}

. The first two algorithms were reported by Xia et al. [54]. It can be seen that the DOIA converges much faster and is more accurate than theother algorithms.

In the computation of the pseudoinverse, the last two singular values close to zero can be prevented from appearing in the matrix

C \in R^{m \times m}

if m is small enough. For example, we take

m = 1

in Table 5, such that

D = C^{- 1}

can be computed accurately without being induced by the zero singular value appearing in the denominator to enlarge the rounding error.

7.2. Example 9: Inverting the Ill-Conditioned Rectangular Hilbert Matrix

We want to find the Moore–Penrose inverse of the Hilbert matrix:

A_{i j} = \frac{1}{i + j - 1}, 1 \leq i \leq q, 1 \leq j \leq n .

This problem belongs to Case 2. This is more difficult than the previous example. Here, we fix

q = 50

,

n = 3

. The numerical errors of the Penrose Equations (137)–(140) are compared in Table 6. In the DOIA(m) and OHPIA(m), we take

m_{0} = 1

,

m_{1} = 2

and

ε = 10^{- 10}

. DOIA(m) and OHPIA(m) converge faster than in [36,38].

Next, we find the Moore–Penrose inverse of the Hilbert matrix with

n = 10

and

q = 15

. For MPIA,

m = 3

;

(m_{0}, m_{1}) = (1, 6)

for MPIA(m). For the method in [36], which does not converge with

ε = 10^{- 6}

, it is not applicable to find the inverse of a highly ill-conditioned Hilbert matrix with

q = 15

and

n = 10

(Table 7). Like the KKRJ, the OHPIA(m) is weak for highly ill-conditioned Hilbert matrices; the errors of

∥ A^{†} A A^{†} - A^{†} ∥^{2}

and

∥ {(A^{†} A)}^{T} - A^{†} {A ∥}^{2}

raise to the order of

10^{11}

.

7.3. Example 10

We find the Moore–Penrose inverse of the cyclic matrix in Example 2, as shown in Table 8, where we take

n = 50

and

q = 60

,

m = 3

for MPIA, and

(m_{0}, m_{1}) = (2, 6)

for MPIA(m) and OHPIA(m). It is remarkable that the OHPIA(m) is much better than the other methods.

7.4. Example 11

We find the Moore–Penrose inverse of a full-rank matrix given by

A_{i j} = 1, i = 1, \dots, 100, j = 1, \dots, 10

but with

A_{i j} = 0, i = 1, \dots, 10, j > i

. We take

(m_{0}, m_{1}) = (2, 6)

and

ε = 10^{- 9}

for the MPIA(m);

(m_{0}, m_{1}) = (5, 6)

is taken for the OHPIA(m). For the method in Equation (145), we denote it as NSM. Table 9 compares the errors and IN for different methods, the MPIA(m) outperforms other methods; OHPIA(m) is better than MPIA(m).

7.5. Example 12

We find the Moore–Penrose inverse of a randomly generated real matrix of size

100 \times 50

with

- 2 < A_{i j} < 2

. We take

(m_{0}, m_{1}) = (3, 6)

and

ε = 10^{- 12}

for the MPIA(m). Table 10 compares the errors and IN of the different methods.

8. Conclusions

In the m-degree matrix pencil Krylov subspace, an explicit solution (34) of the linear matrix equation was obtained by optimizing the two merit functions in (32) and (33). Then, we derive an optimal iterative algorithm, DOIA, to solve square or nonsquare linear matrix equation systems. The iterative method DOIA possesses an A-orthogonal property and absolute convergence, which has good computational efficiency and accuracy in solving the linear matrix equations. The restarted version DOIA(m) is proven to speed up the convergence. The Moore–Penrose pseudoinverses of rectangular matrices are also derived by using the DOIA, DOIA(m), MPIA, and MPIA(m). The proposed polynomial pencil method includes the Newton–Schultz method, the Chebyshev method, the Homeier method, and the KKRJ method as special cases; it is important that in the proposed iterative MPIA and MPIA(m), the coefficients are optimized using two minimization techniques. We also propose a new modification of hyperpower method, namely, the optimized hyperpower iterative algorithm OHPIA(m), which, through two optimizations, become the most powerful iterative algorithm for quickly computing the Moore–Penrose pseudoinverses of rectangular matrices. However, the OHPIA(m), like KKRJ, is weak in inverting ill-conditioned rectangular matrices.

The idea of varying the affine matrix Krylov subspace is novel for finding a better iterative solution to linear matrix equations based on a dual optimization. The limitations are that several matrix multiplications are needed to construct the projection operator P in the matrix Krylov subspace, and the computational cost is high for an inversion of the matrix

C^{- 1}

with dimension m.

Author Contributions

Conceptualization, C.-S.L. and C.-W.C.; Methodology, C.-S.L. and C.-W.C.; Software, C.-S.L., C.-W.C. and C.-L.K.; Validation, C.-S.L., C.-W.C. and C.-L.K.; Formal analysis, C.-S.L. and C.-W.C.; Investigation, C.-S.L., C.-W.C. and C.-L.K.; Resources, C.-S.L. and C.-W.C.; Data curation, C.-S.L., C.-W.C. and C.-L.K.; Writing—original draft, C.-S.L. and C.-W.C.; Writing—review & editing, C.-W.C.; Visualization, C.-S.L., C.-W.C. and C.-L.K.; Supervision, C.-S.L. and C.-W.C.; Project administration, C.-W.C.; Funding acquisition, C.-W.C. All authors have read and agreed to the published version of the manuscript.

Funding

This work was financially supported by the National Science and Technology Council [grant numbers: NSTC 112-2221-E-239-022].

Data Availability Statement

The data presented in this study are available on request from the corresponding authors.

Acknowledgments

The authors would like to express their thanks to the reviewers, who supplied feedback to improve the quality of this paper.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Hestenes, M.R.; Stiefel, E.L. Methods of conjugate gradients for solving linear systems. J. Res. Nat. Bur. Stand. 1952, 49, 409–436. [Google Scholar] [CrossRef]
Lanczos, C. Solution of systems of linear equations by minimized iterations. J. Res. Nat. Bur. Stand. 1952, 49, 33–53. [Google Scholar] [CrossRef]
Liu, C.S. An optimal multi-vector iterative algorithm in a Krylov subspace for solving the ill-posed linear inverse problems. CMC Comput. Mater. Contin. 2013, 33, 175–198. [Google Scholar]
Dongarra, J.; Sullivan, F. Guest editors’ introduction to the top 10 algorithms. Comput. Sci. Eng. 2000, 2, 22–23. [Google Scholar] [CrossRef]
Simoncini, V.; Szyld, D.B. Recent computational developments in Krylov subspace methods for linear systems. Numer. Linear Algebra Appl. 2007, 14, 1–59. [Google Scholar] [CrossRef]
Saad, Y.; Schultz, M.H. GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Stat. Comput. 1986, 7, 856–869. [Google Scholar] [CrossRef]
Saad, Y. Krylov subspace methods for solving large unsymmetric linear systems. Math. Comput. 1981, 37, 105–126. [Google Scholar] [CrossRef]
Freund, R.W.; Nachtigal, N.M. QMR: A quasi-minimal residual method for non-Hermitian linear systems. Numer. Math. 1991, 60, 315–339. [Google Scholar] [CrossRef]
van Den Eshof, J.; Sleijpen, G.L.G. Inexact Krylov subspace methods for linear systems. SIAM J. Matrix Anal. Appl. 2004, 26, 125–153. [Google Scholar] [CrossRef]
Paige, C.C.; Saunders, M.A. Solution of sparse indefinite systems of linear equations. SIAM J. Numer. Anal. 1975, 12, 617–629. [Google Scholar] [CrossRef]
Fletcher, R. Conjugate gradient methods for indefinite systems. In Lecture Notes in Mathematics; Springer: Berlin/Heidelberg, Germany, 1976; Volume 506, pp. 73–89. [Google Scholar]
Sonneveld, P. CGS: A fast Lanczos-type solver for nonsymmetric linear systems. SIAM J. Sci. Stat. Comput. 1989, 10, 36–52. [Google Scholar] [CrossRef]
van der Vorst, H.A. Bi-CGSTAB: A fast and smoothly converging variant of Bi-CG for the solution of nonsymmetric linear systems. SIAM J. Sci. Stat. Comput. 1992, 13, 631–644. [Google Scholar] [CrossRef]
Saad, Y.; van der Vorst, H.A. Iterative solution of linear systems in the 20th century. J. Comput. Appl. Math. 2000, 123, 1–33. [Google Scholar] [CrossRef]
Bouyghf, F.; Messaoudi, A.; Sadok, H. A unified approach to Krylov subspace methods for solving linear systems. Numer. Algorithms 2024, 96, 305–332. [Google Scholar] [CrossRef]
Saad, Y. Iterative Methods for Sparse Linear Systems, 2nd ed.; SIAM: Philadelphia, PA, USA, 2003. [Google Scholar]
van der Vorst, H.A. Iterative Krylov Methods for Large Linear Systems; Cambridge University Press: New York, NY, USA, 2003. [Google Scholar]
Jbilou, K.; Messaoudi, A.; Sadok, H. Global FOM and GMRES algorithms for matrix equations. Appl. Numer. Math. 1999, 31, 49–63. [Google Scholar] [CrossRef]
El Guennouni, A.; Jbilou, K.; Riquet, A.J. Block Krylov subspace methods for solving large Sylvester equations. Numer. Algorithms 2002, 29, 75–96. [Google Scholar] [CrossRef]
Frommer, A.; Lund, K.; Szyld, D.B. Block Krylov subspace methods for functions of matrices. Electron. Trans. Numer. Anal. 2017, 47, 100–126. [Google Scholar] [CrossRef]
Frommer, A.; Lund, K.; Szyld, D.B. Block Krylov subspace methods for functions of matrices II: Modified block FOM. SIAM J. Matrix Anal. Appl. 2020, 41, 804–837. [Google Scholar] [CrossRef]
El Guennouni, A.; Jbilou, K.; Riquet, A.J. The block Lanczos method for linear systems with multiple right-hand sides. Appl. Numer. Math. 2004, 51, 243–256. [Google Scholar] [CrossRef]
Kubínová, M.; Soodhalter, K.M. Admissible and attainable convergence behavior of block Arnoldi and GMRES. SIAM J. Matrix Anal. Appl. 2020, 41, 464–486. [Google Scholar] [CrossRef]
Lund, K. Adaptively restarted block Krylov subspace methods with low-synchronization skeletons. Numer. Algorithms 2023, 93, 731–764. [Google Scholar] [CrossRef]
Konghua, G.; Hu, X.; Zhang, L. A new iteration method for the matrix equation AX = B. Appl. Math. Comput. 2007, 187, 1434–1441. [Google Scholar] [CrossRef]
Meng, C.; Hu, X.; Zhang, L. The skew-symmetric orthogonal solutions of the matrix equation AX = B. Linear Algebra Appl. 2005, 402, 303–318. [Google Scholar] [CrossRef]
Peng, Z.; Hu, X. The reflexive and anti-reflexive solutions of the matrix equation AX = B. Linear Algebra Appl. 2003, 375, 147–155. [Google Scholar] [CrossRef]
Zhang, J.C.; Zhou, S.Z.; Hu, X. The (P,Q) generalized reflexive and anti-reflexive solutions of the matrix equation AX = B. Appl. Math. Comput. 2009, 209, 254–258. [Google Scholar] [CrossRef]
Liu, C.S.; Hong, H.K.; Atluri, S.N. Novel algorithms based on the conjugate gradient method for inverting ill-conditioned matrices, and a new regularization method to solve ill-posed linear systems. Comput. Model. Eng. Sci. 2010, 60, 279–308. [Google Scholar]
Higham, N.J. Functions of Matrices: Theory and Computation; SIAM: Philadelphia, PA, USA, 2008. [Google Scholar]
Amat, S.; Ezquerro, J.A.; Hernandez-Veron, M.A. Approximation of inverse operators by a new family of high-order iterative methods. Numer. Linear Algebra Appl. 2014, 21, 629. [Google Scholar] [CrossRef]
Homeier, H.H.H. On Newton-type methods with cubic convergence. J. Comput. Appl. Math. 2005, 176, 425–432. [Google Scholar] [CrossRef]
Petkovic, M.D.; Stanimirovic, P.S. Iterative method for computing Moore–Penrose inverse based on Penrose equations. J. Comput. Appl. Math. 2011, 235, 1604–1613. [Google Scholar] [CrossRef]
Dehdezi, E.K.; Karimi, S. GIBS: A general and efficient iterative method for computing the approximate inverse and Moore–Penrose inverse of sparse matrices based on the Schultz iterative method with applications. Linear Multilinear Algebra 2023, 71, 1905–1921. [Google Scholar] [CrossRef]
Cordero, A.; Soto-Quiros, P.; Torregrosa, J.R. A general class of arbitrary order iterative methods for computing generalized inverses. Appl. Math. Comput. 2021, 409, 126381. [Google Scholar] [CrossRef]
Kansal, M.; Kaur, M.; Rani, L.; Jantschi, L. A cubic class of iterative procedures for finding the generalized inverses. Mathematics 2023, 11, 3031. [Google Scholar] [CrossRef]
Cordero, A.; Segura, E.; Torregrosa, J.R.; Vassileva, M.P. Inverse matrix estimations by iterative methods with weight functions and their stability analysis. Appl. Math. Lett. 2024, 155, 109122. [Google Scholar] [CrossRef]
Petkovic, M.D.; Stanimirovic, P.S. Two improvements of the iterative method for computing Moore–Penrose inverse based on Penrose equations. J. Comput. Appl. Math. 2014, 267, 61–71. [Google Scholar] [CrossRef]
Katsikis, V.N.; Pappas, D.; Petralias, A. An improved method for the computation of the Moore–Penrose inverse matrix. Appl. Math. Comput. 2011, 217, 9828–9834. [Google Scholar] [CrossRef]
Stanimirovic, I.; Tasic, M. Computation of generalized inverse by using the LDL* decomposition. Appl. Math. Lett. 2012, 25, 526–531. [Google Scholar] [CrossRef]
Sheng, X.; Wang, T. An iterative method to compute Moore–Penrose inverse based on gradient maximal convergence rate. Filomat 2013, 27, 1269–1276. [Google Scholar] [CrossRef]
Toutounian, F.; Ataei, A. A new method for computing Moore–Penrose inverse matrices. J. Comput. Appl. Math. 2009, 228, 412–417. [Google Scholar] [CrossRef]
Soleimani, F.; Stanimirovic, P.S.; Soleymani, F. Some matrix iterations for computing generalized inverses and balancing chemical equations. Algorithms 2015, 8, 982–998. [Google Scholar] [CrossRef]
Baksalary, O.M.; Trenkler, G. The Moore–Penrose inverse: A hundred years on a frontline of physics research. Eur. Phys. J. 2021, 46, 9. [Google Scholar] [CrossRef]
Pavlikova, S.; Sevcovic, D. On the Moore–Penrose pseudo-inversion of block symmetric matrices and its application in the graph theory. Linear Algebra Appl. 2023, 673, 280–303. [Google Scholar] [CrossRef]
Sayevand, K.; Pourdarvish, A.; Machado, J.A.T.; Erfanifar, R. On the calculation of the Moore–Penrose and Drazin inverses: Application to fractional calculus. Mathematics 2021, 9, 2501. [Google Scholar] [CrossRef]
AL-Obaidi, R.H.; Darvishi, M.T. A comparative study on qualification criteria of nonlinear solvers with introducing some new ones. J. Math. 2022, 2022, 4327913. [Google Scholar] [CrossRef]
Liu, C.S.; Kuo, C.L.; Chang, C.W. Solving least-squares problems via a double-optimal algorithm and a variant of Karush–Kuhn–Tucker equation for over-determined system. Algorithms 2024, 17, 211. [Google Scholar] [CrossRef]
Einstein, A. The foundation of the general theory of relativity. Ann. Phys. 1916, 49, 769–822. [Google Scholar] [CrossRef]
Todd, J. The condition of finite segments of the Hilbert matrix. In The Solution of Systems of Linear Equations and the Determination of Eigenvalues; Taussky, O., Ed.; National Bureau of Standards: Applied Mathematics Series; National Bureau of Standards: Gaithersburg, MD, USA, 1954; Volume 39, pp. 109–116. [Google Scholar]
Pan, Y.; Soleymani, F.; Zhao, L. An efficient computation of generalized inverse of a matrix. Appl. Math. Comput. 2018, 316, 89–101. [Google Scholar] [CrossRef]
Climent, J.J.; Thome, N.; Wei, Y. A geometrical approach on generalized inverses by Neumann-type series. Linear Algebra Appl. 2001, 332–334, 533–540. [Google Scholar] [CrossRef]
Soleymani, F.; Stanimirovic, P.S.; Haghani, F.K. On hyperpower family of iterations for computing outer inverses possessing high efficiencies. Linear Algebra Appl. 2015, 484, 477–495. [Google Scholar] [CrossRef]
Xia, Y.; Chen, T.; Shan, J. A novel iterative method for computing generalized inverse. Neural Comput. 2014, 26, 449–465. [Google Scholar] [CrossRef]

Figure 1. For example 1 solved by the DOIA: (a) residual, (b) comparison of numerical and exact solutions, and (c) numerical error; (b,c) with the same x axis.

Figure 2. For example 2 solved by the DOIA: (a) residual, (b) comparison of numerical and exact solutions, and (c) numerical error; (b,c) with the same x axis.

Figure 3. For example 3 solved by the DOS and DOIA, showing (a) residual and (b) comparison of error of inverse matrix.

Figure 4. For Example 5 solved by the DOS, showing (a) residual, (b)

α_{0} λ_{0}

, and (c) error of

x_{i}

; (a,b) the same x axis.

Figure 4. For Example 5 solved by the DOS, showing (a) residual, (b)

α_{0} λ_{0}

, and (c) error of

x_{i}

; (a,b) the same x axis.

Figure 5. For Example 5 solved by the DOS, DOIA, and DOIA(m), showing (a) residuals, (b) comparing error of inverse matrix.

Figure 6. For example 5 solved by the DOIA, DOIA(m), and QR, comparing errors of (a)

x_{i}

, and (b) inverse matrix.

Figure 6. For example 5 solved by the DOIA, DOIA(m), and QR, comparing errors of (a)

x_{i}

, and (b) inverse matrix.

Figure 7. For example 6 solved by the DOS: (a) comparing numerical and exact solutions, (b) numerical error of matrix elements; (a,b) with the same x axis.

Table 1. For Example 1, comparing ME, iterations number (IN), and CPU time obtained by DOIA(m) for different n.

n	60	100	150	300	500
ME	$1.06 \times 10^{- 11}$	$1.83 \times 10^{- 11}$	$3.92 \times 10^{- 11}$	$2.38 \times 10^{- 11}$	$6.89 \times 10^{- 11}$
IN	3	3	3	3	3
CPU	0.45	0.69	1.06	8.32	29.22

Table 2. For Example 2, (

∥ A X - I_{n} ∥

, IN) obtained by DOIA(m) and the Newton–Schultz iterative method (NSIM) for different n.

Table 2. For Example 2, (

∥ A X - I_{n} ∥

, IN) obtained by DOIA(m) and the Newton–Schultz iterative method (NSIM) for different n.

n	4	5	6	7
DOIA(m)	( $3.82 \times 10^{- 16}$ , 4)	( $4.60 \times 10^{- 16}$ , 4)	( $8.37 \times 10^{- 16}$ , 7)	( $8.12 \times 10^{- 16}$ , 14)
NSIM	( $8.42 \times 10^{- 16}$ , 11)	( $3.66 \times 10^{- 16}$ , 12)	( $1.68 \times 10^{- 16}$ , 12)	( $1.67 \times 10^{- 15}$ , 13)

Table 3. For Example 3,

∥ A X - I_{6} ∥

, ME, and IN obtained by DOIA for different noise s values.

Table 3. For Example 3,

∥ A X - I_{6} ∥

, ME, and IN obtained by DOIA for different noise s values.

s	0	0.01	0.03	0.05	0.1
$∥ A X - I_{6} ∥$	$9.77 \times 10^{- 9}$	$9.74 \times 10^{- 9}$	$9.61 \times 10^{- 9}$	$9.82 \times 10^{- 9}$	$9.96 \times 10^{- 9}$
ME	$5.73 \times 10^{- 11}$	$8.24 \times 10^{- 3}$	$2.48 \times 10^{- 2}$	$4.15 \times 10^{- 2}$	$8.37 \times 10^{- 2}$
IN	67	124	84	246	622

Table 4. For Example 5, the

∥ A X - I_{n} ∥

and IN obtained by DOIA(m) and the Newton–Schultz iterative method (NSIM) for different n.

Table 4. For Example 5, the

∥ A X - I_{n} ∥

and IN obtained by DOIA(m) and the Newton–Schultz iterative method (NSIM) for different n.

n	4	5	6
DOIA(m)	( $1.41 \times 10^{- 13}$ , 5)	( $9.70 \times 10^{- 12}$ , 9)	( $8.92 \times 10^{- 10}$ , 148)
NSIM	( $3.27 \times 10^{- 13}$ , 500)	( $4.81 \times 10^{- 12}$ , 500)	( $1.80 \times 10^{- 10}$ , 500)

Table 5. Computed results of a rank-deficient matrix in Example 8.

Alg.	IP	$∥ {AA}^{†} {A - A ∥}^{2}$	$∥ A^{†} {AA}^{†} - A^{†} ∥^{2}$	$∥ {({AA}^{†})}^{T} - {AA}^{†} ∥^{2}$	$∥ {(A^{†} A)}^{T} - A^{†} {A ∥}^{2}$	IN
[54]	$10^{- 5} A^{T}$	$8.92 \times 10^{- 16}$	$3.36 \times 10^{- 16}$	$7.85 \times 10^{- 17}$	$1.97 \times 10^{- 15}$	50
[54]	$0.1 A^{T}$	$9.17 \times 10^{- 20}$	$6.99 \times 10^{- 16}$	$7.85 \times 10^{- 17}$	$2.86 \times 10^{- 15}$	60
PSM	$0.1 A^{T}$	$1.53 \times 10^{- 14}$	$9.21 \times 10^{- 17}$	$4.13 \times 10^{- 31}$	$1.17 \times 10^{- 30}$	165
KKRJ	$0.025 A^{T}$	$3.01 \times 10^{- 30}$	$2.25 \times 10^{- 29}$	$2.28 \times 10^{- 31}$	$1.53 \times 10^{- 30}$	6
NSIM	$A^{T} / {∥ A ∥}^{2}$	$4.01 \times 10^{- 31}$	$2.60 \times 10^{- 31}$	$1.29 \times 10^{- 31}$	$2.20 \times 10^{- 32}$	9
DOIA	$X_{0} = 0$	$2.80 \times 10^{- 31}$	$4.83 \times 10^{- 33}$	$6.16 \times 10^{- 33}$	$1.78 \times 10^{- 32}$	3

Table 6. Computed results of the Hilbert matrix with

q = 50

,

n = 3

in Example 9.

Table 6. Computed results of the Hilbert matrix with

q = 50

,

n = 3

in Example 9.

Alg.	IP	$∥ {AA}^{†} {A - A ∥}^{2}$	$∥ A^{†} {AA}^{†} - A^{†} ∥^{2}$	$∥ {({AA}^{†})}^{T} - {AA}^{†} ∥^{2}$	$∥ {(A^{†} A)}^{T} - A^{†} {A ∥}^{2}$	IN
PSM	$0.8 A^{T}$	$1.4 \times 10^{- 28}$	$3.7 \times 10^{- 21}$	$4.1 \times 10^{- 30}$	$3.5 \times 10^{- 27}$	34
KKRJ	$0.38 A^{T}$	$1.01 \times 10^{- 28}$	$9.45 \times 10^{- 27}$	$2.52 \times 10^{- 26}$	$5.77 \times 10^{- 27}$	12
NSIM	$A^{T} / {∥ A ∥}^{2}$	$5.65 \times 10^{- 29}$	$1.40 \times 10^{- 26}$	$1.69 \times 10^{- 27}$	$2.60 \times 10^{- 27}$	20
DOIA(m)	$X = 0_{n \times q}$	$1.35 \times 10^{- 29}$	$1.99 \times 10^{- 25}$	$1.54 \times 10^{- 26}$	$7.32 \times 10^{- 27}$	3
OHPIA(m)	$A^{T} / {∥ A ∥}^{2}$	$1.61 \times 10^{- 29}$	$7.52 \times 10^{- 26}$	$2.23 \times 10^{- 26}$	$9.58 \times 10^{- 28}$	3

Table 7. Computed results of the Hilbert matrix with

q = 15

and

n = 10

.

Table 7. Computed results of the Hilbert matrix with

q = 15

and

n = 10

.

Alg.	IP	$∥ {AA}^{†} {A - A ∥}^{2}$	$∥ A^{†} {AA}^{†} - A^{†} ∥^{2}$	$∥ {({AA}^{†})}^{T} - {AA}^{†} ∥^{2}$	$∥ {(A^{†} A)}^{T} - A^{†} {A ∥}^{2}$	IN
RRKJ	$A^{T} / {∥ A ∥}^{2}$	$2.58 \times 10^{- 9}$	$1.44 \times 10^{12}$	$1.17 \times 10^{- 6}$	$2.78 \times 10^{11}$	>500
MPIA	$A^{T} / {∥ A ∥}^{2}$	$2.34 \times 10^{- 6}$	$2.84 \times 10^{- 2}$	$1.49 \times 10^{- 3}$	$8.66 \times 10^{- 4}$	50
MPIA(m)	$A^{T} / {∥ A ∥}^{2}$	$1.85 \times 10^{- 9}$	$1.28 \times 10^{- 2}$	$4.20 \times 10^{- 5}$	$5.45 \times 10^{- 6}$	7

Table 8. Computed results of the cyclic matrix with

q = 60

and

n = 50

in Example 10.

Table 8. Computed results of the cyclic matrix with

q = 60

and

n = 50

in Example 10.

Alg.	IP	$∥ {AA}^{†} {A - A ∥}^{2}$	$∥ A^{†} {AA}^{†} - A^{†} ∥^{2}$	$∥ {({AA}^{†})}^{T} - {AA}^{†} ∥^{2}$	$∥ {(A^{†} A)}^{T} - A^{†} {A ∥}^{2}$	IN
KKRJ	$A^{T} / {∥ A ∥}^{2}$	$3.39 \times 10^{- 20}$	$2.27 \times 10^{- 31}$	$4.98 \times 10^{- 26}$	$6.73 \times 10^{- 26}$	11
MPIA	$A^{T} / {∥ A ∥}^{2}$	$3.87 \times 10^{- 21}$	$3.35 \times 10^{- 29}$	$3.57 \times 10^{- 22}$	$7.92 \times 10^{- 28}$	24
MPIA(m)	$A^{T} / {∥ A ∥}^{2}$	$2.64 \times 10^{- 21}$	$4.24 \times 10^{- 29}$	$5.34 \times 10^{- 21}$	$9.34 \times 10^{- 28}$	17
OHPIA(m)	$A^{T} / {∥ A ∥}^{2}$	$2.42 \times 10^{- 24}$	$9.93 \times 10^{- 32}$	$2.72 \times 10^{- 27}$	$3.25 \times 10^{- 28}$	6

Table 9. Computed results of a full-rank matrix in Example 11 with

q = 100

and

n = 10

.

Table 9. Computed results of a full-rank matrix in Example 11 with

q = 100

and

n = 10

.

Alg.	IP	$∥ {AA}^{†} {A - A ∥}^{2}$	$∥ A^{†} {AA}^{†} - A^{†} ∥^{2}$	$∥ {({AA}^{†})}^{T} - {AA}^{†} ∥^{2}$	$∥ {(A^{†} A)}^{T} - A^{†} {A ∥}^{2}$	IN
[54]	$10^{- 1} A^{T}$	$3.24 \times 10^{- 14}$	$1.89 \times 10^{- 15}$	$1.10 \times 10^{- 15}$	$5.99 \times 10^{- 13}$	130
NSM	$10^{- 8} A^{T}$	$1.29 \times 10^{- 12}$	$2.13 \times 10^{- 15}$	$1.78 \times 10^{- 13}$	$9.28 \times 10^{- 15}$	27
RRKJ	$A^{T} / {∥ A ∥}^{2}$	$2.77 \times 10^{- 24}$	$3.58 \times 10^{- 26}$	$5.19 \times 10^{- 25}$	$3.23 \times 10^{- 26}$	11
MPIA(m)	$A^{T} / {∥ A ∥}^{2}$	$8.19 \times 10^{- 23}$	$1.75 \times 10^{- 24}$	$7.21 \times 10^{- 27}$	$3.91 \times 10^{- 28}$	7
OHPIA(m)	$A^{T} / {∥ A ∥}^{2}$	$7.23 \times 10^{- 28}$	$2.85 \times 10^{- 31}$	$7.41 \times 10^{- 20}$	$1.90 \times 10^{- 30}$	5

Table 10. Computed results of a random matrix in Example 12 with

q = 100

and

n = 50

.

Table 10. Computed results of a random matrix in Example 12 with

q = 100

and

n = 50

.

Alg.	IP	$∥ {AA}^{†} {A - A ∥}^{2}$	$∥ A^{†} {AA}^{†} - A^{†} ∥^{2}$	$∥ {({AA}^{†})}^{T} - {AA}^{†} ∥^{2}$	$∥ {(A^{†} A)}^{T} - A^{†} {A ∥}^{2}$	IN
KKRJ	$A^{T} / {∥ A ∥}^{2}$	$1.18 \times 10^{- 26}$	$7.35 \times 10^{- 31}$	$8.51 \times 10^{- 28}$	$1.64 \times 10^{- 28}$	10
MPIA(m)	$A^{T} / {∥ A ∥}^{2}$	$1.31 \times 10^{- 25}$	$1.18 \times 10^{- 29}$	$1.28 \times 10^{- 24}$	$1.37 \times 10^{- 28}$	6
OHPIA(m)	$0.01 A^{T} / {∥ A ∥}^{2}$	$1.14 \times 10^{- 27}$	$1.25 \times 10^{- 31}$	$7.95 \times 10^{- 23}$	$9.76 \times 10^{- 30}$	6

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, C.-S.; Kuo, C.-L.; Chang, C.-W. Matrix Pencil Optimal Iterative Algorithms and Restarted Versions for Linear Matrix Equation and Pseudoinverse. Mathematics 2024, 12, 1761. https://doi.org/10.3390/math12111761

AMA Style

Liu C-S, Kuo C-L, Chang C-W. Matrix Pencil Optimal Iterative Algorithms and Restarted Versions for Linear Matrix Equation and Pseudoinverse. Mathematics. 2024; 12(11):1761. https://doi.org/10.3390/math12111761

Chicago/Turabian Style

Liu, Chein-Shan, Chung-Lun Kuo, and Chih-Wen Chang. 2024. "Matrix Pencil Optimal Iterative Algorithms and Restarted Versions for Linear Matrix Equation and Pseudoinverse" Mathematics 12, no. 11: 1761. https://doi.org/10.3390/math12111761

APA Style

Liu, C.-S., Kuo, C.-L., & Chang, C.-W. (2024). Matrix Pencil Optimal Iterative Algorithms and Restarted Versions for Linear Matrix Equation and Pseudoinverse. Mathematics, 12(11), 1761. https://doi.org/10.3390/math12111761

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Matrix Pencil Optimal Iterative Algorithms and Restarted Versions for Linear Matrix Equation and Pseudoinverse

Abstract

1. Introduction

1.1. Notation

1.2. Main Contribution

1.3. Outline

2. The Matrix Pencil Krylov Subspace Method

3. Double-Optimal Solution

3.1. Two Minimizations

3.2. Two Main Theorems

3.3. Estimating Residual Error

3.4. Orthogonal Projection

4. Double-Optimal Iterative Algorithm

4.1. Crucial Properties of DOIA

4.2. Restarted DOIA(m)

5. Numerical Examples

5.1. Example 1

5.2. Example 2

5.3. Example 3

5.4. Example 4

5.5. Example 5: Solving Ill-Conditioned Hilbert Matrix Equation

5.6. Example 6

5.7. Example 7

6. Pseudoinverse of Rectangular Matrix

6.1. A New Matrix Pencil

6.2. Optimized Hyperpower Method

7. Numerical Testing of Rectangular Matrix

7.1. Example 8

7.2. Example 9: Inverting the Ill-Conditioned Rectangular Hilbert Matrix

7.3. Example 10

7.4. Example 11

7.5. Example 12

8. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI