Structured Doubling Algorithm for a Class of Large-Scale Discrete-Time Algebraic Riccati Equations with High-Ranked Constant Term

Yu, Bo; Jiang, Chengxu; Dong, Ning

doi:10.3390/fractalfract7020193

Open AccessArticle

Structured Doubling Algorithm for a Class of Large-Scale Discrete-Time Algebraic Riccati Equations with High-Ranked Constant Term

by

Bo Yu

,

Chengxu Jiang

^* and

Ning Dong

^*

School of Science, Hunan University of Technology, Zhuzhou 412007, China

^*

Authors to whom correspondence should be addressed.

Fractal Fract. 2023, 7(2), 193; https://doi.org/10.3390/fractalfract7020193

Submission received: 11 January 2023 / Revised: 5 February 2023 / Accepted: 13 February 2023 / Published: 14 February 2023

(This article belongs to the Special Issue Applications of Iterative Methods in Solving Nonlinear Equations)

Download

Browse Figures

Versions Notes

Abstract

:

Consider the computation of the solution for a class of discrete-time algebraic Riccati equations (DAREs) with the low-ranked coefficient matrix G and the high-ranked constant matrix H. A structured doubling algorithm is proposed for large-scale problems when A is of lowrank. Compared to the existing doubling algorithm of

O (2^{k} n)

flops at the k-th iteration, the newly developed version merely needs

O (n)

flops for preprocessing and

O ({(k + 1)}^{3} m^{3})

flopsfor iterations and is more proper for large-scale computations when

m ≪ n

. The convergence and complexity of the algorithm are subsequently analyzed. Illustrative numerical experiments indicate that the presented algorithm, which consists of a dominant time-consuming preprocessing step and a trivially iterative step, is capable of computing the solution efficiently for large-scale DAREs.

Keywords:

discrete-time algebraic Riccati equation; doubling algorithm; low-ranked matrix; high-ranked constant term

1. Introduction

Consider a discrete-time control system

x_{k + 1} = A x_{k} + B u_{k}, k = 0, 1, 2, . . .,

where

A \in C^{n \times n}

and

B \in C^{n \times l}

with

l \leq n

. Here,

C^{n \times m}

stands for sets of

n \times m

complex matrices. The linear quadratic regulator (LQR) control minimizes the energy or the cost functional

J_{c} (x_{k}, u_{k}) \equiv \sum_{k = 0}^{\infty} [x_{k}^{*} H x_{k} + u_{k}^{*} R u_{k}]

with the Hermitian constant term

H \in C^{n \times n}

being positive semi-definite [1]. Here, the symbol “*” is the conjugate transpose of a vector or a matrix.

The corresponding optimal control is

u_{k} = - F x_{k}

and the feedback gain matrix

F : = {(R + B^{*} X B)}^{- 1} (B^{*} X A)

can then be expressed in terms of the unique positive semi-definite stabilizing solution X of the discrete-time algebraic Riccati equation (DARE) [2]

D (X) = - X + A^{*} X {(I + G X)}^{- 1} A + H = 0,

(1)

where

G = B R^{- 1} B^{*}

with

R \in C^{l \times l}

,

H \in C^{n \times n}

is Hermitian and positive semi-definite. In many control problems, the matrix

A \in C^{n \times n}

is sparse in the sense that the matrix-vector product

A v

and the inverse-vector product

A^{- 1} v

require

O (n)

flops, respectively. The recent applications of the discrete-time control system can be found in [3] such as the wheeled robot and the airborne pursuer. There are also some applications (e.g., the singular Kalman filter) about the fractional Riccati equation, see [4,5] and the references therein.

The existence of the unique positive semi-definite solution X of DARE (1) has been well studied if

(A, B)

is d-stabilizable and

(H, A)

is observable, see [6,7] and their references for more details. The structure-preserving doubling algorithm (SDA) is one of the most efficient methods [7] to compute the unique positive semidefinite solution X via the following iteration

\{\begin{matrix} A_{k + 1} = A_{k} {(I - G_{k} H_{k})}^{- 1} A_{k}, \\ H_{k + 1} = H_{k} + A_{k}^{*} H_{k} {(I - G_{k} H_{k})}^{- 1} A_{k}, \\ G_{k + 1} = G_{k} + A_{k} {(I - G_{k} H_{k})}^{- 1} G_{k} A_{k}^{*} \end{matrix}

(2)

with

A_{0} = A

,

G_{0} = - G

,

H_{0} = H

. Regardless of the structure of coefficient matrices, the computational complexity of each iteration is about

O (n^{3})

, obviously not fitting for large-scale problems. When the constant matrix H is low-ranked, the solution X is commonly numerically low-ranked and can be approximated by

H_{k}

in terms of a series of decomposed matrix factors, making the SDA feasible for large-scale DAREs [8]. If only the feedback gain matrix F is required without outputting the solution X, an adaptive version of the SDA in [9] still works for large-scale problems even if H is high-ranked. In that case, the solution X is no longer numerically low-ranked but can be stored in a sequence of matrix-vector products [9]. In both situations, the computational complexity of the SDA at the kth iteration costs about

O (2^{k} n)

flops (i.e., the exponential increase in k), resulting in the intolerable iteration time when k is large.

In this paper, we consider DAREs with A of the low-ranked structure (which may not be sparse)

A = C_{1} S C_{2}^{*}

(3)

with

C_{1}, C_{2} \in C^{n \times m}

and

S \in C^{m \times m}

(

m ≪ n

). The motivation behind this is that the complexity of the SDA at the k-th iteration might be further reduced in this case and the DAREs, with the structure (3), have several applications in circuit-controlling areas, for example, the circuits system with

C_{1}

and

C_{2}

being the mesh inductance matrices, composed of the product of several mesh matrices (n is the number of meshes) and S being the resistance matrix [10]. To obtain the optimal feedback gain to control the circuit system, one is required to find the solution of the DARE (1).

The main contribution we made under the low-ranked structure (3) is that the computational complexity of the SDA at the k-th iteration can be further reduced to

O ({(k + 1)}^{3} m^{3})

, far less than

O (2^{k} n)

when

m ≪ n

. As a result, the most time-consuming part of the SDA lies in the preprocessing step with a fixed computational complexity

O (n)

, and the other part for the iterations might be accordingly insignificant. Numerical experiments are implemented to validate the effectiveness of the presented algorithm, constituting a useful complement to the solver for computing the solution of DAREs.

The rest of the paper is organized as follows. In Section 2, we develop the structured SDA for DAREs with a low-ranked structure of A and construct its convergence. A detailed complexity analysis as well as the design of the termination criterion are established in Section 3. Section 4 is devoted to numerical experiments to indicate the efficiency of the proposed algorithm, and the conclusion is drawn in the last section.

Notation. Symbols

R^{n \times n}

and

C^{n \times n}

in this paper stand for sets of

n \times n

real and complex matrices, respectively.

I_{n}

is the

n \times n

identity matrix. For a matrix

A \in C^{n \times n}

,

σ (A)

and

ρ (A)

denote, respectively, the spectrum and spectral radius of A. A Hermitian matrix

A > 0

(

\geq 0

) when all its eigenvalues are positive (non-negative). Additionally,

M > N

(

M \geq N

) if and only if

M - N > 0

(

\geq 0

).

We also need the concept of the numerically low-ranked matrix.

Definition 1.

([8]) A matrix A is said to be numerically low-ranked with respect to tolerance

ϵ > 0

if

rank (A) \leq c_{ϵ}

for a constant

c_{ϵ}

associated with ϵ but independent of the size of A.

2. Structured Doubling Algorithm

In this section, we describe the structured iteration scheme for DAREs with a high-ranked constant term and low-ranked A in (3). To avoid the inversion of large-scale matrices, the Sherman–Morrison–Woodbury formula (SMWF) [11,12] is first applied to the sparse-plus-low-ranked matrices to represent the corresponding structured matrices. Then, we aim at preserving the sparsity or the low-ranked structure of the iteration sequence rather than forming it explicitly. As a result, the SDA is capable of being implemented only with some small-scale matrices, referred to as kernels, and the complexity of the iteration can be ignored more easily than that of the preprocessing step for large-scale problems.

2.1. Iteration Scheme

Given the initial matrices

A_{0} = C_{1} S C_{2}

,

H_{0} = H

,

G_{0} = B R^{- 1} B^{*}

,

S_{0} = S

,

T_{0} = 0

,

G_{0} = R^{- 1}

, and

B_{0} = B

, the SDA will be organized according to the following format:

\{\begin{matrix} A_{k} = C_{1} S_{k} C_{2}^{*}, \\ H_{k} = H + C_{2} T_{k} C_{2}^{*}, \\ G_{k} = B_{k} R_{k} B_{k}^{*} \end{matrix}

(4)

for

k \geq 0

, where

S_{k}, T_{k} \in C^{m \times m}

,

B_{k} \in C^{n \times (k m + l)}

, and

R_{k} \in C^{(k m + l) \times (k m + l)}

. One merit of the above scheme (4) is that the sizes of kernels

S_{k}

and

T_{k}

are always invariant (i.e.,

m \times m

) during iterations. Although the column of

B_{k}

and the size of

R_{k}

increase linearly with respect to k, the enhanced scale is generally small due to the fast convergence of the SDA. Then,

G_{k}

still hopefully maintains a low-ranked structure and could be derived and stored in an economic way.

Let

Σ_{k} = R_{k}^{- 1} - B_{k}^{*} H_{k} B_{k}

. By applying the Sherman–Morrison–Woodbury formula (SMWF) [11], we have

{(I - G_{k} H_{k})}^{- 1} = I + B_{k} Σ_{k}^{- 1} B_{k}^{*} H_{k}, {(I - H_{k} G_{k})}^{- 1} = I + H_{k} B_{k} Σ_{k}^{- 1} B_{k}^{*} .

(5)

Insertion (5) into the SDA (2) with currently available

A_{k}

,

H_{k}

and

G_{k}

yield

A_{k + 1} = C_{1} S_{k + 1} C_{2}^{*}

,

H_{k + 1} = H + C_{2} T_{k + 1} C_{2}^{*}

,

G_{k + 1} = B_{k + 1} R_{k + 1} B_{k + 1}^{*}

with

\{\begin{matrix} S_{k + 1} = S_{k} (C_{2}^{*} C_{1} + Φ_{k}^{*} Σ_{k}^{- 1} Ψ_{k}) S_{k}, \\ T_{k + 1} = T_{k} + S_{k}^{*} (C_{1}^{*} H_{k} C_{1} + Ψ_{k}^{*} Σ_{k}^{- 1} Ψ_{k}) S_{k}, \\ B_{k + 1} = [C_{1}, B_{k}], \\ R_{k + 1} = [\begin{matrix} {\tilde{R}}_{k} \\ R_{k} \end{matrix}] \end{matrix}

(6)

and

Φ_{k} = C_{2}^{*} B_{k}, Ψ_{k} = C_{1}^{*} H_{k} B_{k}, {\tilde{R}}_{k} = S_{k} Φ_{k} Σ_{k}^{- 1} Φ_{k}^{*} S_{k}^{*} .

The main computational task of (6) is the update of

H_{k} B_{k}

,

B_{k}^{*} H_{k} B_{k}

in

Ψ_{k}

,

Φ_{k}

,

Σ_{k}

and the solutions of two linear system associated with

Σ_{k}

. Regardless of the concrete structure, the complexity of such calculations is

O (2^{k} n)

[8,9]. A deeper observation made here will show that such computations can be further down to the complexity of

O ({(k + 1)}^{3} m^{3})

, far less than that of the preprocessing for large-scale problems with

m ≪ n

. In fact, by setting

B_{0} = B

, it follows from (6) that

B_{k} = [\overset{\overset{km}{⏞}}{C_{1}, C_{1}, . . ., C_{1}}, \overset{\overset{l}{⏞}}{B}] n

and thus

Φ_{k} = [\overset{\overset{km}{⏞}}{C_{2}^{*} C_{1}, C_{2}^{*} C_{1}, . . ., C_{2}^{*} C_{1}}, \overset{\overset{l}{⏞}}{C_{2}^{*} B}] m .

(7)

Analogously, we have

\begin{matrix} H_{k} B_{k} & = H B_{k} + C_{2} T_{k} C_{2}^{*} B_{k} \\ = [\overset{\overset{km}{⏞}}{H C_{1} + C_{2} T_{k} C_{2}^{*} C_{1} . . . H C_{1} + C_{2} T_{k} C_{2}^{*} C_{1}} \overset{\overset{l}{⏞}}{H B + C_{2} T_{k} C_{2}^{*} B}] n \end{matrix}

and

\begin{matrix} Ψ_{k} = [\overset{\overset{km}{⏞}}{C_{1}^{*} H C_{1} + C_{1}^{*} C_{2} T_{k} C_{2}^{*} C_{1} . . . C_{1}^{*} H C_{1} + C_{1}^{*} C_{2} T_{k} C_{2}^{*} C_{1}} \overset{\overset{l}{⏞}}{C_{1}^{*} H B + C_{1}^{*} C_{2} T_{k} C_{2}^{*} B}] m . \end{matrix}

Furthermore, as

\begin{matrix} B_{k}^{*} H_{k} B_{k} = B_{k}^{*} H B_{k} + B_{k}^{*} C_{2} T_{k} C_{2}^{*} B_{k} \\ = & \overset{k m}{\overset{⏞}{[\begin{matrix} C_{1}^{*} H C_{1} + C_{1}^{*} C_{2} T_{k} C_{2}^{*} C_{1} & \dots & C_{1}^{*} H C_{1} + C_{1}^{*} C_{2} T_{k} C_{2}^{*} C_{1} \\ ⋮ & ⋱ & ⋮ \\ C_{1}^{*} H C_{1} + C_{1}^{*} C_{2} T_{k} C_{2}^{*} C_{1} & \dots & C_{1}^{*} H C_{1} + C_{1}^{*} C_{2} T_{k} C_{2}^{*} C_{1} \\ B^{*} H C_{1} + B^{*} C_{2} T_{k} C_{2}^{*} C_{1} & \dots & B^{*} H C_{1} + B^{*} C_{2} T_{k} C_{2}^{*} C_{1} \end{matrix}}} & \overset{l}{\overset{⏞}{\begin{matrix} C_{1}^{*} H B + C_{1}^{*} C_{2} T_{k} C_{2}^{*} B \\ ⋮ \\ C_{1}^{*} H B + C_{1}^{*} C_{2} T_{k} C_{2}^{*} B \\ B^{*} H B + B^{*} C_{2} T_{k} C_{2}^{*} B \end{matrix}]}} & \begin{array}{l} \begin{array}{l}  \end{array}\} k m \\ \} l \end{array}, \end{matrix}

(8)

, the update of the matrix

Σ_{k} = R_{k}^{- 1} - B_{k}^{*} H_{k} B_{k} = [\begin{matrix} {\tilde{R}}_{k - 1}^{- 1} \\ R_{k - 1}^{- 1} \end{matrix}] - B_{k}^{*} H_{k} B_{k}

will be of size

(k m + l) \times (k m + l)

. Now, suppose that matrices

C_{1}^{*} H C_{1}

,

C_{1}^{*} H B

,

B^{*} H B

,

C_{2}^{*} C_{1}

, and

C_{2}^{*} B

are available in the preprocessing step, then

Φ_{k}

in (7) does not require additional computations. Additionally,

Ψ_{k}

and

Σ_{k}

can be obtained via updating several small scale matrix multiplications of the size

m \times m

, i.e.,

{(C_{2}^{*} C_{1})}^{*} T_{k} (C_{2}^{*} C_{1})

,

{(C_{2}^{*} C_{1})}^{*} T_{k} (C_{2}^{*} B)

, and

{(C_{2}^{*} B)}^{*} T_{k} (C_{2}^{*} B)

, and replicating them

k m + l

times (here

{\tilde{R}}_{k - 1}^{- 1}

and

R_{k - 1}^{- 1}

are assumed to be available in the last iteration for computing

Σ_{k}

). Consequently, the left computation lies in solving two linear systems

Σ_{k} U = Φ_{k}

and

Σ_{k} V = Ψ_{k}

of size

(k m + l) \times (k m + l)

. We summarize the whole process in Algorithm 1 as below; the concrete complexity analysis in the next section shows that the iteration only costs about

O ({(k + 1)}^{3} m^{3})

flops.

Remark 1.

The output matrices

B_{ϵ}

and

R_{ϵ}

are numerically low-ranked with respect to the tolerance ϵ.

\hat{T}

is the matrix from the convergence of

T_{k}

given in the next subsection.

Remark 2.

The QR decomposition of

C_{2}

is for the derivation of the relative residual and also could be implemented in the preprocessing step. The computational complexity of the preprocessing part is about

O (n)

flops, taking the dominant CPU time compared with the iteration part.

Remark 3.

The computations of the iteration part and of the relative residual in the DARE cost about

O ({(k + 1)}^{3} m^{3})

and

O (m^{3})

flops, respectively, much less than

O (n)

of the preprocessing part when

m ≪ n

. Hence, the main computation of Algorithm 1 concentrates on the preprocessing part.

2.2. Convergence

To establish the convergence of Algorithm 1, we first review some results for iteration format (2).

Algorithm 1. Structured SDA for DAREs.
Input:	$C_{1}$ , $C_{2}$ , S, B, $R^{- 1} = R^{- *}$ , H and tolerances $τ_{g}$ and $ϵ$ , and $m_{max}$ ;
Output:	$B_{ϵ} \in C^{n \times m_{ϵ}}$ , $R_{ϵ} \in C^{m_{ϵ} \times m_{ϵ}}$ , $T_{ϵ} \in C^{m_{ϵ} \times m_{ϵ}}$ , and normalized relative resi-
	dual ${\tilde{r}}_{ϵ}$ ;
Preprocess:	Compute $C_{1}^{} H C_{1}$ , $C_{1}^{} H B$ , $B^{} H B$ , $C_{2}^{} C_{1}$ , $C_{2}^{*} B$ and the economic QR de-
	composition of $C_{2}$ .
Iteration:	Set $T_{0} = 0$ , $S_{0} = S$ , $R_{0} = - R^{- 1}$ , $B_{0} = B$ , $H_{0} = H$ , $Σ_{0} = - (R + B^{*} H B)$ ,
	$Φ_{0} = C_{2}^{} B$ , $Ψ_{0} = C_{1}^{} H B$ and $k = 0$ ;
	For $k \geq 1$ , do until convergence:
	Compute the relative residual ${\tilde{r}}_{k}$ as in (11).
	If ${\tilde{r}}_{k} \leq ϵ$ , set $B_{ϵ} = B_{k}$ , $R_{ϵ} = R_{k}$ , $T_{ϵ} = T_{k}$ and ${\tilde{r}}_{ϵ} = {\tilde{r}}_{k}$ ; Exit;
	End If
	Compute
	$S_{k + 1} = S_{k} (C_{2}^{} C_{1} + Φ_{k}^{} Σ_{k}^{- 1} Ψ_{k}) S_{k}$ ;
	$T_{k + 1} = T_{k} + S_{k}^{} (C_{1}^{} H_{k} C_{1} + Ψ_{k}^{*} Σ_{k}^{- 1} Ψ_{k}) S_{k}$ ;
	$R_{k + 1} = [\begin{matrix} S_{k} Φ_{k} Σ_{k}^{- 1} Φ_{k}^{} S_{k}^{} \\ R_{k} \end{matrix}]$ ;
	Obtain $B_{k + 1}^{*} H_{k + 1} B_{k + 1}$ in (8) with preprocessed matrices.
	$Σ_{k + 1}^{- 1} = {(I - R_{k + 1} B_{k + 1}^{*} H_{k + 1} B_{k + 1})}^{- 1} R_{k + 1}$ ,
	$Φ_{k + 1} = [C_{2}^{*} C_{1}, Φ_{k}]$ ,
	$Ψ_{k + 1} = [C_{1}^{} H C_{1} + C_{1}^{} C_{2} T_{k + 1} C_{2}^{} C_{1}, . . ., C_{1}^{} H C_{1} + C_{1}^{} C_{2} T_{k + 1} C_{2}^{} C_{1}$ ,
	$C_{1}^{} H B + C_{1}^{} C_{2} T_{k + 1} C_{2}^{*} B];$
	Set $k \leftarrow k + 1$ .
	End Do

Theorem 1.

([13]) Assume that X and Y are the Hermitian and positive semi-definite solutions of the DARE (1) and its dual equation

D_{d} (Y) = - Y + A Y {(I + H Y)}^{- 1} A^{*} + G = 0,

(9)

respectively. Let

P : = {(I + G X)}^{- 1} A

and

Q : = {(I + H Y)}^{- 1} A^{*}

. Then, the matrix sequences

{A_{k}}

,

{G_{k}}

and

{H_{k}}

generated by the SDA (2) satisfy

\begin{matrix} (1) A_{k} = (I + G_{k} X) P^{2^{k}}; \\ (2) H \leq H_{k} \leq H_{k + 1} \leq X, X - H_{k} = {(P^{*})}^{2^{k}} (X + X G_{k} X) P^{2^{k}}; \\ (3) G \leq G_{k} \leq G_{k + 1} \leq Y, Y - G_{k} = {(Q^{*})}^{2^{k}} (Y + Y H_{k} Y) Q^{2^{k}} . \end{matrix}

(10)

It follows from (10) that

\begin{matrix} ∥ A_{k} ∥ \leq (1 + ∥ X ∥ \cdot ∥ Y ∥) ∥ P^{2^{k}} ∥, \\ ∥ H_{k} - X ∥ \leq ∥ X ∥ (1 + ∥ X ∥ \cdot ∥ Y ∥) ∥ P^{2^{k}} ∥^{2}, \\ ∥ G_{k} - Y ∥ \leq ∥ Y ∥ (1 + ∥ X ∥ \cdot ∥ Y ∥) ∥ Q^{2^{k}} ∥^{2} . \end{matrix}

indicating that sequences

{A_{k}}

,

{H_{k}}

and

{G_{k}}

converge quadratically to zero, X, and Y, respectively, if

ρ (P) < 1

and

ρ (Q) < 1

. By noting the decomposition

A_{k} = C_{1} S_{k} C_{2}^{*}

, the sequence

{S_{k}}

must converge to zero. On the other hand, the decomposition

H_{k} = H + C_{2} T_{k} C_{2}^{*}

implies that the sequence

{T_{k}}

converges to some matrix

\hat{T} \in C^{m \times m}

such that the solution of the DARE

X = H + C_{2} \hat{T} C_{2}^{*}

. At last, the decomposition

\begin{matrix} G_{k} & = & B_{k} R_{k} B_{k}^{*} \\ = & [C_{1}, . . ., C_{1}, B] \cdot [\begin{matrix} S_{k} Φ_{k}^{*} Σ_{k}^{- 1} Φ_{k} S_{k}^{*} \\ ⋱ \\ S_{1} Φ_{1}^{*} Σ_{1}^{- 1} Φ_{1} S_{1}^{*} \\ S_{0} Φ_{0}^{*} Σ_{0}^{- 1} Φ_{0} S_{0}^{*} \end{matrix}] \cdot [\begin{matrix} C_{1}^{*} \\ ⋮ \\ C_{1}^{*} \\ B^{*} \end{matrix}] \\ = & B S_{0} Φ_{0}^{*} Σ_{0}^{- 1} Φ_{0} S_{0}^{*} B^{*} + \sum_{i = 1}^{k} C_{1} S_{k} Φ_{k}^{*} Σ_{k}^{- 1} Φ_{k} S_{k}^{*} C_{1}^{*} \end{matrix}

indicates that the solution Y of the dual DARE has a numerically low-ranked decomposition

Y \approx B_{ϵ} R_{ϵ} B_{ϵ}^{*}

with respect to a sufficient small tolerance

ϵ > 0

. So, we have the following corollary.

Corollary 1.

Suppose that X and Y are the Hermitian and positive semi-definite solutions of the DARE (1) and its dual form (9), respectively. Then, for Algorithm 1, the sequence

{S_{k}}

converges to zero matrix quadratically, and

{T_{k}}

converges to some matrix

\hat{T}

with

X = H + C_{2} \hat{T} C_{2}^{*}

. Moreover, for sufficiently large k, the matrix

R_{k}

is numerically low-ranked with respect to tolerance ϵ. That is, the solution Y of the dual Equation (9) has the low-ranked approximation

Y \approx B_{ϵ} R_{ϵ} B_{ϵ}^{*}

, where matrices

B_{ϵ}

and

R_{ϵ}

associate with ϵ but independently of the size of Y.

3. Computational Issues

3.1. Residual and Stop Criterion

Recalling the low-ranked structures of G and A, the residual of the DARE is

\begin{matrix} - H_{k} + A^{*} H_{k} {(I + G H_{k})}^{- 1} A + H \\ = & C_{2} (- T_{k} + S^{*} C_{1}^{*} H_{k} {(I + B R^{- 1} B^{*} H_{k})}^{- 1} C_{1} S) C_{2}^{*} \\ = & C_{2} (- T_{k} + S^{*} (Π_{k} - Ξ_{k} Θ_{k}^{- 1} Ξ_{k}^{*}) S) C_{2}^{*} \end{matrix}

with

\begin{matrix} Π_{k} & = & C_{1}^{*} H_{k} C_{1} = C_{1}^{*} H C_{1} + {(C_{2}^{*} C_{1})}^{*} \cdot T_{k} \cdot C_{2}^{*} C_{1}, \\ Ξ_{k} & = & C_{1}^{*} H_{k} B = C_{1}^{*} H B + {(C_{2}^{*} C_{1})}^{*} \cdot T_{k} \cdot C_{2}^{*} B, \\ Θ_{k} & = & R + B^{*} H_{k} B = R + B^{*} H B + {(C_{2}^{*} B)}^{*} \cdot T_{k} \cdot C_{2}^{*} B . \end{matrix}

Let

C_{2} = Q_{_{C_{2}}} R_{_{C_{2}}}

(

Q_{_{C_{2}}} \in C^{n \times m}

,

R_{_{C_{2}}} \in C^{m \times m}

) be the economic QR decomposition of

C_{2}

, derived from the preprocessing step. The matrix norm of the residual is

r_{k} = ∥ R_{_{C_{2}}} (- T_{k} + S^{*} (Π_{k} - Ξ_{k} Θ_{k}^{- 1} Ξ_{k}^{*}) S) R_{_{C_{2}}}^{*} ∥

and Algorithm 1 can be terminated by the normalized relative residual

NRRes = r_{k} / (t_{k} + s_{k} + m_{k}) : = {\tilde{r}}_{k} < ϵ

(11)

with

r_{k} = ∥ R_{_{C_{2}}} T_{k} R_{_{C_{2}}}^{*} ∥, s_{k} = ∥ R_{_{C_{2}}} S_{k}^{*} Π_{k} S_{k} R_{_{C_{2}}}^{*} ∥, m_{k} = ∥ R_{_{C_{2}}} S_{k}^{*} Ξ_{k} Θ_{k}^{- 1} Ξ_{k}^{*} S_{k} R_{_{C_{2}}}^{*} ∥ .

Note that the calculation of NRRes only associates with several matrix operations with the small-scale

m \times m

, requiring

O (m^{3})

flops and far less than

O (n)

when

m ≪ n

.

3.2. Complexity Analysis

The main flops of Algorithm 1 come from the preprocessing step of forming matrices

C_{1}^{*} H C_{1}

,

C_{1}^{*} H B

,

B^{*} H B

,

C_{2}^{*} C_{1}

,

C_{2}^{*} B

and QR decomposing

C_{2} = Q_{_{C_{2}}} R_{_{C_{2}}}

with the Householder transformation in [14,15]. Table 1 lists the details, where only the matrix

R_{_{C_{2}}} \in C^{m \times m}

stored as

Q_{_{C_{2}}} \in C^{n \times m}

is orthornormal satisfying

Q_{_{C_{2}}}^{*} Q_{_{C_{2}}} = I_{m}

.

It is seen from Table 1 that the computation and the storage are both of

O (n)

flops when

m, l ≪ n

. We subsequently analyze the complexity of the iteration part. Assume that the LU decomposition is employed for solving the linear system

M Z = N

with

M, Z, N \in C^{(k m + l) \times (k m + l)}

. The flops and memory of the kth iteration are summarized in Table 2 below.

Table 2 shows that the complexity of the kth iteration in Algorithm 1 is about

O ({(k + 1)}^{3} m^{3})

, far less than

O (n)

of the preprocessing step when

m ≪ n

. Thus, the dominantly calculating cost of Algorithm 1 locates at the preprocessing step; however, it is still far less than the exponentially increasing complexity

O (2^{k} n)

[8,9] when k grows large.

4. Numerical Experiments

In this section, we will show the effectiveness of Algorithm 1 to calculate the solution X of the large-scale DARE (1). The code was programmed by Matlab 2014a [16], and all computations were implemented in a ThinkPad notebook with 2.4 GHz Intel i5-6200 CPU and 8G memory. The stop criterion is the NRRes in (11) with a proper tolerance

ϵ

. To show the location of the dominant computations in Algorithm 1, we record the ratio of iteration time and total time in the percentage

R_{t} = \frac{TIME - I}{TIME - P + TIME - I} \times 100 %,

(12)

where “TIME-P” represents the pre-processing time elapsed for forming matrices associated with n, and “TIME-I” stands for the costed CPU time for iterations.

Example 1.

The first example is devised to measure the actual error between the true solution X and the approximated solution

H_{k}

computed from Algorithm 1. Let

S = 1

,

C_{1} = 1 / ∥ 1 ∥ \in R^{n \times 1}

and

C_{2} \in R^{n \times 1}

be a vector such that

C_{1}^{*} C_{2} = 0

and

C_{i}^{*} C_{i} = 1

(

i = 1, 2

), where

1

is a vector with all elements 1. Set

B^{*} = [0, 0, . . ., 0, 1] \in R^{1 \times n}

,

R = 1

and

H = I_{n}

. Then, the solution of the DARE is

X = I_{n} - U U^{*}

with

U = w C_{2}

and

w^{2} = \frac{C_{2_{n}}^{2} - 2 + \sqrt{{(C_{2_{n}}^{2} - 2)}^{2} + 4 C_{2_{n}}^{2} (2 - C_{1_{n}}^{2})}}{2 C_{2_{n}}^{2}}

being the root of the equation

(1 - w^{2}) (2 + w^{2} * C_{2_{n}}^{2}) - C_{1_{n}}^{2} = 0

. Here,

C_{1_{n}}

and

C_{2_{n}}

represent the n-th element of

C_{1}

and

C_{2}

, respectively. The coefficient matrices are

A = C_{1} S C_{2}^{*}

and

G = B R^{- 1} B^{*}

. The principle of selecting the above vectors and matrices is for the convenient construction of the true solution of the DARE. Then, we can evaluate the error between the computed approximated solution and the true solution.

We consider the medium scales with

n = 1000

, 3000, and 5000 to test the accuracy of Algorithm 1, which is terminated when the NRRes is less the prescribed

ϵ = 1.0 \times 10^{- 13}

. Numerical experiments show that Algorithm 1 always takes three iterations to obtain the approximate solution for all tested dimensions n. The obtained results on NRRes and Errors are listed in Table 3.

It is seen from the table that Algorithm 1 is efficient to calculate the solution of the DARE. In fact, for different dimensions, the actual error between

H_{k}

and the solution X is less than the prescribed accuracy after three iterations, and the derived relative residual is down to a lower level about

10^{- 17}

to

10^{- 18}

. Especially, the value of

R_{t}

gradually decreases with the rising scale of n, indicating that the CPU time for iterations takes only a small part of the whole for large-scale problems.

Example 2.

Randomly generate matrices

C_{1}, C_{2}, B \in R^{n \times m}

and define

C_{1} : = \frac{C_{1}}{\sqrt{2} {∥ C_{1} ∥}^{1 / 2}}, C_{2} : = \frac{C_{2}}{\sqrt{2} {∥ C_{2} ∥}^{1 / 2}}, B : = \frac{B}{{∥ B ∥}^{1 / 2}} .

Set

R = S = I_{m}

and consider the DARE (1) with

A = C_{1} C_{2}^{*}

,

G = B B^{*}

, and

H = I - B B^{*} - \frac{1}{2} C_{2} C_{2}^{*} + C_{2} C_{1}^{*} B B^{*} C_{1} C_{2}^{*}

. It is not difficult to see the solution of the DARE is

X = I - B B^{*}

. Similarly, the principle of selecting the above matrices is for the convenience of evaluating the error.

We take

n = 5000, 6000, 7000

to test the error between the true solution and the computed solution. The obtained results together with the NRRes are plotted in Figure 1, Figure 2 and Figure 3. Still,

R_{t}

represents the ratio of the iteration time and the total time.

Figure 1, Figure 2 and Figure 3 show that as the number of iterations increases, the NRRes and errors decrease exponentially and Algorithm 1 terminates at the 6th iteration. In all experiments, the preprocessing time for three cases varied from 0.1 to 0.2 s, while the iterative time only took from 0.0032 to 0.0035 s, costing a small part of the whole CPU time. More experiments also indicated that the ratio

R_{t}

became smaller as the scale of the problem increased.

Example 3.

This example comes from a proper modification of the circuits from the magneto-quasistatic Maxwell equations ([17,18]). The matrix

S \in R^{632 \times 632}

represents the DC resistance matrix of each current filament (see Figure 4) and

C_{1}

as well as

C_{2} \in R^{n \times 632}

associated with the mesh matrices. Let

R = 1

,

B^{⊤} = [1, 0, . . ., 0] \in R^{1 \times n}

and

H = I_{n}

. We randomly generate the matrix

U \in R^{n \times 632}

and define

U = \frac{U}{{∥ U ∥}^{1 / 2}}, C_{1} = U, C_{2} = U .

The tolerance ϵ is taken as

10^{- 14}

, and the dimensions are

n = i \times 10^{5}

(

i = 1, 2, . . ., 6

).

For all cases in our experiments, Algorithm 1 was observed attaining the relative residual level below

1.01 \times 10^{- 16}

at the 4-th iteration. The elapsed CPU time and the ratio

R_{t}

are plotted in Figure 5, where “

T_{p r e}

” and “

T_{i t}

” record the CPU time for the preprocessing and for the iteration, respectively. One can see from the figure that as the scale n rises, the preprocessing time becomes more dominant (about 112 s at

n = 600, 000

) but the iteration time remains almost unchanged (about 3.5 s for all n). The gradually reduced ratio

R_{t}

also illustrates that the main computations of Algorithm 1 when solving the large-scale problems lie in the preprocessing step of

O (n)

flops, much less than the exponentially increasing one of

O (2^{k} n)

in [8,9].

5. Conclusions

We have proposed an efficient algorithm to solve the large-scale DAREs with low-ranked matrices A and G and a high-rank matrix H. Compared with the SDA of the complexity

O (2^{k} n)

in [8,9], the newly developed algorithm only requires preprocessing step of

O (n)

flops and iteration step of

O ({(k + 1)}^{3} m^{3})

flops. For large-scale problems with

m ≪ n

, the main computations of the whole algorithm lie in the preprocessing step with several matrix multiplications and an economic QR decomposition, while the elapsed CPU time for the iteration part is trivial. Some numerical experiments validate the effectiveness of the proposed algorithm. For future work, we may investigate the possibility of the SDA for solving large-scale DAREs with the structure of sparse-plus-low-rank in A, where the possible difficulty might be understanding the concrete structure of the iterative matrix and knowing how to compute and store it efficiently.

Author Contributions

Conceptualization, B.Y.; methodology, N.D.; software, C.J.; validation, B.Y.; and formal analysis, N.D. All authors have read and agreed to the final version of this manuscript.

Funding

This work was supported partly by the NSF of China (11801163), the NSF of Hunan Province (2021JJ50032, 2023JJ50040), and the Key Foundation of the Educational Department of Hunan Province (20A150).

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Athan, M.; Falb, P.L. Optimal Control: An Introduction to the Theory and Its Applications; McGraw-Hill: New York, NY, USA, 1965. [Google Scholar]
Lancaster, P.; Rodman, L. Algebraic Riccati Equations; Clarendon Press: Oxford, UK, 1999. [Google Scholar]
Rabbath, C.A.; Léchevin, N. Discrete-Time Control System Design with Applications; Springer Science and Business Media: Berlin/Heidelberg, Germany, 2013. [Google Scholar]
Nosrati, K.; Shafiee, M. On the convergence and stability of fractional singular Kalman filter and Riccati equation. J. Frankl. Inst. 2020, 357, 7188–7210. [Google Scholar] [CrossRef]
Trujillo, J.J.; Ungureanu, V.M. Optimal control of discrete-time linear fractional-order systems with multiplicative noise. Int. J. Control. 2018, 91, 57–69. [Google Scholar] [CrossRef]
Chu, E.K.-W.; Fan, H.-Y.; Lin, W.-W. A structure-preserving doubling algorithm for continuous-time algebraic Riccati equations. Lin. Alg. Appl. 2005, 396, 55–80. [Google Scholar] [CrossRef]
Chu, E.K.-W.; Fan, H.-Y.; Lin, W.-W.; Wang, C.-S. A structure-preserving doubling algorithm for periodic discrete-time algebraic Riccati equations. Int. J. Control 2004, 77, 767–788. [Google Scholar] [CrossRef]
Chu, E.K.-W.; Weng, P.C.-Y. Large-scale discrete-time algebraic Riccati equations—Doubling algorithm and error analysis. J. Comp. Appl. Maths. 2015, 277, 115–126. [Google Scholar] [CrossRef]
Yu, B.; Fan, H.-Y.; Chu, E.K.-W. Large-scale algebraic Riccati equations with high-rank constant terms. J. Comput. Appl. Math. 2019, 361, 130–143. [Google Scholar] [CrossRef]
Kamon, M.; Wang, F.; White, J. Generating nearly optimally compact models from Krylov-subspace based reduced order models. IEEE Trans. Circuits -Syst.-Ii Analog Digit. Signal Process. 2000, 47, 239–248. [Google Scholar] [CrossRef]
Golub, G.H.; Van Loan, C.F. Matrix Computations, 3rd ed.; Johns Hopkins University Press: Baltimore, MD, USA, 1996. [Google Scholar]
Yu, B.; Li, D.-H.; Dong, N. Low memory and low complexity iterative schemes for a nonsymmetric algebraic Riccati equation arising from transport theory. J. Comput. Appl. Math. 2013, 250, 175–189. [Google Scholar] [CrossRef]
Lin, W.-W.; Xu, S.-F. Convergence analysis of structure-preserving doubling algorithms for Riccati-type matrix equations. SIAM J. Matrix Anal. Appl. 2006, 28, 26–39. [Google Scholar] [CrossRef]
Bhatia, R. Matrix Analysis, Graduate Texts in Mathematics; Springer: Berlin/Heidelberg, Germany, 1997. [Google Scholar]
Higham, N.J. Functions of Matrices: Theory and Computation; SIAM: Philadelphia, PA, USA, 2008. [Google Scholar]
Higham, D.J.; Higham, N.J. MATLAB Guide; Society for Industrial and Applied Mathematics: Philadelphia, PA, USA, 2016. [Google Scholar]
Miguel, S.L.; Kamon, M.; Elfadel, I.; White, J. A coordinate transformed Arnoldi algorithm for generating guaranteed stable reduced order models of RLC circuits. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, San Jose, CA, USA, 10–14 November 1996; pp. 288–294. [Google Scholar]
Odabasioglu, A.; Celik, M.; Pileggi, L.T. PRIMA: Passive Reduced order Interconnect Macro modeling Algorithm. IEEE Trans.-Comput.-Aided Des. Integr. Circuits Syst. 1998, 17, 645–654. [Google Scholar] [CrossRef] [Green Version]

$Fractalfract 07 00193 g001 550$

Figure 1. History of NRRes and Error for

n = 5000

in Example 2.

Figure 1. History of NRRes and Error for

n = 5000

in Example 2.

$Fractalfract 07 00193 g001$

$Fractalfract 07 00193 g002 550$

Figure 2. History of NRRes and Error for

n = 6000

in Example 2.

Figure 2. History of NRRes and Error for

n = 6000

in Example 2.

$Fractalfract 07 00193 g002$

$Fractalfract 07 00193 g003 550$

Figure 3. History of NRRes and Error for

n = 7000

in Example 2.

Figure 3. History of NRRes and Error for

n = 7000

in Example 2.

$Fractalfract 07 00193 g003$

$Fractalfract 07 00193 g004 550$

Figure 4. The structure of the DC resistance matrix of each current filament.

$Fractalfract 07 00193 g004$

$Fractalfract 07 00193 g005 550$

Figure 5. Preprocessing time (

T_{pre}

), iteration time (

T_{it}

), and

R_{t}

for different dimensions in Example 3.

Figure 5. Preprocessing time (

T_{pre}

), iteration time (

T_{it}

), and

R_{t}

for different dimensions in Example 3.

$Fractalfract 07 00193 g005$

Table 1. Complexity and memory of the preprocessing step in Algorithm 1.

Items	Flops	Memory
$H C_{1}$ , $H B$	$2 m (m + l) n$	$(m + l) n$
$C_{1}^{} H C_{1}$ , $C_{1}^{} H B$ , $B^{*} H B$	$2 (m^{2} + m l + l^{2}) n$	$m^{2} + m l + l^{2}$
$C_{2}^{} C_{1}$ , $C_{2}^{} B$	$2 m (m + l) n$	$m^{2} + m l$
$C_{2} = Q_{_{C_{2}}} R_{_{C_{2}}}$	$(10 m^{2} + 4 m l + 4 l^{2}) n$	$m^{2}$
Total	$\begin{matrix} (16 m^{2} + 10 m l + 6 l^{2}) n \end{matrix}$	$\begin{matrix} 3 m^{2} + 2 m l + l^{2} + (m + l) n \end{matrix}$

Table 2. Complexity and memory at kth iteration in Algorithm 1.

Items	Flops	Memory
$Σ_{k}^{- 1} Φ_{k}^{}$ , $Σ_{k}^{- 1} Ψ_{k}^{}$	$16 {(k m + l)}^{2} m$	$2 (k m + l) m$
$Φ_{k}^{} Σ_{k}^{- 1} Ψ_{k}^{}$ , $Φ_{k}^{} Σ_{k}^{- 1} Φ_{k}^{}$ , $Ψ_{k}^{} Σ_{k}^{- 1} Ψ_{k}^{}$	$6 m^{2} (k m + l)$	$3 m^{2}$
$S_{k + 1}$	$4 m^{3}$	$m^{2}$
$T_{k + 1}$	$8 m^{3}$	$m^{2}$
$R_{k + 1}$	$4 m^{3}$	$m^{2}$
$B_{k + 1}^{*} H_{k + 1} B_{k + 1}$	$2 m (2 m^{2} + 2 m l + l^{2})$	${((k + 1) m + l)}^{2}$
$Σ_{k + 1}$	$2 {((k + 1) m)}^{3}$	${((k + 1) m)}^{2}$
$Φ_{k + 1}$	—	$((k + 1) m + l) m$
$Ψ_{k + 1}$	$4 m^{2} (m + l)$	$((k + 1) m + l) m$
Total	$\begin{matrix} (24 + 2 {(k + 1)}^{3}) m^{3} \\ + 2 m l (4 m + l) \\ + 2 m (k m + l) \\ (8 (k m + l) + 3 m) \end{matrix}$	$\begin{matrix} (4 {(k + 1)}^{2} + 2 k + 6) m^{2} \\ + (2 k + 6) m l \\ + l^{2} \end{matrix}$

Table 3. Residual and actual errors in Example 1.

n	1000	3000	5000
${NRRes}_{1}$	$9.99 \times 10^{- 1}$	$9.99 \times 10^{- 1}$	$9.99 \times 10^{- 1}$
${NRRes}_{2}$	$2.49 \times 10^{- 7}$	$2.77 \times 10^{- 7}$	$9.99 \times 10^{- 8}$
${NRRes}_{3}$	$6.24 \times 10^{- 17}$	$3.62 \times 10^{- 17}$	$4.48 \times 10^{- 18}$
$∥ H_{1} - X ∥$	$9.99 \times 10^{- 1}$	$9.99 \times 10^{- 1}$	$9.99 \times 10^{- 1}$
$∥ H_{2} - X ∥$	$2.26 \times 10^{- 7}$	$3.33 \times 10^{- 7}$	$2.75 \times 10^{- 7}$
$∥ H_{3} - X ∥$	$1.24 \times 10^{- 14}$	$1.25 \times 10^{- 14}$	$1.24 \times 10^{- 14}$
$R_{t}$	$37.5 %$	$12.6 %$	$6.8 %$

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yu, B.; Jiang, C.; Dong, N. Structured Doubling Algorithm for a Class of Large-Scale Discrete-Time Algebraic Riccati Equations with High-Ranked Constant Term. Fractal Fract. 2023, 7, 193. https://doi.org/10.3390/fractalfract7020193

AMA Style

Yu B, Jiang C, Dong N. Structured Doubling Algorithm for a Class of Large-Scale Discrete-Time Algebraic Riccati Equations with High-Ranked Constant Term. Fractal and Fractional. 2023; 7(2):193. https://doi.org/10.3390/fractalfract7020193

Chicago/Turabian Style

Yu, Bo, Chengxu Jiang, and Ning Dong. 2023. "Structured Doubling Algorithm for a Class of Large-Scale Discrete-Time Algebraic Riccati Equations with High-Ranked Constant Term" Fractal and Fractional 7, no. 2: 193. https://doi.org/10.3390/fractalfract7020193

APA Style

Yu, B., Jiang, C., & Dong, N. (2023). Structured Doubling Algorithm for a Class of Large-Scale Discrete-Time Algebraic Riccati Equations with High-Ranked Constant Term. Fractal and Fractional, 7(2), 193. https://doi.org/10.3390/fractalfract7020193

Article Menu

Structured Doubling Algorithm for a Class of Large-Scale Discrete-Time Algebraic Riccati Equations with High-Ranked Constant Term

Abstract

1. Introduction

2. Structured Doubling Algorithm

2.1. Iteration Scheme

2.2. Convergence

3. Computational Issues

3.1. Residual and Stop Criterion

3.2. Complexity Analysis

4. Numerical Experiments

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI