Factorized Doubling Algorithm for Large-Scale High-Ranked Riccati Equations in Fractional System

Yu, Bo; Dong, Ning

doi:10.3390/fractalfract7060468

Open AccessArticle

Factorized Doubling Algorithm for Large-Scale High-Ranked Riccati Equations in Fractional System

by

Bo Yu

and

Ning Dong

^*

School of Science, Hunan University of Technology, Zhuzhou 412007, China

^*

Author to whom correspondence should be addressed.

Fractal Fract. 2023, 7(6), 468; https://doi.org/10.3390/fractalfract7060468

Submission received: 6 May 2023 / Revised: 7 June 2023 / Accepted: 8 June 2023 / Published: 10 June 2023

(This article belongs to the Special Issue Feature Papers for the 'Complexity' Section)

Download

Browse Figures

Versions Notes

Abstract

In real-life control problems, such as power systems, there are large-scale high-ranked discrete-time algebraic Riccati equations (DAREs) from fractional systems that require stabilizing solutions. However, these solutions are no longer numerically low-rank, which creates difficulties in computation and storage. Fortunately, the potential structures of the state matrix in these systems (e.g., being banded-plus-low-rank) could be beneficial for large-scale computation. In this paper, a factorized structure-preserving doubling algorithm (FSDA) is developed under the assumptions that the non-linear and constant terms are positive semidefinite and banded-plus-low-rank. The detailed iteration scheme and a deflation process for FSDA are analyzed. Additionally, a technique of partial truncation and compression is introduced to reduce the dimensions of the low-rank factors. The computation of residual and the termination condition of the structured version are also redesigned. Illustrative numerical examples show that the proposed FSDA outperforms SDA with hierarchical matrices toolbox (SDA_HODLR) on CPU time for large-scale problems.

Keywords:

large-scale Riccati equations; high-ranked terms; deflation; partial truncation and compression; doubling algorithm

1. Introduction

Consider the fractional system [1,2]

Δ^{(α)} x (t + 1) = A x (t) + B u (t), y (t) = C x (t),

(1)

where

α \in (0, 1)

and (

α

) represents the order of the fractional derivative,

A \in R^{N \times N}

,

B \in R^{N \times m}

and

C \in R^{l \times N}

with

m, l \leq N

. If

Δ^{(α)} x (t + 1)

is approximated by the Grünwald–Letnikov rule [3] at

k = 1

, the system (1) is equivalent to the discrete-time linear system

x (t + 1) = A x (t) + B u (t), y (t) = C x (t),

(2)

where

A = h^{α} A + α I

and

B = h^{α} B

. The corresponding optimal control and the feedback gain can be expressed in terms of the unique positive semidefinite stabilizing solution of the discrete-time algebraic Riccati Equation (DARE)

\begin{matrix} D (X) & \equiv & - X + A^{⊤} X {(I + G X)}^{- 1} A + H = 0, A, G, H \in R^{N \times N} . \end{matrix}

(3)

There have been numerous methods, including classical and state-of-the-art techniques, developed over the past few decades to solve this equation in a numerically stable manner. See [4,5,6,7,8,9,10,11,12,13,14,15] and the references therein for more details.

In many large-scale control problems, the matrix

G = B R^{- 1} B^{⊤}

in the non-linear term and

H = C^{⊤} T^{- 1} C

in the constant term are of low-rank with

B \in R^{N \times m^{g}}

,

R \in R^{m^{g} \times m^{g}}

,

C \in R^{m^{h} \times N}

,

T \in R^{m^{h} \times m^{h}}

, and

m^{g}, m^{h} ≪ N

. Then the unique positive definite stabilizing solution in the DARE (3) or its dual equation can be approximated numerically by a low-rank matrix [16,17]. However, when the constant term H in the DARE equation has a high-rank structure, the stabilizing solution is no longer numerically low-ranked, making its storage and outputting difficult. To solve this issue, an adapted version of the doubling algorithm, named SDA_h, was proposed in [18]. The main idea behind SDA_h is to take advantage of the numerical low-rank of the stabilizing solution in the dual equation to estimate the residual of the original DARE. In this way, SDA_h can efficiently evaluate the residual and output the feedback gain. An interesting question up to now is:

Can SDA solve the large-scale DAREs efficiently when both G and H are of high-rank?

The main difficulty, in this case, lies in that the stabilizing solutions both in DARE (3) and its dual equation are not of low-rank, making the direct application of SDA difficult for large-scale problems, especially the estimation of residuals and the realization of algorithmic termination. This paper attempts to overcome this obstacle. Rather than answering the above question completely, DARE (3) with the banded-plus-low-rank structure

A = D^{A} + L_{10}^{A} K^{A} {(L_{20}^{A})}^{⊤}

(4)

is considered, where

D^{A} \in R^{N \times N}

is a banded matrix,

L_{10}^{A}

,

L_{20}^{A} \in R^{N \times m^{a}}

are low-rank matrices and

K^{A} \in R^{m^{a} \times m^{a}}

is the kernel matrix with

m^{a} ≪ N

. The assumption of (4) is not necessary when G and H are of low rank, i.e., in that case A is allowed to be any (sparse) matrix. We also assume that the high-rank non-linear item and the constant item are of the form

G = D^{G} + L^{G} K^{G} {(L^{G})}^{⊤}, H = D^{H} + L^{H} K^{H} {(L^{H})}^{⊤},

(5)

where

D^{G}

,

D^{H} \in R^{N \times N}

are positive semidefinite banded matrices,

L^{G} \in R^{N \times m^{g}}

,

L^{H} \in R^{N \times m^{h}}

,

K^{G} \in R^{m^{g} \times m^{g}}

and

K^{H} \in R^{m^{h} \times m^{h}}

are symmetric and

m^{g}, m^{h} ≪ N

(here

m^{g}

and

m^{h}

might be zero). In addition, we assume that

D^{A}

,

D^{G}

, and

D^{H}

are all banded matrices with banded inverse (BMBI), which has some applications in the power system [19,20,21]. See also [22,23,24,25,26,27,28,29], as well as their references for other applications.

The main contributions in this paper are:

Although the hierarchical (e.g., HODLR) structure [30,31] can be employed to run the SDA to cope with large-scale DAREs with both high-rank H and G, it is the first to develop SDA to the factorized form—FSDA—to deal with such DAREs.
The structure of the FSDA iterative sequence is explicitly revealed to consist of two parts—the banded part and the low-rank part. The banded part can iterate independently while the low-rank part relies heavily on the product of the banded part and the low-rank part.
A deflation process of the low-rank factors is proposed to reduce the column number of the low-rank part. The conventional truncation and compression in [17,18] for the whole low-rank factor does not to work as it destroys the implicit structure and makes the subsequent deflation infeasible. Instead, a partial truncation and compression (PTC) technique is then devised to impose merely on the exponentially increasing part (after deflation), effectively slimming the dimensions of the low-rank factors.
The termination criterion of FSDA consists of two parts. The residual of the banded part is considered in the pre-termination, and only if it is small enough, the actual termination criterion involving the low-rank factors is computed. This way, the time-consuming detection of the terminating condition is reduced in complexity.

The research in this field is also motivated by other applications, such as the finite element methods (FEM). In FEM, the matrices resulting from discretizing the matrix equations exhibit a sparse and structured pattern [32,33]. By capitalizing on these advantages, iterative methods designed for such matrices can significantly enhance computational efficiency, minimize memory usage, and lead to quicker solutions for large-scale problems.

The whole paper is organized as follows. Section 2 describes the FSDA for DAREs (3) with high-rank non-linear and constant terms. The deflation process for the low-rank factors and kernels is given in Section 3. Section 4 dwells on the technique of PTC to slim the dimensions of low-rank factors and kernels. The way to compute the residual, as well as the concrete implementation of the FSDA, is described in Section 5. Numerical experiments are listed in Section 6 to show the effectiveness of the FSDA.

Notation 1.

I_{_{N}}

(or simply I) is the

N \times N

identity matrix. For a matrix

A \in R^{N \times N}

,

ρ (A)

denotes the spectral radius of A. For symmetric matrices A and

B \in R^{N \times N}

, we say

A > B

(

A \geq B

) if

A - B

is a positive definite (semi-definite) matrix. Unless stated otherwise, the norm

∥ \cdot ∥

is the F-norm of a matrix. For a sequence of matrices

{A_{i}}_{i = 1}^{k}

,

\prod_{i = k}^{0} A_{i} = A_{k} A_{k - 1} \dots A_{1} A_{0}

. For a banded matrix B,

b w (B)

represents the bandwidth. Additionally, the Sherman–Morrison–Woodbury (SMW) formula (see [34] for example),

{(M + U D V^{⊤})}^{- 1} = M^{- 1} - M^{- 1} U {(D^{- 1} + V^{⊤} M^{- 1} U)}^{- 1} V^{⊤} M^{- 1}

is required in the analysis of iterative scheme.

2. SDA and the Structured Iteration for DARE

For DARE

D (X) = - X + A^{⊤} X {(I + G X)}^{- 1} A + H = 0

and its dual equation

D_{a} (Y) = - Y + A Y {(I + H Y)}^{- 1} A^{⊤} + G = 0,

(6)

SDA [7] generates a sequence of matrices, for

k \geq 1

\{\begin{matrix} G_{k} & = & G_{k - 1} + A_{k - 1} {(I + G_{k - 1} H_{k - 1})}^{- 1} G_{k - 1} A_{k - 1}^{⊤}, \\ H_{k} & = & H_{k - 1} + A_{k - 1}^{⊤} H_{k - 1} {(I + G_{k - 1} H_{k - 1})}^{- 1} A_{k - 1}, \\ A_{k} & = & A_{k - 1} {(I + G_{k - 1} H_{k - 1})}^{- 1} A_{k - 1}, \end{matrix}

(7)

with

A_{0} = A

,

G_{0} = G

,

H_{0} = H

. Under some conditions (see also Theorem 1),

{A_{k}}

converges to the zero matrix and

{H_{k}}

and

{G_{k}}

converge to the stabilizing solutions of

D (X) = 0

and

D_{a} (Y) = 0

, respectively.

2.1. FSDA for High-Rank Terms

Given banded matrices

D_{0}^{A} = D^{A}

,

D_{0}^{G} = D^{G}

and

D_{0}^{H} = D^{H}

, low-rank matrices

L_{0}^{G}

,

L_{10}^{A}

,

L_{0}^{H}

, and

L_{20}^{A}

, and kernels

K_{0}^{A} = K^{A}

,

K_{0}^{G} = K^{G}

, and

K_{0}^{H} = K^{H}

in the structured initial matrices (4) and (5), the FSDA is described inductively as follows, where

A_{k} = D_{k}^{A} + L_{1, k}^{A} K_{k}^{A} {(L_{2, k}^{A})}^{⊤}, G_{k} = D_{k}^{G} + L_{k}^{G} K_{k}^{G} {(L_{k}^{G})}^{⊤}, H_{k} = D_{k}^{H} + L_{k}^{H} K_{k}^{H} {(L_{k}^{H})}^{⊤}

(8)

with sparse banded matrices

D_{k}^{A}, D_{k}^{G}, D_{k}^{H} \in R^{N \times N}

, low-rank factors

L_{1, k}^{A} \in R^{N \times m_{k}^{a_{1}}}

,

L_{2, k}^{A} \in R^{N \times m_{k}^{a_{2}}}

,

L_{k}^{G} \in R^{N \times m_{k}^{g}}

,

L_{k}^{H} \in R^{N \times m_{k}^{h}}

, kernel matrices

K_{k}^{A} \in R^{m_{k}^{a_{1}} \times m_{k}^{a_{2}}}

,

K_{k}^{G} \in R^{m_{k}^{g} \times m_{k}^{g}}

,

K_{k}^{H} \in R^{m_{k}^{h} \times m_{k}^{h}}

and

m_{k}^{a_{1}}, m_{k}^{a_{2}}, m_{k}^{g}, m_{k}^{h} ≪ N

. Without loss of generality, we assume that

m_{0}^{a_{1}} = m_{0}^{a_{2}} \equiv m^{a}

and

K_{0}^{A} = I_{m^{a}}

. Otherwise,

L_{20}^{A} : = L_{20}^{A} {(K_{0}^{A})}^{⊤}

and

K_{0}^{A} : = I_{m^{a}}

fulfill the assumption.

We first elaborate the concrete format of banded parts and low-rank factors for

k = 1

and

k \geq 2

. Note that banded parts are capable of iterating independently, regardless of the low-rank parts and kernels.

Case for

k = 1

.

In the first step, we will assume that

G_{0} = D_{0}^{G}

and

H_{0} = D_{0}^{H}

, i.e., these matrices have no low-rank part. Note that this is only performed in order to simplify exposition. The fully general case with non-trivial low-rank parts will be shown in the case

k \geq 2^{n}

.

Insert the initial matrices

D_{0}^{A}

,

D_{0}^{G}

, and

D_{0}^{H}

and low-rank matrices

L_{10}^{A}

and

L_{20}^{A}

into SDA (7). It follows from the SMW formula that

\begin{matrix} D_{1}^{G} & = & D_{0}^{G} + D_{0}^{A G H G} {(D_{0}^{A})}^{⊤}, \\ D_{1}^{H} & = & D_{0}^{H} + D_{0}^{A^{⊤} H G H} D_{0}^{A}, \\ D_{1}^{A} & = & D_{0}^{A G H} D_{0}^{A} = D_{0}^{A} {(D_{0}^{A^{⊤} H G})}^{⊤} \end{matrix}

(9)

with

\begin{matrix} D_{0}^{A G H G} = D_{0}^{A} {(I_{_{N}} + D_{0}^{G} D_{0}^{H})}^{- 1} D_{0}^{G}, & D_{0}^{A^{⊤} H G H} = {(D_{0}^{A})}^{⊤} {(I_{_{N}} + D_{0}^{H} D_{0}^{G})}^{- 1} D_{0}^{H}, \\ D_{0}^{A G H} = D_{0}^{A} {(I_{_{N}} + D_{0}^{G} D_{0}^{H})}^{- 1}, & D_{0}^{A^{⊤} H G} = {(D_{0}^{A})}^{⊤} {(I_{_{N}} + D_{0}^{H} D_{0}^{G})}^{- 1} . \end{matrix}

It follows from [35] (Lem 4.5) that the iteration (9) is well defined if

D_{0}^{G}

and

D_{0}^{H}

are both positive semidefinite.

The low-rank factors in (8) are

\begin{matrix} L_{1}^{G} = [L_{10}^{A}, D_{0}^{A G H G} L_{20}^{A}], & L_{1}^{H} = [L_{20}^{A}, D_{0}^{A^{⊤} H G H} L_{10}^{A}], \\ L_{11}^{A} = [L_{10}^{A}, D_{0}^{A G H} L_{10}^{A}], & L_{21}^{A} = [L_{20}^{A}, D_{0}^{A^{⊤} H G} L_{20}^{A}] \end{matrix}

(10)

and the kernels in the low-rank parts are

\begin{matrix} K_{1}^{G} & = & [\begin{matrix} {(L_{20}^{A})}^{⊤} D_{0}^{G H G} L_{20}^{A} & I_{m_{0}^{g}} \\ I_{m_{0}^{g}} & 0 \end{matrix}], K_{1}^{H} = [\begin{matrix} {(L_{10}^{A})}^{⊤} D_{0}^{H G H} L_{10}^{A} & I_{m_{0}^{h}} \\ I_{m_{0}^{h}} & 0 \end{matrix}], \end{matrix}

(11)

\begin{matrix} K_{1}^{A} & = & [\begin{matrix} {(L_{20}^{A})}^{⊤} D_{0}^{G H} L_{10}^{A} & I_{m_{0}^{g}} \\ I_{m_{0}^{h}} & 0 \end{matrix}] \end{matrix}

(12)

with

\begin{matrix} D_{0}^{G H G} = {(I_{_{N}} + D_{0}^{G} D_{0}^{H})}^{- 1} D_{0}^{G}, D_{0}^{H G H} = {(I_{_{N}} + D_{0}^{H} D_{0}^{G})}^{- 1} D_{0}^{H}, D_{0}^{G H} = {(I_{_{N}} + D_{0}^{G} D_{0}^{H})}^{- 1} \end{matrix}

and

m_{0}^{g} = m^{a}

,

m_{0}^{h} = m^{a}

.

Case for general

k \geq 2

.

By inserting the banded matrices

D_{k - 1}^{G}

,

D_{k - 1}^{H}

and

D_{k - 1}^{A}

and the low-rank factors

L_{k - 1}^{G}

,

L_{k - 1}^{H}

,

L_{1, k - 1}^{A}

, and

L_{2, k - 1}^{A}

and the kernels

K_{k - 1}^{G}

,

D_{k - 1}^{H}

and

D_{k - 1}^{A}

into SDA (7), banded matrices at the k-th iteration are

\begin{matrix} D_{k}^{G} & = & D_{k - 1}^{G} + D_{k - 1}^{A G H G} {(D_{k - 1}^{A})}^{⊤}, \\ D_{k}^{H} & = & D_{k - 1}^{H} + D_{k - 1}^{A^{⊤} H G H} D_{k - 1}^{A}, \\ D_{k}^{A} & = & D_{k - 1}^{A G H} D_{k - 1}^{A} = D_{k - 1}^{A} {(D_{k - 1}^{A^{⊤} H G})}^{⊤} \end{matrix}

(13)

with

\begin{matrix} D_{k - 1}^{A G H G} = D_{k - 1}^{A} {(I_{_{N}} + D_{k - 1}^{G} D_{k - 1}^{H})}^{- 1} D_{k - 1}^{G}, & D_{k - 1}^{A^{⊤} H G H} = {(D_{k - 1}^{A})}^{⊤} {(I_{_{N}} + D_{k - 1}^{H} D_{k - 1}^{G})}^{- 1} D_{k - 1}^{H}, \\ D_{k - 1}^{A G H} = D_{k - 1}^{A} {(I_{_{N}} + D_{k - 1}^{G} D_{k - 1}^{H})}^{- 1}, & D_{k - 1}^{A^{⊤} H G} = {(D_{k - 1}^{A})}^{⊤} {(I_{_{N}} + D_{k - 1}^{H} D_{k - 1}^{G})}^{- 1} . \end{matrix}

The corresponding low-rank factors are

\begin{matrix} \begin{matrix} \begin{matrix} m_{k - 1}^{g} m_{k - 1}^{a_{1}} m_{k - 1}^{g} m_{k - 1}^{h} m_{k - 1}^{a_{2}} \end{matrix} \\ L_{k}^{G} = [\begin{matrix} L_{k - 1}^{G}, & L_{1, k - 1}^{A}, & D_{k - 1}^{A G H} L_{k - 1}^{G}, & D_{k - 1}^{A G H G} L_{k - 1}^{H}, & D_{k - 1}^{A G H G} L_{2, k - 1}^{A} \end{matrix}] & \begin{matrix} N, \end{matrix} \end{matrix} \end{matrix}

(14)

\begin{matrix} \begin{matrix} \begin{matrix} m_{k - 1}^{a_{1}} m_{k - 1}^{g} m_{k - 1}^{h} m_{k - 1}^{a_{1}} \end{matrix} \\ L_{1, k}^{A} = [\begin{matrix} L_{1, k - 1}^{A}, & D_{k - 1}^{A G H} L_{k - 1}^{G}, & D_{k - 1}^{A G H G} L_{k - 1}^{H}, & D_{k - 1}^{A G H} L_{1, k - 1}^{A} \end{matrix}] & \begin{matrix} N, \end{matrix} \end{matrix} \end{matrix}

(15)

\begin{matrix} \begin{matrix} \begin{matrix} m_{k - 1}^{h} m_{k - 1}^{a_{2}} m_{k - 1}^{h} m_{k - 1}^{g} m_{k - 1}^{a_{1}} \end{matrix} \\ L_{k}^{H} = [\begin{matrix} L_{k - 1}^{H}, & L_{2, k - 1}^{A}, & D_{k - 1}^{A^{⊤} H G} L_{k - 1}^{H}, & D_{k - 1}^{A^{⊤} H G H} L_{k - 1}^{G}, & D_{k - 1}^{A^{⊤} H G H} L_{1, k - 1}^{A} \end{matrix}] & \begin{matrix} N, \end{matrix} \end{matrix} \end{matrix}

(16)

\begin{matrix} \begin{matrix} \begin{matrix} m_{k - 1}^{a_{2}} m_{k - 1}^{g} m_{k - 1}^{h} m_{k - 1}^{a_{2}} \end{matrix} \\ L_{2, k}^{A} = [\begin{matrix} L_{2, k - 1}^{A}, & D_{k - 1}^{A^{⊤} H G} L_{k - 1}^{H}, & D_{k - 1}^{A^{⊤} H G H} L_{k - 1}^{G}, & D_{k - 1}^{A^{⊤} H G} L_{2, k - 1}^{A} \end{matrix}] & \begin{matrix} N . \end{matrix} \end{matrix} \end{matrix}

(17)

To express the kernels explicitly, let

\begin{matrix} Θ_{k - 1}^{H} = {(L_{k - 1}^{H})}^{⊤} D_{k - 1}^{G H G} L_{k - 1}^{H}, & Θ_{k - 1}^{G} = {(L_{k - 1}^{G})}^{⊤} D_{k - 1}^{H G H} L_{k - 1}^{G}, \\ Θ_{k - 1}^{H G} = {(L_{k - 1}^{H})}^{⊤} D_{k - 1}^{G H} L_{k - 1}^{G}, & Θ_{k - 1}^{A} = {(L_{2, k - 1}^{A})}^{⊤} D_{k - 1}^{G H} L_{1, k - 1}^{A}, \\ Θ_{1, k - 1}^{A} = {(L_{1, k - 1}^{A})}^{⊤} D_{k - 1}^{H G H} L_{1, k - 1}^{A}, & Θ_{2, k - 1}^{A} = {(L_{2, k - 1}^{A})}^{⊤} D_{k - 1}^{G H G} L_{2, k - 1}^{A} \end{matrix}

and

\begin{matrix} Θ_{1, k - 1}^{A H} = {(L_{1, k - 1}^{A})}^{⊤} D_{k - 1}^{H G} L_{k - 1}^{H}, & Θ_{1, k - 1}^{A G} = {(L_{1, k - 1}^{A})}^{⊤} D_{k - 1}^{H G H} L_{k - 1}^{G}, \\ Θ_{2, k - 1}^{A H} = {(L_{2, k - 1}^{A})}^{⊤} D_{k - 1}^{G H G} L_{k - 1}^{H}, & Θ_{2, k - 1}^{A G} = {(L_{2, k - 1}^{A})}^{⊤} D_{k - 1}^{G H} L_{k - 1}^{G} \end{matrix}

with

\begin{matrix} D_{k - 1}^{G H G} = {(I_{_{N}} + D_{k - 1}^{G} D_{k - 1}^{H})}^{- 1} D_{k - 1}^{G}, & D_{k - 1}^{H G H} = {(I_{_{N}} + D_{k - 1}^{H} D_{k - 1}^{G})}^{- 1} D_{k - 1}^{H}, \\ D_{k - 1}^{G H} = {(I_{_{N}} + D_{k - 1}^{G} D_{k - 1}^{H})}^{- 1}, & D_{k - 1}^{H G} = {(I_{_{N}} + D_{k - 1}^{H} D_{k - 1}^{G})}^{- 1} . \end{matrix}

Define the kernel components

\begin{matrix} K_{k - 1}^{G H} = [\begin{matrix} 0 & K_{k - 1}^{G} \\ K_{k - 1}^{H} & 0 \end{matrix}] {(I_{_{m_{k - 1}^{h} + m_{k - 1}^{g}}} + [\begin{matrix} - Θ_{k - 1}^{H} & Θ_{k - 1}^{H G} \\ {(Θ_{k - 1}^{H G})}^{⊤} & Θ_{k - 1}^{G} \end{matrix}] [\begin{matrix} - K_{k - 1}^{H} & 0 \\ 0 & K_{k - 1}^{G} \end{matrix}])}^{- 1}, \end{matrix}

(18)

\begin{matrix} K_{k - 1}^{G H G} = K_{k - 1}^{G H} [\begin{matrix} 0 & I_{m_{k - 1}^{h}} \\ - I_{m_{k - 1}^{g}} & 0 \end{matrix}], K_{k - 1}^{H G H} = [\begin{matrix} 0 & - I_{m_{k - 1}^{h}} \\ I_{m_{k - 1}^{g}} & 0 \end{matrix}] K_{k - 1}^{G H} \end{matrix}

(19)

and

\begin{matrix} K_{k - 1}^{A G H G} & = & - K_{k - 1}^{A} [Θ_{2, k - 1}^{A G}, Θ_{2, k - 1}^{A H}] K_{k - 1}^{G H G}, \\ K_{k - 1}^{A^{⊤} H G H} & = & - {(K_{k - 1}^{A})}^{⊤} [Θ_{1, k - 1}^{A H}, Θ_{1, k - 1}^{A G}] K_{k - 1}^{H G H}, \\ K_{k - 1}^{A G H G A^{⊤}} & = & K_{k - 1}^{A} Θ_{2, k - 1}^{A} {(K_{k - 1}^{A})}^{⊤} + K_{k - 1}^{A G H G} {[Θ_{2, k - 1}^{A G}, Θ_{2, k - 1}^{A H}]}^{⊤} {(K_{k - 1}^{A})}^{⊤}, \\ K_{k - 1}^{A^{⊤} H G H A} & = & {(K_{k - 1}^{A})}^{⊤} Θ_{1, k - 1}^{A} K_{k - 1}^{A} + K_{k - 1}^{A^{⊤} H G H} {[Θ_{1, k - 1}^{A H}, Θ_{1, k - 1}^{A G}]}^{⊤} K_{k - 1}^{A}, \\ K_{k - 1}^{A G H} & = & - K_{k - 1}^{A} [Θ_{2, k - 1}^{A G}, Θ_{2, k - 1}^{A H}] K_{k - 1}^{G H}, \\ K_{k - 1}^{A^{⊤} G H} & = & - {(K_{k - 1}^{A})}^{⊤} [Θ_{1, k - 1}^{A H}, Θ_{1, k - 1}^{A G}] {(K_{k - 1}^{G H})}^{⊤}, \\ K_{k - 1}^{A G H A} & = & K_{k - 1}^{A} Θ_{k - 1}^{A} K_{k - 1}^{A} + K_{k - 1}^{A G H} {[Θ_{1, k - 1}^{A H}, Θ_{1, k - 1}^{A G}]}^{⊤} K_{k - 1}^{A} . \end{matrix}

(20)

Then the kernel matrices corresponding to

L_{k}^{G}

,

L_{k}^{H}

, and

L_{1, k}^{A}

(

L_{2, k}^{A}

) at the k-th step are

\begin{matrix} \begin{matrix} m_{k - 1}^{g} m_{k - 1}^{a_{1}} m_{k - 1}^{g} + m_{k - 1}^{h} m_{k - 1}^{a_{2}} \end{matrix} \\ K_{k}^{G} = [\begin{matrix} K_{k - 1}^{G} & 0 & 0 & 0 \\ 0 & K_{k - 1}^{A G H G A^{⊤}} & K_{k - 1}^{A G H G} & K_{k - 1}^{A} \\ 0 & {(K_{k - 1}^{A G H G})}^{⊤} & - K_{k - 1}^{G H G} & 0 \\ 0 & {(K_{k - 1}^{A})}^{⊤} & 0 & 0 \end{matrix}] & \begin{matrix} m_{k - 1}^{g} \\ m_{k - 1}^{a_{1}} \\ m_{k - 1}^{g} + m_{k - 1}^{h} \\ m_{k - 1}^{a_{2}} \end{matrix} \end{matrix},

(21)

\begin{matrix} \begin{matrix} m_{k - 1}^{h} m_{k - 1}^{a_{2}} m_{k - 1}^{h} + m_{k - 1}^{g} m_{k - 1}^{a_{1}} \end{matrix} \\ K_{k}^{H} = [\begin{matrix} K_{k - 1}^{H} & 0 & 0 & 0 \\ 0 & K_{k - 1}^{A^{⊤} H G H A} & K_{k - 1}^{A^{⊤} H G H} & {(K_{k - 1}^{A})}^{⊤} \\ 0 & {(K_{k - 1}^{A^{⊤} H G H})}^{⊤} & - K_{k - 1}^{H G H} & 0 \\ 0 & K_{k - 1}^{A} & 0 & 0 \end{matrix}] & \begin{matrix} m_{k - 1}^{h} \\ m_{k - 1}^{a_{2}} \\ m_{k - 1}^{h} + m_{k - 1}^{g} \\ m_{k - 1}^{a_{1}} \end{matrix} \end{matrix}

(22)

and

\begin{matrix} \begin{matrix} m_{k - 1}^{a_{2}} m_{k - 1}^{g} + m_{k - 1}^{h} m_{k - 1}^{a_{2}} \end{matrix} \\ K_{k}^{A} = [\begin{matrix} K_{k - 1}^{A G H A} & K_{k - 1}^{A G H} & K_{k - 1}^{A} \\ {(K_{k - 1}^{A^{⊤} G H})}^{⊤} & - K_{k - 1}^{G H} & 0 \\ K_{k - 1}^{A} & 0 & 0 \end{matrix}] & \begin{matrix} m_{k - 1}^{a_{1}} \\ m_{k - 1}^{g} + m_{k - 1}^{h} \\ m_{k - 1}^{a_{1}} \end{matrix} \end{matrix} .

(23)

Remark 1.

1. The banded parts in (13) in the FSDA can iterate independently of the low-rank parts, motivating to the pre-termination criterion in Section 5.

2. Low-rank factors in (14)–(17) are seen growing in dimension on a scale of

O (4^{k})

, obviously intolerable for large-scale problems. So a deflation process and a truncation and compression technique are required to reduce the dimensions of the low-rank factors.

3. In real implementations, low-rank factors and kernels for

k \geq 2

are actually deflated, truncated, and compressed, as described in the next two sections, where a superscript “

d t

” is added to the upper right corner of each low-rank factor. Correspondingly, column numbers

m_{k - 1}^{g}

,

m_{k - 1}^{h}

,

m_{k - 1}^{a_{1}}

, and

m_{k - 1}^{a_{2}}

are the ones after deflation, truncation, and compression. Here, we temporarily omit this superscript “

d t

” just for the convenience when describing the successive iteration process.

2.2. Convergence and the Evolution of the Bandwidth

To obtain the convergence, we further assume that

[A, G] is d - stabilizable and [H, A] is d - detectable

(24)

and

[D^{A}, D^{G}] is d - stabilizable and [D^{H}, D^{A}] is d - detectable .

(25)

The following theorem concludes the convergence of SDA (7), see [35] (Thm 4.3, Thm 4.6) or [36] (Thm 3.1).

Theorem 1.

Under the assumption (24), there are unique symmetric positive semi-definite and stabilizing solutions

X_{s}

and

Y_{s}

to DARE (3) and its dual Equation (6), respectively. Moreover, the sequences

{G_{k}}

,

{H_{k}}

and

{A_{k}}

generated by SDA (7) satisfy

0 \leq H \leq H_{k} \leq H_{k + 1} \leq X_{s}

,

0 \leq G \leq G_{k} \leq G_{k + 1} \leq Y_{s}

for all k and

\begin{matrix} lim_{k \to \infty} H_{k} = X_{s}, lim_{k \to \infty} G_{k} = Y_{s}, lim_{k \to \infty} A_{k} = 0, \end{matrix}

(26)

all quadratically.

For the banded iterations (9) and (13), we have the following corollary.

Corollary 1.

Under the assumption (25), there are unique symmetric positive semi-definite and stabilizing solutions

D^{X}

and

D^{Y}

to the equation

- X + {(D^{A})}^{⊤} X {(I + D^{G} X)}^{- 1} D^{A} + D^{H} = 0

(27)

and its dual equation

- Y + D^{A} Y {(I + D^{H} Y)}^{- 1} {(D^{A})}^{⊤} + D^{G} = 0

(28)

respectively. Moreover, the sequences

{D_{k}^{G}}

,

{D_{k}^{H}}

and

{D_{k}^{A}}

generated by iterations (9) and (13) satisfy

0 \leq D^{H} \leq D_{k}^{H} \leq D_{k + 1}^{H} \leq D^{X}

,

0 \leq D^{G} \leq D_{k}^{G} \leq D_{k + 1}^{G} \leq D^{Y}

for all k and

\begin{matrix} lim_{k \to \infty} D_{k}^{H} = D^{X}, lim_{k \to \infty} D_{k}^{G} = D^{Y}, lim_{k \to \infty} D_{k}^{A} = 0, \end{matrix}

(29)

all quadratically.

Proof.

This is a direct application of Theorem 1 to Equation (27) and its dual Equation (28) under the assumption (25). □

Corollary 2.

Under the conditions of Theorem 1 and Corollary 1, the symmetric positive semidefinite solutions

X_{s}

and

Y_{s}

to DARE (3) and its dual equation have the decompositions

X_{s} = D^{X} + L_{l r}^{X} a n d Y_{s} = D^{Y} + L_{l r}^{Y} .

Moreover, for the sequences generated by FSDA,

{D_{k}^{A}}

and

{L_{1, k}^{A} K_{k}^{A} {(L_{2, k}^{A})}^{⊤}}

converge to zero,

{D_{k}^{H}}

and

{L_{k}^{H} K_{k}^{H} {(L_{k}^{H})}^{⊤}}

converge to

D^{X}

and

L_{l r}^{X}

and,

{D_{k}^{G}}

and

{L_{k}^{G} K_{k}^{G} {(L_{k}^{G})}^{⊤}}

converge to

D^{Y}

and

L_{l r}^{Y}

, respectively, all quadratically.

Proof.

It follows from (26) that

{A_{k}}

converges to zero. Then the decomposition

A_{k} = D_{k}^{A} + L_{1, k}^{A} K_{k}^{A} {(L_{2, k}^{A})}^{⊤}

in (8) together with

{lim}_{k \to \infty} D_{k}^{A} = 0

imply that the sequence

{L_{1, k}^{A} K_{k}^{A} {(L_{2, k}^{A})}^{⊤}}

will converge to zero quadratically.

Additionally, as the sequences

{H_{k}}

and

{G_{k}}

converge quadratically, by (26), to the unique solutions

X_{s}

and

Y_{s}

, respectively, and

H_{k} = D_{k}^{H} + L_{k}^{H} K_{k}^{H} {(L_{k}^{H})}^{⊤}, G_{k} = D_{k}^{G} + L_{k}^{G} K_{k}^{G} {(L_{k}^{G})}^{⊤}

in (8). So, given the initial banded matrices

D_{0}^{H} = D^{H}

and

D_{0}^{G} = D^{G}

, the iterations

{D_{k}^{H}}

and

{D_{k}^{G}}

in (9) and (13) are independent of the low-ranked part and have the unique limits

D^{X}

and

D^{Y}

, respectively. Consequently, the sequences

{L_{k}^{H} K_{k}^{H} {(L_{k}^{H})}^{⊤}} = {H_{k} - D_{k}^{H}}

and

{L_{k}^{G} K_{k}^{G} {(L_{k}^{G})}^{⊤}} = {G_{k} - D_{k}^{G}}

converge quadratically to the matrices

X_{s} - D^{X} : = L_{l r}^{X}

and

Y_{s} - D^{Y} : = L_{l r}^{Y}

, respectively. □

Remark 2.

1. Although the product

L_{1, k}^{A} K_{k}^{A} {(L_{2, k}^{A})}^{⊤}

converges to zero, it follows from (15), (17) and (23) that the kernel

K_{k}^{A}

and low-rank factors

L_{1, k}^{A}

and

L_{2, k}^{A}

might still not converge to zero, respectively.

2. If the convergence of SDA (or the corresponding FSDA) is quadratic, the number of the iterations k is not big when termination occurs, then the matrices

L_{l r}^{X}

and

L_{l r}^{Y}

are generally of numerical low-rank.

To show the evolution of the bandwidth of

D_{k}^{A}

,

D_{k}^{G}

and

D_{k}^{H}

, we first require the following result [37].

Theorem 2.

Let

A = (a_{i j})

be an

n \times n

matrix. Assume that there is a number m such that

a_{i j} = 0

if

| i - j | > m

and that

∥ A ∥ \leq c_{1}

and

∥ A^{- 1} ∥ \leq c_{2}

for some

c_{1} > 0

and

c_{2} > 0

. Then for

A^{- 1} = (α_{i j})

, there are numbers

K > 0

and

0 < r < 1

depending only on

c_{1}

,

c_{2}

and m, such that

| α_{i j} | \leq K r^{| i - j |} f o r a l l i, j .

We now consider the evolution of the bandwidth for the banded parts.

Theorem 3.

Let

b_{k}^{a} = b w (D_{k}^{A})

,

b_{k}^{g} : = b w (D_{k}^{G})

and

b_{k}^{h} : = b w (D_{k}^{H})

for

k \geq 0

. If the assumption (25) holds, then for iteration scheme (13), there is an integers

\bar{k}

independent of k, such that

\begin{matrix} b_{k}^{a} \leq 2^{\bar{k}} b_{0}^{a} + (2^{\bar{k}} - 1) {log}_{r}^{(τ / K)}, \\ b_{k}^{g} \leq (2^{\bar{k} + 1} - 2) b_{0}^{a} + b_{0}^{g} + (2^{\bar{k} + 1} - 2 - \bar{k}) {log}_{r}^{(τ / K)}, \\ b_{k}^{h} \leq (2^{\bar{k} + 1} - 2) b_{0}^{a} + b_{0}^{h} + (2^{\bar{k} + 1} - 2 - \bar{k}) {log}_{r}^{(τ / K)}, \end{matrix}

where τ is the truncation tolerance and

K > 0

and

0 < r < 1

depend only on the upper bounds of

∥ I + D_{i}^{H} D_{i}^{G} ∥

,

∥ I + D_{i}^{G} D_{i}^{H} ∥

,

∥ {(I + D_{i}^{H} D_{i}^{G})}^{- 1} ∥

and

∥ {(I + D_{i}^{G} D_{i}^{H})}^{- 1} ∥

for

i \leq \bar{k}

.

Proof.

It follows from [35] (Thm 4.6) that

I - D_{k}^{H} D_{k}^{G}

and

I - D_{k}^{G} D_{k}^{H}

are non-singular for all k. This together with (29) indicate that there is an integer

\bar{k}

such that

| {(D_{\bar{k}}^{A})}_{i j} | < τ

and the increment of

D_{k}^{G}

and

D_{k}^{H}

in (13) satisfies

| {(D_{\bar{k}}^{A} {(I + D_{\bar{k}}^{G} D_{\bar{k}}^{H})}^{- 1} D_{\bar{k}}^{G} D_{\bar{k}}^{A^{⊤}})}_{i j} | < τ a n d | {(D_{\bar{k}}^{A^{⊤}} {(I + D_{\bar{k}}^{H} D_{\bar{k}}^{G})}^{- 1} D_{\bar{k}}^{H} D_{\bar{k}}^{A})}_{i j} | < τ,

(30)

where

τ

is the given the truncation tolerance. On the other hand for

k = 1, \dots, \bar{k}

, it follows from Theorem 2 that there are

K > 0

and

0 < r < 1

independent of k, such that

| {({(I + D_{k}^{G} D_{k}^{H})}^{- 1})}_{i j} | \leq K r^{| i - j |}, | {({(I + D_{k}^{H} D_{k}^{G})}^{- 1})}_{i j} | \leq K r^{| i - j |} .

Then one has

b w ({(I + D_{k}^{G} D_{k}^{H})}^{- 1}) \leq {log}_{r}^{(τ / K)}, b w ({(I + D_{k}^{G} D_{k}^{H})}^{- 1}) \leq {log}_{r}^{(τ / K)}

for

k \leq \bar{k}

. Now recalling the iteration (9), the bandwidths of the first iteration admit the bounds

b_{1}^{a} \leq 2 b_{0}^{a} + {log}_{r}^{(τ / K)}, b_{1}^{g} \leq 2 b_{0}^{a} + b_{0}^{g} + {log}_{r}^{(τ / K)}, b_{1}^{h} \leq 2 b_{0}^{a} + b_{0}^{h} + {log}_{r}^{(τ / K)} .

Iterating the above bandwidth bounds according to the scheme (13) at

k \geq 1

, we have

\begin{matrix} b_{k}^{a} \leq 2^{k} b_{0}^{a} + (2^{k} - 1) {log}_{r}^{(τ / K)}, \\ b_{k}^{g} \leq (2^{k + 1} - 2) b_{0}^{a} + b_{0}^{g} + (2^{k + 1} - 2 - k) {log}_{r}^{(τ / K)}, \\ b_{k}^{h} \leq (2^{k + 1} - 2) b_{0}^{a} + b_{0}^{h} + (2^{k + 1} - 2 - k) {log}_{r}^{(τ / K)} . \end{matrix}

(31)

In particular, the bounds on the RHS of (31) will attain the maximal values at

k = \bar{k}

since elements with the absolute value less than

τ

are removed as in (30). □

3. Deflation of Low-Rank Factors and Kernels

It has been shown that there is an exponential increase in the dimension of low-rank factors and kernels. Nevertheless, it is clear that the first three items in

L_{1, k}^{A}

and

L_{2, k}^{A}

(see (15) and (17)) are same as the second to the fourth item in

L_{k}^{G}

and

L_{k}^{H}

(see (14) and (16)), respectively. Then the deflation of low-rank factors and kernels is needed to keep these matrices low-ranked. To see this process clearly, we start with the case

k = 2

.

Case for

k = 2

.

Consider the deflation of the low-rank factors firstly. It follows from (14)–(17) that

\begin{matrix} L_{2}^{G} & = & [L_{1}^{G}, L_{11}^{A}, D_{1}^{A G H} L_{1}^{G}, D_{1}^{A G H G} L_{1}^{H}, D_{1}^{A G H G} L_{21}^{A}], \\ L_{12}^{A} & = & [L_{11}^{A}, D_{1}^{A G H} L_{1}^{G}, D_{1}^{A G H G} L_{1}^{H}, D_{1}^{A G H} L_{11}^{A}], \\ L_{2}^{H} & = & [L_{1}^{H}, L_{21}^{A}, D_{1}^{A^{⊤} H G} L_{1}^{H}, D_{1}^{A^{⊤} H G H} L_{1}^{G}, D_{1}^{A^{⊤} H G H} L_{11}^{A}], \\ L_{22}^{A} & = & [L_{21}^{A}, D_{1}^{A^{⊤} H G} L_{1}^{H}, D_{1}^{A^{⊤} H G H} L_{1}^{G}, D_{1}^{A^{⊤} H G} L_{21}^{A}] \end{matrix}

with

\begin{matrix} D_{1}^{A G H G} = D_{1}^{A} {(I + D_{1}^{G} D_{1}^{H})}^{- 1} D_{1}^{G}, & D_{1}^{A^{⊤} H G H} = {(D_{1}^{A})}^{⊤} D_{1}^{H} {(I + D_{1}^{G} D_{1}^{H})}^{- 1}, \\ D_{1}^{A G H} = D_{1}^{A} {(I + D_{1}^{G} D_{1}^{H})}^{- 1}, & D_{1}^{A^{⊤} H G} = {(D_{1}^{A})}^{⊤} {(I + D_{1}^{H} D_{1}^{G})}^{- 1} . \end{matrix}

Expanding the above low-rank factors with the initial

L_{10}^{A} \in R^{N \times m^{a}}

and

L_{20}^{A} \in R^{N \times m^{a}}

, one can see from Appendix A that

L_{10}^{A}

and

D_{1}^{A G H G} L_{20}^{A}

(or

L_{20}^{A}

and

D_{1}^{A^{⊤} H G H} L_{10}^{A}

) occur twice in

L_{2}^{G}

(or

L_{2}^{H}

). To reduce the dimension of

L_{2}^{G}

, we remove the duplicated

L_{10}^{A}

in

L_{1}^{G}

(or

L_{20}^{A}

in

L_{1}^{H}

) and retain the one in

L_{11}^{A}

(or

L_{21}^{A}

). Furthermore, we remove

D_{1}^{A G H G} L_{20}^{A}

in

D_{1}^{A G H G} L_{21}^{A}

(or

D_{1}^{A^{⊤} H G H} L_{10}^{A}

in

D_{1}^{A^{⊤} H G H} L_{11}^{A}

) and keep the one in

D_{1}^{A G H G} L_{1}^{H}

(or

D_{1}^{A^{⊤} H G H} L_{1}^{G}

). Then the original

L_{2}^{G}

(or

L_{2}^{H}

) is deflated to

L_{2}^{G d}

(or

L_{2}^{H d}

) of a smaller dimension, where the superscript “d ” indicates the matrix after deflation. Analogously, as

D_{1}^{A G H} L_{10}^{A}

and

D_{1}^{A^{⊤} H G} L_{20}^{A}

appear twice in

L_{12}^{A}

and

L_{22}^{A}

, we apply the same deflation process to

L_{12}^{A}

and

L_{22}^{A}

, respectively, obtaining

L_{12}^{A d}

and

L_{22}^{A d}

in Appendix A, where the left blank in each factor corresponds to the deleted matrix and the black bold matrices inherit from the undeflated ones. Note that the deflated matrices

L_{2}^{G d}

,

L_{12}^{A d}

,

L_{2}^{H d}

and

L_{22}^{A d}

are still denoted by

L_{2}^{G}

,

L_{12}^{A}

,

L_{2}^{H}

and

L_{22}^{A}

, respectively, in next iteration to simplify notations.

For the kernels at

k = 2

, one has

\begin{matrix} \begin{matrix} 2 m^{a} 4 m^{a} 2 m^{a} 2 m^{a} \end{matrix} \\ K_{2}^{G} = [\begin{matrix} K_{1}^{G} & 0 & 0 & 0 \\ 0 & K_{1}^{A G H G A^{⊤}} & K_{1}^{A G H G} & K_{1}^{A} \\ 0 & {(K_{1}^{A G H G})}^{⊤} & - K_{1}^{G H G} & 0 \\ 0 & {(K_{1}^{A})}^{⊤} & 0 & 0 \end{matrix}] & \begin{matrix} 2 m^{a} \\ 4 m^{a} \\ 2 m^{a} \\ 2 m^{a} \end{matrix} \end{matrix},

\begin{matrix} \begin{matrix} 2 m^{a} 4 m^{a} 2 m^{a} 2 m^{a} \end{matrix} \\ K_{2}^{H} = [\begin{matrix} K_{1}^{H} & 0 & 0 & 0 \\ 0 & K_{1}^{A^{⊤} H G H A} & K_{1}^{A^{⊤} H G H} & {(K_{1}^{A})}^{⊤} \\ 0 & {(K_{1}^{A^{⊤} H G H})}^{⊤} & - K_{1}^{H G H} & 0 \\ 0 & K_{1}^{A} & 0 & 0 \end{matrix}] & \begin{matrix} 2 m^{a} \\ 4 m^{a} \\ 2 m^{a} \\ 2 m^{a} \end{matrix} \end{matrix}

and

\begin{matrix} \begin{matrix} 2 m^{a} 4 m^{a} 2 m^{a} \end{matrix} \\ K_{2}^{A} = [\begin{matrix} K_{1}^{A G H A} & K_{1}^{A G H} & K_{1}^{A} \\ {(K_{1}^{A^{⊤} G H})}^{⊤} & - K_{1}^{G H} & 0 \\ K_{1}^{A} & 0 & 0 \end{matrix}] & \begin{matrix} 2 m^{a} \\ 4 m^{a} \\ 2 m^{a} \end{matrix} \end{matrix}

with non-zero components defined in (18)–(20). Here, details of the deflation of

K_{2}^{G}

are explained explicitly and that for

K_{2}^{H}

is similar. In fact, there are 10 block rows and block columns with each of initial size

m^{a} \times m^{a}

in

K_{2}^{G}

. Due to the deflation of the L-factors described above, we add the first and the ninth row to the third and the seventh row and then remove the first and the ninth row, respectively. We also add the the first and the ninth column to the third and the seventh column and then remove the first and the ninth column, respectively, completing the deflation of

K_{2}^{G d}

.

Analogously, there are eight block rows and block columns, each of the initial size

m^{a} \times m^{a}

in

K_{2}^{A}

. The deflation process simultaneously adds the seventh column and row subblocks to the third column and row subblocks, respectively. Then the first column sub-block of the upper right

K_{1}^{A}

and the first row sub-block of the lower-left

K_{1}^{A}

overlap with the first column sub-block of

K_{1}^{A G H}

and the first row sub-block of

{(K_{1}^{A^{⊤} G H})}^{⊤}

, respectively, completing the deflation of

K_{2}^{A d}

.

The whole process is described in Figure 1 and Figure 2 where each small square is of size

m^{a} \times m^{a}

and each block with gray background represents the non-zero component in

K_{2}^{G}

and

K_{2}^{A}

. The little white squares in

K_{2}^{G d}

and

K_{2}^{A d}

inherit from the originally undeflated submatrices and the little black squares in

K_{2}^{G d}

and

K_{2}^{A d}

represent the submatrices after summation.

Case for

k \geq 3

.

After the

(k - 1)

-th deflation, the deflated matrices

L_{k - 1}^{G d}

,

L_{1, k - 1}^{A d}

,

L_{k - 1}^{H d}

and

L_{2, k - 1}^{A d}

are denoted by

L_{k - 1}^{G}

,

L_{1, k - 1}^{A}

,

L_{k - 1}^{H}

and

L_{2, k - 1}^{A}

for simplicity. Now there are

m_{k - 1}^{g} - (k - 1) m^{a}

(or

m_{k - 1}^{h} - (k - 1) m^{a}

) columns in

L_{k - 1}^{G}

and

L_{1, k - 1}^{A}

(or

L_{k - 1}^{H}

and

L_{2, k - 1}^{A}

) and

m_{k - 1}^{a_{2}} - m^{a}

(or

m_{k - 1}^{a_{1}} - m^{a}

) columns in

D_{k - 1}^{A G H G} L_{2, k - 1}^{A}

and

D_{k - 1}^{A G H G} L_{k - 1}^{H}

(or

D_{k - 1}^{A^{⊤} H G H} L_{1, k - 1}^{A}

and

D_{k - 1}^{A^{⊤} H G H} L_{k - 1}^{G}

) that are identical. Then, one can remove columns of

\begin{matrix} L_{k - 1}^{G} (:, (k - 2) m^{a} + 1 : m_{k - 1}^{g} - m^{a}) (o r L_{k - 1}^{H} (:, (k - 2) m^{a} + 1 : m_{k - 1}^{h} - m^{a})) \end{matrix}

and

D_{k - 1}^{A G H G} L_{2, k - 1}^{A} (:, 1 : m_{k - 1}^{a_{2}} - m^{a}) (o r D_{k - 1}^{A^{⊤} H G H} L_{1, k - 1}^{A} (:, 1 : m_{k - 1}^{a_{1}} - m^{a})),

and keep the columns of

L_{1, k - 1}^{A} (:, 1 : m_{k - 1}^{g} - (k - 1) m^{a}) (o r L_{2, k - 1}^{A} (:, 1 : m_{k - 1}^{h} - (k - 1) m^{a}))

and

D_{k - 1}^{A G H G} L_{k - 1}^{H} (:, m_{k - 1}^{h} - m_{k - 1}^{a_{2}} + 1 : m_{k - 1}^{h} - m^{a}) (o r D_{k - 1}^{A^{⊤} H G H} L_{k - 1}^{G} (:, m_{k - 1}^{g} - m_{k - 1}^{a_{1}} + 1 : m_{k - 1}^{g} - m^{a}))

in

L_{k}^{G}

(A1) (or

L_{k}^{H}

(A3)), respectively. So there are

k - 1

matrices, each of order

N \times m^{a}

, that are left in

L_{k - 1}^{G}

(or

L_{k - 1}^{H}

), i.e.,

D_{0}^{A G H G} L_{20}^{A}, D_{1}^{A G H G} D_{0}^{A^{⊤} H G} L_{20}^{A}, \dots, D_{k - 2}^{A G H G} Π_{i = 0}^{k - 3} D_{i}^{A^{⊤} H G} L_{20}^{A}

in (A1) (or

D_{0}^{A^{⊤} H G H} L_{10}^{A}, D_{1}^{A^{⊤} H G H} D_{0}^{A G H} L_{10}^{A}, \dots, D_{k - 2}^{A^{⊤} H G H} Π_{i = 0}^{k - 3} D_{i}^{A G H} L_{10}^{A}

in (A3)) in Appendix B. Meanwhile, only one matrix of order

N \times m^{a}

is left in

D_{k - 1}^{A G H G} L_{2, k - 1}^{A}

, (or

D_{k - 1}^{A^{⊤} H G H} L_{1, k - 1}^{A}

), i.e., the last item

D_{k - 1}^{A G H G} Π_{i = k - 2}^{0} D_{i}^{A^{⊤} H G} L_{20}^{A}

in (A1) (or

D_{k - 1}^{A^{⊤} H G H} Π_{i = k - 2}^{0} D_{i}^{A G H} L_{10}^{A}

in (A3)) of Appendix B. We also take

L_{3}^{G}

as an example to describe the above deflation more clearly in Appendix C.

To deflate

L_{1, k}^{A}

(

L_{2, k}^{A}

), columns of

D_{k - 1}^{A G H} L_{1, k - 1}^{A} (:, 1 : m_{k - 1}^{a_{1}} - m^{a}) (or D_{k - 1}^{A^{⊤} H G} L_{2, k - 1}^{A} (:, 1 : m_{k - 1}^{a_{2}} - m^{a}))

are removed but the columns of

D_{k - 1}^{A G H} L_{k - 1}^{G} (:, m_{k - 1}^{g} - m_{k - 1}^{a_{1}} + 1 : m_{k - 1}^{g} - m^{a}) (or D_{k - 1}^{A^{⊤} H G} L_{k - 1}^{H} (:, m_{k - 1}^{h} - m_{k - 1}^{a_{2}} + 1 : m_{k - 1}^{h} - m^{a}))

are retained in

L_{1, k}^{A}

(or

L_{2, k}^{A}

). So only one matrix of order

N \times m^{a}

is left in

D_{k - 1}^{A G H} L_{1, k - 1}^{A}

(or

D_{k - 1}^{A^{⊤} H G} L_{2, k - 1}^{A}

), i.e., the last item

Π_{i = k - 1}^{0} D_{i}^{A G H} L_{10}^{A}

in (A2) (or

Π_{i = k - 1}^{0} D_{i}^{A^{⊤} H G} L_{20}^{A}

in (A4)) of Appendix B. Note that the low-rank factors in the

(k - 1)

-th iteration are the ones after deflation, truncation and compression, deleting the superscript “d” for the simplicity. We take

L_{13}^{A}

as an example to describe the above deflation more clearly in Appendix D.

Correspondingly, the kernel matrices

K_{k}^{G}

,

K_{k}^{H}

, and

K_{k}^{A}

are deflated according to their low-rank factors. Here, we describe the deflation of

K_{k}^{G}

and that of

K_{k}^{H}

is essentially the same. By recalling the place of non-zero sub-matrices (the block with gray background in Figure 3) of

K_{k}^{G}

in (21), the deflation process essentially adds

K_{k - 1}^{G} ((k - 2) m^{a} + 1 : m_{k - 1}^{g} - m^{a},

(k - 2) m^{a} + 1 : m_{k - 1}^{g} - m^{a})

to

K_{k - 1}^{A G H G A^{⊤}} (1 : m_{k - 1}^{g} - (k - 1) m^{a}, 1 : m_{k - 1}^{g} - (k - 1) m^{a})

, columns

K_{k - 1}^{A} (:, 1 : m_{k - 1}^{a_{2}} - m^{a})

to

K_{k - 1}^{A G H G} (:, m_{k - 1}^{g} + m_{k - 1}^{h} - m_{k - 1}^{a_{2}} + 1 : m_{k - 1}^{g} + m_{k - 1}^{h} - m^{a})

and rows

{(K_{k - 1}^{A})}^{⊤} (1 : m_{k - 1}^{a_{2}} - m^{a}, :)

to

{(K_{k - 1}^{A G H G})}^{⊤} (m_{k - 1}^{g} + m_{k - 1}^{h} - m_{k - 1}^{a_{2}} + 1 : m_{k - 1}^{g} + m_{k - 1}^{h} - m^{a}, :)

, respectively. See Figure 3 for illustration.

Similarly, by recalling the positions of non-zero matrices (the block with gray background in Figure 4) of

K_{k}^{A}

in (23), the deflation process will add columns

K_{k - 1}^{A} (:, 1 : m_{k - 1}^{a_{2}} - m^{a})

to columns

K_{k - 1}^{A H G} (:, m_{k - 1}^{h} - m_{k - 1}^{a_{2}} + 1 : m_{k - 1}^{h} - m^{a})

and rows

K_{k - 1}^{A} (1 : m_{k - 1}^{a_{1}} - m^{a}, :)

to rows

{(K_{k - 1}^{A^{⊤} G H})}^{⊤} (m_{k - 1}^{g} - m_{k - 1}^{a_{1}} + 1 : m_{k - 1}^{g} - m^{a}, :)

. See Figure 4 for illustration.

4. Partial Truncation and Compression

Although the deflation of the low-rank factors and kernels in the last section can reduce dimensional growth, the exponential increment of the undeflated part is still rapid, making large-scale computation and storage infeasible. Conventionally, one efficient way to shrink the column number of low-rank factors is by truncation and compression (TC) [17,18], which, unfortunately, is hard to be applied to our case due to the following two main obstacles.

Direct application of TC to $L_{k}^{H d}$ , $L_{k}^{G d}$ , $L_{1, k}^{A d}$ , $L_{2, k}^{A d}$ , and their corresponding kernels $K_{k}^{H d}$ , $K_{k}^{G d}$ and $K_{k}^{A d}$ at the k-th step will require four QR decompositions, resulting in a relatively high computational complexity and CPU consumption.
The TC process applied to the whole low-rank factors at current step breaks up the implicit structure, causing the deflation to be unrealized in the next iteration.

In this section, we will instead present a technique of partial truncation and compression (PTC) to overcome the above difficulties. Our PTC only requires two QR decompositions of the exponentially increasing (not the entire) parts of low-rank factors, keeping the successive deflation for subsequent iterations.

PTC for low-rank factors. Recall the deflated forms (A1) and (A3) in Appendix B.

L_{k}^{G d}

and

L_{k}^{H d}

can be divided to three parts

\begin{matrix} L_{k}^{G d} = [L_{k}^{G d} (1), L_{k}^{G d} (2), L_{k}^{G d} (3)] \\ L_{k}^{H d} = [L_{k}^{H d} (1), L_{k}^{H d} (2), L_{k}^{H d} (3)] . \end{matrix}

The number of columns in

\begin{matrix} L_{k}^{G d} (1) : = [D_{0}^{A G H G} L_{20}^{A}, D_{1}^{A G H G} D_{0}^{A^{⊤} G H} L_{20}^{A}, \dots, D_{k - 2}^{A G H G} Π_{i = k - 3}^{0} D_{i}^{A^{⊤} G H} L_{20}^{A}] \in R^{N \times (k - 1) m^{a}} \end{matrix}

and

\begin{matrix} L_{k}^{H d} (1) : = [D_{0}^{A^{⊤} H G H} L_{10}^{A}, D_{1}^{A^{⊤} H G H} D_{0}^{A^{⊤} G H} L_{10}^{A}, \dots, D_{k - 2}^{A^{⊤} H G H} Π_{i = k - 3}^{0} D_{i}^{A G H} L_{10}^{A}] \in R^{N \times (k - 1) m^{a}} \end{matrix}

increases only linearly with k, and the last parts

\begin{matrix} L_{k}^{G d} (3) : = D_{k - 1}^{A G H G} Π_{i = k - 2}^{0} D_{i}^{A^{⊤} G H} L_{20}^{A} \in R^{N \times m^{a}} \end{matrix}

and

\begin{matrix} L_{k}^{H d} (3) : = D_{k - 1}^{A^{⊤} H G H} Π_{i = k - 2}^{0} D_{i}^{A G H} L_{10}^{A} \in R^{N \times m^{a}} \end{matrix}

are always of size

N \times m^{a}

. So we only truncate and compress the dominantly growing parts

\begin{matrix} L_{k}^{G d} (2) : = [L_{1, k - 1}^{A}, D_{k - 1}^{A G H} L_{k - 1}^{G}, D_{k - 1}^{A G H G} L_{k - 1}^{H}] \end{matrix}

and

\begin{matrix} L_{k}^{H d} (2) : = [L_{2, k - 1}^{A}, D^{A^{⊤} H G} L_{k - 1}^{H}, D_{k - 1}^{A^{⊤} H G H} L_{k - 1}^{G}] \end{matrix}

by orthogonalization. Consider the QR decompositions with column pivoting of

\begin{matrix} L_{k}^{G d} (2) P_{k}^{G} = [Q_{k}^{G} {\tilde{Q}}_{k}^{G}] [\begin{matrix} U_{k, 1}^{G} & U_{k, 2}^{G} \\ 0 & {\tilde{U}}_{k}^{G} \end{matrix}], ∥ {\tilde{U}}_{k}^{G} ∥ < u_{0}^{g} τ_{g}, \\ L_{k}^{H d} (2) P_{k}^{H} = [Q_{k}^{H} {\tilde{Q}}_{k}^{H}] [\begin{matrix} U_{k, 1}^{H} & U_{k, 2}^{H} \\ 0 & {\tilde{U}}_{k}^{H} \end{matrix}], ∥ {\tilde{U}}_{k}^{H} ∥ < u_{0}^{h} τ_{h}, \end{matrix}

(32)

where

P_{k}^{G}

and

P_{k}^{H}

are permutation matrices such that the diagonal elements of

[\begin{matrix} U_{k, 1}^{J} & U_{k, 2}^{J} \\ 0 & {\tilde{U}}_{k}^{J} \end{matrix}]

(

J = G

or H) are decreasing in absolute value,

u_{0}^{g} = ∥ U_{0, 1}^{G} ∥

,

u_{0}^{h} = ∥ U_{0, 1}^{H} ∥

and

τ_{g}

and

τ_{h}

are some small tolerances controlling PTC of

L_{k}^{G d} (2)

and

L_{k}^{H d} (2)

, respectively,

m_{k}^{g (2)}

and

m_{k}^{h (2)}

are the respective column numbers of

L_{k}^{G} (2)

and

L_{k}^{H} (2)

bounded above by some given

m_{max}

. Then their ranks satisfy

r_{k}^{g} : = rank (L_{k}^{G} (2)) \leq m_{k}^{g (2)} \leq m_{max}, r_{k}^{h} : = rank (L_{k}^{H} (2)) \leq m_{k}^{h (2)} \leq m_{max}

with

m_{max} ≪ N

. Furthermore,

Q_{k}^{G} \in R^{N \times r_{k}^{g}}

and

Q_{k}^{H} \in R^{N \times r_{k}^{h}}

are orthonormal and

U_{k}^{G} = [U_{k, 1}^{G} U_{k, 2}^{G}] \in R^{r_{k}^{g} \times m_{k - 1}^{h g a}}

and

U_{k}^{H} = [U_{k, 1}^{H} U_{k, 2}^{H}] \in R^{r_{k}^{h} \times m_{k - 1}^{h g a}}

are full-rank with

m_{k - 1}^{h g a} = m_{k - 1}^{h} + m_{k - 1}^{g} + m_{k - 1}^{a}

. Then

L_{k}^{G d}

and

L_{k}^{H d}

can be truncated and reorganized as

\begin{matrix} L_{k}^{G d t} = [L_{k}^{G d} (1), Q_{k}^{G}, L_{k}^{G d} (3)] : = [L_{k}^{G d t} (1), L_{k}^{G d t} (2), L_{k}^{G d t} (3)] \in R^{N \times m_{k}^{g}}, \\ L_{k}^{H d t} = [L_{k}^{H d} (1), Q_{k}^{H}, L_{k}^{H d} (3)] : = [L_{k}^{H d t} (1), L_{k}^{H d t} (2), L_{k}^{H d t} (3)] \in R^{N \times m_{k}^{h}} \end{matrix}

(33)

with

m_{k}^{g} = r_{k}^{g} + k m^{a}

and

m_{k}^{h} = r_{k}^{h} + k m^{a}

.

Similarly, recalling the deflated forms in (A2) and (A4) in Appendix B,

L_{1, k}^{A d}

and

L_{2, k}^{A d}

are also divided into two parts,

\begin{matrix} L_{1, k}^{A d} = [L_{1, k}^{A d} (1), L_{1, k}^{A d} (2)] a n d L_{2, k}^{A d} = [L_{2, k}^{A d} (1), L_{2, k}^{A d} (2)] \end{matrix}

with

\begin{matrix} L_{1, k}^{A d} (1) = L_{k}^{G d} (2), L_{1, k}^{A d} (2) = Π_{i = k - 1}^{0} D_{i}^{A G H} L_{10}^{A}, \\ L_{2, k}^{A d} (1) = L_{k}^{H d} (2), L_{2, k}^{A d} (2) = Π_{i = k - 1}^{0} D_{i}^{A^{⊤} H G} L_{20}^{A} . \end{matrix}

Since

L_{k}^{G d} (2)

and

L_{k}^{H d} (2)

have been compressed to

Q_{k}^{G}

and

Q_{k}^{H}

, respectively, one has the truncated and compressed factors

\begin{matrix} L_{1, k}^{A d t} = [Q_{k}^{G}, L_{1, k}^{A d} (2)] = [L_{1, k}^{A d t} (1), L_{1, k}^{A d t} (2)] \in R^{N \times m_{k}^{a_{1}}}, \\ L_{2, k}^{A d t} = [Q_{k}^{H}, L_{2, k}^{A d} (2)] = [L_{2, k}^{A d t} (1), L_{2, k}^{A d t} (2)] \in R^{N \times m_{k}^{a_{2}}} \end{matrix}

(34)

with

m_{k}^{a_{1}} = r_{k}^{g} + m^{a}

and

m_{k}^{a_{2}} = r_{k}^{h} + m^{a}

, finishing the PTC process for the low-rank factors in the k-th iteration.

It is worth noting that the above PTC process can proceed to the next iteration. In fact, one has

\begin{matrix} L_{k + 1}^{G} = [L_{k}^{G d t}, L_{1, k}^{A d t}, D_{k}^{A G H} L_{k}^{G d t}, D_{k}^{A G H G} L_{k}^{H d t}, D_{k + 1}^{A G H G} L_{2, k}^{A d t}], \\ L_{k + 1}^{H} = [L_{k}^{H d t}, L_{2, k}^{A d t}, D_{k}^{A^{⊤} H G} L_{k}^{H d t}, D_{k}^{A^{⊤} H G H} L_{k}^{G d t}, D_{k}^{A^{⊤} H G H} L_{1, k}^{A d t}] \end{matrix}

after the k-th PTC. As

L_{1, k}^{A d t} (1)

is equal to

L_{k}^{G d t} (2)

and

L_{2, k}^{A d t} (1)

is equal to

L_{k}^{H d t} (2)

, one can deflate

L_{k + 1}^{G}

and

L_{k + 1}^{H}

to

\begin{matrix} L_{k + 1}^{G d} = [L_{k + 1}^{G d} (1), L_{k + 1}^{G d} (2), L_{k + 1}^{G d} (3)], L_{k + 1}^{H d} = [L_{k + 1}^{H d} (1), L_{k + 1}^{H d} (2), L_{k + 1}^{H d} (3)] \end{matrix}

with

\begin{matrix} L_{k + 1}^{G d} (1) = [L_{k}^{G d t} (1), L_{k}^{G d t} (3)], L_{k + 1}^{H d} (1) = [L_{k}^{H d t} (1), L_{k}^{H d t} (3)], \\ L_{k + 1}^{G d} (2) = [L_{1, k}^{A d t}, D_{k}^{A G H} L_{k}^{G d t}, D_{k}^{A G H G} L_{k}^{H d t}], L_{k + 1}^{H d} (2) = [L_{2, k}^{A d t}, D_{k}^{A^{⊤} H G} L_{k}^{H d t}, D_{k}^{A^{⊤} H G H} L_{k}^{G d t}], \\ L_{k + 1}^{G d} (3) = D_{k}^{A G H G} L_{2, k}^{A d t} (2), L_{k + 1}^{H d} (3) = D_{k}^{A^{⊤} H G H} L_{1, k}^{A d t} (2) . \end{matrix}

Applying PTC to

L_{k + 1}^{G d} (2)

and

L_{k + 1}^{H d} (2)

, respectively, again, one has

\begin{matrix} L_{k + 1}^{G d t} = [L_{k + 1}^{G d} (1), Q_{k + 1}^{G} L_{k + 1}^{G d} (3)] : = [L_{k + 1}^{G d t} (1), L_{k + 1}^{G d t} (2), L_{k + 1}^{G d t} (3)], \\ L_{k + 1}^{H d t} = [L_{k + 1}^{H d} (1), Q_{k + 1}^{H}, L_{k + 1}^{H d} (3)] : = [L_{k + 1}^{H d t} (1), L_{k + 1}^{H d t} (2), L_{k + 1}^{H d t} (3)], \end{matrix}

(35)

where

Q_{k + 1}^{G} \in R^{N \times r_{k + 1}^{g}}

and

Q_{k + 1}^{H} \in R^{N \times r_{k + 1}^{h}}

are unitary matrices from QR decomposition and the PTC in the

(k + 1)

-th iteration is completed.

PTC for kernels. Define matrices

\begin{matrix} {\hat{U}}_{1, k}^{A} = U_{k}^{G} \oplus I_{m^{a}}, {\hat{U}}_{k}^{G} = I_{(k - 1) m^{a}} \oplus U_{k}^{G} \oplus I_{m^{a}}, \\ {\hat{U}}_{2, k}^{A} = U_{k}^{H} \oplus I_{m^{a}}, {\hat{U}}_{k}^{H} = I_{(k - 1) m^{a}} \oplus U_{k}^{H} \oplus I_{m^{a}}, \end{matrix}

with

U_{k}^{G}

and

U_{k}^{H}

in (32). Then the truncated and compressed kernels are

\begin{matrix} K_{k}^{G d t} : = {\hat{U}}_{k}^{G} K_{k}^{G d} {({\hat{U}}_{k}^{G})}^{⊤} \in R^{m_{k}^{g} \times m_{k}^{g}}, \\ K_{k}^{H d t} : = {\hat{U}}_{k}^{H} K_{k}^{H d} {({\hat{U}}_{k}^{H})}^{⊤} \in R^{m_{k}^{h} \times m_{k}^{h}}, \\ K_{k}^{A d t} : = {\hat{U}}_{1, k}^{A} K_{k}^{H d} {({\hat{U}}_{2, k}^{A})}^{⊤} \in R^{m_{k}^{g} \times m_{k}^{h}} . \end{matrix}

(36)

To eliminate items less than

O (τ_{g})

and

O (τ_{h})

in the low-rank factors and kernels, an additional monitoring step is imposed after the PTC process. Specifically, the last item

D_{k - 2}^{A G H G} Π_{i = k - 3}^{0} D_{i}^{A^{⊤} G H} L_{20}^{A}

in

L_{k}^{G d t}

(or

D_{k - 2}^{A^{⊤} H G H} Π_{i = k - 3}^{0} D_{i}^{A G H} L_{10}^{A}

in

L_{k}^{H d t}

) will be discarded if its norm is less than

O (τ_{g})

(or

O (τ_{h})

). Similarly,

Π_{i = k - 1}^{0} D_{i}^{A G H} L_{10}^{A}

in

L_{1, k}^{A d} (2)

(or

Π_{i = k - 1}^{0} D_{i}^{A^{⊤} H G} L_{20}^{A}

in

L_{2, k}^{A d} (2)

) will be abandoned if its norm is less than

O (τ_{g})

(or

O (τ_{h})

). In this way, the growth of column dimension in the low-rank factors

L_{k}^{G d t}

,

L_{k}^{H d t}

,

L_{1, k}^{A d t}

and

L_{2, k}^{A d t}

, as well as the kernels

K_{k}^{G d t}

,

K_{k}^{H d t}

,

K_{k}^{A d t}

, will be controlled efficiently while sacrificing a hopefully negligible bit of accuracy. Additionally, their sizes after PTC will be further restricted by setting a reasonable upper bound

m_{max}

.

5. Algorithm and Implementation

5.1. Computation of Residuals

The computation of relative residuals, such as

r_{r e l} = | D (H_{k}) | / | D (H_{0}) |

, is commonly used in the context of solving the DARE using SDA, as mentioned in [4]. Typically, the FSDA algorithm is designed to stop when the relative residual is sufficiently small, which guarantees that the approximated solution

H_{k}

is close to the exact solution of the DARE [35]. However, computing

r_{r e l}

directly can be computationally expensive due to the high rank of

H_{k}

and

G_{k}

. To overcome this difficulty, the residual is divided into two parts, the banded part and the low-ranked part, under the assumptions of Equations (4) and (5). The residual for the banded part can be computed relatively easily and serves as a pre-termination condition, followed by the termination of the entire FSDA algorithm based on the residual for the low-ranked part.

5.1.1. Residual for the Banded Part

Define

\begin{matrix} {\tilde{D}}_{k}^{H G} = {(I + D_{k}^{H} D_{0}^{G})}^{- 1}, {\tilde{D}}_{k}^{H G H} = {\tilde{D}}_{k}^{H G} D_{k}^{H}, {\tilde{D}}_{k}^{G H G} = D_{0}^{G} {\tilde{D}}_{k}^{H G} \end{matrix}

and

\begin{matrix} {\tilde{K}}_{k}^{H} = {(I + K_{k}^{H} {(L_{k}^{H})}^{⊤} {\tilde{D}}_{k}^{G H G} L_{k}^{H})}^{- 1} K_{k}^{H} . \end{matrix}

With the current approximated solution

H_{k} = D_{k}^{H} + L_{k}^{H} K_{k}^{H} {(L_{k}^{H})}^{⊤}

, the residual for DARE (3) is

\begin{matrix} D (H_{k}) & = - H_{k} + A^{⊤} ({\tilde{D}}_{k}^{H G H} + {\tilde{D}}_{k}^{H G} L_{k}^{H} {\tilde{K}}_{k}^{H} {({\tilde{D}}_{k}^{H G} L_{k}^{H})}^{⊤}) A + H \\ : = D_{k}^{R} + L_{k}^{R} K_{k}^{R} {(L_{k}^{R})}^{⊤}, \end{matrix}

where the banded part, the low-rank part and the kernel are

D_{k}^{R} = D_{0}^{H} - D_{k}^{H} + {(D_{0}^{A})}^{⊤} D_{k}^{H} {(I + D_{0}^{G} D_{k}^{H})}^{- 1} D_{0}^{A},

L_{k}^{R} = [L_{20}^{A}, {(D_{0}^{A})}^{⊤} {\tilde{D}}_{k}^{H G H} L_{10}^{A}, {(D_{0}^{A})}^{⊤} {\tilde{D}}_{k}^{H G} L_{k}^{H}, L_{k}^{H}],

K_{k}^{R} = \begin{matrix} \begin{matrix} m^{a} m^{a} m_{k}^{h} m_{k}^{h} \end{matrix} \\ [\begin{matrix} {\tilde{K}}_{k}^{A^{⊤} H G H A} & I_{m^{a}} & {\tilde{K}}_{k}^{A^{⊤} H G} & 0 \\ I_{m^{a}} & 0 & 0 & 0 \\ {({\tilde{K}}_{k}^{A^{⊤} H G})}^{⊤} & 0 & {\tilde{K}}_{k}^{H} & 0 \\ 0 & 0 & 0 & - K_{k}^{H} \end{matrix}] & \begin{matrix} m^{a} \\ m^{a} \\ m_{k}^{h} \\ m_{k}^{h} \end{matrix} \end{matrix}

(37)

respectively, and

{\tilde{K}}_{k}^{A^{⊤} H G} = {(L_{10}^{A})}^{⊤} {\tilde{D}}_{k}^{H G} L_{k}^{H} \cdot {\tilde{K}}_{k}^{H},

{\tilde{K}}_{k}^{A^{⊤} H G H A} = {(L_{10}^{A})}^{⊤} {\tilde{D}}_{k}^{H G H} L_{10}^{A} + {\tilde{K}}_{k}^{A^{⊤} H G} \cdot {({(L_{10}^{A})}^{⊤} {\tilde{D}}_{k}^{H G} L_{k}^{H})}^{⊤} .

It is not difficult to see that the main flop counts in the kernel

K_{k}^{R}

lie in forming matrices

{(L_{10}^{A})}^{⊤} {\tilde{D}}_{k}^{H G H} L_{10}^{A}, {(L_{10}^{A})}^{⊤} {\tilde{D}}_{k}^{H G} L_{k}^{H}, {(L_{k}^{H})}^{⊤} {\tilde{D}}_{k}^{G H G} L_{k}^{H} .

(38)

To avoid calculating them in each iteration, we first verify if

B_R R e s = \frac{∥ D_{k}^{R} ∥}{| {\bar{D}}_{0}^{R} | + ∥ L_{0}^{R} ∥^{2} ∥ K_{0}^{R} ∥} \leq ϵ_{b}

(39)

with

| {\bar{D}}_{0}^{R} | = ∥ D_{0}^{A} ∥_{2}^{2} ∥ D_{0}^{H} ∥ ∥ {(I + D_{0}^{G} D_{0}^{H})}^{- 1} ∥_{2}

and

ϵ_{b}

being the band tolerance. Here, the norm

{∥ \cdot ∥}_{2}

is the matrix spectral norm, which is not easy to compute and is replaced by

l_{1}

-matrix norm in practice. This is feasible as the residual of

D (H_{k})

comes from two relatively independent parts, i.e., the banded part and the low-rank part.

5.1.2. Residual for the Low-Rank Part

When the pre-termination (39) is satisfied, matrices in (38) are then constructed, followed by the deflation, truncation, and compression of the low-rank factor

L_{k}^{R}

. Specifically, the columns

L_{20}^{A} (:, 1 : m^{a})

are removed and columns of

L_{k}^{H} (:, 1 : m^{a})

are kept such that

L_{k}^{R}

is deflated to

L_{k}^{R d}

, i.e.,

$Fractalfract 07 00468 i001$

Let

{\hat{I}}_{m^{a}} = [I_{m^{a}}, 0, \dots 0] \in R^{m^{a} \times m_{k}^{h}}

,

{\hat{K}}_{k}^{A^{⊤} H G} = [{({\tilde{K}}_{k}^{A^{⊤} H G})}^{⊤}, 0, \dots, 0] \in R^{m_{k}^{h} \times m_{k}^{h}}

. The kernel

K_{k}^{R}

in (37) is correspondingly deflated as

K_{k}^{R} \overset{d}{\to} \begin{matrix} \begin{matrix} m^{a} m_{k}^{h} m_{k}^{h} \end{matrix} \\ [\begin{matrix} 0 & 0 & {\hat{I}}_{m^{a}} \\ 0 & {\tilde{K}}_{k}^{H} & {\hat{K}}_{k}^{A^{⊤} H G} \\ {({\hat{I}}_{m^{a}})}^{⊤} & {({\hat{K}}_{k}^{A^{⊤} H G})}^{⊤} & {\hat{K}}_{k}^{A^{⊤} H G H A} \end{matrix}] & \begin{matrix} m^{a} \\ m_{k}^{h} \\ m_{k}^{h} \end{matrix} \end{matrix} : = K_{k}^{R d},

where all elements in

{\hat{K}}_{k}^{A^{⊤} H G H A}

are same to those in

K_{k}^{H}

except

{\hat{K}}_{k}^{A^{⊤} H G H A} (1 : m^{a}, 1 : m^{a}) = {\tilde{K}}_{k}^{A^{⊤} H G H A} - K_{k}^{H} (1 : m^{a}, 1 : m^{a})

.

After deflation, the truncation and compression are applied to

L_{k}^{R d}

with QR decomposition

\begin{matrix} L_{k}^{R d} P_{k}^{R} = [Q_{k}^{R} {\tilde{Q}}_{k}^{R}] [\begin{matrix} U_{k, 1}^{R} & U_{k, 2}^{R} \\ 0 & {\tilde{U}}_{k}^{R} \end{matrix}], ∥ {\tilde{U}}_{k}^{R} ∥ < u_{0}^{r} τ_{r}, \end{matrix}

where

P_{k}^{R}

is the permutation matrix such that the diagonal elements of

[\begin{matrix} U_{k, 1}^{R} & U_{k, 2}^{R} \\ 0 & {\tilde{U}}_{k}^{R} \end{matrix}]

are decreasing in absolute value,

u_{0}^{r} = ∥ U_{0, 1}^{R} ∥

and

τ_{r}

is the given tolerance,

Q_{k}^{R} \in R^{n \times r_{k}^{r}}

is orthonormal and

U_{k}^{R} = [U_{k, 1}^{R} U_{k, 2}^{R}] \in R^{r_{k}^{r} \times n_{k}}

is full-ranked. Since

∥ L_{k}^{R} K_{k}^{R} {(L_{k}^{R})}^{⊤} ∥ \approx ∥ U_{k}^{R} K_{k}^{R d} {(U_{k}^{R})}^{⊤} ∥

, the terminating condition of the whole algorithm is chosen to be

L R_R R e s = \frac{∥ U_{k}^{R} K_{k}^{R d} {(U_{k}^{R})}^{⊤} ∥}{| {\bar{D}}_{0}^{R} | + ∥ L_{0}^{R} ∥^{2} ∥ K_{0}^{R} ∥} \leq ϵ_{l}

(40)

with

ϵ_{l}

being the low-rank tolerance.

5.2. Algorithm and Operation Counts

The process of deflation and PTC together with the computation of residuals (39) and (40) are summarized in the FSDA Algorithm 1.

Algorithm 1 FSDA. Solve DAREs with high-ranked G and H

Inputs:: Banded matrices $D_{0}^{A}$ , $D_{0}^{G}$ , $D_{0}^{H}$ , low-rank factors $L_{10}^{A}$ , $L_{20}^{A}$ , $L_{0}^{G}$ , $L_{0}^{H}$ , $K_{0}^{G}$ , $K_{0}^{H}$ , and the iterative tolerance $t o l$ ; truncation tolerances $τ_{g}$ , $τ_{h}$ , $τ_{r}$ and upper bound $m_{max}$ ; band tolerance $ϵ_{b}$ and low-rank tolerance $ϵ_{l}$ .
Outputs:: Banded matrix $D^{H}$ , low-rank matrix $L^{H}$ and the kernel matrix $K^{H}$ with the stabilizing solution $X_{s} \approx D^{H} + L^{H} K^{H} {(L^{H})}^{⊤}$ .

Set $D_{1}^{G} = D_{0}^{G} + D_{0}^{A G H G} {(D_{0}^{A})}^{⊤}$ , $D_{1}^{H} = D_{0}^{H} + D_{0}^{A^{⊤} H G H} D_{0}^{A}$ , $D_{1}^{A} = D_{0}^{A} {(D_{0}^{A^{⊤} H G})}^{⊤}$ as in (9). Set $L_{1}^{G} = [L_{10}^{A}, D_{0}^{A G H G} L_{20}^{A}]$ , $L_{1}^{H} = [L_{20}^{A}, D_{0}^{A^{⊤} H G H} L_{10}^{A}]$ , $L_{11}^{A} = [L_{10}^{A}, D_{0}^{A G H} L_{10}^{A}]$ , $L_{21}^{A} = [L_{20}^{A}, D_{0}^{A^{⊤} H G} L_{20}^{A}]$ as in (10). Set $K_{1}^{G}$ , $K_{1}^{H}$ , $K_{1}^{A}$ as in (11) and (12).
For $k = 2, \dots,$ until convergence, do
Compute banded matrices $D_{k}^{G}$ , $D_{k}^{H}$ , $D_{k}^{A}$ as in (13).
Form components (18)–(20) and construct kernels $K_{k}^{G}$ , $K_{k}^{G}$ and $K_{k}^{G}$ as in (21)–(23).
Deflate kernels $K_{k}^{G} \overset{d}{\to} K_{k}^{G d}$ , $K_{k}^{H} \overset{d}{\to} K_{k}^{H d}$ and $K_{k}^{A} \overset{d}{\to} K_{k}^{A d}$ in a way of Figure 3 and Figure 4.
Deflate the low-rank factors $L_{k}^{G} \overset{d}{\to} L_{k}^{G d}$ , $L_{k}^{H} \overset{d}{\to} L_{k}^{H d}$ , $L_{1, k}^{A} \overset{d}{\to} L_{1, k}^{A d}$ and $L_{2, k}^{A} \overset{d}{\to} L_{2, k}^{A d}$ as in (A1)–(A4).
Partially truncate and compress $L_{k}^{G d}$ and $L_{k}^{H d}$ as in (32) with accuracy $u_{0}^{g} τ_{g}$ , $u_{0}^{g} τ_{h}$ .
Construct compressed low-rank factors $L_{k}^{G d t}$ , $L_{k}^{H d t}$ , $L_{1, k}^{A d t}$ and $L_{2, k}^{A d t}$ as in (33)–(34).
Construct compressed kernels $K_{k}^{G d t}$ , $K_{k}^{H d t}$ and $K_{k}^{A d t}$ as in (36).
Evaluate the residual of the banded part B_RRes in (39).
If B_RRes $< t o l$ , compute the residual of low-rank part LR_RRes in (40).
If LR_RRes $< t o l$ , break, end.
End (If);
$K_{k}^{G} : = K_{k}^{G d t}$ , $K_{k}^{H} : = K_{k}^{H d t}$ , $K_{k}^{A} : = K_{k}^{A d t}$ .
$L_{k}^{G} : = L_{k}^{G d t}$ , $L_{k}^{H} : = L_{k}^{H d t}$ , $L_{1, k}^{A} : = L_{1, k}^{A d t}$ , $L_{2, k}^{A} : = L_{2, k}^{A d t}$ .
$k : = k + 1$ ;
End (For)
Output $D_{k}^{H} = D^{H}$ , $L_{k}^{H} = L^{H}$ and $K_{k}^{H} = K^{H}$ .

Remark 3.

1. At each iteration, elements in the banded matrices

D_{k}^{A}

,

D_{k}^{H}

, and

D_{k}^{G}

with an absolute value less than

t o l = e p s \cdot max {∥ D^{A} ∥, ∥ D^{G} ∥, ∥ D^{H} ∥}

are eliminated.

2. The deflation process involves merging selected rows and columns in the kernels

K_{k}^{G}

,

K_{k}^{H}

, and

K_{k}^{A}

based on overlapping columns in the low-rank factors

L_{k}^{G}

,

L_{k}^{H}

,

L_{1, k}^{A}

, and

L_{2, k}^{A}

. This requires adding some columns and rows.

3. The PTC is applied to

L_{k}^{G d} (2)

and

L_{k}^{H d} (2)

. The column numbers of

L_{k}^{G d} (1)

and

L_{k}^{H d} (1)

increase linearly with respect to

k,

while those of

L_{k}^{G d} (3)

and

L_{k}^{H d} (3)

remain unchanged. Elements in

L_{k}^{G d} (1)

,

L_{k}^{H d} (1)

,

L_{k}^{G d} (3)

, and

L_{k}^{H d} (3)

with an absolute value less than

t o l

are removed to minimize the column size of the low-rank factors.

To further analyze the complexity and the memory requirement of the FSDA, the bandwidth of

D_{k}^{A}

,

D_{k}^{G}

, and

D_{k}^{H}

at each iteration are assumed to be

b_{k}^{a}

,

b_{k}^{g}

and

b_{k}^{h}

(

b_{k}^{a}, b_{k}^{g}, b_{k}^{h} ≪ N

), respectively. We also set

b_{k}^{h g} = max {b_{k}^{h}, b_{k}^{g}}

,

b_{k}^{h g a} = max {b_{k}^{h}, b_{k}^{g}, b_{k}^{a}}

,

m_{k}^{a} = max {m_{k}^{a_{1}}, m_{k}^{a_{2}}}

, and

m_{k - 1}^{h g a} : = m_{k - 1}^{h} + m_{k - 1}^{g} + m_{k - 1}^{a}

for the convenience of counting flops. The table in Appendix E lists the time and memory requirement for different components in the k-th iteration of the FSDA, where the estimations are upper bounds due to the truncation errors

τ_{g}

,

τ_{h}

and

τ_{r}

.

6. Numerical Examples

In this section, we will demonstrate the effectiveness of the FSDA algorithm in computing the approximate solution of the DARE (3). The FSDA algorithm was implemented using MATLAB 2014a [38] on a 64-bit PC running Windows 10. The PC had a 3.0 GHz Intel Core i5 processor with 6 cores and 6 threads, 32GB RAM, and a machine unit round-off value of eps =

2.22 \times 10^{- 16}

. The residual for the DARE was estimated using the upper bound formula

{\tilde{r}}_{k} = B_R R e s + L R_R R e s,

where B_RRes in (39) and LR_RRes in (40) are the relative residuals for the banded part and the low-rank part, respectively. The tolerance values for truncation and compression were set to

τ_{g} = τ_{h} = τ_{r} = 10^{- 16}

, and the termination tolerance values were set to

ϵ_{b} = ϵ_{l} = 10^{- 11}

. We also tried

N \cdot

eps as the tolerance value for

τ_{g}

,

τ_{h}

and

τ_{r}

in our experiments, but found that it had no impact on the residual accuracy. The maximum permitted column number in the low-rank factors was set to

m_{max} = 2200

. As a comparison, we also ran the ordinary SDA algorithm with hierarchical structure (i.e., HODLR) using the hm-toolbox (http://github.com/numpi/hm-toolbox, accessed on 1 June 2023) [39,40]. The SDA algorithm with hierarchical structure is referred to as SDA_HODLR in this paper. The derived relative residual for SDA_HODLR is denoted by

{\hat{r}}_{k}

. In our numerical experiments, the initial bandwidths of all banded matrices in Examples 1 and 3 were relatively small, while those in Example 2 were non-trivial.

Example 1.

The first example is of the medium scale, measuring the error between the true solution and the computed one. Given the constant

θ = \sqrt{η + \frac{1}{η} - 2 ζ}

, where ζ and η are positive numbers such that θ is real. Let

L_{10}^{A} = θ e

with e the random vector satisfying

e^{⊤} e = 1

,

L_{20}^{A} = L_{10}^{A}

,

D_{0}^{A} = ζ I_{_{N}}

, then

A = D_{0}^{A} + L_{10}^{A} {(L_{20}^{A})}^{⊤}

. Set

G = D_{0}^{G} = I_{_{N}}

,

H = D_{0}^{H} = (η + \frac{1}{η}) D_{0}^{A} - {(D_{0}^{A})}^{2} - I_{_{N}}

. The solution of the DARE is of the form

X_{s} = D^{X} + L^{X} {(L^{X})}^{⊤}

with

D^{X} = η D_{0}^{A} - I_{_{N}}

and

L^{X} = \sqrt{η} L_{10}^{A}

.

It is not difficult to see that the solution

X_{s}

is stabilizing since the spectral radius of

{(I_{_{N}} + G X_{s})}^{- 1} A

is less than unity when

η > 1

.

We first took

ζ = 1.2

and

η = 2

to calculate B_RRes, followed by LR_RRes as well as the upper bound of residual of DARE

{\tilde{r}}_{k}

. In our implementations, the relative error between the approximated solution (denoted by

H_{j}

when terminated at the j-th iteration) and the true stabilizing solution

X_{s}

was evaluated, and the numerical results are presented in Table 1. It is seen that for different scales (

N = 1000, 3000, 5000, 7000

) FSDA was able to attain the prescribed banded accuracy in five iterations. Residuals LR_Res and

{\tilde{r}}_{k}

were then evaluated, attaining the order

O (10^{- 16})

. The relative error with the computational time being not included in the CPU time, also reflects that

H_{5}

approximates the true solution very well. On the other hand, SDA_HODLR also attains the prescribed residual accuracy in five iterations, but cost more CPU time (in seconds).

We then took

η = 1.2

to make the spectral radius of

{(I_{_{N}} + G X_{s})}^{- 1} A

close to 1 and recorded the numerical performance of the FSDA with

ζ = 1.0

. It is seen from Table 1 that the FSDA costs seven iterations before termination, obtaining almost the same banded residual histories (B_RRes) for different N. As before, LR_RRes and

{\tilde{r}}_{k}

were of

O (10^{- 17})

and

O (10^{- 16})

, respectively, showing that

H_{7}

is a good approximation to the true solution to DARE (3). The last relative error

∥ H_{7} - X_{s} ∥ / ∥ X_{s} ∥

also validates this fact. Analogously, SDA_HODLR requires seven iterations to arrive at the residual level

O (10^{- 15})

. It is also seen that the FSDA costs less CPU time than SDA_HODLR for all N.

Example 2.

Consider a generalized model of power system labelled by PI Sections 20–80 (https://sites.google.com/site/rommes/software, “S10PI_n1.mat” accessed on 1 June 2023). All transmission lines in the network are modelled by RLC ladder networks, of cascaded RLC PI-circuits [41]. The original band-plus-low-rank matrix A has a small scale of 528 (Figure 5) and is then extended to larger ones. Specifically, we extract the banded part

D_{ori}^{A}

of the bandwidth 217 from the original matrix

A_{ori}

and tile it along the diagonal direction for 20 times to obtain

D_{0}^{A}

. We then implement an SVD of the matrix

A_{ori} - D_{ori}^{A}

to produce the singular value matrix

Σ_{A}

and the unitary matrices

U_{A}

and

V_{A}

. The low-ranked parts

L_{10}^{A}

and

L_{20}^{A}

are then constructed by tiling

U_{A} (:, 1 : r_{a})

and

V_{A} (:, 1 : r_{a})

20 times and multiplying

Σ_{A}^{1 / 2} (1 : r_{a}, 1 : r_{a})

from the right, respectively, where

r_{a}

is the number of singular values in

Σ_{A}

less than

10^{- 8}

. Let

F_{1}

and

F_{3}

be block diagonal matrices with each diagonal block the

3 \times 3

random matrix (generated by‘rand(3)’). Let

F_{2}

and

F_{4}

be also diagonal block matrices with the top left element a random number, the last diagonal block

2 \times 2

random matrix and others

3 \times 3

random matrices. Define matrices G and H as

G : = D_{0}^{G} = (R_{g} + R_{g}^{⊤}) / 2 + ξ I_{N}, H : = D_{0}^{H} = (R_{h} + R_{h}^{⊤}) / 4 + ξ I_{N},

with

R_{g} = (F_{1} + I_{N}) (F_{2} + I_{N})

,

R_{h} = (F_{3} + I_{N}) (F_{4} + I_{N})

.

We ran the FSDA with three different

ξ = 0.11, 1.0, 3.0

, each conducting five random experiments. In all experiments, B_RRes and LR_RRes (in

log 10

) were observed attaining the pre-terminating condition (39) and the terminating condition (40), respectively.

Figure 6 plots the obtained numerical results for five experiments, where Rk is the upper bound of the residual of the DARE, BRes and LRes are the absolute residuals of the banded part and the low-rank part (i.e., the numerators in B_RRes and LR_RRes), respectively. It is seen that the relative residual levels of LR_RRes and B_RRes (between

10^{- 14}

and

10^{- 17}

) are lower than those of LRes and BRes (between

10^{- 11}

and

10^{- 13}

) in all experiments. Particularly, the gap between them increases as

ξ

becomes larger. On the other hand, the residual line of Rk is above the residual lines of B_RRes or LR_RRes, attaining the level between

10^{- 15}

and

10^{- 16}

. This demonstrates that the FSDA can obtain a relatively high residual accuracy.

To clearly see the evolution of the bandwidth of the banded matrices and the dimensional increase in the low-rank factors for five iterations, we listed the history of bandwidths of

D_{k}^{G}

,

D_{k}^{H}

, and

D_{k}^{A}

(denoted by

b_{k}^{g}

,

b_{k}^{h}

, and

b_{k}^{a}

, respectively) and the column numbers of

L_{k}^{H d t}

and

L_{k}^{G d t}

(denoted by

m_{k}^{h}

and

m_{k}^{g}

, respectively) in Table 2, where the CPU row recorded the consumed CPU time in seconds. It is obviously seen that, for

ξ = 0.11, 1

, and 3, the FSDA requires 5, 4, and 3 iterations to reach the prescribed accuracy, respectively. Further experiments show that the required number of iterations, when terminated, will decrease as

ξ

goes larger. Additionally, we see that bandwidths

b_{k}^{g}

and

b_{k}^{h}

rise much in the second iteration but keep almost unchanged for the remaining iterations. Nevertheless,

b_{k}^{a}

decreases gradually after reaching the maximal value in the second iteration, which is consistent with the convergence of

D_{k}^{A}

in Corollary 1. On the other hand, we see from

m_{k}^{h}

and

m_{k}^{g}

that the column numbers in the second iteration are about fourfold of those in the first iteration since the FSDA does not deflate the low-rank factors at the first iteration. However, the column numbers in the fifth iteration (if it exists) are less than twofold of those in the fourth iteration. This reflects that deflation and PTC are efficient in reducing the dimensions of low-rank factors. In our experiments, we also found that nearly half of the CPU time in the FSDA was consumed in forming

{(I_{_{N}} + D_{k}^{H} D_{0}^{G})}^{- 1} D_{k}^{H}

in the pre-termination. However, such a time expense might decrease if the initial bandwidths

b_{0}^{g}

,

b_{0}^{h}

, and

b_{0}^{a}

are narrow.

To further compare numerical performances between the FSDA and SDA_HODLR for larger problems, we extended the original scale to N = 15,840, 21,120, 26,400 and 31,680 at

ξ = 3.0

and ran both algorithms until convergence. The results are listed in Table 3, where one can see that both the FSDA and SDA_HODLR (i.e., SDA_HD in the table) attain the prescribed residual accuracy within three iterations, and SDA_HODLR requires less CPU time than FSDA does. However, there seems a strong tendency that the FSDA will outperform the SDA_HODLR on CPU time for larger problems, as the CPU time of the SDA_HODLR appears to surge at N = 26,400 and SDA_HODLR used up memory at N = 31,680 without producing any numerical results (denoted by “—”). The symbols “*” in the SDA_HODLR column represent no related records for bandwidth and column number of the low-rank factors.

We further modified this example to have a simpler banded part to test both algorithms. Specifically, the relatively data-concentrated banded part of bandwidth 3 is extracted and tiled along the diagonal direction for 20 times to form

D_{0}^{A}

. As before, an SVD is imposed on the rest matrix to construct the low-ranked parts

L_{10}^{A}

and

L_{20}^{A}

after tiling the derived unitary matrices 20 times and multiplying

Σ_{A}^{1 / 2} (1 : r_{a}, 1 : r_{a})

from the right. We still selected

ξ = 3.0

and ran both the FSDA and SDA_HODLR at scales N = 15,840, 21,120, 26,400 and 31,680 again. The obtained results are recorded in Table 4, where it is readily seen that the FSDA outperforms the SDA_HODLR on CPU time. Once again, the SDA_HODLR ran out of memory for the case N = 31,680.

Example 3.

This example is an extension of small-scale electric power systems networks to a large-scale one which is used for signal stability analysis [19,20,21]. The corresponding matrix

A_{ori}

is from the power system of New England (https://sites.google.com/site/rommes/software, “ww_36_pemc_36.mat”, accessed on 1 June 2023). Figure 7 presents the original structure of the matrix A of order 66. We properly modified elements

A_{ori} (32, 28) = - 36.4687

,

A_{ori} (32, 29) = - 37.922

,

A_{ori} (46, 42) = - 33.0033

;

A_{ori} (46, 43) = - 76.8277

,

A_{ori} (60, 56) = - 83.0405

,

A_{ori} (60, 57) = - 73.9947

,

A_{ori} (60, 59) = - 34.0478

. Then the banded part

D_{ori}^{A}

is extracted from blocks

A_{ori}

(1:6, 1:6),

A_{ori}

(7:13, 7:13),

A_{ori}

(14:20, 14:20),

A_{ori}

(21:27, 21:27),

A_{ori}

(28:34, 28:34),

A_{ori}

(35:41, 35:41),

A_{ori}

(42:48, 42:48),

A_{ori}

(49:55, 49:55),

A_{ori}

(56:62, 56:62), and

A_{ori}

(63:66, 63:66), admitting the bandwidth of 4. After tiling

D_{ori}^{A}

200, 400, and 600 times along the diagonal direction, we obtain banded matrix

D_{0}^{A}

of scales N = 13,200, 26,400 and 39,600. For the low-rank factors, an SVD of the matrix

A_{ori} - D_{ori}^{A}

is firstly implemented to produce the diagonal singular value matrix

Σ_{A}

and the unitary matrices

U_{A}

and

V_{A}

. The low-ranked parts

L_{10}^{A}

and

L_{20}^{A}

are then constructed by tiling

U_{A} (:, 1 : r_{a})

and

V_{A} (:, 1 : r_{a})

200, 400, and 600 times and dividing their F-norms, respectively, where

r_{a}

is the number of singular values in

Σ_{A}

less than

10^{- 10}

. The matrices G and H are

G : = D_{0}^{G} = ξ I_{N}, H : = D_{0}^{H} = I_{N} - \frac{1}{1 + ξ} D_{0}^{A} {(D_{0}^{A})}^{⊤}

with

ξ > 0

.

We took different

ξ

and ran the FSDA to compute the stabilizing solution for different dimensions N = 13,200, 26,400, and 39,600. In our experiments, the FSDA always satisfied the pre-terminating condition (39) first and then terminated at LR_RRes

< ϵ_{l} = 10^{- 11}

. We picked

ξ = 95

and listed derived results in Table 5, where BRes (or LRes) and B_RRes (or LR_RRes) record the absolute and the relative residual for the banded part (or the low-rank part), respectively, and

{\tilde{r}}_{k}

,

[b_{k}^{g} b_{k}^{h} b_{k}^{a} m_{k}^{h} m_{k}^{g}]

record histories of the upper bound of the residual of DARE, the bandwidths of

D_{k}^{G}

,

D_{k}^{H}

and

D_{k}^{A}

and the column numbers of the low-rank factors

L_{k}^{H d t}

and

L_{k}^{G d t}

, respectively. Particularly, the

t_{k}

column describes the accumulated time to compute residuals (excluding the data marked with “*”).

Obviously, for different N, the FSDA is capable of achieving the prescribed accuracy after five iterations. The residuals BRes, B_RRes, LRes, and LR_RRes indicate that the FSDA tended to converge quadratically. Especially, BRes (or B_RRes) at different N are of nearly same order and terminate at

O (10^{- 9})

(or

O (10^{- 11})

). Similarly, LRes (or LR_RRes) at different N attain the order

O (10^{- 11})

(or

O (10^{- 16})

). More iterations seemed useless in improving the accuracy of LRes and LR_RRes. Note that data labelled with the superscript “*” in columns LRes, LR_RRes and

{\tilde{r}}_{k}

come from the re-running of the FSDA to complement the residual in each iteration, and their corresponding CPU time is not included in the column

t_{k}

. Lastly,

[b_{k}^{g} b_{k}^{h} b_{k}^{a} m_{k}^{h} m_{k}^{g}]

indicate that the bandwidths of

D_{k}^{G}

,

D_{k}^{G}

, and

D_{k}^{G}

are invariant and the column numbers of the low-rank factors grow less than twice in each iteration, demonstrating the effectiveness of the deflation and PTC.

We also ran the FSDA to compute the solution of the DARE of

ξ = 90

and the results were recorded in Table 6. In this case, the FSDA requires seven iterations to reach the prescribed accuracy. As before, the last few residuals in the column BRes (or B_RRes) at different N are almost the same of

O (10^{- 9})

(or

O (10^{- 14})

). The residuals LRes (or LR_RRes) at different N terminate at

O (10^{- 10})

(or

O (10^{- 15})

). In particular, BRes and B_RRes showed that the FSDA attained the prescribed accuracy at the 5th iteration, but the corresponding residual of the low-rank part was still between

10^{- 8}

and

10^{- 9}

. So two additional iterations were required to meet the termination condition (40), even if the residual level in B_RRes kept stagnant in the last three iterations. From a structured point of view, it seems that the low-rank part is approaching the critical case while the banded part still lies in the non-critical case. Similarly, [

b_{g}^{k} b_{h}^{k} b_{a}^{k} m_{k}^{h} m_{k}^{g}

] indicate that

D_{k}^{G}

,

D_{k}^{H}

, and

D_{k}^{A}

are all block diagonal with block sizes

\leq 6

and the deflation and PTC for the low-rank factors are effective. Moreover,

t_{k}

shows that the CPU times at the current iteration were less than twice that of the previous iteration when

k \geq 3

.

We further compare numerical performances between the FSDA and SDA_HODLR for large-scale problems. Different values of

ξ

have been tried and the compared numerical behaviors of both algorithms are analogous. We list the results of

ξ = 98

and

ξ = 250

in Table 7, where one can see that the FSDA requires less iterations and CPU time to satisfy the stop criterion than the SDA_HODLR. Particularly, the SDA_HODLR depleted all memory at N = 39,600 and did not yield any numerical results (denoted by “—”). The symbols “*” in the SDA_HODLR column represent no related records for bandwidths and column numbers of the low-rank factors.

7. Conclusions

The stabilizing solution of the discrete-time algebraic Riccati Equation (DARE) from the fractional system, with high-rank non-linear term G and constant term H, is not of numerical low-rank. The structure-preserving doubling algorithm (SDA_h) proposed in [18] is no longer applicable for large scale problems. In some applications, such as in power systems, the state matrix A is of banded-plus-low-rank, and in those cases SDA can be further developed to the factorized structure-preserving doubling algorithm (FSDA) to solve large scale DAREs with high-rank non-linear and constant terms. Under the assumption that G and H are positive semidefinite and

D^{G}

and

D^{H}

are banded matrices with banded inverse (BMBI), we presented the iterative scheme of FSDA, as well as the convergence of the banded and the low-ranked parts. A deflation process and the technique of PCT are subsequently proposed to efficiently control the growth of the number of columns of low-rank factors. Numerical experiments have demonstrated that the FSDA always reaches the economical pre-terminating condition associated with the banded part before the real terminating condition related to the low-rank part, yielding good approximated solutions

H_{k} = D_{k}^{H} + L_{k}^{H} K_{k}^{H} {(L_{k}^{H})}^{⊤}

and

G_{k} = D_{k}^{G} + L_{k}^{G} K_{k}^{G} {(L_{k}^{G})}^{⊤}

to the DARE and its dual, respectively. Moreover, our FSDA is superior to the existing SDA_HODLR on the CPU time for large-scale DAREs. For future work, the computation of the stabilizing solution for CAREs might be further investigated. This will be more complicated as the Cayley transformation is incorporated and the selection of the corresponding parameter does not seem easy. In addition, other sparse structures of A and high-rank H and G might be investigated.

Author Contributions

Conceptualization, B.Y.; methodology, B.Y.; software, N.D.; validation, N.D.; and formal analysis, B.Y. All authors have read and agreed to the final version of this manuscript.

Funding

This work was supported in part by the NSF of China (11801163), the NSF of Hunan Province (2021JJ50032, 2023JJ50165), the foundation of Education Department of Hunan Province (HNJG-2021-0129) and Degree & Postgraduate Education Reform Project of Hunan University of Technology and Hunan Province (JG2315, 2023JGYB210).

Acknowledgments

Part of the work occurred when the first author visited Monash University. The authors also thank the editor and three anonymous referees for their helpful comments.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

$Fractalfract 07 00468 i002$

Matrices

L_{1}^{G}

,

L_{11}^{A}

,

L_{1}^{H}

and

L_{21}^{A}

are actually the deflated, truncated and compressed low-rank factors

L_{1}^{G d t}

,

L_{11}^{A d t}

,

L_{1}^{H d t}

and

L_{21}^{A d t}

, respectively. We omit the superscript “dt” for the simpler notation.

Appendix B

$Fractalfract 07 00468 i003$

(A1)

$Fractalfract 07 00468 i004$

(A2)

$Fractalfract 07 00468 i005$

(A3)

$Fractalfract 07 00468 i006$

(A4)

Matrices

L_{k - 1}^{G}

,

L_{1, k - 1}^{A}

,

L_{k - 1}^{H}

and

L_{2, k - 1}^{A}

are actually the deflated, truncated and compressed low-rank factors

L_{k - 1}^{G d t}

,

L_{1, k - 1}^{A d t}

,

L_{k - 1}^{H d t}

and

L_{2, k - 1}^{A d t}

, respectively. We omit the superscript “dt” for convenience.

Appendix C. Description for the Deflation of $L_{13}^{A}$

\begin{matrix} m_{2}^{g} m_{2}^{a_{1}} m_{2}^{g} m_{2}^{h} m_{2}^{a_{2}} \\ L_{3}^{G} & = & [L_{2}^{G} ∣ L_{12}^{A} ∣ D_{2}^{A G H} L_{2}^{G} ∣ D_{2}^{A G H G} L_{2}^{H} ∣ D_{2}^{A G H G} L_{22}^{A}] N \\ = & [D_{0}^{A G H G} L_{20}^{A}, L_{10}^{A}, D_{0}^{AGH} L_{10}^{A}, D_{1}^{AGH} L_{10}^{A}, D_{1}^{AGH} D_{0}^{AGHG} L_{20}^{A}, D_{1}^{AGHG} L_{20}^{A}, D_{1}^{AGHG} D_{0}^{A^{⊤} GHG} L_{10}^{A}, D_{1}^{A G H G} D_{0}^{A^{⊤} H G} L_{20}^{A}, | \end{matrix}

(A5)

\begin{matrix} L_{10}^{A}, D_{0}^{AGH} L_{10}^{A}, D_{1}^{AGH} L_{10}^{A}, D_{1}^{AGH} D_{0}^{AGHG} L_{20}^{A}, D_{1}^{AGHG} L_{20}^{A}, D_{1}^{AGHG} D_{0}^{A^{⊤} GHG} L_{10}^{A}, D_{1}^{A G H} D_{0}^{A G H} L_{10}^{A}, | \end{matrix}

(A6)

D_{2}^{A G H} L_{2}^{A}, |

\begin{matrix} D_{2}^{A G H G} (D_{0}^{A^{⊤} H G H} L_{10}^{A}, L_{20}^{A}, D_{0}^{A^{⊤} HG} L_{20}^{A}, D_{1}^{A^{⊤} HG} L_{20}^{A}, D_{1}^{A^{⊤} HG} D_{0}^{A^{⊤} HGH} L_{10}^{A}, D_{1}^{A^{⊤} HGH} L_{10}^{A}, D_{1}^{A^{⊤} HGH} D_{0}^{AGHG} L_{20}^{A}, D_{1}^{A^{⊤} H G H} D_{0}^{A G H} L_{10}^{A}) | \end{matrix}

(A7)

\begin{matrix} D_{2}^{A G H G} (L_{20}^{A}, D_{0}^{A^{⊤} HG} L_{20}^{A}, D_{1}^{A^{⊤} HG} L_{20}^{A}, D_{1}^{A^{⊤} HG} D_{0}^{A^{⊤} HGH} L_{10}^{A}, D_{1}^{A^{⊤} HGH} L_{10}^{A}, D_{1}^{A^{⊤} HGH} D_{0}^{AGHG} L_{20}^{A}, D_{1}^{A^{⊤} H G} D_{0}^{A^{⊤} H G} L_{20}^{A})] \end{matrix}

(A8)

\begin{matrix} \underset{\to}{d} & [D_{0}^{A G H G} L_{20}^{A}, D_{1}^{A G H G} D_{0}^{A^{⊤} H G} L_{20}^{A}, | \\ L_{10}^{A}, D_{0}^{AGH} L_{10}^{A}, D_{1}^{AGH} L_{10}^{A}, D_{1}^{AGH} D_{0}^{AGHG} L_{20}^{A}, D_{1}^{AGHG} L_{20}^{A}, D_{1}^{AGHG} D_{0}^{A^{⊤} GHG} L_{10}^{A}, D_{1}^{A G H} D_{0}^{A G H} L_{10}^{A}, | \\ D_{2}^{A G H} L_{2}^{A}, | \\ D_{2}^{A G H G} (D_{0}^{A^{⊤} H G H} L_{10}^{A}, L_{20}^{A}, D_{0}^{A^{⊤} HG} L_{20}^{A}, D_{1}^{A^{⊤} HG} L_{20}^{A}, D_{1}^{A^{⊤} HG} D_{0}^{A^{⊤} HGH} L_{10}^{A}, D_{1}^{A^{⊤} HGH} L_{10}^{A}, D_{1}^{A^{⊤} HGH} D_{0}^{AGHG} L_{20}^{A}, D_{1}^{A^{⊤} H G H} D_{0}^{A G H} L_{10}^{A}) | \\ D_{2}^{A G H G} D_{1}^{A^{⊤} H G} D_{0}^{A^{⊤} H G} L_{20}^{A}] \\ 20 m^{a} m_{2}^{a_{1}} m_{2}^{g} m_{2}^{h} m^{a} \\ = & [D_{0}^{A G H G} L_{20}^{A}, D_{1}^{A G H G} D_{0}^{A^{⊤} H G} L_{20}^{A} |L_{12}^{A}| D_{2}^{A G H} L_{2}^{G} |D_{2}^{A G H G} L_{2}^{H}| D_{2}^{A G H G} D_{1}^{A^{⊤} H G} D_{0}^{A^{⊤} H G} L_{20}^{A}] N : = L_{3}^{G d} . \end{matrix}

After the previous deflation, there are

m_{2}^{g} - 2 m^{a}

columns in

L_{2}^{G}

and

L_{12}^{A}

(items marked with bold type in (A2) and (A3)) and

m_{2}^{a} - m^{a}

columns (items marked with bold type in (A4) and (A5)) in

D_{2}^{A G H G} L_{2}^{H}

and

D_{2}^{A G H G} L_{22}^{A}

are identical. Then, one can remove columns of

L_{2}^{G} (:, m^{a} + 1 : m_{2}^{g} - m^{a})

in (A2) and

D_{2}^{A G H G} L_{22}^{A} (:, 1 : m_{2}^{a} - m^{a})

in (A5) (i.e., items with bold type in (A2) and (A5)), keep columns of

L_{12}^{A} (:, 1 : m_{2}^{g} - 2 m^{a})

in (A3) and

D_{2}^{A G H G} L_{2}^{H} (:, m_{2}^{h} - m_{2}^{a_{2}} + 1 : m_{2}^{h} - m^{a})

in (A4) (i.e., items with bold type in (A3) and (A4)), respectively. Then there are two matrices with each of order

N \times m^{a}

are left in

L_{2}^{G}

and only one matrix of order

N \times m^{a}

left in

D_{k}^{A G H G} L_{22}^{A}

.

Note that matrices

L_{2}^{G}

,

L_{12}^{A}

,

L_{2}^{H}

and

L_{22}^{A}

are actually the deflated, truncated and compressed low-rank factors

L_{2}^{G d t}

,

L_{12}^{A d t}

,

L_{2}^{H d t}

and

L_{22}^{A d t}

, respectively.

Appendix D. Description for the Deflation of $L_{13}^{A}$

\begin{matrix} m_{2}^{a_{1}} m_{2}^{g} m_{2}^{h} m_{2}^{a_{1}} \\ L_{13}^{A} & = & [L_{12}^{A} ∣ D_{2}^{A G H} L_{2}^{G} ∣ D_{2}^{A G H G} L_{2}^{H} ∣ D_{2}^{A G H} L_{12}^{A}] N \\ = & [L_{12}^{A}, ∣ \end{matrix}

D_{2}^{A G H} (D_{0}^{A G H G} L_{20}^{A}, L_{10}^{A}, D_{0}^{AGH} L_{10}^{A}, D_{1}^{AGH} L_{10}^{A}, D_{1}^{AGH} D_{0}^{AGHG} L_{20}^{A}, D_{1}^{AGHG} L_{20}^{A}, D_{1}^{AGHG} D_{0}^{A^{⊤} GHG} L_{10}^{A}, D_{1}^{A G H G} D_{0}^{A^{⊤} H G} L_{20}^{A}), |

(A9)

D_{2}^{A G H G} L_{2}^{H} |

D_{2}^{A G H} (L_{10}^{A}, D_{0}^{AGH} L_{10}^{A}, D_{1}^{AGH} L_{10}^{A}, D_{1}^{AGH} D_{0}^{AGHG} L_{20}^{A}, D_{1}^{AGHG} L_{20}^{A}, D_{1}^{AGHG} D_{0}^{A^{⊤} GHG} L_{10}^{A}, D_{1}^{A G H} D_{0}^{A G H} L_{10}^{A})]

(A10)

\begin{matrix} \underset{\to}{d} & [L_{12}^{A}, | \\ D_{2}^{A G H} (D_{0}^{A G H G} L_{20}^{A}, L_{10}^{A}, D_{0}^{AGH} L_{10}^{A}, D_{1}^{AGH} L_{10}^{A}, D_{1}^{AGH} D_{0}^{AGHG} L_{20}^{A}, D_{1}^{AGHG} L_{20}^{A}, D_{1}^{AGHG} D_{0}^{A^{⊤} GHG} L_{10}^{A}, D_{1}^{A G H G} D_{0}^{A^{⊤} H G} L_{20}^{A}), | \\ D_{2}^{A G H G} L_{2}^{H}, | \end{matrix}

\begin{matrix} D_{2}^{A G H} D_{1}^{A G H} D_{0}^{A G H} L_{10}^{A}] \end{matrix}

(A11)

\begin{matrix} m_{2}^{a_{1}} m_{2}^{g} m_{2}^{h} m^{a} \\ = & [L_{12}^{A} ∣ D_{2}^{A G H} L_{2}^{G} ∣ D_{2}^{A G H G} L_{2}^{H} ∣ D_{2}^{A G H} D_{1}^{A G H G} D_{0}^{A G H G} L_{10}^{A}] N : = L_{13}^{A d} . \end{matrix}

To deflate

L_{13}^{A}

, columns of

D_{2}^{A G H} L_{12}^{A} (:, 1 : m_{2}^{a_{1}} - m^{a})

are removed (i.e., items marked with bold type in (A7)) but columns of

D_{2}^{A G H} L_{2}^{G} (:, m_{2}^{g} - m_{2}^{a_{1}} + 1 : m_{2}^{g} - m^{a})

(i.e., items marked with bold type in (A6)) are retained in

L_{12}^{A}

. So only one matrix of order

N \times m^{a}

is left in

D_{2}^{A G H} L_{12}^{A}

, i.e., the last item

Π_{i = 2}^{0} D_{i}^{A G H} L_{10}^{A}

in (A8).

Note that matrices

L_{2}^{G}

,

L_{12}^{A}

and

L_{2}^{H}

are actually the deflated, truncated, and compressed low-rank factors

L_{2}^{G d t}

,

L_{12}^{A d t}

, and

L_{2}^{H d t}

, respectively.

Appendix E

Table A1. Complexity and memory requirement at k-th iteration in the FSDA.

Items	Flops	Memory
	Banded part
$D_{k}^{A G H}$ , $D_{k}^{A^{⊤} H G}$ *	$4 N {(2 b_{k - 1}^{h g} + 1)}^{2} + b_{k - 1}^{h g} b_{k - 1}^{a}$	$2 N (2 b_{k - 1}^{h g a} + 1)$
$D_{k}^{G}$ , $D_{k}^{H}$ , $D_{k}^{A}$	$4 N (2 b_{k - 1}^{g} + 1) (2 b_{k - 1}^{h g a} + 1)$	$2 N (2 b_{k - 1}^{h g a} + 1)$
	Low-rank part and kernels
$D_{k - 1}^{A G H} L_{k - 1}^{G}$ , $D_{k - 1}^{A G H G} L_{k - 1}^{H}$ , $D_{k - 1}^{A G H G} L_{2, k - 1}^{A}$	$2 N b_{k - 1}^{h g a} (m_{k - 1}^{g} + m_{k - 1}^{h} + m_{k - 1}^{a})$	$(m_{k - 1}^{g} + m_{k - 1}^{h} + m_{k - 1}^{a}) N$
$D_{k - 1}^{A^{⊤} H G} L_{k - 1}^{H}$ , $D_{k - 1}^{A^{⊤} H G H} L_{k - 1}^{G}$ , $D_{k - 1}^{A^{⊤} H G H} L_{1, k - 1}^{A}$	$2 N b_{k - 1}^{h g a} (m_{k - 1}^{g} + m_{k - 1}^{h} + m_{k - 1}^{a})$	$(m_{k - 1}^{g} + m_{k - 1}^{h} + m_{k - 1}^{a}) N$
$Θ_{k - 1}^{H}$ , $Θ_{k - 1}^{G}$ , $Θ_{k - 1}^{H G}$	$\begin{matrix} 2 N (b_{k - 1}^{h g} (m_{k - 1}^{h} + m_{k - 1}^{g}) + b_{k - 1}^{h g} m_{k - 1}^{g} \\ + {(m_{k - 1}^{h})}^{2} + {(m_{k - 1}^{g})}^{2} + m_{k - 1}^{g} m_{k - 1}^{h}) \end{matrix}$	${(m_{k - 1}^{h})}^{2} + {(m_{k - 1}^{g})}^{2} + m_{k - 1}^{h} m_{k - 1}^{g}$
$Θ_{k - 1}^{A}$ , $Θ_{1, k - 1}^{A}$ , $Θ_{2, k - 1}^{A}$	$2 N (2 b_{k - 1}^{h g} m_{k - 1}^{a} + b_{k - 1}^{h g} m_{k - 1}^{a} + 3 {(m_{k - 1}^{a})}^{2})$	$3 {(m_{k - 1}^{a})}^{2}$
$Θ_{1, k - 1}^{A H}$ , $Θ_{1, k - 1}^{A G}$	$2 N (b_{k - 1}^{h g} (m_{k - 1}^{h} + m_{k - 1}^{g}) + m_{k - 1}^{a} (m_{k - 1}^{h} + m_{k - 1}^{g}))$	$m_{k - 1}^{a} (m_{k - 1}^{h} + m_{k - 1}^{g})$
$Θ_{2, k - 1}^{A H}$ , $Θ_{2, k - 1}^{A G}$	$2 N (b_{k - 1}^{h g} (m_{k - 1}^{h} + m_{k - 1}^{g}) + m_{k - 1}^{a} (m_{k - 1}^{h} + m_{k - 1}^{g}))$	$m_{k - 1}^{a} (m_{k - 1}^{h} + m_{k - 1}^{g})$
$K_{k - 1}^{A G H G}$	${(m_{k - 1}^{a})}^{2} (m_{k - 1}^{h} + m_{k - 1}^{g}) + m_{k - 1}^{a} {(m_{k - 1}^{h} + m_{k - 1}^{g})}^{2}$	$m_{k - 1}^{a} (m_{k - 1}^{h} + m_{k - 1}^{g})$
$K_{k - 1}^{A G H G A^{⊤}}$ , $K_{k - 1}^{A^{⊤} H G H A}$ , $K_{k - 1}^{A G H A}$	$6 {(m_{k - 1}^{a})}^{2} (2 m_{k - 1}^{a} + m_{k - 1}^{h} + m_{k - 1}^{g})$	$3 {(m_{k - 1}^{a})}^{2}$
$K_{k - 1}^{A^{⊤} H G H}$	$2 (m_{k - 1}^{a}) (m_{k - 1}^{a} + m_{k - 1}^{h}) (m_{k - 1}^{a} + m_{k - 1}^{h} + m_{k - 1}^{g})$	$m_{k - 1}^{a} (m_{k - 1}^{h} + m_{k - 1}^{g})$
$K_{k - 1}^{A G H}$ , $K_{k - 1}^{A^{⊤} G H}$	$2 (m_{k - 1}^{a}) {(m_{k - 1}^{a} + m_{k - 1}^{h})}^{2}$	$2 m_{k - 1}^{a} (m_{k - 1}^{h} + m_{k - 1}^{g})$
$K_{k - 1}^{G H}$ *, $K_{k - 1}^{G H G}$ , $K_{k - 1}^{H G H}$	$8 {(m_{k - 1}^{h} + m_{k - 1}^{g})}^{3} / 3$	$3 {(m_{k - 1}^{h} + m_{k - 1}^{g})}^{2}$
$Q_{k}^{G}$ , $Q_{k}^{H}$ **	$4 {(m_{k - 1}^{a} + m_{k - 1}^{g} + m_{k - 1}^{h})}^{2} (N - m_{k - 1}^{a} + m_{k - 1}^{g} + m_{k - 1}^{h})$	$(r_{k}^{h} + r_{k}^{g}) N$
$U_{k}^{G}$ , $U_{k}^{H}$ ,	$4 (m_{k - 1}^{a} + m_{k - 1}^{g} + m_{k - 1}^{h}) r_{k - 1}^{g} (N - m_{k - 1}^{a} + m_{k - 1}^{g} + m_{k - 1}^{h})$	$(r_{k}^{g} + r_{k}^{h}) \times m_{k - 1}^{h g a}$
$K_{k}^{G d t}$	$12 {(m_{k - 1}^{a} + m_{k - 1}^{g} + m_{k - 1}^{h})}^{2} r_{k - 1}^{g}$	${(m_{k}^{g})}^{2}$
$K_{k}^{H d t}$ ,	$12 {(m_{k - 1}^{a} + m_{k - 1}^{g} + m_{k - 1}^{h})}^{2} r_{k - 1}^{h}$	${(m_{k}^{h})}^{2}$
$K_{k}^{A d t}$	$6 {(m_{k - 1}^{a} + m_{k - 1}^{g} + m_{k - 1}^{h})}^{2} (r_{k - 1}^{g} + r_{k - 1}^{g})$	$m_{k}^{g} m_{k}^{h}$
	Residual part
${(D_{0}^{A})}^{⊤} {\tilde{D}}_{k}^{H G H} L_{10}^{A}$ , ${(D_{0}^{A})}^{⊤} {\tilde{D}}_{k}^{H G} L_{k}^{H}$	$2 b_{k}^{h g} (m^{a} + m_{k}^{h}) N$	$(m_{k}^{h} + m^{a}) N$
${(L_{k}^{H})}^{⊤} {\tilde{D}}_{k}^{G H G} L_{k}^{H}$	$2 b_{k}^{h g} (m^{a} + m_{k}^{h}) N$	${(m_{k}^{h})}^{2}$
${\tilde{K}}_{k}^{H}$ *	$8 {(m_{k}^{h})}^{2} / 3$	${(m_{k}^{h})}^{2}$
${\tilde{K}}_{k}^{A^{⊤} H G}$	$2 b_{k}^{h g} (m^{a} + m_{k}^{h}) N$	$m^{a} m_{k}^{h}$
${\tilde{K}}_{k}^{A^{⊤} H G H A}$	$2 m^{a} (b_{k}^{h g} + m^{a}) N + 2 {(m^{a})}^{2} m_{k}^{h}$	${(m^{a})}^{2}$
$Q_{k}^{R}$ **	$2 {(m^{a} + 2 m_{k}^{h})}^{2} (N - m^{a} - 2 m_{k}^{h})$	$r_{k}^{r} N$
$U_{k}^{R}$	$2 (m^{a} + 2 m_{k}^{h}) r_{k}^{r} (N - m^{a} - 2 m_{k}^{h})$	$r_{k}^{r} (m^{a} + 2 m_{k}^{h})$
$U_{k}^{R} K_{k}^{R d} {(U_{k}^{R})}^{⊤}$	$2 (m^{a} + 2 m_{k}^{h}) r_{k}^{r} (r_{k}^{r} + m^{a} + 2 m_{k}^{h})$	${(r_{k}^{r})}^{2}$

* LU factorization and Gaussian elimination is used [42]. ** Householder QR decomposition is used [12].

References

Nosrati, K.; Shafiee, M. On the convergence and stability of fractional singular Kalman filter and Riccati equation. J. Frankl. Inst. 2020, 357, 7188–7210. [Google Scholar]
Trujillo, J.J.; Ungureanu, V.M. Optimal control of discrete-time linear fractional-order systems with multiplicative noise. Int. J. Control 2018, 91, 57–69. [Google Scholar] [CrossRef]
Podlubny, I. Fractional Differential Equations; Academic Press: New York, NY, USA, 1999. [Google Scholar]
Benner, P.; Fassbender, H. The symplectic eigenvalue problem, the butterfly form, the SR algorithm, and the Lanczos method. Linear Algebra Appl. 1998, 275–276, 19–47. [Google Scholar]
Chen, C.-R. A structure-preserving doubling algorithm for solving a class of quadratic matrix equation with M-matrix. Electron. Res. Arch. 2022, 30, 574–581. [Google Scholar]
Chu, E.K.-W.; Fan, H.-Y.; Lin, W.-W. A structure-preserving doubling algorithm for continuous-time algebraic Riccati equations. Linear Algebra Appl. 2005, 396, 55–80. [Google Scholar]
Chu, E.K.-W.; Fan, H.-Y.; Lin, W.-W.; Wang, C.-S. A structure-preserving doubling algorithm for periodic discrete-time algebraic Riccati equations. Int. J. Control 2004, 77, 767–788. [Google Scholar]
Kleinman, D. On an iterative technique for Riccati equation computations. IEEE Trans. Autom. Control 1968, 13, 114–115. [Google Scholar] [CrossRef]
Lancaster, P.; Rodman, L. Algebraic Riccati Equations; Clarendon Press: Oxford, UK, 1995. [Google Scholar]
Laub, A.J. A Schur method for solving algebraic Riccati equation. IEEE Trans. Autom. Control 1979, AC-24, 913–921. [Google Scholar]
Li, T.-X.; Chu, D.-L. A structure-preserving algorithm for semi-stabilizing solutions of generalized algebraic Riccati equations. Electron. Trans. Numer. Anal. 2014, 41, 396–419. [Google Scholar]
Mehrmann, V.L. The Autonomous Linear Quadratic Control Problem; Lecture Notes in Control and Information Sciences; Springer: Berlin/Heidelberg, Germany, 1991; Volume 163. [Google Scholar]
Mohammad, I. Fractional polynomial approximations to the solution of fractional Riccati equation. Punjab Univ. J. Math. 2019, 51, 123–141. [Google Scholar]
Tvyordyj, D.A. Hereditary Riccati equation with fractional derivative of variable order. J. Math. Sci. 2021, 253, 564–572. [Google Scholar] [CrossRef]
Yu, B.; Li, D.-H.; Dong, N. Low memory and low complexity iterative schemes for a nonsymmetric algebraic Riccati equation arising from transport theory. J. Comput. Appl. Math. 2013, 250, 175–189. [Google Scholar] [CrossRef]
Benner, P.; Saak, J.A. Galerkin-Newton-ADI method for solving large-scale algebraic Riccati equations. In DFG Priority Programme 1253 “Optimization with Partial Differential Equations”; Preprint SPP1253-090; DFG: Bonn, Germany, 2010. [Google Scholar]
Chu, E.K.-W.; Weng, P.C.-Y. Large-scale discrete-time algebraic Riccati equations—Doubling algorithm and error analysis. J. Comput. Appl. Math. 2015, 277, 115–126. [Google Scholar] [CrossRef]
Yu, B.; Fan, H.-Y.; Chu, E.K.-W. Large-scale algebraic Riccati equations with high-rank constant terms. J. Comput. Appl. Math. 2019, 361, 130–143. [Google Scholar] [CrossRef]
Martins, N.; Lima, L.; Pinto, H. Computing dominant poles of power system transfer functions. IEEE Trans. Power Syst. 1996, 11, 162–170. [Google Scholar] [CrossRef]
Freitas, F.D.; Martins, N.; Varricchio, S.L.; Rommes, J.; Veliz, F.C. Reduced-Order Transfer Matrices from RLC Network Descriptor Models of Electric Power Grids. IEEE Trans. Power Syst. 2011, 26, 1905–1916. [Google Scholar] [CrossRef]
Rommes, J.; Martins, N. Efficient computation of multivariable transfer function dominant poles using subspace acceleration. IEEE Trans. Power Syst. 2006, 21, 1471–1483. [Google Scholar] [CrossRef]
Dahmen, W.; Micchelli, C.C. Banded matrices with banded inverses, II: Locally finite decomposition of spline spaces. Constr. Approx. 1993, 9, 263–281. [Google Scholar] [CrossRef]
Cantero, M.J.; Moral, L.; Velázquez, L. Five-diagonal matrices and zeros of orthogonal polynomials on the unit circle. Linear Algebra Appl. 2003, 362, 29–56. [Google Scholar] [CrossRef]
Kimura, H. Generalized Schwarz form and lattice-ladder realizations of digital filters. IEEE Trans. Circuits Syst. 1985, 32, 1130–1139. [Google Scholar] [CrossRef]
Kavcic, A.; Moura, J. Matrices with banded inverses: Inversion algorithms and factorization of Gauss–Markov processes. IEEE Trans. Inf. Theory 2000, 46, 1495–1509. [Google Scholar] [CrossRef]
Strang, G. Fast transforms: Banded matrices with banded inverses. Proc. Natl. Acad. Sci. USA 2010, 107, 12413–12416. [Google Scholar] [CrossRef] [PubMed]
Strang, G. Groups of banded matrices with banded inverses. Proc. Am. Math. Soc. 2011, 139, 4255–4264. [Google Scholar] [CrossRef]
Strang, G.; Nguyen, T. Wavelets and Filter Banks; Wellesley-Cambridge Press: Cambridge, UK, 1996. [Google Scholar]
Olshevsky, V.; Zhlobich, P.; Strang, G. Green’s matrices. Linear Algebra Appl. 2010, 432, 218–241. [Google Scholar] [CrossRef]
Grasedyck, L.; Hackbusch, W.; Khoromskij, B.N. Solution of large scale algebraic matrix Riccati equations by use of hierarchical matrices. Computing 2003, 70, 121–165. [Google Scholar] [CrossRef]
Kressner, D.; Krschner, P.; Massei, S. Low-rank updates and divide-and-conquer methods for quadratic matrix equations. Numer. Algorithms 2020, 84, 717–741. [Google Scholar] [CrossRef]
Benner, P.; Saak, J. A Semi-Discretized Heat Transfer Model for Optimal Cooling of Steel Profiles. In Dimension Reduction of Large-Scale Systems; Benner, P., Sorensen, D.C., Mehrmann, V., Eds.; Lecture Notes in Computational Science and Engineering; Springer: Berlin/Heidelberg, Germany, 2005; Volume 45. [Google Scholar]
Korvink, G.; Rudnyi, B. Oberwolfach Benchmark Collection. In Dimension Reduction of Large-Scale Systems; Benner, P., Sorensen, D.C., Mehrmann, V., Eds.; Lecture Notes in Computational Science and Engineering; Springer: Berlin/Heidelberg, Germany, 2005; Volume 45. [Google Scholar]
Golub, G.H.; Van Loan, C.F. Matrix Computations; Johns Hopkins University Press: Baltimore, MD, USA, 1996. [Google Scholar]
Huang, C.-M.; Li, R.-C.; Lin, W.-W. Structure-Preserving Doubling Algorithms for Nonlinear Matrix Equations; SIAM: Washington, DC, USA, 2018. [Google Scholar]
Lin, W.-W.; Xu, S.-F. Convergence analysis of structure-preserving doubling algorithms for Riccati-type matrix equations. SIAM J. Matrix Anal. Appl. 2006, 28, 26–39. [Google Scholar] [CrossRef]
Demko, S. Inverses of band matrices and local convergence of spline projections. SIAM J. Numer. Anal. 1977, 14, 616–619. [Google Scholar] [CrossRef]
Mathworks. MATLAB User’s Guide; Mathworks: Natick, MA, USA, 2010. [Google Scholar]
Massei, S.; Palitta, D.; Robol, L. Solving rank structured Sylvester and Lyapunov equations. SIAM J. Matrix Anal. Appl. 2018, 39, 1564–1590. [Google Scholar] [CrossRef]
Massei, S.; Robol, L.; Kressner, D. hm-toolbox: Matlab software for HODLR and HSS matrices. SIAM J. Sci. Comput. 2020, 42, C43–C68. [Google Scholar] [CrossRef]
Watson, N.; Arrillaga, J. Power Systems Electromagnetic Transients Simulation; IET, Digital Libray: London, UK, 2003. [Google Scholar]
Arbenz, P.; Gander, W. A Survey of Direct Parallel Algorithms for Banded Linear Systems; Tech. Report 221; Departement Informatik, Institut für Wissenschaftliches Rechnen, ETH Zürich: Zurich, Switzerland, 1994. [Google Scholar]

$Fractalfract 07 00468 g001$

Figure 1. The deflation process of

K_{2}^{G}

(or

K_{2}^{H}

).

Figure 1. The deflation process of

K_{2}^{G}

(or

K_{2}^{H}

).

$Fractalfract 07 00468 g001$

$Fractalfract 07 00468 g002$

Figure 2. The deflation process of

K_{2}^{A}

.

Figure 2. The deflation process of

K_{2}^{A}

.

$Fractalfract 07 00468 g002$

$Fractalfract 07 00468 g003$

Figure 3. The deflation process of

K_{k}^{G}

(or

K_{k}^{H}

).

Figure 3. The deflation process of

K_{k}^{G}

(or

K_{k}^{H}

).

$Fractalfract 07 00468 g003$

$Fractalfract 07 00468 g004$

Figure 4. The deflation process of

K_{k}^{A}

.

Figure 4. The deflation process of

K_{k}^{A}

.

$Fractalfract 07 00468 g004$

$Fractalfract 07 00468 g005$

Figure 5. Structured matrix

A_{ori}

of size

528 \times 528

in Example 2.

Figure 5. Structured matrix

A_{ori}

of size

528 \times 528

in Example 2.

$Fractalfract 07 00468 g005$

$Fractalfract 07 00468 g006$

Figure 6. Residual of the banded part and the low-rank part for different

ξ

.

Figure 6. Residual of the banded part and the low-rank part for different

ξ

.

$Fractalfract 07 00468 g006$

$Fractalfract 07 00468 g007$

Figure 7. Structured matrix A of order

66 \times 66

(1194 non-zeros) in Example 3.

Figure 7. Structured matrix A of order

66 \times 66

(1194 non-zeros) in Example 3.

$Fractalfract 07 00468 g007$

Table 1. Residual and actual errors in Example 1.

		$ζ = 1.2$ , $η = 2.0$
$N$	1000	3000	5000	7000
		FSDA
	$4.39 \times 10^{- 1}$	$4.41 \times 10^{- 1}$	$4.42 \times 10^{- 1}$	$4.42 \times 10^{- 1}$
	$3.47 \times 10^{- 2}$	$3.48 \times 10^{- 2}$	$3.49 \times 10^{- 2}$	$3.49 \times 10^{- 2}$
B_RRes	$1.38 \times 10^{- 4}$	$1.38 \times 10^{- 4}$	$1.38 \times 10^{- 4}$	$1.38 \times 10^{- 4}$
	$2.10 \times 10^{- 9}$	$2.11 \times 10^{- 9}$	$2.11 \times 10^{- 9}$	$2.11 \times 10^{- 9}$
	$4.25 \times 10^{- 16}$	$4.27 \times 10^{- 16}$	$4.27 \times 10^{- 16}$	$4.31 \times 10^{- 16}$
LR_RRes	$2.09 \times 10^{- 18}$	$2.27 \times 10^{- 18}$	$4.04 \times 10^{- 18}$	$3.28 \times 10^{- 18}$
${\tilde{r}}_{k}$	$4.27 \times 10^{- 16}$	$4.29 \times 10^{- 16}$	$4.31 \times 10^{- 16}$	$4.34 \times 10^{- 16}$
$∥ H_{5} - X_{s} ∥ / ∥ X_{s} ∥$	$2.56 \times 10^{- 16}$	$2.57 \times 10^{- 16}$	$2.56 \times 10^{- 16}$	$2.48 \times 10^{- 16}$
CPU	0.04	0.09	0.22	0.48
		SDA_HODLR
	$4.44 \times 10^{- 1}$	$4.44 \times 10^{- 1}$	$4.44 \times 10^{- 1}$	$4.44 \times 10^{- 1}$
	$3.50 \times 10^{- 2}$	$3.50 \times 10^{- 2}$	$3.50 \times 10^{- 2}$	$3.50 \times 10^{- 2}$
${\hat{r}}_{k}$	$1.39 \times 10^{- 4}$	$1.39 \times 10^{- 4}$	$1.39 \times 10^{- 4}$	$1.39 \times 10^{- 4}$
	$2.12 \times 10^{- 9}$	$2.12 \times 10^{- 9}$	$2.12 \times 10^{- 9}$	$2.12 \times 10^{- 9}$
	$1.33 \times 10^{- 15}$	$1.27 \times 10^{- 15}$	$1.34 \times 10^{- 15}$	$1.47 \times 10^{- 15}$
CPU	1.17	19.93	76.67	186.61
		$ζ = 1.0$ , $η = 1.2$
$N$	1000	3000	5000	7000
		FSDA
	$8.68 \times 10^{- 1}$	$8.84 \times 10^{- 1}$	$8.89 \times 10^{- 1}$	$8.92 \times 10^{- 1}$
	$6.06 \times 10^{- 1}$	$6.18 \times 10^{- 1}$	$6.21 \times 10^{- 1}$	$6.23 \times 10^{- 1}$
	$1.93 \times 10^{- 1}$	$1.97 \times 10^{- 1}$	$1.98 \times 10^{- 1}$	$1.99 \times 10^{- 1}$
B_RRes	$1.15 \times 10^{- 2}$	$1.18 \times 10^{- 2}$	$1.18 \times 10^{- 2}$	$1.19 \times 10^{- 2}$
	$3.40 \times 10^{- 5}$	$3.47 \times 10^{- 5}$	$3.49 \times 10^{- 5}$	$3.50 \times 10^{- 5}$
	$2.91 \times 10^{- 10}$	$2.97 \times 10^{- 10}$	$2.99 \times 10^{- 10}$	$3.00 \times 10^{- 10}$
	$8.22 \times 10^{- 16}$	$8.38 \times 10^{- 16}$	$8.43 \times 10^{- 16}$	$8.46 \times 10^{- 16}$
LR_RRes	$3.03 \times 10^{- 17}$	$1.07 \times 10^{- 17}$	$2.77 \times 10^{- 17}$	$1.75 \times 10^{- 17}$
${\tilde{r}}_{k}$	$8.52 \times 10^{- 16}$	$8.48 \times 10^{- 16}$	$8.70 \times 10^{- 16}$	$8.63 \times 10^{- 16}$
$∥ H_{7} - X_{s} ∥ / ∥ X_{s} ∥$	$4.23 \times 10^{- 15}$	$5.04 \times 10^{- 15}$	$4.94 \times 10^{- 15}$	$4.98 \times 10^{- 15}$
CPU	0.31	0.45	0.48	0.96
		SDA_HODLR
	$9.08 \times 10^{- 1}$	$9.08 \times 10^{- 1}$	$9.08 \times 10^{- 1}$	$9.08 \times 10^{- 1}$
	$6.34 \times 10^{- 1}$	$6.34 \times 10^{- 1}$	$6.34 \times 10^{- 1}$	$6.34 \times 10^{- 1}$
	$2.02 \times 10^{- 1}$	$2.02 \times 10^{- 1}$	$2.02 \times 10^{- 1}$	$2.02 \times 10^{- 1}$
${\hat{r}}_{k}$	$1.21 \times 10^{- 2}$	$1.21 \times 10^{- 2}$	$1.21 \times 10^{- 2}$	$1.21 \times 10^{- 2}$
	$3.56 \times 10^{- 5}$	$3.56 \times 10^{- 5}$	$3.56 \times 10^{- 5}$	$3.56 \times 10^{- 5}$
	$3.05 \times 10^{- 10}$	$3.05 \times 10^{- 10}$	$3.05 \times 10^{- 10}$	$3.05 \times 10^{- 10}$
	$4.75 \times 10^{- 15}$	$4.62 \times 10^{- 15}$	$4.97 \times 10^{- 15}$	$5.52 \times 10^{- 15}$
CPU	1.61	27.10	107.16	263.34

Table 2. CPU times and history of bandwidth of banded matrices and column numbers of low-rank factors in Example 2.

	1	2	3	4	5
	$[b_{k}^{g} b_{k}^{h} b_{k}^{a} m_{k}^{h} m_{k}^{g}]$	$[b_{k}^{g} b_{k}^{h} b_{k}^{a} m_{k}^{h} m_{k}^{g}]$	$[b_{k}^{g} b_{k}^{h} b_{k}^{a} m_{k}^{h} m_{k}^{g}]$	$[b_{k}^{g} b_{k}^{h} b_{k}^{a} m_{k}^{h} m_{k}^{g}]$	$[b_{k}^{g} b_{k}^{h} b_{k}^{a} m_{k}^{h} m_{k}^{g}]$
	[445 445 445 34 34]	[445 445 445 34 34]	[445 445 445 34 34]	[ 445 445 445 34 34]	[445 445 445 34 34]
	[979 980 981 126 132]	[982 982 1042 126 132]	[973 767 973 126 132]	[1047 1033 1051 126 132]	[998 998 997 126 132]
$ξ = 0.11$	[981 980 980 474 484]	[981 980 980 474 481]	[973 767 748 480 492]	[1050 1047 1049 468 495]	[998 999 973 474 488]
	[981 980 768 1012 1020]	[981 980 768 1014 1018]	[973 767 674 1025 1032]	[1050 1042 1047 1096 1028]	[981 980 768 1011 1023]
	[981 980 519 1758 1767]	[981 980 522 1759 1771]	[973 767 493 1801 1812]	[1050 1042 983 1946 1853]	[981 980 525 1762 1773]
CPU	4443.63	4451.36	4456.96	4414.65	4457.14
	[445 445 445 34 34]	[445 445 445 34 34]	[445 445 445 34 34]	[ 445 445 445 34 34]	[445 445 445 34 34]
	[973 767 973 126 132]	[768 973 769 126 132]	[815 973 973 126 132]	[1033 996 1042 126 132]	[745 745 748 126 132]
$ξ = 1$	[973 973 973 471 476]	[767 973 766 469 476]	[815 973 768 477 487]	[1042 1042 1042 479 490]	[753 980 732 474 488]
	[973 973 646 911 927]	[767 973 555 910 916]	[815 973 646 1007 1027]	[1042 1042 840 973 980]	[753 980 684 923 931]
CPU	4014.65	4025.74	3993.86	4107.84	4020.12
	[445 445 445 34 34]	[445 445 445 34 34]	[445 445 445 34 34]	[445 445 445 34 34]	[445 445 445 34 34]
$ξ = 3.0$	[652 654 674 126 132]	[746 746 746 126 132]	[695 673 675 126 132]	[674 686 685 126 132]	[701 703 686 126 132]
	[652 654 519 448 453]	[746 746 650 466 475]	[695 673 614 449 454]	[674 686 658 447 454]	[701 703 651 448 455]
CPU	1797.39	1640.02	1803.23	1748.16	1695.01

Table 3. Numerical results for FSDA and SDA_HODLR in Example 2 at

ξ = 3.0

. The symbol * stands for no related records.

Table 3. Numerical results for FSDA and SDA_HODLR in Example 2 at

ξ = 3.0

. The symbol * stands for no related records.

N	15,840		21,120		26,400		31,680
	FSDA	SDA_HD	FSDA	SDA_HD	FSDA	SDA_HD	FSDA	SDA_HD
$b_{k}^{g}$	[445 695 695]	*	[445 736 736]	*	[445 723 723]	*	[445 652 652]	*
$b_{k}^{h}$	[445 673 673]	*	[445 745 745]	*	[445 737 737]	*	[445 654 654]	*
$b_{k}^{a}$	[445 675 614]	*	[445 745 674]	*	[445 738 653]	*	[445 674 619]	*
$m_{k}^{h}$	[34 126 448]	*	[34 126 469]	*	[34 126 460]	*	[34 126 444]	*
$m_{k}^{g}$	[34 132 453]	*	[34 132 476]	*	[34 132 469]	*	[34 132 454]	*
IT.	3	3	3	3	3	3	3	—
RES.	7.83 $\times 10^{- 17}$	1.44 $\times 10^{- 15}$	7.27 $\times 10^{- 17}$	1.70 $\times 10^{- 15}$	8.04 $\times 10^{- 17}$	1.74 $\times 10^{- 15}$	5.96 $\times 10^{- 15}$	—
CPU	6740.54	1285.31	13,037.43	3701.43	18,154.14	17,653.63	21,618.03	—

Table 4. Numerical results for FSDA and SDA_HODLR in relatively simpler banded part of Example 2 at

ξ = 3.0

. The symbol * stands for no related records.

Table 4. Numerical results for FSDA and SDA_HODLR in relatively simpler banded part of Example 2 at

ξ = 3.0

. The symbol * stands for no related records.

N	15,840		21,120		26,400		31,680
	FSDA	SDA_HD	FSDA	SDA_HD	FSDA	SDA_HD	FSDA	SDA_HD
$b_{k}^{g}$	[31 31 31]	*	[36 37 37]	*	[38 39 39]	*	[34 37 37]	*
$b_{k}^{h}$	[28 30 30]	*	[34 36 36]	*	[38 40 40]	*	[36 39 39]	*
$b_{k}^{a}$	[28 31 28]	*	[36 38 34]	*	[38 42 38]	*	[34 40 35]	*
$m_{k}^{h}$	[48 280 628]	*	[48 286 647]	*	[48 287 651]	*	[48 285 647]	*
$m_{k}^{g}$	[48 281 628]	*	[48 285 645]	*	[48 287 650]	*	[48 287 650]	*
IT.	3	3	3	3	3	3	3	—
RES.	5.95 $\times 10^{- 17}$	2.09 $\times 10^{- 15}$	3.46 $\times 10^{- 16}$	1.81 $\times 10^{- 15}$	8.49 $\times 10^{- 16}$	3.05 $\times 10^{- 15}$	9.93 $\times 10^{- 17}$	—
CPU	133.71	1255.06	218.07	3744.06	288.53	15,508.81	344.18	—

Table 5. Residuals, column numbers of low-rank factors, and CPU times at

ξ = 95

in Example 3.

Table 5. Residuals, column numbers of low-rank factors, and CPU times at

ξ = 95

in Example 3.

k	BRes	B_RRes	LRes	LR_RRes	${\tilde{r}}_{k}$	[ $b_{k}^{g} b_{k}^{h} b_{k}^{a} m_{k}^{h} m_{k}^{g}$ ]	$t_{k}$
N = 13,200 $ξ = 95$ $τ_{g} = τ_{h} = 10^{- 16}$ $m_{max} = 2000$
1	$1.02 \times 10^{3}$	$1.42 \times 10^{- 2}$	$1.04 \times 10^{3} *$	$1.25 \times 10^{- 2} *$	$2.67 \times 10^{- 2} *$	$[6 6 6 29 29]$	$1.03$
2	$2.33 \times 10^{0}$	$3.25 \times 10^{- 5}$	$1.40 \times 10^{- 1} *$	$2.06 \times 10^{- 6} *$	$3.45 \times 10^{- 5} *$	$[6 6 6 66 66]$	$5.01$
3	$6.19 \times 10^{- 3}$	$8.64 \times 10^{- 8}$	$2.94 \times 10^{- 3} *$	$4.33 \times 10^{- 8} *$	$1.30 \times 10^{- 7} *$	$[6 6 6 76 76]$	$79.13$
4	$1.37 \times 10^{- 7}$	$2.02 \times 10^{- 12}$	$6.28 \times 10^{- 6}$	$7.76 \times 10^{- 11}$	$7.96 \times 10^{- 11}$	$[6 6 6 100 101]$	$158.63$
5	$2.27 \times 10^{- 9}$	$3.30 \times 10^{- 14}$	$4.31 \times 10^{- 11}$	$6.33 \times 10^{- 16}$	$3.38 \times 10^{- 14}$	$[6 6 6 169 170]$	$246.55$
N = 26,400 $ξ = 95$ $τ_{g} = τ_{h} = 10^{- 16}$ $m_{max} = 2000$
1	$8.31 \times 10^{2}$	$8.64 \times 10^{- 3}$	$1.61 \times 10^{3} *$	$1.81 \times 10^{- 2} *$	$2.67 \times 10^{- 2} *$	$[6 6 6 29 29]$	$3.58$
2	$2.95 \times 10^{0}$	$3.07 \times 10^{- 5}$	$1.40 \times 10^{- 1} *$	$1.46 \times 10^{- 5} *$	$3.21 \times 10^{- 5} *$	$[6 6 6 66 66]$	$13.92$
3	$4.91 \times 10^{- 3}$	$5.11 \times 10^{- 8}$	$2.94 \times 10^{- 3} *$	$3.06 \times 10^{- 8} *$	$8.07 \times 10^{- 8} *$	$[6 6 6 75 76]$	$534.56$
4	$1.94 \times 10^{- 7}$	$2.02 \times 10^{- 12}$	$5.28 \times 10^{- 6}$	$5.49 \times 10^{- 11}$	$5.69 \times 10^{- 11}$	$[6 6 6 97 98]$	$1085.76$
5	$3.21 \times 10^{- 9}$	$3.30 \times 10^{- 14}$	$4.81 \times 10^{- 11}$	$8.00 \times 10^{- 16}$	$3.39 \times 10^{- 14}$	$[6 6 6 160 161]$	$1675.01$
N = 39,600 $ξ = 95$ $τ_{g} = τ_{h} = 10^{- 16}$ $m_{max} = 2000$
1	$1.01 \times 10^{3}$	$8.64 \times 10^{- 3}$	$1.62 \times 10^{3} *$	$1.81 \times 10^{- 2} *$	$2.67 \times 10^{- 2} *$	$[6 6 6 29 29]$	$7.93$
2	$3.61 \times 10^{0}$	$3.07 \times 10^{- 5}$	$1.40 \times 10^{- 1} *$	$1.19 \times 10^{- 6} *$	$3.19 \times 10^{- 5} *$	$[6 6 6 66 66]$	$33.41$
3	$6.02 \times 10^{- 3}$	$5.11 \times 10^{- 8}$	$2.94 \times 10^{- 3} *$	$2.50 \times 10^{- 8} *$	$7.62 \times 10^{- 8} *$	$[6 6 6 76 77]$	$605.64$
4	$2.37 \times 10^{- 7}$	$2.02 \times 10^{- 12}$	$5.28 \times 10^{- 6}$	$4.48 \times 10^{- 11}$	$4.68 \times 10^{- 11}$	$[6 6 6 100 102]$	$1210.54$
5	$3.94 \times 10^{- 9}$	$3.30 \times 10^{- 14}$	$5.22 \times 10^{- 11}$	$4.43 \times 10^{- 16}$	$3.39 \times 10^{- 14}$	$[6 6 6 170 172]$	$1923.38$

Table 6. Residuals, spans of columns, and CPU times at

ξ = 90

in Example 3.

Table 6. Residuals, spans of columns, and CPU times at

ξ = 90

in Example 3.

k	BRes	B_RRes	LRes	LR_RRes	${\tilde{r}}_{k}$	[ $b_{k}^{g} b_{k}^{h} b_{k}^{a} m_{k}^{h} m_{k}^{g}$ ]	$t_{k}$
N = 13,200 $ξ = 90$ $τ_{g} = τ_{h} = 10^{- 16}$ $m_{max} = 2000$
1	$1.02 \times 10^{3}$	$1.42 \times 10^{- 2}$	$1.59 \times 10^{3} *$	$2.12 \times 10^{- 2} *$	$3.35 \times 10^{- 2} *$	$[6 6 6 29 29]$	$1.05$
2	$2.33 \times 10^{0}$	$3.25 \times 10^{- 5}$	$3.04 \times 10^{- 1} *$	$4.24 \times 10^{- 6} *$	$3.68 \times 10^{- 5} *$	$[6 6 6 66 66]$	$5.03$
3	$6.19 \times 10^{- 3}$	$8.64 \times 10^{- 8}$	$4.38 \times 10^{- 2} *$	$6.11 \times 10^{- 7} *$	$6.99 \times 10^{- 7} *$	$[6 6 6 76 76]$	$82.23$
4	$6.09 \times 10^{- 7}$	$8.49 \times 10^{- 12}$	$3.45 \times 10^{- 3}$	$4.81 \times 10^{- 8}$	$4.81 \times 10^{- 8}$	$[6 6 6 100 101]$	$162.04$
5	$1.00 \times 10^{- 9}$	$1.40 \times 10^{- 14}$	$9.49 \times 10^{- 4}$	$1.32 \times 10^{- 8}$	$1.32 \times 10^{- 8}$	$[6 6 6 169 170]$	$248.86$
6	$1.00 \times 10^{- 9}$	$1.40 \times 10^{- 14}$	$1.01 \times 10^{- 5}$	$1.41 \times 10^{- 10}$	$1.41 \times 10^{- 10}$	$[6 6 6 225 256]$	$355.07$
7	$1.00 \times 10^{- 9}$	$1.40 \times 10^{- 14}$	$9.45 \times 10^{- 10}$	$1.31 \times 10^{- 14}$	$2.72 \times 10^{- 14}$	$[6 6 6 225 256]$	$449.81$
N = 26,400 $ξ = 90$ $τ_{g} = τ_{h} = 10^{- 16}$ $m_{max} = 2000$
1	$1.44 \times 10^{3}$	$1.42 \times 10^{- 2}$	$1.61 \times 10^{3} *$	$2.21 \times 10^{- 2} *$	$3.63 \times 10^{- 2} *$	$[6 6 6 29 29]$	$3.89$
2	$3.29 \times 10^{0}$	$3.25 \times 10^{- 5}$	$3.04 \times 10^{- 1} *$	$3.00 \times 10^{- 6} *$	$3.55 \times 10^{- 5} *$	$[6 6 6 66 66]$	$14.24$
3	$8.76 \times 10^{- 3}$	$8.64 \times 10^{- 8}$	$4.38 \times 10^{- 2} *$	$4.32 \times 10^{- 7} *$	$5.19 \times 10^{- 7} *$	$[6 6 6 76 76]$	$554.22$
4	$8.61 \times 10^{- 7}$	$8.49 \times 10^{- 12}$	$3.45 \times 10^{- 3}$	$3.40 \times 10^{- 8}$	$3.40 \times 10^{- 8}$	$[6 6 6 100 101]$	$1100.79$
5	$1.42 \times 10^{- 9}$	$1.40 \times 10^{- 14}$	$9.49 \times 10^{- 4}$	$9.35 \times 10^{- 9}$	$9.35 \times 10^{- 9}$	$[6 6 6 169 170]$	$1667.77$
6	$1.42 \times 10^{- 9}$	$1.40 \times 10^{- 14}$	$1.01 \times 10^{- 5}$	$1.00 \times 10^{- 10}$	$1.00 \times 10^{- 10}$	$[6 6 6 210 234]$	$2286.67$
7	$1.42 \times 10^{- 9}$	$1.40 \times 10^{- 14}$	$9.46 \times 10^{- 10}$	$9.33 \times 10^{- 15}$	$2.33 \times 10^{- 14}$	$[6 6 6 210 234]$	$2924.54$
N = 39,600 $ξ = 90$ $τ_{g} = τ_{h} = 10^{- 16}$ $m_{max} = 2000$
1	$1.76 \times 10^{3}$	$1.42 \times 10^{- 2}$	$1.61 \times 10^{3} *$	$2.21 \times 10^{- 2} *$	$3.63 \times 10^{- 2} *$	$[6 6 6 29 29]$	$7.49$
2	$4.03 \times 10^{0}$	$3.25 \times 10^{- 5}$	$3.04 \times 10^{- 1} *$	$2.45 \times 10^{- 6} *$	$3.49 \times 10^{- 5} *$	$[6 6 6 66 66]$	$28.02$
3	$1.07 \times 10^{- 2}$	$8.64 \times 10^{- 8}$	$4.38 \times 10^{- 2} *$	$3.53 \times 10^{- 7} *$	$4.39 \times 10^{- 7} *$	$[6 6 6 76 76]$	$564.66$
4	$1.05 \times 10^{- 6}$	$8.49 \times 10^{- 12}$	$3.45 \times 10^{- 3}$	$2.78 \times 10^{- 8}$	$2.78 \times 10^{- 8}$	$[6 6 6 100 101]$	$1206.85$
5	$1.74 \times 10^{- 9}$	$1.40 \times 10^{- 14}$	$9.49 \times 10^{- 4}$	$7.64 \times 10^{- 9}$	$7.64 \times 10^{- 9}$	$[6 6 6 169 170]$	$1929.52$
6	$1.74 \times 10^{- 9}$	$1.40 \times 10^{- 14}$	$1.01 \times 10^{- 5}$	$8.19 \times 10^{- 11}$	$8.19 \times 10^{- 11}$	$[6 6 5 209 234]$	$3553.12$
7	$1.74 \times 10^{- 9}$	$1.40 \times 10^{- 14}$	$9.48 \times 10^{- 10}$	$7.63 \times 10^{- 15}$	$2.17 \times 10^{- 14}$	$[6 6 0 209 234]$	$5806.44$

Table 7. Numerical results between FSDA and SDA_HODLR of Example 3. The symbol * stands for no related records.

N		13,200		26,400		39,600
		FSDA	SDA_HD	FSDA	SDA_HD	FSDA	SDA_HD
	$b_{k}^{g}$	[6 6 6 6 ]	*	[6 6 6 6 ]	*	[6 6 6 6 ]	*
	$b_{k}^{h}$	[6 6 6 6 ]	*	[6 6 6 6 ]	*	[6 6 6 6 ]	*
	$b_{k}^{a}$	[6 6 6 6 ]	*	[6 6 6 6 ]	*	[6 6 6 6 ]	*
	$m_{k}^{h}$	[29 66 77 101]	*	[29 66 77 102]	*	[29 66 77 102]	*
$ξ = 98$	$m_{k}^{g}$	[29 66 77 102]	*	[29 66 77 103]	*	[29 66 77 102]	*
	IT.	4	5	4	5	4	—
	RES.	8.01 $\times 10^{- 12}$	1.64 $\times 10^{- 12}$	7.42 $\times 10^{- 12}$	1.50 $\times 10^{- 14}$	6.22 $\times 10^{- 12}$	—
	CPU	162.18	1130.93	1148.34	18,832.71	1246.78	—
	$b_{k}^{g}$	[6 6 6]	*	[6 6 6]	*	[6 6 6]	*
	$b_{k}^{h}$	[6 6 6]	*	[6 6 6]	*	[6 6 6]	*
	$b_{k}^{a}$	[6 6 6]	*	[6 6 6]	*	[6 6 6]	*
	$m_{k}^{h}$	[29 66 69]	*	[29 66 69]	*	[29 66 69]	*
$ξ = 250$	$m_{k}^{g}$	[29 66 73]	*	[29 66 71]	*	[29 66 73]	*
	IT.	3	3	3	3	3	—
	RES.	1.75 $\times 10^{- 12}$	1.73 $\times 10^{- 12}$	2.54 $\times 10^{- 12}$	1.74 $\times 10^{- 12}$	3.62 $\times 10^{- 12}$	—
	CPU	80.96	655.69	536.76	15,322.53	634.70	—

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yu, B.; Dong, N. Factorized Doubling Algorithm for Large-Scale High-Ranked Riccati Equations in Fractional System. Fractal Fract. 2023, 7, 468. https://doi.org/10.3390/fractalfract7060468

AMA Style

Yu B, Dong N. Factorized Doubling Algorithm for Large-Scale High-Ranked Riccati Equations in Fractional System. Fractal and Fractional. 2023; 7(6):468. https://doi.org/10.3390/fractalfract7060468

Chicago/Turabian Style

Yu, Bo, and Ning Dong. 2023. "Factorized Doubling Algorithm for Large-Scale High-Ranked Riccati Equations in Fractional System" Fractal and Fractional 7, no. 6: 468. https://doi.org/10.3390/fractalfract7060468

APA Style

Yu, B., & Dong, N. (2023). Factorized Doubling Algorithm for Large-Scale High-Ranked Riccati Equations in Fractional System. Fractal and Fractional, 7(6), 468. https://doi.org/10.3390/fractalfract7060468

Article Menu

Factorized Doubling Algorithm for Large-Scale High-Ranked Riccati Equations in Fractional System

Abstract

1. Introduction

2. SDA and the Structured Iteration for DARE

2.1. FSDA for High-Rank Terms

2.2. Convergence and the Evolution of the Bandwidth

3. Deflation of Low-Rank Factors and Kernels

4. Partial Truncation and Compression

5. Algorithm and Implementation

5.1. Computation of Residuals

5.1.1. Residual for the Banded Part

5.1.2. Residual for the Low-Rank Part

5.2. Algorithm and Operation Counts

6. Numerical Examples

7. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

Appendix A

Appendix B

Appendix C. Description for the Deflation of $L_{13}^{A}$

Appendix D. Description for the Deflation of $L_{13}^{A}$

Appendix E

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Factorized Doubling Algorithm for Large-Scale High-Ranked Riccati Equations in Fractional System

Abstract

1. Introduction

2. SDA and the Structured Iteration for DARE

2.1. FSDA for High-Rank Terms

2.2. Convergence and the Evolution of the Bandwidth

3. Deflation of Low-Rank Factors and Kernels

4. Partial Truncation and Compression

5. Algorithm and Implementation

5.1. Computation of Residuals

5.1.1. Residual for the Banded Part

5.1.2. Residual for the Low-Rank Part

5.2. Algorithm and Operation Counts

6. Numerical Examples

7. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

Appendix A

Appendix B

Appendix C. Description for the Deflation of L 13 A

Appendix D. Description for the Deflation of L 13 A

Appendix E

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Appendix C. Description for the Deflation of $L_{13}^{A}$

Appendix D. Description for the Deflation of $L_{13}^{A}$