Modeling of Nonlinear Systems: Method of Optimal Injections

Anatoli Torokhti; Pablo Soto-Quiros

doi:10.3390/mca30020026

Abstract

In this paper, a nonlinear system is interpreted as an operator

F

transforming random vectors. It is assumed that the operator is unknown and the random vectors are available. It is required to find a model of the system represented by a best constructive operator

F

approximation. While the theory of operator approximation with any given accuracy has been well elaborated, the theory of best constrained constructive operator approximation is not so well developed. Despite increasing demands from various applications, this subject is minimally tractable because of intrinsic difficulties with associated approximation techniques. This paper concerns the best constrained approximation of a nonlinear operator in probability spaces. The main conceptual novelty of the proposed approach is that, unlike the known techniques, it targets a constructive optimal determination of all

3 p + 2

ingredients of the approximating operator where p is a nonnegative integer. The solution to the associated problem is represented by a combination of new best approximation techniques with a special iterative procedure. The proposed approximating model of the system has several degrees of freedom to minimize the associated error. In particular, one of the specific features of the developed approximating technique is special random vectors called injections. It is shown that the desired injection is determined from the solution of a special Fredholm integral equation of the second kind. Its solution is called the optimal injection. The determination of optimal injections in this way allows us to further minimize the associated error.

Keywords:

approximation of nonlinear mappings; error minimization; optimization

1. Introduction

1.1. Motivation

Over the last few decades, the problem of constructive approximation of nonlinear operators has been a topic of profound research. A number of fundamental papers have appeared, which have established significant advances in this research area. Some relevant references can be found, in particular, in [1,2,3,4,5,6,7,8,9,10,11,12,13,14].

The known related results mainly concern proving the existence and uniqueness of operators approximating a given map, and justifying the bounds of errors arising from the approximation methods. The assumptions are that preimages and images are deterministic and can be represented in an analytical form, that is, by equations. At the same time, in many applications, the sets of preimages and images are stochastic and cannot be described by equations. Nevertheless, it is possible to represent these sets in terms of their numerical characteristics, such as the expectation and covariance matrices. Typical examples include stochastic signal processing [15,16,17,18,19], statistics [20,21,22,23,24], engineering [25,26,27,28], and image processing [29,30]; in the latter case, a digitized image, presented by a matrix, is often interpreted as the sample of a stochastic signal.

While the theory of operator approximation with any given accuracy is well elaborated (see, e.g., [1,2,3,4,5,6,7,8,9,10,11,12,13,14]), the theory of best constrained constructive operator approximation is not particularly well developed, although this is an area of intensive recent research (see, e.g., [31,32,33,34,35,36]). Despite increasing demands from applications [17,18,19,21,22,23,25,26,27,28,30,31,32,33,34,36,37,38,39,40,41,42,43,44,45,46], this subject is minimally tractable because of intrinsic difficulties in best approximation techniques, especially when the approximating operator should have a specific structure implied by the underlying problem. Recent studies related to this topic can be found in [47,48,49,50,51,52,53,54,55]. In particular, in [47], an approach to modeling of nonlinear systems was proposed based on observed input-output data. The proposed method constructs an approximate model by decomposing the unknown function into multiple components, aiming to minimize the associated error. Special random vectors, called injections, are used to refine the approximation while specific transformations simplify the optimization process. The approach provides flexibility in adjusting parameters to improve accuracy.

We wish to extend the known results in this area to the case when the sets of preimages and images of map

F

are stochastic, and the approximating operator we search is constructive in the sense it can numerically be realized and, therefore, is applicable to problems in applications.

1.2. Short Description of the Method

Let

Ω

be a set of outcomes in the probability space

(Ω, Σ, μ)

for which

Σ

is a

σ

–field of measurable subsets of

Ω

and

μ : Σ \to [0, 1]

is an associated probability measure. Let

x \in L^{2} (Ω, R^{m})

and

y \in L^{2} (Ω, R^{n})

be random vectors,

F : L^{2} (Ω, R^{n}) \to L^{2} (Ω, R^{m})

be a nonlinear map, and

x = F (y)

.

A nonlinear system is interpreted as the map

F

where

x

and

y

are random output and input signals, respectively. It is assumed that

F

is unknown, and

x

and

y

are available. We propose and justify a new method for the constructive

F

approximation such that first, an associated error is minimized and second, a structure of the approximating operator satisfies the special constraint related to the dimensionality reduction of vector

y

. More specifically, we develop a new approach to the best constructive approximation of the map

F

in probability spaces subject to a specialized criterion associated with the dimensionality reduction of random preimages. The latter constraint follows from the requirements in applications such as those considered in [20,21,22,23,24,26,41,42]. In particular, in signal processing and system theory, a dimensionality reduction of random signals is used to optimize the cost of signal transmission. It is assumed that the only available information on

F

is given by certain covariance matrices formed from the preimages and images. This is a typical assumption used in applications such as those considered, e.g., in [15,16,17,18,19,20,21,22,23,37,38,44,45,46]. Here, we adopt that assumption. As mentioned, in particular, in [56,57,58,59,60,61], a priori knowledge of the covariances can come either from specific data models, or after sample estimation during a training phase.

The problem we consider (see (7) below) concerns finding the best approximating operator that depends on

2 p + 2

unknown matrices

G_{j}

and

H_{j}

, for

j = 0, \dots, p

and p more unknown random vectors

v_{1}, \dots, v_{p}

. We call

v_{1}, \dots, v_{p}

the injections. Here, p is a non-negative integer p. The injections

v_{1}, \dots, v_{p}

are aimed to further diminish the associated error. The difficulty is that

3 p + 2

unknowns should be determined from a minimization of the single cost function given in (7).

The main difference between the approach in [47] and the method proposed in this paper is in their approach to modeling of nonlinear systems. The method in [47] constructs the model

T_{p}

as a sum of

p + 1

specific components, and the original optimization problem is decomposed into

p + 1

simpler sub-problems. The method empirically determines the special random vectors, called injections, to reduce the associated error. In contrast, the method proposed here solves the problem of a best constrained approximation of a nonlinear operator in probability spaces. It introduces the concept of an optimal injection, which is determined by solving a Fredholm integral equation of the second kind. While the method in [47] focuses on a structured numerical implementation, the proposed method provides a more general theoretical framework that optimally determines all

3 p + 2

parameters through a combination of best approximation techniques and an iterative procedure.

The solution is represented in Section 3 and Section 4, and is based on the following observation. Methods for best approximation are aimed at obtaining the best solution within a certain class; the accuracy of the solution is limited by the extent to which the class is satisfactory. By contrast, iterative methods are normally convergent, but the convergence can be quite slow. Moreover, in practice, only a finite number of iteration loops can be carried out, and therefore, the final approximate solution is often unsatisfactorily inaccurate. A natural idea is to combine the methods for best approximation and iterative techniques to exploit their advantageous features.

In Section 3 and Section 4, we present an approach which realizes this. First, a special iterative procedure is proposed which aims to improve the accuracy of approximation with each consequent iteration loop. Secondly, the best approximation problem is solved providing the smallest associated error within the chosen class of approximants for each iteration loop. In Section 4, we show that the combination of these techniques allows us to build a computationally efficient and flexible method. In particular, we prove that the error in approximating

F

by the proposed method decreases with an increase in the number of iterations. An application is made to the optimal filtering of stochastic signals.

1.3. Novelty and Advantages

The novelty of the proposed method is in its approach to the best constructive approximation of nonlinear operators in probability spaces. Unlike methods that primarily focus on the existence and uniqueness of approximating operators, the proposed method aims to achieve an optimal constructive approximation by targeting a specific structure for the approximating operator, which is constrained by dimensionality reduction of random preimages. The focus on dimensionality reduction and the optimization of random vectors, known as injections, distinguishes the proposed method from existing approaches.

The advantages of the proposed method are twofold. First, it allows us to determine optimal injections which are derived from solving a Fredholm integral equation of the second kind, thereby providing a more robust and theoretically sound approach to approximating nonlinear systems. Second, the method combines best approximation techniques with a special iterative procedure, ensuring that the accuracy of the approximation improves with each iteration. This combination not only enhances the computational efficiency of the model but also provides a flexible framework that can be adapted to various applications, particularly those associated with signal processing and system theory, where dimensionality reduction plays a crucial role in optimizing signal transmission costs. By approximating

F

using the best constrained constructive operator approximation, one can design efficient algorithms for signal reconstruction or denoising. In system identification, this approach can be used to model and estimate dynamic systems with stochastic inputs, enabling more accurate predictions and adaptive control strategies.

2. The Proposed Approach

2.1. Some Special Notation

Let us write

x = {[x_{(1)}, \dots, x_{(m)}]}^{T}

and

y = {[y_{(1)}, \dots, y_{(n)}]}^{T}

where

x_{(i)}, y_{(j)} \in L^{2} (Ω, R)

, for

i = 1, \dots, m

and

j = 1, \dots, n

, and

x (ω) \in R^{m}

and

y (ω) \in R^{n}

for all

ω \in Ω

. Each matrix

A \in R^{m \times n}

defines a bounded linear transformation

A : L^{2} (Ω, R^{n}) \to L^{2} (Ω, R^{m})

. It is customary to write A rather then

A

since

[A (x)] (ω) = A [x (ω)]

, for each

ω \in Ω

. Let us also write

\begin{matrix} {∥ x ∥}_{Ω}^{2} = \int_{Ω} \sum_{j = 1}^{m} {[x_{j} (ω)]}^{2} d μ (ω) < \infty . \end{matrix}

(1)

The covariance matrix formed from

x

and

y

is denoted by

E_{x y}

such that

\begin{matrix} E_{x y} = \int_{Ω} x (ω) {[y (ω)]}^{T} d μ (ω) = {\{\int_{Ω} x_{(i)} (ω) y_{(j)} (ω) d μ (ω)\}}_{i, j = 1}^{m, n} . \end{matrix}

(2)

The Moore–Penrose pseudo-inverse [62] of matrix M is denoted by

M^{†}

.

2.2. Generic Structure of Approximating Operator

Let

v_{1}, \dots, v_{p}

be random vectors such that

v_{j} \in L^{2} (Ω, R^{q_{j}})

, for

j = 1, \dots, p

. We write

y = v_{0}

and

q_{0} = n

. As mentioned before, we call

v_{1}, \dots, v_{p}

the injections. This is because

v_{1}, \dots, v_{p}

contribute to the decrease of the associated error, as shown in Section 4.4 below. The choice of

v_{1}, \dots, v_{p}

is considered in Section 4.2, where each

v_{j}

, for

j = 1, \dots, p

is defined by a nonlinear transformation

φ_{j}

of

y

, i.e.,

v_{j} = φ_{j} (y)

. To facilitate the numerical implementation of the approximating technique introduced below, each vector

v_{j}

, for

j = 1, \dots, p

, is transformed to vector

z_{j} \in L^{2} (Ω, R^{q_{j}})

by transformation

Q_{j}

so that

\begin{matrix} z_{j} = Q_{j} (v_{j}, Z_{j - 1}), \end{matrix}

(3)

where

Z_{j - 1} = {z_{0}, \dots, z_{j - 1}}

. The choice of

Q_{j}

is considered in Section 3.1.

Further, for

i = 0, 1, \dots, p

, let

G_{i} \in R^{m \times r_{i}},

H_{i} \in R^{r_{i} \times q_{i}}

where

r_{i}

is given,

0 < r_{i} < r

, and

\begin{matrix} r = r_{0} + \dots + r_{p} . \end{matrix}

(4)

Here, r is a positive integer such that

r \leq min {m, n}

.

It is convenient to set

Q_{0} = I

and

z_{0} = v_{0} = y

. To approximate

F

for a given reduction ratio

\begin{matrix} c = r / min {m, n}, \end{matrix}

(5)

we consider operator

T_{p} : L^{2} (Ω, R^{q_{0}}) \times \dots \times L^{2} (Ω, R^{q_{p}}) \to L^{2} (Ω, R^{m})

represented by

\begin{matrix} T_{p} (v_{0}, \dots v_{p}) = G_{0} H_{0} z_{0} + \dots + G_{p} H_{p} z_{p}, \end{matrix}

(6)

where

G_{j} : L^{2} (Ω, R^{r_{j}}) \to L^{2} (Ω, R^{m})

and

H_{j} : L^{2} (Ω, R^{q_{j}}) \to L^{2} (Ω, R^{r_{j}})

, for

j = 0, 1, \dots, p

, are linear operators (i.e.,

G_{j}

and

H_{j}

are represented by

m \times r_{j}

and

r_{j} \times q_{j}

matrices, respectively. Recall that we use the same symbol to define a matrix and the associated linear operator).

Importantly, operators

H_{0}, \dots, H_{p}

imply the dimensionality reduction of vector

y

. This is because

H_{i} z_{i} \in L^{2} (Ω, R^{r_{i}})

where

0 < r_{i} < r \leq min {m, n}

, for

i = 0, \dots, p

.

We call p the degree of

T_{p}

. It is shown below that

T_{p}

approximates an operator of interest

F : L^{2} (Ω, R^{n}) \to L^{2} (Ω, R^{m})

, with the accuracy represented by theorems in Section 3.3, Section 3.4, and Section 4.4.

2.3. Statement of the Problem

Let

F : L^{2} (Ω, R^{n}) \to L^{2} (Ω, R^{m})

be a continuous operator. We consider the problem as follows: Given

x, y

and

r_{0}, \dots, r_{p}

, find matrices

G_{0}, H_{0}, \dots, G_{p}, H_{p}

and vectors

v_{1},

\dots, v_{p}

that solve

\begin{matrix} min_{v_{1}, \dots, v_{p}} min_{\begin{matrix} G_{0}, H_{0}, \dots, G_{p}, H_{p} \end{matrix}} {∥F (y) - \sum_{j = 0}^{p} G_{j} H_{j} z_{j}∥}_{Ω}^{2} \end{matrix}

(7)

subject to

\begin{matrix} G_{j} \in R^{m \times r_{j}} and H_{j} \in R^{r_{j} \times q_{j}}, \end{matrix}

(8)

and

\begin{matrix} E_{z_{i} z_{j}} = O, for i \neq j, \end{matrix}

(9)

where

i, j = 0, \dots, p

and

O

denotes the zero matrix (and the zero vector).

It will be shown in Section 3 below that the solution of problem (7)–(9) is determined under a special condition imposed on vectors

v_{1}, \dots, v_{p}

.

2.4. Related Work

2.4.1. Low-Rank Approximations

For

p = 0

and the assumption that matrix

E_{x y}

is invertible, the particular case of the problem in (7)–(9),

\begin{matrix} min_{G_{0}, H_{0}} {∥x - G_{0} H_{0} y∥}_{Ω}^{2}, \end{matrix}

(10)

has been solved, in particular, in [20,21,63,64,65]. Under the assumption

E_{x y}

is singular, it has been solved in [24,66,67,68,69]. Note it is a quite simplified case of problem in (7)–(9).

For

p = 1

,

G : L^{2} (Ω, R^{r_{0}}) \to L^{2} (Ω, R^{m})

,

H_{j} : L^{2} (Ω, R^{n}) \to L^{2} (Ω, R^{r_{0}})

where

j = 0, 1

, a particular case of the problem in (7)–(9),

\begin{matrix} min_{G, H_{0}, H_{1}} {∥ x - G (H_{0} y + H_{1} v_{1}) ∥}_{Ω}^{2}, \end{matrix}

(11)

was solved in [20]. The choice of

v_{1}

considered in [20] is not optimal.

2.4.2. Tensor Methods

In [70,71,72], the problem in (10) is generalized and studied in terms of tensors. Together with methods formulated in terms of matrices and mentioned in Section 1 and Section 2.4.1, the tensor methods represent an important research subject both in the theoretical and applied sense. In this paper, the problems in (10) and (11) are generalized in a different way, which was described in Section 2.2 and Section 2.3, and will be justified in Section 3 and Section 4.4 below.

2.4.3. System Identification and Modeling

The problem we consider can also be represented as a black-box problem [73,74] where

y

and

x

are an available random input and output, respectively, and

F

is an unknown system. Then, the approximating operator

T_{p}

identifies a model of the system. Its particular features are detailed in Section 4.

2.5. Contribution

2.5.1. Challenges of High-Dimensionality

The proposed approach achieves the dimensionality reduction of preimages in the following way. In (6), let us write

T_{p} (v_{0}, \dots v_{p})

as

T_{p} (v_{0}, \dots v_{p}) = [G_{0}, \dots, G_{p}] [\begin{matrix} H_{0} z_{0} \\ ⋮ \\ H_{p} z_{p} \end{matrix}] = G H z,

where

G = [G_{0}, \dots, G_{p}]

,

H = diag [H_{0}, \dots, H_{p}]

,

z = {[z_{0}^{T}, \dots, z_{p}^{T}]}^{T}

. Here, H is the block-diagonal matrix with

H_{0}, \dots, H_{p}

on the main diagonal and the dimensionality of vector

u = H z

is r, which is defined by (4). Therefore, H realizes the dimensionality reduction in vector

y

with the reduction ratio defined by (5). Unlike the known techniques [20,21,22,23,24,26,41,42] for dimensionality reduction, the proposed approach achieves this by using

p + 1

terms,

H_{0}, \dots, H_{p}

. This allows us to increase the accuracy associated with optimal determination of approximating operator

T_{p}

. More details can be found in Section 2.5.3 and Section 4.

2.5.2. Challenges of Accuracy

It is shown in Theorems 3, 4, and 5 below that, for the same reduction ratio, accuracy associated with the approximating operator

T_{p}

improves if the degree p or dimensions of injections

v_{1}, \dots, v_{p}

increase. For the case of the optimal determination of injections

v_{1}, \dots, v_{p}

, the associated error is further improved. This is established in Theorems 10 and 11 in Section 4.4.

2.5.3. Novelties and Relation to Existing Concepts

Commonly, an approximating operator is represented by

\sum_{j = 0}^{p} P_{j} v_{j}

where

v_{0}, \dots, v_{p}

are known basis functions and

P_{0}, \dots, P_{j}

are scalars or matrices which should be determined from (desirably) an error minimization.

The main conceptual novelty of the proposed approach is that, unlike the known techniques, it targets a constructive optimal determination of all

3 p + 2

ingredients of the approximating operator in (6). They are

v_{1}, \dots, v_{p}, H_{0}, G_{0}, \dots,

H_{p}, G_{p}

. The solution of the associated problem (7)–(9) is provided in Section 3.1, Section 3.2, Section 4.1, and Section 4.2, and is represented by a combination of a new best approximation technique with a special iterative procedure.

A basic idea of the solution is to reduce the original problem (7)–(9) with

3 p + 2

unknowns to

p + 1

simpler problems in (25) and (63) so that each of them, for

j = 1, \dots, p

, contains only three unknowns,

v_{j}, H_{j}

, and

G_{j}

. For

j = 0

, there are only two unknowns,

H_{0}

and

G_{0}

. This is achieved due to exploiting operators

Q_{0}, \dots, Q_{p}

determined in Theorem 1.

The iteration procedure represented in Section 4.1 is based on the idea of the maximum block improvement method [75] which is an efficient technique for solving the spherically constrained homogeneous polynomial optimization problems. The associated novelties are in the techniques for the determination of matrices

H_{0}, G_{0}, \dots,

H_{p}, G_{p}

given in Section 3.2, injections

v_{1}, \dots, v_{p}

represented in Section 4.2, and for the error analysis given in Section 3.3 and Section 4.4. In particular, it is shown in Theorem 8 in Section 4.2 that the desired vector

v_{j}

is determined from the solution of a special Fredholm integral equation of the second kind (71). Its solution is called the optimal injection.

Further, unlike the known techniques, the proposed approximating operator

T_{p}

has several degrees of freedom to minimize the associated error. They are:

‘degree’ p of $T_{p}$ ,
matrices $G_{0}, H_{0}, \dots,$ $G_{p}, H_{p}$ ,
optimal injections $v_{1}, \dots, v_{p}$ ,
values of $r_{0}, \dots, r_{p}$ in (4), and
dimensions $q_{1}, \dots, q_{p}$ of injections $v_{1}, \dots, v_{p}$ .

It is shown in Section 3.3 and Section 4.4 below that both the optimal choice of

G_{0}, H_{0}, \dots,

G_{p}, H_{p}

and injections

v_{1}, \dots, v_{p}

, and the increase in

r_{0}, \dots, r_{p}

and

q_{1}, \dots, q_{p}

lead to the decrease in the error associated with approximating operator

T_{p}

. Injections

v_{1}, \dots, v_{p}

represent a new special feature of the proposed technique.

In terms of multilayer networks, the novelties are as follows. First, the associated network consists of four hidden layers, i.e., more than in the known networks. Second, it exploits the interaction between the hidden layers (see Figure 1 where a particular case of

T_{p} (v_{0}, \dots, v_{p})

for

p = 3

is illustrated). These particular features imply the improvement in the associated accuracy (see Section 3.3 and Section 4.4). In Figure 1,

\tilde{x}

denotes the approximation of

x = F (y)

determined by the proposed technique, i.e.,

\tilde{x} = T_{p} (v_{0}, \dots, v_{p})

. Details are provided in Section 4.

Figure 1. Diagrammatical representation of the proposed technique.

In terms of system identification, side by side with the black-box problem as in [73,74], the problem in (7)–(9) can be interpreted as a quite wide generalization of the blind system identification problem considered, in particular, in [17,76,77,78]. This is because vectors

v_{1}, \dots, v_{p}

are assumed to be unknown. Unlike the existing work on blind system identification, the problem in (7)–(9) is stated in terms of random vectors. Other novelties associated with system identification are similar to those considered above, i.e., the proposed model

T_{p} (v_{0}, \dots, v_{p})

of the system contains

2 p + 1

unknowns, the method of their determination is different from the existing techniques, and the associated accuracy is improved by a variation in more degrees of freedom.

3. Preliminary Results

Here, we consider the determination of vectors

z_{1}, \dots,

z_{p}

(in Definition 1 that follows, they are called pairwise uncorrelated vectors) and the solution of a particular case of the problem in (7), (8), (9) where the minimization with respect to

v_{1}, \dots,

v_{p}

is not included. These preliminary results will be used in Section 4, where the solution to the original problem represented by (7), (8), (9) is provided.

Definition 1.

Random vectors

z_{0}, \dots, z_{p}

are called pairwise uncorrelated if the condition in (9) holds for any pair of vectors

z_{i}

and

z_{j}

, for

i \neq j

, where

i, j = 0, \dots, p

. Two vectors

z_{i}

and

z_{j}

belonging to the set of the pairwise uncorrelated vectors are called uncorrelated.

For

j = 0, \dots, p

, let

N (M^{(j)})

be a null space of matrix

M^{(j)} \in R^{q_{j} \times q_{j}}

.

Definition 2.

Random vectors

v_{0}, \dots, v_{p}

are called linearly independent in the generalized sense if, for every collection of matrices

M^{(0)}, \dots, M^{(p)}

,

M^{(0)} v_{0} (ω) + \dots + M^{(p)} v_{p} (ω) = O,

for almost all

ω \in Ω

, implies that

v_{j} (ω) \in N (M^{(j)})

, for each

j = 0, \dots, p

, and almost all

ω \in Ω

.

Definition 3.

Random vector

v_{j}

, for

j = 0, \dots, p,

is called the well-defined injection if

\begin{matrix} Γ_{z_{j}} = E_{x z_{j}} E_{z_{j} z_{j}}^{†} E_{z_{j} x} \neq O, \end{matrix}

(12)

where

z_{j}

is defined by (3). Otherwise, injection

v_{j}

is called ill-defined.

An explanation for introducing Definition 3 is provided by Remark 1 at the end of Section 3.2 below.

3.1. Determination of Pairwise Uncorrelated Vectors

Theorem 1.

Let random vectors

v_{0}, \dots, v_{p}

be linearly independent in the generalized sense. Then they are transformed to the pairwise uncorrelated vectors

z_{0}, \dots, z_{p}

by transformations

Q_{0}, \dots, Q_{p}

as follows:

\begin{matrix} z_{0} = Q_{0} (v_{0}) = v_{0} and, for j = 1, \dots, p, \end{matrix}

(13)

\begin{matrix} z_{j} = Q_{j} (v_{j}, Z_{j - 1}) = v_{j} - \sum_{k = 0}^{j - 1} E_{v_{j} z_{k}} E_{z_{k} z_{k}}^{†} z_{k} . \end{matrix}

(14)

Proof.

Suppose that the condition in (9) holds for

z_{0},

\dots, z_{i - 1}

. Then, for

ℓ = 0, \dots, i - 1

,

\begin{matrix} E_{z_{i} z_{ℓ}} & = & E [(v_{i} - \sum_{l = 0}^{i - 1} E_{v_{i} z_{l}} E_{z_{l} z_{l}}^{†} z_{l}) z_{ℓ}^{T}] \\ = & E_{v_{i} z_{ℓ}} - \sum_{l = 0}^{i - 1} E_{v_{i} z_{l}} E_{z_{l} z_{l}}^{†} E_{z_{l} z_{ℓ}} \\ = & E_{v_{i} z_{ℓ}} - E_{v_{i} z_{ℓ}} E_{z_{ℓ} z_{ℓ}}^{†} E_{z_{ℓ} z_{ℓ}} = O . \end{matrix}

(15)

The latter is true because by Lemma 1 in [79],

E_{v_{i} z_{ℓ}} E_{z_{ℓ} z_{ℓ}}^{†} E_{z_{ℓ} z_{ℓ}} = E_{v_{i} z_{ℓ}} .

Thus, by induction, (9) holds for any

i = 0, \dots, p

. □

In terms of a multi layer network, this procedure is illustrated diagrammatically in Figure 1.

The solution device of the problem in (7)–(9) is based, in particular, on the solution of the problem

\begin{matrix} min_{\begin{matrix} G_{0}, H_{0}, \dots, G_{p}, H_{p} \end{matrix}} {∥F (y) - \sum_{j = 0}^{p} G_{j} H_{j} z_{j}∥}_{Ω}^{2} \end{matrix}

(16)

subject to (8) and (9). In the following Section 3.2, matrices

G_{0}, H_{0}, \dots,

G_{p}, H_{p}

that solve this problem are given.

3.2. Determination of Matrices $G_{0}, H_{0}, \dots,$ $G_{p}, H_{p}$ That Solve the Problem in (16), (8) and (9)

First, recall the definition of a truncated singular value decomposition (SVD). Let the SVD of matrix

A \in R^{m \times n}

be given by

A = U_{A} Σ_{A} V_{A}^{T},

where

U_{A} = [u_{1} u_{2} \dots u_{m}] \in R^{m \times m}, V_{A} = [v_{1} v_{2} \dots v_{n}] \in R^{n \times n}

are unitary matrices, and

Σ_{A} = diag (σ_{1} (A),

\dots,

σ_{min (m, n)} (A)) \in R^{m \times n}

is a generalized diagonal matrix, with the singular values

σ_{1} (A) \geq σ_{2} (A) \geq \dots \geq 0

on the main diagonal. For

k < m

,

j < n

and

ℓ < min (m, n)

, we denote

U_{A, k} = [u_{1} u_{2} \dots u_{k}], V_{A, j} = [v_{1} v_{2} \dots v_{j}], Σ_{A, ℓ} = diag (σ_{1} (A), \dots, σ_{ℓ} (A)),

and write

Π_{A, L} = \sum_{k = 1}^{rank (A)} u_{k} u_{k}^{T} and Π_{A, R} = \sum_{j = 1}^{rank (A)} v_{j} v_{j}^{T} .

For

r = 1, \dots, rank (A)

,

\begin{matrix} {[A]}_{r} = \sum_{i = 1}^{r} σ_{i} (A) u_{i} v_{i}^{T} \in R^{m \times n}, \end{matrix}

(17)

is the truncated SVD of A. For

r \geq rank (A)

, we write

{[A]}_{r} = A (= A_{rank (A)})

.

Proposition 1.

For any random vector

x \in L^{2} (Ω, R^{m})

,

\begin{matrix} {∥ x ∥}_{Ω}^{2} = tr {E_{x x}} . \end{matrix}

(18)

Proof.

We have

\begin{matrix} {∥ x ∥}_{Ω}^{2} = \int_{Ω} {∥ x (ω) ∥}^{2} d μ (ω) = \int_{Ω} tr {x (ω) {[x (ω)]}^{T}} d μ (ω) \\ = \sum_{j = 1}^{m} \int_{Ω} {[x_{(j)} (ω)]}^{2} d μ (ω) = tr {E_{x x}} . \end{matrix}

Thus, (18) is true. □

Theorem 2.

Let

v_{0}, \dots, v_{p}

be well-defined injections and vectors

z_{0}, \dots, z_{p}

be pairwise uncorrelated. Then the minimal Frobenius norm solution to the problem in (16) is given, for

j = 0, \dots, p

, by

\begin{matrix} G_{j} = U_{Γ_{z_{j}}, r_{j}} a n d H_{j} = U_{Γ_{z_{j}}, r_{j}}^{T} E_{x z_{j}} E_{z_{j} z_{j}}^{†}, \end{matrix}

(19)

Proof.

Let

P_{j} = G_{j} H_{j}

,

P = [P_{0}, \dots, P_{p}]

and

w = [z_{0}^{T}, \dots,

z_{p}^{T}]^{T}

. Then, for

x = F (y)

,

\begin{matrix} ∥ F (y) - \sum_{j = 0}^{p} P_{j} z_{j} ∥_{Ω}^{2} = tr \{E_{x x} - E_{x w} P^{T} - P E_{w x} + P E_{w w} P^{T}\}, \end{matrix}

(20)

where

E_{x w} = [E_{x z_{0}}, \dots, E_{x z_{p}}]

and by Theorem 1, matrix

E_{w w}

is block-diagonal,

E_{w w} = diag [E_{z_{0} z_{0}}, \dots, E_{z_{p} z_{p}}] .

Thus,

P E_{w w} P^{T} = \sum_{j = 0}^{p} P_{j} E_{z_{j} z_{j}} P_{j}^{T} and P E_{w x} = \sum_{j = 0}^{p} P_{j} E_{z_{j} x}

and then

∥ F (y) - \sum_{j = 0}^{p} P_{j} z_{j} ∥_{Ω}^{2} = tr \{E_{x x} - \sum_{j = 0}^{p} E_{x z_{j}} P_{j}^{T} - \sum_{j = 0}^{p} P_{j} E_{z_{j} x} + \sum_{j = 0}^{p} P_{j} E_{z_{j} z_{j}} P_{j}^{T}\} .

At the same time,

\begin{matrix} ∥ F (y) - P_{j} z_{j} ∥_{Ω}^{2} = tr {E_{x x} - E_{x z_{j}} P_{j}^{T} - P_{j} E_{z_{j} x} + P_{j} E_{z_{j} z_{j}} P_{j}^{T}} . \end{matrix}

(21)

Therefore, (20) and (21) imply

\begin{matrix} ∥ F (y) - \sum_{j = 0}^{p} P_{j} z_{j} ∥_{Ω}^{2} = \sum_{j = 0}^{p} {∥ F (y) - P_{j} z_{j} ∥}_{Ω}^{2} - tr \{p E_{x x}\} . \end{matrix}

(22)

The RHS in (22) is nonnegative. Indeed, for

R = - \sum_{j = 0}^{p} E_{x z_{j}} P_{j}^{T} - \sum_{j = 0}^{p} P_{j} E_{z_{j} x} + \sum_{j = 0}^{p} P_{j} E_{z_{j} z_{j}} P_{j}^{T}

, we have

∥ F (y) - \sum_{j = 0}^{p} P_{j} z_{j} ∥_{Ω}^{2} = tr {E_{x x} + R} \geq 0

, i.e.,

tr {E_{x x} + (p E_{x x} - p E_{x x}) + R} \geq 0

. Here,

tr {E_{x x} + (p E_{x x} - p E_{x x}) + R} = \sum_{j = 0}^{p} {∥ F (y) - P_{j} z_{j} ∥}_{Ω}^{2} - tr \{p E_{x x}\}

. Further, the case

F (y) = P_{j} z_{j}

is not possible since matrix

P_{j}

is singular. Further,

\begin{matrix} ∥ F (y) - P_{j} z_{j} ∥_{Ω}^{2} & = & ∥ E_{x x}^{1 / 2} ∥^{2} - ∥ E_{x z_{j}} {(E_{z_{j} z_{j}}^{1 / 2})}^{†} ∥^{2} + {∥ (P_{j} - E_{x z_{j}} E_{z_{j} z_{j}}^{†}) E_{z_{j} z_{j}}^{1 / 2} ∥}^{2} \\ = & ∥ E_{x x}^{1 / 2} ∥^{2} - ∥ E_{x z_{j}} {(E_{z_{j} z_{j}}^{1 / 2})}^{†} ∥^{2} + {∥ P_{j} E_{z_{j} z_{j}}^{1 / 2} - E_{x z_{j}} {(E_{z_{j} z_{j}}^{1 / 2})}^{†} ∥}^{2} \end{matrix}

(23)

because

E_{z_{j} z_{j}}^{†} E_{z_{j} z_{j}}^{1 / 2} = {(E_{z_{j} z_{j}}^{1 / 2})}^{†}

and

E_{x z_{j}} E_{z_{j} z_{j}}^{†} E_{z_{j} z_{j}} = E_{x z_{j}}

(24)

(see [80]).

Let us now denote by

R_{r_{j}}^{m \times n}

the set of all

m \times n

matrices of rank at most

r_{j}

. In the RHS of (23), only the last term depends on

P_{j}

. Therefore, on the basis of [24,68,81,82,83,84], the minimal Frobenius norm solution to the problem

min_{P_{j} \in R_{r_{j}}^{m \times n}} {∥ F (y) - P_{j} z_{j} ∥}_{Ω}^{2},

(25)

for

j = 0, \dots, p

, is given by

P_{j} = G_{j} H_{j} = U_{Γ_{z_{j}}, r_{j}} U_{Γ_{z_{j}}, r_{j}}^{T} E_{x z_{j}} E_{z_{j} z_{j}}^{†} .

(26)

Then, (19) follows from (26). □

Remark 1.

Definition 3 of the well-defined injections is motivated by the following observation. It follows from (19) that if, for all

j = 0, \dots, p,

vector

v_{j}

is such that

Γ_{z_{j}} = O

, then

G_{j} = O

and

H_{j} = O

. In other words, then approximating operator

T_{p} = O

.

Therefore, in Theorem 2 above and in the theorems below, vectors

v_{0}, \dots, v_{p}

are assumed well-defined.

3.3. Error Analysis Associated with the Solution of Problem in (16), (8) and (9)

In Theorem 3 of this section, we obtain the constructive representation of the error associated with the solution of problem in (16), (8), and (9). In Theorem 4, we show that the error can be improved by the increase in the dimensions of injections

v_{1}, \dots, v_{p}

. Further, Theorems 3, 4, and 5 establish that the error is also diminished by the increase in the degree of approximating operator

T_{p}

.

The error associated with

T_{p} (v_{0}, \dots, v_{p}) = \sum_{j = 0}^{p} G_{j} H_{j} z_{j}

is denoted by

ε_{G H}^{(p)} = min_{G_{0}, H_{0}, \dots, G_{p}, H_{p}} ∥ F (y) - \sum_{j = 0}^{p} G_{j} H_{j} z_{j} ∥_{Ω}^{2} .

Let us denote the Frobenius norm by

∥ \cdot ∥

.

Theorem 3.

For

j = 0, \dots, p

, let

A_{j} = E_{x z_{j}} {(E_{z_{j} z_{j}}^{1 / 2})}^{†}

,

rank (A_{j}) = s_{j}

and

s_{j} \geq r_{j} + 1

. For

k = 1, \dots, s_{j}

, let

σ_{k} (A_{j})

be a singular value of

A_{j}

. Let

G_{0}, H_{0}, \dots, G_{p}, H_{p}

be determined by (19). Then

\begin{matrix} ε_{G H}^{(p)} = tr {E_{x x}} - \sum_{j = 0}^{p} \sum_{k = 1}^{r_{j}} σ_{k}^{2} (A_{j}) \end{matrix}

(27)

In particular, the error decreases as p increases.

Proof.

In the notation introduced in (17), matrix

G_{j} H_{j}

in (19) is represented as

G_{j} H_{j} = {[E_{x z_{j}} {(E_{z_{j} z_{j}}^{1 / 2})}^{†}]}_{r_{j}} {(E_{z_{j} z_{j}}^{1 / 2})}^{†}

. Therefore, in (23),

\begin{matrix} ∥ G_{j} H_{j} E_{z_{j} z_{j}}^{1 / 2} - E_{x z_{j}} {(E_{z_{j} z_{j}}^{1 / 2})}^{†} ∥^{2} & = & ∥ {[E_{x z_{j}} {(E_{z_{j} z_{j}}^{1 / 2})}^{†}]}_{r_{j}} {(E_{z_{j} z_{j}}^{1 / 2})}^{†} E_{z_{j} z_{j}}^{1 / 2} - E_{x z_{j}} {(E_{z_{j} z_{j}}^{1 / 2})}^{†} ∥^{2} \\ = & ∥ {[E_{x z_{j}} {(E_{z_{j} z_{j}}^{1 / 2})}^{†}]}_{r_{j}} - E_{x z_{j}} {(E_{z_{j} z_{j}}^{1 / 2})}^{†} ∥^{2} \\ = & ∥ {[A_{j}]}_{r_{j}} - A_{j} ∥^{2} \\ = & \sum_{k = r_{j} + 1}^{s_{j}} σ_{k}^{2} (A_{j}) . \end{matrix}

(28)

because

{[E_{x z_{j}} {(E_{z_{j} z_{j}}^{1 / 2})}^{†}]}_{r_{j}} {(E_{z_{j} z_{j}}^{1 / 2})}^{†} E_{z_{j} z_{j}}^{1 / 2} = {[E_{x z_{j}} {(E_{z_{j} z_{j}}^{1 / 2})}^{†}]}_{r_{j}}

. Further, since

\begin{matrix} ∥ E_{x z_{j}} {(E_{z_{j} z_{j}}^{1 / 2})}^{†} ∥^{2} = {∥ A_{j} ∥}^{2} = \sum_{k = 1}^{s_{j}} σ_{k}^{2} (A_{j}), \end{matrix}

(29)

then (22), (23), (28), and (29) imply (27). □

Let us write

A_{j} = {a_{k i (j)}}_{k, i = 1}^{m, q_{j}} and A_{j} - {[A_{j}]}_{r_{j}} = {b_{k i (j)}}_{k, i = 1}^{m, q_{j}}

where

a_{k i (j)}

and

a_{k i (j)}

are entries of matrices

A_{j}

and

A_{j} - {[A_{j}]}_{r_{j}}

, respectively. Let us also denote

γ_{k, (j)} = \sum_{i = 1}^{m} (a_{k i (j)}^{2} - b_{k i (j)}^{2}), γ_{(j)} = max {γ_{1, (j)}, \dots, γ_{q_{j}, (j)}},

γ = max_{j = 1, \dots, p} γ_{(j)}, α_{0} = tr {E_{x x} - \sum_{j = 0}^{p} A_{j} A_{j}^{T}} and q = q_{1} + \dots, + q_{p} .

In the following theorem we show that injections

v_{1}, \dots, v_{p}

are useful in the sense that their dimensions increase, so the error associated with the solution to the problem in (16), (8), and (9) is diminished.

Theorem 4.

Let

v_{1}, \dots, v_{p}

be well-defined injections and let matrices

G_{0}, H_{0},

\dots,

G_{p}, H_{p}

be defined by Theorem 2. Then the associated error decreases as the sum q of dimensions of injections

v_{1}, \dots, v_{p}

increases. In particular, there is

β \in (0, γ]

such that, given

α \geq α_{0}

, then

\begin{matrix} α_{0} \leq ε_{G H}^{(p)} \leq α \end{matrix}

(30)

if

\begin{matrix} q \geq \frac{tr \{E_{x x}\} - \sum_{k = 1}^{r_{0}} σ_{k}^{2} (A_{0})}{β} . \end{matrix}

(31)

Proof.

It follows from (22), (27), (23), and (28) that

\begin{matrix} ε_{G H}^{(p)} & = & ∥ F (y) - G_{j} H_{0} z_{0} ∥_{Ω}^{2} + \sum_{j = 1}^{p} {∥ F (y) - G_{j} H_{j} z_{j} ∥}_{Ω}^{2} - tr \{p E_{x x}\} \\ = & tr \{E_{x x}\} - \sum_{k = 1}^{r_{0}} σ_{k}^{2} (A_{0}) + \sum_{j = 1}^{p} {∥ F (y) - G_{j} H_{j} z_{j} ∥}_{Ω}^{2} - tr \{p E_{x x}\} \end{matrix}

(32)

where

\begin{matrix} ∥ F (y) - G_{j} H_{j} z_{j} ∥_{Ω}^{2} & = & tr \{E_{x x}\} - ({∥A_{j}∥}^{2} - {∥{[A_{j}]}_{r_{j}} - A_{j}∥}^{2}) \\ = & tr \{E_{x x}\} - \sum_{k = 1}^{q_{j}} \sum_{i = 1}^{m} (a_{k i (j)}^{2} - b_{k i (j)}^{2}) . \end{matrix}

Therefore,

\begin{matrix} ε_{G H}^{(p)} & = & tr \{E_{x x}\} - \sum_{k = 1}^{r_{0}} σ_{k}^{2} (A_{0}) - \sum_{j = 1}^{p} \sum_{k = 1}^{q_{j}} \sum_{i = 1}^{m} (a_{k i (j)}^{2} - b_{k i (j)}^{2}) . \end{matrix}

(33)

Here,

\sum_{k = 1}^{q_{j}} \sum_{i = 1}^{m} (a_{k i (j)}^{2} - b_{k i (j)}^{2}) > 0

, since by (28),

\begin{matrix} {∥A_{j}∥}^{2} - {∥{[A_{j}]}_{r_{j}} - A_{j}∥}^{2} = \sum_{k = 1}^{r_{j}} σ_{k}^{2} (A_{j}) > 0 . \end{matrix}

(34)

Thus, (32)–(34) imply that

ε_{G H}^{(p)}

decreases as

q_{j}

increases, for

j = 1, \dots, p,

and p increases.

Further, (27) implies

\begin{matrix} ε_{G H}^{(p)} \geq tr {E_{x x}} - \sum_{j = 0}^{p} \sum_{k = 1}^{s_{j}} σ_{k}^{2} (A_{j}) = tr \{E_{x x} - \sum_{j = 0}^{p} A_{j} A_{j}^{T}\} = α_{0} . \end{matrix}

(35)

Since

0 < \sum_{j = 1}^{p} \sum_{k = 1}^{q_{j}} \sum_{i = 1}^{m} (a_{k i (j)}^{2} - b_{k i (j)}^{2}) \leq γ \sum_{j = 1}^{p} q_{j} = γ q,

then

\begin{matrix} \sum_{j = 1}^{p} \sum_{k = 1}^{q_{j}} \sum_{i = 1}^{m} (a_{k i (j)}^{2} - b_{k i (j)}^{2}) = q β . \end{matrix}

(36)

Therefore, (33), (35), and (36) imply

α_{0} \leq ε_{G H}^{(p)} = tr \{E_{x x}\} - \sum_{k = 1}^{r_{0}} σ_{k}^{2} (A_{0}) - q β .

thus, if

ε_{G H}^{(p)} \leq α

, then (31) is true. Conversely, if the latter is true, then

ε_{G H}^{(p)} \leq α

. □

Remark 2.

An empirical explanation of Theorem 4 is that the increase in q implies the increase in the dimensions of matrices

H_{1}, \dots, H_{p}

in (6) and (19). Hence, it implies the increase in the number of parameters to optimize. As a result, for a fixed parameter r given by (4), the accuracy associated with approximating operator

T_{p}

improves. Further, it follows from (31) that, as q increases,

ε_{G H}^{(p)}

tends to

α_{0}

, which is the error associated with the full rank approximating operator

S_{h}

(see (40) and (46) below).

Remark 3.

By Theorem 3, the error associated with solution of problem (16) decreases as the degree p of the approximating operator

T_{p}

increases. At the same time, the increase in degree p of the approximating operator

T_{p}

may involve an increase in parameter r (see (4)). However, by a condition of some applied problem in hand, r must be fixed. In the following Theorem 5, under the condition of fixed r, the case of decreasing the error as the degree p of the approximating operator increases is detailed.

Theorem 5.

Let r and

r_{j}

, for

j = 0, \dots, p

, be given. Let g be a nonnegative integer such that

g < p

and let

ℓ_{g} = r_{g} + r_{g + 1} + \dots + r_{p}

. If

\begin{matrix} \sum_{k = r_{g} + 1}^{r_{g + 1} + \dots + r_{p}} σ_{k}^{2} (A_{g}) < \sum_{j = g + 1}^{p} \sum_{k = 1}^{r_{j}} σ_{k}^{2} (A_{j}), \end{matrix}

(37)

where

\sum_{j = g + 1}^{p} \sum_{k = 1}^{r_{j}} σ_{k}^{2} (A_{j}) = \sum_{j = g + 1}^{p} \sum_{k = 1}^{q_{j}} \sum_{i = 1}^{m} (a_{k i (j)}^{2} - b_{k i (j)}^{2})

, then

\begin{matrix} ε_{G H}^{(x)} < ε_{G H}^{(c)}, \end{matrix}

(38)

i.e., for the same r, the error associated with the approximating operator of higher degree p is less than the error associated with the approximating operator of lower degree g.

Proof.

We write

r = r_{0} + \dots + r_{g - 1} + l_{g} .

Then,

\begin{matrix} ε_{G H}^{(g)} & = & tr {E_{x x}} - \sum_{j = 0}^{g - 1} \sum_{k = 1}^{r_{j}} σ_{k}^{2} (A_{j}) - \sum_{k = 1}^{ℓ_{g}} σ_{k}^{2} (A_{g}) \\ = & tr {E_{x x}} - \sum_{j = 0}^{g - 1} \sum_{k = 1}^{r_{j}} σ_{k}^{2} (A_{j}) - \sum_{k = 1}^{r_{g}} σ_{k}^{2} (A_{g}) - \sum_{k = r_{g} + 1}^{r_{g + 1} + \dots + r_{p}} σ_{k}^{2} (A_{g}) . \end{matrix}

(39)

Thus, (27) and (39) imply (37) and (38). □

Remark 4.

The RHS in (37) increases as the dimension

q_{j}

of at least single injection

v_{j}

, for

j = g + 1, \dots, p

, increases while the LHS does not depend on

q_{j}

. In other words, one can always find

q_{j}

, for

j = g + 1, \dots, p

, such that the inequality in (37) is true.

Example 1.

Here, we wish to numerically illustrate Theorems 3, 4, and 5. To this end, we assume that

x \in L^{2} (Ω, R^{m})

and

y \in L^{2} (Ω, R^{m})

are uniformly and normally distributed random vectors, respectively. Injections

v_{1} \in L^{2} (Ω, R^{q_{1}})

and

v_{2} \in L^{2} (Ω, R^{q_{2}})

are here chosen as uniformly distributed random vectors. Covariance matrices

E_{x v_{j}}

,

E_{v_{i} v_{j}}

, for

i, j = 0, 1, 2

, are represented by

E_{x v_{j}} = \frac{1}{s} X V_{j}^{T} a n d E_{v_{i} v_{j}} = \frac{1}{s} V_{i} V_{j}^{T}

where

X \in R^{m \times s}

and

V_{j} \in R^{q_{j} \times s}

are sample matrices of

x

and

v_{j}

, respectively, for

j = 0, 1, 2

.

We choose

m = 100

and

r = 50

. It follows from Theorems 3, 4, and 5 that the error associated with approximating operator

T_{p} (v_{0}, \dots, v_{p})

varies if values of p,

q = q_{0} + \dots + q_{p}

and

r = r_{0} + \dots + r_{p}

vary. We wish to illustrate the decrease in error when, for the same r, values of p and q increase. To this end, we provide Table 1 and Table 2, where values of the errors are given for different values of

q_{j}

,

r_{j}

, for

j = 0, 1, 2

and

p = 0, 1, 2

. In Table 1 and Table 2, the abbreviation MSE (mean square error) is used to unite notations

ε_{G H}^{(p)}

and

ε_{G H}^{(g)}

used in Theorems 3, 4, and 5. In the tables, Cases 1 and 2 for specific values of

q_{j}

,

r_{j}

are considered.

Table 1. Numerical characterizations of approximating operators

T_{0} (v_{0})

,

T_{1} (v_{0}, v_{1})

, and

T_{2} (v_{0}, v_{1}, v_{2})

in Case 1.

Table 2. Numerical characterizations of operators

T_{0} (v_{0})

,

T_{1} (v_{0}, v_{1})

, and

T_{2} (v_{0}, v_{1}, v_{2})

in Case 2.

In Figure 2 the MSE values are illustrated diagrammatically.

Figure 2. Example 1: Diagrams of the errors associated with

T_{0} (v_{0})

,

T_{1} (v_{0}, v_{1})

, and

T_{2} (v_{0}, v_{1}, v_{2})

.

It follows from Table 1 and Table 2 and Figure 2 that, for the same r, the error associated with the proposed system model decreases if degree p or the sum q of the injection dimensions increases. This is because the increase in the number of parameters to optimize in the operator

T

results in a decrease in the error, as stated in Theorem 5.

3.4. Particular Case: No Reduction of Vector Dimensionality

An important particular case of the problem in (16) and (9) is when matrix

G_{j} H_{j}

is replaced with a full rank matrix

M_{j} \in R^{m \times q_{j}}

, for

j = 0, \dots, h

. Operator

S_{h}

given by

\begin{matrix} S_{h} (v_{0}, \dots v_{p}) = \sum_{k = 0}^{h} M_{k} z_{k} \end{matrix}

(40)

is called the full rank approximating operator. The problem then is to find matrices

M_{0}, \dots, M_{h}

that solve

\begin{matrix} min_{M_{0}, \dots, M_{h}} ∥ F (y) - \sum_{k = 0}^{h} M_{k} z_{k} ∥_{Ω}^{2}, \end{matrix}

(41)

subject to condition (9).

In particular, searched matrices

M_{0}, \dots, M_{h}

can be found, for

ω \in Ω

, from equations

\begin{matrix} M_{j} z_{j} (ω) = [F (y)] (ω), for j = 0, \dots, h, \end{matrix}

(42)

and

\begin{matrix} \sum_{k = 0}^{h} M_{k} z_{k} (ω) = [F (y)] (ω), \end{matrix}

(43)

respectively. Those equations may have an infinite number of solutions or have no solutions. Therefore, instead, the following theorem provides the solution to the problem in (41) where cases (42) and (43) are excluded.

Theorem 6.

Let

v_{0}, \dots, v_{p}

be well-defined injections and vectors

z_{0}, \dots, z_{p}

be pairwise uncorrelated. Let

F (y) \neq M_{j} z_{j}

, for

j = 0, \dots, h,

and

F (y) \neq \sum_{k = 0}^{h} M_{k} z_{k}

. Then, the minimal Frobenius norm solution to the problem (41) is given, for

k = 0, \dots, h

, by

\begin{matrix} M_{k} = E_{x z_{k}} E_{z_{k} z_{k}}^{†} . \end{matrix}

(44)

Proof.

Similar to (22),

\begin{matrix} ∥ F (y) - \sum_{j = 0}^{h} M_{j} z_{j} ∥_{Ω}^{2} = \sum_{j = 0}^{h} {∥ x - M_{j} z_{j} ∥}_{Ω}^{2} - tr \{h E_{x x}\} . \end{matrix}

(45)

It is known (see, for example, [16,20,21]) that the minimal Frobenius norm solution to the problem

min_{M_{j}} {∥ x - M_{j} z_{j} ∥}_{Ω}^{2}

is given by (44). □

Theorem 7.

Let

A_{j} = E_{x z_{j}} {(E_{z_{j} z_{j}}^{1 / 2})}^{†}

. The error associated with the minimal Frobenius norm solution to the problem in (41) is represented by

\begin{matrix} min_{M_{0}, \dots, M_{h}} ∥ F (y) - \sum_{j = 0}^{h} M_{j} z_{j} ∥_{Ω}^{2} = tr {E_{x x}} - \sum_{j = 0}^{h} {∥ A_{j} ∥}^{2} . \end{matrix}

(46)

Proof.

For

M_{j}

determined by (44),

∥ F (y) - M_{j} z_{j} ∥_{Ω}^{2} = tr {E_{x x}} - ∥ E_{x z_{j}} {(E_{z_{j} z_{j}}^{1 / 2})}^{†} ∥^{2} + {∥ M_{j} E_{z_{j} z_{j}}^{1 / 2} - E_{x z_{j}} {(E_{z_{j} z_{j}}^{1 / 2})}^{†} ∥}^{2} .

(47)

Here,

\begin{matrix} ∥ M_{j} E_{z_{j} z_{j}}^{1 / 2} - E_{x z_{j}} {(E_{z_{j} z_{j}}^{1 / 2})}^{†} ∥^{2} & = & ∥ E_{x z_{j}} E_{z_{j} z_{j}}^{†} E_{z_{j} z_{j}}^{1 / 2} - E_{x z_{j}} {(E_{z_{j} z_{j}}^{1 / 2})}^{†} ∥^{2} \\ = & ∥ E_{x z_{j}} {(E_{z_{j} z_{j}}^{1 / 2})}^{†} - E_{x z_{j}} {(E_{z_{j} z_{j}}^{1 / 2})}^{†} ∥^{2} = 0 . \end{matrix}

(48)

That is,

\begin{matrix} ∥ F (y) - M_{j} z_{j} ∥_{Ω}^{2} = tr {E_{x x}} - {∥ E_{x z_{j}} {(E_{z_{j} z_{j}}^{1 / 2})}^{†} ∥}^{2} . \end{matrix}

(49)

Then, (46) follows from (22) and (48). □

4. Solution of Problem Given by (7), (8), (9)

Now we are in the position to consider a solution to the original problem in (7), (8), (9).

4.1. Device of Solution

In comparison with the problem in (16), (8), (9), a specific difficulty of the original problem in (7), (8), (9) is a determination of p additional unknowns, injections

v_{1}, \dots, v_{p}

.

The device of the proposed solution is as follows. First, in (13) and (14), arbitrary linearly independent in the generalized sense vectors

v_{0}, \dots, v_{p}

are denoted by

v_{0}^{(0)}, \dots, v_{p}^{(0)}

and pairwise uncorrelated random vectors

z_{0}, \dots, z_{p}

are denoted by

z_{0} (v_{0}^{(0)}),

\dots,

z_{p} (v_{p}^{(0)})

. In (19), matrices

G_{j}

and

H_{j}

, for

j = 0, \dots, p

, are denoted by

G_{j}^{(0)}

and

H_{j}^{(0)}

, respectively. The associated error is still represented by (27). We also write

\begin{matrix} ε^{(0)} = ∥ F (y) - \sum_{k = 0}^{p} G_{k}^{(0)} H_{k}^{(0)} z_{k} (v_{k}^{(0)}) ∥_{Ω}^{2} . \end{matrix}

(50)

Then, for

i = 0, 1, \dots

, searched injections

v_{1}^{(i + 1)},

\dots, v_{p}^{(i + 1)}

and matrices

G_{1}^{(i + 1)},

H_{1}^{(i + 1)}

,

\dots G_{p}^{(i + 1)}, H_{p}^{(i + 1)}

are determined by the iterative procedure represented bellow. It will be shown in Theorem 10 below that

v_{1}^{(i + 1)},

\dots, v_{p}^{(i + 1)}

and

G_{1}^{(i + 1)},

H_{1}^{(i + 1)}

,

\dots G_{p}^{(i + 1)}, H_{p}^{(i + 1)}

further minimize the associated error. The i-th iterative loop of the iterative procedure consists of the steps as follows.

The i-th iterative loop, for

i = 0, 1, \dots .

Step 1. Given

v_{0} = y

,

G_{0}^{(i)}, H_{0}^{(i)}, \dots,

G_{p}^{(i)}, H_{p}^{(i)}

, find

v_{1}, \dots, v_{p}

that solve

\begin{matrix} min_{v_{1}, \dots, v_{p}} ∥ F (y) - \sum_{k = 1}^{p} G_{k}^{(i)} H_{k}^{(i)} z_{k} (v_{k}) ∥_{Ω}^{2}, \end{matrix}

(51)

where

\begin{matrix} z_{j} (v_{j}) = v_{j} - \sum_{k = 0}^{j - 1} E_{v_{j} z_{k} (v_{k})} E_{z_{k} (v_{k}) z_{k} (v_{k})}^{†} z_{k} (v_{k}) . \end{matrix}

(52)

The solution is denoted by

v_{1}^{(i + 1)}, \dots, v_{p}^{(i + 1)}

. We call

v_{1}^{(i + 1)}, \dots, v_{p}^{(i + 1)}

the optimal injections.

Step 2. Given

v_{1}^{(i + 1)}, \dots, v_{p}^{(i + 1)}

, find pairwise uncorrelated random vectors

z_{1} (v_{1}^{(i + 1)}),

\dots, z_{p} (v_{p}^{(i + 1)})

and denote

\begin{matrix} ε_{v}^{(i + 1)} = ∥ F (y) - \sum_{k = 0}^{p} G_{k}^{(i)} H_{k}^{(i)} z_{k} (v_{k}^{(i + 1)}) ∥_{Ω}^{2}, \end{matrix}

(53)

where, for

k = 0

, we set

G_{0}^{(i)} = G_{0}^{(0)}

and

H_{0}^{(i)} = H_{0}^{(0)}

, for all

i = 1, 2, \dots

and we write

\begin{matrix} z_{j} (v_{j}^{(i)}) = v_{j}^{(i)} - \sum_{k = 0}^{j - 1} E_{v_{j}^{(i)} z_{k} (v_{k}^{(i)})} E_{z_{k} (v_{k}^{(i)}) z_{k} (v_{k}^{(i)})}^{†} z_{k} (v_{k}^{(i)}) . \end{matrix}

(54)

Step 3. Given

z_{1} (v_{1}^{(i)}), \dots, z_{p} (v_{p}^{(i)})

, find

G_{1}, H_{1}, \dots, G_{p}, H_{p}

that solve

\begin{matrix} min_{\begin{matrix} G_{1}, H_{1}, \dots, G_{p}, H_{p} \end{matrix}} {∥F (y) - \sum_{k = 1}^{p} G_{k} H_{k} z_{k} (v_{k}^{(i)})∥}_{Ω}^{2} . \end{matrix}

(55)

The solution of problem (55) is denoted by

G_{1}^{(i + 1)}, H_{1}^{(i + 1)}, \dots,

G_{p}^{(i + 1)}, H_{p}^{(i + 1)}

. Further, denote

\begin{matrix} ε_{G H}^{(i + 1)} = ∥ F (y) - \sum_{k = 0}^{p} G_{k}^{(i + 1)} H_{k}^{(i + 1)} z_{k} (v_{k}^{(i)}) ∥_{Ω}^{2}, \end{matrix}

(56)

where, as before, for

k = 0

, we set

G_{0}^{(i + 1)} = G_{0}^{(0)}

and

H_{0}^{(i + 1)} = H_{0}^{(0)}

, for all

i = 0, 1, \dots

.

Step 4. Denote

\begin{matrix} ε^{(i + 1)} = min {ε_{v}^{(i + 1)}, ε_{G H}^{(i + 1)}} . \end{matrix}

(57)

Step 5. If, for a given tolerance

δ

,

\begin{matrix} ∥ ε^{(i + 1)} - ε^{(i)} ∥_{Ω}^{2} \leq δ, \end{matrix}

(58)

the iterations are stopped. If not, then Steps 1–4 are repeated to form the next iterative loop.

The above steps of the solution device are consummated as follows.

4.2. Determination of $v_{j}^{(i + 1)}$ in Step 1

Let us denote

\begin{matrix} z_{k}^{(i + 1)} = z_{k} (v_{k}^{(i + 1)}), P_{j}^{(i)} = G_{j}^{(i)} H_{j}^{(i)}, B_{j k}^{(i + 1)} = {(P_{j}^{(i)})}^{†} E_{x z_{k}^{(i + 1)}}, \\ γ_{ℓ}^{(i + 1)} = E_{z_{ℓ}^{(i + 1)} z_{ℓ}^{(i + 1)}}^{†} z_{ℓ}^{(i + 1)} and A_{ℓ k}^{(i + 1)} = E_{z_{ℓ}^{(i + 1)} z_{ℓ}^{(i + 1)}}^{†} E_{z_{ℓ}^{(i + 1)} z_{k}^{(i + 1)}} . \end{matrix}

Here,

γ_{ℓ}^{(i + 1)} (ω) \in R^{q_{ℓ} \times q_{ℓ}}

,

B_{j k}^{(i + 1)} \in R^{q_{j} \times q_{k}}

and

A_{ℓ k}^{(i + 1)} \in R^{q_{ℓ} \times q_{k}}

.

Theorem 8.

Let

\begin{matrix} B_{j}^{(i)} = [B_{j 0}^{(i)}, \dots, B_{j, j - 1}^{(i)}] a n d A_{j}^{(i)} = {A_{k s}^{(i)}}_{k, s = 0}^{j - 1}, \end{matrix}

(59)

where

B_{j}^{(i)} \in R^{q_{j} \times q}

,

A_{j}^{(i)} \in R^{q \times q}

and

q = q_{0} + \dots + q_{j - 1}

. Then the optimal injection

v_{j}^{(i + 1)}

, for

j = 1, \dots, p

, that solves the problem in (51) is determined by

\begin{matrix} v_{j}^{(i + 1)} (ω) = {(G_{j}^{(i)} H_{j}^{(i)})}^{†} [F (y)] (ω) + \sum_{ℓ = 0}^{j - 1} C_{j ℓ}^{(i)} γ_{ℓ}^{(i)} (ω), \end{matrix}

(60)

where matrices

C_{j 0}^{(i)}, \dots, C_{j, j - 1}^{(i)}

are defined by

\begin{matrix} [C_{j 0}^{(i)}, \dots, C_{j, j - 1}^{(i)}] = B_{j}^{(i)} {(I - A_{j}^{(i)})}^{†} . \end{matrix}

(61)

Here,

C_{j ℓ}^{(i)} \in R^{q_{j} \times q_{ℓ}}

, for

ℓ = 0, \dots, j - 1

.

Proof.

Based on (22), the problem in (51) is represented as follows:

\begin{matrix} min_{v_{1}, \dots, v_{p}} {∥F (y) - \sum_{j = 0}^{p} P_{j}^{(i)} z_{j} (v_{j})∥}_{Ω}^{2} = {∥F (y) - P_{0}^{(0)} z_{0} (v_{0})∥}_{Ω}^{2} \\ + \sum_{j = 1}^{p} min_{v_{j}} {∥F (y) - P_{j}^{(i)} z_{j} (v_{j})∥}_{Ω}^{2} - tr {p E_{x x}} . \end{matrix}

(62)

Then

\begin{matrix} min_{v_{j}} {∥F (y) - P_{j}^{(i)} z_{j} (v_{j})∥}_{Ω}^{2} & = & min_{v_{j}} {∥F (y) - P_{j}^{(i)} (v_{j} - \sum_{k = 0}^{j - 1} E_{v_{j} z_{k}} E_{z_{k} z_{k}}^{†} z_{k} (v_{k}))∥}_{Ω}^{2} . \end{matrix}

(63)

Here,

\begin{matrix} {∥F (y) - P_{j}^{(i)} (v_{j} - \sum_{k = 0}^{j - 1} E_{v_{j} z_{k}} E_{z_{k} z_{k}}^{†} z_{k} (v_{k}))∥}_{Ω}^{2} \\ = & \int_{Ω} {∥[F (y)] (ω) - P_{j}^{(i)} (v_{j} (ω) - \sum_{k = 0}^{j - 1} E_{v_{j} z_{k}} E_{z_{k} z_{k}}^{†} z_{k} (v_{k} (ω)))∥}^{2} d μ (ω) \\ = & \int_{Ω} {∥[F (y)] (ω) - P_{j}^{(i)} z_{j} (v_{j} (ω))∥}^{2} d μ (ω) . \end{matrix}

(64)

For

u = z_{j} ((v_{j}),

let us denote

U_{v} = {u \in L^{2} (Ω, R^{q_{j}}) ∣ u = z_{j} ((v_{j})}

. For any

u

, let us write

U = {u \in L^{2} (Ω, R^{q_{j}})} .

For

j = 1, \dots, p

, the vector

\tilde{u} (ω) \in R^{q_{j}}

of the smallest Euclidean norm of all minimizers that solves

min_{u (ω)} ∥ [F (y)] (ω) - P_{j}^{(i)} {u (ω) ∥}^{2}

is given by (see [85], p. 257)

\begin{matrix} \tilde{u} (ω) = {(P_{j}^{(i)})}^{†} [F (y)] (ω) . \end{matrix}

Let

v_{j}^{(i + 1)}

be such that

min_{v_{j} (ω)} ∥ [F (y)] (ω) - P_{j}^{(i)} z_{j} (v_{j} (ω)) ∥^{2} = ∥ [F (y)] (ω) - P_{j}^{(i)} z_{j} (v_{j}^{(i + 1)} (ω)) ∥^{2} .

Since

U_{v} \subset U

then

\begin{matrix} min_{v_{j} (ω)} ∥ [F (y)] (ω) - P_{j}^{(i)} z_{j} (v_{j} (ω)) ∥^{2} \geq min_{u (ω)} ∥ [F (y)] (ω) - P_{j}^{(i)} {u (ω) ∥}^{2}, \end{matrix}

(65)

i.e.,

\begin{matrix} min_{v_{j} (ω)} ∥ [F (y)] (ω) - P_{j}^{(i)} z_{j} (v_{j} (ω)) ∥^{2} & \geq & ∥ [F (y)] (ω) - P_{j}^{(i)} \tilde{u} {(ω) ∥}^{2} \\ = & ∥ [F (y)] (ω) - P_{j}^{(i)} {(P_{j}^{(i)})}^{†} [F (y) {] (ω) ∥}^{2} \end{matrix}

(66)

Then (66) implies that, for all

v_{j}

,

\begin{matrix} ∥ [F (y)] (ω) - P_{j}^{(i)} z_{j} (v_{j} (ω)) ∥^{2} \geq ∥ [F (y)] (ω) - P_{j}^{(i)} {(P_{j}^{(i)})}^{†} [F (y) {] (ω) ∥}^{2} . \end{matrix}

(67)

Since it is true for all

v_{j}

, then

\begin{matrix} ∥ [F (y)] (ω) - P_{j}^{(i)} {(P_{j}^{(i)})}^{†} [F (y)] (ω) ∥^{2} = min_{v_{j} (ω)} ∥ [F (y) {] (ω) - P_{j}^{(i)} z_{j} (v_{j} (ω)) ∥}^{2}, \end{matrix}

(68)

i.e.,

∥ [F (y)] (ω) - P_{j}^{(i)} {(P_{j}^{(i)})}^{†} [F (y)] (ω) ∥^{2} = ∥ [F (y) {] (ω) - P_{j}^{(i)} z_{j} (v_{j}^{(i + 1)} (ω)) ∥}^{2} .

Therefore,

\begin{matrix} z_{j} (v_{j}^{(i + 1)} (ω)) = {(P_{j}^{(i)})}^{†} [F (y)] (ω) . \end{matrix}

(69)

Taking into account (14), Equation (69) is written as

v_{j}^{(i + 1)} (ω) = {(P_{j}^{(i)})}^{†} [F (y)] (ω) + \sum_{k = 0}^{j - 1} E_{v_{j}^{(i + 1)} z_{k}^{(i + 1)}} E_{z_{k}^{(i + 1)} z_{k}^{(i + 1)}}^{†} z_{k} (v_{k}^{(i + 1)} (ω)),

where we denote

\begin{matrix} E_{v_{j}^{(i + 1)} z_{k}^{(i + 1)}} = \int_{Ω} v_{j}^{(i + 1)} (ξ) z_{k} {[v_{k}^{(i + 1)} (ξ)]}^{T} d μ (ξ), \\ and E_{z_{k}^{(i + 1)} z_{k}^{(i + 1)}} = \int_{Ω} z_{k} [v_{k}^{(i + 1)} (ξ)] z_{k} {[v_{k}^{(i + 1)} (ξ)]}^{T} d μ (ξ) . \end{matrix}

Thus,

\begin{matrix} v_{j}^{(i + 1)} (ω) = {(P_{j}^{(i)})}^{†} [F (y)] (ω) + \sum_{ℓ = 0}^{j - 1} [\int_{Ω} v_{j}^{(i + 1)} (ξ) {[z_{ℓ} (v_{ℓ}^{(i + 1)} (ξ))]}^{T} d μ (ξ)] γ_{ℓ}^{(i + 1)} (ω) . \end{matrix}

(70)

Let us now write Equation (70) as follows:

\begin{matrix} v_{j}^{(i + 1)} (ω) = α_{j}^{(i)} (ω) + \int_{Ω} K_{j}^{(i + 1)} (ω, ξ) v_{j}^{(i + 1)} (ξ) d μ (ξ), \end{matrix}

(71)

where

\begin{matrix} α_{j}^{(i)} (ω) = {(P_{j}^{(i)})}^{†} [F (y)] (ω) and K_{j}^{(i + 1)} (ω, ξ) = \sum_{ℓ = 0}^{j - 1} {[z_{ℓ} (v_{ℓ}^{(i + 1)} (ξ))]}^{T} γ_{ℓ}^{(i + 1)} (ω) . \end{matrix}

(72)

Recall that in (72), matrix

P_{j}^{(i)} = G_{j}^{(i)} H_{j}^{(i)}

depends on

v_{j}^{(i)} (ω)

, not on

v_{j}^{(i + 1)} (ω)

. Therefore, Equation (71) is a vector version of the Fredholm integral equation of the second kind [86] with respect to

v_{j}^{(i + 1)} (ω)

. Its solution is provided as follows. Write Equation (71) as

\begin{matrix} v_{j}^{(i + 1)} (ω) = α_{j}^{(i)} (ω) + \sum_{ℓ = 0}^{j - 1} C_{j ℓ}^{(i + 1)} γ_{ℓ}^{(i + 1)} (ω), \end{matrix}

(73)

where

C_{j ℓ}^{(i + 1)} = \int_{Ω} v_{j}^{(i + 1)} (ξ) {[z_{ℓ} (v_{ℓ}^{(i + 1)} (ξ))]}^{T} d μ (ξ) .

Let us now multiply both sides of (73) by

{[z_{k} (v_{k}^{(i + 1)} (ω))]}^{T}

, for

k = 0, \dots, j - 1

, and integrate. It implies

\begin{matrix} \int_{Ω} v_{j}^{(i + 1)} (ω) {[z_{k} (v_{k}^{(i + 1)} (ω))]}^{T} d μ (ω) & = & \int_{Ω} α_{j}^{(i)} (ω) {[z_{k} (v_{k}^{(i + 1)} (ω))]}^{T} d μ (ω) \\ + \sum_{ℓ = 0}^{j - 1} C_{j ℓ}^{(i + 1)} \int_{Ω} γ_{ℓ}^{(i + 1)} (ω) {[z_{k} (v_{k}^{(i + 1)} (ω))]}^{T} d μ (ω) \end{matrix}

and

\begin{matrix} C_{j k}^{(i + 1)} = B_{j k}^{(i + 1)} + \sum_{ℓ = 0}^{j - 1} C_{j ℓ}^{(i + 1)} A_{ℓ k}^{(i + 1)}, \end{matrix}

(74)

where, for

k = 0, \dots, j - 1

,

\begin{matrix} B_{j k}^{(i + 1)} & = & \int_{Ω} α_{j}^{(i + 1)} (ω) {[z_{k} (v_{k}^{(i + 1)} (ω))]}^{T} d μ (ω) \\ = & {(P_{j}^{(i)})}^{†} \int_{Ω} [F (y)] (ω) {[z_{k} (v_{k}^{(i + 1)} (ω))]}^{T} d μ (ω) = {(P_{j}^{(i)})}^{†} E_{x z_{k}^{(i + 1)}} \in R^{q_{j} \times q_{k}} \end{matrix}

and

\begin{matrix} A_{ℓ k}^{(i + 1)} = \int_{Ω} γ_{ℓ}^{(i + 1)} (ω) {[z_{k} (v_{k}^{(i + 1)} (ω))]}^{T} d μ (ω) = E_{z_{ℓ}^{(i + 1)} z_{ℓ}^{(i + 1)}}^{†} E_{z_{ℓ}^{(i + 1)} z_{k}^{(i + 1)}} \in R^{q_{ℓ} \times q_{k}} . \end{matrix}

Let

C_{j}^{(i + 1)} = [C_{j 0}^{(i + 1)}, \dots, C_{j, j - 1}^{(i + 1)}] \in R^{q_{j} \times q}

. Then the set of matrix equations in (74) can be written as a single equation

\begin{matrix} C_{j}^{(i + 1)} = B_{j}^{(i + 1)} + C_{j}^{(i + 1)} A_{j}^{(i + 1)} or B_{j}^{(i + 1)} = C_{j}^{(i + 1)} (I - A_{j}^{(i + 1)}) . \end{matrix}

(75)

If matrix

I - A_{j}^{(i + 1)}

is invertible, then (75) implies

\begin{matrix} C_{j}^{(i + 1)} = B_{j}^{(i + 1)} {(I - A_{j}^{(i + 1)})}^{- 1} . \end{matrix}

(76)

If matrix

I - A_{j}^{(i + 1)}

is singular, then instead of Equation (75), we consider the problem

\begin{matrix} min_{C_{j}^{(i + 1)}} {∥ B_{j}^{(i + 1)} - C_{j}^{(i + 1)} (I - A_{j}^{(i + 1)}) ∥}^{2} . \end{matrix}

(77)

Its minimal Frobenius norm solution is given by [62]:

\begin{matrix} C_{j}^{(i + 1)} = B_{j}^{(i + 1)} {(I - A_{j}^{(i + 1)})}^{†} . \end{matrix}

(78)

Thus, injection

v_{j}^{(i + 1)}

follows from (73), (76), and (78). □

In terms of a multilayer network, this procedure is illustrated diagrammatically in Figure 1. To simplify the notation, superscripts

(i)

and

(i + 1)

are omitted in Figure 1.

4.3. Determination of Matrices $G_{1}^{(i + 1)}, H_{1}^{(i + 1)}, \dots,$ $G_{p}^{(i + 1)}, H_{p}^{(i + 1)}$ in Step 3

In Step 3, the matrices

G_{1}^{(i + 1)}, H_{1}^{(i + 1)}, \dots,

G_{p}^{(i + 1)}, H_{p}^{(i + 1)}

solve problem (55). At the same time, the problems in (16) and (55) differ in notation only. Therefore, in Step 3, matrices

G_{1}^{(i + 1)}, H_{1}^{(i + 1)}, \dots,

G_{p}^{(i + 1)}, H_{p}^{(i + 1)}

are determined, in fact, by Theorem 2, where only the notation should be changed.

Nevertheless, to avoid any confusion, we below provide Theorem 9, where matrices

G_{1}^{(i + 1)}, H_{1}^{(i + 1)}, \dots,

G_{p}^{(i + 1)}, H_{p}^{(i + 1)}

are represented. To this end, we denote

Γ_{z_{j}^{(i)}} = E_{x z_{j}^{(i)}} E_{z_{j}^{(i)} z_{j}^{(i)}}^{†} E_{z_{j}^{(i)} x}

. To simplify the notation, we set

Γ_{z_{j}} : = Γ_{z_{j}^{(i)}} .

Theorem 9.

Let

v_{0}, v_{1}^{(i)}, \dots, v_{p}^{(i)}

be well-defined injections and let vectors

z_{1}^{(i)},

\dots, z_{p}^{(i)}

be pairwise uncorrelated. Then, the minimal Frobenius norm solution to the problem in (55) is given, for

j = 1, \dots, p

, by

\begin{matrix} G_{j}^{(i + 1)} = U_{Γ_{z_{j}}, r_{j}} a n d H_{j}^{(i + 1)} = U_{Γ_{z_{j}}, r_{j}}^{T} E_{x z_{j}^{(i)}} E_{z_{j}^{(i)} z_{j}^{(i)}}^{†} . \end{matrix}

(79)

Proof.

The proof follows from the proof of Theorem 2. □

4.4. Error Analysis of the Solution of Problem in (7), (8), (9)

Theorem 10.

Let

G_{0}^{(0)},

H_{0}^{(0)},

G_{k}^{(i)},

H_{k}^{(i)}

, and

v_{k}^{(i)}

, for

k = 1, \dots, p

,

i = 0, 1, \dots,

be determined by Theorems 2, 8, and 9. Let

ε^{(i)}

be the associated error defined by (50) and (57). Then, the increase in the number of iterations i of the procedure represented in nm33,jkw8,nmwk1 implies the decrease in the associated error, i.e.,

\begin{matrix} ε^{(i + 1)} \leq ε^{(i)} . \end{matrix}

(80)

Proof.

Let us consider the initial case of the proposed technique when

i = 0

.

The case $i = 0$ . For $i = 0$ , the i-th iteration loop represented in Section 4.1 implies

$ε_{v}^{(1)} = min_{v_{1}, \dots, v_{p}} ∥ F (y) - \sum_{k = 1}^{p} G_{k}^{(0)} H_{k}^{(0)} z_{k} (v_{k}) ∥_{Ω}^{2} .$

Here, for $j = 1, \dots, p,$ $G_{j}^{(1)} = G_{j}^{(0)}$ and $H_{j}^{(1)} = H_{j}^{(0)}$ . Therefore,

$ε_{v}^{(1)} = min_{v_{1}, \dots, v_{p}} ∥ F (y) - \sum_{k = 1}^{p} G_{k}^{(1)} H_{k}^{(1)} z_{k} (v_{k}) ∥_{Ω}^{2},$

i.e., for any $v_{j}$ , in particular, for $v_{j} = v_{j}^{(0)}$ ,

$\begin{matrix} ε_{v}^{(1)} \leq ∥ F (y) - \sum_{k = 1}^{p} G_{k}^{(1)} H_{k}^{(1)} z_{k} (v_{k}^{(0)}) ∥_{Ω}^{2} = ε_{G H}^{(1)} \end{matrix}$

(81)

and

$\begin{matrix} ε_{v}^{(1)} \leq ∥ F (y) - \sum_{k = 1}^{p} G_{k}^{(0)} H_{k}^{(0)} z_{k} (v_{k}^{(0)}) ∥_{Ω}^{2} = ε^{(0)} . \end{matrix}$

(82)

Taking into account (81), we denote

$\begin{matrix} ε^{(1)} = ε_{v}^{(1)} . \end{matrix}$

(83)

Then, by (82),

$\begin{matrix} ε^{(1)} \leq ε^{(0)} . \end{matrix}$

(84)

For $i = 1, 2, \dots,$ let us prove inequality (80) by induction. To this end, we first consider the basis step of the induction, which consists of cases $i = 1$ and $i = 2$ .
The basis step: Case $i = 1$ . If $i = 1$ then the i-th iteration loop (see Section 4.1) implies

$\begin{matrix} ε_{G H}^{(2)} & = & min_{\begin{matrix} G_{1}, H_{1}, \dots, G_{p}, H_{p} \end{matrix}} {∥F (y) - \sum_{k = 1}^{p} G_{k} H_{k} z_{k} (v_{k}^{(1)})∥}_{Ω}^{2} \\ = & {∥F (y) - \sum_{k = 1}^{p} G_{k}^{(2)} H_{k}^{(2)} z_{k} (v_{k}^{(1)})∥}_{Ω}^{2}, \end{matrix}$

i.e., for all $G_{k}$ and $H_{k}$ with $k = 1, \dots, p$ ,

$ε_{G H}^{(2)} \leq {∥F (y) - \sum_{k = 1}^{p} G_{k} H_{k} z_{k} (v_{k}^{(1)})∥}_{Ω}^{2} .$

In particular, for $G_{k} = G_{k}^{(1)}$ and $H_{k} = H_{k}^{(1)}$ with $k = 1, \dots, p$ ,

$\begin{matrix} ε_{G H}^{(2)} \leq {∥F (y) - \sum_{k = 1}^{p} G_{k}^{(1)} H_{k}^{(1)} z_{k} (v_{k}^{(1)})∥}_{Ω}^{2} . \end{matrix}$

(85)

Further, because for $k = 1, \dots, p,$ $G_{k}^{(1)} = G_{k}^{(0)}$ and $H_{k}^{(1)} = H_{k}^{(0)}$ , then

$\begin{matrix} min_{v_{1}, \dots, v_{p}} ∥ F (y) - \sum_{k = 1}^{p} G_{k}^{(1)} H_{k}^{(1)} z_{k} (v_{k}) ∥_{Ω}^{2} = min_{v_{1}, \dots, v_{p}} ∥ F (y) - \sum_{k = 1}^{p} G_{k}^{(0)} H_{k}^{(0)} z_{k} (v_{k}) ∥_{Ω}^{2} . \end{matrix}$

(86)

Therefore, for $k = 1, \dots, p$ ,

$\begin{matrix} v_{k}^{(2)} = v_{k}^{(1)} and z_{k} (v_{k}^{(2)}) = z_{k} (v_{k}^{(1)}) \end{matrix}$

(87)

(see (51)). Thus, (85) and (87) imply

$\begin{matrix} ε_{G H}^{(2)} \leq {∥F (y) - \sum_{k = 1}^{p} G_{k}^{(1)} H_{k}^{(1)} z_{k} (v_{k}^{(2)})∥}_{Ω}^{2} = ε_{v}^{(2)} . \end{matrix}$

(88)

But since $G_{k}^{(1)} = G_{k}^{(0)}$ , $H_{k}^{(1)} = H_{k}^{(0)}$ and $z_{k} (v_{k}^{(2)}) = z_{k} (v_{k}^{(1)})$ , for $k = 1, \dots, p$ , then

$\begin{matrix} ε_{v}^{(2)} = ε_{v}^{(1)} . \end{matrix}$

(89)

Then by (83), (88), and (89),

$\begin{matrix} ε_{G H}^{(2)} \leq ε_{v}^{(1)} = ε^{(1)} . \end{matrix}$

(90)

Taking into account (88), we denote

$\begin{matrix} ε^{(2)} = ε_{G H}^{(2)} . \end{matrix}$

(91)

Then, (90) and (91) imply

$\begin{matrix} ε^{(2)} \leq ε^{(1)} . \end{matrix}$

(92)
The basis step: Case $i = 2$ . In this case, $ε_{v}^{(i + 1)}$ in (53) is written as

$ε_{v}^{(3)} = min_{v_{1}, \dots, v_{p}} ∥ F (y) - \sum_{k = 0}^{p} G_{k}^{(2)} H_{k}^{(2)} z_{k} (v_{k}) ∥_{Ω}^{2} = ∥ F (y) - \sum_{k = 1}^{p} G_{k}^{(2)} H_{k}^{(2)} z_{k} (v_{k}^{(3)}) ∥_{Ω}^{2} .$

That is, for any $v_{1}, \dots, v_{p},$

$\begin{matrix} ε_{v}^{(3)} \leq ∥ F (y) - \sum_{k = 0}^{p} G_{k}^{(2)} H_{k}^{(2)} z_{k} (v_{k}) ∥_{Ω}^{2} . \end{matrix}$

(93)

At the same time,

$\begin{matrix} ε_{G H}^{(3)} & = & min_{G_{1}, H_{1}, \dots, G_{p}, H_{p}} {∥F (y) - \sum_{k = 1}^{p} G_{k} H_{k} z_{k} (v_{k}^{(2)})∥}_{Ω}^{2} \\ = & {∥F (y) - \sum_{k = 1}^{p} G_{k}^{(3)} H_{k}^{(3)} z_{k} (v_{k}^{(2)})∥}_{Ω}^{2} . \end{matrix}$

(94)

By (87), $z_{k} (v_{k}^{(2)}) = z_{k} (v_{k}^{(1)})$ , therefore, $G_{k}^{(3)}, H_{k}^{(3)}$ solve, in fact,

$min_{\begin{matrix} G_{1}, H_{1}, \dots, G_{p}, H_{p} \end{matrix}} {∥F (y) - \sum_{k = 1}^{p} G_{k} H_{k} z_{k} (v_{k}^{(1)})∥}_{Ω}^{2},$

i.e., $G_{k}^{(3)} = G_{k}^{(2)}, H_{k}^{(3)} = H_{k}^{(2)}$ . Therefore, (94) implies

$\begin{matrix} ε_{G H}^{(3)} = {∥F (y) - \sum_{k = 1}^{p} G_{k}^{(2)} H_{k}^{(2)} z_{k} (v_{k}^{(1)})∥}_{Ω}^{2} = ε_{G H}^{(2)} . \end{matrix}$

(95)

Further, (93) is true for all $v_{k}$ and, in particular, for $v_{k}^{(1)}$ . Therefore, (93) and (95) imply

$\begin{matrix} ε_{v}^{(3)} \leq ε_{G H}^{(3)} = ε_{G H}^{(2)} . \end{matrix}$

(96)

Denote $ε^{(3)} = ε_{v}^{(3)}$ . By (91), $ε^{(2)} = ε_{G H}^{(2)} .$ Thus,

$\begin{matrix} ε^{(3)} \leq ε^{(2)} . \end{matrix}$

(97)
The inductive step. For $s = 1, 2, \dots,$ let us suppose that if $z_{k} (v_{k}^{(2 s)}) = z_{k} (v_{k}^{(2 s - 1)})$ , and $ε^{(2 s)} = ε_{G H}^{(2 s)}$ and $ε^{(2 s - 1)} = ε_{v}^{(2 s - 1)}$ then

$\begin{matrix} ε^{(2 s)} \leq ε^{(2 s - 1)} . \end{matrix}$

(98)

Below, we show that then $ε^{(2 s + 1)} \leq ε^{(2 s)}$ and $ε^{(2 s + 2)} \leq ε^{(2 s + 1)}$ , i.e., that (80) is true.
To this end, for the i-th iterative loop, let us consider case $i = 2 s$ where $s = 1, 2, \dots$ . We have

$\begin{matrix} ε_{v}^{(2 s + 1)} & = & min_{v_{1}, \dots, v_{p}} ∥ F (y) - \sum_{k = 1}^{p} G_{k}^{(2 s)} H_{k}^{(2 s)} z_{k} (v_{k}) ∥_{Ω}^{2} \\ = & ∥ F (y) - \sum_{k = 1}^{p} G_{k}^{(2 s)} H_{k}^{(2 s)} z_{k} (v_{k}^{(2 s + 1)}) ∥_{Ω}^{2} . \end{matrix}$

That is, for any $v_{1}, \dots, v_{p},$

$\begin{matrix} ε_{v}^{(2 s + 1)} \leq ∥ F (y) - \sum_{k = 1}^{p} G_{k}^{(2 s)} H_{k}^{(2 s)} z_{k} (v_{k}) ∥_{Ω}^{2} . \end{matrix}$

(99)

At the same time,

$\begin{matrix} ε_{G H}^{(2 s + 1)} & = & min_{G_{1}, H_{1}, \dots, G_{p}, H_{p}} {∥F (y) - \sum_{k = 1}^{p} G_{k} H_{k} z_{k} (v_{k}^{(2 s)})∥}_{Ω}^{2} \\ = & {∥F (y) - \sum_{k = 1}^{p} G_{k}^{(2 s + 1)} H_{k}^{(2 s + 1)} z_{k} (v_{k}^{(2 s)})∥}_{Ω}^{2} . \end{matrix}$

(100)

By the assumption $z_{k} (v_{k}^{(2 s)}) = z_{k} (v_{k}^{(2 s - 1)})$ , $G_{k}^{(2 s + 1)}, H_{k}^{(2 s + 1)}$ solve, in fact,

$min_{\begin{matrix} G_{1}, H_{1}, \dots, G_{p}, H_{p} \end{matrix}} {∥F (y) - \sum_{k = 1}^{p} G_{k} H_{k} z_{k} (v_{k}^{(2 s - 1)})∥}_{Ω}^{2},$

i.e., $G_{k}^{(2 s + 1)} = G_{k}^{(2 s)}, H_{k}^{(2 s + 1)} = H_{k}^{(2 s)}$ . Therefore, (100) implies

$\begin{matrix} ε_{G H}^{(2 s + 1)} = {∥F (y) - \sum_{k = 1}^{p} G_{k}^{(2 s)} H_{k}^{(2 s)} z_{k} (v_{k}^{(2 s - 1)})∥}_{Ω}^{2} = ε_{G H}^{(2 s)} . \end{matrix}$

(101)

Further, (99) is true for all $v_{k}$ and, in particular, for $v_{k}^{(2 s - 1)}$ . Therefore, (99) and (101) imply

$\begin{matrix} ε_{v}^{(2 s + 1)} \leq ε_{G H}^{(2 s + 1)} = ε_{G H}^{(2 s)} . \end{matrix}$

(102)

Denote $ε^{(2 s + 1)} = ε_{v}^{(2 s + 1)}$ . By the assumption $ε^{(2 s)} = ε_{G H}^{(2 s)}$ , then

$\begin{matrix} ε^{(2 s + 1)} \leq ε^{(2 s)} . \end{matrix}$

(103)

Further, for $s = 1, 2, \dots,$ let us now consider the case $i = 2 s + 1$ . Then

$\begin{matrix} ε_{G H}^{(2 s + 2)} & = & min_{\begin{matrix} G_{1}, H_{1}, \dots, G_{p}, H_{p} \end{matrix}} {∥F (y) - \sum_{k = 1}^{p} G_{k} H_{k} z_{k} (v_{k}^{(2 s + 1)})∥}_{Ω}^{2} \\ = & {∥F (y) - \sum_{k = 1}^{p} G_{k}^{(2 s + 2)} H_{k}^{(2 s + 2)} z_{k} (v_{k}^{(2 s + 1)})∥}_{Ω}^{2} . \end{matrix}$

Therefore, for any $G_{k}$ and $H_{k}$ ,

$\begin{matrix} ε_{G H}^{(2 s + 2)} \leq {∥F (y) - \sum_{k = 1}^{p} G_{k} H_{k} z_{k} (v_{k}^{(2 s + 1)})∥}_{Ω}^{2} . \end{matrix}$

(104)

We also need the following. By (55), $G_{k}^{(2 s + 1)}$ and $H_{k}^{(2 s + 1)}$ solve

$\begin{matrix} min_{\begin{matrix} G_{1}, H_{1}, \dots, G_{p}, H_{p} \end{matrix}} {∥F (y) - \sum_{k = 1}^{p} G_{k} H_{k} z_{k} (v_{k}^{(2 s)})∥}_{Ω}^{2} . \end{matrix}$

(105)

Since by the assumption, $z_{k} (v_{k}^{(2 s)}) = z_{k} (v_{k}^{(2 s - 1)})$ , then (105) is equivalent to

$\begin{matrix} min_{\begin{matrix} G_{1}, H_{1}, \dots, G_{p}, H_{p} \end{matrix}} {∥F (y) - \sum_{k = 1}^{p} G_{k} H_{k} z_{k} (v_{k}^{(2 s - 1)})∥}_{Ω}^{2} . \end{matrix}$

(106)

Thus,

$\begin{matrix} G_{k}^{(2 s + 1)} = G_{k}^{(2 s)} and H_{k}^{(2 s + 1)} = H_{k}^{(2 s)} . \end{matrix}$

(107)

For $G_{k} = G_{k}^{(2 s + 1)}$ and $H_{k} = H_{k}^{(2 s + 1)}$ , (104) implies

$\begin{matrix} ε_{G H}^{(2 s + 2)} \leq {∥F (y) - \sum_{k = 1}^{p} G_{k}^{(2 s + 1)} H_{k}^{(2 s + 1)} z_{k} (v_{k}^{(2 s + 1)})∥}_{Ω}^{2} \end{matrix}$

(108)

and then by (107),

$\begin{matrix} ε_{G H}^{(2 s + 2)} \leq {∥F (y) - \sum_{k = 1}^{p} G_{k}^{(2 s)} H_{k}^{(2 s)} z_{k} (v_{k}^{(2 s + 1)})∥}_{Ω}^{2} = ε_{v}^{(2 s + 1)} . \end{matrix}$

(109)

Because of (109), denote $ε^{(2 s + 2)} = ε_{G H}^{(2 s + 2)}$ . Recall that according to the above, $ε^{(2 s + 1)} = ε_{v}^{(2 s + 1)} .$ Therefore, (109) implies

$\begin{matrix} ε^{(2 s + 2)} \leq ε^{(2 s + 1)} . \end{matrix}$

(110)

Thus, (80) is true.

□

Further, we wish to show that the proposed procedure for the solution of problem in (7), (8), (9) converges to a so-called coordinate-wise minimum, which is defined in Theorem 11 below. To this end, let us denote

\begin{matrix} f (P, v) = ∥ F (y) - \sum_{j = 0}^{p} P_{j} z_{j} (v_{j}) ∥_{Ω}^{2}, \end{matrix}

(111)

where, for

j = 0, 1, \dots, p,

as in (20),

P_{j} = G_{j} H_{j}

,

P = [P_{0}, \dots, P_{p}]

,

v = [v_{0}, \dots,

v_{p}]^{T}

and

v_{0} = y

.

For

i = 1, 2, \dots

, we also write

P_{j}^{(i)} = G_{j}^{(i)} H_{j}^{(i)}

,

P^{(i)} = [P_{0}^{(0)}, P_{1}^{(i)}, \dots, P_{p}^{(i)}]

and

v^{(i)} = [v_{0}^{(i)}, \dots,

v_{p}^{(i)}]^{T}

. As before,

v_{0}^{(i)} = y

.

Let us now define compact sets

K_{1}

and

K_{2}

such that

\begin{matrix} K_{1} = {P : 0 \leq f (P, v^{(0)}) \leq f (P^{(0)}, v^{(0)})} \end{matrix}

(112)

and

\begin{matrix} K_{2} = {v : 0 \leq f (P^{(0)}, v) \leq f (P^{(0)}, v^{(1)})} . \end{matrix}

(113)

Theorem 11.

Let

P \in K_{1}

and

v \in K_{2} .

Let

v^{(i)} = [v_{0}^{(i)}, \dots,

v_{p}^{(i)}]^{T}

and

P^{(i)}

be determined by Theorems 2, 8, and 9, respectively, and let

S^{(i)} = (P^{(i)}, v^{(i)})

. Then, any cluster point of sequence

{S^{(i)}}

, say

S^{*} = (P^{*}, v^{*})

is a coordinate-wise minimum point of

f (P, v)

, i.e.,

P^{*} \in arg min_{P \in K_{1}} f (P, v^{*}) a n d v^{*} \in arg min_{v \in K_{2}} f (P^{*}, v) .

Proof.

Since each

K_{j}

, for

j = 1, 2

, is compact, then there is a subsequence

{S^{(j_{t})}} = {(P^{(j_{t})}, v^{(j_{t})})}

such that

S^{(j_{t})} \to S^{*}

when

t \to \infty

. Let

{\bar{P}}^{*}

and

{\bar{P}}^{(j_{t})}

be defined by

{\bar{P}}^{*} \in arg min_{P \in K_{1}} f (P, v^{*}) and {\bar{P}}^{(j_{t})} \in arg min_{P \in K_{1}} f (P, v^{(j_{t})}) .

Similar to [75],

{\bar{P}}^{*}

and

{\bar{P}}^{(j_{t})}

are called the best responses to P associated with

v^{*}

and

v^{(j_{t})}

, respectively. Consider entry

P^{(j_{t})}

of

S^{(j_{t})}

. Then, on the basis of the proof of Theorem 3.1 in [75], we have

\begin{matrix} f ({\bar{P}}^{*}, v^{(j_{t})}) \geq f ({\bar{P}}^{(j_{t})}, v^{(j_{t})}) \geq f (P^{(j_{t} + 1)}, v^{(j_{t} + 1)}) \geq f (P^{(j_{t + 1})}, v^{(j_{t + 1})}) . \end{matrix}

By continuity, as

t \to \infty

,

\begin{matrix} f ({\bar{P}}^{*}, v^{*}) \geq f (P^{*}, v^{*}) . \end{matrix}

(114)

This implies that (114) should hold as an equality, since the inequality is false by the definition of the best response

{\bar{P}}^{*}

. Thus,

P^{*}

is the best response for

v^{*}

, or equivalently,

P^{*}

is the solution of the problem

arg min_{P \in K_{1}} f (P, v^{*}) .

The proof is similar if we consider entry

v^{(j_{t})}

of

S^{(j_{t})}

. □

Example 2.

In the above Example 1, we illustrated the decrease in the error associated with approximating operator

T_{p} (v_{0}, \dots, v_{p})

when parameters p and q increase. In Example 1, injections were not optimal.

Here, for the case of optimal injections, we wish to numerically illustrate Theorem 10, i.e., that the error

ε^{(i)}

decreases if the number of iterations i increases.

For

i = 0, 1, \dots,

we denote

T_{p}^{(i, i + 1)} (v_{0}, v_{1}^{(i + 1)}, \dots, v_{p}^{(i + 1)}) = \sum_{k = 0}^{p} G_{k}^{(i)} H_{k}^{(i)} z_{k} (v_{k}^{(i + 1)})

and

T_{p}^{(i + 1, i)} (v_{0}, v_{1}^{(i)}, \dots, v_{p}^{(i)}) = \sum_{k = 0}^{p} G_{k}^{(i + 1)} H_{k}^{(i + 1)} z_{k} (v_{k}^{(i)}) .

According to the procedure described in Section 4, for each i-th iteration where

i = 0, 1, \dots

, the proposed method is represented by either

T_{p}^{(i, i + 1)} (v_{0}, v_{1}^{(i + 1)}, \dots, v_{p}^{(i + 1)})

or

T_{p}^{(i + 1, i)} (v_{0}, v_{1}^{(i)}, \dots, v_{p}^{(i)}) .

To simplify the notation, both

T_{p}^{(i, i + 1)} (v_{0},

v_{1}^{(i + 1)},

\dots, v_{p}^{(i + 1)})

and

T_{p}^{(i + 1, i)} (v_{0},

v_{1}^{(i)}, \dots, v_{p}^{(i)})

are denoted by

T_{p}^{(i)} (v^{(i)})

.

We assume that

x \in L^{2} (Ω, R^{m})

and

y \in L^{2} (Ω, R^{m})

are uniformly and normally distributed random vectors, respectively. Initial injections

v_{1}^{(0)} \in L^{2} (Ω, R^{q_{1}}), \dots, v_{p}^{(0)} \in L^{2} (Ω, R^{q_{p}})

are here chosen as uniformly distributed random vectors. It is assumed that covariance matrices

E_{x v_{j}}

,

E_{v_{i} v_{j}}

, for

i, j = 0, 1, 2

, are given by

E_{x v_{j}} = \frac{1}{s} X V_{j}^{T} a n d E_{v_{i} v_{j}} = \frac{1}{s} V_{i} V_{j}^{T}

where

X \in R^{m \times s}

and

V_{j} \in R^{q_{j} \times s}

are samples of

x

and

v_{j}

, respectively, for

j = 0, 1, 2

.

We choose

m = 400

and

r_{0} = 200

. In Table 3, values of error

ε^{(i)}

associated with operators

T_{0}^{(i)} (v^{(i)})

,

T_{1}^{(i)} (v^{(i)})

and

T_{2}^{(i)} (v^{(i)})

, for

i = 14

are represented.

Table 3. Numerical characterizations of approximating operators

T_{0}^{(i)} (v^{(i)})

,

T_{1}^{(i)} (v^{(i)})

and

T_{2}^{(i)} (v^{(i)})

.

In Figure 3, diagrams of the errors are represented. It follows from Table 3 and Figure 3 that the error decreases if the number of iterations i increases. In particular, the error remains the same after

i = 14

.

Figure 3. Example 2: Diagrams of the errors associated with

T_{0} : = T_{0}^{(i)} (v^{(i)})

,

T_{1} : = T_{1}^{(i)} (v^{(i)})

and

T_{2} : = T_{2}^{(i)} (v^{(i)})

.

Similar to what is explained in Example 1, the proposed method is more effective because the increase in the number of parameters to optimize in the operator

T

leads to the reduction in the error, as stated in Theorem 10.

Author Contributions

Conceptualization, P.S.-Q. and A.T.; methodology, A.T.; software, P.S.-Q.; validation, A.T.; writing—original draft, A.T.; visualization, P.S.-Q. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Acknowledgments

This work was financially supported by Vicerrectoría de Investigación y Extensión from Instituto Tecnológico de Costa Rica (Research #1440054).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Amat, S.; Busquier, S.; Negra, M. Adaptive Approximation of Nonlinear Operators. Numer. Funct. Anal. Optim. 2004, 25, 397–405. [Google Scholar] [CrossRef]
Bruno, V.I. An approximate Weierstrass theorem in topological vector space. J. Approx. Theory 1984, 42, 1–3. [Google Scholar] [CrossRef]
Chen, T.; Chen, H. Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its application to dynamical systems. IEEE Trans. Neural Netw. 1995, 6, 911–917. [Google Scholar] [CrossRef]
Dingle, K.; Camargo, C.Q.; Louis, A.A. Input–output maps are strongly biased towards simple outputs. Nat. Commun. 2018, 9, 761. [Google Scholar] [CrossRef]
Gallman, P.G.; Narendra, K. Representations of nonlinear systems via the Stone-Weierstrass theorem. Automatica 1976, 12, 619–622. [Google Scholar] [CrossRef]
Howlett, P.G.; Torokhti, A.P.; Pearce, C.E.M. A Philosophy for the Modelling of Realistic Non-linear Systems. Proc. Am. Math. Soc. 2003, 131, 353–363. [Google Scholar] [CrossRef]
Istrăţescu, V.I. A Weierstrass theorem for real Banach spaces. J. Approx. Theory 1977, 19, 118–122. [Google Scholar] [CrossRef]
Prenter, P.M. A Weierstrass Theorem for Real, Separable Hilbert Spaces. J. Approx. Theory 1970, 4, 341–357. [Google Scholar] [CrossRef]
Prolla, J.B.; Machado, S. Weierstrass-Stone theorems for set-valued mappings. J. Approx. Theory 1982, 36, 1–15. [Google Scholar] [CrossRef]
Rao, N.V. Stone-Weierstrass theorem revisited. Am. Math. Mon. 2005, 112, 726–729. [Google Scholar] [CrossRef]
Sandberg, I.W. Notes on uniform approximation of time–varying systems on finite time intervals. IEEE Trans. Circuits Syst. I Fundam. Theory Appl. 1998, 45, 863–864. [Google Scholar] [CrossRef]
Sandberg, I.W. Time delay polynomial networks and quality of approximation. IEEE Trans. Circuits Syst. I Fundam. Theory Appl. 2000, 47, 40–49. [Google Scholar] [CrossRef]
Sandberg, I.W. R+ Fading Memory and Extensions of Input-Output Maps. IEEE Trans. Circuits Syst. I Fundam. Theory Appl. 2002, 49, 1586–1591. [Google Scholar] [CrossRef]
Timofte, V. Stone-Weierstrass theorem revisited. J. Approx. Theory 2005, 536, 45–59. [Google Scholar] [CrossRef][Green Version]
Piotrowski, T.; Cavalcante, R.; Yamada, I. Stochastic MV-PURE Estimator-Robust Reduced-Rank Estimator for Stochastic Linear Model. IEEE Trans. Signal Process. 2009, 57, 1293–1303. [Google Scholar] [CrossRef]
Hua, Y.; Nikpour, M.; Stoica, P. Optimal reduced-rank estimation and filtering. IEEE Trans. Signal Process. 2001, 49, 457–469. [Google Scholar]
Zhu, X.L.; Zhang, X.D.; Ding, Z.Z.; Jia, Y. Adaptive nonlinear PCA algorithms for blind source separation without prewhitening. IEEE Trans. Circuits Syst. I Regul. Pap. 2006, 53, 745–753. [Google Scholar]
Wang, Q.; Jing, Y. New rank detection methods for retuced-rank MIMO systems. EURASIP J. Wirel. Comm. Netw. 2015, 2015, 1–16. [Google Scholar] [CrossRef]
Bai, L.; Dou, S.; Xiao, Z.; Choi, J. Doubly iterative multiple-input-multiple-output-bit-interleaved coded modulation receiver with joint channel estimation and randomised sampling detection. IET Signal Process. 2016, 10, 335–341. [Google Scholar] [CrossRef]
Brillinger, D.R. Time Series: Data Analysis and Theory; SIAM: San Francisco, CA, USA, 2001. [Google Scholar]
Jolliffe, I. Principal Component Analysis, 2nd ed.; Springer: New York, NY, USA, 2002. [Google Scholar]
Bühlmann, P.; van de Geer, S. Statistics for High-Dimensional Data; Springer: New York, NY, USA, 2011. [Google Scholar]
She, Y.; Li, S.; Wu, D. Robust Orthogonal Complement Principal Component Analysis. J. Am. Stat. Assoc. 2016, 111, 763–771. [Google Scholar] [CrossRef]
Torokhti, A.; Friedland, S. Towards theory of generic Principal Component Analysis. J. Multivar. Anal. 2009, 100, 661–669. [Google Scholar] [CrossRef]
Saghri, J.A.; Schroeder, S.; Tescher, A.G. Adaptive two-stage Karhunen-Loeve-transform scheme for spectral decorrelation in hyperspectral bandwidth compression. Opt. Eng. 2010, 49, 057001. [Google Scholar] [CrossRef]
Kmuntchav, R.; Kountcheva, R. Hierarchical Adaptive KL-Based Transform: Algorithms and Applications. In Computer Vision in Control Systems-1; Favorskaya, M.N., Jain, L.C., Eds.; Springer: Cham, Switzerland, 2015; Volume 73, pp. 91–136. [Google Scholar]
de Moura, E.P.; de Abreu Melo Junior, F.; Damasceno, F.; Figueiredo, L.; de Andrade, C.; de Almeida, M. Classification of imbalance levels in a scaled wind turbine through detrended fluctuation analysis of vibration signals. Renew. Energy 2016, 96 Pt A, 993–1002. [Google Scholar] [CrossRef]
Azimia, R.; Ghayekhlooa, M.; Ghofranib, M.; Sajedic, H. A novel clustering algorithm based on data transformation approaches. Expert Syst. Appl. 2017, 76, 59–70. [Google Scholar] [CrossRef]
Burge, M.J.; Burger, W. Digital Image Processing—An Algorithmic Introduction Using Java; Springer: London, UK, 2006. [Google Scholar]
Phophalia, A.; Rajwade, A.; Mitra, S.K. Rough set based image denoising for brain MR images. Signal Process. 2014, 103, 24–35. [Google Scholar] [CrossRef]
Chenn, S.; Billings, S.A.; Grant, P. Non-linear system identification using neural networks. Int. J. Control 1990, 51, 1191–1214. [Google Scholar] [CrossRef]
Chenn, S.; Billings, S.A. Neural networks for nonlinear dynamic system modelling and identification. Int. J. Control 1992, 56, 319–346. [Google Scholar] [CrossRef]
Fomin, V.N.; Ruzhansky, M.V. Abstract Optimal Linear Filtering. SIAM J. Control Optim. 2000, 38, 1334–1352. [Google Scholar] [CrossRef]
Temlyakov, V.N. Nonlinear Methods of Approximation. Found. Comput. Math. 2003, 3, 33–107. [Google Scholar] [CrossRef]
Torokhti, A.; Howlett, P. Best approximation of identity mapping: The case of variable memory. J. Approx. Theory 2006, 143, 111–123. [Google Scholar] [CrossRef]
Liu, W.; Zhang, H.; Yu, K.; Tan, X. Optimal linear filtering for networked systems with communication constraints, fading measurements, and multiplicative noises. Int. J. Adapt. Control Signal Process. 2016. [Google Scholar] [CrossRef]
Aminzare, Z.; Sontag, E.D. Synchronization of Diffusively-Connected Nonlinear Systems: Results Based on Contractions with Respect to General Norms. IEEE Trans. Netw. Sci. Eng. 2014, 1, 91–106. [Google Scholar] [CrossRef]
Schneider, M.K.; Willsky, A.S. A Krylov Subspace Method for Covariance Approximation and Simulation of Random Processes and Fields. Multidimens. Syst. Signal Process. 2003, 14, 295–318. [Google Scholar] [CrossRef]
Chenn, S.; Billings, S.A. Representations of non-linear systems: The NARMAX model. Int. J. Control 1989, 49, 1013–1032. [Google Scholar] [CrossRef]
Piroddi, L.; Spinelli, W. An identification algorithm for polynomial NARX models based on simulation error minimization. Int. J. Control 2003, 76, 1767–1781. [Google Scholar] [CrossRef]
Alter, O.; Golub, G.H. Singular value decomposition of genome-scale mRNA lengths distribution reveals asymmetry in RNA gel electrophoresis band broadening. Process. Natl. Acad. Sci. USA 2006, 103, 11828–11833. [Google Scholar] [CrossRef]
Gianfelici, F.; Turchetti, C.; Crippa, P. A non-probabilistic recognizer of stochastic signals based on KLT. Signal Process. 2009, 89, 422–437. [Google Scholar] [CrossRef]
Formaggia, L.; Quarteroni, A.; Veneziani, A. (Eds.) Cardiovascular Mathematics—Modeling and Simulation of the Circulatory System; Springer: Milano, Italy, 2009; Volume 1. [Google Scholar]
Ambrosi, D.; Quarteroni, A.; Rozza, G. (Eds.) Modelling of Physiological Flows; Springer: Milano, Italy, 2011; Volume V. [Google Scholar]
Piotrowski, T.; Yamada, I. Performance of the stochastic MV-PURE estimator in highly noisy settings. J. Frankl. Insitute 2014, 351, 3339–3350. [Google Scholar] [CrossRef]
Poor, H.V. An Introduction to Signal Processing and Estimation, 2nd ed.; Springer: New York, NY, USA, 2001. [Google Scholar]
Torokhti, A.; Soto-Quiros, P. Optimal modeling of nonlinear systems: Method of variable injections. Proyecciones 2024, 43, 189–224. [Google Scholar] [CrossRef]
Yang, S.; Xing, T.; Ke, C.; Liang, J.; Ke, X. Effect of wavefront distortion on the performance of coherent detection systems: Theoretical analysis and experimental research. Photonics 2023, 10, 493. [Google Scholar] [CrossRef]
Stanković, L.; Brajović, M.; Stanković, I.; Lerga, J.; Daković, M. RANSAC-based signal denoising using compressive sensing. Circuits Syst. Signal Process. 2021, 40, 3907–3928. [Google Scholar] [CrossRef]
Wang, Y.; Cheng, K.; Zhao, S.; Xu, E. Human ear image recognition method using PCA and Fisherface complementary double feature extraction. J. Artif. Intell. Technol. 2023, 3, 18–24. [Google Scholar] [CrossRef]
Al-Saffar, N.F.H.; Al-Saiq, I.R. Symmetric text encryption scheme based Karhunen Loeve transform. J. Discret. Math. Sci. Cryptogr. 2022, 25, 2773–2781. [Google Scholar] [CrossRef]
Soto-Quiros, P.; Torokhti, A. Fast random vector transforms in terms of pseudo-inverse within the Wiener filtering paradigm. J. Comput. Appl. Math. 2024, 448, 115927. [Google Scholar] [CrossRef]
Howlett, P.; Torokhti, A. Optimal approximation of a large matrix by a sum of projected linear mappings on prescribed subspaces. Electron. J. Linear Algebra 2024, 40, 585–605. [Google Scholar] [CrossRef]
Torokhti, A.; Howlett, P. Optimal estimation of distributed highly noisy signals within KLT-Wiener archetype. Digit. Signal Process. 2023, 143, 104225. [Google Scholar] [CrossRef]
Howlett, P.; Torokhti, A. An optimal linear filter for estimation of random functions in Hilbert space. ANZIAM J. 2020, 62, 274–301. [Google Scholar] [CrossRef]
Ledoit, O.; Wolf, M. A well-conditioned estimator for large-dimensional covariance matrices. J. Multivar. Anal. 2004, 88, 365–411. [Google Scholar] [CrossRef]
Ledoit, O.; Wolf, M. Nonlinear shrinkage estimation of large-dimensional covariance matrices. Ann. Stat. 2012, 40, 1024–1060. [Google Scholar] [CrossRef]
Adamczak, R.; Litvak, A.E.; Pajor, A.; Tomczak-Jaegermann, N. Quantitative estimates of the convergence of the empirical covariance matrix in log-concave ensembles. J. Am. Math. Soc. 2009, 2, 535–561. [Google Scholar] [CrossRef]
Vershynin, R. How Close is the Sample Covariance Matrix to the Actual Covariance Matrix? J. Theor. Probab. 2012, 25, 655–686. [Google Scholar] [CrossRef]
Won, J.-H.; Lim, J.; Kim, S.-J.; Rajaratnam, B. Condition-number-regularized covariance estimation. J. R. Stat. Soc. Ser. B 2013, 75, 427–450. [Google Scholar] [CrossRef] [PubMed]
Yang, R.; Berger, J.O. Estimation of a covariance matrix using the reference prior. Ann. Statist. 1994, 22, 1195–1211. [Google Scholar] [CrossRef]
Ben-Israel, A.; Greville, T.N.E. Generalized Inverses: Theory and Applications, 2nd ed.; Springer: New York, NY, USA, 2003. [Google Scholar]
Hotelling, H. Analysis of a Complex of Statistical Variables into Principal Components. J. Educ. Psychol. 1933, 6, 417–441 and 498–520. [Google Scholar] [CrossRef]
Karhunen, K. Über Lineare Methoden in der Wahrscheinlichkeitsrechnung. Ann. Acad. Sci. Fenn. Ser. A. I. Math.-Phys. 1947, 1, 1–79. [Google Scholar]
Loève, M. Fonctions aléatoires de second ordre. In Processus Stochastiques et Mouvement Brownien; Lévy, P., Ed.; Hermann: Paris, France, 1948. [Google Scholar]
Scharf, L. The SVD and reduced rank signal processing. Signal Process. 1991, 25, 113–133. [Google Scholar] [CrossRef]
Hua, Y.; Liu, W. Generalized Karhunen-Loeve transform. IEEE Signal Process. Lett. 1998, 5, 141–142. [Google Scholar]
Torokhti, A.; Soto-Quiros, P. Generalized Brillinger-Like Transforms. IEEE Signal Process. Lett. 2016, 23, 843–847. [Google Scholar] [CrossRef]
Torokhti, A.; Miklavcic, S. Data compression under constraints of causality and variable finite memory. Signal Process. 2010, 90, 2822–2834. [Google Scholar] [CrossRef]
Lathauwer, L.D.; Moor, B.D.; Vandewalle, J. On the Best Rank-1 and Rank-(R1,R2, …, RN) Approximation of Higher-Order Tensors. SIAM J. Matrix Anal. Appl. 2006, 21, 1324–1342. [Google Scholar] [CrossRef]
Grasedyck, L.; Kressner, D.; Tobler, C. A literature survey of low-rank tensor approximation techniques. GAMM-Mitteilungen 2013, 36, 53–78. [Google Scholar] [CrossRef]
Friedland, S.; Tammali, V. Low-Rank Approximation of Tensors. In Numerical Algebra, Matrix Theory, Differential-Algebraic Equations and Control Theory; Benner, P., Bollhöfer, M., Kressner, D., Mehl, C., Stykel, T., Eds.; Springer: Cham, Switzerland, 2015; pp. 377–411. [Google Scholar]
Billings, S.A. Nonlinear System Identification-Narmax Methods in the Time, Frequency, and Spatio-Temporal Domains; John Wiley and Sons, Ltd.: Hoboken, NJ, USA, 2013. [Google Scholar]
Schoukens, M.; Tiels, K. Identification of Block-oriented Nonlinear Systems Starting from Linear Approximations: A Survey. Automatica 2017, 85, 272–292. [Google Scholar] [CrossRef]
Chen, B.; He, S.; Li, Z.; Zhang, S. Maximum Block Improvement and Polynomial Optimization. SIAM J. Optim. 2012, 22, 87–107. [Google Scholar] [CrossRef]
Abed-Meraim, K.; Qui, W.; Hua, Y. Blind system identification. Proc. IEEE 1997, 85, 1310–1322. [Google Scholar] [CrossRef]
Hua, Y. Blind methods of system identification. Circuits Syst. Signal Process. 2002, 21, 91–108. [Google Scholar] [CrossRef]
Shi, X. Mathematical Description of Blind Signal Processing. In Blind Signal Processing: Theory and Practice; Springer: Berlin/Heidelberg, Germany, 2011; pp. 27–59. [Google Scholar]
Torokhti, A.; Howlett, P. Optimal Fixed Rank Transform of the Second Degree. IEEE Trans. Circuits Syst. II Analog. Digit. Signal Process. 2001, 48, 309–315. [Google Scholar] [CrossRef]
Torokhti, A.; Howlett, P. Computational Methods for Modelling of Nonlinear Systems; Elsevier: Amsterdam, The Netherlands, 2007. [Google Scholar]
Friedland, S.; Torokhti, A. Generalized Rank-Constrained Matrix Approximations. SIAM J. Matrix Anal. Appl. 2007, 29, 656–659. [Google Scholar] [CrossRef][Green Version]
Liu, X.; Li, W.; Wang, H. Rank constrained matrix best approximation problem with respect to (skew) Hermitian matrices. J. Comput. Appl. Math. 2017, 319, 77–86. [Google Scholar] [CrossRef]
Boutsidis, C.; Woodruff, D.P. Optimal CUR Matrix Decompositions. SIAM J. Comput. 2017, 46, 543–589. [Google Scholar] [CrossRef]
Wang, H. Rank constrained matrix best approximation problem. Appl. Math. Lett. 2015, 50, 98–104. [Google Scholar] [CrossRef]
Golub, G.; Loan, C.F.V. Matrix Computations; Jons Hopkins University Press: Baltimore, MD, USA, 1996. [Google Scholar]
Kanwal, R. Linear Integral Equations; Birkhäuser Boston: Boston, MA, USA, 1996. [Google Scholar]

Figure 1. Diagrammatical representation of the proposed technique.

Figure 2. Example 1: Diagrams of the errors associated with

T_{0} (v_{0})

,

T_{1} (v_{0}, v_{1})

, and

T_{2} (v_{0}, v_{1}, v_{2})

.

Figure 2. Example 1: Diagrams of the errors associated with

T_{0} (v_{0})

,

T_{1} (v_{0}, v_{1})

, and

T_{2} (v_{0}, v_{1}, v_{2})

.

Figure 3. Example 2: Diagrams of the errors associated with

T_{0} : = T_{0}^{(i)} (v^{(i)})

,

T_{1} : = T_{1}^{(i)} (v^{(i)})

and

T_{2} : = T_{2}^{(i)} (v^{(i)})

.

Figure 3. Example 2: Diagrams of the errors associated with

T_{0} : = T_{0}^{(i)} (v^{(i)})

,

T_{1} : = T_{1}^{(i)} (v^{(i)})

and

T_{2} : = T_{2}^{(i)} (v^{(i)})

.

Table 1. Numerical characterizations of approximating operators

T_{0} (v_{0})

,

T_{1} (v_{0}, v_{1})

, and

T_{2} (v_{0}, v_{1}, v_{2})

in Case 1.

Table 1. Numerical characterizations of approximating operators

T_{0} (v_{0})

,

T_{1} (v_{0}, v_{1})

, and

T_{2} (v_{0}, v_{1}, v_{2})

in Case 1.

Approximating Operator	$q_{0}$	$q_{1}$	$q_{2}$	$r_{0}$	$r_{1}$	$r_{2}$	MSE
$T_{0} (v_{0})$	100	N/A	N/A	50	N/A	N/A	8.30
$T_{1} (v_{0}, v_{1})$	100	25	N/A	25	25	N/A	7.93
$T_{2} (v_{0}, v_{1}, v_{2})$	100	25	500	17	17	16	7.03

Table 2. Numerical characterizations of operators

T_{0} (v_{0})

,

T_{1} (v_{0}, v_{1})

, and

T_{2} (v_{0}, v_{1}, v_{2})

in Case 2.

Table 2. Numerical characterizations of operators

T_{0} (v_{0})

,

T_{1} (v_{0}, v_{1})

, and

T_{2} (v_{0}, v_{1}, v_{2})

in Case 2.

Approximating Operator	$q_{0}$	$q_{1}$	$q_{2}$	$r_{0}$	$r_{1}$	$r_{2}$	MSE
$T_{0} (v_{0})$	100	N/A	N/A	50	N/A	N/A	8.30
$T_{1} (v_{0}, v_{1})$	100	200	N/A	25	25	N/A	7.61
$T_{2} (v_{0}, v_{1}, v_{2})$	100	200	500	17	17	16	6.28

Table 3. Numerical characterizations of approximating operators

T_{0}^{(i)} (v^{(i)})

,

T_{1}^{(i)} (v^{(i)})

and

T_{2}^{(i)} (v^{(i)})

.

Table 3. Numerical characterizations of approximating operators

T_{0}^{(i)} (v^{(i)})

,

T_{1}^{(i)} (v^{(i)})

and

T_{2}^{(i)} (v^{(i)})

.

$T_{p}^{(i)} (v^{(i)})$	$q_{0}$	$q_{1}$	$q_{2}$	$r_{0}$	$r_{1}$	$r_{2}$	i	$ε^{(i)}$
$T_{0}^{(i)} (v^{(i)})$	400	N/A	N/A	200	N/A	N/A	N/A	30.9
$T_{1}^{(i)} (v^{(i)})$	400	400	400	100	100	N/A	14	21.8
$T_{2}^{(i)} (v^{(i)})$	400	400	400	67	67	66	14	18.9

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Article Metrics

Citations

Article Access Statistics

Journal Statistics

Multiple requests from the same IP address are counted as one view.

Modeling of Nonlinear Systems: Method of Optimal Injections

Abstract

1. Introduction

1.1. Motivation

1.2. Short Description of the Method

1.3. Novelty and Advantages

2. The Proposed Approach

2.1. Some Special Notation

2.2. Generic Structure of Approximating Operator

2.3. Statement of the Problem

2.4. Related Work

2.4.1. Low-Rank Approximations

2.4.2. Tensor Methods

2.4.3. System Identification and Modeling

2.5. Contribution

2.5.1. Challenges of High-Dimensionality

2.5.2. Challenges of Accuracy

2.5.3. Novelties and Relation to Existing Concepts

3. Preliminary Results

3.1. Determination of Pairwise Uncorrelated Vectors

3.2. Determination of Matrices G 0 , H 0 , … , G p , H p That Solve the Problem in (16), (8) and (9)

3.3. Error Analysis Associated with the Solution of Problem in (16), (8) and (9)

3.4. Particular Case: No Reduction of Vector Dimensionality

4. Solution of Problem Given by (7), (8), (9)

4.1. Device of Solution

4.2. Determination of v j ( i + 1 ) in Step 1

4.3. Determination of Matrices G 1 ( i + 1 ) , H 1 ( i + 1 ) , … , G p ( i + 1 ) , H p ( i + 1 ) in Step 3

4.4. Error Analysis of the Solution of Problem in (7), (8), (9)

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Article Metrics

Article Access Statistics

3.2. Determination of Matrices $G_{0}, H_{0}, \dots,$ $G_{p}, H_{p}$ That Solve the Problem in (16), (8) and (9)

4.2. Determination of $v_{j}^{(i + 1)}$ in Step 1

4.3. Determination of Matrices $G_{1}^{(i + 1)}, H_{1}^{(i + 1)}, \dots,$ $G_{p}^{(i + 1)}, H_{p}^{(i + 1)}$ in Step 3