Application of Gradient Optimization Methods in Defining Neural Dynamics

Stanimirović, Predrag S.; Tešić, Nataša; Gerontitis, Dimitrios; Milovanović, Gradimir V.; Petrović, Milena J.; Kazakovtsev, Vladimir L.; Stasiuk, Vladislav

doi:10.3390/axioms13010049

Open AccessArticle

Application of Gradient Optimization Methods in Defining Neural Dynamics

by

Predrag S. Stanimirović

^1,2

,

Nataša Tešić

³,

Dimitrios Gerontitis

⁴,

Gradimir V. Milovanović

⁵

,

Milena J. Petrović

^6,*

,

Vladimir L. Kazakovtsev

² and

Vladislav Stasiuk

²

¹

Faculty of Sciences and Mathematics, University of Niš, 18000 Niš, Serbia

²

Laboratory “Hybrid Methods of Modelling and Optimization in Complex Systems”, Siberian Federal University, Prosp. Svobodny 79, 660041 Krasnoyarsk, Russia

³

Department of Mathematics and Informatics, Faculty of Sciences, University of Novi Sad, 21000 Novi Sad, Serbia

⁴

Department of Information and Electronic Engineering, International Hellenic University, 57400 Thessaloniki, Greece

⁵

Mathematical Institute, Serbian Academy of Sciences and Arts, Kneza Mihaila 35, 11000 Belgrade, Serbia

⁶

Faculty of Sciences and Mathematics, University of Pristina in Kosovska Mitrovica, Lole Ribara 29, 38220 Kosovska Mitrovica, Serbia

^*

Author to whom correspondence should be addressed.

Axioms 2024, 13(1), 49; https://doi.org/10.3390/axioms13010049

Submission received: 1 November 2023 / Revised: 24 December 2023 / Accepted: 11 January 2024 / Published: 14 January 2024

(This article belongs to the Special Issue Numerical Analysis and Optimization)

Download

Browse Figures

Versions Notes

Abstract

Applications of gradient method for nonlinear optimization in development of Gradient Neural Network (GNN) and Zhang Neural Network (ZNN) are investigated. Particularly, the solution of the matrix equation

A X B = D

which changes over time is studied using the novel GNN model, termed as GGNN

(A, B, D)

. The GGNN model is developed applying GNN dynamics on the gradient of the error matrix used in the development of the GNN model. The convergence analysis shows that the neural state matrix of the GGNN

(A, B, D)

design converges asymptotically to the solution of the matrix equation

A X B = D

, for any initial state matrix. It is also shown that the convergence result is the least square solution which is defined depending on the selected initial matrix. A hybridization of GGNN with analogous modification GZNN of the ZNN dynamics is considered. The Simulink implementation of presented GGNN models is carried out on the set of real matrices.

Keywords:

gradient neural network; generalized inverses; Moore–Penrose inverse; linear matrix equations

MSC:

68T05; 15A09; 65F20

1. Introduction and Background

Recurrent neural networks (RNNs) are an important class of algorithms for computing matrix (generalized) inverses. These algorithms are used to find the solutions of matrix equations or to minimize certain nonlinear matrix functions. RNNs are divided into two subgroups: Gradient Neural Networks (GNNs) and Zhang Neural Networks (ZNNs). The GNN design is explicit and mostly applicable to time-invariant problems, which means that the coefficients of the equations that are addressed are constant matrices. ZNN models can be implicit and are able to solve time-varying problems, where the coefficients of the equations depend on the variable

t \in R, t > 0,

representing time [1,2,3].

The Moore–Penrose inverse of

A \in R^{p \times n}

is the unique matrix

A^{†} = X \in R^{n \times p}

which is the solution to the well-known Penrose equations [4,5]:

A = A X A, X = X A X, A X = {(A X)}^{T}, X A = {(X A)}^{T},

where

{()}^{T}

denotes the transpose matrix. The rank of a matrix A, i.e., the maximum number of linearly independent columns in A, is denoted by

rank (A)

.

Applications of linear algebra tools and generalized inverses can be found in important areas such as the modeling of electrical circuits [6], the estimation of DNA sequences [7] and the balancing of chemical equations [8,9], as well as in other important research domains related to robotics [10] and statistics [11]. A number of iterative methods for solving matrix equations based on gradient values have been proposed [12,13,14,15].

In the following sections, we will focus on GNN and ZNN dynamical systems based on the gradient of the objective function and their implementation. The main goal of this research is the analysis of convergence and the study of analytic solutions.

Models with GNN neural designs for computing the inverse or the Moore–Penrose inverse and linear matrix equations were proposed in [16,17,18,19]. Further, various dynamical systems aimed at approximating the pseudo-inverse of rank-deficient matrices were developed in [16]. Wei, in [20], proposed three RNN models for the approximation of the weighted Moore–Penrose inverse. Online matrix inversion in a complex matrix case was considered in [21]. A novel GNN design based on nonlinear activation functions (AFs) was proposed and analyzed in [22,23] for solving the constant Lyapunov matrix equation online. A fast convergent GNN aimed at solving a system of linear equations was proposed and numerically analyzed in [24]. Xiao, in [25], investigated the finite-time convergence of an appropriately accelerated ZNN for the online solution of the time-varying complex matrix equation

A (t) X (t) = B (t)

. A comparison with the corresponding GNN design was considered. Two improved nonlinear GNN dynamical systems for approximating the Moore–Penrose inverse of full-row or full-column rank matrices were proposed and considered in [26]. GNN-type models for solving matrix equations and computing related generalized inverses were developed in [1,3,13,16,18,20,27,28,29]. The acceleration of GNN dynamics to a finite-time convergence has been investigated recently. A finite-time convergent GNN for approximating online solutions of the general linear matrix equation

A X (t) B + C X (t) D = B

was proposed in [30]. This goal was achieved using two activation functions (AFs) in the construction of the GNN. The influence of AFs on the convergence performance of a GNN design for solving the matrix equation

A X B + X = C

was investigated in [31]. A fixed-time convergent GNN for solving the Sylvester equation was investigated in [32]. Moreover, noise-tolerant GNN models equipped with a suitable activation function (AF) able to solve convex optimization problems were developed in [33].

Our goal is to solve the equation

A X B = D

and apply its particular cases in computing generalized inverses in real time by improving the GNN model developed in [34]. The developed dynamical system is denoted by GNN

(A, B, D)

. Or motivation is to improve the GNN model denoted by GNN

(A, B, D)

and develop a novel gradient-based GGNN model, termed GGNN

(A, B, D)

, utilizing a novel type of dynamical system. The proposed GGNN model is based on the standard GNN dynamics along the gradient of the standard error matrix. The convergence analysis reveals the global asymptotic convergence of GGNN

(A, B, D)

without restrictions, while the output belongs to the set of general solutions to the matrix equation

A X B = D

.

In addition, we propose gradient-based modifications of the hybrid models developed in [35] as proper combinations of GNN and ZNN models for solving the matrix equations

B X = D

and

X C = D

with constant coefficients. Analogous hybridizations for approximating the matrix inverse were developed in [36], while two modifications of the ZNN design for computing the Moore–Penrose inverse were proposed in [37]. Hybrid continuous-gradient–Zhang neural dynamics for solving linear time-variant equations were investigated in [38,39]. The developed hybrid GNN-ZNN models in this paper are aimed at solving the matrix equations

A X = B

and

X C = D

, denoted by

HGZNN (A, I, B)

and

HGZNN (I, C, D)

, respectively.

The implementation was performed in MATLAB Simulink, and numerical experiments were performed with simulations of the GNN, GGNN and HGZNN models.

The GNN used to solve the general linear matrix equation

A X B = D

is defined over the error matrix

E (t) = D - A V (t) B

, where

t \in [0, + \infty)

is time, and

V (t)

is an unknown state-variable matrix that approximates the unknown matrix X in

A X B = D

. The goal function is

ε (t) = | | D - {A V (t) B | |}_{F}^{2} / 2

, where

{∥ \cdot ∥}_{F} = \sqrt{\sum_{i j} a_{i j}^{2}}

denotes the Frobenius norm of a matrix. The gradient of

ε (t)

is equal to

\frac{\partial ε (t)}{\partial V} = \nabla ε = \frac{1}{2} \frac{\partial | | D - {A V (t) B | |}_{F}^{2}}{\partial V} = - A^{T} (D - A V (t) B) B^{T} .

The GNN evolutionary design is defined by the dynamic system

\dot{V} (t) = \frac{d V (t)}{d t} = - γ \frac{\partial ε (t)}{\partial V}, V (0) = V_{0},

(1)

where

γ > 0

is a real parameter used to speed up the convergence, and

\dot{V} (t)

denotes the time derivative of

V (t)

. Thus, the linear GNN aimed at solving

A X B = D

is given by the following dynamics:

\dot{V} (t) = γ A^{T} (D - A V (t) B) B^{T} .

(2)

The dynamical flow (2) is denoted as GNN

(A, B, D)

. The nonlinear GNN

(A, B, D)

for solving

A X B = D

is defined by

\dot{V} (t) = γ A^{T} F (D - A V (t) B) B^{T} .

(3)

The function array

F (C) = F ([c_{i j}])

is based on the appropriate odd and monotonically increasing activation function, which is applicable to the elements of a real matrix

C = (c_{i j}) \in R^{m \times n}

, i.e.,

F (C) = [f (c_{i j})], i = 1, \dots, m, j = 1, \dots, n,

.

Proposition 1 restates restrictions on the solvability of

A X B = D

and its general solution.

Proposition 1

([4,5]). If

A \in R^{m \times n}, B \in R^{p \times q}

and

D \in R^{m \times q}

, then the fulfillment of the condition

A A^{†} D B^{†} B = D

(4)

is necessary and sufficient for the solvability of the linear matrix equation

A X B = D

. In this case, the set of all solutions is given by

X = \{A^{†} D B^{†} + Y - A^{†} A Y B B^{†} | Y \in R^{n \times p}\} .

(5)

The following results from [34] describe the conditions of convergence and the limit of the unknown matrix

V (t)

from (3) as

t \to + \infty

.

Proposition 2

([34]). Suppose the matrices

A \in R^{m \times n}, B \in R^{p \times q}

and

D \in R^{m \times q}

satisfy (4). Then, the unknown matrix

V (t)

from (3) converges as

t \to + \infty

with the equilibrium state

V (t) \to \tilde{V} = A^{†} D B^{†} + V (0) - A^{†} A V (0) B B^{†}

(6)

for any initial state-variable matrix

V (0) \in R^{n \times p} .

The research in [40] investigated various ZNN models based on optimization methods. The goal of the current research is to develop a GNN model based on the gradient

E_{G} (t)

of

{∥ E (t) ∥}_{F}^{2}

instead of the original goal function

E (t)

.

The obtained results are summarized as follows:

A novel error function $E_{G} (t)$ is proposed for the development of the GNN dynamical evolution.
The GNN design based on the error function $E_{G} (t)$ is developed and analyzed theoretically and numerically.
A hybridization of GNN and ZNN dynamical systems based on the error matrix $E_{G}$ is proposed and investigated.

The overall organization of this paper is as follows. The motivation and derivation of the GGNN and GZNN models are presented in Section 2. Section 3 is dedicated to the convergence analysis of GGNN dynamics. A numerical comparison of GNN and GGNN dynamics is given in Section 4. Neural dynamics based on the hybridization of GGNN and GZNN models for solving matrix equations are considered in Section 6. Numerical examples of hybrid models are analyzed in Section 6. Finally, the last section presents some concluding remarks and a vision of further research.

2. Motivation and Derivation of GGNN and GZNN Models

The standard GNN design (2) solves the GLME

A X B = D

under constraint (4). Our goal is to resolve this restriction and propose dynamic evolutions based on error functions that tend to zero without restrictions.

Our goal is to define the GNN design for solving the GLME

A X B = D

based on the error function

E_{G} (t) : = \nabla ε (t) = A^{T} (D - A V (t) B) B^{T} = A^{T} E (t) B^{T} .

(7)

According to known results from nonlinear unconstrained optimization [41], the equilibrium points of (7) satisfy

E_{G} (t) : = \nabla ε (t) = 0 .

We continue the investigation from [40]. More precisely, we develop the GNN model based on the error function

E_{G} (t)

instead of the error function

E (t)

. In this way, new neural dynamics are aimed at forcing the gradient

E_{G}

to zero instead of the standard goal function

E (t)

. It is reasonable to call such an RNN model a gradient-based GNN (abbreviated GGNN).

Proposition 3 gives the conditions for the solvability of the matrix equations

E (t) = 0

and

E_{G} (t) = 0

and the general solutions to these systems.

Proposition 3

([40]). Consider the arbitrary matrices

A \in R^{m \times n}

,

B \in R^{k \times h}

and

D \in R^{m \times h}

. The following statements are true:

(a): The equation $E (t) = 0$ is solvable if and only if (4) is satisfied, and the general solution to $E (t) = 0$ is given by (5).
(b): The equation $E_{G} (t) = 0$ is always solvable, and its general solution coincides with (5).

Proof.

(a) This part of the proof follows from known results on the solvability and general solution of the matrix equation

A X B = D

of generalized inverses [4] (p. 52, Theorem 1) and its application to the matrix equation

E (t) = 0 ⟺ A V (t) B = D

.

(b) According to [4] (p. 52, Theorem 1), the matrix equation

E_{G} (t) = 0 ⟺ A^{T} A V B B^{T} = A^{T} D B^{T}

is consistent if and only if

A^{T} A {(A^{T} A)}^{†} A^{T} D B^{T} {(B B^{T})}^{†} B B^{T} = A^{T} D B^{T}

is satisfied. Indeed, applying the properties

{(A^{T} A)}^{†} A^{T} = A^{†}

,

B^{T} {(B B^{T})}^{†} = B^{†}

and

A^{T} A A^{†} = A^{T}

,

B^{†} B B^{T} = B^{T}

of the Moore–Penrose inverse [5] results in

A^{T} A {(A^{T} A)}^{†} A^{T} D B^{T} {(B B^{T})}^{†} B B^{T} = A^{T} A A^{†} D B^{†} B B^{T} = A^{T} D B^{T} .

In addition, based on [4] (p. 52, Theorem 1), the general solution

V (t)

to

E_{G} (t) = 0

is

\begin{array}{l} V = & {(A^{T} A)}^{†} A^{T} D B^{T} {(B B^{T})}^{†} + Y - {(A^{T} A)}^{†} A^{T} A Y B B^{T} {(B B^{T})}^{†} \\ = A^{†} D B^{†} + Y - A^{†} A Y B B^{†}, \end{array}

(8)

which coincides with (5). □

In this way, the matrix equation

E (t) = 0

is solvable under condition (4), while the equation

E_{G} (t) = 0

is always consistent. In addition, the general solutions to equations

E (t) = 0

and

E_{G} (t) = 0

are identical [40].

The next step is to define the GGNN dynamics using the error matrix

E_{G} (t)

. Let us define the objective function

ε_{G} = | | E_{G} {| |}_{F}^{2} / 2

, whose gradient is equal to

\frac{\partial ε_{G} (V (t))}{\partial V} = \frac{\partial | | A^{T} (D - A V (t) B) B^{T} {| |}_{F}^{2}}{\partial V} = - A^{T} A (A^{T} (D - A V (t) B) B^{T}) B B^{T} .

The dynamical system for the GGNN formula is obtained by applying the GNN evolution along the gradient of

ε_{G} (V (t))

based on

E_{G} (t)

, as follows:

\begin{matrix} \dot{V} (t) & = - γ \frac{\partial ε_{G}}{\partial V} \\ = γ A^{T} A (A^{T} (D - A V (t) B) B^{T}) B B^{T} . \end{matrix}

(9)

The nonlinear GGNN dynamics are defined as

\dot{V} (t) = γ A^{T} A F (A^{T} (D - A V (t) B) B^{T}) B B^{T},

(10)

in which

F (C) = F ([c_{i j}])

denotes the elementwise application of an odd and monotonically increasing function

f (\cdot)

, as mentioned in the previous section for the GNN model (3). Model (10) is termed GGNN

(A, B, D)

. Three activation functions

f (\cdot)

are used in numerical experiments:

1.: Linear function

$f_{l i n} (x) = x;$

(11)
2.: Power-sigmoid activation function

$f_{p s} (x, ρ, ϱ) = \{\begin{matrix} x^{ρ} & if & | x | \geq 1 \\ \frac{1 + e^{- ϱ}}{1 - e^{- ϱ}} \cdot \frac{1 + e^{- ϱ x}}{1 - e^{- ϱ x}} & if & | x | < 1 \end{matrix}$

(12)

where $ϱ > 2$ , and $ρ \geq 3$ is an odd integer;
3.: Smooth power-sigmoid function

$f_{s p s} (x, ρ, ϱ) = \frac{1}{2} x^{ρ} + \frac{1 + e^{- ϱ}}{1 - e^{- ϱ}} \cdot \frac{1 + e^{- ϱ x}}{1 - e^{- ϱ x}},$

(13)

where $ϱ > 2$ , and $ρ \geq 3$ is an odd integer.

Figure 1 represents the Simulink implementation of GGNN

(A, B, D)

dynamics (10).

On the other hand, the GZNN model, defined using the ZNN dynamics on the Zhangian matrix

E_{G} (t)

, is defined in [40] by the general evolutionary design

{\dot{E}}_{G} (t) = \frac{d E_{G} (t)}{d t} = - γ F (E_{G} (t)) .

(14)

3. Convergence Analysis of GGNN Dynamics

In this section, we will analyze the convergence properties of the GGNN model given by dynamics (10).

Theorem 1.

Consider matrices

A \in R^{m \times n}, B \in R^{p \times q}

and

D \in R^{m \times q}

. If an odd and monotonically increasing array activation function

F (\cdot)

based on an elementwise function

f (\cdot)

is used, then the activation state matrix

V (t) \in R^{n \times p}

of the

GGNN (A, B, D)

model (10) asymptotically converges to the solution of the matrix equation

A X B = D

, i.e.,

A^{T} A V (t) B B^{T} \to A^{T} D B^{T}

as

t \to + \infty

, for an arbitrary initial state matrix

V (0)

.

Proof.

From statement (b) of Proposition 3, the solvability of

A^{T} A V B B^{T} = A^{T} D B^{T}

is ensured. The substitution

V (t) = \bar{V} (t) + A^{†} D B^{†}

transforms the dynamics (10) into

\begin{matrix} \frac{d \bar{V} (t)}{d t} & = \frac{d V (t)}{d t} = γ A^{T} A F (A^{T} (D - A V (t) B) B^{T}) B B^{T} \\ = γ A^{T} A F (A^{T} (D - A \bar{V} (t) B - A A^{†} D B^{†} B) B^{T}) B B^{T} \\ \overset{(4)}{=} γ A^{T} A F (A^{T} (D - A \bar{V} (t) B - D) B^{T}) B B^{T} \\ = - γ A^{T} A F (A^{T} A \bar{V} (t) B B^{T}) B B^{T} . \end{matrix}

(15)

The Lyapunov function candidate that measures the convergence performance is defined by

L (\bar{V} (t), t) = \frac{1}{2} | | \bar{V} (t) {| |}_{F}^{2} = \frac{1}{2} Tr (\bar{V} {(t)}^{T} \bar{V} (t)) .

(16)

The conclusion is

L (\bar{V} (t), t) \geq 0

. According to (16), assuming (15) and using

d Tr (X^{T} X) = 2 Tr (X^{T} d X)

, in conjunction with the basic properties of the matrix trace function, one can express the time derivative of

L (\bar{V} (t), t)

as follows:

\begin{matrix} \frac{d L (\bar{V} (t), t)}{d t} & = \frac{1}{2} \frac{d Tr (\bar{V} {(t)}^{T} \bar{V} (t))}{d t} \\ = \frac{1}{2} \cdot 2 \cdot Tr (\bar{V} {(t)}^{T} \frac{d \bar{V} (t)}{d t}) \\ = Tr [\bar{V} {(t)}^{T} (- γ A^{T} A F (A^{T} A \bar{V} (t) B B^{T}) B B^{T})] \\ = - γ Tr [\bar{V} {(t)}^{T} A^{T} A F (A^{T} A \bar{V} (t) B B^{T}) B B^{T}] \\ = - γ Tr [B B^{T} \bar{V} {(t)}^{T} A^{T} A F (A^{T} A \bar{V} (t) B B^{T})] \\ = - γ Tr [{(A^{T} A \bar{V} (t) B B^{T})}^{T} F (A^{T} A \bar{V} (t) B B^{T})] . \end{matrix}

(17)

Since the scalar-valued function

f (\cdot)

is odd and monotonically increasing, it follows that, for

W (t) = A^{T} A \bar{V} (t) B B^{T},

\begin{matrix} \frac{d L (\bar{V} (t), t)}{d t} & = - γ Tr [(W^{T} F (W))] \\ = - γ \sum_{i = 1}^{m} \sum_{j = 1}^{n} w_{i j} f (w_{i j}) \{\begin{matrix} < 0 & if & W (t) : = A^{T} A \bar{V} (t) B B^{T} \neq 0 \\ = 0 & if & W (t) : = A^{T} A \bar{V} (t) B B^{T} = 0, \end{matrix} \end{matrix}

(18)

which implies

\frac{d L (\bar{V} (t), t)}{d t} \{\begin{matrix} < 0 & if & W (t) \neq 0 \\ = 0 & if & W (t) = 0 . \end{matrix}

(19)

Observing the identity

\begin{matrix} W (t) & = A^{T} A \bar{V} (t) B B^{T} \\ = A^{T} A (V (t) - A^{†} D B^{†}) B B^{T} \\ = A^{T} A V (t) B B^{T} - A^{T} D B^{T} \\ = A^{T} (A V (t) B - D) B^{T}, \end{matrix}

and using the Lyapunov stability theory,

W (t) : = A^{T} (A V (t) B - D) B^{T}

globally converges to the zero matrix from an arbitrary initial value

V (0)

. □

Theorem 2.

The activation state-variable matrix

V (t)

of the model

GGNN (A, B, D)

, defined by (10), is convergent as

t \to + \infty

, and its equilibrium state is

V (t) \to \tilde{V} (t) = A^{†} D B^{†} + V (0) - A^{†} A V (0) B B^{†}

(20)

for every initial state matrix

V (0) \in R^{n \times p}

.

Proof.

From (10), the matrix

V_{1} (t) = {(A^{T} A)}^{†} A^{T} A V (t) B B^{T} {(B B^{T})}^{†}

satisfies

\begin{matrix} \frac{d V_{1} (t)}{d t} & = {(A^{T} A)}^{†} A^{T} A \frac{d V (t)}{d t} B B^{T} {(B B^{T})}^{†} \\ = γ {(A^{T} A)}^{†} A^{T} A [A^{T} A (A^{T} (D - A V (t) B) B^{T}) B B^{T}] B B^{T} {(B B^{T})}^{†} . \end{matrix}

According to the basic properties of the Moore–Penrose inverse [5], it follows that

{(B B^{T})}^{T} B B^{T} {(B B^{T})}^{†} = {(B B^{T})}^{T} = B B^{T}, {(A^{T} A)}^{†} A^{T} A {(A^{T} A)}^{T} = {(A^{T} A)}^{T} = A^{T} A

which further implies

\begin{matrix} \frac{d V_{1} (t)}{d t} & = γ A^{T} A (A^{T} (D - A V (t) B) B^{T}) B B^{T} \\ = \frac{d V (t)}{d t} . \end{matrix}

Consequently,

V_{2} (t) = V (t) - V_{1} (t)

satisfies

\frac{d V_{2} (t)}{d t} = \frac{d V (t)}{d t} - \frac{d V_{1} (t)}{d t} = 0

, which implies

\begin{matrix} V_{2} (t) & = V_{2} (0) \\ = V (0) - V_{1} (0) \\ = V (0) - {(A^{T} A)}^{†} A^{T} A V (0) B B^{T} {(B B^{T})}^{†} \\ = V (0) - A^{†} A V (0) B B^{†}, t \geq 0 . \end{matrix}

(21)

Furthermore, from Theorem 1,

A^{T} A V (t) B B^{T} \to A^{T} D B^{T}

, and

V_{1} (t)

converges to

\begin{array}{l} V_{1} (t) & = {(A^{T} A)}^{†} A^{T} A V (t) B B^{T} {(B B^{T})}^{†} \to {(A^{T} A)}^{†} A^{T} D B^{T} {(B B^{T})}^{†} \\ = A^{†} D B^{†} \end{array}

as

t \to + \infty

. Therefore,

V (t) = V_{1} (t) + V_{2} (t)

converges to the equilibrium state

\tilde{V} (t) = A^{†} D B^{†} + V_{2} (t) = A^{†} D B^{†} + V (0) - A^{†} A V (0) B B^{†} .

The proof is finished. □

4. Numerical Experiments on GNN and GGNN Dynamics

The numerical examples in this section are based on the Simulink implementation of the GGNN formula in Figure 1.

The parameter

γ

, initial state

V (0)

and parameters

ρ

and

ϱ

of the nonlinear activation functions (12) and (13) are entered directly into the model, while matrices A, B and D are defined from the workspace. It is assumed that

ρ = ϱ = 3

in all examples. The ode15s differential equation solver is used in the configuration parameters. In all examples,

V^{*}

denotes the theoretical solution.

The blocks powersig, smoothpowersig and transpmult include the codes described in [34,42].

Example 1.

Let us consider the idempotent matrix A from [43,44],

A = [\begin{matrix} 1 & 0 & 1 & 1 \\ 0 & 1 & 1 & 2 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{matrix}]

of

rank (A) = 2

, and the theoretical Moore–Penrose inverse

V^{*} = A^{†} = \frac{1}{3} [\begin{matrix} 2 & - 1 & 0 & 0 \\ - 1 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \end{matrix}] .

The matrix equation corresponding to the Moore–Penrose inverse is

A^{T} A X = A^{T}

[16], which implies the error function

E (t) = A^{T} (I - A X)

. The corresponding GNN model is defined by GNN

(A^{T} A, I_{4}, A^{T})

, where

I_{4}

denotes the identity and zero

4 \times 4

matrix. Constraint (4) reduces to the condition

A A^{†} A^{T} = A^{T}

, which is not satisfied. The input parameters of GNN

(A^{T} A, I_{4}, A^{T})

are

γ = 10^{8}

,

V (0) = O_{4}

, where

O_{4}

denotes the zero

4 \times 4

matrix. The corresponding GGNN

({(A^{T} A)}^{2},

I, A^{T} A A^{T})

design is based on the error matrix

E_{G} (t) = A^{T} A A^{T} (I - A V) .

The Simulink implementation of GGNN

(A, B, D)

from Figure 1 and the Simulink implementation of GNN

(A, B, D)

from [34] export, in this case, the graphical results presented in Figure 2 and Figure 3, which display the behaviors of the norms

| | E_{G} {(t) | |}_{F} = | | A^{T} A A^{T} (I - A V (t)) {| |}_{F}

and

| | V (t) - V^{*} {| |}_{F}

, respectively. It is observable that the norms generated by the application of the GGNN formula vanish faster to zero than the corresponding norms in the GNN model. The graphs in the presented figures strengthen the fast convergence of the GGNN dynamical system and its important role, which can include the application of this specific model (10) to problems that require the computation of the Moore–Penrose inverse.

Example 2.

Let us consider the matrices

A = [\begin{matrix} - 8 & 8 & - 4 \\ 11 & 4 & - 7 \\ 1 & - 4 & 3 \\ 0 & 12 & - 10 \\ 6 & 12 & - 12 \end{matrix}], B = [\begin{matrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{matrix}], D = [\begin{matrix} - 84 & 2524 & 304 \\ - 2252 & - 623 & 2897 \\ 484 & - 885 & - 701 \\ - 1894 & 2278 & 2652 \\ - 2778 & 1524 & 3750 \end{matrix}] .

The exact minimum-norm least-squares solution is

V^{*} = A^{†} D B^{†} = [\begin{matrix} - \frac{7409}{65} & - \frac{9564}{65} & \frac{8953}{65} & 0 \\ - \frac{968}{13} & \frac{1770}{13} & \frac{1402}{13} & 0 \\ \frac{6503}{65} & - \frac{4187}{65} & - \frac{8826}{65} & 0 \end{matrix}] .

The ranks of the input matrices are equal to

r = rank (A) = 2

,

rank (D) = 2

and

rank (B) = 3

. Constraint (4) is satisfied in this case. The linear GGNN

(A, B, D)

formula (10) is applied to solve the matrix equation

A X B = D

. The gain parameter of the model is

γ = 10^{9}

,

V (0) = 0

, and the stopping time is

t = 0.00001

, which gives

X = [\begin{matrix} - 113.9846 & - 147.1385 & 137.7385 & 0 \\ - 74.4615 & 136.1538 & 107.8462 & 0 \\ 100.0462 & - 64.4154 & - 135.7846 & 0 \end{matrix}] \approx A^{†} D B^{†} .

The elementwise trajectories of the state variables

v_{i j}

of the state matrix

V (t)

are shown in Figure 4a–c with solid red lines for linear, power-sigmoid and smooth power-sigmoid activation functions, respectively. The fast convergence of elementwise trajectories to the corresponding black dashed trajectories of the theoretical solution

V^{*}

is notable. In addition, faster convergence caused by the nonlinear AFs

f_{p s}

and

f_{s p s}

is noticeable in Figure 4b,c. The trajectories in the figures indicate the usual convergence behavior, so the system is globally asymptotically stable. The norms of the error matrix

E_{G}

of both models GNN and GGNN under linear and nonlinear AFs are shown in Figure 5a–c. The power-sigmoid and smooth power-sigmoid activation functions show superiority in their convergence speed compared with linear activation. On each graph in Figure 5a–c, the Frobenius norm

∥ E_{G} {(t) ∥}_{F}

of the error matrix

E_{G} (t)

in the GGNN formula vanishes faster to zero than that in the GNN model. Moreover, in each graph in Figure 6a–c, the Frobenius norm

{∥ E (t) ∥}_{F}

in the GGNN formula vanishes faster to zero than that in the GNN model, which strengthens the fact that the proposed dynamical system (10) initiates accelerated convergence compared to (3).

All graphs shown in Figure 5 and Figure 6 confirm the applicability of the proposed GGNN design compared to the traditional GNN design, even if constraint (4) holds.

Example 3.

Let us explore the behavior of GNN and GGNN dynamics for computing the Moore–Penrose inverse of the matrix

A = [\begin{matrix} 9 & 3 & - 3 \\ - 1 & 1 & 0 \\ 4 & 7 & 2 \\ 2 & 4 & - 4 \\ 13 & 5 & 8 \end{matrix}] .

The Moore–Penrose inverse of A is equal to

\begin{array}{l} A^{†} & = [\begin{matrix} \frac{9908}{127779} & - \frac{18037}{766674} & - \frac{6874}{127779} & - \frac{2663}{383337} & \frac{29941}{766674} \\ - \frac{5690}{127779} & \frac{14426}{383337} & \frac{16741}{127779} & \frac{25130}{383337} & - \frac{6392}{383337} \\ - \frac{3517}{42593} & \frac{1979}{255558} & \frac{1073}{42593} & - \frac{7373}{127779} & \frac{15049}{255558} \end{matrix}] \\ \approx [\begin{matrix} 0.0775 & - 0.0235 & - 0.0538 & - 0.0069 & 0.0390 \\ - 0.0445 & 0.0376 & 0.1310 & 0.0655 & - 0.0167 \\ - 0.0826 & 0.0077 & 0.0252 & - 0.0577 & 0.0589 \end{matrix}] . \end{array}

The rank of the input matrix is equal to

r = rank (A) = 3

. Consequently, the matrix A is left invertible and satisfies

A^{†} A = I

. The error matrix

E (t) = I - V A

initiates the GNN

(I, A, I)

dynamics for computing

A^{†}

. The gradient-based error matrix

E_{G} (t) = (I - V (t) A) A^{T} .

initiates the GGNN

(I, A A^{T}, A^{T})

design.

The gain parameter of the model is

γ = 100

, and the initial state is

V (0) = 0

with a stop time

t = 0.00001

.

The Frobenius norms of the error matrix

E (t)

generated by the linear GNN and GGNN models for different values of γ (

γ = 10^{2}, γ = 10^{3}, γ = 10^{6}

) are shown in Figure 7a–c. The graphs in these figures confirm an increase in the convergence speed, which is caused by the increase in the gain parameter γ. Because of that, the considered time intervals are

[0, 10^{- 2}]

,

[0, 10^{- 3}]

and

[0, 10^{- 6}]

, respectively. In all three scenarios, a faster convergence of the GGNN model is observable compared to the GNN design. The values of the norm

∥ E_{G} ∥_{F}

generated by both the GNN and GGNN models with linear and two nonlinear activation functions are shown in Figure 8a–c. Like the conclusion in the previous example, the perception is that the GGNN converges faster compared to the GNN model.

In addition, the graphs in Figure 8b,c, corresponding to the power-sigmoid and smooth power-sigmoid AFs, respectively, show a certain level of instability in convergence, as well as an increase in the value of

∥ E_{G} {(t) ∥}_{F}

.

Example 4.

Consider the matrices

A = [\begin{matrix} 15 & - 352 & - 45 & - 238 & 42 \\ - 5 & 14 & 8 & 132 & - 65 \\ 235 & - 65 & 44 & 350 & - 73 \end{matrix}], D = [\begin{matrix} - 4 & 4 & 16 \\ 3 & 1 & - 9 \\ 1 & - 7 & 2 \\ 2 & 2 & - 4 \\ 4 & 1 & - 5 \end{matrix}], A_{1} = D A,

which dissatisfy

rank (A_{1}) = rank (D) = 3

. Now, we apply the GNN and GGNN formulae to solve the matrix equation

A_{1} X = D

. The standard error function is defined as

E (t) = D - A_{1} V (t)

. So, we consider GNN

(A_{1}, I_{3}, D)

. The error matrix for the corresponding GGNN model is

E_{G} (t) = A_{1}^{T} (D - A_{1} V (t)),

which initiates the GGNN

(A_{1}^{T} A_{1}, I_{3}, A_{1}^{T} D)

flow. The gain parameter of the model is

γ = 10^{9}

, and the final time is

t = 0.00001

. The zero initial state

V (0) = 0

generates the best approximate solution

X = A_{1}^{†} D = {(D A)}^{†} D

of the matrix equation

A_{1} X = D

, given by

\begin{array}{l} X = A_{1}^{†} D & = [\begin{matrix} - \frac{133851170015}{180355524917879} & - \frac{1648342203725}{180355524917879} & \frac{608888775010}{180355524917879} \\ - \frac{508349079720}{180355524917879} & - \frac{691967699675}{180355524917879} & - \frac{48398092277}{180355524917879} \\ - \frac{68130232042}{180355524917879} & - \frac{242513061343}{180355524917879} & \frac{82710890618}{180355524917879} \\ - \frac{31936168532}{180355524917879} & \frac{727110260384}{180355524917879} & \frac{134047117682}{180355524917879} \\ - \frac{172434574901}{180355524917879} & - \frac{1350198643304}{180355524917879} & \frac{225136761416}{180355524917879} \end{matrix}] \\ \approx [\begin{matrix} - 0.000742 & - 0.00914 & 0.00338 \\ - 0.00282 & - 0.00384 & - 0.000268 \\ - 0.000378 & - 0.00134 & 0.000459 \\ - 0.000177 & 0.00403 & 0.000743 \\ - 0.000956 & - 0.00749 & 0.00125 \end{matrix}] . \end{array}

The Frobenius norms of the error matrix

E (t) = D - A_{1} V (t) B

in the GNN and GGNN models for both linear and nonlinear activation functions are shown in Figure 9a–c, and the error matrix

E_{G} (t) = A_{1}^{T} (D - A_{1} V (t))

in both models for linear and nonlinear activation functions are shown in Figure 10a–c. It is observable that the GGNN converges faster than GNN.

Example 5.

Table 1 and Table 2 show the results obtained during experiments we conducted with nonsquared matrices, where

m \times n

is the dimension of the matrix. Table 1 lists the input data that were used to perform experiments with the Simulink model and generated the results in Table 2. The best cases in Table 2 are marked in bold text.

The numerical results arranged in Table 2 are divided into two parts by a horizontal line. The upper part corresponds to the test matrices of dimensions

\leq 10

, while the lower part corresponds to the dimensions

m, n \geq 10

. Considering the first two columns, it is observable from the upper part that the GGNN generates smaller values

{| | E (t) | |}_{F}

compared to the GGNN. The values of

{| | E (t) | |}_{F}

in the lower part generated by the GNN and GGNN are equal. Considering the third and fourth columns, it is observable from the upper part that the GGNN generates smaller values

| | E_{G} (t) {| |}_{F}

compared to the GGNN. On the other hand, the values of

| | E_{G} (t) {| |}_{F}

in the lower part, generated by the GGNN, are smaller than the corresponding values generated by the GNN. The last two columns show that the GGNN requires less CPU time compared to the GNN. The general conclusion is that the GGNN model is more efficient in rank-deficient test matrices of larger order

m, n \geq 10

.

5. Mixed GGNN-GZNN Model for Solving Matrix Equations

The gradient-based error matrix for solving the matrix equation

A X = B

is defined by

E_{G_{A, I, B}} (t) = A^{T} (A V (t) - B) .

The GZNN design (14) corresponding to the error matrix

E_{A, I, B}

, designated GZNN

(A, I, B)

, is of the form:

{\dot{E}}_{G_{A, I, B}} (t) = - γ F (A^{T} (A V (t) - B)) .

(22)

Now, the scalar-valued norm-based error function corresponding to

E_{G_{A, I, B}} (t)

is given by

ε (t) = ε (V (t)) = \frac{1}{2} | | E_{G_{A, I, B}} (t) {| |}_{F} = \frac{| | A^{T} (A V (t) - B) {| |}_{F}}{2} .

The following dynamic state equation can be derived using the GGNN

(A, I, B)

design formula based on (10):

\dot{V} (t) = - γ A^{T} A F (A^{T} (A V (t) - B)) .

(23)

Further, using a combination of

{\dot{E}}_{G_{A, I, B}} (t) = A^{T} A \dot{V} (t)

and the GNN dynamics (23), it follows that

{\dot{E}}_{G_{A, I, B}} (t) = A^{T} A \dot{V} (t) = - γ A^{T} A A^{T} A F (A^{T} (A V (t) - B)) .

(24)

The next step is to define the new hybrid model based on the summation of the right-hand sides in (22) and (24), as follows:

\begin{matrix} {\dot{E}}_{G_{A, I, B}} (t) & = - γ ({(A^{T} A)}^{2} + I) F (A^{T} (A V (t) - B)) . \end{matrix}

(25)

The model (25) is derived from the combination of the model GGNN

(A, I, B)

and the model GZNN

(A, I, B)

. Hence, it is equally justified to use the term Hybrid GGNN (abbreviated HGGNN) and Hybrid GZNN (abbreviated HGZNN) model. But model (25) is implicit, so it is not a type of GGNN dynamics. On the other hand, it is designed for time-invariant matrices, which is not in accordance with the common nature of GZNN models, because usually, the GZNN is used in the time-varying case. A formal comparison of (25) and GZNN

(A, I, B)

reveals that both these methods possess identical left-hand sides, and the right-hand side of (25) can be derived by multiplying the right-hand side of GZNN

(A, I, B)

by the term

{(A^{T} A)}^{2} + I

.

Formally, (25) is closer to GZNN dynamics, so we will denote the model (25) by HGZNN

(A, I, B)

, considering that this model is not the exact GZNN neural dynamics and is applicable to time-invariant case. This is the case of the constant coefficient matrices A, I and B. Figure 11 represents the Simulink implementation of HGZNN

(A, I, B)

dynamics (25).

Now, we will take into account the process of solving the matrix equation

X C = D

. The error matrix for this equation is defined by

E_{G_{I, C, D}} (t) = (V (t) C - D) C^{T} .

The GZNN design (14) corresponding to the error matrix

E_{I, C, D}

, denoted by GZNN

(I, C, D)

, is of the form:

{\dot{E}}_{G_{I, C, D}} (t) = \dot{V} C C^{T} = - γ F ((V (t) C - D) C^{T}) .

(26)

On the other hand, the GGNN design formula (10) produces the following dynamic state equation:

\dot{V} (t) = - γ F ((V (t) C - D) C^{T}) C C^{T}, V (0) = V_{0} .

(27)

The GGNN model (27) is denoted by GGNN

(I, C, D)

. It implies

{\dot{E}}_{G_{I, C, D}} (t) = \dot{V} (t) C C^{T} = - γ F ((V (t) C - D) C^{T}) C C^{T} C C^{T} .

(28)

A new hybrid model based on the summation of the right-hand sides in (26) and (28) can be proposed as follows:

\begin{matrix} {\dot{E}}_{G_{I, C, D}} (t) & = - γ F ((V (t) C - D) C^{T}) (I + {(C C^{T})}^{2}) . \end{matrix}

(29)

The Model (29) will be denoted by HGZNN

(I, C, D)

. This is the case with the constant coefficient matrices I, C and D.

For the purposes of the proof of the following results, we will use

E C R (M)

to denote the exponential convergence rate of the model

M

. With

λ_{min} (K)

and

λ_{max} (K)

, we denote the smallest and largest eigenvalues of the matrix K, respectively. Continuing the previous work, we use three types of activation functions

F

: linear, power-sigmoid and smooth power-sigmoid.

The following theorem determines the equilibrium state of HGZNN

(A, I, B)

and defines its global exponential convergence.

Theorem 3.

Let

A \in R^{k \times n}, B \in R^{k \times m}

be given and satisfy

A A^{†} B = B

, and let

V (t) \in R^{n \times m}

be the state matrix of (25), where

F

is defined by

f_{l i n}

,

f_{p s}

or

f_{s p s}

.

(a): Then, $V (t)$ achieves global convergence and satisfies $A V (t) \to B$ when $t \to + \infty$ , starting from any initial state $X (0) \in R^{n \times m}$ . The state matrix $V (t) \in R^{n \times m}$ of HGZNN $(A, I, B)$ is stable in the sense of Lyapunov.
(b): The exponential convergence rate of the HGZNN $(A, I, B)$ model (25) in the linear case is equal to

$E C R (HGZNN (A, I, B)) = γ (1 + σ_{min}^{4} (A)),$

(30)

where $σ_{min} (A) = λ_{min} (A^{T} A)$ is the minimum singular value of A.
(c): The activation state variable matrix $V (t)$ of the model HGZNN $(A, I, B)$ is convergent when $t \to + \infty$ with the equilibrium state matrix

$V (t) \to {\tilde{V}}_{V (0)} = A^{†} B + (I - A^{†} A) V (0) .$

(31)

Proof.

(a) The assumption

A A^{†} B = B

provides the solvability of the matrix equation

A X = B

.

The appropriate Lyapunov function is defined as

\begin{matrix} L (t) & = \frac{1}{2} | | E_{G_{A, I, B}} (t) {| |}_{F}^{2} = \frac{1}{2} Tr ({(E_{G_{A, I, B}} (t))}^{T} E_{G_{A, I, B}} (t)) . \end{matrix}

Hence, from (25) and

d Tr (V^{T} V) = 2 Tr (V^{T} d V)

, it holds that

\begin{matrix} \dot{L} (t) & = \frac{1}{2} \frac{d}{d t} Tr ({(E_{G_{A, I, B}} (t))}^{T} E_{G_{A, I, B}} (t)) \\ = Tr ({(E_{G_{A, I, B}} (t))}^{T} {\dot{E}}_{A, I, B} (t)) \\ = Tr ({(E_{G_{A, I, B}} (t))}^{T} (- γ ({(A^{T} A)}^{2} + I) F (E_{G_{A, I, B}} (t)))) \\ = - γ Tr (({(A^{T} A)}^{2} + I) F (E_{G_{A, I, B}} (t)) {(E_{G_{A, I, B}} (t))}^{T}) . \end{matrix}

According to similar results from [45], one can verify the following inequality:

\dot{L} (t) \leq - γ Tr (({(A^{T} A)}^{2} + I) E_{G_{A, I, B}} (t) {(E_{G_{A, I, B}} (t))}^{T}) .

We also consider the following inequality from [46], which is valid for a real symmetric matrix K and a real symmetric positive-semidefinite matrix L of the same size:

λ_{min} (K) Tr (L) \leq Tr (K L) \leq λ_{max} (K) Tr (L) .

(32)

Now, the following can be chosen:

K = {(A^{T} A)}^{2} + I

and

L = E_{G_{A, I, B}} (t) {(E_{G_{A, I, B}} (t))}^{T}

. Consider

λ_{min} ({(A^{T} A)}^{2}) = λ_{min}^{2} (A^{T} A) = σ_{min}^{4} (A)

, where

λ_{min} (A)

is the minimum eigenvalue of A, and

σ_{min} (A) = \sqrt{λ_{min} (A^{T} A)}

is the minimum singular value of A. Then,

1 + σ_{min}^{4} (A) \geq 1

is the minimum nonzero eigenvalue of

{(A^{T} A)}^{2} + I

, which implies

\dot{L} (t) \leq - γ (1 + σ_{min}^{4} (A)) Tr (E_{G_{A, I, B}} (t) {(E_{G_{A, I, B}} (t))}^{T}) .

(33)

From (33), it can be concluded

\dot{L} (t) \{\begin{matrix} < 0 & if E_{G_{A, I, B}} (t) \neq 0 \\ = 0 & if E_{G_{A, I, B}} (t) = 0 . \end{matrix}

(34)

According to (34), the Lyapunov stability theory confirms that

E_{A, I, B} (t) = A V (t) - B = 0

is a globally asymptotically stable equilibrium point of the HGZNN

(A, I, B)

model (25). So,

E_{A, I, B} (t)

converges to the zero matrix, i.e.,

A V (t) \to B

, from any initial state

X (0)

.

(b): From (a), it follows that

\begin{matrix} \dot{L} & \leq - γ (1 + σ_{min}^{4} (A)) Tr ({(E_{G_{A, I, B}} (t))}^{T} E_{G_{A, I, B}} (t)) \\ = - γ (1 + σ_{min}^{4} (A)) | | E_{G_{A, I, B}} (t) {| |}_{F}^{2} \\ = - \frac{γ}{2} (1 + σ_{min}^{4} (A)) L (t) . \end{matrix}

This implies

\begin{matrix} L & \leq L (0) e^{- γ (1 + σ_{min}^{4} (A)) t} ⟺ \\ | | E_{G_{A, I, B}} (t) {| |}_{F}^{2} & \leq | | E_{G_{A, I, B}} (0) {| |}_{F}^{2} e^{- γ (1 + σ_{min}^{4} (A))} ⟺ \\ | | E_{G_{A, I, B}} (t) {| |}_{F} & \leq | | E_{G_{A, I, B}} (0) {| |}_{F} e^{- γ / 2 (1 + σ_{min}^{4} (A))}, \end{matrix}

which confirms the convergence rate (30) of HGZNN

(A, I, B)

.

(c): This part of the proof can be verified with the particular case $B : = I, D : = B$ of Theorem 2.

□

Theorem 4.

Let

C \in R^{m \times l}, D \in R^{n \times l}

be given and satisfy

D C^{†} C = D

, and let

V (t) \in R^{n \times m}

be the state matrix of (29), where

F

is defined by

f_{l i n}

,

f_{p s}

or

f_{s p s}

.

(a): Then, $V (t)$ achieves global convergence $V (t) C \to D$ when $t \to + \infty$ , starting from any initial state $V (0) \in R^{n \times m}$ . The state matrix $V (t) \in R^{n \times m}$ of HGZNN $(I, C, D)$ is stable in the sense of Lyapunov.
(b): The exponential convergence rate of the HGZNN $(I, C, D)$ model (29) in the linear case is equal to

$E C R (HGZNN (I, C, D)) = γ (1 + σ_{m i n}^{4} (C)) .$

(35)
(c): The activation state variable matrix $V (t)$ of the model HGZNN $(I, C, D)$ is convergent when $t \to + \infty$ with the equilibrium state matrix

$V (t) \to {\tilde{V}}_{V (0)} = D C^{†} + V (0) (I - C C^{†}) .$

(36)

Proof.

(a) The assumption

D C^{†} C = D

ensures the solvability of the matrix equation

X C = D

.

Let us define the Lyapunov function by

\begin{matrix} L (t) & = \frac{1}{2} | | E_{G_{I, C, D}} (t) {| |}_{F}^{2} = \frac{1}{2} Tr ({(E_{G_{I, C, D}} (t))}^{T} E_{G_{I, C, D}} (t)) . \end{matrix}

Hence, from (29) and

d Tr (X^{T} X) = 2 Tr (X^{T} d X)

, it holds that

\begin{matrix} \dot{L} (t) & = \frac{1}{2} \frac{d}{d t} Tr ({(E_{G_{I, C, D}} (t))}^{T} E_{G_{I, C, D}} (t)) \\ = Tr ({(E_{G_{I, C, D}} (t))}^{T} {\dot{E}}_{G_{I, C, D}} (t)) \\ = Tr ({(E_{G_{I, C, D}} (t))}^{T} (- γ ({(C C^{T})}^{2} + I) F (E_{G_{I, C, D}} (t)))) \\ = - γ Tr (({(C C^{T})}^{2} + I) F (E_{G_{I, C, D}} (t)) {(E_{G_{I, C, D}} (t))}^{T}) . \end{matrix}

Following the principles from [45], one can verify the following inequality:

\dot{L} (t) \leq - γ Tr (({(C C^{T})}^{2} + I) E_{G_{I, C, D}} (t) {(E_{G_{I, C, D}} (t))}^{T}) .

Consider the inequality (32) with the particular settings

K = {(C C^{T})}^{2} + I

,

L = E_{G_{I, C, D}} (t)

{(E_{G_{I, C, D}} (t))}^{T}

. Let

λ_{min} ({(C C^{T})}^{2})

be the minimum eigenvalue of

{(C C^{T})}^{2}

. Then,

1 + σ_{min}^{4} (C)) \geq 1

is the minimal nonzero eigenvalue of

{(C C^{T})}^{2} + I

, which implies

\dot{L} (t) \leq - γ (1 + σ_{min}^{4} (C)) Tr (E_{G_{I, C, D}} (t) {(E_{G_{I, C, D}} (t))}^{T}) .

(37)

From (37), it can be concluded

\dot{L} (t) \{\begin{matrix} < 0 & if E_{G_{I, C, D}} (t) \neq 0 \\ = 0 & if E_{G_{I, C, D}} (t) = 0 . \end{matrix}

(38)

According to (38), the Lyapunov stability theory confirms that

E_{G_{I, C, D}} (t) = V (t) C - D = 0

is a globally asymptotically stable equilibrium point of the HGZNN

(A, I, B)

model (29). So,

E_{G_{I, C, D}} (t)

converges to the zero matrix, i.e.,

V (t) C \to D

, from any initial state

V (0)

.

(b): From (a), it follows

\begin{matrix} \dot{L} & \leq - γ (1 + σ_{min}^{4} (C)) Tr ({(E_{G_{I, C, D}} (t))}^{T} E_{G_{I, C, D}} (t)) \\ = - γ (1 + σ_{min}^{4} (C)) | | E_{G_{I, C, D}} (t) {| |}_{F}^{2} \\ = - \frac{γ}{2} (1 + σ_{min}^{4} (C)) L (t) . \end{matrix}

This implies

\begin{matrix} L & \leq L (0) e^{- 2 γ (1 + σ_{min}^{4} (C)) t} ⟺ \\ | | E_{G_{I, C, D}} (t) {| |}_{F}^{2} & \leq | | E_{G_{I, C, D}} (0) {| |}_{F}^{2} e^{- 2 γ (1 + σ_{min}^{4} (C))} ⟺ \\ | | E_{G_{I, C, D}} (t) {| |}_{F} & \leq | | E_{G_{I, C, D}} (0) {| |}_{F} e^{- γ (1 + σ_{min}^{4} (C))}, \end{matrix}

which confirms the convergence rate (35) of HGZNN

(I, C, D)

.

(c): This part of the proof can be verified with the particular case $A : = I, B : = C$ of Theorem 2.

□

Corollary 1.

(a) Let the matrices

A \in R^{k \times n}, B \in R^{k \times m}

be given and satisfy

A A^{†} B = B

, and let

V (t) \in R^{n \times m}

be the state matrix of (25), with an arbitrary nonlinear activation

F

. Then,

E C R (GZNN (A, I, B)) = γ

and

E C R (GGNN (A, I, B)) = γ σ_{min} (A)

.

(b) Let the matrices $C \in R^{m \times l}, D \in R^{n \times l}$ be given and satisfy $D C^{†} C = D$ , and let $V (t) \in R^{n \times m}$ be the state matrix of (29) with an arbitrary nonlinear activation $F$ . Then, $E C R (GZNN (I, C, D)) = γ$ and $E C R (GGNN (I, C, D)) = γ σ_{min} (C)$ .

From Theorem 3 and Corollary 1(a), it follows that

\frac{E C R (HGZNN (A, I, B))}{E C R (GZNN (A, I, B))} = 1 + σ_{min}^{4} (A) \geq 1 .

(39)

\frac{E C R (HGZNN (A, I, B))}{E C R (GGNN (A, I, B))} = \frac{1 + σ_{min}^{4} (A)}{σ_{min}^{2} (A)} > 1 .

(40)

\frac{E C R (GZNN (A, I, B))}{E C R (GGNN (A, I, B))} = \frac{1}{σ_{min}^{2} (A)} \{\begin{matrix} < 1, & σ_{min} (A) > 1 \\ \geq 1, & σ_{min} (A) \leq 1 \end{matrix} .

(41)

Similarly, according to Theorem 4 and Corollary 1(b), it can be concluded that

\frac{E C R (HGZNN (I, C, D))}{E C R (GZNN (I, C, D))} = 1 + σ_{min}^{4} (C) \geq 1 .

(42)

\frac{E C R (HGZNN (I, C, D))}{E C R (GGNN (I, C, D))} = \frac{1 + σ_{min}^{4} (C)}{σ_{min}^{2} (C)} > 1 .

(43)

\frac{E C R (GZNN (I, C, D))}{E C R (GGNN (I, C, D))} = \frac{1}{σ_{min}^{2} (C)} \{\begin{matrix} < 1, & σ_{min} (C) > 1 \\ \geq 1, & σ_{min} (C) \leq 1 \end{matrix} .

(44)

Remark 1. (a) According to (40), it follows that

E C R (HGZNN (A, I, B)) > E C R (GZNN (A, I, B))

. According to (39), it is obtained

E C R (HGZNN (A, I, B)) \{\begin{matrix} = E C R (GZNN (A, I, B)), & σ_{min} (A) = 0 \\ > E C R (GZNN (A, I, B)), & σ_{min} (A) > 0 . \end{matrix}

According to (41), it follows

E C R (GZNN) (A, I, B) \{\begin{matrix} < E C R (GGNN (A, I, B)), & σ_{min} (A) > 1 \\ \geq E C R (GGNN (A, I, B)), & σ_{min} (A) \leq 1 \end{matrix} .

As a result, the following conclusions follow:

-: HGZNN $(A, I, B)$ is always faster than GGNN $(A, I, B)$ ;
-: HGZNN $(A, I, B)$ is faster than GZNN $(A, I, B)$ in the case where $σ_{min} (A) > 0$ ;
-: GZNN $(A, I, B)$ is faster than GGNN $(A, I, B))$ in the case where $σ_{min} (A) < 1$ .

(b) According to (43), it follows that

E C R (HGZNN (I, C, D)) > E C R (GZNN (I, C, D))

. According to (42), it follows that

E C R (HGZNN (I, C, D)) \{\begin{matrix} = E C R (GZNN (I, C, D)), & σ_{min} (C) = 0 \\ > E C R (GZNN (I, C, D)), & σ_{min} (C) > 0 . \end{matrix}

According to (41) and (44), it can be verified

E C R (GZNN) (I, C, D) \{\begin{matrix} < E C R (GGNN (I, C, D)), & σ_{min} (C) > 1 \\ \geq E C R (GGNN (I, C, D)), & σ_{min} (C) \leq 1 \end{matrix} .

As a result, the following conclusions follow:

-: HGZNN $(I, C, D)$ is always faster than GGNN $(I, C, D)$ ;
-: HGZNN $(I, C, D)$ is faster than GZNN $(I, C, D)$ in the case where $σ_{min} (C) > 0$ ;
-: GZNN $(I, C, D)$ is faster than GGNN $(I, C, D))$ in the case where $σ_{min} (C) < 1$ .

Remark 2.

The particular HGZNN

(A^{T} A, I, A^{T})

and GGNN

(A^{T} A, I, A^{T})

designs define the corresponding modifications of the improved GNN design proposed in [26] if

A^{T} A

is invertible. In the dual case, HGZNN

(I, C C^{T}, C^{T})

and GGNN

(I, C C^{T}, C^{T})

define the corresponding modifications of the improved GNN design proposed in [26] if

C C^{T}

is invertible.

Regularized HGZNN Model for Solving Matrix Equations

The convergence of HGZNN

(A, I, B)

(resp. HGZNN

(I, C, D)

), as well as GGNN

(A, I, B)

(resp. GGNN

(I, C, D)

), can be improved in the case where

σ_{min} (A) > 0

(resp.

σ_{min} (C) > 0

). There exist two possible situations when the acceleration terms

A^{T} A

and

C C^{T}

improve the convergence. The first case assumes the invertibility of A (resp. C), and the second case assumes the left invertibility of A (resp. right invertibility of C). Still, in some situations, the matrices A and C could be rank-deficient. Hence, in the case where A and C are square and singular, it is useful to use the invertible matrices

A_{1} : = A + λ I

and

C_{1} : = C + λ I

,

λ > 0

instead of A and C and to consider the models HGZNN

(A_{1}, I, B)

and HGZNN

(I, C_{1}, D)

. The following presents the convergence results considering the nonsingularity of

A_{1}

and

C_{1}

.

Corollary 2.

Let

A \in R^{n \times n}

,

B \in R^{n \times m}

be given and

V (t) \in R^{n \times m}

be the state matrix of (25), where

F

is defined by

f_{l i n}

,

f_{p s}

or

f_{s p s}

. Let

λ > 0

be a selected real number. Then, the following statements are valid:

(a): The state matrix $V (t) \in R_{r}^{n \times m}$ of the model HGZNN $(A_{1}, I, B)$ converges globally to

${\tilde{V}}_{V (0)} = A_{1}^{- 1} B,$

when $t \to + \infty$ , starting from any initial state $X (0) \in R^{n \times m}$ , and the solution is stable in the sense of Lyapunov.
(b): The exponential convergence rate of HGZNN $(A_{1}, I, B)$ in the case where $F = I$ is equal to

$E C R (HGZNN (A_{1}, I, B)) = γ (1 + σ_{min}^{4} (A + λ I)) .$
(c): Let ${\tilde{V}}_{V (0)}$ be the limiting value of $V (t)$ when $t \to + \infty$ . Then,

$lim_{λ \to 0} {\tilde{V}}_{V (0)} = lim_{λ \to 0} {(A + λ I)}^{- 1} B .$

(45)

Proof.

Since

A + λ I

is invertible, it follows that

V = {(A + λ I)}^{- 1} B

.

From (31) and the invertibility of

A + λ I

, we conclude the validity of (a). In this case, it follows that

\begin{matrix} {\tilde{V}}_{V (0)} & = {(A + λ I)}^{- 1} B + (I - {(A + λ I)}^{- 1} (A + λ I)) V (0) \\ = {(A + λ I)}^{- 1} B + (I - I) V (0) \\ = {(A + λ I)}^{- 1} B . \end{matrix}

The part (b) is proved analogously to the proof of Theorem 3. The last part (c) follows from (a). □

Corollary 3.

Let

C \in R^{m \times m}

,

D \in R^{n \times m}

be given and

V (t) \in R^{n \times m}

be the state matrix of (29), where

F = I, F = F_{p s}

or

F = F_{s p s}

. Let

λ > 0

be a selected real number. Then, the following statements are valid:

(a): The state matrix $V (t) \in R_{r}^{n \times m}$ of HGZNN $(I, C_{1}, D)$ converges globally to

${\tilde{V}}_{V (0)} = D {(C + λ I)}^{- 1},$

when $t \to + \infty$ , starting from any initial state $X (0) \in R^{n \times m}$ , and the solution is stable in the sense of Lyapunov.
(b): The exponential convergence rate of HGZNN $(I, C_{1}, D)$ in the case where $F = I$ is equal to

$E C R (HGZNN (I, C_{1}, D)) = γ (1 + σ_{min}^{4} (_{1})) .$
(c): Let ${\tilde{V}}_{V (0)}$ be the limiting value of $V (t)$ when $t \to + \infty$ . Then,

$lim_{λ \to 0} {\tilde{V}}_{V (0)} = lim_{λ \to 0} D {(C + λ I)}^{- 1} .$

(46)

Proof.

It can be proved analogously to Corollary 2. □

Remark 3.

(a) According to (40), it can be concluded that

E C R (HGZNN (A_{1}, I, B)) > E C R (GZNN (A_{1}, I, B)) .

Based on (39) it can be concluded

E C R (HGZNN (A_{1}, I, B)) > E C R (GZNN (A_{1}, I, B)) .

According to (41), one concludes

E C R (GZNN (A_{1}, I, B)) < E C R (GGNN (A_{1}, I, B)) .

(b) According to (43), it can be concluded

E C R (HGZNN (I, C_{1}, D)) > E C R (GZNN (I, C_{1}, D)) .

According to (42), it follows

E C R (HGZNN (I, C_{1}, D)) > E C R (GZNN (I, C_{1}, D)) .

Based on (41) and (44), it can be concluded

E C R (GZNN (I, C_{1}, D)) < E C R (GGNN (I, C_{1}, D)) .

6. Numerical Examples on Hybrid Models

In this section, numerical examples are presented based on the Simulink implementation of the HGZNN formula. The previously mentioned three types of activation functions

f (\cdot)

in (11), (12) and (13) will be used in the following examples. The parameters

γ

, the initial state

V (0)

and the parameters

ρ

and

ϱ

of the nonlinear activation functions (12) and (13) are entered directly into the model, while the matrices A, B, C and D are defined from the workspace. We assume that

ρ = ϱ = 3

in all examples. The ordinary differential equation solver in the configuration parameters is ode15s.

We present numerical examples in which we compare Frobenius norms

| | E_{G} {| |}_{F}

and

| | A^{- 1} B - V (t) {| |}_{F}

, which are generated by HGZNN, GZNN and GGNN.

Example 6.

Consider the matrix

A = [\begin{matrix} 0.49 & 0.276 & 0.498 & 0.751 & 0.959 \\ 0.446 & 0.68 & 0.96 & 0.255 & 0.547 \\ 0.646 & 0.655 & 0.34 & 0.506 & 0.139 \\ 0.71 & 0.163 & 0.585 & 0.699 & 0.149 \\ 0.755 & 0.119 & 0.224 & 0.891 & 0.258 \end{matrix}] .

In this example, we compare the HGZNN

(A, I, I)

model with GZNN

(A, I, I)

and GGNN

(A, I, I)

, considering all three types of activation functions. The gain parameter of the model is

γ = 10^{6}

, the initial state

V (0) = 0

, and the final time is

t = 0.00001

.

The Frobenius norm of the error matrix

E_{G}

in the HGZNN, GZNN and GGNN models for both linear and nonlinear activation functions are shown in Figure 12a–c, and the error matrices

A^{- 1} B - V (t)

of both models for linear and nonlinear activation functions are shown in Figure 13a–c. On each graph, the Frobenius norm of the error from the HGZNN formula vanishes faster to zero than those from the GZNN and GGNN models.

Example 7.

Consider the matrices

\begin{matrix} A & = [\begin{matrix} 0.0818 & 0.0973 & 0.0083 & 0.0060 & 0.0292 & 0.0372 \\ 0.0818 & 0.0649 & 0.0133 & 0.0399 & 0.0432 & 0.0198 \\ 0.0722 & 0.0800 & 0.0173 & 0.0527 & 0.0015 & 0.0490 \\ 0.0150 & 0.0454 & 0.0391 & 0.0417 & 0.0984 & 0.0339 \\ 0.0660 & 0.0432 & 0.0831 & 0.0657 & 0.0167 & 0.0952 \\ 0.0519 & 0.0825 & 0.0803 & 0.0628 & 0.0106 & 0.0920 \end{matrix}], \\ B & = [\begin{matrix} 0.1649 & 0.1813 & 0.0851 & 0.1197 & 0.0138 & 0.1437 & 0.1558 \\ 0.1965 & 0.1759 & 0.0625 & 0.0942 & 0.0639 & 0.1937 & 0.0847 \\ 0.1460 & 0.1636 & 0.0323 & 0.1392 & 0.1062 & 0.1063 & 0.0182 \\ 0.0688 & 0.0521 & 0.0358 & 0.1400 & 0.1309 & 0.0650 & 0.0533 \\ 0.1168 & 0.1189 & 0.0846 & 0.1277 & 0.0815 & 0.0211 & 0.0307 \\ 0.0216 & 0.0045 & 0.0188 & 0.0067 & 0.1640 & 0.1222 & 0.0562 \end{matrix}] . \end{matrix}

In this example, we compare the HGZNN

(A, I, B)

model with GZNN

(A, I, B)

and GGNN

(A, I, B)

, considering all three types of activation functions. The gain parameter of the model is

γ = 1000

, the initial state

V (0) = 0

, and the final time is

t = 0.01

.

The elementwise trajectories of the state variable are shown with red lines in Figure 14a–c, for linear, power-sigmoid and smooth power-sigmoid activation functions, respectively. The solid red lines corresponding to HGZNN

(A, I, B)

converge to the black dashed lines of the theoretical solution X. It is observable that the trajectories indicate the usual convergence behavior, so the system is globally asymptotically stable. The error matrices

E_{G}

of the HGZNN, GZNN and GGNN models for both linear and nonlinear activation functions are shown in Figure 15a–c, and the residual matrices

A^{- 1} B - X (t)

of both models for linear and nonlinear activation functions are shown in Figure 16a–c. In each graph, for both error cases, the Frobenius norm of the error of the HGZNN formula is similar to the Frobenius norm of the error of the GZNN model, and they both converges faster to zero than the GGNN model.

Remark 4.

In this remark, we analyze the answer to the question, “how are the system parameters selected to obtain better performance?” The answer is complex and consists of several parts.

1.: The gain parameter γ is the parameter with the most influence on the behavior of the observed dynamic systems. The general rule is “the parameter γ should be selected as large as possible”. The numerical confirmation of this fact is investigated in Figure 7.
2.: The influence of γ and AFs is indisputable. The larger the value of γ, the faster the convergence. And, clearly, AFs increase convergence compared to the linear models. In the presented numerical examples, we investigate the influence of three AFs: linear, power-sigmoid and smooth power-sigmoid.
3.: The right question is as follows: what makes the GGNN better than the GNN under fair conditions that assume an identical environment during testing? Numerical experiments show better performance of the GGNN design compared to the GNN with respect to all three tested criteria: ${∥ E (t) ∥}_{F}$ , $∥ E_{G} {(t) ∥}_{F}$ and $∥ V (t) - V^{*} ∥_{F}$ . Moreover, Table 2 in Example 5 is aimed at convergence analysis. The general conclusion from the numerical data arranged in Table 2 is that the GGNN model is more efficient compared to the GNN in rank-deficient test matrices of larger order $m, n \geq 10$ .
4.: The convergence rate of the linear hybrid model $HGZNN (A, I, B))$ depends on γ and the singular value $σ_{min} (A)$ , while the convergence rate of the hybrid model $HGZNN (I, C, D)$ depends on γ and $σ_{min} (C)$ .
5.: The convergence of the linear regularized hybrid model $HGZNN (A + λ I, I, B))$ depends on γ, $σ_{min} (A)$ and the regularization parameter $λ > 0$ , while the convergence of the linear regularized hybrid model $HGZNN (I, C + λ I, D))$ depends on γ, $σ_{min} (C)$ and λ.

In conclusion, it is reasonable to analyze the system parameter selections to obtain better performance. But the best performance is not defined.

7. Conclusions

We show that the error functions which make the basis of GNN and ZNN dynamical evolutions can be defined using the gradient of the Frobenius norm of the traditional error function

E (t)

. The result of such a strategy is the usage of the error function

E_{G} (t)

for the basis of GNN dynamics, which results in the proposed GGNN model. The results related to the GNN model (called GNN

(A, B, D)

) for solving the general matrix equation

A X B = D

are extended in the GGNN model (called GGNN

(A, B, D)

) in both theoretical and computational directions. In a theoretical sense, the convergence of the defined GGNN model is considered. It is shown that the neural state matrix

V (t)

of the GGNN

(A, B, D)

model asymptotically converges to the solution of the matrix equation

A X B = D

for an arbitrary initial state matrix

V (0)

and coincides with the general solution of the linear matrix equation. A number of applications of GNN(A, B, D) are considered. All applications are globally convergent. Several particular appearances of the general matrix equation are observed and applied for computing various classes of generalized inverses. Illustrative numerical examples and simulation results were obtained using Matlab Simulink implementation and are presented to demonstrate the validity of the derived theoretical results. The influence of various nonlinear activations on the GNN models is considered in both the theoretical and computational directions. From the presented examples, it can be concluded that the GGNN model is faster and has a smaller error compared to the GNN model.

Further research can be oriented to the definition of finite-time convergent GGNN or GZNN models, as well as the definition of a noise-tolerant GGNN or GZNN design.

Author Contributions

Conceptualization, P.S.S. and G.V.M.; methodology, P.S.S., N.T., D.G. and V.S.; software, D.G., V.L.K. and N.T.; validation, G.V.M., M.J.P. and P.S.S.; formal analysis, M.J.P., N.T. and D.G.; investigation, M.J.P., G.V.M. and P.S.S.; resources, D.G., N.T., V.L.K. and V.S.; data curation, M.J.P., V.L.K., V.S., D.G. and N.T.; writing—original draft preparation, P.S.S., D.G. and N.T.; writing—review and editing, M.J.P. and G.V.M.; visualization, D.G. and N.T.; supervision, G.V.M.; project administration, M.J.P.; funding acquisition, G.V.M., M.J.P. and P.S.S. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Ministry of Science and Higher Education of the Russian Federation (Grant No. 075-15-2022-1121).

Data Availability Statement

Data results are available on reader request.

Acknowledgments

Predrag Stanimirović is supported by the Science Fund of the Republic of Serbia (No. 7750185, Quantitative Automata Models: Fundamental Problems and Applications—QUAM). Dimitrios Gerontitis receives financial support from the ‘‘Savas Parastatidis’’ named scholarship granted provided by the Bodossaki Foundation. Milena J. Petrović acknowledges support from a project supported by Ministry of Education and Science of Republic of Serbia, Grant No. 174025.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses or interpretation of the data; in the writing of the manuscript, or in the decision to publish the results.

References

Zhang, Y.; Chen, K. Comparison on Zhang neural network and gradient neural network for time-varying linear matrix equation AXB = C solving. In Proceedings of the 2008 IEEE International Conference on Industrial Technology, Chengdu, China, 21–24 April 2008; pp. 1–6. [Google Scholar] [CrossRef]
Zhang, Y.; Yi, C.; Guo, D.; Zheng, J. Comparison on Zhang neural dynamics and gradient-based neural dynamics for online solution of nonlinear time-varying equation. Neural Comput. Appl. 2011, 20, 1–7. [Google Scholar] [CrossRef]
Zhang, Y.; Xu, P.; Tan, L. Further studies on Zhang neural-dynamics and gradient dynamics for online nonlinear equations solving. In Proceedings of the 2009 IEEE International Conference on Automation and Logistics, Shenyang, China, 5–7 August 2009; pp. 566–571. [Google Scholar] [CrossRef]
Ben-Israel, A.; Greville, T.N.E. Generalized Inverses: Theory and Applications, 2nd ed.; CMS Books in Mathematics; Springer: New York, NY, USA, 2003. [Google Scholar]
Wang, G.; Wei, Y.; Qiao, S. Generalized Inverses: Theory and Computations; Science Press, Springer: Beijing, China, 2018. [Google Scholar]
Dash, P.; Zohora, F.T.; Rahaman, M.; Hasan, M.M.; Arifuzzaman, M. Usage of Mathematics Tools with Example in Electrical and Electronic Engineering. Am. Sci. Res. J. Eng. Technol. Sci. (ASRJETS) 2018, 46, 178–188. [Google Scholar]
Qin, F.; Lee, J. Dynamic methods for missing value estimation for DNA sequences. In Proceedings of the 2010 International Conference on Computational and Information Sciences, IEEE, Chengdu, China, 9–11 July 2010; pp. 442–445. [Google Scholar] [CrossRef]
Soleimani, F.; Stanimirović, P.S.; Soleimani, F. Some matrix iterations for computing generalized inverses and balancing chemical equations. Algorithms 2015, 8, 982–998. [Google Scholar] [CrossRef]
Udawat, B.; Begani, J.; Mansinghka, M.; Bhatia, N.; Sharma, H.; Hadap, A. Gauss Jordan method for balancing chemical equation for different materials. Mater. Today Proc. 2022, 51, 451–454. [Google Scholar] [CrossRef]
Doty, K.L.; Melchiorri, C.; Bonivento, C. A theory of generalized inverses applied to robotics. Int. J. Robot. Res. 1993, 12, 1–19. [Google Scholar] [CrossRef]
Li, L.; Hu, J. An efficient second-order neural network model for computing the Moore–Penrose inverse of matrices. IET Signal Process. 2022, 16, 1106–1117. [Google Scholar] [CrossRef]
Wang, X.; Tang, B.; Gao, X.G.; Wu, W.H. Finite iterative algorithms for the generalized reflexive and anti-reflexive solutions of the linear matrix equation AXB = C. Filomat 2017, 31, 2151–2162. [Google Scholar] [CrossRef]
Ding, F.; Chen, T. Gradient based iterative algorithms for solving a class of matrix equations. IEEE Trans. Autom. Control 2005, 50, 1216–1221. [Google Scholar] [CrossRef]
Ding, F.; Zhang, H. Gradient-based iterative algorithm for a class of the coupled matrix equations related to control systems. IET Control Theory Appl. 2014, 8, 1588–1595. [Google Scholar] [CrossRef]
Zhang, H. Quasi gradient-based inversion-free iterative algorithm for solving a class of the nonlinear matrix equations. Comput. Math. Appl. 2019, 77, 1233–1244. [Google Scholar] [CrossRef]
Wang, J. Recurrent neural networks for computing pseudoinverses of rank-deficient matrices. SIAM J. Sci. Comput. 1997, 18, 1479–1493. [Google Scholar] [CrossRef]
Fa-Long, L.; Zheng, B. Neural network approach to computing matrix inversion. Appl. Math. Comput. 1992, 47, 109–120. [Google Scholar] [CrossRef]
Wang, J. A recurrent neural network for real-time matrix inversion. Appl. Math. Comput. 1993, 55, 89–100. [Google Scholar] [CrossRef]
Wang, J. Recurrent neural networks for solving linear matrix equations. Comput. Math. Appl. 1993, 26, 23–34. [Google Scholar] [CrossRef]
Wei, Y. Recurrent neural networks for computing weighted Moore–Penrose inverse. Appl. Math. Comput. 2000, 116, 279–287. [Google Scholar] [CrossRef]
Xiao, L.; Zhang, Y.; Li, K.; Liao, B.; Tan, Z. FA novel recurrent neural network and its finite-time solution to time-varying complex matrix inversion. Neurocomputing 2019, 331, 483–492. [Google Scholar] [CrossRef]
Yi, C.; Chen, Y.; Lu, Z. Improved gradient-based neural networks for online solution of Lyapunov matrix equation. Inf. Process. Lett. 2011, 111, 780–786. [Google Scholar] [CrossRef]
Yi, C.; Qiao, D. Improved neural solution for the Lyapunov matrix equation based on gradient search. Inf. Process. Lett. 2013, 113, 876–881. [Google Scholar]
Xiao, L.; Li, K.; Tan, Z.; Zhang, Z.; Liao, B.; Chen, K.; Jin, L.; Li, S. Nonlinear gradient neural network for solving system of linear equations. Inf. Process. Lett. 2019, 142, 35–40. [Google Scholar] [CrossRef]
Xiao, L. A finite-time convergent neural dynamics for online solution of time-varying linear complex matrix equation. Neurocomputing 2015, 167, 254–259. [Google Scholar] [CrossRef]
Lv, X.; Xiao, L.; Tan, Z.; Yang, Z.; Yuan, J. Improved Gradient Neural Networks for solving Moore–Penrose Inverse of full-rank matrix. Neural Process. Lett. 2019, 50, 1993–2005. [Google Scholar] [CrossRef]
Wang, J. Electronic realisation of recurrent neural network for solving simultaneous linear equations. Electron. Lett. 1992, 28, 493–495. [Google Scholar] [CrossRef]
Zhang, Y.; Chen, K.; Tan, H.Z. Performance analysis of gradient neural network exploited for online time-varying matrix inversion. IEEE Trans. Autom. Control. 2009, 54, 1940–1945. [Google Scholar] [CrossRef]
Wang, J.; Li, H. Solving simultaneous linear equations using recurrent neural networks. Inf. Sci. 1994, 76, 255–277. [Google Scholar] [CrossRef]
Tan, Z.; Chen, H. Nonlinear function activated GNN versus ZNN for online solution of general linear matrix equations. J. Frankl. Inst. 2023, 360, 7021–7036. [Google Scholar] [CrossRef]
Tan, Z.; Hu, Y.; Chen, K. On the investigation of activation functions in gradient neural network for online solving linear matrix equation. Neurocomputing 2020, 413, 185–192. [Google Scholar] [CrossRef]
Tan, Z. Fixed-time convergent gradient neural network for solving online sylvester equation. Mathematics 2022, 10, 3090. [Google Scholar] [CrossRef]
Wang, D.; Liu, X.-W. A gradient-type noise-tolerant finite-time neural network for convex optimization. Neurocomputing 2022, 49, 647–656. [Google Scholar] [CrossRef]
Stanimirović, P.S.; Petković, M.D. Gradient neural dynamics for solving matrix equations and their applications. Neurocomputing 2018, 306, 200–212. [Google Scholar] [CrossRef]
Stanimirović, P.S.; Katsikis, V.N.; Li, S. Hybrid GNN-ZNN models for solving linear matrix equations. Neurocomputing 2018, 316, 124–134. [Google Scholar] [CrossRef]
Sowmya, G.; Thangavel, P.; Shankar, V. A novel hybrid Zhang neural network model for time-varying matrix inversion. Eng. Sci. Technol. Int. J. 2022, 26, 101009. [Google Scholar] [CrossRef]
Wu, W.; Zheng, B. Improved recurrent neural networks for solving Moore–Penrose inverse of real-time full-rank matrix. Neurocomputing 2020, 418, 221–231. [Google Scholar] [CrossRef]
Zhang, Y.; Wang, C. Gradient-Zhang neural network solving linear time-varying equations. In Proceedings of the 2022 IEEE 17th Conference on Industrial Electronics and Applications (ICIEA), Chengdu, China, 16–19 December 2022; pp. 396–403. [Google Scholar] [CrossRef]
Wang, C.; Zhang, Y. Theoretical Analysis of Gradient-Zhang Neural Network for Time-Varying Equations and Improved Method for Linear Equations. In Neural Information Processing; ICONIP 2023, Lecture Notes in Computer Science; Luo, B., Cheng, L., Wu, Z.G., Li, H., Li, C., Eds.; Springer: Singapore, 2024; Volume 14447. [Google Scholar] [CrossRef]
Stanimirović, P.S.; Mourtas, S.D.; Katsikis, V.N.; Kazakovtsev, L.A. Krutikov, V.N. Recurrent neural network models based on optimization methods. Mathematics 2022, 10, 4292. [Google Scholar] [CrossRef]
Nocedal, J.; Wright, S. Numerical Optimization; Springer: New York, NY, USA, 1999. [Google Scholar]
Stanimirović, P.S.; Petković, M.D.; Gerontitis, D. Gradient neural network with nonlinear activation for computing inner inverses and the Drazin inverse. Neural Process. Lett. 2017, 48, 109–133. [Google Scholar] [CrossRef]
Smoktunowicz, A.; Smoktunowicz, A. Set-theoretic solutions of the Yang–Baxter equation and new classes of R-matrices. Linear Algebra Its Appl. 2018, 546, 86–114. [Google Scholar] [CrossRef]
Baksalary, O.M.; Trenkler, G. On matrices whose Moore–Penrose inverse is idempotent. Linear Multilinear Algebra 2022, 70, 2014–2026. [Google Scholar] [CrossRef]
Wang, X.Z.; Ma, H.; Stanimirović, P.S. Nonlinearly activated recurrent neural network for computing the Drazin inverse. Neural Process. Lett. 2017, 46, 195–217. [Google Scholar] [CrossRef]
Wang, S.D.; Kuo, T.S.; Hsu, C.F. Trace bounds on the solution of the algebraic matrix Riccati and Lyapunov equation. IEEE Trans. Autom. Control 1986, 31, 654–656. [Google Scholar] [CrossRef]

Figure 1. Simulink implementation of GGNN

(A, B, D)

evolution (10).

Figure 1. Simulink implementation of GGNN

(A, B, D)

evolution (10).

Figure 2. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation.

∥ E_{G} {(t) ∥}_{F}

in GGNN

({(A^{T} A)}^{2}, I, A^{T} A A^{T})

compared to GNN

(A^{T} A, I_{4}, A^{T})

in Example 1.

Figure 2. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation.

∥ E_{G} {(t) ∥}_{F}

in GGNN

({(A^{T} A)}^{2}, I, A^{T} A A^{T})

compared to GNN

(A^{T} A, I_{4}, A^{T})

in Example 1.

Figure 3. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation.

∥ V (t) - V^{*} ∥_{F}

in GGNN

({(A^{T} A)}^{2}, I, A^{T} A A^{T})

compared to GNN

(A^{T} A, I_{4}, A^{T})

in Example 1.

Figure 3. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation.

∥ V (t) - V^{*} ∥_{F}

in GGNN

({(A^{T} A)}^{2}, I, A^{T} A A^{T})

compared to GNN

(A^{T} A, I_{4}, A^{T})

in Example 1.

Figure 4. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation. Elementwise convergence trajectories

v_{i j} \in V (t)

of the GGNN

(A, B, D)

network in Example 2.

Figure 4. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation. Elementwise convergence trajectories

v_{i j} \in V (t)

of the GGNN

(A, B, D)

network in Example 2.

Figure 5. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation.

∥ E_{G} {(t) ∥}_{F}

in GGNN

(A, B, D)

compared to GNN

(A, B, D)

in Example 2.

Figure 5. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation.

∥ E_{G} {(t) ∥}_{F}

in GGNN

(A, B, D)

compared to GNN

(A, B, D)

in Example 2.

Figure 6. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation.

{∥ E (t) ∥}_{F}

in GGNN

(A, B, D)

compared to GNN

(A, B, D)

in Example 2.

Figure 6. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation.

{∥ E (t) ∥}_{F}

in GGNN

(A, B, D)

compared to GNN

(A, B, D)

in Example 2.

Figure 7. (a)

γ = 10

,

t \in [0, 10^{- 2}]

. (b)

γ = 10^{3}

,

t \in [0, 10^{- 3}]

. (c)

γ = 10^{6}

,

t \in [0, 10^{- 6}]

.

{∥ E (t) ∥}_{F}

for different

γ

in GGNN

(I, A A^{T}, A^{T})

compared to GNN

(I, A, I)

in Example 3.

Figure 7. (a)

γ = 10

,

t \in [0, 10^{- 2}]

. (b)

γ = 10^{3}

,

t \in [0, 10^{- 3}]

. (c)

γ = 10^{6}

,

t \in [0, 10^{- 6}]

.

{∥ E (t) ∥}_{F}

for different

γ

in GGNN

(I, A A^{T}, A^{T})

compared to GNN

(I, A, I)

in Example 3.

Figure 8. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation.

∥ E_{G} {(t) ∥}_{F}

in GGNN

(I, A A^{T}, A^{T})

compared to GNN

(I, A, I)

in Example 3.

Figure 8. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation.

∥ E_{G} {(t) ∥}_{F}

in GGNN

(I, A A^{T}, A^{T})

compared to GNN

(I, A, I)

in Example 3.

Figure 9. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation.

{∥ E (t) ∥}_{F}

in GGNN

(A_{1}^{T} A_{1}, I_{3}, A_{1}^{T} D)

compared to GNN

(A_{1}, I_{3}, D)

in Example 4.

Figure 9. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation.

{∥ E (t) ∥}_{F}

in GGNN

(A_{1}^{T} A_{1}, I_{3}, A_{1}^{T} D)

compared to GNN

(A_{1}, I_{3}, D)

in Example 4.

Figure 10. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation.

∥ E_{G} {(t) ∥}_{F}

in GGNN

(A_{1}^{T} A_{1}, I_{3}, A_{1}^{T} D)

compared to GNN

(A_{1}, I_{3}, D)

in Example 4.

Figure 10. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation.

∥ E_{G} {(t) ∥}_{F}

in GGNN

(A_{1}^{T} A_{1}, I_{3}, A_{1}^{T} D)

compared to GNN

(A_{1}, I_{3}, D)

in Example 4.

Figure 11. Simulink implementation of (25).

Figure 12. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation.

∥ E_{A, I, B} ∥_{F}

of HGZNN

(A, I, I)

compared to GGNN

(A, I, I)

and GZNN

(A, I, I)

in Example 6.

Figure 12. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation.

∥ E_{A, I, B} ∥_{F}

of HGZNN

(A, I, I)

compared to GGNN

(A, I, I)

and GZNN

(A, I, I)

in Example 6.

Figure 13. (a) Linear activation. (b) Power–sigmoid activation. (c) Smooth power–sigmoid activation.

∥ A^{- 1} {B - V (t) ∥}_{F}

of HGZNN

(A, I, I)

compared to GGNN

(A, I, I)

and GZNN

(A, I, I)

in Example 6.

Figure 13. (a) Linear activation. (b) Power–sigmoid activation. (c) Smooth power–sigmoid activation.

∥ A^{- 1} {B - V (t) ∥}_{F}

of HGZNN

(A, I, I)

compared to GGNN

(A, I, I)

and GZNN

(A, I, I)

in Example 6.

Figure 14. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation. Elementwise convergence trajectories of the HGZNN

(A, I, B)

network in Example 7.

Figure 14. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation. Elementwise convergence trajectories of the HGZNN

(A, I, B)

network in Example 7.

Figure 15. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation.

∥ E_{A, I, B} ∥_{F}

of HGZNN

(A, I, B)

compared to GGNN

(A, I, B)

and GZNN

(A, I, B)

in Example 7.

Figure 15. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation.

∥ E_{A, I, B} ∥_{F}

of HGZNN

(A, I, B)

compared to GGNN

(A, I, B)

and GZNN

(A, I, B)

in Example 7.

Figure 16. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation. Frobenius norm of error matrix

A^{- 1} B - X (t)

of HGZNN

(A, I, B)

compared to GGNN

(A, I, B)

and GZNN

(A, I, B)

in Example 7.

Figure 16. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation. Frobenius norm of error matrix

A^{- 1} B - X (t)

of HGZNN

(A, I, B)

compared to GGNN

(A, I, B)

and GZNN

(A, I, B)

in Example 7.

Table 1. Input data.

Matrix A			Matrix B			Matrix D			Input and Residual Norm
$m$	$n$	$rank (A)$	$p$	$q$	$rank (B)$	$m$	$q$	$rank (D)$	$γ$	$t_{f}$	$∥ {AA}^{†} {DB}^{†} {B - D ∥}_{F}$
10	8	8	9	7	7	10	7	7	$10^{4}$	0.5	1.051
10	8	6	9	7	7	10	7	7	$10^{4}$	0.5	1.318
10	8	6	9	7	5	10	7	7	$10^{4}$	0.5	1.81
10	8	6	9	7	5	10	7	5	$10^{4}$	5	2.048
10	8	1	9	7	2	10	7	1	$10^{4}$	5	2.372
20	10	10	8	5	5	20	5	5	$10^{6}$	5	1.984
20	10	5	8	5	5	20	5	5	$10^{6}$	5	2.455
20	10	5	8	5	2	20	5	5	$10^{6}$	1	3.769
20	10	2	8	5	2	20	5	2	$10^{6}$	1	2.71
20	15	15	5	2	2	20	2	2	$10^{8}$	1	1.1
20	15	10	5	2	2	20	2	2	$10^{8}$	1	1.158
20	15	10	5	2	1	20	2	2	$10^{8}$	1	2.211
20	15	5	5	2	1	20	2	2	$10^{8}$	1	1.726

Table 2. Experimental results based on data presented in Table 1.

${\| \| E (t) \| \|}_{F}$ (GNN)	${\| \| E (t) \| \|}_{F}$ (GGNN)	$\| \| E_{G} (t) {\| \|}_{F}$ (GNN)	$\| \| E_{G} (t) {\| \|}_{F}$ (GGNN)	CPU (GNN)	CPU (GGNN)
$1.051$	$1.094$	$2.52 \times 10^{- 9}$	$0.02524$	$5.017148$	$13.470995$
$1.318$	1.393	$3.122 \times 10^{- 7}$	0.03661	22.753954	$10.734163$
$1.811$	1.899	$0.0008711$	0.03947	15.754537	$15.547785$
$2.048$	2.082	$1.96 \times 10^{- 10}$	0.00964	$9.435709$	17.137916
$2.372$	$2.372$ 2	$1.7422 \times 10^{- 15}$	2.003 $\times 10^{- 15}$	21.645386	$13.255210$
$1.984$	$1.984$	2.288 $\times 10^{- 14}$	9.978 $\times 10^{- 15}$	21.645386	13.255210
2.455	2.455	1.657 $\times 10^{- 11}$	1.693 $\times 10^{- 14}$	50.846893	19.059385
3.769	3.769	6.991 $\times 10^{- 11}$	4.071 $\times 10^{- 14}$	42.184748	13.722390
2.71	2.71	1.429 $\times 10^{- 14}$	1.176 $\times 10^{- 14}$	148.484258	13.527065
1.1	1.1	1.766 $\times 10^{- 13}$	5.949 $\times 10^{- 15}$	218.169376	17.5666568
1.158	1.158	2.747 $\times 10^{- 10}$	2.981 $\times 10^{- 13}$	45.505618	12.441782
2.211	2.211	7.942 $\times 10^{- 12}$	8.963 $\times 10^{- 14}$	194.605133	14.117241
1.726	1.726	8.042 $\times 10^{- 15}$	3.207 $\times 10^{- 15}$	22.340501	11.650829

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Stanimirović, P.S.; Tešić, N.; Gerontitis, D.; Milovanović, G.V.; Petrović, M.J.; Kazakovtsev, V.L.; Stasiuk, V. Application of Gradient Optimization Methods in Defining Neural Dynamics. Axioms 2024, 13, 49. https://doi.org/10.3390/axioms13010049

AMA Style

Stanimirović PS, Tešić N, Gerontitis D, Milovanović GV, Petrović MJ, Kazakovtsev VL, Stasiuk V. Application of Gradient Optimization Methods in Defining Neural Dynamics. Axioms. 2024; 13(1):49. https://doi.org/10.3390/axioms13010049

Chicago/Turabian Style

Stanimirović, Predrag S., Nataša Tešić, Dimitrios Gerontitis, Gradimir V. Milovanović, Milena J. Petrović, Vladimir L. Kazakovtsev, and Vladislav Stasiuk. 2024. "Application of Gradient Optimization Methods in Defining Neural Dynamics" Axioms 13, no. 1: 49. https://doi.org/10.3390/axioms13010049

APA Style

Stanimirović, P. S., Tešić, N., Gerontitis, D., Milovanović, G. V., Petrović, M. J., Kazakovtsev, V. L., & Stasiuk, V. (2024). Application of Gradient Optimization Methods in Defining Neural Dynamics. Axioms, 13(1), 49. https://doi.org/10.3390/axioms13010049

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Application of Gradient Optimization Methods in Defining Neural Dynamics

Abstract

1. Introduction and Background

2. Motivation and Derivation of GGNN and GZNN Models

3. Convergence Analysis of GGNN Dynamics

4. Numerical Experiments on GNN and GGNN Dynamics

5. Mixed GGNN-GZNN Model for Solving Matrix Equations

Regularized HGZNN Model for Solving Matrix Equations

6. Numerical Examples on Hybrid Models

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI