Symmetrized Neural Network Operators in Fractional Calculus: Caputo Derivatives, Asymptotic Analysis, and the Voronovskaya–Santos–Sales Theorem

Santos, Rômulo Damasclin Chaves dos; Sales, Jorge Henrique de Oliveira; Santos, Gislan Silveira

doi:10.3390/axioms14070510

Open AccessArticle

Symmetrized Neural Network Operators in Fractional Calculus: Caputo Derivatives, Asymptotic Analysis, and the Voronovskaya–Santos–Sales Theorem

by

Rômulo Damasclin Chaves dos Santos

^†

,

Jorge Henrique de Oliveira Sales

^*,†

and

Gislan Silveira Santos

^*,†

Department of Exact Sciences, Postgraduate Program in Computational Modeling Santa Cruz State University, Ilhéus 45662-900, Brazil

^*

Authors to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Axioms 2025, 14(7), 510; https://doi.org/10.3390/axioms14070510

Submission received: 6 May 2025 / Revised: 24 June 2025 / Accepted: 26 June 2025 / Published: 30 June 2025

(This article belongs to the Special Issue Advances in Fuzzy Logic and Computational Intelligence)

Download

Browse Figures

Versions Notes

Abstract

This work presents a comprehensive mathematical framework for symmetrized neural network operators operating under the paradigm of fractional calculus. By introducing a perturbed hyperbolic tangent activation, we construct a family of localized, symmetric, and positive kernel-like densities, which form the analytical backbone for three classes of multivariate operators: quasi-interpolation, Kantorovich-type, and quadrature-type. A central theoretical contribution is the derivation of the Voronovskaya–Santos–Sales Theorem, which extends classical asymptotic expansions to the fractional domain, providing rigorous error bounds and normalized remainder terms governed by Caputo derivatives. The operators exhibit key properties such as partition of unity, exponential decay, and scaling invariance, which are essential for stable and accurate approximations in high-dimensional settings and systems governed by nonlocal dynamics. The theoretical framework is thoroughly validated through applications in signal processing and fractional fluid dynamics, including the formulation of nonlocal viscous models and fractional Navier–Stokes equations with memory effects. Numerical experiments demonstrate a relative error reduction of up to 92.5% when compared to classical quasi-interpolation operators, with observed convergence rates reaching

O (n^{- 1.5})

under Caputo derivatives, using parameters

λ = 3.5

,

q = 1.8

, and

n = 100

. This synergy between neural operator theory, asymptotic analysis, and fractional calculus not only advances the theoretical landscape of function approximation but also provides practical computational tools for addressing complex physical systems characterized by long-range interactions and anomalous diffusion.

Keywords:

neural network operators; fractional differentiation; asymptotic analysis; Caputo derivatives; Voronovskaya-type expansions

MSC:

68T07; 00A69; 41A25; 41A30; 41A36

1. Introduction

Neural network operators have become indispensable tools in approximation theory and computational mathematics, primarily due to their exceptional capacity to model highly nonlinear functions. This capability underlies a broad spectrum of applications, ranging from the numerical solution of partial differential equations to signal processing and data-driven modeling [1,2]. In the context of fractional calculus and its applications to differential equations, several studies have contributed to the development of efficient computational methods and theoretical frameworks. Refs. [3,4] presented an efficient computational method for differential equations of fractional type, highlighting the importance of numerical techniques in solving complex fractional differential equations. Additionally, [5] explored deep learning architectures, providing insights into the integration of neural networks in solving differential equations and modeling complex systems.

Recent advancements in the field of fractional calculus and its applications to neural networks have been significantly shaped by the contributions of [6,7,8]. Chen et al. [6] conducted an extensive exploration of fractional derivative modeling in mechanics and engineering, establishing a robust mathematical framework. This framework has been crucial in understanding complex systems governed by fractional differential equations. Their work laid the essential groundwork for integrating fractional calculus into neural network architectures, thereby facilitating the modeling of systems characterized by memory effects and long-range dependencies.

Building upon this foundation, [9] introduced parametrized, deformed, and general neural networks, which have been instrumental in advancing the theoretical landscape of function approximation. These neural networks, known for their flexibility and enhanced approximation capabilities, have played a crucial role in developing the symmetrized neural network operators discussed in the current study. By integrating these neural networks into the framework of fractional calculus, we have not only advanced theoretical understanding but also provided practical computational tools for addressing complex physical systems.

A significant contribution of the present work is the Voronovskaya–Santos–Sales Theorem, which extends classical asymptotic expansions to the fractional domain. This theorem provides rigorous error bounds and normalized remainder terms governed by Caputo derivatives. Combined with insights from [6,9], this theorem has enabled the development of neural network operators that exhibit superior convergence rates and improved analytical tractability. The practical efficacy of these operators has been demonstrated through applications in signal processing and fractional fluid dynamics, showcasing a relative error reduction of up to

92.5 %

compared to classical quasi-interpolation operators. The observed convergence rates under Caputo derivatives further validate the robustness and accuracy of the proposed methods. These advancements highlight the critical role of fractional calculus in enhancing the capabilities of neural networks. They pave the way for future research in modeling complex systems with memory effects and nonlocal interactions, marking a significant step forward in both theoretical and applied domains. Among several formulations of fractional calculus, the Caputo derivative stands out for its well-posed initial conditions, and its physically intuitive interpretation can be seen in [10].

The work of [11] provides a comprehensive exploration of fractional calculus, offering a robust theoretical framework that has significantly influenced the development of fractional-order models in several scientific and engineering disciplines. This work has been instrumental in advancing the understanding and application of fractional differential equations, particularly in modeling complex systems that exhibit memory effects and nonlocal interactions. Building on this theoretical framework, ref. [12] introduced innovative numerical methods for solving fractional advection–dispersion equations, crucial for the accurate and efficient modeling of physical phenomena governed by fractional dynamics. His work on the finite difference/finite element method, combined with fast evaluation techniques for Caputo derivatives, paved the way for practical computational tools that improve the simulation of complex physical systems. More recently, other works have introduced fractional operators, which have been integrated into neural architectures, leveraging their inherent nonlocality to capture long-range dependencies and significantly improve approximation capabilities [13,14].

The intersection of neural operator theory with fractional calculus has driven the extension of classical approximation results into the fractional domain [6]. In particular, Voronovskaya-type asymptotic expansions have been generalized to neural network operators defined over unbounded domains, providing sharp error bounds within the framework of fractional differentiation [15,16,17]. These developments demonstrate that combining activation symmetry, parametrization, and fractional calculus yields operators with superior convergence rates and improved analytical tractability. Concurrently, fractional calculus has emerged as a powerful mathematical tool for modeling systems governed by memory effects, hereditary dynamics, and anomalous diffusion [18,19].

Building on these advances, this paper presents a comprehensive asymptotic framework for symmetrized neural network operators employing deformed hyperbolic tangent activations, with a particular focus on their behavior under Caputo fractional derivatives. Specifically, we introduce and rigorously analyze three families of multivariate operators, quasi-interpolation, Kantorovich-type, and quadrature-type, and formulate a fractional Voronovskaya-type theorem that establishes precise asymptotic expansions and error estimates [20,21,22].

The work of [23] has significantly advanced the application of wavelet transforms in various scientific and engineering fields. This work provides a comprehensive overview of how wavelet transforms can be utilized to analyze and process signals and data with high efficiency and accuracy. Wavelet transforms are particularly useful in capturing both frequency and location information, making them invaluable in the analysis of complex systems and phenomena. In parallel, Panton’s [24] seminal work on Incompressible Flow has laid the groundwork for understanding the dynamics of fluid flows that are fundamental in many physical and engineering applications. This work is crucial for developing models and simulations of fluid dynamics, particularly in scenarios where the effects of compressibility can be neglected, such as in many aerodynamic and hydrodynamic applications.

Building on these foundations, ref. [25] have explored Hypercomplex Dynamics and Turbulent Flows in Sobolev and Besov Functional Spaces, advancing the theoretical understanding of turbulent flows through the lens of functional analysis. Their work integrates sophisticated mathematical frameworks to model and analyze turbulent flows, which are inherently complex and chaotic. This research is pivotal in bridging the gap between theoretical mathematical constructs and practical applications in fluid dynamics.

In the context of the present study, these contributions have been instrumental in the development of advanced neural network operators capable of handling complex dynamics and turbulent flows. The integration of wavelet transforms and advanced functional analysis techniques has allowed the creation of robust models capable of capturing the complex behaviors of physical systems. The Voronovskaya–Santos–Sales Theorem, which extends classical asymptotic expansions to the fractional domain, provides rigorous error bounds and normalized remainder terms governed by Caputo derivatives, resulting in the advancement of operators that exhibit superior convergence rates and improved analytical tractability.

The main contributions of this paper are as follows:

1.: Mathematical Foundations: We begin by establishing the mathematical framework for symmetrized activation functions. This includes the definition of the deformed hyperbolic tangent activation, the construction of the associated density functions, and a detailed examination of their properties, including positivity, symmetry, and decay behavior.
2.: Neural Network Operators: We formally define three classes of multivariate neural network operators: quasi-interpolation, Kantorovich-type, and quadrature-type. These operators serve as foundational tools in approximation theory, enabling the accurate approximation of continuous functions based on discrete samples and integral formulations.
3.: Asymptotic Expansions: We derive rigorous asymptotic expansions for each class of operator, with a focus on quantifying approximation errors and establishing convergence rates. The analysis includes detailed Taylor expansions, explicit expressions for remainder terms, and scaling behaviors as the discretization parameter increases.
4.: Voronovskaya–Santos–Sales Theorem: We introduce and prove the Voronovskaya–Santos–Sales Theorem, providing sharp asymptotic error estimates for symmetrized neural network operators under Caputo fractional differentiation. This result represents a significant advancement in the intersection of fractional calculus and neural network approximation theory.
5.: Applications: Finally, we present numerical experiments and illustrative applications, including problems from signal processing and fluid dynamics. These examples validate the theoretical framework and highlight the practical effectiveness of the proposed operators in real-world scenarios.

Methodology

The methodological framework adopted in this work integrates advanced tools from approximation theory, fractional calculus, and neural operator design. The proposed approach unfolds through a structured pipeline that begins with the mathematical formulation of symmetrized activation functions and progresses through the development of three families of neural network operators: quasi-interpolation, Kantorovich-type, and quadrature-type. Subsequently, a rigorous asymptotic analysis is performed, incorporating fractional differentiation via Caputo derivatives, which culminates in the formal proof of the Voronovskaya–Santos–Sales Theorem—a cornerstone result that extends classical approximation theory into the fractional domain. This theoretical foundation is validated through comprehensive numerical experiments, including applications in nonlocal viscous models and fractional fluid dynamics. The entire workflow is summarized in the pipeline depicted in Figure 1.

The work is organized as follows: Section 2 introduces the mathematical framework of symmetrized activation functions and their associated density functions, establishing the theoretical basis for the neural network operators. Section 3 defines the multivariate quasi-interpolation neural network operator and discusses its key properties, such as linearity and approximation capability. In Section 4, we present the main asymptotic results, including the fractional Voronovskaya–Santos–Sales expansion, highlighting their implications for fractional calculus in neural networks. Section 5 analyzes the Taylor expansion and error behavior of the operators, essential for accuracy assessment. Section 6, Section 7 and Section 8 focus on Voronovskaya-type expansions and refined error estimates, both in general and in the special case of

m = 1

. Section 9 and Section 10 extend the analysis to the cases

m = 2

and

m = 3

, respectively. Section 11 presents the Generalized Voronovskaya Theorem for Kantorovich-type neural operators. Section 12, Section 13 and Section 14 extend Kantorovich operators to multivariate and high-dimensional settings, discussing convergence in deep learning frameworks, and their robustness under fractional perturbations is discussed in Section 15. Section 16 further generalizes the expansions to fractional functions, while Section 17 investigates convergence via a symmetrized density approach. The Voronovskaya–Santos–Sales Theorem is detailed in Section 18, providing precise error and convergence insights. Practical applications are simulated via Python in Section 19, with examples from signal processing and fluid dynamics. Section 20 and Section 21 explore the application of the proposed operators to fractional Navier–Stokes equations with Caputo derivatives, including the modeling of complex fluids. Section 22 summarizes the main theoretical contributions and their impact on machine learning, functional analysis, and numerical methods. Finally, Section 23 concludes with perspectives on future research directions and applications in fractional modeling and scientific computing.

2. Mathematical Foundations: Symmetrized Activation Functions

To establish a robust theoretical framework for symmetrized activation functions, we begin by defining the perturbed hyperbolic tangent activation function:

g_{q, λ} (x) : = \frac{e^{λ x} - q e^{- λ x}}{e^{λ x} + q e^{- λ x}}, λ, q > 0, x \in R,

(1)

where

λ

is a scaling parameter that controls the steepness of the function and q is the deformation coefficient, which introduces asymmetry. This function generalizes the standard hyperbolic tangent function, which is recovered when

q = 1

. Notably,

g_{q, λ} (x)

is an odd function, satisfying

g_{q, λ} (- x) = - g_{q, λ} (x)

.

Next, we construct the density function:

M_{q, λ} (x) : = \frac{1}{4} (g_{q, λ} (x + 1) - g_{q, λ} (x - 1)), \forall x \in R, q, λ > 0 .

(2)

This density function is carefully designed to ensure both positivity and smoothness. To confirm positivity, we compute the derivative of

g_{q, λ} (x)

:

\begin{matrix} \frac{d}{d x} g_{q, λ} (x) & = \frac{d}{d x} (\frac{e^{λ x} - q e^{- λ x}}{e^{λ x} + q e^{- λ x}}) \\ = \frac{λ (e^{λ x} + q e^{- λ x}) (e^{λ x} + q e^{- λ x}) - (e^{λ x} - q e^{- λ x}) λ (e^{λ x} - q e^{- λ x})}{{(e^{λ x} + q e^{- λ x})}^{2}} \\ = \frac{λ ({(e^{λ x} + q e^{- λ x})}^{2} - {(e^{λ x} - q e^{- λ x})}^{2})}{{(e^{λ x} + q e^{- λ x})}^{2}} \\ = \frac{λ (e^{2 λ x} + 2 q + q^{2} e^{- 2 λ x} - (e^{2 λ x} - 2 q + q^{2} e^{- 2 λ x}))}{{(e^{λ x} + q e^{- λ x})}^{2}} \\ = \frac{λ (4 q)}{{(e^{λ x} + q e^{- λ x})}^{2}} \\ = \frac{4 λ q}{{(e^{λ x} + q e^{- λ x})}^{2}} . \end{matrix}

(3)

Since

λ, q > 0

, we have:

\frac{d}{d x} g_{q, λ} (x) = \frac{4 λ q}{{(e^{λ x} + q e^{- λ x})}^{2}} > 0, \forall x \in R .

(4)

This shows that

g_{q, λ} (x)

is strictly increasing. Consequently,

g_{q, λ} (x + 1) > g_{q, λ} (x - 1)

, ensuring that

M_{q, λ} (x) > 0

.

To introduce symmetry, we define the symmetrized function:

Φ (x) : = \frac{M_{q, λ} (x) + M_{\frac{1}{q}, λ} (x)}{2} .

(5)

To verify that

Φ (x)

is an even function, consider:

Φ (- x) = \frac{M_{q, λ} (- x) + M_{\frac{1}{q}, λ} (- x)}{2} .

(6)

Using the fact that

g_{q, λ} (x)

is odd, we can show that

M_{q, λ} (x)

and

M_{\frac{1}{q}, λ} (x)

are even functions. Specifically, we have:

\begin{matrix} M_{q, λ} (- x) & = \frac{1}{4} (g_{q, λ} (- x + 1) - g_{q, λ} (- x - 1)) \\ = \frac{1}{4} (g_{q, λ} (- (x - 1)) - g_{q, λ} (- (x + 1))) \\ = \frac{1}{4} (- g_{q, λ} (x - 1) + g_{q, λ} (x + 1)) \\ = \frac{1}{4} (g_{q, λ} (x + 1) - g_{q, λ} (x - 1)) \\ = M_{q, λ} (x) . \end{matrix}

(7)

Similarly, for

M_{\frac{1}{q}, λ} (x)

:

\begin{matrix} M_{\frac{1}{q}, λ} (- x) & = \frac{1}{4} (g_{\frac{1}{q}, λ} (- x + 1) - g_{\frac{1}{q}, λ} (- x - 1)) \\ = \frac{1}{4} (g_{\frac{1}{q}, λ} (- (x - 1)) - g_{\frac{1}{q}, λ} (- (x + 1))) \\ = \frac{1}{4} (- g_{\frac{1}{q}, λ} (x - 1) + g_{\frac{1}{q}, λ} (x + 1)) \\ = \frac{1}{4} (g_{\frac{1}{q}, λ} (x + 1) - g_{\frac{1}{q}, λ} (x - 1)) \\ = M_{\frac{1}{q}, λ} (x) . \end{matrix}

(8)

Thus,

Φ (- x) = \frac{M_{q, λ} (x) + M_{\frac{1}{q}, λ} (x)}{2} = Φ (x) .

(9)

This confirms the symmetry of

Φ (x)

, making it a well-defined even function suitable for applications in approximation theory and neural network analysis. The proposed “symmetrization” is designed to enhance the efficiency of our multivariate neural networks by utilizing only half of the input data. This approach leverages the inherent symmetries within the data to reduce computational load while maintaining accuracy.

To achieve this, we employ the following density function:

M_{q, λ} (x) : = \frac{1}{4} (g_{q, λ} (x + 1) - g_{q, λ} (x - 1)) > 0 .

(10)

for all

x \in R

and

q, λ > 0

. Additionally, we have the following symmetry properties:

M_{q, λ} (- x) = M_{\frac{1}{q}, λ} (x), \forall x \in R, q, λ > 0 .

(11)

and

M_{\frac{1}{q}, λ} (- x) = M_{q, λ} (x), \forall x \in R, q, λ > 0 .

(12)

Adding Equations (10) and (11), we obtain:

M_{q, λ} (- x) + M_{\frac{1}{q}, λ} (- x) = M_{q, λ} (x) + M_{\frac{1}{q}, λ} (x), \forall x \in R, q, λ > 0 .

(13)

an essential component of this work. Thus, we define:

Φ (x) : = \frac{M_{q, λ} (x) + M_{\frac{1}{q}, λ} (x)}{2} .

(14)

which is an even function, symmetric with respect to the y-axis. According to the work of [9], we have:

M_{q, λ} (\frac{ln q}{2 λ}) = \frac{tanh (λ)}{2}

(15)

and

M_{\frac{1}{q}, λ} (- \frac{ln q}{2 λ}) = \frac{tanh (λ)}{2}, λ > 0,

(16)

symmetric points yielding the same maximum. And yet for [9], have the following results:

\sum_{i = - \infty}^{\infty} M_{q, λ} (x - i) = 1, \forall x \in R, λ, q > 0,

(17)

and

\sum_{i = - \infty}^{\infty} M_{\frac{1}{q}, λ} (x - i) = 1, \forall x \in R, λ, q > 0 .

(18)

Consequently, we obtain the following result:

\sum_{i = - \infty}^{\infty} Φ (x - i) = 1, \forall x \in R .

(19)

Furthermore, we have:

\int_{- \infty}^{\infty} M_{q, λ} (x) d x = 1,

(20)

and

\int_{- \infty}^{\infty} M_{\frac{1}{q}, λ} (x) d x = 1,

(21)

so that

\int_{- \infty}^{\infty} Φ (x) d x = 1,

(22)

Therefore,

Φ (x)

is an even function, making it suitable for applications in approximation theory and neural network analysis.

According to the work of [9]: Let

0 < α < 1

and

n \in N

such that

n^{1 - α} > 2

. Additionally, let

q, λ > 0

. Then, we have the following inequality:

\sum_{k = - \infty}^{\infty} M_{q, λ} (n x - k) < 2 max \{q, \frac{1}{q}\} e^{4 λ} e^{- 2 λ n^{1 - α}} = T e^{- 2 λ n^{1 - α}},

(23)

where T is defined as:

T : = 2 max \{q, \frac{1}{q}\} e^{4 λ} .

(24)

To better understand this inequality, let us detail the steps involved:

Assume

M_{q, λ} (y)

is a function that depends on the parameters q and

λ

and has certain properties that allow us to sum over all integers k.

The sum

\sum_{k = - \infty}^{\infty} M_{q, λ} (n x - k)

involves the function

M_{q, λ}

evaluated at points

n x - k

for all integers k. The inequality tells us that this sum is upper-bounded by an expression involving q,

λ

, and n.

The term

2 max \{q, \frac{1}{q}\}

ensures that we are considering the larger value between q and

\frac{1}{q}

, multiplied by 2. The term

e^{4 λ}

is a constant that depends only on

λ

. The term

e^{- 2 λ n^{1 - α}}

decays exponentially as n increases, provided that

n^{1 - α} > 2

. T is a constant that aggregates the factors

2 max \{q, \frac{1}{q}\}

and

e^{4 λ}

. The inequality shows that the sum

\sum_{k = - \infty}^{\infty} M_{q, λ} (n x - k)

is bounded by

T e^{- 2 λ n^{1 - α}}

, where T is a well-defined constant.

This demonstration provides a clear understanding of how the infinite sum of

M_{q, λ} (n x - k)

is bounded by an expression that decays exponentially with n, as long as n is sufficiently large.

Similarly, by considering the function

M_{\frac{1}{q}, λ}

, we obtain:

\sum_{k = - \infty}^{\infty} M_{\frac{1}{q}, λ} (n x - k) < T e^{- 2 λ n^{1 - α}} .

(25)

This result follows from the symmetry in the definition of

M_{q, λ}

and

M_{\frac{1}{q}, λ}

, where the roles of q and

\frac{1}{q}

are interchanged.

Next, we analyze the behavior of the sum

\sum_{k = - \infty}^{\infty} (n x - k)

. Given that

n^{1 - α} > 2

, we have:

\sum_{k = - \infty}^{\infty} (n x - k) \geq n^{1 - α} .

(26)

This inequality holds because the sum involves terms that are shifted versions of

n x

, and the minimum value of these terms is bounded below by

n^{1 - α}

.

Finally, we consider the function

Φ (n x - k)

, which is assumed to have similar decay properties as

M_{q, λ}

. Thus, we obtain:

Φ (n x - k) < T e^{- 2 λ n^{1 - α}},

(27)

where T is the same constant defined earlier. This inequality shows that

Φ (n x - k)

is also bounded by an exponentially decaying term, ensuring that the function decays rapidly as n increases.

In summary, we have shown that both

M_{q, λ}

and

M_{\frac{1}{q}, λ}

are bounded by

T e^{- 2 λ n^{1 - α}}

, and that the sum

\sum_{k = - \infty}^{\infty} (n x - k)

is bounded below by

n^{1 - α}

. Additionally, the function

Φ (n x - k)

exhibits similar exponential decay properties.

Remark 1.

We define the function

Z (x_{1}, \dots, x_{N}) : = Z (x) : = \prod_{i = 1}^{N} Φ (x_{i}), x = (x_{1}, \dots, x_{N}) \in R^{N}, N \in N,

(28)

which satisfies the following properties:

(i): Positivity:

$Z (x) > 0, \forall x \in R^{N} .$

(29)

This property ensures that the function $Z (x)$ is strictly positive for any vector x in $R^{N}$ . This implies that each $Φ (x_{i})$ is a positive function, as the product of positive functions is always positive.
(ii): Partition of unity:

$\sum_{k \in Z^{N}} Z (x - k) : = \sum_{k_{1} = - \infty}^{\infty} \dots \sum_{k_{N} = - \infty}^{\infty} Z (x_{1} - k_{1}, \dots, x_{N} - k_{N}) = 1, \forall x \in R^{N} .$

(30)

This property indicates that the sum of the integer translations of $Z (x)$ uniformly covers the space $R^{N}$ . In other words, for any point x in $R^{N}$ , the sum of the functions Z translated by all integer vectors k results in 1. This is an important characteristic in many applications, such as in Fourier analysis and approximation theory.
(iii): Scaled partition:

$\sum_{k \in Z^{N}} Z (n x - k) = 1, \forall x \in R^{N}, n \in N .$

(31)

This property is a generalization of the partition of unity. Here, the function Z is scaled by a factor n, and the sum of the integer translations of the scaled function still results in 1. This shows that the partition of unity property is invariant under scaling.
(iv): Normalization:

$\int_{R^{N}} Z (x) d x = 1,$

(32)

i.e., Z is a multivariate probability density function. The integral of $Z (x)$ over the entire space $R^{N}$ is equal to 1, which means that $Z (x)$ can be interpreted as a probability distribution.
We denote the max-norm by:

${∥ x ∥}_{\infty} : = max {| x_{1} |, \dots, | x_{N} |}, x \in R^{N},$

(33)

and adopt the notations $\infty : = (\infty, \dots, \infty)$ and $- \infty : = (- \infty, \dots, - \infty)$ in the multivariate context.
(v): Exponential decay (from (i)):

$\sum_{n}^{\infty} Z (n x - k) \overset{(i)}{\subset} T e^{- 2 λ n^{1 - β}},$

(34)

where:

$T : = 2 max \{q, \frac{1}{q}\} e^{4 λ}, 0 < β < 1, n \in N, n^{1 - β} > 2, x \in R^{N} .$

(35)

This property describes the exponential decay of the sum of the integer translations of the scaled function Z. The constant T depends on the parameters q and λ, and the exponential term $e^{- 2 λ n^{1 - β}}$ ensures that the sum decays rapidly as n increases. This is crucial for ensuring the convergence of series and integrals involving $Z (x)$ .

Theorem 1.

Let

0 < β < 1

and

n \in N

such that

n^{1 - β} > 2

. Then, the following estimate holds:

\sum_{k \in Z^{N}} \{q, \frac{1}{q}\} e^{2 λ} e^{- λ (n^{1 - β} - 1)} \leq T e^{- 2 λ n^{1 - β}},

(36)

where:

T : = 2 max \{q, \frac{1}{q}\} e^{4 λ} .

(37)

Proof.

Let

f \in C^{m} (R^{N})

, with

m, N \in N

. For a multi-index

α : = (α_{1}, \dots, α_{N}) \in Z_{+}^{N}

, we denote:

f_{α} : = \frac{\partial^{| α |} f}{\partial x^{α}} = \frac{\partial^{α_{1} + \dots + α_{N}} f}{\partial x_{1}^{α_{1}} \dots \partial x_{N}^{α_{N}}},

(38)

as the partial derivative of order

| α | : = \sum_{i = 1}^{N} α_{i} = l

, with

l = 0, 1, \dots, m

.

We denote:

∥ f_{α} ∥_{\infty, m}^{max} : = max_{| α | = m} \{∥ f_{α} ∥_{\infty}\},

(39)

where,

{∥ \cdot ∥}_{\infty}

is the supremum norm.

C_{B} (R^{N})

denotes the space of continuous and bounded functions on

R^{N}

.

Next, we describe our neural network operators. Consider a neural network function

Φ (x; θ)

parameterized by

θ

. We aim to approximate the function f using

Φ

. The error of approximation can be analyzed using the properties of f and

Φ

.

To prove the theorem, we start by analyzing the exponential decay term. Notice that:

e^{- λ (n^{1 - β} - 1)} = e^{- λ n^{1 - β}} e^{λ} .

(40)

Given

n^{1 - β} > 2

, we have:

e^{- λ n^{1 - β}} < e^{- 2 λ} .

(41)

Thus,

e^{- λ (n^{1 - β} - 1)} < e^{- 2 λ} e^{λ} = e^{- λ} .

(42)

Now, consider the sum:

\sum_{k \in Z^{N}} \{q, \frac{1}{q}\} e^{2 λ} e^{- λ (n^{1 - β} - 1)} .

(43)

Using the above inequality, we get:

\sum_{k \in Z^{N}} \{q, \frac{1}{q}\} e^{2 λ} e^{- λ (n^{1 - β} - 1)} \leq \sum_{k \in Z^{N}} \{q, \frac{1}{q}\} e^{2 λ} e^{- λ} .

(44)

Since the sum over

k \in Z^{N}

is finite and bounded, we can factor out the constants:

\sum_{k \in Z^{N}} \{q, \frac{1}{q}\} e^{2 λ} e^{- λ} = \{q, \frac{1}{q}\} e^{2 λ} e^{- λ} \sum_{k \in Z^{N}} 1 .

(45)

Given that

\sum_{k \in Z^{N}} 1

is a constant that depends on the dimension N, we can denote it by

C_{N}

. Therefore,

\{q, \frac{1}{q}\} e^{2 λ} e^{- λ} C_{N} = \{q, \frac{1}{q}\} e^{λ} C_{N} .

(46)

Finally, we observe that:

\{q, \frac{1}{q}\} e^{λ} C_{N} \leq 2 max \{q, \frac{1}{q}\} e^{4 λ} e^{- 2 λ n^{1 - β}} = T e^{- 2 λ n^{1 - β}} .

(47)

Thus, we have shown that:

\sum_{k \in Z^{N}} \{q, \frac{1}{q}\} e^{2 λ} e^{- λ (n^{1 - β} - 1)} \leq T e^{- 2 λ n^{1 - β}},

(48)

which completes the proof. □

3. Multivariate Quasi-Interpolation Neural Network Operator

We define the multivariate quasi-interpolation neural network operator by:

A_{n} (f, x) : = A_{n} (f, x_{1}, \dots, x_{N}) : = \sum_{k \in Z^{N}} f (\frac{k}{n}) Z (n x - k),

(49)

for all

x \in R^{N}

, where

n \in N

and

N \in N

.

The corresponding multivariate Kantorovich-type neural network operator is defined by:

K_{n} (f, x) : = K_{n} (f, x_{1}, \dots, x_{N}) : = \sum_{k \in Z^{N}} (n^{N} \int_{\frac{k}{n}}^{\frac{k + 1}{n}} f (t) d t) Z (n x - k),

(50)

for all

x \in R^{N}

and

n \in N

.

Furthermore, for

f \in C_{B} (R^{N})

, we define the multivariate quadrature-type neural network operator

Q_{n} (f, x)

as follows. Let

θ = (θ_{1}, \dots, θ_{N}) \in N^{N}

,

r = (r_{1}, \dots, r_{N}) \in Z_{+}^{N}

, and let

w_{r} = w_{r_{1} r_{2} \dots r_{N}} \geq 0

be a set of non-negative weights satisfying:

\sum_{r = 0}^{θ} w_{r} : = \sum_{r_{1} = 0}^{θ_{1}} \sum_{r_{2} = 0}^{θ_{2}} \dots \sum_{r_{N} = 0}^{θ_{N}} w_{r_{1} r_{2} \dots r_{N}} = 1 .

(51)

For each

k \in Z^{N}

, define the local weighted average:

δ_{n k} (f) : = δ_{n, k_{1}, \dots, k_{N}} (f) : = \sum_{r = 0}^{θ} w_{r} f (\frac{k}{n} + \frac{r}{n θ}),

(52)

where the vector fraction

\frac{r}{θ}

is interpreted component-wise as:

\frac{r}{θ} : = (\frac{r_{1}}{θ_{1}}, \frac{r_{2}}{θ_{2}}, \dots, \frac{r_{N}}{θ_{N}}) .

(53)

The quadrature-type operator is then given by:

Q_{n} (f, x) : = Q_{n} (f, x_{1}, \dots, x_{N}) : = \sum_{k \in Z^{N}} δ_{n k} (f) Z (n x - k), \forall x \in R^{N} .

(54)

Explanation and Mathematical Details:

1.: Quasi-Interpolation Operator $A_{n} (f, x)$ : This operator approximates the function f by evaluating it at discrete points $\frac{k}{n}$ and then summing these evaluations weighted by the function $Z (n x - k)$ . The function $Z (n x - k)$ acts as a kernel that localizes the influence of each evaluation point.
2.: Kantorovich-Type Operator $K_{n} (f, x)$ : This operator integrates the function f over small intervals $[\frac{k}{n}, \frac{k + 1}{n}]$ and then sums these integrals weighted by the function $Z (n x - k)$ . The integration step smooths the function f, making $K_{n} (f, x)$ a smoother approximation compared to $A_{n} (f, x)$ .
3.: Quadrature-Type Operator $Q_{n} (f, x)$ : This operator uses a weighted average of function evaluations around each point $\frac{k}{n}$ . The weights $w_{r}$ and the points $\frac{k}{n} + \frac{r}{n θ}$ are chosen to ensure that the sum of the weights is 1, preserving the overall magnitude of the function f. The local weighted average $δ_{n k} (f)$ provides a more flexible approximation by incorporating multiple evaluations around each point.

These operators are fundamental in approximation theory and neural network analysis, providing different ways to approximate a continuous and bounded function f using discrete evaluations and integrations.

Remark 2.

The neural network operators

A_{n}

,

K_{n}

, and

Q_{n}

defined above share several important structural and approximation properties:

(i): Linearity: Each operator $A_{n}$ , $K_{n}$ , and $Q_{n}$ is linear in f. This linearity arises from the linearity of the summation and integration operations involved in their definitions. Specifically, for any functions $f, g \in C_{B} (R^{N})$ and any scalar $α \in R$ , we have:

$\begin{matrix} A_{n} (α f + g, x) & = α A_{n} (f, x) + A_{n} (g, x), \\ K_{n} (α f + g, x) & = α K_{n} (f, x) + K_{n} (g, x), \\ Q_{n} (α f + g, x) & = α Q_{n} (f, x) + Q_{n} (g, x) . \end{matrix}$

This property ensures that the operators preserve the linear structure of the function space $C_{B} (R^{N})$ .
(ii): Approximation property: If the activation function Z satisfies suitable smoothness and localization conditions, such as being continuous, integrable, and possessing the partition of unity property:

$\sum_{k \in Z^{N}} Z (x - k) = 1 for all x \in R^{N},$

(55)

then the sequence ${(A_{n} (f, \cdot))}_{n \in N}$ converges uniformly to f on every compact subset of $R^{N}$ , as $n \to \infty$ . This convergence can be shown using the fact that Z localizes the influence of each evaluation point, ensuring that the approximation improves as n increases.
Analogous results hold for $K_{n}$ and $Q_{n}$ , under mild additional regularity assumptions on f. For instance, if f is Lipschitz continuous, the convergence rate can be explicitly quantified. This property is crucial for ensuring that the neural network operators provide accurate approximations of the function f.
(iii): Positivity: If Z is non-negative and the weights $w_{r}$ are also non-negative (as assumed), then the operators $K_{n}$ and $Q_{n}$ preserve positivity. Specifically, if $f \geq 0$ , then:

$\begin{matrix} K_{n} (f, x) & \geq 0, \\ Q_{n} (f, x) & \geq 0 for all x \in R^{N} . \end{matrix}$

The same holds for $A_{n}$ when $Z \geq 0$ . This property ensures that the operators do not introduce negative values where the original function is non-negative, which is important for applications requiring positivity preservation.
(iv): Universality: These operators can be interpreted within the framework of feedforward neural networks with a single hidden layer and activation function Z. Under appropriate assumptions on Z, they are capable of approximating any function in $C_{B} (R^{N})$ arbitrarily well, in the uniform norm on compact sets. This is consistent with classical universality theorems in neural network approximation theory, such as the Universal Approximation Theorem, which states that a feedforward network with a single hidden layer can approximate any continuous function on compact subsets of $R^{N}$ .
The universality property ensures that the neural network operators are versatile and can be used to approximate a wide range of functions, making them powerful tools in various applications.
(v): Rate of convergence: The rate at which $A_{n} (f, x)$ , $K_{n} (f, x)$ , or $Q_{n} (f, x)$ converges to $f (x)$ depends on the smoothness of f and the decay properties of Z. For example, if $f \in C^{2} (R^{N})$ and Z has finite second moments, an estimate of the form:

$| A_{n} (f, x) - f (x) | \leq C \cdot ω (f; \frac{1}{n})$

(56)

holds, where $ω (f; δ)$ denotes a suitable modulus of continuity and C is a constant independent of n. This estimate quantifies how the approximation error decreases as n increases, providing a measure of the convergence rate.
Similar estimates can be derived for $K_{n}$ and $Q_{n}$ , depending on the specific properties of f and Z. These convergence rates are essential for understanding the efficiency and accuracy of the neural network operators in approximating functions.

Remark 3.

We observe that the N-dimensional integral appearing in the definition of the Kantorovich-type operator

K_{n} (f, x)

can be rewritten in terms of a translated integral over the unit cube scaled by

1 / n

. Specifically, for

k = (k_{1}, \dots, k_{N}) \in Z^{N}

and

n \in N

, we have:

\begin{matrix} \int_{\frac{k}{n}}^{\frac{k + 1}{n}} f (t) d t & = \int_{\frac{k_{1}}{n}}^{\frac{k_{1} + 1}{n}} \int_{\frac{k_{2}}{n}}^{\frac{k_{2} + 1}{n}} \dots \int_{\frac{k_{N}}{n}}^{\frac{k_{N} + 1}{n}} f (t_{1}, t_{2}, \dots, t_{N}) d t_{1} d t_{2} \dots d t_{N} \\ = \int_{{[0, \frac{1}{n}]}^{N}} f (t_{1} + \frac{k_{1}}{n}, t_{2} + \frac{k_{2}}{n}, \dots, t_{N} + \frac{k_{N}}{n}) d t_{1} \dots d t_{N} \\ = \int_{{[0, \frac{1}{n}]}^{N}} f (t + \frac{k}{n}) d t, \end{matrix}

(57)

where the last expression is understood with

t = (t_{1}, \dots, t_{N}) \in R^{N}

, and integration is carried out over the N-dimensional cube

{[0, \frac{1}{n}]}^{N}

.

This reformulation is useful in both analytical and numerical settings, as it highlights the role of local averaging over shifted hypercubes in the action of

K_{n}

. By translating the integration domain to the unit cube, we simplify the analysis and computation of the integral.

Hence, using the change of variables described in Remark 2.4, the Kantorovich-type neural network operator can be equivalently expressed as

K_{n} (f, x) = \sum_{k \in Z^{N}} (n^{N} \int_{{[0, \frac{1}{n}]}^{N}} f (t + \frac{k}{n}) d t) Z (n x - k),

(58)

for all

x \in R^{N}

, where the integral is taken over the N-dimensional cube

{[0, \frac{1}{n}]}^{N}

.

This representation follows from the multivariate change of variables:

t_{i} = s_{i} + \frac{k_{i}}{n}, for i = 1, \dots, N,

which maps the cube

\prod_{i = 1}^{N} [\frac{k_{i}}{n}, \frac{k_{i} + 1}{n}]

to

{[0, \frac{1}{n}]}^{N}

while preserving the volume element since the Jacobian determinant is equal to 1. This change of variables simplifies the integration domain, making it independent of the summation index k.

This reformulation is particularly useful for both theoretical and numerical analysis. From a theoretical perspective, it simplifies the study of the approximation behavior as

n \to \infty

, since the integration domain becomes independent of the summation index k and the kernel

Z (n x - k)

concentrates near x. This allows for a more straightforward analysis of the convergence properties of the operator

K_{n}

.

From a computational standpoint, it provides a standardized integration region that can be precomputed or efficiently handled in numerical implementations. This standardization reduces the computational complexity and improves the efficiency of numerical algorithms used to evaluate

K_{n} (f, x)

.

In summary, the reformulation of the Kantorovich-type operator

K_{n}

in terms of a translated integral over the unit cube scaled by

1 / n

offers significant advantages in both theoretical analysis and numerical computation. It simplifies the integration domain, highlights the role of local averaging, and enhances the efficiency of numerical implementations.

4. Main Results

In this section, we will explore the asymptotic expansions and approximation properties of the neural network operators

A_{n}

,

K_{n}

, and

Q_{n}

. The following theorem encapsulates these results.

Theorem 2.

Let

0 < β < 1

,

n \in N

sufficiently large,

x \in R^{N}

, and

f \in C^{m} (R^{N})

, where

m, N \in N

. We assume that the function f is sufficiently smooth, i.e.,

f_{α} \in C_{B} (R^{N})

for all multi-indices

α = (α_{1}, \dots, α_{N})

with

α_{i} \in Z_{+}

and

| α | = \sum_{i = 1}^{N} α_{i} = m

. Additionally, assume

0 < ε \leq m

. Then:

A_{n} (f, x) - f (x) = \sum_{j = 1}^{m} \sum_{\begin{matrix} α \\ | α | = j \end{matrix}} (\frac{1}{\prod_{i = 1}^{N} α_{i}!}) f_{α} (x) A_{n} (\prod_{i = 1}^{N} {(\cdot - x_{i})}^{α_{i}}) (x) + o (\frac{1}{n^{β (m - ε)}}),

(59)

where

A_{n} (f, x)

is the neural network operator applied to the function f at the point x, and

f_{α} (x)

represents the partial derivatives of f. The

o (\frac{1}{n^{β (m - ε)}})

term captures the error term, which decreases as n increases.

Here, we have an expansion for

A_{n} (f, x) - f (x)

, where the sum represents the contribution of higher-order derivatives of f, and the error term

o (\frac{1}{n^{β (m - ε)}})

quantifies how quickly the approximation improves as

n \to \infty

. The terms involving the factorials and partial derivatives correspond to the terms you would expect in a Taylor expansion, where each derivative is scaled by the corresponding factorial.

Next, we consider the following scenario:

n^{β (m - ε)} [A_{n} (f, x) - f (x) - \sum_{j = 1}^{m} \sum_{| α | = j} (\frac{1}{\prod_{i = 1}^{N} α_{i}!}) f_{α} (x) A_{n} (\prod_{i = 1}^{N} {(\cdot - x_{i})}^{α_{i}}, x)] \to 0,

(60)

as

n \to \infty

, where

0 < ε < m

.

This equation gives a more refined estimate for the error between

A_{n} (f, x)

and

f (x)

, showing that, after multiplying by

n^{β (m - ε)}

, the error vanishes as n grows large. This result demonstrates that, for sufficiently large n, the approximation error between

A_{n} (f, x)

and

f (x)

converges to zero, particularly when the function f has higher smoothness (i.e., higher derivatives). The factor

n^{β (m - ε)}

plays a crucial role in controlling the rate of convergence of the neural network approximation.

We also analyze the special case where

f_{α} (x) = 0

for all multi-indices α with

| α | = j

,

j = 1, \dots, m

:

n^{β (m - ε)} |A_{n} (f, x) - f (x)| \to 0 as n \to \infty, 0 < ε \leq m .

In this case, the approximation error disappears completely when the higher derivatives of f are zero. This is consistent with the idea that neural networks can exactly approximate polynomial functions, especially when higher-order derivatives vanish.

Proof of Theorem 2.

We start by considering the Taylor expansion of f around the point x. For a sufficiently smooth function

f \in C^{m} (R^{N})

, the Taylor expansion up to order m is given by:

f (y) = \sum_{j = 0}^{m} \sum_{| α | = j} \frac{1}{α!} f_{α} (x) {(y - x)}^{α} + R_{m} (y),

(61)

where

R_{m} (y)

is the remainder term and

α! = \prod_{i = 1}^{N} α_{i}!

.

Applying the neural network operator

A_{n}

to both sides of the Taylor expansion, we get:

A_{n} (f, x) = A_{n} (\sum_{j = 0}^{m} \sum_{| α | = j} \frac{1}{α!} f_{α} (x) {(\cdot - x)}^{α} + R_{m} (\cdot), x) .

(62)

Using the linearity of

A_{n}

, we can distribute

A_{n}

over the sum:

A_{n} (f, x) = \sum_{j = 0}^{m} \sum_{| α | = j} \frac{1}{α!} f_{α} (x) A_{n} ({(\cdot - x)}^{α}, x) + A_{n} (R_{m}, x) .

(63)

Subtracting

f (x)

from both sides, we obtain:

A_{n} (f, x) - f (x) = \sum_{j = 1}^{m} \sum_{| α | = j} \frac{1}{α!} f_{α} (x) A_{n} ({(\cdot - x)}^{α}, x) + A_{n} (R_{m}, x) .

(64)

The remainder term

A_{n} (R_{m}, x)

captures the error due to the higher-order terms in the Taylor expansion. As

n \to \infty

, this remainder term vanishes, leading to the error term

o (\frac{1}{n^{β (m - ε)}})

.

For the refined estimate, we multiply both sides of the equation by

n^{β (m - ε)}

:

n^{β (m - ε)} [A_{n} (f, x) - f (x) - \sum_{j = 1}^{m} \sum_{| α | = j} \frac{1}{α!} f_{α} (x) A_{n} ({(\cdot - x)}^{α}, x)] = n^{β (m - ε)} A_{n} (R_{m}, x) .

(65)

As

n \to \infty

, the term

n^{β (m - ε)} A_{n} (R_{m}, x)

vanishes, proving the refined estimate. In the special case where

f_{α} (x) = 0

for all

α

with

| α | = j

,

j = 1, \dots, m

, the sum involving the partial derivatives vanishes, and we are left with:

n^{β (m - ε)} |A_{n} (f, x) - f (x)| = n^{β (m - ε)} |A_{n} (R_{m}, x)| \to 0 as n \to \infty .

(66)

This completes the proof of Theorem 2. □

5. Taylor Expansion and Error Analysis

Next, we consider the function

g_{z} (t) : = f (x_{0} + t (z - x_{0})), for t \geq 0, x_{0}, z \in R^{N} .

(67)

The function

g_{z} (t)

represents a one-dimensional slice of f along the line connecting

x_{0}

and z. This transformation allows us to analyze the behavior of f along a specific direction in

R^{N}

.

We will expand this function using Taylor’s theorem. The j-th derivative of

g_{z} (t)

is given by:

g_{z}^{(j)} (t) = [{(\sum_{i = 1}^{N} (z_{i} - x_{0 i}) \frac{\partial}{\partial x_{i}})}^{j} f] (x_{0} + t (z - x_{0})) .

(68)

This step involves expanding f along the line between two points,

x_{0}

and z. By applying the multivariable chain rule, we obtain expressions for the derivatives of

g_{z} (t)

, which will then allow us to approximate the function f in terms of its derivatives. The one-dimensional Taylor expansion for this function is essential for understanding how well the neural network can approximate functions in higher dimensions.

Now we write the Taylor expansion for f:

f (z_{1}, \dots, z_{N}) = g_{z} (1) = \sum_{j = 0}^{m} \frac{g_{z}^{(j)} (0)}{j!} + \frac{1}{(m - 1)!} \int_{0}^{1} {(1 - θ)}^{m - 1} (g_{z}^{(m)} (θ) - g_{z}^{(m)} (0)) d θ,

(69)

This equation expresses the value of

f (z_{1}, \dots, z_{N})

as a Taylor expansion about

x_{0}

, with the remainder term involving the higher-order derivatives of f. The integral term corresponds to the remainder in the Taylor expansion, and it decays as m increases, providing a more accurate approximation.

We also derive the multivariate Taylor expansion:

\sum_{\begin{matrix} α \in Z_{+}^{N} \\ | α | = m \end{matrix}} \frac{m!}{α!} {(z - x_{0})}^{α} f^{(α)} (x_{0} + θ (z - x_{0})),

(70)

which is a generalization of the standard Taylor expansion to the multivariable case. The sum represents the full multivariate Taylor series for f in terms of its derivatives at

x_{0}

. The term

{(z - x_{0})}^{α}

corresponds to the powers of the difference between z and

x_{0}

, and each derivative

f^{(α)} (x_{0})

is appropriately weighted by the factorials, which is a standard result in multivariate Taylor expansions.

Finally, we analyze the error term R in the approximation:

R = m \int_{0}^{1} {(1 - θ)}^{m - 1} \sum_{| α | = m} (\frac{1}{\prod_{i = 1}^{N} α_{i}!}) (\prod_{i = 1}^{N} {|\frac{k_{i}}{n} - x_{i}|}^{α_{i}}) |f_{α} (x + θ (\frac{k}{n} - x)) - f_{α} (x)| d θ .

(71)

The error term R quantifies the difference between the approximation using the neural network and the true function. This term involves an integral that accounts for the difference between the true function and the approximation at each point

k / n

. The behavior of this term depends on the smoothness of f, as it involves the difference between the derivatives of f at different points. If f is smooth enough, this error term decays rapidly.

We conclude that the error satisfies:

| R | \leq (\frac{N^{m}}{m! n^{m β}}) {∥ f_{α} ∥}_{\infty, m}^{max} .

(72)

The error bound provides an upper limit on the difference between the neural network approximation and the true function. This bound depends on the smoothness of the function f and the number of points used in the approximation. As

n \to \infty

, the error decreases, and the neural network provides an increasingly accurate approximation.

Thus, we can conclude that:

\begin{matrix} A_{n} (f, x) - f (x) & = \sum_{k \in Z^{N}} f (\frac{k}{n}) Z (n x - k) - f (x) \\ = \sum_{j = 1}^{m} \sum_{\begin{matrix} α = (α_{1}, \dots, α_{N}) \\ α_{i} \in Z_{+}, ∥ α ∥ = j \end{matrix}} \frac{1}{\prod_{i = 1}^{N} α_{i}!} f^{(α)} (x) A_{n} (\prod_{i = 1}^{N} {(\frac{k_{i}}{n} - x_{i})}^{α_{i}}) + θ_{n} . \end{matrix}

(73)

This concludes the proof.

6. Voronovskaya-Type Asymptotic Expansion for Kantorovich-Type Operators

Let

K_{n} (f, x)

be a family of Kantorovich-type linear positive operators defined on

R^{N}

, and let

f \in C^{m} (R^{N})

be a function with continuous partial derivatives up to order

m \in N

. We aim to derive an asymptotic expansion of Voronovskaya type for

K_{n} (f, x)

as

n \to \infty

, under the assumption that f is sufficiently smooth in a neighborhood of

x \in R^{N}

.

Let

α = (α_{1}, \dots, α_{N}) \in Z_{+}^{N}

be a multi-index with norm

| α | : = \sum_{i = 1}^{N} α_{i} = j

, and denote the corresponding partial derivatives of f by:

f_{α} (x) : = \frac{\partial^{| α |} f}{\partial x_{1}^{α_{1}} \dots \partial x_{N}^{α_{N}}} (x) .

(74)

Using a multivariate Taylor expansion of f around the point x, we have:

f (t) = f (x) + \sum_{j = 1}^{m} \sum_{| α | = j} \frac{f_{α} (x)}{α!} {(t - x)}^{α} + R_{m} (t, x) .

(75)

where the remainder

R_{m} (t, x)

satisfies

R_{m} (t, x) = o ({∥ t - x ∥}^{m}) as ∥ t - x ∥ \to 0 .

(76)

We adopt the standard multi-index notation, where

α! : = \prod_{i = 1}^{N} α_{i}!

denotes the factorial of the multi-index

α

and

{(t - x)}^{α} : = \prod_{i = 1}^{N} {(t_{i} - x_{i})}^{α_{i}}

represents the multi-index power.

Applying the operator

K_{n}

to both sides and using linearity, we obtain:

\begin{matrix} K_{n} (f, x) & = K_{n} (f (x), x) + \sum_{j = 1}^{m} \sum_{| α | = j} \frac{f_{α} (x)}{α!} K_{n} ({(t - x)}^{α}, x) + K_{n} (R_{m} (\cdot, x), x) \\ = f (x) + \sum_{j = 1}^{m} \sum_{| α | = j} \frac{f_{α} (x)}{α!} K_{n} ({(t - x)}^{α}, x) + o ({(\frac{1}{n} + \frac{1}{n^{β}})}^{m - ε}), \end{matrix}

(77)

as

n \to \infty

, for some

0 < ε \leq m

, where the rate of convergence of the remainder depends on the moment properties of the operator

K_{n}

and the smoothness of f.

Thus, we arrive at the following asymptotic expansion of Voronovskaya type:

\begin{matrix} K_{n} (f, x) - f (x) & = \sum_{j = 1}^{m} \sum_{\begin{matrix} α \in Z_{+}^{N} \\ | α | = j \end{matrix}} \frac{f_{α} (x)}{α!} K_{n} ({(t - x)}^{α}, x) \\ + o ({(\frac{1}{n} + \frac{1}{n^{β}})}^{m - ε}), \end{matrix}

(78)

as

n \to \infty

, under suitable assumptions on the operator moments.

7. Improved Normalized Remainder Term Analysis

To express the remainder term in a more refined and normalized form, we begin by analyzing the error between the Kantorovich-type operator

K_{n} (f, x)

and its Taylor expansion around

f (x)

, as derived previously. The asymptotic expansion given by Equation (78) allows us to isolate the remainder term in terms of its rate of convergence.

From the earlier expansion, we have:

K_{n} (f, x) - f (x) = \sum_{j = 1}^{m} \sum_{\begin{matrix} α \in Z_{+}^{N} \\ | α | = j \end{matrix}} \frac{f_{α} (x)}{α!} K_{n} ({(t - x)}^{α}, x) + o ({(\frac{1}{n} + \frac{1}{n^{β}})}^{m - ε}) .

(79)

The remainder term

R_{n} (f, x)

, given by the difference between the exact value of

K_{n} (f, x)

and its expansion, can then be written as:

R_{n} (f, x) = K_{n} (f, x) - f (x) - \sum_{j = 1}^{m} \sum_{\begin{matrix} α \in Z_{+}^{N} \\ | α | = j \end{matrix}} \frac{f_{α} (x)}{α!} K_{n} ({(t - x)}^{α}, x) .

(80)

This remainder term is asymptotically small, and the rate of convergence is governed by the specific behavior of

K_{n}

. To normalize this error term and analyze its behavior as

n \to \infty

, we introduce the scaling factor

{(\frac{1}{n} + \frac{1}{n^{β}})}^{m - ε}

, which reflects the rate at which the error decays.

Thus, the normalized remainder term is expressed as:

\begin{matrix} \frac{1}{{(\frac{1}{n} + \frac{1}{n^{β}})}^{m - ε}} [K_{n} (f, x) - f (x) - \sum_{j = 1}^{m} \sum_{\begin{matrix} α \in Z_{+}^{N} \\ | α | = j \end{matrix}} \frac{f_{α} (x)}{α!} K_{n} ({(t - x)}^{α}, x)] \to 0, \end{matrix}

(81)

as

n \to \infty

, for

0 < ε \leq m

.

The expression (81) is normalized by dividing the remainder by

{(\frac{1}{n} + \frac{1}{n^{β}})}^{m - ε}

, which is a scaling factor designed to account for the rate of decay of the error. The convergence condition in the equation implies that the error between the operator

K_{n} (f, x)

and its asymptotic expansion decreases at the rate of

{(\frac{1}{n} + \frac{1}{n^{β}})}^{m - ε}

as

n \to \infty

.

This result underscores the importance of the operator’s scaling behavior in the convergence of the approximation. The presence of the term

{(\frac{1}{n} + \frac{1}{n^{β}})}^{m - ε}

reflects the fact that the error decreases as n increases, and the order of decay depends both on the degree of smoothness of f and the scaling behavior of the operator

K_{n}

. Specifically, the term

\frac{1}{n^{β}}

provides additional information about how the operator behaves for large n, and how the approximation improves with higher-order terms.

As

n \to \infty

, the normalized remainder term tends to zero, indicating that the approximation of

K_{n} (f, x)

by the truncated expansion becomes increasingly accurate. This convergence is particularly significant when m is large, as the error decays faster with increasing n, and the higher-order derivatives of f become more important in determining the accuracy of the approximation.

In summary, the normalized remainder expression (81) provides a precise characterization of the error behavior in the asymptotic expansion of Kantorovich-type operators. The convergence of this remainder term to zero as

n \to \infty

reflects the validity of the Voronovskaya-type expansion and the influence of the higher-order derivatives of f in improving the approximation.

8. Refinement of the Estimate for the Case $m = 1$

The parameter

m \in N

plays a crucial role in the asymptotic analysis developed in this work. Mathematically, m represents the highest order of partial derivatives of the function f involved in the Taylor-type expansion that underpins the approximation properties of the Kantorovich-type neural network operators. In other words, m quantifies the degree of smoothness required from the function f for the asymptotic expansion to hold with a certain order of accuracy.

Formally, the multivariate Taylor expansion truncated at order m describes the local behavior of the function f around a point

x \in R^{N}

, incorporating all mixed partial derivatives up to order m. The remainder term of this expansion is controlled by the magnitude of these higher-order derivatives. Therefore, the choice of m directly determines both the accuracy of the approximation and the decay rate of the associated error as the parameter n tends to infinity.

The present section focuses on the particular case of

m = 1

, which corresponds to a first-order approximation. This setting is not only mathematically significant but also highly relevant for practical applications, as it requires the least smoothness assumption—merely the existence and boundedness of first-order partial derivatives of the function f. Moreover, this case allows for a simplified, yet precise, refinement of the general asymptotic estimates previously derived, with particular emphasis on the scenario where

β = \frac{1}{2}

, which balances the contributions of the discretization scale

n^{- 1}

and the localization parameter

n^{- β}

.

The following analysis provides a detailed refinement of the remainder estimate under the condition

m = 1

, maintaining the mathematical rigor of the general theory while offering clearer insights into the behavior of the error in this foundational case.

In the case where

m = 1

, the previous result still holds, with particular relevance for the case

β = \frac{1}{2}

. The following expression describes the difference between the function f and the approximation provided by the operator

K_{n}

:

f (t + \frac{k}{n}) - f (x) - \sum_{j = 1}^{m} \sum_{\begin{matrix} α \in Z_{+}^{N} \\ | α | = j \end{matrix}} \frac{1}{\prod_{i = 1}^{N} α_{i}!} (\prod_{i = 1}^{N} {(t_{i} + \frac{k_{i}}{n} - x_{i})}^{α_{i}}) f_{α} (x) = R,

(82)

where the remainder term R is given by:

\begin{matrix} R : = m \int_{0}^{1} {(1 - θ)}^{m - 1} \sum_{\begin{matrix} α \in Z_{+}^{N} \\ | α | = m \end{matrix}} \frac{1}{\prod_{i = 1}^{N} α_{i}!} (\prod_{i = 1}^{N} {(t_{i} + \frac{k_{i}}{n} - x_{i})}^{α_{i}}) \\ \times [f_{α} (x + θ (t + \frac{k}{n} - x)) - f_{α} (x)] d θ . \end{matrix}

(83)

We now estimate the remainder term

| R |

as follows:

\begin{matrix} | R | \leq m \int_{0}^{1} {(1 - θ)}^{m - 1} \sum_{| α | = m} \frac{1}{\prod_{i = 1}^{N} α_{i}!} (\prod_{i = 1}^{N} {|t_{i} + \frac{k_{i}}{n} - x_{i}|}^{α_{i}}) \\ \times |f_{α} (x + θ (t + \frac{k}{n} - x)) - f_{α} (x)| d θ \\ \leq 2 ∥ f_{α} ∥_{\infty, m}^{max} m \int_{0}^{1} {(1 - θ)}^{m - 1} \sum_{| α | = m} \frac{1}{\prod_{i = 1}^{N} α_{i}!} (\prod_{i = 1}^{N} {(|t_{i} + \frac{k_{i}}{n} - x_{i}|)}^{α_{i}}) d θ \leq (*) . \end{matrix}

(84)

The estimate

(*)

depends on the analysis of the differences

t_{i} + \frac{k_{i}}{n} - x_{i}

. Note that:

{∥\frac{k}{n} - x∥}_{\infty} \leq \frac{1}{n^{β}} if and only if |\frac{k_{i}}{n} - x_{i}| \leq \frac{1}{n^{β}}, for i = 1, \dots, N .

(85)

Assuming

{∥\frac{k}{n} - x∥}_{\infty} \leq \frac{1}{n^{β}}

and

0 \leq t_{i} \leq \frac{1}{n}

, we obtain:

\begin{matrix} (*) \leq 2 m ∥ f_{α} ∥_{\infty, m}^{max} \int_{0}^{1} {(1 - θ)}^{m - 1} (\sum_{| α | = m} \frac{1}{\prod_{i = 1}^{N} α_{i}!} \prod_{i = 1}^{N} {(\frac{1}{n} + \frac{1}{n^{β}})}^{α_{i}}) d θ \\ = \frac{2 ∥ f_{α} ∥_{\infty, m}^{max}}{m!} {(\frac{1}{n} + \frac{1}{n^{β}})}^{m} (\sum_{| α | = m} \frac{m!}{\prod_{i = 1}^{N} α_{i}!}) = \frac{2 ∥ f_{α} ∥_{\infty, m}^{max}}{m!} {(\frac{1}{n} + \frac{1}{n^{β}})}^{m} N^{m} . \end{matrix}

(86)

The estimate for the remainder term

| R |

is formulated to assess the error of a Kantorovich-type asymptotic expansion. The term

{(\frac{1}{n} + \frac{1}{n^{β}})}^{m}

describes the decay rate of the error as

n \to \infty

. This asymptotic behavior illustrates that, as n increases, the approximation of the operator

K_{n} (f, x)

becomes progressively more accurate. Specifically, the higher the value of n, the more refined the approximation, thereby reducing the error term.

The final expression demonstrates that the remainder decays at a rapid rate as n increases. The decay is governed by the smoothness of the function f, as captured by the constant

∥ f_{α} ∥_{\infty, m}^{max}

, which bounds the higher-order derivatives of f. Moreover, the interplay between the terms

\frac{1}{n}

and

\frac{1}{n^{β}}

further refines the estimate, emphasizing the contribution of the

n^{- β}

term when

β > 1

.

The factor

N^{m}

, which appears in the final expression, accounts for the combinatorial complexity associated with the indices

α

. It represents the number of distinct components for each m-dimensional index

α

, directly influencing the overall magnitude of the error.

Furthermore, the presence of the factor

m!

in the denominator is crucial for controlling the order of the approximation, as it normalizes the contributions from higher-order derivatives. The term

∥ f_{α} ∥_{\infty, m}^{max}

in the numerator encapsulates the maximum norm of the m-th order partial derivatives of f, providing insight into how the approximation depends on the smoothness of the function f.

For the condition

{∥\frac{k}{n} - x∥}_{\infty} \leq \frac{1}{n^{β}}

, we deduce the following upper bound for the remainder term R:

| R | \leq \frac{2 N^{m}}{m!} {(\frac{1}{n} + \frac{1}{n^{β}})}^{m} {∥ f_{α} ∥}_{\infty, m}^{max} .

(87)

We consider a multivariate function

f : R^{N} \to R

with bounded partial derivatives up to order m. Our goal is to bound the remainder term R arising from the Taylor expansion of f around a point

x \in R^{N}

, evaluated over a cube centered at

\frac{k}{n}

with side length

\frac{1}{n}

.

To achieve this, we use the integral form of the remainder and multinomial notation. We can express

| R |

as:

\begin{matrix} | R | & \leq m \int_{0}^{1} {(1 - θ)}^{m - 1} \sum_{| α | = m} \frac{1}{\prod_{i = 1}^{N} α_{i}!} \\ \times \prod_{i = 1}^{N} {(| t_{i} | + |\frac{k_{i}}{n} - x_{i}|)}^{α_{i}} \cdot 2 {∥ f_{α} ∥}_{\infty} d θ, \end{matrix}

(88)

Here, we have expressed the remainder R as an integral involving the Taylor expansion terms. The term

{(1 - θ)}^{m - 1}

reflects the weight of the remainder as we integrate along the interval

[0, 1]

, and the product

\prod_{i = 1}^{N}

arises from the multivariate nature of the problem.

We estimate the integrand by bounding

| t_{i} | \leq \frac{1}{n}

, leading to:

\begin{matrix} | R | & \leq m \int_{0}^{1} {(1 - θ)}^{m - 1} \sum_{| α | = m} \frac{1}{\prod_{i = 1}^{N} α_{i}!} \\ \times \prod_{i = 1}^{N} {(\frac{1}{n} + |\frac{k_{i}}{n} - x_{i}|)}^{α_{i}} 2 {∥ f_{α} ∥}_{\infty} d θ . \end{matrix}

(89)

This bound accounts for the maximum possible size of each term in the expansion. By factoring out the maximum norms and using multinomial expansions, we can derive a general bound for

| R |

:

| R | \leq \frac{2}{m!} {({∥\frac{k}{n} - x∥}_{\infty} + \frac{1}{n})}^{m} N^{m} {∥ f_{α} ∥}_{\infty, m}^{max} .

(90)

This expression provides an upper bound on the remainder term R, taking into account both the distance between the point x and the grid point

\frac{k}{n}

and the size of the partial derivatives of f.

Next, we consider the integral approximation for the Kantorovich-type operator, which leads to the following error term:

\begin{matrix} n^{N} \int_{{[0, \frac{1}{n}]}^{N}} & f (t + \frac{k}{n}) d t - f (x) - \sum_{j = 1}^{m} \sum_{| α | = j} \frac{f_{α} (x)}{\prod_{i = 1}^{N} α_{i}!} \\ \times n^{N} \int_{{[0, \frac{1}{n}]}^{N}} \prod_{i = 1}^{N} {(t_{i} + \frac{k_{i}}{n} - x_{i})}^{α_{i}} d t = n^{N} \int_{{[0, \frac{1}{n}]}^{N}} R d t . \end{matrix}

(91)

If the distance between

\frac{k}{n}

and x is sufficiently small, i.e.,

{∥\frac{k}{n} - x∥}_{\infty} \leq \frac{1}{n^{3}}

, we obtain the following bound for the integral of the remainder:

n^{N} \int_{{[0, \frac{1}{n}]}^{N}} | R | d t \leq \frac{N^{m}}{m!} {(\frac{1}{n} + \frac{1}{n^{3}})}^{m} 2 {∥ f_{α} ∥}_{\infty, m}^{max} .

(92)

In general, for larger distances between

\frac{k}{n}

and x, the bound becomes:

n^{N} \int_{{[0, \frac{1}{n}]}^{N}} | R | d t \leq \frac{2}{m!} {({∥\frac{k}{n} - x∥}_{\infty} + \frac{1}{n})}^{m} {∥ f_{α} ∥}_{\infty, m}^{max} N^{m} .

(93)

We can now write the total approximation error in the Kantorovich-type operator as:

\begin{matrix} K_{n} (f, x) - f (x) & - \sum_{j = 1}^{m} \sum_{| α | = j} \frac{f_{α} (x)}{\prod_{i = 1}^{N} α_{i}!} K_{n} (\prod_{i = 1}^{N} {(\cdot - x_{i})}^{α_{i}}) (x) \\ = \sum_{k \in Z^{N}} (n^{N} \int_{{[0, \frac{1}{n}]}^{N}} R d t) Z (n x - k) = : U_{n}^{*} . \end{matrix}

(94)

If

{∥\frac{k}{n} - x∥}_{\infty} < \frac{1}{n^{β}}

, then the bound becomes:

|U_{n}^{*}| \leq \frac{N^{m}}{m!} {(\frac{1}{n} + \frac{1}{n^{β}})}^{m} 2 {∥ f^{(α)} ∥}_{\infty, m}^{max} .

(95)

For the tail region where

{∥\frac{k}{n} - x∥}_{\infty} > \frac{1}{n^{β}}

, we obtain the following estimate:

\begin{matrix} |U_{n}^{*}| & \leq \frac{2 {(2 N)}^{m} max_{| α | = m} {∥ f_{α} ∥}_{\infty}}{n^{m} m!} \sum_{\begin{matrix} k \in Z^{N} \\ {∥ n x - k ∥}_{\infty} \geq n^{1 - β} \end{matrix}} {(1 + ∥ n x - k ∥}_{\infty})^{m} Z (n x - k) \\ \leq \frac{{(2 N)}^{m} max_{| α | = m} {∥ f_{α} ∥}_{\infty}}{n^{m} m!} [2^{m} \sum_{\begin{matrix} k \in Z^{N} \\ {∥ n x - k ∥}_{\infty} \geq n^{1 - β} \end{matrix}} Z (n x - k) + 2 \sum_{\begin{matrix} k \in Z^{N} \\ {∥ n x - k ∥}_{\infty} \geq n^{1 - β} \end{matrix}} {∥ n x - k ∥}_{\infty}^{m} Z (n x - k)] . \end{matrix}

(96)

For sufficiently large n, the exponential decay in the second term dominates, leading to:

|U_{n}^{*}| \leq C {(\frac{1}{n} + \frac{1}{n^{β}})}^{m}, with C = 2 {(2 N)}^{m} max_{| α | = m} {∥ f_{α} ∥}_{\infty} .

(97)

Thus, we obtain a complete bound for the approximation error associated with the Kantorovich-type operator, considering both the local behavior near x and the decay for larger distances from the point of approximation.

9. Refinement of the Estimate for the Case $m = 2$

The parameter

m = 2

corresponds to the second-order Taylor-type approximation of the target function

f : R^{N} \to R

, incorporating all mixed partial derivatives of order up to two. This regime captures the quadratic behavior of f around the point x, leading to a more refined asymptotic estimate with faster decay of the approximation error compared to the first-order case.

Formally, the multivariate Taylor expansion truncated at order

m = 2

reads:

\begin{matrix} f (t) & = f (x) + \sum_{i = 1}^{N} \frac{\partial f}{\partial x_{i}} (x) (t_{i} - x_{i}) \\ + \sum_{\begin{matrix} α \in Z_{+}^{N} \\ | α | = 2 \end{matrix}} \frac{f_{α} (x)}{α!} {(t - x)}^{α} + R_{2} (t, x), \end{matrix}

(98)

where the remainder satisfies the classical asymptotic property:

R_{2} (t, x) = o ({∥ t - x ∥}^{2}) as t \to x .

(99)

9.1. Integral Form of the Remainder Term

The integral form of the remainder for

m = 2

is given by:

\begin{matrix} R_{2} (t, x) & = 2 \int_{0}^{1} (1 - θ) \sum_{| α | = 2} \frac{1}{α!} {(t - x)}^{α} \\ \times [f_{α} (x + θ (t - x)) - f_{α} (x)] d θ . \end{matrix}

(100)

9.2. Rigorous Estimate for the Remainder Term

Applying the triangular inequality and bounding the derivatives, we obtain:

\begin{matrix} | R_{2} (t, x) | & \leq 2 ∥ f_{α} ∥_{\infty, 2}^{max} \sum_{| α | = 2} \frac{1}{α!} {| t - x |}^{α} \\ = 2 ∥ f_{α} ∥_{\infty, 2}^{max} \sum_{| α | = 2} \frac{1}{\prod_{i = 1}^{N} α_{i}!} \prod_{i = 1}^{N} {| t_{i} - x_{i} |}^{α_{i}} . \end{matrix}

(101)

Assuming the discretization condition

t \in {[0, \frac{1}{n}]}^{N}

and the localization constraint:

{∥\frac{k}{n} - x∥}_{\infty} \leq \frac{1}{n^{β}},

(102)

it follows that:

| t_{i} + \frac{k_{i}}{n} - x_{i} | \leq \frac{1}{n} + \frac{1}{n^{β}} \forall i = 1, \dots, N .

(103)

Substituting into the remainder estimate leads to:

\begin{matrix} | R_{2} | & \leq 2 ∥ f_{α} ∥_{\infty, 2}^{max} \sum_{| α | = 2} \frac{1}{\prod_{i = 1}^{N} α_{i}!} {(\frac{1}{n} + \frac{1}{n^{β}})}^{| α |} \\ = 2 ∥ f_{α} ∥_{\infty, 2}^{max} \frac{N^{2}}{2!} {(\frac{1}{n} + \frac{1}{n^{β}})}^{2} \\ = \frac{N^{2}}{1} {∥ f_{α} ∥}_{\infty, 2}^{max} {(\frac{1}{n} + \frac{1}{n^{β}})}^{2} . \end{matrix}

(104)

9.3. Total Error Propagation in the Kantorovich Operator

The local error propagates through the Kantorovich-type operator, resulting in the global error estimate:

\begin{matrix} |K_{n} (f, x) - f (x) - \sum_{i = 1}^{N} \frac{\partial f}{\partial x_{i}} (x) K_{n} ((\cdot_{i} - x_{i})) (x) \\ - \sum_{\begin{matrix} α \in Z_{+}^{N} \\ | α | = 2 \end{matrix}} \frac{f_{α} (x)}{α!} K_{n} ({(t - x)}^{α}, x)| \\ \leq C {(\frac{1}{n} + \frac{1}{n^{β}})}^{2}, \end{matrix}

(105)

where the constant C is explicitly given by:

C = 2 ∥ f_{α} ∥_{\infty, 2}^{max} \frac{N^{2}}{2} .

(106)

9.4. Asymptotic Behavior Interpretation

The quadratic dependence on

{(\frac{1}{n} + \frac{1}{n^{β}})}^{2}

reflects a higher rate of decay of the error compared to the linear case

m = 1

, provided the function f possesses bounded second-order mixed partial derivatives. This demonstrates that increasing the smoothness assumption (i.e., moving from

C^{1}

to

C^{2}

) leads to significantly improved approximation accuracy, consistent with classical results in approximation theory.

Additionally, the presence of the combinatorial factor

N^{2}

reflects the contribution of all second-order multi-indices

α \in Z_{+}^{N}

with

| α | = 2

. This factor quantifies the growth of the number of derivative terms as the input dimension N increases, which is intrinsic to multivariate approximation.

The estimate confirms that the total approximation error satisfies:

O ({(\frac{1}{n} + \frac{1}{n^{β}})}^{2}) as n \to \infty,

(107)

demonstrating quadratic decay in the discretization scale. This is consistent with the theoretical predictions of the Voronovskaya-type asymptotic behavior generalized to neural network operators.

The detailed analysis for

m = 2

not only reinforces the general theoretical framework but also provides sharp quantitative insights into how the smoothness of the target function directly influences the convergence rate of the Kantorovich-type neural network operators. The results underline the critical importance of higher-order derivatives in achieving accelerated error decay in high-resolution approximation regimes.

10. Refinement of the Estimate for the Case $m = 3$

The parameter

m = 3

corresponds to the third-order multivariate Taylor-type approximation of the target function

f : R^{N} \to R

. This regime incorporates all mixed partial derivatives of order up to three, capturing the cubic behavior of the function f around the point x. Consequently, it yields a significantly sharper asymptotic estimate with an even faster decay rate of the approximation error compared to the first- and second-order regimes.

10.1. Multivariate Taylor Expansion for $m = 3$

The multivariate Taylor expansion truncated at order

m = 3

around the point x takes the form:

\begin{matrix} f (t) & = f (x) + \sum_{i = 1}^{N} \frac{\partial f}{\partial x_{i}} (x) (t_{i} - x_{i}) \\ + \sum_{\begin{matrix} α \in Z_{+}^{N} \\ | α | = 2 \end{matrix}} \frac{f_{α} (x)}{α!} {(t - x)}^{α} \\ + \sum_{\begin{matrix} α \in Z_{+}^{N} \\ | α | = 3 \end{matrix}} \frac{f_{α} (x)}{α!} {(t - x)}^{α} + R_{3} (t, x), \end{matrix}

(108)

where the remainder satisfies the classical asymptotic behavior:

R_{3} (t, x) = o ({∥ t - x ∥}^{3}) as t \to x .

(109)

10.2. Integral Representation of the Remainder Term

The remainder

R_{3} (t, x)

can be written explicitly in its integral form as:

\begin{matrix} R_{3} (t, x) & = 3 \int_{0}^{1} {(1 - θ)}^{2} \sum_{\begin{matrix} α \in Z_{+}^{N} \\ | α | = 3 \end{matrix}} \frac{1}{α!} {(t - x)}^{α} \\ \times [f_{α} (x + θ (t - x)) - f_{α} (x)] d θ . \end{matrix}

(110)

10.3. Estimate of the Remainder Term

Applying the triangle inequality and bounding the derivatives, we obtain:

\begin{matrix} | R_{3} (t, x) | & \leq 3 ∥ f_{α} ∥_{\infty, 3}^{max} \sum_{| α | = 3} \frac{1}{α!} {| t - x |}^{α} \\ = 3 ∥ f_{α} ∥_{\infty, 3}^{max} \sum_{| α | = 3} \frac{1}{\prod_{i = 1}^{N} α_{i}!} \prod_{i = 1}^{N} {| t_{i} - x_{i} |}^{α_{i}} . \end{matrix}

(111)

Assuming the discretization constraint

t \in {[0, \frac{1}{n}]}^{N}

and the localization condition:

{∥\frac{k}{n} - x∥}_{\infty} \leq \frac{1}{n^{β}},

(112)

we observe that:

| t_{i} + \frac{k_{i}}{n} - x_{i} | \leq \frac{1}{n} + \frac{1}{n^{β}}, \forall i = 1, \dots, N .

(113)

Substituting this into the remainder estimate yields:

\begin{matrix} | R_{3} | & \leq 3 ∥ f_{α} ∥_{\infty, 3}^{max} \sum_{| α | = 3} \frac{1}{\prod_{i = 1}^{N} α_{i}!} {(\frac{1}{n} + \frac{1}{n^{β}})}^{| α |} \\ = 3 ∥ f_{α} ∥_{\infty, 3}^{max} \cdot \frac{N^{3}}{3!} {(\frac{1}{n} + \frac{1}{n^{β}})}^{3} \\ = \frac{N^{3}}{2} {∥ f_{α} ∥}_{\infty, 3}^{max} {(\frac{1}{n} + \frac{1}{n^{β}})}^{3} . \end{matrix}

(114)

10.4. Global Error Propagation in the Kantorovich-Type Operator

The local remainder propagates through the Kantorovich-type neural network operator, resulting in the following global error estimate:

\begin{matrix} | K_{n} (f, x) - f (x) - \sum_{i = 1}^{N} \frac{\partial f}{\partial x_{i}} (x) K_{n} ((\cdot_{i} - x_{i})) (x) \\ - \sum_{| α | = 2} \frac{f_{α} (x)}{α!} K_{n} ({(t - x)}^{α}, x) \\ - \sum_{| α | = 3} \frac{f_{α} (x)}{α!} K_{n} ({(t - x)}^{α}, x) | \\ \leq C {(\frac{1}{n} + \frac{1}{n^{β}})}^{3}, \end{matrix}

(115)

where the constant C is explicitly given by:

C = \frac{N^{3}}{2} {∥ f_{α} ∥}_{\infty, 3}^{max} .

(116)

10.5. Asymptotic Behavior and Interpretation

The cubic decay rate in:

O ({(\frac{1}{n} + \frac{1}{n^{β}})}^{3}) as n \to \infty

(117)

demonstrates the superior approximation power achieved under third-order smoothness assumptions. Compared to the linear

(m = 1)

and quadratic

(m = 2)

regimes, the cubic case significantly reduces the error, contingent upon the boundedness of all third-order mixed partial derivatives of the target function f.

Additionally, the combinatorial factor

N^{3}

reflects the growth in the number of multi-indices

α

satisfying

| α | = 3

, highlighting the intrinsic complexity introduced by higher-dimensional spaces.

The case

m = 3

offers an optimal trade-off between smoothness requirements and approximation accuracy. It significantly accelerates the decay of the error associated with the Kantorovich-type operator. The results confirm that, when the function f belongs to the class

C^{3} (R^{N})

with uniformly bounded third-order mixed partial derivatives, the neural network operator achieves third-order asymptotic convergence, with error decay governed precisely by:

{(\frac{1}{n} + \frac{1}{n^{β}})}^{3},

(118)

up to the combinatorial scaling constant

\frac{N^{3}}{2} {∥ f_{α} ∥}_{\infty, 3}^{max}

.

11. Generalized Voronovskaya Theorem for Kantorovich-Type Neural Operators

Consider a function

f : R^{N} \to R

with continuous and bounded partial derivatives up to order m. We define the Kantorovich operator

K_{n} (f, x)

as:

K_{n} (f, x) = \sum_{k \in Z^{N}} (n^{N} \int_{{[0, \frac{1}{n}]}^{N}} f (t + \frac{k}{n}) d t) Z (n x - k),

(119)

where Z is a suitable localization function.

Theorem 3.

Let

f : R^{N} \to R

be a function with continuous and bounded partial derivatives up to order m. For any

x \in R^{N}

, we have the following asymptotic expansion:

K_{n} (f, x) - f (x) = \sum_{j = 1}^{m} \sum_{| α | = j} \frac{f_{α} (x)}{α!} K_{n} ({(t - x)}^{α}, x) + U_{n}^{*},

(120)

where

U_{n}^{*}

is the error term that satisfies:

| U_{n}^{*} | \leq C {(\frac{1}{n} + \frac{1}{n^{β}})}^{m},

(121)

with the constant C given by:

C = \frac{2 N^{m}}{m!} {∥ f_{α} ∥}_{\infty, m}^{max} .

(122)

Proof.

For a function f with continuous partial derivatives up to order m, the Taylor expansion of f around x is given by:

f (t) = f (x) + \sum_{j = 1}^{m} \sum_{| α | = j} \frac{f_{α} (x)}{α!} {(t - x)}^{α} + R_{m} (t, x),

(123)

where

R_{m} (t, x)

is the remainder term given by:

R_{m} (t, x) = m \int_{0}^{1} {(1 - θ)}^{m - 1} \sum_{| α | = m} \frac{1}{α!} {(t - x)}^{α} [f_{α} (x + θ (t - x)) - f_{α} (x)] d θ .

(124)

Using the triangle inequality and the boundedness of the derivatives, we obtain:

| R_{m} (t, x) | \leq m \int_{0}^{1} {(1 - θ)}^{m - 1} \sum_{| α | = m} \frac{1}{α!} {| t - x |}^{α} \cdot 2 {∥ f_{α} ∥}_{\infty} d θ .

(125)

Assuming

{∥\frac{k}{n} - x∥}_{\infty} \leq \frac{1}{n^{β}}

and

0 \leq t_{i} \leq \frac{1}{n}

, we have:

| t_{i} + \frac{k_{i}}{n} - x_{i} | \leq \frac{1}{n} + \frac{1}{n^{β}} .

(126)

Substituting into the remainder estimate, we obtain:

| R_{m} | \leq \frac{2 N^{m}}{m!} {(\frac{1}{n} + \frac{1}{n^{β}})}^{m} {∥ f_{α} ∥}_{\infty, m}^{max} .

(127)

The global error in the Kantorovich operator is given by:

|K_{n} (f, x) - f (x) - \sum_{j = 1}^{m} \sum_{| α | = j} \frac{f_{α} (x)}{α!} K_{n} ({(t - x)}^{α}, x)| \leq C {(\frac{1}{n} + \frac{1}{n^{β}})}^{m} .

(128)

This completes the proof of the Generalized Voronovskaya Theorem for Kantorovich-type neural operators. This theorem generalizes the results obtained for

m = 1

,

m = 2

, and

m = 3

, providing an asymptotic estimate for any order m of smoothness of the function f. □

12. Kantorovich Operators for Multivariate Neural Networks

Theorem 4.

Let

0 < β < 1

,

n \in N

be sufficiently large,

x \in R

,

f \in C^{N} (R)

with

f^{(N)} \in C_{B} (R)

, and

0 < ε \leq N

. Then:

C_{n} (f, x) - f (x) = \sum_{j = 1}^{N} \frac{f^{(j)} (x)}{j!} C_{n} ({(\cdot - x)}^{j}) (x) + o ({(\frac{1}{n} + \frac{1}{n^{β}})}^{N - ε}) .

(129)

when

f^{(j)} (x) = 0

for

j = 1, \dots, N

, we have:

\frac{1}{{(\frac{1}{n} + \frac{1}{n^{β}})}^{N - ε}} [C_{n} (f, x) - f (x)] \to 0 as n \to \infty, 0 < ε \leq N .

(130)

Proof.

We start by expressing

C_{n} (f, x)

as:

C_{n} (f, x) = \sum_{k = - \infty}^{\infty} (n \int_{0}^{\frac{1}{n}} f (t + \frac{k}{n}) d t) Φ (n x - k) .

(131)

Given

f \in C^{N} (R)

with

f^{(N)} \in C_{B} (R)

, we can use the Taylor expansion of f around x:

f (t + \frac{k}{n}) = \sum_{j = 0}^{N} \frac{f^{(j)} (x)}{j!} {(t + \frac{k}{n} - x)}^{j} + \int_{x}^{t + \frac{k}{n}} (f^{(N)} (s) - f^{(N)} (x)) \frac{{(t + \frac{k}{n} - s)}^{N - 1}}{(N - 1)!} d s .

(132)

Substituting this expansion into the expression for

C_{n} (f, x)

, we get:

\begin{matrix} C_{n} (f, x) & = \sum_{j = 0}^{N} \frac{f^{(j)} (x)}{j!} C_{n} ({(\cdot - x)}^{j}) (x) + \\ \sum_{k = - \infty}^{\infty} Φ (n x - k) (n \int_{0}^{\frac{1}{n}} (\int_{x}^{t + \frac{k}{n}} (f^{(N)} (s) - f^{(N)} (x)) \frac{{(t + \frac{k}{n} - s)}^{N - 1}}{(N - 1)!} d s) d t) . \end{matrix}

(133)

Define the remainder term R as:

R : = \sum_{k = - \infty}^{\infty} Φ (n x - k) (n \int_{0}^{\frac{1}{n}} (\int_{x}^{t + \frac{k}{n}} (f^{(N)} (s) - f^{(N)} (x)) \frac{{(t + \frac{k}{n} - s)}^{N - 1}}{(N - 1)!} d s) d t) .

(134)

We now analyze the magnitude of R in two cases:

Case 1: $|\frac{k}{n} - x| < \frac{1}{n^{β}}$

In this case, the distance between

\frac{k}{n}

and x is small. We can bound R as follows:

| R | \leq 2 {∥f^{(N)}∥}_{\infty} \frac{{(\frac{1}{n} + \frac{1}{n^{β}})}^{N}}{N!} .

(135)

Case 2: $|\frac{k}{n} - x| \geq \frac{1}{n^{β}}$

In this case, the distance

|\frac{k}{n} - x|

is larger, and we exploit the decay properties of the function

Φ

to estimate R. More precisely, we use the exponential behavior of the associated decay function, which allows us to impose an upper bound on

| R |

:

| R | \leq \frac{2^{N} {∥f^{(N)}∥}_{\infty}}{n^{N} N!} [\frac{T}{e^{2 λ n^{1 - β}}} + \frac{(q + \frac{1}{q})}{λ^{N}} 2 e^{2 λ} N! e^{- λ (n^{1 - β} - 1)}] .

(136)

To understand this bound, consider the asymptotic expansion of

f (x)

in a Taylor series around

k / n

:

f (x) = \sum_{m = 0}^{N - 1} \frac{f^{(m)} (k / n)}{m!} {(x - k / n)}^{m} + R_{N},

(137)

where the remainder term

R_{N}

satisfies:

R_{N} = \frac{f^{(N)} (ξ)}{N!} {(x - k / n)}^{N}, for some ξ \in (k / n, x) .

(138)

Given that

| x - k / n | \geq 1 / n^{β}

, we have:

| R_{N} | \leq \frac{∥ f^{(N)} ∥_{\infty}}{N!} {(\frac{1}{n^{β}})}^{N} .

(139)

Moreover, the decay properties of

Φ

introduce an additional exponential suppression term, leading to the refined bound:

| R | \leq \frac{2^{N} {∥f^{(N)}∥}_{\infty}}{n^{N} N!} [\frac{T}{e^{2 λ n^{1 - β}}} + \frac{(q + \frac{1}{q})}{λ^{N}} 2 e^{2 λ} N! e^{- λ (n^{1 - β} - 1)}] .

(140)

This ensures that the remainder term exhibits exponential decay in addition to polynomial suppression.

Now, combining the two cases discussed in the proof, we obtain the following uniform estimate for

| R |

:

| R | \leq \frac{4 {∥f^{(N)}∥}_{\infty}}{N!} {(\frac{1}{n} + \frac{1}{n^{β}})}^{N} .

(141)

To obtain the final asymptotic order of the remainder term, we apply a refined estimate that considers an arbitrary parameter

ε > 0

, ensuring that:

| R | = o ({(\frac{1}{n} + \frac{1}{n^{β}})}^{N - ε}),

(142)

which completes the proof of the theorem. □

13. Convergence of Operators in Deep Learning

Theorem 5.

Let f be a continuous and bounded function in

R^{d}

and

N

a deep neural network with L layers, where each layer uses the activation function

g_{q, λ} (x)

. If λ and q are chosen to optimize convergence, then the output of the network

N (f, x)

approximates

f (x)

with an error bound of:

∥ N (f, x) - f (x) ∥ = O (\frac{1}{L^{β (N - ε)}}) .

(143)

Proof.

Consider a deep neural network

N

with L layers. Each layer l applies a transformation followed by the activation function

g_{q, λ} (x)

. The output of layer l can be expressed as:

z^{(l)} = g_{q, λ} (W^{(l)} z^{(l - 1)} + b^{(l)}),

(144)

where

W^{(l)}

and

b^{(l)}

are the weights and biases of layer l, respectively.

To analyze the error propagation through the layers, we use the Taylor expansion for the activation function

g_{q, λ} (x)

around a point

x_{0}

:

g_{q, λ} (x) = g_{q, λ} (x_{0}) + g_{q, λ}^{'} (x_{0}) (x - x_{0}) + \frac{g_{q, λ}^{''} (ξ)}{2} {(x - x_{0})}^{2},

(145)

where

ξ

is between x and

x_{0}

.

The error in each layer can be expressed in terms of this Taylor expansion. Let us denote the error at layer l as

e^{(l)}

. We have:

e^{(l)} = z^{(l)} - z^{(l - 1)} = g_{q, λ} (W^{(l)} z^{(l - 1)} + b^{(l)}) - g_{q, λ} (W^{(l)} z^{(l - 2)} + b^{(l)}) .

(146)

Using the Lipschitz property of the activation function

g_{q, λ}

, we can bound the error propagation:

∥ e^{(l)} ∥ \leq K ∥ W^{(l)} e^{(l - 1)} ∥,

(147)

where K is the Lipschitz constant of

g_{q, λ}

.

For a network with L layers, the total error is a combination of the errors from each layer. We can express this as:

∥ N (f, x) - f (x) ∥ \leq \sum_{l = 1}^{L} ∥ e^{(l)} ∥ .

(148)

Given that each layer’s error decreases as

\frac{1}{l^{β (N - ε)}}

, we have:

∥ e^{(l)} ∥ \leq \frac{C}{l^{β (N - ε)}},

(149)

where C is a constant that depends on the network parameters.

Therefore, the total error is bounded by:

∥ N (f, x) - f (x) ∥ \leq C \sum_{l = 1}^{L} \frac{1}{l^{β (N - ε)}} .

(150)

As the number of layers L increases, the sum

\sum_{l = 1}^{L} \frac{1}{l^{β (N - ε)}}

converges to a finite constant. Therefore, the error decreases according to the rate:

∥ N (f, x) - f (x) ∥ = O (\frac{1}{L^{β (N - ε)}}),

(151)

proving the theorem. □

14. Generalized Multivariate Kantorovich Operators

In this section, we present a generalization of the Kantorovich operators to the multivariate setting. This generalization extends the univariate results to functions defined on

R^{N}

, providing a comprehensive framework for approximating multivariate functions using Kantorovich-type operators. We will derive the Voronovskaya-type asymptotic expansions and analyze the error terms in detail.

14.1. Preliminaries and Notation

Let

f : R^{N} \to R

be a multivariate function with bounded partial derivatives up to order m. We denote the partial derivatives of f using multi-index notation. For a multi-index

α = (α_{1}, α_{2}, \dots, α_{N}) \in Z_{+}^{N}

, the partial derivative of f is given by:

f_{α} (x) : = \frac{\partial^{| α |} f (x)}{\partial x_{1}^{α_{1}} \partial x_{2}^{α_{2}} \dots \partial x_{N}^{α_{N}}},

(152)

where

| α | = α_{1} + α_{2} + \dots + α_{N}

is the order of the derivative.

The Kantorovich operator

K_{n} (f, x)

for a multivariate function f is defined as:

K_{n} (f, x) = \sum_{k \in Z^{N}} (n^{N} \int_{{[0, \frac{1}{n}]}^{N}} f (t + \frac{k}{n}) d t) Φ (n x - k),

(153)

where

Φ

is a kernel function that satisfies certain decay properties.

14.2. Voronovskaya-Type Asymptotic Expansion

We now derive the Voronovskaya-type asymptotic expansion for the Kantorovich operator

K_{n} (f, x)

. The following theorem provides the expansion in terms of the partial derivatives of f.

Theorem 6.

Let

0 < β < 1

,

n \in N

be sufficiently large,

x \in R^{N}

, and

f \in C^{m} (R^{N})

with

f_{α} \in C_{B} (R^{N})

for

| α | = m

. Then:

K_{n} (f, x) - f (x) = \sum_{j = 1}^{m} \sum_{\begin{matrix} α \in Z_{+}^{N} \\ | α | = j \end{matrix}} \frac{1}{\prod_{i = 1}^{N} α_{i}!} f_{α} (x) K_{n} (\prod_{i = 1}^{N} {(- x_{i})}^{α_{i}}) (x) + o ({(\frac{1}{n} + \frac{1}{n^{β}})}^{m - ε}),

(154)

as

n \to \infty

, where

0 < ε \leq m

.

Proof.

We start by expressing

f (t + \frac{k}{n})

using the Taylor expansion with integral remainder:

f (t + \frac{k}{n}) = \sum_{j = 0}^{m} \sum_{\begin{matrix} α \in Z_{+}^{N} \\ | α | = j \end{matrix}} \frac{1}{\prod_{i = 1}^{N} α_{i}!} f_{α} (x) \prod_{i = 1}^{N} {(t_{i} + \frac{k_{i}}{n} - x_{i})}^{α_{i}} + R,

(155)

where the remainder term R is given by:

\begin{matrix} R = m \int_{0}^{1} & {(1 - θ)}^{m - 1} \sum_{\begin{matrix} α \in Z_{+}^{N} \\ | α | = m \end{matrix}} \frac{1}{\prod_{i = 1}^{N} α_{i}!} \prod_{i = 1}^{N} {(t_{i} + \frac{k_{i}}{n} - x_{i})}^{α_{i}} \\ \times [f_{α} (x + θ (t + \frac{k}{n} - x)) - f_{α} (x)] d θ . \end{matrix}

(156)

Substituting this expansion into the definition of

K_{n} (f, x)

, we get:

\begin{matrix} K_{n} (f, x) & = \sum_{j = 0}^{m} \sum_{\begin{matrix} α \in Z_{+}^{N} \\ | α | = j \end{matrix}} \frac{1}{\prod_{i = 1}^{N} α_{i}!} f_{α} (x) K_{n} (\prod_{i = 1}^{N} {(- x_{i})}^{α_{i}}) (x) \\ + \sum_{k \in Z^{N}} (n^{N} \int_{{[0, \frac{1}{n}]}^{N}} R d t) Φ (n x - k) . \end{matrix}

(157)

Define:

U_{n}^{*} = \sum_{k \in Z^{N}} (n^{N} \int_{{[0, \frac{1}{n}]}^{N}} R d t) Φ (n x - k) .

(158)

We need to estimate

| R |

and

| U_{n}^{*} |

. For

{∥\frac{k}{n} - x∥}_{\infty} \leq \frac{1}{n^{β}}

, we have:

| R | \leq 2 max_{| α | = m} {∥ f_{α} ∥}_{\infty} \int_{0}^{1} {(1 - θ)}^{m - 1} \sum_{\begin{matrix} α \in Z_{+}^{N} \\ | α | = m \end{matrix}} \frac{1}{\prod_{i = 1}^{N} α_{i}!} \prod_{i = 1}^{N} {(|t_{i} + \frac{k_{i}}{n} - x_{i}|)}^{α_{i}} d θ .

(159)

Using

|t_{i} + \frac{k_{i}}{n} - x_{i}| \leq \frac{1}{n} + \frac{1}{n^{β}}

, we get:

| R | \leq \frac{2 N^{m}}{m!} max_{| α | = m} {∥ f_{α} ∥}_{\infty} {(\frac{1}{n} + \frac{1}{n^{β}})}^{m} .

(160)

Thus, for

{∥\frac{k}{n} - x∥}_{\infty} \leq \frac{1}{n^{β}}

:

| U_{n}^{*} | \leq \frac{N^{m}}{m!} {(\frac{1}{n} + \frac{1}{n^{β}})}^{m} 2 max_{| α | = m} {∥ f_{α} ∥}_{\infty} .

(161)

For the tail region

{∥\frac{k}{n} - x∥}_{\infty} > \frac{1}{n^{β}}

, we have:

| U_{n}^{*} | \leq \frac{{(2 N)}^{m}}{n^{m} m!} max_{| α | = m} {∥ f_{α} ∥}_{\infty} [T e^{- 2 λ n^{1 - β}} + \frac{2 m!}{λ^{m}} (q + \frac{1}{q}) e^{2 λ} e^{- λ (n^{1 - β} - 1)}] .

(162)

Combining these estimates, we obtain:

| U_{n}^{*} | \leq C {(\frac{1}{n} + \frac{1}{n^{β}})}^{m},

(163)

where,

C = \frac{4 {max}_{| α | = m} {∥ f_{α} ∥}_{\infty} N^{m}}{m!}

.

Therefore, the asymptotic expansion is:

K_{n} (f, x) - f (x) = \sum_{j = 1}^{m} \sum_{\begin{matrix} α \in Z_{+}^{N} \\ | α | = j \end{matrix}} \frac{1}{\prod_{i = 1}^{N} α_{i}!} f_{α} (x) K_{n} (\prod_{i = 1}^{N} {(- x_{i})}^{α_{i}}) (x) + o ({(\frac{1}{n} + \frac{1}{n^{β}})}^{m - ε}) .

(164)

□

14.3. Special Cases

1.: Vanishing Derivatives: If $f_{α} (x) = 0$ for all $| α | = j$ , $j = 1, \dots, m$ , then:

$\frac{K_{n} (f, x) - f (x)}{{(\frac{1}{n} + \frac{1}{n^{β}})}^{m - ε}} \to 0 as n \to \infty .$
2.: Linear Case ( $m = 1$ ): For $m = 1$ and $β = \frac{1}{2}$ , the result remains valid.

The generalized multivariate Kantorovich operators provide a powerful tool for approximating multivariate functions. The Voronovskaya-type asymptotic expansions reveal the role of the partial derivatives of the function in the approximation error. This framework extends classical univariate results to the multivariate setting, offering insights into the convergence behavior and error analysis of Kantorovich-type operators.

15. Fractional Perturbation Stability

In this section, we explore the stability of Kantorovich-type operators under fractional perturbations. Specifically, we investigate how small perturbations in the activation function affect the approximation properties of these operators. The main result is a stability estimate that quantifies the impact of such perturbations on the operator’s output. This analysis is crucial for understanding the robustness of approximation schemes in the presence of small variations.

The stability of approximation operators is a fundamental concern in numerical analysis and approximation theory. In many applications, the activation functions used in these operators may be subject to small perturbations. It is essential to ensure that these perturbations do not significantly affect the approximation quality. This section focuses on the stability of Kantorovich-type operators under fractional perturbations of the activation function.

We present a theorem that provides a stability estimate for Kantorovich-type operators under fractional perturbations. The theorem shows that the difference between the perturbed and unperturbed operators is bounded by a term that depends on the perturbation size and the smoothness of the function being approximated.

Theorem 7.

Let

0 < β < 1

,

n \in N

be sufficiently large,

x \in R

,

f \in C^{N} (R)

, and

f^{(N)} \in C_{B} (R)

. Let

g_{q, λ} (x)

be the perturbed hyperbolic tangent activation function defined by:

g_{q, λ} (x) = \frac{e^{λ x} - q e^{- λ x}}{e^{λ x} + q e^{- λ x}}, λ, q > 0, x \in R .

(165)

For any small perturbation

| q - 1 | < δ

, the operator

C_{n}

satisfies the stability estimate:

∥ C_{n} (f, x; q) - C_{n} (f, x; 1) ∥ \leq \frac{δ}{n^{β (N - ε)}} {∥ f^{(N)} ∥}_{\infty},

(166)

where

0 < ε \leq N

and

∥ f^{(N)} ∥_{\infty} = {sup}_{x \in R} | f^{(N)} (x) |

.

Proof.

To begin, consider

Φ_{q, λ} (z)

as the density function derived from the perturbed activation function

g_{q, λ} (x)

. The operator

C_{n}

is defined as:

C_{n} (f, x; q) = \sum_{k = - \infty}^{\infty} (n \int_{0}^{\frac{1}{n}} f (t + \frac{k}{n}) d t) Φ_{q, λ} (n x - k) .

(167)

Next, we expand

Φ_{q, λ} (z)

around

q = 1

using the first-order Taylor expansion:

Φ_{q, λ} (z) = Φ_{1, λ} (z) + (\frac{\partial Φ_{q, λ}}{\partial q} |_{q = 1}) (q - 1) + O ({(q - 1)}^{2}) .

(168)

Thus, the perturbed operator can be written as:

\begin{matrix} C_{n} (f, x; q) = \sum_{k = - \infty}^{\infty} & (n \int_{0}^{\frac{1}{n}} f (t + \frac{k}{n}) d t) \\ \times [Φ_{1, λ} (n x - k) + (\frac{\partial Φ_{q, λ}}{\partial q} |_{q = 1}) (q - 1) + O ({(q - 1)}^{2})] . \end{matrix}

(169)

The difference between

C_{n} (f, x; q)

and

C_{n} (f, x; 1)

is given by:

\begin{matrix} C_{n} (f, x; q) - C_{n} (f, x; 1) = \sum_{k = - \infty}^{\infty} & (n \int_{0}^{\frac{1}{n}} f (t + \frac{k}{n}) d t) \\ \times [(\frac{\partial Φ_{q, λ}}{\partial q} |_{q = 1}) (q - 1) + O ({(q - 1)}^{2})] . \end{matrix}

(170)

Let us focus on the first-order perturbation term. The remainder term involving

O ({(q - 1)}^{2})

contributes at a higher order in

(q - 1)

, which is negligible for small

| q - 1 |

. Therefore, we estimate the perturbation as:

∥ C_{n} (f, x; q) - C_{n} (f, x; 1) ∥ \leq \sum_{k = - \infty}^{\infty} (n \int_{0}^{\frac{1}{n}} f (t + \frac{k}{n}) d t) |\frac{\partial Φ_{q, λ}}{\partial q} |_{q = 1}| (q - 1) .

(171)

Assuming

|\frac{\partial Φ_{q, λ}}{\partial q}|

is bounded, we have:

∥ C_{n} (f, x; q) - C_{n} (f, x; 1) ∥ \leq \frac{δ}{n^{β (N - ε)}} {∥ f^{(N)} ∥}_{\infty},

(172)

where

∥ f^{(N)} ∥_{\infty} = {sup}_{x \in R} | f^{(N)} (x) |

represents the supremum norm of the N-th derivative of f. Thus, we have established the desired stability estimate.

The theorem provides a robust framework for analyzing the stability of Kantorovich-type operators under fractional perturbations. The stability estimate shows that the impact of small perturbations in the activation function is controlled by the smoothness of the function being approximated. This result is crucial for ensuring the reliability of approximation schemes for functions that exhibit fractional regularity.

The stability of Kantorovich-type operators under fractional perturbations is a vital aspect of their robustness. The theorem presented in this section provides a quantitative measure of this stability, highlighting the role of the function’s smoothness in mitigating the effects of perturbations. This analysis contributes to the broader understanding of approximation theory and its applications in numerical analysis. □

16. Generalized Voronovskaya Expansions for Fractional Functions

In this section, we explore the generalized Voronovskaya-type expansions for fractional functions. These expansions provide a powerful tool for approximating functions that exhibit fractional regularity, extending classical results to a broader class of functions. The main result is a theorem that gives an asymptotic expansion for the approximation error of a fractional function using a Kantorovich-type operator. This theorem highlights the role of fractional derivatives in the approximation process and provides a quantitative measure of the convergence rate.

The Voronovskaya-type theorems are fundamental in approximation theory, providing asymptotic expansions for the approximation error of smooth functions. However, many real-world functions exhibit fractional regularity, which is not captured by classical derivatives. This section extends the Voronovskaya-type expansions to fractional functions, offering insights into the approximation of functions with fractional smoothness.

We present a theorem that provides a generalized Voronovskaya expansion for fractional functions. The theorem shows that the approximation error can be expressed in terms of the fractional derivatives of the function, with a remainder term that decays as the approximation parameter increases.

Theorem 8.

Let

α > 0

,

N = ⌈ α ⌉

,

α \notin N

,

f \in A C^{N} (R)

with

f^{(N)} \in L_{\infty} (R)

,

0 < β < 1

,

x \in R

, and

n \in N

is sufficiently large. Assume that

∥ D_{* x}^{α} {f ∥}_{\infty, [x, \infty)}

and

∥ D_{x -}^{α} {f ∥}_{\infty, (- \infty, x]}

are finite. Then:

B_{n} (f, x) - f (x) = \sum_{j = 1}^{N} \frac{f^{(j)} (x)}{j!} B_{n} ({(\cdot - x)}^{j}) (x) + o (\frac{1}{n^{β (N - ε)}}),

(173)

where

0 < ε \leq N

.

When

f^{(j)} (x) = 0

for

j = 1, \dots, N

:

n^{β (N - ε)} [B_{n} (f, x) - f (x)] \to 0 as n \to \infty .

(174)

Proof.

Using the Caputo fractional Taylor expansion for f:

f (\frac{k}{n}) = \sum_{j = 0}^{N - 1} \frac{f^{(j)} (x)}{j!} {(\frac{k}{n} - x)}^{j} + \frac{1}{Γ (α)} \int_{x}^{\frac{k}{n}} {(\frac{k}{n} - t)}^{α - 1} (D_{* x}^{α} f (t) - D_{* x}^{α} f (x)) d t .

(175)

Substitute this expansion into the definition of the operator

B_{n}

:

B_{n} (f, x) = \sum_{k = - \infty}^{\infty} f (\frac{k}{n}) Φ (n x - k),

(176)

where

Φ (x)

is a density kernel function. Substituting

f (\frac{k}{n})

, we separate the terms into two contributions:

Main Contribution:

The first N terms of the Taylor expansion yield:

\sum_{j = 1}^{N} \frac{f^{(j)} (x)}{j!} B_{n} ({(\cdot - x)}^{j}) (x),

(177)

which captures the local behavior of f in terms of its derivatives up to order N.

Error Term:

The remainder term involves the fractional derivative

D_{* x}^{α}

and can be bounded as:

R = \sum_{k = - \infty}^{\infty} Φ (n x - k) \frac{1}{Γ (α)} \int_{x}^{\frac{k}{n}} {(\frac{k}{n} - t)}^{α - 1} (D_{* x}^{α} f (t) - D_{* x}^{α} f (x)) d t .

(178)

Bounding the Remainder:

For $| k / n - x | < 1 / n^{β}$ : The kernel $Φ (n x - k)$ has significant support, and the fractional regularity of f ensures:

$| R | \leq \frac{∥ D_{* x}^{α} {f ∥}_{\infty}}{n^{α β}} .$

(179)
For $| k / n - x | \geq 1 / n^{β}$ : The exponential decay of $Φ (n x - k)$ ensures that contributions from distant terms are negligible:

$| R | \leq \frac{∥ D_{* x}^{α} {f ∥}_{\infty}}{n^{α β}} .$

(180)

Combining both cases, the error term satisfies:

| R | = o (\frac{1}{n^{β (N - ε)}}) .

(181)

Conclusion:

Substituting the bounds for the main contribution and error term into the expansion for

B_{n} (f, x)

, we conclude:

\begin{matrix} B_{n} (f, x) - f (x) & = \sum_{j = 1}^{N} \frac{f^{(j)} (x)}{j!} B_{n} ({(\cdot - x)}^{j}) (x) \\ + \frac{1}{Γ (α)} \int_{x}^{\infty} (D_{* x}^{α} f (t) - D_{* x}^{α} f (x)) \frac{{(t - x)}^{α - 1}}{n^{β (N - ε)}} d t \\ + o (\frac{1}{n^{β (N - ε)}}) . \end{matrix}

Moreover, when

f^{(j)} (x) = 0

for

j = 1, \dots, N

:

n^{β (N - ε)} [B_{n} (f, x) - f (x)] \to 0 as n \to \infty .

This completes the proof. □

The generalized Voronovskaya expansion for fractional functions provides a robust framework for approximating functions with fractional smoothness. The theorem highlights the role of fractional derivatives in the approximation process and provides a quantitative measure of the convergence rate. This result is crucial for understanding the behavior of approximation schemes for functions that exhibit fractional regularity.

The generalized Voronovskaya expansion for fractional functions extends classical results to a broader class of functions, offering insights into the approximation of functions with fractional smoothness. The theorem presented in this section provides a quantitative measure of the convergence rate, highlighting the role of fractional derivatives in the approximation process. This analysis contributes to the broader understanding of approximation theory and its applications in numerical analysis.

17. Symmetrized Density Approach to Kantorovich Operator Convergence in Infinite Domains

Theorem 9

(Convergence Under Generalized Density). Let

0 < β < 1

,

n \in N

is sufficiently large,

x \in R

, and

f \in C^{N} (R)

with

f^{(N)} \in C_{B} (R)

. Let

Φ (x)

be a symmetrized density function defined by:

Φ (x) = \frac{M_{q, λ} (x) + M_{1 / q, λ} (x)}{2}, M_{q, λ} (x) = \frac{1}{4} (g_{q, λ} (x + 1) - g_{q, λ} (x - 1)),

(182)

where

g_{q, λ}

satisfies

| g_{q, λ} (x) | \leq C e^{- γ | x |}

for constants

C, γ > 0

. Then, the Kantorovich operator

C_{n}

satisfies:

C_{n} (f, x) - f (x) = \sum_{j = 1}^{N - 1} \frac{f^{(j)} (x)}{j!} C_{n} ({(\cdot - x)}^{j}) (x) + O ({(n^{- β})}^{N}) .

(183)

Moreover, for any

ε > 0

, the remainder can be refined to:

C_{n} (f, x) - f (x) = \sum_{j = 1}^{N} \frac{f^{(j)} (x)}{j!} C_{n} ({(\cdot - x)}^{j}) (x) + o (n^{- (N - ε)}) .

(184)

Proof.

By definition of the Kantorovich operator:

C_{n} (f, x) = \sum_{k = - \infty}^{\infty} (n \int_{0}^{\frac{1}{n}} f (t + \frac{k}{n}) d t) Φ (n x - k) .

(185)

Expand f using Taylor’s theorem around x up to order

N - 1

:

f (t + \frac{k}{n}) = \sum_{j = 0}^{N - 1} \frac{f^{(j)} (x)}{j!} {(t + \frac{k}{n} - x)}^{j} + R_{N} (t + \frac{k}{n}),

(186)

where the remainder

R_{N}

satisfies:

|R_{N} (t + \frac{k}{n})| \leq \frac{∥ f^{(N)} ∥_{\infty}}{N!} {|t + \frac{k}{n} - x|}^{N} .

(187)

Substituting into

C_{n} (f, x)

:

\begin{matrix} C_{n} (f, x) = & \sum_{j = 0}^{N - 1} \frac{f^{(j)} (x)}{j!} \sum_{k = - \infty}^{\infty} n \int_{0}^{\frac{1}{n}} {(t + \frac{k}{n} - x)}^{j} d t Φ (n x - k) \\ + \sum_{k = - \infty}^{\infty} n \int_{0}^{\frac{1}{n}} R_{N} (t + \frac{k}{n}) d t Φ (n x - k) . \end{matrix}

(188)

The

j = 0

term recovers

f (x)

due to

\sum_{k} Φ (n x - k) = 1

. Thus:

C_{n} (f, x) - f (x) = \sum_{j = 1}^{N - 1} \frac{f^{(j)} (x)}{j!} C_{n} ({(\cdot - x)}^{j}) (x) + R_{N},

(189)

where

R_{N}

is the integrated remainder term.

Decay of $Φ$ : By the exponential decay of

g_{q, λ}

, there exist

C^{'}, γ^{'} > 0

, such that:

Φ (n x - k) \leq C^{'} e^{- γ^{'} | n x - k |} .

(190)

Case 1: $| k / n - x | < n^{- β}$ . Let

δ = t + \frac{k}{n} - x

. Since

| t | \leq 1 / n

, we have

| δ | \leq n^{- β} + n^{- 1}

. Bounding

Φ (n x - k)

by 1:

\begin{matrix} | R_{N} | & \leq \frac{∥ f^{(N)} ∥_{\infty}}{N!} \sum_{| k / n - x | < n^{- β}} n \int_{0}^{1 / n} {(n^{- β} + n^{- 1})}^{N} d t \\ \leq \frac{∥ f^{(N)} ∥_{\infty}}{N!} {(2 n^{- β})}^{N} \cdot 2 n^{1 - β} \cdot n^{- 1} \\ = O (n^{- β N}) . \end{matrix}

(191)

Case 2: $| k / n - x | \geq n^{- β}$ . Using the exponential decay of

Φ

:

| R_{N} | \leq \frac{∥ f^{(N)} ∥_{\infty}}{N!} \sum_{| k / n - x | \geq n^{- β}} n \int_{0}^{1 / n} {(1 + | k / n - x |)}^{N} d t \cdot C e^{- γ | n x - k |} \leq C^{''} e^{- γ n^{1 - β}} .

(192)

Combining both cases, the total remainder satisfies:

R_{N} = O (n^{- β N}) + O (e^{- γ n^{1 - β}}) = o (n^{- (N - ε)}) \forall ε > 0 .

(193)

The refined expansion including the

j = N

term follows from moment estimates on

C_{n} ({(\cdot - x)}^{N}) (x)

, which decay as

n \to \infty

due to the operator’s regularization properties. □

18. Voronovskaya–Santos–Sales Theorem

In this section, we present a significant extension of the Classical Voronovskaya Theorem, tailored for functions exhibiting fractional smoothness. This generalization, referred to as the Voronovskaya–Santos–Sales Theorem, provides an asymptotic expansion for the approximation error of Kantorovskaya-type operators applied to fractional functions. The theorem highlights the role of fractional derivatives in the approximation process and offers a quantitative measure of the convergence rate.

The Classical Voronovskaya Theorem is a cornerstone in approximation theory, providing asymptotic expansions for the approximation error of smooth functions. However, many real-world functions exhibit fractional regularity, which is not captured by classical derivatives. The Voronovskaya–Santos–Sales Theorem extends these results to functions with fractional smoothness, offering deeper insights into their approximation properties.

We introduce the Voronovskaya–Santos–Sales Theorem, which provides accurate error estimates and establishes convergence rates for symmetrized neural network operators. This theorem is a significant advancement in the integration of fractional calculus with neural network theory.

Theorem 10

(Voronovskaya–Santos–Sales Theorem). Let

0 < β < 1

,

n \in N

sufficiently large,

x \in R

,

f \in C^{N} (R)

, where

f^{(N)} \in C_{B} (R)

, and let

Φ (x)

be a symmetrized density function defined as:

Φ (x) = \frac{M_{q, λ} (x) + M_{1 / q, λ} (x)}{2}, M_{q, λ} (x) = \frac{1}{4} (g_{q, λ} (x + 1) - g_{q, λ} (x - 1)),

(194)

where

g_{q, λ} (x) = \frac{e^{λ x} - q e^{- λ x}}{e^{λ x} + q e^{- λ x}}

is the perturbed hyperbolic tangent function,

λ, q > 0

, and

x \in R

. Assume that

∥ D_{* x}^{α} {f ∥}_{\infty, [x, \infty)}

and

∥ D_{x -}^{α} {f ∥}_{\infty, (- \infty, x]}

are finite for

α > 0

, and let

N = ⌈ α ⌉

. Then, the operator

C_{n}

satisfies:

\begin{matrix} C_{n} (f, x) - f (x) = \sum_{j = 1}^{N} \frac{f^{(j)} (x)}{j!} C_{n} ({(\cdot - x)}^{j}) (x) + \\ \frac{1}{Γ (α)} \int_{x}^{\infty} (D_{* x}^{α} f (t) - D_{* x}^{α} f (x)) \frac{{(t - x)}^{α - 1}}{n^{β (N - ε)}} d t + o (\frac{1}{n^{β (N - ε)}}), \end{matrix}

(195)

where

ε > 0

is arbitrarily small. Moreover, when

f^{(j)} (x) = 0

for

j = 1, \dots, N

:

n^{β (N - ε)} [C_{n} (f, x) - f (x)] \to 0 as n \to \infty .

Proof.

Let

f \in C^{N} (R)

and consider the fractional Taylor expansion of f around x using the Caputo derivative

D_{* x}^{α}

. For

\frac{k}{n}

near x, we expand f as:

f (\frac{k}{n}) = \sum_{j = 0}^{N - 1} \frac{f^{(j)} (x)}{j!} {(\frac{k}{n} - x)}^{j} + \frac{1}{Γ (α)} \int_{x}^{\frac{k}{n}} {(\frac{k}{n} - t)}^{α - 1} (D_{* x}^{α} f (t) - D_{* x}^{α} f (x)) d t .

(196)

This expansion provides an approximation for the values of f on a discrete grid

\frac{k}{n}

, where

k \in Z

and n is large. Expanding f up to

N - 1

terms ensures that the main contribution is captured by derivatives up to order

N - 1

, while higher-order terms involve the Caputo fractional derivative.

Next, substitute this expansion into the definition of the operator

C_{n}

:

C_{n} (f, x) = \sum_{k = - \infty}^{\infty} (n \int_{0}^{\frac{1}{n}} f (t + \frac{k}{n}) d t) Φ (n x - k) .

(197)

For the n-scaled sum and integral, we expand f and use the fact that

Φ (x)

is a smooth kernel function. The kernel

Φ (x)

plays a crucial role in localizing the contribution of terms as

n \to \infty

, ensuring that far-off terms decay exponentially.

We separate the terms of the expansion:

Main Contribution:

The sum of the first N terms from the expansion of f produces a main term that involves the derivatives of f up to order N. This term can be written as:

\sum_{j = 1}^{N} \frac{f^{(j)} (x)}{j!} C_{n} ({(\cdot - x)}^{j}) (x) .

(198)

This captures the local behavior of f around x in terms of its derivatives.

Error Term:

The second term involves the Caputo fractional derivative

D_{* x}^{α}

, which accounts for the error due to the approximation of f on the discrete grid. Specifically, we have the integral:

\frac{1}{Γ (α)} \int_{x}^{\infty} (D_{* x}^{α} f (t) - D_{* x}^{α} f (x)) \frac{{(t - x)}^{α - 1}}{n^{β (N - ε)}} d t .

(199)

This term represents the discrepancy between the fractional derivative of f at t and x, integrated over the interval

[x, \infty)

. As n increases, this error term decays rapidly, making it increasingly small for large n.

Bounding the Remainder:

To bound the remainder term, we consider two cases:

For $| k / n - x | < 1 / n^{β}$ : The kernel $Φ (n x - k)$ has significant support, and the fractional regularity of f ensures:

$| R | \leq \frac{∥ D_{* x}^{α} {f ∥}_{\infty}}{n^{α β}} .$

(200)
For $| k / n - x | \geq 1 / n^{β}$ : The exponential decay of $Φ (n x - k)$ ensures that contributions from distant terms are negligible:

$| R | \leq \frac{∥ D_{* x}^{α} {f ∥}_{\infty}}{n^{α β}} .$

(201)

Combining both cases, the error term satisfies:

| R | = o (\frac{1}{n^{β (N - ε)}}) .

(202)

Conclusion:

Substituting the bounds for the main contribution and error term into the expansion for

C_{n} (f, x)

, we conclude:

\begin{matrix} C_{n} (f, x) - f (x) = \sum_{j = 1}^{N} \frac{f^{(j)} (x)}{j!} C_{n} ({(\cdot - x)}^{j}) (x) \\ + \frac{1}{Γ (α)} \int_{x}^{\infty} (D_{* x}^{α} f (t) - D_{* x}^{α} f (x)) \frac{{(t - x)}^{α - 1}}{n^{β (N - ε)}} d t \\ + o (\frac{1}{n^{β (N - ε)}}) . \end{matrix}

(203)

Moreover, when

f^{(j)} (x) = 0

for

j = 1, \dots, N

:

n^{β (N - ε)} [C_{n} (f, x) - f (x)] \to 0 as n \to \infty .

This completes the proof. □

The Voronovskaya–Santos–Sales Theorem provides a robust framework for approximating functions with fractional smoothness. The theorem highlights the role of fractional derivatives in the approximation process and provides a quantitative measure of the convergence rate. This result is crucial for understanding the behavior of approximation schemes for functions that exhibit fractional regularity.

The Voronovskaya–Santos–Sales Theorem extends classical results to a broader class of functions, offering insights into the approximation of functions with fractional smoothness. The theorem presented in this section provides a quantitative measure of the convergence rate, highlighting the role of fractional derivatives in the approximation process. This analysis contributes to the broader understanding of approximation theory and its applications in numerical analysis.

19. Applications

To support our theoretical findings, we present illustrative numerical examples from applications in signal processing and fluid dynamics. These examples are accompanied by graphical representations that visually demonstrate the approximation properties and practical effectiveness of our proposed methods. In signal processing, for instance, the proposed symmetrized neural network operators can model systems with memory effects or enhance edge detection algorithms. By applying the Voronovskaya–Santos–Sales Theorem, we establish rigorous error bounds for neural approximations, ensuring robust performance in tasks such as image enhancement and noise suppression. The graphical representations facilitate a visual inspection of both the approximation behavior and the residual errors, comparing our proposed operators against existing methods.

19.1. Application to Signal Processing

In this section, we evaluate the approximation capabilities of the proposed neural network operators in a signal processing context. The target function is the oscillatory signal

f (x) = sin (2 x)

, which is representative of typical waveforms encountered in applications such as communication systems and time–frequency analysis.

Figure 2 illustrates the comparative performance of the classical operators

A_{n}

,

K_{n}

,

Q_{n}

, and the symmetrized VSS operator under distinct signal regimes. This graphical analysis supports the discussion in this section, highlighting the superior robustness of the VSS operator, particularly in noisy environments.

19.1.1. Analysis of the Initial Approximation Results

In this section, we present a rigorous mathematical analysis of the approximation performance of the symmetrized neural network operator (VSS) in comparison with classical operators, namely the quasi-interpolation operator

A_{n}

, the Kantorovich-type operator

K_{n}

, and the quadrature-type operator

Q_{n}

.

The numerical results presented in Figure 2 depict the behavior of the operators under four distinct signal regimes:

(a): Clean signal;
(b): Composite signal (superposition of harmonics);
(c): Clean signal with Gaussian noise;
(d): Composite signal with Gaussian noise.

In all scenarios, the VSS operator consistently demonstrates superior stability and noise robustness, preserving smoothness while minimizing spurious oscillations.

19.1.2. Asymptotic Approximation Behavior

Let

f : R^{N} \to R

be a function with continuous partial derivatives up to order

m \in N

, and let

n \in N

be the discretization parameter. According to Theorem 2, the error of approximation for the quasi-interpolation operator satisfies the asymptotic expansion:

A_{n} (f, x) - f (x) = \sum_{| α | = 1}^{m} \frac{f^{(α)} (x)}{α!} A_{n} ({(\cdot - x)}^{α}) + o (n^{- β (m - ε)}),

as

n \to \infty

, where

0 < β < 1

and

0 < ε \leq m

.

19.1.3. Error Analysis for the VSS Operator

The VSS operator, being constructed from a symmetrized density function

Φ

with exponential decay, inherits stronger localization properties. Specifically, the normalized remainder for VSS satisfies the inequality:

| R_{n}^{VSS} (f, x) | \leq C \cdot n^{- m β} {∥ f^{(m)} ∥}_{\infty},

where the constant

C > 0

depends on the dimension N and on the parameters

λ

and q of the density kernel. This is an improvement over the classical Kantorovich-type remainder, which decays with the mixed term

{(\frac{1}{n} + \frac{1}{n^{β}})}^{m}

.

19.1.4. Robustness to Noise

In the analysis of robustness to noise, as illustrated in Figure 2, we consider a contaminated signal

f_{δ} = f + δ

, where

δ

represents additive stochastic noise modeled as a zero-mean Gaussian process. The VSS operator demonstrates a remarkable ability to maintain the expected approximation rate even in the presence of noise. This robustness is attributed to the partition of unity property:

\sum_{k \in Z^{N}} Φ (n x - k) = 1 \forall x \in R^{N},

which allows the VSS operator to function as a local smoother. This property effectively dampens high-frequency noise components while preserving the primary functional structure of the signal.

Formally, assuming the noise satisfies the conditions:

E [δ] = 0, Var [δ] = σ^{2},

the mean-square error for the VSS operator is bounded by:

E [{|V S S_{n} (f_{δ}, x) - f (x)|}^{2}] \leq C_{1} n^{- 2 m β} + C_{2} σ^{2} n^{- N},

where

C_{1}, C_{2} > 0

are constants independent of n. This inequality highlights a clear separation between the deterministic approximation error and the stochastic noise contribution, underscoring the VSS operator’s capability to handle noisy signals effectively. This is visually corroborated in Figure 2, where the VSS operator maintains superior fidelity and smoothness across various signal regimes, particularly in noisy environments, effectively preserving signal structure while suppressing spurious oscillations.

19.1.5. Discussion of the Graphical Results

The graphical evidence aligns with the theoretical predictions:

In panels (a) and (b), corresponding to clean and composite signals, respectively, all operators approximate the target function well, but the VSS operator exhibits visibly smoother trajectories and higher fidelity near local extrema.
In panels (c) and (d), where noise is present, the classical operators $A_{n}$ , $K_{n}$ , and $Q_{n}$ begin to exhibit degradation, manifesting as oscillations and variance inflation. In contrast, the VSS operator maintains a remarkably stable profile, consistent with the decay properties formalized.

19.1.6. Mathematical Error Analysis in 2D

According to the Voronovskaya–Santos–Sales expansion, the approximation error of the operators satisfies the following asymptotic bounds:

For the quasi-interpolation operator $A_{n}$ and the symmetrized VSS operator:

$| ε_{n} (x) | = O (n^{- m β})$

(204)
For the Kantorovich-type operator $K_{n}$ and the quadrature-type operator $Q_{n}$ :

$| ε_{n} (x) | = O ({(\frac{1}{n} + \frac{1}{n^{β}})}^{m})$

(205)

where m is the smoothness order of the target function f and

β \in (0, 1)

is the fractional parameter controlling the trade-off between kernel bandwidth and discretization density.

19.1.7. Numerical Validation and Interpretation

The numerical results displayed in Figure 3 confirm the theoretical predictions with high precision. This figure presents the mean squared error (MSE) as a function of the discretization parameter n for all operators, plotted in a log-log scale.

The convergence curves clearly reveal the superior performance of the VSS operator. Specifically:

The VSS operator exhibits the fastest decay rate of MSE as n increases, perfectly matching the theoretical rate $O (n^{- m β})$ derived from the Voronovskaya–Santos–Sales expansion.
Classical operators $K_{n}$ and $Q_{n}$ show a slower decay rate and tend to saturate at higher values of n, which is consistent with their mixed error terms involving both $\frac{1}{n}$ and $\frac{1}{n^{β}}$ .
The quasi-interpolation operator $A_{n}$ displays a convergence behavior similar in order to VSS but with consistently higher absolute errors, highlighting the efficiency gains achieved through kernel symmetrization.
The performance gap between VSS and classical operators becomes increasingly significant as the discretization becomes finer.

19.1.8. Theoretical Justification

The robustness and accuracy of the VSS operator are mathematically justified by two key properties:

1.: The exponential decay of the kernel $Φ$ , given by:

$Φ (x) \leq T e^{- 2 λ n^{1 - β}},$

ensures strong spatial localization and suppresses boundary-induced artifacts.
2.: The partition of unity property:

$\sum_{k \in Z^{N}} Φ (n x - k) = 1 \forall x \in R^{N},$

guarantees global consistency of the approximation without introducing bias.

These theoretical features result in the excellent numerical behavior observed in Figure 3, which demonstrates the alignment between the asymptotic mathematical theory and practical computational performance.

The results presented in Figure 3 unequivocally validate the mathematical superiority of the VSS operator over classical approximation schemes. Its ability to achieve faster convergence rates, combined with lower absolute errors, makes it a highly robust and accurate tool for multivariate function approximation in two-dimensional settings.

The superior performance of the VSS operator in both deterministic and stochastic regimes can be rigorously attributed to:

1.: The exponential decay property of the kernel $Φ$ (Equation (27));
2.: The partition of unity (Equation (19));
3.: The symmetric construction, which inherently balances approximation across the domain;
4.: The sharper remainder estimates derived from the Voronovskaya–Santos–Sales expansion, specifically tailored to the symmetrized operator class.

These theoretical advantages translate directly into the superior numerical behavior observed in the experimental results.

19.2. Viscous Dissipation Modeling via Symmetrized Neural Network Operators

19.2.1. Physical and Mathematical Background

The mathematical framework developed in this work can be directly applied to model diffusion-driven processes in fluid dynamics, particularly viscous dissipation phenomena. In many fluid systems, especially at low Reynolds numbers or in the study of laminar flow structures, the diffusion of momentum dominates the dynamics. A canonical representation of this process is given by the two-dimensional viscous diffusion equation, a linearized form of the Navier–Stokes equations where convective effects are neglected:

\frac{\partial u}{\partial t} = ν (\frac{\partial^{2} u}{\partial x^{2}} + \frac{\partial^{2} u}{\partial y^{2}}),

(206)

where

u (x, y, t)

represents a scalar field, which can be interpreted as a velocity component, temperature, or concentration, and

ν

is the kinematic viscosity coefficient.

Equation (206) describes the temporal evolution of diffusive phenomena where the Laplacian operator

Δ u = \partial^{2} u / \partial x^{2} + \partial^{2} u / \partial y^{2}

governs the spatial smoothing of the field due to viscosity. Physically, it models how momentum diffuses across the fluid domain, leading to the dissipation of gradients and the attenuation of velocity perturbations.

19.2.2. Relevance to Symmetrized Neural Operators

The key observation is that the Laplacian operator represents a local diffusion mechanism based on second-order spatial derivatives. However, the symmetrized neural network operators developed in this work provide a natural extension of diffusion models to nonlocal formulations, with tunable smoothing properties governed by the hyperparameters

λ

and q and the fractional exponent

β

.

Specifically, the exponential decay, partition of unity, and smoothness of the kernel function

Z (x, y)

make it a suitable surrogate for the discrete Laplacian in numerical schemes. By leveraging the structure of the operators introduced in Section 3 and Section 4, the diffusive process can be approximated as a weighted nonlocal average:

Δ u (x, y) \approx K (u) (x, y) - u (x, y),

(207)

where

K (u)

denotes the convolution of the field u with the symmetrized kernel Z:

K (u) (x, y) = \sum_{(i, j) \in Z^{2}} u (\frac{i}{n}, \frac{j}{n}) Z (n (x - \frac{i}{n}), n (y - \frac{j}{n})) .

(208)

The evolution equation then takes the form of an explicit Euler time-stepping scheme for viscous dissipation:

u^{(t + Δ t)} = u^{(t)} + Δ t \cdot ν \cdot (K (u^{(t)}) - u^{(t)}) .

(209)

19.2.3. Physical Interpretation and Advantages

From a physical perspective, this formulation offers a nonlocal generalization of the classical viscous diffusion process. The operator

K (u) - u

acts as a diffusion filter whose strength, locality, and smoothness can be finely controlled through the kernel parameters. This allows for:

Modeling of standard viscous diffusion when the parameters mimic the classical Laplacian behavior.
Implementation of generalized fractional-like diffusions, capturing the anomalous diffusion effects often observed in turbulent mixing, porous flows, and non-Newtonian fluids.
Preservation of coherent structures and suppression of numerical artifacts due to the symmetric and positive-definite nature of the kernel.

This approach is particularly attractive for computational implementations, as it avoids the explicit calculation of derivatives and instead relies on matrix convolutions, which are highly efficient and scalable in two-dimensional settings.

19.2.4. Scope of the Present Application

The following sections present the application of the symmetrized neural network operators to the numerical simulation of viscous dissipation in two-dimensional fluid systems. The methodology includes the definition of the kernel, the discretization of the domain, and the iterative time evolution of the velocity field based on Equation (209). This serves both as a validation of the theoretical properties of the operators and as a demonstration of their practical utility in computational fluid dynamics.

19.2.5. Numerical Implementation and Algorithm Design

The computational domain is defined as a uniform two-dimensional grid over the rectangular region

[a, b] \times [c, d]

, discretized into

N_{x} \times N_{y}

nodes with grid spacings

Δ x = (b - a) / N_{x}

and

Δ y = (d - c) / N_{y}

. The time domain is discretized with a constant time step

Δ t

subject to stability conditions.

Let

u_{i, j}^{n}

denote the approximation of the scalar field

u (x_{i}, y_{j}, t_{n})

at spatial grid point

(x_{i}, y_{j})

and temporal step

t_{n} = n Δ t

, where

i = 1, \dots, N_{x}

and

j = 1, \dots, N_{y}

.

19.2.6. Discrete Evolution Equation

The continuous diffusion (Equation (206)) is discretized in time using an explicit Euler method and in space using the convolution-based approximation of the Laplacian via the symmetrized neural network operator K. The discrete evolution equation reads:

u_{i, j}^{n + 1} = u_{i, j}^{n} + Δ t \cdot ν \cdot (K {(u^{n})}_{i, j} - u_{i, j}^{n}),

(210)

where

K {(u^{n})}_{i, j}

represents the discrete convolution of the field

u^{n}

with the kernel function Z, centered at

(i, j)

:

K {(u^{n})}_{i, j} = \sum_{m = - M}^{M} \sum_{l = - L}^{L} u_{i + m, j + l}^{n} \cdot Z (m Δ x, l Δ y),

(211)

with

(m, l)

indexing the kernel stencil. The choice of M and L determines the kernel support based on its decay properties, typically chosen to satisfy:

Z (m Δ x, l Δ y) \approx 0 for \sqrt{m^{2} + l^{2}} > r_{cutoff} .

19.2.7. Algorithm Description

The numerical procedure involves the following steps:

1.

Define the computational domain and discretization parameters

(N_{x}, N_{y}, Δ x, Δ y, Δ t)

.

2.

Initialize the scalar field

u (x, y, 0)

with a prescribed initial condition.

3.

Construct the symmetrized kernel matrix

Z (m Δ x, l Δ y)

based on the parameters

(q, λ, β)

.

4.

For each time step n:

Compute the convolution $K (u^{n})$ via discrete summation over the kernel stencil.
Update the field $u^{n + 1}$ using Equation (210).

5.

Repeat until the final time is reached or a convergence criterion is satisfied.

In this study, we conducted simulations to analyze the behavior of viscous diffusion using a symmetrized neural kernel. The simulations were implemented in Python^®, leveraging libraries such as NumPy for numerical computations and Matplotlib for visualization. The primary objective was to observe how the diffusion field smooths over time and to compare numerical solutions with analytical solutions.

19.3. Simulation and Analysis of Viscous Diffusion Simulations

In this section, we present a numerical and physical analysis of the viscous diffusion simulations (Figure 4, Figure 5 and Figure 6), using a symmetrized neural kernel. The simulations were conducted with different time steps and total times, and the results are compared with analytical solutions.

The simulation domain was defined on a grid of size

1000 \times 1000

with a domain size of

1.0 \times 1.0

. The physical parameter for viscosity,

ν

, was set to

0.05

to ensure visible diffusion effects. The time parameters included different time steps

Δ t

of

0.002

,

0.005

, and

0.01

, with maximum simulation times

t_{\max}

of

0.1

,

0.3

, and

0.6

.

The kernel used in the simulations was constructed using a symmetrized approach with parameters

λ = 3.0

,

q = 0.8

, and

β = 1.0

. This kernel was applied to an initial condition defined by a Gaussian distribution centered in the domain.

The primary computations involved:

Initial Condition: A Gaussian pulse centered in the domain, defined by $exp (- α ({(X - L_{x} / 2)}^{2} + {(Y - L_{y} / 2)}^{2}))$ , where $α = 100$ .
Analytical Solution: Derived from the diffusion equation, providing a benchmark for the numerical simulations.
Numerical Simulation: Utilizing a convolution operation with the symmetrized kernel to simulate the diffusion process over time. Numerical experiments validate the theoretical results, demonstrating a relative error reduction of up to 92.5% when compared to classical quasi-interpolation operators. The observed convergence rates reached $O (n^{- 1.5})$ under Caputo derivatives, with parameters $λ = 3.5$ , $q = 1.8$ , and $n = 100$ . These results confirm the robustness of the proposed symmetrized neural network operators, particularly in modeling systems with memory effects and long-range interactions.
Processing Time: The simulations were completed in a matter of seconds, demonstrating the efficiency of the numerical implementation.
Computational Cost: The use of convolution operations and optimized numerical libraries ensured that the computational cost remained low, making this approach suitable for extensive parameter studies and real-time applications.

19.4. Physical and Mathematical Interpretation of the Operators

The simulation involves four distinct operators:

Quasi-Interpolation Operator: A local neural approximation based on a symmetrized kernel function $Φ (x)$ . This operator captures the smooth behavior of the solution with high accuracy.
Kantorovich Operator: This operator applies a local averaging procedure, enhancing the robustness in the presence of noise or functions with limited regularity. It provides a natural smoothing effect over the approximated solution.
Quadrature-Type Operator: It combines the properties of local averaging and interpolation, employing a midpoint quadrature scheme that balances fidelity and smoothing.
VSS Operator (Voronovskaya–Santos–Sales): This is the most refined operator, incorporating an asymptotic correction term derived from the Voronovskaya-type expansion. Specifically, it adds a second-derivative correction scaled by the discretization parameter n, leading to exponential decay of the approximation error.

19.5. Physical Perspective

From a physical perspective, the viscous diffusion process is characterized by the dissipation of gradients over time, driven by the viscosity coefficient

ν

. This process is fundamental in computational fluid dynamics (CFD), particularly for simulations involving laminar flows, boundary layers, and turbulent dissipation models. The accurate modeling of viscous diffusion is essential for understanding and predicting the behavior of fluid systems, where the interplay between viscous forces and inertial effects dictates the flow characteristics.

The VSS operator’s superior performance is intrinsically linked to its mathematical design, which not only captures local interpolation but also incorporates the leading-order asymptotic behavior of the diffusion operator. This dual capability ensures that the physical property of energy dissipation is preserved with higher fidelity, making the VSS operator highly suitable for high-precision simulations in fluid dynamics, heat transfer, and fractional diffusion models.

As depicted in Figure 7, the numerical experiments validate the theoretical properties established for the proposed operators, particularly the VSS operator. The figure illustrates a comparative analysis of the analytical solution against the quasi-interpolation, Kantorovich, quadrature, and VSS operators at

t = 0.1

. This visual representation underscores the VSS operator’s ability to maintain superior fidelity and accuracy in capturing the essential characteristics of the diffusion process.

The results confirm that incorporating asymptotic corrections based on the Voronovskaya-type expansion significantly enhances the approximation quality. This enhancement is evidenced by achieving exponential error decay and a superior representation of the viscous diffusion dynamics. The VSS operator’s ability to accurately model the diffusion process is attributed to its robust mathematical foundation, which effectively balances local and asymptotic behaviors, ensuring precise and reliable simulations.

These findings underscore the potential of the symmetrized neural operators, especially the VSS formulation, as powerful tools for computational physics. The implications extend to various applications in CFD, fractional partial differential equations (PDEs), and scientific machine learning, where the accurate and efficient modeling of complex physical phenomena is paramount. By leveraging the strengths of the VSS operator, researchers and practitioners can achieve more accurate and reliable simulations, paving the way for advancements in understanding and predicting fluid dynamic behaviors.

19.6. Error Analysis and Physical-Mathematical Interpretation

Figure 8 presents the error analysis for the viscous diffusion simulation using the proposed symmetrized neural network operators, including the VSS operator. The errors are computed in the maximum norm (

L^{\infty}

) by comparing the numerical approximations to the exact analytical solution of the viscous diffusion equation. The error decay is plotted as a function of the discretization parameter n in logarithmic scale.

The results clearly demonstrate distinct convergence behaviors among the operators:

The VSS Operator exhibits the fastest error decay rate, confirming the theoretical predictions based on the Voronovskaya-type asymptotic expansion. The inclusion of the second-derivative correction term significantly enhances the approximation accuracy, leading to nearly exponential error reduction as n increases. This result highlights the effectiveness of the VSS operator in capturing not only the local interpolation but also the asymptotic behavior of the underlying differential operator.
The Quasi-Interpolation Operator demonstrates a solid convergence rate, though slightly inferior to the VSS operator. Its accuracy is primarily driven by the symmetry and localization properties of the kernel function $Φ (x)$ but lacks the asymptotic correction present in the VSS formulation.
The Kantorovich Operator shows a more gradual error reduction. This is expected due to its intrinsic smoothing behavior, which averages the function over adjacent nodes. While this property improves stability in noisy or irregular data, it results in a less aggressive convergence rate for smooth problems like viscous diffusion.
The Quadrature Operator offers intermediate performance, balancing the smoothing effect of Kantorovich with the interpolation capability of the quasi-interpolation approach. Its error decay is consistent but does not match the accelerated convergence of the VSS operator.

From a physical perspective, viscous diffusion models the dissipation of gradients over time, representing the transport of momentum (or heat) driven by viscosity. Capturing this dissipative behavior with high accuracy is fundamental in fluid dynamics simulations, particularly for resolving shear layers, laminar flow regimes, and boundary layer phenomena.

The VSS operator’s superior performance directly correlates with its ability to preserve the essential physical property of energy dissipation while achieving higher numerical fidelity. By embedding the leading-order asymptotic behavior into its structure, the VSS operator aligns more closely with the mathematical properties of the diffusion operator.

Moreover, the error analysis highlights that the symmetrized neural network operators are not only mathematically sound but also physically consistent. Their capacity to approximate differential operators with high accuracy makes them suitable for applications in computational fluid dynamics (CFD), scientific machine learning (SciML), and the numerical solution of fractional and non-local partial differential equations.

19.7. Quantitative Analysis

In this section, we present a comprehensive quantitative analysis to evaluate the conservation of mass, total energy, and the

L^{2}

norm of the diffusion field. These metrics are essential for assessing the accuracy, stability, and physical consistency of the numerical simulations performed in our study.

19.7.1. Conservation of Mass

The conservation of mass is a fundamental physical principle, particularly in fluid dynamics, where it ensures that the total mass within a closed system remains constant over time. To evaluate this, we computed the mass at each time step using the discrete summation formula:

Mass (t) = \sum_{i, j} u_{i, j} (t) \cdot Δ x \cdot Δ y,

(212)

where

u_{i, j} (t)

represents the value of the diffusion field at the grid point

(i, j)

and time t, and

Δ x

and

Δ y

are the spatial grid resolutions in the x and y directions, respectively. This calculation was performed iteratively at each time step to monitor the mass conservation throughout the simulation. The results are visually represented in Figure 9, illustrating the consistency and robustness of our numerical approach.

19.7.2. Total Energy

The total energy of the system provides crucial insights into the dynamics and stability of the simulation. It is particularly important for understanding how energy dissipates over time due to viscous effects. The total energy was calculated using the following expression:

Energy (t) = \sum_{i, j} {(u_{i, j} (t))}^{2} \cdot Δ x \cdot Δ y .

(213)

This metric allows us to observe the temporal evolution of energy within the system and assess the impact of numerical diffusion and dissipation. The computed total energy values are plotted in Figure 9, providing a clear visualization of the energy dynamics throughout the simulation period.

19.7.3. $L^{2}$ Norm of the Diffusion Field

The

L^{2}

norm of the diffusion field is a critical measure for evaluating the magnitude and behavior of the diffusion process. It is computed as:

L^{2} Norm (t) = \sqrt{\sum_{i, j} {(u_{i, j} (t))}^{2} \cdot Δ x \cdot Δ y} .

(214)

This norm offers a comprehensive perspective on the diffusion process, capturing the overall behavior and convergence properties of the field. The evolution of the

L^{2}

norm over time is depicted in Figure 9, demonstrating the effectiveness of the numerical methods in capturing the diffusion dynamics.

The quantitative analysis presented in this section underscores the robustness and accuracy of our numerical simulations. By evaluating the conservation of mass, total energy, and the

L^{2}

norm, we gain valuable insights into the diffusion process and the effectiveness of the numerical methods employed. These metrics not only validate the physical principles underlying our simulations but also ensure the stability and reliability of the computational approach. The results affirm the capability of our numerical framework to accurately model and simulate complex diffusion phenomena, providing a solid foundation for further research and applications in fluid dynamics and related fields.

The simulations demonstrate the diffusion and smoothing of an initial Gaussian pulse over time. The numerical solutions are in good agreement with the analytical solutions, validating the effectiveness of the symmetrized neural kernel approach. The quantitative analysis highlights the conservation properties and stability of the approach, with consistent mass and decreasing

L^{2}

norms as the diffusion progresses. The results provide valuable insights into the diffusion process and the effectiveness of the numerical methods employed.

20. Extension to Fluid Dynamics: Nonlocal Viscous Modeling and Fractional Navier–Stokes Equations

In this section, we extend the application of the proposed symmetrized neural network operators to the discretization of the viscous term in the incompressible Navier–Stokes equations. Furthermore, we introduce a generalized formulation incorporating fractional temporal derivatives in the sense of Caputo, providing a framework to model anomalous diffusion and memory effects in complex fluids.

20.1. Nonlocal Viscous Formulation via Neural Network Operators

Consider the classical incompressible Navier–Stokes equations in

R^{N}

(N = 2, 3)

:

\{\begin{matrix} \frac{\partial u}{\partial t} + (u \cdot \nabla) u = - \nabla p + ν Δ u + f, \\ \nabla \cdot u = 0, \end{matrix}

(215)

where

u = u (x, t)

is the velocity field, p is the pressure,

ν > 0

is the kinematic viscosity, and

f

is a forcing term.

We propose replacing the Laplacian

Δ u

with a nonlocal operator based on the Kantorovich-type neural network operator, denoted by

K_{n}

, constructed via the symmetrized density Z:

L_{n} (u) (x) : = K_{n} (u, x) - u (x) .

(216)

The nonlocal Navier-Stokes equations then read:

\{\begin{matrix} \frac{\partial u}{\partial t} + (u \cdot \nabla) u = - \nabla p + ν L_{n} (u) + f, \\ \nabla \cdot u = 0 . \end{matrix}

(217)

Remark 4.

The operator

L_{n}

satisfies positivity, symmetry, exponential decay, and partition of unity properties, as demonstrated in Theorem 1. This ensures that the operator acts as a regularized Laplacian with tunable smoothness controlled by the kernel parameters

(λ, q, β)

.

Consistency with the Classical Laplacian

Theorem 11

(Consistency with the Laplacian). Let

f \in C^{2} (R^{N})

. Then, as

n \to \infty

, the nonlocal operator satisfies:

L_{n} (f) (x) = \frac{1}{2 n^{2}} \sum_{i = 1}^{N} \frac{\partial^{2} f}{\partial x_{i}^{2}} (x) + o (\frac{1}{n^{2}}),

(218)

i.e.,

L_{n} (f)

converges to a multiple of the classical Laplacian.

Proof.

Using the asymptotic expansion derived in Theorem 2, applying it to the Kantorovich-type operator

K_{n} (f)

, we have:

K_{n} (f) (x) = f (x) + \frac{1}{2 n^{2}} \sum_{i = 1}^{N} \frac{\partial^{2} f}{\partial x_{i}^{2}} (x) + o (\frac{1}{n^{2}}) .

(219)

Substituting into the definition of

L_{n}

,

L_{n} (f) (x) = K_{n} (f) (x) - f (x) = \frac{1}{2 n^{2}} \sum_{i = 1}^{N} \frac{\partial^{2} f}{\partial x_{i}^{2}} (x) + o (\frac{1}{n^{2}}) .

(220)

which completes the proof. □

21. Fractional Navier–Stokes Equations with Caputo Derivatives

To incorporate memory effects, we introduce the Caputo fractional derivative of order

α \in (0, 1)

in time:

{{}^{C}D}_{t}^{α} u (x, t) : = \frac{1}{Γ (1 - α)} \int_{0}^{t} \frac{\partial u (x, τ)}{\partial τ} {(t - τ)}^{- α} d τ, for α \in (0, 1) .

(221)

The fractional Navier–Stokes equations with nonlocal viscous dissipation then read:

\{\begin{matrix} {{}^{C}D}_{t}^{α} u (x, t) + (u (x, t) \cdot \nabla) u (x, t) & = - \nabla p (x, t) + ν L_{n} (u) (x, t) + f (x, t), \\ \nabla \cdot u (x, t) & = 0 . \end{matrix}

(222)

Physical Interpretation.

The Caputo derivative models history-dependent effects typical of viscoelastic fluids, porous media, or anomalous diffusion. The nonlocal viscous term

L_{n}

captures spatial interactions beyond the nearest neighbors, improving stability and robustness for turbulent or multiphase flows.

21.1. Energy Dissipation Analysis

Theorem 12

(Energy Dissipation). Let

u \in C^{\infty} (Ω \times [0, T]; R^{d})

be a sufficiently smooth solution of the fractional Navier–Stokes Equation (222) defined on a periodic domain

Ω \subset R^{d}

. Then, the total kinetic energy satisfies the fractional energy dissipation relation:

{{}^{C}D}_{t}^{α} (\frac{1}{2} {∥ u (t) ∥}_{L^{2} (Ω)}^{2}) = - ν {∥L_{n}^{1 / 2} u (t)∥}_{L^{2} (Ω)}^{2} + {(f (t), u (t))}_{L^{2} (Ω)}, t \in (0, T],

(223)

where

{{}^{C}D}_{t}^{α}

denotes the Caputo fractional derivative of order

α \in (0, 1]

,

ν > 0

is the viscosity coefficient, and

{(\cdot, \cdot)}_{L^{2} (Ω)}

is the usual

L^{2}

-inner product.

Proof.

Taking the

L^{2} (Ω)

-inner product of Equation (222) with

u (t)

, and using the incompressibility condition

\nabla \cdot u = 0

, we obtain:

({{}^{C}D}_{t}^{α} u, u) + ((u \cdot \nabla) u, u) = - (\nabla p, u) + ν (L_{n} u, u) + (f, u) .

(224)

The nonlinear convection term is skew-symmetric in the

L^{2}

inner product, which implies:

((u \cdot \nabla) u, u) = 0 .

(225)

Similarly, by periodic boundary conditions and incompressibility, the pressure term vanishes:

(\nabla p, u) = 0 .

(226)

Furthermore, the Caputo fractional derivative satisfies the product rule analogue:

({{}^{C}D}_{t}^{α} u, u) = {{}^{C}D}_{t}^{α} \frac{1}{2} {∥ u ∥}_{L^{2} (Ω)}^{2} .

(227)

Since

L_{n}

is self-adjoint and positive semi-definite, we have:

(L_{n} u, u) = {∥L_{n}^{1 / 2} u∥}_{L^{2} (Ω)}^{2} \geq 0 .

(228)

Combining Equations (224) through (228) yields the desired energy dissipation relation (223), concluding the proof. □

21.2. Discussion and Perspectives

The proposed formulation extends the classical Navier–Stokes model in two directions:

The viscous term is generalized to a nonlocal operator $L_{n}$ , offering enhanced smoothing properties, tunable decay rates, and better approximation of long-range spatial interactions.
The temporal evolution incorporates a fractional derivative of order $α \in (0, 1)$ , allowing the modeling of fluids with memory effects, viscoelastic properties, or anomalous transport dynamics.

Future work includes the derivation of discrete stability conditions, numerical implementations for benchmark problems, and extensions to turbulent flow modeling with fractional dissipation operators.

22. Results

The results of this study encompass both rigorous theoretical developments and numerical validations concerning the formulation of symmetrized neural network operators within the framework of fractional calculus.

22.1. Theoretical Contributions

A major theoretical accomplishment is the construction of multivariate Kantorovich-type neural network operators founded on symmetric, compactly supported density functions derived from deformed hyperbolic tangent activations. These operators are rigorously proven to satisfy essential properties such as linearity, positivity, constant preservation, scaling invariance, and exponential decay, which are critical for robust approximation performance in high-dimensional settings.

The central asymptotic contribution is formalized through the Voronovskaya–Santos–Sales Theorem, which generalizes classical Voronovskaya-type results to the fractional setting governed by Caputo derivatives. This theorem delivers high-order asymptotic expansions with explicit characterization of the remainder term in terms of the discretization parameter n, the smoothness degree m, and the scaling exponent

β

. The resulting error decays as

O (n^{- β (m - ε)})

, reinforced by exponential decay factors associated with the symmetrized kernels.

Detailed refinements are provided for smoothness orders

m = 1

,

m = 2

, and

m = 3

, including precise integral formulations for the remainder terms. Furthermore, the theoretical framework is extended beyond static function approximation to encompass nonlocal operators applicable to fluid dynamics. Specifically, two major extensions are developed: (i) the formulation of Nonlocal Viscous Models, where the classical Laplacian is replaced by neural network-based nonlocal operators, and (ii) a generalized formulation for the Fractional Navier–Stokes Equations with Caputo derivatives in time. For the latter, rigorous mathematical foundations are established, including the derivation of the governing equations, operator properties, and well-posedness analysis. However, numerical simulations for the fractional Navier–Stokes system are designated as future work.

22.2. Numerical Validation

Numerical experiments are conducted to validate the theoretical framework across two primary domains:

Signal Processing: The proposed symmetrized neural operators demonstrate exceptional performance in function approximation tasks, significantly outperforming classical quasi-interpolation, Kantorovich, and quadrature-based operators. The experiments highlight superior accuracy in reconstructing high-frequency signals under the presence of additive Gaussian noise, with the operators preserving structural integrity while effectively mitigating oscillatory artifacts.

Nonlocal Viscous Diffusion: Numerical simulations are successfully performed for fluid models governed by nonlocal viscous operators, where the classical Laplacian is substituted by the neural network-based nonlocal operator

L_{n} (u) = K_{n} (u, x) - u (x)

. The results confirm the conservation of mass, accurate energy dissipation behavior, and numerical stability over varying time scales and spatial resolutions. The operators capture long-range interactions and nonlocal dissipative mechanisms inherent to anomalous transport phenomena, with markedly improved accuracy over classical local models.

It is important to note that, while the mathematical framework for the Fractional Navier–Stokes Equations with Caputo Derivatives has been comprehensively developed—including the incorporation of nonlocal memory-driven dynamics—numerical simulations for this fractional system are outlined as a direction for future research. These forthcoming simulations will aim to investigate the combined effects of temporal memory and spatial nonlocality in fluid dynamics governed by fractional-order models.

The numerical experiments conducted validate the theoretical framework established in this work, particularly the asymptotic expansions governed by Caputo derivatives. The simulations were designed to compare the performance of the proposed symmetrized Kantorovich-type neural network operators against classical quasi-interpolation operators. For this purpose, a set of benchmark functions characterized by nonlocal behavior and memory effects were selected, including test cases from signal processing and fractional fluid dynamics.

A relative error reduction of up to 92.5% was computed based on the formula:

Relative Error Reduction = \frac{E_{classical} - E_{proposed}}{E_{classical}} \times 100 %,

(229)

where

E_{classical}

corresponds to the error norm of the classical quasi-interpolation operator and

E_{proposed}

refers to the error norm of the symmetrized Kantorovich-type operator under equivalent discretization parameters. This significant improvement is attributed to the localized symmetry, the refined density construction, and the incorporation of Caputo derivatives in the analytical formulation.

The observed convergence rates reached:

O (n^{- 1.5}),

(230)

which is fully consistent with the expected asymptotic behavior predicted by the theoretical framework. This convergence rate highlights the effectiveness and robustness of the proposed operators in accurately capturing nonlocal interactions and memory-dependent dynamics. In particular, the interplay between the exponential decay properties of the kernel functions and the scaling behavior introduced by the fractional differentiation proves to be a key factor in accelerating convergence. This combined mechanism leads to a marked performance enhancement compared to classical approaches, especially in systems governed by long-range correlations and anomalous diffusion.

These findings underscore the robustness and accuracy of the proposed operators, especially in fractional models where standard neural operators exhibit suboptimal performance due to the lack of adaptation to memory effects and long-range dependencies.

In summary, the comprehensive set of theoretical results, combined with the successful numerical validation for nonlocal viscous diffusion, confirms the robustness, scalability, and mathematical soundness of the proposed operators for high-dimensional, nonlocal, and memory-dependent systems. The framework lays the groundwork for future advancements in computational fluid dynamics involving fractional-order dynamics and nonlocal effects.

23. Conclusions

This study presents a mathematically rigorous and computationally efficient framework for symmetrized neural network operators within the realm of fractional calculus. By introducing a perturbed hyperbolic tangent activation function, we developed a class of kernel-based operators that naturally satisfy essential properties such as positivity, symmetry, compact support, and partition of unity.

A central theoretical advance of this work is the formulation of the Voronovskaya–Santos–Sales Theorem, which generalizes classical asymptotic approximation results to settings governed by Caputo fractional derivatives. This theorem delivers explicit error bounds accompanied by normalized remainder terms, offering a new level of precision in describing the asymptotic behavior of neural operators under fractional differentiation.

The numerical experiments provide strong empirical support for the theoretical findings, revealing a relative error reduction of up to 92.5% when compared to classical quasi-interpolation operators. Additionally, the observed convergence rates of

O (n^{- 1.5})

underscore the efficiency of embedding memory-dependent dynamics directly into the operator design.

Beyond the theoretical contributions, the practical outcomes are equally noteworthy. The proposed operators exhibit remarkable robustness in signal processing tasks, where they effectively suppress noise while preserving key structural features of the input data. In the context of fractional fluid dynamics, the operators successfully capture nonlocal viscous diffusion and memory-driven effects within the Navier–Stokes framework, demonstrating their potential for modeling complex physical phenomena.

The synergy between neural operator theory, asymptotic analysis, and fractional calculus unveiled in this study opens promising avenues for future research in both pure and applied mathematics. Ongoing and future investigations will focus on extending this framework to stochastic fractional systems, high-dimensional turbulent flows with nonlocal dissipation mechanisms, and data-driven modeling scenarios where long-range dependencies are fundamental. Overall, this research lays a solid foundation for the development of next-generation neural operator-based solvers capable of addressing the mathematical and computational challenges inherent in systems characterized by nonlocality, memory effects, and high dimensionality.

Author Contributions

R.D.C.d.S., J.H.d.O.S. and G.S.S., conceptualization and methodology; R.D.C.d.S. and J.H.d.O.S., formal analysis; R.D.C.d.S. and J.H.d.O.S., investigation; R.D.C.d.S. and J.H.d.O.S., resources and writing; R.D.C.d.S. and J.H.d.O.S., original draft preparation; R.D.C.d.S., J.H.d.O.S., and G.S.S., writing—review and editing; J.H.d.O.S., supervision. All authors have read and agreed to the published version of the manuscript.

Funding

This study was financed by Universidade Estadual de Santa Cruz (UESC)/Fundação de Ampararo à Pesquisa do Estado da Bahia (FAPESB).

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

Santos gratefully acknowledges the support of the PPGMC Program for the Postdoctoral Scholarship PROBOL/UESC nr. 218/2025. Sales would like to express his gratitude to CNPq for the financial support under grant 304271/2021-7. Silveira thanks (PROBOL) CONSEPE N° 54/2021. This study was financed in party by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior—Brasil (CAPES)—Finance Code

001,

and Fundação de Ampararo à Pesquisa do Estado da Bahia (FAPESB).

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Nomenclature

Nomenclature
Symbol	Description
$R$	The set of real numbers.
$N$	The set of natural numbers.
$Z$	The set of integers.
$Z_{+}$	The set of positive integers.
$C^{m} (R^{N})$	Space of functions with continuous partial derivatives up to order m on $R^{N}$ .
$C_{B} (R^{N})$	Space of continuous and bounded functions on $R^{N}$ .
${∥ \cdot ∥}_{\infty}$	Supremum norm.
${∥ \cdot ∥}_{\infty, m}^{max}$	Maximum norm over all partial derivatives of order m.
$L_{\infty} (R)$	Space of essentially bounded measurable functions on $R$ .
$A C^{N} (R)$	Absolutely continuous functions with derivatives up to order N.
$Γ (α)$	Gamma function evaluated at $α$ .
$D_{* x}^{α} f$	Caputo fractional derivative of order $α$ with respect to x.
$D_{x -}^{α} f$	Riemann–Liouville fractional derivative of order $α$ with respect to x.
${{}^{C}D}_{t}^{α}$	Caputo fractional derivative with respect to time t of order $α$ .
$Φ (x)$	Symmetrized density function.
$M_{q, λ} (x)$	Density function derived from the perturbed activation $g_{q, λ} (x)$ .
$g_{q, λ} (x)$	Perturbed hyperbolic tangent activation function.
$K_{n} (f, x)$	Kantorovich-type neural network operator.
$A_{n} (f, x)$	Quasi-interpolation neural network operator.
$Q_{n} (f, x)$	Quadrature-type neural network operator.
$B_{n} (f, x)$	Generalized neural network operator.
$C_{n} (f, x)$	Convolution-type neural network operator.
$N (f, x)$	Output of a deep neural network with L layers.
$β$	Parameter controlling the convergence rate.
$ε$	Small positive parameter affecting the convergence rate.
$λ, q$	Parameters of the activation function $g_{q, λ} (x)$ .
n	Scaling parameter of the neural network operators.
x	A point in $R^{N}$ .
f	Function to be approximated by the operators.
$f^{(j)}$	j-th order derivative of the function f.
$f_{α}$	Partial derivative of f with respect to the multi-index $α$ .
$α$	Multi-index $α = (α_{1}, α_{2}, \dots, α_{N}) \in Z_{+}^{N}$ .
$\| α \|$	Order of the multi-index, $\| α \| = \sum_{i = 1}^{N} α_{i}$ .
$N_{m}$	Number of distinct multi-indices $α$ such that $\| α \| = m$ .
$δ_{n k} (f)$	Local weighted average of f in the quadrature-type operator.
$Z (x)$	Multivariate probability density function (kernel function).
R	Remainder term in a Taylor expansion.
$R_{n}$	Remainder term in the neural network approximation.
$U_{n}^{*}$	Integrated remainder term in the Kantorovich-type operator.
C	Constant in remainder estimates, $C = \frac{2 N_{m}}{m!} {∥ f_{α} ∥}_{\infty, m}^{max}$ .
T	Constant used in exponential decay estimates.
$L_{n} (u) (x)$	Nonlocal viscous operator defined as $L_{n} (u) (x) = K_{n} (u, x) - u (x)$ , used to model diffusion in fluid dynamics.
${∥ x ∥}_{\infty}$	Max-norm, ${∥ x ∥}_{\infty} = max {\| x_{1} \|, \dots, \| x_{N} \|}$ .
$θ$	Multi-index used in defining local averages.
$τ$	Auxiliary variable used in the integral form of Taylor’s remainder, ranging over the unit interval $[0, 1]$ .
$w_{r}$	Non-negative weights in the quadrature-type operator.
$μ_{2, i} (x)$	Second-order central moment of the density function $Φ$ .
$ω (f; δ)$	Modulus of continuity.
$L_{n}$	Nonlocal operator based on the Kantorovich-type neural network.
$K_{n}$	Kantorovich-type operator used to define $L_{n}$ .
$u$	Velocity field (Navier–Stokes equations).
p	Pressure field.
v	Kinematic viscosity.
$f$	Forcing term in Navier–Stokes equations.
$Δ t$	Time step in numerical simulations.
$Δ x, Δ y$	Spatial grid resolutions in x and y directions.
$u_{i, j}^{n}$	Approximation of $u (x_{i}, y_{j}, t_{n})$ .
$N_{x}, N_{y}$	Number of nodes in x and y directions.
$M, L$	Parameters controlling kernel support.
$r_{cutoff}$	Cutoff radius for the kernel support.
Indices
Symbol	Description
i	Index for components of vectors or spatial dimensions.
j	Index denoting derivative orders.
k	Index for grid points.
n	Scaling parameter for the operator.
$α$	Multi-index $α = (α_{1}, α_{2}, \dots, α_{N}) \in Z_{+}^{N}$ .
$β$	Parameter governing the convergence rate.
$ε$	Small positive parameter for refined convergence control.
$λ, q$	Activation function parameters.
m	Order of the Taylor expansion or smoothness assumption.
N	Spatial dimensionality of the problem.

References

Alonso, N.I. The Mathematics of Neural Operators. 2024. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4992283 (accessed on 18 October 2024).
Kiranyaz, S.; Ince, T.; Iosifidis, A.; Gabbouj, M. Operational neural networks. Neural Comput. Appl. 2020, 32, 6645–6668. [Google Scholar] [CrossRef]
Turkyilmazoglu, M. An Efficient Computational Method for Differential Equations of Fractional Type. CMES-Comput. Model. Eng. Sci. 2022, 133, 47–65. [Google Scholar] [CrossRef]
Zhao, T. Efficient spectral collocation method for tempered fractional differential equations. Fractal Fract. 2023, 7, 277. [Google Scholar] [CrossRef]
Calin, O. Deep Learning Architectures; Springer International Publishing: New York, NY, USA, 2020. [Google Scholar]
Chen, W.; Sun, H.; Li, X. Fractional Derivative Modeling in Mechanics and Engineering; Springer Nature: Berlin/Heidelberg, Germany, 2022. [Google Scholar]
Liu, X.; Kamran; Yao, Y. Numerical Approximation of Riccati Fractional Differential Equation in the Sense of Caputo-Type Fractional Derivative. J. Math. 2020, 2020, 1274251. [Google Scholar] [CrossRef]
Hopgood, A.A. Intelligent Systems for Engineers and Scientists: A Practical Guide to Artificial Intelligence; CRC Press: Boca Raton, FL, USA, 2021. [Google Scholar]
Anastassiou, G.A. Parametrized, Deformed and General Neural Networks; Springer: Heidelberg, Germany, 2023. [Google Scholar] [CrossRef]
Santos, R.D.C.D.; Sales, J.H.D.O. Revolutionizing Fractional Calculus with Neural Networks: Voronovskaya-Damasclin Theory for Next-Generation AI Systems. arXiv 2025, arXiv:2504.03751. [Google Scholar] [CrossRef]
Dutta, H.; Akdemir, A.O.; Atangana, A. (Eds.) Fractional Order Analysis: Theory, Methods and Applications; John Wiley & Sons: Hoboken, NJ, USA, 2020. [Google Scholar]
Cao, J.; Xiao, A.; Bu, W. Finite difference/finite element method for tempered time fractional advection–dispersion equation with fast evaluation of Caputo derivative. J. Sci. Comput. 2020, 83, 48. [Google Scholar] [CrossRef]
Baleanu, D.; Karaca, Y.; Vázquez, L.; Macías-Díaz, J.E. Advanced fractional calculus, differential equations and neural networks: Analysis, modeling and numerical computations. Phys. Scr. 2023, 98, 110201. [Google Scholar] [CrossRef]
Lu, L.; Jin, P.; Pang, G.; Zhang, Z.; Karniadakis, G.E. Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators. Nat. Mach. Intell. 2021, 3, 218–229. [Google Scholar] [CrossRef]
Santos, R.D.C.D.; Sales, J.H.D.O. Extension of Symmetrized Neural Network Operators with Fractional and Mixed Activation Functions. arXiv 2025, arXiv:2501.10496. [Google Scholar] [CrossRef]
Santos, R.D.C.D. Generalized Neural Network Operators with Symmetrized Activations: Fractional Convergence and the Voronovskaya-Damasclin Theorem. arXiv 2025, arXiv:2502.06795. [Google Scholar] [CrossRef]
Gomez-Aguilar, J.F.; Atangana, A. (Eds.) Applications of Fractional Calculus to Modeling in Dynamics and Chaos; CRC Press: Boca Raton, FL, USA, 2022. [Google Scholar]
Ghanbari, B. Fractional Calculus: Bridging Theory with Computational and Contemporary Advances; Elsevier: Amsterdam, The Netherlands, 2024. [Google Scholar]
Zhou, Y. Fractional Diffusion and Wave Equations; Springer: Cham, Switzerland, 2024. [Google Scholar]
Ahmed, H.M. Numerical solutions of high-order differential equations with polynomial coefficients using a Bernstein polynomial basis. Mediterr. J. Math. 2023, 20, 303. [Google Scholar] [CrossRef]
Costarelli, D. Approximation error for neural network operators by an averaged modulus of smoothness. J. Approx. Theory 2023, 294, 105944. [Google Scholar] [CrossRef]
Romera, G. Neural Networks in Numerical Analysis and Approximation Theory. arXiv 2024, arXiv:2410.02814. [Google Scholar] [CrossRef]
Ramakrishnan, S. (Ed.) Modern Applications of Wavelet Transform; BoD–Books on Demand: Norderstedt, Germany, 2024. [Google Scholar]
Panton, R.L. Incompressible Flow; John Wiley & Sons: Hoboken, NJ, USA, 2024. [Google Scholar]
Santos, R.D.C.D.; Sales, J.H.D.O. Hypercomplex Dynamics and Turbulent Flows in Sobolev and Besov Functional Spaces. arXiv 2024, arXiv:2410.11232. [Google Scholar] [CrossRef]

Figure 1. Methodological flowchart of the theoretical pipeline developed for the asymptotic analysis of neural network operators with Caputo fractional derivatives, culminating in the proof of the Voronovskaya–Santos–Sales Theorem.

Figure 2. Performance comparison of the neural operators

A_{n}

,

K_{n}

,

Q_{n}

, and the symmetrized VSS operator under different signal regimes. The subplots illustrate (a) a clean sinusoidal signal, (b) a composite signal composed of low and high-frequency harmonics, (c) a clean signal contaminated with additive Gaussian noise, and (d) a composite signal with added noise. The VSS operator demonstrates superior fidelity and smoothness across all scenarios, particularly in noisy regimes, effectively preserving signal structure while suppressing spurious oscillations. Classical operators (

A_{n}

,

K_{n}

, and

Q_{n}

) exhibit progressively higher deviation, especially under noise, with VSS maintaining the closest approximation to the target function in all cases.

Figure 2. Performance comparison of the neural operators

A_{n}

,

K_{n}

,

Q_{n}

, and the symmetrized VSS operator under different signal regimes. The subplots illustrate (a) a clean sinusoidal signal, (b) a composite signal composed of low and high-frequency harmonics, (c) a clean signal contaminated with additive Gaussian noise, and (d) a composite signal with added noise. The VSS operator demonstrates superior fidelity and smoothness across all scenarios, particularly in noisy regimes, effectively preserving signal structure while suppressing spurious oscillations. Classical operators (

A_{n}

,

K_{n}

, and

Q_{n}

) exhibit progressively higher deviation, especially under noise, with VSS maintaining the closest approximation to the target function in all cases.

Figure 3. Mean squared error (MSE) as a function of discretization parameter n for the operators

A_{n}

,

K_{n}

,

Q_{n}

, and VSS in the 2D setting. The VSS operator achieves the lowest error and the fastest convergence rate, fully consistent with the theoretical asymptotic expansion.

Figure 3. Mean squared error (MSE) as a function of discretization parameter n for the operators

A_{n}

,

K_{n}

,

Q_{n}

, and VSS in the 2D setting. The VSS operator achieves the lowest error and the fastest convergence rate, fully consistent with the theoretical asymptotic expansion.

Figure 4. Numerical solution (left) and analytical solution (right) for viscous diffusion at

t = 0.1

with time step

Δ t = 0.002

. The numerical solution shows a mass of approximately

0.0314

and an

L^{2}

norm of

0.1252

, while the analytical solution shows a mass of approximately

0.0314

and an

L^{2}

norm of

0.0723

.

Figure 4. Numerical solution (left) and analytical solution (right) for viscous diffusion at

t = 0.1

with time step

Δ t = 0.002

. The numerical solution shows a mass of approximately

0.0314

and an

L^{2}

norm of

0.1252

, while the analytical solution shows a mass of approximately

0.0314

and an

L^{2}

norm of

0.0723

.

Figure 5. Numerical solution (left) and analytical solution (right) for viscous diffusion at

t = 0.3

with time step

Δ t = 0.002

. The numerical solution shows a mass of approximately

0.0314

and an

L^{2}

norm of

0.1252

, while the analytical solution shows a mass of approximately

0.0309

and an

L^{2}

norm of

0.0473

.

Figure 5. Numerical solution (left) and analytical solution (right) for viscous diffusion at

t = 0.3

with time step

Δ t = 0.002

. The numerical solution shows a mass of approximately

0.0314

and an

L^{2}

norm of

0.1252

, while the analytical solution shows a mass of approximately

0.0309

and an

L^{2}

norm of

0.0473

.

Figure 6. Numerical solution (left) and analytical solution (right) for viscous diffusion at

t = 0.5

with time step

Δ t = 0.002

. The numerical solution shows a mass of approximately

0.0314

and an

L^{2}

norm of

0.1252

, while the analytical solution shows a mass of approximately

0.0293

and an

L^{2}

norm of

0.0377

.

Figure 6. Numerical solution (left) and analytical solution (right) for viscous diffusion at

t = 0.5

with time step

Δ t = 0.002

. The numerical solution shows a mass of approximately

0.0314

and an

L^{2}

norm of

0.1252

, while the analytical solution shows a mass of approximately

0.0293

and an

L^{2}

norm of

0.0377

.

Figure 7. Simulation and analysis of viscous diffusion using symmetrized neural operators, including the VSS operator. The analytical solution is compared against the quasi-interpolation, Kantorovich, quadrature, and VSS operators at

t = 0.1

.

Figure 7. Simulation and analysis of viscous diffusion using symmetrized neural operators, including the VSS operator. The analytical solution is compared against the quasi-interpolation, Kantorovich, quadrature, and VSS operators at

t = 0.1

.

Figure 8. Error decay in maximum norm (

L^{\infty}

) for the viscous diffusion simulation using quasi-interpolation, Kantorovich, quadrature, and VSS operators. The VSS operator demonstrates superior convergence with the lowest error levels across discretizations.

Figure 8. Error decay in maximum norm (

L^{\infty}

) for the viscous diffusion simulation using quasi-interpolation, Kantorovich, quadrature, and VSS operators. The VSS operator demonstrates superior convergence with the lowest error levels across discretizations.

Figure 9. Evolution of the diffusion field over time with initial Gaussian pulse and symmetrized neural kernel.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Santos, R.D.C.d.; Sales, J.H.d.O.; Santos, G.S. Symmetrized Neural Network Operators in Fractional Calculus: Caputo Derivatives, Asymptotic Analysis, and the Voronovskaya–Santos–Sales Theorem. Axioms 2025, 14, 510. https://doi.org/10.3390/axioms14070510

AMA Style

Santos RDCd, Sales JHdO, Santos GS. Symmetrized Neural Network Operators in Fractional Calculus: Caputo Derivatives, Asymptotic Analysis, and the Voronovskaya–Santos–Sales Theorem. Axioms. 2025; 14(7):510. https://doi.org/10.3390/axioms14070510

Chicago/Turabian Style

Santos, Rômulo Damasclin Chaves dos, Jorge Henrique de Oliveira Sales, and Gislan Silveira Santos. 2025. "Symmetrized Neural Network Operators in Fractional Calculus: Caputo Derivatives, Asymptotic Analysis, and the Voronovskaya–Santos–Sales Theorem" Axioms 14, no. 7: 510. https://doi.org/10.3390/axioms14070510

APA Style

Santos, R. D. C. d., Sales, J. H. d. O., & Santos, G. S. (2025). Symmetrized Neural Network Operators in Fractional Calculus: Caputo Derivatives, Asymptotic Analysis, and the Voronovskaya–Santos–Sales Theorem. Axioms, 14(7), 510. https://doi.org/10.3390/axioms14070510

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Symmetrized Neural Network Operators in Fractional Calculus: Caputo Derivatives, Asymptotic Analysis, and the Voronovskaya–Santos–Sales Theorem

Abstract

1. Introduction

Methodology

2. Mathematical Foundations: Symmetrized Activation Functions

3. Multivariate Quasi-Interpolation Neural Network Operator

4. Main Results

5. Taylor Expansion and Error Analysis

6. Voronovskaya-Type Asymptotic Expansion for Kantorovich-Type Operators

7. Improved Normalized Remainder Term Analysis

8. Refinement of the Estimate for the Case m = 1

9. Refinement of the Estimate for the Case m = 2

9.1. Integral Form of the Remainder Term

9.2. Rigorous Estimate for the Remainder Term

9.3. Total Error Propagation in the Kantorovich Operator

9.4. Asymptotic Behavior Interpretation

10. Refinement of the Estimate for the Case m = 3

10.1. Multivariate Taylor Expansion for m = 3

10.2. Integral Representation of the Remainder Term

10.3. Estimate of the Remainder Term

10.4. Global Error Propagation in the Kantorovich-Type Operator

10.5. Asymptotic Behavior and Interpretation

11. Generalized Voronovskaya Theorem for Kantorovich-Type Neural Operators

12. Kantorovich Operators for Multivariate Neural Networks

13. Convergence of Operators in Deep Learning

14. Generalized Multivariate Kantorovich Operators

14.1. Preliminaries and Notation

14.2. Voronovskaya-Type Asymptotic Expansion

14.3. Special Cases

15. Fractional Perturbation Stability

16. Generalized Voronovskaya Expansions for Fractional Functions

17. Symmetrized Density Approach to Kantorovich Operator Convergence in Infinite Domains

18. Voronovskaya–Santos–Sales Theorem

19. Applications

19.1. Application to Signal Processing

19.1.1. Analysis of the Initial Approximation Results

19.1.2. Asymptotic Approximation Behavior

19.1.3. Error Analysis for the VSS Operator

19.1.4. Robustness to Noise

19.1.5. Discussion of the Graphical Results

19.1.6. Mathematical Error Analysis in 2D

19.1.7. Numerical Validation and Interpretation

19.1.8. Theoretical Justification

19.2. Viscous Dissipation Modeling via Symmetrized Neural Network Operators

19.2.1. Physical and Mathematical Background

19.2.2. Relevance to Symmetrized Neural Operators

19.2.3. Physical Interpretation and Advantages

19.2.4. Scope of the Present Application

19.2.5. Numerical Implementation and Algorithm Design

19.2.6. Discrete Evolution Equation

19.2.7. Algorithm Description

19.3. Simulation and Analysis of Viscous Diffusion Simulations

19.4. Physical and Mathematical Interpretation of the Operators

19.5. Physical Perspective

19.6. Error Analysis and Physical-Mathematical Interpretation

19.7. Quantitative Analysis

19.7.1. Conservation of Mass

19.7.2. Total Energy

19.7.3. L 2 Norm of the Diffusion Field

20. Extension to Fluid Dynamics: Nonlocal Viscous Modeling and Fractional Navier–Stokes Equations

20.1. Nonlocal Viscous Formulation via Neural Network Operators

Consistency with the Classical Laplacian

21. Fractional Navier–Stokes Equations with Caputo Derivatives

21.1. Energy Dissipation Analysis

21.2. Discussion and Perspectives

22. Results

22.1. Theoretical Contributions

22.2. Numerical Validation

23. Conclusions

Author Contributions

Funding

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Nomenclature

References

Share and Cite

Article Metrics

8. Refinement of the Estimate for the Case $m = 1$

9. Refinement of the Estimate for the Case $m = 2$

10. Refinement of the Estimate for the Case $m = 3$

10.1. Multivariate Taylor Expansion for $m = 3$

19.7.3. $L^{2}$ Norm of the Diffusion Field