Abstract
Basic neural network operators are interpreted as positive linear operators and the related general theory applies to them. These operators are induced by a symmetrized density function deriving from the parametrized and deformed hyperbolic tangent activation function. I explore the space of continuous functions on a compact interval of the real line to the reals. I study quantitatively the rate of convergence of these neural network operators to the unit operator. The studied inequalities involve the modulus of continuity of the function under approximation or its derivative. I produce uniform and , , approximation results via these inequalities. The convexity of functions is also taken into consideration.
Keywords:
neural network operators; positive linear operators; modulus of continuity; quantitative approximation to the unit MSC:
41A17; 41A25; 41A36
1. Introduction
I have previously studied extensively the quantitative approximation of positive linear operators to the unit since 1985—see, for example, [,,,], which is used in this work. I began with the quantitative weak convergence of finite positive measures to the unit Dirac measure, having as a method the geometric moment theory (see []), and I produced the best upper bounds, leading to attained (i.e., sharp Jackson-type inequalities)—e.g., see [,]. These studies can be taken in many possible directions, univariate and multivariate, though in this work I focus only on the univariate approach.
Also, in 1997, I started studying the quantitative convergence of neural network operators to the unit; since them, I have written numerous articles and books—e.g., see [,,], which I draw from here.
The wide range of neural network operators I have treated are by nature positive linear operators.
Here, for the first time in the literature, neural network operators are treated as positive linear operators.
So, all methods of positive linear operators apply here to the defined summation neural network operators, producing new and interesting results: point-wise, uniform, and , , results. Via the Riesz representation theorem, neural networks are connected to measure theory. The convexity of functions produces optimal and sharper results.
The use of the perturbed hyperbolic tangent function as an activation function has been well established and supported in [] (Chapter 18). The hyperbolic tangent activation function is one of the most commonly used activation functions in this area.
The symmetrization method presented here aims to have the operators converge at a high speed to the unit operator using half the relevant data feed.
This article establishes a bridge for the first time connecting neural network approximation (part of AI) to positive linear operators (an important part of functional analysis).
So, this is a theoretical work intended mainly for relevant experts.
This study was greatly inspired by [,]. For classic studies on neural networks, I also recommend [,,,,,].
For newer work on neural networks, I also refer the reader to [,,,,,,,,,]. For recent studies on positive linear operators, I commend [,,,,,] to you.
2. Basics
Here, we will follow [] (pp. 455–460).
The activation function to be used is
Above, is the parameter and q is the deformation coefficient, typically
For further reading, refer to Chapter 18 of [], “q-Deformed and -Parametrized Hyperbolic Tangent based Banach space Valued Ordinary and Fractional Neural Network Approximation”.
This chapter motivates the current work.
The proposed “symmetrization method” aims to use half the data feed to the relevant neural networks.
We will employ the following density function:
; .
The result is that
is an even function, symmetric with respect to the y-axis.
By (18.18) of [], we have
sharing the same maximum at symmetric points.
By Theorem 18.1, p. 458 of [], we have that
Consequently, we derive that
By Theorem 18.2, p. 459 of [], we have that
so that
Thereforem is a density function.
By Theorem 18.3, p. 459 of [], we have the following:
Let , and with ; . Then,
where
Similarly, we obtain that
Consequently, we obtain that
where
Here, denotes the ceiling of the number, and its integral part.
We mention the following:
Theorem 18.4, p. 459, []: Let and so that . For , we consider , such that , and . Then,
Similarly, we consider , such that , and . Thus,
Hence,
and
Consequently, the following holds:
so that
that is,
We have proved the theorem.
Theorem 1.
Let and so that . For , we consider , such that , and . Also consider , such that , and . Then,
We make the following remark:
Remark 1.
(I) By Remark 18.5, p. 460 of [], we have
and
Therefore, it holds that
Hence,
even if
because then
which is equivalent to
which is true given that
(II) Let . For large n, we always have . Also, iff . So, in general, the following holds:
We need the following definition:
Definition 1.
Let and . We introduce and define the following real-valued symmetrized and perturbed positive linear neural network operators:
In fact, we have and The modulus of continuity is defined by
In this work, , where is the supremum norm.
3. Main Results
The following approximation results are valid.
Theorem 2.
Let . Then
so that uniformly.
Proof.
We estimate
That is,
Consequently, it holds that
By Theorem 7.1.7, p. 203, [], the Shisha–Mond inequality ([]) for positive linear operators, and , we have that
Thus, it holds that
proving the claim. □
The following holds:
Theorem 3.
Let be a continuous and -periodic function with modulus of continuity . Here, denotes the sup-norm over , and the operators are acting on such f over ; , . Then,
Proof.
We want to estimate ():
We have proved that
Therefore, by [], the following inequalities hold:
hence proving the claim. □
We need the following definition.
Definition 2.
Let . The Peetre [] functional is defined as follows:
We now present the following theorem.
Theorem 4.
Let . Then,
as
The result is that , uniformly.
For the proof of (44), we use the following:
Corollary 1
([], Point-wise Approximation). Let and let L be a positive linear operator acting on satisfying , all .
Then, we have the attainable (i.e., sharp) inequality
Proof of Theorem 4.
We apply Inequality (45). Clearly, we have that
which is attainable (i.w., sharp).
A trigonometric result follows.
Theorem 5.
Here, , , , Then,
- (i)
- (ii)
- (iii)
- If , we obtainand
- (iv)
Proof.
Direct application of Theorem 12.3, p. 384, []. □
Next, we give the hyperbolic version of the last result.
Theorem 6.
All as in Theorem 5. Then,
- (i)
- (ii)
- (iii)
- If , we obtainand
- (iv)
Proof.
Direct application of Theorem 12.5, p. 390 of []. □
Remark 2.
We similarly have that
as
Hence, by each of the above Theorems 5 and 6, we obtain that is uniformly convergent as , ∀
These results are valid by Theorem 12.4 (p. 390) and Theorem 12.6 (p. 395) from [], respectively.
We make the following remark:
Remark 3.
Let , ; here, is a positive linear operator from into itself.
Also, . By the Riesz representation theorem, there exists a probability measure on such that
Assume here that is convex in t. We consider
Clearly, here, we have (see (32)).
For large enough , we obtain
That is,
By Corollary 8.1.1, p. 245 of [], we obtain the following:
Theorem 7.
The following holds:
for large enough .
Remark 4.
We have that
for large enough .
By (8.1.8) (p. 245, []), we obtain the following:
Theorem 8.
Let ; then,
for large enough
By Theorem 8.1.2, p. 248 of [], we can derive the following result.
Theorem 9.
Here, we consider , and . Let L be a positive linear operator from into itself, such that .
Let
Consider such that is convex in t.
Assume that
Then,
We make the following remark.
Remark 5.
We have that
for large enough .
By Theorem 9 and Remark 5, we have proved the following:
Theorem 10.
Here, and . Consider which is convex in t, .
Then, for sufficiently large enough , we derive that
Next, we use Theorem 18.1, p. 419 of [].
Theorem 11.
Denote ()
where is the supremum norm.
Let . Then,
Furthermore, the following holds:
and for , we have, similarly, that
By Corollary 18.1, p. 421 of [], we obtain the following:
Corollary 2.
Here, . Let . Then,
Here, we have
By Corollary 18.2, p. 421 of [], we have the following:
Corollary 3.
Here, . Let . Then,
Now, we have
and
By Theorem 18.2, p. 422 of []:
Theorem 12.
Let , and . Let also . Then,
The next results are with respect to the , norms, which are with respect to the Lebesgue measure.
By Theorem 18.3, p. 424 of [], we obtain the following:
Theorem 13.
Let
. Let also . Then,
By Corollary 18.3, p. 426 of [], we have the following:
Corollary 4.
Set
Let . Then,
By Corollary 18.4, p. 427 of [], we obtain
Corollary 5.
Set
Let . Then,
In the next Theorems 14 and 15, we use the following:
Let , , be a measure space, where is the Borel -algebra on and is a positive finite measure on . (Note that , for any ) Here, stands for the related norm with respect to . Let such that .
Next, we apply Theorem 18.4 from p. 428 of [].
Theorem 14.
Here, . Set
Let . Then,
Remark 6.
We have that
and
Therefore, we have
By Theorem 18.5 from p. 431 of [], we have the following:
Theorem 15.
Here, . Let
Let . Then,
We now need the following definition.
Definition 3.
Let such that , ∀. A norm on is called monotone iff . We denote a monotone norm by , e.g., norms in general (), Orlicz norms, etc.
Finally, we apply Corollary 18.5 from p. 432 of [].
Corollary 6.
Let be a monotone norm on . Define
Let . Then,
4. Conclusions
The symmetrization technique presented here aims at having operators converge at a high speed to the unit operator using half the relevant data feed. My forthcoming work documents numerical work and programming related to this result. This article builds a bridge for the first time connecting neural networks approximation (part of ai) to positive linear operators (an important part of functional analysis). My work has helped pioneer the study of positive linear operators using geometric theory since 1985 and also helped found quantitative approximation theory using neural networks since 1997. The list of complete references supporting the above is provided below.
Funding
This research received no external funding.
Data Availability Statement
No new data were created or analyzed in this study.
Conflicts of Interest
The author declares no conflicts of interest.
References
- Anastassiou, G.A. A “K-Attainabl” inequality related to the convergence of Positive Linear Operators. J. Approx. Theory 1985, 44, 380–383. [Google Scholar]
- Anastassiou, G.A. Moments in Probability and Approximation Theory; Pitman Research Notes in Mathematics series; Longman Scientific & Technical: Essex, UK; New York, NY, USA, 1993. [Google Scholar]
- Anastassiou, G.A. Quantitative Approximations; Chapmen & Hall/CRC: London, UK; New York, NY, USA, 2001. [Google Scholar]
- Anastassiou, G.A. Trigonometric and Hyperbolic Generated Approximation Theory; World Scientific: Singapore; New York, NY, USA, 2025. [Google Scholar]
- Anastassiou, G.A. Parametrized, Deformed and General Neural Networks; Springer: Berlin/Heidelberg, Germany; New York, NY, USA, 2023. [Google Scholar]
- Shisha, O.; Mond, B. The degree of convergence of sequences of linear positive operators. Proc. Nat. Acad. Sci. USA 1968, 60, 1196–1200. [Google Scholar]
- Shisha, O.; Mond, B. The degree of Approximation to Periodic Functions by Linear Positive Operators. J. Approx. Theory 1968, 1, 335–339. [Google Scholar]
- Chen, Z.; Cao, F. The approximation operators with sigmoidal functions. Comput. Math. Appl. 2009, 58, 758–765. [Google Scholar] [CrossRef]
- Costarelli, D.; Spigler, R. Approximation results for neural network operators activated by sigmoidal functions. Neural Netw. 2013, 44, 101–106. [Google Scholar] [PubMed]
- Costarelli, D.; Spigler, R. Multivariate neural network operators with sigmoidal activation functions. Neural Netw. 2013, 48, 72–77. [Google Scholar]
- Haykin, S. Neural Networks: A Comprehensive Foundation, 2nd ed.; Prentice Hall: New York, NY, USA, 1998. [Google Scholar]
- McCulloch, W.; Pitts, W. A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 1943, 7, 115–133. [Google Scholar]
- Mitchell, T.M. Machine Learning; WCB-McGraw-Hill: New York, NY, USA, 1997. [Google Scholar]
- Yu, D.; Cao, F. Construction and approximation rate for feed-forward neural network operators with sigmoidal functions. J. Comput. Appl. Math. 2025, 453, 116150. [Google Scholar]
- Cen, S.; Jin, B.; Quan, Q.; Zhou, Z. Hybrid neural-network FEM approximation of diffusion coeficient in elyptic and parabolic problems. IMA J. Numer. Anal. 2024, 44, 3059–3093. [Google Scholar] [CrossRef]
- Coroianu, L.; Costarelli, D.; Natale, M.; Pantiş, A. The approximation capabilities of Durrmeyer-type neural network operators. J. Appl. Math. Comput. 2024, 70, 4581–4599. [Google Scholar] [CrossRef]
- Warin, X. The GroupMax neural network approximation of convex functions. IEEE Trans. Neural Netw. Learn. Syst. 2024, 35, 11608–11612. [Google Scholar] [CrossRef]
- Fabra, A.; Guasch, O.; Baiges, J.; Codina, R. Approximation of acoustic black holes with finite element mixed formulations and artificial neural network correction terms. Finite Elem. Anal. Des. 2024, 241, 104236. [Google Scholar] [CrossRef]
- Grohs, P.; Voigtlaender, F. Proof of the theory-to-practice gap in deep learning via sampling complexity bounds for neural network approximation spaces. Found. Comput. Math. 2024, 24, 1085–1143. [Google Scholar] [CrossRef]
- Basteri, A.; Trevisan, D. Quantitative Gaussian approximation of randomly initialized deep neural networks. Mach. Learn. 2024, 113, 6373–6393. [Google Scholar] [CrossRef]
- De Ryck, T.; Mishra, S. Error analysis for deep neural network approximations of parametric hyperbolic conservation laws. Math. Comp. 2024, 93, 2643–2677. [Google Scholar] [CrossRef]
- Liu, J.; Zhang, B.; Lai, Y.; Fang, L. Hull form optimization reserach based on multi-precision back-propagation neural network approximation model. Internal. J. Numer. Methods Fluid 2024, 96, 1445–1460. [Google Scholar] [CrossRef]
- Yoo, J.; Kim, J.; Gim, M.; Lee, H. Error estimates of physics-informed neural networks for initial value problems. J. Korean Soc. Ind. Appl. Math. 2024, 28, 33–58. [Google Scholar]
- Kaur, J.; Goyal, M. Hyers-Ulam stability of some positive linear operators. Stud. Univ. Babeş-Bolyai Math. 2025, 70, 105–114. [Google Scholar] [CrossRef]
- Abel, U.; Acu, A.M.; Heilmann, M.; Raşa, I. On some Cauchy problems and positive linear operators. Mediterr. J. Math. 2025, 22, 20. [Google Scholar] [CrossRef]
- Moradi, H.R.; Furuichi, S.; Sababheh, M. Operator quadratic mean and positive linear maps. J. Math. Inequal. 2024, 18, 1263–1279. [Google Scholar] [CrossRef]
- Bustamante, J.; Torres-Campos, J.D. Power series and positive linear operators in weighted spaces. Serdica Math. J. 2024, 50, 225–250. [Google Scholar]
- Acu, A.-M.; Rasa, I.; Sofonea, F. Composition of some positive linear integral operators. Demonstr. Math. 2024, 57, 20240018. [Google Scholar]
- Patel, P.G. On positive linear operators linking gamma, Mittag-Leffler and Wright functions. Int. J. Appl. Comput. Math. 2024, 10, 152. [Google Scholar] [CrossRef]
- Peetre, J. A Theory of Interpolation of Normed Spaces; Universidade de Brasilia: Brasilia, Brazil, 1963. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).