Abstract
Essential neural-network operators are interpreted as positive linear operators, and the related general theory applies to them. These operators are induced by a symmetrized density function deriving from the parametrized and deformed A-generalized logistic activation function. We are acting on the space of continuous functions on a compact interval of real line to the reals. We quantitatively study the rate of convergence of these neural -network operators to the unit operator. Our inequalities involve the modulus of continuity of the function under approximation or its derivative. We produce uniform and , approximation results via these inequalities. The convexity of functions is also used to derive more refined results.
Keywords:
neural-network operators; positive linear operators; modulus of continuity; quantitative approximation to the unit; logistic activation function MSC:
41A17; 41A25; 41A36
1. Introduction
The author has extensively studied the quantitative approximation of positive linear operators to the unit since 1985; see, for example, [1,2,3,4], which are used in this work. He started from the quantitative weak convergence of finite positive measures to the Dirac unit measure, with geometric moment theory as a method—see [2]—and he produced the best upper bounds, leading to attaining sharp Jackson-type inequalities, e.g., see [1,2]. These studies have gone in all possible directions, univariate and multivariate, though in this work, we focus only on the univariate approach.
Since 1997, the author has been studying the quantitative convergence of neural network operators to the unit and has written numerous articles and books, e.g., see [3,4,5], which are used here.
The wide range of neural-network operators being used by the author are, indeed, by nature, positive linear operators.
Here, the author continues his study of treating neural-network operators as positive linear operators. This is a totally new approach in the literature; see [6].
Different activation functions allow for different non-linearities, which might work better for solving a specific problem. So, the need to use neural networks with various activation functions is pressing. Thus, performing neural-network approximations using different activation functions is not only necessary but fully justified. Furthermore, brain non-symmetry has been observed in animals and humans in terms of structure, function, and behavior. This lateralization is thought to reflect evolutionary, hereditary, developmental, experiential, and pathological factors. Consequently, for our study, it is natural to consider deformed neural-network activation functions and operators. Thus, this work is a specific study under this philosophy of approaching reality as close as possible. The author is currently working on the long project of connecting neural-network operators to positive linear operator theory. Some of the most popular activation functions derive from the generalized logistic function, on which this article is based, and the hyperbolic tangent, on which [6] is based. Both cases have different calculations and produce different results of great interest and important application in my other papers.
All methods of positive linear operators apply here to our summation defined neural network operators, producing new and interesting results: pointwise, uniform, and , kinds. Via the Riesz representation theorem, neural networks are connected to measure theory. The convexity of functions produces optimal and sharper results.
The use of A-generalized logistic function as an activation function is well established and supported in ([5], Chapter 16). The A-generalized logistic activation function behaves well and is one of the most commonly used activation functions on this subject.
The author’s symmetrization method presented here aims for our operators to converge at high speed to the unit operator using a half-feed of data.
This article establishes a bridge connecting neural-network approximation (part of AI) to positive linear operators (a significant part of functional analysis).
So, it is a theoretical work aimed mainly at related experts.
The authors of [7,8] have, as always, been great inspiration. For classic studies on neural networks, we also recommend [9,10,11,12,13,14].
For newer work on neural networks, we also refer the reader to [15,16,17,18,19,20,21,22,23,24]. For recent studies in positive linear operators, we refer the reader to [6,25,26,27,28,29,30].
For other general related work, please read [31,32,33,34,35,36,37,38], with their justifications to follow:
Justification of [31]: Provides analogous weighted and pointwise error estimates for Kantorovich operators, enriching comparison with our symmetrized kernel bounds.
Justification of [32]: Develops Bézier–Kantorovich constructions using wavelet techniques, which is parallel to our symmetrization approach in handling deformed kernels.
Justification of [33]: Introduces a two-parameter Stancu deformation framework analogous to our A-generalized logistic deformation.
Justification of [34]: Explores lacunary sequences and invariant means underlying averaging arguments similar to our symmetrization sums.
Justification of [35]: Applies Appell-polynomial enhancement to beta-function kernels, directly relating to our beta-based logistic symmetrization.
Justification of [36]: Combines Stancu deformation with Bézier bases, mirroring our combination of deformation and symmetrization.
Justification of [37]: Presents modified Bernstein polynomials via basis-function techniques, offering comparable convergence comparisons.
Justification of [38]: Integrates q-statistical methods with wavelet-aided kernels, paralleling our use of q-deformation and symmetrized densities.
2. Basics
Here, we follow ([5], pp. 395–417).
Our activation function to be used here is the q-deformed and -parametrized function
This is the A-generalized logistic function.
For more, read Chapter 16 of [5]: “Banach space valued ordinary and fractional neural-network approximation based on q-deformed and -parametrized A-generalized logistic function”.
This chapter motivates our current work.
The proposed “symmetrization technique” aims to use a half-data feed to our neural networks.
We will employ the following density function
We have
and
Adding (3) and (4), we obtain
which is the key to this work.
Therefore,
is an even function, symmetric with respect to the y-axis.
The global maximum of is given by (16.18), p. 401 of [5] as
In addition, the global max of is
both sharing the same maximum at symmetric points.
By Theorem 16.1, p. 401 of [5], we have
and
Consequently, we derive that
By Theorem 16.2, p. 402 of [5], we have
similarly, it holds that
so that
and therefore, W is a density function.
By Theorem 16.3, p. 402 of [5], we have:
Let , and with . Then,
where
Similarly, we obtain that
Consequently, we obtain that
where
Here, denotes the ceiling of the number, and its integral part.
We mention (Theorem 16.4 p. 402 of [5]) let and so that . For , we consider the number with , and . Then,
Similarly, we consider , such that , and . Thus,
Hence,
and
Consequently, it holds
so that
that is,
We have proved
Theorem 1.
Let and , so that . For , we consider with , and . Also consider , such that , and . Then
We make the following remark:
Remark 1.
(I) By Remark 16.5, p. 402 of [5], we have
and
Therefore, it holds that
Hence, it is
even if
because then
equivalently
true by
(II) Let . For large n, we always have . Also , iff . So, in general, it holds that
We need:
Definition 1.
Let and . We introduce and define the X-valued linear symmetrized neural-network operators
In fact, it is ; and are positive linear operators.
The modulus of continuity is defined by
The same is defined for (uniformly continuous and bounded functions) and for (bounded and continuous functions) and for (uniformly continuous functions).
In fact, or is equivalent to
In this work, , , where is the supremum norm.
3. Main Results
We present uniform, pointwise,
Theorem 2.
Let . Then,
So that uniformly.
Proof.
We estimate that
That is,
Consequently, it holds that
By (Theorem 7.1.7, p. 203 of [2]), using the Shisha–Mond inequality [7] for positive linear operators and , we have
Thus, it holds that
proving the claim. □
It holds
Theorem 3.
Let be a continuous and periodic function with modulus of continuity . Here, denotes the sup-norm over , and the operators are acting on such f over ; , . Then,
Proof.
We want to estimate (),
We have proved that
Therefore, it holds that, by [8], the following inequalities are
hence proving the claim. □
We need
Definition 2.
Let . The Peetre ([39]) functional is defined as follows:
We give
Theorem 4.
Let . Then,
as
So that , uniformly.
For the proof of (48), we will use
Corollary 1.
([1]). (pointwise approximation) Let and let L be a positive linear operator acting on satisfying , all .
Then, we have the attainable (i.e., sharp) inequality
Proof. of Theorem 4.
We apply inequality (49). Clearly, we have
which is attainable (i.e., sharp).
We estimate that
That is,
We have proved that
∀
Now, the validity of (48) is clear. □
This follows a trigonometric result.
Theorem 5.
Here , , , Then
(i)
(ii)
(iii) if , we obtain
and
(iv)
Proof.
Direct application of (Theorem 12.3, p. 384 of [4]). □
Next, we give the hyperbolic version of the last result.
Theorem 6.
All as in Theorem 5. Then,
(i)
(ii)
(iii) if , we obtain
and
(iv)
Proof.
Direct application of (Theorem 12.5, p. 390 of [4]). □
Remark 2.
We similarly have
as
Hence, by each of the above Theorems 5 and 6, we obtain that is uniformly convergent as ,
Valid, by Theorem 12.4, p. 390, and Theorem 12.6, p. 395 from [4], respectively.
We make the following remark:
Remark 3.
Let , , here, is a positive linear operator from into itself.
Also, . By the Riesz Representation theorem, there exists a probability measure on , such that
For large enough , we obtain
That is,
By (Corollary 8.1.1, p. 245 of [2]), we obtain:
Theorem 7.
It holds that
for large enough .
Remark 4.
We have
for large enough .
By (8.1.8, p. 245 of [2]), we obtain
Theorem 8.
Let , then
for large enough
By Theorem 8.1.2, p. 248 of [2], we can derive the following result.
Theorem 9.
Here, we consider , and . Let L be a positive linear operator from into itself, such that .
Denote by
Consider , such that is convex in t.
Assume that
Then,
We make the following remark:
Remark 5.
We have
for large enough .
By Theorem 9 and Remark 5, we have proved the following:
Theorem 10.
Here, it is and . Consider is convex in t, .
Then, for sufficiently large , we derive that
Next, we use (Theorem 18.1, p. 419 of [3]).
Theorem 11.
Denote ()
where is the supremum norm.
Let . Then,
Furthermore, it holds that
and for , we have, similarly, that
By (Corollary 18.1, p. 421 of [3]), we obtain
Corollary 2.
Here . Let . Then,
Here, it is
By (Corollary 18.2, p. 421 of [3]), we obtain
Corollary 3.
Here . Let . Then,
Here, they are
and
By (Theorem 18.2, p. 422 of [3]), we obtain:
Theorem 12.
Let , and . Let also . Then
The following results are with respect to , norms, which are with respect to the Lebesgue measure.
By (Theorem 18.3, p. 424 of [3]), we obtain
Theorem 13.
Let
. Also let . Then,
By (Corollary 18.3, p. 426 of [3]), we obtain:
Corollary 4.
Set
Let . Then,
By (Corollary 18.4, p. 427 of [3]), we obtain
Corollary 5.
Set
Let . Then
Setting 1.
In the following Theorems 14 and 15, we use:
Let , be a measure space, where is the Borel σ algebra on and μ is a positive finite measure on . (Please note that , any ). Here, stands for the related norm with respect to μ. Let such that .
Next, we apply Theorem 18.4, p. 428 of [3].
Theorem 14.
Here . Set
Let . Then
Remark 6.
We have
and
Therefore, it holds that
By (Theorem 18.5, p. 431 of [3]), we obtain
Theorem 15.
Here, . We denote the following:
Let . Then,
We need
Definition 3.
Let , such that , ∀. A norm on is called monotone iff . We denote a monotone norm by , e.g., norms in general (), Orlicz norms, etc.
Finally, we apply (Corollary 18.5, p. 432 of [3]).
Corollary 6.
Let be a monotone norm on . Denote
Let . Then
4. Conclusions
The author’s recent symmetrization technique presented here aims for our operators to converge at high speed to the unit operator using half-feed of data. This fact is documented by other authors’ forthcoming work, which includes numerical work and programming. This article builds a bridge connecting neural-network approximation (part of AI) to positive linear operators (a significant part of functional analysis). The author has been a pioneer in the study of positive linear operators by the use of geometric theory since 1985 and the founder of quantitative approximation theory by neural networks in 1997. The author recently was the first to connect neural networks to positive linear operators; see [6]. The list of references is complete, supporting the above.
Funding
This research received no external funding.
Data Availability Statement
The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Anastassiou, G.A. A “K-Attainable” inequality related to the convergence of Positive Linear Operators. J. Approx. Theory 1985, 44, 380–383. [Google Scholar] [CrossRef]
- Anastassiou, G.A. Moments in Probability and Approximation Theory; Pitman Research Notes in Mathematics Series; Longman Scientific & Technical: Essex, UK; New York, NY, USA, 1993. [Google Scholar]
- Anastassiou, G.A. Quantitative Approximations; Chapmen & Hall/CRC: London, UK; New York, NY, USA, 2001. [Google Scholar]
- Anastassiou, G.A. Trigonometric and Hyperbolic Generated Approximation Theory; World Scientific: Singapore; New York, NY, USA, 2025. [Google Scholar]
- Anastassiou, G.A. Parametrized, Deformed and General Neural Networks; Springer: Berlin/Heidelberg, Germany; New York, NY, USA, 2023. [Google Scholar]
- Anastassiou, G.A. Neural Networks a Positive Linear Operators. Mathematics 2025, 13, 1112. [Google Scholar] [CrossRef]
- Shisha, O.; Mond, B. The degree of convergence of sequences of linear positive operators. Proc. Natl. Acad. Sci. USA 1968, 60, 1196–1200. [Google Scholar] [CrossRef] [PubMed]
- Shisha, O.; Mond, B. The degree of Approximation to Periodic Functions by Linear Positive Operators. J. Approx. Theory 1968, 1, 335–339. [Google Scholar] [CrossRef][Green Version]
- Chen, Z.; Cao, F. The approximation operators with sigmoidal functions. Comput. Math. Appl. 2009, 58, 758–765. [Google Scholar] [CrossRef]
- Costarelli, D.; Spigler, R. Approximation results for neural network operators activated by sigmoidal functions. Neural Netw. 2013, 44, 101–106. [Google Scholar] [CrossRef]
- Costarelli, D.; Spigler, R. Multivariate neural network operators with sigmoidal activation functions. Neural Netw. 2013, 48, 72–77. [Google Scholar] [CrossRef]
- Haykin, S. Neural Networks: A Comprehensive Foundation, 2nd ed.; Prentice Hall: New York, NY, USA, 1998. [Google Scholar]
- McCulloch, W.; Pitts, W. A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 1943, 7, 115–133. [Google Scholar] [CrossRef]
- Mitchell, T.M. Machine Learning; WCB-McGraw-Hill: New York, NY, USA, 1997. [Google Scholar]
- Dansheng, Y.; Feilong, C. Construction and approximation rate for feed-forward neural network operators with sigmoidal functions. J. Comput. Appl. Math. 2025, 453, 116150. [Google Scholar]
- Siyu, C.; Bangti, J.; Qimeng, Q.; Zhi, Z. Hybrid neural-network FEM approximation of diffusion coeficient in elyptic and parabolic problems. IMA J. Numer. Anal. 2024, 44, 3059–3093. [Google Scholar]
- Lucian, C.; Danillo, C.; Mariarosaria, N.; Pantiş, A. The approximation capabilities of Durrmeyer-type neural network operators. J. Appl. Math. Comput. 2024, 70, 4581–4599. [Google Scholar]
- Warin, X. The GroupMax neural network approximation of convex functions. IEEE Trans. Neural Netw. Learn. Syst. 2024, 35, 11608–11612. [Google Scholar] [CrossRef] [PubMed]
- Fabra, A.; Guasch, O.; Baiges, J.; Codina, R. Approximation of acoustic black holes with finite element mixed formulations and artificial neural network correction terms. Finite Elem. Anal. Des. 2024, 241, 104236. [Google Scholar] [CrossRef]
- Grohs, P.; Voigtlaender, F. Proof of the theory-to-practice gap in deep learning via sampling complexity bounds for neural network approximation spaces. Found. Comput. Math. 2024, 24, 1085–1143. [Google Scholar] [CrossRef]
- Basteri, A.; Trevisan, D. Quantitative Gaussian approximation of randomly initialized deep neural networks. Mach. Learn. 2024, 113, 6373–6393. [Google Scholar] [CrossRef]
- De Ryck, T.; Mishra, S. Error analysis for deep neural network approximations of parametric hyperbolic conservation laws. Math. Comp. 2024, 93, 2643–2677. [Google Scholar] [CrossRef]
- Liu, J.; Zhang, B.; Lai, Y.; Fang, L. Hull form optimization reserach based on multi-precision back-propagation neural network approximation model. Internal. J. Numer. Methods Fluid 2024, 96, 1445–1460. [Google Scholar] [CrossRef]
- Yoo, J.; Kim, J.; Gim, M.; Lee, H. Error estimates of physics-informed neural networks for initial value problems. J. Korean Soc. Ind. Appl. Math. 2024, 28, 33–58. [Google Scholar]
- Kaur, J.; Goyal, M. Hyers-Ulam stability of some positive linear operators. Stud. Univ. Babeş-Bolyai Math. 2025, 70, 105–114. [Google Scholar] [CrossRef]
- Abel, U.; Acu, A.M.; Heilmann, M.; Raşa, I. On some Cauchy problems and positive linear operators. Mediterr. J. Math. 2025, 22, 20. [Google Scholar] [CrossRef]
- Moradi, H.R.; Furuichi, S.; Sababheh, M. Operator quadratic mean and positive linear maps. J. Math. Inequal. 2024, 18, 1263–1279. [Google Scholar] [CrossRef]
- Bustamante, J.; Torres-Campos, J. Power series and positive linear operators in weighted spaces. Serdica Math. J. 2024, 50, 225–250. [Google Scholar] [CrossRef]
- Acu, A.-M.; Rasa, I.; Sofonea, F. Composition of some positive linear integral operators. Demonstr. Math. 2024, 57, 20240018. [Google Scholar] [CrossRef]
- Patel, P.G. On positive linear operators linking gamma, Mittag-Leffler and Wright functions. Int. J. Appl. Comput. Math. 2024, 10, 152. [Google Scholar] [CrossRef]
- Ansari, K.J.; Özger, F. Pointwise and weighted estimates for Bernstein-Kantorovich type operators including beta function. Indian J. Pure Appl. Math. 2024. [Google Scholar] [CrossRef]
- Savaş, E.; Mursaleen, M. Bézier Type Kantorovich q-Baskakov Operators via Wavelets and Some Approximation Properties. Bull. Iran. Math. Soc. 2023, 49, 68. [Google Scholar] [CrossRef]
- Cai, Q.; Aslan, R.; Özger, F.; Srivastava, H.M. Approximation by a new Stancu variant of generalized (λ,μ)-Bernstein operators. Alex. Eng. J. 2024, 107, 205–214. [Google Scholar] [CrossRef]
- Ayman-Mursaleen, M.; Nasiruzzaman, M.; Sharma, S.; Cai, Q. Invariant means and lacunary sequence spaces of order (α, β). Demonstr. Math. 2024, 57, 20240003. [Google Scholar] [CrossRef]
- Ayman-Mursaleen, M.; Nasiruzzaman, M.; Rao, N. On the Approximation of Szász-Jakimovski-Leviatan Beta Type Integral Operators Enhanced by Appell Polynomials. Iran. J. Sci. 2025. [Google Scholar] [CrossRef]
- Alamer, A.; Nasiruzzaman, M. Approximation by Stancu variant of λ-Bernstein shifted knots operators associated by Bézier basis function. J. King Saud Univ. Sci. 2024, 36, 103333. [Google Scholar] [CrossRef]
- Ayman-Mursaleen, M.; Nasiruzzaman, M.; Rao, N.; Dilshad, M.; Nisar, K.S. Approximation by the modified λ-Bernstein-polynomial in terms of basis function. AIMS Math. 2024, 9, 4409–4426. [Google Scholar] [CrossRef]
- Ayman-Mursaleen, M.; Lamichhane, B.P.; Kilicman, A.; Senu, N. On q-statistical approximation of wavelets aided Kantorovich q-Baskakov operators. Filomat 2024, 38, 3261–3274. [Google Scholar] [CrossRef]
- Peetre, J. A Theory of Interpolation of Normed Spaces; Notes Universidade de Brasilia: Brasília, Brazil, 1963. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).