Abstract
In this paper, we create a family of neural network (NN) operators employing a parametrized and deformed half-hyperbolic tangent function as an activation function and a density function produced by the same activation function. Moreover, we consider the univariate quantitative approximations by complex-valued neural network (NN) operators of complex-valued functions on a compact domain. Pointwise and uniform convergence results on Banach spaces are acquired through trigonometric, hyperbolic, and hybrid-type hyperbolic–trigonometric approaches.
Keywords:
parametrized half-hyperbolic tangent function; Banach space-valued neural network approximation; Ostrowski- and Opial-type inequalities; complex-valued neural network operators; trigonometric- and hyperbolic-type Taylor formulae; activation function; neural networks MSC:
26A33; 41A17; 41A25; 41A30; 46B25
1. Motivation
Under the umbrella of Artificial Intelligence (AI), neural network (NN) operators supply a vigorous structure for solving complex real-life problems in various fields of science such as cybersecurity, healthcare, economics, medicine, psychology, sociology, etc. Thanks to their differentiability properties, they can optimize parameters directly. Moreover, NN operators need correctly selected activation functions to work effectively. When we look at the literature, we encounter a very wide range of activation functions [1]. In this study, we choose —a parametrized and deformed half-hyperbolic tangent function—as the activation function. -like functions can be more effective in reaching the optimum solution due to their trainability [2]. One of the interesting aspects of this work is that all higher-order approximations are based on trigonometric and hyperbolic-type Taylor’s formulae inequalities (see [3,4,5]). Next, we emphasize the following: as is well known, the human brain has been proven medically to be a non-symmetrical human organ. As a result, NNs try to imitate its operation and are not symmetrical mathematical structures. But in our paper, we build an approximation apparatus that is as close as possible to symmetry. Namely the activation function we initially use is described as reciprocal anti-symmetrical (see Proposition 1). This is the building block for our density function which is used in our approximation NN operators. We prove that this density function is a reciprocal symmetric function (see Equation (3)). This represents our study’s interesting connection to the general “symmetry” phenomenon.
The first construction of approximation by NN operators in the sense of the “Cardaliaguet Euvrard” and “Squashing” types was made by G.A. Anastassiou in 1997. As a fruit of this construction, he also brought to the literature the ability to calculate the convergence speed with the help of convergence rates using the modulus of continuity [6].
The mathematical expression of the one-hidden-layer neural network (NN) architecture is presented as
for , where are the connection weights, is the coefficient, is the inner product of , is the threshold, and is the activation function. For more knowledge of NNs, readers are recommended to read [7,8,9].
This paper is organized as follows: in Section 1, we present our motivation to the readers. In Section 2, step by step, we build up our activation function . We also include our basic estimations, which form the basis for the main results. We devote Section 3 and Section 4 to the construction of density function , and the creation of the valued linear NN operators, respectively. In Section 5, the authors perform deformed and parametrized half-hyperbolic tangent function-activated high-order neural network approximations on continuous functions over compact intervals of a real line with complex values. All convergences have rates expressed via the modulus of continuity of the involved functions’ high-order derivatives, derived from very tight Jackson-type inequalities. We conclude with Section 6.
2. Activation Function: Parametrized Half-Hyperbolic Tangent Function
We consider the following function inspired by [10]
to be a parametrized half-hyperbolic tangent function for Also, one has
Proposition 1.
We observe that
for
Proposition 2.
The function is strictly increasing on Because
for every
Proposition 3.
When we take the second-order derivative of we have
According to the above calculation, the following are true:
- Case 1:
- if , then is strictly concave upward, with
- Case 2:
- if , then is strictly concave downward.
3. Construction of Density Function
In this section, we aim to create the density function , and we also present its basic properties, which will be used throughout the paper. It is clear that So, for every let us consider the following density function:
Furthermore, note that
so the x-axis is a horizontal asymptote. One can write that
Thus,
Remark 1.
We have
Then,
- (i)
- Let then and , that is, . Thus, is strictly increasing on .
- (ii)
- Let then and , namely, . Therefore, is strictly decreasing on .
Remark 2.
Let . Then,
Explicitly, according to one determines that for , and is strictly concave downward on Therefore, is a bell-shaped function on Moreover, is satisfied.
Remark 3.
The maximum value of is
Theorem 1
([10]). We determine that
Thus
In this manner, it holds that
However,
So
and
Theorem 2
([10]). It holds that
Thus, this means that is a density function over such that
Next, the following result is needed.
Theorem 3
([10]). Let , and with ; . Then,
where
Let and be the ceiling of the number and the integral part of the number, respectively.
Let us continue with the following conclusion:
Theorem 4
([10]). Let and so that . For , we consider the number with and . Then,
We also mention the following:
Remark 4
([10]). (i) We also notice that
where
(ii) Let . For large we always have . Also , iff . In general, it holds that
4. Generation of -Valued Linear NN Operators
Let be the Banach space of the complex numbers on the reals.
Definition 1.
Let and . We introduce and define the -valued linear neural network operators as follows:
For large enough n, we always obtain Also, iff The same may be used for real-valued functions. Here, we study the pointwise and uniform convergence of to with rates.
Clearly, here,
For the sake of usefulness, we can follow the following:
that is,
so that
We will calculate (12) by virtue of the classical first modulus of continuity for defined below:
Moreover, is equivalent to (see [6]).
5. Approximation Results
Now, we are ready to perform -valued neural network high-order approximations to a function f given with rates. Let us start with a trigonometric approach.
Theorem 5.
Let , , , Then,
- (1)
- (2)
- If we obtain
Note here that there is a high rate of convergence at
- (3)
- In addition, we obtain
namely, , pointwise and uniformly,
- (4)
- Eventually, it holds that
and a high speed of convergence at is gained.
Proof.
(employing , ∀)
namely,
(employing , ∀ )
Here, we keep in mind that has a high convergence rate.
We state that convergence is pointwise and uniform such that .
Inspired by [3], and applying the trigonometric Taylor’s formula for , let ; then,
Furthermore, it holds that
So, we have
Hence, we gain
where
We assume that
when in other words, when large enough , is assumed. Thus, this yields or .
For the case , the following are obtained:
- (i)
- If , then
- (ii)
- If , then
Therefore,
So, we have proven that when , it is always true that
- (a)
- Again let :
Thus, we obtain
- (b)
- One more time, let
So, we gain
Also, we have
Thus, it follows that
As a result, we derive
Again, we apply , ∀
We obtain
and
We determine that
Then, we calculate
using , ∀, and calculate
i.e.,
As a consequence, we obtain the following:
- (1)
- (2)
- If according to (14), we have
- (3)
- In addition, according to (14), we have
We consider that
Lastly, we obtain (∀, ):
- (4)
The theorem is proved. □
We move ahead with a hyperbolic high-order neural network approximation.
Theorem 6.
considering the high rate of convergence at .
it yields that , pointwise and uniformly, and
Let , , , Then,
- (1)
- (2)
- If we obtain
- (3)
- In addition, we obtain
- (4)
Here, again, is our high convergence speed.
Proof.
that is,
i.e.,
with the high rate of convergence at
Using the mean value theorem, we write
for some in , for any .
Thus,
In other words, there exists such that
where
Inspired by [3,4], and applying the hyperbolic Taylor’s formula for , when , then
So, it holds that
Accordingly, we obtain
So, we obtain
where
We assume that
For large enough , let , that is, when .
Hence, or .
For , we have the following:
- (i)
- If , then
- (ii)
- If , then
So, we have verified that when , it gives
Now, let us again assume that ; then,
Thus,
If , then
Hence, it verifies that
Also, we have
Thus, it yields that
Therefore, we determine that
Also, we have that
and
We obtain that
Then, we calculate
One determines that
That is,
- (1)
- (2)
- If according to (20), we achieve that
- (3)
- Moreover, according to (20), we gain
It yields a pointwise and uniform convergence such that .
We note that
Consequently, we gain (∀, ):
- (4)
The theorem is accomplished. □
Now, we go further with a hybrid-type, i.e., hyperbolic–trigonometric high-order NN approximation.
Proof.
Finally, when , we always determine that
Inspired by [4], we employ the hyperbolic–trigonometric Taylor’s formula for :
When , then
From Theorems 5 and 6, we determine that
where
Without loss of generality, let us consider that .
Hence, or .
For , we gain the following cases:
- (i)
- If , then
Namely, if , then
- (ii)
- When , then
Again, let ; then,
Thus,
If we let , we obtain
which is why there exists that
So,
We try to prove that
We examine that
As a result, we obtain (∀, ):
The theorem is established. □
Now, a general trigonometric result will be considered.
Theorem 8.
Let , , , and such that Then,
- (1)
- (2)
- If we obtain
is the high convergence speed in both (1) and (2).
Proof.
This proof is inspired by [4] (Chapter 3, Theorem 3.13, pp. 84–89), and also the proof of Theorem 7. □
We finalize with a general hyperbolic result.
Theorem 9.
Let , , , , and let with Then,
- (1)
- (2)
- If we obtain
Again, is the high convergence speed in both (1) and (2).
Proof.
This proof is inspired by [4], and also the proof of Theorem 7. □
6. Conclusions
As we have seen, the foundations of neural operators are based on a rich tapestry of branches such as approximation theory and computational analysis. The sophisticated architecture of single-layer neural networks offers effective and faster solutions to many problems in engineering and science. Our study stands out in terms of revealing complex-valued hyperbolic, trigonometric, and hybrid convergence cases with the help of an activation function that is relatively easy to train. Moreover, the types of convergence that reach a promising speed of approximation are our key findings. Finally, we would like to point out that we worked with Ostrowski- and Opial-type inequalities, norms, and also trigonometric- and hyperbolic-type Taylor formulae to reach this convergence rate result.
Author Contributions
Conceptualization, G.A.A. and S.K.; methodology and validation, S.K.; investigation, G.A.A.; resources and writing—original draft preparation, G.A.A. and S.K.; writing—review and editing and visualization, G.A.A. and S.K.; supervision, G.A.A.; project administration, G.A.A. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Data Availability Statement
No new data were created or analyzed in this study. Data sharing is not applicable to this article.
Acknowledgments
We would like to thank the reviewers who generously shared their time and opinions.
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Apicella, A.; Donnarumma, F.; Isgrò, F.; Prevete, R. A survey on modern trainable activation functions. Neural Netw. 2021, 138, 14–32. [Google Scholar] [CrossRef] [PubMed]
- Pappas, C.; Kovaios, S.; Moralis-Pegios, M.; Tsakyridis, A.; Giamougiannis, G.; Kirtas, M.; Van Kerrebrouck, J.; Coudyzer, G.; Yin, X.; Passalis, N.; et al. Programmable tanh-, elu-, sigmoid-, and sin-based nonlinear activation functions for neuromorphic photonics. IEEE J. Sel. Top. Quantum Electron. 2023, 29, 6101210. [Google Scholar] [CrossRef]
- Anastassiou, G.A. Opial and Ostrowski Type Inequalities Based on Trigonometric and Hyperbolic Type Taylor Formulae. Malaya J. Mat. 2023, 11, 1–26. [Google Scholar] [CrossRef] [PubMed]
- Anastassiou, G.A. Perturbed Hyperbolic Tangent Function-Activated Complex-Valued Trigonometric and Hyperbolic Neural Network High Order Approximation. In Trigonometric and Hyperbolic Generated Approximation Theory; World Scientific: Singapore; New York, NY, USA, in press.
- Ali, H.A.; Zsolt, P. Taylor-type expansions in terms of exponential polynomials. Math. Inequalities Appl. 2022, 25, 1123–1141. [Google Scholar] [CrossRef]
- Anastassiou, G.A. Rate of convergence of some neural network operators to the unit-univariate case. J. Math. Anal. Appl. 1997, 212, 237–262. [Google Scholar] [CrossRef]
- Haykin, S. Neural Networks: A Comprehensive Foundation, 2nd ed.; Prentice Hall: New York, NY, USA, 1998. [Google Scholar]
- McCulloch, W.; Pitts, W. A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 1943, 7, 115–133. [Google Scholar] [CrossRef]
- Mitchell, T.M. Machine Learning; WCB-McGraw-Hill: New York, NY, USA, 1997. [Google Scholar]
- Anastassiou, G.A. Banach Space Valued Ordinary and Fractional Neural Network Approximation Based on q-Deformed and β-Parametrized Half Hyperbolic Tangent. In Parametrized, Deformed and General Neural Networks; Studies in Computational Intelligence; Springer: Cham, Switzerland, 2023; Volume 1116. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).