Next Article in Journal
Offloading Strategy for Forest Monitoring Network Based on Improved Beetle Optimization Algorithm
Next Article in Special Issue
On Symmetrically Stochastic System of Fractional Differential Equations and Variational Inequalities
Previous Article in Journal
On Matrices of Generalized Octonions (Cayley Numbers)
Previous Article in Special Issue
Composition Operators on Weighted Zygmund Spaces of the First Loo-keng Hua Domain
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Parametrized Half-Hyperbolic Tangent Function-Activated Complex-Valued Neural Network Approximation

by
George A. Anastassiou
1,* and
Seda Karateke
2
1
Department of Mathematical Sciences, University of Memphis, Memphis, TN 38152, USA
2
Department of Software Engineering, Faculty of Engineering and Natural Sciences, Istanbul Atlas University, Kagithane, Istanbul 34408, Türkiye
*
Author to whom correspondence should be addressed.
Symmetry 2024, 16(12), 1568; https://doi.org/10.3390/sym16121568
Submission received: 21 October 2024 / Revised: 6 November 2024 / Accepted: 19 November 2024 / Published: 23 November 2024

Abstract

:
In this paper, we create a family of neural network (NN) operators employing a parametrized and deformed half-hyperbolic tangent function as an activation function and a density function produced by the same activation function. Moreover, we consider the univariate quantitative approximations by complex-valued neural network (NN) operators of complex-valued functions on a compact domain. Pointwise and uniform convergence results on Banach spaces are acquired through trigonometric, hyperbolic, and hybrid-type hyperbolic–trigonometric approaches.

1. Motivation

Under the umbrella of Artificial Intelligence (AI), neural network (NN) operators supply a vigorous structure for solving complex real-life problems in various fields of science such as cybersecurity, healthcare, economics, medicine, psychology, sociology, etc. Thanks to their differentiability properties, they can optimize parameters directly. Moreover, NN operators need correctly selected activation functions to work effectively. When we look at the literature, we encounter a very wide range of activation functions [1]. In this study, we choose h ^ α —a parametrized and deformed half-hyperbolic tangent function—as the activation function. tan h -like functions can be more effective in reaching the optimum solution due to their trainability [2]. One of the interesting aspects of this work is that all higher-order approximations are based on trigonometric and hyperbolic-type Taylor’s formulae inequalities (see [3,4,5]). Next, we emphasize the following: as is well known, the human brain has been proven medically to be a non-symmetrical human organ. As a result, NNs try to imitate its operation and are not symmetrical mathematical structures. But in our paper, we build an approximation apparatus that is as close as possible to symmetry. Namely the activation function we initially use is described as reciprocal anti-symmetrical (see Proposition 1). This is the building block for our density function which is used in our approximation NN operators. We prove that this density function is a reciprocal symmetric function (see Equation (3)). This represents our study’s interesting connection to the general “symmetry” phenomenon.
The first construction of approximation by NN operators in the sense of the “Cardaliaguet Euvrard” and “Squashing” types was made by G.A. Anastassiou in 1997. As a fruit of this construction, he also brought to the literature the ability to calculate the convergence speed with the help of convergence rates using the modulus of continuity [6].
The mathematical expression of the one-hidden-layer neural network (NN) architecture is presented as
N ψ , n x = i = 0 n k i ψ ω ˜ i · x + ϖ i , x R s , s N ,
for 0 i n , where ω ˜ i = ω ˜ i , 1 , ω ˜ i , 2 , , ω ˜ i , s R s are the connection weights, k i is the coefficient, ω ˜ i · x is the inner product of ω ˜ i , x ; ϖ i R is the threshold, and ψ is the activation function. For more knowledge of NNs, readers are recommended to read [7,8,9].
This paper is organized as follows: in Section 1, we present our motivation to the readers. In Section 2, step by step, we build up our activation function h ^ α . We also include our basic estimations, which form the basis for the main results. We devote Section 3 and Section 4 to the construction of density function A α , and the creation of the C valued linear NN operators, respectively. In Section 5, the authors perform deformed and parametrized half-hyperbolic tangent function-activated high-order neural network approximations on continuous functions over compact intervals of a real line with complex values. All convergences have rates expressed via the modulus of continuity of the involved functions’ high-order derivatives, derived from very tight Jackson-type inequalities. We conclude with Section 6.

2. Activation Function: Parametrized Half-Hyperbolic Tangent Function h ^ α

We consider the following function inspired by [10]
h ^ α x : = 1 α e γ x 1 + α e γ x , x R ,
to be a parametrized half-hyperbolic tangent function for α , γ > 0 . Also, one has h ^ α 0 = 1 α 1 + α .
Proposition 1.
We observe that
h ^ α x = h ^ 1 α x
for x R ; α > 0 .
Proposition 2.
The function h ^ α is strictly increasing on R . Because
h ^ α x = 1 α e γ x 1 + α e γ x = 2 α γ e γ x e γ x + α 2 > 0 ,
for every x R ; α , γ > 0 .
Proposition 3.
When we take the second-order derivative of h ^ α , we have
h ^ α x = 2 α γ e γ x e γ x + α 2 = 2 α γ 2 e γ x α e γ x α + e γ x 3 C R , x R ; α , γ > 0 .
According to the above calculation, the following are true:
Case 1:
if x < ln α γ , then h ^ α is strictly concave upward, with h ^ α ln α γ = 0 .
Case 2:
if x > ln α γ , then h ^ α is strictly concave downward.

3. Construction of Density Function A α

In this section, we aim to create the density function A α , and we also present its basic properties, which will be used throughout the paper. It is clear that 1 > 1 x + 1 > x 1 . So, for every x R ; α , γ > 0 , let us consider the following density function:
A α x : = 1 4 h ^ α x + 1 h ^ α x 1 > 0 .
Furthermore, note that
lim x + A α x = lim x A α x = 0 ,
so the x-axis is a horizontal asymptote. One can write that
A α x = 1 4 h ^ α x + 1 h ^ α x 1 = 1 4 1 α e γ x + 1 1 1 α e γ x + 1 + 1 + 1 α e γ x 1 1 1 α e γ x 1 + 1 = 1 4 h ^ 1 α x + 1 h ^ 1 α x 1 = A 1 α x , x R ; α > 0 .
Thus,
A α x = A 1 α x , x R ; α > 0 .
Remark 1.
We have
A α x : = 1 4 h ^ α x + 1 h ^ α x 1 .
Then,
(i) 
Let x < ln α γ 1 , then x 1 < x + 1 < ln α γ , and h ^ α x + 1 > h ^ α x 1 , that is, A α x > 0 . Thus, A α is strictly increasing on , ln α γ 1 .
(ii) 
Let x 1 > ln α γ , then x + 1 > x 1 > ln α γ , and h ^ α x + 1 < h ^ α x 1 , namely, A α x < 0 . Therefore, A α is strictly decreasing on ln α γ + 1 , + .
Remark 2.
Let ln α γ 1 x ln α γ + 1 . Then,
A α x = 1 4 h ^ α x + 1 h ^ α x 1 = α γ 2 2 e γ x + 1 α e γ x + 1 α + e γ x + 1 3 e γ x 1 α e γ x 1 α + e γ x 1 3
Explicitly, according to 4 ; one determines that A α x 0 for x ln α γ 1 , ln α γ + 1 , and is strictly concave downward on ln α γ 1 , ln α γ + 1 . Therefore, A α is a bell-shaped function on R . Moreover, A α ln α γ < 0 is satisfied.
Remark 3.
The maximum value of A α is
A α ln α γ = h ^ 1 1 2 .
Theorem 1
([10]). We determine that
j = A α x j = 1 , x R , α > 0 .
Thus
j = A α n x j = 1 , n N , x R .
In this manner, it holds that
j = A 1 α x j = 1 , x R .
However,
A 1 α x j = ( 3 ) A α j x .
So
j = A α j x = 1 , x R , α > 0 ,
and
j = A α j + x = 1 , x R , α > 0 .
Theorem 2
([10]). It holds that
A α x d x = 1 ; α > 0 .
Thus, this means that A α is a density function over R such that α > 0 .
Next, the following result is needed.
Theorem 3
([10]). Let 0 < λ < 1 , and n N with n 1 λ > 2 ; α , γ > 0 . Then,
k = : n x k n 1 λ A α n x k < 2 max α , 1 α e 2 γ e γ n 1 λ = : C e γ n 1 λ ,
where C : = 2 max α , 1 α e 2 γ .
Let · and · be the ceiling of the number and the integral part of the number, respectively.
Let us continue with the following conclusion:
Theorem 4
([10]). Let x a , b R and n N so that n a n b . For α , γ > 0 , we consider the number γ α > ρ 0 > 0 with A α ρ 0 = A α 0 , and γ α > 1 . Then,
1 k = n a n b A α n x k < max 1 A α γ α , 1 A 1 α γ 1 α = : α .
We also mention the following:
Remark 4
([10]). (i) We also notice that
lim n + k = n a n b A α n x k 1 , for at least some x a , b ,
where α > 0 .
(ii) Let a , b R . For large n , we always have n a n b . Also a k n b , iff n a k n b . In general, it holds that
k = n a n b A α n x k 1 .

4. Generation of C -Valued Linear NN Operators

Let C , · be the Banach space of the complex numbers on the reals.
Definition 1.
Let f C a , b , C and n N : n a n b . We introduce and define the C -valued linear neural network operators as follows:
Θ n f , x : = k = n a n b f k n A α n x k k = n a n b A α n x k , x a , b ; α > 0 , α 1 .
For large enough n, we always obtain n a n b . Also, a k n b , iff n a k n b . The same Θ n may be used for real-valued functions. Here, we study the pointwise and uniform convergence of Θ n f , x to f x with rates.
Clearly, here, Θ n f C a , b , C .
For the sake of usefulness, we can follow the following:
Θ n * f , x : = k = n a n b f k n A α n x k ,
that is,
Θ n f , x : = Θ n * f , x k = n a n b A α n x k .
so that
Θ n f , x f x = Θ n * f , x f x k = n a n b A α n x k k = n a n b A α n x k .
Therefore, we state that
Θ n f , x f x α Θ n * f , x f x k = n a n b A α n x k
= α k = n a n b f k n f x A α n x k ,
where α is as in (6).
We will calculate (12) by virtue of the classical first modulus of continuity for f C a , b , C defined below:
ω 1 f , δ : = sup x , y a , b x y δ f x f y , δ > 0 .
Moreover, f C a , b , C is equivalent to lim δ 0 ω 1 f , δ = 0 (see [6]).

5. Approximation Results

Now, we are ready to perform C -valued neural network high-order approximations to a function f given with rates. Let us start with a trigonometric approach.
Theorem 5.
Let f C 2 a , b , C , 0 < λ < 1 , n N : n 1 λ > 2 , x a , b . Then,
(1) 
Θ n f , x f x α f x 1 n λ + b a C e γ n 1 λ
+ f x 2 1 n 2 λ + b a 2 C e γ n 1 λ +
ω 1 f + f , 1 n λ 2 n 2 λ + f + f b a 2 C e γ n 1 λ ,
(2) 
If f x = f x = 0 , we obtain
Θ n f , x f x α ω 1 f + f , 1 n λ 2 n 2 λ + f + f b a 2 C e γ n 1 λ ,
Note here that there is a high rate of convergence at n 3 λ .
(3) 
In addition, we obtain
Θ n f f α f 1 n λ + b a C e γ n 1 λ + f 2 1 n 2 λ + b a 2 C e γ n 1 λ + ω 1 f + f , 1 n λ 2 n 2 λ + f + f b a C e γ n 1 λ ,
namely, lim n + Θ n f = f , pointwise and uniformly,
(4) 
Eventually, it holds that
Θ n f , x f x Θ n sin · x , x 2 f x Θ n sin 2 · x 2 , x f x α ω 1 f + f , 1 n λ 2 n 2 λ + f + f b a 2 C e γ n 1 λ ,
and a high speed of convergence at n 3 λ is gained.
Proof.  
Inspired by [3], and applying the trigonometric Taylor’s formula for f C 2 a , b , C , let k n , x a , b ; then,
f k n = f x + f x sin k n x + 2 f x sin 2 k n x 2
+ x k n f t + f t f x + f x sin k n t d t .
Furthermore, it holds that
f k n A α n x k = f x A α n x k
+ f x sin k n x A α n x k + 2 f x sin 2 k n x 2 A α n x k
+ A α n x k x k n f t + f t f x + f x sin k n t d t .
So, we have
k = n a n b f k n A α n x k f x k = n a n b A α n x k = f x k = n a n b A α n x k sin k n x + 2 f x k = n a n b A α n x k sin 2 k n x 2 + k = n a n b A α n x k x k n f t + f t f x + f x sin k n t d t .
Hence, we gain
Θ n * f , x f x k = n a n b A α n x k = f x Θ n * sin · x , x + 2 f x Θ n * sin 2 · x 2 , x + Γ n x ,
where
Γ n x : = k = n a n b A α n x k x k n f t + f t f x + f x sin k n t d t .
We assume that
Γ ^ n : = x k n f t + f t f x + f x sin k n t d t .
when n > b a 1 λ , in other words, when large enough n N , b a > 1 n λ is assumed. Thus, this yields k n x 1 n λ or k n x > 1 n λ .
For the case k n x 1 n λ , the following are obtained:
(i)
If k n x , then
Γ ^ n = x k n f t + f t f x + f x sin k n t d t x k n ω 1 f + f , t x sin k n t d t
(employing sin x x , ∀ x R )
x k n ω 1 f + f , t x k n t d t ω 1 f + f , k n x k n x 2 2 ω 1 f + f , 1 n λ 2 n 2 λ ,
namely,
Γ ^ n ω 1 f + f , 1 n λ 2 n 2 λ .
(ii)
If k n < x , then
Γ ^ n = x k n f t + f t f x + f x sin k n t d t = k n x f t + f t f x + f x sin k n t d t k n x f t + f t f x + f x sin k n t d t k n x ω 1 f + f , x k n t k n d t ω 1 f + f , x k n x k n 2 2 ω 1 f + f , 1 n λ 2 n 2 λ .
Therefore,
Γ ^ n ω 1 f + f , 1 n λ 2 n 2 λ .
So, we have proven that when k n x 1 n λ , it is always true that
Γ ^ n ω 1 f + f , 1 n λ 2 n 2 λ .
(a)
Again let k n x :
Γ ^ n = x k n f t + f t f x + f x sin k n t d t x k n f t + f t f x + f x sin k n t d t
(employing sin x x , ∀ x R )
2 f + f x k n k n t d t = 2 f + f k n x 2 2 f + f b a 2 .
Thus, we obtain
Γ ^ n f + f b a 2 .
(b)
One more time, let k n < x ,
Γ ^ n = x k n f t + f t f x + f x sin k n t d t = k n x f x + f x f t + f t sin k n t d t k n x f x + f x f t + f t sin k n t d t 2 f + f k n x sin k n t d t 2 f + f k n x t k n d t = f + f x k n 2 f + f b a 2 .
So, we gain
Γ ^ n f + f b a 2 .
Also, we have
Γ n x = k = n a : k n x 1 n λ n b A α n x k Γ ^ n
+ k = n a : k n x > 1 n λ n b A α n x k Γ ^ n .
Thus, it follows that
Γ n x k = n a : k n x 1 n λ n b A α n x k Γ ^ n
+ k = n a : k n x > 1 n λ n b A α n x k Γ ^ n
k = n a : k n x 1 n λ n b A α n x k ω 1 f + f , 1 n λ 2 n 2 λ
+ k = n a : k n x > 1 n λ n b A α n x k f + f b a 2 ( by ( 7 ) )
ω 1 f + f , 1 n λ 2 n 2 λ + f + f b a 2
k = n a : k n x > 1 n λ n b A α n x k ( by Theorem 3 )
ω 1 f + f , 1 n λ 2 n 2 λ + f + f b a 2 C e γ n 1 λ .
As a result, we derive
Γ n x ω 1 f + f , 1 n λ 2 n 2 λ + f + f b a 2 C e γ n 1 λ .
Again, we apply sin x x , ∀ x R .
We obtain
Θ n * sin · x , x = k = n a n b A α n x k sin k n x ,
and
Θ n * sin · x , x k = n a n b A α n x k sin k n x
= k = n a : k n x 1 n λ n b A α n x k sin k n x
+ k = n a : k n x > 1 n λ n b A α n x k sin k n x
k = n a : k n x 1 n α n b A α n x k k n x
+ k = n a : k n x > 1 n λ n b A α n x k k n x
1 n λ + b a k = n a : k n x > 1 n λ n b A α n x k
( by ( 7 ) ) 1 n λ + b a C e γ n 1 λ .
We determine that
Θ n * sin · x , x 1 n λ + b a C e γ n 1 λ .
Then, we calculate
Θ n * sin 2 · x 2 , x = k = n a n b A α n x k sin 2 k n x 2 ,
using sin x x , ∀ x R , and calculate
Θ n * sin 2 · x 2 , x = k = n a n b A α n x k sin k n x 2 2
1 4 k = n a n b A α n x k k n x 2
= 1 4 k = n a : k n x 1 n λ n b A α n x k k n x 2
+ k = n a : k n x > 1 n λ n b A α n x k k n x 2
1 4 1 n 2 λ + b a 2 C e γ n 1 λ ,
i.e.,
Θ n * sin 2 · x 2 , x 1 4 1 n 2 λ + b a 2 C e γ n 1 λ .
As a consequence, we obtain the following:
(1)
Θ n f , x f x α f x 1 n λ + b a C e γ n 1 λ + f x 2 1 n 2 λ + b a 2 C e γ n 1 λ
+ ω 1 f + f , 1 n λ 2 n 2 λ + f + f b a 2 C e γ n 1 λ .
(2)
If f x = f x = 0 , according to (14), we have
Θ n f , x f x α ω 1 f + f , 1 n λ 2 n 2 λ + f + f b a 2 C e γ n 1 λ .
Here, we keep in mind that n 3 λ has a high convergence rate.
(3)
In addition, according to (14), we have
Θ n f f α f 1 n λ + b a C e γ n 1 λ + f 2 1 n 2 λ + b a 2 C e γ n 1 λ + ω 1 f + f , 1 n λ 2 n 2 λ + f + f b a 2 C e γ n 1 λ .
We state that convergence is pointwise and uniform such that lim n + Θ n f = f .
We consider that
Θ n f , x f x Θ n sin · x , x 2 f x Θ n sin 2 · x 2 , x f x
= Θ n * f , x k = n a n b A α n x k f x Θ n * sin · x , x k = n a n b A α n x k
2 f x Θ n * sin 2 · x 2 , x k = n a n b A α n x k f x k = n a n b A α n x k k = n a n b A α n x k
= 1 k = n a n b A α n x k Θ n * f , x f x Θ n * sin · x , x
2 f x Θ n * sin 2 · x 2 , x f x k = n a n b A α n x k
= 1 k = n a n b A α n x k Γ n x .
Lastly, we obtain (∀ x a , b , n N ):
(4)
Θ n f , x f x Θ n sin · x , x 2 f x Θ n sin 2 · x 2 , x f x ( 6 ) α Γ n x ( 13 ) α ω 1 f + f , 1 n λ 2 n 2 λ + f + f b a 2 C e γ n 1 λ .
The theorem is proved. □
We move ahead with a hyperbolic high-order neural network approximation.
Theorem 6.
Let f C 2 a , b , C , 0 < λ < 1 , n N : n 1 λ > 2 , x a , b . Then,
(1) 
Θ n f , x f x α cosh b a f x 1 n λ + b a C e γ n 1 λ f x 2 1 n 2 λ + b a 2 C e γ n 1 λ + ω 1 f f , 1 n λ 2 n 2 λ + f f b a 2 C e γ n 1 λ ,
(2) 
If f x = f x = 0 , we obtain
Θ n f , x f x α cosh b a ω 1 f f , 1 n λ 2 n 2 λ + f f b a 2 C e γ n 1 λ ,
considering the high rate of convergence at n 3 λ .
(3) 
In addition, we obtain
Θ n f f α cosh b a f 1 n λ + b a C e γ n 1 λ f 2 1 n 2 λ + b a 2 C e γ n 1 λ + ω 1 f f , 1 n λ 2 n 2 λ + f f b a 2 C e γ n 1 λ ,
it yields that lim n + Θ n f = f , pointwise and uniformly, and
(4) 
Θ n f , x f x Θ n sinh · x , x 2 f x Θ n sinh 2 · x 2 , x f x
α cosh b a ω 1 f f , 1 n λ 2 n 2 λ + f f b a 2 C e γ n 1 λ ,
Here, again, n 3 λ is our high convergence speed.
Proof. 
Using the mean value theorem, we write
sinh x = sinh x sinh 0 = cosh ξ x 0 ,
for some ξ in 0 , x , for any x R .
Thus,
sinh x cosh , b a , b a x , x b a , b a .
In other words, there exists M 1 such that
sinh x M x , x b a , b a ,
where M : = cosh , b a , b a = cosh b a .
Inspired by [3,4], and applying the hyperbolic Taylor’s formula for f C 2 a , b , C , when k n , x a , b , then
f k n = f x + f x sinh k n x + 2 f x sinh 2 k n x 2 + x k n f t f t f x f x sinh k n t d t .
So, it holds that
f k n A α n x k = f x A α n x k + f x sinh k n x A α n x k + 2 f x sinh 2 k n x 2 A α n x k + A α n x k x k n f t f t f x f x sinh k n t d t .
Accordingly, we obtain
k = n a n b f k n A α n x k f x k = n a n b A α n x k = f x k = n a n b A α n x k sinh k n x + 2 f x k = n a n b A α n x k sinh 2 k n x 2 + k = n a n b A α n x k x k n f t f t f x f x sinh k n t d t .
So, we obtain
Θ n * f , x f x k = n a n b A α n x k
= f x Θ n * sinh · x , x + 2 f x Θ n * sinh 2 · x 2 , x + Γ n x ,
where
Γ n x : = k = n a n b A α n x k x k n f t f t f x f x sinh k n t d t .
We assume that
Γ ^ n : = x k n f t f t f x f x sinh k n t d t .
For large enough n N , let b a > 1 n λ , that is, when n > b a 1 λ .
Hence, k n x 1 n λ or k n x > 1 n λ .
For k n x 1 n λ , we have the following:
(i)
If k n x , then
Γ ^ n = x k n f t f t f x f x sinh k n t d t x k n ω 1 f f , t x sinh k n t d t
( 15 ) x k n ω 1 f f , t x M k n t d t M ω 1 f f , k n x x k n k n t d t
= M ω 1 f f , k n x k n x 2 2 M ω 1 f f , 1 n λ 2 n 2 λ ,
that is,
Γ ^ n M ω 1 f f , 1 n λ 2 n 2 λ .
(ii)
If k n < x , then
Γ ^ n = x k n f t f t f x f x sinh k n t d t = k n x f t f t f x f x sinh k n t d t k n x f t f t f x f x sinh k n t d t M ω 1 f f , x k n k n x t k n d t = M ω 1 f f , x k n x k n 2 2 M ω 1 f f , 1 n λ 2 n 2 λ ,
i.e.,
Γ ^ n M ω 1 f f , 1 n λ 2 n 2 λ .
So, we have verified that when k n x 1 n α , it gives
Γ ^ n M ω 1 f f , 1 n λ 2 n 2 λ .
Now, let us again assume that k n x ; then,
Γ ^ n = x k n f t f t f x f x sinh k n t d t x k n f t f t f x f x sinh k n t d t 2 M f f x k n k n t d t = 2 M f f k n x 2 2 M f f b a 2 .
Thus,
Γ ^ n M f f b a 2 .
If k n < x , then
Γ ^ n = k n x f t f t f x f x sinh k n t d t k n x f t f t f x f x sinh k n t d t 2 M f f k n x t k n d t = 2 M f f x k n 2 2 M f f b a 2 .
Hence, it verifies that
Γ ^ n M f f b a 2 .
Also, we have
Γ n x = k = n a : k n x 1 n λ n b A α n x k Γ ^ n
+ k = n a : k n x > 1 n λ n b A α n x k Γ ^ n .
Thus, it yields that
Γ n x k = n a : k n x 1 n λ n b A α n x k Γ ^ n
+ k = n a : k n x > 1 n λ n b A α n x k Γ ^ n
k = n a : k n x 1 n λ n b A α n x k ω 1 f f , 1 n λ M 2 n 2 λ
+ k = n a : k n x > 1 n λ n b A α n x k M f f b a 2
( by ( 5 ) ) M ω 1 f f , 1 n λ 2 n 2 λ + M f f b a 2
k = n a : k n x > 1 n λ n b A α n x k
( by Theorem 3 ) M ω 1 f f , 1 n λ 2 n 2 λ + M f f b a 2 C e γ n 1 λ .
Therefore, we determine that
Γ n x M ω 1 f f , 1 n λ 2 n 2 λ + f f b a 2 C e γ n 1 λ .
Also, we have that
Θ n * sinh · x , x = k = n a n b A α n x k sinh k n x ,
and
Θ n * sinh · x , x k = n a n b A α n x k sinh k n x
= k = n a : k n x 1 n λ n b A α n x k sinh k n x
+ k = n a : k n x > 1 n λ n b A α n x k sinh k n x
M k = n a : k n x 1 n λ n b A α n x k k n x
+ k = n a : k n x > 1 n λ n b A α n x k k n x
M 1 n λ + b a k = n a : k n x > 1 n λ n b A α n x k ( by ( 5 ) )
M 1 n α + b a C e γ n 1 λ .
We obtain that
Θ n * sinh · x , x M 1 n λ + b a C e γ n 1 λ .
Then, we calculate
Θ n * sinh 2 · x 2 , x = k = n a n b A α n x k sinh 2 k n x 2 .
One determines that
Θ n * sinh 2 · x 2 , x = k = n a n b A α n x k sinh k n x 2 2
M 4 k = n a n b A α n x k k n x 2
= M 4 k = n a : k n x 1 n λ n b A α n x k k n x 2
+ k = n a : k n x > 1 n λ n b A α n x k k n x 2
M 4 1 n 2 λ + b a 2 C e γ n 1 λ .
That is,
Θ n * sinh 2 · x 2 , x M 4 1 n 2 λ + b a 2 C e γ n 1 λ .
According to (12), and based on (16)–(19), we acquire the following:
(1)
Θ n f , x f x α M f x 1 n λ + b a C e γ n 1 λ
+ f x 2 1 n 2 λ + b a 2 C e γ n 1 λ
+ ω 1 f f , 1 n λ 2 n 2 λ + f f b a 2 C e γ n 1 λ .
(2)
If f x = f x = 0 , according to (20), we achieve that
Θ n f , x f x
α M ω 1 f f , 1 n λ 2 n 2 λ + f f b a 2 C e γ n 1 λ ,
with the high rate of convergence at n 3 λ .
(3)
Moreover, according to (20), we gain
Θ n f f α M f 1 n λ + b a C e γ n 1 λ + f 2 1 n 2 λ + b a 2 C e γ n 1 λ + ω 1 f f , 1 n λ 2 n 2 λ + f f b a 2 C e γ n 1 λ .
It yields a pointwise and uniform convergence such that lim n + Θ n f = f .
We note that
Θ n f , x f x Θ n sinh · x , x 2 f x Θ n sinh 2 · x 2 , x f x
= Θ n * f , x k = n a n b A α n x k f x Θ n * sinh · x , x k = n a n b A α n x k
2 f x Θ n * sinh 2 · x 2 , x k = n a n b A α n x k f x k = n a n b A α n x k k = n a n b A α n x k
= 1 k = n a n b A α n x k Θ n * f , x f x Θ n * sinh · x , x
2 f x Θ n * sinh 2 · x 2 , x f x k = n a n b A α n x k
= 1 k = n a n b A α n x k Γ n x .
Consequently, we gain (∀ x a , b , n N ):
(4)
Θ n f , x f x Θ n sinh · x , x 2 f x Θ n sinh 2 · x 2 , x f x ( 6 ) α Γ n x ( 17 ) α M ω 1 f f , 1 n λ 2 n 2 λ + f f b a 2 C e γ n 1 λ .
The theorem is accomplished. □
Now, we go further with a hybrid-type, i.e., hyperbolic–trigonometric high-order NN approximation.
Theorem 7.
Let f C 4 a , b , C , 0 < λ < 1 , n N : n 1 λ > 2 , x a , b . Then,
(1) 
Θ n f , x f x f x 2 Θ n sinh · x + sin · x , x f x 2 Θ n cosh · x cos · x , x f 3 x 2 Θ n sinh · x sin · x , x f 4 x Θ n sinh 2 · x 2 sin 2 · x 2 , x α cosh b a + 1 2 ω 1 f 4 f , 1 n λ 2 n 2 λ + f 4 f b a 2 C e γ n 1 λ ,
(2) 
If f i x = 0 , i = 1 , 2 , 3 , 4 , we obtain
Θ n f , x f x α cosh b a + 1 2
ω 1 f 4 f , 1 n λ 2 n 2 λ + f 4 f b a 2 C e γ n 1 λ ,
and in (21), n 3 λ appears as the highest speed.
Proof. 
Inspired by [4], we employ the hyperbolic–trigonometric Taylor’s formula for f C 4 a , b , C :
When k n , x a , b , then
f k n f x f x sinh k n x + sin k n x 2 f x cosh k n x cos k n x 2 f 3 x sinh k n x sin k n x 2 f 4 x sinh 2 k n x 2 sin 2 k n x 2
= x k n f 4 t f t f 4 x f x sinh k n t sin k n t 2 d t = : Φ k n , x .
From Theorems 5 and 6, we determine that
Θ n * f , x f x k = n a n b A α n x k f x 2 Θ n * sinh · x + sin · x , x f x 2 Θ n * cosh · x cos · x , x f 3 x 2 Θ n * sinh · x sin · x , x
f 4 x 2 Θ n * sinh 2 · x 2 sin 2 · x 2 , x = Φ ^ n x ,
where
Φ ^ n x : = k = n a n b A α n x k Φ n k n , x .
Without loss of generality, let us consider that n > b a 1 λ .
Hence, k n x 1 n λ or k n x > 1 n λ .
For k n x 1 n λ , we gain the following cases:
(i)
If k n x , then
Φ k n , x = 1 2 x k n f 4 t f t f 4 x f x sinh k n t sin k n t d t
1 2 x k n f 4 t f t f 4 x f x sinh k n t sin k n t d t 1 2 x k n ω 1 f 4 f , t x sinh k n t + sin k n t d t ω 1 f 4 f , k n x 2 x k n cosh b a k n t + k n t d t = cosh b a + 1 ω 1 f 4 f , k n x 2 x k n k n t d t = cosh b a + 1 ω 1 f 4 f , k n x 4 k n x 2 cosh b a + 1 ω 1 f 4 f , 1 n λ 4 n 2 λ .
Namely, if k n x , then
Φ k n , x cosh b a + 1 ω 1 f 4 f , 1 n λ 4 n 2 λ .
(ii)
When k n < x , then
Φ k n , x = 1 2 k n x f 4 t f t f 4 x f x sinh k n t sin k n t d t ω 1 f 4 f , x k n 2 k n x cosh b a t k n + t k n d t = cosh b a + 1 ω 1 f 4 f , x k n 2 k n x t k n d t = cosh b a + 1 ω 1 f 4 f , x k n 4 x k n 2 cosh b a + 1 ω 1 f 4 f , 1 n λ 4 n 2 λ .
Finally, when k n x 1 n λ , we always determine that
Φ k n , x cosh b a + 1 ω 1 f 4 f , 1 n λ 4 n 2 λ .
Again, let k n x ; then,
Φ k n , x 1 2 x k n f 4 t f t f 4 x f x sinh k n t sin k n t d t f 4 f x k n cosh b a k n t + k n t d t = f 4 f cosh b a + 1 x k n k n t d t = f 4 f cosh b a + 1 2 k n x 2 f 4 f cosh b a + 1 b a 2 2 .
Thus,
Φ k n , x f 4 f cosh b a + 1 b a 2 2 .
If we let k n < x , we obtain
Φ k n , x 1 2 k n x f 4 t f t f 4 x f x sinh k n t sin k n t d t f 4 f k n x cosh b a t k n + t k n d t = f 4 f cosh b a + 1 k n x t k n d t = f 4 f cosh b a + 1 2 x k n 2 f 4 f cosh b a + 1 b a 2 2 .
which is why there exists that
Φ k n , x f 4 f cosh b a + 1 b a 2 2 .
So,
Φ ^ n x k = n a n b A α n x k Φ k n , x
= k = n a : k n x 1 n λ n b A α n x k Φ k n , x
+ k = n a : k n x > 1 n λ n b A α n x k Φ k n , x
k = n a : k n x 1 n λ n b A α n x k cosh b a + 1 ω 1 f 4 f , 1 n λ 4 n 2 λ
+ k = n a : k n x > 1 n λ n b A α n x k
f 4 f cosh b a + 1 b a 2 2
cosh b a + 1 ω 1 f 4 f , 1 n λ 4 n 2 λ
+ f 4 f cosh b a + 1 b a 2 2 C e γ n 1 λ .
We try to prove that
Φ ^ n x cosh b a + 1 2
ω 1 f 4 f , 1 n λ 2 n 2 λ + f 4 f b a 2 C e γ n 1 λ .
We examine that
Θ n f , x f x f x 2 Θ n sinh · x + sin · x , x f x 2 Θ n cosh · x cos · x , x f 3 x 2 Θ n sinh · x sin · x , x f 4 x Θ n sinh 2 · x 2 sin 2 · x 2 , x = Θ n * f , x f x k = n a n b A α n x k f x 2 Θ n * sinh · x + sin · x , x f x 2 Θ n * cosh · x cos · x , x f 3 x 2 Θ n * sinh · x sin · x , x
f 4 x Θ n * sinh 2 · x 2 sin 2 · x 2 , x 1 k = n a n b A α n x k = Φ ^ n x k = n a n b A α n x k .
As a result, we obtain (∀ x a , b , n N ):
Θ n f , x f x f x 2 Θ n sinh · x + sin · x , x
f x 2 Θ n cosh · x cos · x , x
f 3 x 2 Θ n sinh · x sin · x , x
f 4 x Θ n sinh 2 · x 2 sin 2 · x 2 , x
= Φ ^ n x k = n a n b A α n x k α Φ ^ n x ( by ( 23 ) ) α cosh b a + 1 2
ω 1 f 4 f , 1 n λ 2 n 2 λ + f 4 f b a 2 C e γ n 1 λ .
The theorem is established. □
Now, a general trigonometric result will be considered.
Theorem 8.
Let f C 4 a , b , C , 0 < λ < 1 , n N : n 1 λ > 2 , x a , b , and ξ ˜ , ξ ¯ R such that ξ ˜ ξ ¯ ξ ˜ 2 ξ ¯ 2 0 . Then,
(1) 
Θ n f , x f x f x ξ ˜ ξ ¯ ξ ¯ 2 ξ ˜ 2 Θ n ξ ¯ 3 sin ξ ˜ · x ξ ˜ 3 sin ξ ¯ · x , x f x ξ ¯ 2 ξ ˜ 2 Θ n cos ξ ˜ · x cos ξ ¯ · x , x f 3 x ξ ˜ ξ ¯ ξ ¯ 2 ξ ˜ 2 Θ n ξ ¯ sin ξ ˜ · x ξ ˜ sin ξ ¯ · x , x 2 f 4 x + ξ ˜ 2 + ξ ¯ 2 f x ξ ˜ ξ ¯ 2 ξ ¯ 2 ξ ˜ 2 Θ n ξ ¯ 2 sin 2 ξ ˜ · x 2 ξ ˜ 2 sin 2 ξ ¯ · x 2 , x α ξ ¯ 2 ξ ˜ 2 ω 1 f 4 + ξ ˜ 2 + ξ ¯ 2 f + ξ ˜ 2 ξ ¯ 2 f , 1 n λ n 2 λ + 2 f 4 + ξ ˜ 2 + ξ ¯ 2 f + ξ ˜ 2 ξ ¯ 2 f b a 2 C e γ n 1 λ ,
(2) 
If f i x = 0 , i = 1 , 2 , 3 , 4 , we obtain
Θ n f , x f x α ξ ¯ 2 ξ ˜ 2 ω 1 f 4 + ξ ˜ 2 + ξ ¯ 2 f + ξ ˜ 2 ξ ¯ 2 f , 1 n λ n 2 λ + 2 f 4 + ξ ˜ 2 + ξ ¯ 2 f + ξ ˜ 2 ξ ¯ 2 f b a 2 C e γ n 1 λ .
n 3 λ is the high convergence speed in both (1) and (2).
Proof. 
This proof is inspired by [4] (Chapter 3, Theorem 3.13, pp. 84–89), and also the proof of Theorem 7. □
We finalize with a general hyperbolic result.
Theorem 9.
Let f C 4 a , b , C , 0 < λ < 1 , n N : n 1 λ > 2 , x a , b , and let ξ ˜ , ξ ¯ R with ξ ˜ ξ ¯ ξ ˜ 2 ξ ¯ 2 0 . Then,
(1) 
Θ n f , x f x f x ξ ˜ ξ ¯ ξ ¯ 2 ξ ˜ 2 Θ n ξ ¯ 3 sinh ξ ˜ · x ξ ˜ 3 sinh ξ ¯ · x , x f x ξ ¯ 2 ξ ˜ 2 Θ n cosh ξ ¯ · x cosh ξ ˜ · x , x f x ξ ˜ ξ ¯ ξ ¯ 2 ξ ˜ 2 Θ n ξ ˜ sinh ξ ¯ · x ξ ¯ sinh ξ ˜ · x , x 2 ( f 4 x ξ ˜ 2 + ξ ¯ 2 f x ) ξ ˜ ξ ¯ 2 ξ ¯ 2 ξ ˜ 2 Θ n ξ ˜ 2 sinh 2 ξ ¯ · x 2 ξ ¯ 2 sinh 2 ξ ˜ · x 2 , x α cosh b a ξ ¯ 2 ξ ˜ 2 ω 1 f 4 ξ ˜ 2 + ξ ¯ 2 f + ξ ˜ 2 ξ ¯ 2 f , 1 n λ n 2 λ + 2 f 4 ξ ˜ 2 + ξ ¯ 2 f + ξ ˜ 2 ξ ¯ 2 f b a 2 C e γ n 1 λ ,
(2) 
If f i x = 0 , i = 1 , 2 , 3 , 4 , we obtain
Θ n f , x f x α cosh b a ξ ¯ 2 ξ ˜ 2
ω 1 f 4 ξ ˜ 2 + ξ ¯ 2 f + ξ ˜ 2 ξ ¯ 2 f , 1 n λ n 2 λ
+ 2 f 4 ξ ˜ 2 + ξ ¯ 2 f + ξ ˜ 2 ξ ¯ 2 f b a 2 C e γ n 1 λ .
Again, n 3 λ is the high convergence speed in both (1) and (2).
Proof. 
This proof is inspired by [4], and also the proof of Theorem 7. □

6. Conclusions

As we have seen, the foundations of neural operators are based on a rich tapestry of branches such as approximation theory and computational analysis. The sophisticated architecture of single-layer neural networks offers effective and faster solutions to many problems in engineering and science. Our study stands out in terms of revealing complex-valued hyperbolic, trigonometric, and hybrid convergence cases with the help of an activation function that is relatively easy to train. Moreover, the types of convergence that reach a promising speed of approximation are our key findings. Finally, we would like to point out that we worked with Ostrowski- and Opial-type inequalities, norms, and also trigonometric- and hyperbolic-type Taylor formulae to reach this convergence rate result.

Author Contributions

Conceptualization, G.A.A. and S.K.; methodology and validation, S.K.; investigation, G.A.A.; resources and writing—original draft preparation, G.A.A. and S.K.; writing—review and editing and visualization, G.A.A. and S.K.; supervision, G.A.A.; project administration, G.A.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Acknowledgments

We would like to thank the reviewers who generously shared their time and opinions.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Apicella, A.; Donnarumma, F.; Isgrò, F.; Prevete, R. A survey on modern trainable activation functions. Neural Netw. 2021, 138, 14–32. [Google Scholar] [CrossRef] [PubMed]
  2. Pappas, C.; Kovaios, S.; Moralis-Pegios, M.; Tsakyridis, A.; Giamougiannis, G.; Kirtas, M.; Van Kerrebrouck, J.; Coudyzer, G.; Yin, X.; Passalis, N.; et al. Programmable tanh-, elu-, sigmoid-, and sin-based nonlinear activation functions for neuromorphic photonics. IEEE J. Sel. Top. Quantum Electron. 2023, 29, 6101210. [Google Scholar] [CrossRef]
  3. Anastassiou, G.A. Opial and Ostrowski Type Inequalities Based on Trigonometric and Hyperbolic Type Taylor Formulae. Malaya J. Mat. 2023, 11, 1–26. [Google Scholar] [CrossRef] [PubMed]
  4. Anastassiou, G.A. Perturbed Hyperbolic Tangent Function-Activated Complex-Valued Trigonometric and Hyperbolic Neural Network High Order Approximation. In Trigonometric and Hyperbolic Generated Approximation Theory; World Scientific: Singapore; New York, NY, USA, in press.
  5. Ali, H.A.; Zsolt, P. Taylor-type expansions in terms of exponential polynomials. Math. Inequalities Appl. 2022, 25, 1123–1141. [Google Scholar] [CrossRef]
  6. Anastassiou, G.A. Rate of convergence of some neural network operators to the unit-univariate case. J. Math. Anal. Appl. 1997, 212, 237–262. [Google Scholar] [CrossRef]
  7. Haykin, S. Neural Networks: A Comprehensive Foundation, 2nd ed.; Prentice Hall: New York, NY, USA, 1998. [Google Scholar]
  8. McCulloch, W.; Pitts, W. A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 1943, 7, 115–133. [Google Scholar] [CrossRef]
  9. Mitchell, T.M. Machine Learning; WCB-McGraw-Hill: New York, NY, USA, 1997. [Google Scholar]
  10. Anastassiou, G.A. Banach Space Valued Ordinary and Fractional Neural Network Approximation Based on q-Deformed and β-Parametrized Half Hyperbolic Tangent. In Parametrized, Deformed and General Neural Networks; Studies in Computational Intelligence; Springer: Cham, Switzerland, 2023; Volume 1116. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Anastassiou, G.A.; Karateke, S. Parametrized Half-Hyperbolic Tangent Function-Activated Complex-Valued Neural Network Approximation. Symmetry 2024, 16, 1568. https://doi.org/10.3390/sym16121568

AMA Style

Anastassiou GA, Karateke S. Parametrized Half-Hyperbolic Tangent Function-Activated Complex-Valued Neural Network Approximation. Symmetry. 2024; 16(12):1568. https://doi.org/10.3390/sym16121568

Chicago/Turabian Style

Anastassiou, George A., and Seda Karateke. 2024. "Parametrized Half-Hyperbolic Tangent Function-Activated Complex-Valued Neural Network Approximation" Symmetry 16, no. 12: 1568. https://doi.org/10.3390/sym16121568

APA Style

Anastassiou, G. A., & Karateke, S. (2024). Parametrized Half-Hyperbolic Tangent Function-Activated Complex-Valued Neural Network Approximation. Symmetry, 16(12), 1568. https://doi.org/10.3390/sym16121568

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop