1. Introduction
The author of [
1,
2], see Chapters 2–5, was the pioneer to start neural network quantitative approximation to continuous functions by precisely defined neural network operators of Cardaliaguet–Euvrard and “Squashing” types, by using the modulus of continuity of the given function or its high order derivative, and deriving almost sharp Jackson-type inequalities. He dealt with both the univariate and multivariate cases. The defining these operators “bell-shaped” and “squashing” functions were taken with a compact support. Furthermore in [
2], he provides the
Nth order asymptotic expansion for the error of weak approximation of these two operators to a particular natural class of smooth functions, for more visit Chapters 4–5 therein.
Coming back the author motivated by [
3], who continued his research on neural networks approximation by employing and using the appropriate quasi-interpolation operators of sigmoidal and hyperbolic tangent type, which resulted in [
4,
5,
6,
7,
8], by treating both the univariate and multivariate cases. He also completed the corresponding fractional cases [
9,
10,
11].
Let h be a general sigmoid function with , and the horizontal asymptotes. Of course h is strictly increasing over . Let the parameter and . Then clearly and , furthermore it holds . Consequently the sigmoid has a graph inside the graph of , of course with the same asymptotes . Therefore has derivatives (gradients) at more points x than has different than zero or not as close to zero, thus killing a smaller number of neurons! Furthermore, of course is more distant from , than it is. A highly desired fact in neural networks theory.
The brain non-symmetry has been clearly discovered in animals and humans in terms of structure, function and behaviors. This observation reflects evolutionary, hereditary, developmental, experiential and pathological factors. Therefore it is natural to consider for our study deformed neural network activation functions and operators. So this paper is a specific study under this philosophy of approaching reality as close as possible.
Consequently the author here performs q-deformed and -parameterized hyperbolic tangent function activated neural network approximations to continuous functions over closed intervals of reals or over the whole with values to an arbitrary Banach space . Finally he deals with the related X-valued fractional approximation. All convergences here are quantitative expressed via the first modulus of continuity of the on hand function or its X-valued high order derivative, or X-valued fractional derivatives and given by almost attained Jackson-type inequalities.
Our closed intervals are not necessarily symmetric to the origin. Some of our upper bounds to error quantity are very flexible and general. In preparation to derive our results we describe important properties of the basic density function defining our operators which is induced by a q-deformed and -parameterized hyperbolic tangent sigmoid function.
Feed-forward
X-valued neural networks (FNNs) with one hidden layer, the only type of networks we use in this work, are mathematically expressed by
where for
,
are the thresholds,
are the connection weights,
are the coefficients,
is the inner product of
and
x, and
k is the activation function of the network. For more in neural networks read [
12,
13,
14].
2. About q-Deformed and λ-Parameterized Hyperbolic Tangent Function gq,λ
Here all this background comes from [
15,
16].
We use
, see (
1), and exhibit that it is a sigmoid function and we will present several of its properties related to the approximation by neural network operators. It will act as activation function.
So, let us consider the function
We have that
We notice also that
That is
and
hence
It is
i.e.,
Furthermore,
i.e.,
We find that
therefore
is strictly increasing.
So, in case of , we have that is strictly concave up, with
Furthermore, in case of , we have that is strictly concave down.
Clearly,
is a shifted sigmoid function with
, and
, (a semi-odd function), see also [
17].
By
,
, we consider the function
;
. Notice that
, so the
x-axis is horizontal asymptote.
Thus,
a deformed symmetry.
Let , then and (by being strictly concave up for ), that is . Hence, is strictly increasing over
Let now , then , and , that is
Therefore is strictly decreasing over
Let us next consider,
We have that
By
By
Clearly by (
13) we obtain that
, for
More precisely is concave down over , and strictly concave down over
Consequently has a bell-type shape over
Of course it holds
At
, we have
Thus,
That is,
is the only critical number of
over
. Hence at
achieves its global maximum, which is
Conclusion: The maximum value of
is
We mention
Similarly, it holds
However,
, ∀
It follows
So that is a density function on
We need the following result
Theorem 3 ([
16])
. Let , and with ; . Then,where Let the ceiling of the number, and the integral part of the number.
Theorem 4 ([
16])
. Let and so that . For , we consider the number with and . Then, We also mention
Remark 1 ([
16])
. (i) We have thatwhere (ii) Let . For large n we always have . Furthermore, , iff . In general it holds Let be a Banach space.
Definition 1. Let and . We introduce and define the X-valued linear neural network operatorsFor large enough n we always obtain Furthermore, iff The same is used for real valued functions. We study here the pointwise and uniform convergence of to with rates. For convenience, also we call
(the same
can be defined for real valued functions) that is
So that
Consequently, we derive that
where
as in (
25).
We will estimate the right hand side of the last quantity.
For that we need, for
the first modulus of continuity
Similarly, it is defined
for
(uniformly continuous and bounded functions from
into
X), for
(continuous and bounded
X-valued), and for
(uniformly continuous).
The fact
or
, is equivalent to
, see [
18].
We make
Definition 2. When , or , we define, , the X-valued quasi-interpolation neural network operator. We give
Remark 2. We have thatandandand, finally,a convergent series in . So, the series is absolutely convergent in X, hence it is convergent in X and . We denote by , for , similarly it is defined for
3. Main Results
We present a set of X-valued neural network approximations to a function given with rates.
Theorem 5. Let , , , , , Then,
and
(ii)We obtain that , pointwise and uniformly. Proof. We see that
That is
Using the last equality we derive (
37). □
Next we give
Theorem 6. Let , , , Then
(ii)For we obtain , pointwise and uniformly. Proof. We observe that
proving the claim. □
We need the X-valued Taylor’s formula in an appropriate form:
Theorem 7 ([
19,
20])
. Let , and , where and X is a Banach space. Let any . Then, The derivatives
,
, are defined like the numerical ones, see [
21], p. 83. The integral
in (
46) is of Bochner type, see [
22].
By [
20,
23] we have that: if
, then
and
In the next we discuss high order neural network X-valued approximation by using the smoothness of f.
Theorem 8. Let , , , , , and . Then,
(ii) assume further , for some , it holds and
(iii)Again we obtain , pointwise and uniformly. Proof. It is lengthy, and as similar to [
24] is omitted. □
All integrals from now on are of Bochner type [
22].
We need
Definition 3 ([
20])
. Let , X be a Banach space, ; , ( is the ceiling of the number), . We assume that . We call the Caputo–Bochner left fractional derivative of order α:If , we set the ordinary X-valued derivative (defined similar to numerical one, see [21], p. 83), and also set By [
19],
exists almost everywhere in
and
.
If
, then by [
23],
hence
We mention
Definition 4 ([
19])
. Let , X be a Banach space, , . We assume that , where . We call the Caputo–Bochner right fractional derivative of order α:We observe that for , and By [
19],
exists almost everywhere on
and
.
If
, and
by [
19],
hence
We make
Remark 3 ([
18])
. Let , , , , . Then,Thus, we observeConsequently,Similarly, let , , , , , thenSo for , , , , , we findand By [
20] we obtain that
, and by [
19] we obtain that
We present the following X-valued fractional approximation result by neural networks.
Theorem 9. Let , , , , , , , , Then,
(ii) if , for , we have and
(iv)Above, when the sum As we see here we obtain X-valued fractionally type pointwise and uniform convergence with rates of the unit operator, as
Proof. The proof is very lengthy and similar to [
24]; therefore, it is omitted. □
Next we apply Theorem 9 for
Theorem 10. Let , , , Then
and
When we derive
Corollary 1. Let , , , Then
and
We make
Remark 4. Some convergence analysis follows based on Corollary 1.
Let , , , We elaborate on (65). Assume thatand
∀
, ∀
, where . Then it holdswhere The other summand of the right hand side of (65), for large enough n, converges to zero at the speed , so it is about , where is a constant. Then, for large enough , by (65), (68) and the above comment, we obtain thatwhere , converging to zero at the high speed of In Theorem 5, for and for large enough , the speed is . So by (69), converges much faster to zero. The last comes because we assumed differentiability of f. Notice that in Corollary 1 no initial condition is assumed.