Abstract
In this article, we study the approximation properties of activated singular integral operators over the real line. We establish their convergence to the unit operator with rates. The kernels here derive from neural network activation functions and their corresponding density functions. The estimates are mostly sharp, and they are pointwise and uniform. The derived inequalities involve the higher order modulus of smoothness.
Keywords:
activation functions from neural networks; best constant; singular integral; modulus of smoothness; sharp inequality MSC:
26A15; 26D15; 41A17; 41A30; 41A35; 41A44
1. Introduction
The rate of convergence of singular integrals has been studied earlier in [1,2,3,4]. The seminal monograph [5], Ch. 14, is motivated by it and it is used in this work. Here, we consider some activated singular integral operators over and we study the degree of approximation to the unit operator with rates over smooth functions. We establish related inequalities involving the higher modulus of smoothness with respect to . The estimates are pointwise and uniform. Most of the time these are optimal, in the sense that the inequalities are attained by basic functions. Our discussed operators are not, in general, positive.
The new thing here is the reverse process from applied mathematics to pure. Our kernels here are formed by density functions from activation functions related to neural networks approximation; see [6,7].
For the history of the topic, we mention the monograph [5] of 2012, which was the first complete source to deal exclusively with the classic theory of the approximation of singular integrals to the identity-unit operator. In it, the authors studied quantitatively the basic approximation properties of the general Picard, Gauss–Weierstrass and Poisson–Cauchy singular integral operators over the real line, which are not positive linear operators. In particular, they studied the rate of convergence of these operators to the unit operator, as well as the related simultaneous approximation. This is given via inequalities and with the use of the higher-order modulus of the smoothness of the high-order derivative of the involved function. Some of these inequalities are proven to be sharp. Also, they studied the global smoothness preservation property of these operators. Furthermore, they gave asymptotic expansions of Voronovskaya type for the error of approximation. They continued with the study of related properties of the general fractional Gauss–Weierstrass and Poisson–Cauchy singular integral operators. These properties were studied with respect to the norm, . The case of Lipschitz-type function approximation was studied separately and in detail. Furthermore, they presented the corresponding general approximation theory of general singular integral operators with many applications for the trigonometric singular integral, which had not received much focus until then. Of great interest, and motivating the author, are the articles [8,9,10,11,12]. In contemporary intense mathematical activity with the use of neural networks in solving differential equations, our current work is expected to play a pivotal role, as in the classic case using the earlier versions of singular integrals.
2. Background
Everything in this section comes from [5], Ch. 14. Here, we studied the following smooth general singular integral operators defined as follows. Let and let be Borel probability measures on .
For and , we set
that is, Let be Borel-measurable, and we can define for the integral
We suppose that. for all . We also say that
We notice that , c constant, and
Let , with the rth modulus of smoothness be finite, i.e.,
where
see [13], p. 44.
We need to introduce
and the even function
with
We mention the following result.
Theorem 1.
The integrals , are assumed to be finite. Then,
Corollary 1.
Assume , . Then, it holds for that
Inequality (10) is sharp.
Theorem 2.
Inequality (10) at is attained by , with even.
Corollary 2.
Inequality (11) is sharp, that is, it is attained at by , r even.
We also need the next result.
Theorem 3.
Let , . Set , Assume also that , ∀. It is also supposed that
Then,
When , the sum on the L.H.S. (15) collapses.
Here, L.H.S. means left-hand side.
3. On Activation Functions
3.1. About Richards’s Curve
Here, we follow [7], Chapter 1.
A Richards’s curve is
which is strictly increasing on , and it is a sigmoid function; in particular, this is a generalized logistic function. And it is an activation function in neural networks; see [7], Chapter 1.
It is
We consider the function
which is , where all .
It is
and
We also have
We also obtain
and G is a bell symmetric function with maximum
Theorem 4.
It holds that
Theorem 5.
It holds that
Therefore, G is a density function.
Remark 2.
Therefore, we have
- (i)
- Let . That is, . Applying the mean value theorem, we obtain:where
- Notice that
- (ii)
- Now, let . That is, . Applying the mean value theorem again, we obtain:where
Hence, we derive that
Consequently, we proved that
Let ; it holds that
Clearly, according to Theorem 5, we have
Therefore, is a density function, and let ; that is, is a Borel probability measure.
We give the following important result.
Theorem 6.
Let , and
Then, are finite and as
Proof.
We can write
We notice that
Next, we have
and we have the gamma function , .
We have established that
Finally, we observe that
Therefore, we have proved that
Next, we give the following.
Theorem 7.
It holds that
for
Also, this integral converges to zero, as
Proof.
We observe that
Therefore, we have
and it converges to zero, as □
3.2. About the q-Deformed and -Parametrized Hyperbolic Tangent Function
We consider the activation function and study its related properties; all the basics come from [7], Ch. 17.
Let the activation function
It is
and
with
We consider the function
∀, . We have , so that the x-axis is a horizontal asymptote.
It holds that
and
The maximum is
Theorem 8.
We find that
Theorem 9.
It holds that
Therefore, is a density function on .
Remark 3.
(i) Let . That is, . According to mean value theorem, we obtain
for some
But , and
That is,
Set ; then,
(ii) Now, let . That is, . Again, we have
We have
and
Hence,
Therefore, it holds that
That is,
Set ; then,
We have proved that
∀
Let ; it holds that
According to Theorem 9, we find that
Therefore, is a density function and let
that is, is a Borel probability measure.
We give the following.
Theorem 10.
Let
Then, are finite and as
Proof.
We can write
We notice that
Next, we have
We have established that
Finally, we observe that
Therefore, we have proved that
Furthermore, we present the results.
Theorem 11.
It holds that
where ;
Also, , as
Proof.
We find that
and it converges to zero, as □
3.3. About the Gudermannian Generated Activation Function
Here, we follow [6], Ch. 2.
Let the related normalized generator sigmoid function be
and the neural network activation function be
We mention the following.
Theorem 12.
It holds that
Therefore, is a density function.
According to [6], p. 49, we found that
But
∀.
Therefore,
So the following,
is the related Borel probability measure.
We give the following results; proofs similar to Theorems 6 and 7 are omitted.
Theorem 13.
Let , and
Then, are finite and , as
Theorem 14.
It holds that
;
Also, this integral converges to zero, as
3.4. About the q-Deformed and -Parametrized Logistic-Type Activation Function
Here, everything comes from [7], Ch. 15.
The activation function now is
where
The density function here will be
We mention the following.
Theorem 15.
It holds that
According to [7], p. 373, we find
So the following
is the related Borel probability measure.
We give the following results; proofs similar to Theorems 10 and 11 are omitted.
Theorem 16.
Let
Then, are finite and , as
Theorem 17.
It holds that
where ;
Also, , as
3.5. About the q-Deformed and -Parametrized Half-Hyperbolic Tangent Function
Here, everything comes from [7], Ch. 19.
The activation function now is
where
The corresponding density function will be
Theorem 18.
According to [7], p. 481, we find that
Thus, the following
is the related Borel probability measure.
We state the following results; proofs similar to Theorems 10 and 11 are omitted.
Theorem 19.
Let
Then, are finite and , as
Theorem 20.
It holds that
where ;
Also, , as
In Section 2, we generally assumed that , for all .
The next result proves that this is easily possible.
Theorem 21.
Let be a Lipschitz function, that is, there exists such that , for any . Here, is a general Borel probability measure on with the property as finite. Then, , defined by (2), is finite, i.e., ,
Proof.
We find that
Thus,
That is,
Notice that
and
proving the claim. □
4. Main Results
Here, we describe the pointwise and uniform approximation properties of the following activated singular operators, which are special cases of ; see Section 2. Their definitions are based on Section 3. Essentially, we apply our listed results in Section 2.
We give the following results grouped by operator.
Theorem 22.
It holds that
The last, at , is attained by , with even.
Proof.
Theorems 1 and 6. □
Corollary 3.
Assume ,
Then,
which is attained at by , where r is even.
Remark 4.
We also have the uniform estimates
and
And the following holds.
Proof.
Theorems 3 and 7. □
We continue similarly.
Theorem 24.
It holds that
The last, at , is attained by , with even.
Proof.
Theorems 1 and 10. □
Corollary 4.
Assume that ,
Then,
which is attained at by , where r is even.
Remark 5.
We also have the uniform estimates
and
And the following holds.
Theorem 25.
Let , . Assume also that , ∀ Then,
At , the sum on the L.H.S. (123) collapses.
Proof.
Theorems 3 and 11. □
We continue similarly.
Theorem 26.
It holds that
The last, at , is attained by , with even.
Proof.
Theorems 1 and 13. □
Corollary 5.
Assume that , Then,
which is attained at by , where r is even.
Remark 6.
We also have the uniform estimates
and
And the following holds.
Theorem 27.
Let , . Assume also that , Then,
At , the sum on the L.H.S. (128) collapses.
Proof.
Theorems 3 and 14. □
We continue on this fashion.
Theorem 28.
It holds that
The last, at , is attained by , with even.
Proof.
Theorems 1 and 16. □
Corollary 6.
Assume that ,
Then,
which is attained at by , where r is even.
Remark 7.
We also have the uniform estimates
and
And the following holds.
Theorem 29.
Let , . Assume also that , Then,
At , the sum on the L.H.S. (133) collapses.
Proof.
Theorems 3 and 17. □
The last group of results follows.
Theorem 30.
It holds that
The last, at , is attained by , with even.
Proof.
Theorems 1 and 19. □
Corollary 7.
Assume that ,
Then,
which is attained at by , where r is even.
Remark 8.
We also have the uniform estimates
and
And the following holds.
Theorem 31.
Let , . Assume also that , Then,
At , the sum on the L.H.S. (138) collapses.
Proof.
Theorems 3 and 20. □
5. Conclusions
Here, we presented the new idea of moving from the neural networks’ main tools, the activation functions, to singular integral approximation. This is a rare case of employing applied mathematics in theoretical cases.
Funding
This research received no external funding.
Data Availability Statement
The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.
Conflicts of Interest
The author declares no conflict of interest.
References
- Anastassiou, G.A.; Gal, S. Convergence of generalized singular integrals to the unit, univariate case. Math. Inequal. Appl. 2000, 3, 511–518. [Google Scholar] [CrossRef]
- Gal, S.G. Remark on the degree of approximation of continuous functions by singular integrals. Math. Nachr. 1993, 164, 197–199. [Google Scholar] [CrossRef]
- Gal, S.G. Degree of approximation of continuous functions by some singular integrals. Rev. d’Anal. Numér. Théor. l’Approx. 1998, 27, 251–261. [Google Scholar]
- Mohapatra, R.N.; Rodriguez, R.S. On the rate of convergence of singular integrals for Hölder continuous functions. Math. Nachr. 1990, 149, 117–124. [Google Scholar] [CrossRef]
- Anastassiou, G.; Mezei, R. Approximation by Singular Integrals; Cambridge Scientific Publishers: Cambridge, UK, 2012. [Google Scholar]
- Anastassiou, G.A. Banach Space Valued Neural Network; Springer: Heidelberg, Germany; New York, NY, USA, 2023. [Google Scholar]
- Anastassiou, G.A. Parametrized, Deformed and General Neural Networks; Springer: Heidelberg, Germany; New York, NY, USA, 2023. [Google Scholar]
- Aral, A. On a generalized Gauss Weierstrass singular integral. Fasc. Math. 2005, 35, 23–33. [Google Scholar]
- Aral, A. Pointwise approximation by the generalization of Picard and Gauss-Weierstrass singular integrals. J. Concr. Appl. Math. 2008, 6, 327–339. [Google Scholar]
- Aral, A. On generalized Picard integral operators. In Advances in Summability and Approximation Theory; Springer: Singapore, 2018; pp. 157–168. [Google Scholar]
- Aral, A.; Deniz, E.; Erbay, H. The Picard and Gauss-Weiertrass singular integrals in (p, q)-calculus. Bull. Malays. Math. Sci. Soc. 2020, 43, 1569–1583. [Google Scholar] [CrossRef]
- Aral, A.; Gal, S.G. q-generalizations of the Picard and Gauss-Weierstrass singular integrals. Taiwan. J. Math. 2008, 12, 2501–2515. [Google Scholar] [CrossRef]
- DeVore, R.A.; Lorentz, G.G. Constructive Approximation; Springer: Berlin, Germany; New York, NY, USA, 1993; Volume 303. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).