1. Introduction and Preliminaries
In the contemporary artificial intelligence (AI) era, the design and analysis of neural network (NN) operators have become increasingly important, particularly with respect to their approximation properties and convergence behavior. Activation functions play a central role in determining the stability, learning capacity, and expressive power of neural architectures. Adjustable and asymmetric activation mechanisms provide additional flexibility, allowing for improved approximation performance and enhanced adaptability to complex data structures.
Convolution-type operators form a fundamental class in approximation theory and signal processing. Their systematic study within the framework of positive linear operators (PLOs) originated with Bernstein’s constructive proof of the Weierstrass theorem [
1], and was further strengthened by Korovkin’s classical convergence theorem [
2]. A unified operator-theoretic treatment of convolution, integral, and polynomial operators was developed by Butzer and Nessel [
3,
4]. In recent decades, Anastassiou has extended these ideas toward probabilistic and NN-inspired operators, providing quantitative approximation estimates and connections with modern learning models [
5,
6,
7,
8].
From the viewpoint of NNs, smooth and bounded activation functions such as the hyperbolic tangent are known to promote stable learning and efficient approximation. To overcome limitations such as slow convergence and limited flexibility, several parametrized and deformed variants of the tanh function have been proposed and analyzed [
9,
10,
11]. Experimental evidence indicates that introducing asymmetry into the activation function can significantly improve accuracy and representational capacity [
9]. Moreover, adaptive hyperbolic tangent activations have demonstrated superior performance in data mining and learning tasks due to their enhanced generalization ability [
10].
Parallel to these developments, symmetrized neural network operators have been shown to exhibit improved convergence behavior and sharper approximation bounds compared to classical constructions, especially in convolution-type settings [
6,
8]. Motivated by these advances, the present work investigates PLOs of convolution-type generated by the adjustable half-hyperbolic (adj HH) tangent activation function, establishing quantitative and qualitative convergence results supported by numerical validation .
This paper is organized as follows. In
Section 1, we introduce the adj HH-tangent function and present the necessary preliminaries and notation.
Section 2 gives the technical instruments. In addition, several versions of the operators are constructed, and their approximation behavior is analyzed.
Section 3 is devoted to the main theoretical results, where we establish convergence properties and quantitative estimates for the proposed operators using tools from PLOs of the approximation theory.
Section 4 provides numerical examples and graphical illustrations that support the theoretical findings with so-called regression-type metrics. Finally,
Section 5 contains the concluding remarks and discusses possible directions for future research.
Definition 1 ([
5]).
We use the following “adjustable half-hyperbolic (adj HH) tangent function” for such thatHere, while ℓ controls the steepness of the transition, κ controls the asymmetry. Moreover, it is obvious that as a composition of exponentials and rational operations. Definition 2. Let for every Then, letand Such that that is symmetric about the for every It is known thatAgain,is valid, and for all Thus, here, works as a “density kernel”. Definition 3 ([
12])
. For ; , the modulus of continuity is defined by 2. Proposed Operators and Preliminary Results
Here, we focus on establishing a solid foundation for our approximation results with emphasis on the PLO properties of our operators.
Definition 4. For all ; , and “Classical adj HHC-type operator” is defined as follows:where is in (4). Definition 5. For ; , and “Adj HHC-based Kantorovich-type operator” is defined by Definition 6. For every ; , and “Adj HHC-quadrature-type operator” is defined bywhere such that . Remark 1 ([
8]).
For , it is known that the operators are PLOs satisfying for . Moreover, if and for , then these operators commute with differentiation up to order i; in other words,, and all Proposition 1. The inequalityis satisfied for such that Proof. For every
, we have that
Let
i.e.,
Applying the mean value theorem, there exists
so that
Thus,
Compute
hence,
Therefore,
Similarly,
Adding (20) and (21) and using (
4) yields
Let
Then,
□
Proposition 2. For is satisfied. Proof. Let us write that
Using the fact that from [
8],
and for all
Hence,
This proves (25). □
Proposition 3. Let and . It is also assumed that the operator is defined as in (9). Then,andas Proof. It is necessary and sufficient to observe that
using (25). This is finite and tends to 0 as
. □
Proposition 4. Let the operator be defined as in (10) for and . Then,andas Proposition 5. Consider and . Suppose that the operator is defined as in (11). Then,andis finite as Proof. It is known that
The rest of the proof is identical to the previous proposition. □
3. Main Results
Theorem 1. We denote by any of the operators or for acting on For and , the following is provided:Moreover, if f is continuous uniformly, then Proof. This proof is obtained through a straightforward application of Theorem 3.6 in [
8] and Propositions 1–4. □
Theorem 2. Assume that such that for . Then,is finite. Thus, Specifically, if , then we obtain thatAgain, is satisfied. Proof. Applying Theorem 4.1 of [
7] with Propositions 1–4 immediately yields the proof. □
Notation 1. Let be. Then,and Corollary 1. Let and , . Assume that for Then,Thus, Proof. By a direct application of Proposition 1 and Theorem 2 with (41). □
Corollary 2. Let and , . Assume , Then,Hence, Proof. The proof follows directly from Theorem 2, Propositions 2 and 3, and (42). □
Corollary 3. Let Then,If f is also uniformly continuous, we get , pointwise and uniformly. Proof. An application of Proposition 1, Theorem 1, and (41) completes the proof. □
Corollary 4. Let Then,If f is also uniformly continuous, we obtain , pointwise and uniformly. Proof. Combining Propositions 2 and 3, Theorem 1, and (42) establishes the proof. □
Theorem 3. Let for Then, for is valid. Moreover, if is uniformly continuous, then we get , pointwise and uniformly. Proof. Corollary 3 and (12) together yield the proof. □
Theorem 4. Let for Then,If is uniformly continuous, we obtain , pointwise and uniformly. Proof. The proof follows from Corollary 4 and (12). □
4. Numerical Approximation and Error Analysis via Regression- Based Metrics
In this section, we shall illustrate the approximation behavior of the adj HHC–type operators by numerical experiments. All computations are performed on the real line and confirm the theoretical convergence results established in the previous sections. For reproducibility, the complete Python implementation is provided in
Appendix A (see Listings A1–A4). Algorithms 1–5 summarize the computational procedures, while the full Python implementation is reported in
Appendix A (Listings A1–A4).
| Algorithm 1 Activation, kernel, and target function |
Require: Parameters ,
- 1:
Define the activation function: - 2:
Define the adj HHC kernel: - 3:
Define the test function:
|
| Algorithm 2 Numerical evaluation of (truncated adaptive quadrature) |
Require: , , parameters , truncation
- 1:
, - 2:
- 3:
Compute using adaptive quadrature - 4:
return
|
| Algorithm 3 Evaluation of (finite weighted sampling) |
Require: , , , weights with
- 1:
- 2:
for to m do - 3:
- 4:
end for - 5:
return
|
| Algorithm 4 Numerical evaluation of (adj quadrature) |
Require: , , , weights , parameters , truncation
- 1:
, - 2:
- 3:
- 4:
Compute using adaptive quadrature - 5:
return
|
| Algorithm 5 Numerical experiment pipeline (tables and figures) |
Require: Interval , number of points N, set of n values, reference , parameters ,
- 1:
Construct uniform grid - 2:
Compute for all i - 3:
for each n do - 4:
Compute , , - 5:
Evaluate metrics for , , via Algorithm 6 - 6:
Report a metric table for , , - 7:
end for - 8:
Plot f and , , - 9:
Plot pointwise errors , , - 10:
For each n compute and plot versus n in log–log scale
|
| Algorithm 6 Discrete error metrics on a sampling grid |
Require: Grid , targets , approximations
- 1:
- 2:
- 3:
- 4:
- 5:
, - 6:
(small ) - 7:
return
|
4.1. Quantitative Error Measures
Let
be a fixed compact interval, containing the main support of
f,
and
We consider a uniform partition of
given by
where
is the mesh size. The values
thus form a sampling grid on
, which is used to evaluate both the target function
f and its operator approximations
. Then, we could define the discrete pointwise error by
Motivated by Corollaries 3 and 4, which provide estimates in the uniform norm, we introduce the following quantitative error metrics (please see [
13]).
This metric directly reflects the theoretical estimates in (45) and (46).
- (ii)
Mean absolute error (MAE)
where
- (iii)
Mean squared error (MSE)
where
.
- (iv)
Root mean squared error (RMSE)
This quantity penalizes large local deviations and highlights pointwise instabilities.
- (v)
Coefficient of determination ()
Values of
close to 1 indicate strong agreement between
and
f.
These discrete metrics provide numerical counterparts of the theoretical convergence results established in Corollaries 3 and 4 allow for a detailed comparison of the approximation behavior of the operators , , and for increasing values of n.
4.2. Examples
Example 1. As a test function, we consider the bounded, infinitely differentiable function This function is a standard benchmark in approximation theory since it is smooth, rapidly decreasing at infinity, and satisfies the assumptions required for convolution-type PLOs.
We consider in (4) as a density kernel, “Classical adj HHC-type operator” as PLOs, and adj HH-tangent function in (1) as activation function for
As we observe in Figure 1, Figure 2 and Figure 3 and Table 1; numerical calculations and graphics convergence results reveal that the proposed operator works perfectly in a well-fitting space. Remark 2. Theoretical justification of the “sliding window truncation”:
The HHC-type operator is defined by the improper integral. So, a direct numerical evaluation of this integral is not feasible due to the infinite integration domain. However, the structure of the kernel allows for a rigorous truncation. Since converges exponentially to as it follows that there exist suitable constants such thatis satisfied for Consequently, the kernel is rapidly decaying and effectively localized around the origin. Change in variables and concentration nearby : Observe that the kernel is evaluated at . Hence, the integrand is significant only when v is close to . For large , the kernel suppresses the contribution exponentially. This yields the estimateuniformly for . Sliding window truncation: Based on the previous arguments, the improper integral can be approximated bywhere the integration interval slides with the evaluation point y. Please note that this “sliding window” is not an ad hoc numerical trick; it is a direct consequence of the approximate identity structure of the kernel . Example 2. In this example, we numerically investigate the approximation behavior of several HHC-type operators acting on a bounded function defined on the real line. We consider the bounded and continuous functionwhich belongs to the space . This function is not symmetric and exhibits a non-smooth transition at the origin, making it a suitable benchmark for evaluating proposed operators. We compare , , and generated by the adj HH tangent kernel. For numerical implementation, these integrals are approximated by a sliding-window truncation of the formwhere sufficiently large is chosen to ensure numerical stability and accuracy. As shown in Figure 4, Figure 5 and Figure 6 and Table 2, the numerical experiments and graphical convergence results demonstrate that the proposed operators , , and exhibit satisfactory performance within the considered functional setting. The operators are evaluated on a uniform grid in the interval for several values of n. Approximation quality is measured using classical regression-based metrics, including the root mean square error (RMSE), mean absolute error (MAE), maximum error, and the coefficient of determination .
5. Discussion and Future Directions
The use of ML error metrics such as RMSE and MAE provides an average-case, regression-oriented interpretation of the approximation behavior of the proposed operators. These metrics complement classical uniform error estimates and allow for a transparent numerical comparison of different operator constructions at finite resolution. From an ML perspective, the proposed CNN operators can be interpreted as deterministic regression models, whose approximation accuracy is assessed using standard ML error metrics.
In this study, we have introduced and rigorously analyzed new classes of convolution-type approximation operators constructed via adj HH-tangent activation function using the framework of PLOs, and we have established quantitative convergence estimates toward the identity operator, supported by precise inequalities involving the modulus of continuity.
Furthermore, by extending the analysis to simultaneous settings, we have demonstrated the robustness and adaptability of these operators. The integration of adjustability within the operator design not only enhances theoretical understanding but also holds potential for practical applications in NN training and deep learning (DL) architectures.
Future research may further explore their performance in applications to real-world datasets and adaptations within stochastic frameworks. We can present related problems in fields such as Image Processing, Computer Vision, Signal Processing and Acoustics, Natural Language Processing (NLP), etc., as real-life applications.
Author Contributions
Conceptualization, G.A.A., S.K. and M.Z.; methodology, G.A.A. and S.K.; software, M.Z. and S.K.; validation, G.A.A., S.K. and M.Z.; formal analysis, G.A.A.; investigation, G.A.A., S.K. and M.Z.; resources, G.A.A.; writing—original draft preparation, G.A.A. and S.K.; writing—review and editing, G.A.A., S.K. and M.Z.; visualization, S.K. and M.Z.; supervision, G.A.A.; project administration, S.K. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
No new data were created or analyzed in this study. Data sharing is not applicable to this article.
Acknowledgments
The authors thank the reviewers for their constructive comments.
Conflicts of Interest
The authors declare no conflicts of interest.
Appendix A. Python Implementation
References
- Bernstein, S.N. Démonstration du théorème de Weierstrass fondée sur le calcul des probabilités. Commun. Soc. Math. Kharkow 1912, 13, 1–2. [Google Scholar]
- Korovkin, P.P. Linear Operators and Approximation Theory; Hindustan Publ. Corp.: Delhi, India, 1953. [Google Scholar]
- Butzer, P.L.; Nessel, R.J. Fourier Analysis and Approximation; Academic Press: New York, NY, USA, 1971; Volumes I–II. [Google Scholar]
- Butzer, P.L.; Nessel, R.J. Approximation Theory and Functional Analysis; Academic Press: New York, NY, USA, 1971. [Google Scholar]
- Anastassiou, G.A. Parametrized, Deformed and General Neural Networks; Springer: Berlin/Heidelberg, Germany, 2023. [Google Scholar]
- Anastassiou, G.A. Approximation by symmetrized and perturbed hyperbolic tangent activated convolution type operators. Mathematics 2024, 12, 3302. [Google Scholar] [CrossRef]
- Anastassiou, G.A. Neural networks in infinite domain as positive linear operators. Ann. Univ. Sci. Bp. Sect. Comp. 2025, 58, 15–29. [Google Scholar] [CrossRef]
- Anastassiou, G.A. Approximation by symmetrized and perturbed hyperbolic tangent activated convolutions as positive linear operators. Mod. Math. Methods 2025, 3, 72–84. [Google Scholar] [CrossRef]
- Kim, D.; Kim, W.; Kim, S. Tanh works better with asymmetry. Adv. Neural Inf. Process. Syst. 2023, 36, 12536–12554. [Google Scholar]
- Xu, S. Data mining using an adaptive HONN model with hyperbolic tangent neurons. In Knowledge Management and Acquisition for Smart Systems and Services; Springer: Berlin/Heidelberg, Germany, 2010. [Google Scholar]
- Zhang, S.; Ren, G. RoSwish: A novel rotating swish activation function with adaptive rotation around zero. Neural Netw. 2025, 192, 107892. [Google Scholar] [CrossRef] [PubMed]
- Mamedov, R.G. Asymptotic approximation of differentiable functions with linear positive operators. Dokl. Akad. Nauk SSSR 1959, 128, 471–474. (In Russian) [Google Scholar]
- Chicco, D.; Warrens, M.J.; Jurman, G. The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation. PeerJ Comput. Sci. 2021, 7, e623. [Google Scholar] [CrossRef] [PubMed]
Figure 1.
Approximation of the Gaussian function by HHC-type PLOs for increasing values of n. The operator curves converge uniformly to f on compact subsets of , illustrating the theoretical approximation properties.
Figure 1.
Approximation of the Gaussian function by HHC-type PLOs for increasing values of n. The operator curves converge uniformly to f on compact subsets of , illustrating the theoretical approximation properties.
Figure 2.
Pointwise absolute errors for the HHC-type operators approximating . As n increases, the error curves uniformly decrease, confirming the theoretical convergence of the operators.
Figure 2.
Pointwise absolute errors for the HHC-type operators approximating . As n increases, the error curves uniformly decrease, confirming the theoretical convergence of the operators.
Figure 3.
Uniform norm error decay of the HHC-type operators for the approximation of the bounded function . The decrease in the supremum norm as n increases confirms the quantitative convergence predicted by the theoretical results.
Figure 3.
Uniform norm error decay of the HHC-type operators for the approximation of the bounded function . The decrease in the supremum norm as n increases confirms the quantitative convergence predicted by the theoretical results.
Figure 4.
Comparison of operator approximations for a fixed parameter . The target function is depicted by the dashed curve, while , , and denote the adj classical, adj Kantorovich-type, and adj quadrature HHC-type operators, respectively.
Figure 4.
Comparison of operator approximations for a fixed parameter . The target function is depicted by the dashed curve, while , , and denote the adj classical, adj Kantorovich-type, and adj quadrature HHC-type operators, respectively.
Figure 5.
Pointwise absolute error functions for . The curves represent the absolute deviations , , and corresponding to the adj classical, adj Kantorovich-type, and adj quadrature HHC-type operators, respectively.
Figure 5.
Pointwise absolute error functions for . The curves represent the absolute deviations , , and corresponding to the adj classical, adj Kantorovich-type, and adj quadrature HHC-type operators, respectively.
Figure 6.
Uniform error decay of the HHC-type operators. The log–log plot illustrates the convergence behavior of , , and in as n increases. Here, represents , , and , respectively.
Figure 6.
Uniform error decay of the HHC-type operators. The log–log plot illustrates the convergence behavior of , , and in as n increases. Here, represents , , and , respectively.
Table 1.
Error analysis for the approximation of by HHC-type operators . The decay of RMSE, MAE, and uniform error , together with , confirms convergence as n increases.
Table 1.
Error analysis for the approximation of by HHC-type operators . The decay of RMSE, MAE, and uniform error , together with , confirms convergence as n increases.
| n | RMSE | MAE | Max Error | |
|---|
| 5 | | | | |
| 10 | | | | |
| 20 | | | | |
| 40 | | | | |
Table 2.
Regression-based error metrics for the HHC-type operators , , and applied to the test function in (60).
Table 2.
Regression-based error metrics for the HHC-type operators , , and applied to the test function in (60).
| n | Operator | RMSE | MAE | Max Error | |
|---|
| 5 | | | | | |
| | | | | | |
| | | | | | |
| 10 | | | | | |
| | | | | | |
| | | | | | |
| 20 | | | | | |
| | | | | | |
| | | | | | |
| 40 | | | | | |
| | | | | | |
| | | | | | |
| 80 | | | | | |
| | | | | | |
| | | | | | |
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |