Next Article in Journal
Riemann–Liouville Fractional Integral Form of Modified Baskakov-Type Operators: Approximation Properties and Statistical Convergence
Previous Article in Journal
Quantum Heuristic Approach to Vehicle Routing Problem
 
 
Due to scheduled maintenance work on our servers, there may be short service disruptions on this website between 11:00 and 12:00 CEST on March 28th.
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Multiple-Composite Quantitative Approximation by Multivariate Kantorovich–Choquet Neural Networks

by
George A. Anastassiou
Department of Mathematical Sciences, University of Memphis, Memphis, TN 38152, USA
Mathematics 2026, 14(6), 1027; https://doi.org/10.3390/math14061027
Submission received: 28 January 2026 / Revised: 7 March 2026 / Accepted: 13 March 2026 / Published: 18 March 2026

Abstract

In this work we study the univariate and multivariate quantitative approximation by multi-composite Kantorovich–Choquet-type quasi-interpolation neural network operators with respect to the supremum norm. This is achieved with rates via the first univariate and multivariate moduli of continuity. We approximate continuous and bounded non-negative functions on R N , N N . When they are also uniformly continuous we have pointwise and uniform convergences, plus L p estimates. Our multi-composite activation functions are formed by general sigmoid functions.

1. Introduction

Multi-composition of activation functions refers to the technique of chaining, mixing, or stacking multiple activation functions, either in sequence or in parallel (concatenation), to create more complex, flexible, and often trainable non-linearities in neural networks. Instead of using a single activation (like ReLU) throughout the network, this approach leverages multiple functions to improve performance, enhance model expressiveness, and optimize learning.
Types of Activation Composition
  • Sequential Composition: Applying one activation function after another (e.g., f g ( x ) ). A recent development includes “PolyCom” (Polynomial Composition), which combines polynomial functions with others like ReLU to accelerate convergence and improve accuracy. Another example is compounding functions to create “normalized cusp neural network operators”, which can reduce infinite domains to compact supports, enhancing approximation capabilities.
  • Parallel Composition (Concatenation): Computing the outputs of different activation functions ( f 1 x , f 2 x , ) for the same input and concatenating them into a single vector. This allows the network to utilize multiple non-linearities at once.
  • Dynamic Activation Composition (Dyn): A technique that uses learnable, normalized convex combinations of basis activation functions. This allows the network to adaptively “mix” activations during training, enhancing model adaptability and performance.
  • Multivariate/Multi-dimensional Activations: Moving beyond simple scalar functions to functions that take multiple inputs and produce multiple outputs (e.g., generalizing ReLU to a second-order cone projection). These approaches are shown to have higher expressive power than traditional single-input, single-output activations.
Advantages and Applications
  • Increased Expressiveness: Complex compositions allow networks to learn more intricate patterns and handle non-linearly separable data more effectively.
  • Improved Accuracy: Studies have shown that combining different activation functions can lead to better performance compared to using a single standard activation, particularly for less predictable data distributions.
  • Faster Convergence: Certain compositions, such as multi-kernel activation functions (multi-KAF) or dynamic mixtures, can help models converge faster by better adapting to the data.
  • Controlled Output Range: Composing functions can limit the output to specific, desirable ranges, which helps in focusing on important information and filtering out noise.
  • Better Gradient Flow: Some compositions, such as concatenating Swish and Tanh, can provide paths with non-zero derivatives, helping to mitigate vanishing gradient problems.
Key research Findings
  • Flexibility: Research indicates that compositing activation functions—such as using a “cusp” function (a composition of two functions)—results in more flexible and powerful neural networks.
  • Learned Activations: Instead of manually selecting the composition, techniques like “Trainable Adaptive Activation Function Structure (TAAFS)” learn the optimal combination of activation functions during training.
  • Efficiency Concerns: While composing functions can improve accuracy, it may add to the computational load. However, the performance gains often justify the added complexity.
This approach is increasingly being explored to improve the performance of deep learning models, including Transformers and Large Language Models (LLMs).
The author in [1,2], see Chapters 2–5, was the first to establish neural network approximations to continuous functions with rates by very specifically defined neural network operators of Cardaliaguet–Euvrard and “Squashing” types by employing the modulus of continuity of the engaged function or its high-order derivative and producing very tight Jackson-type inequalities. He treats there both the univariate and multivariate cases. The defining these operators as “bell-shaped” and “squashing” functions assumes they are of compact support. Also in [2], he gives the Nth order asymptotic expansion for the error of weak approximation of these two operators to a special natural class of smooth functions; see Chapters 4–5.
The author, inspired by [3], continued his studies on neural network approximation by introducing and using the proper quasi-interpolation operators of sigmoidal and hyperbolic tangent-type which resulted in [4,5,6,7], by treating both the univariate and multivariate cases. He did also the corresponding fractional case [5,7].
The author here performs multi-composite univariate and multivariate general sigmoid activation functions-based neural network approximations to continuous non-negative functions over the whole R N , N N ; then he extends his results to complex-valued functions. L p approximations are also included. All convergences here are with rates expressed via the modulus of continuity of the involved function and given by very tight Jackson-type inequalities.
The author comes up with the “right” precisely defined flexible quasi-interpolation, Kantorovich–Choquet-type integral coefficient neural networks operators associated with multi-composite general sigmoid activation functions. In preparation to prove our results, we establish important properties of the multi-composite general density functions defining our operators.
Feed-forward neural networks (FNNs) with one hidden layer, the only type of networks we deal with in this work, are mathematically expressed as
N n x = j = 0 n c j σ a j · x + b j , x R s , s N ,
where for 0 j n , b j R are the thresholds, a j R s are the connection weights, c j R are the coefficients, a j · x is the inner product of a j and x, and σ is the activation function of the network. For neural networks in general, read [8,9,10,11,12]. For recent related works, see [13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32].

2. Background

2.1. Description of Choquet Integral [33]

We make
Definition 1. 
Consider Ω and let C be a σ-algebra of subsets in Ω.
(i) (see, e.g., [29], p. 63) The set function μ : C 0 , + is called a monotone set function (or capacity) if μ = 0 and μ A μ B for all A , B C , with A B . Also, μ is called submodular if
μ A B + μ A B μ A + μ B , f o r a l l A , B C .
μ is called bounded if μ Ω < + and normalized if μ Ω = 1 .
(ii) (see, e.g., [29], p. 233, or [2]) If μ is a monotone set function on C and if f : Ω R is C -measurable (that is, for any Borel subset B R it follows f 1 B C ), the for any A C , the Choquet integral is defined by
C A f d μ = 0 + μ F β f A d β + 0 μ F β f A μ A d β ,
where we used the notation F β f = ω Ω : f ω β . Notice that if f 0 on A, then in the above formula we get 0 = 0 .
The integrals on the right-hand side are the usual Riemann integral.
The function f will be called Choquet integrable on A if C A f d μ R .
Next, we list some well-known properties of the Choquet integral.
Remark 1. 
If μ : C 0 , + is a monotone set function, then the following properties hold:
(i) For all a 0 , we have C A a f d μ = a · C A f d μ (if f 0 then see, e.g., [29], Theorem 11.2, (5), p. 228 and if f is arbitrary sign, then see, e.g., [34], p. 64, Proposition 5.1, (ii)).
(ii) For all c R and f of arbitrary sign, we have (see, e.g., [29], pp. 232–233, or [34], p. 65) C A f + c d μ = C A f d μ + c · μ A .
If μ is submodular too, then for all f , g of arbitrary sign and lower bounded, we have (see, e.g., [34], p. 75, Theorem 6.3)
C A f + g d μ C A f d μ + C A g d μ .
(iii) If f g on A, then C A f d μ C A g d μ (see, e.g., [29], p. 228, Theorem 11.2, (3) if f , g 0 and p. 232 if f , g are of arbitrary sign).
(iv) Let f 0 . If A B then C A f d μ C B f d μ . In addition, if μ is finitely subadditive, then
C A B f d μ C A f d μ + C B f d μ .
(v) It is immediate that C A 1 · d μ t = μ A .
(vi) If μ is a countably additive bounded measure, then the Choquet integral C A f d μ reduces to the usual Lebesgue-type integral (see, e.g., [34], p. 62, or [29], p. 226).
(vii) If f 0 , then C A f d μ 0 .
(viii) If Ω = R N , N N , we call μ strictly positive if μ A > 0 , for any open subset A R N . See also [35].

2.2. On Multi-Composite Activation Functions

We mention:
Definition 2. 
Let i = 1 , 2 , , and h i : R 1 , 1 be general sigmoid activation functions, such that they are strictly increasing, h i 0 = 0 , h i x = h i x , x R , h i + = 1 , h i = 1 . Also h i is strictly convex over ( , 0 ] and strictly concave over [ 0 , + ) , with h i 2 C R . Examples tanh x , e r f x , normalized arctan x , normalized Gudermanian x , etc.
Notice here 0 < h i 1 1 , i = 1 , 2 , . Any composition h 1 h 2 h 3 h λ is meant to be h 1 | 1 , 1 h 2 | 1 , 1 h 3 | 1 , 1 h λ 1 | 1 , 1 h λ , λ N , and for convenience, we denote it by G λ : = h 1 h 2 h 3 h λ . We have for any λ N : 0 < h λ 1 1 , hence 0 < h λ 1 h λ 1 h λ 1 1 1 , and 0 < h λ 2 h λ 1 h λ 1 h λ 2 h λ 1 1 h λ 2 1 1 .
Inductively we derive that 0 < G λ 1 1 , ∀ λ N .
Clearly, it gives us G λ 0 = 0 and G λ is strictly increasing over R . Furthermore, it holds
G λ x = h 1 h 2 h 3 h λ 1 h λ x = h 1 h 2 h 3 h λ 1 h λ x = = h 1 h 2 h 3 h λ 1 h λ x = G λ x , x R .
Clearly, it holds G λ 2 C R .
We notice that
G λ + = h 1 h 2 h 3 h λ 1 h λ + = h 1 h 2 h 3 h λ 1 1 = G λ 1 1 ,
and
G λ = h 1 h 2 h 3 h λ 1 h λ = h 1 h 2 h 3 h λ 1 1 = h 1 h 2 h 3 h λ 1 1 = G λ 1 1 .
Consequently, it holds
G λ 1 1 G λ x G λ 1 1 , x R .
Thus, y = ± G λ 1 1 are asymptotes of G λ x , any λ N .
Next, we act over ( , 0 ] : let λ , μ 0 : λ + μ = 1 . Then, by convexity of h 2 , we have
h 2 λ x + μ y λ h 2 x + μ h 2 y , x , y ( , 0 ] ;
and
h 1 h 2 λ x + μ y h 1 λ h 2 x + μ h 2 y λ h 1 h 2 x + μ h 1 h 2 y , x , y ( , 0 ] .
so that h 1 h 2 is convex over ( , 0 ] .
Now we work on [ 0 , + ) : and let λ , μ 0 : λ + μ = 1 . Then, by concavity of h 2 , we have
h 2 λ x + μ y λ h 2 x + μ h 2 y , x , y [ 0 , + ) ;
and
h 1 h 2 λ x + μ y h 1 λ h 2 x + μ h 2 y λ h 1 h 2 x + μ h 1 h 2 y , x , y [ 0 , + ) .
Thus, h 1 h 2 is concave over [ 0 , + ) .
Therefore, G 2 = h 1 h 2 is a general sigmoid activation function with asymptotes y = ± h 1 1 , and fulfilling the rest of conditions of Definition 2.
Arguing as above h 2 h 3 : R 1 , 1 , fulfills Definition 2, and h 1 h 2 h 3 does the same with asymptotes h = ± h 1 h 2 1 .
Inductively, we prove that G λ fulfills Definition 2 with asymptotes y = ± G λ 1 1 .
We have proved the following:
Theorem 1. 
Let λ N . Then G λ : = h 1 h 2 h 3 h λ fulfills all the properties of Definition 2 with asymptotes y = ± G λ 1 1 . That is G λ is a multi-composite general sigmoid activation function from R 1 , 1 .
Corollary 1. 
G λ G λ 1 1 fulfills Definition 2 with asymptotes y = ± 1 .
We call
G ˜ λ : = G λ G λ 1 1 .
Remark 2. 
Next, we consider the function
T λ x : = 1 4 G ˜ λ x + 1 G ˜ λ x 1 > 0 , x R , λ N .
We observe that
T λ x = 1 4 G ˜ λ x + 1 G ˜ λ x 1 = 1 4 G ˜ λ x 1 G ˜ λ x + 1 = 1 4 G ˜ λ x 1 + G ˜ λ x + 1 = 1 4 G ˜ λ x + 1 G ˜ λ x 1 = T λ x .
That is T λ is an even function,
T λ x = T λ x , x R , λ N .
We see that
T λ 0 = G ˜ λ 1 2 .
Let x > 1 , we have that
T λ x = 1 4 G ˜ λ x + 1 G ˜ λ x 1 < 0 ,
by G ˜ λ being strictly decreasing over [ 0 , + ) .
Let now 0 < x < 1 , then 1 x > 0 and 0 < 1 x < 1 + x . It holds G ˜ λ x 1 = G ˜ λ 1 x > G ˜ λ x + 1 , so that again T λ x < 0 . Consequently, T λ is strictly decreasing on 0 , + .
Clearly, T λ is strictly increasing on ( , 0 ) , and T λ 0 = 0 .
We observe that
lim x + T λ x = 1 4 G ˜ λ + G ˜ λ + = 0 ,
and
lim x T λ x = 1 4 G ˜ λ G ˜ λ = 0 .
That is the x-axis is the horizontal asymptote on T λ .
As a result, T λ is a bell-shaped symmetric function with maximum
T λ 0 = G ˜ λ 1 2 .
We need
Theorem 2. 
It holds
i = T λ x i = 1 , x R .
Proof. 
As similar to [6], p. 286 is omitted. □
Theorem 3. 
We have that
T λ x d x = 1 .
Proof. 
As similar to [6], p. 287, it is omitted. □
So that T λ x can serve as a density function in general.
We need
Theorem 4 
([36]). Let 0 < α < 1 , and n N with n 1 α > 2 . Then
k = : n x k n 1 α T λ n x k < 1 G ˜ λ n 1 α 2 ,
and
lim n + 1 G ˜ λ n 1 α 2 = 0 .
Denote by · the integral part of the number and by · the ceiling of the number.
We also need
Theorem 5. 
Let x a , b R and n N so that n a n b . It holds
1 k = n a n b T λ n x k < 1 T λ 1 , x a , b .
Proof. 
As similar to [6], p. 289 is omitted. □
Remark 3. 
We have that
lim n k = n a n b T λ n x k 1 ,
for at least some x a , b .
See [6], p. 290, same reasoning.
Note 1. 
For large enough n, we always obtain n a n b . Also, a k n b , if n a k n b . In general, it holds (by (14))
k = n a n b T λ n x k 1 .
We make
Remark 4. 
We define
Z ^ x 1 , , x N : = Z ^ x : = i = 1 N T λ x i , x = x 1 , , x N R N , N N .
It has the properties:
(i)
Z ^ x > 0 , x R N ,
(ii)
k = Z ^ x k : = k 1 = k 2 = k N = Z ^ x 1 k 1 , , x N k N = k 1 = k 2 = k N = i = 1 N T λ x i k i = i = 1 N k i = T λ x i k i = ( 14 ) 1 .
Hence
k = Z ^ x k = 1 .
That is
(iii)
k = Z ^ n x k = 1 , x R N ; n N .
and
(iv)
R N Z ^ x d x = R N i = 1 N T λ x i d x 1 d x N = i = 1 N T λ x i d x i = ( 15 ) 1 ,
Thus
R N Z ^ x d x = 1 ,
That is, Z ^ is a multivariate density function.
Here, we denote x = x 1 , , x N , x : = max x 1 , , x N , x R N , also set : = , , , : = , , upon the multivariate context, 0 < β < 1 ,
(v) We have
k = k n x > 1 n β Z ^ n x k = k 1 = k N = k n x > 1 n β i = 1 N T λ n x i k i = i = 1 N k i = k n x > 1 n β T λ n x i k i ( f o r s o m e r 1 , , N ) i = 1 i r N k i = T λ n x i k i k r = k r n x r > 1 n β T λ n x r k r = k r = k r n x r > 1 n β T λ n x r k r = k r = n x r k r > n 1 β T λ n x r k r < ( 16 ) 1 G ˜ λ n 1 β 2 .
That is
k = k n x > 1 n β Z ^ n x k < 1 G ˜ λ n 1 β 2 .
0 < β < 1 , n N : n 1 β > 2 , x i = 1 N a i , b i .
We denote by
δ ^ N β , n : = 1 G ˜ λ n 1 β 2 .
0 < β < 1 .
For f C B + R N (continuous and bounded functions from R N into R + ), we define the first modulus of continuity
ω 1 f , δ : = sup x , y R N x y h f x f y , h > 0 .
Given that f C U + R N (uniformly continuous from R N into R + , the same definition for ω 1 ), we have that
lim h 0 ω 1 f , h = 0 .
When N = 1 , ω 1 is defined as in (30) with · collapsing to · and has the property (31).

3. Main Results

We need
Definition 3. 
Let L be the Lebesgue σ-algebra on R N , N N , and the set function μ : L [ 0 , + ) , which is assumed to be monotone, submodular and strictly positive. For f C B + R N , we define the general multi-composite multivariate Kantorovich–Choquet-type neural network operators for any x R N :
K ^ n μ f , x = K ^ n μ f , x 1 , , x N : = k = C 0 , 1 n N f t + k n d μ t μ 0 , 1 n N Z ^ n x k = k 1 = k 2 = k N = C 0 1 n 0 1 n f t 1 + k 1 n , t 2 + k 2 n , , t N + k N n d μ t 1 , , t N μ 0 , 1 n N i = 1 N T λ n x i k i ,
where x = x 1 , , x N R N , k = k 1 , , k N , t = t 1 , , t N , n N .
Clearly, here μ 0 , 1 n N > 0 , n N .
From the above, we notice that
K ^ n μ f f ,
so that K ^ n μ f , x is well-defined.
We make
Remark 5. 
Let f C B + R N , t 0 , 1 n N and x R N , then
f t + k n = f t + k n f x + f x f t + k n f x + f x ,
Hence
C 0 , 1 n N f t + k n d μ t C 0 , 1 n N f t + k n f x d μ t + C 0 , 1 n N f x d μ t = C 0 , 1 n N f t + k n f x d μ t + f x μ 0 , 1 n N .
That is
C 0 , 1 n N f t + k n d μ t f x μ 0 , 1 n N C 0 , 1 n N f t + k n f x d μ t .
Similarly, we have that
f x = f x f t + k n + f t + k n f t + k n f x + f t + k n .
Hence
C 0 , 1 n N f x μ d t C 0 , 1 n N f t + k n f x d μ t + C 0 , 1 n N f t + k n μ d t ,
and
f x μ 0 , 1 n N C 0 , 1 n N f t + k n μ d t C 0 , 1 n N f t + k n f x d μ t .
By (35) and (36), we derive that
C 0 , 1 n N f t + k n μ d t f x μ 0 , 1 n N C 0 , 1 n N f t + k n f x d μ t .
In particular, it holds
C 0 , 1 n N f t + k n μ d t μ 0 , 1 n N f x C 0 , 1 n N f t + k n f x d μ t μ 0 , 1 n N .
We present the following approximation result:
Theorem 6. 
Let f C B + R N , 0 < β < 1 , x R N , N , n N with n 1 β > 2 . Then
(i)
sup μ K ^ n μ f , x f x ω 1 f , 1 n + 1 n β + 2 f δ ^ N β , n = : σ n ,
and
(ii)
sup μ K ^ n μ f f σ n .
Given that f C U + R N C B + R N , we obtain lim n K ^ n μ f = f , uniformly.
Proof. 
We observe that
K ^ n μ f , x f x =
k = C 0 , 1 n N f t + k n d μ t μ 0 , 1 n N Z ^ n x k k = f x Z ^ n x k
= k = C 0 , 1 n N f t + k n d μ t μ 0 , 1 n N f x Z ^ n x k
k = C 0 , 1 n N f t + k n d μ t μ 0 , 1 n N f x Z ^ n x k ( 38 )
k = C 0 , 1 n N f t + k n f x d μ t μ 0 , 1 n N Z ^ n x k =
k = k n x 1 n β C 0 , 1 n N f t + k n f x d μ t μ 0 , 1 n N Z ^ n x k +
k = k n x > 1 n β C 0 , 1 n N f t + k n f x d μ t μ 0 , 1 n N Z ^ n x k
k = k n x 1 n β C 0 , 1 n N ω 1 f , t + k n x d μ t μ 0 , 1 n N Z ^ n x k +
2 f k = k n x > 1 n β Z ^ n x k
( 28 ) ω 1 f , 1 n + 1 n β + 2 f δ ^ N β , n ,
proving the claim. □
Additionally, we give
Definition 4. 
Denote C B + R N , C = { f : R N C | f = f 1 + i f 2 , where f 1 , f 2 C B + R N } . We set for f C B + R N , C that
K ^ n μ f , x : = K ^ n μ f 1 , x + i K ^ n μ f 2 , x ,
n N , x R N , i = 1 .
We give
Theorem 7. 
Let f C B + R N , C , f = f 1 + i f 2 , N N , 0 < β < 1 , x R N , n N with n 1 β > 2 . Then
(i)
sup μ K ^ n μ f , x f x ω 1 f 1 , 1 n + 1 n β + ω 1 f 2 , 1 n + 1 n β + 2 f 1 + f 2 δ ^ N β , n = : φ n ,
and
(ii)
sup μ K ^ n μ f f φ n .
Proof. 
We have that
K ^ n μ f , x f x = K ^ n μ f 1 , x + i K ^ n μ f 2 , x f 1 x i f 2 x = K ^ n μ f 1 , x f 1 x + i K ^ n μ f 2 , x f 2 x K ^ n μ f 1 , x f 1 x + K ^ n μ f 2 , x f 2 x ( 39 ) ω 1 f 1 , 1 n + 1 n β + 2 f 1 δ ^ N β , n + ω 1 f 2 , 1 n + 1 n β + 2 f 2 δ ^ N β , n ,
proving the claim. □
We need
Definition 5. 
Let L * be the Lebesgue σ-algebra on R , and the set function μ * : L * 0 , + , which is assumed to be monotone, submodular and strictly positive. For f C B + R , we define the general multi-composite univariate Kantorovich–Choquet-type neural network operator for any x R :
M ^ n μ * f , x = k = C 0 1 n f t + k n d μ * t μ * 0 , 1 n T λ n x k .
Clearly, here μ * 0 , 1 n > 0 , n N .
From the above, we notice that
M ^ n μ * f f ,
so that M ^ n μ * f , x is well-defined.
Notice that K ^ n μ , when N = 1 , collapses to M ^ n μ * .
This follows another related result.
Corollary 2. 
(to Theorem 6 when N = 1 )
Let f C B + R , 0 < β < 1 , x R ; n N with n 1 β > 2 . Then
(i)
sup μ * M ^ n μ * f , x f x ω 1 f , 1 n + 1 n β + 2 f 1 G ˜ λ n 1 β 2 = : τ n ,
and
(ii)
sup μ M ^ n μ * f f τ n .
Given that f C U + R C B + R , we obtain lim n M ^ n μ * f = f , uniformly.
Proof. 
Similarly to Theorem 6, it is omitted. □
We need
Definition 6. 
Let f C B + R , C , where f = f 1 + i f 2 with f 1 , f 2 C B + R . We set
M ^ n μ * f , x : = M ^ n μ * f 1 , x + i M ^ n μ * f 2 , x ,
n N ,   x R .
We continue with
Corollary 3. 
(to Theorem 7 when N = 1 ) Let f C B + R , C , f = f 1 + i f 2 , 0 < β < 1 , x R , n N with n 1 β > 2 . Then
(i)
sup μ * M ^ n μ * f , x f x ω 1 f 1 , 1 n + 1 n β + ω 1 f 2 , 1 n + 1 n β + 2 f 1 + f 2 1 G ˜ λ n 1 β 2 = : ψ n ,
and
(ii)
sup μ * M ^ n μ * f f ψ n .
Proof. 
Similarly to Theorem 7, it is omitted. □
We finish with L p estimates.
Theorem 8. 
Let all as in Theorem 7, p > 1 , Λ ^ a compact subset of R N . Then
K ^ n μ f f p , Λ ^ φ n Λ ^ 1 p ,
where Λ ^ < , is the Lebesgue measure of Λ ^ , where φ n as in (45).
Proof. 
By (45), etc. □
Theorem 9. 
All as in Theorem 8, N = 1 case. Then
M ^ n μ * f f p , Λ ^ ψ n Λ ^ 1 p ,
where Λ ^ now is a closed interval of R , where ψ n as in (52).
Proof. 
By (52), etc. □

4. Conclusions

We studied the multi-composite univariate and multivariate general approximation by the Kantorovich–Choquet-type quasi-interpolation integral neural network operators.
The functions under approximation are non-negative-valued, continuous, and bounded. We established a series of multi-composite Jackson-type inequalities giving the corresponding convergences via the modulus of continuity. We extended our results to complex-valued functions. We finished with related L p results.

Funding

This research received no external funding.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest

The author declares no conflicts of interest.

References

  1. Anastassiou, G.A. Rate of Convergence of Some Neural Network Operators to the Unit-Univariate Case. J. Math. Anal. Appl. 1997, 212, 237–262. [Google Scholar] [CrossRef]
  2. Anastassiou, G.A. Quantitative Approximations; Chapman & Hall/CRC: Boca Raton, NY, USA, 2001. [Google Scholar]
  3. Chen, Z.; Cao, F. The approximation operators with sigmoidal functions. Comput. Math. Appl. 2009, 58, 758–765. [Google Scholar] [CrossRef]
  4. Anastassiou, G.A. Intelligent Systems: Approximation by Artificial Neural Networks; Intelligent Systems Reference Library; Springer: Berlin/Heidelberg, Germany, 2011; Volume 19. [Google Scholar]
  5. Anastassiou, G.A. Intelligent Systems II: Complete Approximation by Neural Network Operators; Springer: Berlin/Heidelberg, Germany; New York, NY, USA, 2016. [Google Scholar]
  6. Anastassiou, G.A. Intelligent Computations: Abstract Fractional Calculus, Inequalities, Approximations; Springer: Berlin/Heidelberg, Germany; New York, NY, USA, 2018. [Google Scholar]
  7. Anastassiou, G.A. Parametrized, Deformed and General Neural Networks; Springer: Berlin/Heidelberg, Germany; New York, NY, USA, 2023. [Google Scholar]
  8. Costarelli, D.; Spigler, R. Approximation results for neural network operators activated by sigmoidal functions. Neural Netw. 2013, 44, 101–106. [Google Scholar] [CrossRef] [PubMed]
  9. Costarelli, D.; Spigler, R. Multivariate neural network operators with sigmoidal activation functions. Neural Netw. 2013, 48, 72–77. [Google Scholar] [CrossRef] [PubMed]
  10. Haykin, I.S. Neural Networks: A Comprehensive Foundation, 2nd ed.; Prentice Hall: New York, NY, USA, 1998. [Google Scholar]
  11. McCulloch, W.; Pitts, W. A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 1943, 7, 115–133. [Google Scholar] [CrossRef]
  12. Mitchell, T.M. Machine Learning; WCB-McGraw-Hill: New York, NY, USA, 1997. [Google Scholar]
  13. Dansheng, Y.; Feilong, C. Construction and approximation rate for feed-forward neural network operators with sigmoidal functions. J. Comput. Appl. Math. 2025, 453, 116150. [Google Scholar]
  14. Siyu, C.; Bangti, J.; Qimeng, Q.; Zhi, Z. Hybrid neural-network FEM approximation of diffusion coeficient in elyptic and parabolic problems. IMA J. Numer. Anal. 2024, 44, 3059–3093. [Google Scholar]
  15. Lucian, C.; Danillo, C.; Mariarosaria, N.; Alexandra, P. The approximation capabilities of Durrmeyer-type neural network operators. J. Appl. Math. Comput. 2024, 70, 4581–4599. [Google Scholar] [CrossRef]
  16. Xavier, W. The GroupMax neural network approximation of convex functions. IEEE Trans. Neural Netw. Learn. Syst. 2024, 35, 11608–11612. [Google Scholar]
  17. Arnau, F.; Oriol, G.; Joan, B.; Ramon, C. Approximation of acoustic black holes with finite element mixed formulations and artificial neural network correction terms. Finite Elem. Anal. Des. 2024, 241, 104236. [Google Scholar]
  18. Philipp, G.; Felix, V. Proof of the theory-to-practice gap in deep learning via sampling complexity bounds for neural network approximation spaces. Found. Comput. Math. 2024, 24, 1085–1143. [Google Scholar]
  19. Andrea, B.; Dario, T. Quantitative Gaussian approximation of randomly initialized deep neural networks. Mach. Learn. 2024, 113, 6373–6393. [Google Scholar] [CrossRef]
  20. De Ryck, T.; Mishra, S. Error analysis for deep neural network approximations of parametric hyperbolic conservation laws. Math. Comp. 2024, 93, 2643–2677. [Google Scholar] [CrossRef]
  21. Jie, L.; Baoji, Z.; Yuyang, L.; Liqiau, F. Hull form optimization reserach based on multi-precision back-propagation neural network approximation model. Int. J. Numer. Methods Fluids 2024, 96, 1445–1460. [Google Scholar]
  22. Yoo, J.; Kim, J.; Gim, M.; Lee, H. Error estimates of physics-informed neural networks for initial value problems. J. Korean Soc. Ind. Appl. Math. 2024, 28, 33–58. [Google Scholar]
  23. Kaur, J.; Goyal, M. Hyers-Ulam stability of some positive linear operators. Stud. Univ. Babeş-Bolyai Math. 2025, 70, 105–114. [Google Scholar] [CrossRef]
  24. Abel, U.; Acu, A.M.; Heilmann, M.; Raşa, I. On some Cauchy problems and positive linear operators. Mediterr. J. Math. 2025, 22, 20. [Google Scholar] [CrossRef]
  25. Moradi, H.R.; Furuichi, S.; Sababheh, M. Operator quadratic mean and positive linear maps. J. Math. Inequal. 2024, 18, 1263–1279. [Google Scholar] [CrossRef]
  26. Bustamante, J.; Torres-Campos, J.D. Power series and positive linear operators in weighted spaces. Serdica Math. J. 2024, 50, 225–250. [Google Scholar] [CrossRef]
  27. Acu, A.-M.; Rasa, I.; Sofonea, F. Composition of some positive linear integral operators. Demonstr. Math. 2024, 57, 20240018. [Google Scholar] [CrossRef]
  28. Patel, P.G. On positive linear operators linking gamma, Mittag-Leffler and Wright functions. Int. J. Appl. Comput. Math. 2024, 10, 152. [Google Scholar] [CrossRef]
  29. Wang, Z.; Klir, G.J. Generalized Measure Theory; Springer: New York, NY, USA, 2009. [Google Scholar]
  30. Ozger, F.; Aslan, A.R.; Merve, E. Some approximation results on a class of Szasz-Mirakjan-Kantorovich operators including non-negative parameter. Numer. Funct. Anal. Optim. 2025, 46, 461–484. [Google Scholar] [CrossRef]
  31. Costarelli, D.; Piconi, M. Strong and weak sharp bounds for neural network operators in Sobolev-Orlicz spaces and their quantitative extensions to Orlicz spaces. Bull. Sci. Math. 2026, 208, 103791. [Google Scholar] [CrossRef]
  32. Saini, S.; Singh, U. Kantorovich-Type Stochastic Neural Network Operators for the Mean-Square Approximation of Certain Second-Order Stochastic Processes. arXiv 2026, arXiv:2601.03634. [Google Scholar]
  33. Choquet, G. Theory of capacities. Ann. Inst. Fourier 1954, 5, 131–295. [Google Scholar] [CrossRef]
  34. Denneberg, D. Non-Additive Measure and Integral; Kluwer: Dordrecht, The Netherlands, 1994. [Google Scholar]
  35. Gal, S. Uniform and Pointwise Quantitative Approximation by Kantorovich-Choquet type integral Operators with respect to monotone and submodular set functions. Mediterr. J. Math. 2017, 14, 205. [Google Scholar] [CrossRef]
  36. Anastassiou, G.A. General Multi-Composite Sigmoid Relied Banach Space Valued Univariate Neural Network Approximation. In Parametrized, Deformed and General Neural Networks; Studies in Computational Intelligence; Springer: Cham, Switzerland, 2023. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Anastassiou, G.A. Multiple-Composite Quantitative Approximation by Multivariate Kantorovich–Choquet Neural Networks. Mathematics 2026, 14, 1027. https://doi.org/10.3390/math14061027

AMA Style

Anastassiou GA. Multiple-Composite Quantitative Approximation by Multivariate Kantorovich–Choquet Neural Networks. Mathematics. 2026; 14(6):1027. https://doi.org/10.3390/math14061027

Chicago/Turabian Style

Anastassiou, George A. 2026. "Multiple-Composite Quantitative Approximation by Multivariate Kantorovich–Choquet Neural Networks" Mathematics 14, no. 6: 1027. https://doi.org/10.3390/math14061027

APA Style

Anastassiou, G. A. (2026). Multiple-Composite Quantitative Approximation by Multivariate Kantorovich–Choquet Neural Networks. Mathematics, 14(6), 1027. https://doi.org/10.3390/math14061027

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop