Multiple-Composite Quantitative Approximation by Multivariate Kantorovich–Choquet Neural Networks

Anastassiou, George A.

doi:10.3390/math14061027

Open AccessArticle

Multiple-Composite Quantitative Approximation by Multivariate Kantorovich–Choquet Neural Networks

by

George A. Anastassiou

Department of Mathematical Sciences, University of Memphis, Memphis, TN 38152, USA

Mathematics 2026, 14(6), 1027; https://doi.org/10.3390/math14061027

Submission received: 28 January 2026 / Revised: 7 March 2026 / Accepted: 13 March 2026 / Published: 18 March 2026

Download Versions Notes

Abstract

In this work we study the univariate and multivariate quantitative approximation by multi-composite Kantorovich–Choquet-type quasi-interpolation neural network operators with respect to the supremum norm. This is achieved with rates via the first univariate and multivariate moduli of continuity. We approximate continuous and bounded non-negative functions on

R^{N},

N \in N

. When they are also uniformly continuous we have pointwise and uniform convergences, plus

L_{p}

estimates. Our multi-composite activation functions are formed by general sigmoid functions.

Keywords:

multi-composite general sigmoid activation functions; univariate and multivariate quasi-interpolation neural network approximation; multi-composite Kantorovich–Choquet-type operators

MSC:

41A17; 41A25; 41A30; 41A35

1. Introduction

Multi-composition of activation functions refers to the technique of chaining, mixing, or stacking multiple activation functions, either in sequence or in parallel (concatenation), to create more complex, flexible, and often trainable non-linearities in neural networks. Instead of using a single activation (like ReLU) throughout the network, this approach leverages multiple functions to improve performance, enhance model expressiveness, and optimize learning.

Types of Activation Composition

Sequential Composition: Applying one activation function after another (e.g., $f (g (x))$ ). A recent development includes “PolyCom” (Polynomial Composition), which combines polynomial functions with others like ReLU to accelerate convergence and improve accuracy. Another example is compounding functions to create “normalized cusp neural network operators”, which can reduce infinite domains to compact supports, enhancing approximation capabilities.
Parallel Composition (Concatenation): Computing the outputs of different activation functions ( $f_{1} (x),$ $f_{2} (x), \dots$ ) for the same input and concatenating them into a single vector. This allows the network to utilize multiple non-linearities at once.
Dynamic Activation Composition (Dyn): A technique that uses learnable, normalized convex combinations of basis activation functions. This allows the network to adaptively “mix” activations during training, enhancing model adaptability and performance.
Multivariate/Multi-dimensional Activations: Moving beyond simple scalar functions to functions that take multiple inputs and produce multiple outputs (e.g., generalizing ReLU to a second-order cone projection). These approaches are shown to have higher expressive power than traditional single-input, single-output activations.

Advantages and Applications

Increased Expressiveness: Complex compositions allow networks to learn more intricate patterns and handle non-linearly separable data more effectively.
Improved Accuracy: Studies have shown that combining different activation functions can lead to better performance compared to using a single standard activation, particularly for less predictable data distributions.
Faster Convergence: Certain compositions, such as multi-kernel activation functions (multi-KAF) or dynamic mixtures, can help models converge faster by better adapting to the data.
Controlled Output Range: Composing functions can limit the output to specific, desirable ranges, which helps in focusing on important information and filtering out noise.
Better Gradient Flow: Some compositions, such as concatenating Swish and Tanh, can provide paths with non-zero derivatives, helping to mitigate vanishing gradient problems.

Key research Findings

Flexibility: Research indicates that compositing activation functions—such as using a “cusp” function (a composition of two functions)—results in more flexible and powerful neural networks.
Learned Activations: Instead of manually selecting the composition, techniques like “Trainable Adaptive Activation Function Structure (TAAFS)” learn the optimal combination of activation functions during training.
Efficiency Concerns: While composing functions can improve accuracy, it may add to the computational load. However, the performance gains often justify the added complexity.

This approach is increasingly being explored to improve the performance of deep learning models, including Transformers and Large Language Models (LLMs).

The author in [1,2], see Chapters 2–5, was the first to establish neural network approximations to continuous functions with rates by very specifically defined neural network operators of Cardaliaguet–Euvrard and “Squashing” types by employing the modulus of continuity of the engaged function or its high-order derivative and producing very tight Jackson-type inequalities. He treats there both the univariate and multivariate cases. The defining these operators as “bell-shaped” and “squashing” functions assumes they are of compact support. Also in [2], he gives the Nth order asymptotic expansion for the error of weak approximation of these two operators to a special natural class of smooth functions; see Chapters 4–5.

The author, inspired by [3], continued his studies on neural network approximation by introducing and using the proper quasi-interpolation operators of sigmoidal and hyperbolic tangent-type which resulted in [4,5,6,7], by treating both the univariate and multivariate cases. He did also the corresponding fractional case [5,7].

The author here performs multi-composite univariate and multivariate general sigmoid activation functions-based neural network approximations to continuous non-negative functions over the whole

R^{N}

,

N \in N

; then he extends his results to complex-valued functions.

L_{p}

approximations are also included. All convergences here are with rates expressed via the modulus of continuity of the involved function and given by very tight Jackson-type inequalities.

The author comes up with the “right” precisely defined flexible quasi-interpolation, Kantorovich–Choquet-type integral coefficient neural networks operators associated with multi-composite general sigmoid activation functions. In preparation to prove our results, we establish important properties of the multi-composite general density functions defining our operators.

Feed-forward neural networks (FNNs) with one hidden layer, the only type of networks we deal with in this work, are mathematically expressed as

N_{n} (x) = \sum_{j = 0}^{n} c_{j} σ (〈a_{j} \cdot x〉 + b_{j}), x \in R^{s}, s \in N,

where for

0 \leq j \leq n

,

b_{j} \in R

are the thresholds,

a_{j} \in R^{s}

are the connection weights,

c_{j} \in R

are the coefficients,

〈a_{j} \cdot x〉

is the inner product of

a_{j}

and x, and

σ

is the activation function of the network. For neural networks in general, read [8,9,10,11,12]. For recent related works, see [13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32].

2. Background

2.1. Description of Choquet Integral [33]

We make

Definition 1.

Consider

Ω \neq ⌀

and let

C

be a σ-algebra of subsets in Ω.

(i) (see, e.g., [29], p. 63) The set function

μ : C \to [0, + \infty]

is called a monotone set function (or capacity) if

μ (⌀) = 0

and

μ (A) \leq μ (B)

for all

A, B \in C

, with

A \subset B

. Also, μ is called submodular if

μ (A \cup B) + μ (A \cap B) \leq μ (A) + μ (B), f o r a l l A, B \in C .

μ is called bounded if

μ (Ω) < + \infty

and normalized if

μ (Ω) = 1 .

(ii) (see, e.g., [29], p. 233, or [2]) If μ is a monotone set function on

C

and if

f : Ω \to R

is

C

-measurable (that is, for any Borel subset

B \subset R

it follows

f^{- 1} (B) \in C

), the for any

A \in C

, the Choquet integral is defined by

(C) \int_{A} f d μ = \int_{0}^{+ \infty} μ (F_{β} (f) \cap A) d β + \int_{- \infty}^{0} [μ (F_{β} (f) \cap A) - μ (A)] d β,

where we used the notation

F_{β} (f) = \{ω \in Ω : f (ω) \geq β\}

. Notice that if

f \geq 0

on A, then in the above formula we get

\int_{- \infty}^{0} = 0 .

The integrals on the right-hand side are the usual Riemann integral.

The function f will be called Choquet integrable on A if

(C) \int_{A} f d μ \in R

.

Next, we list some well-known properties of the Choquet integral.

Remark 1.

If

μ : C \to [0, + \infty]

is a monotone set function, then the following properties hold:

(i) For all

a \geq 0

, we have

(C) \int_{A} a f d μ = a \cdot (C) \int_{A} f d μ

(if

f \geq 0

then see, e.g., [29], Theorem 11.2, (5), p. 228 and if f is arbitrary sign, then see, e.g., [34], p. 64, Proposition 5.1, (ii)).

(ii) For all

c \in R

and f of arbitrary sign, we have (see, e.g., [29], pp. 232–233, or [34], p. 65)

(C) \int_{A} (f + c) d μ = (C) \int_{A} f d μ + c \cdot μ (A) .

If μ is submodular too, then for all

f, g

of arbitrary sign and lower bounded, we have (see, e.g., [34], p. 75, Theorem 6.3)

(C) \int_{A} (f + g) d μ \leq (C) \int_{A} f d μ + (C) \int_{A} g d μ .

(iii) If

f \leq g

on A, then

(C) \int_{A} f d μ \leq (C) \int_{A} g d μ

(see, e.g., [29], p. 228, Theorem 11.2, (3) if

f, g \geq 0

and p. 232 if

f, g

are of arbitrary sign).

(iv) Let

f \geq 0

. If

A \subset B

then

(C) \int_{A} f d μ \leq (C) \int_{B} f d μ

. In addition, if μ is finitely subadditive, then

(C) \int_{A \cup B} f d μ \leq (C) \int_{A} f d μ + (C) \int_{B} f d μ .

(v) It is immediate that

(C) \int_{A} 1 \cdot d μ (t) = μ (A) .

(vi) If μ is a countably additive bounded measure, then the Choquet integral

(C) \int_{A} f d μ

reduces to the usual Lebesgue-type integral (see, e.g., [34], p. 62, or [29], p. 226).

(vii) If

f \geq 0

, then

(C) \int_{A} f d μ \geq 0

.

(viii) If

Ω = R^{N}

,

N \in N

, we call μ strictly positive if

μ (A) > 0

, for any open subset

A \subseteq R^{N} .

2.2. On Multi-Composite Activation Functions

We mention:

Definition 2.

Let

i = 1, 2, \dots,

and

h_{i} : R \to [- 1, 1]

be general sigmoid activation functions, such that they are strictly increasing,

h_{i} (0) = 0

,

h_{i} (- x) = - h_{i} (x)

,

x \in R

,

h_{i} (+ \infty) = 1

,

h_{i} (- \infty) = - 1

. Also

h_{i}

is strictly convex over

(- \infty, 0]

and strictly concave over

[0, + \infty)

, with

h_{i}^{(2)} \in C (R)

. Examples

tanh x,

e r f (x)

, normalized

arctan x

, normalized Gudermanian

(x)

, etc.

Notice here

0 < h_{i} (1) \leq 1

,

i = 1, 2, \dots

. Any composition

h_{1} \circ h_{2} \circ h_{3} \circ \dots \circ h_{λ}

is meant to be

h_{1} |_{[- 1, 1]} \circ h_{2} |_{[- 1, 1]} \circ h_{3} {|_{[- 1, 1]} \circ \dots \circ h_{λ - 1} |}_{[- 1, 1]} \circ h_{λ}

,

λ \in N

, and for convenience, we denote it by

G_{λ} : = h_{1} \circ h_{2} \circ h_{3} \circ \dots \circ h_{λ}

. We have for any

λ \in N : 0 < h_{λ} (1) \leq 1

, hence

0 < h_{λ - 1} (h_{λ} (1)) \leq h_{λ - 1} (1) \leq 1,

and

0 < h_{λ - 2} (h_{λ - 1} (h_{λ} (1))) \leq h_{λ - 2} (h_{λ - 1} (1)) \leq h_{λ - 2} (1) \leq 1 .

Inductively we derive that

0 < G_{λ} (1) \leq 1

, ∀

λ \in N

.

Clearly, it gives us

G_{λ} (0) = 0

and

G_{λ}

is strictly increasing over

R

. Furthermore, it holds

\begin{matrix} G_{λ} (- x) = h_{1} (h_{2} (h_{3} (\dots (h_{λ - 1} (h_{λ} (- x)))))) = h_{1} (h_{2} (h_{3} (\dots (h_{λ - 1} (- h_{λ} (x)))))) \\ = \dots = - h_{1} (h_{2} (h_{3} (\dots (h_{λ - 1} (h_{λ} (x)))))) = - G_{λ} (x), x \in R . \end{matrix}

Clearly, it holds

G_{λ}^{(2)} \in C (R) .

We notice that

\begin{matrix} G_{λ} (+ \infty) = h_{1} (h_{2} (h_{3} (\dots (h_{λ - 1} (h_{λ} (+ \infty)))))) = \\ h_{1} (h_{2} (h_{3} (\dots (h_{λ - 1} (1))))) = G_{λ - 1} (1), \end{matrix}

(1)

and

\begin{matrix} G_{λ} (- \infty) = h_{1} (h_{2} (h_{3} (\dots (h_{λ - 1} (h_{λ} (- \infty)))))) = h_{1} (h_{2} (h_{3} (\dots (h_{λ - 1} (- 1))))) \\ = - h_{1} (h_{2} (h_{3} (\dots (h_{λ - 1} (1))))) = - G_{λ - 1} (1) . \end{matrix}

(2)

Consequently, it holds

- G_{λ - 1} (1) \leq G_{λ} (x) \leq G_{λ - 1} (1), \forall x \in R .

(3)

Thus,

y = \pm G_{λ - 1} (1)

are asymptotes of

G_{λ} (x),

any

λ \in N .

Next, we act over

(- \infty, 0] :

let

λ, μ \geq 0 : λ + μ = 1

. Then, by convexity of

h_{2}

, we have

h_{2} (λ x + μ y) \leq λ h_{2} (x) + μ h_{2} (y), \forall x, y \in (- \infty, 0];

and

\begin{matrix} h_{1} (h_{2} (λ x + μ y)) \leq h_{1} (λ h_{2} (x) + μ h_{2} (y)) \leq \\ λ (h_{1} \circ h_{2}) (x) + μ (h_{1} \circ h_{2}) (y), \forall x, y \in (- \infty, 0] . \end{matrix}

(4)

so that

h_{1} \circ h_{2}

is convex over

(- \infty, 0] .

Now we work on

[0, + \infty) :

and let

λ, μ \geq 0 : λ + μ = 1

. Then, by concavity of

h_{2}

, we have

h_{2} (λ x + μ y) \geq λ h_{2} (x) + μ h_{2} (y), \forall x, y \in [0, + \infty);

and

\begin{matrix} h_{1} (h_{2} (λ x + μ y)) \geq h_{1} (λ h_{2} (x) + μ h_{2} (y)) \geq \\ λ (h_{1} \circ h_{2}) (x) + μ (h_{1} \circ h_{2}) (y), \forall x, y \in [0, + \infty) . \end{matrix}

(5)

Thus,

h_{1} \circ h_{2}

is concave over

[0, + \infty) .

Therefore,

G_{2} = h_{1} \circ h_{2}

is a general sigmoid activation function with asymptotes

y = \pm h_{1} (1)

, and fulfilling the rest of conditions of Definition 2.

Arguing as above

h_{2} \circ h_{3} : R \to [- 1, 1]

, fulfills Definition 2, and

h_{1} \circ h_{2} \circ h_{3}

does the same with asymptotes

h = \pm h_{1} (h_{2} (1))

.

Inductively, we prove that

G_{λ}

fulfills Definition 2 with asymptotes

y = \pm G_{λ - 1} (1) .

We have proved the following:

Theorem 1.

Let

λ \in N

. Then

G_{λ} : = h_{1} \circ h_{2} \circ h_{3} \circ \dots \circ h_{λ}

fulfills all the properties of Definition 2 with asymptotes

y = \pm G_{λ - 1} (1)

. That is

G_{λ}

is a multi-composite general sigmoid activation function from

R \to [- 1, 1] .

Corollary 1.

\frac{G_{λ}}{G_{λ - 1} (1)}

fulfills Definition 2 with asymptotes

y = \pm 1

.

We call

{\tilde{G}}_{λ} : = \frac{G_{λ}}{G_{λ - 1} (1)} .

(6)

Remark 2.

Next, we consider the function

T_{λ} (x) : = \frac{1}{4} ({\tilde{G}}_{λ} (x + 1) - {\tilde{G}}_{λ} (x - 1)) > 0, \forall x \in R, λ \in N .

(7)

We observe that

\begin{matrix} T_{λ} (- x) = \frac{1}{4} ({\tilde{G}}_{λ} (- x + 1) - {\tilde{G}}_{λ} (- x - 1)) = \\ \frac{1}{4} ({\tilde{G}}_{λ} (- (x - 1)) - {\tilde{G}}_{λ} (- (x + 1))) = \frac{1}{4} (- {\tilde{G}}_{λ} (x - 1) + {\tilde{G}}_{λ} (x + 1)) = \\ \frac{1}{4} ({\tilde{G}}_{λ} (x + 1) - {\tilde{G}}_{λ} (x - 1)) = T_{λ} (x) . \end{matrix}

(8)

That is

T_{λ}

is an even function,

T_{λ} (- x) = T_{λ} (x), \forall x \in R, λ \in N .

(9)

We see that

T_{λ} (0) = \frac{{\tilde{G}}_{λ} (1)}{2} .

(10)

Let

x > 1

, we have that

T_{λ}^{'} (x) = \frac{1}{4} ({\tilde{G}}_{λ}^{'} (x + 1) - {\tilde{G}}_{λ}^{'} (x - 1)) < 0,

by

{\tilde{G}}_{λ}^{'}

being strictly decreasing over

[0, + \infty) .

Let now

0 < x < 1

, then

1 - x > 0

and

0 < 1 - x < 1 + x

. It holds

{\tilde{G}}_{λ}^{'} (x - 1) = {\tilde{G}}_{λ}^{'} (1 - x) > {\tilde{G}}_{λ}^{'} (x + 1)

, so that again

T_{λ}^{'} (x) < 0 .

Consequently,

T_{λ}

is strictly decreasing on

(0, + \infty) .

Clearly,

T_{λ}

is strictly increasing on

(- \infty, 0)

, and

T_{λ}^{'} (0) = 0 .

We observe that

lim_{x \to + \infty} T_{λ} (x) = \frac{1}{4} ({\tilde{G}}_{λ} (+ \infty) - {\tilde{G}}_{λ} (+ \infty)) = 0,

(11)

and

lim_{x \to - \infty} T_{λ} (x) = \frac{1}{4} ({\tilde{G}}_{λ} (- \infty) - {\tilde{G}}_{λ} (- \infty)) = 0 .

(12)

That is the x-axis is the horizontal asymptote on

T_{λ}

.

As a result,

T_{λ}

is a bell-shaped symmetric function with maximum

T_{λ} (0) = \frac{{\tilde{G}}_{λ} (1)}{2} .

(13)

We need

Theorem 2.

It holds

\sum_{i = - \infty}^{\infty} T_{λ} (x - i) = 1, \forall x \in R .

(14)

Proof.

As similar to [6], p. 286 is omitted. □

Theorem 3.

We have that

\int_{- \infty}^{\infty} T_{λ} (x) d x = 1 .

(15)

Proof.

As similar to [6], p. 287, it is omitted. □

So that

T_{λ} (x)

can serve as a density function in general.

We need

Theorem 4

([36]). Let

0 < α < 1

, and

n \in N

with

n^{1 - α} > 2

. Then

\sum_{\{\begin{matrix} k = - \infty \\ : |n x - k| \geq n^{1 - α} \end{matrix}}^{\infty} T_{λ} (n x - k) < (1 - {\tilde{G}}_{λ} (n^{1 - α} - 2)),

(16)

and

lim_{n \to + \infty} (1 - {\tilde{G}}_{λ} (n^{1 - α} - 2)) = 0 .

(17)

Denote by

⌊\cdot⌋

the integral part of the number and by

⌈\cdot⌉

the ceiling of the number.

We also need

Theorem 5.

Let

x \in [a, b] \subset R

and

n \in N

so that

⌈n a⌉ \leq ⌊n b⌋

. It holds

\frac{1}{\sum_{k = ⌈n a⌉}^{⌊n b⌋} T_{λ} (n x - k)} < \frac{1}{T_{λ} (1)}, \forall x \in [a, b] .

(18)

Proof.

As similar to [6], p. 289 is omitted. □

Remark 3.

We have that

lim_{n \to \infty} \sum_{k = ⌈n a⌉}^{⌊n b⌋} T_{λ} (n x - k) \neq 1,

(19)

for at least some

x \in [a, b] .

See [6], p. 290, same reasoning.

Note 1.

For large enough n, we always obtain

⌈n a⌉ \leq ⌊n b⌋

. Also,

a \leq \frac{k}{n} \leq b

, if

⌈n a⌉ \leq k \leq ⌊n b⌋

. In general, it holds (by (14))

\sum_{k = ⌈n a⌉}^{⌊n b⌋} T_{λ} (n x - k) \leq 1 .

(20)

We make

Remark 4.

We define

\hat{Z} (x_{1}, \dots, x_{N}) : = \hat{Z} (x) : = \prod_{i = 1}^{N} T_{λ} (x_{i}), x = (x_{1}, \dots, x_{N}) \in R^{N}, N \in N .

(21)

It has the properties:

(i)

\hat{Z} (x) > 0, \forall x \in R^{N},

(22)

(ii)

\begin{matrix} \sum_{k = - \infty}^{\infty} \hat{Z} (x - k) : = \sum_{k_{1} = - \infty}^{\infty} \sum_{k_{2} = - \infty}^{\infty} \dots \sum_{k_{N} = - \infty}^{\infty} \hat{Z} (x_{1} - k_{1}, \dots, x_{N} - k_{N}) = \\ \sum_{k_{1} = - \infty}^{\infty} \sum_{k_{2} = - \infty}^{\infty} \dots \sum_{k_{N} = - \infty}^{\infty} \prod_{i = 1}^{N} T_{λ} (x_{i} - k_{i}) = \prod_{i = 1}^{N} (\sum_{k_{i} = - \infty}^{\infty} T_{λ} (x_{i} - k_{i})) \overset{(14)}{=} 1 . \end{matrix}

Hence

\sum_{k = - \infty}^{\infty} \hat{Z} (x - k) = 1 .

(23)

That is

(iii)

\sum_{k = - \infty}^{\infty} \hat{Z} (n x - k) = 1, \forall x \in R^{N}; n \in N .

(24)

and

(iv)

\int_{R^{N}} \hat{Z} (x) d x = \int_{R^{N}} (\prod_{i = 1}^{N} T_{λ} (x_{i})) d x_{1} \dots d x_{N} = \prod_{i = 1}^{N} (\int_{- \infty}^{\infty} T_{λ} (x_{i}) d x_{i}) \overset{(15)}{=} 1,

(25)

Thus

\int_{R^{N}} \hat{Z} (x) d x = 1,

(26)

That is,

\hat{Z}

is a multivariate density function.

Here, we denote

x = (x_{1}, \dots, x_{N}),

{∥x∥}_{\infty} : = max \{|x_{1}|, \dots, |x_{N}|\}

,

x \in R^{N}

, also set

\infty : = (\infty, \dots, \infty)

,

- \infty : = (- \infty, \dots, - \infty)

upon the multivariate context,

0 < β < 1,

(v) We have

\begin{matrix} \sum_{\{\begin{matrix} k = - \infty \\ {∥\frac{k}{n} - x∥}_{\infty} > \frac{1}{n^{β}} \end{matrix}}^{\infty} \hat{Z} (n x - k) = \underset{{∥\frac{k}{n} - x∥}_{\infty} > \frac{1}{n^{β}}}{\sum_{k_{1} = - \infty}^{\infty} \dots \sum_{k_{N} = - \infty}^{\infty}} (\prod_{i = 1}^{N} T_{λ} (n x_{i} - k_{i})) = \\ \prod_{i = 1}^{N} (\sum_{\{\begin{matrix} k_{i} = - \infty \\ {∥\frac{k}{n} - x∥}_{\infty} > \frac{1}{n^{β}} \end{matrix}}^{\infty} T_{λ} (n x_{i} - k_{i})) \leq (f o r s o m e r \in \{1, \dots, N\}) \\ (\prod_{\begin{matrix} i = 1 \\ i \neq r \end{matrix}}^{N} (\sum_{k_{i} = - \infty}^{\infty} T_{λ} (n x_{i} - k_{i}))) (\sum_{\{\begin{matrix} k_{r} = - \infty \\ |\frac{k_{r}}{n} - x_{r}| > \frac{1}{n^{β}} \end{matrix}}^{\infty} T_{λ} (n x_{r} - k_{r})) = \\ \sum_{\{\begin{matrix} k_{r} = - \infty \\ |\frac{k_{r}}{n} - x_{r}| > \frac{1}{n^{β}} \end{matrix}}^{\infty} T_{λ} (n x_{r} - k_{r}) = \sum_{\{\begin{matrix} k_{r} = - \infty \\ |n x_{r} - k_{r}| > n^{1 - β} \end{matrix}}^{\infty} T_{λ} (n x_{r} - k_{r}) \overset{(16)}{<} \\ 1 - {\tilde{G}}_{λ} (n^{1 - β} - 2) . \end{matrix}

(27)

That is

\sum_{\{\begin{matrix} k = - \infty \\ {∥\frac{k}{n} - x∥}_{\infty} > \frac{1}{n^{β}} \end{matrix}}^{\infty} \hat{Z} (n x - k) < 1 - {\tilde{G}}_{λ} (n^{1 - β} - 2) .

(28)

0 < β < 1

,

n \in N : n^{1 - β} > 2

,

\forall x \in \prod_{i = 1}^{N} [a_{i}, b_{i}] .

We denote by

{\hat{δ}}_{N} (β, n) : = 1 - {\tilde{G}}_{λ} (n^{1 - β} - 2) .

(29)

0 < β < 1 .

For

f \in C_{B}^{+} (R^{N})

(continuous and bounded functions from

R^{N}

into

R_{+}

), we define the first modulus of continuity

ω_{1} (f, δ) : = sup_{\begin{matrix} x, y \in R^{N} \\ {∥x - y∥}_{\infty} \leq h \end{matrix}} |f (x) - f (y)|, h > 0 .

(30)

Given that

f \in C_{U}^{+} (R^{N})

(uniformly continuous from

R^{N}

into

R_{+}

, the same definition for

ω_{1}

), we have that

lim_{h \to 0} ω_{1} (f, h) = 0 .

(31)

When

N = 1

,

ω_{1}

is defined as in (30) with

{∥\cdot∥}_{\infty}

collapsing to

|\cdot|

and has the property (31).

3. Main Results

We need

Definition 3.

Let

L

be the Lebesgue σ-algebra on

R^{N}

,

N \in N

, and the set function

μ : L \to [0, + \infty)

, which is assumed to be monotone, submodular and strictly positive. For

f \in C_{B}^{+} (R^{N})

, we define the general multi-composite multivariate Kantorovich–Choquet-type neural network operators for any

x \in R^{N}

:

\begin{matrix} {\hat{K}}_{n}^{μ} (f, x) = {\hat{K}}_{n}^{μ} (f, x_{1}, \dots, x_{N}) : = \\ \sum_{k = - \infty}^{\infty} (\frac{(C) \int_{{[0, \frac{1}{n}]}^{N}} f (t + \frac{k}{n}) d μ (t)}{μ ({[0, \frac{1}{n}]}^{N})}) \hat{Z} (n x - k) = \\ \sum_{k_{1} = - \infty}^{\infty} \sum_{k_{2} = - \infty}^{\infty} \dots \sum_{k_{N} = - \infty}^{\infty} (\frac{(C) \int_{0}^{\frac{1}{n}} \dots \int_{0}^{\frac{1}{n}} f (t_{1} + \frac{k_{1}}{n}, t_{2} + \frac{k_{2}}{n}, \dots, t_{N} + \frac{k_{N}}{n}) d μ (t_{1}, \dots, t_{N})}{μ ({[0, \frac{1}{n}]}^{N})}) \\ (\prod_{i = 1}^{N} T_{λ} (n x_{i} - k_{i})), \end{matrix}

(32)

where

x = (x_{1}, \dots, x_{N}) \in R^{N}

,

k = (k_{1}, \dots, k_{N})

,

t = (t_{1}, \dots, t_{N})

,

n \in N .

Clearly, here

μ ({[0, \frac{1}{n}]}^{N}) > 0

,

\forall n \in N

.

From the above, we notice that

{∥{\hat{K}}_{n}^{μ} (f)∥}_{\infty} \leq {∥f∥}_{\infty},

(33)

so that

{\hat{K}}_{n}^{μ} (f, x)

is well-defined.

We make

Remark 5.

Let

f \in C_{B}^{+} (R^{N}),

t \in {[0, \frac{1}{n}]}^{N}

and

x \in R^{N}

, then

f (t + \frac{k}{n}) = f (t + \frac{k}{n}) - f (x) + f (x) \leq |f (t + \frac{k}{n}) - f (x)| + f (x),

Hence

\begin{matrix} (C) \int_{{[0, \frac{1}{n}]}^{N}} f (t + \frac{k}{n}) d μ (t) \leq \\ (C) \int_{{[0, \frac{1}{n}]}^{N}} |f (t + \frac{k}{n}) - f (x)| d μ (t) + (C) \int_{{[0, \frac{1}{n}]}^{N}} f (x) d μ (t) = \\ (C) \int_{{[0, \frac{1}{n}]}^{N}} |f (t + \frac{k}{n}) - f (x)| d μ (t) + f (x) μ ({[0, \frac{1}{n}]}^{N}) . \end{matrix}

(34)

That is

\begin{matrix} (C) \int_{{[0, \frac{1}{n}]}^{N}} f (t + \frac{k}{n}) d μ (t) - f (x) μ ({[0, \frac{1}{n}]}^{N}) \leq \\ (C) \int_{{[0, \frac{1}{n}]}^{N}} |f (t + \frac{k}{n}) - f (x)| d μ (t) . \end{matrix}

(35)

Similarly, we have that

f (x) = f (x) - f (t + \frac{k}{n}) + f (t + \frac{k}{n}) \leq |f (t + \frac{k}{n}) - f (x)| + f (t + \frac{k}{n}) .

Hence

\begin{matrix} (C) \int_{{[0, \frac{1}{n}]}^{N}} f (x) μ (d t) \leq \\ (C) \int_{{[0, \frac{1}{n}]}^{N}} |f (t + \frac{k}{n}) - f (x)| d μ (t) + (C) \int_{{[0, \frac{1}{n}]}^{N}} f (t + \frac{k}{n}) μ (d t), \end{matrix}

and

\begin{matrix} f (x) μ ({[0, \frac{1}{n}]}^{N}) - (C) \int_{{[0, \frac{1}{n}]}^{N}} f (t + \frac{k}{n}) μ (d t) \leq \\ (C) \int_{{[0, \frac{1}{n}]}^{N}} |f (t + \frac{k}{n}) - f (x)| d μ (t) . \end{matrix}

(36)

By (35) and (36), we derive that

\begin{matrix} |(C) \int_{{[0, \frac{1}{n}]}^{N}} f (t + \frac{k}{n}) μ (d t) - f (x) μ ({[0, \frac{1}{n}]}^{N})| \leq \\ (C) \int_{{[0, \frac{1}{n}]}^{N}} |f (t + \frac{k}{n}) - f (x)| d μ (t) . \end{matrix}

(37)

In particular, it holds

|(\frac{(C) \int_{{[0, \frac{1}{n}]}^{N}} f (t + \frac{k}{n}) μ (d t)}{μ ({[0, \frac{1}{n}]}^{N})}) - f (x)| \leq \frac{(C) \int_{{[0, \frac{1}{n}]}^{N}} |f (t + \frac{k}{n}) - f (x)| d μ (t)}{μ ({[0, \frac{1}{n}]}^{N})} .

(38)

We present the following approximation result:

Theorem 6.

Let

f \in C_{B}^{+} (R^{N})

,

0 < β < 1

,

x \in R^{N}

,

N, n \in N

with

n^{1 - β} > 2

. Then

(i)

sup_{μ} |{\hat{K}}_{n}^{μ} (f, x) - f (x)| \leq ω_{1} (f, \frac{1}{n} + \frac{1}{n^{β}}) + 2 {∥f∥}_{\infty} {\hat{δ}}_{N} (β, n) = : σ_{n},

(39)

and

(ii)

sup_{μ} {∥{\hat{K}}_{n}^{μ} (f) - f∥}_{\infty} \leq σ_{n} .

(40)

Given that

f \in (C_{U}^{+} (R^{N}) \cap C_{B}^{+} (R^{N})),

we obtain

lim_{n \to \infty}

{\hat{K}}_{n}^{μ} (f) = f

, uniformly.

Proof.

We observe that

|{\hat{K}}_{n}^{μ} (f, x) - f (x)| =

|\sum_{k = - \infty}^{\infty} (\frac{(C) \int_{{[0, \frac{1}{n}]}^{N}} f (t + \frac{k}{n}) d μ (t)}{μ ({[0, \frac{1}{n}]}^{N})}) \hat{Z} (n x - k) - \sum_{k = - \infty}^{\infty} f (x) \hat{Z} (n x - k)|

= |\sum_{k = - \infty}^{\infty} ((\frac{(C) \int_{{[0, \frac{1}{n}]}^{N}} f (t + \frac{k}{n}) d μ (t)}{μ ({[0, \frac{1}{n}]}^{N})}) - f (x)) \hat{Z} (n x - k)| \leq

(41)

\sum_{k = - \infty}^{\infty} |(\frac{(C) \int_{{[0, \frac{1}{n}]}^{N}} f (t + \frac{k}{n}) d μ (t)}{μ ({[0, \frac{1}{n}]}^{N})}) - f (x)| \hat{Z} (n x - k) \overset{(38)}{\leq}

\sum_{k = - \infty}^{\infty} (\frac{(C) \int_{{[0, \frac{1}{n}]}^{N}} |f (t + \frac{k}{n}) - f (x)| d μ (t)}{μ ({[0, \frac{1}{n}]}^{N})}) \hat{Z} (n x - k) =

\sum_{\{\begin{matrix} k = - \infty \\ {∥\frac{k}{n} - x∥}_{\infty} \leq \frac{1}{n^{β}} \end{matrix}}^{\infty} (\frac{(C) \int_{{[0, \frac{1}{n}]}^{N}} |f (t + \frac{k}{n}) - f (x)| d μ (t)}{μ ({[0, \frac{1}{n}]}^{N})}) \hat{Z} (n x - k) +

(42)

\sum_{\{\begin{matrix} k = - \infty \\ {∥\frac{k}{n} - x∥}_{\infty} > \frac{1}{n^{β}} \end{matrix}}^{\infty} (\frac{(C) \int_{{[0, \frac{1}{n}]}^{N}} |f (t + \frac{k}{n}) - f (x)| d μ (t)}{μ ({[0, \frac{1}{n}]}^{N})}) \hat{Z} (n x - k) \leq

\sum_{\{\begin{matrix} k = - \infty \\ {∥\frac{k}{n} - x∥}_{\infty} \leq \frac{1}{n^{β}} \end{matrix}}^{\infty} (\frac{(C) \int_{{[0, \frac{1}{n}]}^{N}} ω_{1} (f, {∥t∥}_{\infty} + {∥\frac{k}{n} - x∥}_{\infty}) d μ (t)}{μ ({[0, \frac{1}{n}]}^{N})}) \hat{Z} (n x - k) +

2 {∥f∥}_{\infty} (\sum_{\{\begin{matrix} k = - \infty \\ {∥\frac{k}{n} - x∥}_{\infty} > \frac{1}{n^{β}} \end{matrix}}^{\infty} \hat{Z} (|n x - k|))

\overset{(28)}{\leq} ω_{1} (f, \frac{1}{n} + \frac{1}{n^{β}}) + 2 {∥f∥}_{\infty} {\hat{δ}}_{N} (β, n),

(43)

proving the claim. □

Additionally, we give

Definition 4.

Denote

C_{B}^{+} (R^{N}, C) = {f : R^{N} \to C | f = f_{1} + i f_{2}

, where

f_{1}, f_{2} \in C_{B}^{+} (R^{N})}

. We set for

f \in C_{B}^{+} (R^{N}, C)

that

{\hat{K}}_{n}^{μ} (f, x) : = {\hat{K}}_{n}^{μ} (f_{1}, x) + i {\hat{K}}_{n}^{μ} (f_{2}, x),

(44)

\forall n \in N

,

x \in R^{N},

i = \sqrt{- 1} .

We give

Theorem 7.

Let

f \in C_{B}^{+} (R^{N}, C)

,

f = f_{1} + i f_{2}

,

N \in N,

0 < β < 1

,

x \in R^{N}

,

n \in N

with

n^{1 - β} > 2

. Then

(i)

\begin{matrix} sup_{μ} |{\hat{K}}_{n}^{μ} (f, x) - f (x)| \leq (ω_{1} (f_{1}, \frac{1}{n} + \frac{1}{n^{β}}) + ω_{1} (f_{2}, \frac{1}{n} + \frac{1}{n^{β}})) \\ + 2 ({∥f_{1}∥}_{\infty} + {∥f_{2}∥}_{\infty}) {\hat{δ}}_{N} (β, n) = : φ_{n}, \end{matrix}

(45)

and

(ii)

sup_{μ} {∥{\hat{K}}_{n}^{μ} (f) - f∥}_{\infty} \leq φ_{n} .

Proof.

We have that

\begin{matrix} |{\hat{K}}_{n}^{μ} (f, x) - f (x)| = |{\hat{K}}_{n}^{μ} (f_{1}, x) + i {\hat{K}}_{n}^{μ} (f_{2}, x) - f_{1} (x) - i f_{2} (x)| = \\ |({\hat{K}}_{n}^{μ} (f_{1}, x) - f_{1} (x)) + i ({\hat{K}}_{n}^{μ} (f_{2}, x) - f_{2} (x))| \leq \\ |{\hat{K}}_{n}^{μ} (f_{1}, x) - f_{1} (x)| + |{\hat{K}}_{n}^{μ} (f_{2}, x) - f_{2} (x)| \overset{(39)}{\leq} \\ (ω_{1} (f_{1}, \frac{1}{n} + \frac{1}{n^{β}}) + 2 {∥f_{1}∥}_{\infty} {\hat{δ}}_{N} (β, n)) + \\ (ω_{1} (f_{2}, \frac{1}{n} + \frac{1}{n^{β}}) + 2 {∥f_{2}∥}_{\infty} {\hat{δ}}_{N} (β, n)), \end{matrix}

(46)

proving the claim. □

We need

Definition 5.

Let

L^{*}

be the Lebesgue σ-algebra on

R

, and the set function

μ^{*} : L^{*} \to [0, + \infty]

, which is assumed to be monotone, submodular and strictly positive. For

f \in C_{B}^{+} (R)

, we define the general multi-composite univariate Kantorovich–Choquet-type neural network operator for any

x \in R

:

{\hat{M}}_{n}^{μ^{*}} (f, x) = \sum_{k = - \infty}^{\infty} (\frac{(C) \int_{0}^{\frac{1}{n}} f (t + \frac{k}{n}) d μ^{*} (t)}{μ^{*} ([0, \frac{1}{n}])}) T_{λ} (n x - k) .

(47)

Clearly, here

μ^{*} ([0, \frac{1}{n}]) > 0

,

\forall n \in N

.

From the above, we notice that

{∥{\hat{M}}_{n}^{μ^{*}} (f)∥}_{\infty} \leq {∥f∥}_{\infty},

(48)

so that

{\hat{M}}_{n}^{μ^{*}} (f, x)

is well-defined.

Notice that

{\hat{K}}_{n}^{μ}

, when

N = 1

, collapses to

{\hat{M}}_{n}^{μ^{*}} .

This follows another related result.

Corollary 2.

(to Theorem 6 when

N = 1

)

Let

f \in C_{B}^{+} (R)

,

0 < β < 1

,

x \in R

;

n \in N

with

n^{1 - β} > 2

. Then

(i)

sup_{μ^{*}} |{\hat{M}}_{n}^{μ^{*}} (f, x) - f (x)| \leq ω_{1} (f, \frac{1}{n} + \frac{1}{n^{β}}) + 2 {∥f∥}_{\infty} (1 - {\tilde{G}}_{λ} (n^{1 - β} - 2)) = : τ_{n},

(49)

and

(ii)

sup_{μ} {∥{\hat{M}}_{n}^{μ^{*}} (f) - f∥}_{\infty} \leq τ_{n} .

(50)

Given that

f \in (C_{U}^{+} (R) \cap C_{B}^{+} (R)),

we obtain

lim_{n \to \infty}

{\hat{M}}_{n}^{μ^{*}} (f) = f

, uniformly.

Proof.

Similarly to Theorem 6, it is omitted. □

We need

Definition 6.

Let

f \in C_{B}^{+} (R, C)

, where

f = f_{1} + i f_{2}

with

f_{1}, f_{2} \in C_{B}^{+} (R)

. We set

{\hat{M}}_{n}^{μ^{*}} (f, x) : = {\hat{M}}_{n}^{μ^{*}} (f_{1}, x) + i {\hat{M}}_{n}^{μ^{*}} (f_{2}, x),

(51)

\forall n \in N,

x \in R .

We continue with

Corollary 3.

(to Theorem 7 when

N = 1

) Let

f \in C_{B}^{+} (R, C)

,

f = f_{1} + i f_{2}

,

0 < β < 1

,

x \in R

,

n \in N

with

n^{1 - β} > 2

. Then

(i)

\begin{matrix} sup_{μ^{*}} |{\hat{M}}_{n}^{μ^{*}} (f, x) - f (x)| \leq (ω_{1} (f_{1}, \frac{1}{n} + \frac{1}{n^{β}}) + ω_{1} (f_{2}, \frac{1}{n} + \frac{1}{n^{β}})) + \\ 2 ({∥f_{1}∥}_{\infty} + {∥f_{2}∥}_{\infty}) (1 - {\tilde{G}}_{λ} (n^{1 - β} - 2)) = : ψ_{n}, \end{matrix}

(52)

and

(ii)

sup_{μ^{*}} {∥{\hat{M}}_{n}^{μ^{*}} (f) - f∥}_{\infty} \leq ψ_{n} .

Proof.

Similarly to Theorem 7, it is omitted. □

We finish with

L_{p}

estimates.

Theorem 8.

Let all as in Theorem 7,

p > 1

,

\hat{Λ}

a compact subset of

R^{N}

. Then

{∥{\hat{K}}_{n}^{μ} (f) - f∥}_{p, \hat{Λ}} \leq φ_{n} {|\hat{Λ}|}^{\frac{1}{p}},

(53)

where

|\hat{Λ}| < \infty

, is the Lebesgue measure of

\hat{Λ}

, where

φ_{n}

as in (45).

Proof.

By (45), etc. □

Theorem 9.

All as in Theorem 8,

N = 1

case. Then

{∥{\hat{M}}_{n}^{μ^{*}} (f) - f∥}_{p, \hat{Λ}} \leq ψ_{n} {|\hat{Λ}|}^{\frac{1}{p}},

(54)

where

\hat{Λ}

now is a closed interval of

R

, where

ψ_{n}

as in (52).

Proof.

By (52), etc. □

4. Conclusions

We studied the multi-composite univariate and multivariate general approximation by the Kantorovich–Choquet-type quasi-interpolation integral neural network operators.

The functions under approximation are non-negative-valued, continuous, and bounded. We established a series of multi-composite Jackson-type inequalities giving the corresponding convergences via the modulus of continuity. We extended our results to complex-valued functions. We finished with related

L_{p}

results.

Funding

This research received no external funding.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest

The author declares no conflicts of interest.

References

Anastassiou, G.A. Rate of Convergence of Some Neural Network Operators to the Unit-Univariate Case. J. Math. Anal. Appl. 1997, 212, 237–262. [Google Scholar] [CrossRef]
Anastassiou, G.A. Quantitative Approximations; Chapman & Hall/CRC: Boca Raton, NY, USA, 2001. [Google Scholar]
Chen, Z.; Cao, F. The approximation operators with sigmoidal functions. Comput. Math. Appl. 2009, 58, 758–765. [Google Scholar] [CrossRef]
Anastassiou, G.A. Intelligent Systems: Approximation by Artificial Neural Networks; Intelligent Systems Reference Library; Springer: Berlin/Heidelberg, Germany, 2011; Volume 19. [Google Scholar]
Anastassiou, G.A. Intelligent Systems II: Complete Approximation by Neural Network Operators; Springer: Berlin/Heidelberg, Germany; New York, NY, USA, 2016. [Google Scholar]
Anastassiou, G.A. Intelligent Computations: Abstract Fractional Calculus, Inequalities, Approximations; Springer: Berlin/Heidelberg, Germany; New York, NY, USA, 2018. [Google Scholar]
Anastassiou, G.A. Parametrized, Deformed and General Neural Networks; Springer: Berlin/Heidelberg, Germany; New York, NY, USA, 2023. [Google Scholar]
Costarelli, D.; Spigler, R. Approximation results for neural network operators activated by sigmoidal functions. Neural Netw. 2013, 44, 101–106. [Google Scholar] [CrossRef] [PubMed]
Costarelli, D.; Spigler, R. Multivariate neural network operators with sigmoidal activation functions. Neural Netw. 2013, 48, 72–77. [Google Scholar] [CrossRef] [PubMed]
Haykin, I.S. Neural Networks: A Comprehensive Foundation, 2nd ed.; Prentice Hall: New York, NY, USA, 1998. [Google Scholar]
McCulloch, W.; Pitts, W. A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 1943, 7, 115–133. [Google Scholar] [CrossRef]
Mitchell, T.M. Machine Learning; WCB-McGraw-Hill: New York, NY, USA, 1997. [Google Scholar]
Dansheng, Y.; Feilong, C. Construction and approximation rate for feed-forward neural network operators with sigmoidal functions. J. Comput. Appl. Math. 2025, 453, 116150. [Google Scholar]
Siyu, C.; Bangti, J.; Qimeng, Q.; Zhi, Z. Hybrid neural-network FEM approximation of diffusion coeficient in elyptic and parabolic problems. IMA J. Numer. Anal. 2024, 44, 3059–3093. [Google Scholar]
Lucian, C.; Danillo, C.; Mariarosaria, N.; Alexandra, P. The approximation capabilities of Durrmeyer-type neural network operators. J. Appl. Math. Comput. 2024, 70, 4581–4599. [Google Scholar] [CrossRef]
Xavier, W. The GroupMax neural network approximation of convex functions. IEEE Trans. Neural Netw. Learn. Syst. 2024, 35, 11608–11612. [Google Scholar]
Arnau, F.; Oriol, G.; Joan, B.; Ramon, C. Approximation of acoustic black holes with finite element mixed formulations and artificial neural network correction terms. Finite Elem. Anal. Des. 2024, 241, 104236. [Google Scholar]
Philipp, G.; Felix, V. Proof of the theory-to-practice gap in deep learning via sampling complexity bounds for neural network approximation spaces. Found. Comput. Math. 2024, 24, 1085–1143. [Google Scholar]
Andrea, B.; Dario, T. Quantitative Gaussian approximation of randomly initialized deep neural networks. Mach. Learn. 2024, 113, 6373–6393. [Google Scholar] [CrossRef]
De Ryck, T.; Mishra, S. Error analysis for deep neural network approximations of parametric hyperbolic conservation laws. Math. Comp. 2024, 93, 2643–2677. [Google Scholar] [CrossRef]
Jie, L.; Baoji, Z.; Yuyang, L.; Liqiau, F. Hull form optimization reserach based on multi-precision back-propagation neural network approximation model. Int. J. Numer. Methods Fluids 2024, 96, 1445–1460. [Google Scholar]
Yoo, J.; Kim, J.; Gim, M.; Lee, H. Error estimates of physics-informed neural networks for initial value problems. J. Korean Soc. Ind. Appl. Math. 2024, 28, 33–58. [Google Scholar]
Kaur, J.; Goyal, M. Hyers-Ulam stability of some positive linear operators. Stud. Univ. Babeş-Bolyai Math. 2025, 70, 105–114. [Google Scholar] [CrossRef]
Abel, U.; Acu, A.M.; Heilmann, M.; Raşa, I. On some Cauchy problems and positive linear operators. Mediterr. J. Math. 2025, 22, 20. [Google Scholar] [CrossRef]
Moradi, H.R.; Furuichi, S.; Sababheh, M. Operator quadratic mean and positive linear maps. J. Math. Inequal. 2024, 18, 1263–1279. [Google Scholar] [CrossRef]
Bustamante, J.; Torres-Campos, J.D. Power series and positive linear operators in weighted spaces. Serdica Math. J. 2024, 50, 225–250. [Google Scholar] [CrossRef]
Acu, A.-M.; Rasa, I.; Sofonea, F. Composition of some positive linear integral operators. Demonstr. Math. 2024, 57, 20240018. [Google Scholar] [CrossRef]
Patel, P.G. On positive linear operators linking gamma, Mittag-Leffler and Wright functions. Int. J. Appl. Comput. Math. 2024, 10, 152. [Google Scholar] [CrossRef]
Wang, Z.; Klir, G.J. Generalized Measure Theory; Springer: New York, NY, USA, 2009. [Google Scholar]
Ozger, F.; Aslan, A.R.; Merve, E. Some approximation results on a class of Szasz-Mirakjan-Kantorovich operators including non-negative parameter. Numer. Funct. Anal. Optim. 2025, 46, 461–484. [Google Scholar] [CrossRef]
Costarelli, D.; Piconi, M. Strong and weak sharp bounds for neural network operators in Sobolev-Orlicz spaces and their quantitative extensions to Orlicz spaces. Bull. Sci. Math. 2026, 208, 103791. [Google Scholar] [CrossRef]
Saini, S.; Singh, U. Kantorovich-Type Stochastic Neural Network Operators for the Mean-Square Approximation of Certain Second-Order Stochastic Processes. arXiv 2026, arXiv:2601.03634. [Google Scholar]
Choquet, G. Theory of capacities. Ann. Inst. Fourier 1954, 5, 131–295. [Google Scholar] [CrossRef]
Denneberg, D. Non-Additive Measure and Integral; Kluwer: Dordrecht, The Netherlands, 1994. [Google Scholar]
Gal, S. Uniform and Pointwise Quantitative Approximation by Kantorovich-Choquet type integral Operators with respect to monotone and submodular set functions. Mediterr. J. Math. 2017, 14, 205. [Google Scholar] [CrossRef]
Anastassiou, G.A. General Multi-Composite Sigmoid Relied Banach Space Valued Univariate Neural Network Approximation. In Parametrized, Deformed and General Neural Networks; Studies in Computational Intelligence; Springer: Cham, Switzerland, 2023. [Google Scholar] [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Anastassiou, G.A. Multiple-Composite Quantitative Approximation by Multivariate Kantorovich–Choquet Neural Networks. Mathematics 2026, 14, 1027. https://doi.org/10.3390/math14061027

AMA Style

Anastassiou GA. Multiple-Composite Quantitative Approximation by Multivariate Kantorovich–Choquet Neural Networks. Mathematics. 2026; 14(6):1027. https://doi.org/10.3390/math14061027

Chicago/Turabian Style

Anastassiou, George A. 2026. "Multiple-Composite Quantitative Approximation by Multivariate Kantorovich–Choquet Neural Networks" Mathematics 14, no. 6: 1027. https://doi.org/10.3390/math14061027

APA Style

Anastassiou, G. A. (2026). Multiple-Composite Quantitative Approximation by Multivariate Kantorovich–Choquet Neural Networks. Mathematics, 14(6), 1027. https://doi.org/10.3390/math14061027

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Multiple-Composite Quantitative Approximation by Multivariate Kantorovich–Choquet Neural Networks

Abstract

1. Introduction

2. Background

2.1. Description of Choquet Integral [33]

2.2. On Multi-Composite Activation Functions

3. Main Results

4. Conclusions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI