Approximation by the Extended Neural Network Operators of Kantorovich Type

Chenghao Xiang; Yi Zhao; Xu Wang; Peixin Ye

doi:10.3390/math11081903

,

and

¹

School of Mathematics, Hangzhou Normal University, Hangzhou 311121, China

²

Department of Mathematics and Statistics, Wilfrid Laurier University, Waterloo, ON N2L 3C5, Canada

³

School of Mathematics and LPMC, Nankai University, Tianjin 300071, China

^*

Author to whom correspondence should be addressed.

Mathematics2023, 11(8), 1903;https://doi.org/10.3390/math11081903

Version Notes

Order Reprints

Abstract

Based on the idea of integral averaging and function extension, an extended Kantorovich-type neural network operator is constructed, and its error estimate of approximating continuous functions is obtained by using the modulus of continuity. Furthermore, by introducing the normalization factor, the approximation property of the new version of the extended Kantorovich-type neural network (normalized extended Kantorovich-type neural network) operator is obtained in

L_{p} [- 1, 1]

. The numerical examples show that this newly proposed neural network operator has a better approximation performance than the classical one, especially at the endpoints of a compact interval.

Keywords:

neural networks; Kantorovich-type operator; approximation; modulus of continuity; L_p space

MSC:

41A30; 41A25; 47A58

1. Introduction

Neural networks are broadly applied in various applications, such as visual recognition, healthcare, astronomical physics, geology, cybersecurity, and many more. As the most widely used neural networks, feedforward neural networks (FNNs) have been thoroughly studied because of their universal approximation capabilities. Theoretically, any continuous function on a compact set can be approximated by FNNs to an arbitrary desired degree provided that the number of neurons is sufficiently large. Further, some upper bounds of the approximation error for FNNs in the uniform metric and

L_{p}

metric have been studied in [1,2,3,4,5], and so on.

FNNs with one hidden layer can be mathematically expressed as

N_{n, σ} (x) = \sum_{k = 1}^{n} c_{k} σ (a_{k} \cdot x + b_{k}), x = (x_{1}, x_{2}, \dots, x_{s}) \in R^{s}, s \in N,

where for

1 \leq k \leq n, b_{k} \in R

are the thresholds,

a_{k} = (a_{k 1}, a_{k 2}, \dots, a_{k s}) \in R^{s}

are the connection weights,

c_{k} \in R

are the coefficients,

a_{k} \cdot x

is the inner product of

a_{k}

and

x

, and

σ

is the activation function.

NN operators and their approximation properties have attracted a lot of attention since the 1990s. Cardaliaguet and Euvrard [6] first introduced NN operators to approximate the unit operator. Since then, different types of NN operators were constructed, and their approximation properties were widely investigated. A lot of impressed results concerning the convergence of NN operators as well as the complexity of approximation have been achieved ([6,7,8,9,10,11,12,13,14,15,16,17], etc.). Traditional operators provide references for NN operators in many aspects, such as the construction, the way of discussing approximation properties, and the practical applications. They are both related and different in approximation performance and practical applications. Taking the classical Bernstein operator as an example, it is usually used to approximate continuous functions, while NN operators can be utilized to approximate a broad class of functions, such as integrable functions; Bernstein operators are valuable tools in computer-aided geometric design, while NN operators are used for a wider range of applications, such as machine learning. Moreover, different from classical operators, one remarkable feature of NN operators is that they are nonlinear. It is worth mentioning that the main advantage of using NN operators as approximation tools is that the NN operators can be viewed as FNNs with multiple layers, while all the components in the NN operators are known, such as the coefficients, the weights, and the thresholds, in order to approximate the target function. The constructions of NN operators and related discussions have formed an important part of the fundamental theories of artificial neural networks, which were introduced in order to simulate human brain activities.

Next, we are going to review some NN operators and their approximation properties. Let

C [a, b]

be the space of continuous functions on

[a, b]

. The approximation properties of NN operator

G_{n} (f, x) : [a, b] \to R

,

\begin{matrix} G_{n} (f, x) = \frac{\sum_{k = ⌈ n a ⌉}^{⌊ n b ⌋} f (\frac{k}{n}) \cdot ϕ_{σ} (n x - k)}{\sum_{k = ⌈ n a ⌉}^{⌊ n b ⌋} ϕ_{σ} (n x - k)}, x \in [a, b] \end{matrix}

has been studied in [12,18,19], where

ϕ_{σ}

will be defined in Section 2,

f \in C [a, b]

, equipped with the uniform norm

{∥ f ∥}_{\infty} = max_{a \leq x \leq b} | f (x) |

. The symbol

⌊ x ⌋

denotes the greatest integer not exceeding x, while

⌈ x ⌉

denotes the smallest integer greater than or equal to x.

In applications, the values of

f (k / n)

would have some errors due to the time-jitter or offset of the input signals, while more information is usually known around a point than precisely at that point. In approximation theory, constructing a Kantorovich-type operator to reduce the “time-jitter or offset“ errors is a well-known method. This kind of operator is also very useful in areas such as signal processing. To achieve this, we replace single function values

f (k / n)

by an average of f on a small interval around

k / n

, namely the mean,

n \int_{\frac{k}{n}}^{\frac{k + 1}{n}} f (u) d u

.

In [13], the authors used a similar idea and constructed a Kantorovich-type neural network operator

F_{n} (f, x) : R \to R

in

R^{d}, d \in Z

, and the related estimates are given therein. If we consider the one-dimension case of

F_{n} (f)

, then the NN operator degenerates to

N_{n} (f, x) : [- 1, 1] \to R

,

\begin{matrix} N_{n} (f, x) = \frac{\sum_{k = - n}^{n - 1} n \int_{\frac{k}{n}}^{\frac{k + 1}{n}} f (u) d u \cdot ϕ_{σ} (n x - k)}{\sum_{k = - n}^{n - 1} ϕ_{σ} (n x - k)}, x \in [- 1, 1] . \end{matrix}

It was proved in [13] that

N_{n} (f, x)

convergences to

f (x)

in various functional spaces.

In [9], the authors extended the continuous function f on

[- 1, 1]

to the function

f_{1, e} (x) = \{\begin{matrix} f (x), & x \in [- 1, 1] \\ f (sgn x), & 1 < | x | < 2, \end{matrix}

(1)

and discovered that, after constructing Kantorovich-type operators, the approximation rate of the constructed NN operator to the target function was significantly improved.

A natural question is, what happens if we combine the idea of function extension with that of integral averaging? In this paper, we investigate the approximation effects, including the convergence properties and quantitative estimates, of our newly constructed NN operator, called the extended Kantorovich-type NN operator, regarding this question. A numerical example shows that this type of neural network operator has a better approximation performance than

N_{n} (f, x)

, especially at the endpoints of a compact interval.

The remaining part of this paper is organized as follows. In Section 2, we propose two new types of NN operators, the extended Kantorovich-type NN operator (EKNNO) and the normalized extended Kantorovich-type NN operator (NEKNNO), and investigate their basic properties and the operator-dependent activation function. In Section 3, we establish the approximation theorems of the EKNNO and NEKNNO for continuous functions, including the convergence theorems and quantitative estimates, using the modulus of continuity. This section also demonstrates a numerical example and its results, which verify the validity of the theoretical results and the potential superiority of these operators. In Section 4, we further establish and prove the approximation properties of the EKNNO in the Lebesgue space, as well as the convergence results and the approximation rate of the NEKNNO. The conclusions are included in Section 5.

2. Extended Neural Network Operators of Kantorovich Type

In this section, we explain how to construct two extended neural network operators of a Kantorovich type—EKNNO and NEKNNO.

The activation function

σ

plays a significant role in the approximation properties of neural networks. In many fundamental NN models, the activation function

σ

is usually taken to be a sigmoid function ([20]), which is defined below.

A function

σ (x) : R \to R

is called sigmoid ([20]) if

lim_{x \to - \infty} σ (x) = 0, lim_{x \to + \infty} σ (x) = 1 .

For example, the well-known Logistic function is a typical sigmoid type, which is defined by

σ (x) = \frac{1}{1 + e^{- x}}

.

Next, we write

ϕ_{σ} (x)

, a combination of the translations of

σ

that was first proposed by Chen and Cao [8]:

ϕ_{σ} (x) : = \frac{1}{2} [σ (x + 1) - σ (x - 1)], x \in R .

(2)

We assume in this paper that the sigmoid-type function is nondecreasing on

R

, such that

σ (4) > σ (2)

, and it is centrosymmetric with respect to the point

(0, σ (0))

. For example, the Logistic function is exactly the sigmoid-type function that satisfies these conditions, i.e.,

lim_{x \to + \infty} ϕ_{σ} (x) = lim_{x \to - \infty} ϕ_{σ} (x) = 0 .

The function

ϕ_{σ} (x)

, which is often called the “bell-shaped function” (a name given by Cardaliaguet and Euvrard [6]), has been discussed in many papers. It has some important properties, and we cite part of them from [11] here, which are related to the research presented in this paper.

\begin{matrix} (i) \forall x \in R, ϕ_{σ} (x) \geq 0, specifically, ϕ_{σ} (3) > 0; \\ (ii) ϕ_{σ} (x) is an even function on R; \\ (iii) ϕ_{σ} (x) is nondecreasing for x < 0, and ϕ_{σ} (x) is nonincreasing for x \geq 0 . \\ Moreover, ϕ_{σ} (x) \leq ϕ_{σ} (0), x \in R . \end{matrix}

Using the definition of the Fourier transformation ([21]) and the Poisson summation formula ([9,22]), we have Theorem 1.

Theorem 1.

Assume that

σ (x)

is

sigmoid

type and centrosymmetric with respect to the point

(0, σ (0))

,

ϕ_{σ} (x)

is defined by (2). If there exist constants

C > 0

and

δ > 1

such that

| ϕ_{σ} {(x) | \leq C (1 + | x |)}^{- 1 - δ}, | {\hat{ϕ}}_{σ} {(x) | \leq C (1 + | x |)}^{- 1 - δ}, x \in R, \int_{- \infty}^{+ \infty} ϕ_{σ} (x) d x = 1,

then we have

\sum_{k = - \infty}^{+ \infty} ϕ_{σ} (x - k) = 1 + 2 \sum_{k = 1}^{+ \infty} {\hat{ϕ}}_{σ} (k) cos 2 k π x,

(3)

where

{\hat{ϕ}}_{σ} (x)

is the Fourier transformation of

ϕ_{σ} (x)

.

Throughout the paper, C refers to a positive constant whose value may vary under different circumstances.

Remark 1.

If

σ (x)

is taken as the

Logistic

function in Theorem 1, it was proved in [8,22] that for all

k \in Z - {0}

, we have

{\hat{ϕ}}_{σ} (k) = 0

, that is, Equality (3) can be simply written as

\sum_{k = - \infty}^{+ \infty} ϕ_{σ} (x - k) = 1

.

Inspired by the idea of integral averaging and function extension, we propose the first new operator—the extended Kantorovich-type NN operator (EKNNO)—as follows.

Definition 1.

Let

A > 0

,

ϕ_{σ} (x)

is defined by (2). Denote

ϕ_{σ, A} (x) = \frac{1}{A} ϕ_{σ} (\frac{x}{A}) .

(4)

Assume

f \in C [- 1, 1]

, and

f_{1, e}

is defined as in (1). Then, EKNNO is defined by

K_{n, A}^{*} (f, x) = \sum_{k = - 2 n}^{2 n - 1} n \int_{\frac{k}{n}}^{\frac{k + 1}{n}} f_{1, e} (u) d u \cdot ϕ_{σ, A} (n x - k) .

The EKNNO

K_{n, A}^{*} (f, x)

has several key characteristics. First, we replace

f (k / n)

by the integral averaging

n \int_{\frac{k}{n}}^{\frac{k + 1}{n}} f_{1, e} (u) d u

to remove the effect of time-jitter. Second, we extend the traditional function f defined on interval

[- 1, 1]

to

f_{1, e}

on

(- 2, 2)

to have better approximation abilities, especially around the endpoints

- 1, 1

. Further, we introduce a parameter A into the activation function

ϕ_{σ, A} (n x - k)

, which serves as a flexible quantity used to fine tune the approximation ability of the operator.

Based on Definition 1, by introducing the normalization factor, we further construct the normalized version of the extended NN operator (NEKNNO) as follows.

Definition 2.

Assume that

f_{1, e} (u)

is given by (1), and

ϕ_{σ, A} (x)

is defined by (4). Then, we define

\begin{matrix} K_{n, A} (f, x) = \frac{\sum_{k = - 2 n}^{2 n - 1} n \int_{\frac{k}{n}}^{\frac{k + 1}{n}} f_{1, e} (u) d u \cdot ϕ_{σ, A} (n x - k)}{\sum_{k = - 2 n}^{2 n - 1} ϕ_{σ, A} (n x - k)} . \end{matrix}

Normalization allows us to further discuss the approximation performance of NN operators in the integrable function space.

Compared with

K_{n, A}^{*} (f, x)

,

K_{n, A} (f, x)

has the following properties.

Theorem 2.

Assume

K_{n, A} (f, x)

is defined as in Definition 2. Then, for

f, g \in C [- 1, 1]

, we have

(I)

|K_{n, A} (f, x) - K_{n, A} (g, x)| \leq K_{n, A} (| f - g |, x), x \in [- 1, 1]

.

(II) Denote

1

the unitary constant function on

[- 1, 1]

, i.e.,

g = 1 \Leftrightarrow g (t) \equiv 1, t \in [- 1, 1]

. Then,

\begin{matrix} K_{n, A} (1, x) \equiv 1, x \in [- 1, 1] . \end{matrix}

Proof.

For arbitrary

x \in [- 1, 1]

, we have

f (x) = f (x) - g (x) + g (x) \leq | f (x) - g (x) | + g (x),

(5)

similarly

g (x) \leq | g (x) - f (x) | + f (x) .

(6)

Notice that for arbitrary

x \in [- 1, 1]

, if

f (x) \leq g (x)

, then

K_{n, A} (f, x) \leq K_{n, A} (g, x),

(7)

K_{n, A} (f + g, x) \leq K_{n, A} (f, x) + K_{n, A} (g, x) .

(8)

By (5)–(8), we have

\{\begin{matrix} K_{n, A} (f, x) \leq K_{n, A} (| f - g |, x) + K_{n, A} (g, x) \\ K_{n, A} (g, x) \leq K_{n, A} (| g - f |, x) + K_{n, A} (f, x) \end{matrix}

\Rightarrow |K_{n, A} (f, x) - K_{n, A} (g, x)| \leq K_{n, A} (| f - g |, x) .

The Proof of (II) is straightforward because

K_{n, A} (1, x) = \frac{\sum_{k = - 2 n}^{2 n - 1} n \int_{\frac{k}{n}}^{\frac{k + 1}{n}} 1 d u \cdot ϕ_{σ, A} (n x - k)}{\sum_{k = - 2 n}^{2 n - 1} ϕ_{σ, A} (n x - k)} = 1 .

Theorem 2 is proved. □

To obtain the error estimates of the operators

K_{n, A} (f, x)

and

K_{n, A}^{*} (f, x)

, we need the following lemmas.

Lemma 1.

Let

ϕ_{σ, A} (x)

be defined as in (4). Then, for any

γ > 0, k \in Z

,

lim_{n \to + \infty} \sum_{| x - k | > γ n} ϕ_{σ, A} (x - k) = 0

holds uniformly for

x \in R

.

Proof.

For any

x \in R

, set

x_{0} = x - ⌊ x ⌋

, and then

0 \leq x_{0} < 1

. Further, for any

k \in Z

, denote

\bar{k} = k - ⌊ x ⌋

. Then, for sufficiently large

n \in N

, by the definition of

ϕ_{σ, A} (x)

and Theorem 1, we have

\begin{matrix} \sum_{| x - k | > γ n} ϕ_{σ, A} (x - k) & = \sum_{| x_{0} - \bar{k} | > γ n} ϕ_{σ, A} (x_{0} - \bar{k}) \\ \leq \sum_{| \bar{k} | > γ n - 1} ϕ_{σ, A} (x_{0} - \bar{k}) \\ \leq C sup_{x_{0} \in [0, 1)} \sum_{| \bar{k} | > γ n - 1} {| x_{0} - \bar{k} |}^{- 1 - δ} \\ \leq C (\sum_{\bar{k} > γ n - 1} | 1 - \bar{k} |^{- 1 - δ} + \sum_{\bar{k} < 1 - γ n} {| \bar{k} |}^{- 1 - δ}) \to 0, n \to + \infty . \end{matrix}

□

Lemma 2.

Let

ϕ_{σ, A} (x)

be defined as in 4. For any

x \in [- 1, 1]

,

n \in N

, the following inequality holds:

\sum_{k = - 2 n}^{2 n - 1} ϕ_{σ, A} (n x - k) \geq ϕ_{σ, A} (3) .

Proof.

For all

k \in [- 2 n, 2 n - 1] ⋂ Z

, it is easy to see that

\begin{matrix} \sum_{k = - 2 n}^{2 n - 1} ϕ_{σ, A} (n x - k) = \sum_{k = - 2 n}^{2 n - 1} ϕ_{σ, A} (| n x - k |) \geq ϕ_{σ, A} (| n x - k_{0} |) . \end{matrix}

Further, we can fix a specific

k_{0} \in Z

such that

| n x - k_{0} | \leq 1

; therefore,

ϕ_{σ, A} (| n x - k_{0} |) \geq ϕ_{σ, A} (3) > 0,

which completes the proof of Lemma 2. □

3. Degree of the Approximation by $K_{n, A} (f, x)$ and $K_{n, A}^{*} (f, x)$ in $C [- 1, 1]$

The main aim of this section is to prove the convergence theorem as well as the quantitative approximation theorem of operators

K_{n, A}^{*} (f)

and

K_{n, A} (f)

to functions

f \in C [- 1, 1]

. For the EKNNO

K_{n, A}^{*} (f)

, Theorem 3 can be established.

Theorem 3.

Assume that the function

σ (x)

satisfies the condition in Theorem 1, and then for

f \in C [- 1, 1]

, there exists a constant

C > 0

such that

\begin{matrix} ∥ f - K_{n, A}^{*} {(f) ∥}_{\infty} & \leq C [{∥ f ∥}_{\infty} [\frac{1}{δ A^{1 + δ}} + \frac{1}{δ} {(\frac{A}{n})}^{δ} + \frac{A^{δ}}{n^{δ (1 - α) - α}}] \\ + (2 + \frac{1}{n^{1 - α}}) (1 + \frac{1}{δ A^{1 + δ}}) ω (f, \frac{1}{n^{α}})], \end{matrix}

where

α = \frac{1}{2} - \frac{1}{| δ - 2 | + 2}, δ > 1

.

The symbol

ω (f, t)

in Theorem 3 denotes the modulus of continuity of f [23], defined by

ω (f, t) : = sup_{x, y \in [- 1, 1], | x - y | \leq t} | f (x) - f (y) |, t > 0 .

(9)

Proof.

In view of Theorem 1,

\sum_{k = - \infty}^{+ \infty} ϕ_{σ, A} (n x - k) = 1 + 2 \sum_{k = 1}^{+ \infty} {\hat{ϕ}}_{σ, A} (A k) cos 2 k n π x,

then

\begin{matrix} f (x) - K_{n, A}^{*} (f, x) \\ = \sum_{k \in Z} f (x) ϕ_{σ, A} (n x - k) - K_{n, A}^{*} (f, x) - 2 f (x) \sum_{k = 1}^{+ \infty} {\hat{ϕ}}_{σ, A} (A k) cos 2 k n π x \\ = \sum_{k = - \infty}^{- 2 n - 1} f (x) ϕ_{σ, A} (n x - k) + \sum_{k = 2 n}^{+ \infty} f (x) ϕ_{σ, A} (n x - k) \\ + \sum_{k = - 2 n}^{2 n - 1} [f (x) - n \int_{\frac{k}{n}}^{\frac{k + 1}{n}} f_{1, e} (u) d u] ϕ_{σ, A} (n x - k) - 2 f (x) \sum_{k = 1}^{+ \infty} {\hat{ϕ}}_{σ, A} (A k) cos 2 k n π x \\ : = I_{1} + I_{2} + I_{3} + I_{4} . \end{matrix}

Now, we estimate

I_{1}, I_{2}, I_{3}, I_{4}

separately.

For

I_{1}, x \in [- 1, 1], k \leq - 2 n - 1

, we have

n x - k \geq 2 n + 1 + n x \geq n + 1

. By the definition of

ϕ_{σ, A} (x)

and the property that

| ϕ_{σ} {(x) | \leq C (1 + | x |)}^{- 1 - δ}

, we have:

\begin{matrix} | I_{1} | & \leq {∥ f ∥}_{\infty} \sum_{k = - \infty}^{- 2 n - 1} |ϕ_{σ, A} (n x - k)| \leq C {∥ f ∥}_{\infty} \frac{1}{A} \sum_{k = - \infty}^{- 2 n - 1} \frac{A^{1 + δ}}{{(n x - k)}^{1 + δ}} \\ = {C ∥ f ∥}_{\infty} A^{δ} \sum_{k = n + 1}^{\infty} \frac{1}{k^{1 + δ}} \leq C {∥ f ∥}_{\infty} \frac{1}{δ} {(\frac{A}{n})}^{δ} . \end{matrix}

Similarly,

\begin{matrix} | I_{2} | = |\sum_{k = 2 n}^{+ \infty} f (x) ϕ_{σ, A} (n x - k)| \leq C {∥ f ∥}_{\infty} \frac{1}{δ} {(\frac{A}{n})}^{δ}, \end{matrix}

(10)

and

\begin{matrix} | I_{4} | = |- 2 f (x) \sum_{k = 1}^{+ \infty} {\hat{ϕ}}_{σ, A} (A k) cos 2 k n π x| \leq {C ∥ f ∥}_{\infty} \sum_{k = 1}^{+ \infty} \frac{1}{{(A k)}^{1 + δ}} \leq \frac{C}{δ A^{1 + δ}} {∥ f ∥}_{\infty} . \end{matrix}

(11)

While for

I_{3}

,

\begin{matrix} | I_{3} | & \leq \sum_{k = - 2 n}^{2 n} |f (x) - n \int_{\frac{k}{n}}^{\frac{k + 1}{n}} f_{1, e} (u) d u| \cdot |ϕ_{σ, A} (n x - k)| \\ \leq (\sum_{|\frac{k}{n} - x| \leq \frac{1}{n^{α}}} + \sum_{|\frac{k}{n} - x| > \frac{1}{n^{α}}}) |f (x) - n \int_{\frac{k}{n}}^{\frac{k + 1}{n}} f_{1, e} (u) d u| \cdot |ϕ_{σ, A} (n x - k)| \\ : = |I_{3}^{1}| + |I_{3}^{2}|, \end{matrix}

then

\begin{matrix} |I_{3}^{1}| & = \sum_{|\frac{k}{n} - x| \leq \frac{1}{n^{α}}} n \int_{\frac{k}{n}}^{\frac{k + 1}{n}} |f (x) - f_{1, e} (u)| d u |ϕ_{σ, A} (n x - k)| \\ \leq \sum_{|\frac{k}{n} - x| \leq \frac{1}{n^{α}}} n \int_{\frac{k}{n}}^{\frac{k + 1}{n}} ω (f, | u - x |) d u |ϕ_{σ, A} (n x - k)| \\ \leq \sum_{|\frac{k}{n} - x| \leq \frac{1}{n^{α}}} n \int_{\frac{k}{n}}^{\frac{k + 1}{n}} [1 + n^{α} | u - x |] ω (f, \frac{1}{n^{α}}) d u |ϕ_{σ, A} (n x - k)| \\ \leq \sum_{|\frac{k}{n} - x| \leq \frac{1}{n^{α}}} n ω (f, \frac{1}{n^{α}}) \int_{\frac{k}{n}}^{\frac{k + 1}{n}} [1 + n^{α} |u - \frac{k}{n}| + n^{α} |\frac{k}{n} - x|] d u |ϕ_{σ, A} (n x - k)| \\ \leq \sum_{|\frac{k}{n} - x| \leq \frac{1}{n^{α}}} n ω (f, \frac{1}{n^{α}}) \int_{\frac{k}{n}}^{\frac{k + 1}{n}} [1 + n^{α} \cdot \frac{1}{n} + n^{α} \cdot \frac{1}{n^{α}}] d u |ϕ_{σ, A} (n x - k)| \\ \leq n ω (f, \frac{1}{\sqrt{n}}) \cdot (2 + \frac{1}{n^{1 - α}}) \cdot \frac{1}{n} \sum_{k = - \infty}^{+ \infty} |ϕ_{σ, A} (n x - k)| . \end{matrix}

Furthermore, because

\begin{matrix} \sum_{k = - \infty}^{+ \infty} |ϕ_{σ, A} (n x - k)| \leq 1 + \frac{C}{δ A^{1 + δ}} \leq C (1 + \frac{1}{δ A^{1 + δ}}), \end{matrix}

(12)

thus

\begin{matrix} |I_{3}^{1}| \leq C (2 + \frac{1}{n^{1 - α}}) (1 + \frac{1}{δ A^{1 + δ}}) ω (f, \frac{1}{n^{α}}), \end{matrix}

while

\begin{matrix} |I_{3}^{2}| & = \sum_{|\frac{k}{n} - x| > \frac{1}{n^{α}}} |f (x) - n \int_{\frac{k}{n}}^{\frac{k + 1}{n}} f (u) d u| \cdot |ϕ_{σ, A} (n x - k)| \\ \leq {2 ∥ f ∥}_{\infty} \sum_{|\frac{k}{n} - x| > \frac{1}{n^{α}}} \frac{1}{A} \cdot \frac{A^{1 + δ}}{{| n x - k |}^{1 + δ}} \\ \leq {C ∥ f ∥}_{\infty} A^{δ} \sum_{|\frac{k}{n} - x| > \frac{1}{n^{α}}} \frac{1}{n^{(1 + δ) (1 - α)}} \leq C {∥ f ∥}_{\infty} A^{δ} \int_{n^{1 - α}}^{\infty} \frac{1}{t^{1 + δ}} d t \\ \leq {C ∥ f ∥}_{\infty} A^{δ} \frac{1}{δ} \frac{1}{n^{δ (1 - α)}} . \end{matrix}

Therefore,

\begin{matrix} |I_{3}| \leq C [(2 + \frac{1}{n^{1 - α}}) (1 + \frac{1}{δ A^{1 + δ}}) ω (f, \frac{1}{n^{α}}) + {∥ f ∥}_{\infty} A^{δ} \frac{1}{δ n^{δ (1 - α)}}] . \end{matrix}

Combining the above estimates together, we have

\begin{matrix} ∥ f - K_{n, A}^{*} {(f) ∥}_{\infty} & \leq C [{∥ f ∥}_{\infty} (\frac{1}{δ A^{1 + δ}} + \frac{1}{δ} {(\frac{A}{n})}^{δ} + \frac{A^{δ}}{δ n^{δ (1 - α)}}) + \\ (2 + \frac{1}{n^{1 - α}}) (1 + \frac{1}{δ A^{1 + δ}}) ω (f, \frac{1}{n^{α}})] . \end{matrix}

□

Remark 2.

For a fixed A, the terms

\frac{A}{n}, \frac{A^{δ}}{n^{δ (1 - α)}}

, and

ω (f, \frac{1}{n^{α}})

are approaching 0 when

n \to \infty

; consequently,

lim_{n \to + \infty} ∥ f - K_{n, A}^{*} {(f) ∥}_{\infty} \leq \frac{C}{A^{1 + δ}} {∥ f ∥}_{\infty}

for some constant

C > 0

. Especially, if we choose

A = 1

, we have

lim_{n \to + \infty} ∥ f - K_{n, A}^{*} {(f) ∥}_{\infty} \leq C {∥ f ∥}_{\infty} .

Remark 3.

Taking the activation function

y = σ (x) = \frac{1}{1 + e^{- x}}

, then

\sum_{k = - \infty}^{+ \infty} ϕ_{σ} (x - k) = 1

in this case. Following the similar (but more simple) procedure, we have

{∥f - K_{n, A}^{*} (f)∥}_{\infty} \leq C [{∥ f ∥}_{\infty} (\frac{1}{δ} {(\frac{A}{n})}^{δ} + \frac{A^{δ}}{n^{δ (1 - α) - α}}) + (2 + \frac{1}{n^{1 - α}}) ω (f, \frac{1}{n^{α}})] \to 0,

when

n \to + \infty .

Remark 4.

Notice that in Theorem 3, we set

α = \frac{1}{2} - \frac{1}{| δ - 2 | + 2}, δ > 0 .

Then, it is easy to verify that

δ (1 - α) = \frac{δ | δ - 2 | + 4 δ}{2 | δ - 2 | + 4} > \frac{δ}{2},

as shown in Figure 1. Moreover, for

δ \to + \infty

, α tends to

\frac{1}{2}

.

Figure 1. Comparison of the Order.

In approximation theory, Theorem 3 is called a direct theorem of approximation by the operators, which gives the upper bound of the approximation. The direct theorems of the Kantorovich-types operators were investigated in much of the literature, see, for example, [24,25], etc.

The results on the upper bound imply the convergence of the NN operators to the target function and also provide a quantitative measurement of how accurately the target function can be approximated.

As for the NEKNNO

K_{n, A} (f)

, we have the following theorem.

Theorem 4.

Assume that

f : [- 1, 1] \to R

is bounded on

[- 1, 1]

and continuous at

x_{0} \in R

, then

\begin{matrix} lim_{n \to + \infty} K_{n, A} (f, x_{0}) = f (x_{0}) . \end{matrix}

(13)

Furthermore, if

f \in C [- 1, 1]

, we have

\begin{matrix} lim_{n \to + \infty} {∥K_{n, A} (f) - f∥}_{\infty} = 0 . \end{matrix}

(14)

Proof.

f is continuous at

x = x_{0}

⇔ For any

ε > 0

, there exists a

γ > 0

, such that for arbitrary

y \in [- 1, 1] ⋂ [x - γ, x + γ]

, it holds

| f (x_{0}) - f (y) | < ε

. While for

u \in [k / n, (k + 1) / n]

and

| n x_{0} - k | \leq n γ / 2

, we have

\begin{matrix} | u - x_{0} | \leq |u - \frac{k}{n}| + |\frac{k}{n} - x_{0}| \leq \frac{1}{n} + \frac{γ}{2}, \end{matrix}

then

\begin{matrix} |K_{n, A} (f, x_{0}) - f (x_{0})| & \leq |K_{n, A} (| f - f_{x} |, x_{0})| \\ \leq \frac{1}{ϕ_{σ, A} (3)} \sum_{k = - 2 n}^{2 n} n \int_{\frac{k}{n}}^{\frac{k + 1}{n}} | f (u) - f (x_{0}) | d u ϕ_{σ, A} (n x_{0} - k) \\ \leq \frac{1}{ϕ_{σ, A} (3)} max \{\sum_{| n x - k | \leq n γ / 2}, \sum_{| n x - k | > n γ / 2}\} n \times \\ \int_{\frac{k}{n}}^{\frac{k + 1}{n}} | f (u) - f (x_{0}) | d u ϕ_{σ, A} (n x_{0} - k) \\ : = \frac{1}{ϕ_{σ, A} (3)} max (J_{1}, J_{2}) . \end{matrix}

Notice that

| u - x_{0} | < γ

for sufficiently large

n \in N

.

Estimating

J_{1}, J_{2}

, respectively, we have:

\begin{matrix} J_{1} & \leq \sum_{| n x - k | \leq n γ / 2} n \int_{\frac{k}{n}}^{\frac{k + 1}{n}} | f (u) - f (x_{0}) | d u ϕ_{σ, A} (n x_{0} - k) \\ \leq n \int_{\frac{k}{n}}^{\frac{k + 1}{n}} ε d u \sum_{k = - \infty}^{+ \infty} |ϕ_{σ, A} (n x_{0} - k)| \leq C (1 + \frac{1}{A^{1 + δ}}) ε; \\ J_{2} & \leq \sum_{| n x - k | > n γ / 2} n \int_{\frac{k}{n}}^{\frac{k + 1}{n}} 2 {∥ f ∥}_{\infty} d u \cdot ϕ_{σ, A} (n x_{0} - k) \to 0, n \to + \infty . \end{matrix}

\Rightarrow lim_{n \to + \infty} K_{n, A} (f, x_{0}) = f (x_{0}) .

Relation (13) is proved, and (14) can be proved similarly. □

Remark 5.

In Figure 2, we compare the approximation efficiency of the NEKNNO

K_{n, A} (f, x)

and

N_{n} (f, x)

when both approximate to the quadratic function

f (x) = x^{2}

on

[- 1, 1]

, while parameters n and A vary. It is very clear to see from Figure 2a–c that the NEKNNO

K_{n, A} (f, x)

has a better approximation performance, especially at the endpoints. At the same time, the change in the parameter A in Figure 2d obviously affects the approximation efficiency of the NEKNNO

K_{n, A} (f, x)

at the endpoints. Then, whether the optimal solution of A exists or not is an open question worth discussing.

Figure 2. The comparison of approximation effects of operators

K_{n, A} (f, x)

and

N_{n} (f, x)

.

4. Degree of the Approximation by NEKNNO $K_{n, A} (f)$ and EKNNO $K_{n, A}^{*} (f)$ in $L_{p} [- 1, 1]$

Now, consider the efficiency of the approximation of f by proposed operators EKNNO

K_{n, A}^{*} (f)

and NEKNNO

K_{n, A} (f)

in the Lebesgue space

L_{p} [- 1, 1] (1 \leq p < + \infty)

, where

L_{p} [- 1, 1] = \{f : \int_{- 1}^{1} {| f (x) |}^{p} d x < + \infty\},

equipped with the norm

{∥ f ∥}_{p} = {(\int_{- 1}^{1} {| f (x) |}^{p} d x)}^{1 / p}, 1 \leq p < + \infty .

We first give Lemma 3.

Lemma 3.

For any functions

f, g \in L_{p} [- 1, 1], 1 \leq p < + \infty

, the following inequality holds:

{∥K_{n, A} (f) - K_{n, A} (g)∥}_{p} \leq \frac{C}{ϕ_{σ, A}^{1 / p} (3)} {∥f - g∥}_{p} .

Proof.

By the definition of

{∥ \cdot ∥}_{p}

, Theorem 2, Lemma 2, and applying Jenson’s inequality due to the convexity of

{| \cdot |}^{p} (1 \leq p < + \infty)

, we have

\begin{matrix} {∥K_{n, A} (f) - K_{n, A} (g)∥}_{p}^{p} & = \int_{- 1}^{1} {|K_{n, A} (f, x) - K_{n, A} (g, x)|}^{p} d x \leq \int_{- 1}^{1} {|K_{n, A} (| f - g |, x)|}^{p} d x \\ \leq \int_{- 1}^{1} \frac{\sum_{k = - 2 n}^{2 n - 1} ϕ_{σ, A} (n x - k) {(n \int_{\frac{k}{n}}^{\frac{k + 1}{n}} |f_{1, e} (u) - g_{1, e} (u)| d u)}^{p}}{\sum_{k = - 2 n}^{2 n - 1} ϕ_{σ, A} (n x - k)} d x \\ \leq \frac{1}{ϕ_{σ, A} (3)} \int_{- 1}^{1} \sum_{k = - 2 n}^{2 n - 1} ϕ_{σ, A} (n x - k) {(n \int_{\frac{k}{n}}^{\frac{k + 1}{n}} |f_{1, e} (u) - g_{1, e} (u)| d u)}^{p} d x \\ \leq \frac{1}{ϕ_{σ, A} (3)} \int_{R} \sum_{k = - 2 n}^{2 n - 1} n ϕ_{σ, A} (n x - k) d x \int_{\frac{k}{n}}^{\frac{k + 1}{n}} {|f_{1, e} (u) - g_{1, e} (u)|}^{p} d u . \end{matrix}

In view of Theorem 1, we obtain

\begin{matrix} \int_{R} n ϕ_{σ, A} (n x - k) d x = \int_{R} ϕ_{σ} (\frac{n x - k}{A}) d \frac{n x}{A} = 1 . \end{matrix}

Therefore,

\begin{matrix} ∥ K_{n, A} (f) - K_{n, A} {(g) ∥}_{p}^{p} & \leq \frac{1}{ϕ_{σ, A} (3)} \sum_{k = - 2 n}^{2 n - 1} \int_{\frac{k}{n}}^{\frac{k + 1}{n}} {|f_{1, e} (u) - g_{1, e} (u)|}^{p} d u \\ = \frac{1}{ϕ_{σ, A} (3)} \int_{- 2}^{2} {|f_{1, e} (u) - g_{1, e} (u)|}^{p} d u \leq \frac{C}{ϕ_{σ, A} (3)} {∥ f - g ∥}_{p}^{p} . \end{matrix}

This completes the proof of Lemma 3. □

Utilizing Lemma 3, we can prove the convergence of

K_{n, A} (f)

in

L_{p} [- 1, 1]

.

Theorem 5.

Assume that

f \in L_{p} [- 1, 1]

, then

lim_{n \to + \infty} {∥K_{n, A} (f) - f∥}_{p} = 0 .

Proof.

It is well-known that

C [- 1, 1]

is dense in

L_{p} [- 1, 1]

. Let

f \in L_{p} [- 1, 1]

. Then, for any

ε > 0

, there exists a function

g \in C [- 1, 1]

such that

{∥ f - g ∥}_{p} \leq ε

. By Lemma 3 and (14),

\begin{matrix} {∥K_{n, A} (g) - g∥}_{p} & = {[\int_{- 1}^{1} {|K_{n, A} (g, x) - g (x)|}^{p} d x]}^{1 / p} \\ \leq {[\int_{- 1}^{1} {∥K_{n, A} (g, x) - g (x)∥}_{\infty}^{p} d x]}^{1 / p} \\ \leq 2^{1 / p} ε . \end{matrix}

Therefore, for sufficiently large

n \in N

, we have

\begin{matrix} {∥K_{n, A} (f) - f∥}_{p} & \leq {∥K_{n, A} (f) - K_{n, A} (g)∥}_{p} + {∥K_{n, A} (g) - g∥}_{p} + {∥ f - g ∥}_{p} \\ \leq (C + \frac{1}{ϕ_{σ, A}^{1 / p} (3)}) {∥ f - g ∥}_{p} + {∥K_{n, A} (g) - g∥}_{p} \\ \leq (C + \frac{1}{ϕ_{σ, A}^{1 / p} (3)} + 2^{1 / p}) ε . \end{matrix}

Theorem 5 is proved. □

Next, we consider

K_{n, A}^{*} (f)

. We need Lemma 4, whose proof is similar to that of Lemma 3.

Lemma 4.

For any functions

f, g \in L_{p} [- 1, 1], 1 \leq p < + \infty

, the following inequality holds:

{∥K_{n, A}^{*} (f) - K_{n, A}^{*} (g)∥}_{p} \leq {∥ f - g ∥}_{p} .

Now, we can establish the approximation theorem of

K_{n, A}^{*} (f, x)

in

L_{p} [- 1, 1]

space.

Theorem 6.

Let

C^{1} [- 1, 1]

denote the set of all functions that are differentiable and have continuous derivatives on

[- 1, 1]

. Then, for

f \in L_{p} [- 1, 1], 1 \leq p < + \infty

,

g \in C^{1} [- 1, 1]

, the following inequality holds:

{∥K_{n, A}^{*} (f) - f∥}_{p} \leq inf_{g \in C^{1} [- 1, 1]} \{2 {∥f - g∥}_{p} + \frac{M_{A, n, δ}^{1}}{ϕ_{σ, A}^{1 / p} (3)} {∥g_{1, e}∥}_{L_{p} [- 2, 2]} + \frac{M_{A, n, δ}^{2}}{n ϕ_{σ, A} (3)} {∥ g^{'} ∥}_{\infty}\},

where

{∥g_{1, e}∥}_{L_{p} [- 2, 2]} = {(\int_{- 2}^{2} {| g_{1, e} (u) |}^{p} d u)}^{1 / p}

,

M_{A, n, δ}^{1} = \frac{C}{δ} [\frac{1}{A^{1 + δ}} + {(\frac{A}{n})}^{δ}],

and

M_{A, n, δ}^{2} = C [(\frac{1}{n^{\frac{1}{2 (δ - 1)}}} + 1) (1 + \frac{1}{A^{1 + δ}}) + \frac{1}{δ - 1} \sqrt{n}] .

Proof.

Let

f \in L_{p} [- 1, 1], 1 \leq p < + \infty

, and

g (x) \in C^{1} [- 1, 1]

. By Minkowski’s inequality,

\begin{matrix} {∥K_{n, A}^{*} (f) - f∥}_{p} & \leq {∥K_{n, A}^{*} (f) - K_{n, A}^{*} (g)∥}_{p} + {∥K_{n, A}^{*} (g) - K_{n, A} (g)∥}_{p} \\ + {∥K_{n, A} (g) - g∥}_{p} + {∥f - g∥}_{p} . \end{matrix}

In view of Lemma 4,

\begin{matrix} {∥K_{n, A}^{*} (f) - K_{n, A}^{*} (g)∥}_{p} \leq {∥ f - g ∥}_{p} . \end{matrix}

(15)

Therefore,

\begin{matrix} {∥K_{n, A}^{*} (f) - f∥}_{p} & \leq {∥K_{n, A}^{*} (g) - K_{n, A} (g)∥}_{p} + {∥K_{n, A} (g) - g∥}_{p} + 2 {∥f - g∥}_{p} \\ : = S_{1} + S_{2} + 2 {∥f - g∥}_{p} . \end{matrix}

According to the definition of

K_{n, A}^{*}

and

K_{n, A}

, we have

S_{1} = {∥K_{n, A}^{*} (g) - K_{n, A} (g)∥}_{p} \leq |1 - \sum_{k = - 2 n}^{2 n - 1} ϕ_{σ, A} (n x - k)| {∥K_{n, A} (g)∥}_{p} .

Then, in view of (10) and (11),

\begin{matrix} |1 - \sum_{k = - 2 n}^{2 n - 1} ϕ_{σ, A} (n x - k)| & \leq \sum_{k = - \infty}^{- 2 n - 1} ϕ_{σ, A} (n x - k) + \sum_{k = 2 n}^{+ \infty} ϕ_{σ, A} (n x - k) \\ + 2 \sum_{k = 1}^{+ \infty} |{\hat{ϕ}}_{σ, A} (A k) cos 2 k n π x| \\ \leq \frac{C}{δ} [\frac{1}{A^{1 + δ}} + {(\frac{A}{n})}^{δ}] . \end{matrix}

By Jensen’s inequality, Hölder’s inequality, and Lemma 2,

\begin{matrix} {∥K_{n, A} (g)∥}_{p}^{p} & = \int_{- 1}^{1} {|\frac{\sum_{k = - 2 n}^{2 n - 1} n \int_{\frac{k}{n}}^{\frac{k + 1}{n}} g_{1, e} (u) d u ϕ_{σ, A} (n x - k)}{\sum_{k = - 2 n}^{2 n - 1} ϕ_{σ, A} (n x - k)}|}^{p} d x \\ \leq \int_{- 1}^{1} \frac{\sum_{k = - 2 n}^{2 n - 1} {(n \int_{\frac{k}{n}}^{\frac{k + 1}{n}} g_{1, e} (u) d u)}^{p} ϕ_{σ, A} (n x - k)}{\sum_{k = - 2 n}^{2 n - 1} ϕ_{σ, A} (n x - k)} d x \\ \leq \frac{1}{ϕ_{σ, A} (3)} \int_{- 1}^{1} \sum_{k = - 2 n}^{2 n - 1} {(n \int_{\frac{k}{n}}^{\frac{k + 1}{n}} g_{1, e} (u) d u)}^{p} ϕ_{σ, A} (n x - k) d x \\ \leq \frac{1}{ϕ_{σ, A} (3)} \int_{R} \sum_{k = - 2 n}^{2 n - 1} n ϕ_{σ, A} (n x - k) d x \int_{\frac{k}{n}}^{\frac{k + 1}{n}} {|g_{1, e} (u)|}^{p} d u \\ \leq \frac{1}{ϕ_{σ, A} (3)} \sum_{k = - 2 n}^{2 n - 1} \int_{\frac{k}{n}}^{\frac{k + 1}{n}} {|g_{1, e} (u)|}^{p} d u = \frac{1}{ϕ_{σ, A} (3)} {∥g_{1, e}∥}_{L_{p} [- 2, 2]}^{p} \\ \Rightarrow S_{1} = {∥K_{n, A}^{*} (g) - K_{n, A} (g)∥}_{p} \end{matrix}

(16)

\begin{matrix} \leq \frac{C}{δ ϕ_{σ, A}^{1 / p} (3)} [\frac{1}{A^{1 + δ}} + {(\frac{A}{n})}^{δ}] {∥g_{1, e}∥}_{L_{p} [- 2, 2]} . \end{matrix}

(17)

For

S_{2}

,

\forall x, t \in [- 1, 1]

, according to the Lagrange mean value theorem, we have

| g (t) - g (x) | \leq ∥ g^{'} ∥_{\infty} | t - x | .

Therefore,

\begin{matrix} |K_{n, A} (g, x) - g (x)| & \leq ∥ g^{'} ∥_{\infty} K_{n, A} (| t - x |, x) \\ \leq \frac{∥ g^{'} ∥_{\infty}}{σ_{σ, A} (3)} [\sum_{k = - 2 n}^{2 n - 1} n \int_{\frac{k}{n}}^{\frac{k + 1}{n}} (|x - \frac{k}{n}| + |\frac{k}{n} - u|) d u ϕ_{σ, A} (n x - k)] \\ \leq \frac{∥ g^{'} ∥ \infty}{n σ_{σ, A} (3)} [\sum_{k = - 2 n}^{2 n - 1} | n x - k | ϕ_{σ, A} (n x - k) + \sum_{k = - 2 n}^{2 n - 1} ϕ_{σ, A} (n x - k)] . \end{matrix}

Let

β \in (0, 1)

. We have

\begin{matrix} \sum_{k = - 2 n}^{2 n - 1} | n x - k | ϕ_{σ, A} (n x - k) & \leq \sum_{|\frac{k}{n} - x| \leq \frac{1}{n^{β}}} | n x - k | ϕ_{σ, A} (n x - k) \\ + \sum_{|\frac{k}{n} - x| > \frac{1}{n^{β}}} | n x - k | ϕ_{σ, A} (n x - k) \\ \leq \frac{1}{n^{β}} \sum_{k = - \infty}^{+ \infty} ϕ_{σ, A} (n x - k) + C \sum_{|\frac{k}{n} - x| > \frac{1}{n^{β}}} \frac{| n x - k |}{{| n x - k |}^{1 + δ}} \\ \leq \frac{C}{n^{β}} (1 + \frac{1}{δ A^{1 + δ}}) + C \int_{\frac{1}{n^{β}}}^{+ \infty} \frac{1}{t^{δ}} d t \\ = C [\frac{1}{n^{β}} (1 + \frac{1}{δ A^{1 + δ}}) + \frac{1}{δ - 1} n^{β (δ - 1)}] . \end{matrix}

Meanwhile, by (12),

\sum_{k = - 2 n}^{2 n - 1} ϕ_{σ, A} (n x - k) \leq \sum_{k = - \infty}^{+ \infty} ϕ_{σ, A} (n x - k) \leq C (1 + \frac{1}{δ A^{1 + δ}}),

setting

β = \frac{1}{2 (δ - 1)}

, we have

\begin{matrix} |K_{n, A} (g, x) - g (x)| \leq \frac{C ∥ g^{'} ∥_{\infty}}{n ϕ_{σ, A} (3)} [(\frac{1}{n^{\frac{1}{2 (δ - 1)}}} + 1) (1 + \frac{1}{δ A^{1 + δ}}) + \frac{1}{δ - 1} \sqrt{n}] . \end{matrix}

(18)

In summary, combined with Equations (15)–(17), Theorem 6 is proved. □

Remark 6.

If we take

y = σ (x) = \frac{1}{1 + e^{- x}}

as the activation function, then we have

\sum_{k = - \infty}^{+ \infty} ϕ_{σ} (x - k) = 1 .

Consequently,

\begin{matrix} {∥K_{n, A}^{*} (f) - f∥}_{p} & \leq inf_{g \in C^{1} [- 1, 1]} \{2 {∥f - g∥}_{p} + \frac{C}{δ ϕ_{σ, A} (3)} {(\frac{A}{n})}^{δ} {∥g_{1, e}∥}_{L_{p} [- 2, 2]} \\ + \frac{C ∥ g^{'} ∥_{\infty}}{n ϕ_{σ, A}^{1 / p} (3)} [(\frac{1}{n^{\frac{1}{2 (δ - 1)}}} + 1) + \frac{1}{δ - 1} \sqrt{n}]\} . \end{matrix}

Choosing appropriate

g \in C^{1} [- 1, 1]

with

δ > 1

and

n \in N

leads to

lim_{n \to + \infty} {∥ f - K_{n, A}^{*} (f) ∥}_{p} = 0 .

If we define g as the Steklov mean function of f, that is,

g (x) = \{\begin{matrix} \frac{1}{h} \int_{x}^{x + h} f (t) d t, x \in [- 1, 1 - h), \\ \frac{1}{h} \int_{1 - h}^{1} f (t) d t, x \in [1 - h, 1], \end{matrix}

where

0 < h < 2

, then g is absolutely continuous and f related. The upper bound of

{∥K_{n, A}^{*} (f) - f∥}_{p}

can be estimated by using the modulus of smoothness in

L_{p}

space. We will derive this type of estimate in future work.

5. Conclusions

In this paper, we propose two types of NN operators, the EKNNO and NEKNNO, which can be regarded as feedforward neural networks with multiple layers. We construct the EKNNO and NEKNNO using the following ideas: (1) integral averaging leads to a Kantorovich-type NN operator for removing time-jitters; (2) a function extension improves the EKNNO’s and NEKNNO’s approximation abilities, especially at the endpoints of a compact interval; (3) the introduction of a flexible parameter A can fine tune the approximation ability of the operators; and (4) normalization allows us to further discuss the approximation performance of the NEKNNO in an integrable function space. All these features combined provide a better approximation performance. We further prove the convergence of these operators as well as attain the quantitative estimates, while in the latter some important approximation tools, such as the modulus of continuity and the idea of K-functional, are utilized. Numerical examples are used to verify the validity of the theoretical results and some potential superiorities of our NN operators.

However, in this paper, we only considered the direct theorems of the NN operators, and the target function is univariate. The converse results and higher dimensional case will be investigated in our future work. Moreover, we use the sigmoid-type function as the activation function in our paper, while many other activation functions, such as ReLU and some other variations of ReLU (such as LeakReLU, PReLu, ELU, SELU, etc.) are widely used in the machine learning and deep learning fields. It is worth exploring if the similar construction of an NN can work well with different activation functions.

In conclusion, we utilize methods and tools in approximation theory to obtain some interesting results in the field of neural networks, which may lead to more applications in neural networks.

Author Contributions

Conceptualization, writing—review and editing, visualization, C.X.; conceptualization, formal analysis, writing—review and editing Y.Z.; conceptualization, writing—review and editing X.W.; conceptualization, writing—review and editing P.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Natural Science Foundation of China under grant number 11671213, 11601110 and the Natural Science and Engineering Research Council of Canada under number RGPIN-2019-05917.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The study did not report any data.

Acknowledgments

The authors would like to thank the Anonymous Reviewers, Academic Editor, and the Journal Editor for their important comments.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of the data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:

NN	neural network
FNNs	feedforward neural networks
EKNNO	extended Kantorovich-type neural network operator
NEKNNO	normalized extended Kantorovich-type neural network operator

References

Barron, A.R. Universal approximation bounds for superpositions of a sigmoidal function. IEEE Trans. Inform. Theory 1993, 39, 930–945. [Google Scholar] [CrossRef]
Cao, F.L.; Xu, Z.B.; Li, Y.M. Pointwise approximation for neural networks. Lect. Notes Comput. Sci. 2005, 3496, 39–44. [Google Scholar]
Cao, F.L.; Xie, T.F.; Xu, Z.B. The estimate for approximation error of neural networks: A constructive approach. Neurocomput. 2008, 71, 626–630. [Google Scholar] [CrossRef]
Cao, F.L.; Zhang, R. The errors of approximation for feedforward neural networks in the L^p metric. Math. Comput. Model. 2009, 49, 1563–1572. [Google Scholar] [CrossRef]
Chui, C.K.; Li, X. Approximation by ridge functions and neural networks with one hidden layer. J. Approx. Theory 1992, 70, 131–141. [Google Scholar] [CrossRef]
Cardaliaguet, P.; Euvrard, G. Approximation of a function and its derivative with a neural network. Neural Netw. 1992, 5, 207–220. [Google Scholar] [CrossRef]
Cantarini, M.; Coroianu, L.; Costarelli, D.; Gal, S.G.; Vinti, G. Inverse Result of Approximation for the Max-Product Neural Network Operators of the Kantorovich Type and Their Saturation Order. Mathematics 2022, 10, 63. [Google Scholar] [CrossRef]
Chen, Z.X.; Cao, F.L. The approximation operators with sigmoidal functions. Comput. Math. Appl. 2009, 58, 758–765. [Google Scholar] [CrossRef]
Chen, Z.X.; Cao, F.L.; Zhao, J.W. The construction and approximation of some neural networks operators. Appl. Math.-A J. Chin. Univ. 2012, 27, 69–77. [Google Scholar] [CrossRef]
Chen, Z.X.; Cao, F.L. Scattered data approximation by neural network operators. Neurocomputing 2016, 190, 237–242. [Google Scholar] [CrossRef]
Costarelli, D.; Spigler, R. Approximation results for neural network operators activated by sigmoidal functions. Neural Netw. 2013, 44, 101–106. [Google Scholar] [CrossRef] [PubMed]
Costarelli, D.; Spigler, R. Multivariate neural network operators with sigmoidal activation functions. Neural Netw. 2013, 48, 72–77. [Google Scholar] [CrossRef] [PubMed]
Costarelli, D.; Spigler, R. Convergence of a family of neural network operators of the Kantorovich type. J. Approx. Theory 2014, 185, 80–90. [Google Scholar] [CrossRef]
Costarelli, D.; Vinti, G. Quantitative estimates involving K-functionals for neural network-type operators. Appl. Anal. 2019, 98, 2639–2647. [Google Scholar] [CrossRef]
Qian, Y.Y.; Yu, D.S. Neural network interpolation operators activated by smooth ramp functions. Anal. Appl. 2022, 20, 791–813. [Google Scholar] [CrossRef]
Yu, D.S.; Zhou, P. Approximation by neural network operators activated by smooth ramp functions. Acta Math. Sin. (Chin. Ed.) 2016, 59, 623–638. [Google Scholar]
Zhao, Y.; Yu, D.S. Learning rates of neural network estimators via the new FNNs operators. In Proceedings of the 2014 International Joint Conference on Neural Networks (IJCNN), Beijing, China, 6–11 July 2014. [Google Scholar] [CrossRef]
Anastassiou, G.A. Univariate hyperbolic tangent neural network approximation. Math. Comput. Model. 2011, 53, 1111–1132. [Google Scholar] [CrossRef]
Anastassiou, G.A. Intelligent Systems: Approximation by Artificial Neural Networks, 1st ed.; Springer: Berlin/Heidelberg, Germany, 2011. [Google Scholar]
Cybenko, G. Approximation by superpositions of a sigmoidal function. Math. Control. Signals Syst. 1989, 27, 303–314. [Google Scholar] [CrossRef]
Zygmund, A.; Fefferman, R. Trigonometric Series, 3rd ed.; Cambridge University Press: Cambridge, UK, 2003. [Google Scholar]
Zhang, Z.; Liu, K.; Zhu, L.; Chen, Y. The new approximation operators with sigmoidal functions. Appl. Math. Comput. 2013, 42, 455–468. [Google Scholar] [CrossRef]
DeVore, R.A.; Lorentz, G.G. Constructive Approximation, 1st ed.; Springer: Berlin/Heidelberg, Germany, 1992. [Google Scholar]
Heshamuddin, M.; Rao, N.; Lamichhane, B.P.; Kiliçman, A.; Ayman-Mursaleen, M. On one- and two-dimensional α-Stancu-Schurer- Kantorovich operators and their approximation properties. Mathematics 2022, 10, 3227. [Google Scholar] [CrossRef]
Rao, N.; Malik, P.; RaniPradeep, M. Blending type Approximations by Kantorovich variant of α-Baskakov operators. Palest. J. Math 2022, 11, 402–413. [Google Scholar]

Figure 1. Comparison of the Order.

Figure 2. The comparison of approximation effects of operators

K_{n, A} (f, x)

and

N_{n} (f, x)

.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Approximation by the Extended Neural Network Operators of Kantorovich Type

Abstract

1. Introduction

2. Extended Neural Network Operators of Kantorovich Type

3. Degree of the Approximation by $K_{n, A} (f, x)$ and $K_{n, A}^{*} (f, x)$ in $C [- 1, 1]$

4. Degree of the Approximation by NEKNNO $K_{n, A} (f)$ and EKNNO $K_{n, A}^{*} (f)$ in $L_{p} [- 1, 1]$

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Article Metrics

Citations

Article Access Statistics

Approximation by the Extended Neural Network Operators of Kantorovich Type

Abstract

1. Introduction

2. Extended Neural Network Operators of Kantorovich Type

3. Degree of the Approximation by K n , A ( f , x ) and K n , A ∗ ( f , x ) in C [ − 1 , 1 ]

4. Degree of the Approximation by NEKNNO K n , A ( f ) and EKNNO K n , A ∗ ( f ) in L p [ − 1 , 1 ]

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Article Metrics

Citations

Article Access Statistics

3. Degree of the Approximation by $K_{n, A} (f, x)$ and $K_{n, A}^{*} (f, x)$ in $C [- 1, 1]$

4. Degree of the Approximation by NEKNNO $K_{n, A} (f)$ and EKNNO $K_{n, A}^{*} (f)$ in $L_{p} [- 1, 1]$