1. Introduction
The error analysis in learning theory shows that the learning rate of the kernel regularized regression depends upon the approximation ability of the kernel function spaces (see, for example, [
1,
2,
3]).
Let
X be a complete metric space and
be a Borel measure on
X. Denoted by
, the Hilbert space consisting of (real) square integrable functions with the inner product
Suppose that
is continuous, symmetric and strictly positive definite, i.e., for any given integers
are positive definite matrices for given finite sets
Assume that
, i.e.,
Then the linear operator
defined by
is positive, and its range lies in
. Take
to be the linear operator on
satisfying
and
, the inverse of
. Additionally, define
. Then
is a reproducing kernel Hilbert space associated with
, i.e., (see [
1,
4,
5,
6,
7]),
where the inner product
is induced by a norm defined as
i.e.,
One of the targets of learning theory is to find an unknown function
from the random observations
drawn i.i.d. (identically and independently distributed) according to a unknown probability
defined on
(see [
1,
6]). A usual algorithm to realize this aim is to solve the following kernel regularized optimization problem:
where
is taken as the hypothesis space,
is a parameter which balances the relationship between the empirical error term
and the penalty term
. Let
be the regression function. Then
is the least-squares-best predictor (see Section 9.4 in Section 9 of [
8]), i.e.,
It is known that the convergence analysis of model (
5) sums up to bound the convergence rate for error
, which depends upon the decay of the best approximation
defined as (see e.g., [
1,
2,
6])
as
Formula (
6) deals with a decay rate which depends upon the approximation property of
. Many mathematicians have performed investigations on it. For example, D. X. Zhou gives the decay of (
6) with the RKHS interpolation theory (see [
2,
3]). P.X. Ye gives the decay using convolution operators in the Euclidean space
(see [
9]). H.W. Sun gives a decay for (
6) with the help of operator theory in a Hilbert space (see [
10]). It is known that the Fourier–Bessel series is a good approximation tool and has been studied by many mathematicians (see for example, [
11,
12,
13,
14,
15,
16]). Additionally, we found that approximation by RBF networks of Delsarte translates was studied by some mathematicians. The essence of RBF is summed up as the approximation of Fourier–Bessel transforms (see, for example, [
17,
18,
19,
20]). So it is of interest for us to conduct investigations on the decay of
with both the Fourier–Bessel series and the Fourier–Bessel transforms.
Let
and
be given real numbers, and
denote the space of all measurable real functions on
such that
where
The normalized Bessel function
of the first kind and order
is
where
is the Bessel function of first kind and order
, and
is the Gamma function.
For
, the usual Fourier–Bessel transform
is defined as
In the present paper, some investigations on the decay of in the case that are constructed with and are provided. Some K-functional and moduli of smoothness are defined with the help of the semigroup of operators, and their equivalences are shown, with which the error for the decay is bounded. The results obtained are two kinds of upper bound estimates associated with Fourier–Bessel series and Fourier–Bessel transforms, respectively.
The paper is organized as follows. In
Section 2, some notions and results of the Fourier–Bessel series and Fourier–Bessel transforms are provided, with which two kinds of RKHSs are constructed; the corresponding best RKHS approximation problem in these setting is restated. Some
K-functionals and moduli of smoothness associated with Fourier–Bessel series and Fourier–Bessel transforms are provided, and their equivalence is shown, with which some upper bounds for the best approximation are shown in
Section 3 and
Section 4, respectively. All the proofs for the propositions, the theorems and lemmas are given in
Section 5. Some further analysis for the results of the present paper are given in
Section 6, from which one can see the value of writing this manuscript. A general proposition for the strong equivalence of
K-functionals and moduli of smoothness is listed in the
Appendix A.
2. Preliminaries
Let
be the positive zeros of
arranged in increasing order. It is well known that
form a complete orthogonal system in
(see, for example, [
12,
16,
21]), i.e.,
Take
. Then
forms an orthonormal basis of
and for any
, there holds Fourier–Bessel series
where
and
Lemma 1. We have the following results:
- (i)
- (ii)
The generalized translation operator on defined as where and - (iii)
The zeros satisfy
Inequality (
13) is a theoretical basis for defining the moduli of smoothness with translation operators
.
Let
be the set of given positive real sequences such that the right side of the series
has uniform convergence for all
. It therefore is a Mercer kernel. Then
Then it is easy to verify that
, and
is a RKHS in
associating with reproducing kernel
and an inner product
defined as
Equality (
6) becomes
as
Let
be the class of even
-functions on
. Denoted by
, the space of even
-functions on
R which are rapidly decreasing together with all their derivatives, i.e.,
where
is the set of natural numbers.
Let
denote the space of even
-functions on
R with support in
and
Additionally, define the generalized translation operator
on
as
and define a convolution on
by
For the Bessel operators
we have (see p. 12 or p. 177 of [
22])
and therefore
Moreover, we have the following lemma.
Lemma 2. There hold the following:
- (i)
is dense in ;
- (ii)
Both and are dense in and - (iii)
If , then and ;
- (iv)
is a topological isomorphism from to itself and
- (v)
- (vi)
If , then - (vii)
Let or . Then - (viii)
There hold the following relations
Proposition 2.1 of [
23] shows that if
satisfies
and
, then
defines a Mercer kernel on
. We give an assumption
Assumption I. Letsatisfyand for anythere is a real numbersuch that We point here that the functionssatisfying Assumption 1 are existent, and give two examples.
Example 1. For
the function
defined by
satisfies
and
for
(see Problem 5. VIII 2 in Section 5.VIII Problems of [
22]).
Example 2. For
the function
defined by
satisfies
and
for
(see Problem 5. VIII 1 in Section 5.VIII Problems of [
22]).
Define
with norm
Define an inner product on
as
It is known that
is a reproducing kernel of
(see [
24]), i.e.,
Defi
ne for a given real number
an operator as
Then it is easy to show that
,
and
In this case, the decay (
6) becomes
for
If
, then we define the corresponding RKHS
and for
, there holds
We have by (
34) that
for
4. An Upper Bound Estimate with the Fourier–Bessel Transform
To bound
, we define a
K-functional
and a modulus
respectively corresponding to
as
and
where
The K-functional and the modulus are equivalent, i.e., we have the following proposition.
Proposition 2. Let satisfy Assumption 1. Then there holds the equivalence We now give an upper bound estimate for (
34).
Theorem 2. Under the conditions of Proposition 2, there is a constant such that if
For
we define a
K-functional on
as
Define a modulus of smoothness as
where
Then we have the following two corollaries.
Corollary 3. There holds the equivalent relation Corollary 4. There is a constant such that We give further computations for
. By Example 1, we know
, which, together with (
21), gives
which with (42) shows that
Take (
43) into (
42). Then
(
44) shows that the decay of
is controlled by the approximation order of convolution operator
for
.
For
we define
Define a
K-functional on
as
Define a modulus of smoothness as
where
Then we have the following two corollaries.
Corollary 6. There is a constant such that Additionally, by Example 2, we know
, which, together with (
21), gives
which, with (47), shows that
Take (
48) into (
47), we have
We know by (
49) that the decay of
is controlled by the approximation order of the convolution operator
for
6. Further Discussions
We now give some comments on the results obtained in the present paper.
A more general problem arising from learning theory is to bound the decay rate of the function (see [
2])
where
is a Banach space and
is a dense subspace with
for
It is known that the approximation ability of a function class is determined by the smoothness of its functions. So the decay of is influenced by the smoothness of the functions in
Smale and Zhou (see [
2]) give the first estimate for the decay of (
61) in the case that
, which is a particular Besov space (in fact, it is the interpolation space of
B and
H). This work is improved in [
9]. For
(the Sobolev space, see [
2] for the definition) and the reproducing kernel Hilbert space
, Zhou gives an estimate as (see [
3])
if
, where
is the Gaussian kernels
The tools used is the RKHS function interpolation.
It is known that the most commonly used tool in approximation theory is the
K-functional. The most helpful relation is the strong equivalent relation between a
K-functional and a corresponding modulus of smoothness (see, for example, [
26]). The most commonly used quantity for describing the approximation ability of a function class is the Jackson inequality expressed with a
K-functional or a modulus of smoothness (see also [
26]). As far as we know from the literature, no Jackson inequality has been established for the decay of (
6). There is little description for the smoothness of a RKHS. Recent research shows that any RKHS has some smoothness; it can be considered from the view of fractional derivative and orthogonal series and show that the well-known
K-functional ([
27])
is equivalent to a modulus of smoothness, where
X is chosen as some compact sets, for example,
and
. It is valuable for us to extend these results to the RKHS defined on a noncompact set. The set
X used in the present paper is
, which is a noncompact set and has essential properties different from those of a compact set (see, for example, [
5]). Moreover, it is the first time that a Jackson inequality is established to describe the decay (
6). A advantage of this manuscript is the use of the Bessel series and Bessel transforms, which transforms the RKHS approximation problem into the classical Bessel–Fourier approximation problem and gives the decay rate with Bessel–Fourier approximation skills.
The Jackson inequalities in Theorem 1 and Theorem 2 show that the RKHSs constructed with Bessel series and Bessel transforms have the same approximation as that of the Bessel series and Bessel transforms.
The moduli of smoothness defined in this manuscript are one-order moduli. It is a valuable problem for us to define higher-order moduli of smoothness and show the Jackson inequality to describe the decay of (
6).