Estimation of Star-Shaped Distributions

Eckhard Liebscher; Wolf-Dieter Richter

doi:10.3390/risks4040044

Abstract

Scatter plots of multivariate data sets motivate modeling of star-shaped distributions beyond elliptically contoured ones. We study properties of estimators for the density generator function, the star-generalized radius distribution and the density in a star-shaped distribution model. For the generator function and the star-generalized radius density, we consider a non-parametric kernel-type estimator. This estimator is combined with a parametric estimator for the contours which are assumed to follow a parametric model. Therefore, the semiparametric procedure features the flexibility of nonparametric estimators and the simple estimation and interpretation of parametric estimators. Alternatively, we consider pure parametric estimators for the density. For the semiparametric density estimator, we prove rates of uniform, almost sure convergence which coincide with the corresponding rates of one-dimensional kernel density estimators when excluding the center of the distribution. We show that the standardized density estimator is asymptotically normally distributed. Moreover, the almost sure convergence rate of the estimated distribution function of the star-generalized radius is derived. A particular new two-dimensional distribution class is adapted here to agricultural and financial data sets.

Keywords:

star-shaped distributions; antinorm contoured distributions; norm contoured distributions; non-concentric elliptically contoured distributions; kernel density estimators

AMS Subject Classification:

60E05; 62G07; 62H12

1. Introduction

The classes of multivariate Gaussian and elliptically contoured distributions have served as the probabilistic basis of many multivariate statistical models over a period of several decades. Accounts of the theory of elliptically contoured distributions may be found in [1,2,3]. The book [4] by Fang and Anderson contains a big chapter about statistical inference of elliptically contoured distributions. The theory of elliptically contoured distributions including applications to portfolio theory is presented in the monograph by [5] Gupta et al. On combining the advantages of several estimators, semiparametric density estimators for elliptical distributions were derived in papers [6,7,8] by Stute and Werner, by Cui and He and by Liebscher. In [9] Battey and Linton considered a density estimator for elliptical distributions based on Gaussian mixture sieves. The performance of their estimators heavily depends on how the density can be approximated by a mixture of normal distributions. Scatter plots of multivariate data sets, however, motivate modeling of star-shaped distributions beyond elliptically contoured ones.

The more flexible star-shaped densities were studied in [10] and later in [11]. The general structure of their normalizing constant given a density generating function was discovered, a geometric measure representation and, based upon it a stochastic representation were derived, and a survey of applications of such densities was given in [12]. Moreover, two-dimensional non-concentric elliptically contoured distributions are introduced there and, based upon two-dimensional star-shaped densities, a universal star-shaped generalization of the univariate von Mises density is derived. These results are further studied in detail in [13] for several particular classes. The big classes of norm and antinorm contoured distributions, being particular cases of star-shaped distributions, are considered in [14,15] for dimension two and for arbitrary finite dimension, respectively. In this paper we study several of those classes of distributions for arbitrary finite dimension and introduce a particular new class of distributions for dimension two. The rather general class of distributions considered in the present paper covers distributions with convex as well as such with non-convex contours.

The main goal of this paper is to develop estimation procedures for fitting multivariate generalized star-shaped distributions. The semiparametric procedure combines the flexibility of nonparametric estimators and the simple estimation and interpretation of parametric estimators. Since we apply nonparametric estimation to a univariate function, we avoid the disadvantages of nonparametric estimators in connection with the curse of dimensionality. The semiparametric approach of this paper is based on that in the earlier paper [7] on elliptical distributions but uses partially weaker assumptions. Alternatively, we consider a pure parametric method. In both cases, a parametric model is assumed for the density contours given by the star body and the Minkowski functional of it. For the semiparametric method, we assume that the contours are smooth, more precisely, that the Minkowski functional is continuously differentiable. The parameters are estimated using a method of moments. The star generalized radius density is estimated nonparametrically by use of a kernel density estimator, or parametrically.

The paper is structured as follows. The class of continuous star-shaped distributions and several of its subclasses are considered in Section 2. Section 3 deals with the estimation of the density and the star-generalized radius distribution. In Section 3, the statements on convergence rates and on the asymptotic normality of the density estimator as well as on the convergence rate of the estimated distribution function of the generalized radius are provided. First the case of a given star body is considered, later the more general case of a parametrized star body is taken into consideration. The particular Section 3.4 surveys on the one hand to a certain extent examples where different subclasses of star-shaped distributions appear in practice and deals on the other hand with applications of the methods developed here to the analysis of two-dimensional agricultural and financial data. The proofs can be found in Section 4.

2. Continuous Star-Shaped Distributions

2.1. The General Distribution Class

Throughout this paper,

K \subset R^{d}

denotes a star body, i.e., a non-empty star-shaped set that is compact and is equal to the closure of its interior, having the origin as an interior point. The Minkowski functional of K is defined by

h_{K} (x) = inf {λ \geq 0 : x \in λ K} for x \in R^{d} .

The boundary of K is just the set

{{(u, v)}^{T} : h_{K} ({(u, v)}^{T}) = 1}

. Further we find a ball

{y \in R^{d} : ∥y∥ \leq r}

which covers K where

| | . | |

denotes Euclidean norm. Hence

h_{K} (x {∥x∥}^{- 1}) \geq 1 / r

and

h_{K} (x) \geq \frac{1}{r} ∥x∥ .

(1)

The function

h_{K}

is assumed to be homogeneous of degree one,

h_{K} (λ x) = λ h_{K} (x) for x \in R^{d}, λ > 0,

and to satisfy a further assumption.

A countable collection

F = {C_{1}, C_{2}, \dots}

of pairwise disjoint sectors (closed convex cones

C_{j}

containing no half-space, with non-empty interior and vertex being the origin

0_{d}

) such that

R^{d} = ⋃_{j} C_{j}

will be called a fan. By

B_{d}

we denote the Borel-σ-field in

R^{d}

and by S, the boundary of K. We denote

S_{j} = S \cap C_{j}, S_{j} \cap B_{d} = B_{S, j}

and

B_{S} = σ {B_{S, 1}, B_{S, 2}, \dots}

. We shall consider only star bodies K and sets

A \in B_{S}

satisfying the following condition.

Assumption 1.

The star body K and the set

A \in B_{S}

are chosen such that for every j the set

G (A \cap S_{j}) = {ϑ \in R^{d - 1} : \exists η with θ = {(ϑ^{T}, η)}^{T} \in A \cap S_{j}}

is well defined and such that for every

ϑ = {(ϑ_{1}, \dots, ϑ_{d - 1})}^{T} \in G (A \cap S_{j})

there is a uniquely determined

η > 0

satisfying

h_{K} ({(ϑ_{1}, \dots, ϑ_{d - 1}, η)}^{T}) = 1

.

A star body K satisfying this assumption will be called for short an

A 1

-star-body.

Let

g : [0, + \infty) \to [0, + \infty)

be a nonnegative function which fulfills the condition

0 < \int_{R^{d}} g (h_{K} (x)) d x < \infty .

Such function is called a density generating function (dgf).

We consider the class of continuous star-shaped distributions of random vectors X taking values in

R^{d}

:

\begin{matrix} C S t S h^{(d)} & = & {Φ_{g, K, μ} : μ \in R^{d}, K is an A 1-star-body with 0_{d} \in i n t K, g is a dgf} \end{matrix}

where

i n t K

means the interior of K. Suppose that the distribution law

Φ_{g, K, μ}

has the density

φ_{g, K, μ} (x) = C (g, K) g (h_{K} (x - μ)) for x \in R^{d},

(2)

where

C (g, K)

is a suitable normalizing constant. Moreover, K is called the contour defining star body of

φ_{g, K, μ}

.

We consider the random vector X having the density (2) (in symbols

X \sim Φ_{g, K, μ}

). According to Theorem 8 in [12], this random vector has the representation

X - μ \overset{d}{=} R U,

(3)

where the star-generalized radius variable

R = h_{K} (X - μ)

and the star-generalized uniform basis vector

U = \frac{1}{R} (X - μ)

are independent.

Moreover, R has the density

f_{R} (r) = \frac{1}{I (g, d)} r^{d - 1} g (r)

(4)

with

I (g, d) = \int_{0}^{\infty} r^{d - 1} g (r) d r

, and U has a star-generalized uniform probability distribution on the boundary of K, i.e.,

P (U \in A) = O_{S} (A) / O_{S} (S)

for

A \in B_{S}

. According to [12],

O_{S}

means the star-generalized surface measure which is a non-Euclidean one unless for S being the Euclidean sphere, and which is well defined if Assumption 1 is fulfilled. Note that

O_{S} (S) = d \cdot v o l (K)

. If

{lim}_{r \to 0 + 0} g (r) > 0

is finite, then in view of (4), R takes values in the neighbourhood of zero with a rather small probability. This behaviour is called the volcano effect and is the stronger the higher the dimension is. The density (2) may be written as

φ_{g, K, μ} (x) = \frac{1}{O_{S} (S)} {(h_{K} (x - μ))}^{1 - d} f_{R} (h_{K} (x - μ)), x \in R^{d} .

(5)

Estimating such density may be studied under various assumptions concerning the degree of knowledge on the groups of parameters

K, h_{K}, O_{S} (S)

and

f_{R}

, as well as μ. Let

X = {(X^{(1)}, \dots, X^{(d)})}^{T}

and

U = {(U_{1}, \dots, U_{d})}^{T}

. The next lemma gives helpful information about the mean and the covariances. Here and in what follows

1 {A}

denotes the indicator function of an event A.

Lemma 1.

If

E R^{2} < + \infty

and K is symmetric w.r.t. the origin, then

E X = μ, and C ov (X^{(j)}, X^{(k)}) = E U_{j} U_{k} E R^{2} for j, k = 1, \dots, d .

Proof.

In view of (3), we have first to show that

E U = 0

. Because of the symmetry of K, U has the same distribution as

- U

, and

U_{j} = 0

with probability 0 for each j. Thus we obtain

\begin{matrix} E X - μ & = & E R E U = E R (E (U 1 \{U_{1} > 0\}) + E (U 1 \{U_{1} < 0\})) \\ = & E R (E (U 1 \{U_{1} > 0\}) + E (- U 1 \{U_{1} < 0\})) = 0 . \end{matrix}

Moreover, it follows that

C ov (X^{(j)}, X^{(k)}) = E (R^{2} U_{j} U_{k}) = E U_{j} U_{k} E R^{2} .

☐

The general approach followed here includes non-convex bodies which can occur in applications. Obviously,

h_{K} (U) = 1

and the distribution of the random vector U is concentrated on the set

{u \in R^{d} : h_{K} (u) = 1}

.

2.2. A Class of Two-Dimensional Distributions Whose Contour Defining Star Bodies Are Squared Sine Transformed Euclidean Circles

We define

α (u, v) \in (- π, π]

to be the angle in radians between the positive x-axis and the line through the point

(u, v)

and the origin:

α (u, v) = arctan (v / u)

for

u > 0

,

α (u, v) = arctan (v / u) + sgn (v) π

for

u < 0

,

α (0, v) = \frac{π}{2} \cdot sgn (v)

. The Minkowski functional of any two-dimensional star body K can then be written as

h_{K} ({(u, v)}^{T}) = \sqrt{u^{2} + v^{2}} H (α (u, v)),

where

H : (- π, π] \to (0, \infty)

is a bounded function. In the following examples we consider two-dimensional star bodies with smooth boundaries; i.e., H is differentiable. Here, the following generator function is used

g_{0} (r) = \frac{1 + r}{3} e^{- r}

which corresponds to the star-generalized radius density of mixed Erlang type

f_{0 R} (r) = \frac{r + r^{2}}{3} e^{- r} .

Example 1.

Here we consider the Minkowski functional

h_{K} ({(u, v)}^{T}) = \sqrt{u^{2} + v^{2}} (1 + a \frac{v^{2}}{u^{2} + v^{2}}) = \sqrt{u^{2} + v^{2}} (1 + a {sin}^{2} α (u, v))

where

a \in (- 1, + \infty)

is a parameter. The Figure 1, Figure 2, Figure 3 and Figure 4 show the contour lines of the boundary of the body for several values of a and the resulting density for one choice of a. These figures show that the distribution class includes densities with convex as well as with non-convex contours.

Figure 1. Plot of the boundaries of K in the cases

a = 10

(solid line),

a = 3

(dashed line),

a = 1

(dotted line) and

a = 0.3

(dashed/dotted line).

Figure 2. Plot of the boundaries of K in the cases

a = 10

(solid line),

a = 3

(dashed line),

a = 1

(dotted line) and

a = 0.3

(dashed/dotted line).

Figure 3. Plot of the boundaries of K in the cases

a = - 0.8

(solid line),

a = - 0.6

(dashed line),

a = - 0.4

(dotted line),

a = - 0.2

(dashed/dotted line).

Figure 4. Contour plot of the density of

φ_{g, K, μ}

for

a = - 0.8

and levels

0.39, 0.38, \dots 0.19

.

Example 2.

We consider the star body K with Minkowski functional

h_{K} ({(u, v)}^{T}) = \sqrt{u^{2} + v^{2}} (1 + a_{1} \frac{{(u - a_{2} v)}^{2}}{u^{2} + v^{2}}),

where

a_{1}

and

a_{2} \in R

are parameters such that

1 + a_{1} (1 + a_{2}^{2}) > 0

, and

H (α) = 1 + a_{1} \frac{{(1 - a_{2} tan α)}^{2}}{1 + {tan}^{2} α} .

This star body arises from a rotation of K in Example 1 by an angle

α_{0}

where

a_{2} = 1 / tan α_{0}

. In Figure 5 and Figure 6 the boundaries of (multiples of) K are depicted.

Figure 5. Plot of the boundaries of K for

a_{2} = 2

in the cases

a_{1} = 10

(solid line),

a_{1} = 3

(dashed line),

a_{1} = 1

(dotted line) and

a_{1} = 0.3

(dashed/dotted line).

Figure 6. Contour plot of the density

φ_{g, K, μ}

for

a_{1} = 3

and levels as in Figure 4.

A specific motivation for considering a star body K as in Example 2 arises when studying the dataset 5 of [38], see Figure 10 below.

Figure 10. Scatter plot of the dataset 5 of [38].

2.3. Norm-Contoured Distributions

Specific norm-contoured distributions were studied in several papers which are in part surveyed in Richter [14]. A geometric measure representation of arbitrary norm contoured distributions is proved in [15]. The class of all norm-contoured distributions is denoted, according to these papers, by

NC

. The class

NC

is a subclass of the class StSh of star-shaped distributions. Here, we consider the subclass of continuous norm-contoured distributions CNC.

It is well known that there is a one-to-one correspondence between the class of convex bodies being symmetric w.r.t. the origin, where

x \in K

implies

- x \in K

, and the class of norms in

R^{d}

. If K is any such symmetric convex body then

h_{K} (x) = ∥x∥

where

∥.∥

is the uniquely determined norm. On the other hand, if

∥.∥

is any norm, then

K = {x : ∥x∥ \leq 1}

is the corresponding convex symmetric body having the origin as interior point, and

∥x∥ = h_{K} (x)

.

Throughout this section, let

∥.∥

be any norm and

K = {x : ∥x∥ \leq 1}

, and let the density of a norm-contoured distribution be

φ (x; g, ∥.∥, O, μ) = C (g, ∥.∥) g (∥O^{- 1} (x - μ)∥) for x \in R^{d}

where O is any orthogonal

d \times d

-matrix. Because any rotated or mirrored norm-ball is again a norm ball we shall restrict our attention to the case O being the unit matrix and will write then

X \sim C N C (g, ∥.∥, μ)

. In the present situation,

S = {x : ∥x∥ = 1}

is considered to be the unit sphere in the Minkowski space

(R^{d}, ∥.∥)

.

In the following we consider several specific cases of norms and the corresponding norm-contoured distributions.

Example 3.

If K is the Euclidean unit ball then

h_{K}

is the Euclidean norm and (2) defines a shifted spherical distribution.

Example 4.

Let

A \in R^{d, d}

be a

d \times d

-matrix satisfying

det (A) > 0

,

{∥.∥}^{\circ}

any norm, and

∥x∥ = {∥A x∥}^{\circ}

another norm. If X is

{∥.∥}^{\circ}

-contoured distributed and A is a symmetric and positive definite then we call the distribution of

A X

an elliptically generalized

{∥.∥}^{\circ}

-contoured distribution.

Example 5.

Let

a = {(a_{1}, \dots, a_{d})}^{T}

be a vector with

a_{i} > 0, i = 1, \dots, d

and

A = diag (1 / a_{1}, \dots, 1 / a_{d})

. If, in Example 4,

{∥.∥}^{\circ}

is the p-norm,

p \geq 1,

then the corresponding norm

∥.∥

is

∥x∥ = {(\sum_{i = 1}^{d} {|\frac{x_{i}}{a_{i}}|}^{p})}^{1 / p} .

The distribution of X is called an axes aligned p-generalized elliptically contoured distribution.

Example 6.

Assume that data are grouped into k groups, and let

k > 1, n = n_{1} + \dots + n_{k}, n_{i} \geq 1, p_{i} \geq 1, i = 1 \dots k

and

h_{K} (x) = \sum_{j = 1}^{k} {(\sum_{i = 1}^{d} {|\frac{x_{i}}{a_{i}}|}^{p_{j}})}^{1 / p_{j}} .

K may be called then an

(a, p_{1}, \dots, p_{k})

-generalized axis-aligned ellipsoid and we will say that X follows a grouped

(a, p_{1}, \dots, p_{k})

-generalized axis-aligned elliptically contoured distribution in

R^{n}

.

Example 7.

In the case of two-dimensional observations, let

P_{n}

denote the polygon having the n vertices

I_{n, i} = {(cos (\frac{2 π}{n} (i - 1)), sin (\frac{2 π}{n} (i - 1)))}^{T}

,

i = 1, \dots, n, n \geq 3

. The convex body which is circumscribed by

P_{n}

will be denoted by K. Then

h_{K}

is a norm defined in

R^{2}

and

φ_{g, K, 0}

a polygonally contoured density which was used implicitly in [13] to construct a corresponding geometric generalization of the von Mises density. For the more general class of multivariate polyhedral star-shaped distributions, see [16].

Example 8.

Given a homogeneous polynomial p of degree k with

p (| x_{1} |,

\dots, | x_{d} |) \geq 0

, the function

N (x) : = {(p (| x_{1} |, \dots, | x_{d} |))}^{1 / k}

defines a norm in

R^{d}

if it is subadditive. An example for a homogeneous polynomial of degree 3 and

d = 2

is

p (x_{1}, x_{2}) = x_{1}^{3} + x_{2}^{3} + 2 x_{1}^{2} x_{2}

.

2.4. Antinorm-Contoured Distributions

A function

g : R^{n} \to [0, \infty)

which is continuous, positively homogeneous, non-degenerate and superadditive in some fan is called an antinorm in [17]. Thereby, g is called superadditive in a sector C or in the fan

F

if it satisfies the reverse triangle inequality in C or in every sector of the fan

F

, respectively.

Example 9.

If the

(a, p)

-functional

{| . |}_{a, p}

is defined as

∥.∥

in Example 5 but with

p \in (0, 1)

then it is an antinorm.

For geometric measure representations of elements from a big class of continuous antinorm contoured distributions we refer to [14,15]. For figures of two-dimensional antinorm balls, see [17]).

2.5. Continuous Non-Concentric Elliptically Contoured Distributions

Let

a = {(a_{1}, \dots, a_{d})}^{T}, a_{i} > 0; i = 1 \dots n

and

K_{a} = {x \in R^{d} : \sum_{i = 1}^{d} {(x_{i} / a_{i})}^{2} \leq 1}

. If

e = {(e_{1}, \dots, e_{d})}^{T}

satisfies

\sum_{i = 1}^{d} {(e_{i} / a_{i})}^{2} < 1

then

K_{a, e} = K_{a} - e

is a star body having the origin as an interior point, and

r K_{a, e} = \{x \in R^{d} : \sum_{i = 1}^{d} {(\frac{x_{i} + r e_{i}}{r a_{i}})}^{2} \leq 1\} = K_{r a, r e} .

Moreover,

r_{1} K_{a, e} \subset r_{2} K_{a, e}

for

r_{1} \leq r_{2}

. A Minkowski functional

h_{K_{a, e}}

which is homogeneous of degree one will be called a non-concentric elliptically contoured function and

φ_{g, K_{a, e}, μ}

a non-concentric elliptically contoured density. If

O : R^{d} \to R^{d}

denotes an arbitrary orthogonal transformation then

h_{O K_{a, e}}

is also a non-concentric elliptically contoured function which is homogeneous of degree one. For the special case of

d = 2

see [12,13].

3. Estimation for Continuous Star-Shaped Distributions

3.1. Parametric Estimators

Let

X_{1}, \dots, X_{n}

be a sample of independent random vectors, where

X_{i} \sim Φ_{g, K, μ}

and

X_{i} = {(X_{i 1}, \dots, X_{i d})}^{T}

. Assume that the star body K is given and Assumption 1 is satisfied. From now on, we suppose that K is symmetric w.r.t. the origin. We consider a model family

{f_{θ} : θ \in Θ_{1}}

of continuously differentiable densities for the star-generalized radius R on

[0, \infty)

, see (4).

Θ = Θ_{1} \times Θ_{2}, Θ_{1} \subset R^{q}, Θ_{2} \subset R^{d}

is the parameter space which is assumed to be compact. Suppose that

h_{K} (.)

is a continuous function.

Next we give two reasonable model classes for

f_{θ}

:

(1): Modified exponential model. $θ = τ \in (0, + \infty)$ ,

$f_{τ} (r) = \frac{1}{(d + 1) (d - 1)!} τ^{- d} r^{d - 1} (1 + \frac{r}{τ}) e^{- r / τ} for r > 0$

with the expectation

$\int_{0}^{\infty} r f_{θ} (r) d r = \frac{d (d + 2) τ}{d + 1} .$

(6)
(2): Weibull model. $θ = (τ, a) \in (0, + \infty) \times (1, + \infty)$ ,

$f_{θ} (r) = \frac{a}{τ^{d} Γ (d / a)} r^{d - 1} e^{- {(r / τ)}^{a}} for r > 0$

with the expectation

$\int_{0}^{\infty} r f_{θ} (r) d r = \frac{Γ (\frac{d + 1}{a}) τ}{Γ (\frac{d}{a})} .$

Let

f_{R} \in {f_{θ} : θ \in Θ_{1}}

. In this section the aim is to fit the specific parametric model for the density

φ_{g, K, μ}

to the data by estimating the parameters θ and μ where

φ_{g, K, μ}

is given according to (5) and (4) with

f_{R} = f_{θ}

. Therefore, the two models [1] and [2] fulfill the condition

{lim}_{r \to 0 + 0} g^{'} (r) = 0

which ensures the differentiability of the density

φ_{g, K, μ}

at zero.

For the statistical analysis we suppose that the data

X_{1}, \dots, X_{n}

are given and these data comprise independent random vectors having density

φ_{g, K, μ}

. Suppose that θ and μ are interior points of

Θ_{1}

and

Θ_{2}

, respectively. The concentrated log likelihood function (constant addends can be omitted) reads as follows

L (θ, μ) = \sum_{i = 1}^{n} (ln f_{θ} (h_{K} (X_{i} - μ)) + (1 - d) ln h_{K} (X_{i} - μ)) .

We introduce the maximum likelihood estimators

{\hat{θ}}_{n}, {\hat{μ}}_{n}

of θ and μ as joint maximizers of the likelihood function:

L ({\hat{θ}}_{n}, {\hat{μ}}_{n}) = max_{(θ, μ) \in Θ} L (θ, μ) .

Under appropriate assumptions, the maximum-likelihood-estimator are asymptotically normally distributed (cf. Theorem 5.1 in [18], p. 463)

\sqrt{n} {({\hat{θ}}_{n} - θ, {\hat{μ}}_{n} - μ)}^{T} \overset{d}{⟶} N (0, I {(θ, μ)}^{- 1}) for n \to \infty,

where

\overset{d}{⟶}

is the symbol for convergence in distribution and the information matrix is given by

I (θ, μ) = {(I_{i j} (δ))}_{i, j = 1 \dots d + q}

with

δ^{T} = (δ_{1}, \dots, δ_{d + q}) = (θ^{T}, μ^{T})

and

I_{i j} (δ) = - E (\frac{\partial}{\partial δ_{i} \partial δ_{j}} (ln f_{θ} (h_{K} (X_{k} - μ)) + (1 - d) ln h_{K} (X_{k} - μ))) .

3.2. Nonparametric Estimators without Scale Fit

In the present section we deal with nonparametric estimators in the context of star-shaped distributions. This type of estimators is of special interest if no suitable parametric model can be found. The cdf of R will be denoted by

F_{R}

.

3.2.1. Estimating μ and $F_{R}$

Let

X_{1}, \dots, X_{n}

be the sample as in Section 3.1. In the following the focus is on the estimation of the parameter μ and the distribution function of the generalized radius R.

First we choose an estimator for μ. For this purpose we assume that

E | X | < + \infty

. In view of Lemma 1,

{\hat{μ}}_{n} = \frac{1}{n} \sum_{i = 1}^{n} X_{i}

(7)

is an unbiased estimator for the unknown parameter μ. Define

R_{i} = h_{K} (X_{i} - μ)

and

{\hat{R}}_{i} = h_{K} (X_{i} - {\hat{μ}}_{n})

for

i = 1, \dots, n

. Using this definition, an estimator for the cdf of

R = h_{K} (X - μ)

(cf. (3)) is given by the formula

{\hat{F}}_{n}^{R} (r) = \frac{1}{n} \sum_{i = 1}^{n} 1 \{{\hat{R}}_{i} \leq r\}

(8)

for

r \geq 0

. At a first glance,

{\hat{F}}_{n}^{R} (r)

just approximates the empirical distribution function

F_{n}^{R} (x) = \frac{1}{n} \sum_{i = 1}^{n} 1 \{R_{i} \leq r\}

which is not available from the data because of the unknown μ. We can prove that

{\hat{F}}_{n}^{R}

converges to

F_{R}

a . s .

, in fact at the same rate as every common empirical distribution function converges to the cdf. This is the assertion of the following theorem.

Theorem 1.

Suppose that Assumption 1 is satisfied,

h_{K} (.)

is Lipschitz-continuous on

R^{d}

, and

\int_{0}^{\infty} r^{d + 1} g (r) d r < + \infty .

(9)

If further f is bounded on

[0, + \infty)

, then, for

n \to \infty

,

sup_{r \geq 0} |{\hat{F}}_{n}^{R} (r) - F_{R} (r)| = O (\sqrt{\frac{ln ln n}{n}}) a . s .

Here the condition (9) ensures that

E R^{2} < + \infty

which in turn is an assumption for the law of iterated logarithm of

{\hat{μ}}_{n}

.

3.2.2. Density Estimation

In the remainder of Section 3.2, we establish an estimator for the density

φ_{g, K, μ}

in the case of a bounded generator function g, and provide statements on convergence properties of the estimator. An estimator for μ is available by Formula (7), the estimation of g is still an open problem. If we want to estimate g, then it is necessary that this function is identifiable. In (2), however, function g is determined up to a constant factor. Therefore, we require

I (g, d) = 1

to obtain the uniqueness and identifiability. As a consequence, we get, according to [12]

C (g, K) = \frac{1}{O_{S} (S)} .

In the following we adopt the approach introduced in Section 2 of [7] to the present much more general situation. This approach combines the advantages of two estimators and avoids their disadvantages. Let

ψ : [0, \infty) \to [0, + \infty)

be a function having a derivative

ψ^{'}

with

ψ^{'} (y) > 0

for

y \geq 0,

and the property

ψ (0) = 0

. We introduce the random variable

Y = ψ (h_{K} (X - μ))

and denote the inverse function of ψ by Ψ. The transformation using ψ is applied to adjust the volcano effect described above. In view of (4), the density χ of

Y = ψ (R)

is given by

χ (y) = Ψ {(y)}^{d - 1} g (Ψ (y)) \cdot Ψ^{'} (y)

for

y \geq 0

. This equation implies the following formula for g:

g (z) = z^{1 - d} ψ^{'} (z) χ (ψ (z)) .

The next step is to establish the estimator for χ. Nonparametric estimators have the advantage that they are flexible and there is no need to assume a specific model. Let us consider the transformed sample

Y_{1 n}, \dots, Y_{n n}

with

Y_{i n} = ψ ({\hat{R}}_{i})

. Further we apply the following kernel density estimator for χ:

{\hat{χ}}_{n} (y) = n^{- 1} b^{- 1} \sum_{i = 1}^{n} (k ((y - Y_{i n}) b^{- 1}) + k ((y + Y_{i n}) b^{- 1})) for y \geq 0,

(10)

where

b = b (n)

is the bandwidth and k the kernel function. Note that

{\hat{χ}}_{n}

represents the usual kernel density estimator for χ based upon the

Y_{i n}

’s and including a boundary correction at zero (the second addend in the outer parentheses of (10)). The mirror rule is used as a simple boundary correction. Other more elegant corrections can be applied at the price of a higher technical effort. The properties of

{\hat{χ}}_{n}

are essentially influenced by the bandwidth b. Since the kernel estimator shows reasonable properties only in the case of bounded χ, we have to guarantee by suitable assumptions that

{lim}_{z \to 0 +} z^{1 - d} ψ^{'} (z) > 0

in order to get the boundedness of χ (see below). On the basis of

{\hat{χ}}_{n}

, we can establish the following estimator for

φ_{g, K, μ}

:

{\hat{φ}}_{n} (x) = O_{S} {(S)}^{- 1} {\hat{g}}_{n} (h_{K} (x - {\hat{μ}}_{n})),

where

{\hat{g}}_{n} (z) = z^{1 - d} ψ^{'} (z) {\hat{χ}}_{n} (ψ (z)) .

(11)

This approach has the property that the theory of kernel density estimators applies here (cf. [19]). The kernel estimators are a very popular type of nonparametric density estimators because of their comparatively simple structure. In the literature the reader can find a lot of hints concerning the choice of the bandwidth.

Let us add here some words to the comparison between this paper and [7]. Although the main idea for the construction of estimators is the same, there is a difference in the definition of the generator functions (say g and

g_{L}

). Considering the special case

h_{K} (x) = ∥Σ^{- 1 / 2} (x - μ)∥

, identity

g (t) = g_{L} (t^{2})

can be established for

t \geq 0

. This causes some changes in the formulas. For more details in a particular case, see Section 3 in [20].

3.2.3. Assumptions Ensuring Convergence Properties of Estimators

Next we provide the assumptions for the theorems below. Assumption 2 concerns the parameter

b = b (n)

and the function k of the kernel estimator whereas Assumption 3 is posed on function ψ.

Assumption 2.

(a) We assume that

lim_{n \to \infty} b ln ln n = 0 a n d \bar{b} \geq b \geq C_{1} \cdot n^{- 1 / 5}

with constants

\bar{b}, C_{1} > 0

.

(b) Suppose that the kernel function

k : R ⟶ R

is continuous and vanishes outside the interval

[- 1, 1]

, and has a Lipschitz continuous derivative on

[- 1, 1]

. Moreover, assume that

k (- t) = k (t)

holds for

t \in [- 1, 1]

,

\int_{- 1}^{1} k (t) d t = 1 a n d \int_{- 1}^{1} t^{j} k (t) d t = 0

(12)

for even

j : 0 < j < p

, where

p \geq 2

is an integer.

Note that continuity of the derivative at an enclosed boundary point means that the one-sided derivative exists and is the limit of the derivatives in a neighbourhood of this point. Symmetric kernel functions k satisfying (12) are called kernels of order p. Assumption 2 ensures that the bias of the density estimator

{\hat{χ}}_{n}

converges to zero at a certain rate. Under Assumption 2 with

p = 2

and

k (t) \geq 0

, the estimator

{\hat{χ}}_{n}

is indeed a density. The case

p > 2

is added to complete the presentation and is of minor practical importance unless we have a very large sample size (cf. the discussion in [21]). From the asymptotic theory for density estimators, it is known that the Epanechnikov kernel

k_{epa} (t) = \{\begin{matrix} \frac{3}{4} (1 - t^{2}) for t \in [- 1, 1], \\ 0 otherwise \end{matrix}

is an optimal kernel of order 2 (i.e., in the case

p = 2

in Assumption 2) with respect to the asymptotic mean square error (cf. [19]). This kernel function is simple in structure and leads to fast computations. The consideration of optimal kernels can be extended to higher-order kernels. It turned out that their use is advantageous only in the case of sufficiently large sample sizes (for instance, for a size greater than 1000).

Assumption 3.

The

(p + 1)

-th order derivative of

Ψ^{d}

exists and is continuous on

[0, \infty)

,

ψ^{'}

is positive and bounded on

(0, + \infty)

for some integer

p \geq 2

, and

ψ^{″}

is bounded on

(0, + \infty)

. The functions

z ⇝ z^{d - 1} ψ^{'} {(z)}^{- 1}

and

z ⇝ z^{- 1} ψ^{'} (z)

have bounded derivatives on

[0, M_{1}]

with some

M_{1} > 0

. Moreover,

lim_{t \to 0 +} {(Ψ^{d} (t))}^{'} > 0 .

(13)

Notice that in Assumption 3 we require that the right-sided limit of the

(p + 1)

-th order derivative of

Ψ^{d}

is finite at zero. Hence Assumption 3 implies that

\begin{matrix} Ψ^{d} (t) & = & {(Ψ^{d} (\tilde{t}))}^{'} t, \tilde{t} \in (0, t), \\ lim_{t \to 0 +} t^{- 1 / d} Ψ (t) & = & C_{2}, \end{matrix}

(14)

and

lim_{z \to 0 +} z^{- d} ψ (z) = C_{2}^{- d}

with a finite constant

C_{2} > 0

. On the other hand, it follows from (13) that

lim_{z \to 0 +} z^{1 - d} ψ^{'} (z) = lim_{t \to 0 +} Ψ {(t)}^{1 - d} Ψ^{'} {(t)}^{- 1} = d lim_{t \to 0 +} {({(Ψ^{d} (t))}^{'})}^{- 1} = C_{3}

with a finite constant

C_{3} > 0

. Therefore, χ is bounded under Assumption 3.

Example 10.

Let

ψ (z) = - a + {(a^{d} + z^{d})}^{1 / d}

with a constant

a > 0

. Then

Ψ^{d} (t) = {(t + a)}^{d} - a^{d}

,

{(Ψ^{d} (t))}^{'} = d {(t + a)}^{d - 1}

,

\begin{matrix} z^{- 1} ψ^{'} (z) & = & z^{d - 2} {(a^{d} + z^{d})}^{- 1 + 1 / d}, z^{d - 1} ψ^{'} {(z)}^{- 1} = {(a^{d} + z^{d})}^{1 - 1 / d}, \\ {(z^{- 1} ψ^{'} (z))}^{'} & = & {(a^{d} + z^{d})}^{1 / d - 2} z^{d - 3} (a^{d} d - 2 a^{d} - z^{d}), a n d \\ {(z^{d - 1} ψ^{'} {(z)}^{- 1})}^{'} & = & (d - 1) z^{d - 1} {(a^{d} + z^{d})}^{- 1 / d} . \end{matrix}

Hence, Assumption 3 is satisfied for every

p \geq 2

.

Another condition is required now for

h_{K}

.

Assumption 4.

For any bounded subset Q of

R^{d}, 0 \notin Q

, the partial derivatives

G_{1}, \dots, G_{d}

of

h_{K} (.)

exist and are bounded on Q, and

x ⇝ ψ^{'} (h_{K} (x)) G_{j} (x)

is Hölder continuous of order

α > 0.2

on Q for each

j \in {1, \dots, d}

.

Assumption 5.

For any bounded subset Q of

R^{d}, 0 \notin Q

,

h_{K}

is Hölder continuous of order

\bar{α} > 0.2

.

If Assumption 3 is fulfilled, the function

h_{K}

has second order derivatives

\frac{\partial^{2}}{\partial x_{j} \partial x_{J}} h_{K} (x) = G_{j J} (x)

, and these are bounded on bounded subsets of

R^{d}

, then the Assumption 4 is satisfied.

Example 11.

We consider the q-norm/antinorm:

h_{K} (x) = {∥x∥}_{q}

,

\begin{matrix} G_{j} (x) & = & \frac{x_{j} {| x_{j} |}^{q - 2}}{{∥x∥}_{q}^{q - 1}} for j = 1, \dots, d, \\ G_{j J} (x) & = & \{\begin{matrix} \frac{(1 - q) x_{j} | x_{j} |^{q - 2} x_{J} {| x_{J} |}^{q - 2}}{{∥x∥}_{q}^{2 q - 1}} for j \neq J, \\ \frac{(q - 1)}{{∥x∥}_{q}^{2 q - 1}} | x_{j} |^{q - 2} \sum_{ν \neq j} {| x_{ν} |}^{q} for j = J . \end{matrix} \end{matrix}

Therefore, Assumption 4 is fulfilled in the case

q > 1.2

, and Assumption 5 is fulfilled in the case

q > 0.2

.

Examples 1 and 2: (Continued) One can show that

G_{j} (x)

exists for

x \neq 0, j \in {1, \dots, d}

, and is bounded. Moreover,

ψ^{'} (h_{K} (.)) G_{j}

is Lipschitz continuous on

R^{d} \ {0}

. Hence, Assumption 4 is satisfied.

3.2.4. Properties of the Density Estimator

First we provide the result on strong convergence of the density estimator.

Theorem 2.

Suppose that the p-th order derivative

g^{(p)}

of g exists and is bounded on

[0, \infty)

for some even integer

p \geq 2

. Moreover, assume that condition (9) as well as Assumptions 1 to 3 are satisfied for the given p. Let Assumption 4 or Assumption 5 be satisfied. In the first case define

r_{n} : = \sqrt{ln n} {(n b)}^{- 1 / 2}

, in the latter case

r_{n} : = n^{- \bar{α} / 2} b^{- 1}

. Then, for any compact set D with

μ \notin D

and

n \to \infty

,

sup_{x \in D} |{\hat{φ}}_{n} (x) - φ_{g, K, μ} (x)| = O (r_{n} + b^{p}) a . s .

(15)

For any compact set D with

μ \in D

and

n \to \infty

,

sup_{x \in D} |{\hat{φ}}_{n} (x) - φ_{g, K, μ} (x)| = O (r_{n} + b^{1 / d}) a . s .

Theorem 2 applies in particular to the Euclidean case

h_{K} = {∥.∥}_{2}

. Since Assumption 2 is weaker than the corresponding assumption on the kernel in [7], Theorem 2 extends Theorem 3.1 in [7] even in the case of

h_{K} = {∥.∥}_{2}

. The convergence rate in (15) is the same as that known for one-dimensional kernel density estimators and cannot be improved under the assumptions posed here (cf. [22]).

The next theorem represents the result about the asymptotic normality of the estimator

{\hat{φ}}_{n}

.

Theorem 3.

Suppose that the assumptions of Theorem 2 and Assumption 4 are satisfied. Let

x \in R^{d}, x \neq μ

such that

g^{(p)}

is continuous at

\tilde{x} : = h_{K} (x - μ)

.

(i): Define

$\begin{matrix} {\bar{σ}}^{2} (\tilde{x}) & = & O_{S} {(S)}^{- 2} {\tilde{x}}^{1 - d} ψ^{'} (\tilde{x}) g (\tilde{x}) \int_{- 1}^{1} k^{2} (t) d t, \\ Λ (\tilde{x}) & = & O_{S} {(S)}^{- 1} {\tilde{x}}^{1 - d} ψ^{'} (\tilde{x}) \frac{1}{p!} χ^{(p)} (ψ (\tilde{x})) \int_{- 1}^{1} t^{p} k (t) d t . \end{matrix}$

Then

${\hat{φ}}_{n} (x) - φ_{g, K, μ} (x) = Z_{n} + e_{n},$

where $e_{n} = Λ (\tilde{x}) b^{p} + o (b^{p}),$

$\sqrt{n b} Z_{n} \overset{d}{⟶} N (0, {\bar{σ}}^{2} (\tilde{x})) for n \to \infty .$
(ii): If additionally ${lim}_{n \to \infty} n^{1 / (2 p + 1)} b = C_{4}$ holds true with a constant $C_{4} \geq 0$ , then, for $n \to \infty$ ,

$\sqrt{n b} ({\hat{φ}}_{n} (x) - φ_{g, K, μ} (x)) \overset{d}{⟶} N (C_{4}^{(2 p + 1) / 2} Λ (\tilde{x}), {\bar{σ}}^{2} (\tilde{x})) .$

The assertion of Theorem 3 can be used to construct an asymptotic confidence region for

φ_{g, K, μ} (x)

. Term

e_{n}

describes the asymptotic behaviour of the the bias of the estimator

{\hat{φ}}_{n}

whereas the fluctuations of the estimator are represented by

Z_{n}

. In view of Theorem 3,

\sqrt{n b} Z_{n}

converges in distribution to

Z \sim N (0, {\bar{σ}}^{2} (\tilde{x}))

. The mean squared deviation of the leading terms in the asymptotic expansion of

{\hat{φ}}_{n}

is thus given by

E {(n^{- 1 / 2} b^{- 1 / 2} Z + Λ (\tilde{x}) b^{p})}^{2} = n^{- 1} b^{- 1} {\bar{σ}}^{2} (\tilde{x}) + Λ^{2} (\tilde{x}) b^{2 p} .

The minimization of this function w.r.t. b leads to the asymptotically optimal bandwidth

b^{*} = {(\frac{{\bar{σ}}^{2} (\tilde{x})}{2 p Λ^{2} (\tilde{x}) n})}^{1 / (2 p + 1)} .

(16)

The bandwidth

b^{*}

converges at rate

n^{- 1 / (2 p + 1)}

to zero. Under the conditions of Theorem 3(ii),

{\hat{φ}}_{n} (x) - φ_{g, K, μ} (x)

has the convergence rate

n^{- p / (2 p + 1)}

. This convergence rate of

{\hat{φ}}_{n}

is better than that of a nonparametric density estimator but slower than the usual rate

n^{- 1 / 2}

for parametric estimators. In principle, Formula (16) could be used for the optimal choice of the bandwidth. However, one would need then an estimator for

χ^{(p)}

and typically, estimators of derivatives of densities do not exhibit a good performance unless n is very large. As a resort, one can consider a bandwidth which makes reference to a specific radius distribution.

To illustrate how the estimators work in practice, we simulated data from a q-norm distribution with

q = 1.3

and the radius distribution to be the modified exponential distribution with

τ = 1.1

. The Figure 7 and Figure 8 include graphs of the underlying function g and its estimator in two cases.

Figure 7. Estimator of g (solid line) and the model function (dashed line) for

n = 1000

.

Figure 8. Estimator of g (solid line) and the model function (dashed line) for

n = 10,000

.

3.2.5. Reference Bandwidth

Let us consider an estimator

{\hat{φ}}_{n} (x)

with Epanechnikov kernel, function ψ as in Example 10, and modified exponential radius density in the case

p = 2

. According to (16), the reference bandwidth is then

b^{*} = {(15 (d + 1) (d - 1)! e^{\tilde{x} / τ} {\tilde{x}}^{4 d} τ^{5 / d} {(1 + {\tilde{x}}^{d})}^{- 1 + 5 / d} (\tilde{x} + τ) {(O_{S} (S) D^{2} n)}^{- 1})}^{1 / 5},

where

\begin{matrix} D & = & {\tilde{x}}^{2} (\tilde{x} + (d - 2) τ) + {\tilde{x}}^{d + 2} (2 \tilde{x} - (d + 1) τ) \\ + {\tilde{x}}^{2 d} ({\tilde{x}}^{3} + (d - 2) (d - 1) (\tilde{x} + τ) τ^{2} + {\tilde{x}}^{2} τ (1 - 2 d)) . \end{matrix}

This formula was generated using the computer algebra system Mathematica. The parameter τ can be estimated by utilizing the above Formula (6) for the expectation of the radius.

3.3. Semiparametric Estimators Involving a Scale and a Parameter Fit

In this section we consider the situation where the contour of the body K depends on scale parameters

σ_{1}, \dots, σ_{d}

. Suppose that

I (g, d) = 1

. We introduce the diagonal matrix

Σ = diag (σ_{1}, \dots, σ_{d})

and a master body

K_{0}

, which is symmetric w.r.t. the origin. Define

K = Σ K_{0} : = {Σ x : x \in K_{0}},

and

\tilde{U} = Σ^{- 1} U

. The distribution of

\tilde{U}

is concentrated on the boundary

S_{0}

of

K_{0}

. We assume that

K_{0}

is given such that

E {\tilde{U}}_{j}^{2} = 1 for j = 1, \dots, d .

(17)

Otherwise,

K_{0}

is rescaled. Suppose that

h_{K_{0}}

depends on a further parameter vector

θ \in Θ

where the parameter space

Θ \subset R^{q}

is a compact set. Then

h_{K} (x) = h_{K_{0}} (θ, Σ^{- 1} x) for x \in R^{d} .

The parameter vector θ is able to describe the shape of the boundary of body K, see Examples 1 and 2 (parameters

a_{1}

and

a_{2}

). From Lemma 1, we obtain

\begin{matrix} V (X^{(j)}) & = & σ_{j}^{2} E {\tilde{U}}_{j}^{2} E R^{2} = σ_{j}^{2} E R^{2}, \\ C ov (X^{(j)}, X^{(k)}) & = & σ_{j} σ_{k} E {\tilde{U}}_{j} {\tilde{U}}_{k} E R^{2}, \end{matrix}

and

ρ_{j k} = C orr (X^{(j)}, X^{(k)}) = E {\tilde{U}}_{j} {\tilde{U}}_{k} for j, k = 1, \dots, d .

Here we see that (17) results in

V (X^{(j)}) = σ_{j}^{2} E R^{2}

. The density is given by

φ_{g, K, μ} (x) = O_{S_{0}} {(S_{0})}^{- 1} det {(Σ)}^{- 1} g (h_{K_{0}} (θ, Σ^{- 1} (x - μ))), x \in R^{d} .

In this context, a scaling problem occurs concerning g. Assume that g is a suitably given generator function satisfying

I (g, d) = 1

. Then

x ⇝ g_{t}^{*} (x) : = t^{d - 1} g (t x)

is a modified generator for every

t \in R

with

I (g_{t}^{*}, d) = 1

. For any

t \in R

, we obtain the same model when g is replaced by

g_{t}^{*}

and

σ_{j}

is replaced by

σ_{j} t

for

j = 1, \dots, d

. To get uniqueness, we choose t such that

E R^{2} = \int_{0}^{\infty} r^{d + 1} g_{t}^{*} (r) d r = t^{- 2} \int_{0}^{\infty} r^{d + 1} g (r) d r = 1 .

Let

{\hat{μ}}_{n} = {({\hat{μ}}_{n 1}, \dots, {\hat{μ}}_{n d})}^{T}

as above. Then

σ_{j}^{2}

represents the variance of the j-th component of X. Based on this property, the sample variances of the components of X can be used as estimators for

σ_{j}^{2}

:

{\hat{σ}}_{n j}^{2} = \frac{1}{n - 1} \sum_{i = 1}^{n} {(X_{i j} - {\hat{μ}}_{n j})}^{2} for j = 1, \dots, d .

Moreover, we have the sample correlations

{\hat{ρ}}_{n j k} = {\hat{σ}}_{j}^{- 1} {\hat{σ}}_{k}^{- 1} \frac{1}{n - 1} \sum_{i = 1}^{n} (X_{i j} - {\hat{μ}}_{n j}) (X_{i k} - {\hat{μ}}_{n k}) for j, k = 1, \dots, d .

In the following we use the notation

{\hat{Σ}}_{n} =

diag

({\hat{σ}}_{n 1}, \dots, {\hat{σ}}_{n d})

. If θ is unknown, we consider moment estimators based on the correlations. For this we need the following assumptions.

Assumption 6.

Let I be a subset of

{(j, k) : j, k = 1, \dots, d, j < k}

with cardinality q. There is a vector

ρ = {(ρ_{j k})}_{(j, k) \in I} \in R^{q}

such that for

l = 1, \dots, q

,

θ_{l} = γ_{l} (ρ) .

Assume that

γ_{l} : C \to Θ

has bounded partial derivatives,

ρ \in C

and θ is an interior point of Θ.

Assumption 7.

For any bounded subset Q of

R^{d}, 0 \notin Q

, the partial derivatives

{\tilde{G}}_{1}, \dots, {\tilde{G}}_{q}

,

G_{1}, \dots, G_{d}

of

(θ, x) ⇝ h_{K_{0}} (θ, x)

exist, are bounded for

θ \in Θ, x \in Q

, and

(θ, x) ⇝ ψ^{'} (h_{K_{0}} (θ, x)) G_{j} (θ, x)

,

ψ^{'} (h_{K_{0}} (θ, x)) {\tilde{G}}_{j} (θ, x)

is Hölder continuous of order

α > 0.2

on

Θ \times Q

for each

j \in {1, \dots, d}

.

Assumption 8.

For any bounded subset Q of

R^{d}, 0 \notin Q

,

h_{K_{0}}

is Hölder continuous of order

\bar{α} > 0.2

.

Examples 1 and 2: (Continued) Similarly as above, it can be proven that Assumption 7 is fulfilled.

Let

\hat{ρ}

be the sample version of ρ. Then

{\hat{θ}}_{n l} = γ_{l} (\hat{ρ}) for l = 1, \dots, q

is the estimator for θ,

{\hat{θ}}_{n} = {({\hat{θ}}_{n l})}_{l = 1 \dots q}

. Define

{\hat{R}}_{i} = {\hat{Σ}}^{- 1} (X_{i} - {\hat{μ}}_{n})

. With this definition,

{\hat{F}}_{n}^{R}

is determined according to Formula (8). The following result on the convergence rate of

{\hat{F}}_{n}^{R}

can be proven:

Theorem 4.

Suppose that Assumptions 1 and 6 are satisfied, and

\int_{0}^{\infty} r^{d + 3} g (r) d r < + \infty .

(18)

Let

r ⇝ r f_{R} (r)

be bounded on

[0, + \infty)

. Then, for

n \to \infty

,

sup_{r \geq 0} |{\hat{F}}_{n}^{R} (r) - F_{R} (r)| = O (\sqrt{\frac{ln ln (n)}{n}}) a . s .

In this section the transformed sample

Y_{1 n}, \dots, Y_{n n}

is given by

Y_{i n} = ψ (h_{K_{0}} ({\hat{θ}}_{n}, {\hat{R}}_{i}))

with ψ as in Section 3.2. The estimator

{\hat{g}}_{n}

for the generator g is calculated using Formulas (10) and (11) from the previous section. The following estimator for the density has thus been established:

{\hat{φ}}_{n} (x) = O_{S_{0}} {(S_{0})}^{- 1} det {({\hat{Σ}}_{n})}^{- 1} {\hat{g}}_{n} (h_{K_{0}} ({\hat{θ}}_{n}, {\hat{Σ}}_{n}^{- 1} (x - {\hat{μ}}_{n})))

(19)

The next two theorems show the results concerning strong convergence and asymptotic normality of the density estimator:

Theorem 5.

Suppose that the p-th order derivative

g^{(p)}

of g exists and is bounded on

[0, \infty)

for some even integer

p \geq 2

. Where needed, with this p, assume further that Assumptions 1, 2, 3, 6, (1) and (18) are satisfied. Let Assumption 7 or Assumption 8 be satisfied, and define in the first case

r_{n} : = \sqrt{ln n} {(n b)}^{- 1 / 2}

and in the latter case

r_{n} : = n^{- \bar{α} / 2} b^{- 1}

.Then the claim of Theorem 2 holds true for estimator

{\hat{φ}}_{n}

defined in (19).

Theorem 6.

Suppose that the assumptions of Theorem 5 are satisfied. Let

x \in R^{d}, x \neq μ

such that

g^{(p)}

is continuous at

\tilde{x} : = h_{K_{0}} (θ, Σ^{- 1} (x - μ))

. Assume that

{lim}_{n \to \infty} n^{1 / (2 p + 1)} b = C_{4}

holds true with a constant

C_{4} \geq 0

. Then for

n \to \infty

,

\sqrt{n b} ({\hat{φ}}_{n} (x) - φ_{g, K, μ} (x)) \overset{d}{⟶} N (C_{4}^{(2 p + 1) / 2} Λ (\tilde{x}), {\bar{σ}}^{2} (\tilde{x})),

where

{\hat{φ}}_{n}

is defined in (19),

\begin{matrix} {\bar{σ}}^{2} (\tilde{x}) & = & O_{S_{0}} {(S_{0})}^{- 2} det {(Σ)}^{- 2} {\tilde{x}}^{1 - d} ψ^{'} (\tilde{x}) g (\tilde{x}) \int_{- 1}^{1} k^{2} (t) d t, \\ Λ (\tilde{x}) & = & O_{S_{0}} {(S_{0})}^{- 1} det {(Σ)}^{- 1} {\tilde{x}}^{1 - d} ψ^{'} (\tilde{x}) \frac{1}{p!} χ^{(p)} (ψ (\tilde{x})) \int_{- 1}^{1} t^{p} k (t) d t . \end{matrix}

The remarks following Theorems 2 and 3 are valid similarly.

3.4. Applications

For many decades, statistical applications of multivariate distribution theory were manly based upon Gaussian and elliptically contoured distributions. Studies using non-elliptically contoured star-shaped distributions were basically made during the last two decades and are dealing in most cases with p-generalized normal distributions. Such distributions are convex or radially concave contoured if

p \geq 1

or

0 < p \leq 1,

respectively, and are also called power exponential distributions. Moreover, common elliptically contoured power exponential (ecpe) distributions build a particular class of the wide class of star-shaped distributions that allows modeling much more flexible contours than elliptically ones.

The class of ecpe distributions is used in a crossover trial on insulin applied to rabbits in [23], in image denoising in [24] and in colour texture retrieval in [25]. Applications of multivariate g-and-h distributions to jointly modeling body mass index and lean body mass are demonstrated in [26] and accompanied by star-shaped contoured density illustrations. The

l_{n, p}

-elliptically contoured distributions build another big class of star-shaped distributions and are used in [27] to explore to which extent orientation selectivity and contrast gain control can be used to model the statistics of natural images. Mixtures of ecpe distributions are considered for bioinformation data sets in [28]. Texture retrieval using the p-generalized Gaussian densities is studied in [29]. A random vector modeling data from quantitative genetics presented in [30] are shown in [31] to be more likely to have a power exponential distribution different from a normal one. The reconstruction of the signal induced by cosmic strings in the cosmic microwave background, from radio-interferometric data, is made in [32] based upon generalized Gaussian distributions. These distributions are also used in [33] for voice detection.

More recently, the considerations in [11] opened a new field of financial applications of more general star-shaped asymptotic distributions, where suitably scaled sample clouds converge onto a deterministic set.

Figure 3 in [34] represents a sample cloud which might be modeled with a density being star-shaped w.r.t. a fan having six cones that include sample points and other cones that do not. Note, however, that Figure 1 d-f in the same paper do not reflect a homogeneous density but might be compared in some sense to the level sets of the characteristic functions of certain polyhedral star-shaped distributions in [16], Figure 5.2.

When modeling Lymphoma data, [35] analyze sample clouds of points, see Figures 2 and 3, which might be interpreted as mixtures of densities having contours in part looking similar like that in [36] where flow cytometric data, Australian Institute of Sport data and Iris data are analyzed, or like that of certain skewed densities as they were (analytically derived and) drawn in [37]. In a similar manner, Figures 2 and 5 in [20] indicate that mixtures of different types of star-shaped distributions might be suitable for modeling residuals of certain stock exchange indices. It could be of interest to closer study in future work more possible connections of all the models behind.

The following numerical examples of the present section are aimed to illustrate the agricultural and financial application of the estimators described in this paper. To this end, we make use of the new particular non-elliptically contoured but star-shaped distribution class introduced in Section 2.2 of the present paper.

Example 12.

Example 2 of Section 2.2 continued. We consider the class of bodies K of Example 2. Let

a_{2} = 1

. Figure 9 shows the dependence of the correlation on the parameter

a_{1}

.

Figure 9. Function

γ_{1}^{- 1}

with

a_{1}

on the x-axis, ρ on the y-axis;

γ_{1}

is defined in Assumption 6.

Here we apply the above described methods to the dataset 5 of [38]. The yield of grain and straw are the two variables. The sample correlation is 0.73077. Starting from that value, we can calculate the moment estimator for

a_{1}

:

{\hat{a}}_{1} = 0.83641

. Moreover, we obtain

\begin{matrix} O_{S_{0}} (S_{0}) & = & 2.6406, {\hat{μ}}_{1} = 3.9480, {\hat{μ}}_{2} = 6.5148 \\ {\hat{σ}}_{1} & = & 0.45798, {\hat{σ}}_{2} = 0.89831 . \end{matrix}

The data and the shape of the estimated multivariate density are depicted in Figure 10 and Figure 11.

Figure 11. Estimated generator function g at left (bandwidth

b = 0.5

) and contour plot of

\hat{φ}

at right.

Example 13.

We want to illustrate the potential of our approach for applications to financial data and consider daily index data from Morgan Stanley Capital International of the countries Germany and UK for the period August 2011 to June 2016. The data indicate the continuous daily return values computed as logarithm of the ratio of two subsequent index values. The modelling of MSCI data using elliptical models is considered in [5]. The data are depicted in Figure 12. A visual inspection seems to give some preference for our model from Section 2.2 compared to the elliptically contoured model. Figure 13 and Figure 14 show the estimated model for the data. The basic numerical results are:

\begin{matrix} O_{S_{0}} (S_{0}) & = & 2.1966, {\hat{a}}_{1} = 1.2088, {\hat{μ}}_{1} = 0.00026725, {\hat{μ}}_{2} = 0.00026855 \\ {\hat{σ}}_{1} & = & 0.013519, {\hat{σ}}_{2} = 0.011715 . \end{matrix}

Figure 12. Scatter plot of the MSCI data.

Figure 13. Contour plot of the estimated density.

Figure 14. Estimated generator function g (bandwidth

b = 0.25

).

Further we proceed with proving the results.

4. Proofs

Throughout the remainder of the paper, suppose that Assumptions 1–3 are satisfied for some integer

p \geq 2

. First, we prove auxiliary statements which are used in the proof of strong convergence of

{\hat{φ}}_{n}

and later.

4.1. Proof of Auxiliary Statements

The following Lemma 2 clarifies the asymptotic behaviour of χ in the neighbourhood of zero.

Lemma 2.

Suppose that

g^{'}

exists and is bounded. Then

sup_{t, u \in [0, \bar{M}]} |χ (t) - χ (u)| {| t - u |}^{- 1 / d} < + \infty

for any

\bar{M} > 0

.

Proof.

Observe that by the Lipschitz continuity of the functions g and

z ⇝ z^{d - 1} ψ^{'} {(z)}^{- 1}

in view of Assumption 3,

\begin{matrix} |χ (t) - χ (u)| \\ \leq & C \cdot (\frac{Ψ {(t)}^{d - 1}}{ψ^{'} (Ψ (t))} |g (Ψ (t)) - g (Ψ (u))| + |\frac{Ψ {(t)}^{d - 1}}{ψ^{'} (Ψ (t))} - \frac{Ψ {(u)}^{d - 1}}{ψ^{'} (Ψ (u))}| g (Ψ (u))) \\ \leq & C \cdot |Ψ (t) - Ψ (u)| \end{matrix}

(20)

uniformly for

t, v \in [0, \bar{M}]

. Here and in the following C is a generic constant which may differ from formula to formula. By assumption, we have

|Ψ^{d} (t) - Ψ^{d} (u)| \leq C \cdot | t - u | .

On the other hand,

sup_{t, u \geq 0} \frac{| t^{1 / d} - u^{1 / d} |}{{| t - u |}^{1 / d}} < + \infty .

Hence

|Ψ (t) - Ψ (u)| \leq C \cdot {|Ψ^{d} (t) - Ψ^{d} (u)|}^{1 / d} \leq C \cdot {| t - u |}^{1 / d} .

In view of (20), the proof is complete. ☐

Lemma 3.

Let (9) be fulfilled. Then, for

n \to \infty

,

{∥{\hat{μ}}_{n} - μ∥}_{2} = O (\sqrt{\frac{ln ln (n)}{n}}) a . s .

If in addition (18) is satisfied then, for

n \to \infty

,

max_{j = 1, \dots, d} |{\hat{σ}}_{n j} - σ_{j}| = O (\sqrt{\frac{ln ln (n)}{n}}) a . s .

Proof.

Because of (9), the law of iterated logarithm applies and leads to

\underset{n \to \infty}{lim sup} \sqrt{\frac{n}{ln ln n}} |{\hat{μ}}_{n j} - μ_{j}| = \sqrt{2 V X_{j}} a . s .

(21)

Since there is a constant

C > 0

such that

{∥y∥}_{2} \leq C {∥y∥}_{\infty}

for all

y \in R^{d}

in view of the norm equivalence property, the first part of the lemma follows from (21). The second part can be shown similarly. ☐

In several places, we will use the following property:

Lemma 4.

Suppose that

Λ : R^{d} \to R

is a measurable function with

Λ (- x) = - Λ (x)

. Then

E (Λ (X - μ)) = 0

Proof.

Since

- X + μ

has the same distribution as

X - μ

, we have

\begin{matrix} E (Λ (X - μ)) & = & E (Λ (X - μ) 1 {X_{1} - μ_{1} < 0}) + E (Λ (X - μ) 1 {X_{1} - μ_{1} > 0}) \\ = & E (Λ (- X + μ) 1 {X_{1} - μ_{1} > 0}) + E (Λ (X - μ) 1 {X_{1} - μ_{1} > 0}) = 0 \end{matrix}

which proves the lemma. ☐

4.2. Proving Convergence of ${\hat{F}}_{n}^{R}$

In this section we prove Theorem 1. The law of the iterated logarithm for the empirical process says (cf. [39], p. 268, for example)

Δ_{n} : = sup_{r \in R} |F_{n}^{R} (r) - F_{R} (r)| = O (\sqrt{\frac{ln ln (n)}{n}}) a . s .

By Lipschitz-continuity of

h_{K}

and Lemma 3,

\begin{matrix} {\bar{Δ}}_{n} & : & = sup_{x \in R^{d}} |h_{K} (x - {\hat{μ}}_{n}) - h_{K} (x - μ)| \\ \leq & C {∥{\hat{μ}}_{n} - μ∥}_{2} = O (\sqrt{\frac{ln ln (n)}{n}}) a . s . \end{matrix}

Moreover,

\begin{matrix} {\hat{F}}_{n}^{R} (r) - F_{R} (r) & = & \frac{1}{n} \sum_{i = 1}^{n} (1 \{h_{K} (X_{i} - {\hat{μ}}_{n}) \leq r\} - 1 \{h_{K} (X_{i} - μ) \leq r\}) + F_{n}^{R} (r) - F_{R} (r) \\ \leq & \frac{1}{n} \sum_{i = 1}^{n} (1 \{h_{K} (X_{i} - {\hat{μ}}_{n}) \leq r, h_{K} (X_{i} - μ) > r\} \\ - 1 \{h_{K} (X_{i} - μ) \leq r, h_{K} (X_{i} - {\hat{μ}}_{n}) > r\}) + F_{n}^{R} (r) - F_{R} (r) \\ \leq & \frac{1}{n} \sum_{i = 1}^{n} 1 \{r < h_{K} (X_{i} - μ) \leq r + {\bar{Δ}}_{n}\} + Δ_{n}, \end{matrix}

and

{\hat{F}}_{n}^{R} (r) - F_{R} (r) \geq - \frac{1}{n} \sum_{i = 1}^{n} 1 \{r - {\bar{Δ}}_{n} < h_{K} (X_{i} - μ) \leq r\} - Δ_{n} .

Hence, by the boundedness of

f_{R}

,

\begin{matrix} sup_{r \geq 0} |{\hat{F}}_{n}^{R} (r) - F_{R} (r)| \\ \leq & \frac{1}{n} sup_{r \geq 0} \sum_{i = 1}^{n} 1 \{r - {\bar{Δ}}_{n} < h_{K} (X_{i} - μ) \leq r + {\bar{Δ}}_{n}\} + Δ_{n} \\ = & sup_{r \geq 0} (F_{n}^{R} (r + {\bar{Δ}}_{n}) - F_{n}^{R} (r - {\bar{Δ}}_{n})) + Δ_{n} \\ \leq & sup_{r \geq 0} (F_{R} (r + {\bar{Δ}}_{n}) - F_{R} (r - {\bar{Δ}}_{n})) + 3 Δ_{n} \\ \leq & 2 sup_{r \geq 0} f_{R} (r) {\bar{Δ}}_{n} + O (\sqrt{\frac{ln ln n}{n}}) = O (\sqrt{\frac{ln ln n}{n}}) a . s ., \end{matrix}

which leads to the theorem. ☐

4.3. Proving Strong Convergence of the Density Estimator

Let

{\tilde{Y}}_{i} = ψ (h_{K} (X_{i} - μ))

for

i = 1, \dots, n

,

K_{b} (y, t) = k ((y - t) / b) + k ((y + t) / b)

for

y, t \geq 0

,

Y_{i n} = ψ (h_{K} (X_{i} - {\hat{μ}}_{n}))

, and

{\tilde{χ}}_{n} (y) = \frac{1}{n b} \sum_{i = 1}^{n} K_{b} (y, {\tilde{Y}}_{i})

for

y \geq 0

. Then (cf. (10))

{\hat{χ}}_{n} (y) = \frac{1}{n b} \sum_{i = 1}^{n} K_{b} (y, Y_{i n}) .

Next we prove strong convergence rates for

{\hat{χ}}_{n}

and later for

{\hat{φ}}_{n}

. Throughout this section we suppose that Assumptions 1 to 3 and (9) are fulfilled for some even integer

p \geq 2

. The compact set

[m, M]

with arbitrary m and M,

0 \leq m < M

can be covered with closed intervals

U_{1}, \dots, U_{n}

having sides of length

(M - m) n^{- 1}

and centres

u_{1}, \dots, u_{n}

such that

⋃_{i = 1}^{n} U_{i} = [m, M]

. The constants m and M will be specified later. Note that

\begin{matrix} sup_{y \in [m, M]} |{\hat{χ}}_{n} (y) - χ (y)| & \leq & max_{l = 1, \dots, n} (sup_{y \in U_{l}} |{\hat{χ}}_{n} (y) - {\hat{χ}}_{n} (u_{l})| + |{\hat{χ}}_{n} (u_{l}) - {\tilde{χ}}_{n} (u_{l})| \\ + |{\tilde{χ}}_{n} (u_{l}) - χ (u_{l})| + sup_{y \in U_{l}} |χ (u_{l}) - χ (y)|) . \end{matrix}

(22)

The asymptotic behaviour of the right hand side in (22) as

n \to \infty

is analyzed term by term in the next lemmas.

Lemma 5.

Assume that the p-th order derivative

χ^{(p)}

of χ exists for some even integer

p \geq 2

and is bounded on every finite closed subinterval of

(0, \infty)

. Let

g^{'}

be bounded. Then

max_{l = 1, \dots, n} |{\tilde{χ}}_{n} (u_{l}) - χ (u_{l})| = O (\sqrt{ln n} {(n b)}^{- 1 / 2} + β_{n}) a . s .

where

β_{n} = b^{p}

if

m > 0

and

β_{n} = b^{1 / d}

if

m = 0

.

The proof of this lemma is omitted since, with minor changes, this lemma can be proven in the same way as Lemma 4.4 in [7]. The following lemma is used later several times in proofs of almost sure convergence rates. We provide it without proof. The proof is almost identical to that of Lemma 4.6 in [7].

Lemma 6.

Assume that χ is bounded. Let

\bar{k}, λ : R \to R

be bounded measurable functions with

\bar{k} (t) = 0

for

t : | t | > 1

. Then

\begin{matrix} max_{l = 1, \dots, n} |\sum_{i = 1}^{n} (U_{n i l} - E U_{n i l})| & = & O (\sqrt{n b ln (n)}) a . s ., \\ max_{l = 1, \dots, n} |\sum_{i = 1}^{n} ({\bar{U}}_{n i l} - E {\bar{U}}_{n i l})| & = & O (\sqrt{n b ln (n)}) a . s ., \\ max_{l = 1, \dots, n} |\sum_{i = 1}^{n} (V_{n i l} - E V_{n i l})| & = & O (\sqrt{n b ln (n)}) a . s . \end{matrix}

where $U_{n i l} : = \bar{k} ((u_{l} - {\tilde{Y}}_{i}) / b) 1 {| {\tilde{Y}}_{i} - u_{l} | \leq b - w_{n}} λ (X_{i})$ ,
${\bar{U}}_{n i l} : = \bar{k} ((u_{l} + {\tilde{Y}}_{i}) / b) 1 {| {\tilde{Y}}_{i} - u_{l} | \leq b - w_{n}} λ (X_{i})$ ,
$V_{i n l} : = 1 {b - w_{n} < | {\tilde{Y}}_{i} - u_{l} | < b + w_{n}}$ .

We proceed with proving convergence rates of the terms in (22).

Lemma 7.

Suppose that Assumption 4 or 5 is satisfied. Then, as

n \to \infty,

max_{l = 1, \dots, n} |{\hat{χ}}_{n} (u_{l}) - {\tilde{χ}}_{n} (u_{l})| = \{\begin{matrix} o ({(n b)}^{- 1 / 2}) under Assumption 4, \\ O (b^{- 1} {(n^{- 1} ln ln n)}^{\bar{α} / 2}) under Assumption 5 \end{matrix} a . s .

Proof.

(a) Let Assumption 4 be satisfied. In view of Lemma 3, we obtain

\begin{matrix} |Y_{i n} - {\tilde{Y}}_{i}| & \leq & sup_{t \in [0, \infty)} |ψ^{'} (t)| |h_{K} (X_{i} - {\hat{μ}}_{n}) - h_{K} (X_{i} - μ)| \\ \leq & C_{5} \cdot \sqrt{\frac{ln ln n}{n}} = : w_{n} \end{matrix}

with a suitable constant

C_{5} > 0

for

n \geq n_{1} (ω)

. We introduce

κ_{n} (u, t) = (k^{'} ((u - t) / b) - k^{'} ((u + t) / b)) 1 (|u - t| \leq b - w_{n}) .

Let

\bar{ψ} (z) : = z^{- 1} ψ^{'} (z)

. Observe that

k^{'}

is bounded and Lipschitz continuous on

[- 1, 1]

,

ψ^{'}, \bar{ψ}

and

{\bar{ψ}}^{'}

are bounded on

[0, + \infty)

, functions

G_{j}

are bounded, and functions

ψ^{'} (h_{K} (.)) G_{j}

are Hölder continuous of order

α > 0.2

. We have then by Taylor expansion

\begin{matrix} k (\frac{u - Y_{i n}}{b}) + k (\frac{u + Y_{i n}}{b}) - k (\frac{u - {\tilde{Y}}_{i}}{b}) - k (\frac{u + {\tilde{Y}}_{i}}{b}) \\ = & - \frac{1}{b} (k^{'} ((u - {\tilde{Y}}_{i}) / b) - k^{'} ((u + {\tilde{Y}}_{i}) / b)) ψ^{'} (h_{K} ({\bar{X}}_{i})) \\ \times \sum_{j = 1}^{d} G_{j} ({\bar{X}}_{i}) ({\hat{μ}}_{n j} - μ_{j}) 1 \{|{\tilde{Y}}_{i} - u| \leq b - w_{n}\} + W_{n i} (u), \end{matrix}

where

\begin{matrix} |W_{n i} (u)| & \leq & C (b^{- 2} w_{n}^{2} + b^{- 1} w_{n}^{1 + α}) 1 \{|{\tilde{Y}}_{i} - u| \leq b - w_{n}\} \\ + C b^{- 1} w_{n} 1 \{b - w_{n} < |{\tilde{Y}}_{i} - u| < b + w_{n}\} a . s . \end{matrix}

uniformly w.r.t.

u \in [0, M]

. Here we have used Assumption 4 and Lipschitz continuity of k on

R

. This leads to

max_{l = 1, \dots, n} |{\hat{χ}}_{n} (u_{l}) - {\tilde{χ}}_{n} (u_{l})| \leq C \sum_{j = 1}^{d} B_{1 n j} |{\hat{μ}}_{n j} - μ_{j}| + B_{2 n} + B_{3 n},

(23)

where

\begin{matrix} B_{1 n j} & = & n^{- 1} b^{- 2} max_{l = 1, \dots, n} |\sum_{i = 1}^{n} κ_{n} (u_{l}, {\tilde{Y}}_{i}) ψ^{'} (h_{K} ({\bar{X}}_{i})) G_{j} ({\bar{X}}_{i})|, \\ B_{2 n} & = & C (b^{- 3} w_{n}^{2} + b^{- 2} w_{n}^{1 + α}) max_{l = 1, \dots, n} \sum_{i = 1}^{n} 1 \{|{\tilde{Y}}_{i} - u_{l}| \leq b - w_{n}\}, \\ B_{3 n} & = & C n^{- 1} b^{- 2} w_{n} max_{l = 1, \dots, n} \sum_{i = 1}^{n} 1 \{b - w_{n} < |{\tilde{Y}}_{i} - u_{l}| < b + w_{n}\} a . s . \end{matrix}

Note that

\begin{matrix} B_{1 n j} & = & n^{- 1} b^{- 2} max_{l = 1, \dots, n} |\sum_{i = 1}^{n} κ_{n} (u_{l}, {\tilde{Y}}_{i}) ψ^{'} (h_{K} ({\bar{X}}_{i})) G_{j} ({\bar{X}}_{i}) \\ - \sum_{i = 1}^{n} E (κ_{n} (u_{l}, ψ (h_{K} ({\bar{X}}_{i}))) ψ^{'} (h_{K} ({\bar{X}}_{i})) G_{j} ({\bar{X}}_{i}))| \end{matrix}

since the expectation in the last term is zero in view of Lemma 4 (

G_{j} (- x) = - G_{j} (x)

holds for all

x \in R^{d}

). Applying Lemma 6, it follows that

\sum_{j = 1}^{d} B_{1 n j} |{\hat{μ}}_{n j} - μ_{j}| \leq C n^{- 3 / 2} b^{- 2} \sqrt{ln ln n} \cdot \sqrt{n b ln n} = o ({(n b)}^{- 1 / 2}) .

(24)

On the other hand, we obtain

\begin{matrix} B_{2 n} & \leq & C \frac{ln ln n}{n^{2} b^{3}} (1 + b n^{(1 - α) / 2}) max_{l = 1, \dots, n} \sum_{i = 1}^{n} I (|{\tilde{Y}}_{i} - u_{l}| \leq b - w_{n}) \leq C \frac{ln ln n}{n^{2} b^{3}} (1 + b n^{(1 - α) / 2}) \\ (max_{l = 1, \dots, n} |\sum_{i = 1}^{n} (1 \{|{\tilde{Y}}_{i} - u_{l}| \leq b - w_{n}\} - P \{|{\tilde{Y}}_{i} - u_{l}| \leq b - w_{n}\})| \\ + n \cdot sup_{v \in [0, \infty)} P \{v \leq {\tilde{Y}}_{i} \leq v + 2 b - 2 w_{n}\}) \\ \leq & C \frac{ln ln n}{n^{2} b^{3}} (1 + n^{(1 - α) / 2} b) (\sqrt{n b ln n} + n b) \\ \leq & C ln ln n (n^{- 1} b^{- 2} + n^{- (1 + α) / 2} b^{- 1}) = o ({(n b)}^{- 1 / 2}) a . s ., \end{matrix}

(25)

by utilizing Lemma 6 and taking

α > 0.2

into account. Similarly, it follows that

\begin{matrix} B_{3 n} & \leq & C \sqrt{\frac{ln ln n}{n^{3} b^{4}}} max_{l = 1, \dots, n} \sum_{i = 1}^{n} 1 \{b - w_{n} \leq |{\tilde{Y}}_{i} - u_{l}| \leq b + w_{n}\} \\ \leq & C \sqrt{\frac{ln ln n}{n^{3} b^{4}}} max_{l = 1, \dots, n} (|\sum_{i = 1}^{n} (1 \{b - w_{n} \leq |{\tilde{Y}}_{i} - u_{l}| \leq b + w_{n}\} \\ - P \{b - w_{n} \leq |{\tilde{Y}}_{i} - u_{l}| \leq b + w_{n}\})| + n \cdot P \{u_{l} + b - w_{n} < {\tilde{Y}}_{i} \leq u_{l} + b + w_{n}\} \\ + n \cdot P \{u_{l} - b - w_{n} < {\tilde{Y}}_{i} \leq u_{l} - b + w_{n}\}) \\ = & C \sqrt{\frac{ln ln n}{n^{3} b^{4}}} (\sqrt{n b ln n} + n w_{n}) \\ = & o ({(n b)}^{- 1 / 2}) a . s . \end{matrix}

(26)

Therefore, an application of (23)–(26) leads to the lemma under Assumption 4.

(b) Let Assumption 5 be satisfied. We obtain

\begin{matrix} |Y_{i n} - {\tilde{Y}}_{i}| & \leq & sup_{t \in [0, \infty)} |ψ^{'} (t)| |h_{K} (X_{i} - {\hat{μ}}_{n}) - h_{K} (X_{i} - μ)| \\ \leq & C_{5} \cdot {(\frac{ln ln n}{n})}^{\bar{α}} = : w_{n} . \end{matrix}

Further, by Lipschitz continuity of k,

\begin{matrix} |k (\frac{u - Y_{i n}}{b}) + k (\frac{u + Y_{i n}}{b}) - k (\frac{u - {\tilde{Y}}_{i}}{b}) - k (\frac{u + {\tilde{Y}}_{i}}{b})| \leq C b^{- 1} w_{n} 1 \{|{\tilde{Y}}_{i} - u| \leq b + w_{n}\} \end{matrix}

uniformly w.r.t.

u \in [0, M]

. Hence

\begin{matrix} max_{l = 1, \dots, n} |{\hat{χ}}_{n} (u_{l}) - {\tilde{χ}}_{n} (u_{l})| \\ \leq & C n^{- 1} b^{- 2} w_{n} max_{l = 1, \dots, n} (\sum_{i = 1}^{n} (1 \{|{\tilde{Y}}_{i} - u_{l}| \leq b + w_{n}\} - P \{|{\tilde{Y}}_{i} - u_{l}| \leq b + w_{n}\}) \\ + n \cdot P \{u_{l} - b - w_{n} \leq {\tilde{Y}}_{i} \leq u_{l} + b + w_{n}\}) \\ = & n^{- 1} b^{- 2} w_{n} (\sqrt{n (b + w_{n}) ln n} + n (b + w_{n})) = O (w_{n} b^{- 1}) a . s . \end{matrix}

☐

Lemma 8.

Suppose that the assumptions of Lemma 5 are satisfied. Then

\begin{matrix} (a) & max_{l = 1, \dots, n} sup_{y \in U_{l}} |{\hat{χ}}_{n} (y) - {\hat{χ}}_{n} (u_{l})| & = & O (n^{- 1} b^{- 2}) a . s ., \\ (b) & max_{l = 1, \dots, n} sup_{y \in U_{l}} |{\tilde{χ}}_{n} (y) - {\tilde{χ}}_{n} (u_{l})| & = & O (n^{- 1} b^{- 2}) a . s ., \\ (c) & sup_{y \in [m, M]} |{\hat{χ}}_{n} (y) - χ (y)| & = & O (\sqrt{ln n} {(n b)}^{- 1 / 2} + β_{n}) a . s ., \end{matrix}

β_{n}

as in Lemma 5.

Proof.

In view of Lemma 2, we have

max_{l = 1, \dots, n} sup_{y \in U_{l}} |χ (u_{l}) - χ (y)| = \{\begin{matrix} O (n^{- 1}) if m \neq 0, \\ O (n^{- 1 / d}) if m = 0 . \end{matrix}

Moreover, by the Lipschitz continuity of k, we obtain

\begin{matrix} max_{l = 1, \dots, n} sup_{y \in U_{l}} |{\hat{χ}}_{n} (y) - {\hat{χ}}_{n} (u_{l})| \\ = & n^{- 1} b^{- 1} max_{l = 1, \dots, n} sup_{y \in U_{l}} |\sum_{i = 1}^{n} (K_{b} (y, Y_{i n}) - K_{b} (u_{l}, Y_{i n}))| \\ \leq & C b^{- 2} max_{l = 1, \dots, n} sup_{y \in U_{l}} | y - u_{l} | \\ = & O (n^{- 1} b^{- 2}) a . s . \end{matrix}

which proves assertion (a). Analogously, the validity of assertion (b) can be shown. In view of (22), the lemma follows by Lemma 5 and 7. ☐

We are now in a position to prove the result on strong convergence of

{\hat{φ}}_{n}

.

Proof of Theorem 2:

(i) Case

μ \notin D

. By Lemma 3, there are

M_{0} > m_{0} > 0

such that

h_{K} (x - {\hat{μ}}_{n}) \in [m_{0}, M_{0}]

for

x \in D

,

n \geq n_{2} (ω)

. In view of (11), we obtain

\begin{matrix} sup_{x \in D} |{\hat{φ}}_{n} (x) - φ (x)| & \leq & O_{S} {(S)}^{- 1} (sup_{x \in D} |{\hat{g}}_{n} (h_{K} (x - {\hat{μ}}_{n})) - g (h_{K} (x - {\hat{μ}}_{n}))| \\ + sup_{z \geq 0} | g^{'} (z) | sup_{x \in D} |h_{K} (x - {\hat{μ}}_{n}) - h_{K} (x - μ)|) \\ \leq & O_{S} {(S)}^{- 1} sup_{z \in [m_{0}, M_{0}]} |{\hat{g}}_{n} (z) - g (z)| + C \sqrt{\frac{ln ln n}{n}} \\ \leq & O_{S} {(S)}^{- 1} sup_{z \geq 0} (z^{1 - d} ψ^{'} (z)) sup_{z \in [ψ (m_{0}), ψ (M_{0})]} |{\hat{χ}}_{n} (z) - χ (z)| + C \sqrt{\frac{ln ln n}{n}} a . s . \end{matrix}

for

n \geq n_{3} (ω)

. Lemma 8 applies to complete the proof of part (i).

(ii) Case

μ \in D

. The proof can be done analogously to part (i) taking

m_{0} = 0

into account. ☐

4.4. Proving Asymptotic Normality of ${\hat{φ}}_{n} (x)$

Throughout this subsection, assume that Assumptions 1–3 and (9) are fulfilled for some integer

p \geq 2

. First, an auxiliary result is proven. Define

\hat{x} : = h_{K} (x - {\hat{μ}}_{n})

and

\tilde{x} : = h_{K} (x - μ)

.

Lemma 9.

Under Assumption 4, we have

\begin{matrix} (a) |{\hat{χ}}_{n} (ψ (\tilde{x})) - {\tilde{χ}}_{n} (ψ (\tilde{x}))| & = & o ({(n b)}^{- 1 / 2}) a . s . \\ (b) |{\hat{χ}}_{n} (ψ (\hat{x})) - {\hat{χ}}_{n} (ψ (\tilde{x}))| & = & o ({(n b)}^{- 1 / 2}) a . s . as n \to \infty . \end{matrix}

Proof.

Let

U_{l}

and

u_{l}

as in Section 4.3. We can choose

m, M : 0 < m < M

such that

ψ (\hat{x})

,

ψ (\tilde{x}) \in [m, M]

for

n \geq n_{4} (ω)

. By Lemmas 7 and 8,

\begin{matrix} sup_{y \in [m, M]} |{\hat{χ}}_{n} (y) - {\tilde{χ}}_{n} (y)| & \leq & max_{l = 1, \dots, n} sup_{y \in U_{l}} |{\hat{χ}}_{n} (y) - {\hat{χ}}_{n} (u_{l})| + max_{l = 1, \dots, n} |{\hat{χ}}_{n} (u_{l}) - {\tilde{χ}}_{n} (u_{l})| \\ + max_{l = 1, \dots, n} sup_{y \in U_{l}} |{\tilde{χ}}_{n} (y) - {\tilde{χ}}_{n} (u_{l})| \\ \leq & C n^{- 1} b^{- 2} + o ({(n b)}^{- 1 / 2}) = o ({(n b)}^{- 1 / 2}) a . s . \end{matrix}

(27)

which yields immediately assertion (a). Since

|\hat{x} - \tilde{x}| = O (w_{n}) a . s .

by Lemma 3, we obtain the inequality

|{\tilde{χ}}_{n} (ψ (\hat{x})) - {\tilde{χ}}_{n} (ψ (\tilde{x}))| \leq D_{1 n} + D_{2 n} + D_{3 n}

by Taylor expansion, where

\begin{matrix} D_{1 n} & = & |n^{- 1} b^{- 2} \sum_{i = 1}^{n} \sum_{j = 1}^{d} k^{'} ((ψ (\tilde{x})) - {\tilde{Y}}_{i}) b^{- 1}) 1 \{|ψ (\tilde{x}) - {\tilde{Y}}_{i}| \leq b - w_{n}\} ψ^{'} (\tilde{x}) G_{j} (x - μ) ({\hat{μ}}_{n j} - μ_{j})|, \\ D_{2 n} & \leq & C n^{- 1} b^{- 3} \cdot w_{n}^{2} \sum_{i = 1}^{n} 1 \{|ψ (\tilde{x}) - {\tilde{Y}}_{i}| \leq b - w_{n}\}, \\ D_{3 n} & \leq & C n^{- 1} b^{- 2} \cdot w_{n} \sum_{i = 1}^{n} 1 \{b - w_{n} < |ψ (\tilde{x}) - {\tilde{Y}}_{i}| < b + w_{n}\} \end{matrix}

a.s. Observe that

k^{'} (- t) = - k^{'} (t)

and

\begin{matrix} E k^{'} ((ψ (\tilde{x}) - {\tilde{Y}}_{i}) b^{- 1}) 1 \{|ψ (\tilde{x}) - {\tilde{Y}}_{i}| \leq b - w_{n}\} = b \int_{- 1 + w_{n} b^{- 1}}^{1 - w_{n} b^{- 1}} k^{'} (t) χ (ψ (\tilde{x}) - t b) d t \\ = & b \int_{0}^{1 - w_{n} b^{- 1}} k^{'} (t) (χ (ψ (\tilde{x}) - t b) - χ (ψ (\tilde{x}) + t b)) d t \\ = & O (b^{2}) . \end{matrix}

Analogously to Lemma 6, we can deduce

\begin{matrix} D_{1 n} & \leq & C n^{- 1} b^{- 2} \cdot w_{n} \cdot \sum_{j = 1}^{d} |\sum_{i = 1}^{n} k^{'} ((ψ (\tilde{x}) - {\tilde{Y}}_{i}) b^{- 1}) 1 \{|ψ (\tilde{x}) - {\tilde{Y}}_{i}| \leq b - w_{n}\} ψ^{'} (\tilde{x}) G_{j} (x - μ)| \\ \leq & C n^{- 3 / 2} b^{- 2} \sqrt{ln ln n} \cdot (\sqrt{n b ln n} + n b^{2}) \\ = & o ({(n b)}^{- 1 / 2}) a . s . \end{matrix}

since the expectation is zero due to Lemma 4. Analogously to the examination of

B_{3 n}

and

B_{4 n}

in Lemma 7, we obtain

D_{2 n} + D_{3 n} = o ({(n b)}^{- 1 / 2}) a . s .

Hence

|{\tilde{χ}}_{n} (ψ (\hat{x})) - {\tilde{χ}}_{n} (ψ (\tilde{x}))| = o ({(n b)}^{- 1 / 2}) a . s .

In view of (27), the proof of part b) is complete. ☐

From kernel density estimation theory, we can take the following lemma, see [40]. Subsequently, we prove asymptotic normality of

{\hat{φ}}_{n}

.

Lemma 10.

Suppose that χ is continuous at y. Then

\sqrt{n b} ({\tilde{χ}}_{n} (y) - E {\tilde{χ}}_{n} (y)) \overset{d}{⟶} N (0, σ_{1}^{2}), σ_{1}^{2} = χ (y) \int_{- 1}^{1} k^{2} (t) d t .

Proof of Theorem 3.

Note that

z ⇝ z^{1 - d} ψ^{'} (z)

has a bounded derivative on every finite subinterval of

(0, \infty)

. By Lemmas 3 and 9, we obtain

h_{K} {(x - {\hat{μ}}_{n})}^{1 - d} ψ^{'} (h_{K} (x - {\hat{μ}}_{n})) - {\tilde{x}}^{1 - d} ψ^{'} (\tilde{x}) = O (\sqrt{\frac{ln ln n}{n}})

and hence

\begin{matrix} {\hat{φ}}_{n} (x) - φ (x) & = & O_{S} {(S)}^{- 1} h_{K} {(x - {\hat{μ}}_{n})}^{1 - d} ψ^{'} (h_{K} (x - {\hat{μ}}_{n})) ({\hat{χ}}_{n} (ψ (h_{K} (x - {\hat{μ}}_{n}))) - {\hat{χ}}_{n} (ψ (\tilde{x}))) \\ + O_{S} {(S)}^{- 1} {\tilde{x}}^{1 - d} ψ^{'} (\tilde{x}) ({\hat{χ}}_{n} (ψ (\tilde{x})) - χ (ψ (\tilde{x}))) \\ + (h_{K} {(x - {\hat{μ}}_{n})}^{1 - d} ψ^{'} (h_{K} (x - {\hat{μ}}_{n})) - {\tilde{x}}^{1 - d} ψ^{'} (\tilde{x})) O_{S} {(S)}^{- 1} {\hat{χ}}_{n} (ψ (\tilde{x})) \\ = & o ({(n b)}^{- 1 / 2}) + (O_{S} {(S)}^{- 1} {\tilde{x}}^{1 - d} ψ^{'} (\tilde{x}) + o ({(n b)}^{- 1 / 2})) ({\hat{χ}}_{n} (ψ (\tilde{x})) - χ (ψ (\tilde{x}))) \\ = & Z_{n} + e_{n} + o ({(n b)}^{- 1 / 2}) a . s ., \end{matrix}

where

\begin{matrix} Z_{n} & = & O_{S} {(S)}^{- 1} {\tilde{x}}^{1 - d} ψ^{'} (\tilde{x}) ({\hat{χ}}_{n} (ψ (\tilde{x})) - E {\hat{χ}}_{n} (ψ (\tilde{x}))), \\ e_{n} & = & O_{S} {(S)}^{- 1} {\tilde{x}}^{1 - d} ψ^{'} (\tilde{x}) (E {\hat{χ}}_{n} (ψ (\tilde{x})) - χ (ψ (\tilde{x}))) . \end{matrix}

Using Lemma 10, we have

\sqrt{n b} Z_{n} \overset{d}{⟶} N (0, {\bar{σ}}^{2})

where

{\bar{σ}}^{2} (\tilde{x}) = O_{S} {(S)}^{- 2} {\tilde{x}}^{2 - 2 d} ψ^{'} {(\tilde{x})}^{2} σ_{1}^{2}

. By Taylor expansion, we obtain

\begin{matrix} e_{n} & = & O_{S} {(S)}^{- 1} {\tilde{x}}^{1 - d} ψ^{'} (\tilde{x}) (\int_{0}^{\infty} k (\frac{ψ (\tilde{x}) - y}{b}) χ (y) d y - χ (ψ (\tilde{x}))) \\ = & O_{S} {(S)}^{- 1} {\tilde{x}}^{1 - d} ψ^{'} (\tilde{x}) \int_{- 1}^{1} k (t) (χ (ψ (\tilde{x}) - t b) - χ (ψ (\tilde{x}))) d t \\ = & O_{S} {(S)}^{- 1} {\tilde{x}}^{1 - d} ψ^{'} (\tilde{x}) \cdot \int_{- 1}^{1} k (t) (\sum_{j = 1}^{p - 1} {(j!)}^{- 1} χ^{(j)} (ψ (\tilde{x})) {(- t b)}^{j} + {(p!)}^{- 1} χ^{(p)} ({\tilde{x}}_{n}) t^{p} b^{p}) d t \\ = & O_{S} {(S)}^{- 1} {(p!)}^{- 1} {\tilde{x}}^{1 - d} ψ^{'} (\tilde{x}) χ^{(p)} (ψ (\tilde{x})) \int_{- 1}^{1} t^{p} k (t) d t b^{p} + o (b^{p}), \end{matrix}

and

{\tilde{x}}_{n}

lies between

ψ (\tilde{x}) - t b

and

ψ (\tilde{x})

. This completes the proof. ☐

4.5. Proofs When Additional Scale Fit Is Involved

When proving Theorem 4 we shall make use of the following lemma.

Lemma 11.

Let (18) be satisfied. Then

{\hat{θ}}_{n l} - θ_{l} = O (\sqrt{\frac{ln ln n}{n}}) a . s . f o r l = 1, \dots, q .

Proof.

By the law of iterated logarithm and Lemma 3, we obtain

\begin{matrix} {\hat{ρ}}_{n j k} - ρ_{j k} & = & {\hat{σ}}_{n j}^{- 1} {\hat{σ}}_{n k}^{- 1} (\frac{1}{n - 1} \sum_{i = 1}^{n} X_{i j} X_{i k} - E X_{j} X_{k} \\ - \frac{n}{n - 1} (({\hat{μ}}_{n j} - μ_{j}) {\hat{μ}}_{n k} + μ_{j} ({\hat{μ}}_{n k} - μ_{k})) + O (n^{- 1})) \\ + E X_{j} X_{k} {({\hat{σ}}_{n j} {\hat{σ}}_{n k} σ_{j} σ_{k})}^{- 1} (σ_{j} (σ_{k} - {\hat{σ}}_{n k}) + {\hat{σ}}_{n k} (σ_{j} - {\hat{σ}}_{n j})) \\ = & O (\sqrt{\frac{ln ln n}{n}}) a . s . \end{matrix}

for all

j, k = 1, \dots, d

. Since the partial derivatives of functions

γ_{l}

are bounded, it follows that

\begin{matrix} |{\hat{θ}}_{n l} - θ_{l}| & \leq & C max_{j, k = 1, \dots, d} |{\hat{ρ}}_{n j k} - ρ_{j k}| \\ = & O (\sqrt{\frac{ln ln n}{n}}) a . s . \end{matrix}

☐

Proof of Theorem 4.

Let

u (x) = h_{K_{0}} (θ, Σ^{- 1} (x - μ))

. Note that

u (X_{i}) = R_{i}

. By Lipschitz-continuity of

h_{K_{0}}

, (1) and Lemma 3, we have

\begin{matrix} {\bar{Δ}}_{n} & : & = sup_{x \in R^{d}} |h_{K_{0}} ({\hat{θ}}_{n}, {\hat{Σ}}_{n}^{- 1} (x - {\hat{μ}}_{n})) - u (x)| {(1 + u (x))}^{- 1} \\ \leq & C sup_{x \in R^{d}} (∥{\hat{θ}}_{n} - θ∥ + {\hat{Σ}}_{n}^{- 1} ∥{\hat{μ}}_{n} - μ∥ + \sum_{j = 1}^{d} |{\hat{σ}}_{n j}^{- 1} - σ_{j}^{- 1}| |x_{j} - μ_{j}|) \cdot {(1 + u (x))}^{- 1} \\ \leq & C \sqrt{\frac{ln ln n}{n}} \cdot sup_{x \in R^{d}} (1 + ∥x - μ∥) \cdot {(1 + u (x))}^{- 1} \\ = & O (\sqrt{\frac{ln ln n}{n}}) a . s . \end{matrix}

We obtain the result as follows:

\begin{matrix} sup_{r \geq 0} |{\hat{F}}_{n}^{R} (r) - F_{R} (r)| \\ \leq & \frac{1}{n} sup_{r \geq 0} \sum_{i = 1}^{n} 1 \{r - {\bar{Δ}}_{n} (1 + u (X_{i})) \leq u (X_{i}) \leq r + {\bar{Δ}}_{n} (1 + u (X_{i}))\} + Δ_{n} \\ = & \frac{1}{n} sup_{r \geq 0} \sum_{i = 1}^{n} 1 \{\frac{r - {\bar{Δ}}_{n}}{1 + {\bar{Δ}}_{n}} \leq R_{i} \leq \frac{r + {\bar{Δ}}_{n}}{1 - {\bar{Δ}}_{n}}\} + Δ_{n} \\ = & sup_{r \geq 0} (F_{n}^{R} ((r + {\bar{Δ}}_{n}) / (1 - {\bar{Δ}}_{n})) - F_{n}^{R} ((r - {\bar{Δ}}_{n}) / (1 + {\bar{Δ}}_{n}))) + Δ_{n} \\ \leq & sup_{r \geq 0} (F_{R} ((r + {\bar{Δ}}_{n}) / (1 - {\bar{Δ}}_{n})) - F_{R} ((r - {\bar{Δ}}_{n}) / (1 + {\bar{Δ}}_{n}))) + 3 Δ_{n} \\ \leq & 2 {\bar{Δ}}_{n} sup_{r \geq 0} r f_{R} (r) {(1 - {\bar{Δ}}_{n}^{2})}^{- 1} + C \sqrt{\frac{ln ln n}{n}} = O (\sqrt{\frac{ln ln n}{n}}) a . s . \end{matrix}

☐

We proceed with proving Theorem 5. The next two lemmas are used in the proof of this theorem. Here we define

Y_{i n} = ψ (h_{K_{0}} ({\hat{θ}}_{n}, {\hat{Σ}}_{n}^{- 1} (X_{i} - {\hat{μ}}_{n})))

and

{\tilde{Y}}_{i} = ψ (h_{K_{0}} (θ, Σ^{- 1} (X_{i} - μ)))

. Notice that

{\tilde{Y}}_{i}

has density χ, and Lemmas 5 and 6 hold true for the modified

Y_{i n}

and

{\tilde{Y}}_{i}

, too.

Lemma 12.

Let

g^{'}

be bounded. Then we have

sup_{u \in [0, M]} |E κ_{n} (u, ψ (R_{i})) ψ^{'} (R_{i}) R_{i}| = O (b^{2}) .

Proof.

Note that

\int_{- 1 + w_{n} / b}^{1 - w_{n} / b} k^{'} (t) d t = k (1 - w_{n} / b) - k (- 1 + w_{n} / b) = 0 .

For

u > b

, we obtain

\begin{matrix} E k^{'} ((u - ψ (R_{i})) / b) ψ^{'} (R_{i}) R_{i} 1 \{|u - ψ (R_{i})| \leq b - w_{n}\} \\ = & \int_{0}^{\infty} k^{'} (\frac{u - ψ (t)}{b}) ψ^{'} (t) t^{d} g (t) 1 (|u - ψ (t)| \leq b - w_{n}) d t \\ = & - b \int_{- 1 + w_{n} / b}^{1 - w_{n} / b} k^{'} (t) (Ψ^{d} (u - t b) g (Ψ (u - t b)) - Ψ^{d} (u) g (Ψ (u))) d t . \end{matrix}

Since

g, g^{'}, Ψ^{d}

and

{(Ψ^{d})}^{'}

are bounded, we can conclude

sup_{u \in (b, M]} |E κ_{n} (u, ψ (R_{i})) ψ^{'} (R_{i}) R_{i}| = O (b^{2}) .

Further we obtain

\begin{matrix} sup_{u \in [0, b]} |E (k^{'} (\frac{u - ψ (R_{i})}{b}) - k^{'} (\frac{u + ψ (R_{i})}{b})) ψ^{'} (R_{i}) R_{i} \\ 1 \{|u - ψ (R_{i})| \leq b - w_{n}\}| \\ \leq & C sup_{u \in [- 1, 1]} | k^{'} (u) | sup_{u \geq 0} | g (u) | \int_{0}^{Ψ (2 b)} t^{d} ψ^{'} (t) d t \\ \leq & C \int_{0}^{2 b} Ψ^{d} (t) d t \leq C \int_{0}^{2 b} t d t = O (b^{2}), \end{matrix}

by (14). This inequality completes the proof. ☐

Lemma 13.

Suppose that Assumption 7 or Assumption 8 is satisfied. Then

max_{l = 1, \dots, n} |{\hat{χ}}_{n} (u_{l}) - {\tilde{χ}}_{n} (u_{l})| = \{\begin{matrix} o ({(n b)}^{- 1 / 2}) under Assumption 7, \\ O (b^{- 1} {(n^{- 1} ln ln n)}^{\bar{α} / 2}) under Assumption 8 \end{matrix} a . s .

Proof.

We prove the Lemma under Assumption 7, the proof of the other part is analogous to that of Lemma 7(b). As we see later, we need only to involve data vectors

X_{i}

with

Y_{i n} = ψ (h_{K_{0}} ({\hat{θ}}_{n}, {\hat{Σ}}_{n}^{- 1} (X_{i} - {\hat{μ}}_{n}))) \leq M + \bar{b}

or

{\tilde{Y}}_{i} = ψ (h_{K_{0}} (θ, Σ^{- 1} (X_{i} - μ))) \leq M + \bar{b}

. This implies

min {∥X_{i} - {\hat{μ}}_{n}∥, ∥X_{i} - μ∥} \leq C Ψ (M + \bar{b})

by (1), and therefore

∥X_{i} - μ∥ \leq C_{5}

for

n \geq n_{6} (ω)

with a constant

C_{5} > 0

. In view of Lemma 3 and by Assumption 7, we obtain

\begin{matrix} |Y_{i n} - {\tilde{Y}}_{i}| & \leq & sup_{t \in [0, \infty)} |ψ^{'} (t)| |h_{K_{0}} ({\hat{θ}}_{n}, {\hat{Σ}}_{n}^{- 1} (X_{i} - {\hat{μ}}_{n})) - h_{K_{0}} (θ, Σ^{- 1} (X_{i} - μ))| \\ \leq & C (∥{\hat{θ}}_{n} - θ∥ + \sum_{j = 1}^{d} ({\hat{σ}}_{n j}^{- 1} |μ_{j} - {\hat{μ}}_{n j}| + |X_{i j} - μ_{j}| |{\hat{σ}}_{n j}^{- 1} - σ_{j}^{- 1}|)) \\ \leq & C_{6} \cdot \sqrt{\frac{ln ln n}{n}} = : w_{n} \end{matrix}

(

X_{i} = {(X_{i 1}, \dots, X_{i d})}^{T}

) with a suitable constant

C_{6} > 0

for

n \geq n_{7} (ω)

. We introduce

κ_{n} (u, t) = (k^{'} ((u - t) / b) - k^{'} ((u + t) / b)) 1 (|u - t| \leq b - w_{n}) .

Let

\bar{ψ} (z) : = z^{- 1} ψ^{'} (z), {\bar{X}}_{i} : = Σ^{- 1} (X_{i} - μ) = {({\bar{X}}_{i 1}, \dots, {\bar{X}}_{i d})}^{T}

. Observe that

k^{'}

is bounded and Lipschitz continuous on

[- 1, 1]

,

ψ^{'}, \bar{ψ}

and

{\bar{ψ}}^{'}

are bounded on

[0, + \infty)

, functions

G_{j}, {\tilde{G}}_{j}

are bounded, and functions

ψ^{'} (h_{K} (.)) G_{j}

are Hölder continuous of order

α > 0.2

. By Taylor expansion, Assumption 7 and Lipschitz continuity of k on

R

, we have

\begin{matrix} k (\frac{u - Y_{i n}}{b}) + k (\frac{u + Y_{i n}}{b}) - k (\frac{u - {\tilde{Y}}_{i}}{b}) - k (\frac{u + {\tilde{Y}}_{i}}{b}) \\ = & - \frac{1}{b} (k^{'} ((u - {\tilde{Y}}_{i}) / b) - k^{'} ((u + {\tilde{Y}}_{i}) / b)) ψ^{'} (h_{K_{0}} (θ, {\bar{X}}_{i})) 1 \{|{\tilde{Y}}_{i} - u| \leq b - w_{n}\} \\ \cdot (\sum_{j = 1}^{d} G_{j} (θ, {\bar{X}}_{i}) ({\hat{σ}}_{n j}^{- 1} (μ_{j} - {\hat{μ}}_{n j}) + (X_{i j} - μ_{j}) ({\hat{σ}}_{n j}^{- 1} - σ_{j}^{- 1})) \\ + \sum_{j = 1}^{q} {\tilde{G}}_{j} (θ, {\bar{X}}_{i}) ({\hat{θ}}_{n j} - θ_{j})) + W_{n i} (u), \end{matrix}

where, uniformly for

u \in [0, M]

,

\begin{matrix} |W_{n i} (u)| & \leq & C b^{- 2} w_{n}^{2} 1 \{|{\tilde{Y}}_{i} - u| \leq b - w_{n}\} \\ + C b^{- 1} w_{n} 1 \{b - w_{n} < |{\tilde{Y}}_{i} - u| < b + w_{n}\} a . s . \end{matrix}

This leads to

\begin{matrix} max_{l = 1, \dots, n} |{\hat{χ}}_{n} (u_{l}) - {\tilde{χ}}_{n} (u_{l})| \\ \leq & C (\sum_{j = 1}^{d} (B_{1 n j} {\hat{σ}}_{n j}^{- 1} |{\hat{μ}}_{n j} - μ_{j}| + B_{4 n j} |{\hat{σ}}_{n j}^{- 1} - σ_{j}|) + \sum_{j = 1}^{q} B_{5 n j} |{\hat{θ}}_{n j} - θ_{j}|) \\ + B_{2 n} + B_{3 n}, \end{matrix}

where

B_{2 n}, B_{3 n}

are as above, and

\begin{matrix} B_{1 n j} & = & n^{- 1} b^{- 2} max_{l = 1, \dots, n} |\sum_{i = 1}^{n} κ_{n} (u_{l}, {\tilde{Y}}_{i}) ψ^{'} (h_{K_{0}} (θ, {\bar{X}}_{i})) G_{j} (θ, {\bar{X}}_{i})|, \\ B_{4 n j} & = & n^{- 1} b^{- 2} max_{l = 1, \dots, n} |\sum_{i = 1}^{n} κ_{n} (u_{l}, {\tilde{Y}}_{i}) ψ^{'} (h_{K_{0}} (θ, {\bar{X}}_{i})) G_{j} (θ, {\bar{X}}_{i}) {\bar{X}}_{i}|, \\ B_{5 n j} & = & n^{- 1} b^{- 2} max_{l = 1, \dots, n} |\sum_{i = 1}^{n} κ_{n} (u_{l}, {\tilde{Y}}_{i}) ψ^{'} (h_{K_{0}} (θ, {\bar{X}}_{i})) {\tilde{G}}_{j} (θ, {\bar{X}}_{i})| . \end{matrix}

Analogously to Lemma 7, we obtain

\sum_{j = 1}^{d} B_{1 n j} |{\hat{μ}}_{n j} - μ_{j}| \leq C n^{- 3 / 2} b^{- 2} \sqrt{ln ln n} \cdot \sqrt{n b ln n} = o ({(n b)}^{- 1 / 2}) .

Observe that

{\bar{X}}_{i} = R_{i} {\bar{U}}_{i}

,

G_{j} (θ, R_{i} {\bar{U}}_{i}) = G_{j} (θ, {\bar{U}}_{i})

and

{\tilde{G}}_{j} (θ, R_{i} {\bar{U}}_{i}) = R_{i} {\tilde{G}}_{j} (θ, {\bar{U}}_{i})

a . s .

(

h_{K_{0}} (θ, .)

is a homogeneous function). Applying Lemmas 6 and 12, we can derive

\begin{matrix} B_{4 n j} & \leq & n^{- 1} b^{- 2} max_{l = 1, \dots, n} |\sum_{i = 1}^{n} (κ_{n} (u_{l}, {\tilde{Y}}_{i}) ψ^{'} (h_{K_{0}} (θ, {\bar{X}}_{i})) G_{j} (θ, {\bar{X}}_{i}) {\bar{X}}_{i j} \\ - E (κ_{n} (u_{l}, ψ (h_{K_{0}} (θ, {\bar{X}}_{i}))) ψ^{'} (h_{K_{0}} (θ, {\bar{X}}_{i})) G_{j} (θ, {\bar{X}}_{i})) {\bar{X}}_{i j})| \\ + b^{- 2} max_{l = 1, \dots, n} |E (κ_{n} (u_{l}, ψ (R_{i})) ψ^{'} (R_{i}) R_{i}) E {\bar{U}}_{i j} G_{j} (θ, {\bar{U}}_{i})| \\ = & O (\sqrt{n^{- 1} b^{- 3} ln n} + 1) \end{matrix}

(

{\bar{U}}_{i} = {({\bar{U}}_{i 1}, \dots, {\bar{U}}_{i d})}^{T}

) which implies

\sum_{j = 1}^{d} B_{4 n j} |{\hat{σ}}_{n j}^{- 1} - σ_{j}| = o ({(n b)}^{- 1 / 2}) .

Further, by Lemma 12,

\begin{matrix} B_{5 n j} & \leq & n^{- 1} b^{- 2} max_{l = 1, \dots, n} |\sum_{i = 1}^{n} (κ_{n} (u_{l}, {\tilde{Y}}_{i}) ψ^{'} (h_{K_{0}} (θ, {\bar{X}}_{i})) {\tilde{G}}_{j} (θ, {\bar{X}}_{i}) \\ - E (κ_{n} (u_{l}, ψ (h_{K_{0}} (θ, {\bar{X}}_{i}))) ψ^{'} (h_{K_{0}} (θ, {\bar{X}}_{i})) {\tilde{G}}_{j} (θ, {\bar{X}}_{i})))| \\ + b^{- 2} max_{l = 1, \dots, n} |E (κ_{n} (u_{l}, ψ (R_{i})) ψ^{'} (R_{i}) R_{i}) E {\tilde{G}}_{j} (θ, {\bar{U}}_{i}))| \\ = & O (\sqrt{n^{- 1} b^{- 3} ln n} + 1) \end{matrix}

and

\sum_{j = 1}^{q} B_{5 n j} |{\hat{θ}}_{n j} - θ_{j}| = o ({(n b)}^{- 1 / 2}) .

This completes the proof. ☐

Proof of Theorem 5.

We consider only the case

μ \notin D

, the proof in the other case is similar. By Lemma 3, there are

M_{0} > m_{0} > 0

such that

h_{K_{0}} ({\hat{θ}}_{n}, {\hat{Σ}}^{- 1} (x - {\hat{μ}}_{n})) \in [m_{0}, M_{0}]

for

x \in D

,

n \geq n_{8} (ω)

. In view of (11), we obtain

\begin{matrix} sup_{x \in D} |{\hat{φ}}_{n} (x) - φ (x)| \\ \leq & O_{S} {(S)}^{- 1} det {({\hat{Σ}}_{n})}^{- 1} \\ (sup_{x \in D} |{\hat{g}}_{n} (h_{K_{0}} ({\hat{θ}}_{n}, {\hat{Σ}}_{n}^{- 1} (x - {\hat{μ}}_{n}))) - g (h_{K_{0}} ({\hat{θ}}_{n}, {\hat{Σ}}_{n}^{- 1} (x - {\hat{μ}}_{n})))| \\ + sup_{z \geq 0} | g^{'} (z) | sup_{x \in D} |h_{K_{0}} ({\hat{θ}}_{n}, {\hat{Σ}}_{n}^{- 1} (x - {\hat{μ}}_{n})) - h_{K_{0}} (θ, Σ^{- 1} (x - μ))|) \\ \leq & C (sup_{z \in [m_{0}, M_{0}]} |{\hat{g}}_{n} (z) - g (z)| + \sqrt{\frac{ln ln n}{n}}) \\ \leq & C (sup_{z \geq 0} (z^{1 - d} ψ^{'} (z)) sup_{z \in [ψ (m_{0}), ψ (M_{0})]} |{\hat{χ}}_{n} (z) - χ (z)| + \sqrt{\frac{ln ln n}{n}}) a . s . \end{matrix}

for

n \geq n_{9} (ω)

. An application of Lemma 8 completes the proof. ☐

In last part of this section, we prove the result on asymptotic normality. Define

\hat{x} : = h_{K_{0}} ({\hat{θ}}_{n}, {\hat{Σ}}_{n}^{- 1} (x - {\hat{μ}}_{n}))

and

\tilde{x} : = h_{K_{0}} (θ, Σ^{- 1} (x - μ))

.

Lemma 14.

Under Assumption 7, we have

\begin{matrix} a) |{\hat{χ}}_{n} (ψ (\tilde{x})) - {\tilde{χ}}_{n} (ψ (\tilde{x}))| & = & o ({(n b)}^{- 1 / 2}) a . s . \\ b) |{\hat{χ}}_{n} (ψ (\hat{x})) - {\hat{χ}}_{n} (ψ (\tilde{x}))| & = & o ({(n b)}^{- 1 / 2}) a . s . \end{matrix}

Proof.

Let

U_{l}

and

u_{l}

as in Section 4.3. We can choose

m, M : 0 < m < M

such that

ψ (\hat{x}), ψ (\tilde{x}) \in [m, M]

for

n \geq n_{10} (ω)

. By Lemmas 7 and 8,

\begin{matrix} sup_{y \in [m, M]} |{\hat{χ}}_{n} (y) - {\tilde{χ}}_{n} (y)| & \leq & max_{l = 1, \dots, n} sup_{y \in U_{l}} |{\hat{χ}}_{n} (y) - {\hat{χ}}_{n} (u_{l})| + max_{l = 1, \dots, n} |{\hat{χ}}_{n} (u_{l}) - {\tilde{χ}}_{n} (u_{l})| \\ + max_{l = 1, \dots, n} sup_{y \in U_{l}} |{\tilde{χ}}_{n} (y) - {\tilde{χ}}_{n} (u_{l})| \\ \leq & C n^{- 1} b^{- 2} + o ({(n b)}^{- 1 / 2}) = o ({(n b)}^{- 1 / 2}) a . s . \end{matrix}

(28)

which yields immediately assertion a). Since

|\hat{x} - \tilde{x}| = O (w_{n}) a . s .

by Lemma 3, we obtain the inequality

|{\tilde{χ}}_{n} (ψ (\hat{x})) - {\tilde{χ}}_{n} (ψ (\tilde{x}))| \leq D_{1 n} + D_{2 n} + D_{3 n}

by Taylor expansion, where

\begin{matrix} D_{1 n} & = & |n^{- 1} b^{- 2} \sum_{i = 1}^{n} k^{'} ((ψ (\tilde{x})) - {\tilde{Y}}_{i}) b^{- 1}) ψ^{'} (\tilde{x}) 1 \{|ψ (\tilde{x}) - {\tilde{Y}}_{i}| \leq b - w_{n}\} \\ (\sum_{j = 1}^{d} G_{j} (θ, \tilde{x}) ({\hat{σ}}_{n j}^{- 1} (μ_{j} - {\hat{μ}}_{n j}) + (X_{i j} - μ_{j}) ({\hat{σ}}_{n j}^{- 1} - σ_{j}^{- 1})) \\ + \sum_{j = 1}^{q} {\tilde{G}}_{j} (θ, \tilde{x}) ({\hat{θ}}_{n j} - θ_{j}))|, \end{matrix}

D_{2 n}

and

D_{3 n}

are as in Section 4.3. Analogously to Lemma 6, we can deduce that

\begin{matrix} D_{1 n} & \leq & C n^{- 1} b^{- 2} w_{n} \cdot (|\sum_{i = 1}^{n} k^{'} ((ψ (\tilde{x}) - {\tilde{Y}}_{i}) b^{- 1}) 1 \{|ψ (\tilde{x}) - {\tilde{Y}}_{i}| \leq b - w_{n}\}| \\ ψ^{'} (\tilde{x}) (\sum_{j = 1}^{d} |G_{j} (θ, \tilde{x})| + \sum_{j = 1}^{q} |{\tilde{G}}_{j} (θ, \tilde{x})|)| \\ \leq & C n^{- 3 / 2} b^{- 2} \sqrt{ln ln n} \cdot (\sqrt{n b ln n} \\ + n |E (k^{'} ((ψ (\tilde{x}) - {\tilde{Y}}_{i}) b^{- 1}) 1 \{|ψ (\tilde{x}) - {\tilde{Y}}_{i}| \leq b - w_{n}\})|) \\ = & o ({(n b)}^{- 1 / 2}) a . s . \end{matrix}

The remainder of the proof is done as in the proof of Lemma 9. ☐

Proof of Theorem 6.

By Lemmas 3 and 14, and analogously to the proof of Theorem 3, we obtain

\begin{matrix} {\hat{φ}}_{n} (x) - φ (x) & = & (O_{S} {(S)}^{- 1} det {(Σ)}^{- 1} {\tilde{x}}^{1 - d} ψ^{'} (\tilde{x}) + o ({(n b)}^{- 1 / 2}) ({\tilde{χ}}_{n} (ψ (\tilde{x})) - χ (ψ (\tilde{x}))) \\ + o ({(n b)}^{- 1 / 2}) a . s . \end{matrix}

The remainder of the proof can be done in the same manner as in the proof of Theorem 3. ☐

Acknowledgments

The authors are grateful to the referees who provided valuable suggestions on improving the paper and additional references.

Author Contributions

The authors contributed equally to this work.

Conflicts of Interest

The authors declare no conflict of interest.

References

M. Bilodeau, and D. Brenner. Theory of Multivariate Statistics. New York, NY, USA: Springer, 1999. [Google Scholar]
K.-T. Fang, S. Kotz, and K. Ng. Symmetric Multivariate and Related Distributions. London, UK: Chapman & Hall, 1990. [Google Scholar]
K.-T. Fang, and Y. Zhang. Generalized Multivariate Analysis. New York, NY, USA: Springer, 1990. [Google Scholar]
K.-T. Fang, and T.W. Anderson, eds. Statistical Inference in Elliptically Contoured and Related Distributions. New York, NY, USA: Allerton Press, 1990.
A.K. Gupta, T. Varga, and T. Bodnar. Elliptically Contoured Models in Statistics and Portfolio Theory. New York, NY, USA: Springer, 2013. [Google Scholar]
H. Cui, and X. He. “The consistence of semiparametric estimation of elliptic densities.” Acta Math. Sin. Engl. Ser. 11 (1995): 44–58. [Google Scholar]
E. Liebscher. “A semiparametric density estimator based on elliptical distributions.” J. Multivar. Anal. 92 (2005): 205–225. [Google Scholar] [CrossRef]
W. Stute, and U. Werner. “Nonparametric estimation of elliptically contoured densities.” In Nonparametric Functional Estimation and Related Topics (Spetses, 1990). NATO Science Series C: Mathematical and Physical Sciences; Dordrecht, The Netherlands: Kluwer Academic Publisher, 1991, pp. 173–190. [Google Scholar]
H. Battey, and O. Linton. “Nonparametric estimation of multivariate elliptic densities via finite mixture sieves.” J. Multivar. Anal. 123 (2014): 43–67. [Google Scholar] [CrossRef]
C. Fernandez, J. Osiewalski, and M.F.J. Steel. “Modeling and inference with ν-spherical distributions.” J. Am. Stat. Assoc. 90 (1995): 1331–1340. [Google Scholar] [CrossRef]
A.A. Balkema, P. Embrechts, and N. Nolde. “Meta densities and the shape of their sample clouds.” J. Multivar. Anal. 101 (2010): 1738–1754. [Google Scholar] [CrossRef]
W.-D. Richter. “Geometric disintegration and star-shaped distributions.” J. Stat. Distrib. Appl. 1 (2014). [Google Scholar] [CrossRef]
T. Dietrich, and W.-D. Richter. “Classes of geometrically generalized von Mises distributions.” Sankhya B, 2016. [Google Scholar] [CrossRef]
W.-D. Richter. “Norm contoured distributions in R².” Lect. Notes Semin. Interdiscip. Mat. 12 (2015): 179–199. [Google Scholar]
W.-D. Richter. “Convex and radially concave contoured distributions.” J. Prob. Stat. 2015 (2015): 165468. [Google Scholar] [CrossRef]
W.-D. Richter, and K. Schicker. “Polyhedral star-shaped distributions. Representations, properties and applications.” J. Prob. Stat., 2016, in press. [Google Scholar]
M. Moszyńska, and W.-D. Richter. “Reverse triangle inequality. Antinorms and semi-antinorms.” Stud. Sci. Math. Hung. 49 (2012): 120–138. [Google Scholar] [CrossRef]
E.L. Lehmann, and G. Casella. Theory of Point Estimation, 2nd ed. New York, NY, USA: Springer, 1998. [Google Scholar]
B.W. Silverman. Density Estimation for Statistics and Data Analysis. London, UK: Chapman & Hall, 1986. [Google Scholar]
K. Müller, and W.-D. Richter. “Exact distributions of order statistics of dependent random variables from l_n,p-symmetric sample distributions.” Depend. Model. 4 (2016). [Google Scholar] [CrossRef]
J.S. Marron, and M.P. Wand. “Exact mean integrated squared error.” Ann. Stat. 20 (1992): 712–736. [Google Scholar] [CrossRef]
W. Stute. “A law of the logarithm for kernel density estimators.” Ann. Prob. 10 (1982): 414–422. [Google Scholar] [CrossRef]
J.K. Lindsey. “Multivariate elliptically contoured distributions for repeated measurements.” Biometrics 55 (1999): 1277–1280. [Google Scholar] [CrossRef] [PubMed]
D. Cho, and T.D. Bin. “Multivariate statistical modeling for image denoising using wavelet transforms.” Signal Process. Image Commun. 20 (2005): 77–89. [Google Scholar] [CrossRef]
G. Verdoolaege, S. de Becker, and P. Scheunders. “Multiscale colour texture retrieval using the geodesic distance between multivariate generalized models.” In Proceedings of the 15th IEEE International Conference on Image Processing, San Diego, CA, USA, 12–15 October 2008; pp. 169–172.
C. Field, and M.C. Genton. “The multivariate g-and h- distribution.” Technometrics 48 (2006): 104–111. [Google Scholar] [CrossRef]
F. Sinz, and M. Bethge. “The conjoint effect of divisive normalization and orientation selectivity on redundancy reduction.” In Proceedings of the 2008 Conference on Advances in Neural Information Processing Systems 21, Vancouver, BC, Canada, 8–10 December 2008; pp. 1521–1628.
U.J. Dang, R.P. Browne, and P.D. McNicholas. “Mixtures of multivariate power exponential distributions.” Biometrics 71 (2015): 1081–1089. [Google Scholar] [CrossRef] [PubMed]
M.N. Do, and M. Vetterli. “Wavelet-based texture retrieval using generalized Gaussian density and Kullback-Leibler Distance.” IEEE Trans. Image Process. 11 (2002): 146–158. [Google Scholar] [CrossRef] [PubMed]
E. Santiago, J. Albornoz, A. Domínguez, M.A. Toro, and C. López-Fanjul. “The distribution of spontaneous mutations on quantitative traits and fitness in Drosophilamelanogaster.” Genetics 132 (1992): 771–781. [Google Scholar] [PubMed]
E. Gómez, M.A. Gomez-Viilegas, and J.M. Marín. “A multivariate generalization of the power exponential family of distributions.” Commun. Stat. Theory Methods 27 (1998): 589–600. [Google Scholar] [CrossRef]
Y. Wiaux, G. Puy, and P. Vandergheynst. “Compressed sensing reconstruction of a string signal from interferometric observations of the cosmic microwave background.” Mon. Not. R. Astron. Soc. 402 (2010): 2626–2636. [Google Scholar] [CrossRef]
J.-H. Chang, J.W. Shin, and N.S. Kim. “Voice activity detector employing generalized Gaussian distribution.” Electron. Lett. 40 (2004): 24. [Google Scholar] [CrossRef]
F. Forbes, and D. Wraith. “A new family of multivariate heavy-tailed distributions with variable marginal amounts of tail weight: Application to robust clustering.” Stat. Comput. 24 (2014): 971–984. [Google Scholar] [CrossRef]
D. Wraith, and F. Forbes. “Location and scale mixtures of Gaussians with flexible tail behaviour: Properties, inference and application to multivariate clustering.” Comput. Stat. Data Anal. 90 (2015): 61–73. [Google Scholar] [CrossRef]
S.X. Lee, and G.J. McLachlan. “Finite mixtures of canonical fundamental skew t-distributions. The unification of the restricted and unrestricted skew t-mixture models.” Stat. Comput. 26 (2016): 573–589. [Google Scholar] [CrossRef]
W.-D. Richter, and J. Venz. “Geometric representations of multivariate skewed elliptically contoured distributions.” Chil. J. Stat. 5 (2014): 71–90. [Google Scholar]
D.F. Andrews, and A.M. Herzberg. Data: A Collection of Problems from Many Fields for the Student and Research Worker. New York, NY, USA: Springer, 1985. [Google Scholar]
A.W. Van der Vaart. Asymptotic Statistics. Cambridge, UK: Cambridge University Press, 1998. [Google Scholar]
E. Parzen. “On estimation of a probability density function and mode.” Ann. Math. Stat. 33 (1962): 1065–1076. [Google Scholar] [CrossRef]

Figure 1. Plot of the boundaries of K in the cases

a = 10

(solid line),

a = 3

(dashed line),

a = 1

(dotted line) and

a = 0.3

(dashed/dotted line).

Figure 1. Plot of the boundaries of K in the cases

a = 10

(solid line),

a = 3

(dashed line),

a = 1

(dotted line) and

a = 0.3

(dashed/dotted line).

Figure 2. Plot of the boundaries of K in the cases

a = 10

(solid line),

a = 3

(dashed line),

a = 1

(dotted line) and

a = 0.3

(dashed/dotted line).

Figure 2. Plot of the boundaries of K in the cases

a = 10

(solid line),

a = 3

(dashed line),

a = 1

(dotted line) and

a = 0.3

(dashed/dotted line).

Figure 3. Plot of the boundaries of K in the cases

a = - 0.8

(solid line),

a = - 0.6

(dashed line),

a = - 0.4

(dotted line),

a = - 0.2

(dashed/dotted line).

Figure 3. Plot of the boundaries of K in the cases

a = - 0.8

(solid line),

a = - 0.6

(dashed line),

a = - 0.4

(dotted line),

a = - 0.2

(dashed/dotted line).

Figure 4. Contour plot of the density of

φ_{g, K, μ}

for

a = - 0.8

and levels

0.39, 0.38, \dots 0.19

.

Figure 4. Contour plot of the density of

φ_{g, K, μ}

for

a = - 0.8

and levels

0.39, 0.38, \dots 0.19

.

Figure 5. Plot of the boundaries of K for

a_{2} = 2

in the cases

a_{1} = 10

(solid line),

a_{1} = 3

(dashed line),

a_{1} = 1

(dotted line) and

a_{1} = 0.3

(dashed/dotted line).

Figure 5. Plot of the boundaries of K for

a_{2} = 2

in the cases

a_{1} = 10

(solid line),

a_{1} = 3

(dashed line),

a_{1} = 1

(dotted line) and

a_{1} = 0.3

(dashed/dotted line).

Figure 6. Contour plot of the density

φ_{g, K, μ}

for

a_{1} = 3

and levels as in Figure 4.

Figure 6. Contour plot of the density

φ_{g, K, μ}

for

a_{1} = 3

and levels as in Figure 4.

Figure 7. Estimator of g (solid line) and the model function (dashed line) for

n = 1000

.

Figure 7. Estimator of g (solid line) and the model function (dashed line) for

n = 1000

.

Figure 8. Estimator of g (solid line) and the model function (dashed line) for

n = 10,000

.

Figure 8. Estimator of g (solid line) and the model function (dashed line) for

n = 10,000

.

Figure 9. Function

γ_{1}^{- 1}

with

a_{1}

on the x-axis, ρ on the y-axis;

γ_{1}

is defined in Assumption 6.

Figure 9. Function

γ_{1}^{- 1}

with

a_{1}

on the x-axis, ρ on the y-axis;

γ_{1}

is defined in Assumption 6.

Figure 10. Scatter plot of the dataset 5 of [38].

Figure 11. Estimated generator function g at left (bandwidth

b = 0.5

) and contour plot of

\hat{φ}

at right.

Figure 11. Estimated generator function g at left (bandwidth

b = 0.5

) and contour plot of

\hat{φ}

at right.

Figure 12. Scatter plot of the MSCI data.

Figure 13. Contour plot of the estimated density.

Figure 14. Estimated generator function g (bandwidth

b = 0.25

).

Figure 14. Estimated generator function g (bandwidth

b = 0.25

).

© 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/).

Article Metrics

Citations

Article Access Statistics

Journal Statistics

Multiple requests from the same IP address are counted as one view.

Estimation of Star-Shaped Distributions

Abstract

1. Introduction

2. Continuous Star-Shaped Distributions

2.1. The General Distribution Class

2.2. A Class of Two-Dimensional Distributions Whose Contour Defining Star Bodies Are Squared Sine Transformed Euclidean Circles

2.3. Norm-Contoured Distributions

2.4. Antinorm-Contoured Distributions

2.5. Continuous Non-Concentric Elliptically Contoured Distributions

3. Estimation for Continuous Star-Shaped Distributions

3.1. Parametric Estimators

3.2. Nonparametric Estimators without Scale Fit

3.2.1. Estimating μ and F R

3.2.2. Density Estimation

3.2.3. Assumptions Ensuring Convergence Properties of Estimators

3.2.4. Properties of the Density Estimator

3.2.5. Reference Bandwidth

3.3. Semiparametric Estimators Involving a Scale and a Parameter Fit

3.4. Applications

4. Proofs

4.1. Proof of Auxiliary Statements

4.2. Proving Convergence of F ^ n R

4.3. Proving Strong Convergence of the Density Estimator

4.4. Proving Asymptotic Normality of φ ^ n ( x )

4.5. Proofs When Additional Scale Fit Is Involved

Acknowledgments

Author Contributions

Conflicts of Interest

References

Article Metrics

Article Access Statistics

3.2.1. Estimating μ and $F_{R}$

4.2. Proving Convergence of ${\hat{F}}_{n}^{R}$

4.4. Proving Asymptotic Normality of ${\hat{φ}}_{n} (x)$