Estimation of Star-Shaped Distributions

Abstract: Scatter plots of multivariate data sets motivate modeling of star-shaped distributions beyond elliptically contoured ones. We study properties of estimators for the density generator function, the star-generalized radius distribution and the density in a star-shaped distribution model. For the generator function and the star-generalized radius density, we consider a non-parametric kernel-type estimator. This estimator is combined with a parametric estimator for the contours which are assumed to follow a parametric model. Therefore, the semiparametric procedure features the flexibility of nonparametric estimators and the simple estimation and interpretation of parametric estimators. Alternatively, we consider pure parametric estimators for the density. For the semiparametric density estimator, we prove rates of uniform, almost sure convergence which coincide with the corresponding rates of one-dimensional kernel density estimators when excluding the center of the distribution. We show that the standardized density estimator is asymptotically normally distributed. Moreover, the almost sure convergence rate of the estimated distribution function of the star-generalized radius is derived. A particular new two-dimensional distribution class is adapted here to agricultural and financial data sets.


Introduction
The classes of multivariate Gaussian and elliptically contoured distributions have served as the probabilistic basis of many multivariate statistical models over a period of several decades.Accounts of the theory of elliptically contoured distributions may be found in [1][2][3].The book [4] by Fang and Anderson contains a big chapter about statistical inference of elliptically contoured distributions.The theory of elliptically contoured distributions including applications to portfolio theory is presented in the monograph by [5] Gupta et al.On combining the advantages of several estimators, semiparametric density estimators for elliptical distributions were derived in papers [6][7][8] by Stute and Werner, by Cui and He and by Liebscher.In [9] Battey and Linton considered a density estimator for elliptical distributions based on Gaussian mixture sieves.The performance of their estimators heavily depends on how the density can be approximated by a mixture of normal distributions.Scatter plots of multivariate data sets, however, motivate modeling of star-shaped distributions beyond elliptically contoured ones.
The more flexible star-shaped densities were studied in [10] and later in [11].The general structure of their normalizing constant given a density generating function was discovered, a geometric measure representation and, based upon it a stochastic representation were derived, and a survey of applications of such densities was given in [12].Moreover, two-dimensional non-concentric elliptically contoured distributions are introduced there and, based upon two-dimensional star-shaped densities, a universal star-shaped generalization of the univariate von Mises density is derived.These results are further studied in detail in [13] for several particular classes.The big classes of norm and antinorm contoured distributions, being particular cases of star-shaped distributions, are considered in [14,15] for dimension two and for arbitrary finite dimension, respectively.In this paper we study several of those classes of distributions for arbitrary finite dimension and introduce a particular new class of distributions for dimension two.The rather general class of distributions considered in the present paper covers distributions with convex as well as such with non-convex contours.
The main goal of this paper is to develop estimation procedures for fitting multivariate generalized star-shaped distributions.The semiparametric procedure combines the flexibility of nonparametric estimators and the simple estimation and interpretation of parametric estimators.Since we apply nonparametric estimation to a univariate function, we avoid the disadvantages of nonparametric estimators in connection with the curse of dimensionality.The semiparametric approach of this paper is based on that in the earlier paper [7] on elliptical distributions but uses partially weaker assumptions.Alternatively, we consider a pure parametric method.In both cases, a parametric model is assumed for the density contours given by the star body and the Minkowski functional of it.For the semiparametric method, we assume that the contours are smooth, more precisely, that the Minkowski functional is continuously differentiable.The parameters are estimated using a method of moments.The star generalized radius density is estimated nonparametrically by use of a kernel density estimator, or parametrically.
The paper is structured as follows.The class of continuous star-shaped distributions and several of its subclasses are considered in Section 2. Section 3 deals with the estimation of the density and the star-generalized radius distribution.In Section 3, the statements on convergence rates and on the asymptotic normality of the density estimator as well as on the convergence rate of the estimated distribution function of the generalized radius are provided.First the case of a given star body is considered, later the more general case of a parametrized star body is taken into consideration.The particular Section 3.4 surveys on the one hand to a certain extent examples where different subclasses of star-shaped distributions appear in practice and deals on the other hand with applications of the methods developed here to the analysis of two-dimensional agricultural and financial data.The proofs can be found in Section 4.

The General Distribution Class
Throughout this paper, K ⊂ R d denotes a star body, i.e., a non-empty star-shaped set that is compact and is equal to the closure of its interior, having the origin as an interior point.The Minkowski functional of K is defined by The boundary of K is just the set {(u, v) T : h K ((u, v) T ) = 1}.Further we find a ball {y ∈ R d : y ≤ r} which covers K where ||.|| denotes Euclidean norm.Hence h K (x x −1 ) ≥ 1/r and The function h K is assumed to be homogeneous of degree one, and to satisfy a further assumption.
A countable collection F = {C 1 , C 2 , ...} of pairwise disjoint sectors (closed convex cones C j containing no half-space, with non-empty interior and vertex being the origin 0 d ) such that R d = j C j will be called a fan.By B d we denote the Borel-σ-field in R d and by S, the boundary of K. We denote S j = S ∩ C j , S j ∩ B d = B S,j and B S = σ{B S,1 , B S,2 , ...}.We shall consider only star bodies K and sets A ∈ B S satisfying the following condition.
Assumption 1.The star body K and the set A ∈ B S are chosen such that for every j the set is well defined and such that for every ϑ = (ϑ 1 , ..., A star body K satisfying this assumption will be called for short an A1-star-body.Let g : [0, +∞) → [0, +∞) be a nonnegative function which fulfills the condition

Such function is called a density generating function (dgf).
We consider the class of continuous star-shaped distributions of random vectors X taking values in R d : where int K means the interior of K. Suppose that the distribution law Φ g,K,µ has the density where C(g, K) is a suitable normalizing constant.Moreover, K is called the contour defining star body of ϕ g,K,µ .We consider the random vector X having the density (2) (in symbols X ∼ Φ g,K,µ ).According to Theorem 8 in [12], this random vector has the representation where the star-generalized radius variable R = h K (X − µ) and the star-generalized uniform basis vector with dr, and U has a star-generalized uniform probability distribution on the boundary of K, i.e., P(U ∈ A) = O S (A)/O S (S) for A ∈ B S .According to [12], O S means the star-generalized surface measure which is a non-Euclidean one unless for S being the Euclidean sphere, and which is well defined if Assumption 1 is fulfilled.Note that O S (S) = d • vol(K).If lim r→0+0 g(r) > 0 is finite, then in view of (4), R takes values in the neighbourhood of zero with a rather small probability.This behaviour is called the volcano effect and is the stronger the higher the dimension is.The density (2) may be written as Estimating such density may be studied under various assumptions concerning the degree of knowledge on the groups of parameters K, h K , O S (S) and f R , as well as µ.Let X = (X (1) , . . ., X (d) ) T and U = (U 1 , . . ., U d ) T .The next lemma gives helpful information about the mean and the covariances.
Here and in what follows 1{A} denotes the indicator function of an event A. Lemma 1.If ER 2 < +∞ and K is symmetric w.r.t. the origin, then EX = µ, and Cov(X (j) , X (k) ) = EU j U k ER 2 for j, k = 1, . . ., d.
Proof.In view of (3), we have first to show that EU = 0.Because of the symmetry of K, U has the same distribution as −U, and U j = 0 with probability 0 for each j.Thus we obtain Moreover, it follows that The general approach followed here includes non-convex bodies which can occur in applications.Obviously, h K (U) = 1 and the distribution of the random vector U is concentrated on the set {u ∈ R d : h K (u) = 1}.

A Class of Two-Dimensional Distributions Whose Contour Defining Star Bodies Are Squared Sine Transformed Euclidean Circles
We define α(u, v) ∈ (−π, π] to be the angle in radians between the positive x-axis and the line through the point (u, v) and the origin: . The Minkowski functional of any two-dimensional star body K can then be written as where H : (−π, π] → (0, ∞) is a bounded function.In the following examples we consider two-dimensional star bodies with smooth boundaries; i.e., H is differentiable.Here, the following generator function is used which corresponds to the star-generalized radius density of mixed Erlang type Example 1.Here we consider the Minkowski functional where a ∈ (−1, +∞) is a parameter.The Figures 1-4 show the contour lines of the boundary of the body for several values of a and the resulting density for one choice of a.These figures show that the distribution class includes densities with convex as well as with non-convex contours.Example 2. We consider the star body K with Minkowski functional where a 1 and a 2 ∈ R are parameters such that 1 + a 1 1 + a 2 2 > 0, and This star body arises from a rotation of K in Example 1 by an angle α 0 where a 2 = 1/ tan α 0 .In Figures 5 and 6 the boundaries of (multiples of) K are depicted.A specific motivation for considering a star body K as in Example 2 arises when studying the dataset 5 of [38], see Figure 10 below.

Norm-Contoured Distributions
Specific norm-contoured distributions were studied in several papers which are in part surveyed in Richter [14].A geometric measure representation of arbitrary norm contoured distributions is proved in [15].The class of all norm-contoured distributions is denoted, according to these papers, by N C. The class N C is a subclass of the class StSh of star-shaped distributions.Here, we consider the subclass of continuous norm-contoured distributions CNC.
It is well known that there is a one-to-one correspondence between the class of convex bodies being symmetric w.r.t. the origin, where x ∈ K implies −x ∈ K, and the class of norms in R d .If K is any such symmetric convex body then h K (x) = x where . is the uniquely determined norm.On the other hand, if . is any norm, then K = {x : x ≤ 1} is the corresponding convex symmetric body having the origin as interior point, and x = h K (x).
Throughout this section, let .be any norm and K = {x : x ≤ 1}, and let the density of a norm-contoured distribution be where O is any orthogonal d × d-matrix.Because any rotated or mirrored norm-ball is again a norm ball we shall restrict our attention to the case O being the unit matrix and will write then X ∼ CNC(g, ., µ).In the present situation, S = {x : x = 1} is considered to be the unit sphere in the Minkowski space (R d , . ).
In the following we consider several specific cases of norms and the corresponding norm-contoured distributions.
K may be called then an (a, p 1 , . . ., p k )-generalized axis-aligned ellipsoid and we will say that X follows a grouped (a, p 1 , . . ., p k )-generalized axis-aligned elliptically contoured distribution in R n .
Example 7. In the case of two-dimensional observations, let P n denote the polygon having the n vertices I n,i = (cos 2π n (i − 1) , sin 2π n (i − 1) ) T , i = 1, . . ., n, n ≥ 3. The convex body which is circumscribed by P n will be denoted by K. Then h K is a norm defined in R 2 and ϕ g,K,0 a polygonally contoured density which was used implicitly in [13] to construct a corresponding geometric generalization of the von Mises density.For the more general class of multivariate polyhedral star-shaped distributions, see [16].

Antinorm-Contoured Distributions
A function g : R n → [0, ∞) which is continuous, positively homogeneous, non-degenerate and superadditive in some fan is called an antinorm in [17].Thereby, g is called superadditive in a sector C or in the fan F if it satisfies the reverse triangle inequality in C or in every sector of the fan F, respectively.Example 9.If the (a, p)-functional |.| a,p is defined as .in Example 5 but with p ∈ (0, 1) then it is an antinorm.
For geometric measure representations of elements from a big class of continuous antinorm contoured distributions we refer to [14,15].For figures of two-dimensional antinorm balls, see [17]).

Continuous Non-Concentric Elliptically Contoured Distributions
− e is a star body having the origin as an interior point, and Moreover, r 1 K a,e ⊂ r 2 K a,e for r 1 ≤ r 2 .A Minkowski functional h K a,e which is homogeneous of degree one will be called a non-concentric elliptically contoured function and ϕ g,K a,e ,µ a non-concentric elliptically contoured density.If O : R d → R d denotes an arbitrary orthogonal transformation then h OK a,e is also a non-concentric elliptically contoured function which is homogeneous of degree one.For the special case of d = 2 see [12,13].

Parametric Estimators
Let X 1 , . . ., X n be a sample of independent random vectors, where X i ∼ Φ g,K,µ and X i = (X i1 , . . ., X id ) T .Assume that the star body K is given and Assumption 1 is satisfied.From now on, we suppose that K is symmetric w.r.t. the origin.We consider a model family { f θ : θ ∈ Θ 1 } of continuously differentiable densities for the star-generalized radius R on [0, ∞), see (4).
the parameter space which is assumed to be compact.Suppose that h K (.) is a continuous function.
Next we give two reasonable model classes for f θ : (1) Modified exponential model.θ = τ ∈ (0, +∞), In this section the aim is to fit the specific parametric model for the density ϕ g,K,µ to the data by estimating the parameters θ and µ where ϕ g,K,µ is given according to (5) and ( 4) with f R = f θ .Therefore, the two models [1] and [2] fulfill the condition lim r→0+0 g (r) = 0 which ensures the differentiability of the density ϕ g,K,µ at zero.
For the statistical analysis we suppose that the data X 1 , . . ., X n are given and these data comprise independent random vectors having density ϕ g,K,µ .Suppose that θ and µ are interior points of Θ 1 and Θ 2 , respectively.The concentrated log likelihood function (constant addends can be omitted) reads as follows We introduce the maximum likelihood estimators θn , μn of θ and µ as joint maximizers of the likelihood function: Under appropriate assumptions, the maximum-likelihood-estimator are asymptotically normally distributed (cf.Theorem 5.1 in [18], p. 463) where d −→ is the symbol for convergence in distribution and the information matrix is given by I(θ, µ) = (I ij (δ)) i,j=1...d+q with δ T = (δ 1 , . . ., δ d+q ) = (θ T , µ T ) and

Nonparametric Estimators without Scale Fit
In the present section we deal with nonparametric estimators in the context of star-shaped distributions.This type of estimators is of special interest if no suitable parametric model can be found.The cdf of R will be denoted by F R .

Estimating µ and F R
Let X 1 , . . ., X n be the sample as in Section 3.1.In the following the focus is on the estimation of the parameter µ and the distribution function of the generalized radius R.
First we choose an estimator for µ.For this purpose we assume that E|X| < +∞.
is an unbiased estimator for the unknown parameter µ.
for r ≥ 0. At a first glance, FR n (r) just approximates the empirical distribution function which is not available from the data because of the unknown µ.We can prove that FR n converges to F R a.s., in fact at the same rate as every common empirical distribution function converges to the cdf.This is the assertion of the following theorem.
Here the condition (9) ensures that ER 2 < +∞ which in turn is an assumption for the law of iterated logarithm of μn .

Density Estimation
In the remainder of Section 3.2, we establish an estimator for the density ϕ g,K,µ in the case of a bounded generator function g, and provide statements on convergence properties of the estimator.An estimator for µ is available by Formula (7), the estimation of g is still an open problem.If we want to estimate g, then it is necessary that this function is identifiable.In (2), however, function g is determined up to a constant factor.Therefore, we require I(g, d) = 1 to obtain the uniqueness and identifiability.As a consequence, we get, according to [12] C(g, K) = 1 O S (S) .
In the following we adopt the approach introduced in Section 2 of [7] to the present much more general situation.This approach combines the advantages of two estimators and avoids their disadvantages.Let ψ : [0, ∞) → [0, +∞) be a function having a derivative ψ with ψ (y) > 0 for y ≥ 0, and the property ψ(0) = 0. We introduce the random variable Y = ψ(h K (X − µ)) and denote the inverse function of ψ by Ψ.The transformation using ψ is applied to adjust the volcano effect described above.In view of (4), the density χ of Y = ψ(R) is given by for y ≥ 0. This equation implies the following formula for g: The next step is to establish the estimator for χ.Nonparametric estimators have the advantage that they are flexible and there is no need to assume a specific model.Let us consider the transformed sample Y 1n , . . ., Y nn with Y in = ψ( Ri ).Further we apply the following kernel density estimator for χ: where b = b(n) is the bandwidth and k the kernel function.Note that χn represents the usual kernel density estimator for χ based upon the Y in 's and including a boundary correction at zero (the second addend in the outer parentheses of ( 10)).The mirror rule is used as a simple boundary correction.Other more elegant corrections can be applied at the price of a higher technical effort.The properties of χn are essentially influenced by the bandwidth b.Since the kernel estimator shows reasonable properties only in the case of bounded χ, we have to guarantee by suitable assumptions that lim z→0+ z 1−d ψ (z) > 0 in order to get the boundedness of χ (see below).On the basis of χn , we can establish the following estimator for ϕ g,K,µ : This approach has the property that the theory of kernel density estimators applies here (cf.[19]).The kernel estimators are a very popular type of nonparametric density estimators because of their comparatively simple structure.In the literature the reader can find a lot of hints concerning the choice of the bandwidth.
Let us add here some words to the comparison between this paper and [7].Although the main idea for the construction of estimators is the same, there is a difference in the definition of the generator functions (say g and g L ).Considering the special case h K (x) = Σ −1/2 (x − µ) , identity g(t) = g L (t 2 ) can be established for t ≥ 0. This causes some changes in the formulas.For more details in a particular case, see Section 3 in [20].
for even j : 0 < j < p, where p ≥ 2 is an integer.
Note that continuity of the derivative at an enclosed boundary point means that the one-sided derivative exists and is the limit of the derivatives in a neighbourhood of this point.Symmetric kernel functions k satisfying (12) are called kernels of order p. Assumption 2 ensures that the bias of the density estimator χn converges to zero at a certain rate.Under Assumption 2 with p = 2 and k(t) ≥ 0, the estimator χn is indeed a density.The case p > 2 is added to complete the presentation and is of minor practical importance unless we have a very large sample size (cf.the discussion in [21]).From the asymptotic theory for density estimators, it is known that the Epanechnikov kernel is an optimal kernel of order 2 (i.e., in the case p = 2 in Assumption 2) with respect to the asymptotic mean square error (cf.[19]).This kernel function is simple in structure and leads to fast computations.The consideration of optimal kernels can be extended to higher-order kernels.It turned out that their use is advantageous only in the case of sufficiently large sample sizes (for instance, for a size greater than 1000).Assumption 3. The (p + 1)-th order derivative of Ψ d exists and is continuous on [0, ∞), ψ is positive and bounded on (0, +∞) for some integer p ≥ 2, and ψ is bounded on (0, +∞).The functions z Notice that in Assumption 3 we require that the right-sided limit of the (p + 1)-th order derivative of Ψ d is finite at zero.Hence Assumption 3 implies that and lim with a finite constant C 2 > 0. On the other hand, it follows from (13) that lim with a finite constant C 3 > 0. Therefore, χ is bounded under Assumption 3.
Example 10.Let Hence, Assumption 3 is satisfied for every p ≥ 2.
Another condition is required now for h K .
Assumption 4. For any bounded subset Q of R d , 0 / ∈ Q, the partial derivatives G 1 , . . ., G d of h K (.) exist and are bounded on Q, and x ψ (h K (x))G j (x) is Hölder continuous of order α > 0.2 on Q for each j ∈ {1, . . ., d}.

Assumption 5. For any bounded subset
If Assumption 3 is fulfilled, the function h K has second order derivatives ∂ 2 and these are bounded on bounded subsets of R d , then the Assumption 4 is satisfied.
Example 11.We consider the q-norm/antinorm: h Therefore, Assumption 4 is fulfilled in the case q > 1.2, and Assumption 5 is fulfilled in the case q > 0.2.

Properties of the Density Estimator
First we provide the result on strong convergence of the density estimator.
Theorem 2. Suppose that the p-th order derivative g (p) of g exists and is bounded on [0, ∞) for some even integer p ≥ 2.Moreover, assume that condition (9) as well as Assumptions 1 to 3 are satisfied for the given p.
Let Assumption 4 or Assumption 5 be satisfied.In the first case define r n := √ ln n(nb) −1/2 , in the latter case r n := n −ᾱ/2 b −1 .Then, for any compact set D with µ / ∈ D and n → ∞, For any compact set D with µ ∈ D and n → ∞, Theorem 2 applies in particular to the Euclidean case h K = .2 .Since Assumption 2 is weaker than the corresponding assumption on the kernel in [7], Theorem 2 extends Theorem 3.1 in [7] even in the case of h K = .2 .The convergence rate in (15) is the same as that known for one-dimensional kernel density estimators and cannot be improved under the assumptions posed here (cf.[22]).
The next theorem represents the result about the asymptotic normality of the estimator φn .
Theorem 3. Suppose that the assumptions of Theorem 2 and Assumption 4 are satisfied.
where e n = Λ( x) b p + o(b p ), (ii) If additionally lim n→∞ n 1/(2p+1) b = C 4 holds true with a constant C 4 ≥ 0, then, for n → ∞, The assertion of Theorem 3 can be used to construct an asymptotic confidence region for ϕ g,K,µ (x).Term e n describes the asymptotic behaviour of the the bias of the estimator φn whereas the fluctuations of the estimator are represented by Z n .In view of Theorem 3,

√
nbZ n converges in distribution to Z ∼ N (0, σ2 ( x)).The mean squared deviation of the leading terms in the asymptotic expansion of φn is thus given by The minimization of this function w.r.t.b leads to the asymptotically optimal bandwidth The bandwidth b * converges at rate n −1/(2p+1) to zero.Under the conditions of Theorem 3(ii), φn (x) − ϕ g,K,µ (x) has the convergence rate n −p/(2p+1) .This convergence rate of φn is better than that of a nonparametric density estimator but slower than the usual rate n −1/2 for parametric estimators.In principle, Formula ( 16) could be used for the optimal choice of the bandwidth.However, one would need then an estimator for χ (p) and typically, estimators of derivatives of densities do not exhibit a good performance unless n is very large.As a resort, one can consider a bandwidth which makes reference to a specific radius distribution.
To illustrate how the estimators work in practice, we simulated data from a q-norm distribution with q = 1.3 and the radius distribution to be the modified exponential distribution with τ = 1.1.The Figures 7 and 8 include graphs of the underlying function g and its estimator in two cases.
This formula was generated using the computer algebra system Mathematica.The parameter τ can be estimated by utilizing the above Formula (6) for the expectation of the radius.

Semiparametric Estimators Involving a Scale and a Parameter Fit
In this section we consider the situation where the contour of the body K depends on scale parameters σ 1 , . . ., σ d .Suppose that I(g, d) = 1.We introduce the diagonal matrix Σ = diag(σ 1 , . . ., σ d ) and a master body K 0 , which is symmetric w.r.t. the origin.Define K = ΣK 0 := {Σx : x ∈ K 0 }, and Ũ = Σ −1 U.The distribution of Ũ is concentrated on the boundary S 0 of K 0 .We assume that K 0 is given such that Otherwise, K 0 is rescaled.Suppose that h K 0 depends on a further parameter vector θ ∈ Θ where the parameter space Θ ⊂ R q is a compact set.Then The parameter vector θ is able to describe the shape of the boundary of body K, see Examples 1 and 2 (parameters a 1 and a 2 ).From Lemma 1, we obtain and Here we see that (17) results in V(X (j) ) = σ 2 j ER 2 .The density is given by In this context, a scaling problem occurs concerning g.Assume that g is a suitably given generator function satisfying I(g, d) = 1.Then x g * t (x) := t d−1 g(tx) is a modified generator for every t ∈ R with I(g * t , d) = 1.For any t ∈ R, we obtain the same model when g is replaced by g * t and σ j is replaced by σ j t for j = 1, . . ., d.To get uniqueness, we choose t such that Let μn = ( μn1 , . . ., μnd ) T as above.Then σ 2 j represents the variance of the j-th component of X.Based on this property, the sample variances of the components of X can be used as estimators for σ 2 j : Moreover, we have the sample correlations In the following we use the notation Σn = diag( σn1 , . . ., σnd ).If θ is unknown, we consider moment estimators based on the correlations.For this we need the following assumptions.Assumption 6.Let I be a subset of {(j, k) : j, k = 1, . . ., d, j < k} with cardinality q.There is a vector ρ = (ρ jk ) (j,k)∈I ∈ R q such that for l = 1, . . ., q, θ l = γ l (ρ).

Assumption 8. For any bounded subset
Examples 1 and 2: (Continued) Similarly as above, it can be proven that Assumption 7 is fulfilled.
Let ρ be the sample version of ρ.Then θnl = γ l ( ρ) for l = 1, . . ., q is the estimator for θ, θn = ( θnl ) l=1...q .Define Ri = Σ−1 (X i − μn ).With this definition, FR n is determined according to Formula (8).The following result on the convergence rate of FR n can be proven: Theorem 4. Suppose that Assumptions 1 and 6 are satisfied, and Let r r f R (r) be bounded on [0, +∞).Then, for n → ∞, In this section the transformed sample Y 1n , . . ., Y nn is given by Y in = ψ(h K 0 ( θn , Ri )) with ψ as in Section 3.2.The estimator ĝn for the generator g is calculated using Formulas (10) and (11) from the previous section.The following estimator for the density has thus been established: The next two theorems show the results concerning strong convergence and asymptotic normality of the density estimator: Theorem 5. Suppose that the p-th order derivative g (p) of g exists and is bounded on [0, ∞) for some even integer p ≥ 2.Where needed, with this p, assume further that Assumptions 1, 2, 3, 6, (1) and ( 18) are satisfied.Let Assumption 7 or Assumption 8 be satisfied, and define in the first case r n := √ ln n(nb) −1/2 and in the latter case r n := n −ᾱ/2 b −1 .Then the claim of Theorem 2 holds true for estimator φn defined in (19).Theorem 6. Suppose that the assumptions of Theorem 5 are satisfied.Let where φn is defined in (19), The remarks following Theorems 2 and 3 are valid similarly.

Applications
For many decades, statistical applications of multivariate distribution theory were manly based upon Gaussian and elliptically contoured distributions.Studies using non-elliptically contoured star-shaped distributions were basically made during the last two decades and are dealing in most cases with p-generalized normal distributions.Such distributions are convex or radially concave contoured if p ≥ 1 or 0 < p ≤ 1, respectively, and are also called power exponential distributions.Moreover, common elliptically contoured power exponential (ecpe) distributions build a particular class of the wide class of star-shaped distributions that allows modeling much more flexible contours than elliptically ones.
The class of ecpe distributions is used in a crossover trial on insulin applied to rabbits in [23], in image denoising in [24] and in colour texture retrieval in [25].Applications of multivariate g-and-h distributions to jointly modeling body mass index and lean body mass are demonstrated in [26] and accompanied by star-shaped contoured density illustrations.The l n,p -elliptically contoured distributions build another big class of star-shaped distributions and are used in [27] to explore to which extent orientation selectivity and contrast gain control can be used to model the statistics of natural images.Mixtures of ecpe distributions are considered for bioinformation data sets in [28].Texture retrieval using the p-generalized Gaussian densities is studied in [29].A random vector modeling data from quantitative genetics presented in [30] are shown in [31] to be more likely to have a power exponential distribution different from a normal one.The reconstruction of the signal induced by cosmic strings in the cosmic microwave background, from radio-interferometric data, is made in [32] based upon generalized Gaussian distributions.These distributions are also used in [33] for voice detection.
More recently, the considerations in [11] opened a new field of financial applications of more general star-shaped asymptotic distributions, where suitably scaled sample clouds converge onto a deterministic set.
Figure 3 in [34] represents a sample cloud which might be modeled with a density being star-shaped w.r.t. a fan having six cones that include sample points and other cones that do not.Note, however, that Figure 1 d-f in the same paper do not reflect a homogeneous density but might be compared in some sense to the level sets of the characteristic functions of certain polyhedral star-shaped distributions in [16], Figure 5.2.
When modeling Lymphoma data, [35] analyze sample clouds of points, see Figures 2 and 3, which might be interpreted as mixtures of densities having contours in part looking similar like that in [36] where flow cytometric data, Australian Institute of Sport data and Iris data are analyzed, or like that of certain skewed densities as they were (analytically derived and) drawn in [37].In a similar manner, Figures 2 and 5 in [20] indicate that mixtures of different types of star-shaped distributions might be suitable for modeling residuals of certain stock exchange indices.It could be of interest to closer study in future work more possible connections of all the models behind.
The following numerical examples of the present section are aimed to illustrate the agricultural and financial application of the estimators described in this paper.To this end, we make use of the new particular non-elliptically contoured but star-shaped distribution class introduced in Section 2.2 of the present paper.Figure 9 shows the dependence of the correlation on the parameter a 1 .

Figure 9. Function γ −1
1 with a 1 on the x-axis, ρ on the y-axis; γ 1 is defined in Assumption 6.
The data and the shape of the estimated multivariate density are depicted in Figures 10 and 11   Example 13.We want to illustrate the potential of our approach for applications to financial data and consider daily index data from Morgan Stanley Capital International of the countries Germany and UK for the period August 2011 to June 2016.The data indicate the continuous daily return values computed as logarithm of the ratio of two subsequent index values.The modelling of MSCI data using elliptical models is considered in [5].The data are depicted in Figure 12.A visual inspection seems to give some preference for our model from Section 2.2 compared to the elliptically contoured model.Figures 13 and 14     Further we proceed with proving the results.

Proofs
Throughout the remainder of the paper, suppose that Assumptions 1-3 are satisfied for some integer p ≥ 2. First, we prove auxiliary statements which are used in the proof of strong convergence of φn and later.

Proof of Auxiliary Statements
The following Lemma 2 clarifies the asymptotic behaviour of χ in the neighbourhood of zero.Lemma 2. Suppose that g exists and is bounded.Then Proof.Observe that by the Lipschitz continuity of the functions g and z uniformly for t, v ∈ [0, M].Here and in the following C is a generic constant which may differ from formula to formula.By assumption, we have On the other hand, sup t,u≥0 In view of ( 20), the proof is complete.
Since there is a constant C > 0 such that y 2 ≤ C y ∞ for all y ∈ R d in view of the norm equivalence property, the first part of the lemma follows from (21).The second part can be shown similarly.
In several places, we will use the following property: which proves the lemma.

Proving Convergence of FR n
In this section we prove Theorem 1.The law of the iterated logarithm for the empirical process says (cf. [39], p. 268, for example) By Lipschitz-continuity of h K and Lemma 3, Hence, by the boundedness of f R , which leads to the theorem.

Proving Strong Convergence of the Density Estimator
for y ≥ 0. Then (cf.(10)) Next we prove strong convergence rates for χn and later for φn .Throughout this section we suppose that Assumptions 1 to 3 and ( 9) are fulfilled for some even integer p ≥ The asymptotic behaviour of the right hand side in (22) as n → ∞ is analyzed term by term in the next lemmas.
Lemma 5. Assume that the p-th order derivative χ (p) of χ exists for some even integer p ≥ 2 and is bounded on every finite closed subinterval of (0, ∞).Let g be bounded.Then where The proof of this lemma is omitted since, with minor changes, this lemma can be proven in the same way as Lemma 4.4 in [7].The following lemma is used later several times in proofs of almost sure convergence rates.We provide it without proof.The proof is almost identical to that of Lemma 4.6 in [7].Lemma 6. Assume that χ is bounded.Let k, λ : R → R be bounded measurable functions with k(t) = 0 for t where We proceed with proving convergence rates of the terms in (22).
with a suitable constant C 5 > 0 for n ≥ n 1 (ω).We introduce Let ψ(z) := z −1 ψ (z).Observe that k is bounded and Lipschitz continuous on [−1, 1], ψ , ψ and ψ are bounded on [0, +∞), functions G j are bounded, and functions ψ (h K (.))G j are Hölder continuous of order α > 0.2.We have then by Taylor expansion where where Note that since the expectation in the last term is zero in view of Lemma 4 (G j (−x) = −G j (x) holds for all x ∈ R d ).Applying Lemma 6, it follows that On the other hand, we obtain by utilizing Lemma 6 and taking α > 0.2 into account.Similarly, it follows that Therefore, an application of ( 23)-( 26) leads to the lemma under Assumption 4. (b) Let Assumption 5 be satisfied.We obtain Further, by Lipschitz continuity of k, which proves assertion (a).Analogously, the validity of assertion (b) can be shown.In view of ( 22), the lemma follows by Lemma 5 and 7.
We are now in a position to prove the result on strong convergence of φn .

Proof of Theorem 2:
for n ≥ n 3 (ω).Lemma 8 applies to complete the proof of part (i).
(ii) Case µ ∈ D. The proof can be done analogously to part (i) taking m 0 = 0 into account.

Proving Asymptotic Normality of φn (x)
Throughout this subsection, assume that Assumptions 1-3 and ( 9) are fulfilled for some integer p ≥ 2. First, an auxiliary result is proven.Define x := h K (x − μn ) and x := h K (x − µ).Lemma 9.Under Assumption 4, we have Proof.Let U l and u l as in Section 4.
which yields immediately assertion (a).Since by Lemma 3, we obtain the inequality by Taylor expansion, where Analogously to Lemma 6, we can deduce since the expectation is zero due to Lemma 4. Analogously to the examination of B 3n and B 4n in Lemma 7, we obtain In view of ( 27), the proof of part b) is complete.
From kernel density estimation theory, we can take the following lemma, see [40].Subsequently, we prove asymptotic normality of φn .Lemma 10.Suppose that χ is continuous at y. Then Proof of Theorem 3. Note that z z 1−d ψ (z) has a bounded derivative on every finite subinterval of (0, ∞).By Lemmas 3 and 9, we obtain where Using Lemma 10, we have By Taylor expansion, we obtain and xn lies between ψ( x) − tb and ψ( x).This completes the proof.

Proofs When Additional Scale Fit Is Involved
When proving Theorem 4 we shall make use of the following lemma.Proof.By the law of iterated logarithm and Lemma 3, we obtain +EX j X k σnj σnk σ j σ k We proceed with proving Theorem 5.The next two lemmas are used in the proof of this theorem.Here we define Y in = ψ(h K 0 ( θn , Σ−1 n (X i − μn ))) and Ỹi = ψ(h K 0 (θ, Σ −1 (X i − µ))).Notice that Ỹi has density χ, and Lemmas 5 and 6 hold true for the modified Y in and Ỹi , too.Proof.We prove the Lemma under Assumption 7, the proof of the other part is analogous to that of Lemma 7(b).As we see later, we need only to involve data vectors X i with Y in = ψ(h K 0 ( θn , Σ−1 n (X i − μn ))) ≤ M + b or Ỹi = ψ(h K 0 (θ, Σ −1 (X i − µ))) ≤ M + b.This implies min{ X i − μn , X i − µ } ≤ CΨ(M + b) by (1), and therefore X i − µ ≤ C 5 for n ≥ n 6 (ω) with a constant C 5 > 0. In view of Lemma 3 and by Assumption 7, we obtain • ln ln n n =: w n (X i = (X i1 , . . ., X id ) T ) with a suitable constant C 6 > 0 for n ≥ n 7 (ω).We introduce This completes the proof.

Figure 6 .
Figure 6.Contour plot of the density ϕ g,K,µ for a 1 = 3 and levels as in Figure 4.

Example 8 .
Given a homogeneous polynomial p of degree k with p(|x 1 |, . . ., |x d |) ≥ 0, the function N(x) := (p(|x 1 |, . . ., |x d |)) 1/k defines a norm in R d if it is subadditive.An example for a homogeneous polynomial of degree 3 and d

3. 2 . 3 .
Assumptions Ensuring Convergence Properties of Estimators Next we provide the assumptions for the theorems below.Assumption 2 concerns the parameter b = b(n) and the function k of the kernel estimator whereas Assumption 3 is posed on function ψ.Assumption 2. (a) We assume that lim n→∞ b ln ln n = 0 and b ≥ b ≥ C 1 • n −1/5 with constants b, C 1 > 0. (b) Suppose that the kernel function k : R −→ R is continuous and vanishes outside the interval [−1, 1], and has a Lipschitz continuous derivative on [−1, 1].Moreover, assume that k(

Figure 7 .
Figure 7. Estimator of g (solid line) and the model function (dashed line) for n = 1000.

Figure 8 .
Figure 8. Estimator of g (solid line) and the model function (dashed line) for n = 10, 000.

3. 2
.5.Reference Bandwidth Let us consider an estimator φn (x) with Epanechnikov kernel, function ψ as in Example 10, and modified exponential radius density in the case p = 2.According to (16), the reference bandwidth is then b

Example 12 .
Example 2 of Section 2.2 continued.We consider the class of bodies K of Example 2. Let a 2 = 1. .

Figure 11 .
Figure 11.Estimated generator function g at left (bandwidth b = 0.5) and contour plot of φ at right.

Figure 12 .
Figure 12.Scatter plot of the MSCI data.

Figure 13 .
Figure 13.Contour plot of the estimated density.
Let A ∈ R d,d be a d × d-matrix satisfying det(A) > 0, .• any norm, and x = Ax • another norm.If X is .• -contoured distributed and A is a symmetric and positive definite then we call the distribution of AX an elliptically generalized .• -contoured distribution.Let a = (a 1 , . . ., a d ) T be a vector with a i > 0, i = 1, ..., d and A = diag(1/a 1 , . . ., 1/a d ).If, in Example 4, .• is the p-norm, p ≥ 1, then the corresponding norm . is