Next Article in Journal
On Third-Order Bronze Fibonacci Numbers
Next Article in Special Issue
On a Fractional Stochastic Risk Model with a Random Initial Surplus and a Multi-Layer Strategy
Previous Article in Journal
Anatomical Model of Rat Ventricles to Study Cardiac Arrhythmias under Infarction Injury
Previous Article in Special Issue
Time-Inhomogeneous Feller-Type Diffusion Process in Population Dynamics
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Study of Seven Asymmetric Kernels for the Estimation of Cumulative Distribution Functions

by
Pierre Lafaye de Micheaux
1,2,3 and
Frédéric Ouimet
4,5,*
1
School of Mathematics and Statistics, UNSW Sydney, Sydney, NSW 2052, Australia
2
Desbrest Institute of Epidemiology and Public Health, INSERM and University Montpellier, 34093 Montpellier, France
3
AMIS, Université Paul Valéry Montpellier 3, 34199 Montpellier, France
4
Department of Mathematics and Statistics, McGill University, Montreal, QC H3A 0B9, Canada
5
Division of Physics, Mathematics and Astronomy, California Institute of Technology, Pasadena, CA 91125, USA
*
Author to whom correspondence should be addressed.
Mathematics 2021, 9(20), 2605; https://doi.org/10.3390/math9202605
Submission received: 15 September 2021 / Revised: 10 October 2021 / Accepted: 12 October 2021 / Published: 16 October 2021
(This article belongs to the Special Issue Stochastic Models with Applications)

Abstract

:
In this paper, we complement a study recently conducted in a paper of H.A. Mombeni, B. Masouri and M.R. Akhoond by introducing five new asymmetric kernel c.d.f. estimators on the half-line [ 0 , ) , namely the Gamma, inverse Gamma, LogNormal, inverse Gaussian and reciprocal inverse Gaussian kernel c.d.f. estimators. For these five new estimators, we prove the asymptotic normality and we find asymptotic expressions for the following quantities: bias, variance, mean squared error and mean integrated squared error. A numerical study then compares the performance of the five new c.d.f. estimators against traditional methods and the Birnbaum–Saunders and Weibull kernel c.d.f. estimators from Mombeni, Masouri and Akhoond. By using the same experimental design, we show that the LogNormal and Birnbaum–Saunders kernel c.d.f. estimators perform the best overall, while the other asymmetric kernel estimators are sometimes better but always at least competitive against the boundary kernel method from C. Tenreiro.

In the context of density estimation, asymmetric kernel estimators were introduced by Aitchison and Lauder [1] on the simplex and studied theoretically for the first time by Chen [2] on [ 0 , 1 ] (using a Beta kernel), and by Chen [3] on [ 0 , ) (using a Gamma kernel). These estimators are designed so that the bulk of the kernel function varies with each point x in the support of the target density. More specifically, the parameters of the kernel function can vary in a way that makes the mode, the median or the mean equal to x. This variable smoothing allows asymmetric kernel estimators to behave better than traditional kernel estimators (see, e.g., Rosenblatt [4], Parzen [5]) near the boundary of the support in terms of their bias. Since the variable smoothing is integrated directly in the parametrization of the kernel function, asymmetric kernel estimators are also usually simpler to implement than boundary kernel methods (see, e.g., Gasser and Müller [6], Rice [7], Gasser et al. [8], Müller [9], Zhang and Karunamuni [10,11]). For these two reasons, asymmetric kernel estimators are, by now, well-known solutions to the boundary bias problem from which traditional kernel estimators suffer. In the past twenty years, various asymmetric kernels have been considered in the literature on density estimation:
  • Beta kernel, when the target density is supported on [ 0 , 1 ] , see, e.g., Chen [2], Bouezmarni and Rolin [12], Renault and Scaillet [13], Fernandes and Monteiro [14], Hirukawa [15], Bouezmarni and Rombouts [16], Zhang and Karunamuni [17], Bertin and Klutchnikoff [18,19] Igarashi [20];
  • Gamma, inverse Gamma, LogNormal, inverse Gaussian, reciprocal inverse Gaussian, Birnbaum–Saunders and Weibull kernels, when the target density is supported on [ 0 , ) , see, e.g., Chen [3], Jin and Kawczak [21], Scaillet [22], Bouezmarni and Scaillet [23], Fernandes and Monteiro [14], Bouezmarni and Rombouts [16], fBouezmarni and Rombouts [24], Bouezmarni and Rombouts [25], Igarashi and Kakizawa [26,27], Charpentier and Flachaire [28], Igarashi [29], Zougab and Adjabi [30], Kakizawa and Igarashi [31], Kakizawa [32], Zougab et al. [33], Zhang [34], Kakizawa [35];
  • Dirichlet kernel, when the target density is supported on the d-dimensional unit simplex, see [1] and the first theoretical study by Ouimet and Tolosana-Delgado [36].
  • Continuous associated kernels, the aim of which is to unify the theory of asymmetric kernels with the one for traditional kernels in both the univariate and multivariate settings, see, e.g., Kokonendji and Libengué Dobélé-Kpoka [37], Kokonendji and Somé [38,39].
The interested reader is referred to Hirukawa [40] and Section 2 of Ouimet and Tolosana-Delgado [36] for a review of some of these papers and an extensive list of papers dealing with asymmetric kernels in other settings.
In contrast, there are almost no papers dealing with the estimation of cumulative distribution functions (c.d.f.s) in the literature on asymmetric kernels. In fact, to the best of our knowledge, [41] seems to be the first (and only) paper in this direction if we exclude the closely related theory of Bernstein estimators. (In the setting of Bernstein estimators, c.d.f. estimation on compact sets was tackled, for example, by Babu et al. [42], Leblanc [43], Leblanc [44], Leblanc [45], Dutta [46], Jmaei et al. [47], Erdoğan et al. [48] and Wang et al. [49] in the univariate setting, and by Babu and Chaubey [50], Belalia [51], Dib et al. [52] and Ouimet [53,54] in the multivariate setting. In [55], the authors introduced Bernstein estimators with Poisson weights (also called Szasz estimators) for the estimation of c.d.f.s that are supported on [ 0 , ) , see also Ouimet [56]).
In the present paper, we complement the study reported in [41] by introducing five new asymmetric kernel c.d.f. estimators, namely the Gamma, inverse Gamma, LogNormal, inverse Gaussian and reciprocal inverse Gaussian kernel c.d.f. estimators. Our goal is to prove several asymptotic properties for these five new estimators (bias, variance, mean squared error, mean integrated squared error and asymptotic normality) and compare their numerical performance against traditional methods and against the Birnbaum–Saunders and Weibull kernel c.d.f. estimators from [41]. As we will see in the discussion of the results (Section 9), the LogNormal and Birnbaum–Saunders kernel c.d.f. estimators perform the best overall, while the other asymmetric kernel estimators are sometimes better but always at least competitive against the boundary kernel method from [57].

1. The Models

Let X 1 , X 2 , , X n be a sequence of i.i.d. observations from an unknown cumulative distribution function F supported on the half-line [ 0 , ) . We consider the following seven asymmetric kernel estimators (the first five are new):
F ^ n , b Gam ( x ) = 1 n i = 1 n K ¯ Gam ( X i | b 1 x + 1 , b ) ,
F ^ n , b IGam ( x ) = 1 n i = 1 n K ¯ IGam ( X i | b 1 + 1 , x 1 b ) ,
F ^ n , b LN ( x ) = 1 n i = 1 n K ¯ LN ( X i | log x , b ) ,
F ^ n , b IGau ( x ) = 1 n i = 1 n K ¯ IGau ( X i | x , b 1 x ) ,
F ^ n , b RIG ( x ) = 1 n i = 1 n K ¯ RIG ( X i | x 1 ( 1 b ) 1 , x 1 b 1 ) ,
F ^ n , b B S ( x ) = 1 n i = 1 n K ¯ B S ( X i | x , b ) ,
F ^ n , b W ( x ) = 1 n i = 1 n K ¯ W ( X i | x / Γ ( 1 + b ) , b 1 ) ,
where b > 0 is a smoothing (or bandwidth) parameter, and
K ¯ Gam ( t | α , θ ) = Γ ( α , t / θ ) Γ ( α ) , α , θ > 0 , K ¯ IGam ( t | α , θ ) = 1 Γ ( α , 1 / ( t θ ) ) Γ ( α ) , α , θ > 0 , K ¯ LN ( t | μ , σ ) = 1 Φ log t μ σ , μ , σ > 0 , K ¯ IGau ( t | μ , λ ) = 1 Φ λ t t μ 1 e 2 λ / μ Φ λ t t μ + 1 , μ , λ > 0 , K ¯ RIG ( t | μ , λ ) = Φ λ t 1 t μ 1 + e 2 λ / μ Φ λ t 1 t μ + 1 , μ , λ > 0 , K ¯ B S ( t | β , α ) = 1 Φ 1 α t β β t , β , α > 0 , K ¯ W ( t | λ , k ) = exp t λ k , λ , k > 0 ,
denote, respectively, the survival function of the
  • Gamma ( α , θ ) distribution (with the shape/scale parametrization);
  • InverseGamma ( α , θ ) distribution (with the shape/scale parametrization);
  • LogNormal ( μ , σ ) distribution;
  • InverseGaussian ( μ , λ ) distribution;
  • ReciprocalInverseGaussian ( μ , λ ) distribution;
  • Birnbaum - Saunders ( β , α ) distribution;
  • Weibull ( λ , k ) distribution.
The function Γ ( α , z ) = z t α 1 e t d t denotes the upper incomplete gamma function (where Γ ( α ) = Γ ( α , 0 ) ), and Φ denotes the c.d.f. of the standard normal distribution. The parametrizations are chosen so that
  • The mode of the kernel function in (1) is x;
  • The median of the kernel function in (3) and (6) is x;
  • The mean of the kernel function in (2), (4), (5) and (7) is x.
In this paper, we will compare the numerical performance of the above seven asymmetric kernel c.d.f. estimators against the following three traditional estimators (K here is the c.d.f. of a kernel function):
F ^ n , b OK ( x ) = 1 n i = 1 n K x X i b ,
F ^ n , b BK ( x ) = 1 n i = 1 n K x X i b 𝟙 [ b , ) ( x ) + K x X i x 𝟙 ( 0 , b ) ( x ) ,
F ^ n EDF ( x ) = 1 n i = 1 n 𝟙 [ X i , ) ( x ) ,
which denote, respectively, the ordinary kernel (OK) c.d.f. estimator (from Tiago de Oliveira [58], Nadaraja [59] or Watson and Leadbetter [60]), the boundary kernel (BK) c.d.f. estimator (from Example 2.3 in [57]) and the empirical c.d.f. (EDF) estimator.

2. Outline, Assumptions and Notation

2.1. Outline

In Section 3, Section 4, Section 5, Section 6 and Section 7, the asymptotic normality, and the asymptotic expressions for the bias, variance, mean squared error (MSE) and mean integrated squared error (MISE), are stated for the Gam , IGam , LN , IGau and RIG kernel c.d.f. estimators, respectively. The proofs can be found in Appendix A, Appendix B, Appendix C, Appendix D and Appendix E, respectively. Aside from the asymptotic normality (which can easily be deduced), these results were obtained for the Birnbaum–Saunders and Weibull kernel c.d.f. estimators in [41]. In Section 8, we compare the performance of all seven asymmetric kernel estimators above with the three traditional estimators OK , BK and EDF , defined in (8)–(10). A discussion of the results and our conclusion follow in Section 9 and Section 10. Technical calculations for the proofs of the asymptotic results are gathered in Appendix F.

2.2. Assumptions

Throughout the paper, we make the following two basic assumptions:
  • The target c.d.f. F has two continuous and bounded derivatives;
  • The smoothing (or bandwidth) parameter b = b n > 0 satisfies b 0 as n .

2.3. Notation

Throughout the paper, the notation u = O ( v ) means that lim sup | u / v | < C < as n . The positive constant C can depend on the target c.d.f. F, but no other variable unless explicitly written as a subscript. For example, if C depends on a given point x ( 0 , ) , we would write u = O x ( v ) . Similarly, the notation u = o ( v ) means that lim | u / v | = 0 as n . Subscripts indicate which parameters the convergence rate can depend on. The symbol D over an arrow ‘⟶’ will denote the convergence in law (or distribution).

3. Asymptotic Properties of the c.d.f. Estimator with Gam Kernel

In this section, we find the asymptotic properties of the Gamma ( Gam ) kernel estimator defined in (1).
Lemma 1
(Bias and variance). For any given x ( 0 , ) ,
B ias [ F ^ n , b Gam ( x ) ] = E [ F ^ n , b Gam ( x ) ] F ( x ) = b · ( f ( x ) + x 2 f ( x ) ) + o x ( b ) ,
V ar ( F ^ n , b Gam ( x ) ) = n 1 F ( x ) ( 1 F ( x ) ) n 1 b 1 / 2 · x f ( x ) π + O x ( n 1 b ) .
Corollary 1
(Mean squared error). For any given x ( 0 , ) ,
MSE ( F ^ n , b Gam ( x ) ) = V ar ( F ^ n , b Gam ( x ) ) + B ias [ F ^ n , b Gam ( x ) ] 2 = n 1 F ( x ) ( 1 F ( x ) ) n 1 b 1 / 2 · x f ( x ) π + b 2 · ( f ( x ) + x 2 f ( x ) ) 2 + O x ( n 1 b ) + o x ( b 2 ) .
In particular, if f ( x ) · ( f ( x ) + x f ( x ) ) 0 , the asymptotically optimal choice of b, with respect to MSE , is
b opt = n 2 / 3 4 · ( f ( x ) + x 2 f ( x ) ) 2 x f ( x ) / π 2 / 3
with
MSE ( F ^ n , b opt Gam ( x ) ) = n 1 F ( x ) ( 1 F ( x ) ) n 4 / 3 3 4 ( x f ( x ) / π ) 4 4 · ( f ( x ) + x 2 f ( x ) ) 2 1 / 3 + o x ( n 4 / 3 ) .
Proposition 1
(Mean integrated squared error). Assuming that the target density f = F satisfies
0 x f ( x ) d x < and 0 ( f ( x ) + x 2 f ( x ) ) 2 d x < ,
then we have
MISE ( F ^ n , b Gam ) = 0 V ar ( F ^ n , b Gam ( x ) ) d x + 0 B ias [ F ^ n , b Gam ( x ) ] 2 d x = n 1 0 F ( x ) ( 1 F ( x ) ) d x n 1 b 1 / 2 0 x f ( x ) π d x + b 2 0 ( f ( x ) + x 2 f ( x ) ) 2 d x + o ( n 1 b 1 / 2 ) + o ( b 2 ) .
In particular, if f ( x ) · ( f ( x ) + x f ( x ) ) 0 , the asymptotically optimal choice of b, with respect to MISE , is
b opt = n 2 / 3 4 0 ( f ( x ) + x 2 f ( x ) ) 2 d x 0 x f ( x ) / π d x 2 / 3
with
MISE ( F ^ n , b opt Gam ) = n 1 0 F ( x ) ( 1 F ( x ) ) d x n 4 / 3 3 4 0 x f ( x ) / π d x 4 4 0 ( f ( x ) + x 2 f ( x ) ) 2 d x 1 / 3 + o ( n 4 / 3 ) .
Proposition 2
(Asymptotic normality). For any x > 0 such that 0 < F ( x ) < 1 , we have the following convergence in distribution:
n 1 / 2 ( F ^ n , b Gam ( x ) E [ F ^ n , b Gam ( x ) ] ) D N ( 0 , σ 2 ( x ) ) , as b 0 , n ,
where σ 2 ( x ) = F ( x ) ( 1 F ( x ) ) . In particular, Lemma 1 implies
n 1 / 2 ( F ^ n , b Gam ( x ) F ( x ) ) D N ( 0 , σ 2 ( x ) ) , if n 1 / 2 b 0 ,
n 1 / 2 ( F ^ n , b Gam ( x ) F ( x ) ) D N ( λ · ( f ( x ) + x 2 f ( x ) ) , σ 2 ( x ) ) , if n 1 / 2 b λ ,
for any constant λ > 0 .

4. Asymptotic Properties of the c.d.f. Estimator with IGam Kernel

In this section, we find the asymptotic properties of the inverse Gamma ( IGam ) kernel estimator defined in (2).
Lemma 2
(Bias and variance). For any given x ( 0 , ) ,
B ias [ F ^ n , b IGam ( x ) ] : = E [ F ^ n , b IGam ( x ) ] F ( x ) = b · x 2 2 f ( x ) + o x ( b ) ,
V ar ( F ^ n , b IGam ( x ) ) = n 1 F ( x ) ( 1 F ( x ) ) n 1 b 1 / 2 · x f ( x ) π + O x ( n 1 b ) .
Corollary 2
(Mean squared error). For any given x ( 0 , ) ,
MSE ( F ^ n , b IGam ( x ) ) = V ar ( F ^ n , b IGam ( x ) ) + B ias [ F ^ n , b IGam ( x ) ] 2 = n 1 F ( x ) ( 1 F ( x ) ) n 1 b 1 / 2 · x f ( x ) π + b 2 · x 4 4 ( f ( x ) ) 2 + O x ( n 1 b ) + o x ( b 2 ) .
In particular, if f ( x ) · f ( x ) 0 , the asymptotically optimal choice of b, with respect to MSE , is
b opt = n 2 / 3 4 · x 4 4 ( f ( x ) ) 2 x f ( x ) / π 2 / 3
with
MSE ( F ^ n , b opt IGam ( x ) ) = n 1 F ( x ) ( 1 F ( x ) ) n 4 / 3 3 4 x f ( x ) / π 4 4 · x 4 4 ( f ( x ) ) 2 1 / 3 + o x ( n 4 / 3 ) .
Proposition 3
(Mean integrated squared error). Assuming that the target density f = F satisfies
0 x f ( x ) d x < and 0 x 4 ( f ( x ) ) 2 d x < ,
then we have
MISE ( F ^ n , b IGam ) = 0 V ar ( F ^ n , b IGam ( x ) ) d x + 0 B ias [ F ^ n , b IGam ( x ) ] 2 d x = n 1 0 F ( x ) ( 1 F ( x ) ) d x n 1 b 1 / 2 0 x f ( x ) π d x + b 2 0 x 4 4 ( f ( x ) ) 2 d x + o ( n 1 b 1 / 2 ) + o ( b 2 ) .
In particular, if 0 x 4 ( f ( x ) ) 2 d x > 0 , the asymptotically optimal choice of b, with respect to MISE , is
b opt = n 2 / 3 4 0 x 4 4 ( f ( x ) ) 2 d x 0 x f ( x ) / π d x 2 / 3
with
MISE ( F ^ n , b opt IGam ) = n 1 0 F ( x ) ( 1 F ( x ) ) d x n 4 / 3 3 4 0 x f ( x ) / π d x 4 4 0 x 4 4 ( f ( x ) ) 2 d x 1 / 3 + o ( n 4 / 3 ) .
Proposition 4
(Asymptotic normality). For any x > 0 such that 0 < F ( x ) < 1 , we have the following convergence in distribution:
n 1 / 2 ( F ^ n , b IGam ( x ) E [ F ^ n , b IGam ( x ) ] ) D N ( 0 , σ 2 ( x ) ) , a s b 0 , n ,
where σ 2 ( x ) = F ( x ) ( 1 F ( x ) ) . In particular, Lemma 2 implies
n 1 / 2 ( F ^ n , b IGam ( x ) F ( x ) ) D N ( 0 , σ 2 ( x ) ) , i f n 1 / 2 b 0 ,
n 1 / 2 ( F ^ n , b IGam ( x ) F ( x ) ) D N ( λ · x 2 f ( x ) 2 , σ 2 ( x ) ) , i f n 1 / 2 b λ ,
for any constant λ > 0 .

5. Asymptotic Properties of the c.d.f. Estimator with LN Kernel

In this section, we find the asymptotic properties of the LogNormal (LN) kernel estimator defined in (3).
Lemma 3
(Bias and variance). For any given x ( 0 , ) ,
B ias [ F ^ n , b LN ( x ) ] : = E [ F ^ n , b LN ( x ) ] F ( x ) = b · x 2 ( f ( x ) + x f ( x ) ) + o x ( b ) ,
V ar ( F ^ n , b LN ( x ) ) = n 1 F ( x ) ( 1 F ( x ) ) n 1 b 1 / 2 · x f ( x ) π + O x ( n 1 b ) .
Corollary 3
(Mean squared error). For any given x ( 0 , ) ,
MSE ( F ^ n , b LN ( x ) ) = V ar ( F ^ n , b LN ( x ) ) + B ias [ F ^ n , b LN ( x ) ] 2 = n 1 F ( x ) ( 1 F ( x ) ) n 1 b 1 / 2 · x f ( x ) π + b 2 · x 2 4 ( f ( x ) + x f ( x ) ) 2 + O x ( n 1 b ) + o x ( b 2 ) .
In particular, if f ( x ) · ( f ( x ) + x f ( x ) ) 0 , the asymptotically optimal choice of b, with respect to MSE , is
b opt = n 2 / 3 4 · x 2 4 ( f ( x ) + x f ( x ) ) 2 x f ( x ) / π 2 / 3
with
MSE ( F ^ n , b opt LN ( x ) ) = n 1 F ( x ) ( 1 F ( x ) ) n 4 / 3 3 4 ( x f ( x ) / π ) 4 4 · x 2 4 ( f ( x ) + x f ( x ) ) 2 1 / 3 + o x ( n 4 / 3 ) .
Proposition 5
(Mean integrated squared error). Assuming that the target density f = F satisfies
0 x f ( x ) d x < a n d 0 x 2 ( f ( x ) + x f ( x ) ) 2 d x < ,
then we have
MISE ( F ^ n , b LN ) = 0 V ar ( F ^ n , b LN ( x ) ) d x + 0 B ias [ F ^ n , b LN ( x ) ] 2 d x = n 1 0 F ( x ) ( 1 F ( x ) ) d x n 1 b 1 / 2 0 x f ( x ) π d x + b 2 0 x 2 4 ( f ( x ) + x f ( x ) ) 2 d x + o ( n 1 b 1 / 2 ) + o ( b 2 ) .
In particular, if f ( x ) · ( f ( x ) + x f ( x ) ) 0 , the asymptotically optimal choice of b, with respect to MISE , is
b opt = n 2 / 3 4 0 x 2 4 ( f ( x ) + x f ( x ) ) 2 d x 0 x f ( x ) / π d x 2 / 3
with
MISE ( F ^ n , b opt LN ) = n 1 0 F ( x ) ( 1 F ( x ) ) d x n 4 / 3 3 4 0 x f ( x ) / π d x 4 4 0 x 2 4 ( f ( x ) + x f ( x ) ) 2 d x 1 / 3 + o ( n 4 / 3 ) .
Proposition 6
(Asymptotic normality). For any x > 0 such that 0 < F ( x ) < 1 , we have the following convergence in distribution:
n 1 / 2 ( F ^ n , b LN ( x ) E [ F ^ n , b LN ( x ) ] ) D N ( 0 , σ 2 ( x ) ) , a s b 0 , n ,
where σ 2 ( x ) : = F ( x ) ( 1 F ( x ) ) . In particular, Lemma 3 implies
n 1 / 2 ( F ^ n , b LN ( x ) F ( x ) ) D N ( 0 , σ 2 ( x ) ) , i f n 1 / 2 b 0 ,
n 1 / 2 ( F ^ n , b LN ( x ) F ( x ) ) D N ( λ · x 2 ( f ( x ) + x f ( x ) ) , σ 2 ( x ) ) , i f n 1 / 2 b λ ,
for any constant λ > 0 .

6. Asymptotic Properties of the c.d.f. Estimator with IGau Kernel

In this section, we find the asymptotic properties of the inverse Gaussian (IGau) kernel estimator defined in ().
Lemma 4
(Bias and variance). For any given x ( 0 , ) ,
B ias [ F ^ n , b IGau ( x ) ] : = E [ F ^ n , b IGau ( x ) ] F ( x ) = b · x 2 2 f ( x ) + o x ( b ) ,
V ar ( F ^ n , b IGau ( x ) ) = n 1 F ( x ) ( 1 F ( x ) ) n 1 b 1 / 2 · f ( x ) 2 lim b 0 b 1 / 2 E [ | T 1 T 2 | ] + O x ( n 1 b ) ,
where T 1 , T 2 i . i . d . IGau ( x , b 1 x ) .
Corollary 4
(Mean squared error). For any given x ( 0 , ) ,
MSE ( F ^ n , b IGau ( x ) ) = V ar ( F ^ n , b IGau ( x ) ) + B ias [ F ^ n , b IGau ( x ) ] 2 = n 1 F ( x ) ( 1 F ( x ) ) n 1 b 1 / 2 · f ( x ) 2 lim b 0 b 1 / 2 E [ | T 1 T 2 | ] + b 2 · x 4 4 ( f ( x ) ) 2 + O x ( n 1 b ) + o x ( b 2 ) ,
where T 1 , T 2 i . i . d . IGau ( x , b 1 x ) . The quantity lim b 0 b 1 / 2 E [ | T 1 T 2 | ] needs to be approximated numerically. In particular, if f ( x ) lim b 0 b 1 / 2 E [ | T 1 T 2 | ] · f ( x ) 0 , the asymptotically optimal choice of b, with respect to MSE , is
b opt = n 2 / 3 4 · x 4 4 ( f ( x ) ) 2 f ( x ) 2 lim b 0 b 1 / 2 E [ | T 1 T 2 | ] 2 / 3
with
MSE ( F ^ n , b opt IGau ( x ) ) = n 1 F ( x ) ( 1 F ( x ) ) n 4 / 3 3 4 f ( x ) 2 lim b 0 b 1 / 2 E [ | T 1 T 2 | ] 4 4 · x 4 4 ( f ( x ) ) 2 1 / 3 + o x ( n 4 / 3 ) .
Proposition 7
(Mean integrated squared error). Assuming that the target density f = F satisfies
0 f ( x ) lim b 0 b 1 / 2 E [ | T 1 T 2 | ] d x < a n d 0 x 4 ( f ( x ) ) 2 d x < ,
where T 1 , T 2 i . i . d . IGau ( x , b 1 x ) , then we have
MISE ( F ^ n , b IGau ) = 0 V ar ( F ^ n , b IGau ( x ) ) d x + 0 B ias [ F ^ n , b IGau ( x ) ] 2 d x = n 1 0 F ( x ) ( 1 F ( x ) ) d x n 1 b 1 / 2 0 f ( x ) 2 lim b 0 b 1 / 2 E [ | T 1 T 2 | ] d x + b 2 0 x 4 4 ( f ( x ) ) 2 d x + o ( n 1 b 1 / 2 ) + o ( b 2 ) .
The quantity lim b 0 b 1 / 2 E [ | T 1 T 2 | ] needs to be approximated numerically. In particular, if 0 x 4 ( f ( x ) ) 2 d x > 0 , the asymptotically optimal choice of b, with respect to MISE , is
b opt = n 2 / 3 4 0 x 4 4 ( f ( x ) ) 2 d x 0 f ( x ) 2 lim b 0 b 1 / 2 E [ | T 1 T 2 | ] d x 2 / 3
with
MISE ( F ^ n , b opt IGau ) = n 1 0 F ( x ) ( 1 F ( x ) ) d x n 4 / 3 3 4 0 f ( x ) 2 lim b 0 b 1 / 2 E [ | T 1 T 2 | ] d x 4 4 0 x 4 4 ( f ( x ) ) 2 d x 1 / 3 + o ( n 4 / 3 ) .
Proposition 8
(Asymptotic normality). For any x > 0 such that 0 < F ( x ) < 1 , we have the following convergence in distribution:
n 1 / 2 ( F ^ n , b IGau ( x ) E [ F ^ n , b IGau ( x ) ] ) D N ( 0 , σ 2 ( x ) ) , as b 0 , n ,
where σ 2 ( x ) = F ( x ) ( 1 F ( x ) ) . In particular, Lemma 4 implies
n 1 / 2 ( F ^ n , b IGau ( x ) F ( x ) ) D N ( 0 , σ 2 ( x ) ) , i f n 1 / 2 b 0 ,
n 1 / 2 ( F ^ n , b IGau ( x ) F ( x ) ) D N ( λ · x 2 2 f ( x ) , σ 2 ( x ) ) , i f n 1 / 2 b λ ,
for any constant λ > 0 .

7. Asymptotic Properties of the c.d.f. Estimator with RIG Kernel

In this section, we find the asymptotic properties of the reciprocal inverse Gaussian (RIG) kernel estimator defined in (5).
Lemma 5
(Bias and variance). For any given x ( 0 , ) ,
B ias [ F ^ n , b RIG ( x ) ] : = E [ F ^ n , b IGau ( x ) ] F ( x ) = b · x 2 2 f ( x ) + o x ( b ) ,
V ar ( F ^ n , b RIG ( x ) ) = n 1 F ( x ) ( 1 F ( x ) ) n 1 b 1 / 2 · f ( x ) 2 lim b 0 b 1 / 2 E [ | T 1 T 2 | ] + O x ( n 1 b ) ,
where T 1 , T 2 i . i . d . RIG ( x , b 1 x ) .
Corollary 5
(Mean squared error). For any given x ( 0 , ) ,
MSE ( F ^ n , b RIG ( x ) ) = V ar ( F ^ n , b RIG ( x ) ) + B ias [ F ^ n , b RIG ( x ) ] 2 = n 1 F ( x ) ( 1 F ( x ) ) n 1 b 1 / 2 · f ( x ) 2 lim b 0 b 1 / 2 E [ | T 1 T 2 | ] + b 2 · 1 4 x 4 ( f ( x ) ) 2 + O x ( n 1 b ) + o x ( b 2 ) ,
where T 1 , T 2 i . i . d . RIG ( x 1 ( 1 b ) 1 , x 1 b 1 ) . The quantity lim b 0 b 1 / 2 E [ | T 1 T 2 | ] needs to be approximated numerically. In particular, if f ( x ) lim b 0 b 1 / 2 E [ | T 1 T 2 | ] · f ( x ) 0 , the asymptotically optimal choice of b, with respect to MSE , is
b opt = n 2 / 3 4 · 1 4 x 4 ( f ( x ) ) 2 f ( x ) 2 lim b 0 b 1 / 2 E [ | T 1 T 2 | ] 2 / 3
with
MSE ( F ^ n , b opt RIG ( x ) ) = n 1 F ( x ) ( 1 F ( x ) ) n 4 / 3 3 4 f ( x ) 2 lim b 0 b 1 / 2 E [ | T 1 T 2 | ] 4 4 · 1 4 x 4 ( f ( x ) ) 2 1 / 3 + o x ( n 4 / 3 ) .
Proposition 9
(Mean integrated squared error). Assuming that the target density f = F satisfies
0 f ( x ) lim b 0 b 1 / 2 E [ | T 1 T 2 | ] d x < a n d 0 x 4 ( f ( x ) ) 2 d x < ,
where T 1 , T 2 i . i . d . RIG ( x 1 ( 1 b ) 1 , x 1 b 1 ) , then we have
MISE ( F ^ n , b RIG ( x ) ) = 0 V ar ( F ^ n , b RIG ( x ) ) d x + 0 B ias [ F ^ n , b RIG ( x ) ] 2 d x = n 1 0 F ( x ) ( 1 F ( x ) ) d x n 1 b 1 / 2 0 f ( x ) 2 lim b 0 b 1 / 2 E [ | T 1 T 2 | ] d x + b 2 0 x 4 4 ( f ( x ) ) 2 d x + o ( n 1 b 1 / 2 ) + o ( b 2 ) .
The quantity lim b 0 b 1 / 2 E [ | T 1 T 2 | ] needs to be approximated numerically. In particular, if 0 x 4 ( f ( x ) ) 2 d x > 0 , the asymptotically optimal choice of b, with respect to MISE , is
b opt = n 2 / 3 4 0 1 4 x 4 ( f ( x ) ) 2 d x 0 f ( x ) 2 lim b 0 b 1 / 2 E [ | T 1 T 2 | ] d x 2 / 3
with
MISE ( F ^ n , b opt RIG ( x ) ) = n 1 0 F ( x ) ( 1 F ( x ) ) d x n 4 / 3 3 4 0 f ( x ) 2 lim b 0 b 1 / 2 E [ | T 1 T 2 | ] d x 4 4 0 1 4 x 4 ( f ( x ) ) 2 d x 1 / 3 + o ( n 4 / 3 ) .
Proposition 10
(Asymptotic normality). For any x > 0 such that 0 < F ( x ) < 1 , we have the following convergence in distribution:
n 1 / 2 ( F ^ n , b RIG ( x ) E [ F ^ n , b IGau ( x ) ] ) D N ( 0 , σ 2 ( x ) ) , a s b 0 , n ,
where σ 2 ( x ) = F ( x ) ( 1 F ( x ) ) . In particular, Lemma 5 implies
n 1 / 2 ( F ^ n , b RIG ( x ) F ( x ) ) D N ( 0 , σ 2 ( x ) ) , i f n 1 / 2 b 0 ,
n 1 / 2 ( F ^ n , b RIG ( x ) F ( x ) ) D N ( λ · x 2 2 f ( x ) , σ 2 ( x ) ) , i f n 1 / 2 b λ ,
for any constant λ > 0 .

8. Numerical Study

As in [41], we generated M = 1000 samples of size n = 256 and n = 1000 from eight target distributions:
  • Burr ( 1 , 3 , 1 ) , with the following parametrization for the density function:
    f 1 ( x | λ , c , k ) : = c k λ x λ c 1 1 + x λ c k 1 𝟙 ( 0 , ) ( x ) , λ , c , k > 0 ;
  • Gamma ( 0.6 , 2 ) , with the following parametrization for the density function:
    f 2 ( x | α , θ ) : = x α 1 exp ( x θ ) θ α Γ ( α ) 𝟙 ( 0 , ) ( x ) , α , θ > 0 ;
  • Gamma ( 4 , 2 ) , with the following parametrization for the density function:
    f 3 ( x | α , θ ) : = x α 1 exp ( x θ ) θ α Γ ( α ) 𝟙 ( 0 , ) ( x ) , α , θ > 0 ;
  • GeneralizedPareto ( 0.4 , 1 , 0 ) , with the following parametrization for the density function:
    f 4 ( x | ξ , σ , μ ) : = 1 σ 1 + ξ x μ σ 1 ξ 1 𝟙 ( μ , ) ( x ) , ξ , σ > 0 , μ R ;
  • HalfNormal ( 1 ) , with the following parametrization for the density function:
    f 5 ( x | σ ) : = 2 π σ 2 exp x 2 2 σ 2 𝟙 ( 0 , ) ( x ) , σ > 0 ;
  • LogNormal ( 0 , 0.75 ) , with the following parametrization for the density function:
    f 6 ( x | μ , σ ) : = 1 x 2 π σ 2 exp ( log x μ ) 2 2 σ 2 𝟙 ( 0 , ) ( x ) , μ R , σ > 0 ;
  • Weibull ( 1.5 , 1.5 ) , with the following parametrization for the density function:
    f 7 ( x | λ , k ) : = k λ x λ k 1 exp x λ k 𝟙 ( 0 , ) ( x ) , λ , k > 0 ;
  • Weibull ( 3 , 2 ) , with the following parametrization for the density function:
    f 8 ( x | λ , k ) : = k λ x λ k 1 exp x λ k 𝟙 ( 0 , ) ( x ) , λ , k > 0 .
For each of the eight target distributions ( i = 1 , 2 , , 8 ), each of the ten estimators ( j = 1 , 2 , , 10 ), each sample size ( n = 256 , 1000 ), and each sample ( k = 1 , 2 , , M ), we calculated the integrated squared errors
ISE i , j , n ( k ) : = 0 ( F ^ j , n ( k ) ( x ) F i ( x ) ) 2 d x ,
where
  • F ^ 1 , n ( k ) denotes the estimator F ^ n , b opt Gam from (1) applied to the k-th sample;
  • F ^ 2 , n ( k ) denotes the estimator F ^ n , b opt IGam from (2) applied to the k-th sample;
  • F ^ 3 , n ( k ) denotes the estimator F ^ n , b opt LN from (3) applied to the k-th sample;
  • F ^ 4 , n ( k ) denotes the estimator F ^ n , b opt IGau from (4) applied to the k-th sample;
  • F ^ 5 , n ( k ) denotes the estimator F ^ n , b opt RIG from (5) applied to the k-th sample;
  • F ^ 6 , n ( k ) denotes the estimator F ^ n , b opt B S from (6) applied to the k-th sample;
  • F ^ 7 , n ( k ) denotes the estimator F ^ n , b opt W from (7) applied to the k-th sample;
    (for F ^ 1 , n ( k ) to F ^ 7 , n ( k ) , b opt is optimal with respect to the MISE (see (18), (30), (42), (54) and (66)) and approximated under the assumption that the target distribution is Gamma ( α ^ n ( k ) , θ ^ n ( k ) ) , where α ^ n ( k ) and θ ^ n ( k ) are the maximum likelihood estimates for the k-th sample) and
  • F ^ 8 , n ( k ) ( x ) : = 1 n = 1 n Epa x X ( k ) b LNO , where
    • Epa ( u ) = 1 2 + 3 u 4 u 3 4 · 𝟙 ( 1 , 1 ) ( u ) + 𝟙 [ 1 , ) ( u ) denotes the c.d.f. of the Epanechnikov kernel;
    • b LNO is selected by minimizing the Leave-None-Out criterion from page 197 in [61];
  • F ^ 9 , n ( k ) ( x ) : = 1 n = 1 n Epa x X ( k ) b CV 𝟙 [ b CV , ) ( x ) + Epa x X ( k ) x 𝟙 ( 0 , b CV ) ( x ) is the boundary modified kernel estimator from Example 2.3 in [57], where
    • Epa ( u ) = 1 2 + 3 u 4 u 3 4 · 𝟙 ( 1 , 1 ) ( u ) + 𝟙 [ 1 , ) ( u ) denotes the c.d.f. of the Epanechnikov kernel;
    • b CV is selected by minimizing the Cross-Validation criterion from page 180 in [57];
  • F ^ 10 , n ( k ) ( x ) : = 1 n = 1 n 𝟙 { X ( k ) x } is the empirical c.d.f. applied to the k-th sample.
Everywhere in our R code, we approximated the integrals on ( 0 , ) using the integral function from the R package pracma (the base function integrate had serious precision issues). Table 1 below shows the mean and standard deviation of the ISE ’s, i.e.,
1 M k = 1 M ISE i , j , n ( k ) and 1 M 1 k = 1 M ISE i , j , n ( k ) 1 M k = 1 M ISE i , j , n ( k ) 2 ,
for the eight target distributions ( i = 1 , 2 , , 8 ), the ten estimators ( j = 1 , 2 , , 10 ) and the two sample sizes ( n = 256 , 1000 ). All the values presented in the table have been multiplied by 10 4 . In Table 2, we computed, for each target distribution and each sample size, the difference between the ISE means and the lowest ISE mean for the corresponding target distribution and sample size (i.e., the ISE means minus the ISE mean of the best estimator on the corresponding line). The totals of those differences are also calculated for each sample size on the two “total” lines. Figure 1 gives a better idea of the target distribution of ISE ’s by displaying the boxplot of the ISE ’s for every target distribution and every estimator, when the sample size is n = 1000 . Finally, Figure 2, Figure 3, Figure 4, Figure 5, Figure 6, Figure 7, Figure 8 and Figure 9 (one figure for each of the eight target distributions) show a collection of ten c.d.f. estimates from each of the ten estimators when the sample size is 256.
Here are the results, which we discuss briefly in Section 9:
Figure 1. Boxplots of the ISE i , j , n ( k ) , k = 1 , 2 , , M , for the eight target distributions and the ten estimators, when the sample size is n = 1000. (a) Burr(1, 3, 1). (b) Gamma(0.6, 2). (c) Gamma(4, 2). (d) GeneralizedPareto(0.4, 1, 0). (e) HalfNormal(1). (f) LogNormal(0, 0.75). (g) Weibull(1.5, 1.5). (h) Weibull(3, 2).
Figure 1. Boxplots of the ISE i , j , n ( k ) , k = 1 , 2 , , M , for the eight target distributions and the ten estimators, when the sample size is n = 1000. (a) Burr(1, 3, 1). (b) Gamma(0.6, 2). (c) Gamma(4, 2). (d) GeneralizedPareto(0.4, 1, 0). (e) HalfNormal(1). (f) LogNormal(0, 0.75). (g) Weibull(1.5, 1.5). (h) Weibull(3, 2).
Mathematics 09 02605 g001
Table 1. The mean and standard deviation of the ISE i , j , n ( k ) , k = 1 , 2 , , M , for the eight target distributions ( i = 1 , 2 , , 8 ), the ten estimators ( j = 1 , 2 , , 10 ) and the two sample sizes ( n = 256 , 1000 ). All the values presented in the table have been multiplied by 10 4 . The ordinary kernel estimator F ^ 8 is denoted by OK , the boundary kernel estimator F ^ 9 is denoted by BK , and the empirical c.d.f. F ^ 10 is denoted by EDF . For each line in the table, the lowest ISE means are highlighted in cyan.
Table 1. The mean and standard deviation of the ISE i , j , n ( k ) , k = 1 , 2 , , M , for the eight target distributions ( i = 1 , 2 , , 8 ), the ten estimators ( j = 1 , 2 , , 10 ) and the two sample sizes ( n = 256 , 1000 ). All the values presented in the table have been multiplied by 10 4 . The ordinary kernel estimator F ^ 8 is denoted by OK , the boundary kernel estimator F ^ 9 is denoted by BK , and the empirical c.d.f. F ^ 10 is denoted by EDF . For each line in the table, the lowest ISE means are highlighted in cyan.
× 10 4 iGam ( i = 1 )IGam ( i = 2 )LN ( i = 3 )IGau ( i = 4 )RIG ( i = 5 )B-S ( i = 6 )W ( i = 7 )OK ( i = 8 )BK ( i = 9 )EDF ( i = 10 )
n j MeanStd.MeanStd.MeanStd.MeanStd.MeanStd.MeanStd.MeanStd.MeanStd.MeanStd.MeanStd.
25611.391.271.371.341.311.261.371.321.371.321.311.261.371.321.541.431.471.341.541.44
22.592.362.502.532.362.422.492.462.492.472.362.422.502.512.762.442.672.572.762.45
36.706.286.776.586.626.286.696.456.696.456.626.286.746.547.447.016.706.397.447.00
43.743.143.603.273.363.153.613.203.613.213.363.143.603.263.963.243.803.273.973.24
51.141.101.181.131.181.071.171.131.171.131.181.071.171.121.261.191.101.141.261.19
61.931.831.911.891.811.801.911.871.911.871.811.801.911.882.131.942.051.932.131.95
71.751.821.771.991.681.831.761.961.761.961.681.831.761.951.951.931.732.041.951.92
82.692.712.752.782.812.662.672.712.672.712.812.662.752.753.022.882.562.593.032.88
100010.400.360.390.360.380.350.390.360.390.360.380.350.390.360.430.390.410.360.430.39
20.720.700.700.690.670.670.700.690.700.690.670.670.700.690.750.710.730.720.750.71
32.012.092.052.222.022.162.042.152.042.152.022.162.052.202.232.291.992.092.232.30
40.990.790.970.820.930.800.970.810.970.810.930.800.970.821.030.821.000.821.030.83
50.310.310.310.310.310.300.310.310.310.310.310.300.310.310.330.320.300.320.330.32
60.470.430.470.430.460.420.470.430.470.430.460.420.470.430.500.450.490.430.500.45
70.460.460.460.480.440.450.460.480.460.480.440.450.460.480.490.500.460.480.490.50
80.720.740.740.750.750.740.730.750.730.750.750.740.740.750.780.800.700.720.780.81
Table 2. For each of the eight target distributions ( i = 1 , 2 , , 8 ) and each of the two sample sizes ( n = 256 , 1000 ), a cell represents the mean of the ISE i , j , n ( k ) , k = 1 , 2 , , M , minus the lowest ISE mean for that line (i.e., minus the ISE mean of the best estimator for that specific target distribution and sample size). For each estimator ( j = 1 , 2 , , 10 ) and each sample size, the total of those differences to the best ISE means is calculated on the line called “total”. For each sample size, the lowest totals are highlighted in cyan.
Table 2. For each of the eight target distributions ( i = 1 , 2 , , 8 ) and each of the two sample sizes ( n = 256 , 1000 ), a cell represents the mean of the ISE i , j , n ( k ) , k = 1 , 2 , , M , minus the lowest ISE mean for that line (i.e., minus the ISE mean of the best estimator for that specific target distribution and sample size). For each estimator ( j = 1 , 2 , , 10 ) and each sample size, the total of those differences to the best ISE means is calculated on the line called “total”. For each sample size, the lowest totals are highlighted in cyan.
× 10 4 iGam ( i = 1 )IGam ( i = 2 )LN ( i = 3 )IGau ( i = 4 )RIG ( i = 5 )B-S ( i = 6 )W ( i = 7 )OK ( i = 8 )BK ( i = 9 )EDF ( i = 10 )
n j Diff. withDiff. withDiff. withDiff. withDiff. withDiff. withDiff. withDiff. withDiff. withDiff. with
LowestLowestLowestLowestLowestLowestLowestLowestLowestLowest
MeanMeanMeanMeanMeanMeanMeanMeanMeanMean
25610.080.060.000.060.060.000.060.230.160.23
20.230.140.000.140.130.000.140.400.320.40
30.080.150.010.080.070.000.120.820.090.82
40.380.240.000.250.240.000.230.600.430.60
50.050.080.090.070.070.080.080.160.000.17
60.120.100.000.100.100.000.100.320.240.32
70.070.100.000.090.090.000.080.270.050.27
80.130.190.250.110.110.250.190.460.000.46
total1.141.070.350.890.880.340.993.261.293.28
100010.020.010.000.010.010.000.010.050.030.05
20.040.020.000.020.020.000.020.070.060.08
30.020.060.030.050.050.030.060.240.000.24
40.060.040.000.040.040.000.040.100.070.10
50.010.020.020.010.010.020.020.030.000.03
60.020.020.000.020.020.000.020.040.040.04
70.010.020.000.020.020.000.020.050.020.05
80.020.040.050.030.030.050.040.080.000.08
total0.200.230.100.200.200.100.240.660.220.68
Figure 2. The Burr ( 1 , 3 , 1 ) density function appears on the top-left, and the target c.d.f. is depicted in red everywhere else. Each plot has ten estimates in blue for the Burr ( 1 , 3 , 1 ) c.d.f. using one of the ten estimators (the name of the corresponding kernel is indicated above each graph) and n = 256 .
Figure 2. The Burr ( 1 , 3 , 1 ) density function appears on the top-left, and the target c.d.f. is depicted in red everywhere else. Each plot has ten estimates in blue for the Burr ( 1 , 3 , 1 ) c.d.f. using one of the ten estimators (the name of the corresponding kernel is indicated above each graph) and n = 256 .
Mathematics 09 02605 g002
Figure 3. The Gamma ( 0.6 , 2 ) density function appears on the top-left, and the target c.d.f. is depicted in red everywhere else. Each plot has ten estimates in blue for the Gamma ( 0.6 , 2 ) c.d.f. using one of the ten estimators (the name of the corresponding kernel is indicated above each graph) and n = 256 .
Figure 3. The Gamma ( 0.6 , 2 ) density function appears on the top-left, and the target c.d.f. is depicted in red everywhere else. Each plot has ten estimates in blue for the Gamma ( 0.6 , 2 ) c.d.f. using one of the ten estimators (the name of the corresponding kernel is indicated above each graph) and n = 256 .
Mathematics 09 02605 g003
Figure 4. The Gamma ( 4 , 2 ) density function appears on the top-left, and the target c.d.f. is depicted in red everywhere else. Each plot has ten estimates in blue for the Gamma ( 4 , 2 ) c.d.f. using one of the ten estimators (the name of the corresponding kernel is indicated above each graph) and n = 256 .
Figure 4. The Gamma ( 4 , 2 ) density function appears on the top-left, and the target c.d.f. is depicted in red everywhere else. Each plot has ten estimates in blue for the Gamma ( 4 , 2 ) c.d.f. using one of the ten estimators (the name of the corresponding kernel is indicated above each graph) and n = 256 .
Mathematics 09 02605 g004
Figure 5. The GeneralizedPareto ( 0.4 , 1 , 0 ) density function appears on the top-left, and the target c.d.f. is depicted in red everywhere else. Each plot has ten estimates in blue for the GeneralizedPareto ( 0.4 , 1 , 0 ) c.d.f. using one of the ten estimators (the name of the corresponding kernel is indicated above each graph) and n = 256 .
Figure 5. The GeneralizedPareto ( 0.4 , 1 , 0 ) density function appears on the top-left, and the target c.d.f. is depicted in red everywhere else. Each plot has ten estimates in blue for the GeneralizedPareto ( 0.4 , 1 , 0 ) c.d.f. using one of the ten estimators (the name of the corresponding kernel is indicated above each graph) and n = 256 .
Mathematics 09 02605 g005
Figure 6. The HalfNormal ( 1 ) density function appears on the top-left, and the target c.d.f. is depicted in red everywhere else. Each plot has ten estimates in blue for the HalfNormal ( 1 ) c.d.f. using one of the ten estimators (the name of the corresponding kernel is indicated above each graph) and n = 256 .
Figure 6. The HalfNormal ( 1 ) density function appears on the top-left, and the target c.d.f. is depicted in red everywhere else. Each plot has ten estimates in blue for the HalfNormal ( 1 ) c.d.f. using one of the ten estimators (the name of the corresponding kernel is indicated above each graph) and n = 256 .
Mathematics 09 02605 g006
Figure 7. The LogNormal ( 0 , 0.75 ) density function appears on the top-left, and the target c.d.f. is depicted in red everywhere else. Each plot has ten estimates in blue for the LogNormal ( 0 , 0.75 ) c.d.f. using one of the ten estimators (the name of the corresponding kernel is indicated above each graph) and n = 256 .
Figure 7. The LogNormal ( 0 , 0.75 ) density function appears on the top-left, and the target c.d.f. is depicted in red everywhere else. Each plot has ten estimates in blue for the LogNormal ( 0 , 0.75 ) c.d.f. using one of the ten estimators (the name of the corresponding kernel is indicated above each graph) and n = 256 .
Mathematics 09 02605 g007
Figure 8. The Weibull ( 1.5 , 1.5 ) density function appears on the top-left, and the target c.d.f. is depicted in red everywhere else. Each plot has ten estimates in blue for the Weibull ( 1.5 , 1.5 ) c.d.f. using one of the ten estimators (the name of the corresponding kernel is indicated above each graph) and n = 256 .
Figure 8. The Weibull ( 1.5 , 1.5 ) density function appears on the top-left, and the target c.d.f. is depicted in red everywhere else. Each plot has ten estimates in blue for the Weibull ( 1.5 , 1.5 ) c.d.f. using one of the ten estimators (the name of the corresponding kernel is indicated above each graph) and n = 256 .
Mathematics 09 02605 g008
Figure 9. The Weibull ( 3 , 2 ) density function appears on the top-left, and the target c.d.f. is depicted in red everywhere else. Each plot has ten estimates in blue for the Weibull ( 3 , 2 ) c.d.f. using one of the ten estimators (the name of the corresponding kernel is indicated above each graph) and n = 256 .
Figure 9. The Weibull ( 3 , 2 ) density function appears on the top-left, and the target c.d.f. is depicted in red everywhere else. Each plot has ten estimates in blue for the Weibull ( 3 , 2 ) c.d.f. using one of the ten estimators (the name of the corresponding kernel is indicated above each graph) and n = 256 .
Mathematics 09 02605 g009

9. Discussion of the Simulation Results

In Table 1, the mean and standard deviation of the ISE i , j , n ( k ) , k = 1 , 2 , , M , are displayed for the eight target distributions ( i = 1 , 2 , , 8 ), the ten estimators ( j = 1 , 2 , , 10 ) and the two sample sizes ( n = 256 , 1000 ). All the values presented in the table have been multiplied by 10 4 . For each line in the table (i.e., for each target distribution and each sample size), the lowest ISE means are highlighted in cyan. We see that the LogNormal (LN) and Birnbaum–Saunders (B–S) kernel c.d.f. estimators performed the best (had the lowest ISE means) for the majority of the target distributions considered (for j = 1 , 2 , 3 , 4 , 6 , 7 when n = 256 , and for j = 1 , 2 , 4 , 6 , 7 when n = 1000 ). They also always did so in pair, with the same ISE mean up to the second decimal. For the remaining cases, the boundary kernel c.d.f. estimator (BK) from Tenreiro [57] had the lowest ISE means. As expected, the ordinary kernel c.d.f. estimator and the empirical c.d.f. performed the worst. The standard deviations are fairly stable across all estimators for any given target distribution and sample size (this can also be seen in Figure 1), so our analysis focuses on the ISE means. In [41], the authors reported that the empirical c.d.f. performed better than the BK estimator, but this has to be a programming error (especially since the bandwidth was optimized with a plug-in method). Overall, our means and standard deviations in Table 1 seem to be lower than the ones reported in [41] at least in part because we used a more precise option (the pracma::integral function in R) to approximate the integrals involved in the bandwidth selection procedures and the computation of the ISE ’s. In all cases, the asymmetric kernel estimators were at least competitive with the BK estimator in Table 1. To give an idea of the shape of the eight target distributions and the corresponding estimates for each of the ten estimators, we plotted the eight target c.d.f.s and ten estimates for each estimator (one figure for each of the eight target distributions, ten graphs per figure for the ten estimators, and ten estimates per graph) in Figure 2, Figure 3, Figure 4, Figure 5, Figure 6, Figure 7, Figure 8 and Figure 9 when the sample size is n = 256 .
In Table 2, for each of the eight target distributions ( i = 1 , 2 , , 8 ) and each of the two sample sizes ( n = 256 , 1000 ), a cell represents the mean of the ISE i , j , n ( k ) , k = 1 , 2 , , M , minus the lowest ISE mean for that line (i.e., minus the ISE mean of the best estimator for that specific target distribution and sample size). For each estimator ( j = 1 , 2 , , 10 ) and each sample size, the total of those differences to the best ISE means is calculated on the line called “total”. For each sample size, the lowest totals are highlighted in cyan. We see that Table 2 paints a nice picture of the asymmetric kernel c.d.f. estimators’ performance. Indeed, it shows that for each sample size ( n = 256 , 1000 ), the total of the differences to the best ISE means is significantly lower for the LogNormal (LN) and Birnbaum–Saunders (B–S) kernel c.d.f. estimators compared to all the other alternatives. For instance, the boundary kernel (BK) c.d.f. estimator would have been the go-to method in the past, but our results show that the total (over the eight target distributions) of the ISE mean differences to the best ISE means is more than three times lower for the LN and B–S kernel c.d.f. estimators compared to the BK c.d.f. estimator when n = 256 , and similarly, it is more than two times lower for the LN and B–S kernel c.d.f. estimators compared to the BK c.d.f. estimator when n = 1000 . Even if we put aside the best asymmetric kernel c.d.f. estimators, the totals of the ISE mean differences to the best ISE means for all the other asymmetric kernel c.d.f. estimators are also lower than for the BK c.d.f. estimator when n = 256 , and they are in the same range (or better in the case of LN and B–S) when n = 1000 . This means that all the asymmetric kernel estimators are overall better alternatives (or at least always remain competitive) compared to the BK estimator, although the advantage seems to dissipate (except for LN and B–S) when n increases.

10. Conclusions

In this paper, we considered five new asymmetric kernel c.d.f. estimators, namely the Gamma (Gam), inverse Gamma (IGam), LogNormal (LN), inverse Gaussian (IGau) and reciprocal inverse Gaussian (RIG) kernel c.d.f. estimators. We proved the asymptotic normality of these estimators and we also found asymptotic expressions for their bias, variance, mean squared error and mean integrated squared error. The expressions for the optimal bandwidth under the mean integrated squared error were used in each case to implement a bandwidth selection procedure in our simulation study. With the same experimental design as Mombeni et al. [41] (but with an improved approximation of the integrals involved in the bandwidth selection procedures and the computation of the ISE ’s), our results show that the LogNormal and Birnbaum–Saunders kernel c.d.f. estimators perform the best overall. The results also show that all seven asymmetric kernel c.d.f. estimators are better in some cases and always at least competitive against the boundary kernel alternative presented by Tenreiro [57]. In that sense, all seven asymmetric kernel c.d.f. estimators are safe to use in place of more traditional methods. We recommend using the LogNormal and Birnbaum–Saunders kernel c.d.f. estimators in the future.

Author Contributions

Conceptualisation, methodology, investigation, formal analysis, software, coding and simulations, validation, visualisation, writing—original draft preparation, writing—review and editing, review of the literature, theoretical results and proofs, F.O.; software, coding and simulations, validation, P.L.d.M. All authors have read and agreed to the published version of the manuscript.

Funding

F. Ouimet is supported by postdoctoral fellowships from the NSERC (PDF) and the FRQNT (B3X supplement and B3XR).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The R code for the simulations in Section 8 is available online.

Acknowledgments

We thank Benedikt Funke for reminding us of the representation min { T 1 , T 2 } = 1 2 ( T 1 + T 2 ) 1 2 | T 1 T 2 | , which helped tightening up the MSE and MISE results in Section 6 and Section 7. This research includes computations using the computational cluster Katana supported by Research Technology Services at UNSW Sydney. We thank the referees for their insightful remarks which led to improvements in the presentation of this paper.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Proof of the Results for the Gam Kernel

Proof of Lemma 1. 
If T denotes a random variable with the density
k Gam ( t | α , θ ) = t α 1 e t / θ θ α Γ ( α ) , with ( α , θ ) = ( b 1 x + 1 , b ) ,
then integration by parts yields
E [ F ^ n , b Gam ( x ) ] F ( x ) = E [ F ( T ) ] F ( x ) = f ( x ) · E [ T x ] + 1 2 f ( x ) · E [ ( T x ) 2 ] + o x E [ ( T x ) 2 ] = f ( x ) · b + 1 2 f ( x ) · b ( 2 b + x ) + o x ( b ) = b · ( f ( x ) + x 2 f ( x ) ) + o x ( b ) .
Now, we want to compute the expression for the variance. Let S be a random variable with density t 2 k Gam ( t | b 1 x + 1 , b ) K ¯ Gam ( t | b 1 x + 1 , b ) and note that min { T 1 , T 2 } has that particular distribution if T 1 , T 2 Gam ( b 1 x + 1 , b ) are independent. Then, integration by parts and Corollary A1 yield, for any given x ( 0 , ) ,
E K ¯ Gam 2 ( X 1 , b 1 x + 1 , b ) = E [ F ( S ) ] = F ( x ) + f ( x ) · E [ S x ] + O x E [ ( S x ) 2 ] = F ( x ) + f ( x ) · b x π + O x ( b ) + O x ( b ) = F ( x ) b 1 / 2 · x f ( x ) π + O x ( b ) ,
so that
V ar ( F ^ n , b Gam ( x ) ) = n 1 E K ¯ Gam 2 ( X 1 , b 1 x + 1 , b ) n 1 E [ F ^ n , b Gam ( x ) ] 2 = n 1 F ( x ) ( 1 F ( x ) ) n 1 b 1 / 2 · x f ( x ) π + O x ( n 1 b ) .
This ends the proof. □
Proof of Proposition 1. 
Note that F ^ n , b Gam ( x ) E [ F ^ n , b Gam ( x ) ] = 1 n i = 1 n Z i , b , where
Z i , b = K ¯ Gam ( X i | b 1 x + 1 , b ) E [ K ¯ Gam ( X i | b 1 x + 1 , b ) ] , 1 i n ,
are i.i.d. and centered random variables. It suffices to show the following Lindeberg condition for double arrays (see, e.g., Section 1.9.3 in [62]): for every ε > 0 ,
s b 2 E Z 1 , b 2 𝟙 { | Z 1 , b | > ε n 1 / 2 s b } 0 , as n ,
where s b 2 = E [ Z 1 , b 2 ] and b = b ( n ) 0 . This follows from the fact that | Z 1 , b | 2 for all b > 0 , and s b = ( n V ar ( F ^ n , b Gam ) ) 1 / 2 F ( x ) ( 1 F ( x ) ) as n by Lemma 1. □

Appendix B. Proof of the Results for the IGam Kernel

Proof of Lemma 2. 
If T denotes a random variable with the density
k IGam ( t | α , θ ) = t α 1 e 1 / ( t θ ) θ α Γ ( α ) , with ( α , θ ) = ( b 1 + 1 , x 1 b ) ,
then integration by parts yields (assuming 0 < b < 1 / 2 )
E [ F ^ n , b IGam ( x ) ] F ( x ) = E [ F ( T ) ] F ( x ) = f ( x ) · E [ T x ] + 1 2 f ( x ) · E [ ( T x ) 2 ] + o x E [ ( T x ) 2 ] = f ( x ) · 0 + 1 2 f ( x ) · b x 2 1 b + o x ( b ) = b · x 2 2 f ( x ) + o x ( b ) .
Now, we want to compute the expression for the variance. Let S be a random variable with density t 2 k IGam ( t | b 1 + 1 , x 1 b ) K ¯ IGam ( t | b 1 + 1 , x 1 b ) and note that min { T 1 , T 2 } has that particular distribution if T 1 , T 2 IGam ( b 1 + 1 , x 1 b ) are independent. Then, integration by parts and Corollary A2, for any given x ( 0 , ) ,
E K ¯ IGam 2 ( X 1 , b 1 + 1 , x 1 b ) = E [ F ( S ) ] = F ( x ) + f ( x ) · E [ S x ] + O x E [ ( S x ) 2 ] = F ( x ) + f ( x ) · x b π + O x ( b ) + O x ( b ) = F ( x ) b 1 / 2 · x f ( x ) π + O x ( b ) ,
so that
V ar ( F ^ n , b IGam ( x ) ) = n 1 E K ¯ IGam 2 ( X 1 , b 1 + 1 , x 1 b ) n 1 E [ F ^ n , b IGam ( x ) ] 2 = n 1 F ( x ) ( 1 F ( x ) ) n 1 b 1 / 2 · x f ( x ) π + O x ( n 1 b ) .
This ends the proof. □
Proof of Proposition 3. 
Note that F ^ n , b IGam ( x ) E [ F ^ n , b IGam ( x ) ] = 1 n i = 1 n Z i , b , where
Z i , b = K ¯ IGam ( X i | b 1 + 1 , x 1 b ) E [ K ¯ IGam ( X i | b 1 + 1 , x 1 b ) ] , 1 i n ,
are i.i.d. and centered random variables. It suffices to show the following Lindeberg condition for double arrays (see, e.g., Section 1.9.3 in [62]): for every ε > 0 ,
s b 2 E Z 1 , b 2 𝟙 { | Z 1 , b | > ε n 1 / 2 s b } 0 , as n ,
where s b 2 = E [ Z 1 , b 2 ] and b = b ( n ) 0 . This follows from the fact that | Z 1 , b | 2 for all b > 0 , and s b = ( n V ar ( F ^ n , b IGam ) ) 1 / 2 F ( x ) ( 1 F ( x ) ) as n by Lemma 2. □

Appendix C. Proof of the Results for the LN Kernel

Proof of Lemma 3. 
If T denotes a random variable with the density
k LN ( t | μ , σ ) = 1 t 2 π σ 2 exp ( log t μ ) 2 2 σ 2 , with ( μ , σ ) = ( log x , b ) ,
then integration by parts yields
E [ F ^ n , b LN ( x ) ] F ( x ) = E [ F ( T ) ] F ( x ) = f ( x ) · E [ T x ] + 1 2 f ( x ) · E [ ( T x ) 2 ] + o x E [ ( T x ) 2 ] = f ( x ) · x ( e b / 2 1 ) + 1 2 f ( x ) · x 2 ( e 2 b 2 e b / 2 + 1 ) + o x ( b ) = b · x 2 ( f ( x ) + x f ( x ) ) + o x ( b ) .
Now, we want to compute the expression for the variance. Let S be a random variable with density t 2 k LN ( t | log x , b ) K ¯ LN ( t | log x , b ) and note that min { T 1 , T 2 } has that particular distribution if T 1 , T 2 LN ( log x , b ) are independent. Then, integration by parts and Corollary A3 yield, for any given x ( 0 , ) ,
E K ¯ LN 2 ( X 1 , log x , b ) = E [ F ( S ) ] = F ( x ) + f ( x ) · E [ S x ] + O x E [ ( S x ) 2 ] = F ( x ) + f ( x ) · b 1 / 2 · x π + O x ( b ) + O x ( b ) = F ( x ) b 1 / 2 · x f ( x ) π + O x ( b ) ,
so that
V ar ( F ^ n , b LN ( x ) ) = n 1 E K ¯ LN 2 ( X 1 , log x , b ) n 1 E [ F ^ n , b LN ( x ) ] 2 = n 1 F ( x ) ( 1 F ( x ) ) n 1 b 1 / 2 · x f ( x ) π + O x ( n 1 b ) .
This ends the proof. □
Proof of Proposition 5. 
Note that F ^ n , b LN ( x ) E [ F ^ n , b LN ( x ) ] = 1 n i = 1 n Z i , b , where
Z i , b = K ¯ LN ( X i | log x , b ) E [ K ¯ LN ( X i | log x , b ) ] , 1 i n ,
are i.i.d. and centered random variables. It suffices to show the following Lindeberg condition for double arrays (see, e.g., Section 1.9.3 in [62]): for every ε > 0 ,
s b 2 E Z 1 , b 2 𝟙 { | Z 1 , b | > ε n 1 / 2 s b } 0 , as n ,
where s b 2 = E [ Z 1 , b 2 ] and b = b ( n ) 0 . This follows from the fact that | Z 1 , b | 2 for all b > 0 , and s b = ( n V ar ( F ^ n , b LN ) ) 1 / 2 F ( x ) ( 1 F ( x ) ) as n by Lemma 3. □

Appendix D. Proof of the Results for the IGau Kernel

Proof of Lemma 4. 
If T denotes a random variable with the density
k IGau ( t | μ , λ ) = λ 2 π t 3 exp λ ( t μ ) 2 2 μ 2 t , with ( μ , λ ) = ( x , b 1 x ) ,
then integration by parts yields
E [ F ^ n , b IGau ( x ) ] F ( x ) = E [ F ( T ) ] F ( x ) = f ( x ) · E [ T x ] + 1 2 f ( x ) · E [ ( T x ) 2 ] + o x E [ ( T x ) 2 ] = f ( x ) · 0 + 1 2 f ( x ) · x 2 b + o x ( b ) = b · x 2 2 f ( x ) + o x ( b ) .
Now, we want to compute the expression for the variance. Let S be a random variable with density t 2 k IGau ( t | x , b 1 x ) K ¯ IGau ( t | x , b 1 x ) and note that min { T 1 , T 2 } , which can also be written as 1 2 ( T 1 + T 2 ) 1 2 | T 1 T 2 | , has that particular distribution if T 1 , T 2 IGau ( x , b 1 x ) are independent. Then, integration by parts together with the fact that E [ T 1 ] = E [ T 2 ] = x yield, for any given x ( 0 , ) ,
E K ¯ IGau 2 ( X 1 , x , b 1 x ) = E [ F ( S ) ] = F ( x ) + f ( x ) · E [ S x ] + O x E [ ( S x ) 2 ] = F ( x ) + f ( x ) · 1 2 E [ | T 1 T 2 | ] + O x ( b ) = F ( x ) b 1 / 2 · f ( x ) 2 lim b 0 b 1 / 2 E [ | T 1 T 2 | ] + O x ( b ) ,
so that
V ar ( F ^ n , b IGau ( x ) ) = n 1 E K ¯ IGau 2 ( X 1 , x , b 1 x ) n 1 E [ F ^ n , b IGau ( x ) ] 2 = n 1 F ( x ) ( 1 F ( x ) ) n 1 b 1 / 2 · f ( x ) 2 lim b 0 b 1 / 2 E [ | T 1 T 2 | ] + O x ( n 1 b ) .
This ends the proof. □
Proof of Proposition 7. 
Note that F ^ n , b IGau ( x ) E [ F ^ n , b IGau ( x ) ] = 1 n i = 1 n Z i , b , where
Z i , b = K ¯ IGau ( X i | x , b 1 x ) E [ K ¯ IGau ( X i | x , b 1 x ) ] , 1 i n ,
are i.i.d. and centered random variables. It suffices to show the following Lindeberg condition for double arrays (see, e.g., Section 1.9.3 in [62]): for every ε > 0 ,
s b 2 E Z 1 , b 2 𝟙 { | Z 1 , b | > ε n 1 / 2 s b } 0 , as n ,
where s b 2 = E [ Z 1 , b 2 ] and b = b ( n ) 0 . This follows from the fact that | Z 1 , b | 2 for all b > 0 , and s b = ( n V ar ( F ^ n , b IGau ) ) 1 / 2 F ( x ) ( 1 F ( x ) ) as n by Lemma 4. □

Appendix E. Proof of the Results for the RIG Kernel

Proof of Lemma 5. 
If T denotes a random variable with the density
k RIG ( t | μ , λ ) = λ 2 π t exp λ ( 1 μ t ) 2 2 μ 2 t , with ( μ , λ ) = ( x 1 ( 1 b ) 1 , x 1 b 1 ) ,
then integration by parts yields
E [ F ^ n , b RIG ( x ) ] F ( x ) = E [ F ( T ) ] F ( x ) = f ( x ) · E [ T x ] + 1 2 f ( x ) · E [ ( T x ) 2 ] + o x E [ ( T x ) 2 ] = f ( x ) · 0 + 1 2 f ( x ) · x 2 b ( 1 + b ) + o x ( b ) = b · 1 2 x 2 f ( x ) + o x ( b ) .
Now, we want to compute the expression for the variance. Let S be a random variable with density t 2 k RIG ( t | x 1 ( 1 b ) 1 , x 1 b 1 ) K ¯ RIG ( t | x 1 ( 1 b ) 1 , x 1 b 1 ) and note that min { T 1 , T 2 } , which can also be written as 1 2 ( T 1 + T 2 ) 1 2 | T 1 T 2 | , has that particular distribution if T 1 , T 2 RIG ( x 1 ( 1 b ) 1 , x 1 b 1 ) are independent. Then, integration by parts together with the fact that E [ T 1 ] = E [ T 2 ] = x yield, for any given x ( 0 , ) ,
E K ¯ RIG 2 ( X 1 , x 1 ( 1 b ) 1 , x 1 b 1 ) = E [ F ( S ) ] = F ( x ) + f ( x ) · E [ S x ] + O x E [ ( S x ) 2 ] = F ( x ) + f ( x ) · 1 2 E [ | T 1 T 2 | ] + O x ( b ) = F ( x ) b 1 / 2 · f ( x ) 2 lim b 0 b 1 / 2 E [ | T 1 T 2 | ] + O x ( b ) ,
so that
V ar ( F ^ n , b RIG ( x ) ) = n 1 E K ¯ RIG 2 ( X 1 , x 1 ( 1 b ) 1 , x 1 b 1 ) n 1 E [ F ^ n , b RIG ( x ) ] 2 = n 1 F ( x ) ( 1 F ( x ) ) n 1 b 1 / 2 · f ( x ) 2 lim b 0 b 1 / 2 E [ | T 1 T 2 | ] + O x ( n 1 b ) .
This ends the proof. □
Proof of Proposition 9. 
Note that F ^ n , b RIG ( x ) E [ F ^ n , b RIG ( x ) ] = 1 n i = 1 n Z i , b , where
Z i , b = K ¯ RIG ( X i | x 1 ( 1 b ) 1 , x 1 b 1 ) E [ K ¯ RIG ( X i | x 1 ( 1 b ) 1 , x 1 b 1 ) ] , 1 i n ,
are i.i.d. and centered random variables. It suffices to show the following Lindeberg condition for double arrays (see, e.g., Section 1.9.3 in [62]): for every ε > 0 ,
s b 2 E Z 1 , b 2 𝟙 { | Z 1 , b | > ε n 1 / 2 s b } 0 , as n ,
where s b 2 = E [ Z 1 , b 2 ] and b = b ( n ) 0 . This follows from the fact that | Z 1 , b | 2 for all b > 0 , and s b = ( n V ar ( F ^ n , b RIG ) ) 1 / 2 F ( x ) ( 1 F ( x ) ) as n by Lemma 5. □

Appendix F. Technical Lemmas

The lemma below computes the first two moments for the minimum of two i.i.d. random variables with a Gamma distribution. The proof is a slight generalization of the answer provided by Felix Marin in the following MathStackExchange post (https://math.stackexchange.com/questions/3910094/how-to-compute-this-double-integral-involving-the-gamma-function) (accessed on 15 September 2021).
Lemma A1.
Let X , Y i . i . d . Gamma ( α , θ ) , then
E [ ( min { X , Y } ) j ] = θ j Γ ( α + j ) Γ ( α ) j θ j π · Γ ( α + j 1 / 2 ) Γ ( α ) , j { 1 , 2 } ,
where Γ ( α ) = 0 t α 1 e t d t denotes the gamma function. In particular, for all x R ,
E [ ( min { X , Y } x ) ] = θ α θ π · Γ ( α + 1 / 2 ) Γ ( α ) x ,
E [ ( min { X , Y } x ) 2 ] = θ 2 α ( α + 1 ) 2 θ 2 π · Γ ( α + 3 / 2 ) Γ ( α ) 2 x θ α θ π · Γ ( α + 1 / 2 ) Γ ( α ) + x 2 .
Proof. 
Assume throughout the proof that j { 1 , 2 } . By the simple change of variables ( u , v ) = ( x / θ , y / θ ) , we have
E [ ( min { X , Y } ) j ] = 2 0 0 y j ( x y ) α 1 e ( x + y ) / θ θ 2 α Γ 2 ( α ) 𝟙 [ 0 , ) ( x y ) d x d y = 2 θ j Γ 2 ( α ) 0 0 u α 1 e u v α + j 1 e v 𝟙 [ 0 , ) ( u v ) d u d v .
By the integral representation of the Heaviside function
𝟙 [ 0 , ) ( x ) = lim ε 0 + 1 2 π i 1 τ i ε e i x τ d τ ,
the above is
= 2 θ j Γ 2 ( α ) lim ε 0 + 1 2 π i 1 τ i ε 0 u α 1 e ( 1 i τ ) u d u = ( 1 i τ ) α Γ ( α ) 0 v α + j 1 e ( 1 + i τ ) v d v = ( 1 + i τ ) α j Γ ( α + j ) d τ = 2 θ j Γ ( α + j ) Γ ( α ) lim ε 0 + 1 2 π i ( 1 + τ 2 ) α ( 1 + i τ ) j τ i ε d τ = 2 θ j Γ ( α + j ) Γ ( α ) P . V . 1 2 π i ( 1 + τ 2 ) α ( 1 + i τ ) j τ d τ + 1 2 π i ( 1 + τ 2 ) α ( 1 + i τ ) j · i π δ ( τ ) d τ ,
where δ denotes the Dirac delta function. The second term in the last brace is 1 / 2 and the principal value is
= 1 2 π i 0 ( 1 + τ 2 ) α τ 1 ( 1 + i τ ) j 1 ( 1 i τ ) j d τ = j π 0 ( 1 + τ 2 ) α j d τ ,
where we crucially used the fact that j { 1 , 2 } to obtain the last equality. Putting all the work back in (A36), we obtain
E [ ( min { X , Y } ) j ] = 2 θ j Γ ( α + j ) Γ ( α ) j π 0 ( 1 + τ 2 ) α j d τ + 1 2 = 2 θ j Γ ( α + j ) Γ ( α ) j 2 π 0 t 1 / 2 1 ( 1 + t ) α j d t + 1 2 .
The remaining integral can be evaluated using Ramanujan’s master theorem. Indeed, note that
( 1 + t ) α j = k = 0 α j k t k = k = 0 α + j + k 1 k ( t ) k = k = 0 φ ( k ) ( t ) k k ! , with φ ( z ) = Γ ( α + j + z ) Γ ( α + j ) .
Therefore,
0 t 1 / 2 1 ( 1 + t ) α j d t = Γ ( 1 / 2 ) φ ( 1 / 2 ) = π Γ ( α + j 1 / 2 ) Γ ( α + j ) .
By putting this result in (A38), we obtain
E [ ( min { X , Y } ) j ] = θ j Γ ( α + j ) Γ ( α ) j θ j π · Γ ( α + j 1 / 2 ) Γ ( α ) .
This ends the proof. □
Corollary A1.
Let X , Y i . i . d . Gamma ( b 1 x + 1 , b ) for some x , b ( 0 , ) , then
E [ ( min { X , Y } x ) ] = b x b + 1 b π · Γ ( x b + 3 / 2 ) Γ ( x b + 1 ) x = b x π + b + O x ( b 3 / 2 ) ,
E [ ( min { X , Y } x ) 2 ] = b 2 x b + 1 x b + 2 2 b 2 π · Γ ( x b + 5 / 2 ) Γ ( x b + 1 ) 2 x x b x π + b + O x ( b 3 / 2 ) + x 2 = b x + O x ( b 3 / 2 ) .
The lemma below computes the first two moments for the minimum of two i.i.d. random variables with an inverse Gamma distribution.
Lemma A2.
Let X , Y i . i . d . InverseGamma ( α , θ ) and assume α > 2 , then
E [ ( min { X , Y } ) j ] = θ j Γ ( α j ) Γ ( α ) j θ j π · Γ ( α j ) Γ ( α 1 / 2 ) Γ 2 ( α ) , j { 1 , 2 } ,
where Φ denotes the c.d.f. of the standard normal distribution. In particular, for all x R ,
E [ ( min { X , Y } x ) ] = θ 1 α 1 1 1 π · Γ ( α 1 / 2 ) Γ ( α ) x ,
E [ ( min { X , Y } x ) 2 ] = θ 2 ( α 1 ) ( α 2 ) 1 2 π · Γ ( α 1 / 2 ) Γ ( α ) 2 x θ 1 α 1 1 1 π · Γ ( α 1 / 2 ) Γ ( α ) + x 2 .
Proof. 
Assume throughout the proof that j { 1 , 2 } . By the simple change of variables ( u , v ) = ( x 1 / θ , y 1 / θ ) and the reparametrization α ˜ = α j > 0 , we have
E [ ( min { X , Y } ) j ] = 2 0 0 y j ( x y ) α 1 e ( x 1 + y 1 ) / θ θ 2 α Γ 2 ( α ) 𝟙 [ 0 , ) ( x y ) d x d y = 2 θ j Γ 2 ( α ) 0 0 u α 1 e u v α j 1 e v 𝟙 [ 0 , ) ( v u ) d u d v = 2 θ j Γ 2 ( α ) 0 0 u α ˜ + j 1 e u v α ˜ 1 e v 𝟙 [ 0 , ) ( v u ) d u d v .
We already evaluated this double integral in the proof of Lemma A1 (with α instead of α ˜ ). The above is
= 2 θ j Γ 2 ( α ) · Γ ( α ˜ ) 2 Γ ( α ˜ + j ) j 2 π · Γ ( α ˜ + j 1 / 2 ) .
This ends the proof. □
Corollary A2.
Let X , Y i . i . d . InverseGamma ( b 1 + 1 , x 1 b ) for some x ( 0 , ) and b ( 0 , 1 ) , then
E [ ( min { X , Y } x ) ] = x π · Γ ( b 1 + 1 / 2 ) Γ ( b 1 + 1 ) = x b π + O x ( b 3 / 2 ) ,
E [ ( min { X , Y } x ) 2 ] = x 2 1 1 b 1 2 x 2 π Γ ( b 1 + 1 / 2 ) Γ ( b 1 + 1 ) 1 1 b 1 = b x 2 + O x ( b 3 / 2 ) .
The lemma below computes the first two moments for the minimum of two i.i.d. random variables with a LogNormal distribution.
Lemma A3.
Let X , Y i . i . d . LogNormal ( μ , σ ) , then
E [ ( min { X , Y } ) a ] = 2 e a μ + ( a σ ) 2 2 Φ a σ 2 , a > 0 ,
where Φ denotes the c.d.f. of the standard normal distribution. In particular, for all x R ,
E [ ( min { X , Y } x ) ] = 2 e μ + σ 2 2 Φ σ 2 x , E [ ( min { X , Y } x ) 2 ] = 2 e 2 μ + 2 σ 2 Φ 2 σ 4 x e μ + σ 2 2 Φ σ 2 + x 2 .
Proof. 
With the change of variables
u v = 1 2 1 1 1 1 x y , d ( u , v ) d ( x , y ) = 1 ,
we have
E [ ( min { X , Y } ) a ] = 2 y e a ( μ + σ y ) 1 2 π e x 2 + y 2 2 d x d y = 2 0 e a ( μ + σ · u + v 2 ) 1 2 π e u 2 + v 2 2 d u d v = 2 e a μ + ( a σ ) 2 2 0 1 2 π e ( u + a σ 2 ) 2 2 1 2 π e ( v a σ 2 ) 2 2 d u d v = 2 e a μ + ( a σ ) 2 2 Φ a σ 2 .
This ends the proof. □
Corollary A3.
Let X , Y i . i . d . LogNormal ( log x , b ) for some x , b ( 0 , ) , then
E [ ( min { X , Y } x ) ] = x 2 e b / 2 Φ b 2 1 = x b π + b x 2 + O x ( b 3 / 2 ) , E [ ( min { X , Y } x ) 2 ] = x 2 2 e 2 b Φ 2 b 4 e b / 2 Φ b 2 + 1 = b x 2 + O x ( b 3 / 2 ) .

References

  1. Aitchison, J.; Lauder, I.J. Kernel density estimation for compositional data. J. R. Stat. Soc. Ser. C 1985, 34, 129–137. [Google Scholar] [CrossRef]
  2. Chen, S.X. Beta kernel estimators for density functions. Comput. Stat. Data Anal. 1999, 31, 131–145. [Google Scholar] [CrossRef]
  3. Chen, S.X. Probability density function estimation using gamma kernels. Ann. Inst. Stat. Math. 2000, 52, 471–480. [Google Scholar] [CrossRef]
  4. Rosenblatt, M. Remarks on some nonparametric estimates of a density function. Ann. Math. Stat. 1956, 27, 832–837. [Google Scholar] [CrossRef]
  5. Parzen, E. On estimation of a probability density function and mode. Ann. Math. Stat. 1962, 33, 1065–1076. [Google Scholar] [CrossRef]
  6. Gasser, T.; Müller, H.G. Kernel estimation of regression functions. In Smoothing Techniques for Curve Estimation; Springer: Berlin/Heidelberg, Germany, 1979; pp. 23–68. [Google Scholar]
  7. Rice, J. Boundary modification for kernel regression. Comm. Stat. A Theory Methods 1984, 13, 893–900. [Google Scholar]
  8. Gasser, T.; Müller, H.G.; Mammitzsch, V. Kernels for nonparametric curve estimation. J. R. Stat. Soc. Ser. B 1985, 47, 238–252. [Google Scholar] [CrossRef]
  9. Müller, H.G. Smooth optimum kernel estimators near endpoints. Biometrika 1991, 78, 521–530. [Google Scholar] [CrossRef]
  10. Zhang, S.; Karunamuni, R.J. On kernel density estimation near endpoints. J. Stat. Plann. Inference 1998, 70, 301–316. [Google Scholar] [CrossRef]
  11. Zhang, S.; Karunamuni, R.J. On nonparametric density estimation at the boundary. J. Nonparametr. Stat. 2000, 12, 197–221. [Google Scholar] [CrossRef]
  12. Bouezmarni, T.; Rolin, J.M. Consistency of the beta kernel density function estimator. Canad. J. Stat. 2003, 31, 89–98. [Google Scholar] [CrossRef]
  13. Renault, O.; Scaillet, O. On the way to recovery: A nonparametric bias free estimation of recovery rate densities. J. Bank. Financ. 2004, 28, 2915–2931. [Google Scholar] [CrossRef] [Green Version]
  14. Fernandes, M.; Monteiro, P.K. Central limit theorem for asymmetric kernel functionals. Ann. Inst. Stat. Math. 2005, 57, 425–442. [Google Scholar] [CrossRef]
  15. Hirukawa, M. Nonparametric multiplicative bias correction for kernel-type density estimation on the unit interval. Comput. Stat. Data Anal. 2010, 54, 473–495. [Google Scholar] [CrossRef]
  16. Bouezmarni, T.; Rombouts, J.V.K. Nonparametric density estimation for multivariate bounded data. J. Stat. Plann. Inference 2010, 140, 139–152. [Google Scholar] [CrossRef] [Green Version]
  17. Zhang, S.; Karunamuni, R.J. Boundary performance of the beta kernel estimators. J. Nonparametr. Stat. 2010, 22, 81–104. [Google Scholar] [CrossRef]
  18. Bertin, K.; Klutchnikoff, N. Minimax properties of beta kernel estimators. J. Stat. Plann. Inference 2011, 141, 2287–2297. [Google Scholar] [CrossRef]
  19. Bertin, K.; Klutchnikoff, N. Adaptive estimation of a density function using beta kernels. ESAIM Probab. Stat. 2014, 18, 400–417. [Google Scholar] [CrossRef]
  20. Igarashi, G. Bias reductions for beta kernel estimation. J. Nonparametr. Stat. 2016, 28, 1–30. [Google Scholar] [CrossRef]
  21. Jin, X.; Kawczak, J. Birnbaum-Saunders and lognormal kernel estimators for modelling durations in high frequency financial data. Ann. Econ. Financ. 2003, 4, 103–124. Available online: http://aeconf.com/Articles/May2003/aef040106.pdf (accessed on 15 September 2021).
  22. Scaillet, O. Density estimation using inverse and reciprocal inverse Gaussian kernels. J. Nonparametr. Stat. 2004, 16, 217–226. [Google Scholar] [CrossRef] [Green Version]
  23. Bouezmarni, T.; Scaillet, O. Consistency of asymmetric kernel density estimators and smoothed histograms with application to income data. Econom. Theor. 2005, 21, 390–412. [Google Scholar] [CrossRef] [Green Version]
  24. Bouezmarni, T.; Rombouts, J.V.K. Density and hazard rate estimation for censored and α-mixing data using gamma kernels. J. Nonparametr. Stat. 2008, 20, 627–643. [Google Scholar] [CrossRef] [Green Version]
  25. Bouezmarni, T.; Rombouts, J.V.K. Nonparametric density estimation for positive time series. Comput. Stat. Data Anal. 2010, 54, 245–261. [Google Scholar] [CrossRef] [Green Version]
  26. Igarashi, G.; Kakizawa, Y. Re-formulation of inverse Gaussian, reciprocal inverse Gaussian, and Birnbaum-Saunders kernel estimators. Stat. Probab. Lett. 2014, 84, 235–246. [Google Scholar] [CrossRef]
  27. Igarashi, G.; Kakizawa, Y. Generalised gamma kernel density estimation for nonnegative data and its bias reduction. J. Nonparametr. Stat. 2018, 30, 598–639. [Google Scholar] [CrossRef]
  28. Charpentier, A.; Flachaire, E. Log-transform kernel density estimation of income distribution. L’actualité Économique Rev. D’analyse Économique 2015, 91, 141–159. [Google Scholar] [CrossRef] [Green Version]
  29. Igarashi, G. Weighted log-normal kernel density estimation. Comm. Stat. Theory Methods 2016, 45, 6670–6687. [Google Scholar] [CrossRef]
  30. Zougab, N.; Adjabi, S. Multiplicative bias correction for generalized Birnbaum-Saunders kernel density estimators and application to nonnegative heavy tailed data. J. Korean Stat. Soc. 2016, 45, 51–63. [Google Scholar] [CrossRef]
  31. Kakizawa, Y.; Igarashi, G. Inverse gamma kernel density estimation for nonnegative data. J. Korean Stat. Soc. 2017, 46, 194–207. [Google Scholar] [CrossRef]
  32. Kakizawa, Y. Nonparametric density estimation for nonnegative data, using symmetrical-based inverse and reciprocal inverse Gaussian kernels through dual transformation. J. Stat. Plann. Inference 2018, 193, 117–135. [Google Scholar] [CrossRef]
  33. Zougab, N.; Harfouche, L.; Ziane, Y.; Adjabi, S. Multivariate generalized Birnbaum-Saunders kernel density estimators. Comm. Stat. Theory Methods 2018, 47, 4534–4555. [Google Scholar] [CrossRef]
  34. Zhang, S. A note on the performance of the gamma kernel estimators at the boundary. Stat. Probab. Lett. 2010, 80, 548–557. [Google Scholar] [CrossRef]
  35. Kakizawa, Y. Multivariate non-central Birnbaum-Saunders kernel density estimator for nonnegative data. J. Stat. Plann. Inference 2020, 209, 187–207. [Google Scholar] [CrossRef]
  36. Ouimet, F.; Tolosana-Delgado, R. Asymptotic properties of Dirichlet kernel density estimators. J. Multivar. Anal. 2022, 187, 104832. [Google Scholar] [CrossRef]
  37. Kokonendji, C.C.; Libengué Dobélé-Kpoka, F.G.B. Asymptotic results for continuous associated kernel estimators of density functions. Afr. Diaspora J. Math. 2018, 21, 87–97. [Google Scholar]
  38. Kokonendji, C.C.; Somé, S.M. On multivariate associated kernels to estimate general density functions. J. Korean Stat. Soc. 2018, 47, 112–126. [Google Scholar] [CrossRef]
  39. Kokonendji, C.C.; Somé, S.M. Bayesian bandwidths in semiparametric modelling for nonnegative orthant data with diagnostics. Stats 2021, 4, 162–183. [Google Scholar] [CrossRef]
  40. Hirukawa, M. Asymmetric Kernel Smoothing; SpringerBriefs in Statistics; Springer: Singapore, 2018; p. xii+110. [Google Scholar]
  41. Mombeni, H.A.; Masouri, B.; Akhoond, M.R. Asymmetric Kernels for Boundary Modification in Distribution Function Estimation. Revstat 2019, 1–27. Available online: https://www.ine.pt/revstat/pdf/Asymmetrickernelsforboundarymodificationindistributionfunctionestimation.pdf (accessed on 15 September 2021).
  42. Babu, G.J.; Canty, A.J.; Chaubey, Y.P. Application of Bernstein polynomials for smooth estimation of a distribution and density function. J. Stat. Plann. Inference 2002, 105, 377–392. [Google Scholar] [CrossRef]
  43. Leblanc, A. Chung-Smirnov property for Bernstein estimators of distribution functions. J. Nonparametr. Stat. 2009, 21, 133–142. [Google Scholar] [CrossRef]
  44. Leblanc, A. On estimating distribution functions using Bernstein polynomials. Ann. Inst. Stat. Math. 2012, 64, 919–943. [Google Scholar] [CrossRef]
  45. Leblanc, A. On the boundary properties of Bernstein polynomial estimators of density and distribution functions. J. Stat. Plann. Inference 2012, 142, 2762–2778. [Google Scholar] [CrossRef]
  46. Dutta, S. Distribution function estimation via Bernstein polynomial of random degree. Metrika 2016, 79, 239–263. [Google Scholar] [CrossRef]
  47. Jmaei, A.; Slaoui, Y.; Dellagi, W. Recursive distribution estimator defined by stochastic approximation method using Bernstein polynomials. J. Nonparametr. Stat. 2017, 29, 792–805. [Google Scholar] [CrossRef]
  48. Erdoğan, M.S.; Dişibüyük, C.; Ege Oruç, O. An alternative distribution function estimation method using rational Bernstein polynomials. J. Comput. Appl. Math. 2019, 353, 232–242. [Google Scholar] [CrossRef]
  49. Wang, X.; Song, L.; Sun, L.; Gao, H. Nonparametric estimation of the ROC curve based on the Bernstein polynomial. J. Stat. Plann. Inference 2019, 203, 39–56. [Google Scholar] [CrossRef]
  50. Babu, G.J.; Chaubey, Y.P. Smooth estimation of a distribution and density function on a hypercube using Bernstein polynomials for dependent random vectors. Stat. Probab. Lett. 2006, 76, 959–969. [Google Scholar] [CrossRef]
  51. Belalia, M. On the asymptotic properties of the Bernstein estimator of the multivariate distribution function. Stat. Probab. Lett. 2016, 110, 249–256. [Google Scholar] [CrossRef]
  52. Dib, K.; Bouezmarni, T.; Belalia, M.; Kitouni, A. Nonparametric bivariate distribution estimation using Bernstein polynomials under right censoring. Comm. Stat. Theory Methods 2020, 1–11. [Google Scholar] [CrossRef]
  53. Ouimet, F. Asymptotic properties of Bernstein estimators on the simplex. J. Multivariate Anal. 2021, 185, 104784. [Google Scholar] [CrossRef]
  54. Ouimet, F. On the boundary properties of Bernstein estimators on the simplex. arXiv 2021, arXiv:2006.11756. [Google Scholar]
  55. Hanebeck, A.; Klar, B. Smooth distribution function estimation for lifetime distributions using Szasz-Mirakyan operators. Ann. Inst. Stat. Math. 2021, 1–19. [Google Scholar] [CrossRef]
  56. Ouimet, F. On the Le Cam distance between Poisson and Gaussian experiments and the asymptotic properties of Szasz estimators. J. Math. Anal. Appl. 2021, 499, 125033. [Google Scholar] [CrossRef]
  57. Tenreiro, C. Boundary kernels for distribution function estimation. REVSTAT Stat. J. 2013, 11, 169–190. [Google Scholar]
  58. Tiago de Oliveira, J. Estatística de densidades: Resultados assintóticos. Rev. Fac. Ciências Lisb. 1963, 9, 111–206. [Google Scholar]
  59. Nadaraja, E.A. Some new estimates for distribution functions. Teor. Verojatnost. i Primenen. 1964, 9, 550–554. [Google Scholar]
  60. Watson, G.S.; Leadbetter, M.R. Hazard analysis. II. Sankhyā Ser. A 1964, 26, 101–116. [Google Scholar]
  61. Altman, N.; Léger, C. Bandwidth selection for kernel distribution function estimation. J. Stat. Plann. Inference 1995, 46, 195–214. [Google Scholar] [CrossRef] [Green Version]
  62. Serfling, R.J. Approximation Theorems of Mathematical Statistics; Wiley Series in Probability and Mathematical Statistics; John Wiley & Sons, Inc.: New York, NY, USA, 1980; p. xvi+371. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Lafaye de Micheaux, P.; Ouimet, F. A Study of Seven Asymmetric Kernels for the Estimation of Cumulative Distribution Functions. Mathematics 2021, 9, 2605. https://doi.org/10.3390/math9202605

AMA Style

Lafaye de Micheaux P, Ouimet F. A Study of Seven Asymmetric Kernels for the Estimation of Cumulative Distribution Functions. Mathematics. 2021; 9(20):2605. https://doi.org/10.3390/math9202605

Chicago/Turabian Style

Lafaye de Micheaux, Pierre, and Frédéric Ouimet. 2021. "A Study of Seven Asymmetric Kernels for the Estimation of Cumulative Distribution Functions" Mathematics 9, no. 20: 2605. https://doi.org/10.3390/math9202605

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop