Next Article in Journal
Multipolar Intuitionistic Fuzzy Set with Finite Degree and Its Application in BCK/BCI-Algebras
Next Article in Special Issue
Asymptotic Approximations of Ratio Moments Based on Dependent Sequences
Previous Article in Journal
Gamma-Bazilevič Functions
Previous Article in Special Issue
Optimal Designs for Carry Over Effects the Case of Two Treatment and Four Periods
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Pointwise Optimality of Wavelet Density Estimation for Negatively Associated Biased Sample

1
State Key Laboratory of Mechanics and Control of Mechanical Structures, Department of Mathematics, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China
2
School of Mathematics and Computation Science, Anqing Normal University, Anqing 246133, China
*
Author to whom correspondence should be addressed.
Mathematics 2020, 8(2), 176; https://doi.org/10.3390/math8020176
Submission received: 14 December 2019 / Revised: 27 January 2020 / Accepted: 28 January 2020 / Published: 2 February 2020
(This article belongs to the Special Issue Probability, Statistics and Their Applications)

Abstract

:
This paper focuses on the density estimation problem that occurs when the sample is negatively associated and biased. We constructed a block thresholding wavelet estimator to recover the density function from the negatively associated biased sample. The pointwise optimality of this wavelet density estimation is shown as L p ( 1 p < ) risks over Besov space. To validate the effectiveness of the block thresholding wavelet method, we provide some examples and implement the numerical simulations. The results indicate that our block thresholding wavelet density estimator is superior in terms of the mean squared error (MSE) when comparing with the nonlinear wavelet density estimator.

1. Introduction

Let X 1 , X 2 , , X n be the unobserved realizations of random variable X with the density function g, and Y 1 , Y 2 , , Y n be the recorded observations of random variable Y with the density function:
f y = η y g y μ
where η y represents a biasing function and its corresponding expectation is μ = E η X = 0 1 η y g y d y ( 0 < μ < ).
To achieve the goal that recovers the density function g y in Model (1) from the sample Y 1 , Y 2 , , Y n , some statisticians have conducted thorough explorations [1,2,3,4]. The wavelet method, which can be adapted to represent the local features of the density function, has been widely used in their researches. For instance, Ramirez and Vidakovic [2] estimated the density from the stratified size-biased sample using a linear wavelet method and proved the consistency of their wavelet estimator on the L 2 risk over Besov space. Chesneau [3] considered a pointwise wavelet estimation on the L P ( 1 p < ) risks [4] and extended this univariate wavelet estimation to the multivariate case. However, the above results rely on the independence assumption of the size-biased sample, which is a serious restriction in practical applications. As the size-biased sample is dependent, many researchers modelled the dependence of the sample as being negatively associated (NA), the definition of which is as follows.
Definition 1. 
Let A and B be an arbitrary pair of disjoint nonempty subsets of 1 , 2 , , n . Assume that f 1 and f 2 are real-valued coordinate-wise nondecreasing functions. For a sequence of random variables Y 1 , Y 2 , , Y n , if the covariances of the random variable functions exist and satisfy
C o v f 1 Y k , k A , f 2 Y m , m B 0 ,
then Y 1 , Y 2 , , Y n are said to be NA.
The concept of negative association, was first proposed by Alam and Saxena [5] and its basic properties were investigated by Joag-Dev and Proschan [6]. Negative association has been widely applied to multivariate statistical analysis and systems reliability since it contains many multivariate distributions, such as: (a) multinomial, (b) convolution of unlike multinomial, (c) Dirichlet, (d) negatively correlated normal distribution, and (e) random sampling without replacement. Some studies investigated the fundamental and asymptotic properties for an NA sample ([7,8,9,10,11,12]). Recently, the NA sequence has been introduced into the Model (1), and the wavelet density estimation of NA size-biased sample was studied. For example, Chesneau [13] obtained the optimal convergence rate of the linear wavelet method for a NA size-biased sample on the L 2 risk. As the linear wavelet method is not adaptive, Liu and Xu [14] and Guo and Kou [15] considered a nonlinear wavelet method to estimate the density function from a NA (stratified) size-biased sample. This nonlinear wavelet density estimation has been shown to be adaptive, and a pointwise convergence rate over L p risks was established.
As far as we know, a good density estimation procedure should simultaneously achieve two objectives: computational efficiency and adaptivity [16,17]. Although the nonlinear method in [14,15] is adaptive, the convergence rate of the nonlinear estimation is nearly optimal (up to one logarithmic factor). Cai and Chicken [16] proposed a block thresholding method that can remove the logarithmic factor from the convergence rate of the wavelet density estimation. The block thresholding method provides spatial adaptivity to relatively subtle changes in the underlying density function. More specifically, it considers the wavelet coefficients of blocks of length L, but not the term by term; thus, it produces a degree of graduated smoothing, which amounts to choosing an appropriate bandwidth in kernel estimation. In contrast, the hard thresholding method in Liu and Xu [14] applied to density estimation involves using a global bandwidth in kernel estimation.
We aimed to remove the logarithmic factor in the convergence rate of wavelet density estimation and thus enhance computational efficiency. We selected the block thresholding technique [16,18] and allowed the sample to be NA size-biased. We structured an L p version of the block thresholding wavelet estimator for the density function g y in Model (1). This estimator is adaptive and simultaneously achieves the pointwise optimal convergence rate over Besov space. Some examples are provided and the corresponding simulations were conducted using R software. The result indicated that the block thresholding wavelet density estimator is better than the nonlinear wavelet density estimator in terms of the mean squared error (MSE).

2. Notations and Assumptions

Denote the scaling function and its associated wavelet function by φ and ψ , respectively. We assume both φ and ψ are r-regularity and have periodic boundary conditions on [0, 1]. Let φ i 0 j y = 2 i 0 / 2 φ 2 i 0 y j , ψ i j y = 2 i / 2 ψ 2 i y j . We define an orthonormal basis (ONB) of L 2 [ 0 , 1 ] as:
Ω = φ i 0 j , j = 0 , 1 , , 2 i 0 1 ; ψ i j , i i 0 0 , j = 0 , 1 , , 2 i 1 .
Then, a function g L 2 [ 0 , 1 ] can be reconstructed as:
g y = j = 0 2 i 0 1 α i 0 j φ i 0 j y + i i 0 j = 0 2 i 1 β i j ψ i j y ,
where α i 0 j = 0 1 g y φ i 0 j y d y , β i j = 0 1 g y ψ i j y d y .
Besov space B h , q s includes the common Sobolev ( B 2 , 2 s ) and Hölder ( B , s ) spaces, so they offer a flexible collection of smooth functions. For a function g y that belongs to the Besov space, the Besov sequence norm b h , q s of the wavelet coefficients is bounded [19], which implies that a positive constant K exists, such that:
β b h , q s = α i 0 h + i = i 0 2 i q η β i h q 1 / q K ,
where β and α i 0 denote the vectors, s is an index of regularity that satisfies 0 < s < r , θ and q satisfy 1 h , q , γ = s + 1 / 2 1 / h , and s > 1 / h .
To establish our theorem, we list some assumptions that were necessary in our proofs.
(A1)
The function g y is bounded; that is, a positive constant c 1 exists, such that:
g y c 1 .
(A2)
The biasing function η y is non-increasing and for all y 0 , 1 , two positive constants c 2 and c 3 exist, i.e.,
c 2 η y c 3 .
Remark 1. 
The assumptions (A1) and (A2) are reasonable to obtain the pointwise adaptivity [3,15]. From the definition of g y , the assumption (A2) implies that:
μ = 0 1 η y f y d y c 2 0 1 f y d y = c 2 .

3. Estimator and Main Result

Let L = log n p 2 and divide the set 0 , 1 , , 2 i 1 into consecutive, non-overlapping blocks of length L at every resolution level i, i.e.,
Δ i m = j : m 1 L + 1 j m L , m Z .
Consider μ ^ n = 1 n t = 1 n 1 η Y t 1 as the estimator of μ . Define α ^ i 0 j = μ ^ n n t = 1 n 1 η Y t ψ i 0 j Y t and β ^ i j = μ ^ n n t = 1 n 1 η Y t ψ i j Y t . Denote i 0 = p 2 2 log 2 log 2 n and i 1 = log 2 n / log 2 n and let i m represent the summation over j Δ i m . Then, the block thresholding estimator is given by:
g ^ y = j = 0 2 i 0 1 α ^ i 0 j φ i 0 j y + i = i 0 i 1 1 m i m β ^ i j ψ i j y I B ^ i m > λ ,
where B ^ i m = L 1 i m β ^ i j p 1 / p and the threshold is set to λ = 48 c 2 1 μ + c 1 2 n 1 .
We outline the pointwise convergence rate of the wavelet density estimator g ^ y in the following theorem.
Theorem 1. 
Let Y 1 , Y 2 , , Y n be an identically distributed NA sequence in Model (1). Suppose g B h , q s and s > max 1 / h , 1 / 2 . For the block thresholding wavelet estimator g ^ y defined by (5), if the conditions (A1) and (A2) are satisfied, then for 1 p < and h p / 2 s + 1 , a positive constant C exists, such that:
sup g B h , q s E g ^ y g y 2 d y C n p s / 2 s + 1 , f o r h p , C n p s / 2 s + 1 log n p s / 2 s + 1 , f o r p / 2 s + 1 h < p .
Remark 2. 
The block thresholding wavelet estimator is adaptive since i 0 a n d i 1 do not rely on s , h , q . If p = 2 and η y 1 , Model (1) reduces to the standard density estimation model and the convergence rate in Theorem 1 is the same as the results of Cai and Chicken [16]. If the NA bias reduces to the independent bias, the results of the Theorem 1 become Theorem (4.1) in Chesneau [3].
Remark 3. 
With the presence of NA bias, Liu and Xu [14] and Guo and Kou [15] established the near optimal convergence rates (up to a logarithmic term) of a nonlinear wavelet estimator for the density function in Model (1). Note that the logarithmic term in the convergence rate has been removed in Theorem 1, which improves the convergence rates in [14,15].

4. Simulation Study

We used the method in Alam and Saxena [5] and Liu and Xu [14] to generate NA samples, firstly drawing n samples Z 1 , Z 2 , , Z n from the distribution B e t a 2 , 2 n 2 , then choosing Y t = Z t δ ( δ > 0 ). Thus, Y 1 , Y 2 , , Y n are NA (refer to Liu and Xu [14]). Throughout the simulations, we took δ = 1 7 .
Now, we provide three examples including the linear and nonlinear density functions, and their derivatives are continuous and discontinuous.
Example 1. 
Let η y = y and g y = 3 y 3 + y 2 , y 0 , 1 , then g y is nonlinear density function and its derivative is continuous, μ = 23 30 and:
f y = 30 23 3 y 4 + y 2 2 , 0 y 1 .
Example 2. 
Let η y = y + 1 and:
g y = 2 y + 1 2 , 0 y 1 2 , 2 y + 5 2 , 1 2 < y 1 .
g y is a linear continuous function but its derivative is discontinuous at y = 1 2 . We have μ = 13 12 and:
f y = 12 13 2 y 2 + 5 2 y + 1 2 , 0 y 1 2 , 12 13 2 y 2 + 1 2 y + 5 2 , 1 2 < y 1 .
Example 3. 
Let η y = y + 1 and
g y = 3 y 2 + 3 4 , 0 y 1 2 , 3 y 2 6 y + 15 4 , 1 2 < y 1 .
g y is a nonlinear continuous function but its derivative is discontinuous at y = 1 2 . We have μ = 3 2 and:
f y = 2 3 3 y 3 + 3 y 2 + 3 4 y + 3 4 , 0 y 1 2 , 2 3 3 y 3 3 y 2 9 4 y + 15 4 , 1 2 < y 1 .
For the block wavelet threshold, we set c 2 = μ and c 1 = max 1 t n g Y t + 1 . To evaluate the performance of the density estimators, we considered the MSE, which is defined as:
MSE = 1 n t = 1 n g e s t Y t g Y t 2 .
Figure 1, Figure 2 and Figure 3 display the recovery of the density functions by wavelet density estimators for NA samples in the above examples. Table 1 lists the MSEs of these estimators. The results indicated both the nonlinear wavelet method and the block thresholding wavelet method performed well in the density estimation problem even though the samples were NA size-biased. The block wavelet density estimator performed better than the nonlinear wavelet density estimator in terms of the MSE; the estimations were increasingly accurate with increasing sample size.

5. Proof of Theorem 1

The proof of Theorem 1 is similar to that of Theorem (4.2) in Chesneau [20] and Theorem (4.1) in Chesneau [3]. The difference is that we considered that the samples were not only biased, but also NA. Hence, we had to overcome some non-trivial technical difficulties. Before elaborating the detailed proof, we introduce some basic properties and inequalities of the NA sequence in the following lemmas.
Lemma 1. 
[13]. For a sequence of NA random variables Z 1 , Z 2 , c d o t s , Z n and the non-empty subsets B 1 , B 2 , , B k of 1 , 2 , , n , if B 1 , B 2 , , B k are pairwise disjoint and f 1 , f 2 , , f k are k coordinate-wise non-decreasing Borel functions, then f 1 ( Z t , t B 1 ) , f 2 ( Z t , t B 2 ) , , f k ( Z t , t B k ) are still NA.
Lemma 2. 
[20]. Let Z t , t 1 be NA random variables and Z t * , t 1 be independent random variables with the same marginal distribution as Z t . If f is a non-decreasing function, then:
E f max 1 k n t = 1 k Z t E f max 1 k n t = 1 k Z t * .
Lemma 3. 
[20]. If Z t a n d t 1 are NA random variables with E Z t = 0 and E Z t 2 < , then we have:
(1) 
Rosenthal-type inequality: If E Z t p < for some p 2 , then a constant C p (only depends on p) exists, such that:
E max 1 k n t = 1 k Z t p C p t = 1 n E Z t p + t = 1 n E Z t 2 p / 2 .
(2) 
Kolmogorov-type inequality: Denote b n 2 = t = 1 n E Z t 2 , we have, for all 0 < b < 1 , z > 0 and τ > 0 :
P max 1 k n t = 1 k Z t z 2 P max 1 k n Z k > z + 1 1 b exp τ 2 b 2 τ z + b n 2 .
Proof of Theorem 1. 
According to the proof of Theorem (4.1) in Chesneau [3], the proof of Theorem 1 will be completed if we can show that the moment inequality
E β ^ i j β i j 2 p C n p ,
and the large deviation inequality
P j Δ i m β ^ i j β i j p 1 / p λ 2 log n 1 / 2 n p ,
hold at resolution level i (the case of primary resolution level i 0 can be treated similarly). Therefore, the remainder of the proof is composed of the following two parts.
Part one: Moment inequality. By the proof of Proposition 4.1 in Chesneau [3], we have that the inequality (6) is equivalent to
E β ^ i j β i j 2 p C p G i j + H i j ,
where:
G i j = E μ n 1 t = 1 n η 1 Y t ψ i j β i j 2 p ,
H i j = E n 1 t = 1 n η 1 Y t μ 1 2 p .
We first consider the term G i j . Write:
n 1 t = 1 n μ η 1 Y t ψ i j β i j = n 1 t = 1 n μ η 1 Y t ψ i j β i j = : n 1 t = 1 n ξ t .
Since ψ is a bounded variation function, then two bounded nonnegative nondecreasing functions exist, ψ ¯ and ψ ˜ , such that ψ = ψ ¯ ψ ˜ . Denote:
ξ ¯ t = μ η 1 Y t ψ ¯ i j Y t β ¯ i j , ξ ˜ t = μ η 1 Y t ψ ˜ i j Y t β ˜ i j ,
where β ¯ i j = 0 1 g y ψ ¯ i j y d y and β ˜ i j = 0 1 g y ψ ˜ i j y d y and, hence, ξ t = ξ ¯ t ξ ˜ t . As Lemma 1 and E ξ 1 = 0 1 μ η 1 y ψ i j y g y η y μ 1 d y β i j = 0 , ξ ¯ t and ξ ˜ t ( t = 1 , 2 , , n ) are zero NA random variables according to the monotonicity of η y in assumption (A2).
Note that β i j 0 1 g 2 y d y 1 / 2 0 1 ψ i j 2 y d y 1 / 2 c 1 and ψ i j y sup y 0 , 1 ψ i j y 2 j / 2 sup y 0 , 1 ψ y by assumption (A2) and Model (4). It follows that:
E ξ ¯ 1 p = 2 p 1 E μ η 1 Y 1 ψ ¯ i j Y 1 p + β ¯ i j p   2 p 1 μ p c 2 p 1 2 j p 2 / 2 sup y 0 , 1 ψ y p 2 E η 1 Y 1 ψ ¯ i j 2 Y 1 + c 1 p .
For any y 0 , 1 , we have:
  E η 1 Y 1 ψ ¯ i j 2 Y 1 = 0 1 η 1 y ψ ¯ i j 2 y μ 1 η y g y d y μ 1 c 1 0 1 ψ ¯ i j 2 y d y μ 1 c 1 .
Hence, for all resolution levels i 0 i i 1 :
E ξ 1 p 2 p 1 c 1 μ p 1 c 2 p 1 2 j p 2 / 2 sup y 0 , 1 ψ y p 2 + c 1 p   2 p 1 C n p 2 / 2 + c 1 p = O n p 2 / 2 .
Resorting to the Rosenthal-type inequality in Lemma 3, we obtain:
E n 1 t = 1 n ξ ¯ t 2 p C n 1 2 p E ξ 1 2 p + n p E ξ 1 p p = O n p .
Similarly, for ξ ˜ t , we have:
E n 1 t = 1 n ξ ˜ t 2 p = O n p .
Recall that ξ t = ξ ¯ t ξ ˜ t , then we obtain:
G j k = E n 1 t = 1 n ξ t 2 p 1 n 2 p E t = 1 n ξ ¯ t 2 p + E t = 1 n ξ ˜ t 2 p = O n p .
Now, we consider the term H i j . Denote ζ t = 1 η Y t 1 μ . It is easy to check that E ζ 1 p 2 p 1 E η p Y 1 + μ p c 1 . Resorting to the Rosenthal-type inequality again, we have:
H i j = E n 1 t = 1 n ζ t 2 p   C n 1 2 p E ζ 1 2 p + n p E ζ 1 p p   C n 1 2 p + n p = O n p .
Consequently, the inequality (6) is followed by (8), (10), and (11).
Part two: Large deviation inequality. From the proof of Proposition 4.2 in Chesneau [3], we know that the inequality (7) is bounded by:
  P j Δ i m β ^ i j β i j p 1 / p 32 c 2 1 μ + c 1 2 n 1 / 2 log 2 n 1 / 2 P 1 n t = 1 n 1 η Y t 1 μ 8 c 2 1 μ + c 1 2 n 1 / 2 log 2 n 1 / 2   + P n 1 j Δ i m μ ^ n η Y t ψ i j Y t β i j p 1 / p 8 c 2 1 μ + c 1 2 n 1 / 2 log 2 n 1 / 2 = : S 1 + S 2 .
Now, let us consider the upper bounds of S 1 and S 2 , separately.
Bound for S 1 : Recall that ζ t = 1 η Y t 1 μ and t = 1 , , n , are NA random variables with zero mean and finite variance σ 2 . Hence, the triangular inequality yields:
ζ t μ 1 + η 1 Y t 2 c 2 1 .
Taking τ = n 1 / 2 log 2 n 1 / 2 , b = 1 2 and z = 2 c 2 1 , using the Kolmogorov-type inequality in Lemma 3, we have:
S 1 P m a x 1 t n t = 1 n ζ t C n 1 / 2 log 2 n 1 / 2   4 exp 8 c 2 1 μ + c 1 2 n log 2 n 4 2 8 c 2 1 μ + c 1 2 c 2 1 n 1 / 2 log 2 n 1 / 2 + n σ 2   4 exp C 2 4 σ 2 log 2 n = O n p .
Bound for S 2 : Recall that L = log 2 n p 2 , we set λ * = 8 c 2 1 μ + c 1 2 n 1 / 2 log 2 n 1 / 2 , then S 2 O n p implies that:
P n 1 j Δ i m μ ^ n η Y t ψ i j Y t β i j p 1 / p λ * C exp n λ * 2 8 c 2 1 μ + c 1 2 .
Now, we will check the inequality in (13). Let d = p p 1 and define a set Θ = θ = θ 1 , θ 2 , , θ L R L : t = 1 L θ t d 1 . By the elementary inequality and the argument of duality, we show that for all integers i , m :
j Δ i m n 1 t = 1 n μ η Y t ψ i j Y t β i j p 1 / p j Δ i m n 1 t = 1 n μ η Y t ψ i j Y t β i j 2 1 / 2   = sup θ Θ j = 1 L θ j n 1 t = 1 n μ η Y t ψ i j Y t β i j .
Therefore, substituting Equation (14) into Equation (13) produces:
P sup θ Θ j = 1 L θ j n 1 t = 1 n μ η Y t ψ i j Y t β i j λ * C exp n λ * 2 8 c 2 1 μ + c 1 2 .
To prove Equation (15), we define a stochastic process T n θ , θ Θ with
T n θ = j = 1 L θ j n 1 t = 1 n μ η Y t ψ i j Y t β i j .
Note that β i j = 0 1 g y ψ i j y d y = 1 n t = 1 n g Y t ψ i j Y t . Denote e t = μ η Y t g Y t , then T n θ can be rewritten as:
T n θ = 1 n t = 1 n e t j = 1 L θ j ψ i j Y t .
Similar to the definition of ψ ¯ and ψ ˜ in Equation (9), we set T ¯ n θ = 1 n t = 1 n e t j = 1 L θ j ψ ¯ i j Y t , T ˜ n θ = 1 n t = 1 n e t j = 1 L θ j ψ ˜ i j Y t . Let ε > 0 , if ε T ¯ n θ 1 , then:
E sup θ Θ exp ε T ¯ n θ 1 + k = 2 E sup θ Θ ε T ¯ n θ k k !   1 + ε 2 E sup θ Θ T ¯ n θ 2 1 2 ! + 1 3 ! +   exp ε 2 E sup θ Θ T ¯ n θ 2 .
Recall that j = 1 L θ j d 1 and 1 d + 1 p = 1 , e t μ η Y t + g Y t c 2 1 μ + c 1 , invoking the Hölder’s inequality produces:
exp E sup θ Θ ε T ¯ n θ 2 sup θ Θ t = 1 n exp E ε 2 n 2 j = 1 L θ j ψ ¯ i j Y t 2 e t 2   sup θ Θ t = 1 n exp c 2 1 μ + c 1 2 ε 2 n 2 j = 1 L θ j d 2 / d j = 1 L ψ ¯ i j p Y t 2 / p   exp c 2 1 μ + c 1 2 ε 2 n 2 t = 1 n j = 1 L ψ ¯ i j p Y t 2 / p .
Note that when t = j , ψ ¯ i j Y t p = 1 , and for all t j , ψ ¯ i j Y t p = 0 . Hence, for every j j = 1 , 2 , , L , only a counterpart t = j exists, such that ψ i j Y t p = 1 ; that is, t = 1 n j = 1 L ψ i j Y t p 2 / p = L 2 / p , which leads to:
exp ε 2 E sup θ Θ T ¯ n θ 2 C exp c 2 1 μ + c 1 2 ε 2 L 2 / p n 2 .
Now we consider the case ε T ¯ n θ > 1 . By Lemmas 1 and 2, for ε > 0 , we obtain:
E sup θ Θ exp ε T ¯ n θ E exp E sup θ Θ ε T ¯ n θ 2   sup θ Θ t = 1 n E exp ε 2 n 2 j = 1 L θ j ψ ¯ i j Y t 2 e t 2   C exp c 2 1 μ + c 1 2 ε 2 n 2 t = 1 n j = 1 L ψ i j Y t p 2 / p   C exp c 2 1 μ + c 1 2 ε 2 n .
Thus, using Equations (16)–(18) together, we derive that:
E sup θ Θ exp ε T ¯ n θ C exp c 2 1 μ + c 1 2 ε 2 n .
Likewise, for the term T ˜ n θ :
E sup θ Θ exp ε T ˜ n θ C exp c 2 1 μ + c 1 2 ε 2 n .
Applying Markov’s inequality, all λ * > 0 and ε > 0 , we have:
P sup θ Θ T ¯ n θ λ * E exp sup θ Θ T ¯ n θ exp ε λ * exp ε λ * + c 2 1 μ + c 1 2 ε 2 / n ,
P sup θ Θ T ˜ n θ λ * E exp sup θ Θ T ˜ n θ exp ε λ * exp ε λ * + c 2 1 μ + c 1 2 ε 2 / n .
Taking ε = n λ * 2 c 2 1 μ + c 1 2 in Equations (19) and (20), we obtain:
P sup θ Θ T ¯ n θ λ * C exp λ * 2 n / 4 c 2 1 μ + c 1 2 ,
P sup θ Θ T ˜ n θ λ * C exp λ * 2 n / 4 c 2 1 μ + c 1 2 .
Note that sup θ Θ T n θ sup θ Θ T ¯ n θ T ˜ n θ sup θ Θ T ¯ n θ + sup θ Θ T ˜ n θ ; hence:
P sup θ Θ T n θ λ * P sup θ Θ T ¯ n θ λ * / 2 + P sup θ Θ T ˜ n θ λ * / 2   C exp λ * 2 n / 8 c 2 1 μ + c 1 2 .
This is the result of Equation (15) and thus concludes the proof of Theorem 1. □

Author Contributions

R.Y. compiled the manuscript with support from X.L. and Y.Y. All authors read and approved the final manuscript.

Funding

Xinsheng Liu was supported by National Natural Science Foundation of China (No. 61374183). Renyu Ye was supported by the Key Project Foundation of Natural Science Research in Universities of Anhui Province in China (No. KJ2019A0557) and the Natural Science Foundation of Anhui Province of China (No.1908085MA01). Yuncai Yu was supported by the Postgraduate Research & Practice Innovation Program of Jiangsu Province (No. KYCX 19 _ 0149 ).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Efromovich, S. Density estimation for biased data. Ann. Statist. 2004, 32, 1137–1161. [Google Scholar] [CrossRef] [Green Version]
  2. Ramirez, P.; Vidakovic, B. Wavelet density estimation for stratified size-biased sample. J. Stat. Plan. Inference 2010, 140, 419–432. [Google Scholar] [CrossRef]
  3. Chesneau, C. Wavelet block thresholding for density estimation in the presence of bias. J. Korean Stat. Soc. 2010, 39, 43–53. [Google Scholar] [CrossRef] [Green Version]
  4. Guo, H.; Kou, J. Pointwise density estimation for biased sample. J. Comput. Appl. Math. 2019, 361, 444–458. [Google Scholar] [CrossRef]
  5. Alam, K.; Saxena, K.L. Positive dependence in multivariate distributions. Commun. Stat. Theory Method. 1981, 10, 1183–1196. [Google Scholar]
  6. Joag-Dev, K.; Proschan, F. Negative association of random variables with applications. Ann. Stat. 1983, 11, 286–295. [Google Scholar] [CrossRef]
  7. Liang, H.Y. Complete convergence for weighted sums of negatively associated random variables. Statist. Probab. Lett. 2000, 48, 317–325. [Google Scholar] [CrossRef]
  8. Matula, P. A note on the almost sure convergence of sums of negatively dependent random variables. Statist. Probab. Lett. 1992, 5, 209–212. [Google Scholar] [CrossRef]
  9. Roussas, G.G. Asymptotic normality of random fields of positively or negatively associated processes. J. Multivariate Anal. 1994, 50, 152–173. [Google Scholar] [CrossRef] [Green Version]
  10. Chen, P.; Sung, S.H. On the strong convergence for weighted sums of negatively associated random variables. Statist. Probab. Lett. 2014, 92, 45–52. [Google Scholar] [CrossRef]
  11. Miao, Y.; Xu, W.; Chen, S. Some limit theorems for negatively associated random variables. Pro. Math. Sci. 2014, 124, 447–456. [Google Scholar] [CrossRef]
  12. Wu, Y. On complete moment convergence for arrays of rowwise negatively associated random variables. RACSAM 2014, 108, 669–681. [Google Scholar] [CrossRef]
  13. Chesneau, C.; Dewan, I.; Doosti, H. Wavelet linear density estimation for associated stratified size-biased sample. J. Nonparametr. Stat. 2012, 2, 429–445. [Google Scholar] [CrossRef]
  14. Liu, Y.M.; Xu, J.L. Wavelet density estimation for negatively associated stratified size-biased sample. J. Nonparametr. Stat. 2014, 26, 537–554. [Google Scholar] [CrossRef]
  15. Guo, H.J.; Kou, J.K. Pointwise density estimation based on negatively associated data. J. Inequal. Appl. 2019, 206, 1–16. [Google Scholar] [CrossRef]
  16. Cai, T.; Chicken, E. Block thresholding for density estimation: Local and global adaptivity. J. Multivariate Anal. 2005, 95, 76–106. [Google Scholar]
  17. Brown, L.; Cai, T.; Zhang, R.; Zhao, L. The root-unroot algorithm for density estimation as implemented via wavelet block thresholding. Probab. Theory Related Fields 2010, 146, 401–433. [Google Scholar] [CrossRef] [Green Version]
  18. Chesneau, C. Wavelet estimation via block thresholding: A minimax study under the Lp risk. Statist. Sinica. 2008, 18, 1007–1024. [Google Scholar]
  19. Donoho, D.L.; Johnstone, I.M. Minimax estimation via wavelet shrinkage. Ann. Stat. 1998, 26, 879–921. [Google Scholar] [CrossRef]
  20. Shao, Q.M. A comparison theorem on maximum inequalities between negatively associated and independent random variables. J. Theoret. Probab. 2000, 13, 343–356. [Google Scholar] [CrossRef]
Figure 1. Recovery of the density function for NA samples in Example 1 ( n = 2048 ).
Figure 1. Recovery of the density function for NA samples in Example 1 ( n = 2048 ).
Mathematics 08 00176 g001
Figure 2. Recovery of the density function for NA samples in Example 2 ( n = 2048 ).
Figure 2. Recovery of the density function for NA samples in Example 2 ( n = 2048 ).
Mathematics 08 00176 g002
Figure 3. Recovery of the density function for NA samples in the Example 3 ( n = 2048 ).
Figure 3. Recovery of the density function for NA samples in the Example 3 ( n = 2048 ).
Mathematics 08 00176 g003
Table 1. The MSEs of the block and nonlinear wavelet estimators.
Table 1. The MSEs of the block and nonlinear wavelet estimators.
nExample 1Example 2Example 3
BlockNonlinearBlockNonlinearBlockNonlinear
5120.00470.00720.00640.01020.00670.0114
10240.00430.00650.00560.00980.00540.0101
20480.00390.00580.00530.00960.00480.0093

Share and Cite

MDPI and ACS Style

Ye, R.; Liu, X.; Yu, Y. Pointwise Optimality of Wavelet Density Estimation for Negatively Associated Biased Sample. Mathematics 2020, 8, 176. https://doi.org/10.3390/math8020176

AMA Style

Ye R, Liu X, Yu Y. Pointwise Optimality of Wavelet Density Estimation for Negatively Associated Biased Sample. Mathematics. 2020; 8(2):176. https://doi.org/10.3390/math8020176

Chicago/Turabian Style

Ye, Renyu, Xinsheng Liu, and Yuncai Yu. 2020. "Pointwise Optimality of Wavelet Density Estimation for Negatively Associated Biased Sample" Mathematics 8, no. 2: 176. https://doi.org/10.3390/math8020176

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop