Abstract
The two-sided and one-sided empirical Bayes test (EBT) rules for the parameter of a generalized exponential distribution with contaminated data (errors in variables) are constructed by a deconvolution kernel method, respectively. Under the type of the supersmooth error distributions and the supersmooth errors with the error level can be controlled situations, the asymptotically optimal uniformly over a class of prior distributions and uniform rates of convergence of the corresponding regret for the proposed EBT rules are obtained with suitable conditions. The example study shows that the assumptions and conditions of the main results of this paper are satisfied easily by calculating.
1. Introduction
Much of the more recent literature has looked at the empirical Bayes test (EBT). EBT for the parameter of some common distributions are investigated [1,2,3,4]. Under non-identical components case, empirical Bayes testing for a lifetime guarantee is considered for the double parameter exponential distribution [5]. Merging Bayesian and empirical Bayes posterior distributions in total variation is discussed [6]. A double empirical Bayes decision is obtained for multi-experiment studies by means of an empirical Bayes analytical method [7]. In order to study the relationship between empirical Bayes posterior distributions and false discovery rate control, a spike and slab empirical Bayes multiple testing is constructed [8]. An empirical Bayes multiple testing procedure for the sparse sequence model is investigated [9]. In earlier times, on the EBT for the continuous one-parameter exponential family has lots of work and is asymptotically optimal and the optimal convergence rates of EBT are obtained [10,11,12,13,14]. Most of the studies have discussed the empirical Bayes decision problem in the case of non-contaminated data, which is not the case for pure data cases. However, in practical application problems, contaminated data (errors in variables) are involved in many fields, and it has been widely studied [15,16,17,18,19]. In recent literature, the one-sided empirical Bayes decision problem is investigated for the continuous one-parameter exponential family with contaminated data [20].
Suppose that the random variable X has the generalized exponential distribution (GED) with probability density function (PDF) of the following forms [21]
where is a unknown parameter with the natural space . In this article, we assume is a known constant, and the sample space is .
GED is also called linear exponential distribution, it is a combinatorial distribution, and the exponential and Rayleigh distributions are considered as special cases of GED when and , respectively. The hazard function of GED is a linear function about time and age in linear exponential models, and it is one of the reasonable models for lifetime distributions of random phenomena. Progressive type-II censored competing risks data when the lifetimes are assumed to be a linear exponential distribution [21]. Recurrence relations for single and product moments of generalized order statistics have been derived with the linear exponential distribution [22]. The linear exponential distribution has been used in the area of reliability and life-testing see, for example, Broadbent [23] and Bain [24].
The two-sided and one-sided EBT rules are constructed for GED with contaminated data in this article. Deconvolution kernel method is employed to develop the two-sided and one-sided EBT rules with contaminated data, respectively. For errors in the variables model, deconvolution kernel method can eliminate the effect of the additive noise kernel density estimation. Furthermore, under the supersmooth error distributions and the supersmooth errors, the error level can be controlled, the asymptotically optimal and uniform rates of convergence are obtained with suitable conditions.
In practical problems, we often encounter measurement errors due to observation conditions, so the analysis of contaminated data is very important. For the pair random variables , assume that has a prior distribution , X is one dimensional real random variable with a marginal density function when is given, is not directly observable. We observe only Y, where , and are the random error. Suppose that follows a known distribution on , and independent on .
Firstly, we consider the two-sided test problem as follows
where and are known constants.
Define , , then (2) is equivalent to
For hypothesis test (3), let taking 0–1 weighted square loss function in the following
where is a constant, and is the decision space, indicates accepting , indicates rejecting .
When , then we obtain ; when then we have .
Let the parameter be distributed according to an unknown prior , and assume that belongs to the following class of distributions
where denotes the m order derivative of , which is the marginal density of X, and is an integer, is a constant.
We define the randomized decision rule for hypothesis test (3) as follows
Then, the Bayes risk of is given by
where
with
and denotes the density of Y given , i.e., .
Let , and
thus, we have
From (7), we obtain
where is the marginal PDF of random variable X, and denote the first order and the second order derivative of , respectively.
A test is called a Bayes test with respect to if
Since is unknown in this paper, is unavailable to use, so this leads us to use the empirical Bayes approach in the following.
The rest of this article is organized as follows. In Section 2, the two-sided EBT rule for GED with contaminated data is proposed; Section 3 is devoted to obtaining asymptotic properties and the uniform convergence rate of two-sided EBT rule; the main results of two-sided EBT are proved in Section 4; Section 5 investigated one-sided EBT rule for GED with contaminated data; an example study is presented in Section 6.
2. The Proposed Two-Sided EBT Rule of GED with Contaminated Data
It is well-known that we usually make the following assumptions in the empirical Bayes framework, let , , ⋯, , and be independent pair of random variables, the parameters () and have a common prior distribution ; () and Y are distributed according to the same marginal distribution with density function , , ⋯, denotes historical samples and Y is called the present sample.
Deconvolution is a very important problem. It is often encountered when modeling unobservable data or to estimate conditional moments useful in likelihood calculations. When dealing with non-parametric estimation of priors or in measurement error models, the sample data are noisy because of the measurement error; deconvolution kernel method is adopted to eliminate the effect of the additive noise kernel density estimation. In order to obtain the empirical Bayes decision, we employ the deconvolution kernel method in the following by Fan [17,18,19].
Let and be the characteristic function (c.f.) of Y and , respectively. Note that
Thus, a deconvoluted kernel density estimation of is defined by
where as and is called the empirical c.f. of random variable Y. Note that and .
We define an estimator of the of the random variable X by
where is the kernel density estimator given by (13), and is a sequence of constants.
Hence, we define an estimator of the as follows
Furthermore, an empirical Byes test rule is defined as
In the following, let E be the expectation with respect to the joint distribution of . Then, the overall Bayes risk of would be
By the definition, for any , if , where , then is called asymptotically optimal uniformly with uniform convergence rate .
3. Asymptotic Properties of Two-Sided EBT Rule
In this section, asymptotic properties of be investigated, some assumptions on the kernel function and the error variable are given in the following.
- (A1)
- The is a symmetric function about zero on and satisfies, for and for some integer ,
- (A2)
- is a symmetric function, having bounded integral derivatives on ,
- (A3)
- ,
- (A4)
- The characteristic function of satisfies for any t,
- (A5)
- uniformly in y, and
- (A6)
- for some , andand, where is given by (9).
Next, theorem below about the two-sided EBT establish the rates of convergence of the regret , where and are given by (11) and (18), respectively.
Theorem 1.
For some integer and constants , ϑ is defined by (4). Suppose that and are such that (A1)–(A6) hold, and the following conditions are satisfied:
- (B1)
- for,
- (B2)
- asfor some positive constants, , and a constant.
Then, by the choosing the bandwidth, we obtain
Remark 1.
If its characteristic function satisfies condition (B2) of Theorem 1, then the distribution of a random variable ε is called supersmooth of order β. The common examples of supersmooth distributions are normal, Cauchy, mixture normal, etc. In practice, the conditions of Theorem 1 are easy to verify. It can be seen from the result of the Theorem 1 that the rate of convergence of EBT is very slow for very common error distributions, such as normal. Fan [17,18] pointed out the supersmooth error distribution will result in a worse convergence rate than of the smooth distribution.
It appears that the optimal rate of convergence for Gaussian deconvolution is extremely slow. Since the normal distribution is frequently used in applications, we need to study how to large a noise level is acceptable. Thus, considering the following model, let us assume that the data are independent identical distribution samples from
where , parameterizes the noise level.
Theorem 2.
For some integer and constants , ϑ is defined by (4). Suppose that and are such that (A1)–(A6) hold with . Then, let and by choosing the bandwidth , we have
Remark 2.
Although all the data are contaminated with supersmooth errors, the results of Theorem 2 can also be as good as that of the uncontaminated data case. Suppose that all the data are contaminated with supersmooth errors, while the error level can be controlled, namely, . Fan [19] had been considered model (20). Theorem 2 indicates that the convergence rate is also very slow. The result of the following Lemma 3 is as good as ordinary smooth errors distribution but the result of the following Lemma 4 cause to the worse convergence rate of empirical Bayes estimator.
4. Proofs
In this section, first we need some lemmas to prove the main results of this paper. Lemmas 1 and 2 are due to Fan [17,18], Lemmas 3, and 4 are due to Fan [19]. The proof of Lemma 4 can be found in Johns and Van Ryzin [10]. Theorems 1 and 2 shall be proved, since the proofs of Theorems 1 and 2 are similar, only Theorem 1 is proved in detail. In the following, always stand for some positive constants and may be different even with the same notations.
Lemma 1.
Lemma 2.
Lemma 3.
Lemma 4.
Let is given by (15), suppose that and are bounded, respectively. Let satisfy (A1)–(A4) with , and , then we have
Lemma 5.
Proof of Theorem 1.
By Lemma 5 and by the Markov inequality, for any ,
By applying the -inequality followed by Lyapunov’s inequality and using Fubini’s Theorem, we obtain
From Lemmas 1 and 2, by the assumption conditions of Theorem 1, we have
So the proof of Theorem 1 was completed. □
Proof of Theorem 2.
The proof is similar to that of Theorem 1 above, except that we let Lemmas 3 and 4 in the place of Lemmas 1 and 2 in the proof of Theorem 1, respectively. □
5. One-Sided EBT Rule and Its Asymptotic Properties
In this section, we study one-sided EBT for the parameter of GED with contaminated data. Considering the problem of testing the hypotheses versus , where be a known positive constant. Let linear loss function of testing the hypotheses as follows
where a is a positive constant and is the action space, indicates accepting , indicates rejecting , is the indicator of the set A.
The same as above, assume that X is not directly observable and because of measurement error or the nature of environment, we can only observe , where the error variable has a known distribution on . It is assumed that and are independent. It is assumed that the parameter is a realization of a random variable having an unknown prior distribution over the natural parameter space . Let randomized decision rule for the preceding testing problem is . For one-sided test, we assume that belongs to the following class of distributions
where denotes the m order derivative of , which is the marginal density of X, and is an integer, is a constant.
Let denotes the Bayes risk of the test when G is the prior distribution, it can be expressed as
where
From (35), we obtain
where .
Therefore, the Bayes test can be presented as
The Bayes risk of is
Thus, we defined the estimation of as
Furthermore, one-sided empirical Byes test rule is defined by
Then, the overall Bayes risks of would be
It is necessary state that Lemmas 1–4 still hold over a class of new prior distributions for one-sided EB decision problem. So by Lemmas 1–5, Theorem below establish the rates of convergence of the regret , where and are given by (38) and (41), respectively. For one-sided EBT, we assume that the following conditions are satisfied:
- (C1)
- uniformly in y, and
- (C2)
- for some , and
, where is given by (36).
Theorem 3.
For any , let be defined by (33), suppose that , such that (A1)–(A4) and (B1)–(B2) of Theorem 1 hold and satisfying conditions (C1) and (C2). Then, by choosing the bandwidth , we obtain
Theorem 4.
For any and some integer , let be defined by (33), suppose that and are such that (A1)–(A4) hold with , and satisfying conditions (C1) and (C2). Then, let and by choosing the bandwidth , we have
Remark 3.
For one-sided EBT, Similar to Theorem 1, the supersmooth distribution of a random variable ε is also considered to Theorem 3, its characteristic function satisfies condition (B2). Under all the data are contaminated while the error level can be controlled situation, for model (20), by Lemmas 1–5, Theorem 4, Theorem 4 obtained the rate of convergence of one-sided EBT, this result can also be as good as that of the uncontaminated data case.
Proof of Theorem 3.
By Lemma 5 and by the Markov inequality, for ,
By applying the -inequality followed by Lyapunov’s inequality and using Fubini’s Theorem, we obtain
From Lemmas 1 and 2, by the assumption conditions of Theorem 3, we have
So the proof of Theorem 3 was completed. □
Proof of Theorem 4.
The proof is similar to that of Theorem 3 above, except that we let Lemmas 3 and 4 in the place of Lemmas 1 and 2 in the proof of Theorem 3, respectively. □
6. An Example Study
In this section, an example study is presented to verify the GED and the prior distribution which satisfies theorems in this paper exist. Suppose that the probability density function of random variable X as follows
where is a given parameter, and the sample space is , the parameter space is . Let the prior distribution of parameter is
where r is a positive known parameter and is a positive unknown parameter. By calculating we obtain
where .
Obviously, is existence, and , where is polynomial with respect to x and . Since , is bounded on , where is an integer. Thus, and are satisfied.
Let the supersmooth error distribution be N(0,1), it is easy to check that satisfies the condition (B2) of Theorem 1. Moreover, we can take .
For the two-sided EBT case, we used the following kernel function
where , and we choose the Fourier transform of the above kernel is
Then, the deconvolution kernel density estimators (14) are the following kernels, for the type of supersmooth error distribution case,
Similar to [20], it is easily shown that assumptions and conditions of Theorems 1 and 2 are satisfied with the above specifications.
For one-sided EBT case, we choose , then the Fourier transform of is a second order kernel as follows
The corresponding deconvolution kernel density estimators (14) are kernel in the following
Then, similar to literature [20], it is easily shown that assumptions and conditions of Theorem 3 and 4 are satisfied with the above specifications.
Actually, we can take , when , at the same time it may suit for one-sided and two-sided EBT. However, if , the second order kernel (53) only satisfies kernel conditions of Theorems 3 and 4.
7. Conclusions
In this paper, we had studied the empirical Bayes decision for the parameter of a generalized exponential distribution with contaminated data, two-sided and one-sided empirical Bayes test rules were constructed by a deconvolution kernel method, respectively. For the type of the supersmooth error distributions the asymptotically optimal uniformly over a class of prior distributions and uniform rates of convergence of the corresponding regret for the proposed EBT rules are obtained under the conditions of Theorems 1 and 3. Furthermore, we also investigated the supersmooth errors with the error level can be controlled case, , where , parameterizes the noise level, that is, , and obtained Theorems 2 and 4. As an example, let the supersmooth error distribution be N(0,1), we proved the assumptions and conditions of the main results of this paper are satisfied easily by calculating.
In many practical problems, not all the observations are contaminated, but there may be a partially contaminated case. Suppose that only of the data are measured with error and the remaining data are error free. We consider the mode , taking and , where is an error variable with distribution and the characteristic function . Thus, the characteristic function of is denoted by . In this regard, we can consider extending the current research work to this situation, which is believed to be a very interesting topic.
Author Contributions
Conceptualization, J.C.; Data curation, H.Q.; Formal analysis, H.Q.; Funding acquisition, J.C.; Investigation, H.Q.; Methodology, H.Q.; Project administration, Z.Y. and Y.H.; Supervision, J.C.; visualization, Z.Y.; Validation, Y.H.; Writing—original draft, H.Q.; Writing—review and editing, H.Q. All authors have read and agreed to the published version of the manuscript.
Funding
This research was supported by the National Natural Science Foundation of China under Grant 81671633 to J. Chen.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
All data in this paper have been presented in the manuscript.
Acknowledgments
Many thanks to reviewers for their positive feedback, valuable comments and constructive suggestions that helped improve the quality of this article. Many thanks to editors’ great help and coordination for the publish of this paper.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Karunamuni, R.; Li, J.; Wu, J. Robust empirical Bayes tests for continuous distributions. J. Stat. Plan. Inference 2010, 140, 268–282. [Google Scholar] [CrossRef]
- Chen, L.S.; Yang, M.C. Empirical Bayes testing for equivalence. J. Stat. Plan. Inference 2011, 141, 2670–2681. [Google Scholar] [CrossRef]
- Yuan, M.; Wei, L. Two-sided empirical Bayes test for location parameter in the gamma distribution. Commun. Stat.-Theory Methods 2017, 46, 4215–4225. [Google Scholar] [CrossRef]
- Yuan, M.; Zhang, Q.; Wei, L.S. One-sided empirical Bayes test for location parameter in Gamma distribution. Appl. Math. J. Chin. Univ. 2018, 33, 287–297. [Google Scholar] [CrossRef]
- Chen, L.S. Empirical Bayes testing for guarantee lifetime: Non identical components case. Commun. Stat.-Theory Methods 2017, 46, 683–705. [Google Scholar] [CrossRef]
- Petrone, S.; Rousseau, J.; Scricciolo, C. Bayes and empirical Bayes: Do they merge? Biometrika 2014, 101, 285–302. [Google Scholar] [CrossRef]
- Tansey, W.; Wang, Y.; Rabadan, R.; Blei, D. Double empirical bayes testing. Int. Stat. Rev. 2020, 88, S91–S113. [Google Scholar] [CrossRef]
- Castillo, I.; Roquain, É. On spike and slab empirical Bayes multiple testing. Ann. Statist. 2020, 48, 2548–2574. [Google Scholar] [CrossRef]
- Abraham, K.; Castillo, I.; Roquain, É. Empirical Bayes cumulative ℓ-value multiple testing procedure for sparse sequences. Electron. J. Stat. 2022, 16, 2033–2081. [Google Scholar]
- Johns, M., Jr.; Van Ryzin, J. Convergence rates for empirical Bayes two-action problems II. Continuous case. Ann. Math. Stat. 1972, 43, 934–947. [Google Scholar] [CrossRef]
- Singh, R.S.; Laisheng, W. Nonparametric empirical bayes procedures, asymptotic optimality And rates Of convergence For two-tail tests In exponential family. J. Nonparametric Stat. 2000, 475–501. [Google Scholar] [CrossRef]
- Pensky, M. Rates of convergence of empirical Bayes tests for a normal mean. J. Stat. Plan. Inference 2003, 111, 181–196. [Google Scholar] [CrossRef]
- Liang, T.C. On optimal convergence rate of empirical Bayes tests. Stat. Probab. Lett. 2004, 68, 189–198. [Google Scholar] [CrossRef]
- Gupta, S.S.; Li, J. On empirical Bayes procedures for selecting good populations in a positive exponential family. J. Stat. Plan. Inference 2005, 129, 3–18. [Google Scholar] [CrossRef]
- Carroll, R.J.; Hall, P. Optimal rates of convergence for deconvolving a density. J. Am. Stat. Assoc. 1988, 83, 1184–1186. [Google Scholar] [CrossRef]
- Stefanski, L.A. Rates of convergence of some estimators in a class of deconvolution problems. Stat. Probab. Lett. 1990, 9, 229–235. [Google Scholar] [CrossRef]
- Fan, J. On the optimal rates of convergence for nonparametric deconvolution problems. Ann. Stat. 1991, 19, 1257–1272. [Google Scholar] [CrossRef]
- Fan, J. Global behavior of deconvolution kernel estimates. Stat. Sin. 1991, 1, 541–551. [Google Scholar]
- Fan, J. Deconvolution with supersmooth distributions. Can. J. Stat. 1992, 20, 155–169. [Google Scholar] [CrossRef]
- Karunamuni, R.J.; Zhang, S. Empirical Bayes two-action problem for the continuous one-parameter exponential family with errors in variables. J. Stat. Plan. Inference 2003, 113, 437–449. [Google Scholar] [CrossRef]
- Davies, K.F.; Volterman, W. Progressively Type-II censored competing risks data from the linear exponential distribution. Commun. Stat.-Theory Methods 2022, 51, 1444–1460. [Google Scholar] [CrossRef]
- Ahmad, A.E.B.A. Single and product moments of generalized order statistics from linear exponential distribution. Commun. Stat.-Theory Methods 2008, 37, 1162–1172. [Google Scholar] [CrossRef]
- Broadbent, S. Simple mortality rates. Appl. Stat. 1958, 7, 86–95. [Google Scholar] [CrossRef]
- Bain, L.J. Analysis for the linear failure-rate life-testing distribution. Technometrics 1974, 16, 551–559. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).