Abstract
Second-order Chebyshev–Edgeworth expansions are derived for various statistics from samples with random sample sizes, where the asymptotic laws are scale mixtures of the standard normal or chi-square distributions with scale mixing gamma or inverse exponential distributions. A formal construction of asymptotic expansions is developed. Therefore, the results can be applied to a whole family of asymptotically normal or chi-square statistics. The random mean, the normalized Student t-distribution and the Student t-statistic under non-normality with the normal limit law are considered. With the chi-square limit distribution, Hotelling’s generalized statistics and scale mixture of chi-square distributions are used. We present the first Chebyshev–Edgeworth expansions for asymptotically chi-square statistics based on samples with random sample sizes. The statistics allow non-random, random, and mixed normalization factors. Depending on the type of normalization, we can find three different limit distributions for each of the statistics considered. Limit laws are Student t-, standard normal, inverse Pareto, generalized gamma, Laplace and generalized Laplace as well as weighted sums of generalized gamma distributions. The paper continues the authors’ studies on the approximation of statistics for randomly sized samples.
Keywords:
second-order expansions; random sample size; asymptotically normal statistics; asymptotically chi-square statistics; Student’s t-distribution; normal distribution; inverse Pareto distribution; Laplace and generalized Laplace distribution; weighted sums of generalized gamma distributions MSC:
62E17 (Primary) 62H10; 60E05 (Secondary)
1. Introduction
In classical statistical inference, the number of observations is usually known. If observations are collected in a fixed time span or we lack observations the sample size may be a realization of a random variable. The number of failed devices in the warranty period, the number of new infections each week in a flu season, the number of daily customers in a supermarket or the number of traffic accidents per year are all random numbers.
Interest in studying samples with a random number of observations has grown steadily over the past few years. In medical research, the authors of [1,2,3] examines ANOVA models with unknown sample sizes for the analysis of fixed one-way effects in order to avoid false rejection. Applications of orthogonal mixed models to situations with samples of a random number of observations of a Poisson or binomial distributed random variable are presented. Based a random number of observations [4], Al-Mutairi and Raqab [5] and Barakat et al. [6] examined the mean with known and unknown variances and the variance in the normal model, confidence intervals for quantiles and prediction intervals for the future observations for generalized order statistics. An overview on statistical inference of samples with random sample sizes and some applications are given in [4], see also the references therein.
When the non-random sample size is replaced by a random variable, the asymptotic features of statistics can change radically, as shown by Gnedenko [7]. The monograph by Gnedenko and Korolev [8] deals with below limit distributions for randomly indexed sequences and their applications.
General transfer theorems for asymptotic expansions of the distribution function of statistics based on samples with non-random sample sizes to their analogues for samples of random sizes are proven in [9,10]. In these papers, rates of convergence and first-order expansion are proved for asymptotically normal statistics. The results depend on the rates of convergence with which the distributions of the normalized random sample sizes approach the corresponding limit distribution.
The difficulty of obtaining second-order expansions for the normalized random sample sizes beyond the rates of convergences was overcome by Christoph et al. [11]. Second-order expansions were proved by the authors of [11,12] for the random mean and the median of samples with random sample sizes and the authors of [13,14] for the three geometric statistics of Gaussian vectors, the length of a vector, the distance between two vectors and the angle between two vectors associated with their correlation coefficient when the dimension of the vectors is random.
The classical Chebyshev–Edgeworth expansions strongly influenced the development of asymptotic statistics. The fruitful interactions between Chebyshev Edgeworth expansions and Bootstrap methods are demonstrated in [15]. Detailed reviews of applications of Chebyshev–Edgeworth expansions in statistics were given by, e.g., Bickel [16] and Kolassa [17]. If the arithmetic mean of independent random variables is considered as the statistic, only the expected value and the dispersion are taken into account in the central limit theorem or in the Berry–Esseen inequalities. The two important characteristics of random variables, skewness and kurtosis, have great influence on second order expansions, provided that the corresponding moments exist. The Cornish–Fisher inversion of the Chebyshev–Edgeworth expansion allows the approximation of the quantiles of the test statistics used, for example, in many hypothesis tests. In [11], Theorems 3 and 6, and [12], Corollaries 6.2 and 6.3, Cornish–Fisher expansions for the random mean and median from samples with random sample sizes are obtained. In the same way, Cornish–Fisher expansions for the quantiles of the statistics considered in present paper can be derived from the corresponding Chebyshev–Edgeworth expansions.
In the present paper, we continue our research on approximations if the sample sizes are random. To the best of our knowledge, Chebyshev–Edgeworth-type expansions with asymptotically chi-square statistics have not yet been proven in the literature when the sample sizes are random.
The article is structured as follows. Section 2 describes statistical models with random numbers of observations, the assumptions about statistics and random sample sizes and transfer propositions from samples with non-random to random sample sizes. Section 3 presents statistics with non-random sample sizes with Chebyshev–Edgeworth expansions based on standard normal or chi-square distributions. Corresponding expansions of the negative binomial or discrete Pareto distributions as random sample sizes are considered in Section 4. Section 5 describes the influence of non-random, random or mixed normalization factors on the limit distributions of the examined statistics that are based on samples with random sample sizes. Besides the common Student’s t, normal and Laplace distributions, inverse Pareto, generalized gamma and generalized Laplace as well as weighted sums of generalized gamma distributions also occur as limit laws. The main results for statistic families with different normalization factors and examples are given in Section 6. To prove statements about a family of statistics, formal constructions for the expansions are worked out in Section 7, which are used in Section 8 to prove the theorems. Conclusions are drawn in Section 9. We leave four auxiliary lemmas to Appendix A.
2. Statistical Models with a Random Number of Observations
Let and be random variables defined on a common probability space . The random variables denote the observations and form the random sample with a non-random sample size . Let
be some statistic obtained from the sample . Consider now the sample . The random variable denotes the random size of the underlying sample, that is the random number of observations, depending on a parameter . We suppose for each that is independent of random variables and in probability as .
Let be a statistic obtained from a random sample defined as
2.1. Assumptions on Statistics and Random Sample Sizes
In further consideration, we restrict ourselves to only those terms in the expansions that are used below.
We assume that the following condition for the statistic with from a sample with non-random sample size is fulfilled:
Assumption 1.
There are differentiable functions for all distribution function and bounded functions , and real numbers , and so that for all integers
Remark 1.
In contrast to Bening et al. [10], the differentiability of , and is only required for . In the present article, in addition to the normal distribution, the chi-square distribution with p degrees of freedom is used as , which is not differentiable in if or .
The distribution functions of the normalized random variables satisfy the following condition:
Assumption 2.
A distribution function with , a function of bounded variation , a sequence and real numbers and exist so that for all integers
2.2. Transfer Proposition from Samples with Non-Random to Random Sample Sizes
Assumptions 1 and 2 allow the construction of expansions for distributions of normalized random-size statistics based on approximate results for fixed-size normalized statistics in (1) and for the random size in (2).
Proposition 1.
General transfer theorems with more terms are proved in [9,10] for .
Remark 2.
The following statement clarifies the problem.
Proposition 2.
Remark 3.
Remark 4.
Proof of Propositions 1 and 2:
Evidence of Proposition 1 follows along the similar arguments of the more general Transfer Theorem 3.1 in [10] for . The proof was adapted by Christoph and Ulyanov [13] to negative , too. Therefore, the Proposition 1 applies to .
The present Propositions 1 and 2 differ from Theorems 1 and 2 in [13] only by the additional term and the added condition (7) to estimate this additional term. Therefore, the details are omitted her. □
Remark 5.
In Appendix 2 of the monograph by Gnedenko and Korolev [8], asymptotic expansions for generalized Cox processes are proved (see Theorems A2.6.1–A2.6.3). As random sample size, the authors considered a Cox process controlled by a Poisson process (also known as a doubly stochastic Poisson process) and proved asymptotic expansions for the random sum , where are independent identically distributed random variables. For each , the random variables are independent. The above-mentioned theorems are close to Proposition 1. The structure of the functions in (4) and the bounds on the right-hand side of inequality (3) in Proposition 1 differ from the corresponding terms in Theorems A2.6.1–A2.6.3. Thus, the bounds contain little o-terms.
3. Chebyshev–Edgeworth Expansions Based on Standard Normal and Chi-Square Distributions
We consider two classes of statistics which are asymptotically normal or chi-square distributed.
3.1. Examples for Asymptotically Normally Distributed Statistics
Let be independent identically distributed random variables with
The random variable X is assumed to satisfy Cramér’s condition
Consider the asymptotically normal sample mean:
It follows from Petrov [18], Theorem 5.18 with , that
with C being independent of m and second order expansion
where and are standard normal distribution function and its density and are the Chebyshev–Hermite polynomials
Let the random variable be chi-square distributed with d degrees of freedom having distribution function and density function :
Next, we examine the scale-mixed normalized statistic , where Z and are independent random variables with the standard normal distribution and the chi-square distribution , respectively. Then, the statistic follows the Student’s t-distribution with m degrees of freedom. Example 2.1 in [19] indicates
Chebyshev–Edgeworth expansions of Student’s t-statistic under non-normality are well investigated, but only Hall [20] proved these under minimal moment condition. Let conditions (12) and (13) are satisfied for independent identically distributed random variables Define with sample mean and biased sample variance . It follows from Hall [20] that for Student’s t-statistic :
uniformly in x, where as ,
Remark 6.
The estimate (19) does not satisfy (1) in Assumption 1 because we do not have a computable error bound U with for all . The estimate (19) does not satisfy (1) in Assumption 1 because we do not have a computable constant C with for all , if all parameter are given. The remainder in (19) meets order condition as , but in the equivalent condition for all the values and are unknown. About non-asymptotic bounds and order conditions, see the work of Fujikoshi and Ulyanov [19] (Section 1.1).
In [21], an inequality for a first order approximation is proved:
where is required for arbitrary and is defined in (20).
3.2. Examples for Asymptotically Chi-Square Distributed Statistics
The baseline distribution of the second order expansions is now the chi-square distribution occurring as limit distribution in different multivariate tests (see [22], Chapters 5 and 8–10, [19,23]).
At first, we consider statistic , where and are random matrices independently distributed as Wishart distributions and , respectively, with identity operator in . Note that has Wishart distribution if and its density is
where (see [23], Chapter 2, for some basic properties).
Hotelling’s generalized distribution allows approximation
(see [24], Theorem 4.1), where
If is a scale mixture, where and are independent, allows asymptotic expansion
(see [25], Section 5), where now
Integration by parts gives . Moreover, for and . Then, it follows for both statistics in (22) and in (24) that
where the coefficients and are defined in (23) and (25).
The scaled mixture is considered in the works by Fujikoshi et al. [23] (Example 13.2.2) and Fujikoshi and Ulyanov [19] (Example 2.2). The estimation given there leads to a computable error bound:
4. Chebyshev–Edgeworth Expansions for Distributions of Normalized Random Sample Sizes
As in the articles by, e.g., Bening et al. [9,10], Christoph et al. [11,12] and Christoph and Ulyanov [13] and Christoph and Ulyanov [14], we consider as random sample sizes the negative binomial random variable and the maximum of n independent discrete Pareto random variables where and are parameters.
“The negative binomial distribution is one of the two leading cases for count models, it accommodates the overdispersion typically observed in count data (which the Poisson model cannot)” [26]. Moreover, and tends to the gamma distribution with identical shape and rate parameters .
On the other hand, the mean for the discrete Pareto-like variable does not exist, yet tends to the inverse exponential distribution with scale parameter .
Remark 8.
The authors of [1,2,3,4,27], among others, considered the binomial or Poisson distributions as random number N of observations. If is binomial (with parameters n and ) or Poisson (with rate ) distributed, then tends to the degenerated in 1 distribution as . Therefore, Assumption 2 for the Transfer Proposition 1 is not fulfilled. On the other hand, since binomial or Poisson sample sizes are asymptotically normally distributed and if the statistic is also asymptotically normally distributed, so is the statistic , too (see [28]). Chebyshev–Edgeworth expansions for lattice distributed random variables exist so far only with bounds of small-o or large- order (see [29]). For (2) in Assumption 2, computable error bounds are required because the constant in (3) depends on (see also Remark 6 on large--bounds and computable error bounds).
4.1. The Random Sample Size Has Negative Binomial Distribution with Success Probability
The sample size has a negative binomial distribution shifted by 1 with the parameters and , the probability mass function
and . Bening and Korolev [30] and Schluter and Trede [26] showed
where is the gamma distribution function with its density
In addition to the expansion of , a bound of the negative moment in (3) is required, where is rate of convergence of the Chebyshev–Edgeworth expansion for in (1).
Proposition 3.
Suppose that and the discrete random variables have probability mass function (28) with . Then,
for all , where the constant does not depent on n and
Moreover, negative momentsfulfill the estimate for all,
and the convergence rate in casecannot be improved.
Proof.
In [10] (Formula (21)) and in [31] (Formula (11)), the convergence rate is reported for the case . In [11] (Theorem 1), the Chebyshev–Edgeworth expansion for is proved. In the case , for geometric distributed random variable with success probability the proof is straightforward:
where and . Hence, (31) holds for .
In [12] (Corollary 4.2), leading terms for the negative moments of are derived, which lead to (34). □
4.2. The Random Sample Size Is the Maximum of n Independent Discrete Pareto Variables
We consider the continuous Pareto Type II (Lomax) distribution function
The discrete Pareto II distribution is obtained by discretizing the continuous Pareto distribution , : . The random variable is the discrete counterpart on the positive integers to the continuous random variable . Both random variables and have shape parameter 1 and scale parameter (see [32]). The discrete Pareto distributed has probability mass and distribution functions:
Let be a sequence of independent random variables with the common distribution function (35). Define
The random variable is extremely spread over the positive integers.
Proposition 4.
The Chebyshev–Edgeworth expansion (37) is proved in [11] (Theorem 4). In [12] (Corollary 5.2), leading terms for the negative moments are derived for the negative moments that lead to (39).
Remark 10.
Let the random variable is exponentially distributed with rate parameter . Then, is an inverse exponentially distributed random variable with the continuous distribution function . Both and are heavy tailed with shape parameter 1.
Remark 11.
Since and for all , we choose as normalizing factor for in (37).
Remark 12.
Remark 13.
Lyamin [33] proved a bound for integers .
5. Limit Distributions of Statistics with Random Sample Sizes Using Different Scaling Factors
The statistic from a sample with non-random sample size fulfills condition (1) in Assumption 1. Instead of the non-random sample size m, we consider a random sample size satisfying condition (2) in Assumption 2. Let be a sequence with as . Consider the scaling factor by the statistics with if and or if and . Then, conditioning on and using (1) and (2), we have
If there exists a limit distribution of as , then it has to be a scale mixture of parent distribution and positive mixing parameter : (see, e.g., [23,34], Chapter 13, and [19] and the references therein).
Remark 14.
Formula (40) shows that different normalization factors at lead to different scale mixtures of the limit distribution of the normalized statistics .
5.1. The Case and
The statistics (15), (18) and (21) considered in Section 3.1 have normal approximations . The limit distribution for the normalized random sample size is the gamma distribution with density (30). We investigate the dependence of the limit distributions in as for .
(i) If , then the limit distribution is Student-s t distribution having density
(ii) If , the standard normal law is the limit one with density .
(iii) For , the generalized Laplace distributions occur with density (see [13], Section 5.1.3):
where is the Macconald function of order α or modified Bessel function of the third kind with index . The function is also sometimes called a modified Bessel function of the second kind of order . For properties of these functions, see, e.g., Chapter 51 in [35] or the Appendix on Bessel functions in [36].
If , the so-called Sargan densities and their distribution functions are computable in closed forms (see Formulas (63)–(65) below in Section 7):
where for .
The double exponential or standard Laplace density is with variance 1 and distribution function given in (43). The Sargan distributions are therefore a generalisation of the standard Laplace distribution.
5.2. The Case and
The statistics considered in Section 3.2 asymptotically approach chi-square distribution . The limit distribution for the normalized random sample size is the inverse exponential distribution .
(i) If , then the generalized gamma distribution occurs with density :
where the Macconald function already appears in Formula (42) with different and argument. For , where m is an integer, the Macconald function has a closed form (see Formulas (63)–(65) below in Section 7). Therefore, if is an odd number, then the density may be calculated in closed form. The distribution functions with density functions for and are
Remark 15.
Functions in (45) are Weibull density and distribution functions, in (46) there are density and distribution functions of a generalized gamma distribution, but and are even more general.
The family of generalized gamma distributions contains many absolutely continuous distributions concentrated on the non-negative half-line.
Remark 16.
The generalized gamma distribution corresponds to the density
where α and r are the two shape parameters and λ the scale parameter. The density representation (48) is suggested in the work of Korolev and Zeifman [37] or Korolev and Gorshenin [38], and many special cases are listed therein. In addition to, e.g., Gamma and Weibull distributions (with a > 0), inverse Gamma, Lévy and Fréche distributions (with a < 0) also belong to that family of generalized gamma distributions.
Remark 17.
The Weibull density in (45) is . Moreover, . The densities , are weighted sums of generalized gamma distribution with different shape parameters r, e.g.,
(i) If . For better readability I have introduced for (i) and after (ii).
(ii) If , the standard normal law is the limit distribution with density .
(iii) If as limit distribution the inverse Pareto distribution occurs with shape parameter , scale parameter and density :
In [39], a robust and efficient estimator for the shape parameter of the inverse Pareto distribution and applications are given.
6. Main Results
We examine asymptotic approximations of depending on the scaling factor for , for if the statistic is asymptotically normal or if is asymptotically chi-square distributed.
6.1. Asymptotically Normal Distributed Statistics with Negative Binomial Distributed Sample Sizes
Consider first the statistics estimated in (15), (18) and (21) with the normal limiting distribution . They have the form
The sample size is negative binomial with probability mass function (28).
Theorem 1.
Let . If inequality (50) for the statistic and inequality (31) for random sample size with hold, then, for all , the following expansions apply:
- i:
- The non-random scaling factor by statistic leads to Student’s t-approximation.
- ii:
- The standard normal approximation occurs at random scaling factor by statistic :where
- iii:
- If , the mixed scaling factor by statistic leads to generalized Laplace approximation:where , for and
Remark 18.
The statistics from Section 3.1 are considered with different normalization factors as applications of Theorem 1:
Corollary 1.
Let the conditions of Theorem 1 be satisfied:
- i:
- ii:
- iii:
Remark 19.
The approximating functions in the expansions for with the statistics estimated in (15), (18) and (21) can only be given in closed form for all in the case of non-random () or random () normalization factors. In the case of the mixed () normalization factor, only for positive integer r closed forms are available, while in the other cases Macconald functions are involved.
6.2. Asymptotically Chi-Square Distributed Statistics with Pareto-Like Distributed Sample Sizes
Consider now the statistics, estimated in (26) and (27) with limit chi-square distributions. They have the form
The sample size is the Pareto-like random variable with probability mass function (35).
Theorem 2.
Let and (36) be the distribution function of the random sample size . If for the statistic the inequality (56) with limiting chi-square distribution and the inequality (37) with for the random sample size hold, then for all one has the following approximation:
- i:
- The non-random scaling factor n by leads to the limiting generalized gamma distributions.andwhere the limit law with density for and are given in (45) and (46).
- ii:
- The random scaling factor by induces the limiting chi-square distribution.
- iii:
- Limiting inverse Pareto distributions occur at mixed scaling factor by .wherewith inverse Pareto distribution having shape parameter , scale parameter and density defined in (49).
Remark 20.
The statistics from Section 3.2 are considered with different normalization factors as applications of Theorem 2.
Corollary 2.
Let the conditions of Theorem 2 be satisfied.
- i:
- ii:
- iii:
Remark 21.
For the statistics estimated in (26) and (27), the approximating functions in the expansions for can only be given in closed form for all integer d in the case of non-random () or random () normalization factors. In the case of the mixed () normalization factor, only for odd integer d in closed form can be presented; for even integer d, the Macconald functions are involved.
7. Formal Construction of the Expansions
Expansions of the statistics considered in (15), (18), (21), (26) and (27) have the structure:
with and polynomials of degrees and , respectively. Here, or .
We calculate the integrals with and :
The limit distributions of the random sizes are and with corresponding second approximation .
We use the following formulas several times: Formula 2.3.3.1 in [40]
and Formula 2.3.16.1 in [40] with real and :
where the Macconald function already appears in Formula (42) with different and argument.
For , where m is an integer, the Macdonald function has a closed form (see Formulas 2.3.16.2 and 2.3.16.3 in [40] with ):
where
7.1. The Case and
Consider statistics that meet the condition (50).
Let with . Then, for and
If is an integer number then using (64) with , the density of can be calculated with (65) in a closed form.
7.2. The Case and
Consider statistics that meet the condition (56). Let , . Then, and
Let . Using (63) with , and , we find
If is an odd number, using the closed form in (65) with , and , then and its density may be calculated in closed form:
The distribution functions and their densities for are given in (45)–(47).
If , we use (62) with , , and the substitution :
where is the density of the inverse Pareto distribution defined in (49).
8. Proof of Theorems
We find from Lemmas A1 and A2 that in (5) in Proposition 1 is bounded and the integrals in (10) and (11) in Proposition 2 have the necessary convergence rates. It remains to calculate the integrals in (9).
Proof of Theorem 1.
Let , and defined in (32).
Suppose with , which are the limit distributions in (9) for under the condition of Theorem 1. Then, for . It follows from (66), (62) for and (65) for that
where is the density of Student’s t-distribution with degrees of freedom and is the density of a generalized Laplace distribution.
Integral is the integral by in the expansion (9). Then, using (67) and (68) with , we obtain
Integral is the integral by in the expansion (9). Then, using again (67) and (68) with , we obtain
Integration by parts in the last integral by in (9) for and leads to
Suppose . We find from (62)
with
where is defined in (33). It follows from Lemma A3 that for
Hence, because of for , we obtain
For , we only consider the case , which results in and
where is proved in Lemma A3.
If , then since .
Proof of Theorem 2.
Let , and defined in (32).
Suppose with which are the limit distributions in (9) for under the condition of Theorem 2.
Then, for . It follows from (69) and (65) for and (70) for that
where is the Weibull density (see (45)), is the generalized gamma density0 (see (46)) and is the density of the inverse Pareto distribution defined in (49).
Integration by parts in the last integral by in (9) for leads to -4.6cm0cm
where is defined in (33). Suppose . We get with (65)
For using (65), we see that
In Lemma A4, for is proved.
If , then .
Combining the above estimates proves Theorem 2. □
9. Conclusions
Chebyshev–Edgeworth expansions are derived for the distributions of various statistics from samples with random sample sizes. The construction of these asymptotic expansions is based on the given asymptotic expansions for the distributions of statistics of samples with a fixed sample sizes as well as those of the distributions of the random sample sizes.
The asymptotic laws are scale mixtures of the underlying standard normal or chi-square distributions with gamma or inverse exponential mixing distributions. The results hold for a whole family of asymptotically normal or chi-squared statistics since a formal construction of asymptotic expansions are developed. In addition to the random sample size, a normalization factor for the examined statistics also has a significant influence on the limit distribution. As limit laws, Student, standard normal, Laplace, inverse Pareto, generalized gamma, generalized Laplace and weighted sums of generalized gamma distributions occur. As statistica the random mean, the scale-mixed normalized Student t-distribution and the Student’s t-statistic under non-normality with normal limit law, as well as Hotelling’s generalized and scale mixture of chi-squared statistics with chi-square limit laws, are considered. The bounds for the corresponding residuals are presented in terms of inequalities.
Author Contributions
Conceptualization, G.C. and V.V.U.; methodology, V.V.U. and G.C.; writing—original draft, G.C. and V.V.U.; and writing—review and editing, V.V.U. and G.C. All authors have read and agreed to the published version of the manuscript.
Funding
This research was done within the framework of the Moscow Center for Fundamental and Applied Mathematics, Lomonosov Moscow State University and University Basic Research Programs.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Not applicable.
Acknowledgments
The authors would like to thank the managing editor for the assistance and the reviewers for their careful reading of the manuscript and the relevant comments. Their constructive feedback helped to improve the quality of this paper.
Conflicts of Interest
The authors declare no conflict of interest.
Appendix A. Auxiliary Statements and Lemmas
In Section 3, we consider statistics satisfying (1) in Assumption 1 and in Section 4 random sample sizes satisfying (2) in Assumption 2. The statistics in (15), (18) and (21) satisfy Assumption 1 with the normal limit distribution and in (26) and (27) with chi-square limit distributions and , defined in (17), respectively.
Further, we estimate the functions and , that appear in of Assumption 1 and in the term in . Since the functions are products of a polynomial and a density function with or , it follows for that, if
with some polynomial . For , we have and . Hence, (A1) also holds for .
For example, with occurring in (16) for the sample mean and occurring in (27) with for scale mixture of chi-square statistics, we obtain
with and .
Remark A1.
If , i.e., the absolute term of the polynomial is not equal to zero, then it is also the absolute term of , i.e., .
The functions and in allow obtaining the estimates for
Appendix A.1. Lemmas A1 and A2
Lemma A1.
Proof of Lemma A1.
The statistics in (15), (18) and (21) satisfy Assumption 1 with the normal limit distribution . To estimate , we consider the cases and .
Let . Since for and has constant for and , we find
From (A1) and (A2), it follows that , which proves the first case. Moreover, since for the considered statistics.
Consider now the statistics estimated in (26) and (27) with limit chi-square distributions. We only need to examine and . In the cases now under review, we have and with some real constants A and B. The proof is completed with (A1) and (A2),
. □
Next, the integrals in (10) and (11) in Proposition 2 for the gamma limit distributions and the inverse exponential limit distribution are estimated.
Lemma A2.
(i) The conditions (6), (7) and (8) in Proposition 2 are satisfied for and .
(ii) Let . Consider given in (16) and is given in (16) or (18) for statistics occurring in (15), (18) and (21) with limiting distribution .
(iia) Let the mixing distribution be with , and . Then, we obtain with
(iib) Now, consider the mixing distribution with , and . Then, apply it to in (11)
(iii) Put . Consider for statistics occurring in (26) and (27) satisfying Assumption 1 with the chi-square distribution .
(iiia) The mixing distribution is as in Case (ia) above. Then, for ,
(iiib) The mixing distribution is with and as in Case (iib). Then,
Proof of Lemma A2.
(i) Insertion of with and with and simple calculation result in the necessary estimates in (6)–(8). In the case of , one even gets for all terms exponentially fast decrease.
(ii) The limit distribution of the considered statistics is standard normal .
(iia) Let . Using (A2), the estimations (A3) and (A5) for , with , are
Taking into account
the bound (A4) follows from
If with , we find and the bound (A6) follows from
where for using , we obtain
and, in the case of , the substitution for leads to
Finally, if , then does not depend on y. Then, (A7) follows from (A1), (A2), and (A14) for .
Let . Integration by parts for Lebesgue–Stieltjes integrals , , in (11) leads to
with bound given in (A2). Defining , we find
and, with , we obtain
Hence, using for , we obtain (A8) and, for , .
For , the second integral in the line above is an exponential integral. Therefore, with (A14), we find (A8) for , too.
(iib) The mixing distribution is with . Since , only has to be estimated. Integration by parts for in (11) leads to (A15) with , and instead of . Hence, (A9) follows from
(iii) The limit distribution of statistics in (26) and (27) is chi-square distribution defined in (17). In the considered cases . Let . Consider with chi-square density .
(iiia) Let . We have to estimate for and for . The bound (A10) for follows from (A2) and
If with , we find and
in the case of using variable transformation for one has
If then , noting (A14) and for , we prove (A11).
Appendix A.2. Lemmas A3 and A4
We show that the integrals and in the proofs of Theorems 1 and 2 have the order of the remaining terms. Therefore, the involved jump correcting function occurring in (32) and (38) has no effect on the second approximation. The function is periodic with period 1. The Fourier series expansion of at all non-integer points y is
(see formula 5.4.2.9 in [40] with ).
Proof of Lemma A3.
We begin by considering in defined in (A17) following the estimate of in the proof of Theorem 2 in [11]. Inserting Fourier series expansion of into the integral , interchanging the integral and sum and applying formula (2.5.31.4) in [40] with and , then
Now, we split the exponent and obtain
Since and , the first statement in Lemma A3 follows:
To prove the second statement about , we insert again the Fourier series expansion of given in (A17) into and interchange the integral and sum
Further, we use formula 2.5.37.4 in [40]
with , , and . Use of the estimates
leads to the inequalities
and Lemma A3 is proven. □
Lemma A4.
Let be defined by (79), then for .
Proof of Lemma A4.
Using the Fourier series expansion (A17) of the periodic function , given in (33), and interchange integral and sum, we find
We begin by estimating , i.e., the exponent by y in (A19) is . Thus, we can use formula 2.5.37.3 in [40]
where , , and . Since
it results in
Let now and . The main difference compared with the previous estimate of is that we are facing more technical trouble in order to estimate . The exponent by y in (A19) is and we cannot find a closed formula similar to (A19) for this case. To estimate in the proof of Theorem 5 in [11], we show that differentiation with respect to s under the integral sign in (A20) is allowed. Hence,
with the same coefficients p, s, b and as in (A20). The use of (A21) and the obvious inequalities and leads to
Finally, let now .
Partial integration in the integral with , and leads to
and using (62) to
Therefore,
and Lemma A4 is proven. □
References
- Nunes, C.; Capistrano, G.; Ferreira, D.; Ferreira, S.S.; Mexia, J.T. Exact critical values for one-way fixed effects models with random sample sizes. J. Comput. Appl. Math. 2019, 354, 112–122. [Google Scholar] [CrossRef]
- Nunes, C.; Capistrano, G.; Ferreira, D.; Ferreira, S.S.; Mexia, J.T. Random sample sizes in orthogonal mixed models with stability. Comp. Math. Methods 2019, 1, e1050. [Google Scholar] [CrossRef]
- Nunes, C.; Mário, A.; Ferreira, D.; Moreira, E.M.; Ferreira, S.S.; Mexia, J.T. An algorithm for simulation in mixed models with crossed factors considering the sample sizes as random. J. Comput. Appl. Math. 2021. [Google Scholar] [CrossRef]
- Esquível, M.L.; Mota, P.P.; Mexia, J.T. On some statistical models with a random number of observations. J. Stat. Theory Pract. 2016, 10, 805–823. [Google Scholar] [CrossRef]
- Al-Mutairi, J.S.; Raqab, M.Z. Confidence intervals for quantiles based on samples of random sizes. Statist. Pap. 2020, 61, 261–277. [Google Scholar] [CrossRef]
- Barakat, H.M.; Nigm, E.M.; El-Adll, M.E.; Yusuf, M. Prediction of future generalized order statistics based on exponential distribution with random sample size. Statist. Pap. 2018, 59, 605–631. [Google Scholar] [CrossRef]
- Gnedenko, B.V. Estimating the unknown parameters of a distribution with a random number of independent observations. (Probability theory and mathematical statistics (Russian)). Trudy Tbiliss. Mat. Inst. Razmadze Akad. Nauk Gruzin. SSR 1989, 92, 146–150. [Google Scholar]
- Gnedenko, B.V.; Korolev, V.Y. Random Summation. Limit Theorems and Applications; CRC Press: Boca Raton, FL, USA, 1996. [Google Scholar]
- Bening, V.E.; Galieva, N.K.; Korolev, V.Y. On rate of convergence in distribution of asymptotically normal statistics based on samples of random size. Ann. Math. Inform. 2012, 39, 17–28. (In Russian) [Google Scholar]
- Bening, V.E.; Galieva, N.K.; Korolev, V.Y. Asymptotic expansions for the distribution functions of statistics constructed from samples with random sizes. Inform. Appl. 2013, 7, 75–83. (In Russian) [Google Scholar]
- Christoph, G.; Monakhov, M.M.; Ulyanov, V.V. Second-order Chebyshev-Edgeworth and Cornish-Fisher expansions for distributions of statistics constructed with respect to samples of random size. J. Math. Sci. 2020, 244, 811–839. [Google Scholar] [CrossRef]
- Christoph, G.; Ulyanov, V.V.; Bening, V.E. Second Order Expansions for Sample Median with Random Sample Size. arXiv 2020, arXiv:1905.07765v2. [Google Scholar]
- Christoph, G.; Ulyanov, V.V. Second order expansions for high-dimension low-sample-size data statistics in random setting. Mathematics 2020, 8, 1151. [Google Scholar] [CrossRef]
- Christoph, G.; Ulyanov, V.V. Short Expansions for High-Dimension Low-Sample-Size Data Statistics in a Random Setting. Recent Developments in Stochastic Methods and Applications. In Proceedings in Mathematics & Statistics; Shiryaev, A.N., Samouylov, K.E., Kozyrev, D.V., Eds.; Springer: Cham, Switzerland, 2021; (to appear). [Google Scholar]
- Hall, P. The Bootstrap and Edgeworth Expansion; Springer Series in Statistics; Springer: New York, NY, USA, 1992. [Google Scholar]
- Bickel, P.J. Edgeworth expansions in nonparametric statistics. Ann. Statist. 1974, 2, 1–20. [Google Scholar] [CrossRef]
- Kolassa, J.E. Series Approximation Methods in Statistics, 3rd ed.; Lecture Notes in Statistics 88; Springer: New York, NY, USA, 2006. [Google Scholar]
- Petrov, V.V. Limit Theorems of Probability Theory, Sequences of Independent Random Variables; Clarendon Press: Oxford, UK, 1995. [Google Scholar]
- Fujikoshi, Y.; Ulyanov, V.V. Non-Asymptotic Analysis of Approximations for Multivariate Statistics; Springer: Singapore, 2020. [Google Scholar]
- Hall, P. Edgeworth Expansion for Student’s t-statistic under minimal moment conditions. Ann. Probab. 1987, 15, 920–931. [Google Scholar] [CrossRef]
- Bentkus, V.; Götze, F.; van Zwet, W.R. An Edgeworth expansion for symmetric statistics. Ann. Statist. 1997, 25, 851–896. [Google Scholar] [CrossRef]
- Anderson, T.W. An Introduction to Multivariate Statistical Analysis, 3rd ed.; Wiley Series in Probability and Mathematical Statistics; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2003. [Google Scholar]
- Fujikoshi, Y.; Ulyanov, V.V.; Shimizu, R. Multivariate Statistics. High-Dimensional and Large-Sample Approximations; Wiley Series in Probability and Statistics; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2010. [Google Scholar]
- Fujikoshi, Y.; Ulyanov, V.V.; Shimizu, R. L1-norm error bounds for asymptotic expansions of multivariate scale mixtures and their applications to Hotelling’s generalized . J. Multivar. Anal. 2005, 96, 1–19. [Google Scholar] [CrossRef][Green Version]
- Ulyanov, V.V.; Fujikoshi, Y. On accuracy of improved χ2-approximation. Georgian Math. J. 2001, 8, 401–414. [Google Scholar] [CrossRef]
- Schluter, C.; Trede, M. Weak convergence to the Student and Laplace distributions. J. Appl. Probab. 2016, 53, 121–129. [Google Scholar] [CrossRef]
- Döbler, C. New Berry-Esseen and Wasserstein bounds in the CLT for non-randomly centered random sums by probabilistic methods. ALEA Lat. Am. J. Probab. Math. Stat. 2015, 12, 863–902. [Google Scholar]
- Robbins, H. The asymptotic distribution of the sum of a random number of random variables. Bull. Am. Math. Soc. 1948, 54, 1151–1161. [Google Scholar] [CrossRef]
- Kolassa, J.E.; McCullagh, P. Edgeworth Series for Lattice Distributions. Ann. Statist. 1990, 18, 981–985. [Google Scholar] [CrossRef]
- Bening, V.E.; Korolev, V.Y. On the use of Student’s distribution in problems of probability theory and mathematical statistics. Theory Probab. Appl. 2005, 49, 377–391. [Google Scholar] [CrossRef]
- Gavrilenko, S.V.; Zubov, V.N.; Korolev, V.Y. The rate of convergence of the distributions of regular statistics constructed from samples with negatively binomially distributed random sizes to the Student distribution. J. Math. Sci. 2017, 220, 701–713. [Google Scholar] [CrossRef]
- Buddana, A.; Kozubowski, T.J. Discrete Pareto distributions. Econ. Qual. Control 2014, 29, 143–156. [Google Scholar] [CrossRef]
- Lyamin, O.O. On the rate of convergence of the distributions of certain statistics to the Laplace distribution. Mosc. Univ. Comput. Math. Cybern. 2010, 34, 126–134. [Google Scholar] [CrossRef]
- Choy, T.B.; Chan, J.E. Scale mixtures distributions in statistical modellings. Aust. N. Z. J. Stat. 2008, 50, 135–146. [Google Scholar] [CrossRef]
- Oldham, K.B.; Myland, J.C.; Spanier, J. An Atlas of Functions, 2nd ed.; Springer Science+Business Media: New York, NY, USA, 2009. [Google Scholar]
- Kotz, S.; Kozubowski, T.J.; Podgórski, K. The Laplace Distribution and Generalizations: A Revisit with Applications to Communications, Economics, Engineering, and Finance; Birkhäuser: Boston, MA, USA, 2001. [Google Scholar]
- Korolev, V.Y.; Zeifman, A.I. Generalized negative binomial distributions as mixed geometric laws and related limit theorems. Lith. Math. J. 2019, 59, 366–388. [Google Scholar] [CrossRef]
- Korolev, V.Y.; Gorshenin, A. Probability models and statistical tests for extreme precipitation based on generalized negative binomial distributions. Mathematics 2020, 8, 604. [Google Scholar] [CrossRef]
- Safari, M.A.M.; Masseran, N.; Ibrahim, K.; Hussain, S.I. A robust and efficient estimator for the tail index of inverse Pareto distribution. Phys. A Stat. Mech. Its Appl. 2019, 517, 431–439. [Google Scholar] [CrossRef]
- Prudnikov, A.P.; Brychkov, Y.A.; Marichev, O.I. Integrals and Series, Vol. 1: Elementary Functions, 3rd ed.; Gordon & Breach Science Publishers: New York, NY, USA, 1992. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).