Abstract
The convergence rate in the famous Rényi theorem is studied by means of the Stein method refinement. Namely, it is demonstrated that the new estimate of the convergence rate of the normalized geometric sums to exponential law involving the ideal probability metric of the second order is sharp. Some recent results concerning the convergence rates in Kolmogorov and Kantorovich metrics are extended as well. In contrast to many previous works, there are no assumptions that the summands of geometric sums are positive and have the same distribution. For the first time, an analogue of the Rényi theorem is established for the model of exchangeable random variables. Also within this model, a sharp estimate of convergence rate to a specified mixture of distributions is provided. The convergence rate of the appropriately normalized random sums of random summands to the generalized gamma distribution is estimated. Here, the number of summands follows the generalized negative binomial law. The sharp estimates of the proximity of random sums of random summands distributions to the limit law are established for independent summands and for the model of exchangeable ones. The inverse to the equilibrium transformation of the probability measures is introduced, and in this way a new approximation of the Pareto distributions by exponential laws is proposed. The integral probability metrics and the techniques of integration with respect to sign measures are essentially employed.
Keywords:
probability metrics; Stein method; geometric sums; generalization of the Rényi theorem; generalized transformation of equilibrium for probability measures and its inverse; generalized gamma distribution MSC:
60F99; 60E10; 60G50; 60G09
1. Introduction
The theory of sums of random variables belongs to the core of modern probability theory. The fundamental contribution to the formation of the classical core was made by A. de Moivre, J. Bernoulli, P.-S. Laplace, D. Poisson, P.L. Chebyshev, A.A. Markov, A.M. Lyapunov, E. Borel, S.N. Bernstein, P. Lévy, J. Lindeberg, H. Cramér, A.N. Kolmogorov, A.Ya. Khinchin, B.V. Gnedenko, J.L. Doob, W. Feller, Yu.V. Prokhorov, A.A. Borovkov, Yu.V. Linnik, I.A. Ibragimov, A. Rényi, P. Erdös, M. Csörgö, P. Révész, C. Stein, P. Hall, V.V. Petrov, V.M. Zolotarev, J. Jacod and A.N. Shiryaev among others. The first steps led to limit theorems for appropriately normalized partial sums of sequences of independent random variables. Besides the laws of large numbers, special attention was paid to emergence of Gaussian and Poisson limit laws. Note that despite many efforts to find necessary and sufficient conditions for the validity of the central limit theorem (the term was proposed by G. Pólya for a class of limit theorems describing weak convergence of distributions of normalized sums of random variables to the Gaussian law), this problem was completely resolved for independent summands only in the second part of the 20th century in the works by V.M. Zolotarev and V.I. Rotar. Also in the last century, the beautiful theory of infinitely divisible and stable laws was constructed. New developments of infinite divisibility along with classical theory can be found in [1]. For exposition of the theory of stable distributions and their applications, we refer to [2], see also references therein.
Parallel to partial sums of a sequence of random variables (and vectors), other significant schemes have appeared, for instance, the arrays of random variables. Moreover, in physics, biology and other domains, researchers found that it was essential to study the sums of random variables when the number of summands was random. Thus, the random sums with random summands became an important object of investigation. One can mention the branching processes which stem from the 19th century population models by I.J. Bienaymé, F. Galton and H.W. Watson that are still intensively being developed, see, e.g., [3]. In the theory of risk, it is worth recalling the celebrated Cramér–Lundberg model for dynamics of the capital of an insurance company, see, e.g., Ch. 6 in [4]. Various examples of models described by random sums are considered in Ch. 1 of [5], including (see Example 1.2.1) the relationship between certain random sums analysis and the famous Pollaczek–Khinchin formula in queuing theory. A vast literature deals with the so-called geometric sums. There, one studies the sum of independent identically distributed random variables, and the summation index follows the geometric distribution, being independent with summands. Such random sums can model many real world phenomena, e.g., in queuing, insurance and reliability, see the Section “Origin of Geometric Sums” in the Introduction of [6]. Furthermore, a multitude of important stochastic models described by systems of dependent random variables occurred to meet diverse applications, see, e.g., [7]. In particular, the general theory of stochastic processes and random fields arose in the last century (for introduction to random fields, see, e.g., [8]).
An intriguing problem of estimating the convergence rate to a limit law was addressed by A.C. Berry and C.-G. Esseen. Their papers initiated the study of proximity for distribution functions of the normalized partial sums of independent random variables to the distribution function of a standard Gaussian law in the framework of the classical theory of random sums.
To assess the proximity of distributions, we will employ various integral probability metrics. Usually, for random variables Y, Z and a specified class of functions , one sets
Clearly, is a functional depending on and , i.e., distributions of Y and Z. A class should be rich enough to guarantee that possesses the properties of a metric (or semi-metric). The general theory of probability metrics is presented, e.g., in [9,10]. In terms of such metrics, one often compares the distribution of a random variable Y under consideration with that of a target random variable Z. In Section 2, we recall the definitions of the Kolmogorov and Kantorovich (alternatively called Wasserstein) distances and Zolotarev ideal metrics corresponding to the adequate choice of , denoted below as , and , respectively.
It should be emphasized that for sums of random variables, deep results were established along with creation and development of different methods of analysis. One can mention the method of characteristic functions due to the works of J.Fourier, P.-S.Laplace and A.M.Lyapunov, the method of moments proposed by P.L.Chebyshev and developed by A.A.Markov, the Lindeberg method of employing auxiliary Gaussian random variables and the Bernstein techniques of large and small boxes. In 1972, C.Stein in [11] (see also [12]) introduced the new method to estimate the proximity of the distribution under consideration to a normal law. Furthermore, this powerful method was developed in the framework of classical limit theorems of the probability theory. We describe this method in Section 2. Applying the Stein method along with other tools, one can establish in certain cases the sharp estimates of closeness between a target distribution and other ones in specified metrics (see, e.g., [13,14]). We recommend the books [15,16] and the paper [17] for basic ideas of the ingenious Stein method. The development of this techniques under mild moment restrictions for summands is treated in [18,19]. We mention in passing that there are deep generalizations of Stein techniques involving generators of certain Markov processes; a compact exposition is provided, e.g., on p. 2 of [20].
In the theory of random sums of random summands, the limit theorems with exponential law as a target distribution play a role similar to the central limit theorem for (nonrandom) sums of random variables. Here, one has to underline the principal role of the Rényi classical theorem for geometric sums published in [21]. Recall this famous result. Let be a sequence of independent identically distributed (i.i.d.) random variables such that . Take a geometric random variable with parameter , defined as follows:
Assume that and are independent. Set , , . Then,
where stands for convergence in distribution, and Z follows the exponential law with parameter , . In fact, instead of , A.Rényi considered the shifted geometric random variable such that , . Clearly, has the same law as . He supposed that i.i.d. random variables are non-negative, and and are independent. Then, converges in distribution to as , where . It was explained in [22] that both statements are equivalent and the assumption of nonnegativity of summands can be omitted.
Building on the previous investigations discussed below in this section, we study different instances of quantifying the approximation of random sums by limit laws and also extend the Stein method employment. The main goals of our paper are the following: (1) to find sharp estimates (i.e., optimal ones which cannot be diminished) of proximity of geometric sums of independent (in general non-identically distributed) random variables to exponential law using the probability metric ; (2) to prove the new version of the Rényi theorem when the summands are described by a model of exchangeable random variables, establishing the due non-exponential limit law together with an optimal bound of the convergence rate applying ; (3) to obtain the exact convergence rate of appropriately normalized random sums of random summands to the generalized gamma distribution when the number of summands follows the generalized negative binomial distribution employing ; (4) to introduce the inverse transformation to an “equilibrium distribution transformation”, give full description of its existence and demonstrate the advantage of applying the Stein method combined with that inverse transform; and (5) to use such approach in deriving the new approximation in the Kolmogorov metric of the Pareto distribution by an exponential one, which is important in signal processing.
The main idea is to apply the Stein method and deduce (Lemma 2) new estimates of the solution of Stein’s equation (corresponding to an exponential law as a target distribution) when a function h appearing in its right-hand side belongs to a class . This entails the established sharp estimates. The integral probability metrics and the techniques of integration with respect to sign measures are essentially employed. It should be stressed that we consider random summands which take, in general, positive and negative values and in certain cases need not have the same law.
Now, we briefly comment on the relevance of the five groups of the paper results mentioned above. Some upper bounds for convergence rates in Equation (3) were obtained previously by different tools (the renewal techniques and the memoryless property of the geometric distribution), and the estimates were not sharp. We refer to the results by A.D. Soloviev, V.V. Kalashnikov and S.Y. Vsekhsvyatskii, M. Brown, V.M. Kruglov and V.Yu. Korolev, where the authors either used the Kolmogorov distance or proved specified nonuniform estimates for differences of the corresponding distribution functions. For instance, in [23] the following estimate was proved
where . Moreover, this estimate is asymptotically exact when . Some improvements are in [24] under certain (hazard rate) assumptions. E.V. Sugakova obtained a version of the Rényi theorem for independent, in general, not identically distributed random variables. We also mention contributions by V.V. Kalashnikov, E.F. Peköz, A. Röllin, N. Ross and T.L. Hung which gave the estimates in terms of the Zolotarev ideal metrics. We do not reproduce all these results here since they can be viewed on pages 3 and 4 of [22] with references where they were published.
In Corollary 3.6 of [25] for nondegenerate i.i.d. positive random variables with mean and finite second moment, it was proved that
where , is the Zolotarev ideal metric of order two, , . In [22], the estimates for proximity of geometric sums distributions to were provided in the Kantorovich and metrics. A substantial contribution of the authors of [22] is the study of random summands that need not be positive (see also [26]). The general estimate for deviation of from in the ideal metric of order s was proved in [27]. We do not assume that is constructed by means of i.i.d. random variables and, moreover, demonstrate that our estimate (for summands taking real values) involving the metric is sharp.
The exchangeable random variables form an important class having various applications in statistics and combinatorics, see, e.g., [28]. As far as we know, the model of exchangeable random variables is studied in the context of random sums for the first time here. It is interesting that instead of the exponential limit law we indicate explicit expression of the new limit law. In addition, we establish the sharp estimate of proximity of random sums distributions to this law using .
A natural generalization of the Rényi theorem is to study the summation index following non-geometrical distribution. In this way, the upper bound of the convergence rate of random sums of random summands to generalized gamma distribution was proved in [29]. Theorem 3.1 in [30] contains the estimates in the Kolmogorov and Kantorovich distances for approximations of non-negative random variable law by specified (nongeneralized) gamma distribution. The proof relies on Stein’s identity for gamma distribution established in H.M.Luk’s PhD thesis (see the reference in [30]). New estimates of the solutions of the gamma Stein equation are given in [31]. We derive the sharp estimate for approximation of random sums by generalized gamma law using the Zolotarev metric of order two. In a quite recent paper [32] the author established deep results concerning further generalizations of the Rényi theorem. Namely, Theorem 1 of [32] demonstrates how one can provide the upper bounds of the convergence rate of specified random sums to a more general law than an exponential one using the estimates in the Rényi theorem. This approach is appealing since the author employs the ideal metric of order . However, the sharpness of these estimates was not examined.
Note that in [33] the important “equilibrium transformation of distributions” was proposed and employed along with the Stein techniques. We will consider this transformation for a random variable X in Section 7 and also tackle other useful transformations. In the present paper, the inverse to the “equilibrium distribution transformation” is introduced. We completely describe the possibility to construct such transformation and provide an explicit formula for the corresponding density. The idea to apply such inverse transformation whenever it exists is based on the result [33] demonstrating that one can obtain a more precise estimate for proximity in the Kantorovich metric between and Z than between X and Z, where and , . We extend this result. Moreover, we prove that in this way one can obtain a new estimate of approximation of the Pareto distribution by an exponential one. It is shown that our new estimate is advantageous for a wide range of parameters of the Pareto distribution. Let , i.e., the distribution function of is
We show that the preimage . Thus, for any , , one has where and stands for the Kolmogorov distance. This bound is more precise than the previous ones applied in signal processing, see, e.g., [34].
This paper is organized as follows. After the Introduction, the auxiliary results are provided in Section 2. Here we include the material important for understanding the main results. We recall the concept of probability metrics, consider the Kolmogorov and the Kantorovich distances and examine the Zolotarev ideal metrics. We describe the basic ideas of Stein’s method, especially for the exponential target distribution. In this section, we formulate a simple but useful Lemma 1 concerning the essential supremum of the Lipschitz function, an important Lemma 2 giving the solution of the Stein equation for different functional classes. We explain the essential role of the generalized equilibrium transformation proposed in [22] which permits study of the summands taking both positive and negative values. We formulate Lemma 3 to be able to solve an integral equation involving the generalized equilibrium transformation when and . The proofs of auxiliary lemmas are placed in Appendix A. Section 3 is devoted to an approximation of the normalized geometric sums by an exponential law. Here, the sharp convergence rate is found (see Theorem 1) by means of the probability metric . The proof is based on the Lebesgue–Stieltjes integration techniques, the formula of integration by parts for functions of bounded variations, Lemma 2, various limit theorems for integrals and the important result of [22] concerning the estimates involving the Kantorovich distance. In Section 4, for the first time an analog of the Rényi theorem is proved for a model of exchangeable random variables proposed in [35]. We demonstrate (Theorem 2) that, in contrast to Rényi’s theorem, the limit distribution for random sums under consideration is a specified mixture of two explicitly indicated laws. Moreover, the sharp convergence rate to this limit law is obtained (Theorem 3) by means of . In Section 5, the distance between the generalized gamma law and the suitably normalized sum of independent random variables is estimated when the number of summands has the generalized negative binomial distribution. Theorem 4 demonstrates that this estimate is sharp. For the proof, we employ various truncation techniques, the transformations of parameters of initial random variables, the monotone convergence theorem and explicit formula for the generalized gamma distribution moments of order , obtained in [27]. Section 6 provides the pioneering study of the same problem in the framework of exchangeable random variables and also gives the sharp estimate for the metric (Theorem 5). In Section 7, we introduce the inverse to the equilibrium transformation of the probability measures. Lemma 6 contains a full description of situations when a unique preimage X of a random variable exists and gives an explicit formula for distribution of X. This approach permits us to obtain the new estimates of closeness of probability measures in the Kolmogorov and Kantorovich metrics (Theorem 6). In particular, due to Theorem 6 and Lemmas 2, 6, it becomes possible to find a useful estimate of proximity of the Pareto law to the exponential one (Example 2). Section 8 containing the conclusions and indications for further research work is followed by Appendix A and the list of references.
2. Auxiliary Results
Let where if A holds and zero otherwise. The choice in Equation (1) corresponds to the Kolmogorov distance. Note that h above is a function in x, whereas z is the index parameterizing the class.
A function is called the Lipschitz one if
Then,
and in light of Equation (4), is the smallest possible constant C appearing in Equation (5). We write , where for a collection of the Lipschitz functions having . For set (where, for , stands for the minimal integer number which is equal or greater than a). Introduce a class of functions
As usual, , . We write for a metric defined according to Equation (1) with . V.M. Zolotarev and many other researchers defined an ideal metric of order involving only bounded functions from . We will use collections and without assumption that functions h are bounded on . This is the reason why we write instead of . Thus, we employ
Note that in definitions of we deal with , where the space consists of functions such that exists for all , and is continuous on (evidently the Lipschitz function is continuous). One calls the Kantorovich metric (the term Wasserstein metric appears in the literature as well). One also uses the bounded Kantorovich metric when the class contains all the bounded functions from . The metric was introduced in [36] and called an ideal metric in light of its important properties. The properties of metrics, where , are collected in Sec. 2 of [32]. We mention in passing that various functionals are ubiquitous in assessing the proximity of distributions. In this regard, we refer, e.g., to [37,38].
To apply the Stein method, we begin with fixing the target random variable Z (or its distribution) and describe a class to estimate for a random variable Y under consideration. Then, the problem is to indicate an operator T (with specified domain of definition) so that the Stein equation
has a solution , , for each function . After that, one can substitute Y instead of x in Equation (6) and take the expectation of both sides, assuming that all these expectations are finite. As a result, one comes to the relation
It is not a priori clear why the estimation of the left-hand side of Equation (7) is more adequate than the estimation of for . However, in many situations, justifying the method this occurs. The choice of T depends on the distribution of Z. Note that in certain cases (e.g., when Z follows the Poisson law) one considers functions f defined on a subset of . We emphasize that the construction of operator T is a nontrivial problem, see, e.g., [33,39,40,41].
The basic idea in this way is the following. For many probability distributions (Gaussian, Laplace, Exponential, etc.), one can find an operator T characterizing the law of a target variable Z. In other words, for a rather large class of functions f, if and only if (i.e., the laws of Y and Z coincide). Thus, if is small enough for a suitable class of functions h, this leads to the assertion that the law of Y is close (in a sense) to the law of Z. One has to verify that this kind of “continuity” takes place. Clearly, if for any , where defines the integral probability metric in Equation (1), one can find a solution of Equation (6), then the relation for all , , yields and, consequently, .
Further, we assume that , i.e., Z has exponential distribution with parameter . In this case (see, e.g., Sec. 5 in [17]), one uses the operator
and writes the Stein Equation (6) as follows
It should be stipulated that for a test function , and there exists a differentiable solution f of Equation (9). Therefore, if one can find such solution f, then
under the hypothesis that all these expectations are finite. If is absolutely continuous, then (see, e.g., Theorem 13.18 of [42]) for almost all with respect to the Lebesgue measure, there exists . Moreover, one can find an integrable (on each interval) function , , to guarantee, for each , that
where for almost all . Thus, is defined for such f according to Equation (8) for almost all . In general, for an arbitrary random variable Y, one cannot write since the value of expectation depends on the choice of a version of , . Really, let be such that , where m stands for the Lebesgue measure. Assume that Y takes values in B. Then, it is clear that depends on the choice of a function version defined on . However, if the distribution of a random variable Y has a density with respect to m, then will be the same for any version of (with respect to the Lebesgue measure). In certain cases, the Stein operator is applied to smoothed functions (see, e.g., [33,43]). Otherwise, Equation (6) does not hold at each point of (see, e.g., Lemma 2.2 in [16]), and complementary efforts are needed. For our study, it is convenient to employ in Equation (8) for T in the capacity of , , the right derivative. In many cases, for a real-valued function f defined on a fixed set one considers as "essential supremum". Recall that a function is a version of f (and vice versa) if the measure (here the Lebesgue measure) of points x such that is zero. The notation means that one takes , where belongs to the class of all versions of f. Clearly, will be the same if we change f on a subset of D having a measure which is equal to zero. Thus, we write instead of appearing in Equation (11). The following simple observation is useful. Its proof is provided in Appendix A.
Lemma 1.
A function h is the Lipschitz function on with if and only if h is absolutely continuous and (its essential supremum).
Remark 1.
Note that , , for any . If, for some positive constant C, , then Equation (5) yields that . If is a Lipschitz function (with ), then exists for almost all and an application of Lemma 1 gives
Consequently, for some positive A, B (one can take , ) and any . As is continuous on each interval, it follows that for some positive and all (, , ). Therefore, for some positive and each .
Lemma 2.
For any and each , the equation
has a solution
where . If , then for all there exists and . If , then is defined on and . For , a function is defined on and .
The right-hand side of Equation (13) is well defined for each in light of Remark 1. Lemma 4.1 of [33] contains for some statements of Lemma 1. We will use the above estimates for any . Estimates for were not considered in [33]. The proof of Lemma 2 is given in Appendix A.
The following concept was introduced in [33].
Definition 1
([33]). Let X be a non-negative random variable with finite . One says that a random variable has distribution of equilibrium with respect to X if for any Lipschitz function ,
Note that Definition 1 deals separately with distributions of X and . One says that is the result of the equilibrium transformation applied to X. The same terminology is used for transition from to . For the sake of completeness, we explain in Appendix A (Comments to Definition 1) why one can take the law of having a density with respect to the Lebesgue measure
to guarantee the validity of Equation (14).
Remark 2.
For a non-negative random variable X with finite , one can construct a random variable having a density (15). Accordingly, we then have a random vector with specified marginal distributions. However, the joint law of X and is not fixed and can be chosen in appropriate way. If is a sequence of independent random variables, we will assume that a sequence consists of independent vectors, and these vectors are independent with all considered random variables which are independent with .
In the recent paper [22], a generalization of the equilibrium transformation of distributions was proposed without assuming that random variable X is non-negative.
Definition 2
([22]). Let X be a random variable having a distribution function , . Assume the existence of finite . An equilibrium distribution function corresponding to X (or ) is introduced by way of
where . This function can be written as , where
thus, is a density (with respect to the Lebesgue measure) of a signed measure corresponding to . In other words, Equation (17) demonstrates the Jordan decomposition (see, e.g., Sec. 29 of [44]) of .
Clearly, for a non-negative random variable, the functions defined in Equation (15) and Equation (16) coincide. For a nonpositive random variable, the function appearing in Equation (16) is a distribution function of a probability measure. In general, when X can take positive and negative values, the function introduced in Equation (16) is not a distribution function. We will call the generalized equilibrium distribution function. Note that . Thus, is the Lipschitz function and consequently continuous ( is well defined for each since is finite and nonzero). Moreover, is absolutely continuous being the Lipschitz function. Each absolutely continuous function has bounded variation. If G is a function of bounded variation, then , where and are nondecreasing functions (see, e.g., [42], Theorem 12.18). One can employ the canonical choice , where means the variation of G on , (if then ). If G is right-continuous (on ), then evidently and are also right-continuous. Thus, for a right-continuous G having bounded variation, a nondecreasing function in its representation corresponds to a -finite measure on , . More precisely, there exists a unique -finite measure on such that, for each finite interval , , . Recall that one writes for the Lebesgue–Stieltjes integral with respect to a function G
whenever the integrals in the right-hand side exist (with values in ), and the cases or are excluded. The integral means the integration with respect to measure , . The signed measure Q corresponding to G is . Thus, means the integration with respect to signed measure Q. Note that if where is right-continuous and nondecreasing (), then
The left-hand side and the right-hand side of Equation (19) make sense simultaneously, and if so, are equal to each other. Indeed, for any finite interval (), one has . Thus, the signed measures corresponding to and coincide on . We mention in passing that one can also employ the Jordan decomposition of a signed measure.
For introduced in Equation (16), the analog of Equation (15) has the form
Taking into account Equation (17), one can rewrite Equation (20) equivalently as follows
The right-hand side of the latter relation does not depend on the choice of a version of . Due to Theorem 1(d) of [22], Equation (20) is valid for any Lipschitz function f. Evidently, an arbitrary function need not be the Lipschitz one and vice versa.
Lemma 3.
Let X be a random variable such that and . Then, Equation (20) is satisfied for all .
The proof is provided in Appendix A.
3. Limit Theorem for Geometric Sums of Independent Random Variables
Consider , see Equation (2). In other words, has a geometric distribution with parameter p. Let be a sequence of independent random variables such that , where , , . Assume that and are independent. Consider a normalized geometric sum
introduced in Equation (3). Since can take zero value, set, as usual, . One can see that can be viewed as a random sum normalized by .
Lemma 4.
Let and , where , be random variables described above in this Section. Then, the following relations hold:
Proof.
Recall that
Thus, one has
Clearly, since is finite ). Therefore
Set , . One has
According to Equations (24) and (25) one derives the formula
Convergence of the series having non-negative terms holds simultaneously with the validity of inequality . Changing the order of summation, we obtain
The latter formula and Equations (26), (27) yield
Equation (23) is established. □
The proof of Theorem 3.1 in [45] shows for non-negative i.i.d. random variables (when , see Formula (3.15) in [45]) that the equilibrium transformation of distribution has the following form:
where means that we construct and then take a random index . In other words,
It was explained in Section 2 that a generalized equilibrium distribution function (see Definition 2) need not be a distribution function when the summands can take values of different signs. However, employing this function, one can establish the following result.
Theorem 1.
Let be a sequence of independent random variables having finite , where , . Assume that and are independent, where , . If , then
where was introduced in Equation (22).
Proof.
If , then since, for a function , , belonging to , one has , whereas . According to Equation (23), and are both finite or infinite simultaneously. Consequently, Equation (29) is true when .
Let us turn to the case . At first, we obtain an upper bound for . Take . Applying Lemmas 1 and 2 and Remark 1, one can write due to Stein’s Equation (10) that
Using the generalized equilibrium distribution transformation (20) one obtains:
Due to Lemma 3 this is true, for , because according to Lemma 2 (with ). Next, we employ the relation
Evidently, one can write . The notation in the integral refers to the Lebesgue–Stieltjes integral with respect to a function of bounded variation. In fact, the integral with integrator means that integration employs a signed measure , where and have the following densities with respect to the Lebesgue measure:
we took into account that according to Lemma 4. Then, for any , one ascertains that variation of on is given by formula (see, e.g., Theorem 4.4.7 [46]). Note that for any ,
according to Lemma 4. Thus, is a function of bounded variation. In the right-hand side of Equation (32), we take the Lebesgue–Stieltjes integral with respect to the function of bounded variation , . Let , , where are nondecreasing right-continuous functions (even continuous since is continuous), . Thus,
With the help of Equations (18) and (19) one makes sure that, for each ,
All the integrals in the latter formulas are finite. According to Lemma 2 and Remark 1, one can write , where are positive constants. Thus, the Lebesgue theorem on dominated convergence ensures that
where the latter integral is finite. Indeed,
according to Lemma 4. By the same Lemma, one has . Therefore, on account of Equation (17), the following relation holds:
whereas Corollary 2, Sec. 6, Ch. II of [47] and Lemma 4 entail that
The Lebesgue theorem on dominated convergence for -finite measures and Equation (34) yield
where the latter integral is finite. Now, we show that
Note that at each as . To apply the version of the Lebesgue theorem to integrals over a signed measure, it suffices (see, e.g., [48], p. 74) to verify that
where means that one evaluates an integral with respect to the measure corresponding to the total variation of a measure determined by a right-continuous function G of bounded variation. The extension of the Lebesgue theorem on dominated convergence for signed measures is an immediate corollary of the Jordan decomposition mentioned above. Using this decomposition, one obtains the inequality
Due to Remark 1 one has for all and some positive constants . Then, Equations (33) and (34) yield (as generates probability measure)
The functions and are right-continuous and have bounded variation. Then each of them can be represented as the difference of right-continuous nondecreasing functions, and using for any the integration by parts formula (see, e.g., Theorem 11, Sec. 6, Ch. 2, [47]), one has
Since the integral in the right-hand side of Equation (35) is finite, it holds
(the proof is similar to the proof of Corollary 2, Sec. 6, Ch. 2 in [47]). Then,
The function is absolutely continuous according to Lemma 2. Hence (see also Equations (36) and (A12) in Appendix A) we get
because due to Lemmas 1 and 2. Using the homogeneity of the Kantorovich metric for signed measures which is derived from formula (20) of [22] (see Lemma 1 (a) there) and applying Lemma 3 of that paper, we can write
Relations (30), (31), (32), (37), (38) and Lemmas 1 and 2 guarantee that does not exceed the right-hand side of Equation (29).
Remark 3.
Evidently,
Thus, one obtains
and the latter inequality becomes an equality when for all . Therefore, the statement of Theorem 1 can be written as follows
and this becomes an equality when for all .
Remark 4.
In [22], the authors proved the following inequality
We established the sharp estimate with a factor instead of having employed Equation (20) for a class of functions comprising solutions of the Stein equation for . The estimate with factor was also obtained in the recent paper [49] but for i.i.d. summands. The lower bounds were not provided there. In our Theorem 1, the summands have the same expectations but need not have the same distribution.
Remark 5.
If the summands of are non-negative, we consider appearing in Equation (28). Applying Theorem 1(i) [22] to relation (29), one obtains
For , consider a random variable having distribution . Then , and, consequently, . We can choose , , according to Remark 2. Then, the distribution of will be the same if we change to in Equation (28). In such a way, is a normalized sum of a random number of independent random variables. Using the homogeneity of the Kantorovich metric, one has
Therefore, for an arbitrary sequence satisfying conditions of Theorem 1, the upper bound for the left-hand side of Equation (40) is not less than the right-hand side of Equation (40).
4. Limit Theorem for Geometric Sums of Exchangeable Random Variables
Now, we consider exchangeable random variables satisfying the dependence condition proposed in [35]. Namely, assume that for all , () and some
where . The cases of and correspond, respectively, to independent random variables and those possessing the property of comonotonicity. The latter means that for the joint behavior of is strongly correlated and coincides with one of a vector .
Theorem 2.
Let be exchangeable random variables with , satisfying condition (41) for some . Suppose that and are independent, where , . In contrast to the Rényi theorem, one has
where the law of Y is the following mixture
random variables are independent and , .
Proof.
Let be independent copies of , respectively. Suppose that are independent with . Set , , , . Denote the characteristic function of a random variable by , . For each , using Equation (41), one has
For each , one has
where .
According to the classical Rényi theorem, as , where . Note that as , where . In fact, one can apply Theorem 1 with , to check this. For each , taking into account that and are independent and applying the Lebesgue theorem on dominated convergence, we see that
since and V are independent. Hence,
is true. In light of Equation (43),
here the law of Y is the mixture of distributions and Z provided by Equation (42). The proof is complete. □
Theorem 3.
Assume that and satisfy conditions of Theorem 2. Let . Then,
Proof.
Relation (43) for characteristic functions implies that the following equality of distributions holds
where indicator equals 1 and 0 with probabilities and , respectively, and is independent of all the variables under consideration. Assume at first that . Then, for ,
In view of Equation (42) one has
The latter two formulas and the triangle inequality yield
By means of Theorem 1 we have
For each , taking into account the independence of , , V, one can write
Due to homogeneity of we infer from Theorem 1 that
Consequently, it holds
Equations (46), (47) and (48) lead to the upper bound for .
Note that a function , , belongs to and therefore
Note that because and . The random variables are independent. Thus, in light of Equation (42), one has
By means of Equations (45), (23) and (25) we obtain
Equations (50) and (51) permit to find . Hence Equation (49) leads to the inequality
Now, let . Then, according to Equation (52). The proof is complete. □
5. Convergence of Random Sums of Independent Summands to Generalized Gamma Distribution
Statements concerning weak convergence of geometric sums distributions to exponential law are often just particular cases of more general results concerning the convergence of random sums of random summands to generalized gamma law when the number of summands follows the generalized negative binomial distribution, see, e.g., [27,29,49]). The recent work [29] demonstrated how it is possible to study the mentioned general case employing the estimates of proximity of geometric sums distributions to exponential law. We introduce some notation to apply Theorem 1 for analysis of the distance between the distributions of random sums and the generalized gamma law.
Introduce a random variable such that , where is the gamma law with positive parameters r and , i.e., its density with respect to the Lebesgue measure has the form
being the gamma function. For , one has . Clearly, for , . Set , where . One says that random variable has the generalized gamma distribution . According to Equation (5) of [29], the density of is given by formula
Also it is known (see Equation (6) in [29]) that, for , and , the following relation holds
where q is a density of a specified random variable such that support of its distribution belongs to (see Remark 3 [49]). We only note that for the density q admits a representation
Consider a random variable having the generalized negative binomial distribution , where , and , i.e.,
Thus has a mixed Poisson distribution. One can verify that coincides with , where is the negative binomial law. Recall that if
Note also that .
Introduce the random variables
where , , , , and , , . We assume that and are independent, where , , .
Theorem 4.
Proof.
Without loss of generality, we can assume that ; otherwise, we consider , . For such sequence, . Note that has the same distribution as . Applying the homogeneity property of the ideal probability metric of order two, one has
The proof of Theorem 1 [29] starts with establishing for any bounded Borel function h, , and , that
where , and
Let us examine these relations for each . Recall that in light of Remark 1 for some positive constants and (which depend on h), we write , where , . Set , . Then, and are bounded Borel functions such that for each , as . Hence, the monotone convergence theorem yields
Note that, for each , . Applying the monotone convergence theorem once again, we obtain
So, Equation (57) is valid if instead of h belonging to we write . Obviously, , , . Thus,
According to [27] (page 8), for , one has
This permits us to write .
In the same manner, we demonstrate that Equation (57) is valid if instead of we take . Moreover, is finite. Therefore, Equation (57) holds for any , and for such h, is finite.
By the monotone convergence theorem . In a similar way, as , and applying this theorem once again, we obtain
Taking into account that Equation (58) is valid for bounded Borel functions , one ascertains that Equation (58) holds if we replace h by . To show the latter integral is finite, we note that , for some positive and all . Formula (23) of Lemma 4 yields, for each ,
It was assumed above that the right-hand side of Equation (56) is finite. So,
since in light of Equation (57), taking and (these functions belong to ), , we obtain, respectively,
We demonstrate analogously that Equation (58) holds upon replacing with and if the right-hand side of Equation (56) is finite, it follows that
is finite as well. Consequently, Equation (58) is established for each (whenever the right-hand side of Equation (56) is finite) and is finite for such h. Therefore, for and fixed , one has
By Theorem 1, for , it holds
where we take into account that , and coincides with . Thus, can be written as
where , and are independent.
Therefore, for each , is bounded by the right-hand side of Equation (56), and so the desired upper bound is obtained (recall that ).
Corollary 1.
Let conditions of Theorem 4 be satisfied and also . Then, the right-hand side of Equation (56) is finite and
The inequality becomes an equality if for all . In particular, if then .
Proof.
According to Equation (57), for , ,
Thus, the following relation is valid.
Due to [27] (see page 8 there), for , one has Therefore,
For , we obtain . □
6. Convergence of Random Sums of Exchangeable Summands to Generalized Gamma Distribution
Consider the model of exchangeable random variables described in Section 4. Introduce the distribution of a random variable as the following mixture
where , , , , , random variables are independent, , . Since (see, e.g., page 8 [27]), one has
Due to the properties of generalized gamma distributions, for any positive number c,
where indicator equals 1 and 0 with probabilities and , respectively, and is independent with all the variables under consideration. Note that has the same distribution as a random variable Y, having the law defined in Equation (42). Recall that the generalized negative binomial distribution is the law of a random variable , see Equation (54). We will use the following result.
Lemma 5.
If , , , then for one has
Proof.
According to Equation (54), for each ,
The desired statement follows from the monotone convergence theorem for the Lebesgue integral by letting . □
Theorem 5.
Proof.
Without loss of generality, we can assume that ; otherwise, we consider , . For such sequence, . Note that Equation (58) is true for dependent summands (see Theorem 1 [29]). Furthermore, for bounded , , function is also bounded for any . Thus, an employment of Equation (63) gives
Now we apply Equation (57) with bounded and by Fubini’s theorem obtain:
where and are independent and . Apply Equation (57) for the second summand of Equation (68). Then, Equation (69) yields
where and have the same distribution as Y, see Equation (42).
Recall that, for , an inequality holds for all and some positive constants , (see Remark 1). Moreover, according to Equation (64). So, employing bounded tending to as , one can invoke the Lebesgue dominated convergence theorem to claim that . We take into account that
The integral in the right-hand side of the latter formula is finite by Equation (60) and in accord with Equation (64). Thus, it is possible to apply the Lebesgue dominated convergence theorem to obtain
for any . So, Equation (70) holds for all .
In a similar way, for . According to the Cauchy–Bunyakovsky–Schwarz inequality for identically distributed variables we have for and consequently
Equations (59) and (66) entail that . Thus, the dominated convergence theorem guarantees that . Furthermore, one can demonstrate that, for each ,
For this purpose we note that Equation (71) implies
According to Equation (66) one has
The latter integral is finite because one can take and in Equation (57) and invoke Equation (59). Then, it is possible to use the dominated convergence theorem once again to establish Equation (72).
Now, combining Equation (58) and Equation (70) leads for any to the relation
Note that a random variable follows the geometric distribution with parameter . For each and any , by Theorem 3 and in view of homogeneity, we obtain
Employing Equations (73), (74) and (62) one deduces
Equation (65) implies by virtue of homogeneity that
Combining Equations (59), (75) and (76) we conclude that the right-hand side of Equation (67) is an upper bound for .
7. Inverse to Equilibrium Transformation
The development of Stein’s method is closely connected with various transformations of distributions. Let a random variable and . Then, one says that a random variable has the W-size biased distribution if for all f such that exists
The connection of this transformation with Stein’s equation was considered in [50,51]. It was pointed out in [51] that this transformation works well for combinatorial problems, such as counting the number of vertices in a random graph having prespecified degrees, see also [52]. In [53], another transformation was introduced. Namely, if a random variable W has mean zero and variance , then the authors of [53] write (Definition 1.1) that a variable has W-zero biased distribution whenever, for all differentiable f such that exists, the following relation holds
This definition is inspired by an equation characterizing the normal law . The authors of [53] explain that always exists if and . Zero-based coupling for products of normal random variables is treated in [54]. In Sec. 2 of [30], it is demonstrated that the gamma distribution is uniquely characterised by the property that its size-biased distribution is the same as its zero-biased distribution. Two generalizations of zero biasing were proposed in [55], see p. 104 of that paper for discussion of these transformations. We refer also to survey [56].
Now, we turn to the equilibrium distribution transformation introduced in [33] and concentrate on approximation of the law under consideration by means of an exponential law, see the corresponding Definition 1 in Section 2.
According to the second part of Theorem 2.1 of [33] (in our notation), for and non-negative random variable X with and the following estimate holds
and at the same time
The authors of [33] also proved that . Notice that the estimate for is more precise than that for .
Now we turn to Equation (77) and demonstrate how to find the distribution of X when we know the distribution of . In other words, we concentrate on the inverse of an equilibrium distribution transformation.
Assume that . Recall that a random variable exists if appearing in Equation (16) is a distribution function. The latter statement for is equivalent to nonnegativity of X. Indeed, for non-negative X, coincides with a distribution function having a density (15). If is a distribution function and in Equation (16), then for only if for .
Thus a random variable has a (version of) density introduced in Equation (15). Obviously, the function has the following properties. It is nonincreasing on and for . This density is right-continuous on and consequently . Now, we are able to provide a full description of the class of densities for random variables relevant to all non-negative X with positive mean.
Lemma 6.
Let a non-negative random variable have a version of density (with respect to the Lebesgue measure), , such that this function is nonincreasing on , for , and there is finite . Then, there exists a unique preimage of distribution having the distribution function F continuous at . Namely,
Proof.
First of all, note that as otherwise for all ( is a nonincreasing function on ). We also know that there exist a left-sided limit and a right-sided limit of at each point as well as the right-sided limit of at . The set of discontinuity points of is at most countable, and we can take a version which is right continuous at each point of . Then, Equation (78) introduces a distribution function. Consider a random variable X with distribution function F and check the validity of Equation (14).
The integration by a parts formula yields, for any ,
Summands in the right-hand side of Equation (79) are non-negative. Therefore, for any , . Hence, the monotone convergence theorem implies that is finite. According to Equation (78)
since . Taking in the Equation (79) limit as , one obtains . Now, we are ready to verify Equation (14). For any Lipschitz function f, is finite and
Taking into account Equation (80), we infer that as . Consequently, applying integration by parts once again (f has bounded variation), we obtain
Uniqueness of X distribution corresponding to is a consequence of Equation (15) and continuity of at . Indeed, assume that for and one has . Then, Equation (15) yields that for almost all ,
and therefore , where c is a positive constant (the equilibrium distribution in Definition 1 is introduced for random variables with positive expectation only). Since , one has . Let , , where the points belong to the set considered in Equation (81) to ensure that . Thus, distributions of and coincide. □
Remark 6.
Let be the Bernoulli random variable taking values 1 and 0 with probabilities p and , respectively. Then, it is easily seen that the distribution of is uniform on . Thus, in contrast to Lemma 6, without assumption of continuity of F at a point one can not guarantee, in general, the preimage uniqueness for the inverse transformation to the equilibrium one.
In the proof of Lemma 6, we find out that . Set , . Then, . Further, we suppose that this choice of is made.
Recall that random variables U and V are stochastically ordered if either , for every , or the opposite inequality holds (for all ). Now, we clarify one of the Theorem 2.1 of [33] statements (see also Theorem 3 [22], where the result similar to Theorem 2.1 of [33] is formulated employing the generalized distributions).
Theorem 6.
Let a random variable satisfy conditions of Lemma 6, and and X be a preimage of the equilibrium transformation. Then, Equation (77) holds. Moreover, the inequality becomes an equality when X and are stochastically ordered.
Proof.
Apply the Stein Equation (10) along with equilibrium transformation (14). Then, in light of and , we can write
The last inequality in (82) is true due to Lemma 2. Now, we demonstrate that equality in (82) can be attained. Taking , we have a solution of Equation (12). Then,
Employing the integration by parts formula, one can show that the expression in the right-hand side of the last equality is equal to the Kantorovich distance between X and when these variables are stochastically ordered. Note that , as and , as because and are finite. Thus,
since (or ≤) for all . It is well-known that the Kantorovich distance is the minimal one for the metric (see, e.g., [9], Ch. 1, §1.3). Therefore,
where the infimum has taken over all joint laws such that and (see also Remark 2 and [10], Corollary 5.3.2). Consequently, in the framework of Theorem 6, . □
Remark 7.
One can show that by means of Lemma 2 and Equation (82) it is possible to provide an estimate
For each function h belonging to , in a similar way to Equation (82), one can apply Equation (10) together with equilibrium transformation. Now, it is sufficient to study the Stein equation with right derivative. Formula (13) gives a solution of the Stein equation according to Lemma 2. Note that for , the right derivative coincides almost everywhere with the derivative, and the law of is absolutely continuous according to Equation (15). Thus, for the Lipschitz function (see Lemma 2), one can use an equilibrium transformation.
Example 1.
Consider the distribution functions of random variables , taking values and with probabilities , . Formula (15) yields that has the following piece-line structure
If then, for all , the following inequality holds: , i.e., and are stochastically ordered. We see that for , the inequality is violated in the right neighborhood of a point . Thus, there are beside the stochastically ordered pairs (X, ) also those of a different kind.
Now, we turn to another example of stochastically ordered X and .
Example 2.
Take having the Pareto distribution. The notation means that has a density () and the corresponding distribution function , where , .
Further, we consider only , since in this case there exists finite . By means of Lemma 6, we obtain the distribution of the preimage of the equilibrium transformation
Thus one can state that . It is not difficult to see that for , i.e., the random variables and X are stochastically ordered. Due to Theorem 6, one has
In such a way we find the bound for the Kolmogorov distance between the distributions and . This relation demonstrates the convergence rate of to zero as . The estimate is nontrivial for .
Remark 8.
It is interesting that estimation of the proximity of the Pareto law to the Exponential one became important in signal processing, see [34] and references therein. Let , where , , and . In [34], the author indicates that the Pinsker–Csiszár inequality was employed to derive
where is the Kullback–Leibler divergence between laws of X and Z. More precisely, in the left-hand side of Equation (85) one can write the total variation distance between distributions of X and Z. Clearly, . By evaluating and performing an optimal choice of parameter , it was demonstrated (formula (19) in [34]) that, for and any ,
if . The author of [34] on page 8 writes that in his previous work [57] the inequality
was established with the same choice of . Next, he also writes that “in the most cases ” and notes that the estimate in Equation (86) involving the Kullback–Leibler divergence is more precise for than the estimate in Equation (87) obtained by the Stein method. Moreover, on page 4 of [34] we read: “The problem with the Stein approach is that the bounds do not suggest a suitable way in which, for a given Pareto model, an appropriate approximating Exponential distribution can be specified”. However, we have demonstrated that application of the inverse equilibrium transformation together with the Stein method permits indicating, whenever , the corresponding Exponential distribution with proximity closer than the right-hand sides of Equation (86) and Equation (87) can provide.
8. Conclusions
Our principle goal was to find the sharp estimates of the proximity of random sums distributions to exponential and more general laws. This goal is achieved when we employ the probability metric . Thus, it would be valuable to find the best possible approximations of random sums distributions by means of specified laws using the metrics of order . The results of [32] provide the basis for this approach.
There are various complementary refinements of the Rényi theorem. One approach is related to the employment of Brownian motion. It is interesting that in [58] (p. 1071) the authors proposed an explanation of the Rényi theorem involving the embedding theorem. We provide a little bit different complete proof. Let be i.i.d. random variables with mean and , whereas denote the corresponding partial sums. According to Theorem 12.6 of [59], which is due to A.V. Skorokhod and V. Strassen, there exists a standard Brownian motion (perhaps it is defined on an extension of initial probability space) such that
and
where stands for convergence in probability, and a.s. means almost surely. Thus, in light of Equation (89), we can write, for ,
where and a.s. when . Substitute (see Equation (2)) in Equation (90) instead of t. It is easily seen that (i.e., for each , one has as ) and by means of characteristic functions one can verify that as , where . Therefore, , . In the proof of Lemma 4, we showed (Equation (24)) that . Consequently,
Hence, as . Now, we demonstrate that For any and any ,
In light of Equation (88), for arbitrary and , one can take such that . Then, for any , we obtain
Since , we can find such that if . Therefore, as . The Slutsky lemma yields the desired relation
which implies Equation (3). However, it seems that there is no clear intuitive reason why the law of the random sum converges to an exponential in the Rényi theorem. Moreover, in Ch. 3, Sec. 2 “The Rényi Limit Theorem” of [20] (see Sec. 2.1 “Motivation”), one can find examples demonstrating that intuition behind the Rényi theorem is poor.
Actually, relation (90) leads to refinements of Equation (3). In [58], it is proved that if has finite exponential moments and other specified conditions are satisfied then there exists a more sophisticated approximation for distribution of , and its accuracy is estimated. The results are applied to the study of queue for both light-tailed and heavy-tailed service time distributions. Note that in [58], Section 5, the authors study the model where the distribution of can depend on p. For future research, it would be desirable to establish analogues of our theorems for such a model.
The results concerning the accuracy of approximating a distribution under consideration by an exponential law are applicable to some queuing models. Let, for a queue , the inter-arrival times follow distribution and S stand for the general service time. Introduce the stationary waiting time W and define to be its load. Due to [60], if then as , where . Theorem 3.1 of [45] contains an upper bound of , where . This estimate is used by the authors for analysis of queueing systems with a single server. It would be interesting to obtain the sharp approximations in the framework of queueing systems.
For the model of exchangeable random variables, Theorem 2 in Section 2 ensures the weak convergence of distributions under consideration to specified mixture of explicitly indicated laws. Theorem 3 proves the sharp convergence rate estimate to this limit law by means of the ideal probability metric of the second order. It would be worthwhile to establish such an estimate of the distributions proximity applying the Lévy–Prokhorov distance because convergence in this metric is equivalent to the weak convergence of distributions of random variables. All the more, at present there is no unified theory of probability metrics. In this regard, one can mention Proposition 1.2 of [17] stating that if a random variable Z has the Lebesgue density bounded by C then, for any random variable Y,
However, this estimate only gives the sub-optimal convergence rates. We also highlight the important total variation distance . The authors of [61] study the sum , where is a family of locally dependent non-negative integer-valued random variables. Using the perturbations of Stein’s operator, they establish the upper bounds for where the law of M is a mixture of Poisson distribution and either binomial or negative binomial distribution. It would be desirable to obtain the sharp estimates and, moreover, consider a more general model where the set of summation is random. In this connection, it seems helpful to employ the paper [62], where the authors proved results concerning the weak convergence of distributions of statistics constructed from samples of random size. In addition, it would be interesting to extend these results to stratified samples by invoking Lemma 1 of [63].
Special attention is paid to various generalizations of the geometric sums. In Theorem 3.3 of [64], the authors consider random sums with summation index , where are i.i.d. random variables following the geometric law , see Equation (2). Then, they show that converge in distribution to the gamma law with certain parameters as . In [62], it is demonstrated that the Linnik and the Mittag–Leffler laws arise naturally in the framework of limit theorems for random sums. Hopefully, in future the complete picture of limit laws involving general theory of distributions mixtures will appear. In addition, it is desirable to study various models of random sums of dependent random variables. On this track, it could be useful to consider the decompositions of exchangeable random sequences extending the fundamental de Finetti theorem, see, e.g., [65].
One can try to generalize the results of Section 7 for accumulative laws proposed in [66]. These laws are akin to both the Pareto distribution and the lognormal distribution. In addition, we refer to [43] where the “variance-gamma distributions” were studied. These distributions form a four-parameter family and comprise as special and limiting cases the normal, gamma and Laplace distributions. Employment of these distributions permits enlarging a range of applications in modeling and fitting real data.
To complete the indication of further research directions, we note that the next essential and nontrivial step is to establish the limit theorem in functional spaces for processes generated by a sequence of random sums of random variables. For such stochastic processes, one can obtain the analogues of the classical invariance principles.
Author Contributions
Conceptualization, A.B. and N.S.; methodology, A.B and N.S.; formal analysis, A.B. and N.S.; investigation, A.B. and N.S.; writing—original draft preparation, A.B. and N.S.; writing—review and editing, A.B. and N.S.; supervision, A.B.; project administration, A.B.; funding acquisition, A.B. All authors have read and agreed to the published version of the manuscript.
Funding
The work was supported by the Lomonosov Moscow State University project “Fundamental Mathematics and Mechanics”.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Not applicable.
Acknowledgments
The authors are grateful to Alexander Tikhomirov for invitation to present manuscript for this issue. In addition, they would like to thank three anonymous Reviewers for the careful reading of the manuscript and valuable remarks.
Conflicts of Interest
The authors declare no conflict of interest.
Appendix A
Proof of Lemma 1.
If , then h is absolutely continuous (see, e.g., §13 in [42]), and consequently there exists for almost all . Thus, for almost all in light of Equation (4). Assume that essential supremum . Then, for any , one can find a version of , defined on , such that . (It was explained in Section 2 that one can consider a measurable extension of to ). Then, due to Equation (11) with h instead of f we obtain Equation (5) with instead of C. Consequently, . We come to the contradiction.
On the other hand, let h be absolutely continuous. Then, for almost all , there exists and Equation (11) is valid for h instead of f. Assume that essential supremum . Then, for any there is a version of such that . According to Equation (11), the relation (5) holds with instead of C. Since can be taken as an arbitrary small, one can claim that . Suppose that . Then, for almost all , there exists and . Thus, we found a version with . The contradiction shows that . Hence, the desired statement is proved. □
Proof of Lemma 2.
Let be a continuity point of a function . Then, the same is true for a function , . Hence, the function has a derivative at point (in light of Remark 1 an integral is well defined for any ). Thus, for each point x of continuity h there exists
For each fixed and a function , where , Equation (12) is verified in a similar way for the right derivative at point . Taking in Equation (12), we obtain . Evidently, . Therefore, Equation (A1) yields
If a function h belongs to , then, for any , the following inequality holds . Consequently, for , one has (where means a right derivative of a version of , and we operate with essential supremum).
Taking into account Lemma 1, for a function and any , one can write . For and , by the Lagrange finite-increments formula, , where . Hence, for any and ,
since
Taking into account Equation (12), one can see that, for any , , where and h have derivatives at each point . Using Equation (A2) and Equation (A3), we obtain, for ,
By means of Equation (A3) and the Lagrange finite-increments formula we can write
Let us apply the Taylor formula with integral representation of the residual term:
This representation known for the Riemann integral (see, e.g., [67], §9.17) holds in the framework of the Lebesgue integral if it is possible to use the recurrent integration by parts for , i.e.,
Integral in the left-hand side of Equation (A7) exists by virtue of Lemma 1 since . Therefore, is defined for almost all and (essential supremum) . The latter equality in Equation (A7) is obvious since is continuous function on . The first equality in Equation (A7) is valid due to the integration by parts formula for the Lebesgue integral. Indeed, functions and are absolutely continuous for t belonging to . Thus, we can apply, e.g., Theorem 13.29 of [42] to justify the first equality in Equation (A7). Consequently, due to Equation (A4) and Equation (A6) one can write
where , . Relations Equation (A5) and Equation (A8) lead to the last statement of Lemma 2. The proof is complete. □
Comments to Definition 1.
For each Lipschitz function f, one can claim that is finite since and, in light of Remark 1, one has , where , . Clearly, it is sufficient to verify Equation (14) for any Lipschitz function f such that (otherwise we take the Lipschitz function , ). Evidently, , , introduced by Equation (15), is a probability density because for non-negative random variable X according to [47], Ch.2, formula (69)
We will show that, for such f and a density of , one has
where F is a distribution function of X and . We take integrals over as and for .
We know that a function f has a derivative at almost all points . Therefore, the right-hand side of Equation (A10) does not depend on the choice of a version ( is a measurable bounded function). The integral in the right-hand side of Equation (A10) is finite because in light of Lemma 1 and since the right-hand side of Equation (A9) is finite. One can take the integrals over in Equation (A10) as and , where m stands for the Lebesgue measure.
Function f is a function of finite variation (as f is the Lipschitz function). Therefore, where and are nondecreasing functions. We can take the canonical representation with and , , where is the variation of f on , (see, e.g., [42], Theorem 12.18). If , then . For , one has (see, e.g., [42], Lemma 12.15)
We see that such and are the Lipschitz functions when f is the Lipschitz one. Hence, for almost all , there exist , and . Thus, it is enough to demonstrate that
These integrals are finite since and are the Lipschitz functions. Note that
By applying Theorem 11 of Sec. 6, Ch. 2 [47], one obtains, for each , nondecreasing continuous function and a nondecreasing right-continuous function , the following formula:
We take into account that and the -finite measure corresponding to is absolutely continuous w.r.t. m, and the Radon–Nikodým derivative , , . In addition, we can write in Equation (A11) since for at almost all the left-limit of this function coincides with (there exist at most a countable set of jumps of , ). Obviously, as because for some positive and all . Indeed, according to formula (73) of Sec. 6, Ch. 2 of [47] the condition yields
By the Lebesgue dominated convergence theorem one infers that
and
This permits to claim the validity of Equation (A10) which entails the desired Equation (15).
Proof of Lemma 3.
For , in light of Remark 1 one can state that for some positive numbers and . Let F be a distribution function of X. Since , due to Corollary 2, Sec. 6, Ch. 2, v.1, [47] one has
Hence, we obtain that as and as . Continuous function f has a bounded variation. Thus where and are nondecreasing continuous functions. Thus, for any and , the integration by parts formula (see, e.g., Theorem 11, Sec. 6, Ch. 2, [47]) and Equation (18) give
We take into account that the integrands are bounded measurable functions and the measures corresponding to F, and are finite on any interval . Therefore such integrals are finite. According to the Lebesgue theorem on dominated convergence (recall that ) one has
and the limit is finite. The monotone convergence theorem for -finite measure yields
We have seen that as . Hence, in light of Equation (18)
Therefore, for , each integral is finite as is finite. Thus,
as f is absolutely continuous. Indeed, for any ,
where (continuous) for any finite interval . Thus, and . Set
Then and are nondecreasing continuous functions on , and
where these three integrals are finite. For (non-negative) -finite measures corresponding to and , one can write
Thus, one has
The bound follows from Lemma 1. Therefore, the Lebesgue theorem on dominated convergence yields (as )
We have demonstrated that
References
- Steutel, F.W.; Van Harn, K. Infinite Divisibility of Probability Distributions on the Real Line; Marcel Dekker: New York, NY, USA, 2004. [Google Scholar]
- Nolan, J.P. Univariate Stable Distributions. Models for Heavy Tailed Data; Springer: Cham, Switzerland, 2020. [Google Scholar]
- Jagers, P. Branching processes: Personal historical perspective. In Statistical Modeling for Biological Systems; Almudevar, A., Oakes, D., Hall, J., Eds.; Springer: Cham, Switzerland, 2020; pp. 1–12. [Google Scholar] [CrossRef]
- Schmidli, H. Risk Theory; Springer: Cham, Switzerland, 2017. [Google Scholar]
- Gnedenko, B.V.; Korolev, V.Y. Random Summation. Limit Theorems and Applications; CRC Press: Boca Raton, FL, USA, 1996. [Google Scholar]
- Kalashnikov, V.V. Geometric Sums: Bounds for Rare Events with Applications; Kluwer Academic: Dordrecht, The Netherlands, 1997. [Google Scholar]
- Pinski, M.A.; Karlin, S. An Introduction to Stochastic Modeling, 4th ed.; Academic Press: Amsterdam, The Netherlands, 2011. [Google Scholar]
- Bulinski, A.; Spodarev, E. Introduction to random fields. In Stochastic Geometry, Spacial Statistics and Random Fields. Asymptotic Methods; Spodarev, E., Ed.; Springer: Berlin, Germany, 2013; pp. 277–336. [Google Scholar] [CrossRef]
- Zolotarev, V.M. Modern Theory of Summation of Random Variables; De Gruyter: Berlin, Germany, 1997. [Google Scholar]
- Rachev, S.T.; Klebanov, L.B.; Stoyanov, S.V.; Fabozzi, F.J. The Methods of Distances in the Theory of Probability and Statistics; Springer: New York, NY, USA, 2013. [Google Scholar]
- Stein, C. A bound for the error in the normal approximation to the distribution of a sum of dependent random variables. In Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability, Volume 2: Probability Theory; Statistical Laboratory of the University of California: Berkeley, CA, USA, 1972; pp. 583–602. [Google Scholar]
- Stein, C. Approximate Computation of Expectations, Institute of Mathematical Statistics Lecture Notes—Monograph Series, 7; Institute of Mathematical Statistics: Hayward, CA, USA, 1986. [Google Scholar]
- Slepov, N.A. Convergence rate of random geometric sum distributions to the Laplace law. Theory Probab. Appl. 2021, 66, 121–141. [Google Scholar] [CrossRef]
- Tyurin, I.S. On the convergence rate in Lyapunov’s theorem. Theory Probab. Appl. 2011, 55, 253–270. [Google Scholar] [CrossRef]
- Barbour, A.D.; Chen, L.H.Y. (Eds.) An Introduction to Stein’s Method; World Scientific: Singapore, 2005. [Google Scholar]
- Chen, L.H.Y.; Goldstein, L.; Shao, Q.-M. Normal Approximation by Stein’s Method; Springer: Heidelberg, Germany, 2011. [Google Scholar]
- Ross, N. Fundamentals of Stein’s method. Probab. Surv. 2011, 8, 210–293. [Google Scholar] [CrossRef]
- Arras, B.; Breton, J.-C.; Deshayes, A.; Durieu, O.; Lachièze-Rey, R. Some recent advances for limit theorems. ESAIM Proc. Surv. 2020, 68, 73–96. [Google Scholar] [CrossRef]
- Arras, B.; Houdré, C. On Stein’s Method for Infinitely Divisible Laws with Finite First Moment, 1st ed.; Springer: Cham, Switzerland, 2019. [Google Scholar]
- Chen, P.; Nourdin, I.; Xu, L.; Yang, X.; Zhang, R. Non-integrable Stable Approximation by Stein’s Method. J. Theor. Probab. 2022, 35, 1137–1186. [Google Scholar] [CrossRef]
- Rényi, A. (Hungarian) A characterization of Poisson processes. Magyar Tud. Akad. Mat. Kutató. Int. Közl. 1957, 1, 519–527. [Google Scholar]
- Shevtsova, I.; Tselishchev, M. A generalized equilibrium transform with application to error bounds in the Rényi theorem with no support constraints. Mathematics 2020, 8, 577. [Google Scholar] [CrossRef]
- Brown, M. Error bounds for exponential approximations of geometric convolutions. Ann. Probab. 1990, 18, 1388–1402. [Google Scholar] [CrossRef]
- Brown, M. Sharp bounds for exponential approximations under a hazard rate upper bound. J. Appl. Probab. 2015, 52, 841–850. [Google Scholar] [CrossRef]
- Hung, T.L.; Kein, P.T. On the rates of convergence in weak limit theorems for normalized geometric sums. Bull. Korean Math. Soc. 2020, 57, 1115–1126. [Google Scholar] [CrossRef]
- Shevtsova, I.; Tselishchev, M. On the accuracy of the exponential approximation to random sums of alternating random variables. Mathematics 2020, 8, 1917. [Google Scholar] [CrossRef]
- Korolev, V.; Zeifman, A. Bounds for convergence rate in laws of large numbers for mixed Poisson random sums. Stat. Probab. 2021, 168, 108918. [Google Scholar] [CrossRef]
- Aldous, D.J. More Uses of Exchangeability: Representations of Complex Random Structures. In Probability and Mathematical Genetics: Papers in Honour of Sir John Kingman; Bingham, N.H., Goldie, C.M., Eds.; Cambridge Univesity Press: Cambridge, UK, 2010. [Google Scholar]
- Shevtsova, I.; Tselishchev, M. On the accuracy of the generalized gamma approximation to generalized negative binomial random sums. Mathematics 2021, 9, 1571. [Google Scholar] [CrossRef]
- Liu, Q.; Xia, A. Geometric sums, size biasing and zero biasing. Electron. Commun. Probab. 2022, 27, 1–13. [Google Scholar] [CrossRef]
- Döbler, C.; Peccati, G. The Gamma Stein equation and noncentral de Jong theorems. Bernoulli 2018, 24, 3384–3421. [Google Scholar] [CrossRef]
- Korolev, V. Bounds for the rate of convergence in the generalized Rényi theorem. Mathematics 2022, 10, 4252. [Google Scholar] [CrossRef]
- Peköz, E.A.; Röllin, A. New rates for exponential approximation and the theorems of Rényi and Yaglom. Ann. Probab. 2011, 39, 587–608. [Google Scholar] [CrossRef]
- Weinberg, G.V. Kulback-Leibler divergence and the Pareto-Exponential approximation. SpringerPlus 2016, 5, 604. [Google Scholar] [CrossRef]
- Daly, F. Gamma, Gaussian and Poisson approximations for random sums using size-biased and generalized zero-biased couplings. Scand. Actuar. J. 2022, 24, 471–487. [Google Scholar] [CrossRef]
- Zolotarev, V.M. Ideal metrics in the problem of approximating the distributions of sums of independent random variables. Theory Probab. Appl. 1977, 22, 433–449. [Google Scholar] [CrossRef]
- Gibbs, A.L.; Su, F.E. On choosing and bounding probability metrics. Int. Stat. Rev. 2002, 70, 419–435. [Google Scholar] [CrossRef]
- Janson, S. Probability Distances. 2020. Available online: www2.math.uu.se/∼svante (accessed on 1 September 2022).
- Peköz, E.A.; Röllin, A.; Ross, N. Total variation error bounds for geometric approximation. Bernoulli 2013, 19, 610–632. [Google Scholar] [CrossRef]
- Slepov, N.A. Generalized Stein equation on extended class of functions. In Proceedings of the International Conference on Analytical and Computational Methods in Probability Theory and Its Applications, Moscow, Russia, 23–27 October 2017; pp. 75–79. [Google Scholar]
- Ley, C.; Reinert, G.; Swan, Y. Stein’s method for comparison of inivariate distributions. Probab. Surv. 2017, 14, 1–52. [Google Scholar] [CrossRef]
- Yeh, J. Real Analysis. Theory of Measure and Integration, 2nd ed.; World Scientific: Singapore, 2006. [Google Scholar]
- Gaunt, R.E. Wasserstein and Kolmogorov error bounds for variance gamma approximation via Stein’s method I. J. Theor. Probab. 2020, 33, 465–505. [Google Scholar] [CrossRef]
- Halmos, P.R. Measure Theory; Springer: New York, NY, USA, 1974. [Google Scholar]
- Gaunt, R.E.; Walton, N. Stein’s method for the single server queue in heavy traffic. Stat. Probab. Lett. 2020, 156, 108566. [Google Scholar] [CrossRef]
- Muthukumar, T. Measure Theory and Lebesgue Integration. 2018. Available online: home.iitk.ac.in/∼tmk (accessed on 1 September 2022).
- Shiryaev, A.N. Probability-1; Springer: New York, NY, USA, 2016. [Google Scholar]
- Burkill, L.C. The Lebesgue Integral; Cambridge University Press: Cambridge, UK, 1963. [Google Scholar]
- Korolev, V.; Zeifman, A. Generalized negative binomial distributions as mixed geometric laws and related limit theorems. Lith. Math. J. 2019, 59, 366–388. [Google Scholar] [CrossRef]
- Baldi, P.; Rinott, Y.; Stein, C. A normal approximations for the number of local maxima of a random function on a graph. In Probability, Statistics and Mathematics, Papers in Honor of Samuel Karlin; Anderson, T.W., Athreya, K.B., Iglehart, D.L., Eds.; Academic Press: San-Diego, CA, USA, 1989; pp. 59–81. [Google Scholar] [CrossRef]
- Goldstein, L.; Rinott, Y. Multivariate normal approximations by Stein’s method and size bias couplings. J. Appl. Prob. 1996, 33, 1–17. [Google Scholar] [CrossRef]
- Goldstein, L. Berry-Esseen bounds for combinatorial central limit theorems and pattern occurrences, using zero and size biasing. J. Appl. Probab. 2005, 42, 661–683. [Google Scholar] [CrossRef][Green Version]
- Goldstein, L.; Reinert, G. Stein’s method and the zero bias transformation with application to simple random sampling. Ann. Appl. Probab. 1997, 7, 935–952. [Google Scholar] [CrossRef]
- Gaunt, R.E. On Stein’s method for products of normal random variables and zero bias couplings. Bernoulli 2017, 23, 3311–3345. [Google Scholar] [CrossRef]
- Döbler, C. Distributional transformations without orthogonality relations. J. Theor. Probab. 2017, 30, 85–116. [Google Scholar] [CrossRef]
- Arratia, R.; Goldstein, L.; Kochman, F. Size bias for one and all. Probab. Surv. 2019, 16, 1–61. [Google Scholar] [CrossRef]
- Weinberg, G.V. Validity of whitening-matched filter approximation to the Pareto coherent detector. IET Signal Process 2012, 6, 546–550. [Google Scholar] [CrossRef]
- Blanchet, J.; Glinn, P. Uniform renewal theory with applications to expansions of random geometric sums. Adv. Appl. Prob. 2007, 39, 1070–1097. [Google Scholar] [CrossRef]
- Kallenberg, O. Foundations of Modern Probability; Springer: New York, NY, USA, 1997. [Google Scholar]
- Kingman, J.F.C. On queues in heavy traffic. J. R. Stat. Soc. Ser. B Stat. Methodol. 1962, 24, 383–392. [Google Scholar] [CrossRef]
- Su, Z.; Wang, X. Approximation of sums of locally dependent random variables via perturbation of Stein operator. arXiv 2022, arXiv:2209.09770.v2. [Google Scholar]
- Korolev, V.Y.; Zeifman, A.I. Convergence of statistics constructed from samples with random sizes to the Linnik and Mittag-Leffler distributions and their generalizations. J. Korean Stat. Soc. 2017, 46, 161–181. [Google Scholar] [CrossRef]
- Bulinski, A.; Kozhevin, A. New version of the MDR method for stratified samples. Stat. Optim. Inf. Comput. 2017, 5, 1–18. [Google Scholar] [CrossRef][Green Version]
- Ginag, L.T.; Hung, T.L. An extension of random summations of independent and identically distributed random variables. Commun. Korean Math. Soc. 2018, 33, 605–618. [Google Scholar] [CrossRef]
- Farago, A. Decomposition of Random Sequences into Mixtures of Simpler Ones and Its Application in Network Analysis. Algorithms 2021, 14, 336. [Google Scholar] [CrossRef]
- Feng, M.; Deng, L.-J.; Chen, F.; Perc, M.; Kurths, J. The accumulative law and its probability model: An extension of the Pareto distribution and the log-normal distribution. Proc. R. Soc. A 2020, 476, 20200019. [Google Scholar] [CrossRef] [PubMed]
- Nikolsky, S.M. A Course of Mathematical Analysis, v. 1; Mir Publishers: Moscow, Russia, 1987. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).