Abstract
Let be a sequence of real random variables, a sequence of random indices, and a sequence of constants such that . The asymptotic behavior of , as , is investigated when is exchangeable and independent of . We give conditions for in distribution, where L and M are suitable random variables. Moreover, when is i.i.d., we find constants and such that and for every n. In particular, or in total variation distance provided or , as it happens in some situations.
Keywords:
exchangeability; random sum; rate of convergence; stable convergence; total variation distance MSC:
60F05; 60G50; 60B10; 60G09
1. Introduction
All random elements appearing in this paper are defined on the same probability space, say .
A random sum is a quantity such as , where is a sequence of real random variables and a sequence of -valued random indices. In the sequel, in addition to and , we fix a sequence of positive constants such that and we let
Random sums find applications in a number of frameworks, including statistical inference, risk theory and insurance, reliability theory, economics, finance, and forecasting of market changes. Accordingly, the asymptotic behavior of , as , is a classical topic in probability theory. The related literature is huge and we do not try to summarize it here. We just mention a general text book [1] and some useful recent references: [2,3,4,5,6,7,8,9,10].
In this paper, the asymptotic behavior of is investigated in the (important) special case where is exchangeable and independent of . More precisely, we assume that:
- (i)
- is exchangeable;
- (ii)
- is independent of ;
- (iii)
- for some random variable .
Under such conditions, we prove a weak law of large numbers (WLLN), a central limit theorem (CLT), and we investigate the rate of convergence with respect to the total variation distance.
Suppose in fact and conditions (i)-(ii)-(iii) hold. Define
where V is the random variable involved in condition (iii) and the tail -field of . Then, it is not hard to show that . To obtain a CLT, instead, is not straightforward. In Section 3, we prove that in distribution, where M is a suitable random variable, provided and converges stably. Finally, in Section 4, assuming i.i.d. and some additional conditions, we find constants and such that
In particular, or in total variation distance provided or , as it happens in some situations.
A last note is that, to our knowledge, random sums have been rarely investigated when is exchangeable. Similarly, convergence of or in total variation distance is usually not taken into account. This paper contributes to fill this gap.
2. Preliminaries
In the sequel, the probability distribution of any random element U is denoted by . If S is a topological space, is the Borel -field on S and the space of real bounded continuous functions on S. The total variation distance between two probability measures on , say and , is
With a slight abuse of notation, if X and Y are S-valued random variables, we write instead of , namely
If X is a real random variable, we say that is absolutely continuous to mean that is absolutely continuous with respect to Lebesgue measure. The following technical fact is useful in Section 4.
Lemma 1.
Let X be a strictly positive random variable. Then,
provided the are constants such that and is absolutely continuous.
Proof.
Let f be a density of X. Since , for some sequence of continuous densities, it can be assumed that f is continuous. Furthermore, since , for each there is such that . For such a b, one obtains
Hence, it can be also assumed a.s. for some .
Let be a density of . Since
it suffices to show that for each . To prove the latter fact, define . For large n, one obtains . In this case, on and can be written as
Therefore, follows from the continuity of f and
☐
Stable Convergence
Stable convergence, introduced by Renyi in [11], is a strong form of convergence in distribution. It actually occurs in a number of frameworks, including the classical CLT, and thus it quickly became popular; see, e.g., [12] and references therein. Here, we just recall the basic definition.
Let S be a metric space, a sequence of S-valued random variables, and K a kernel (or a random probability measure) on S. The latter is a map K on such that is a probability measure on , for each , and is -measurable for each . Say that converges stably to K if
for all and with , where .
More generally, take a sub--field and suppose K is -measurable (i.e., is -measurable for fixed ). Then, converges -stably toK if condition (1) holds whenever and .
An important special case is when K is a trivial kernel, in the sense that
where is a fixed probability measure on . In this case, converges -stably to if and only if
whenever and is bounded and -measurable.
3. WLLN and CLT for Random Sums
In this section, we still let
where V is the random variable involved in condition (iii) and
is the tail -field of . Recall that . Recall also that, by de Finetti’s theorem, is exchangeable if and only if is i.i.d. conditionally on , namely
for all and all .
The following WLLN is straightforward.
Theorem 1.
If and conditions (i) and (iii) hold, then .
Proof.
Recall that, if and Y are any real random variables, if and only if, for each subsequence , there is a sub-subsequence such that . Fix a subsequence . Then, by (iii),
along a suitable sub-subsequence . Since , then . As a result of the SLLN for exchangeable sequences, . Therefore,
☐
For definiteness, Theorem 1 has been stated in terms of convergence in probability, but other analogous results are available. As an example, suppose that and conditions (i)–(ii) are satisfied. Then, in distribution provided in distribution. This follows from Skorohod representation theorem and the current version of Theorem 1. Similarly, or whenever or .
We also note that, as implicit in the proof of Theorem 1, condition (iii) implies or equivalently
We next turn to the CLT. It is convenient to begin with the i.i.d. case. From now on, U and Z are two real random variables such that
We also let
Theorem 2.
Suppose is i.i.d., , condition (ii) holds, and
Then,
Proof.
Let
Since is i.i.d., a.s. Since for every n, the sequence is -bounded, and this implies
Therefore, it suffices to prove in distribution. We prove the latter fact by means of characteristic functions.
Fix . Let be the probability distribution of V under and
Then,
In addition, for each , the classical CLT yields
Since condition (3) implies condition (iii), for all . Given , take such that . As a result of (4), one can find an integer m such that
Since is arbitrary and , it follows that
Finally, since and Z is independent of V,
Therefore,
where the second equality is due to condition (3). Hence, in distribution, and this concludes the proof. ☐
The argument used in the proof of Theorem 2 yields a little bit more. Let and . Then, converges -stably (and not only in distribution) to . Among other things, since , this implies that in distribution, where R denotes a random variable independent of L such that . Moreover, condition (3) can be weakened into converges -stably to .
We also note that, under some extra assumptions, Theorem 2 could be given a simpler proof based on some version of Anscombe’s theorem; see, e.g., [13] and references therein.
Finally, we adapt Theorem 2 to the exchangeable case. Let
To introduce the next result, it may be useful to recall that
provided is exchangeable and , where is the Gaussian kernel with mean 0 and random variance W (with ); see, e.g., ([14] Th. 3.1).
Theorem 3.
If and conditions (i)–(ii) and (3) hold, then in distribution.
Proof.
Just note that is i.i.d. conditionally on , with mean and variance W. Hence, for each , Theorem 2 yields
which in turn implies
☐
4. Rate of Convergence with Respect to Total Variation Distance
To obtain upper bounds for and , some additional assumptions are needed. In particular, in this section, is i.i.d. (with the exception of Remark 1). Hence, L and M reduce to and , where , and satisfies condition (2).
We begin with a rough estimate for .
Theorem 4.
Suppose that conditions (ii)–(iii) hold, is i.i.d., and has an absolutely continuous part. Then,
for all , where is a constant independent of m and n.
In order to prove Theorem 4, we recall that
for all and ; see, e.g., ([15] Lem. 3).
Proof of Theorem 4.
Fix . By ([16] Lem. 2.1), up to enlarging the underlying probability space , there is a sequence of random variables, independent of , such that
In addition, by ([17] Th. 2.6), there is a constant depending only on such that
Having noted these facts, define
Then,
Next, since , by conditioning on and applying inequality (5), one obtains
Moreover, since and both and Z are independent of V,
Collecting all these facts together, one finally obtains
☐
The upper bound provided by Theorem 4 is generally large but it becomes manageable under some further assumptions. For instance, if a.s. for some constant , it reduces to
As an example, we discuss a simple but instructive case.
Example 1.
For each , denote by the integer part of x. Suppose a.s. for some constant and define
Suppose also that is independent of V and satisfies the other conditions of Theorem 4. Then,
Hence, letting , inequality (6) reduces to
for some constant . Finally, O if V is bounded above and is absolutely continuous with a Lipschitz density. Hence, under the latter condition on V, one obtains
Incidentally, this bound is essentially of the same order as the bound obtained in [6] when has a mixed Poisson distribution and the total variation distance is replaced by the Wasserstein distance.
One more consequence of Theorem 4 is the following.
Corollary 1.
in total variation distance provided the conditions of Theorem 4 hold, , is absolutely continuous, and
Proof.
First, assume a.s. for some constant . For each , letting , Lemma 1 implies
This concludes the proof if a.s. In general, for each , define
where denotes the integer part of x. Since , the first part of the proof implies
Finally, since and
one obtains . ☐
We next turn to . Following [18], our strategy is to estimate through the Wasserstein distance between and .
Recall that, if X and Y are real integrable random variables, the Wasserstein distance between and is
where inf is over the real random variables H and K such that and while sup is over the 1-Lipschitz functions . Define also
where is the characteristic function of .
Theorem 5.
Assume the conditions of Theorem 2 and:
- (iv)
- , where , is independent of , and is independent of ;
- (v)
- for some and
Then, . Moreover, letting , one obtains
for each and , where k is a constant independent of n.
Proof.
By Theorem 2, in distribution. By condition (iv),
so that is a mixture of centered Gaussian laws. On noting that
one obtains
Finally, by condition (v), and . To conclude the proof, it suffices to apply Theorem 1 of [18] (see also the subsequent remark) with . ☐
Theorem 5 gives two upper bounds for in terms of and . To avoid trivialities, suppose . Obviously, the second bound makes sense only if . However, since and , the first bound implies if . In particular, if .
Example 2.
Under the conditions of Theorem 5, suppose also that is absolutely continuous with a density f satisfying . Then, conditioning on and V and arguing as in ([18] Ex. 2), it can be shown that . Hence, in total variation distance. Furthermore, if , the second bound of Theorem 5 yields
for all and a suitable constant (independent of n).
We close the paper by briefly discussing the exchangeable case.
Remark 1.
Usually, the upper bounds for the total variation distance are preserved under mixtures. Hence, by conditioning on and making some further assumptions, the results obtained in this section can be extended to the case where is exchangeable. As an example, define L and M as in Section 3 and suppose
for each and for some integrable random variable Q. Then, Corollary 1 and Theorem 5 are still valid even if is exchangeable (and not necessarily i.i.d.) up to replacing with a.s. in Corollary 1.
Author Contributions
Methodology, L.P. and P.R. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Gnedenko, B.V.; Korolev, V. Random Summation: Limit Theorems and Applications; CRC Press: Boca Raton, FL, USA, 1996. [Google Scholar]
- Kiche, J.; Ngesa, O.; Orwa, G. On generalized gamma distribution and its application to survival data. Int. J. Stat. Probab. 2019, 8, 85–102. [Google Scholar]
- Korolev, V.; Chertok, A.; Korchagin, A.; Zeifman, A. Modeling high-frequency order flow imbalance by functional limit theorems for two-sided risk processes. Appl. Math. Comput. 2015, 253, 224–241. [Google Scholar] [CrossRef][Green Version]
- Korolev, V.; Dorofeeva, A. Bounds of the accuracy of the normal approximation to the distributions of random sums under relaxed moment conditions. Lith. Math. J. 2017, 57, 38–58. [Google Scholar] [CrossRef]
- Korolev, V.; Zeifman, A. Generalized negative binomial distributions as mixed geometric laws and related limit theorems. Lith. Math. J. 2019, 59, 366–388. [Google Scholar] [CrossRef]
- Korolev, V.; Zeifman, A. Bounds for convergence rate in laws of large numbers for mixed Poisson random sums. Stat. Prob. Lett. 2021, 168, 1–8. [Google Scholar] [CrossRef]
- Mattner, L.; Shevtsova, I. An optimal Berry-Esseen type theorem for integrals of smooth functions. ALEA Lat. Am. J. Probab. Math. Stat. 2019, 16, 487–530. [Google Scholar] [CrossRef]
- Schluter, C.; Trede, M. Weak convergence to the student and Laplace distributions. J. Appl. Probab. 2016, 53, 121–129. [Google Scholar] [CrossRef]
- Shevtsova, I.; Tselishchev, M. A generalized equilibrium transform with application to error bounds in the Renyi theorem with no support constraints. Mathematics 2020, 8, 577. [Google Scholar] [CrossRef]
- Sheeja, S.; Kumar, S. Negative binomial sum of random variables and modeling financial data. Int. J. Stat. Appl. Math. 2017, 2, 44–51. [Google Scholar]
- Renyi, A. On stable sequences of events. Sankhya A 1963, 25, 293–302. [Google Scholar]
- Nourdin, I.; Nualart, D.; Peccati, G. Quantitative stable limit theorems on the Wiener space. Ann. Probab. 2016, 44, 1–41. [Google Scholar] [CrossRef]
- Berti, P.; Crimaldi, I.; Pratelli, L.; Rigo, P. An Anscombe-type theorem. J. Math. Sci. 2014, 196, 15–22. [Google Scholar] [CrossRef]
- Berti, P.; Pratelli, L.; Rigo, P. Limit theorems for a class of identically distributed random variables. Ann. Probab. 2004, 32, 2029–2052. [Google Scholar] [CrossRef]
- Pratelli, L.; Rigo, P. Total variation bounds for Gaussian functionals. Stoch. Proc. Appl. 2019, 129, 2231–2248. [Google Scholar] [CrossRef]
- Sethuraman, J. Some extensions of the Skorohod representation theorem. Sankhya 2002, 64, 884–893. [Google Scholar]
- Bally, V.; Caramellino, L. Asymptotic development for the CLT in total variation distance. Bernoulli 2016, 22, 2442–2485. [Google Scholar] [CrossRef]
- Pratelli, L.; Rigo, P. Convergence in total variation to a mixture of Gaussian laws. Mathematics 2018, 6, 99. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).