Next Article in Journal
Erratum: Abdou, A. A. N. and Khamsi, M.A. Fixed Points of Kannan Maps in the Variable Exponent Sequence Spaces p(·). Mathematics 2020, 8, 76
Next Article in Special Issue
Probability Models and Statistical Tests for Extreme Precipitation Based on Generalized Negative Binomial Distributions
Previous Article in Journal
One Dimensional Discrete Scan Statistics for Dependent Models and Some Related Problems
Previous Article in Special Issue
Optimal Filtering of Markov Jump Processes Given Observations with State-Dependent Noises: Exact Solution and Stable Numerical Schemes
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Generalized Equilibrium Transform with Application to Error Bounds in the Rényi Theorem with No Support Constraints

by
Irina Shevtsova
1,2,3,4,* and
Mikhail Tselishchev
2,4,*
1
Department of Mathematics, School of Science, Hangzhou Dianzi University, Hangzhou 310018, China
2
Department of Mathematical Statistics, Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University, GSP-1, 1-52 Leninskiye Gory, Moscow 119991, Russia
3
Federal Research Center “Informatics and Control” of the Russian Academy of Sciences, Moscow 119333, Russia
4
Moscow Center for Fundamental and Applied Mathematics, Moscow 119991, Russia
*
Authors to whom correspondence should be addressed.
Mathematics 2020, 8(4), 577; https://doi.org/10.3390/math8040577
Submission received: 18 March 2020 / Revised: 6 April 2020 / Accepted: 8 April 2020 / Published: 13 April 2020
(This article belongs to the Special Issue Stability Problems for Stochastic Models: Theory and Applications)

Abstract

:
We introduce a generalized stationary renewal distribution (also called the equilibrium transform) for arbitrary distributions with finite nonzero first moment and study its properties. In particular, we prove an optimal moment-type inequality for the Kantorovich distance between a distribution and its equilibrium transform. Using the introduced transform and Stein’s method, we investigate the rate of convergence in the Rényi theorem for the distributions of geometric sums of independent random variables with identical nonzero means and finite second moments without any constraints on their supports. We derive an upper bound for the Kantorovich distance between the normalized geometric random sum and the exponential distribution which has exact order of smallness as the expectation of the geometric number of summands tends to infinity. Moreover, we introduce the so-called asymptotically best constant and present its lower bound yielding the one for the Kantorovich distance under consideration. As a concluding remark, we provide an extension of the obtained estimates of the accuracy of the exponential approximation to non-geometric random sums of independent random variables with non-identical nonzero means.

1. Introduction

Let X 1 , X 2 , be a sequence of independent and, for simplicity in this Introduction, identically distributed (i.i.d.) random variables (r.v.s) with a E X 1 0 . Let N be a random variable independent of { X 1 , X 2 , } and having the geometric distribution Geom p with parameter p ( 0 , 1 ) , i.e., P ( N = n ) = p ( 1 p ) n 1 for n N . Denote also N 0 N 1 the shifted geometric r.v. Let S n k = 1 n X k , n N , S 0 0 . The well-known Rényi theorem states that the distribution of a properly normalized geometric random sum S N converges weakly to the exponential law as p tends to zero. More precisely,
W S N E S N d E as p 0 , where E Exp 1 and E S N = E N E X 1 = a / p .
Here, the notation Exp λ stands for the exponential distribution with density λ e λ x 𝟙 ( 0 , ) ( x ) , λ > 0 . Originally, Rényi proved Equation (1) under the additional assumption of nonnegativeness of { X n } . However, it can be made sure that Equation (1) holds also: (i) for alternating { X n } (by alternating r.v. we mean a r.v. that may take values of both signs); and (ii) for
W 0 S N 0 E S N 0 = p S N 0 a ( 1 p ) ,
in place of W (still without any support assumptions on the distribution of { X k } ). This can be done, for example, by showing that the characteristic function (ch.f.) of W (and also of W 0 ) converges pointwisely to that of the exponential distribution.
The importance of every limit theorem only increases if it is accompanied by the corresponding estimates of the rate of convergence. There are several bounds on the accuracy of approximation in Equation (1), mainly w.r.t. the Kolmogorov (uniform) and ζ -metrics, which are cited below. All of them assume additional conditions on the distribution of random summands including the finiteness of higher-order moments.
Recall that both the Kolmogorov and ζ s -metrics are defined as simple probability metrics with ζ -structure (see Section 2 of [1]) between probability distributions (d.f.s F, G) of r.v.s X, Y:
ζ H F , G ζ H L X , L Y ζ H X , Y sup h H R h d F R h d G
for specific classes H of real Borel functions on R (to simplify the notation, here and in what follows, we use r.v.s as well as their distributions and d.f.s in the arguments of simple probability metrics interchangeably; this should not cause any misunderstanding). The Kolmogorov metric ρ is obtained with H = { 𝟙 ( , a ) ( x ) a R } , the class of indicators of all open intervals with unbounded left endpoint:
ρ ( F , G ) sup x R F ( x ) G ( x ) ,
while ζ -metric of order s > 0 , originally introduced by Zolotarev [2] (see also [3]) as an example of an ideal metric with ζ -structure, is defined as ζ H with H = F s , where
F s h F s : h is bounded , F s h : R R : h ( m ) ( x ) h ( m ) ( y ) x y s m x , y R with m s 1 N 0 , s > 0 ,
that is,
ζ s F , G sup h F s R h d F R h d G .
Observe that h F s iff h F s 1 , s > 1 . If E | X | s < and E | Y | s < , then ζ s ( F , G ) < and the least upper bound w.r.t. to h F s in Equation (3) may be replaced with that over a wider class F s . For further properties of ζ s -metrics, we refer to the works in [3,4] and Section 4 of [5].
In the present paper, we focus mostly on ζ 1 -metrics between distributions with finite first moments; under this assumption, the definition of ζ 1 -metric can be rewritten as
ζ 1 F , G = sup h Lip 1 R h d F R h d G ,
where
Lip c h : R R | | h ( x ) h ( y ) | c | x y | x , y R , c > 0 ,
so that Lip 1 = F 1 . It is worth noting that ζ 1 has several alternative representations. The Kantorovich–Rubinstein theorem states that ζ 1 ( X , Y ) is minimal with respect to the compound metric E | X Y | , while the results in [6] imply that the optimal coupling is attained at the comonotonic pair (that is, with ( X , Y ) = ( F 1 ( U ) , G 1 ( U ) ) , U having the uniform distribution on ( 0 , 1 ) , F 1 , G 1 being generalized inverse d.f.s):
ζ 1 F , G = min L ( X , Y ) : X = d X , Y = d Y E | X Y | = 0 1 F 1 ( u ) G 1 ( u ) d u = F ( x ) G ( x ) d x .
The rightmost representation in Equation (5), as the mean metric between the d.f.s F and G, follows from the geometrical interpretation. The metric ζ 1 is also called the Kantorovich, or the Wasserstein distance.
Thus, coming back to the convergence rate estimates in Equation (1), we first mention the paper by Solovyev [7], which gives the following uniform bound for nonnegative { X k } , as pointed out in [8]:
ρ ( W 0 , E ) 24 p γ r r 2 , 2 < r 3 ,
where γ r = E X 1 r / a r 1 / ( r 1 ) .
Kalashnikov and Vsekhsvyatskii [9] proved a uniform upper bound for nonnegative summands in terms of their moments of order s ( 1 , 2 ] :
ρ ( W , E ) C p s 1 E X 1 s a s ,
where C is an absolute constant.
Kruglov and Korolev [10] gave the following nonuniform bound of the accuracy of the exponential approximation to the normalized geometric distribution (i.e., for degenerate { X n } ):
P ( p N < x ) ( 1 e x ) x 𝟙 { x < p } + e x e Q ( p ) x 𝟙 { x p } x 𝟙 { x < p } + p 2 ( 1 p ) e x 𝟙 { x p } ,
where Q ( p ) = ( 1 p / 2 ) / ( 1 p ) .
Brown [8] proved an asymptotically exact (as p 0 ) upper bound for nonnegative summands, which does not require moments of order greater than two:
ρ ( W 0 , E ) p E X 1 2 a 2 max 1 , 1 2 ( 1 p ) .
Brown also showed that Equation (9) is tighter than Equation (6) for all 2 < r 3 and p ( 0 , 0.5 ] . Moreover, Equation (9) can be treated as a specification of Equation (7) for s = 2 with a concrete value of C.
Sugakova [11] presented some bounds for the d.f. F S N 0 ( t ) for t > 1 using the characteristics of the renewal process built on top of independent and not necessary identically distributed alternating { X n } with identical means.
Kalashnikov [12] provided estimates of the rate of convergence in the Rényi theorem for i.i.d. alternating { X n } w.r.t. ζ s -metrics of order s [ 1 , 2 ] and the uniform metric (the latter is done under the additional assumption of bounded density), in particular, for any s ( 1 , 2 ] ,
ζ s ( W , E ) p s 1 ζ s ( X 1 , E ) ,
ζ 1 ( W , E ) p ζ 1 ( X 1 , E ) + 2 ( 1 p ) p s 1 ζ s ( X 1 , E ) ,
provided that E X 1 = 1 .
Among other valuable things, Peköz and Röllin [13] exploited Stein’s method and equilibrium (stationary renewal) distributions (see Section 3) to estimate the Kantorovich distance between the exponential distribution and that of a normalized geometric random sum W of square integrable independent and not necessary identically distributed nonnegative random summands { X n } with identical positive means under the technical assumption E X k = 1 :
ζ 1 ( W , E ) 2 p n = 1 P ( N = n ) ζ 1 ( X n , X n e ) ,
where X n e has an equilibrium distribution w.r.t. X n , n N . Using the trivial bound ζ 1 ( X , Y ) E | X | + E | Y | that follows from representation (5) and holds true for arbitrary r.v.s X , Y with finite first moments, the inequality in Equation (12) can be naturally extended to
ζ 1 ( W , E ) 2 p sup n ζ 1 ( X n , X n e ) p sup n E X n 2 + 2 ,
as done in [14].
Equation (22) of Hung [15] gives the following bound for the Trotter distance between W and E in the case of i.i.d. nonnegative summands { X n } with E X 1 = 1 :
d T ( W , E ; h ) sup t R E h ( W + t ) E h ( E + t ) p s 1 E X 1 2 + 3 , h F s , s ( 1 , 2 ] .
Given that ζ s ( W , E ) = sup h F s d T ( W , E ; h ) , the estimate in Equation (14) may be rewritten as
ζ s ( W , E ) p s 1 E X 1 2 + 3 for s ( 1 , 2 ] .
To compare Equation (15) with Kalashnikov’s bound in Equation (10), observe that, by Theorem 1(i,c) below, the dual representation of ζ s ( X , Y ) -metric as the minimal w.r.t. the compound metric E | X Y | s for s ( 0 , 1 ] (see, e.g., Corollary 5.2.2 of [4]), and, finally, Theorem 1(g) below, for s ( 1 , 2 ] , we have
ζ s ( X 1 , E ) = ζ s 1 ( X 1 e , E e ) = ζ s 1 ( X 1 e , E ) = inf L ( X , Y ) : X = d X 1 e , Y = d E E X Y s 1 E X 1 e E + 1 E X 1 e + E E + 1 = E X 1 2 / 2 + 2 < E X 1 2 + 3 ,
hence, Kalashnikov’s bound in Equation (10) is tighter than Equation (15).
Thus, most existing estimates of the rate of convergence in the Rényi theorem were obtained under the additional assumption of nonnegativeness of random summands { X n } . However, there are many applications where geometric random sums appear with alternating random summands, for example, as profit-or-losses in financial mathematics, risk theory, queuing theory, etc. Hence, extensions of such sharp and natural estimates as Equations (9), (12), and (13), say, to the alternating random summands, would not only represent a theoretical interest, but can also be in great demand by various applications of probability theory.
In the present paper, we focus on ζ 1 -estimates, in particular, we extend bounds in Equations (12) and (13) to the alternating case. More precisely, in Theorem 4 below, we prove that, for square integrable independent and not necessarily identically distributed random summands { X n } with identical nonzero means (for simplicity, equal to one), the following estimates hold:
ζ 1 ( W , E ) 2 p n = 1 P ( N = n ) ζ 1 L X n , L e X n p E X N 2 2 P ( X N 0 ) ,
ζ 1 ( W 0 , E ) 2 p 1 p ζ 1 δ 0 , L e X N = p 1 p E X N 2 ,
where δ 0 is the Dirac measure concentrated in zero and L e X n is the equilibrium transform of L X n , which is a generalization of the equilibrium distribution introduced in Section 3 below and, generally speaking, is no more a probability measure (therefore, we write L e X n instead of L X n e ), but allows eliminating the support constraints on the distribution of X n . The notion of the ζ 1 -metric between signed measures is introduced in Section 2 below and coincides with that of the ordinary ζ 1 -metric in case of probability measures. Thus, the intermediate estimate in Equation (16) coincides with estimate (12), but now also holds true for alternating random summands { X n } . Furthermore, it can easily be seen that the right-hand side of Equation (16) does not exceed
p sup n E X n 2 2 P ( X n 0 )
and, hence, is tighter than estimate (13) and does not require that { X n } ’s take only positive values. The comparison of estimates (16) and Kalashnikov’s bound in Equation (11) with s = 2
ζ 1 ( W , E ) p ζ 1 ( X 1 , E ) + 2 p ( 1 p ) ζ 2 ( X 1 , E ) = = p ζ 1 ( X 1 , E ) + 2 p ( 1 p ) ζ 1 L e X 1 , Exp 1
(for the equality here, see Theorem 1(i) below) is complicated in the general case, since, due to Theorem 3 below, the rightmost expression does not exceed
2 p ( 2 p ) ζ 1 L X 1 , L e X 1 ,
which is asymptotically twice greater than the intermediate expression in Equation (16), while the intermediate estimate in Equation (16), by the triangle inequality, yields the bound
ζ 1 ( W , E ) 2 p ζ 1 ( X 1 , E ) + 2 p ζ 1 L e X 1 , Exp 1
with the first term twice larger than that in Equation (18).
We use the same techniques and recipes as in [13]. First, we bound the left-hand side of Equation (16) from above with ζ 1 L W , L e W using Stein’s method (see Theorem 3 in Section 4). Second, we estimate ζ 1 L W , L e W by the ζ 1 -distances between X n and their equilibrium transforms L e X n , n N . Third, we construct an optimal upper bound for ζ 1 L X n , L e X n in terms of the second moments of X n and P ( X n 0 ) , n N (see Theorem 2 in Section 3). The resulting upper bounds for ζ 1 ( W , E ) and ζ 1 ( W 0 , E ) are given in Theorem 4 of Section 5. Furthermore, we provide asymptotic lower bounds for ζ 1 ( W , E ) and ζ 1 ( W 0 , E ) (see Theorem 5 in Section 5) in terms of the so-called asymptotically best constants introduced in Section 5. The constructed lower bounds turn out to be asymptotically four times smaller than the upper ones. Finally, we extend the obtained estimates of the accuracy of the exponential approximation to non-geometric random sums of independent random variables with non-identical nonzero means of identical signs (see Theorem 6 in Section 5).

2. The Kantorovich Distance between Signed Measures

In the next sections, we need to calculate the Kantorovich (or ζ 1 -) distance between measures on ( R , B ) that are no longer probabilities, but still have unit mass on R . Denote by M 1 the linear space of signed measures on ( R , B ) with finite total variations and finite first moments, and by M 0 1 the subspace of measures σ M 1 with σ ( R ) = 0 .
The Kantorovich norm on M 0 1 is defined as (see Section 3.2 of [16])
σ K sup f Lip 1 R f d σ .
Now let μ , ν M 1 and μ ( R ) = ν ( R ) , so that μ ν M 0 1 . The induced Kantorovich distance ζ 1 between μ and ν is
ζ 1 ( μ , ν ) μ ν K = sup f Lip 1 R f d μ R f d ν .
It is easy to see that in the case of probability measures μ and ν Equation (19) coincides with the definition of ζ 1 -distance given in Equation (4).
Using the Jordan decompositions μ = μ + μ and ν = ν + ν , as well as the alternative representation in Equation (5) of the ζ 1 -distance between nonnegative measures λ = μ + + ν and π = ν + + μ with λ ( R ) = π ( R ) in terms of their d.f.s, after a proper normalization, one can rewrite Equation (19) as
ζ 1 ( μ , ν ) = ζ 1 ( λ , π ) = R F λ ( x ) F π ( x ) d x = R F μ ( x ) F ν ( x ) d x ,
where F μ ( x ) = μ ( , x ) , F ν ( x ) = ν ( , x ) , x R , are the d.f.s of the signed measures μ and ν , respectively. In other words, the alternative representation of Zolotarev’s ζ 1 -distance in terms of d.f.s in Equation (5) is preserved for signed measures with identical masses of R .
We also use the convolution of signed measures μ λ , which is defined word-for-word as that of probability distributions. The uniqueness and multiplication theorems (see, e.g., Chapter 6 of [17] or Section 3.8 of [18]) state that the characteristic function of μ (the Fourier–Stieltjes transform of F μ )
μ ^ ( t ) R e i t x μ ( d x ) = R e i t x d F μ ( x ) , t R ,
defines the signed measure μ as well as its d.f. F μ uniquely and
μ ν ^ = μ ^ · ν ^ .
The following lemma, which is a simple corollary to representation (20), shows that the well-known properties of homogeneity and regularity of the Kantorovich distance between probability distributions are preserved for signed measures, but with a slight correction.
Lemma 1.
The Kantorovich distance ζ 1 on the space M D 1 of finite signed Borel measures on the real line with the masses of R equal to D R and finite first moments possesses the following properties:
(a)
Homogeneity of order 1.For every μ , ν M D 1 and c 0 , with μ c ( B ) μ ( c B ) , ν c ( B ) ν ( c B ) and c B { c x x B } , B B , we have
ζ 1 ( μ c , ν c ) = 1 | c | ζ 1 ( μ , ν ) .
(b)
Regularity.For all μ , ν M D 1 and λ M 1 , we have
ζ 1 ( μ λ , ν λ ) | λ | ( R ) · ζ 1 ( μ , ν ) ,
where | λ | λ + + λ is the total variation of λ.
To avoid abusing the notation, in what follows, we also use ζ 1 ( F , G ) for the Kantorovich distance between (signed) measures uniquely restored (Section 3.5, Theorem 3.29 of [19]) from distribution functions F and G.

3. The Equilibrium Transform of Probability Distributions

The notion of equilibrium distribution w.r.t. nonnegative r.v.s with finite positive means originally arises in the renewal theory as the distribution of the initial delay of a renewal process which makes its renewal rate constant (Chapter 11, § 4 of [20]) and, more generally, the renewal process stationary (Chapter 5, § 4 of [21]), which is why it is also called the stationary renewal distribution. Equilibrium distribution appears also as the limit distribution of the residual waiting times, or hitting probabilities (Chapter 11, § 4 of [20]) and in the celebrated Pollaczek–Khinchin–Beekman formula which expresses the ruin probability in the classical risk process in terms of geometric random sum of i.i.d. r.v.s whose common distribution is the equilibrium transform of the distributions of claims. Due to the definition given in a more general form in Equation (21) below, equilibrium distribution is also called the integrated tail one ([12], p. 37, [22]). Concerning the equilibrium transform, we would also like to mention the work of Harkness and Shantaram [23] who considered the iterated equilibrium transform for d.f.s with nonnegative support and investigated limit theorems for normalized iterations, the description of limit laws being given in [24]. In particular, the authors of [23] calculated the ch.f. of the equilibrium transform that can be used as the definition of the equilibrium transform in the general case and hence, with the inverse formula, can give a hint to definition in Equation (21) of the equilibrium d.f. with no support constraints.
We introduce an extension of the equilibrium distribution that is applicable for alternating random variables with finite nonzero first moments, but leads out of the class of probability distributions.
Let P be a probability measure with the d.f. F ( x ) = P ( ( , x ) ) , x R , ch.f. f ( t ) = e i t x P ( d x ) = R e i t x d F ( x ) , t R , and a finite first moment a x P ( d x ) = R x d F ( x ) . If a r.v. X (on some probability space ( Ω , Σ , P ) ) has the distribution P, we also write P = L ( X ) , f ( t ) = E e i t X f X ( t ) , F ( x ) = P ( X < x ) F X ( x ) , a = E X .
Definition 1.
The equilibrium d.f. (distribution) w.r.t. the d.f. F (probability distribution P / law L ( X ) ) with a 0 is a function of bounded variation (a (signed) measure P e / L e ( X ) on B ( R ) with the d.f.)
F e ( x ) 1 a x F ( y ) d y , if x 0 , E X a + 1 a 0 x ( 1 F ( y ) ) d y , if x > 0 ,
= 1 a x + x F ( y ) d y , x R .
In Theorem 1(a) below, it is proved that F e , indeed, has bounded variation and some useful properties of the equilibrium transform are stated as well.
We call F e / P e / L e ( X ) the equilibrium transform (d.f./distribution) w.r.t. F/P/ L ( X ) /X correspondingly, although it may not be a probability d.f./distribution at all. At the same time, it can be easily seen that L e ( X ) is a probability measure if and only if X does not change sign (that is, if and only if P is concentrated either on ( , 0 ] or on [ 0 , ) ), in which case one might construct a random variable X e with the distribution L ( X e ) = L e ( X ) and such that X and X e are either both nonnegative or both nonpositive.
In what follows, to indicate the r.v. whose equilibrium transform is considered, we use the corresponding lower index and write F X e and f X e for ( F X ) e and ( f X ) e , respectively.
Theorem 1.
Let X be a r.v. with the d.f. F and a 0 , and F e be the equilibrium d.f. w.r.t. F defined in Equation (21). Then:
(a)
Absolute continuity.The function F e has bounded variation on R with
| L e ( X ) | ( R ) = E | X | / | E X | , F e ( ) = 0 , F e ( + ) = 1 ,
and, hence, L e ( X ) is a Borel measure with unit on R ; moreover, F e is a.c. with the Lebesgue derivative
p e ( x ) = 1 a F ( x ) , if x 0 , 1 a ( 1 F ( x ) ) , if x > 0 ,
and supp L e ( X ) coincides with the convex hull of supp L ( X ) .
(b)
Characteristic function.The ch.f. (Fourier–Stieltjes transform) of F e has the form
f e ( t ) R e i t x d F e ( x ) = f ( t ) 1 t f ( 0 ) = f ( t ) 1 i t a , if t 0 , and f e ( 0 ) = 1 .
(c)
Fixed points. L e ( X ) = L ( X ) iff X Exp 1 / a , that is, if and only if F ( x ) = ( 1 e x / a ) 𝟙 ( 0 , ) ( x ) for some a > 0 .
(d)
Test functions. F e is the equilibrium d.f. w.r.t. X if and only if
E g ( X ) g ( 0 ) = E X · R g ( x ) d F e ( x )
for all Lipschitz functions g : R R .
(e)
Mixture preservation.For arbitrary d.f.s F 1 , F 2 , with identical nonzero expectations and a discrete probability distribution p n 0 , n N , n = 1 p n = 1 , we have
n = 1 p n F n e = n = 1 p n F n e .
(f)
Homogeneity.For all c R { 0 } , we have
( F c X ) e ( x ) = F X e ( x / c ) , x R ,
or, in terms of (constant-sign) r.v.s, ( c X ) e = d c X e , c R { 0 } . In other words, equilibrium transform respects scaling.
(g)
Moments.If E | X | r + 1 < for some r > 0 , then for all k N [ 1 , r ] we have
R x k d F e ( x ) = E X k + 1 ( k + 1 ) E X , R | x | r d F e ( x ) = E X | X | r ( r + 1 ) E X ,
R x k | d F e | ( x ) = E | X | X k ( k + 1 ) | E X | , R | x | r | d F e | ( x ) = E | X | r + 1 ( k + 1 ) | E X | .
(h)
Single summand property.Let N , X 1 , X 2 , be independent r.v.s, such that a n E X n ( 0 , ) , n N , P ( N N 0 ) = 1 , S N X 1 + + X N , S 0 0 , A E S N = n = 1 a n P ( N n ) be finite, and M be a N -valued r.v. with the distribution
P ( M = m ) = a m A P ( N m ) , m N .
Then,
L e S N = m = 1 P ( M = m ) L ( S m 1 ) L e ( X m ) ,
where * denotes the convolution of two Borel measures, or, in terms of (constant-sign) r.v.s,
S N e = d S M 1 + X M e ,
where all the r.v.s are independent. In particular, if N Geom p and all X k ’s have identical nonzero expectations, then M = d N and
L e ( S N ) = L e ( S N 1 ) = n = 1 p ( 1 p ) n 1 L ( S n 1 ) L e ( X n ) ,
which can be also rewritten, in the case of i.i.d. { X k } , in the form
L e ( S N ) = L e ( S N 1 ) = L ( S N 1 ) L e ( X 1 ) .
(i)
Relation between ζ -distances.For arbitrary d.f.s F and G with finite moments of order s > 1 and identical expectations a 0 , we have
ζ s ( F , G ) = | a | ζ s 1 ( F e , G e ) .
Theorem 2 below provides also an optimal upper bound for ζ 1 ( F , F e ) given F ( 0 + ) and the second-order moment of F.
Remark 1.
Theorem 1(h) shows that the equilibrium transform of the geometric random sum of independent r.v.s with identical nonzero means does not depend on whether or not one takes the geometric distribution starting from zero.
Let us make several historical remarks. Some of the properties of the equilibrium distribution stated in Theorem 1 were known for a nonnegative r.v. X. Thus, the characteristic function of X e given in Equation (24) was found in [23], Equation (25) was taken as the definition of (the distribution of) X e in [13,14]. In Theorem 2.1 of [13], it was proved that the exponential distribution is the only fixed point of the equilibrium transform; this fact is proved directly also in Lemma 5.2 of [14]. In [14] (p. 268), it is observed that ( c X ) e = d c X e for c > 0 . Some moment calculations were given in [22]. Single summand property for S N was demonstrated in the proof of Theorem 3.1 of [13] for nonnegative, but not necessarily independent { X k } . The fact that L e ( S N ) = L e ( S N 1 ) for i.i.d. nonnegative { X k } was observed in [8] (p. 1394). The equality in Equation (32) for F ( 0 ) = G ( 0 ) = 0 and s = 2 was stated in [12] (p. 37).
To prove Theorem 1, we require the following auxiliary statement.
Lemma 2.
For every n N and z 1 , , z n C , we have
k = 1 n z k 1 = k = 1 n ( z k 1 ) j = 1 k 1 z j = k = 1 n ( z k 1 ) j = k + 1 n z j ,
where j = a b ( · ) 1 for b < a .
Proof. 
We use the induction w.r.t. n. For n = 1 Equation (33) is trivial. Let Equation (33) hold for n = 1 , , m 1 ; let us prove it for n = m . Using the inductive transition in the second equality below, we get
k = 1 m z k 1 = ( z m 1 ) k = 1 m 1 z k + k = 1 m 1 z k 1 = ( z m 1 ) k = 1 m 1 z k + k = 1 m 1 ( z k 1 ) j = 1 k 1 z j = k = 1 m ( z k 1 ) j = 1 k 1 z j .
The second equality in Equation (33) can be deduced from the first one just by the re-numeration of { z k } k = 1 n : z k z n k + 1 , k = 1 , , n .  
Proof of Theorem 1. 
(a) It follows immediately from the definition in Equation (21) of F e that F e is a.c. with the density given in Equation (23). In turn, Equation (23) implies that supp L e X is the convex hull of supp L X and, accounting for L e ( X ) ( R ) = | p e ( x ) | d x = E | X | / | E X | < , also that F e has bounded variation. The limiting values F e ( ± ) can be found directly using the definition of F e .
(b) Using the density of F e (see Equation (23)) and integrating by parts, we have
f e ( t ) = 1 a R e i t x p e ( x ) d x = 1 a R e i t x 𝟙 ( 0 , ) ( x ) F ( x ) d x = 1 i t a R 𝟙 ( 0 , ) ( x ) F ( x ) d e i t x = = 1 i t a e i t x F ( x ) | 0 + e i t x 1 F ( x ) | 0 + R e i t x d F ( x ) = f ( t ) 1 i t a ,
which coincides with Equation (24).
(c) This statement follows immediately due to the uniqueness of the solution to the linear equation
f e ( t ) f ( t ) 1 i t a = f ( t ) f ( t ) = 1 1 i t a Exp 1 / a .
(d)–(g) These statements follow from the definition and integration by parts for (d) and (g) or the linearity of the Lebesgue–Stieltjes integral for (e).
(h) Let us denote f 0 ( t ) 1 , f k ( t ) = E e i t X k , k N , t R . Using the fact that
f S N ( t ) = n = 0 P ( N = n ) E e i t S n = n = 0 P ( N = n ) k = 0 n f k ( t ) ,
together with the equation for the equilibrium ch.f. in Equations (24) and (33), we get
f S N e ( t ) = f S N ( t ) 1 t f S N ( 0 ) = 1 i t A n = 1 P ( N = n ) k = 1 n f k ( t ) 1 = = n = 1 P ( N = n ) k = 1 n f k ( t ) 1 i t A j = 1 k 1 f j ( t ) = = n = 1 P ( N = n ) k = 1 n a k A f k e ( t ) f S k 1 ( t ) .
Changing the order of summation, which is possible by virtue of the absolute convergence of the above series, and recalling the definition of L ( M ) , we obtain
f S N e ( t ) = k = 1 f k e ( t ) f S k 1 ( t ) · a k A n = k P ( N = n ) = k = 1 f k e ( t ) f S k 1 ( t ) P ( M = k ) ,
which is equivalent to Equation (30) by virtue of the uniqueness theorem.
If now N Geom p and a 1 = a 2 = = a , then A = a E N = a / p , P ( M = k ) = p ( 1 p ) k 1 = P ( N = k ) , k N . Denoting by M 0 a r.v. corresponding to N 0 N 1 with the distribution
P ( M 0 = k ) : = a k P ( N 0 k ) / k = 1 a k P ( N 0 k ) = P ( N 0 k ) / E N 0 = p ( 1 p ) k 1 , k N ,
we observe that M 0 = d N = d M . This proves Equation (31).
(i) This statement follows from Theorem 4.2(a), Equation (4.20) of [5]. It can also be proved independently, namely, by virtue of (d) we have
ζ s ( F , G ) = sup h F s R h d F R h d G = | a | sup h F s R h d F e R h d G e = = | a | sup h F s 1 R h d F e R h d G e = | a | ζ s 1 ( F , G ) .
To conclude this section, we construct an optimal upper bound for the Kantorovich distance between an arbitrary probability distribution with nonzero mean and its equilibrium transform given its second moment and the mass of nonpositive axis. Before formulating the corresponding result, we have to note that Cantelli’s (one-sided Chebyshev’s) inequality yields P ( X 0 ) 1 1 / E X 2 for an arbitrary r.v. X with 0 < E X 2 < , and, hence,
E X 2 1 1 P ( X 0 ) .
This remark explains the choice of the domain of parameters q and b in the following Theorem 2.
Theorem 2.
Take any q [ 0 , 1 ) and b 1 1 q and let X be a square integrable r.v. with E X = 1 , E X 2 = b 2 , and P ( X 0 ) = q . Then,
ζ 1 L X , L e X b 2 2 q ,
where L e X is the equilibrium transform of L X . The equality in Equation (34) is attained for every q ( 0 , 1 ) and b 1 1 q on the two-point distribution L X = q δ u + ( 1 q ) δ v with
u = 1 1 q q ( b 2 1 ) , v = 1 + q 1 q ( b 2 1 ) ,
and for q = 0 and b = 1 on the degenerate distribution L X = δ 1 .
Remark 2.
With the account of Theorem 1(f) and Lemma 1(a), for arbitrary E X 0 , Equation (34) takes the form
ζ 1 L X , L e X 1 2 · E X 2 | E X | | E X | · P ( X 0 ) .
Proof of Theorem 2 
Let F be the d.f. of X and F e be its equilibrium transform. Consider the following functional on the space F of probability d.f.s with unit mean and finite second moment:
J ( F ) = ζ 1 ( F , F e ) 1 2 R x 2 d F ( x ) + F ( 0 + ) , F F .
Then, Equation (34) would follow from
sup F F J ( F ) 0 .
Let us prove Equation (37).
Since h Lip 1 if and only if ( h ) Lip 1 , the modulus sign in the definition of ζ 1 ( F , F e ) (see Equation (19)) may be omitted. Hence, we can rewrite
J ( F ) = sup h Lip 1 J 1 ( F , h ) , where J 1 ( F , h ) = R h d F R h d F e 1 2 R x 2 d F ( x ) + F ( + 0 ) , F F .
Note that J 1 ( F , h ) is linear w.r.t. F F for every h Lip 1 , by definition. According to Theorems 2 and 3 of [25], for any fixed h Lip 1 , the least upper bound sup F F J 1 ( F , h ) w.r.t. probability d.f F satisfying two linear conditions (we can also fix the value b 2 1 of the second moment and then take the least upper bound w.r.t. all b 1 ) coincides with that over the set of three-point distributions from F . Since every three-point distribution has finite moments of all orders, the condition of finiteness of the second-order moments may be eliminated, so that
sup F F J ( F ) = sup h Lip 1 sup F F 3 J 1 ( F , h ) ,
where F 3 is the space of all discrete probability d.f.s with at most three jumps and unit first moment. Furthermore, according to Hoeffding [26], the least upper bound sup F F 3 J 1 ( F , h ) w.r.t. discrete probability d.f.s F with finite number of jumps and satisfying one moment condition is attained on two-point distributions, hence,
sup F F J ( F ) = sup h Lip 1 sup F F 2 J 1 ( F , h ) = sup F F 2 J ( F ) ,
where F 2 is the space of all discrete probability d.f.s with at most two jumps and unit first moment. Therefore, to prove Equation (37), it suffices to show that J ( F ) 0 for every F F 2
Let F correspond to a two-point distribution p δ u + ( 1 p ) δ v with u < v and p [ 0 , 1 ) . The condition R x d F ( x ) = 1 yields u < 1 v and v = ( 1 p u ) / ( 1 p ) , so that there are only three possibilities:
Case 1: u 0 < 1 v and p [ 0 , 1 ) . Then,
q = P ( X 0 ) = p , b 2 = E X 2 = p u 2 2 p u + 1 1 p ,
and, by definition of F e given in Equation (21), we have
F e ( x ) = 0 , for x u , p u p x , for u < x 0 , p u + ( 1 p ) x , for 0 < x v , 1 , for x > v .
Observing that the difference F ( x ) F e ( x ) has exactly one sign change at x = p ( 1 u ) / ( 1 p ) = v 1 [ 0 , v ) and using Equation (20), after some elementary calculations, we get
ζ 1 F , F e = 1 2 u 2 p u p + 1 2 ( 1 u ) 2 p 2 1 p + 1 2 ( 1 p ) · 1 ,
and, hence,
J ( F ) = ζ 1 F , F e p u 2 2 p u + 1 2 ( 1 p ) + p = 0 ,
which means that J ( F ) = 0 for arbitrary two-point probability distribution with unit first moment and a nonpositive atom. Expressing u and v in terms of q and b 2 (see Equation (38)), we get Equation (35).
Case 2: 0 < u < 1 v and p [ 0 , u ] . Then, q = P ( X 0 ) = 0 ,
F e ( x ) = 0 , for x 0 , x , for 0 < x u , u + ( 1 p ) ( x u ) , for u < x v , 1 , for x > v ,
and by F e ( x ) F ( x ) 0 for all x R , we get ζ 1 F , F e = 1 2 u 2 + 1 2 ( v u ) ( u + 1 2 p ) = 1 1 2 E X 2 . Hence,
J ( F ) = ζ 1 F , F e 1 2 E X 2 + q = 1 E X 2 0 ,
since E X 2 E X 2 = 1 by Jensen’s inequality. The equality here and, hence, in Equation (34) is attained in the case of degenerate distribution δ 1 .
Case 3: 0 < u < 1 < v and p ( u , 1 ) . Then, q = 0 and F e has the same form as in the previous case, but the function F e ( x ) F ( x ) now has exactly one sign change at x = p ( 1 u ) / ( 1 p ) = v 1 ( u , v ) , and, hence, ζ 1 F , F e = 1 2 u 2 + 1 2 ( p u ) 2 1 1 p + 1 2 ( 1 p ) · 1 . Thus,
J ( F ) = ζ 1 F , F e 1 2 E X 2 + q = u 2 p < 0 ,
since u 2 < u < p in this case, and the equality in Equation (37) (and, hence, in Equation (34)) is not attained. □
Remark 3.
Analyzing the proof, one can make sure that Equation (34) admits a slight improvement:
ζ 1 L X , L e X E X 2 2 P ( X 0 ) E ( 1 X ) 2 𝟙 ( 0 , 1 ] ( X )
for any r.v. X with E X = 1 and finite second moment. The proof differs only by the appearance (subtraction) of an additional term ( 0 , 1 ] ( 1 x ) 2 d F ( x ) in definition in Equation (36) of J ( F ) , which is still linear w.r.t. F, and, hence, does not change the logic. One has only to check that the new J ( F ) is nonpositive for two-point distributions. In Case 1, J ( F ) is retained. In Cases 2 and 3, the additional term is of the form p ( 1 u ) 2 and it can be made sure that this term does not affect the sign of J ( F ) .

4. Stein’s Method

Stein’s method, first introduced in [27] for normal approximation, is a powerful technique that allows to estimate distances with ζ -structure (see Equation (2)) between probability distributions and a fixed target distribution (of a r.v.) Z. A complete survey on Stein’s method may be found, e.g., in [14]. Suppose that the distance ζ H is of the form given in Equation (2) for a specific class H of real-valued functions. As mentioned in the Introduction, this is the case for both uniform (Kolmogorov) and Kantorovich distances with H = { 𝟙 ( , a ) ( · ) a R } and H = Lip 1 , respectively.
The first step of Stein’s method is to construct the so-called Stein operator A in some space F of real functions, such that
E A f ( Z ) = 0 f F .
The second step is to find the solution f h to the Stein equation
A f h ( x ) = h ( x ) E h ( Z )
for every h H . Once the solution is found, it becomes possible to estimate the distance between the distributions of X and Z as
ζ H ( X , Z ) = sup h H R h d F X R h d F Z = sup h H R h d F X E h ( Z ) = = sup h H R h E h ( Z ) d F X = sup h H R A f h d F X = sup h H E A f h ( X ) .
The final estimate for ζ H ( X , Z ) is usually derived by bounding the latest expression in Equation (41) from above using the properties of the Stein operator A and those of the solutions f h to the Stein Equation (40).
It can be made sure that for Z = d E Exp 1 the following operator satisfies Equation (39) on the space F of absolutely continuous functions with E | f ( E ) | < + and thus appears to be the Stein operator:
A f ( x ) = f ( x ) f ( x ) + f ( 0 ) .
Peköz and Röllin [13] found an explicit solution to Stein Equation (40) in this case:
f h ( x ) = e x x + h ˜ ( t ) e t d t , where h ˜ ( t ) = h ( t ) E h ( E ) ,
for every h with E | h ( E ) | < . Note that f h ( 0 ) = 0 .
The following theorem extends results of Peköz and Röllin [13] in Theorem 2.1 to distributions with no support constraints and provides estimates of the accuracy of the exponential approximation in terms of the Kantorovich distance characterizing the proximity to the equilibrium transform.
Theorem 3.
Let X be a square integrable r.v. with E X = 1 and E Exp 1 . Then,
ζ 1 ( X , E ) 2 ζ 1 L X , L e X , ζ 1 L e X , Exp 1 ζ 1 L X , L e X ,
where L e X is the equilibrium transform of L X .
Proof. 
Let f h be defined by Equation (43). Then, by Equations (41), (42), and (25), we have
ζ 1 ( X , E ) = sup h Lip 1 E A f h ( X ) = sup h Lip 1 E f h ( X ) E f h ( X ) = sup h Lip 1 R f h d F X R f h d F X e
and
ζ 1 L e X , Exp 1 = sup h Lip 1 R h ( x ) d F X e ( x ) E h ( E ) = sup h Lip 1 R h ˜ ( x ) d F X e ( x ) = = sup h Lip 1 R A f h ( x ) d F X e ( x ) = sup h Lip 1 R f h ( x ) d F X e ( x ) R f h ( x ) d F X e ( x ) = = sup h Lip 1 R f h ( x ) d F X ( x ) R f h ( x ) d F X e ( x ) .
In Lemma 4.1 of [13] (see also Lemma 5.3 of [14]), it is proved that f h Lip 1 and f h Lip 2 for h Lip 1 . This remark together with the observation that L X and L e X have finite first moments immediately leads to the statement of the theorem. □
Less formally, Theorem 3 states that, if L X and L e X are close, then so are L X and Exp 1 , and, hence, may be regarded as the continuity theorem to the fixed-point property stated in Theorem 1(c).

5. Main Results

Theorem 4.
Let X 1 , X 2 , be a sequence of independent square integrable random variables with E X n = a 0 and S n i = 1 n X i for n N , S 0 0 . Let p ( 0 , 1 ) , N Geom p , be independent of all { X n } , N 0 N 1 , and W S N / E S N = p S N / a , W 0 S N 0 / E S N 0 = p S N 0 / ( a ( 1 p ) ) be normalized geometric random sums, E Exp 1 . Then,
ζ 1 ( W , E ) 2 p | a | n = 1 P ( N = n ) ζ 1 L X n , L e X n p E X N 2 a 2 2 P ( X N 0 ) ,
ζ 1 ( W 0 , E ) p 1 p · E X N 2 a 2 .
Before proceeding to the proof, we need the following auxiliary statement.
Lemma 3.
Under the conditions of Theorem 4, we have
ζ 1 . L S N , L e S N n = 1 p ( 1 p ) n 1 ζ 1 L X n , L e X n , ζ 1 L S N 0 , L e S N 0 E X N 2 2 | a | .
Proof. 
Let F n be the d.f. of X n , n N . Then, according to Equation (20), Theorem 1(h), Tonelli’s theorem, and an obvious fact that L S n = L S n 1 L X n , we have
ζ 1 L S N , L e S N = R F S N ( x ) F S N e ( x ) d x n = 1 p ( 1 p ) n 1 R R F n ( x s ) F n e ( x s ) d F S n 1 ( s ) d x = = n = 1 p ( 1 p ) n 1 R R F n ( x s ) F n e ( x s ) d x d F S n 1 ( s ) = = n = 1 p ( 1 p ) n 1 ζ 1 L X n , L e X n ,
which proves the first claim of the lemma, and, similarly,
ζ 1 ( L S N 0 , L e S N 0 ) n = 1 p ( 1 p ) n 1 R R 𝟙 ( 0 , + ) ( x s ) F n e ( x s ) d F S n 1 ( s ) d x = = n = 1 p ( 1 p ) n 1 R 𝟙 ( 0 , + ) ( x ) F n e ( x ) d x = n = 1 p ( 1 p ) n 1 ζ 1 δ 0 , L e X n ,
where δ 0 denotes the Dirac delta-measure concentrated in 0. As can easily be seen from the definition of the equilibrium transform given in Equation (21),
if a > 0 , then F e ( x ) 0 , x 0 , F ( x ) 1 , x 0 , if a < 0 , then F e ( x ) 0 , x 0 , F ( x ) 1 , x 0 ,
hence, we write
| 𝟙 ( 0 , + ) ( x ) F n e ( x ) | = F n e ( x ) sign a , x 0 , ( 1 F n e ( x ) ) sign a , x 0 ,
and also using Equation (28), we obtain
ζ 1 δ 0 , L e X n = sign a · 0 F n e ( x ) d x + 0 + 1 F n e ( x ) d x = sign a · R x d F n e ( x ) = E X n 2 2 | a | .
The second claim of the lemma follows now by the total probability formula and independence conditions. □
Proof of Theorem 4 
Due to the homogeneity of both the Kantorovich metric (Lemma 1(a)) and the equilibrium transform (Theorem 1(f)), without loss of generality, we can assume that a = 1 . The second inequality in Equation (44) is the implication of Theorem 2, thus it remains only to prove the first inequality in Equation (44) and the inequality in Equation (45). Indeed, by Theorems 3 and 1(f) and Lemmas 1 and 3, we have
ζ 1 ( W , E ) 2 ζ 1 L W , L e W = = 2 p ζ 1 L S N , L e S N 2 p n = 1 P ( N = n ) ζ 1 L X n , L e X n ,
and
ζ 1 ( W 0 , E ) 2 ζ 1 L W 0 , L e W 0 = = 2 p 1 p ζ 1 L S N 0 , L e S N 0 p 1 p E X N 2 .  
Corollary 1.
Under the conditions of Theorem 4 and sup n E X n 2 < , we have
ζ 1 ( W , E ) 2 p | a | sup n ζ 1 L X n , L e X n p sup n E X n 2 a 2 2 P ( X n 0 ) ,
ζ 1 ( W 0 , E ) p ( 1 p ) a 2 sup n E X n 2 .
Remark 4.
The right-hand side of Equation (47) is no less than that of Equation (46) because of the factor 1 1 p > 1 and the absence of the nonpositive term 2 P ( X n 0 ) . This result agrees with the intuition that W may be closer to E than W 0 , because S N contains a.s. one summand more than S N 0 .
Corollary 2.
Under the conditions of Theorem 4, we have
ζ 2 ( W , E ) 3 p | a | n = 1 P ( N = n ) ζ 1 L X n , L e X n 3 p 2 E X N 2 a 2 2 P ( X N 0 ) ,
ζ 2 ( W 0 , E ) p 1 p · 3 E X N 2 2 a 2 .
Recently, Korolev and Zeifman [28] obtained a bound similar to Equation (49), but with the constant factor of 1 / 2 on the right-hand side instead of 3 / 2 , i.e., three times smaller. The estimate in Equation (48) is also worse than Kalashnikov’s bound in Equation (10) obtained in the i.i.d. case and E X 1 = 1 , since Equation (10) with s = 2 , by Theorem 3, yields
ζ 2 ( W , E ) p ζ 1 ( X 1 , E ) 2 p ζ 1 L X 1 , L e X 1 ,
while Equation (48) in the i.i.d. case with E X 1 = 1 reduces to
ζ 2 ( W , E ) 3 p ζ 1 L X 1 , L e X 1 ,
which is 1.5 times greater.
Proof. 
Using subsequently Theorem 1(i,c), the triangle inequality for the Kantorovich metric, Theorem 3, and Lemma 3 together with the homogeneity of the Kantorovich distance and the equilibrium transform, we obtain
ζ 2 ( W , E ) = ζ 1 L e W , L e E = ζ 1 L e W , L E ζ 1 L e W , L W + ζ 1 ( W , E ) 3 ζ 1 L e W , L W 3 p | a | n = 1 P ( N = n ) ζ 1 L X n , L e X n .
Similarly,
ζ 2 ( W 0 , E ) 3 ζ 1 L e W 0 , L W 0 3 2 · p 1 p · E X N 2 a 2 .  
To study the problem of the accuracy of the estimates obtained above in Equations (46) and (47), let us introduce the asymptotically best constant for the Kantorovich distance in the Rényi theorem for geometric random sums of i.i.d. r.v.s in a way similar to the definition of the asymptotically best constant [29] in the classical Berry–Esseen inequality (see also [3,30,31,32,33,34,35]):
C AB sup { X n } i . i . d . : E X 1 0 , E X 1 2 < lim p + 0 ζ 1 ( W , E ) ( E X 1 ) 2 p E X 1 2 ,
which serves as a lower bound to the constant C in the inequality
ζ 1 ( W , E ) C p E X 1 2 / ( E X 1 ) 2 ,
still if it is supposed to hold only for sufficiently small p. Similarly, define C AB 0 for W 0 . The inequality in Equation (46) (similarly, Equation (47)) trivially yields the validity of Equation (51) with C = 1 for all p ( 0 , 1 ) . Since
C C AB ,
it is easy to conclude that C AB 1 .
Theorem 5.
For the asymptotically best constants C AB , C AB 0 defined in Equation (50), for W and W 0 we have
C AB 1 / 4 , C AB 0 1 / 4 .
Proof. 
Taking all X n 1 , we get E X n = E X n 2 = 1 and W = p N , W 0 p N 0 / ( 1 p ) , where N Geom p and N 0 N 1 . To estimate ζ 1 ( W , E ) , we use the definition of the Kantorovich distance in Equation (19) and take h ( x ) = 1 t sin ( t x ) Lip 1 as a test function, where t R { 0 } is the free parameter to be chosen later. Recalling the ch.f.s of the exponential and the geometric distributions, we obtain
E h ( E ) = 1 t E e i t E = 1 t ( 1 i t ) = 1 1 + t 2 ,
E h ( W ) = E h ( p N ) = 1 t E e i t p N = 1 t p e i t p 1 ( 1 p ) e i t p = = 1 t p e i t p 1 ( 1 p ) e i t p 1 + ( 1 p ) 2 2 ( 1 p ) cos ( t p ) = p sin ( t p ) t p 2 + 2 t ( 1 p ) 1 cos t p , E h ( W 0 ) = E h p N 0 1 p = 1 t E e i t p N 0 / ( 1 p ) = p ( 1 p ) sin t p 1 p t p 2 + 2 t ( 1 p ) 1 cos t p 1 p .
Thus,
C AB lim p + 0 sup t 0 E h ( W ) E h ( E ) p sup t 0 lim p + 0 E h ( W ) E h ( E ) p = = sup t 0 lim p + 0 p 3 t 3 + o ( p 3 ) p 3 t ( t 2 + 1 ) 2 + o ( p 3 ) = sup t 0 t 2 ( t 2 + 1 ) 2 = 1 / 4 ,
and, similarly,
C AB 0 sup t 0 lim p + 0 E h ( W 0 ) E h ( E ) p = sup t 0 t 2 ( t 2 + 1 ) 2 = 1 / 4 .  
Theorem 1(h) allows extending Theorem 4 to non-geometric random sums of independent random variables with arbitrary means of identical signs. Namely, the following statement holds.
Theorem 6.
Let X 1 , X 2 , be a sequence of independent random variables, independent of all else, with
a n E X n > 0 , b n E X n 2 < , n N ,
and S n i = 1 n X i for n N , S 0 0 . Let N be a N 0 -valued r.v.,
A E S N = n = 1 a n P ( N n ) < ,
and M be a N -valued r.v. with the distribution
P ( M = m ) = a m A P ( N m ) , m N .
Assume also that E S M < . Then, with W S N / E S N = A 1 S N , for any joint distribution L N , M , we have
ζ 1 ( W , E ) 2 A 1 sup n E | X n | · E | N M | + m N P ( M = m ) ζ 1 L X m , L e X m
2 A 1 sup n E | X n | · E | N M | + E b M 2 a M a M · P ( X M 0 | M ) .
Remark 5.
If both expectations E N and E M are finite, then E | N M | in Equations (52) and (53) can be replaced with ζ 1 ( N , M ) .
Remark 6.
Theorem 6 reduces to ([13], Theorem 3.1) in the case of nonnegative { X n } and to Theorem 4, Equation (44), in the case of N Geom p and identical a E X n 0 , n N . For shifted geometric N, i.e., P ( N = n ) = p ( 1 p ) n , n N 0 , under the assumptions of Theorem 4, Theorem 6 yields a bound
ζ 1 ( W 0 , E ) p 1 p 2 sup n E | X n | | a | + n N P ( N = n 1 ) ζ 1 L X n , L e X n p 1 p E X N + 1 2 a 2 + 2 sup n E | X n | | a | P ( X N + 1 0 ) ,
whose rightmost part is worse than the estimate in Equation (45), generally speaking (for example, in the i.i.d. case), since E | X n | | a | for all n N and P ( X N + 1 0 ) 1 .
Proof of Theorem 6 
By Theorem 3 and homogeneity of the Kantorovich distance and the equilibrium transform (see Lemma 1(a) and Theorem 1(f)), we have
ζ 1 ( W , E ) 2 ζ 1 L W , L W e = 2 A 1 ζ 1 L S N , L e S N .
Let us bound ζ 1 . L S N , L e S N from above.
For a given joint distribution L N , M , let p n m P ( N = n , M = m ) , n N 0 , m N . Denoting S j , k i = j k X i for j k and using the representation in Equation (20) and Theorem 1(h), we have
ζ 1 L S N , L e S N = R F S N ( x ) F S N e ( x ) d x = R F S N ( x ) F S M 1 F X M e ( x ) d x = = R n N 0 , m N p n m F S n ( x ) F S m 1 F X m e ( x ) d x n , m p n m R F S n ( x ) F S m 1 F X m e ( x ) d x n < m p n m R 𝟙 ( 0 , + ) ( x ) F S n + 1 , m 1 F X m e ( x ) d x + + n m p n m R F S m , n ( x ) F X m e ( x ) d x .
Adding and subtracting F S n + 1 , m ( x ) under the modulus sign in the integrands in the first sum (w.r.t. n < m ) and F X m ( x ) in the second one (w.r.t. n m ) and using further the triangle inequality and Lemma 1(b), we obtain
ζ 1 L S N , L e S N n < m p n m ζ 1 ( δ 0 , S n + 1 , m ) + n m p n m ζ 1 ( S m + 1 , n , δ 0 ) + + n , m p n m ζ 1 L X m , L e X m = = n , m p n m E | i = ( n m ) + 1 n m X i | + m N P ( M = m ) ζ 1 L X m , L e X m sup i E | X i | · n , m p n m | n m | + m N P ( M = m ) ζ 1 L X m , L e X m = sup i E | X i | · E | N M | + m N P ( M = m ) ζ 1 L X m , L e X m .
Substituting the latter bound into Equation (54) yields Equation (52). The bound in Equation (53) follows from Equation (52) by Theorem 2 (see also Remark 2). □

Author Contributions

Conceptualization, I.S.; methodology, I.S. and M.T.; formal analysis, I.S. and M.T.; investigation, I.S. and M.T.; writing—original draft preparation, I.S. and M.T.; writing—review and editing, I.S. and M.T.; supervision, I.S.; funding acquisition, and I.S. All authors have read and agreed to the published version of the manuscript.

Funding

The results of Section 1, Section 2 and Section 3 (including Theorem 1) were obtained under support by the Russian Science Foundation, project No. 18-11-00155. The rest of the study was funded by RFBR, project number 20-31-70054, and by the grant of the President of Russia No. MD–189.2019.1.

Acknowledgments

The authors would like to thank Professor Victor Korolev for the careful editing of the manuscript and to anonymous referee for a suggestion resulting in Theorem 6.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
r.v.random variable
i.i.d.independent identically distributed
d.f.distribution function
ch.f.characteristic function
a.s.almost sure
a.c.absolute continuity, absolutely continuous
w.r.t.with respect to

References

  1. Zolotarev, V.M. Probability metrics. Theory Probab. Appl. 1984, 28, 278–302. [Google Scholar] [CrossRef]
  2. Zolotarev, V.M. Ideal metrics in the problems of probability theory and mathematical statistics. Austral. J. Statist 1979, 21, 193–208. [Google Scholar] [CrossRef]
  3. Zolotarev, V.M. Modern Theory of Summation of Random Variables; VSP: Utrecht, The Netherlands, 1997. [Google Scholar]
  4. Rachev, S. Probability Metrics and the Stability of Stochastic Models; John Wiley ans Sons: Chichester, UK, 1991. [Google Scholar]
  5. Mattner, L.; Shevtsova, I.G. An optimal Berry–Esseen type theorem for integrals of smooth functions. Lat. Am. J. Probab. Math. Stat. 2019, 16, 487–530. [Google Scholar] [CrossRef]
  6. Cambanis, S.; Simons, G.; Stout, W. Inequalities for Ek(X,Y) when the marginals are fixed. Z. Wahrsch. Verw. Geb. 1976, 36, 285–294. [Google Scholar] [CrossRef]
  7. Solovyev, A.D. Asymptotic behaviour of the time of first occurrence of a rare event. Engrgy Cybern. 1971, 9, 1038–1048. [Google Scholar]
  8. Brown, M. Error Bounds for Exponential Approximations of Geometric Convolutions. Ann. Probab. 1990, 18, 1388–1402. [Google Scholar] [CrossRef]
  9. Kalashnikov, V.V.; Vsekhsvyatskii, S.Y. Metric estimates of the first occurrence time in regenerative processes. In Stability Problems for Stochastic Models; Lecture Notes in Mathematics; Springer: Berlin/Heidelberg, Germany, 1985; Volume 1155, pp. 102–130. [Google Scholar] [CrossRef]
  10. Kruglov, V.M.; Korolev, V.Y. Limit Theorems for Random Sums; Moscow State University: Moscow, Russia, 1990. (In Russian) [Google Scholar]
  11. Sugakova, E.V. Estimates in the Rényi theorem for differently distributed terms. Ukr. Math. J. 1995, 47, 1128–1134. [Google Scholar] [CrossRef]
  12. Kalashnikov, V.V. Geometric Sums: Bounds for Rare Events with Applications: Risk Analysis, Reliability, Queueing; Mathematics and Its Applications; Springer: Dordrecht, The Netherlands, 1997. [Google Scholar]
  13. Peköz, E.A.; Röllin, A. New rates for exponential approximation and the theorems of Rényi and Yaglom. Ann. Probab. 2011, 39, 587–608. [Google Scholar] [CrossRef] [Green Version]
  14. Ross, N. Fundamentals of Stein’s method. Probab. Surv. 2011, 8, 210–293. [Google Scholar] [CrossRef]
  15. Hung, T.L. On the rate of convergence in limit theorems for geometric sums. Southeast Asian J. Sci. 2013, 2, 117–130. [Google Scholar]
  16. Bogachev, V.I. Weak Convergence of Measures. Am. Math. Soc. 2018, 234. [Google Scholar]
  17. Glivenko, V.I. Stieltjes Integral, 2nd ed.; URSS: Moscow, Russia, 2007. [Google Scholar]
  18. Bogachev, V.I. Measure Theory. Vol. I, II.; Springer: Berlin, Germany, 2007. [Google Scholar]
  19. Folland, G.B. Real Analysis, 2nd ed.; John Wiley & Sons: New York, NY, USA, 1999. [Google Scholar]
  20. Feller, W. An Introduction to the Probability Theory and Its Applications. Vol. II., 2nd ed.; John Wiley: New York, NY, USA, 1971. [Google Scholar]
  21. Asmussen, W. Applied Probability and Queues; Springer: New York, NY, USA, 2003. [Google Scholar]
  22. Lin, X.S. Integrated Tail Distribution. Encyclopedia of Actuarial Science. Am. Cancer Soc. 2006. [Google Scholar] [CrossRef]
  23. Harkness, W.L.; Shantaram, R. Convergence of a sequence of transformations of distribution functions. Pacific J. Math. 1969, 31, 403–415. [Google Scholar] [CrossRef]
  24. Shantaram, R.; Harkness, W.L. On a certain class of limit distributions. Ann. Math. Stat. 1972, 43, 2067–2071. [Google Scholar] [CrossRef]
  25. Mulholland, H.P.; Rogers, C.A. Representation Theorems for Distribution Functions. Proc. Lond. Math. Soc. 1958, s3-8, 177–223. [Google Scholar] [CrossRef]
  26. Hoeffding, W. The extrema of the expected value of a function of independent random variables. Ann. Math. Stat. 1955, 26, 268–275. [Google Scholar] [CrossRef]
  27. Stein, C. A bound for the error in the normal approximation to the distribution of a sum of dependent random variables. In Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability, Volume 2: Probability Theory; University of California Press: Berkeley, CA, USA, 1972; pp. 583–602. [Google Scholar]
  28. Korolev, V.; Zeifman, A. Bounds for convergence rate in laws of large numbers for mixed Poisson random sums. arXiv 2020, arXiv:2003.12495. [Google Scholar]
  29. Esseen, C.G. A moment inequality with an application to the central limit theorem. Skand. Aktuarietidskr. 1956, 39, 160–170. [Google Scholar] [CrossRef]
  30. Kolmogorov, A.N. Some recent works in the field of limit theorems of probability theory. Bull. Mosc. Univ. 1953, 10, 29–38. (In Russian) [Google Scholar]
  31. Chistyakov, G.P. Asymptotically proper constants in the Lyapunov theorem. J. Math. Sci. 1999, 93, 480–483. [Google Scholar] [CrossRef]
  32. Chistyakov, G.P. A new asymptotic expansion and asymptotically best constants in Lyapunov’s theorem. I. Theory Probab. Appl. 2002, 46, 226–242. [Google Scholar] [CrossRef]
  33. Chistyakov, G.P. A new asymptotic expansion and asymptotically best constants in Lyapunov’s theorem. II. Theory Probab. Appl. 2002, 46, 516–522. [Google Scholar] [CrossRef]
  34. Chistyakov, G.P. A new asymptotic expansion and asymptotically best constants in Lyapunov’s theorem. III. Theory Probab. Appl. 2003, 47, 395–414. [Google Scholar] [CrossRef]
  35. Shevtsova, I.G. On the asymptotically exact constants in the Berry–Esseen–Katz inequality. Theory Probab. Appl. 2011, 55, 225–252. [Google Scholar] [CrossRef]

Share and Cite

MDPI and ACS Style

Shevtsova, I.; Tselishchev, M. A Generalized Equilibrium Transform with Application to Error Bounds in the Rényi Theorem with No Support Constraints. Mathematics 2020, 8, 577. https://doi.org/10.3390/math8040577

AMA Style

Shevtsova I, Tselishchev M. A Generalized Equilibrium Transform with Application to Error Bounds in the Rényi Theorem with No Support Constraints. Mathematics. 2020; 8(4):577. https://doi.org/10.3390/math8040577

Chicago/Turabian Style

Shevtsova, Irina, and Mikhail Tselishchev. 2020. "A Generalized Equilibrium Transform with Application to Error Bounds in the Rényi Theorem with No Support Constraints" Mathematics 8, no. 4: 577. https://doi.org/10.3390/math8040577

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop