Convergence in Total Variation of Random Sums

Pratelli, Luca; Rigo, Pietro

doi:10.3390/math9020194

Open AccessArticle

Convergence in Total Variation of Random Sums

by

Luca Pratelli

¹ and

Pietro Rigo

^2,*

¹

Accademia Navale, viale Italia 72, 57100 Livorno, Italy

²

Dipartimento di Scienze Statistiche “P. Fortunati”, Università di Bologna, via delle Belle Arti 41, 40126 Bologna, Italy

^*

Author to whom correspondence should be addressed.

Mathematics 2021, 9(2), 194; https://doi.org/10.3390/math9020194

Submission received: 21 December 2020 / Revised: 12 January 2021 / Accepted: 15 January 2021 / Published: 19 January 2021

(This article belongs to the Special Issue Stochastic Processes in Neuronal Modeling)

Download Versions Notes

Abstract

Let

(X_{n})

be a sequence of real random variables,

(T_{n})

a sequence of random indices, and

(τ_{n})

a sequence of constants such that

τ_{n} \to \infty

. The asymptotic behavior of

L_{n} = (1 / τ_{n}) \sum_{i = 1}^{T_{n}} X_{i}

, as

n \to \infty

, is investigated when

(X_{n})

is exchangeable and independent of

(T_{n})

. We give conditions for

M_{n} = \sqrt{τ_{n}} (L_{n} - L) ⟶ M

in distribution, where L and M are suitable random variables. Moreover, when

(X_{n})

is i.i.d., we find constants

a_{n}

and

b_{n}

such that

{sup}_{A \in B (R)} | P (L_{n} \in A) - P (L \in A) | \leq a_{n}

and

{sup}_{A \in B (R)} | P (M_{n} \in A) - P (M \in A) | \leq b_{n}

for every n. In particular,

L_{n} \to L

or

M_{n} \to M

in total variation distance provided

a_{n} \to 0

or

b_{n} \to 0

, as it happens in some situations.

Keywords:

exchangeability; random sum; rate of convergence; stable convergence; total variation distance

MSC:

60F05; 60G50; 60B10; 60G09

1. Introduction

All random elements appearing in this paper are defined on the same probability space, say

(Ω, A, P)

.

A random sum is a quantity such as

\sum_{i = 1}^{T_{n}} X_{i}

, where

(X_{n} : n \geq 1)

is a sequence of real random variables and

(T_{n} : n \geq 1)

a sequence of

N

-valued random indices. In the sequel, in addition to

(X_{n})

and

(T_{n})

, we fix a sequence

(τ_{n} : n \geq 1)

of positive constants such that

τ_{n} \to \infty

and we let

\begin{matrix} L_{n} = \frac{\sum_{i = 1}^{T_{n}} X_{i}}{τ_{n}} . \end{matrix}

Random sums find applications in a number of frameworks, including statistical inference, risk theory and insurance, reliability theory, economics, finance, and forecasting of market changes. Accordingly, the asymptotic behavior of

L_{n}

, as

n \to \infty

, is a classical topic in probability theory. The related literature is huge and we do not try to summarize it here. We just mention a general text book [1] and some useful recent references: [2,3,4,5,6,7,8,9,10].

In this paper, the asymptotic behavior of

L_{n}

is investigated in the (important) special case where

(X_{n})

is exchangeable and independent of

(T_{n})

. More precisely, we assume that:

(i): $(X_{n})$ is exchangeable;
(ii): $(X_{n})$ is independent of $(T_{n})$ ;
(iii): $\frac{T_{n}}{τ_{n}} \overset{P}{⟶} V$ for some random variable $V > 0$ .

Under such conditions, we prove a weak law of large numbers (WLLN), a central limit theorem (CLT), and we investigate the rate of convergence with respect to the total variation distance.

Suppose in fact

E | X_{1} | < \infty

and conditions (i)-(ii)-(iii) hold. Define

\begin{matrix} L = V E (X_{1} ∣ T) and M_{n} = \sqrt{τ_{n}} (L_{n} - L), \end{matrix}

where V is the random variable involved in condition (iii) and

T

the tail

σ

-field of

(X_{n})

. Then, it is not hard to show that

L_{n} \overset{P}{⟶} L

. To obtain a CLT, instead, is not straightforward. In Section 3, we prove that

M_{n} \to M

in distribution, where M is a suitable random variable, provided

E (X_{1}^{2}) < \infty

and

\sqrt{τ_{n}} \{\frac{T_{n}}{τ_{n}} - V\}

converges stably. Finally, in Section 4, assuming

(X_{n})

i.i.d. and some additional conditions, we find constants

a_{n}

and

b_{n}

such that

\begin{matrix} sup_{A \in B (R)} | P (L_{n} \in A) - P (L \in A) | \leq a_{n} and \\ sup_{A \in B (R)} | P (M_{n} \in A) - P (M \in A) | \leq b_{n} for every n \geq 1 . \end{matrix}

In particular,

L_{n} \to L

or

M_{n} \to M

in total variation distance provided

a_{n} \to 0

or

b_{n} \to 0

, as it happens in some situations.

A last note is that, to our knowledge, random sums have been rarely investigated when

(X_{n})

is exchangeable. Similarly, convergence of

L_{n}

or

M_{n}

in total variation distance is usually not taken into account. This paper contributes to fill this gap.

2. Preliminaries

In the sequel, the probability distribution of any random element U is denoted by

L (U)

. If S is a topological space,

B (S)

is the Borel

σ

-field on S and

C_{b} (S)

the space of real bounded continuous functions on S. The total variation distance between two probability measures on

B (S)

, say

μ

and

ν

, is

\begin{matrix} d_{T V} (μ, ν) = sup_{A \in B (S)} | μ (A) - ν (A) | . \end{matrix}

With a slight abuse of notation, if X and Y are S-valued random variables, we write

d_{T V} (X, Y)

instead of

d_{T V} [L (X), L (Y)]

, namely

\begin{matrix} d_{T V} (X, Y) = sup_{A \in B (S)} | P (X \in A) - P (Y \in A) | . \end{matrix}

If X is a real random variable, we say that

L (X)

is absolutely continuous to mean that

L (X)

is absolutely continuous with respect to Lebesgue measure. The following technical fact is useful in Section 4.

Lemma 1.

Let X be a strictly positive random variable. Then,

\begin{matrix} lim_{n} d_{T V} (X + q_{n} \sqrt{X}, X) = 0 \end{matrix}

provided the

q_{n}

are constants such that

q_{n} \to 0

and

L (X)

is absolutely continuous.

Proof.

Let f be a density of X. Since

{lim}_{n} \int_{- \infty}^{\infty} | f_{n} (x) - f (x) | d x = 0

, for some sequence

f_{n}

of continuous densities, it can be assumed that f is continuous. Furthermore, since

X > 0

, for each

ϵ > 0

there is

b > 0

such that

P (X < b) < ϵ

. For such a b, one obtains

\begin{matrix} d_{T V} (X + q_{n} \sqrt{X}, X) \leq ϵ + sup_{A \in B (R)} |P (X + q_{n} \sqrt{X} \in A ∣ X \geq b) - P (X \in A ∣ X \geq b)| . \end{matrix}

Hence, it can be also assumed

X \geq b

a.s. for some

b > 0

.

Let

g_{n}

be a density of

X + q_{n} \sqrt{X}

. Since

\begin{matrix} d_{T V} (X + q_{n} \sqrt{X}, X) = \int_{- \infty}^{\infty} {[f (x) - g_{n} (x)]}^{+} d x = \int_{b}^{\infty} {[f (x) - g_{n} (x)]}^{+} d x, \end{matrix}

it suffices to show that

f (x) = {lim}_{n} g_{n} (x)

for each

x > b

. To prove the latter fact, define

ϕ_{n} (x) = x + q_{n} \sqrt{x}

. For large n, one obtains

4 q_{n}^{2} < b

. In this case,

ϕ_{n}^{'} > 0

on

(b, \infty)

and

g_{n}

can be written as

\begin{matrix} g_{n} (x) = f [ϕ_{n}^{- 1} (x)] \frac{2 \sqrt{ϕ_{n}^{- 1} (x)}}{q_{n} + 2 \sqrt{ϕ_{n}^{- 1} (x)}} . \end{matrix}

Therefore,

f (x) = {lim}_{n} g_{n} (x)

follows from the continuity of f and

\begin{matrix} ϕ_{n}^{- 1} (x) = x + \frac{q_{n}^{2}}{2} - \frac{q_{n}}{2} \sqrt{q_{n}^{2} + 4 x} ⟶ x . \end{matrix}

☐

Stable Convergence

Stable convergence, introduced by Renyi in [11], is a strong form of convergence in distribution. It actually occurs in a number of frameworks, including the classical CLT, and thus it quickly became popular; see, e.g., [12] and references therein. Here, we just recall the basic definition.

Let S be a metric space,

(Y_{n})

a sequence of S-valued random variables, and K a kernel (or a random probability measure) on S. The latter is a map K on

Ω

such that

K (ω)

is a probability measure on

B (S)

, for each

ω \in Ω

, and

ω \mapsto K (ω) (B)

is

A

-measurable for each

B \in B (S)

. Say that

Y_{n}

converges stably to K if

\begin{matrix} lim_{n} E [f (Y_{n}) ∣ H] = E [K (\cdot) (f) ∣ H], \end{matrix}

(1)

for all

f \in C_{b} (S)

and

H \in A

with

P (H) > 0

, where

K (\cdot) (f) = \int f (x) K (\cdot) (d x)

.

More generally, take a sub-

σ

-field

G \subset A

and suppose K is

G

-measurable (i.e.,

ω \mapsto K (ω) (B)

is

G

-measurable for fixed

B \in B (S)

). Then,

Y_{n}

converges

G

-stably toK if condition (1) holds whenever

H \in G

and

P (H) > 0

.

An important special case is when K is a trivial kernel, in the sense that

\begin{matrix} K (ω) = ν for all ω \in Ω \end{matrix}

where

ν

is a fixed probability measure on

B (S)

. In this case,

Y_{n}

converges

G

-stably to

ν

if and only if

\begin{matrix} lim_{n} E \{G f (Y_{n})\} = E (G) \int f d ν \end{matrix}

whenever

f \in C_{b} (S)

and

G : Ω \to R

is bounded and

G

-measurable.

3. WLLN and CLT for Random Sums

In this section, we still let

\begin{matrix} L_{n} = \frac{\sum_{i = 1}^{T_{n}} X_{i}}{τ_{n}}, L = V E (X_{1} ∣ T) and M_{n} = \sqrt{τ_{n}} (L_{n} - L), \end{matrix}

where V is the random variable involved in condition (iii) and

\begin{matrix} T = ⋂_{n} σ (X_{n}, X_{n + 1}, \dots) \end{matrix}

is the tail

σ

-field of

(X_{n})

. Recall that

V > 0

. Recall also that, by de Finetti’s theorem,

(X_{n})

is exchangeable if and only if is i.i.d. conditionally on

T

, namely

\begin{matrix} P (X_{1} \in A_{1}, \dots, X_{n} \in A_{n} ∣ T) = \prod_{i = 1}^{n} P (X_{1} \in A_{i} ∣ T) a . s . \end{matrix}

for all

n \geq 1

and all

A_{1}, \dots, A_{n} \in B (R)

.

The following WLLN is straightforward.

Theorem 1.

If

E | X_{1} | < \infty

and conditions (i) and (iii) hold, then

L_{n} \overset{P}{⟶} L

.

Proof.

Recall that, if

Y_{n}

and Y are any real random variables,

Y_{n} \overset{P}{⟶} Y

if and only if, for each subsequence

(n^{'})

, there is a sub-subsequence

(n^{″}) \subset (n^{'})

such that

Y_{n^{″}} \overset{a . s .}{⟶} Y

. Fix a subsequence

(n^{'})

. Then, by (iii),

\begin{matrix} \frac{T_{n^{″}}}{τ_{n^{″}}} \overset{a . s .}{⟶} V \end{matrix}

along a suitable sub-subsequence

(n^{″}) \subset (n^{'})

. Since

V > 0

, then

T_{n^{″}} \overset{a . s .}{⟶} \infty

. As a result of the SLLN for exchangeable sequences,

(1 / n) \sum_{i = 1}^{n} X_{i} \overset{a . s .}{⟶} E (X_{1} ∣ T)

. Therefore,

\begin{matrix} L_{n^{″}} = \frac{T_{n^{″}}}{τ_{n^{″}}} \frac{\sum_{i = 1}^{n^{″}} X_{i}}{T_{n^{″}}} \overset{a . s .}{⟶} V E (X_{1} ∣ T) = L . \end{matrix}

☐

For definiteness, Theorem 1 has been stated in terms of convergence in probability, but other analogous results are available. As an example, suppose that

E | X_{1} | < \infty

and conditions (i)–(ii) are satisfied. Then,

L_{n} \to L

in distribution provided

\frac{T_{n}}{τ_{n}} \to V

in distribution. This follows from Skorohod representation theorem and the current version of Theorem 1. Similarly,

L_{n} \overset{a . s .}{⟶} L

or

L_{n} \overset{L_{1}}{⟶} L

whenever

\frac{T_{n}}{τ_{n}} \overset{a . s .}{⟶} V

or

\frac{T_{n}}{τ_{n}} \overset{L_{1}}{⟶} V

.

We also note that, as implicit in the proof of Theorem 1, condition (iii) implies

T_{n} \overset{P}{⟶} \infty

or equivalently

\begin{matrix} lim_{n} P (T_{n} \leq c) = 0 for every fixed c > 0 . \end{matrix}

We next turn to the CLT. It is convenient to begin with the i.i.d. case. From now on, U and Z are two real random variables such that

\begin{matrix} Z \sim N (0, 1), U is independent of Z and \\ (U, Z) is independent of (X_{n}, T_{n} : n \geq 1) . \end{matrix}

(2)

We also let

\begin{matrix} a = E (X_{1}) and σ^{2} = var (X_{1}) . \end{matrix}

Theorem 2.

Suppose

(X_{n})

is i.i.d.,

E (X_{1}^{2}) < \infty

, condition (ii) holds, and

\begin{matrix} \sqrt{τ_{n}} \{\frac{T_{n}}{τ_{n}} - V\} converges stably to L (U) . \end{matrix}

(3)

Then,

\begin{matrix} M_{n} ⟶ σ \sqrt{V} Z + a U in distribution . \end{matrix}

Proof.

Let

\begin{matrix} W_{n} = a \sqrt{τ_{n}} \{\frac{T_{n}}{τ_{n}} - V\} + \sqrt{\frac{V}{T_{n}}} \sum_{i = 1}^{T_{n}} (X_{i} - a) . \end{matrix}

Since

(X_{n})

is i.i.d.,

E (X_{1} ∣ T) = E (X_{1}) = a

a.s. Since

E \{{(\frac{\sum_{i = 1}^{T_{n}} (X_{i} - a)}{\sqrt{T_{n}}})}^{2}\} = σ^{2}

for every n, the sequence

\frac{\sum_{i = 1}^{T_{n}} (X_{i} - a)}{\sqrt{T_{n}}}

is

L_{2}

-bounded, and this implies

\begin{matrix} W_{n} - M_{n} = W_{n} - \sqrt{τ_{n}} (L_{n} - a V) = \frac{\sum_{i = 1}^{T_{n}} (X_{i} - a)}{\sqrt{T_{n}}} (\sqrt{V} - \sqrt{\frac{T_{n}}{τ_{n}}}) \overset{P}{⟶} 0 . \end{matrix}

Therefore, it suffices to prove

W_{n} ⟶ σ \sqrt{V} Z + a U

in distribution. We prove the latter fact by means of characteristic functions.

Fix

t \in R

. Let

μ_{n, j} (\cdot) = P (V \in \cdot ∣ T_{n} = j)

be the probability distribution of V under

P (\cdot ∣ T_{n} = j)

and

\begin{matrix} ϕ_{j} (s) = E \{exp (i s \frac{\sum_{i = 1}^{j} (X_{i} - a)}{\sqrt{j}})\} for all s \in R . \end{matrix}

Then,

\begin{matrix} E \{exp (i t W_{n})\} = \sum_{j = 1}^{\infty} P (T_{n} = j) \int exp (i t a \sqrt{τ_{n}} (\frac{j}{τ_{n}} - v)) ϕ_{j} (\sqrt{v} t) μ_{n, j} (d v) . \end{matrix}

In addition, for each

c > 0

, the classical CLT yields

\begin{matrix} lim_{j \to \infty} sup_{0 < v \leq c} |ϕ_{j} (\sqrt{v} t) - exp (- \frac{t^{2} σ^{2} v}{2})| = 0 . \end{matrix}

(4)

Since condition (3) implies condition (iii),

{lim}_{n} P (T_{n} \leq b) = 0

for all

b > 0

. Given

ϵ > 0

, take

c > 0

such that

P (V > c) < ϵ

. As a result of (4), one can find an integer m such that

\begin{matrix} |E \{exp (i t W_{n})\} - E \{exp (i t a \sqrt{τ_{n}} (\frac{T_{n}}{τ_{n}} - V)) exp (- \frac{t^{2} σ^{2} V}{2})\}| \leq \\ \leq ϵ + 2 P (T_{n} \leq m) + 2 P (V > c) < 3 ϵ + 2 P (T_{n} \leq m) . \end{matrix}

Since

ϵ

is arbitrary and

{lim}_{n} P (T_{n} \leq m) = 0

, it follows that

\begin{matrix} \underset{n}{lim sup} |E \{exp (i t W_{n})\} - E \{exp (i t a \sqrt{τ_{n}} (\frac{T_{n}}{τ_{n}} - V)) exp (- \frac{t^{2} σ^{2} V}{2})\}| = 0 . \end{matrix}

Finally, since

Z \sim N (0, 1)

and Z is independent of V,

\begin{matrix} E \{exp (i t σ \sqrt{V} Z)\} = E \{exp (- \frac{t^{2} σ^{2} V}{2})\} . \end{matrix}

Therefore,

\begin{matrix} E \{exp (i t σ \sqrt{V} Z + i t a U)\} = E \{exp (- \frac{t^{2} σ^{2} V}{2})\} E \{exp (i t a U)\} \\ = lim_{n} E \{exp (- \frac{t^{2} σ^{2} V}{2}) exp (i t a \sqrt{τ_{n}} (\frac{T_{n}}{τ_{n}} - V))\} \\ = lim_{n} E \{exp (i t W_{n})\} \end{matrix}

where the second equality is due to condition (3). Hence,

W_{n} ⟶ σ \sqrt{V} Z + a U

in distribution, and this concludes the proof. ☐

The argument used in the proof of Theorem 2 yields a little bit more. Let

ν = L (σ \sqrt{V} Z + a U)

and

G = σ (V, X_{1}, X_{2}, \dots)

. Then,

M_{n}

converges

G

-stably (and not only in distribution) to

ν

. Among other things, since

L_{n} \overset{P}{⟶} L

, this implies that

(L_{n}, M_{n}) \to (L, R)

in distribution, where R denotes a random variable independent of L such that

R \sim ν

. Moreover, condition (3) can be weakened into

\sqrt{τ_{n}} \{\frac{T_{n}}{τ_{n}} - V\}

converges

σ (V)

-stably to

L (U)

.

We also note that, under some extra assumptions, Theorem 2 could be given a simpler proof based on some version of Anscombe’s theorem; see, e.g., [13] and references therein.

Finally, we adapt Theorem 2 to the exchangeable case. Let

\begin{matrix} W = E (X_{1}^{2} ∣ T) - E {(X_{1} ∣ T)}^{2} and M = \sqrt{W V} Z + U E (X_{1} ∣ T) . \end{matrix}

To introduce the next result, it may be useful to recall that

\begin{matrix} \sqrt{n} \{\frac{\sum_{i = 1}^{n} X_{i}}{n} - E (X_{1} ∣ T)\} ⟶ N (0, W) stably \end{matrix}

provided

(X_{n})

is exchangeable and

E (X_{1}^{2}) < \infty

, where

N (0, W)

is the Gaussian kernel with mean 0 and random variance W (with

N (0, 0) = δ_{0}

); see, e.g., ([14] Th. 3.1).

Theorem 3.

If

E (X_{1}^{2}) < \infty

and conditions (i)–(ii) and (3) hold, then

M_{n} \to M

in distribution.

Proof.

Just note that

(X_{n})

is i.i.d. conditionally on

T

, with mean

E (X_{1} ∣ T)

and variance W. Hence, for each

f \in C_{b} (R)

, Theorem 2 yields

\begin{matrix} E \{f (M_{n}) ∣ T\} \overset{a . s .}{⟶} E \{f (M) ∣ T\}, \end{matrix}

which in turn implies

\begin{matrix} E \{f (M)\} = E \{lim_{n} E \{f (M_{n}) ∣ T\}\} = lim_{n} E \{E \{f (M_{n}) ∣ T\}\} = lim_{n} E \{f (M_{n})\} . \end{matrix}

☐

4. Rate of Convergence with Respect to Total Variation Distance

To obtain upper bounds for

d_{T V} (L_{n}, L)

and

d_{T V} (M_{n}, M)

, some additional assumptions are needed. In particular, in this section,

(X_{n})

is i.i.d. (with the exception of Remark 1). Hence, L and M reduce to

L = a V

and

M = σ \sqrt{V} Z + a U

, where

a = E (X_{1})

,

σ^{2} = var (X_{1})

and

(U, Z)

satisfies condition (2).

We begin with a rough estimate for

d_{T V} (L_{n}, L)

.

Theorem 4.

Suppose that conditions (ii)–(iii) hold,

(X_{n})

is i.i.d.,

E (| X_{1} |^{3}) < \infty

and

L (X_{1})

has an absolutely continuous part. Then,

\begin{matrix} d_{T V} (L_{n}, L) \leq P (T_{n} \leq m) + \frac{c}{\sqrt{m + 1}} + d_{T V} (L + σ \sqrt{\frac{V}{τ_{n}}} Z, L) + \\ + E [\frac{| \sqrt{V} - \sqrt{\frac{T_{n}}{τ_{n}}} |}{max (\sqrt{V}, \sqrt{\frac{T_{n}}{τ_{n}}})}] + \frac{| a | \sqrt{τ_{n}}}{σ} E [\frac{| V - \frac{T_{n}}{τ_{n}} |}{max (\sqrt{V}, \sqrt{\frac{T_{n}}{τ_{n}}})}] \end{matrix}

for all

m, n \geq 1

, where

c > 0

is a constant independent of m and n.

In order to prove Theorem 4, we recall that

\begin{matrix} d_{T V} (N (a_{1}, b_{1}), N (a_{2}, b_{2})) \leq \frac{| \sqrt{b_{1}} - \sqrt{b_{2}} | + | a_{1} - a_{2} |}{\sqrt{max (b_{1}, b_{2})}} \end{matrix}

(5)

for all

a_{1}, a_{2} \in R

and

b_{1}, b_{2} > 0

; see, e.g., ([15] Lem. 3).

Proof of Theorem 4.

Fix

m, n \geq 1

. By ([16] Lem. 2.1), up to enlarging the underlying probability space

(Ω, A, P)

, there is a sequence

((S_{j}, Z_{j}) : j \geq 1)

of random variables, independent of

(T_{n}, V)

, such that

\begin{matrix} S_{j} \sim \sum_{i = 1}^{j} X_{i}, Z_{j} \sim N (0, 1), P (S_{j} \neq a j + σ \sqrt{j} Z_{j}) = d_{T V} (S_{j}, a j + σ \sqrt{j} Z_{j}) . \end{matrix}

In addition, by ([17] Th. 2.6), there is a constant

c > 0

depending only on

E (| X_{1} |^{3})

such that

\begin{matrix} d_{T V} (S_{j}, a j + σ \sqrt{j} Z_{j}) = d_{T V} (\frac{S_{j} - a j}{σ \sqrt{j}}, Z_{j}) \leq \frac{c}{\sqrt{m + 1}} for all j > m . \end{matrix}

Having noted these facts, define

\begin{matrix} L_{n}^{*} = \frac{a T_{n} + σ \sqrt{T_{n}} Z_{T_{n}}}{τ_{n}} . \end{matrix}

Then,

\begin{matrix} d_{T V} (L_{n}, L_{n}^{*}) \leq P (T_{n} \leq m) + \sum_{j > m} P (T_{n} = j) d_{T V} [P (L_{n} \in \cdot ∣ T_{n} = j), P (L_{n}^{*} \in \cdot ∣ T_{n} = j)] \\ \leq P (T_{n} \leq m) + sup_{j > m} d_{T V} [P (L_{n} \in \cdot ∣ T_{n} = j), P (L_{n}^{*} \in \cdot ∣ T_{n} = j)] \\ = P (T_{n} \leq m) + sup_{j > m} d_{T V} [\frac{\sum_{i = 1}^{j} X_{i}}{τ_{n}}, \frac{a j + σ \sqrt{j} Z_{j}}{τ_{n}}] \\ = P (T_{n} \leq m) + sup_{j > m} d_{T V} (S_{j}, a j + σ \sqrt{j} Z_{j}) \\ \leq P (T_{n} \leq m) + \frac{c}{\sqrt{m + 1}} . \end{matrix}

Next, since

Z_{T_{n}} \sim N (0, 1)

, by conditioning on

(L_{n}, V)

and applying inequality (5), one obtains

\begin{matrix} d_{T V} (L_{n}^{*}, a V + σ \sqrt{\frac{V}{τ_{n}}} Z_{T_{n}}) \leq E [\frac{| \sqrt{V} - \sqrt{\frac{T_{n}}{τ_{n}}} |}{max (\sqrt{V}, \sqrt{\frac{T_{n}}{τ_{n}}})}] + \frac{| a | \sqrt{τ_{n}}}{σ} E [\frac{| V - \frac{T_{n}}{τ_{n}} |}{max (\sqrt{V}, \sqrt{\frac{T_{n}}{τ_{n}}})}] . \end{matrix}

Moreover, since

Z_{T_{n}} \sim Z

and both

Z_{T_{n}}

and Z are independent of V,

\begin{matrix} d_{T V} (a V + σ \sqrt{\frac{V}{τ_{n}}} Z_{T_{n}}, L) = d_{T V} (L + σ \sqrt{\frac{V}{τ_{n}}} Z, L) . \end{matrix}

Collecting all these facts together, one finally obtains

\begin{matrix} d_{T V} (L_{n}, L) \leq d_{T V} (L_{n}, L_{n}^{*}) + d_{T V} (L_{n}^{*}, L) \\ \leq P (T_{n} \leq m) + \frac{c}{\sqrt{m + 1}} + d_{T V} (L + σ \sqrt{\frac{V}{τ_{n}}} Z, L) + \\ + E [\frac{| \sqrt{V} - \sqrt{\frac{T_{n}}{τ_{n}}} |}{max (\sqrt{V}, \sqrt{\frac{T_{n}}{τ_{n}}})}] + \frac{| a | \sqrt{τ_{n}}}{σ} E [\frac{| V - \frac{T_{n}}{τ_{n}} |}{max (\sqrt{V}, \sqrt{\frac{T_{n}}{τ_{n}}})}] . \end{matrix}

☐

The upper bound provided by Theorem 4 is generally large but it becomes manageable under some further assumptions. For instance, if

V \geq b

a.s. for some constant

b > 0

, it reduces to

\begin{matrix} d_{T V} (L_{n}, L) \leq P (T_{n} \leq m) + \frac{c}{\sqrt{m + 1}} + d_{T V} (L + σ \sqrt{\frac{V}{τ_{n}}} Z, L) + \\ + (\frac{1}{b} + \frac{| a | \sqrt{τ_{n}}}{σ \sqrt{b}}) E [|V - \frac{T_{n}}{τ_{n}}|] . \end{matrix}

(6)

As an example, we discuss a simple but instructive case.

Example 1.

For each

x \in R

, denote by

J (x)

the integer part of x. Suppose

V \geq b

a.s. for some constant

b > 0

and define

\begin{matrix} T_{n} = J (τ_{n} V + 1) . \end{matrix}

Suppose also that

(X_{n})

is independent of V and satisfies the other conditions of Theorem 4. Then,

\begin{matrix} T_{n} > τ_{n} b and |V - \frac{T_{n}}{τ_{n}}| = \frac{T_{n}}{τ_{n}} - V \leq \frac{1}{τ_{n}} a . s . \end{matrix}

Hence, letting

m = J (τ_{n} b)

, inequality (6) reduces to

\begin{matrix} d_{T V} (L_{n}, L) \leq \frac{c^{*}}{\sqrt{τ_{n}}} + d_{T V} (L + σ \sqrt{\frac{V}{τ_{n}}} Z, L) \end{matrix}

for some constant

c^{*}

. Finally,

d_{T V} (L + σ \sqrt{\frac{V}{τ_{n}}} Z, L) =

O

(1 / \sqrt{τ_{n}})

if V is bounded above and

L (V)

is absolutely continuous with a Lipschitz density. Hence, under the latter condition on V, one obtains

\begin{matrix} d_{T V} (L_{n}, L) = O (1 / \sqrt{τ_{n}}) . \end{matrix}

Incidentally, this bound is essentially of the same order as the bound obtained in [6] when

T_{n}

has a mixed Poisson distribution and the total variation distance is replaced by the Wasserstein distance.

One more consequence of Theorem 4 is the following.

Corollary 1.

L_{n} \to L

in total variation distance provided the conditions of Theorem 4 hold,

a \neq 0

,

L (V)

is absolutely continuous, and

\begin{matrix} lim_{n} \sqrt{τ_{n}} E [|V - \frac{T_{n}}{τ_{n}}|] = 0 . \end{matrix}

Proof.

First, assume

V \geq b

a.s. for some constant

b > 0

. For each

z \in R

, letting

q_{n} = \frac{σ}{a \sqrt{τ_{n}}} z

, Lemma 1 implies

\begin{matrix} \underset{n}{lim sup} d_{T V} (L + σ \sqrt{\frac{V}{τ_{n}}} z, L) = \underset{n}{lim sup} d_{T V} (V + q_{n} \sqrt{V}, V) = 0 . \end{matrix}

Conditioning on Z and taking inequality (6) into account, it follows that

\begin{matrix} \underset{n}{lim sup} d_{T V} (L_{n}, L) \leq \frac{c}{\sqrt{m + 1}} + \underset{n}{lim sup} d_{T V} (L + σ \sqrt{\frac{V}{τ_{n}}} Z, L) \\ \leq \frac{c}{\sqrt{m + 1}} + \underset{n}{lim sup} \int d_{T V} (L + σ \sqrt{\frac{V}{τ_{n}}} z, L) N (0, 1) (d z) \\ = \frac{c}{\sqrt{m + 1}} for each m \geq 1 . \end{matrix}

This concludes the proof if

V \geq b

a.s. In general, for each

b > 0

, define

\begin{matrix} V_{b} = 1_{{V > b}} V + 1_{{V \leq b}} (V + b) and \\ T_{n, b} = J (1_{{V > b}} T_{n} + 1_{{V \leq b}} (1 + τ_{n} (V + b))) \end{matrix}

where

J (x)

denotes the integer part of x. Since

\frac{T_{n, b}}{τ_{n}} \overset{P}{⟶} V_{b} > b

, the first part of the proof implies

\begin{matrix} \frac{\sum_{i = 1}^{T_{n, b}} X_{i}}{τ_{n}} ⟶ a V_{b} in total variation distance . \end{matrix}

Finally, since

V > 0

and

\begin{matrix} d_{T V} (L_{n}, L) \leq 2 P (V \leq b) + d_{T V} (\frac{\sum_{i = 1}^{T_{n, b}} X_{i}}{τ_{n}}, a V_{b}) for all b > 0, \end{matrix}

one obtains

{lim}_{n} d_{T V} (L_{n}, L) = 0

. ☐

We next turn to

d_{T V} (M_{n}, M)

. Following [18], our strategy is to estimate

d_{T V} (M_{n}, M)

through the Wasserstein distance between

L (M_{n})

and

L (M)

.

Recall that, if X and Y are real integrable random variables, the Wasserstein distance between

L (X)

and

L (Y)

is

\begin{matrix} d_{W} (X, Y) = inf_{(H, K)} E | H - K | = sup_{f} | E (f (X)) - E (f (Y)) |, \end{matrix}

where inf is over the real random variables H and K such that

H \sim X

and

K \sim Y

while sup is over the 1-Lipschitz functions

f : R \to R

. Define also

\begin{matrix} l_{n} = \int | t ϕ_{n} (t) | d t = 2 \int_{0}^{\infty} t | ϕ_{n} (t) | d t \end{matrix}

where

ϕ_{n}

is the characteristic function of

M_{n}

.

Theorem 5.

Assume the conditions of Theorem 2 and:

(iv): $U = \sqrt{V_{0}} Z_{0}$ , where $Z_{0} \sim N (0, 1)$ , $V_{0} \geq 0$ is independent of $Z_{0}$ , and $(V_{0}, Z_{0})$ is independent of $(V, Z)$ ;
(v): $E (T_{n_{0}}^{2}) < \infty$ for some $n_{0}$ and

$\begin{matrix} sup_{n} τ_{n} E \{{(\frac{T_{n}}{τ_{n}} - V)}^{2}\} < \infty . \end{matrix}$

Then,

d_{W} (M_{n}, M) \to 0

. Moreover, letting

d_{n} = d_{W} (M_{n}, M)

, one obtains

\begin{matrix} d_{T V} (M_{n}, M) \leq d_{n}^{1 / 2} + d_{n}^{1 / 2 - α} + P (\sqrt{σ^{2} V + a^{2} V_{0}} < d_{n}^{α}) + k {(l_{n} d_{n}^{1 / 2})}^{2 / 3} \\ and d_{T V} (M_{n}, M) \leq d_{n}^{1 / 2} (1 + \frac{1}{σ} E (V^{- 1 / 2})) + k {(l_{n} d_{n}^{1 / 2})}^{2 / 3} \end{matrix}

for each

n \geq 1

and

α < 1 / 2

, where k is a constant independent of n.

Proof.

By Theorem 2,

M_{n} \to M

in distribution. By condition (iv),

\begin{matrix} M = σ \sqrt{V} Z + a \sqrt{V_{0}} Z_{0} \sim \sqrt{σ^{2} V + a^{2} V_{0}} Z, \end{matrix}

so that

L (M)

is a mixture of centered Gaussian laws. On noting that

\begin{matrix} E \{{(\sum_{i = 1}^{T_{n}} (X_{i} - a))}^{2}\} = σ^{2} E (T_{n}), \end{matrix}

one obtains

\begin{matrix} E (M_{n}^{2}) = τ_{n} E \{{(\frac{\sum_{i = 1}^{T_{n}} (X_{i} - a)}{τ_{n}} + a (\frac{T_{n}}{τ_{n}} - V))}^{2}\} \\ \leq \frac{2}{τ_{n}} E \{{(\sum_{i = 1}^{T_{n}} (X_{i} - a))}^{2}\} + 2 a^{2} τ_{n} E \{{(\frac{T_{n}}{τ_{n}} - V)}^{2}\} \\ = 2 σ^{2} E (\frac{T_{n}}{τ_{n}}) + 2 a^{2} τ_{n} E \{{(\frac{T_{n}}{τ_{n}} - V)}^{2}\} . \end{matrix}

Finally, by condition (v),

{lim}_{n} E (\frac{T_{n}}{τ_{n}}) = E (V) < \infty

and

{sup}_{n} E (M_{n}^{2}) < \infty

. To conclude the proof, it suffices to apply Theorem 1 of [18] (see also the subsequent remark) with

β = 2

. ☐

Theorem 5 gives two upper bounds for

d_{T V} (M_{n}, M)

in terms of

d_{n} = d_{W} (M_{n}, M)

and

l_{n}

. To avoid trivialities, suppose

σ > 0

. Obviously, the second bound makes sense only if

E (V^{- 1 / 2}) < \infty

. However, since

V > 0

and

d_{n} \to 0

, the first bound implies

d_{T V} (M_{n}, M) \to 0

if

{lim}_{n} l_{n} d_{n}^{1 / 2} = 0

. In particular,

d_{T V} (M_{n}, M) \to 0

if

{sup}_{n} l_{n} < \infty

.

Example 2.

Under the conditions of Theorem 5, suppose also that

L (X_{1})

is absolutely continuous with a density f satisfying

\int | f^{'} (x) | d x < \infty

. Then, conditioning on

T_{n}

and V and arguing as in ([18] Ex. 2), it can be shown that

{sup}_{n} l_{n} < \infty

. Hence,

M_{n} \to M

in total variation distance. Furthermore, if

E (V^{- 1 / 2}) < \infty

, the second bound of Theorem 5 yields

\begin{matrix} d_{T V} (M_{n}, M) \leq k^{*} {(1 \land d_{n})}^{1 / 3} \end{matrix}

for all

n \geq 1

and a suitable constant

k^{*}

(independent of n).

We close the paper by briefly discussing the exchangeable case.

Remark 1.

Usually, the upper bounds for the total variation distance are preserved under mixtures. Hence, by conditioning on

T

and making some further assumptions, the results obtained in this section can be extended to the case where

(X_{n})

is exchangeable. As an example, define L and M as in Section 3 and suppose

\begin{matrix} |E \{exp (i t X_{1}) ∣ T\}| \leq \frac{Q}{| t |} a . s . \end{matrix}

for each

t \in R \ {0}

and for some integrable random variable Q. Then, Corollary 1 and Theorem 5 are still valid even if

(X_{n})

is exchangeable (and not necessarily i.i.d.) up to replacing

a \neq 0

with

E (X_{1} ∣ T) \neq 0

a.s. in Corollary 1.

Author Contributions

Methodology, L.P. and P.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Gnedenko, B.V.; Korolev, V. Random Summation: Limit Theorems and Applications; CRC Press: Boca Raton, FL, USA, 1996. [Google Scholar]
Kiche, J.; Ngesa, O.; Orwa, G. On generalized gamma distribution and its application to survival data. Int. J. Stat. Probab. 2019, 8, 85–102. [Google Scholar]
Korolev, V.; Chertok, A.; Korchagin, A.; Zeifman, A. Modeling high-frequency order flow imbalance by functional limit theorems for two-sided risk processes. Appl. Math. Comput. 2015, 253, 224–241. [Google Scholar] [CrossRef][Green Version]
Korolev, V.; Dorofeeva, A. Bounds of the accuracy of the normal approximation to the distributions of random sums under relaxed moment conditions. Lith. Math. J. 2017, 57, 38–58. [Google Scholar] [CrossRef]
Korolev, V.; Zeifman, A. Generalized negative binomial distributions as mixed geometric laws and related limit theorems. Lith. Math. J. 2019, 59, 366–388. [Google Scholar] [CrossRef]
Korolev, V.; Zeifman, A. Bounds for convergence rate in laws of large numbers for mixed Poisson random sums. Stat. Prob. Lett. 2021, 168, 1–8. [Google Scholar] [CrossRef]
Mattner, L.; Shevtsova, I. An optimal Berry-Esseen type theorem for integrals of smooth functions. ALEA Lat. Am. J. Probab. Math. Stat. 2019, 16, 487–530. [Google Scholar] [CrossRef]
Schluter, C.; Trede, M. Weak convergence to the student and Laplace distributions. J. Appl. Probab. 2016, 53, 121–129. [Google Scholar] [CrossRef]
Shevtsova, I.; Tselishchev, M. A generalized equilibrium transform with application to error bounds in the Renyi theorem with no support constraints. Mathematics 2020, 8, 577. [Google Scholar] [CrossRef]
Sheeja, S.; Kumar, S. Negative binomial sum of random variables and modeling financial data. Int. J. Stat. Appl. Math. 2017, 2, 44–51. [Google Scholar]
Renyi, A. On stable sequences of events. Sankhya A 1963, 25, 293–302. [Google Scholar]
Nourdin, I.; Nualart, D.; Peccati, G. Quantitative stable limit theorems on the Wiener space. Ann. Probab. 2016, 44, 1–41. [Google Scholar] [CrossRef]
Berti, P.; Crimaldi, I.; Pratelli, L.; Rigo, P. An Anscombe-type theorem. J. Math. Sci. 2014, 196, 15–22. [Google Scholar] [CrossRef]
Berti, P.; Pratelli, L.; Rigo, P. Limit theorems for a class of identically distributed random variables. Ann. Probab. 2004, 32, 2029–2052. [Google Scholar] [CrossRef]
Pratelli, L.; Rigo, P. Total variation bounds for Gaussian functionals. Stoch. Proc. Appl. 2019, 129, 2231–2248. [Google Scholar] [CrossRef]
Sethuraman, J. Some extensions of the Skorohod representation theorem. Sankhya 2002, 64, 884–893. [Google Scholar]
Bally, V.; Caramellino, L. Asymptotic development for the CLT in total variation distance. Bernoulli 2016, 22, 2442–2485. [Google Scholar] [CrossRef]
Pratelli, L.; Rigo, P. Convergence in total variation to a mixture of Gaussian laws. Mathematics 2018, 6, 99. [Google Scholar] [CrossRef]

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Pratelli, L.; Rigo, P. Convergence in Total Variation of Random Sums. Mathematics 2021, 9, 194. https://doi.org/10.3390/math9020194

AMA Style

Pratelli L, Rigo P. Convergence in Total Variation of Random Sums. Mathematics. 2021; 9(2):194. https://doi.org/10.3390/math9020194

Chicago/Turabian Style

Pratelli, Luca, and Pietro Rigo. 2021. "Convergence in Total Variation of Random Sums" Mathematics 9, no. 2: 194. https://doi.org/10.3390/math9020194

APA Style

Pratelli, L., & Rigo, P. (2021). Convergence in Total Variation of Random Sums. Mathematics, 9(2), 194. https://doi.org/10.3390/math9020194

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Convergence in Total Variation of Random Sums

Abstract

1. Introduction

2. Preliminaries

Stable Convergence

3. WLLN and CLT for Random Sums

4. Rate of Convergence with Respect to Total Variation Distance

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI