Convergence in Total Variation to a Mixture of Gaussian Laws

Pratelli, Luca; Rigo, Pietro

doi:10.3390/math6060099

Open AccessFeature PaperArticle

Convergence in Total Variation to a Mixture of Gaussian Laws

by

Luca Pratelli

¹ and

Pietro Rigo

^2,*

¹

Accademia Navale, Viale Italia 72, 57100 Livorno, Italy

²

Dipartimento di Matematica “F. Casorati”, Universita’ di Pavia, via Ferrata 1, 27100 Pavia, Italy

^*

Author to whom correspondence should be addressed.

Mathematics 2018, 6(6), 99; https://doi.org/10.3390/math6060099

Submission received: 29 April 2018 / Revised: 1 June 2018 / Accepted: 5 June 2018 / Published: 11 June 2018

(This article belongs to the Special Issue Stochastic Processes with Applications)

Download Versions Notes

Abstract

:

It is not unusual that

X_{n} \overset{d i s t}{⟶} V Z

where

X_{n}

, V, Z are real random variables, V is independent of Z and

Z \sim N (0, 1)

. An intriguing feature is that

P (V Z \in A) = E \{N (0, V^{2}) (A)\}

for each Borel set

A \subset R

, namely, the probability distribution of the limit

V Z

is a mixture of centered Gaussian laws with (random) variance

V^{2}

. In this paper, conditions for

d_{T V} (X_{n}, V Z) \to 0

are given, where

d_{T V} (X_{n}, V Z)

is the total variation distance between the probability distributions of

X_{n}

and

V Z

. To estimate the rate of convergence, a few upper bounds for

d_{T V} (X_{n}, V Z)

are given as well. Special attention is paid to the following two cases: (i)

X_{n}

is a linear combination of the squares of Gaussian random variables; and (ii)

X_{n}

is related to the weighted quadratic variations of two independent Brownian motions.

Keywords:

mixture of Gaussian laws; rate of convergence; total variation distance; Wasserstein distance; weighted quadratic variation

MSC:

60B10; 60F05

1. Introduction

All random elements involved in the sequel are defined on a common probability space

(Ω, F, P)

. We let

B

denote the Borel

σ

-field on

R

and

N (a, b)

the Gaussian law on

B

with mean a and variance b, where

a \in R

,

b \geq 0

, and

N (a, 0) = δ_{a}

. Moreover, Z always denotes a real random variable such that:

\begin{matrix} Z \sim N (0, 1) . \end{matrix}

In plenty of frameworks, it happens that:

\begin{matrix} X_{n} \overset{d i s t}{⟶} V Z, \end{matrix}

(1)

where

X_{n}

and V are real random variables and V is independent of Z. Condition (1) actually occurs in the CLT, both in its classical form (with

V = 1

) and in its exchangeable and martingale versions (Examples 3 and 4). In addition, condition (1) arises in several recent papers with various distributions for V. See, e.g., [1,2,3,4,5,6,7,8].

An intriguing feature of condition (1) is that the probability distribution of the limit:

\begin{matrix} P (V Z \in A) = \int N (0, V^{2}) (A) d P, A \in B, \end{matrix}

is a mixture of centered Gaussian laws with (random) variance

V^{2}

. Moreover, condition (1) can be often strengthened into:

\begin{matrix} d_{W} (X_{n}, V Z) \to 0, \end{matrix}

(2)

where

d_{W} (X_{n}, V Z)

is the Wasserstein distance between the probability distributions of

X_{n}

and

V Z

. In fact, condition (2) amounts to (1) provided the sequence

(X_{n})

is uniformly integrable; see Section 2.1.

A few (engaging) problems are suggested by conditions (1) and (2). One is:

(*) Give conditions for

d_{T V} (X_{n}, V Z) \to 0

, where

\begin{matrix} d_{T V} (X_{n}, V Z) = sup_{A \in B} | P (X_{n} \in A) - P (V Z \in A) | . \end{matrix}

Under such (or stronger) conditions, estimate the rate of convergence, i.e., find quantitative bounds for

d_{T V} (X_{n}, V Z)

.

Problem (*) is addressed in this paper. Before turning to results, however, we mention an example.

Example 1.

Let B be a fractional Brownian motion with Hurst parameter H and

\begin{matrix} X_{n} = \frac{n^{1 + H}}{2} \int_{0}^{1} t^{n - 1} (B_{1}^{2} - B_{t}^{2}) d t . \end{matrix}

The asymptotics of

X_{n}

and other analogous functionals of the B-paths (such as weighted power variations) is investigated in various papers. See, e.g., [5,7,8,9,10] and references therein. We note also that:

\begin{matrix} \int_{0}^{1} t^{n} B_{t} d B_{t} = \frac{X_{n}}{n^{H}} - \frac{H}{2 H + n} f o r e a c h H \geq 1 / 4, \end{matrix}

where the stochastic integral is meant in Skorohod’s sense (it reduces to an Ito integral if

H = 1 / 2

).

Let

a (H) = 1 / 2 - | 1 / 2 - H |

and

V = \sqrt{H Γ (2 H)} B_{1} \sim N (0, H Γ (2 H))

. In [8], it is shown that, for every

β \in (0, 1)

, there is a constant k (depending on H and β only) such that:

\begin{matrix} d_{T V} (X_{n}, V Z) \leq k n^{- β a (H)} f o r a l l n \geq 1, \end{matrix}

where Z is a standard normal random variable independent of V. Furthermore, the rate

n^{- β a (H)}

is quite close to be optimal; see condition (2) of [8].

In Example 1, problem (*) admits a reasonable solution. In fact, in a sense, Example 1 is our motivating example.

This paper includes two main results.

The first (Theorem 1) is of the general type. Suppose

l_{n} : = \int | t ϕ_{n} (t) | d t < \infty

, where

ϕ_{n}

is the characteristic function of

X_{n}

. (In particular,

X_{n}

has an absolutely continuous distribution). Then, an upper bound for

d_{T V} (X_{n}, V Z)

is provided in terms of

l_{n}

and

d_{W} (X_{n}, V Z)

. In some cases, this bound allows to prove

d_{T V} (X_{n}, V Z) \to 0

and to estimate the convergence rate. In Example 5, for instance, such a bound improves on the existing ones; see Theorem 3.1 of [6] and Remark 3.5 of [7]. However, for the upper bound to work, one needs information on

l_{n}

and

d_{W} (X_{n}, V Z)

, which is not always available. Thus, it is convenient to have some further tools.

In the second result (Theorem 2), the ideas underlying Example 1 are adapted to weighted quadratic variations; see [5,8,9]. Let B and

B^{'}

be independent standard Brownian motions and

\begin{matrix} X_{n} = n^{1 / 2} \sum_{k = 0}^{n - 1} f (B_{k / n} - B_{k / n}^{'}) {{(Δ B_{k / n})}^{2} - {(Δ B_{k / n}^{'})}^{2}}, \end{matrix}

where

f : R \to R

is a suitable function,

Δ B_{k / n} = B_{(k + 1) / n} - B_{k / n}

and

Δ B_{k / n}^{'} = B_{(k + 1) / n}^{'} - B_{k / n}^{'}

. Under some assumptions on f (weaker than those usually requested in similar problems), it is shown that

d_{T V} (X_{n}, V Z) =

O

(n^{- 1 / 4})

, where

V = 2 \sqrt{\int_{0}^{1} f^{2} (\sqrt{2} B_{t}) d t}

. Furthermore,

d_{T V} (X_{n}, V Z) =

O

(n^{- 1 / 2})

if one also assumes

inf | f | > 0

. (We recall that, if

a_{n}

and

b_{n}

are non-negative numbers, the notation

a_{n} =

O

(b_{n})

means that there is a constant c such that

a_{n} \leq c b_{n}

for all n).

2. Preliminaries

2.1. Distances between Probability Measures

In this subsection, we recall a few known facts on distances between probability measures. We denote by

(S, E)

a measurable space and by

μ

and

ν

two probability measures on

E

.

The total variation distance between

μ

and

ν

is:

\begin{matrix} ∥ μ - ν ∥ = sup_{A \in E} | μ (A) - ν (A) | . \end{matrix}

If X and Y are

(S, E)

-valued random variables, we also write:

\begin{matrix} d_{T V} (X, Y) = ∥ P (X \in \cdot) - P (Y \in \cdot) ∥ = sup_{A \in E} | P (X \in A) - P (Y \in A) | \end{matrix}

to denote the total variation distance between the probability distributions of X and Y.

Next, suppose S is a separable metric space,

E

the Borel

σ

-field and

\begin{matrix} \int d (x, x_{0}) μ (d x) + \int d (x, x_{0}) ν (d x) < \infty for some x_{0} \in S, \end{matrix}

where d is the distance on S. The Wasserstein distance between

μ

and

ν

is:

\begin{matrix} W (μ, ν) = inf_{X \sim μ, Y \sim ν} E [d (X, Y)], \end{matrix}

where inf is over the pairs

(X, Y)

of

(S, E)

-valued random variables such that

X \sim μ

and

Y \sim ν

. By a duality theorem,

W (μ, ν)

admits the representation:

\begin{matrix} W (μ, ν) = sup_{f} |\int f d μ - \int f d ν|, \end{matrix}

where sup is over those functions

f : S \to R

such that

| f (x) - f (y) | \leq d (x, y)

for all

x, y \in S

; see, e.g., Section 11.8 of [11]. Again, if X and Y are

(S, E)

-valued random variables, we write:

\begin{matrix} d_{W} (X, Y) = W [P (X \in \cdot), P (Y \in \cdot)] \end{matrix}

to mean the Wasserstein distance between the probability distributions of X and Y.

Finally, we make precise the connections between convergence in distribution and convergence according to Wasserstein distance in the case

S = R

. Let

X_{n}

and X be real random variables such that

E | X_{n} | + E | X | < \infty

for each n. Then, the following statements are equivalent:

-: ${lim}_{n} d_{W} (X_{n}, X) = 0$ ;
-: $X_{n} \overset{d i s t}{⟶} X$ and $E | X_{n} | \to E | X |$ ;
-: $X_{n} \overset{d i s t}{⟶} X$ and the sequence $(X_{n})$ is uniformly integrable.

2.2. Two Technical Lemmas

The following simple lemma is fundamental for our purposes.

Lemma 1.

If

a_{1}, a_{2} \in R

,

0 \leq b_{1} \leq b_{2}

and

b_{2} > 0

, then:

\begin{matrix} ∥ N (a_{1}, b_{1}) - N (a_{2}, b_{2}) ∥ \leq 1 - \sqrt{\frac{b_{1}}{b_{2}}} + \frac{| a_{1} - a_{2} |}{\sqrt{2 π b_{2}}} . \end{matrix}

Lemma 1 is well known; see e.g. Proposition 3.6.1 of [12] and Lemma 3 of [8].

Note also that, if

a_{1} = a_{2} = a

, Lemma 1 yields:

\begin{matrix} ∥ N (a, b_{1}) - N (a, b_{2}) ∥ \leq \frac{| b_{1} - b_{2} |}{b_{i}} for each i such that b_{i} > 0 . \end{matrix}

The next result, needed in Section 4, is just a consequence of Lemma 1. In such a result,

X

and

Y

are separable metric spaces,

g_{n} : X \times Y \to R

and

g : X \times Y \to R

Borel functions, and X and Y random variables with values in

X

and

Y

, respectively.

Lemma 2.

Let ν be the probability distribution of Y. If X is independent of Y and

\begin{matrix} g_{n} (X, y) \sim N (0, σ_{n}^{2} (y)), g (X, y) \sim N (0, σ^{2} (y)), σ^{2} (y) > 0 \end{matrix}

for ν-almost all

y \in Y

, then:

\begin{matrix} d_{T V} (g_{n} (X, Y), g (X, Y)) \leq min \{E (\frac{| σ_{n} (Y) - σ (Y) |}{σ (Y)}), E (\frac{| σ_{n}^{2} (Y) - σ^{2} (Y) |}{σ^{2} (Y)})\} . \end{matrix}

Proof.

Since X is independent of Y,

\begin{matrix} d_{T V} (g_{n} (X, Y), g (X, Y)) = sup_{A \in B} |\int (P (g_{n} (X, y) \in A) - P (g (X, y) \in A)) ν (d y)| \\ \leq \int ∥ P (g_{n} (X, y) \in \cdot) - P (g (X, y) \in \cdot) ∥ ν (d y) . \end{matrix}

Thus, since

g_{n} (X, y)

and

g (X, y)

have centered Gaussian laws and

g (X, y)

has strictly positive variance, for

ν

-almost all

y \in Y

, Lemma 1 yields:

\begin{matrix} d_{T V} (g_{n} (X, Y), g (X, Y)) \leq \int \frac{| σ_{n} (y) - σ (y) |}{σ (y)} ν (d y) = E (\frac{| σ_{n} (Y) - σ (Y) |}{σ (Y)}) \\ and d_{T V} (g_{n} (X, Y), g (X, Y)) \leq \int \frac{| σ_{n}^{2} (y) - σ^{2} (y) |}{σ^{2} (y)} ν (d y) = E (\frac{| σ_{n}^{2} (Y) - σ^{2} (Y) |}{σ^{2} (Y)}) . \end{matrix}

☐

3. A General Result

As in Section 1, let

X_{n}

, V and Z be real random variables, with

Z \sim N (0, 1)

and V independent of Z. Since

| V | Z \sim V Z

, it can be assumed

V \geq 0

. We also assume

E | X_{n} | + E | V Z | < \infty

, so that we can define:

\begin{matrix} d_{n} = d_{W} (X_{n}, V Z) . \end{matrix}

In addition, we let:

\begin{matrix} X_{n}^{'} = X_{n} + d_{n}^{1 / 2} U, \end{matrix}

where U is a standard normal random variable independent of

(X_{n}, V, Z : n \geq 1)

.

We aim to estimate

d_{T V} (X_{n}, V Z)

. Under some conditions, however, the latter quantity can be replaced by

d_{T V} (X_{n}, X_{n}^{'})

.

Lemma 3.

For each

α < 1 / 2

,

\begin{matrix} | d_{T V} (X_{n}, V Z) - d_{T V} (X_{n}, X_{n}^{'}) | \leq d_{n}^{1 / 2} + d_{n}^{1 / 2 - α} + P (V < d_{n}^{α}) . \end{matrix}

In addition, if

E (1 / V) < \infty

, then:

\begin{matrix} | d_{T V} (X_{n}, V Z) - d_{T V} (X_{n}, X_{n}^{'}) | \leq d_{n}^{1 / 2} \{1 + E (1 / V)\} . \end{matrix}

Proof.

The Lemma is trivially true if

d_{n} = 0

. Hence, it can be assumed

d_{n} > 0

. Define

X_{n}^{″} = V Z + d_{n}^{1 / 2} U

and note that:

\begin{matrix} | d_{T V} (X_{n}, V Z) - d_{T V} (X_{n}, X_{n}^{'}) | \leq d_{T V} (X_{n}^{'}, X_{n}^{″}) + d_{T V} (X_{n}^{″}, V Z) . \end{matrix}

For each

A \in B

,

\begin{matrix} P (X_{n}^{'} \in A) = \int N (X_{n}, d_{n}) (A) d P and P (X_{n}^{″} \in A) = \int N (V Z, d_{n}) (A) d P . \end{matrix}

Hence, Lemma 1 yields:

\begin{matrix} d_{T V} (X_{n}^{'}, X_{n}^{″}) = sup_{A \in B} |\int (N (X_{n}, d_{n}) (A) - N (V Z, d_{n}) (A)) d P| \\ \leq \int ∥ N (X_{n}, d_{n}) - N (V Z, d_{n}) ∥ d P \leq \frac{E | X_{n} - V Z |}{d_{n}^{1 / 2}} . \end{matrix}

On the other hand, the probability distribution of

X_{n}^{″}

can also be written as:

\begin{matrix} P (X_{n}^{″} \in A) = \int N (0, V^{2} + d_{n}) (A) d P . \end{matrix}

Arguing as above, Lemma 1 implies again:

\begin{matrix} d_{T V} (X_{n}^{″}, V Z) \leq \int ∥ N (0, V^{2} + d_{n}) - N (0, V^{2}) ∥ d P \\ \leq E (1 - \frac{V}{\sqrt{V^{2} + d_{n}}}) \leq E (\frac{d_{n}^{1 / 2}}{\sqrt{V^{2} + d_{n}}}) \leq \frac{d_{n}^{1 / 2}}{ϵ} + P (V < ϵ) \end{matrix}

for each

ϵ > 0

. Letting

ϵ = d_{n}^{α}

with

α < 1 / 2

, it follows that:

\begin{matrix} | d_{T V} (X_{n}, V Z) - d_{T V} (X_{n}, X_{n}^{'}) | \leq \frac{E | X_{n} - V Z |}{d_{n}^{1 / 2}} + d_{n}^{1 / 2 - α} + P (V < d_{n}^{α}) . \end{matrix}

(3)

Inequality (3) holds true for every joint distribution for the pair

(X_{n}, V Z)

. In particular, inequality (3) holds if such a joint distribution is taken to be one that realizes the Wasserstein distance, namely, one such that

E | X_{n} - V Z | = d_{n}

. In this case, one obtains:

\begin{matrix} | d_{T V} (X_{n}, V Z) - d_{T V} (X_{n}, X_{n}^{'}) | \leq d_{n}^{1 / 2} + d_{n}^{1 / 2 - α} + P (V < d_{n}^{α}) . \end{matrix}

Finally, if

E (1 / V) < \infty

, it suffices to note that:

\begin{matrix} d_{T V} (X_{n}^{″}, V Z) \leq E (\frac{d_{n}^{1 / 2}}{\sqrt{V^{2} + d_{n}}}) \leq d_{n}^{1 / 2} E (1 / V) . \end{matrix}

☐

For Lemma 3 to be useful,

d_{T V} (X_{n}, X_{n}^{'})

should be kept under control. This can be achieved under various assumptions. One is to ask

X_{n}

to admit a Lipschitz density with respect to Lebesgue measure.

Theorem 1.

Let

ϕ_{n}

be the characteristic function of

X_{n}

and

\begin{matrix} l_{n} = \int | t ϕ_{n} (t) | d t = 2 \int_{0}^{\infty} t | ϕ_{n} (t) | d t . \end{matrix}

Given

β \geq 1

, suppose

{sup}_{n} E {| X_{n} |}^{β} < \infty

and

d_{n} \to 0

. Then, there is a constant k, independent of n, such that:

\begin{matrix} d_{T V} (X_{n}, X_{n}^{'}) \leq k {(l_{n} d_{n}^{1 / 2})}^{β / (β + 1)} . \end{matrix}

In particular,

\begin{matrix} d_{T V} (X_{n}, V Z) \leq d_{n}^{1 / 2} + d_{n}^{1 / 2 - α} + P (V < d_{n}^{α}) + k {(l_{n} d_{n}^{1 / 2})}^{β / (β + 1)} \end{matrix}

for each

α < 1 / 2

, and

\begin{matrix} d_{T V} (X_{n}, V Z) \leq d_{n}^{1 / 2} \{1 + E (1 / V)\} + k {(l_{n} d_{n}^{1 / 2})}^{β / (β + 1)} if E (1 / V) < \infty . \end{matrix}

It is worth noting that, if

β = 1

, the condition

{sup}_{n} E | X_{n} | < \infty

follows from

d_{n} \to 0

. On the other hand,

d_{n} \to 0

can be weakened into

X_{n} \overset{d i s t}{⟶} V Z

whenever

{sup}_{n} E {| X_{n} |}^{β} < \infty

for some

β > 1

; see Section 2.1.

Proof of Theorem 1

If

l_{n} = \infty

, the Theorem is trivially true. Thus, it can be assumed

l_{n} < \infty

.

Since

ϕ_{n}

is integrable, the probability distribution of

X_{n}

admits a density

f_{n}

with respect to Lebesgue measure. In addition,

\begin{matrix} | f_{n} (x) - f_{n} (y) | = (1 / 2 π) | \int (e^{- i t x} - e^{- i t y}) ϕ_{n} (t) d t | \\ \leq \frac{| x - y |}{2 π} \int | t ϕ_{n} (t) | d t = \frac{l_{n} | x - y |}{2 π} . \end{matrix}

Given

t > 0

, it follows that:

\begin{matrix} 2 d_{T V} (X_{n}, X_{n}^{'}) \leq 2 \int ∥ P (X_{n} \in \cdot) - P (X_{n} + d_{n}^{1 / 2} u \in \cdot) ∥ N (0, 1) (d u) \\ = \int \int | f_{n} (x) - f_{n} (x - d_{n}^{1 / 2} u) | d x N (0, 1) (d u) \\ \leq P (| X_{n} | > t) + P (| X_{n}^{'} | > t) + \int_{- t}^{t} \int | f_{n} (x) - f_{n} (x - d_{n}^{1 / 2} u) | N (0, 1) (d u) d x . \end{matrix}

Since

{sup}_{n} E {| X_{n} |}^{β} < \infty

and

d_{n} \to 0

, one obtains:

\begin{matrix} P (| X_{n} | > t) + P (| X_{n}^{'} | > t) \leq P (| X_{n} | > t) + P (| X_{n} | > t / 2) + P (d_{n}^{1 / 2} | U | > t / 2) \\ \leq 2 P (| X_{n} | > t / 2) + \frac{d_{n}^{β / 2} E {| U |}^{β}}{{(t / 2)}^{β}} \\ \leq \frac{2 E {| X_{n} |}^{β} + d_{n}^{β / 2} E {| U |}^{β}}{{(t / 2)}^{β}} \leq \frac{k^{*}}{t^{β}} \end{matrix}

for some constant

k^{*}

. Hence,

\begin{matrix} 2 d_{T V} (X_{n}, X_{n}^{'}) \leq \frac{k^{*}}{t^{β}} + \frac{l_{n} d_{n}^{1 / 2}}{2 π} \int_{- t}^{t} \int | u | N (0, 1) (d u) d x \\ \leq \frac{k^{*}}{t^{β}} + \frac{l_{n} d_{n}^{1 / 2}}{π} t for each t > 0 . \end{matrix}

Minimizing over t, one finally obtains:

\begin{matrix} 2 d_{T V} (X_{n}, X_{n}^{'}) \leq c (β) {(k^{*})}^{1 / (β + 1)} {(\frac{l_{n} d_{n}^{1 / 2}}{π})}^{β / (β + 1)}, \end{matrix}

where

c (β)

is a constant that depends on

β

only. This concludes the proof. ☐

Theorem 1 provides upper bounds for

d_{T V} (X_{n}, V Z)

in terms of

l_{n}

and

d_{n}

. It is connected to Proposition 4.1 of [4], where

d_{T V}

is replaced by the Kolmogorov distance.

In particular, Theorem 1 implies that

d_{T V} (X_{n}, V Z) \to 0

provided

V > 0

a.s. and

\begin{matrix} lim_{n} d_{n}^{1 / 2} l_{n} = lim_{n} d_{W} {(X_{n}, V Z)}^{1 / 2} \int | t ϕ_{n} (t) | d t = 0 . \end{matrix}

In addition, Theorem 1 allows to estimate the convergence rate. As an extreme example, if

d_{n} \to 0

,

E (1 / V) < \infty

and

{sup}_{n} \{l_{n} + E {| X_{n} |}^{β}\} < \infty

for all

β \geq 1

, then:

\begin{matrix} d_{T V} (X_{n}, V Z) = O (d_{n}^{α}) for every α < 1 / 2 . \end{matrix}

We next turn to examples. In each such examples, Z is a standard normal random variable independent of all other random elements.

Example 2. (Classical CLT).

Let

V = 1

and

X_{n} = (1 / \sqrt{n}) \sum_{i = 1}^{n} ξ_{i}

, where

ξ_{1}, ξ_{2}, \dots

is an i.i.d. sequence of real random variables such that

E (ξ_{1}) = 0

and

E (ξ_{1}^{2}) = 1

. In this case,

d_{n} = O (n^{- 1 / 2})

; see Theorem 2.1 of [13]. Suppose now that

E {| ξ_{1} |}^{β} < \infty

for all

β \geq 1

and

ξ_{1}

has a density f (with respect to Lebesgue measure) such that

\int | f^{'} (x) | d x < \infty

. Then,

{sup}_{n} \{l_{n} + E {| X_{n} |}^{β}\} < \infty

for all

β \geq 1

, and Theorem 1 yields:

\begin{matrix} d_{T V} (X_{n}, Z) = O (n^{- α}) f o r e a c h α < 1 / 4 . \end{matrix}

This rate, however, is quite far from optimal. Under the present assumptions on

ξ_{1}

, in fact,

d_{T V} (X_{n}, Z) = O (n^{- 1 / 2})

; see Theorem 1 of [14].

We finally prove

{sup}_{n} \{l_{n} + E {| X_{n} |}^{β}\} < \infty

. It is well known that

E {| ξ_{1} |}^{β} < \infty

for all β implies

{sup}_{n} E {| X_{n} |}^{β} < \infty

for all β. Hence, it suffices to prove

{sup}_{n} l_{n} < \infty

. Let ϕ be the characteristic function of

ξ_{1}

and

q = \int | f^{'} (x) | d x

. An integration by parts yields

| ϕ (t) | \leq q / | t |

for each

t \neq 0

. By Lemma 1.4 of [15], one also obtains

| ϕ (t) | \leq 1 - (1 / 43) {(t / q)}^{2}

for

| t | < 2 q

(just let

b = 2 q

and

c = 1 / 2

in Lemma 1.4 of [15]). Since

ϕ_{n} (t) = ϕ {(t / \sqrt{n})}^{n}

for each

t \in R

,

\begin{matrix} | ϕ_{n} (t) | \leq {(\frac{q \sqrt{n}}{| t |})}^{n} f o r | t | \geq q \sqrt{n} a n d \\ | ϕ_{n} (t) | \leq {(1 - \frac{t^{2}}{43 q^{2} n})}^{n} f o r | t | < q \sqrt{n} . \end{matrix}

Using these inequalities,

{sup}_{n} l_{n} < \infty

follows from a direct calculation.

As noted above, the rate provided by Theorem 1 in the classical CLT is not optimal. While not exciting, this fact could be expected. Indeed, Theorem 1 is a general result, applying to arbitrary

X_{n}

, and should not be requested to give optimal bounds in a very special case (such as the classical CLT).

Example 3. (Exchangeable CLT).

Suppose now that

(ξ_{n})

is an exchangeable sequence of real random variables with

E (ξ_{1}^{2}) < \infty

. Define

\begin{matrix} V = \sqrt{E (ξ_{1}^{2} ∣ T) - E {(ξ_{1} ∣ T)}^{2}} a n d X_{n} = \frac{\sum_{i = 1}^{n} {ξ_{i} - E (ξ_{1} ∣ T)}}{\sqrt{n}}, \end{matrix}

where

T

is the tail σ-field of

(ξ_{n})

. By de Finetti’s theorem,

\begin{matrix} d_{T V} (X_{n}, V Z) \leq E (∥ P (X_{n} \in \cdot ∣ T) - N (0, V^{2}) ∥) . \end{matrix}

Hence,

d_{T V} (X_{n}, V Z) \to 0

provided

∥ P (X_{n} \in \cdot ∣ T) - N (0, V^{2}) ∥ \overset{P}{⟶} 0

. As to Theorem 1, note that

X_{n} \overset{d i s t}{⟶} V Z

(see e.g. Theorem 3.1 of [16]) and

\begin{matrix} E (X_{n}^{2}) = E \{E (X_{n}^{2} ∣ T)\} = n E \{E ({(\frac{\sum_{i = 1}^{n} (ξ_{i} - E (ξ_{1} ∣ T))}{n})}^{2} ∣ T)\} \\ = n E (V^{2} / n) = E (V^{2}) < \infty . \end{matrix}

Furthermore,

l_{n} \leq E \{\int | t | | E (e^{i t X_{n}} ∣ T) | d t\}

. Thus, by Theorem 1,

d_{T V} (X_{n}, V Z) \to 0

whenever

\begin{matrix} E (ξ_{1}^{2} ∣ T) > E {(ξ_{1} ∣ T)}^{2} a . s . and lim_{n} d_{n}^{1 / 2} E \{\int | t | | E (e^{i t X_{n}} ∣ T) | d t\} = 0 . \end{matrix}

Example 4. (Martingale CLT).

Let

\begin{matrix} X_{n} = \sum_{j = 1}^{k_{n}} ξ_{n, j}, \end{matrix}

where

(ξ_{n, j} : n \geq 1, j = 1, \dots, k_{n})

is an array of real square integrable random variables and

k_{n} ↑ \infty

. For each

n \geq 1

, let:

\begin{matrix} F_{n, 0} \subset F_{n, 1} \subset \dots \subset F_{n, k_{n}} \end{matrix}

be sub-σ-fields of

F

with

F_{n, 0} = {\emptyset, Ω}

. A well known version of the CLT (see e.g. Theorem 3.2 of [17]) states that

X_{n} \overset{d i s t}{⟶} V Z

provided:

(i): $ξ_{n, j}$ is $F_{n, j}$ -measurable and $E (ξ_{n, j} ∣ F_{n, j - 1}) = 0$ a.s.;
(ii): $\sum_{j} ξ_{n, j}^{2} \overset{P}{⟶} V^{2}$ , ${max}_{j} | ξ_{n, j} | \overset{P}{⟶} 0$ , ${sup}_{n} E ({max}_{j} ξ_{n, j}^{2}) < \infty$ ;
(iii): $F_{n, j} \subset F_{n + 1, j}$ .

Condition (iii) can be replaced by:

(iv): V is measurable with respect to the σ-field generated by $N \cup (\cap_{n, j} F_{n, j})$ where $N = {A \in F : P (A) = 0}$ .

Note also that, under (i), one obtains

E (X_{n}^{2}) = \sum_{j = 1}^{k_{n}} E (ξ_{n, j}^{2})

.

Now, in addition to (i)–(ii)–(iii) or (i)–(ii)–(iv), suppose

{sup}_{n} \sum_{j = 1}^{k_{n}} E (ξ_{n, j}^{2}) < \infty

. Then, Theorem 1 (applied with

β = 2

) implies

d_{T V} (X_{n}, V Z) \to 0

whenever

V > 0

a.s. and

{lim}_{n} d_{n}^{1 / 2} l_{n} = 0

. Moreover,

\begin{matrix} d_{T V} (X_{n}, V Z) = O ({(l_{n} d_{n}^{1 / 2})}^{2 / 3}) i f E (1 / V) + l_{n} < \infty f o r e a c h n . \end{matrix}

Our last example is connected to the second order Wiener chaos. We first note a simple fact as a lemma.

Lemma 4.

Let

ξ = (ξ_{1}, \dots, ξ_{k})

be a centered Gaussian random vector. Define:

\begin{matrix} Y = \sum_{j = 1}^{k} a_{j} \{ξ_{j}^{2} - γ_{j}^{2}\}, \end{matrix}

where

a_{j} \in R

and

γ = (γ_{1}, \dots, γ_{k})

is an independent copy of ξ. Then, the characteristic function ψ of Y can be written as:

\begin{matrix} ψ (t) = E \{e^{- t^{2} S}\}, t \in R, w h e r e S = \sum_{i, j} a_{i} a_{j} E (ξ_{i} ξ_{j}) (ξ_{i} + γ_{i}) (ξ_{j} + γ_{j}) . \end{matrix}

Proof.

Let

σ_{i, j} = E (ξ_{i} ξ_{j})

,

ξ^{*} = (ξ + γ) / \sqrt{2}

and

γ^{*} = (ξ - γ) / \sqrt{2}

. Then,

\begin{matrix} (ξ^{*}, γ^{*}) \sim (ξ, γ), Y = 2 \sum_{j} a_{j} ξ_{j}^{*} γ_{j}^{*}, S = 2 \sum_{i, j} a_{i} a_{j} σ_{i, j} ξ_{i}^{*} ξ_{j}^{*} . \end{matrix}

Therefore,

\begin{matrix} ψ (t) = E \{E (e^{i t Y} ∣ ξ^{*})\} = E \{e^{- 2 t^{2} \sum_{i, j} a_{i} a_{j} σ_{i, j} ξ_{i}^{*} ξ_{j}^{*}}\} = E \{e^{- t^{2} S}\} . \end{matrix}

☐

Example 5. (Squares of Gaussian random variables).

For each

n \geq 1

, let

(ξ_{n, 1}, \dots, ξ_{n, k_{n}})

be a centered Gaussian random vector and

\begin{matrix} X_{n} = \sum_{j = 1}^{k_{n}} a_{n, j} \{ξ_{n, j}^{2} - E (ξ_{n, j}^{2})\} w h e r e a_{n, j} \in R . \end{matrix}

Take an independent copy

(γ_{n, 1}, \dots, γ_{n, k_{n}})

of

(ξ_{n, 1}, \dots, ξ_{n, k_{n}})

and define:

\begin{matrix} Y_{n} = \sum_{j = 1}^{k_{n}} a_{n, j} \{ξ_{n, j}^{2} - γ_{n, j}^{2}\}, \\ S_{n} = \sum_{i = 1}^{k_{n}} \sum_{j = 1}^{k_{n}} a_{n, i} a_{n, j} E (ξ_{n, i} ξ_{n, j}) (ξ_{n, i} + γ_{n, i}) (ξ_{n, j} + γ_{n, j}) . \end{matrix}

Note that

S_{n}

is a (random) quadratic form of the covariance matrix

(E (ξ_{n, i} ξ_{n, j}) : 1 \leq i, j \leq k_{n})

. Therefore,

S_{n} \geq 0

.

Since

{| ϕ_{n} |}^{2}

agrees with the characteristic function of

Y_{n}

, Lemma 4 yields:

\begin{matrix} {| ϕ_{n} (t) |}^{2} = E \{e^{- t^{2} S_{n}}\} . \end{matrix}

Being

S_{n} \geq 0

, it follows that:

\begin{matrix} {| ϕ_{n} (t) |}^{2} = E \{e^{- t^{2} S_{n}} 1_{{S_{n} \geq t^{\frac{ϵ - 4}{2}}}}\} + E \{e^{- t^{2} S_{n}} 1_{{S_{n} < t^{\frac{ϵ - 4}{2}}}}\} \\ \leq e^{- t^{ϵ / 2}} + P (S_{n} < t^{\frac{ϵ - 4}{2}}) = e^{- t^{ϵ / 2}} + P (S_{n}^{- 2 - ϵ} > t^{4 + \frac{ϵ (2 - ϵ)}{2}}) \\ \leq e^{- t^{ϵ / 2}} + E (S_{n}^{- 2 - ϵ}) t^{- 4 - \frac{ϵ (2 - ϵ)}{2}} f o r a l l ϵ > 0 a n d t > 0 . \end{matrix}

Hence,

\begin{matrix} \frac{l_{n}}{2} = \int_{0}^{\infty} t | ϕ_{n} (t) | d t \leq 1 + \int_{1}^{\infty} t | ϕ_{n} (t) | d t \\ \leq 1 + \int_{1}^{\infty} t e^{\frac{- t^{ϵ / 2}}{2}} d t + \sqrt{E (S_{n}^{- 2 - ϵ})} \int_{1}^{\infty} t^{- 1 - \frac{ϵ (2 - ϵ)}{4}} d t, \end{matrix}

so that

{sup}_{n} l_{n} < \infty

whenever

{sup}_{n} E (S_{n}^{- 2 - ϵ}) < \infty

for some

ϵ \in (0, 2)

.

To summarize, applying Theorem 1 with

β = 2

, one obtains:

\begin{matrix} d_{T V} (X_{n}, V Z) = O (d_{n}^{1 / 3}) \end{matrix}

(4)

provided

X_{n} \overset{d i s t}{⟶} V Z

, for some V independent of Z, and

\begin{matrix} E (1 / V) + sup_{n} \{E (S_{n}^{- 2 - ϵ}) + E (X_{n}^{2})\} < \infty f o r s o m e ϵ \in (0, 2) . \end{matrix}

The bound (4) requires strong conditions, which may be not easily verifiable in real problems. However, the above result is sometimes helpful, possibly in connection with the martingale CLT of Example 4. As an example, the conditions for (4) are not hard to be checked when

ξ_{n, 1}, \dots, ξ_{n, k_{n}}

are independent for fixed n. We also note that, to our knowledge, the bound (4) improves on the existing ones. In fact, letting

p = 2

in Theorem 3.1 of [6] (see also Remark 3.5 of [7]) one only obtains

d_{T V} (X_{n}, V Z) = O (d_{n}^{1 / 5})

.

4. Weighted Quadratic Variations

Theorem 1 works nicely if one is able to estimate

d_{n}

and

l_{n}

, which is usually quite hard. Thus, it is convenient to have some further tools. In this section,

d_{T V} (X_{n}, V Z)

is upper bounded via Lemma 2. We focus on a special case, but the underlying ideas are easily adapted to more general situations. The results in [8], for instance, arise from a version of such ideas.

For any function

x : [0, 1] \to R

, denote:

\begin{matrix} Δ x (k / n) = x ((k + 1) / n) - x (k / n) where n \geq 1 and k = 0, 1, \dots, n - 1 . \end{matrix}

Let

q \geq 2

be an integer,

f : R \to R

a Borel function, and

J = {J_{t} : 0 \leq t \leq 1}

a real process. The weighted q-variation of J on

{0, 1 / n, 2 / n, \dots, 1}

is:

\begin{matrix} J_{n}^{*} = \sum_{k = 0}^{n - 1} f (J_{k / n}) {(Δ J_{k / n})}^{q} . \end{matrix}

As noted in [5], to fix the asymptotic behavior of

J_{n}^{*}

is useful to determine the rate of convergence of some approximation schemes of stochastic differential equations driven by J. Moreover, the study of

J_{n}^{*}

is also motivated by parameter estimation and by the analysis of single-path behaviour of J. See [5,9,18,19,20,21] and references therein.

More generally, given an

R^{2}

-valued process:

\begin{matrix} (I, J) = {(I_{t}, J_{t}) : 0 \leq t \leq 1}, \end{matrix}

one could define:

\begin{matrix} {(I, J)}_{n}^{*} = \sum_{k = 0}^{n - 1} f (I_{k / n}) {(Δ J_{k / n})}^{q} . \end{matrix}

The weight

f (I_{k / n})

of

{(Δ J_{k / n})}^{q}

depends now on I. Thus, in a sense,

{(I, J)}_{n}^{*}

can be regarded as the weighted q-variation of J relative to I.

Here, we focus on:

\begin{matrix} X_{n} = n^{1 / 2} \sum_{k = 0}^{n - 1} f (B_{k / n} - B_{k / n}^{'}) \{{(Δ B_{k / n})}^{2} - {(Δ B_{k / n}^{'})}^{2}\}, \end{matrix}

(5)

where B and

B^{'}

are independent standard Brownian motions. Note that, letting

q = 2

and

I = B - B^{'}

, one obtains:

\begin{matrix} X_{n} = n^{1 / 2} \{{(I, B)}_{n}^{*} - {(I, B^{'})}_{n}^{*}\} . \end{matrix}

Thus,

n^{- 1 / 2} X_{n}

can be seen as the difference between the quadratic variations of B and

B^{'}

relative to

I = B - B^{'}

.

We aim to show that, under mild assumptions on f, the probability distributions of

X_{n}

converge in total variation to a certain mixture of Gaussian laws. We also estimate the rate of convergence. The smoothness assumptions on f are weaker than those usually requested in similar problems; see, e.g., [5].

Theorem 2.

Let B and

B^{'}

be independent standard Brownian motions and Z a standard normal random variable independent of

(B, B^{'})

. Define

X_{n}

by Equation (5) and

\begin{matrix} V = 2 \sqrt{\int_{0}^{1} f^{2} (\sqrt{2} B_{t}) d t} . \end{matrix}

Suppose

E (1 / V^{2}) < \infty

and

\begin{matrix} | f (x) - f (y) | \leq c | x - y | e^{| x | + | y |} \end{matrix}

for some constant c and all

x, y \in R

. Then, there is a constant k independent of n satisfying:

\begin{matrix} d_{T V} (X_{n}, V Z) \leq k n^{- 1 / 4} . \end{matrix}

Moreover, if

inf | f | > 0

, one also obtains

d_{T V} (X_{n}, V Z) \leq k n^{- 1 / 2}

.

To understand better the spirit of Theorem 2, think of the trivial case

f = 1

. Then, the asymptotic behavior of

X_{n} = n^{1 / 2} \sum_{k = 0}^{n - 1} {{(Δ B_{k / n})}^{2} - {(Δ B_{k / n}^{'})}^{2}}

can be deduced by classical results. In fact,

d_{T V} (X_{n}, 2 Z) = O (n^{- 1 / 2})

and this rate is optimal; see Theorem 1 of [14]. On the other hand, since

V = 2

, the same conclusion can be drawn from Theorem 2.

We finally prove Theorem 2.

Proof of Theorem 2.

First note that

T = (B + B^{'}) / \sqrt{2}

and

Y = (B - B^{'}) / \sqrt{2}

are independent standard Brownian motions and

\begin{matrix} X_{n} = 2 n^{1 / 2} \sum_{k = 0}^{n - 1} f (\sqrt{2} Y_{k / n}) Δ T_{k / n} Δ Y_{k / n} . \end{matrix}

Note also that:

\begin{matrix} V Z \sim 2 T_{1} \sqrt{\int_{0}^{1} f^{2} (\sqrt{2} Y_{t}) d t} . \end{matrix}

Thus, in order to apply Lemma 2, it suffices to let

X = Y = C [0, 1]

,

X = T

, and

\begin{matrix} g_{n} (x, y) = 2 n^{1 / 2} \sum_{k = 0}^{n - 1} f (\sqrt{2} y (k / n)) Δ x (k / n) Δ y (k / n), \\ g (x, y) = 2 x (1) \sqrt{\int_{0}^{1} f^{2} (\sqrt{2} y (t)) d t} . \end{matrix}

For fixed

y \in Y

,

g_{n} (T, y)

and

g (T, y)

are centered Gaussian random variables. Since

E \{{(Δ T_{k / n})}^{2}\} = 1 / n

,

\begin{matrix} σ_{n}^{2} (y) = E \{g_{n} {(T, y)}^{2}\} = 4 \sum_{k = 0}^{n - 1} f^{2} (\sqrt{2} y (k / n)) {(Δ y (k / n))}^{2} \\ and σ^{2} (y) = E \{g {(T, y)}^{2}\} = 4 \int_{0}^{1} f^{2} (\sqrt{2} y (t)) d t . \end{matrix}

On noting that

σ^{2} (Y) \sim V^{2}

, one also obtains:

\begin{matrix} σ^{2} (Y) > 0 a . s . and E \{1 / σ^{2} (Y)\} = E (1 / V^{2}) < \infty . \end{matrix}

Next, define:

\begin{matrix} a_{n} = (1 / 4) E \{| σ_{n}^{2} (Y) - σ^{2} (Y) |\} \\ = E \{|\sum_{k = 0}^{n - 1} f^{2} (\sqrt{2} Y_{k / n}) {(Δ Y_{k / n})}^{2} - \int_{0}^{1} f^{2} (\sqrt{2} Y_{t}) d t|\} . \end{matrix}

By Lemma 2 and the Cauchy–Schwarz inequality,

\begin{matrix} d_{T V} {(X_{n}, V Z)}^{2} = d_{T V} {(g_{n} (T, Y), g (T, Y))}^{2} \leq E {(\frac{| σ_{n} (Y) - σ (Y) |}{σ (Y)})}^{2} \\ \leq E \{1 / σ^{2} (Y)\} E \{{(σ_{n} (Y) - σ (Y))}^{2}\} \\ \leq E (1 / V^{2}) E \{| σ_{n}^{2} (Y) - σ^{2} (Y) |\} = 4 E (1 / V^{2}) a_{n} . \end{matrix}

If

inf | f | > 0

, since

σ^{2} (Y) \geq 4 inf f^{2}

, Lemma 2 implies again:

\begin{matrix} d_{T V} (X_{n}, V Z) \leq E (\frac{| σ_{n}^{2} (Y) - σ^{2} (Y) |}{σ^{2} (Y)}) \leq \frac{E (| σ_{n}^{2} (Y) - σ^{2} (Y) |)}{4 inf f^{2}} = \frac{a_{n}}{inf f^{2}} . \end{matrix}

Thus, to conclude the proof, it suffices to show that

a_{n} =

O

(n^{- 1 / 2})

.

Define

c^{*} = max (c, | f (0) |)

and note that:

\begin{matrix} | f (s) | \leq c^{*} e^{2 | s |} and | f {(s)}^{2} - f {(t)}^{2} | \leq 2 c c^{*} | s - t | e^{3 (| s | + | t |)} for all s, t \in R . \end{matrix}

Define also:

\begin{matrix} a_{n}^{(1)} = E \{|\sum_{k = 0}^{n - 1} f^{2} (\sqrt{2} Y_{k / n}) ({(Δ Y_{k / n})}^{2} - 1 / n)|\} and \\ a_{n}^{(2)} = E \{|(1 / n) \sum_{k = 0}^{n - 1} f^{2} (\sqrt{2} Y_{k / n}) - \int_{0}^{1} f^{2} (\sqrt{2} Y_{t}) d t|\} . \end{matrix}

Since

a_{n} \leq a_{n}^{(1)} + a_{n}^{(2)}

, it suffices to see that

a_{n}^{(i)} =

O

(n^{- 1 / 2})

for each i. Since Y has independent increments and

E \{{({(Δ Y_{k / n})}^{2} - 1 / n)}^{2}\} = 2 / n^{2}

,

\begin{matrix} {(a_{n}^{(1)})}^{2} = E {\{|\sum_{k = 0}^{n - 1} f^{2} (\sqrt{2} Y_{k / n}) ({(Δ Y_{k / n})}^{2} - 1 / n)|\}}^{2} \\ \leq \sum_{k = 0}^{n - 1} E \{f^{4} (\sqrt{2} Y_{k / n}) {({(Δ Y_{k / n})}^{2} - 1 / n)}^{2}\} \\ = \sum_{k = 0}^{n - 1} E \{f^{4} (\sqrt{2} Y_{k / n})\} E \{{({(Δ Y_{k / n})}^{2} - 1 / n)}^{2}\} \\ = (2 / n^{2}) \sum_{k = 0}^{n - 1} E \{f^{4} (\sqrt{2} Y_{k / n})\} \\ \leq \frac{2 {(c^{*})}^{4} E \{e^{8 \sqrt{2} M}\}}{n} where M = sup_{0 \leq t \leq 1} | Y_{t} | . \end{matrix}

Similarly,

\begin{matrix} a_{n}^{(2)} = E \{|(1 / n) \sum_{k = 0}^{n - 1} f^{2} (\sqrt{2} Y_{k / n}) - \int_{0}^{1} f^{2} (\sqrt{2} Y_{t}) d t|\} \\ \leq \sum_{k = 0}^{n - 1} \int_{k / n}^{(k + 1) / n} E \{| f^{2} (\sqrt{2} Y_{k / n}) - f^{2} (\sqrt{2} Y_{t}) |\} d t \\ \leq 2 \sqrt{2} c c^{*} \sum_{k = 0}^{n - 1} \int_{k / n}^{(k + 1) / n} E \{| Y_{k / n} - Y_{t} | e^{6 \sqrt{2} M}\} d t \\ \leq 2 \sqrt{2} c c^{*} \sqrt{E \{e^{12 \sqrt{2} M}\}} \sum_{k = 0}^{n - 1} \int_{k / n}^{(k + 1) / n} \sqrt{E \{{(Y_{k / n} - Y_{t})}^{2}\}} d t \\ \leq 2 \sqrt{2} c c^{*} \sqrt{E \{e^{12 \sqrt{2} M}\}} \frac{1}{\sqrt{n}} . \end{matrix}

Therefore,

a_{n}^{(i)} =

O

(n^{- 1 / 2})

for each i, and this concludes the proof. ☐

Author Contributions

Each author contributed in exactly the same way to each part of this paper.

Funding

This research was supported by the Italian Ministry of Education, University and Research (MIUR): Dipartimenti di Eccellenza Program (2018-2022) - Dept. of Mathematics “F. Casorati”, University of Pavia.

Conflicts of Interest

The authors declare no conflict of interest.

References

Azmoodeh, E.; Gasbarra, D. New moments criteria for convergence towards normal product/tetilla laws. arXiv, 2017; arXiv:1708.07681. [Google Scholar]
Eichelsbacher, P.; Thäle, C. Malliavin-Stein method for Variance-Gamma approximation on Wiener space. Electron. J. Probab. 2015, 20, 1–28. [Google Scholar] [CrossRef]
Gaunt, R.E. On Stein’s method for products of normal random variables and zero bias couplings. Bernoulli 2017, 23, 3311–3345. [Google Scholar] [CrossRef] [Green Version]
Gaunt, R.E. Wasserstein and Kolmogorov error bounds for variance-gamma approximation via Stein’s method I. arXiv, 2017; arXiv:1711.07379. [Google Scholar]
Nourdin, I.; Nualart, D.; Tudor, C.A. Central and non-central limit theorems for weighted power variations of fractional Brownian motion. Ann. I.H.P. 2010, 46, 1055–1079. [Google Scholar] [CrossRef] [Green Version]
Nourdin, I.; Poly, G. Convergence in total variation on Wiener chaos. Stoch. Proc. Appl. 2013, 123, 651–674. [Google Scholar] [CrossRef]
Nourdin, I.; Nualart, D.; Peccati, G. Quantitative stable limit theorems on the Wiener space. Ann. Probab. 2016, 44, 1–41. [Google Scholar] [CrossRef] [Green Version]
Pratelli, L.; Rigo, P. Total Variation Bounds for Gaussian Functionals. 2018. Submitted. Available online: http://www-dimat.unipv.it/rigo/frac.pdf (accessed on 10 April 2018).
Nourdin, I.; Peccati, G. Weighted power variations of iterated Brownian motion. Electron. J. Probab. 2008, 13, 1229–1256. [Google Scholar] [CrossRef]
Peccati, G.; Yor, M. Four limit theorems for quadratic functionals of Brownian motion and Brownian bridge. In Asymptotic Methods in Stochastics, AMS, Fields Institute Communication Series; Amer. Math. Soc.: Providence, RI, USA, 2004; pp. 75–87. [Google Scholar]
Dudley, R.M. Real Analysis and Probability; Cambridge University Press: Cambridge, UK, 2004. [Google Scholar]
Nourdin, I.; Peccati, G. Normal Approximations with Malliavin Calculus: From Stein’s Method to Universality; Cambridge University Press: Cambridge, UK, 2012. [Google Scholar]
Goldstein, L. L¹ bounds in normal approximation. Ann. Probab. 2007, 35, 1888–1930. [Google Scholar] [CrossRef]
Sirazhdinov, S.K.H.; Mamatov, M. On convergence in the mean for densities. Theory Probab. Appl. 1962, 7, 424–428. [Google Scholar] [CrossRef]
Petrov, V.V. Limit Theorems of Probability Theory: Sequences of Independent Random Variables; Clarendon Press: Oxford, UK, 1995. [Google Scholar]
Berti, P.; Pratelli, L.; Rigo, P. Limit theorems for a class of identically distributed random variables. Ann. Probab. 2004, 32, 2029–2052. [Google Scholar]
Hall, P.; Heyde, C.C. Martingale Limit Theory and Its Applications; Academic Press: New York, NY, USA, 1980. [Google Scholar]
Barndorff-Nielsen, O.E.; Graversen, S.E.; Shepard, N. Power variation and stochastic volatility: A review and some new results. J. Appl. Probab. 2004, 44, 133–143. [Google Scholar] [CrossRef]
Gradinaru, M.; Nourdin, I. Milstein’s type schemes for fractional SDEs. Ann. I.H.P. 2009, 45, 1058–1098. [Google Scholar] [CrossRef]
Neuenkirch, A.; Nourdin, I. Exact rate of convergence of some approximation schemes associated to SDEs driven by a fractional Brownian motion. J. Theor. Probab. 2007, 20, 871–899. [Google Scholar] [CrossRef]
Nourdin, I. A simple theory for the study of SDEs driven by a fractional Brownian motion in dimension one. In Seminaire de Probabilites; Springer: Berlin, Germany, 2008; Volume XLI, pp. 181–197. [Google Scholar]

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Pratelli, L.; Rigo, P. Convergence in Total Variation to a Mixture of Gaussian Laws. Mathematics 2018, 6, 99. https://doi.org/10.3390/math6060099

AMA Style

Pratelli L, Rigo P. Convergence in Total Variation to a Mixture of Gaussian Laws. Mathematics. 2018; 6(6):99. https://doi.org/10.3390/math6060099

Chicago/Turabian Style

Pratelli, Luca, and Pietro Rigo. 2018. "Convergence in Total Variation to a Mixture of Gaussian Laws" Mathematics 6, no. 6: 99. https://doi.org/10.3390/math6060099

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Convergence in Total Variation to a Mixture of Gaussian Laws

Abstract

1. Introduction

2. Preliminaries

2.1. Distances between Probability Measures

2.2. Two Technical Lemmas

3. A General Result

4. Weighted Quadratic Variations

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI