Bootstrapping Average Value at Risk of Single and Collective Risks

Beutner, Eric; Zähle, Henryk

doi:10.3390/risks6030096

Open AccessArticle

Bootstrapping Average Value at Risk of Single and Collective Risks

by

Eric Beutner

¹ and

Henryk Zähle

^2,*

¹

Department of Quantitative Economics, Maastricht University, 6200 MD Maastricht, The Netherlands

²

Department of Mathematics, Saarland University, 66123 Saarbrücken, Germany

^*

Author to whom correspondence should be addressed.

Risks 2018, 6(3), 96; https://doi.org/10.3390/risks6030096

Submission received: 1 August 2018 / Revised: 27 August 2018 / Accepted: 7 September 2018 / Published: 12 September 2018

(This article belongs to the Special Issue Estimation of Risk Measures from Data -- Estimators, Computation, Robustness and Elicitability)

Download Versions Notes

Abstract

:

Almost sure bootstrap consistency of the blockwise bootstrap for the Average Value at Risk of single risks is established for strictly stationary

β

-mixing observations. Moreover, almost sure bootstrap consistency of a multiplier bootstrap for the Average Value at Risk of collective risks is established for independent observations. The main results rely on a new functional delta-method for the almost sure bootstrap of uniformly quasi-Hadamard differentiable statistical functionals, to be presented here. The latter seems to be interesting in its own right.

Keywords:

Average Value at Risk; compound distribution; nonparametric estimation; multiplier bootstrap; blockwise bootstrap; functional delta-method; uniform quasi-Hadamard differentiability; chain rule

1. Introduction

One of the most popular risk measures in practice is the so-called Average Value at Risk which is also referred to as Expected Shortfall (see Acerbi and Szekely (2014); Acerbi and Tasche (2002a, 2002b); Emmer et al. (2015) and references therein). For a fixed level

α \in (0, 1)

, the corresponding Average Value at Risk is the map

{AV @ R}_{α} : L^{1} \to R

defined by

{AV @ R}_{α} (X) : = R_{α} (F_{X})

, where

F_{X}

refers to the distribution function of X,

L^{1}

is the usual

L^{1}

-space associated with some atomless probability space, and

R_{α} (F) : = \int_{0}^{1} F^{\leftarrow} (s) d g_{α} (s) = - \int_{- \infty}^{0} g_{α} (F (x)) d x + \int_{0}^{\infty} (1 - g_{α} (F (x))) d x

(1)

for any

F \in F_{1}

with

F_{1}

the set of the distribution functions

F_{X}

of all

X \in L^{1}

. Here,

g_{α} (t) : = \frac{1}{1 - α} max {t - α; 0}

and

F^{\leftarrow} (s) : = inf {x \in R : F (x) \geq s}

denotes the left-continuous inverse of F. The statistical functional

R_{α} : F_{1} \to R

is sometimes referred to as risk functional associated with

{AV @ R}_{α}

. Note that

{AV @ R}_{α} (X) = E [X | X \geq F_{X}^{\leftarrow} (α)]

when

F_{X}

is continuous at

F_{X}^{\leftarrow} (α)

.

In this article, we mainly focus on bootstrap methods for the Average Value at Risk. Before doing so, we briefly review nonparametric estimation techniques and asymptotic results for the Average Value at Risk. Given identically distributed observations

X_{1}, \dots, X_{n}

(

, X_{n + 1}, \dots

) on some probability space

(Ω, F, P)

with unknown marginal distribution

F \in F_{1}

, a natural estimator for

R_{α} (F)

is the empirical plug-in estimator

R_{α} ({\hat{F}}_{n}) = \int_{0}^{1} {\hat{F}}_{n}^{\leftarrow} (s) d g_{α} (s) = \sum_{i = 1}^{n} \{g_{α} (\frac{i}{n}) - g_{α} (\frac{i - 1}{n})\} X_{i : n},

(2)

where

{\hat{F}}_{n} : = \frac{1}{n} \sum_{i = 1}^{n} 𝟙_{[X_{i}, \infty)}

is the empirical distribution function of

X_{1}, \dots, X_{n}

and

X_{1 : n}, \dots, X_{n : n}

refer to the order statistics of

X_{1}, \dots, X_{n}

. The second representation in Equation (2) shows that

R_{α} ({\hat{F}}_{n})

is a specific L-statistic which was already mentioned in Acerbi (2002); Acerbi and Tasche (2002a); Jones and Zitikis (2003).

In particular, if the underlying sequence

{(X_{i})}_{i \in N}

is strictly stationary and ergodic, classical results of van Zwet (1980) and Gilat and Helmers (1997) show that

R_{α} ({\hat{F}}_{n})

converges

P

-almost surely to

R_{α} (F)

as

n \to \infty

, i.e., that strong consistency holds. If

X_{1}, X_{2}, \dots

are i.i.d. and F has a finite second moment and takes the value

α

only once, then a result of Stigler ((Stigler 1974, Theorems 1–2)) yields the asymptotic distribution of the estimation error:

\sqrt{n} (R_{α} ({\hat{F}}_{n}) - R_{α} (F)) ⇝ Z \sim N_{0, σ_{F}^{2}},

(3)

where

σ_{F}^{2} : = \int \int g_{α}^{'} (F (x_{0})) Γ (x_{0}, x_{1}) g_{α}^{'} (F (x_{1})) d x_{0} d x_{1}

with

Γ (x_{0}, x_{1}) : = F (x_{0} \land x_{1}) (1 - F (x_{0} \lor x_{1})) + \sum_{i = 0}^{1} \sum_{k = 2}^{\infty} C ov (𝟙_{{X_{1} \leq x_{i}}}, 𝟙_{{X_{k} \leq x_{1 - i}}})

,

g_{α}^{'} : = \frac{1}{1 - α} 𝟙_{(α, 1]}

, and ⇝ refers to convergence in distribution (see also Shorack 1972; Shorack and Wellner 1986). In fact, for independent

X_{1}, X_{2}, \dots

the second summand in the definition of

Γ (x_{0}, x_{1})

vanishes. Results of Beutner and Zähle (2010) show that Equation (3) still holds if

{(X_{i})}_{i \in N}

is strictly stationary and

α

-mixing with mixing coefficients

α (i) = O (i^{- θ})

and

{lim}_{x \to \infty} (1 - F (x)) x^{2 θ / (θ - 1)} < \infty

for some

θ > 1 + \sqrt{2}

. Tsukahara (2013) obtained the same result. A similar result can also be derived from an earlier work by Mehra and Rao (1975), but under a faster decay of the mixing coefficients and under an additional assumption on the dependence structure. We emphasize that the method of proof proposed by Beutner and Zähle is rather flexible, because it easily extends to other weak and strong dependence concepts and other risk measures (see Beutner et al. 2012; Beutner and Zähle 2010, 2016; Krätschmer et al. 2013; Krätschmer and Zähle 2017).

Even in the i.i.d. case the asymptotic variance

σ_{F}^{2}

depends on F in a fairly complex way. For the approximation of the distribution of

\sqrt{n} (R_{α} ({\hat{F}}_{n}) - R_{α} (F))

, bootstrap methods should thus be superior to the method of estimating

σ_{F}^{2}

. However, to the best of our knowledge, theoretical investigations of the bootstrap for the Average Value at Risk seem to be rare. According to Gribkova (2016), a result of Gribkova (2002) yields bootstrap consistency for Efron’s bootstrap when

X_{1}, X_{2}, \dots

are i.i.d, while Theorem 3 of Helmers et al. (1990) seems not to cover the Average Value at Risk, because there the function J (which plays the role of

g_{α}^{'}

) is assumed to be Lipschitz continuous. In these articles, bootstrap consistency is typically proved by first proving consistency of the bootstrap variance and then using this result by showing that upper bounds for the difference between the sampling distribution and the bootstrap distribution converge to zero. Employing different techniques, Beutner and Zähle (2016) established bootstrap consistency in probability for the multiplier bootstrap when

X_{1}, X_{2}, \dots

are i.i.d. as well as bootstrap consistency in probability for the circular bootstrap when

X_{1}, X_{2}, \dots

are strictly stationary and

β

-mixing with mixing coefficients

β (i) = O (i^{- b})

and

{\int | x |}^{p} d F (x) < \infty

for some

p > 2

and

b > p / (p - 2)

. Recently, Sun and Cheng (2018) established bootstrap consistency in probability for the moving blocks bootstrap when

X_{1}, X_{2}, \dots

are strictly stationary and

α

-mixing with mixing coefficients

α (i) \leq c δ^{i}

and

{\int | x |}^{p} d F (x) < \infty

for some

p > 4

,

c > 0

and

δ \in (0, 1)

. Strictly speaking, Sun and Cheng did not consider the Average Value at Risk (Expected Shortfall) but the Tail Conditional Expectation in the sense of Acerbi and Tasche (2002a, 2002b).

The contribution of the article at hand is twofold. First, we extend the results of Beutner and Zähle (2016) on the Average Value at Risk from bootstrap consistency in probability to bootstrap consistency almost surely. Second, we establish bootstrap consistency for the Average Value at Risk of collective risks, i.e., for

R_{α} (F^{* m})

and more general expressions.

The rest of the article is organized as follows. In Section 2, we present and illustrate our main results which are proved in Section 3. Section 3 is followed by the conclusions. The proofs of Section 3 rely on a new functional delta-method for the almost sure bootstrap which seems to be interesting in its own right and which is presented in Appendix B. Roughly speaking, the (functional) delta method studies properties of particular estimators for quantities of the form

H (θ)

. Here, H is a known functional, such as the Average Value at Risk functional, and

θ

is a possibly infinite dimensional parameter, such as an unknown distribution function. The particular estimators covered by the (functional) delta method are of the form

H ({\hat{T}}_{n})

where

{\hat{T}}_{n}

is an estimator for

θ

. In general and in the particular application considered here, the appeal of the (functional) delta method lies in the fact that, once “differentiability” of H (here, the Average Value at Risk functional) is established, the asymptotic error distribution of

H ({\hat{T}}_{n})

can immediately be derived from the asymptotic error distribution of

{\hat{T}}_{n}

(here

{\hat{F}}_{n}

). This also applies to the (functional) delta method for the bootstrap where bootstrap consistency of the bootstrapped version of

H ({\hat{T}}_{n})

will follow from the respective property of the bootstrapped version of

{\hat{T}}_{n}

(here

{\hat{F}}_{n}

). Thus, if in financial or actuarial applications the data show dependencies for which the asymptotic error distribution and/or bootstrap consistency of plug-in estimators for the Average Value at Risk have not been established yet, it would be enough to check if for these dependencies the asymptotic error distribution and/or bootstrap consistency of

{\hat{F}}_{n}

is known; thanks to the (functional) delta method the Average Value at Risk functional would inherit these properties. In Appendix A.1, we give results on convergence in distribution for the open-ball

σ

-algebra which are needed for the main results, and in Appendix A.2 we prove a delta-method for uniformly quasi-Hadamard differentiable maps that is the basis for the method of Appendix B. Readers interested in these methods used to prove the main results might wish to first work through Appendix A and Appendix B before reading Section 2 and Section 3.

2. Main Results

2.1. The Case of i.i.d. Observations

Keep the notation of Section 1. Assume that

{(X_{i})}_{i \in N}

is a sequence of i.i.d. real-valued random variables on some probability space

(Ω, F, P)

with distribution function F. Let

{\hat{F}}_{n} : = \frac{1}{n} \sum_{i = 1}^{n} 𝟙_{[X_{i}, \infty)}

and

(W_{n i})

be a triangular array of nonnegative real-valued random variables on another probability space

(Ω^{'}, F^{'}, P^{'})

such that one of the following two settings is met.

S1.: The random vector $(W_{n 1}, \dots, W_{n n})$ is multinomially distributed according to the parameters n and $p_{1} = \dots = p_{n} : = 1 / n$ for every $n \in N$ .
S2.: $W_{n i} = Y_{i} / {\bar{Y}}_{n}$ for every $i = 1, \dots, n$ and $n \in N$ , where ${\bar{Y}}_{n} : = \frac{1}{n} \sum_{j = 1}^{n} Y_{j}$ and $(Y_{j})$ is any sequence of nonnegative i.i.d. random variables on $(Ω^{'}, F^{'}, P^{'})$ with $\int_{0}^{\infty} P^{'} {[Y_{1} > t]}^{1 / 2} d t < \infty$ and $V {ar}^{'} {[Y_{1}]}^{1 / 2} = E^{'} [Y_{1}] > 0$ .

Let

(\bar{Ω}, \bar{F}, \bar{P}) : = (Ω \times Ω^{'}, F \otimes F^{'}, P \otimes P^{'})

and

{\hat{F}}_{n}^{*} (ω, ω^{'}) : = \frac{1}{n} \sum_{i = 1}^{n} W_{n i} (ω^{'}) 𝟙_{[X_{i} (ω), \infty)}

. Setting S1. is nothing but Efron’s boostrap (Efron 1979). If in Setting S2. the distribution of

Y_{1}

is the exponential distribution with parameter 1, then the resulting scheme is in line with the Bayesian bootstrap of Rubin (1981). Let

σ_{F}^{2} : = \int \int g_{α}^{'} (F (x_{0})) Γ (x_{0}, x_{1}) g_{α}^{'} (F (x_{1})) d x_{0} d x_{1}

with

Γ (x_{0}, x_{1}) : = F (x_{0} \land x_{1}) (1 - F (x_{0} \lor x_{1}))

.

Theorem 1.

In the setting above assume that

\int ϕ^{2} d F < \infty

for some continuous function

ϕ : R \to [1, \infty)

with

\int 1 / ϕ (x) d x < \infty

(in particular

F \in F_{1}

), and that F takes the value α only once. Then

\sqrt{n} (R_{α} ({\hat{F}}_{n}) - R_{α} (F)) ⇝ Z \sim N_{0, σ_{F}^{2}}

(4)

and

\sqrt{n} (R_{α} ({\hat{F}}_{n}^{*} (ω, \cdot)) - R_{α} ({\hat{F}}_{n} (ω))) ⇝ Z \sim N_{0, σ_{F}^{2}}, P - a . e . ω .

(5)

Theorem 1 is a special case of Corollary 1 below. For the bootstrap Scheme S1. the result of Theorem 1 can be also deduced from Theorem 7 in Gribkova (2002). According to Gribkova (2016), Condition (1) of this theorem is satisfied if there are

0 = a_{0} < a_{1} < \dots < a_{k} = 1

for some

k \in N

such that J is Hölder continuous on each interval

(a_{i - 1}, a_{i})

,

1 \leq i \leq k

, and the measure

d F^{- 1}

has no mass at the points

a_{1}, \dots, a_{k - 1}

. For the bootstrap Scheme S2. the result seems to be new.

We now consider the collective risk model. Let

{(X_{i})}_{i \in N}

and

{\hat{F}}_{n}

be as above, and let

p = {(p_{k})}_{k \in N_{0}}

be the counting density of a distribution on

N_{0}

. Let

F

denote the set of all distribution functions on

R

, and consider the functional

C_{p} : F \to F

defined by

C_{p} (F) : = \sum_{k = 0}^{\infty} p_{k} F^{* k}

, where

F^{* k}

refers to the k-fold convolution of F, i.e.,

F^{* 0} : = 𝟙_{[0, \infty)}

and

F^{* k} (x) : = \int F (x - x_{k - 1}) d F^{* (k - 1)} (x_{k - 1}) = \int \dots \int F (x - x_{k - 1} - \dots - x_{1}) d F (x_{1}) \dots d F (x_{k - 1})

for

k \in N

. If

p_{m} = 1

for some

m \in N_{0}

, then

C_{p} (F) = F^{* m}

. Let

σ_{p, F}^{2} : = \int \int \int \int g_{α}^{'} (F (x_{0})) Γ (x_{0} - y_{0}, x_{1} - y_{1}) g_{α}^{'} (F (x_{1})) d H_{p, F} (y_{0}) d H_{p, F} (y_{1}) d x_{0} d x_{1}

with

Γ (z_{0}, z_{1}) : = F (z_{0} \land z_{1}) (1 - F (z_{0} \lor z_{1}))

and

H_{p, F} : = \sum_{k = 1}^{\infty} k p_{k} F^{* (k - 1)}

.

Theorem 2.

In the setting above assume that

{\int | x |}^{2 λ} d F (x) < \infty

for some

λ > 1

(in particular

F \in F_{1}

) and

\sum_{k = 1}^{\infty} p_{k} k^{1 + λ} < \infty

, and that

C_{p} (F)

takes the value α only once. Then,

\sqrt{n} (R_{α} (C_{p} ({\hat{F}}_{n})) - R_{α} (C_{p} (F))) ⇝ Z \sim N_{0, σ_{p, F}^{2}}

(6)

and

\sqrt{n} (R_{α} (C_{p} ({\hat{F}}_{n}^{*} (ω, \cdot))) - R_{α} (C_{p} ({\hat{F}}_{n} (ω)))) ⇝ Z \sim N_{0, σ_{p, F}^{2}}, P - a . e . ω .

(7)

Theorem 2 is a special case of Corollary 4 below. Lauer and Zähle (2015, 2017) derive the asymptotic distribution as well as almost sure bootstrap consistency for the Average Value at Risk (and more general risk measures) of

F^{* m_{n}}

when

m_{n} / n

is asymptotically constant, but we do not know any result in the existing literature which is comparable to that of Theorem 2.

2.2. The Case of $β$ -Mixing Observations

Keep the notation of Section 1. Assume that

{(X_{i})}_{i \in N}

is a strictly stationary sequence of

β

-mixing random variables on

(Ω, F, P)

with distribution function F. As before let

{\hat{F}}_{n} : = \frac{1}{n} \sum_{i = 1}^{n} 𝟙_{[X_{i}, \infty)}

. Let

(ℓ_{n})

be a sequence of integers such that

ℓ_{n} ↗ \infty

as

n \to \infty

, and

ℓ_{n} < n

for all

n \in N

. Set

k_{n} : = ⌈ n / ℓ_{n} ⌉

for all

n \in N

. Let

{(I_{n j})}_{n \in N, 1 \leq j \leq k_{n}}

be a triangular array of random variables on

(Ω^{'}, F^{'}, P^{'})

such that

I_{n 1}, \dots, I_{n k_{n}}

are i.i.d. according to the uniform distribution on

{1, \dots, n - ℓ_{n} + 1}

for every

n \in N

. Let

(\bar{Ω}, \bar{F}, \bar{P}) : = (Ω \times Ω^{'}, F \otimes F^{'}, P \otimes P^{'})

and

{\hat{F}}_{n}^{*} (ω, ω^{'}) : = \frac{1}{n} \sum_{i = 1}^{n} W_{n i} (ω^{'}) 𝟙_{[X_{i} (ω), \infty)}

with

W_{n i} (ω^{'}) : = \sum_{j = 1}^{k_{n} - 1} 𝟙_{{I_{n j} \leq i \leq I_{n j} + ℓ_{n} - 1}} (ω^{'}) + 𝟙_{{I_{n k_{n}} \leq i \leq I_{n k_{n}} + (n - (k_{n} - 1) ℓ_{n}) - 1}} (ω^{'}) .

(8)

Note that the sequence

(X_{i})

and the triangular array

(W_{n i})

regarded as families of random variables on the product space

(\bar{Ω}, \bar{F}, \bar{P}) : = (Ω \times Ω^{'}, F \otimes F^{'}, P \otimes P^{'})

are independent. At an informal level, this means that, given a sample

X_{1}, \dots, X_{n}

, we pick

k_{n} - 1

blocks of length

ℓ_{n}

and one block of length

n - (k_{n} - 1) ℓ_{n}

in the sample

X_{1}, \dots, X_{n}

, where the start indices

I_{n 1}, I_{n 2}, \dots, I_{n k_{n}}

are chosen independently and uniformly in the set of indices

{1, \dots, n - ℓ_{n} + 1}

:

block 1:	$X_{I_{n 1}}, X_{I_{n 1} + 1}, \dots, X_{I_{n 1} + ℓ_{n} - 1}$
block 2:	$X_{I_{n 2}}, X_{I_{n 2} + 1}, \dots, X_{I_{n 2} + ℓ_{n} - 1}$
	⋮
block $k_{n} - 1$ :	$X_{I_{n (k_{n} - 1)}}, X_{I_{n (k_{n} - 1)} + 1}, \dots, X_{I_{n (k_{n} - 1)} + ℓ_{n} - 1}$
block $k_{n}$ :	$X_{I_{n k_{n}}}, X_{I_{n k_{n}} + 1}, \dots, X_{I_{n k_{n}} + (n - (k_{n} - 1) ℓ_{n}) - 1}$ .

The bootstrapped empirical distribution function

{\hat{F}}_{n}^{*}

is then defined to be the distribution function of the discrete probability measure with atoms

X_{1}, \dots, X_{n}

carrying masses

W_{n 1}, \dots, W_{n n}

, respectively, where

W_{n i}

specifies the number of blocks which contain

X_{i}

. This is known as the blockwise bootstrap (see, e.g., Bühlmann (1994, 1995) and references therein). Assume that the following assertions hold:

A1.: $\int ϕ^{p} d F < \infty$ for some $p > 4$ (in particular $F \in F_{1}$ ).
A2.: The sequence of random variables $(X_{i})$ is strictly stationary and $β$ -mixing with mixing coefficients $(β_{i})$ satisfying $β_{i} \leq c δ^{i}$ for some constants $c > 0$ and $δ \in (0, 1)$ .
A3.: The block length $ℓ_{n}$ satisfies $ℓ_{n} = O (n^{γ})$ for some $γ \in (0, 1 / 2)$ .

Let

{\hat{C}}_{n} : = E^{'} [{\hat{F}}_{n}^{*}] = \frac{1}{n} \sum_{i = 1}^{n} w_{n i} 𝟙_{[X_{i}, \infty)}

with

w_{n i} : = E^{'} [W_{n i}]

, and note that

w_{n i} = \{\begin{matrix} k_{n} \frac{i}{n - ℓ_{n} + 1} & , & i = 1, \dots, n - (k_{n} - 1) ℓ_{n} \\ (k_{n} - 1) \frac{i}{n - ℓ_{n} + 1} + \frac{n - (k_{n} - 1) ℓ_{n}}{n - ℓ_{n} + 1} & , & i = n - (k_{n} - 1) ℓ_{n} + 1, \dots, ℓ_{n} \\ (k_{n} - 1) \frac{ℓ_{n}}{n - ℓ_{n} + 1} + \frac{n - (k_{n} - 1) ℓ_{n}}{n - ℓ_{n} + 1} = \frac{n}{n - ℓ_{n} + 1} & , & i = ℓ_{n} + 1, \dots, n - ℓ_{n} \\ (k_{n} - 1) \frac{n - i + 1}{n - ℓ_{n} + 1} + \frac{2 n - k_{n} ℓ_{n} - i + 1}{n - ℓ_{n} + 1} & , & i = n - ℓ_{n} + 1, \dots, n - (k_{n} ℓ_{n} - n) \\ (k_{n} - 1) \frac{n - i + 1}{n - ℓ_{n} + 1} & , & i = n - (k_{n} ℓ_{n} - n) + 1, \dots, n \end{matrix}

(9)

which can be verified easily. Let

σ_{F}^{2} : = \int \int g_{α}^{'} (F (x_{0})) Γ (x_{0}, x_{1}) g_{α}^{'} (F (x_{1})) d x_{0} d x_{1}

with

Γ (x_{0}, x_{1}) : = F (x_{0} \land x_{1}) (1 - F (x_{0} \lor x_{1})) + \sum_{i = 0}^{1} \sum_{k = 2}^{\infty} C ov (𝟙_{{X_{1} \leq x_{i}}}, 𝟙_{{X_{k} \leq x_{1 - i}}})

.

Theorem 3.

In the setting above (in particular under A1.–A3.) assume that F takes the value α only once. Then, we have

\sqrt{n} (R_{α} ({\hat{F}}_{n}) - R_{α} (F)) ⇝ Z \sim N_{0, σ_{F}^{2}}

(10)

and

\sqrt{n} (R_{α} ({\hat{F}}_{n}^{*} (ω, \cdot)) - R_{α} ({\hat{C}}_{n} (ω))) ⇝ Z \sim N_{0, σ_{F}^{2}}, P - a . e . ω .

(11)

Theorem 3 is a special case of Corollary 1 below. To the best of our knowledge, there does not yet exist any result on almost sure bootstrap consistency for the Average Value at Risk when the underlying data are dependent.

2.3. Applications

2.3.1. Bootstrapping the Down Side Risk of an Asset Price

Let

{(A_{i})}_{i \in N_{0}}

be the price process of an asset. Let us assume that it is induced by an initial state

A_{0} \in R_{+}

and a sequence of

R_{+}

-valued i.i.d. random variables

{(R_{i})}_{i \in N}

via

A_{i} : = R_{i} A_{i - 1}

,

i \in N

. Here,

R_{i}

is the return of the asset in between time

i - 1

and time i. For instance, if

A_{0}, A_{1}, A_{2}, \dots

are the observations of a time-continuous Black–Scholes–Merton model with drift

μ

and volatility

σ

at the points of the time grid

{0, h, 2 h, \dots}

, then the distribution of

R_{1}

is the log-normal distribution with parameters

(μ - σ^{2} / 2) h

and

σ^{2} h

. However, the adequacy of a specific parametric model is usually hard to verify. For this reason, we do not restrict ourselves to any particular parametric structure for the dynamics of

{(R_{i})}_{i \in N}

.

Let us assume that we can observe the asset prices

A_{0}, \dots, A_{n}

up to time n, and that we are interested in the Average Value at Risk at level

α

of the negative price change

A_{n} - A_{n + 1}

(which specifies the down side risk of the asset) in between time n and

n + 1

. That is, since for any

a_{0}, \dots, a_{n} \in R_{+}

the unconditional distribution of

(1 - R_{n + 1}) a_{n}

coincides with the factorized conditional distribution of

A_{n} - A_{n + 1} = (1 - R_{n + 1}) A_{n}

given

(A_{0}, \dots, A_{n}) = (a_{0}, \dots, a_{n})

, we are in fact interested in

R_{α} (F) = {AV @ R}_{α} (X)

for the distribution function F of

X : = (1 - R_{n + 1}) a

for any fixed

a \in R_{+}

. As the random variables

X_{1} : = (1 - R_{1}) a, \dots, X_{n} : = (1 - R_{n}) a

are i.i.d. copies of X, we can use

R_{α} ({\hat{F}}_{n})

as an estimator for

R_{α} (F)

and derive from Equation (4) an asymptotic confidence interval at a given level

τ \in (0, 1)

for

R_{α} (F)

where one has to estimate

σ_{F}^{2}

by

\int \int g_{α}^{'} ({\hat{F}}_{n} (x_{0})) {\hat{F}}_{n} (x_{0} \land x_{1}) (1 - {\hat{F}}_{n} (x_{0} \lor x_{1})) g_{α}^{'} ({\hat{F}}_{n} (x_{1})) d x_{0} d x_{1}

. As the estimator for

σ_{F}^{2}

depends on

{\hat{F}}_{n}

in a somewhat complex way, the bootstrap confidence interval

[R_{α} ({\hat{F}}_{n} (ω)) - \frac{1}{\sqrt{n}} {\hat{q}}_{1 - τ / 2}^{*} (ω), R_{α} ({\hat{F}}_{n} (ω)) - \frac{1}{\sqrt{n}} {\hat{q}}_{τ / 2}^{*} (ω)]

(12)

at level

τ

derived from Equations (4) and (5) is supposed to have a slightly better performance. Here,

{\hat{q}}_{t}^{*} (ω)

denotes a t-quantile of (a Monte Carlo approximation of) the distribution of the left-hand side in Equation (5) for fixed

ω

. For Equations (4) and (5) it suffices to assume that

E [| R_{1} |^{2 + ε}] < \infty

for some arbitrarily small

ε > 0

.

2.3.2. Bootstrapping the Total Risk Premium in Insurance Models

In actuarial mathematics, the collective risk model is frequently used for modeling the total claim distribution of an insurance collective. If the counting density

p = {(p_{k})}_{k \in N_{0}}

corresponds to the distribution of the random number N of claims caused by the whole collective within one insurance period, and if

X_{1}, \dots, X_{N}

(

, X_{N + 1}, \dots

) denote the i.i.d. sizes of the corresponding claims with marginal distribution F, then

C_{p} (F)

is the distribution of the total claim

\sum_{i = 1}^{N} X_{i}

(the latter sum is set to 0 if

N = 0

). Now,

R_{α} (C_{p} (F))

is a suitable insurance premium for the whole collective when the Average Value at Risk at level

α

is considered to be a suitable premium principle.

Assume that p is known, for instance

p_{m} = 1

for some fixed

m \in N

, and let

X_{1}, \dots, X_{n}

be observed historical (i.i.d.) claims with n large. On the one hand, the construction of an exact confidence interval for

R_{α} (C_{p} (F))

at level

τ \in (0, 1)

based on

X_{1}, \dots, X_{n}

is hardly possible. Likewise, the performance of an asymptotic confidence interval at level

τ

derived from Equation (6) with (nonparametrically) estimated

σ_{p, F}^{2}

is typically only moderate. Take into account that

σ_{p, F}^{2}

depends on the unknown F in a fairly complex way. On the other hand, the bootstrap confidence interval

[R_{α} (C_{p} ({\hat{F}}_{n} (ω))) - \frac{1}{\sqrt{n}} {\hat{q}}_{1 - τ / 2}^{*} (ω), R_{α} (C_{p} ({\hat{F}}_{n} (ω))) - \frac{1}{\sqrt{n}} {\hat{q}}_{τ / 2}^{*} (ω)]

at level

τ

derived from Equation (7) should have a better performance. Here,

{\hat{q}}_{t}^{*} (ω)

denotes a t-quantile of (a Monte Carlo approximation of) the distribution of the left-hand side in Equation (7) for fixed

ω

.

Note that Theorem 2 ensures that Equations (6) and (7) hold true when the marginal distribution F of the

X_{i}

is any log-normal distribution, any Gamma distribution, any Pareto distribution with tail index greater than 2, or any convex combination of one of these distributions with the Dirac measure

δ_{0}

, and the counting density p corresponds to any Dirac measure with atom in

N

, any binomial distribution, any Poisson distribution, or any geometric distribution. The former distributions are classical examples for the single claim distribution and the latter distributions are classical examples for the claim number distribution.

3. Proofs of Main Results

Here, we prove the results of Section 2. In fact, Theorems 1–3 are special cases of Corollaries 1 and 4. The latter corollaries are proved with the help of the technique introduced in Appendix B.2, which in turn avails the concept of uniform quasi-Hadamard differentiability (see Definition A1 in Appendix B.1).

Keep the notation introduced in Section 1. Let

D

be the space of all cádlág functions v on

R

with finite sup-norm

{∥ v ∥}_{\infty} : = {sup}_{t \in R} | v (t) |

, and

D

be the

σ

-algebra on

D

generated by the one-dimensional coordinate projections

π_{t}

,

t \in R

, given by

π_{t} (v) : = v (t)

. Let

ϕ : R \to [1, \infty)

be a weight function, i.e., a continuous function being non-increasing on

(- \infty, 0]

and non-decreasing on

[0, \infty)

. Let

D_{ϕ}

be the subspace of

D

consisting of all

x \in D

satisfying

{∥ v ∥}_{ϕ} : = {∥ v ϕ ∥}_{\infty} < \infty

and

{lim}_{| t | \to \infty} | v (t) | = 0

. The latter condition automatically holds when

{lim}_{| t | \to \infty} ϕ (t) = \infty

. We equip

D_{ϕ}

with the trace

σ

-algebra of

D

, and note that this

σ

-algebra coincides with the

σ

-algebra

B_{ϕ}^{\circ}

on

D_{ϕ}

generated by the

{∥ \cdot ∥}_{ϕ}

-open balls (see Lemma 4.1 in Beutner and Zähle (2016)).

3.1. Average Value at Risk functional

Using the terminology of Part (i) of Definition A1, we obtain the following result.

Proposition 1.

Let

F \in F_{1}

and assume that F takes the value α only once. Let

S

be the set of all sequences

(G_{n}) \subseteq F_{1}

with

G_{n} \to F

pointwise. Moreover, assume that

\int 1 / ϕ (x) d x < \infty

. Then, the map

R_{α} : F_{1} (\subseteq D) \to R

is uniformly quasi-Hadamard differentiable with respect to

S

tangentially to

D_{ϕ} 〈 D_{ϕ} 〉

, and the uniform quasi-Hadamard derivative

{\dot{R}}_{α; F} : D_{ϕ} \to R

is given by

{\dot{R}}_{α; F} (v) : = - \int g_{α}^{'} (F (x)) v (x) d x,

(13)

where as before

g_{α}^{'} : = \frac{1}{1 - α} 𝟙_{(α, 1]}

.

Proposition 1 shows in particular that for any

F \in F_{1}

which takes the value

α

only once, the map

R_{α} : F_{1} (\subseteq D) \to R

is uniformly quasi-Hadamard differentiable at F tangentially to

D_{ϕ} 〈 D_{ϕ} 〉

(in the sense of Part (ii) of Definition A1) with uniform quasi-Hadamard derivative given by Equation (13).

Proof. (of Proposition 1)

First, note that the map

{\dot{R}}_{α; F}

defined in Equation (13) is continuous with respect to

{∥ \cdot ∥}_{ϕ}

, because

| {\dot{R}}_{α; F} (v_{1}) - {\dot{R}}_{α; F} (v_{2}) | \leq \int \frac{1}{1 - α} | v_{1} (x) - v_{2} (x) | d x \leq (\frac{1}{1 - α} \int 1 / ϕ (x) d x) {∥ v_{1} - v_{2} ∥}_{ϕ}

holds for every

v_{1}, v_{2} \in D_{ϕ}

.

Now, let

((F_{n}), v, (v_{n}), (ε_{n}))

be a quadruple with

(F_{n}) \subseteq F_{1}

satisfying

F_{n} \to F

pointwise,

v \in D_{ϕ}

,

(v_{n}) \subseteq D_{ϕ}

satisfying

∥ v_{n} {- v ∥}_{ϕ} \to 0

and

(F_{n} + ε_{n} v_{n}) \subseteq F_{1}

, and

(ε_{n}) \subseteq (0, \infty)

satisfying

ε_{n} \to 0

. It remains to show that

lim_{n \to \infty} | \frac{R_{α} (F_{n} + ε_{n} v_{n}) - R_{α} (F_{n})}{ε_{n}} - {\dot{R}}_{α; F} (v) | = 0,

that is,

lim_{n \to \infty} | \int (\frac{g_{α} (F_{n} (x)) - g_{α} ((F_{n} + ε_{n} v_{n}) (x))}{ε_{n}} - (- g_{α}^{'} (F (x)) v (x))) d x | = 0 .

(14)

Let us denote the integrand of the integral in Equation (14) by

I_{n} (x)

. In virtue of

F_{n} \to F

pointwise,

∥ v_{n} {- v ∥}_{ϕ} \to 0

,

ε_{n} \to 0

, and

| (F_{n} + ε_{n} v_{n}) (x) - F (x) | \leq | F_{n} (x) - F (x) | + ε_{n} | v_{n} (x) - v (x) | + ε_{n} | v (x) |,

we have

{lim}_{n \to \infty} F_{n} (x) = F (x)

and

{lim}_{n \to \infty} (F_{n} (x) + ε_{n} v_{n} (x)) = F (x)

for every

x \in R

. Thus, for every

x \in R

with

F (x) < a l p h a

, we obtain

g_{α}^{'} (F (x)) v (x) = 0

and

\frac{g_{α} (F_{n} (x)) - g_{α} ((F_{n} + ε_{n} v_{n}) (x))}{ε_{n}} = 0 for sufficiently large n,

i.e.,

{lim}_{n \to \infty} I_{n} (x) = 0

. Moreover, for every

x \in R

with

F (x) > α

, we obtain

g_{α}^{'} (F (x)) v (x) = \frac{1}{1 - α} v (x)

and

\frac{g_{α} (F_{n} (x)) - g_{α} ((F_{n} + ε_{n} v_{n}) (x))}{ε_{n}} = - \frac{v_{n} (x)}{1 - α} for sufficiently large n,

i.e.,

{lim}_{n \to \infty} I_{n} (x) = 0

. Since we assumed that F takes the value

α

only once, we can conclude that

{lim}_{n \to \infty} I_{n} (x) = 0

for Lebesgue-a.e.

x \in R

. Moreover, by the Lipschitz continuity of

g_{α}

with Lipschitz constant

\frac{1}{1 - α}

we have

\begin{matrix} | I_{n} (x) | & = & | I_{n} {(x) | ϕ (x) ϕ (x)}^{- 1} \\ = & | \frac{g_{α} (F_{n} (x)) - g_{α} ((F_{n} + ε_{n} v_{n}) (x))}{ε_{n}} + g_{α}^{'} (F (x)) v (x) | ϕ (x) ϕ {(x)}^{- 1} \\ \leq & \frac{1}{1 - α} (∥ v_{n} ∥_{ϕ} + {∥ v ∥}_{ϕ}) ϕ {(x)}^{- 1} \\ \leq & \frac{1}{1 - α} (sup_{n \in N} ∥ v_{n} ∥_{ϕ} + {∥ v ∥}_{ϕ}) ϕ {(x)}^{- 1} . \end{matrix}

Since

{sup}_{n \in N} {∥ v_{n} ∥}_{ϕ} < \infty

(recall

∥ v_{n} {- v ∥}_{ϕ} \to 0

), the assumption

\int 1 / ϕ (x) d x < \infty

ensures that the latter expression provides a Borel measurable majorant of

I_{n}

. Now, the Dominated Convergence theorem implies Equation (14). ☐

As an immediate consequence of Corollary A4, Examples A1 and A2, and Proposition 1, we obtain the following corollary.

Corollary 1.

Let F,

{\hat{F}}_{n}

,

{\hat{F}}_{n}^{*}

,

{\hat{C}}_{n}

, and

B_{F}

be as in Example A1 (S1. or S2.) or as in Example A2 respectively, and assume that the assumptions discussed in Example A1 or in Example A2 respectively are fulfilled for some weight function ϕ with

\int 1 / ϕ (x) d x < \infty

(in particular

F \in F_{1}

). Moreover, assume that F takes the value α only once. Then,

\sqrt{n} (R_{α} ({\hat{F}}_{n}) - R_{α} (F)) ⇝ {\dot{R}}_{α; F} (B_{F}) i n (R, B (R))

and

\sqrt{n} (R_{α} ({\hat{F}}_{n}^{*} (ω, \cdot)) - R_{α} ({\hat{C}}_{n} (ω))) ⇝ {\dot{R}}_{α; F} (B_{F}) i n (R, B (R)), P - a . e . ω .

3.2. Compound Distribution Functional

Let

C_{p} : F \to F

be the compound distribution functional introduced in Section 2.1. For any

λ \geq 0

, let the function

ϕ_{λ} : R \to [1, \infty)

be defined by

ϕ_{λ} (x) : = {(1 + | x |)}^{λ}

and denote by

F_{ϕ_{λ}}

the set of all distribution functions F that satisfy

\int ϕ_{λ} (x) d F (x) < \infty

. Using the terminology of Part (ii) of Definition A1, we obtain the following Proposition 2. In the proposition, the functional

C_{p}

is restricted to the domain

F_{ϕ_{λ}}

in order to obtain

D_{ϕ_{λ^{'}}}

as the corresponding trace. The latter will be important for Corollary 3.

Proposition 2.

Let

λ > λ^{'} \geq 0

and

F \in F_{ϕ_{λ}}

. Assume that

\sum_{k = 1}^{\infty} p_{k} k^{(1 + λ) \lor 2} < \infty

. Then, the map

C_{p} : F_{ϕ_{λ}} (\subseteq D) \to F (\subseteq D)

is uniformly quasi-Hadamard differentiable at F tangentially to

D_{ϕ_{λ}} 〈 D_{ϕ_{λ}} 〉

with trace

D_{ϕ_{λ^{'}}}

. Moreover, the uniform quasi-Hadamard derivative

{\dot{C}}_{p; F} : D_{ϕ_{λ}} \to D_{ϕ_{λ^{'}}}

is given by

{\dot{C}}_{p; F} (v) (\cdot) : = v * H_{p, F} (\cdot) : = \int v (\cdot - x) d H_{p, F} (x),

(15)

where as before

H_{p, F} : = \sum_{k = 1}^{\infty} k p_{k} F^{* (k - 1)}

. In particular, if

p_{m} = 1

for some

m \in N

, then

{\dot{C}}_{p; F} (v) (\cdot) = m \int v (\cdot - x) d F^{* (m - 1)} (x) .

Proposition 2 extends Proposition 4.1 of Pitts (1994). Before we prove the proposition, we note that the proposition together with Corollary A4 and Examples A1 and A2 yields the following corollary.

Corollary 2.

Let F,

{\hat{F}}_{n}

,

{\hat{F}}_{n}^{*}

,

{\hat{C}}_{n}

, and

B_{F}

be as in Example A1 (S1. or S2.) or as in Example A2 respectively, and assume that the assumptions discussed in Example A1 or in Example A2 respectively are fulfilled for

ϕ = ϕ_{λ}

for some

λ > 0

. Then, for

λ^{'} \in [0, λ)

\sqrt{n} (C_{p} ({\hat{F}}_{n}) - C_{p} (F)) ⇝^{\circ} {\dot{C}}_{p; F} (B_{F}) i n (D_{ϕ_{λ}^{'}}, B_{ϕ_{λ}^{'}}^{\circ}, ∥ \cdot ∥_{ϕ_{λ^{'}}})

and

\sqrt{n} (C_{p} ({\hat{F}}_{n}^{*} (ω, \cdot)) - C_{p} ({\hat{C}}_{n} (ω))) ⇝^{\circ} {\dot{C}}_{p; F} (B_{F}) i n (D_{ϕ_{λ}^{'}}, B_{ϕ_{λ}^{'}}^{\circ}, ∥ \cdot ∥_{ϕ_{λ^{'}}}), P - a . e . ω .

To ease the exposition of the proof of Proposition 2, we first state a lemma that follows from results given in Pitts (1994). In the sequel we use

f * H

to denote the function defined by

f * H (\cdot) : = \int v (\cdot - x) d H (x)

for any measurable function f and any distribution function H of a finite (not necessarily probability) Borel measure on

R

for which

f * H (\cdot)

is well defined on

R

.

Lemma 1.

Let

λ > λ^{'} \geq 0

, and

(F_{n}) \subseteq F_{ϕ_{λ}}

and

(G_{n}) \subseteq F_{ϕ_{λ}}

be any sequences such that

∥ F_{n} {- F ∥}_{ϕ_{λ}} \to 0

and

∥ G_{n} {- G ∥}_{ϕ_{λ}} \to 0

for some

F, G \in F_{ϕ_{λ}}

. Then, the following two assertions hold.

(i): There exists a constant $C_{1} > 0$ such that for every $k, n \in N$

$∥ 𝟙_{[0, \infty)} - F_{n}^{* k} ∥_{ϕ_{λ^{'}}} \leq (2^{λ^{'} - 1} \lor 1) (1 + k^{λ^{'} \lor 1} C_{1}) .$
(ii): For every $v \in D_{ϕ_{λ^{'}}}$ there exists a constant $C_{2} > 0$ such that for every $k, ℓ, n \in N$

$∥ v * (F_{n}^{* k} * G_{n}^{* ℓ}) ∥_{ϕ_{λ^{'}}} \leq 2^{λ^{'}} (1 + 2^{λ^{'}} (2^{λ^{'} - 1} \lor 1) (2 + {(k + ℓ)}^{λ^{'} \lor 1} C_{2})) {∥ v ∥}_{ϕ_{λ^{'}}} .$

Proof.

(i): From Equation (2.4) in Pitts (1994) we have

\begin{matrix} ∥ 𝟙_{[0, \infty)} - F_{n}^{* k} ∥_{ϕ_{λ^{'}}} \leq (2^{λ^{'} - 1} \lor 1) (1 + k^{λ^{'} \lor 1} \int {| x |}^{λ^{'}} d F_{n} (x)), \end{matrix}

so that it remains to show that

{\int | x |}^{λ^{'}} d F_{n} (x)

is bounded above uniformly in

n \in N

. The functions

𝟙_{[0, \infty)} - F_{n}

and

𝟙_{[0, \infty)} - F

both lie in

D_{ϕ_{λ}}

, because

F_{n}, F \in F_{ϕ_{λ}}

. Along with

∥ F_{n} {- F ∥}_{ϕ_{λ}} \to 0

, this implies

{\int | x |}^{λ^{'}} d F_{n} (x) \to \int {| x |}^{λ^{'}} d F (x)

(see Lemma 2.1 in Pitts (1994)). Therefore,

{\int | x |}^{λ^{'}} d F_{n} (x) \leq C_{1}

for some suitable finite constant

C_{1} > 0

and all

n \in N

.

(ii): With the help of Lemma 2.3 of Pitts (1994) (along with

∥ F_{n}^{* k} * G_{n}^{* ℓ} ∥_{\infty} = 1

), Lemma 2.4 of Pitts (1994), and Equation (2.4) in Pitts (1994), we obtain

\begin{matrix} ∥ v * & (F_{n}^{* k} * G_{n}^{* ℓ}) ∥_{ϕ_{λ^{'}}} \\ \leq & 2^{λ^{'}} {∥ v ∥}_{ϕ_{λ^{'}}} (1 + ∥ 𝟙_{[0, \infty)} - F_{n}^{* k} * G_{n}^{* ℓ} ∥_{ϕ_{λ^{'}}}) \\ \leq & 2^{λ^{'}} {∥ v ∥}_{ϕ_{λ^{'}}} (1 + 2^{λ^{'}} (∥ 𝟙_{[0, \infty)} - F_{n}^{* k} ∥_{ϕ_{λ^{'}}} + {∥ 𝟙_{(0, \infty) -} G_{n}^{* ℓ} ∥}_{ϕ_{λ^{'}}})) \\ \leq & 2^{λ^{'}} {∥ v ∥}_{ϕ_{λ^{'}}} (1 + 2^{λ^{'}} (2^{λ^{'} - 1} \lor 1) (1 + k^{λ^{'} \lor 1} {\int | x |}^{λ^{'}} d F_{n} (x) + 1 + ℓ^{λ^{'} \lor 1} \int {| x |}^{λ^{'}} d G_{n} (x))) . \end{matrix}

It hence remains to show that

{\int | x |}^{λ^{'}} d F_{n} (x)

and

{\int | x |}^{λ^{'}} d G_{n} (x)

are bounded above uniformly in

n \in N

. However, this was already done in the proof of Part (i). ☐

Proof. Proof of Proposition 2.

First, note that for

G_{1}, G_{2} \in F_{ϕ_{λ}}

, we have

\begin{matrix} ∥ C_{p} (G_{1}) - C_{p} (G_{2}) ∥_{ϕ_{λ^{'}}} & \leq & ∥ C_{p} (G_{1}) - 𝟙_{[0, \infty)} ∥_{ϕ_{λ^{'}}} + {∥ 𝟙_{[0, \infty)} - C_{p} (G_{2}) ∥}_{ϕ_{λ^{'}}} \\ \leq & \int {(1 + | x |)}^{λ^{'}} d C_{p} (G_{1}) (x) + \int {(1 + | x |)}^{λ^{'}} d C_{p} (G_{2}) (x) \end{matrix}

by Equation (2.1) in Pitts (1994). Moreover, according to Lemma 2.2 in Pitts (1994), we have that the integrals

{\int | x |}^{λ^{'}} d C_{p} (F) (x)

and

{\int | x |}^{λ^{'}} d C_{p} (G) (x)

are finite under the assumptions of the proposition. Hence,

D_{ϕ_{λ^{'}}}

can indeed be seen as the trace.

Second, we show

(∥ \cdot ∥_{ϕ_{λ}}, ∥ \cdot ∥_{ϕ_{λ^{'}}})

-continuity of the map

{\dot{C}}_{p; F} : D_{ϕ_{λ}} \to D_{ϕ_{λ^{'}}}

. To this end let

v \in D_{ϕ_{λ}}

and

(v_{n}) \subseteq D_{ϕ_{λ}}

such that

∥ v_{n} {- v ∥}_{ϕ_{λ}} \to 0

. For every

k \in N

, we have

\begin{matrix} ∥ p_{k} k (v_{n} - v) * F^{* (k - 1)} ∥_{ϕ_{λ^{'}}} \\ \leq 2^{λ^{'}} {∥ v_{n} - v ∥}_{ϕ_{λ^{'}}} p_{k} k (∥ 𝟙_{[0, \infty)} ∥ F^{* (k - 1)} ∥_{\infty} - F^{* (k - 1)} ∥_{ϕ_{λ^{'}}} + {∥ F^{* (k - 1)} ∥}_{\infty}) \\ = 2^{λ^{'}} {∥ v_{n} - v ∥}_{ϕ_{λ^{'}}} p_{k} k (∥ 𝟙_{[0, \infty)} - F^{* (k - 1)} ∥_{ϕ_{λ^{'}}} + 1) \\ \leq 2^{λ^{'}} {∥ v_{n} - v ∥}_{ϕ_{λ^{'}}} p_{k} k ((2^{λ^{'} - 1} \lor 1) (1 + {(k - 1)}^{λ^{'} \lor 1} \int {| x |}^{λ^{'}} d F (x)) + 1), \end{matrix}

where the first and the second inequality follow from Lemma 2.3 and Equation (2.4) in Pitts (1994) respectively. Hence,

\begin{matrix} ∥ {\dot{C}}_{p; F} (v_{n}) - {\dot{C}}_{p; F} (v) ∥_{ϕ_{λ^{'}}} = {∥ v_{n} * H_{p, F} - v * H_{p, F} ∥}_{ϕ_{λ^{'}}} \\ \leq 2^{λ^{'}} {∥ v_{n} - v ∥}_{ϕ_{λ^{'}}} \sum_{k = 1}^{\infty} p_{k} k ((2^{λ^{'} - 1} \lor 1) (1 + {(k - 1)}^{λ^{'} \lor 1} \int {| x |}^{λ^{'}} d F (x)) + 1) . \end{matrix}

Now, the series converges due to the assumptions, and

∥ v_{n} {- v ∥}_{ϕ_{λ}} \to 0

implies

∥ v_{n} {- v ∥}_{ϕ_{λ^{'}}} \to 0

. Thus,

∥ {\dot{C}}_{p; F} (v_{n}) - {\dot{C}}_{p; F} (v) ∥_{ϕ_{λ^{'}}} \to 0

, which proves continuity.

Third, let

((F_{n}), v, (v_{n}), (ε_{n}))

be a quadruple with

(F_{n}) \subseteq F_{ϕ_{λ}}

satisfying

∥ F_{n} {- F ∥}_{ϕ_{λ}} \to 0

,

v \in D_{ϕ_{λ}}

,

(v_{n}) \subseteq D_{ϕ_{λ}}

satisfying

∥ v_{n} {- v ∥}_{ϕ_{λ}} \to 0

and

(F_{n} + ε_{n} v_{n}) \subseteq F_{ϕ_{λ}}

, and

(ε_{n}) \subseteq (0, \infty)

satisfying

ε_{n} \to 0

. It remains to show that

lim_{n \to \infty} ∥ \frac{C_{p} (F_{n} + ε_{n} v_{n}) - C_{p} (F_{n})}{ε_{n}} - {\dot{C}}_{p; F} (v) ∥_{ϕ_{λ^{'}}} = 0 .

To do so, define for

k \in N_{0}

a map

H_{k} : F \times F : \to F

by

H_{k} (G_{1}, G_{2}) : = \sum_{j = 0}^{k - 1} G_{1}^{* (k - 1 - j)} * G_{2}^{* j}

with the usual convention that the sum over the empty set equals zero. We find that for every

M \in N

\begin{matrix} ∥ \frac{C_{p} (F_{n} + ε_{n} v_{n}) - C_{p} (F_{n})}{ε_{n}} - {\dot{C}}_{p; F} (v) ∥_{ϕ_{λ^{'}}} \\ = ∥ \frac{1}{ε_{n}} (\sum_{k = 0}^{\infty} p_{k} {(F_{n} + ε_{n} v_{n})}^{* k} - \sum_{k = 0}^{\infty} p_{k} F_{n}^{* k}) - {\dot{C}}_{p; F} (v) ∥_{ϕ_{λ^{'}}} \\ = ∥ \frac{1}{ε_{n}} (\sum_{k = 1}^{\infty} (p_{k} {(F_{n} + ε_{n} v_{n})}^{* k} - p_{k} F_{n}^{* k})) - {\dot{C}}_{p; F} (v) ∥_{ϕ_{λ^{'}}} \\ = ∥ \sum_{k = 1}^{\infty} p_{k} v_{n} * H_{k} (F_{n} + ε_{n} v_{n}, F_{n}) - {\dot{C}}_{p; F} (v) ∥_{ϕ_{λ^{'}}} \\ \leq ∥ \sum_{k = M + 1}^{\infty} p_{k} v_{n} * H_{k} (F_{n} + ε_{n} v_{n}, F_{n}) ∥_{ϕ_{λ^{'}}} + ∥ \sum_{k = 1}^{M} p_{k} (v_{n} - v) * H_{k} (F_{n} + ε_{n} v_{n}, F_{n}) ∥_{ϕ_{λ^{'}}} \\ + ∥ v * \sum_{k = M + 1}^{\infty} k p_{k} F^{* (k - 1)} ∥_{ϕ_{λ^{'}}} + ∥ \sum_{k = 1}^{M} p_{k} v * H_{k} (F_{n} + ε_{n} v_{n}, F_{n}) - k p_{k} v * F^{* (k - 1)} ∥_{ϕ_{λ^{'}}} \\ = : S_{1} (n, M) + S_{2} (n, M) + S_{3} (M) + S_{4} (n, M), \end{matrix}

where for the third “=” we use the fact that for

G_{1}, G_{2} \in F

(G_{1} - G_{2}) * H_{k} (G_{1}, G_{2}) = G_{1}^{* k} - G_{2}^{* k} .

(16)

By Part (ii) of Lemma reflemma preceding qHD of compound (this lemma can be applied since

∥ F_{n} + ε_{n} v_{n} {- F ∥}_{ϕ_{λ}} \to 0

) there exists a constant

C_{2} > 0

such that for all

n \in N

\begin{matrix} S_{1} (n, M) & = ∥ \sum_{k = M + 1}^{\infty} p_{k} v_{n} * H_{k} (F_{n} + ε_{n} v_{n}, F_{n}) ∥_{ϕ_{λ^{'}}} \\ \leq 2^{λ^{'}} {∥ v_{n} ∥}_{ϕ_{λ^{'}}} \sum_{k = M + 1}^{\infty} p_{k} k (1 + 2^{λ^{'}} (2^{λ^{'} - 1} \lor 1) (2 + {(k - 1)}^{λ^{'} \lor 1} C_{2})) . \end{matrix}

(17)

Since

λ^{'} < λ

and

∥ v_{n} {- v ∥}_{ϕ_{λ}} \to 0

, we have

∥ v_{n} ∥_{ϕ_{λ^{'}}} \leq K_{1}

for some finite constant

K_{1} > 0

and all

n \in N

. Hence, the right-hand side of Equation (17) can be made arbitrarily small by choosing M large enough. That is,

S_{1} (n, M)

can be made arbitrarily small uniformly in

n \in N

by choosing M large enough.

Furthermore, it is demonstrated in the proof of Proposition 4.1 of Pitts (1994) that

S_{3} (M)

can be made arbitrarily small by choosing M large enough.

Next, applying again Part (ii) of Lemma 1, we obtain

\begin{matrix} S_{2} (n, M) & = & ∥ \sum_{k = 1}^{M} p_{k} (v_{n} - v) * H_{k} (F_{n} + ε_{n} v_{n}, F_{n}) ∥_{ϕ_{λ^{'}}} \\ \leq & 2^{λ^{'}} \sum_{k = 1}^{M} p_{k} k {∥ v_{n} - v ∥}_{ϕ_{λ^{'}}} (1 + 2^{λ^{'}} (2^{λ^{'} - 1} \lor 1) (2 + {(k - 1)}^{λ^{'} \lor 1} C_{2})) . \end{matrix}

Using

∥ v_{n} {- v ∥}_{ϕ_{λ^{'}}} \leq {∥ v_{n} - v ∥}_{ϕ_{λ}} \to 0

, this term tends to zero as

n \to \infty

for a given M.

It remains to consider the summand

\begin{matrix} S_{4} (n, M) & = & ∥ \sum_{k = 1}^{M} p_{k} v * H_{k} (F_{n} + ε_{n} v_{n}, F_{n}) - k p_{k} v * F^{* (k - 1)} ∥_{ϕ_{λ^{'}}} \\ = & ∥ \sum_{k = 1}^{M} p_{k} \sum_{ℓ = 0}^{k - 1} (v * {(F_{n} + ε_{n} v_{n})}^{* (k - 1 - ℓ)} * F_{n}^{* ℓ} - v * F^{* (k - 1)}) ∥_{ϕ_{λ^{'}}} . \end{matrix}

We show that for M fixed this term can be made arbitrarily small by letting

n \to \infty

. This would follow if for every given

k \in {1, \dots, M}

and

ℓ \in {0, \dots, k - 1}

the expression

∥ v * {(F_{n} + ε_{n} v_{n})}^{* (k - 1 - ℓ)} * F_{n}^{* ℓ} - v * F^{* (k - 1)} ∥_{ϕ_{λ^{'}}}

could be made arbitrarily small by letting

n \to \infty

. For every such k and ℓ we can find a linear combination of indicator functions of the form

𝟙_{[a, b)}

,

- \infty < a < b < \infty

, which we denote by

\tilde{v}

, such that

\begin{matrix} ∥ v * {(F_{n} + ε_{n} v_{n})}^{* (k - 1 - ℓ)} * F_{n}^{* ℓ} - v * F^{* (k - 1)} ∥_{ϕ_{λ^{'}}} \\ \leq ∥ v * {(F_{n} + ε_{n} v_{n})}^{* (k - 1 - ℓ)} * F_{n}^{* ℓ} - \tilde{v} * {(F_{n} + ε_{n} v_{n})}^{* (k - 1 - ℓ)} * F_{n}^{* ℓ} ∥_{ϕ_{λ^{'}}} \\ + ∥ \tilde{v} * {(F_{n} + ε_{n} v_{n})}^{* (k - 1 - ℓ)} * F_{n}^{* ℓ} - \tilde{v} * F^{* (k - 1)} ∥_{ϕ_{λ^{'}}} \\ + ∥ \tilde{v} * F^{* (k - 1)} - v * F^{* (k - 1)} ∥_{ϕ_{λ^{'}}} \\ \leq 2^{λ^{'}} {∥ \tilde{v} - v ∥}_{ϕ_{λ^{'}}} (∥ 𝟙_{[0, \infty)} - {(F_{n} + ε_{n} v_{n})}^{* (k - 1 - ℓ)} * F_{n}^{* ℓ} ∥_{ϕ_{λ^{'}}} + 1) \\ + c (λ^{'}, \tilde{v}) {∥ {(F_{n} + ε_{n} v_{n})}^{* (k - 1 - ℓ)} * F_{n}^{* ℓ} - F^{* (k - 1)} ∥}_{ϕ_{λ^{'}}} \\ + 2^{λ^{'}} {∥ \tilde{v} - v ∥}_{ϕ_{λ^{'}}} (∥ 𝟙_{[0, \infty)} - F^{* (k - 1)} ∥_{ϕ_{λ^{'}}} + 1) \end{matrix}

(18)

for some suitable finite constant

c (λ^{'}, \tilde{v}) > 0

depending only on

λ^{'}

and

\tilde{v}

. The first inequality in Equation (18) is obvious (and holds for any

\tilde{v} \in D_{ϕ_{λ^{'}}}

). The second inequality in Equation (18) is obtained by applying Lemma 2.3 of Pitts (1994) to the first summand (noting that

∥ {(F_{n} + ε_{n} v_{n})}^{* (k - 1 - ℓ)} * F_{n}^{* ℓ} ∥_{\infty} = 1

; recall

F_{n} + ε_{n} v_{n} \in F

), by applying Lemma 4.3 of Pitts (1994) to the second summand (which requires that

\tilde{v}

is as described above), and by applying Lemma 2.3 of Pitts (1994) to the third summand.

We now consider the three summands on the right-hand side of Equation (18) separately. We start with the third term. Since

v \in D_{ϕ_{λ}}

, Lemma 4.2 of Pitts (1994) ensures that we may assume that

\tilde{v}

is chosen such that

∥ \tilde{v} {- v ∥}_{ϕ_{λ^{'}}}

is arbitrarily small. Hence, for fixed M the third summand in Equation (18) can be made arbitrarily small.

We next consider the the second summand in Equation (18). Obviously,

\begin{matrix} ∥ {(F_{n} + ε_{n} v_{n})}^{* (k - 1 - ℓ)} * F_{n}^{* ℓ} - F^{* (k - 1)} ∥_{ϕ_{λ^{'}}} \\ = ∥ {(F_{n} + ε_{n} v_{n})}^{* (k - 1 - ℓ)} * F_{n}^{* ℓ} - F_{n}^{* (k - 1)} + F_{n}^{* (k - 1)} - F^{* (k - 1)} ∥_{ϕ_{λ^{'}}} \\ \leq ∥ ({(F_{n} + ε_{n} v_{n})}^{* (k - 1 - ℓ)} - F_{n}^{* (k - 1 - ℓ)}) * F_{n}^{* ℓ} ∥_{ϕ_{λ^{'}}} + {∥ F_{n}^{* (k - 1)} - F^{* (k - 1)} ∥}_{ϕ_{λ^{'}}} . \end{matrix}

(19)

We start by considering the first summand in Equation (19). In view of Equation (16), it can be written as

\begin{matrix} ∥ ({(F_{n} + ε_{n} v_{n})}^{* (k - 1 - ℓ)} - F_{n}^{* (k - 1 - ℓ)}) * F_{n}^{* ℓ} ∥_{ϕ_{λ^{'}}} \\ = ∥ ((F_{n} + ε_{n} v_{n} - F_{n}) * H_{k - 1 - ℓ} (F_{n} + ε_{n} v_{n}, F_{n})) * F_{n}^{* ℓ} ∥_{ϕ_{λ^{'}}} \\ = ∥ (ε_{n} v_{n} * H_{k - 1 - ℓ} (F_{n} + ε_{n} v_{n}, F_{n})) * F_{n}^{* ℓ} ∥_{ϕ_{λ^{'}}} . \end{matrix}

Applying Lemma 2.3 of Pitts (1994) with

f = ε_{n} v_{n} * H_{k - ℓ - 1} (F_{n} + ε_{n} v_{n}, F_{n})

and

H = F_{n}^{* ℓ}

we obtain

\begin{matrix} ∥ (ε_{n} v_{n} * H_{k - 1 - ℓ} (F_{n} + ε_{n} v_{n}, F_{n})) * F_{n}^{* ℓ} ∥_{ϕ_{λ^{'}}} \\ \leq 2^{λ^{'}} ∥ (ε_{n} v_{n} * H_{k - ℓ - 1} (F_{n} + ε_{n} v_{n}, F_{n})) ∥_{ϕ_{λ^{'}}} (∥ 𝟙_{[0, \infty)} ∥ F_{n}^{* ℓ} ∥_{\infty} - F_{n}^{* ℓ} ∥_{ϕ_{λ^{'}}} + {∥ F_{n}^{* ℓ} ∥}_{\infty}) \\ = 2^{λ^{'}} ∥ (ε_{n} v_{n} * H_{k - ℓ - 1} (F_{n} + ε_{n} v_{n}, F_{n})) ∥_{ϕ_{λ^{'}}} (∥ 𝟙_{[0, \infty)} - F_{n}^{* ℓ} ∥_{ϕ_{λ^{'}}} + 1) \\ \leq 2^{λ^{'}} ∥ (ε_{n} v_{n} * H_{k - ℓ - 1} (F_{n} + ε_{n} v_{n}, F_{n})) ∥_{ϕ_{λ^{'}}} \{(2^{λ^{'} - 1} \lor 1) (1 + ℓ^{λ^{'} \lor 1} C_{1}) + 1\}, \end{matrix}

(20)

where we applied Part (i) of Lemma 1 to

∥ 𝟙_{[0, \infty)} - F_{n}^{* ℓ} ∥_{ϕ_{λ^{'}}}

to obtain the last inequality. Hence, for the left-hand side of Equation (20) to go to zero as

n \to \infty

it suffices to show that

∥ (ε_{n} v_{n} * H_{k - ℓ - 1} (F_{n} + ε_{n} v_{n}, F_{n})) ∥_{ϕ_{λ^{'}}} \to 0

as

n \to \infty

. The latter follows from

\begin{matrix} ∥ (ε_{n} v_{n} * H_{k - ℓ - 1} (F_{n} + ε_{n} v_{n}, F_{n})) ∥_{ϕ_{λ^{'}}} \\ \leq 2^{λ^{'}} (k - ℓ - 1) ε_{n} {∥ v_{n} ∥}_{ϕ_{λ^{'}}} (1 + 2^{λ^{'}} (2^{λ^{'} - 1} \lor 1) (2 + {((k - ℓ - 2))}^{λ^{'} \lor 1} C_{2})), \end{matrix}

(21)

where we applied Part (ii) of Lemma 1 with

v = ε_{n} v_{n}

to all summands in

H_{k - ℓ - 1} (F_{n} + ε_{n} v_{n}, F_{n})

. For every k and

ℓ \in {0, \dots, k - 1}

this expression goes indeed to zero as

n \to \infty

, because, as mentioned before,

∥ v_{n} ∥_{ϕ_{λ^{'}}}

is uniformly bounded in

n \in N

, and we have

ε_{n} \to 0

. Next, we consider the second summand in Equation (19). Applying Equation (16) to

F_{n}^{* (k - 1)}

and

F^{* (k - 1)}

and subsequently Part (ii) of Lemma 1 to the summands in

H_{k - 1} (F_{n}, F)

, we have

∥ F_{n}^{* (k - 1)} - F^{* (k - 1)} ∥_{ϕ_{λ^{'}}} \leq 2^{λ^{'}} (k - 1) {∥ F_{n} - F ∥}_{ϕ_{λ^{'}}} (1 + 2^{λ^{'}} (2^{λ^{'} - 1} \lor 1) (2 + {((k - 2))}^{λ^{'} \lor 1} C_{2})) .

Clearly for every k this term goes to zero 0 as

n \to \infty

, because

∥ F_{n} {- F ∥}_{ϕ_{λ^{'}}} \leq {∥ F_{n} - F ∥}_{ϕ_{λ}} \to 0

as

n \to \infty

by assumption. This together with the fact that Equation (20) goes to zero 0 as

n \to \infty

shows that Equation (19) goes to zero in

{∥ \cdot ∥}_{ϕ_{λ^{'}}}

as

n \to \infty

. Therefore, the second summand in Equation (18) goes to zero as

n \to \infty

.

It remains to consider the first term in Equation (18). We find

\begin{matrix} 2^{λ^{'}} {∥ \tilde{v} - v ∥}_{ϕ_{λ}} (∥ 𝟙_{[0, \infty)} - {(F_{n} + ε_{n} v_{n})}^{* (k - 1 - ℓ)} * F_{n}^{* ℓ} ∥_{ϕ_{λ^{'}}} + 1) \\ \leq 2^{λ^{'}} {∥ \tilde{v} - v ∥}_{ϕ_{λ^{'}}} (∥ 𝟙_{[0, \infty)} - {(F_{n} + ε_{n} v_{n})}^{* (k - 1 - ℓ)} * F_{n}^{* ℓ} ∥_{ϕ_{λ^{'}}} + 1) \\ \leq 2^{λ^{'}} {∥ \tilde{v} - v ∥}_{ϕ_{λ^{'}}} (∥ 𝟙_{[0, \infty)} - F^{* (k - 1)} + F^{* (k - 1)} - {(F_{n} + ε_{n} v_{n})}^{* (k - 1 - ℓ)} * F_{n}^{* ℓ} ∥_{ϕ_{λ^{'}}} + 1) \\ \leq 2^{λ^{'}} {∥ \tilde{v} - v ∥}_{ϕ_{λ^{'}}} (∥ 𝟙_{[0, \infty)} - F^{* (k - 1)} ∥_{ϕ_{λ^{'}}} + {∥ F^{* (k - 1)} - {(F_{n} + ε_{n} v_{n})}^{* (k - 1 - ℓ)} * F_{n}^{* ℓ} ∥}_{ϕ_{λ^{'}}} + 1) \\ \leq 2^{λ^{'}} {∥ \tilde{v} - v ∥}_{ϕ_{λ^{'}}} (2^{λ^{'} - 1} \lor 1) (1 + k^{λ \lor 1} \int {| x |}^{λ^{'}} d F (x)) \\ + 2^{λ^{'}} {∥ \tilde{v} - v ∥}_{ϕ_{λ^{'}}} (∥ F^{* (k - 1)} - {(F_{n} + ε_{n} v_{n})}^{* (k - 1 - ℓ)} * F_{n}^{* ℓ} ∥_{ϕ_{λ^{'}}} + 1), \end{matrix}

(22)

where for the last inequality we used Formula (2.4) of Pitts (1994). In the following, Equation (19) we showed that

∥ F^{* (k - 1)} - {(F_{n} + ε_{n} v_{n})}^{* (k - 1 - ℓ)} * F_{n}^{* ℓ} ∥_{ϕ_{λ^{'}}}

goes to zero as

n \to \infty

for every k and

ℓ \in {0, \dots, k - 1}

. Hence, for every such k and ℓ, it is uniformly bounded in

n \in N

. Therefore, we can make Equation (22) arbitrarily small by making

∥ \tilde{v} {- v ∥}_{ϕ_{λ^{'}}}

small which, as mentioned above, is possible according to Lemma 4.2 of Pitts (1994). This finishes the proof. ☐

3.3. Composition of Average Value at Risk Functional and Compound Distribution Functional

Here, we consider the composition of the Average Value at Risk functional

R_{α}

defined in Equation (1) and the compound distribution functional

C_{p}

introduced in Section 2.1. As a consequence of Propositions 1 and 2, we obtain the following Corollary 3. Note that, for any

λ > 1

, Lemma 2.2 in Pitts (1994) yields

C_{p} (F_{ϕ_{λ}}) \subseteq F_{1}

so that the composition

R_{α} \circ C_{p}

is well defined on

F_{ϕ_{λ}}

.

Corollary 3.

Let

λ > 1

and assume

\sum_{k = 1}^{\infty} p_{k} k^{1 + λ} < \infty

. Let

F \in F_{ϕ_{λ}}

, and assume that

C_{p} (F)

takes the value α only once. Then, the map

T_{α, p} : = R_{α} \circ C_{p} : F_{ϕ_{λ}} (\subseteq D) \to R

is uniformly quasi-Hadamard differentiable at F tangentially to

D_{ϕ_{λ}} 〈 D_{ϕ_{λ}} 〉

, and the uniform quasi-Hadamard derivative

{\dot{T}}_{α, p; F} : D_{ϕ_{λ}} \to R

is given by

{\dot{T}}_{α, p; F} = {\dot{R}}_{α; C_{p} (F)} \circ {\dot{C}}_{p; F}

, i.e.,

{\dot{T}}_{α, p; F} (v) = \int g_{α}^{'} (C_{p} (F) (x)) (v * H_{p, F}) (x) d x f o r a l l v \in D_{ϕ_{λ}}

with

g_{α}^{'}

and

v * H_{p, F}

as in Proposition 1 and 2, respectively.

Proof.

We intend to apply Lemma A1 to

H = C_{p} : F_{ϕ_{λ}} \to F_{1}

and

\tilde{H} = R_{α} : F_{1} \to R

. To verify that the assumptions of the lemma are fulfilled, we first recall from the comment directly before Corollary 3 that

C_{p} (F_{ϕ_{λ}}) \subseteq F_{1}

. It remains to show that the Assumptions (a)–(c) of Lemma A1 are fulfilled. According to Proposition 2 we have that for every

λ^{'} \in (1, λ)

the functional

C_{p}

is uniformly quasi-Hadamard differentiable at F tangentially to

D_{ϕ_{λ}} 〈 D_{ϕ_{λ}} 〉

with trace

D_{ϕ_{λ^{'}}}

, which is the first part of Assumption (b). The second part of Assumption (b) means

{\dot{C}}_{p, F} (D_{ϕ_{λ}}) \subseteq D_{ϕ_{λ^{'}}}

and follows from

\begin{matrix} ∥ {\dot{C}}_{p; F} {(v) ∥}_{ϕ_{λ^{'}}} & = & ∥ v * \sum_{k = 1}^{\infty} p_{k} k F^{* (k - 1)} ∥_{ϕ_{λ^{'}}} \\ \leq & 2^{λ^{'}} {∥ v ∥}_{ϕ_{λ^{'}}} \sum_{k = 1}^{\infty} p_{k} k (1 + (2^{λ^{'} - 1} \lor 1) (1 + k^{λ^{'} \lor 1} \int {| x |}^{λ^{'}} d F (x))) \end{matrix}

(for which we applied Lemma 2.3 and Inequality (2.4) in Pitts (1994)), the convergence of the latter series (which holds by assumption), and

{∥ v ∥}_{ϕ_{λ^{'}}} \leq {∥ v ∥}_{ϕ_{λ}} < \infty

. Further, it follows from Proposition 1 that the map

R_{α}

is uniformly quasi-Hadamard differentiable tangentially to

D_{ϕ_{λ^{'}}} 〈 D_{ϕ_{λ^{'}}} 〉

at every distribution function of

F_{ϕ_{λ^{'}}}

that takes the value

1 - α

only once. This is Assumption (c) of Lemma A1.

It remains to show that Assumption (a) of Lemma A1 also holds true. In the present setting, Assumption (a) means that for every sequence

(F_{n}) \subseteq F_{ϕ_{λ}}

with

∥ F_{n} {- F ∥}_{ϕ_{λ}} \to 0

we have

C_{p} (F_{n}) \to C_{p} (F)

pointwise. We show that we even have

∥ C_{p} (F_{n}) - C_{p} (F) ∥_{ϕ_{λ^{'}}} \to 0

. Thus, let

(F_{n}) \subseteq F_{ϕ_{λ}}

. Then,

\begin{matrix} ∥ C_{p} (F_{n}) - C_{p} (F) ∥_{ϕ_{λ^{'}}} & = & ∥ \sum_{k = 1}^{\infty} p_{k} (F_{n}^{* k} - F^{* k}) ∥_{ϕ_{λ^{'}}} \\ = & ∥ (F_{n} - F) * \sum_{k = 1}^{\infty} p_{k} H_{k} (F_{n}, F) ∥_{ϕ_{λ^{'}}} \\ \leq & 2^{λ^{'}} {∥ F_{n} - F ∥}_{ϕ_{λ^{'}}} \sum_{k = 1}^{\infty} p_{k} k (1 + 2^{λ^{'}} (2^{λ^{'} - 1} \lor 1) (2 + {(k - 1)}^{λ^{'} \lor 1} C_{2})), \end{matrix}

where we used Equation (16) for the second “=” and applied Part (ii) of Lemma 1 to the summands of

H_{k}

to obtain the latter inequality. Since the series converges, we obtain

∥ C_{p} (F_{n}) - C_{p} (F) ∥_{ϕ_{λ^{'}}} \to 0

when assuming

∥ F_{n} {- F ∥}_{ϕ_{λ}} \to 0

. ☐

As an immediate consequence of Corollary A4, Examples A1 and A2, and Corollary 3, we obtain the following corollary.

Corollary 4.

Let F,

{\hat{F}}_{n}

,

{\hat{F}}_{n}^{*}

,

{\hat{C}}_{n}

, and

B_{F}

be as in Example A1 (S1. or S2.) or as in Example A2, respectively, and assume that the assumptions discussed in Example A1 or in Example A2 respectively are fulfilled for

ϕ = ϕ_{λ}

for some

λ > 1

(in particular

F \in F_{1}

). Moreover, assume

\sum_{k = 1}^{\infty} p_{k} k^{1 + λ} < \infty

and that

C_{p} (F)

takes the value α only once. Then,

\sqrt{n} (T_{α, p} ({\hat{F}}_{n}) - T_{α, p} (F)) ⇝ {\dot{T}}_{α, p; F} (B_{F}) i n (R, B (R))

and

\sqrt{n} (T_{α, p} ({\hat{F}}_{n}^{*} (ω, \cdot)) - T_{α, p} ({\hat{C}}_{n} (ω))) ⇝ {\dot{T}}_{α, p; F} (B_{F}) i n (R, B (R)), P - a . e . ω .

4. Conclusions

In this paper, we consider the sub-additive risk measure Average Value at Risk and presented in Section 2.1 and Section 2.2 results on almost sure bootstrap consistency for the corresponding empirical plug-in estimator based on i.i.d. or strictly stationary, geometrically

β

-mixing observations. Our results supplement those by Beutner and Zähle (2016) on bootstrap consistency in probability and those by Sun and Cheng (2018) on bootstrap consistency in probability for the Tail Conditional Expectation (which is not sub-additive). In Section 2.1, we also look at the case where one is interested in Average Value of Risk in the collective risk model. Note that one might interpret the collective risk model as a pooling of independent risks. In the context of Solvency II, pooling of risks has received increased attention (see, for example, Bølviken and Guillen 2017). However, one should keep in mind that our results of Section 2.1 can typically not be applied in the Solvency II context. In Solvency II applications risks are usually dependent, whereas in the collective risk model the different risks (claims) are assumed to be independent.

Author Contributions

Both authors contributed equllay to all sections of the article.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Convergence in Distribution °

Let

(E, d)

be a metric space and

B^{\circ}

be the

σ

-algebra on

E

generated by the open balls

B_{r} (x) : = {y \in E : d (x, y) < r}

,

x \in E

,

r > 0

. We refer to

B^{\circ}

as open-ball σ-algebra. If

(E, d)

is separable, then

B^{\circ}

coincides with the Borel

σ

-algebra

B

. If

(E, d)

is not separable, then

B^{\circ}

might be strictly smaller than

B

and thus a continuous real-valued function on

E

is not necessarily

(B^{\circ}, B (R))

-measurable. Let

C_{b}^{\circ}

be the set of all bounded, continuous and

(B^{\circ}, B (R))

-measurable real-valued functions on

E

, and

M_{1}^{\circ}

be the set of all probability measures on

(E, B^{\circ})

.

Let

X_{n}

be an

(E, B^{\circ})

-valued random variable on some probability space

(Ω_{n}, F_{n}, P_{n})

for every

n \in N_{0}

. Then, referring to Billingsley (1999, sct. 1.6), the sequence

(X_{n}) = {(X_{n})}_{n \in N}

is said to converge in distribution

^{\circ}

to

X_{0}

if

\int f d P \circ X_{n}^{- 1} ⟶ \int f d P_{0} \circ X_{0}^{- 1} for all f \in C_{b}^{\circ} .

In this case, we write

X_{n} ⇝^{\circ} X_{0}

. This is the same as saying that the sequence

(P_{n} \circ X_{n}^{- 1})

converges to

P_{0} \circ X_{0}^{- 1}

in the weak

^{\circ}

topology on

M_{1}^{\circ}

; for details see Appendix A of Beutner and Zähle (2016). It is worth mentioning that two probability measures

μ, ν \in M_{1}^{\circ}

coincide if

μ [E_{0}] = ν [E_{0}] = 1

for some separable

E_{0} \in B^{\circ}

and

\int f d μ = \int f d ν

for all uniformly continuous

f \in C_{b}^{\circ}

(see, for instance, (Billingsley 1999, Theorem 6.2)).

In Appendices A–C in Beutner and Zähle (2016), several properties of convergence in distribution

^{\circ}

(and weak

^{\circ}

convergence) have been discussed. The following two subsections complement this discussion.

Appendix A.1. Slutsky-Type Results for the Open-Ball σ-Algebra

For a sequence

(X_{n})

of

(E, B^{\circ})

-valued random variables that are all defined on the same probability space

(Ω, F, P)

, the sequence

(X_{n})

is said to converge in probability

^{\circ}

to

X_{0}

if the mappings

ω \mapsto d (X_{n} (ω), X_{0} (ω))

,

n \in N

, are

(F, B (R_{+}))

-measurable and satisfy

lim_{n \to \infty} P [d (X_{n}, X_{0}) \geq ε] = 0 for all ε > 0 .

(A1)

In this case, we write

X_{n} \to^{p, \circ} X_{0}

. The superscript

^{\circ}

points to the fact that measurability of the mapping

ω \mapsto d (X_{n} (ω), X_{0} (ω))

is a requirement of the definition (and not automatically valid). Note, however, that in the specific situation where

X_{0} \equiv x_{0}

for some

x_{0} \in E

, measurability of the mapping

ω \mapsto d (X_{n} (ω), X_{0} (ω))

does hold (see Lemma B.3 in Beutner and Zähle (2016)). In addition, note that the measurability always hold when

(E, d)

is separable; in this case, we also write

\to^{p}

instead of

\to^{p, \circ}

.

Theorem A1.

Let

(X_{n})

and

(Y_{n})

be two sequences of

(E, B^{\circ})

-valued random variables on a common probability space

(Ω, F, P)

, and assume that the mapping

ω \mapsto d (X_{n} (ω), Y_{n} (ω))

is

(F, B (R_{+}))

-measurable for every

n \in N

. Let

X_{0}

be an

(E, B^{\circ})

-valued random variable on some probability space

(Ω_{0}, F_{0}, P_{0})

with

P_{0} [X_{0} \in E_{0}] = 1

for some separable

E_{0} \in B^{\circ}

. Then,

X_{n} ⇝^{\circ} X_{0}

and

d (X_{n}, Y_{n}) \to^{p} 0

together imply

Y_{n} ⇝^{\circ} X_{0}

.

Proof.

In view of

X_{n} ⇝^{\circ} X

, we obtain for every fixed

f \in {BL}_{1}^{\circ}

\begin{matrix} \underset{n \to \infty}{lim sup} | \int f d P_{Y_{n}} - \int f d P_{X_{0}} | \\ \leq \underset{n \to \infty}{lim sup} | \int f d P_{Y_{n}} - \int f d P_{X_{n}} | + \underset{n \to \infty}{lim sup} | \int f d P_{X_{n}} - \int f d P_{X_{0}} | \\ \leq \underset{n \to \infty}{lim sup} \int | f (Y_{n}) - f (X_{n}) | d P . \end{matrix}

Since f lies in

{BL}_{1}^{\circ}

and we assumed

d (X_{n}, Y_{n}) \to^{p} 0

, we also have

\begin{matrix} \underset{n \to \infty}{lim sup} \int | f (Y_{n}) - f (X_{n}) | d P & \leq & \underset{n \to \infty}{lim sup} \int | f (Y_{n}) - f (X_{n}) | 𝟙_{{d (X_{n}, Y_{n}) \geq ε}} d P + 2 ε \\ \leq & 2 \underset{n \to \infty}{lim sup} P [d (X_{n}, Y_{n}) \geq ε] + 2 ε \end{matrix}

for every

ε > 0

. Thus,

{lim sup}_{n \to \infty} \int | f (Y_{n}) - f (X_{n}) | d P = 0

which together with the Portmanteau theorem (in the form of (Beutner and Zähle 2016, Theorem A.4)) implies the claim. ☐

Set

\bar{E} : = E \times E

and let

{\bar{B}}^{\circ}

be the

σ

-algebra on

\bar{E}

generated by the open balls with respect to the metric

\bar{d} ((x_{1}, x_{2}), (y_{1}, y_{2})) : = max {d (x_{1}, y_{1}); d (x_{2}, y_{2})} .

Recall that

{\bar{B}}^{\circ} \subseteq B^{\circ} \otimes B^{\circ}

, where the inclusion may be strict.

Corollary A1.

Let

(X_{n})

and

(Y_{n})

be two sequences of

(E, B^{\circ})

-valued random variables on a common probability space

(Ω, F, P)

. Let

X_{0}

be an

(E, B^{\circ})

-valued random variable on some probability space

(Ω_{0}, F_{0}, P_{0})

with

P_{0} [X_{0} \in E_{0}] = 1

for some separable

E_{0} \in B^{\circ}

. Let

y_{0} \in E_{0}

. Let

(\tilde{E}, \tilde{d})

be a metric space equipped with the corresponding open-ball σ-algebra

{\tilde{B}}^{\circ}

. Then,

X_{n} ⇝^{\circ} X_{0}

and

Y_{n} \to^{p, \circ} y_{0}

together imply:

(i): $(X_{n}, Y_{n}) ⇝^{\circ} (X_{0}, y_{0})$ .
(ii): $h (X_{n}, Y_{n}) ⇝^{\circ} h (X_{0}, y_{0})$ for every continuous and $({\bar{B}}^{\circ}, {\tilde{B}}^{\circ})$ -measurable $h : \bar{E} \to \tilde{E}$ .

Proof.

Assertion (ii) is an immediate consequence of Assertion (i) and the Continuous Mapping theorem in the form of (Billingsley 1999, Theorem 6.4); take into account that

(X_{0}, y_{0})

takes values only in

{\bar{E}}_{0} : = E_{0} \times E_{0}

and that

E_{0} \times E_{0}

is separable with respect to

\bar{d}

. Thus, it suffices to show Assertion (i). First note that we have

(X_{n}, y_{0}) ⇝^{\circ} (X_{0}, y_{0}) .

(A2)

Indeed, for every

f \in {\bar{C}}_{b}^{\circ}

(with

{\bar{C}}_{b}^{\circ}

the set of all bounded, continuous and

({\bar{B}}^{\circ}, B (R))

-measurable real-valued functions on

\bar{E}

) we have

{lim}_{n \to \infty} \int f (X_{n}, y_{0}) d P = \int f (X_{0}, y_{0}) d P_{0}

by the assumption

X_{n} ⇝^{\circ} X_{0}

and the fact that the mapping

x \mapsto f (x, y_{0})

lies in

C_{b}^{\circ}

(the latter was shown in the proof of Theorem 3.1 in Beutner and Zähle (2016)).

Second, the distance

\bar{d} ((X_{n}, Y_{n}), (X_{n}, y_{0})) = d (Y_{n}, y_{0})

is

(F, B (R_{+}))

-measurable for every

n \in N

, because

Y_{n}

is

(F, B^{\circ})

-measurable and

x \mapsto d (x, y_{0})

is

(B^{\circ}, B (R))

-measurable (due to Lemma B.3 in Beutner and Zähle (2016)). Along with

Y_{n} \to^{p, \circ} y_{0}

, we obtain in particular that

\bar{d} ((X_{n}, Y_{n}), (X_{n}, y_{0})) \to^{p} 0

. Together with Equation (A2) and Theorem A1 (applied to

X_{n}^{'} : = (X_{n}, y_{0})

,

X_{0}^{'} : = (X_{0}, y_{0})

,

Y_{n}^{'} : = (X_{n}, Y_{n})

), this implies

(X_{n}, Y_{n}) ⇝^{\circ} (X_{0}, y_{0})

; take into account again that

(X_{0}, y_{0})

takes values only in

{\bar{E}}_{0} : = E_{0} \times E_{0}

and that

E_{0} \times E_{0}

is separable with respect to

\bar{d}

. ☐

Corollary A2.

Let

(E, ∥ \cdot ∥_{E})

be a normed vector space and d be the induced metric defined by

d (x_{1}, x_{2}) : = {∥ x_{1} - x_{2} ∥}_{E}

. Let

(X_{n})

and

(Y_{n})

be two sequences of

(E, B^{\circ})

-valued random variables on a common probability space

(Ω, F, P)

. Let

X_{0}

be an

(E, B^{\circ})

-valued random variable on some probability space

(Ω_{0}, F_{0}, P_{0})

with

P_{0} [X_{0} \in E_{0}] = 1

for some separable

E_{0} \in B^{\circ}

. Let

y_{0} \in E_{0}

. Assume that the map

h : \bar{E} \to E

defined by

h (x_{1}, x_{2}) : = x_{1} + x_{2}

is

({\bar{B}}^{\circ}, B^{\circ})

-measurable. Then,

X_{n} ⇝^{\circ} X_{0}

and

Y_{n} \to^{p, \circ} y_{0}

together imply

X_{n} + Y_{n} ⇝^{\circ} X_{0} + y_{0}

.

Proof.

The assertion is an immediate consequence of Corollary A1 and the fact that h is clearly continuous. ☐

Appendix A.2. Delta-Method and Chain Rule for Uniformly Quasi-Hadamard Differentiable Maps

Now, assume that

E

is a subspace of a vector space

V

. Let

{∥ \cdot ∥}_{E}

be a norm on

E

and assume that the metric d is induced by

{∥ \cdot ∥}_{E}

. Let

\tilde{V}

be another vector space and

\tilde{E} \subseteq \tilde{V}

be any subspace. Let

{∥ \cdot ∥}_{\tilde{E}}

be a norm on

\tilde{E}

and

{\tilde{B}}^{\circ}

be the corresponding open-ball

σ

-algebra on

\tilde{E}

. Let

0_{\tilde{E}}

denote the null in

\tilde{E}

. Moreover, let

\bar{\tilde{E}} : = \tilde{E} \times \tilde{E}

and

\bar{{\tilde{B}}^{\circ}}

be the

σ

-algebra on

\bar{\tilde{E}}

generated by the open balls with respect to the metric

\bar{\tilde{d}} (({\tilde{x}}_{1}, {\tilde{x}}_{2}), ({\tilde{y}}_{1}, {\tilde{y}}_{2})) : = max {∥ {\tilde{x}}_{1} - {\tilde{y}}_{1} ∥_{\tilde{E}}; ∥ {\tilde{x}}_{2} - {\tilde{y}}_{2} ∥_{\tilde{E}}}

.

Let

(Ω_{n}, F_{n}, P_{n})

be a probability space and

{\hat{T}}_{n} : Ω_{n} \to V

be any map for every

n \in N

. Recall that

⇝^{\circ}

and

\to^{p, \circ}

refer to convergence in distribution

^{\circ}

and convergence in probability

^{\circ}

, respectively. Moreover, recall Definition A1 of quasi-Hadamard differentiability.

Theorem A2.

Let

H : V_{H} \to \tilde{E}

be a map defined on some

V_{H} \subseteq V

. Let

E_{0} \in B^{\circ}

be some

{∥ \cdot ∥}_{E}

-separable subset of

E

. Let

(θ_{n}) \subseteq V_{H}

and define the singleton set

S : = {(θ_{n})}

. Let

(a_{n})

be a sequence of positive real numbers tending to ∞, and consider the following conditions:

(a): ${\hat{T}}_{n}$ takes values only in $V_{H}$ .
(b): $a_{n} ({\hat{T}}_{n} - θ_{n})$ takes values only in $E$ , is $(F_{n}, B^{\circ})$ -measurable and satisfies

$\begin{matrix} a_{n} ({\hat{T}}_{n} - θ_{n}) ⇝^{\circ} ξ i n (E, B^{\circ}, ∥ \cdot ∥_{E}) \end{matrix}$

(A3)

for some $(E, B^{\circ})$ -valued random variable ξ on some probability space $(Ω_{0}, F_{0}, P_{0})$ with $ξ (Ω_{0}) \subseteq E_{0}$ .
(c): $a_{n} (H ({\hat{T}}_{n}) - H (θ_{n}))$ takes values only in $\tilde{E}$ and is $(F_{n}, {\tilde{B}}^{\circ})$ -measurable.
(d): The map H is uniformly quasi-Hadamard differentiable with respect to $S$ tangentially to $E_{0} 〈 E 〉$ with trace $\tilde{E}$ and uniform quasi-Hadamard derivative ${\dot{H}}_{S} : E_{0} \to \tilde{E}$ .
(e): $(Ω_{n}, F_{n}, P_{n}) = (Ω, F, P)$ for all $n \in N$ .
(f): The uniform quasi-Hadamard derivative ${\dot{H}}_{S}$ can be extended to $E$ such that the extension ${\dot{H}}_{S} : E \to \tilde{E}$ is continuous at every point of $E_{0}$ and $(B^{\circ}, {\tilde{B}}^{\circ})$ -measurable.
(g): The map $h : \bar{\tilde{E}} \to \tilde{E}$ defined by $h ({\tilde{x}}_{1}, {\tilde{x}}_{2}) : = {\tilde{x}}_{1} - {\tilde{x}}_{2}$ is $(\bar{{\tilde{B}}^{\circ}}, {\tilde{B}}^{\circ})$ -measurable.

Then, the following two assertions hold:

(i): If Conditions (a)–(d) hold true, then ${\dot{H}}_{S} (ξ)$ is $(F_{0}, {\tilde{B}}^{\circ})$ -measurable and

$a_{n} (H ({\hat{T}}_{n}) - H (θ_{n})) ⇝^{\circ} {\dot{H}}_{S} (ξ) i n (\tilde{E}, {\tilde{B}}^{\circ}, ∥ \cdot ∥_{\tilde{E}}) .$
(ii): If Conditions (a)–(g) hold true, then

$a_{n} (H ({\hat{T}}_{n}) - H (θ_{n})) - {\dot{H}}_{S} (a_{n} ({\hat{T}}_{n} - θ_{n})) \to^{p, \circ} 0_{\tilde{E}} i n (\tilde{E}, ∥ \cdot ∥_{\tilde{E}}) .$

(A4)

Proof.

The proof is very similar to the proof of Theorem C.4 in Beutner and Zähle (2016).

(i): For every

n \in N

, let

E_{n} : = {x_{n} \in E : θ_{n} + a_{n}^{- 1} x_{n} \in V_{H}}

and define the map

h_{n} : E_{n} \to \tilde{E}

by

h_{n} (x_{n}) : = \frac{H (θ_{n} + a_{n}^{- 1} x_{n}) - H (θ_{n})}{a_{n}^{- 1}} .

Moreover, define the map

h_{0} : E_{0} \to \tilde{E}

by

h_{0} (x) : = {\dot{H}}_{S} (x) .

Now, the claim would follow by the extended Continuous Mapping theorem in the form of Theorem C.1 in Beutner and Zähle (2016) applied to the functions

h_{n}

,

n \in N_{0}

, and the random variables

ξ_{n} : = a_{n} ({\hat{T}}_{n} - θ_{n})

,

n \in N

, and

ξ_{0} : = ξ

if we can show that the assumptions of Theorem C.1 in Beutner and Zähle (2016) are satisfied. First, by Assumption (a) and the last part of Assumption (b), we have

ξ_{n} (Ω_{n}) \subseteq E_{n}

and

ξ_{0} (Ω_{0}) \subseteq E_{0}

. Second, by Assumption (c), we have that

h_{n} (ξ_{n}) = a_{n} (H ({\hat{T}}_{n}) - H (θ_{n}))

is

(F_{n}, {\tilde{B}}^{\circ})

-measurable. Third, the map

h_{0}

is continuous by the definition of the quasi-Hadamard derivative. Thus,

h_{0}

is

(B_{0}^{\circ}, {\tilde{B}}^{\circ})

-measurable, because the trace

σ

-algebra

B_{0}^{\circ} : = B^{\circ} \cap E_{0}

coincides with the Borel

σ

-algebra on

E_{0}

(recall that

E_{0}

is separable). In particular,

{\dot{H}}_{S} (ξ)

is

(F_{0}, {\tilde{B}}^{\circ})

-measurable. Fourth, Condition (a) of Theorem C.1 in Beutner and Zähle (2016) holds by Assumption (b). Fifth, Condition (b) of Theorem C.1 in Beutner and Zähle (2016) is ensured by Assumption (d).

(ii): For every

n \in N

, let

E_{n}

and

h_{n}

be as above and define the map

{\bar{h}}_{n} : E_{n} \to \bar{\tilde{E}}

by

{\bar{h}}_{n} (x_{n}) : = (h_{n} (x_{n}), {\dot{H}}_{S} (x_{n})) .

Moreover, define the map

{\bar{h}}_{0} : E_{0} \to \bar{\tilde{E}}

by

{\bar{h}}_{0} (x) : = (h_{0} (x), {\dot{H}}_{S} (x)) = ({\dot{H}}_{S} (x), {\dot{H}}_{S} (x)) .

We first show that

{\bar{h}}_{n} (a_{n} (T_{n} - θ_{n})) ⇝^{\circ} {\bar{h}}_{0} (ξ) in (\bar{\tilde{E}}, \bar{{\tilde{B}}^{\circ}}, \bar{\tilde{d}}) .

(A5)

For Equation (A5), it suffices to show that the assumption of the extended Continuous Mapping theorem in the form of Theorem C.1 in Beutner and Zähle (2016) applied to the functions

{\bar{h}}_{n}

and

ξ_{n}

(as defined above) are satisfied. The claim then follows by Theorem C.1 in Beutner and Zähle (2016). First, we have already observed that

ξ_{n} (Ω_{n}) \subseteq E_{n}

and

ξ_{0} (Ω_{0}) \subseteq E_{0}

. Second, we have seen in the proof of Part (i) that

h_{n} (ξ_{n})

is

(F_{n}, {\tilde{B}}^{\circ})

-measurable,

n \in N

. By Assumption (f), the extended map

{\dot{H}}_{S} : E \to \tilde{E}

is

(B^{\circ}, {\tilde{B}}^{\circ})

-measurable, which implies that

{\dot{H}}_{S} (ξ_{n})

is

(F_{n}, {\tilde{B}}^{\circ})

-measurable. Thus,

{\bar{h}}_{n} (ξ_{n}) = (h_{n} (ξ_{n}), {\dot{H}}_{S} (ξ_{n}))

is

(F_{n}, {\tilde{B}}^{\circ} \otimes {\tilde{B}}^{\circ})

-measurable (to see this note that, in view of

{\tilde{B}}^{\circ} \otimes {\tilde{B}}^{\circ} = σ (π_{1}, π_{2})

for the coordinate projections

π_{1}, π_{2}

on

\bar{\tilde{E}} = \tilde{E} \times \tilde{E}

, Theorem 7.4 of Bauer (2001) shows that the map

(h_{n} (ξ_{n}), {\dot{H}}_{S} (ξ_{n}))

is

(F_{n}, {\tilde{B}}^{\circ} \otimes {\tilde{B}}^{\circ})

-measurable if and only if the maps

h_{n} (ξ_{n}) = π_{1} \circ (h_{n} (ξ_{n}), {\dot{H}}_{S} (ξ_{n}))

and

{\dot{H}}_{S} (ξ_{n}) = π_{2} \circ (h_{n} (ξ_{n}), {\dot{H}}_{S} (ξ_{n}))

are

(F_{n}, {\tilde{B}}^{\circ})

-measurable). In particular, the map

{\bar{h}}_{n} (ξ_{n}) = (h_{n} (ξ_{n}), {\dot{H}}_{S} (ξ_{n}))

is

(F_{n}, \bar{{\tilde{B}}^{\circ}})

-measurable,

n \in N

. Third, we have seen in the proof of Part (i) that the map

h_{0} = {\dot{H}}_{S}

is

(B_{0}^{\circ}, {\tilde{B}}^{\circ})

-measurable. Thus, the map

{\bar{h}}_{0}

is

(B_{0}^{\circ}, {\tilde{B}}^{\circ} \otimes {\tilde{B}}^{\circ})

-measurable (one can argue as above) and in particular

(B_{0}^{\circ}, \bar{{\tilde{B}}^{\circ}})

-measurable. Fourth, Condition (a) of Theorem C.1 in Beutner and Zähle (2016) holds by Assumption (b). Fifth, Condition (b) of Theorem C.1 in Beutner and Zähle (2016) is ensured by Assumption (d) and the continuity of the extended map

{\dot{H}}_{S}

at every point of

E_{0}

(recall Assumption (f)). Hence, Equation (A5) holds.

By Assumption (g) and the ordinary Continuous Mapping theorem (see (Billingsley 1999, Theorem 6.4)) applied to Equation (A5) and the map

h : \bar{\tilde{E}} \to \tilde{E}

,

({\tilde{x}}_{1}, {\tilde{x}}_{2}) \mapsto {\tilde{x}}_{1} - {\tilde{x}}_{2}

, we now have

h_{n} (a_{n} ({\hat{T}}_{n} - θ_{n})) - {\dot{H}}_{S} (a_{n} ({\hat{T}}_{n} - θ_{n})) ⇝^{\circ} {\dot{H}}_{S} (ξ) - {\dot{H}}_{S} (ξ),

i.e.,

a_{n} (H ({\hat{T}}_{n}) - H (θ_{n})) - {\dot{H}}_{S} (a_{n} ({\hat{T}}_{n} - θ_{n})) ⇝^{\circ} 0_{\tilde{E}} .

By Proposition B.4 in Beutner and Zähle (2016), we can conclude Equation (A4). ☐

The following lemma provides a chain rule for uniformly quasi-Hadamard differentiable maps (a similar chain rule with different

S

was found in Varron (2015)). To formulate the chain rule, let

\tilde{\tilde{V}}

be a further vector space and

\tilde{\tilde{E}} \subseteq \tilde{\tilde{V}}

be a subspace equipped with a norm

{∥ \cdot ∥}_{\tilde{\tilde{E}}}

.

Lemma A1.

Let

H : V_{H} \to {\tilde{V}}_{\tilde{H}}

and

\tilde{H} : {\tilde{V}}_{\tilde{H}} \to \tilde{\tilde{V}}

be maps defined on subsets

V_{H} \subseteq V

and

{\tilde{V}}_{\tilde{H}} \subseteq \tilde{V}

such that

H (V_{H}) \subseteq {\tilde{V}}_{\tilde{H}}

. Let

E_{0}

and

{\tilde{E}}_{0}

be subsets of

E

and

\tilde{E}

, respectively. Let

S

and

\tilde{S}

be sets of sequences in

V_{H}

and

{\tilde{V}}_{\tilde{H}}

, respectively, and assume that the following three assertions hold.

(a): For every $(θ_{n}) \in S$ , we have $(H (θ_{n})) \in \tilde{S}$ .
(b): H is uniformly quasi-Hadamard differentiable with respect to $S$ tangentially to $E_{0} 〈 E 〉$ with trace $\tilde{E}$ and uniform quasi-Hadamard derivative ${\dot{H}}_{S} : E_{0} \to \tilde{E}$ , and we have ${\dot{H}}_{S} (E_{0}) \subseteq {\tilde{E}}_{0}$ .
(c): $\tilde{H}$ is uniformly quasi-Hadamard differentiable with respect to $\tilde{S}$ tangentially to ${\tilde{E}}_{0} 〈 \tilde{E} 〉$ with trace $\tilde{\tilde{E}}$ and uniform quasi-Hadamard derivative ${\dot{\tilde{H}}}_{\tilde{S}} : {\tilde{E}}_{0} \to \tilde{\tilde{E}}$ .

Then, the map

T : = \tilde{H} \circ H : V_{H} \to \tilde{\tilde{V}}

is uniformly quasi-Hadamard differentiable with respect to

S

tangentially to

E_{0} 〈 E 〉

with trace

\tilde{\tilde{E}}

, and the uniform quasi-Hadamard derivative

{\dot{T}}_{S}

is given by

{\dot{T}}_{S} : = {\dot{\tilde{H}}}_{\tilde{S}} \circ {\dot{H}}_{S}

.

Proof.

Obviously, since

H (V_{H}) \subseteq {\tilde{V}}_{\tilde{H}}

and

\tilde{H}

is associated with trace

\tilde{\tilde{E}}

, the map

\tilde{H} \circ H

can also be associated with trace

\tilde{\tilde{E}}

.

Now, let

((θ_{n}), x, (x_{n}), (ε_{n}))

be a quadruple with

(θ_{n}) \in S

,

x \in E_{0}

,

(x_{n}) \subseteq E

satisfying

∥ x_{n} {- x ∥}_{E} \to 0

as well as

(θ_{n} + ε_{n} x_{n}) \subseteq V_{H}

, and

(ε_{n}) \subseteq (0, \infty)

satisfying

ε_{n} \to 0

. Then,

\begin{matrix} ∥ {\dot{\tilde{H}}}_{\tilde{S}} ({\dot{H}}_{S} (x)) - \frac{\tilde{H} (H (θ_{n} + ε_{n} x_{n})) - \tilde{H} (H (θ_{n}))}{ε_{n}} ∥_{\tilde{\tilde{E}}} \\ = ∥ {\dot{\tilde{H}}}_{\tilde{S}} ({\dot{H}}_{S} (x)) - \frac{\tilde{H} (H (θ_{n}) + ε_{n} \frac{H (θ_{n} + ε_{n} x_{n}) - H (θ_{n})}{ε_{n}}) - \tilde{H} (H (θ_{n}))}{ε_{n}} ∥_{\tilde{\tilde{E}}} . \end{matrix}

Note that by assumption,

H (θ_{n}) \in {\tilde{V}}_{\tilde{H}}

and in particular

(H (θ_{n})) \in \tilde{S}

. By the uniform quasi-Hadamard differentiability of H with respect to

S

tangentially to

E_{0} 〈 E 〉

with trace

\tilde{E}

,

lim_{n \to \infty} ∥ \frac{H (θ_{n} + ε_{n} x_{n}) - H (θ_{n})}{ε_{n}} - {\dot{H}}_{S} (x) ∥_{\tilde{E}} = 0 .

Moreover,

(H (θ_{n} + ε_{n} x_{n}) - H (θ_{n})) / ε_{n} \in \tilde{E}

and

{\dot{H}}_{S} (x) \in {\tilde{E}}_{0}

, because H is associated with trace

\tilde{E}

and

{\dot{H}}_{S} (E_{0}) \subseteq {\tilde{E}}_{0}

. Hence, by the uniform quasi-Hadamard differentiability of

\tilde{H}

with respect to

\tilde{S}

tangentially to

{\tilde{E}}_{0} 〈 \tilde{E} 〉

, we obtain

lim_{n \to \infty} ∥ {\dot{\tilde{H}}}_{\tilde{S}} ({\dot{H}}_{S} (x)) - \frac{\tilde{H} (H (θ_{n}) + ε_{n} \frac{H (θ_{n} + ε_{n} x_{n}) - H (θ_{n})}{ε_{n}}) - \tilde{H} (H (θ_{n}))}{ε_{n}} ∥_{\tilde{\tilde{E}}} = 0 .

This completes the proof. ☐

Appendix B. Delta-Method for the Bootstrap

The functional delta-method is a widely used technique to derive bootstrap consistency for a sequence of plug-in estimators with respect to a map H from bootstrap consistency of the underlying sequence of estimators. An essential limitation of the classical functional delta-method for proving bootstrap consistency in probability (or outer probability) is the condition of Hadamard differentiability on H (see Theorem 3.9.11 of van der Vaart Wellner (1996)). It is commonly acknowledged that Hadamard differentiability fails for many relevant maps H. Recently, it was demonstrated in Beutner and Zähle (2016) that a functional delta-method for the bootstrap in probability can also be proved for quasi-Hadamard differentiable maps H. Quasi-Hadamard differentiability is a weaker notion of “differentiability” than Hadamard differentiability and can be obtained for many relevant statistical functionals H (see, e.g., Beutner et al. 2012; Beutner and Zähle 2010, 2012; Krätschmer et al. 2013; Krätschmer and Zähle 2017). Using the classical functional delta-method to prove almost sure (or outer almost sure) bootstrap consistency for a sequence of plug-in estimators with respect to a map H from almost sure (or outer almost sure) bootstrap consistency of the underlying sequence of estimators requires uniform Hadamard differentiability on H (see Theorem 3.9.11 of van der Vaart Wellner (1996)). In this section, we introduce the notion of uniform quasi-Hadamard differentiability and demonstrate that one can even obtain a functional delta-method for the almost sure bootstrap and uniformly quasi-Hadamard differentiable maps H.

To explain the background and the contribution of this section more precisely, assume that we are given an estimator

{\hat{T}}_{n}

for a parameter

θ

in a vector space, with n denoting the sample size, and that we are actually interested in the aspect

H (θ)

of

θ

. Here, H is any map taking values in a vector space. Then,

H ({\hat{T}}_{n})

is often a reasonable estimator for

H (θ)

. One of the main objects in statistical inference is the distribution of the error

H ({\hat{T}}_{n}) - H (θ)

, because the error distribution can theoretically be used to derive confidence regions for

H (θ)

. However, in applications, the exact specification of the error distribution is often hardly possible or even impossible. A widely used way out is to derive the asymptotic error distribution, i.e., the weak limit

μ

of

law {a_{n} (H ({\hat{T}}_{n}) - H (θ))}

for suitable normalizing constants

a_{n}

tending to infinity, and to use

μ

as an approximation for

μ_{n} : = law {a_{n} (H ({\hat{T}}_{n}) - H (θ))}

for large n. Since

μ

usually still depends on the unknown parameter

θ

, one should use the notation

μ_{θ}

instead of

μ

. In particular, one actually uses

μ_{{\hat{T}}_{n}} : = μ_{θ} |_{θ = {\hat{T}}_{n}}

as an approximation for

μ_{n}

for large n.

Not least because of the estimation of the parameter

θ

of

μ_{θ}

, the approximation of

μ_{n}

by

μ_{{\hat{T}}_{n}}

is typically only moderate. An often more efficient alternative technique to approximate

μ_{n}

is the bootstrap. The bootstrap has been introduced by Efron (1979) and many variants of his method have been introduced since then. One may refer to Davison and Hinkley (1997); Efron (1994); Lahiri (2003); Shao and Tu (1995) for general accounts on this topic. The basic idea of the bootstrap is the following. Re-sampling the original sample according to a certain re-sampling mechanism (depending on the particular bootstrap method) one can sometimes construct a so-called bootstrap version

{\hat{T}}_{n}^{*}

of

{\hat{T}}_{n}

for which the conditional law of

a_{n} (H ({\hat{T}}_{n}^{*}) - H ({\hat{T}}_{n}))

“given the sample” has the same weak limit

μ_{θ}

as the law of

a_{n} (H ({\hat{T}}_{n}) - H (θ))

has. The latter is referred to as bootstrap consistency. Since

{\hat{T}}_{n}^{*}

depends only on the sample and the re-sampling mechanism, one can at least numerically determine the conditional law of

a_{n} (H ({\hat{T}}_{n}^{*}) - H ({\hat{T}}_{n}))

“given the sample” by means of a Monte Carlo simulation based on

L ≫ n

repetitions. The resulting law

μ_{L}^{*}

can then be used as an approximation of

μ_{n}

, at least for large n.

In applications, the roles of

θ

and

{\hat{T}}_{n}

are often played by a distribution function F and the empirical distribution function

{\hat{F}}_{n}

of n random variables that are identically distributed according to F, respectively. Not least for this particular setting several results on bootstrap consistency for

{\hat{T}}_{n}

are known (see also Appendix B.2). The functional delta-method then ensures that bootstrap consistency also holds for

H ({\hat{T}}_{n})

when H is suitably differentiable at

θ

. Technically speaking, as indicated above, one has to distinguish between two types of bootstrap consistency. First bootstrap consistency in probability for

H ({\hat{T}}_{n})

can be associated with

lim_{n \to \infty} P^{out} [\{ω \in Ω : d_{BL}^{\circ} (P_{n} (ω, \cdot), μ_{θ}) \geq δ\}] = 0 for all δ > 0,

(A6)

where

ω

represents the sample,

P_{n} (ω, \cdot)

denotes the conditional law of

a_{n} (H ({\hat{T}}_{n}^{*}) - H ({\hat{T}}_{n}))

given the sample

ω

,

d_{BL}^{\circ}

is the bounded Lipschitz distance, and the superscript

^{out}

refers to outer probability. At this point, it is worth pointing out that we consider weak convergence (respectively, convergence in distribution) with respect to the open-ball

σ

-algebra, in symbols

\Rightarrow^{\circ}

(respectively,

⇝^{\circ}

), as defined in (Billingsley 1999, sct. 6) (see also Dudley 1966, 1967; Pollard 1984; Shorack and Wellner 1986) and that by the Portmanteau theorem A.3 in Beutner and Zähle (2016) weak convergence

μ_{n} \Rightarrow^{\circ} μ

holds if and only if

d_{BL}^{\circ} (μ_{n}, μ) \to 0

. Second bootstrap consistency almost surely for

H ({\hat{T}}_{n})

means that

law \{a_{n} (H ({\hat{T}}_{n}^{*} (ω, \cdot)) - H ({\hat{T}}_{n} (ω)))\} \Rightarrow^{\circ} μ_{θ} P - a . e . ω .

(A7)

In Beutner and Zähle (2016), it has been shown that Equation (A6) follows from the respective analogue for

{\hat{T}}_{n}

when H is suitably quasi-Hadamard differentiable at

θ

. This extends Theorem 3.9.11 of van der Vaart Wellner (1996) which covers only Hadamard differentiable maps. In this section, we show that Equation (A7) follows from the respective analogue for

{\hat{T}}_{n}

when H is suitably uniformly quasi-Hadamard differentiable at

θ

; the notion of uniform quasi-Hadamard differentiable is introduced in Definition A1 below. This extends Theorem 3.9.13 of van der Vaart Wellner (1996) which covers only Hadamard differentiable maps.

Appendix B.1. Abstract Delta-Method for the Bootstrap

Theorem A4 provides an abstract delta-method for the almost sure bootstrap. It is based on the notion of uniform quasi-Hadamard differentiability which we introduce first. This sort of differentiability extends the notion of quasi-Hadamard differentiability as introduced in Beutner and Zähle (2010, 2016). The latter corresponds to the differentiability concept in (i) of Definition A1 ahead with

S

and

\tilde{E}

as in (iii) and (v) of this definition. Let

V

and

\tilde{V}

be vector spaces. Let

E \subseteq V

and

\tilde{E} \subseteq \tilde{V}

be subspaces equipped with norms

{∥ \cdot ∥}_{E}

and

{∥ \cdot ∥}_{\tilde{E}}

, respectively. Let

H : V_{H} ⟶ \tilde{V}

be any map defined on some subset

V_{H} \subseteq V

.

Definition A1.

Let

E_{0}

be a subset of

E

, and

S

be a set of sequences in

V_{H}

.

(i) The map H is said to be uniformly quasi-Hadamard differentiable with respect to

S

tangentially to

E_{0} 〈 E 〉

with trace

\tilde{E}

if

H (y_{1}) - H (y_{2}) \in \tilde{E}

for all

y_{1}, y_{2} \in V_{H}

and there is some continuous map

{\dot{H}}_{S} : E_{0} \to \tilde{E}

such that

\begin{matrix} lim_{n \to \infty} ∥ {\dot{H}}_{S} (x) - \frac{H (θ_{n} + ε_{n} x_{n}) - H (θ_{n})}{ε_{n}} ∥_{\tilde{E}} = 0 \end{matrix}

(A8)

holds for each quadruple

((θ_{n}), x, (x_{n}), (ε_{n}))

, with

(θ_{n}) \in S

,

x \in E_{0}

,

(x_{n}) \subseteq E

satisfying

∥ x_{n} {- x ∥}_{E} \to 0

as well as

(θ_{n} + ε_{n} x_{n}) \subseteq V_{H}

, and

(ε_{n}) \subseteq (0, \infty)

satisfying

ε_{n} \to 0

. In this case, the map

{\dot{H}}_{S}

is called uniform quasi-Hadamard derivative of H with respect to

S

tangentially to

E_{0} 〈 E 〉

.

(ii) If

S

consists of all sequences

(θ_{n}) \subseteq V_{H}

with

θ_{n} - θ \in E

,

n \in N

, and

∥ θ_{n} {- θ ∥}_{E} \to 0

for some fixed

θ \in V_{H}

, then we replace the phrase “ with respect to

S

” by “at θ” and “

{\dot{H}}_{S}

” by “

{\dot{H}}_{θ}

”.

(iii) If

S

consists only of the constant sequence

θ_{n} = θ

,

n \in N

, then we skip the phrase “uniformly” and replace the phrase “ with respect to

S

” by “at θ” and “

{\dot{H}}_{S}

” by “

{\dot{H}}_{θ}

”. In this case, we may also replace “

H (y_{1}) - H (y_{2}) \in \tilde{E}

for all

y_{1}, y_{2} \in V_{H}

” by “

H (y) - H (θ) \in \tilde{E}

for all

y \in V_{H}

”.

(iv) If

E = V

, then we skip the phrase “quasi-”.

(v) If

\tilde{E} = \tilde{V}

, then we skip the phrase “with trace

\tilde{E}

”.

The conventional notion of uniform Hadamard differentiability as used in Theorem 3.9.11 of van der Vaart Wellner (1996) corresponds to the differentiability concept in (i) with

S

as in (ii),

E

as in (iv), and

\tilde{E}

as in (v). Proposition 1 shows that it is beneficial to refrain from insisting on

E = V

as in (iv). It was recently discussed in Belloni et al. (2017) that it can be also beneficial to refrain from insisting on the assumption of (ii). For

E = V

(“non-quasi” case), uniform Hadamard differentiability in the sense of Definition B.1 in Belloni et al. (2017) corresponds to uniform Hadamard differentiability in the sense of our Definition A1 (Parts (i) and (iv)) when

S

is chosen as the set of all sequences

(θ_{n})

in a compact metric space

(K_{θ}, d_{K})

with

θ \in K_{θ} \subseteq V_{H}

for which

d_{K} (θ_{n}, θ) \to 0

. In Comment B.3 of Belloni et al. (2017), it is illustrated by means of the quantile functional that this notion of differentiability (subject to a suitable choice of

(K_{θ}, d_{K})

) is strictly weaker than the notion of uniform Hadamard differentiability that was used in the classical delta-method for the almost sure bootstrap, Theorem 3.9.11 in van der Vaart Wellner (1996). Although this shows that the flexibility with respect to

S

in our Definition A1 can be beneficial, it is somehow even more important that we allow for the “quasi” case.

Of course, the smaller the family

S

the weaker the condition of uniform quasi-Hadamard differentiability with respect to

S

. On the other hand, if the set

S

is too small, then Condition (e) in Theorem A4 ahead may fail. That is, for an application of the functional delta-method in the form of Theorem A4 the set

S

should be large enough for Condition (e) to be fulfilled and small enough for being able to establish uniform quasi-Hadamard differentiability with respect to

S

of the map H.

We now turn to the abstract delta-method. As mentioned in Section 1, convergence in distribution will always be considered for the open-ball

σ

-algebra. We use the terminology convergence in distribution

^{\circ}

(symbolically

⇝^{\circ}

) for this sort of convergence; for details see Appendix A and Appendices A–C of Beutner and Zähle (2016). In a separable metric space the notion of convergence in distribution

^{\circ}

boils down to the conventional notion of convergence in distribution for the Borel

σ

-algebra. In this case, we use the symbol ⇝ instead of

⇝^{\circ}

.

Let

(Ω, F, P)

be a probability space, and

({\hat{T}}_{n})

be a sequence of maps

{\hat{T}}_{n} : Ω ⟶ V .

Regard

ω \in Ω

as a sample drawn from

P

, and

{\hat{T}}_{n} (ω)

as a statistic derived from

ω

. Somewhat unconventionally, we do not (need to) require at this point that

{\hat{T}}_{n}

is measurable with respect to any

σ

-algebra on

V

. Let

(Ω^{'}, F^{'}, P^{'})

be another probability space and set

(\bar{Ω}, \bar{F}, \bar{P}) : = (Ω \times Ω^{'}, F \otimes F^{'}, P \otimes P^{'}) .

The probability measure

P^{'}

represents a random experiment that is run independently of the random sample mechanism

P

. In the sequel,

{\hat{T}}_{n}

will frequently be regarded as a map defined on the extension

\bar{Ω}

of

Ω

. Let

{\hat{T}}_{n}^{*} : \bar{Ω} ⟶ V

be any map. Since

{\hat{T}}_{n}^{*} (ω, ω^{'})

depends on both the original sample

ω

and the outcome

ω^{'}

of the additional independent random experiment, we may regard

{\hat{T}}_{n}^{*}

as a bootstrapped version of

{\hat{T}}_{n}

. Moreover, let

{\hat{C}}_{n} : Ω ⟶ V

be any map. As with

{\hat{T}}_{n}

, we often regard

{\hat{C}}_{n}

as a map defined on the extension

\bar{Ω}

of

Ω

. We use

{\hat{C}}_{n}

together with a scaling sequence to get weak convergence results for

{\hat{T}}_{n}^{*}

. The role of

{\hat{C}}_{n}

is often played by

{\hat{T}}_{n}

itself (see Example A1), but sometimes also by a different map (see Example A2). Assume that

{\hat{T}}_{n}

,

{\hat{T}}_{n}^{*}

, and

{\hat{C}}_{n}

take values only in

V_{H}

.

Let

B^{\circ}

and

{\tilde{B}}^{\circ}

be the open-ball

σ

-algebras on

E

and

\tilde{E}

with respect to the norms

{∥ \cdot ∥}_{E}

and

{∥ \cdot ∥}_{\tilde{E}}

, respectively. Note that

B^{\circ}

coincides with the Borel

σ

-algebra on

E

when

(E, ∥ \cdot ∥_{E})

is separable. The same is true for

{\tilde{B}}^{\circ}

. Set

\bar{\tilde{E}} : = \tilde{E} \times \tilde{E}

and let

\bar{{\tilde{B}}^{\circ}}

be the

σ

-algebra on

\bar{\tilde{E}}

generated by the open balls with respect to the metric

\bar{\tilde{d}} (({\tilde{x}}_{1}, {\tilde{x}}_{2}), ({\tilde{y}}_{1}, {\tilde{y}}_{2})) : = max {∥ {\tilde{x}}_{1} - {\tilde{y}}_{1} ∥_{\tilde{E}}; ∥ {\tilde{x}}_{2} - {\tilde{y}}_{2} ∥_{\tilde{E}}}

. Recall that

\bar{{\tilde{B}}^{\circ}} \subseteq {\tilde{B}}^{\circ} \otimes {\tilde{B}}^{\circ}

, because any

\bar{\tilde{d}}

-open ball in

\bar{\tilde{E}}

is the product of two

{∥ \cdot ∥}_{\tilde{E}}

-open balls in

\tilde{E}

.

Theorem A3 is a consequence of Theorem A2 in Appendix A.2 as we assume that

{\hat{T}}_{n}

takes values only in

V_{H}

. The proof of the measurability statement of Theorem A3 is given in the proof of Theorem A4. Theorem A3 is stated here because, together with Theorem A4, it implies almost sure bootstrap consistency whenever the limit

ξ

is the same in Theorem A3 and Theorem A4.

Theorem A3.

Let

(θ_{n})

be a sequence in

V_{H}

and

S : = {(θ_{n})}

. Let

E_{0} \subseteq E

be a separable subspace and assume that

E_{0} \in B^{\circ}

. Let

(a_{n})

be a sequence of positive real numbers with

a_{n} \to \infty

, and assume that the following assertions hold:

(a): $a_{n} ({\hat{T}}_{n} - θ_{n})$ takes values only in $E$ , is $(F, B^{\circ})$ -measurable, and satisfies

$a_{n} ({\hat{T}}_{n} - θ_{n}) ⇝^{\circ} ξ i n (E, B^{\circ}, ∥ \cdot ∥_{E})$

(A9)

for some $(E, B^{\circ})$ -valued random variable ξ on some probability space $(Ω_{0}, F_{0}, P_{0})$ with $ξ (Ω_{0}) \subseteq E_{0}$ .
(b): $a_{n} (H ({\hat{T}}_{n}) - H (θ_{n}))$ takes values only in $\tilde{E}$ and is $(F, {\tilde{B}}^{\circ})$ -measurable.
(c): H is uniformly quasi-Hadamard differentiable with respect to $S$ tangentially to $E_{0} 〈 E 〉$ with trace $\tilde{E}$ and uniform quasi-Hadamard derivative ${\dot{H}}_{S}$ .

Then,

{\dot{H}}_{S} (ξ)

is

(F_{0}, {\tilde{B}}^{\circ})

-measurable and

a_{n} (H ({\hat{T}}_{n}) - H (θ_{n})) ⇝^{\circ} {\dot{H}}_{S} (ξ) i n (\tilde{E}, {\tilde{B}}^{\circ}, ∥ \cdot ∥_{\tilde{E}}) .

Theorem A4.

Let

S

be any set of sequences in

V_{H}

. Let

E_{0} \subseteq E

be a separable subspace and assume that

E_{0} \in B^{\circ}

. Let

(a_{n})

be a sequence of positive real numbers with

a_{n} \to \infty

, and assume that the following assertions hold:

(a): $a_{n} ({\hat{T}}_{n}^{*} - {\hat{C}}_{n})$ takes values only in $E$ , is $(\bar{F}, B^{\circ})$ -measurable, and satisfies

$a_{n} ({\hat{T}}_{n}^{*} (ω, \cdot) - {\hat{C}}_{n} (ω)) ⇝^{\circ} ξ i n (E, B^{\circ}, ∥ \cdot ∥_{E}), P - a . e . ω$

(A10)

for some $(E, B^{\circ})$ -valued random variable ξ on some probability space $(Ω_{0}, F_{0}, P_{0})$ with $ξ (Ω_{0}) \subseteq E_{0}$ .
(b): $a_{n} (H ({\hat{T}}_{n}^{*}) - H ({\hat{C}}_{n}))$ takes values only in $\tilde{E}$ and is $(\bar{F}, {\tilde{B}}^{\circ})$ -measurable.
(c): H is uniformly quasi-Hadamard differentiable with respect to $S$ tangentially to $E_{0} 〈 E 〉$ with trace $\tilde{E}$ and uniform quasi-Hadamard derivative ${\dot{H}}_{S}$ .
(d): The uniform quasi-Hadamard derivative ${\dot{H}}_{S}$ can be extended from $E_{0}$ to $E$ such that the extension ${\dot{H}}_{S} : E \to \tilde{E}$ is $(B^{\circ}, {\tilde{B}}^{\circ})$ -measurable and continuous at every point of $E_{0}$ .
(e): $({\hat{C}}_{n} (ω)) \in S$ for $P$ -a.e. ω.
(f): The map $h : \bar{\tilde{E}} \to \tilde{E}$ defined by $h ({\tilde{x}}_{1}, {\tilde{x}}_{2}) : = {\tilde{x}}_{1} - {\tilde{x}}_{2}$ is $(\bar{{\tilde{B}}^{\circ}}, {\tilde{B}}^{\circ})$ -measurable.

Then,

{\dot{H}}_{S} (ξ)

is

(F_{0}, {\tilde{B}}^{\circ})

-measurable and

a_{n} (H ({\hat{T}}_{n}^{*} (ω, \cdot)) - H ({\hat{C}}_{n} (ω)) (⇝^{\circ} {\dot{H}}_{S} (ξ) i n (\tilde{E}, {\tilde{B}}^{\circ}, ∥ \cdot ∥_{\tilde{E}}), P - a . e . ω .

(A11)

Remark A1.

In Condition (a) of Theorem A4, it is assumed that

a_{n} ({\hat{T}}_{n}^{*} - {\hat{C}}_{n})

is

(\bar{F}, B^{\circ})

-measurable for

\bar{F} : = F \otimes F^{'}

. Thus, the mapping

ω^{'} \mapsto a_{n} ({\hat{T}}_{n}^{*} (ω, ω^{'}) - {\hat{C}}_{n} (ω))

is

(F^{'}, B^{\circ})

-measurable for every fixed

ω \in Ω

. That is,

a_{n} ({\hat{T}}_{n}^{*} (ω, \cdot) - {\hat{C}}_{n} (ω))

can be seen as an

(E, B^{\circ})

-valued random variable on

(Ω^{'}, F^{'}, P^{'})

for every fixed

ω \in Ω

, so that assertion (A10) makes sense. By the same line of reasoning one can regard

a_{n} (H ({\hat{T}}_{n}^{*} (ω, \cdot)) - H ({\hat{C}}_{n} (ω)))

as an

(\tilde{E}, {\tilde{B}}^{\circ})

-valued random variable on

(Ω^{'}, F^{'}, P^{'})

for every fixed

ω \in Ω

, so that also assertion (A11) makes sense.

Remark A2.

Condition (c) in Theorem A3 (respectively, Theorem A4) assumes that the trace is given by

\tilde{E}

, which implies that the first part of Condition (b) in Theorem A3 (respectively, Theorem A4) is automatically satisfied.

Remark A3.

Condition (f) of Theorem A4 is automatically fulfilled when

(\tilde{E}, ∥ \cdot ∥_{\tilde{E}})

is separable. Indeed, in this case we have

\bar{{\tilde{B}}^{\circ}} = {\tilde{B}}^{\circ} \otimes {\tilde{B}}^{\circ}

so that every continuous map

h : \bar{\tilde{E}} \to \tilde{E}

(such as

h ({\tilde{x}}_{1}, {\tilde{x}}_{2}) : = {\tilde{x}}_{1} - {\tilde{x}}_{2}

) is

(\bar{{\tilde{B}}^{\circ}}, {\tilde{B}}^{\circ})

-measurable.

Proof. Proof. Proof of Theorem A4.

First note that by the assumption imposed on

ξ

(see Assumption (a)) and Assumption (c) the map

{\dot{H}}_{S} (ξ)

is

(F_{0}, {\tilde{B}}^{\circ})

-measurable. Next, note that

\begin{matrix} a_{n} (H ({\hat{T}}_{n}^{*} (ω, ω^{'})) - H ({\hat{C}}_{n} (ω))) \\ = \{a_{n} (H ({\hat{T}}_{n}^{*} (ω, ω^{'})) - H ({\hat{C}}_{n} (ω))) - {\dot{H}}_{S} (a_{n} ({\hat{T}}_{n}^{*} (ω, ω^{'}) - {\hat{C}}_{n} (ω)))\} \\ + {\dot{H}}_{S} (a_{n} ({\hat{T}}_{n}^{*} (ω, ω^{'}) - {\hat{C}}_{n} (ω))) \\ = : S_{1} (n, ω, ω^{'}) + S_{2} (n, ω, ω^{'}) . \end{matrix}

By Equation (A10) in Assumption (a) and the Continuous Mapping theorem in the form of (Billingsley 1999, Theorem 6.4) (along with

P_{0} \circ ξ^{- 1} [E_{0}] = 1

and the continuity of

{\dot{H}}_{S}

), we have that

S_{2} (n, ω, \cdot) ⇝^{\circ} {\dot{H}}_{S} (ξ)

for

P

-a.e.

ω

. Moreover, for every fixed

ω

we have that

ω^{'} \mapsto S_{1} (n, ω, ω^{'})

is

(F^{'}, {\tilde{B}}^{\circ})

-measurable by Assumption (f), and for

P

-a.e.

ω

we have

a_{n} (H_{n} ({\hat{T}}_{n}^{*} (ω, \cdot)) - H_{n} ({\hat{C}}_{n} (ω))) - {\dot{H}}_{S} (a_{n} ({\hat{T}}_{n}^{*} (ω, \cdot) - {\hat{C}}_{n} (ω))) \to^{p, \circ} 0_{\tilde{E}}

by Part (ii) of Theorem A2 (recall that

{\hat{T}}_{n}^{*}

was assumed to take values only in

V_{H}

), where

\to^{p, \circ}

refers to convergence in probability

^{\circ}

(see Appendix A.1) and

{\hat{T}}_{n}^{*} (ω, \cdot)

,

{\hat{C}}_{n} (ω)

,

{({\hat{C}}_{n} (ω))}

play the roles of

{\hat{T}}_{n} (\cdot)

,

θ_{n}

,

S

, respectively. Hence, from Corollary A2, we get that Equation (A11) holds. ☐

Appendix B.2. Application to Plug-In Estimators of Statistical Functionals

Let

D

,

D_{ϕ}

,

B_{ϕ}^{\circ}

be as introduced at the beginning of Section 3. Let

C_{ϕ} \subseteq D_{ϕ}

be a

{∥ \cdot ∥}_{ϕ}

-separable subspace and assume

C_{ϕ} \in B_{ϕ}^{\circ}

. Moreover, let

H : D (H) \to \tilde{V}

be a map defined on a set

D (H)

of distribution functions of finite (not necessarily probability) Borel measures on

R

, where

\tilde{V}

is any vector space. In particular,

D (H) \subseteq D

. In the following,

D

,

(D_{ϕ}, B_{ϕ}^{\circ}, ∥ \cdot ∥_{ϕ})

,

C_{ϕ}

, and

D (H)

play the roles of

V

,

(E, B^{\circ}, ∥ \cdot ∥_{E})

,

E_{0}

, and

V_{H}

, respectively. As before, we let

(\tilde{E}, ∥ \cdot ∥_{\tilde{E}})

be a normed subspace of

\tilde{V}

equipped with the corresponding open-ball

σ

-algebra

{\tilde{B}}^{\circ}

.

Let

(Ω, F, P)

be a probability space. Let

(F_{n}) \subseteq D (H)

be any sequence and

(X_{i})

be a sequence of real-valued random variables on

(Ω, F, P)

. Moreover, let

{\hat{F}}_{n} : Ω \to D

be the empirical distribution function of

X_{1}, \dots, X_{n}

, which will play the role of

{\hat{T}}_{n}

. It is defined by

{\hat{F}}_{n} : = \frac{1}{n} \sum_{i = 1}^{n} 𝟙_{[X_{i}, \infty)} .

(A12)

Assume that

{\hat{F}}_{n}

takes values only in

D (H)

. Let

(Ω^{'}, F^{'}, P^{'})

be another probability space and set

(\bar{Ω}, \bar{F}, \bar{P}) : = (Ω \times Ω^{'}, F \otimes F^{'}, P \otimes P^{'})

. Moreover, let

{\hat{F}}_{n}^{*} : \bar{Ω} \to D

be any map. Assume that

{\hat{F}}_{n}^{*}

take values only in

D (H)

. Furthermore, let

{\hat{C}}_{n} : \bar{Ω} \to D

be any map that takes values only in

D (H)

. In the present setting Theorems A3 and A4 can be reformulated as follows, where we recall from Remark A3 that Condition (f) of Theorem A4 is automatically fulfilled when

(\tilde{E}, ∥ \cdot ∥_{\tilde{E}})

is separable.

Corollary A3.

Let

(F_{n})

be a sequence in

D (H)

and

S : = {(F_{n})}

. Let

(a_{n})

be a sequence of positive real numbers with

a_{n} \to \infty

, and assume that the following assertions hold:

(a): $a_{n} ({\hat{F}}_{n} - F_{n})$ takes values only in $D_{ϕ}$ and satisfies

$a_{n} ({\hat{F}}_{n} - F_{n}) ⇝^{\circ} B i n (D_{ϕ}, B_{ϕ}^{\circ}, ∥ \cdot ∥_{ϕ})$

(A13)

for some $(D_{ϕ}, B_{ϕ}^{\circ})$ -valued random variable B on some probability space $(Ω_{0}, F_{0}, P_{0})$ with $B (Ω_{0}) \subseteq C_{ϕ}$ .
(b): $a_{n} (H ({\hat{F}}_{n}) - H (F_{n}))$ takes values only in $\tilde{E}$ and is $(F, {\tilde{B}}^{\circ})$ -measurable.
(c): H is uniformly quasi-Hadamard differentiable with respect to $S$ tangentially to $C_{ϕ} 〈 D_{ϕ} 〉$ with trace $\tilde{E}$ and uniform quasi-Hadamard derivative ${\dot{H}}_{S}$ .

Then,

{\dot{H}}_{S} (B)

is

(F_{0}, {\tilde{B}}^{\circ})

-measurable and

a_{n} (H ({\hat{F}}_{n}) - H (F_{n})) ⇝^{\circ} {\dot{H}}_{S} (B) i n (\tilde{E}, {\tilde{B}}^{\circ}, ∥ \cdot ∥_{\tilde{E}}) .

Note that the measurability assumption in Condition (a) of Theorem A3 is automatically satisfied in the present setting (and is therefore omitted in Condition (a) of Corollary A3). Indeed,

a_{n} ({\hat{F}}_{n} - F)

is easily seen to be

(F, B_{ϕ}^{\circ})

-measurable, because

B_{ϕ}^{\circ}

coincides with the trace

σ

-algebra of

D

.

Corollary A4.

Let

S

be any set of sequences in

D (H)

. Let

(a_{n})

be a sequence of positive real numbers with

a_{n} \to \infty

, and assume that the following assertions hold:

(a): $a_{n} ({\hat{F}}_{n}^{*} - {\hat{C}}_{n})$ takes values only in $D_{ϕ}$ , is $(\bar{F}, B_{ϕ}^{\circ})$ -measurable, and

$a_{n} ({\hat{F}}_{n}^{*} (ω, \cdot) - {\hat{C}}_{n} (ω)) ⇝^{\circ} B i n (D_{ϕ}, B_{ϕ}^{\circ}, ∥ \cdot ∥_{ϕ}), P - a . e . ω$

(A14)

for some $(D_{ϕ}, B_{ϕ}^{\circ})$ -valued random variable B on some probability space $(Ω_{0}, F_{0}, P_{0})$ with $B (Ω_{0}) \subseteq C_{ϕ}$ .
(b): $a_{n} (H ({\hat{F}}_{n}^{*}) - H ({\hat{C}}_{n}))$ takes values only in $\tilde{E}$ and is $(\bar{F}, {\tilde{B}}^{\circ})$ -measurable.
(c): H is uniformly quasi-Hadamard differentiable with respect to $S$ tangentially to $C_{ϕ} 〈 D_{ϕ} 〉$ with trace $\tilde{E}$ and uniform quasi-Hadamard derivative ${\dot{H}}_{S}$ .
(d): The uniform quasi-Hadamard derivative ${\dot{H}}_{S}$ can be extended from $C_{ϕ}$ to $D_{ϕ}$ such that the extension ${\dot{H}}_{S} : D_{ϕ} \to \tilde{E}$ is $(B_{ϕ}^{\circ}, {\tilde{B}}^{\circ})$ -measurable, and continuous at every point of $C_{ϕ}$ .
(e): $({\hat{C}}_{n} (ω)) \in S$ for $P$ -a.e. ω.
(f): The map $h : \bar{\tilde{E}} \to \tilde{E}$ defined by $h ({\tilde{x}}_{1}, {\tilde{x}}_{2}) : = {\tilde{x}}_{1} - {\tilde{x}}_{2}$ is $(\bar{{\tilde{B}}^{\circ}}, {\tilde{B}}^{\circ})$ -measurable.

Then,

{\dot{H}}_{S} (B)

is

(F_{0}, {\tilde{B}}^{\circ})

-measurable and

a_{n} (H ({\hat{F}}_{n}^{*} (ω, \cdot)) - H ({\hat{C}}_{n} (ω))) ⇝^{\circ} {\dot{H}}_{S} (B) i n (\tilde{E}, {\tilde{B}}^{\circ}, ∥ \cdot ∥_{\tilde{E}}), P - a . e . ω .

The following examples illustrate

{\hat{F}}_{n}^{*}

and

{\hat{C}}_{n}

. In Example A1, we have

{\hat{C}}_{n} = {\hat{F}}_{n}

, and in Example A2

{\hat{C}}_{n}

may differ from

{\hat{F}}_{n}

. Examples for uniformly quasi-Hadamard differentiable functionals H can be found in Section 3. In the examples in Section 3.1 and Section 3.3 we have

\tilde{V} = \tilde{E} = R

, and in the Example in Section 3.2 we have

\tilde{V} = D

and

\tilde{E} = D_{ϕ}

for some

ϕ

.

Example A1.

Let

(X_{i})

be a sequence of i.i.d. real-valued random variables on

(Ω, F, P)

with distribution function F satisfying

\int ϕ^{2} d F < \infty

, and

{\hat{F}}_{n}

be given by Equation (A12). Let

(W_{n i})

be a triangular array of nonnegative real-valued random variables on

(Ω^{'}, F^{'}, P^{'})

such that Setting S1. or Setting S2. of Section 2.1 is met. Define the map

{\hat{F}}_{n}^{*} : \bar{Ω} \to D

by

{\hat{F}}_{n}^{*} (ω, ω^{'}) : = \frac{1}{n} \sum_{i = 1}^{n} W_{n i} (ω^{'}) 𝟙_{[X_{i} (ω), \infty)}

. Recall that Setting S1. is nothing but Efron’s boostrap (Efron (1979)), and that Setting S2. is in line with the Bayesian bootstrap of Rubin (1981) if

Y_{1}

is exponentially distribution with parameter 1.

In Section 5.1 in Beutner and Zähle (2016), it was proved with the help of results of Shorack and Wellner (1986) and van der Vaart Wellner (1996) that respectively Condition (a) of Corollary A3 (with

F_{n} : = F

) and Condition (a) of Corollary A4 (with

{\hat{C}}_{n} : = {\hat{F}}_{n}

) hold for

a_{n} : = \sqrt{n}

and

B : = B_{F}

, where

B_{F}

is an F-Brownian bridge. Here,

C_{ϕ}

can be chosen to be the set

C_{ϕ, F}

of all

v \in D_{ϕ}

whose discontinuities are also discontinuities of F. In addition, note that, in view of

{\hat{C}}_{n} = {\hat{F}}_{n}

, Condition (e) holds if

S

is (any subset of) the set of all sequences

(G_{n})

of distribution functions on

R

satisfying

G_{n} - F \in D_{ϕ}

,

n \in N

, and

∥ G_{n} {- F ∥}_{ϕ} \to 0

(see, for instance, Theorem 2.1 in Zähle (2014)).

Example A2.

Let

(X_{i})

be a strictly stationary sequence of β-mixing random variables on

(Ω, F, P)

with distribution function F, and

{\hat{F}}_{n}

be given by Equation (A12). Let

(ℓ_{n})

be a sequence of integers such that

ℓ_{n} ↗ \infty

as

n \to \infty

, and

ℓ_{n} < n

for all

n \in N

. Set

k_{n} : = ⌈ n / ℓ_{n} ⌉

for all

n \in N

. Let

{(I_{n j})}_{n \in N, 1 \leq j \leq k_{n}}

be a triangular array of random variables on

(Ω^{'}, F^{'}, P^{'})

such that

I_{n 1}, \dots, I_{n k_{n}}

are i.i.d. according to the uniform distribution on

{1, \dots, n - ℓ_{n} + 1}

for every

n \in N

. Define the map

{\hat{F}}_{n}^{*} : \bar{Ω} \to D

by

{\hat{F}}_{n}^{*} (ω, ω^{'}) : = \frac{1}{n} \sum_{i = 1}^{n} W_{n i} (ω^{'}) 𝟙_{[X_{i} (ω), \infty)}

with

W_{n i}

given by Equation (8), and recall from Section 2.2 that this is the blockwise bootstrap. Similar as in Lemma 5.3 in Beutner and Zähle (2016) it follows that

a_{n} ({\hat{F}}_{n}^{*} - {\hat{C}}_{n})

, with

{\hat{C}}_{n} : = E^{^{'}} [{\hat{F}}_{n}^{*}]

, takes values only in

D_{ϕ}

and is

(\bar{F}, B_{ϕ}^{\circ})

-measurable. That is, the first part of Condition (a) of Corollary A4 holds true for

{\hat{C}}_{n} : = E^{^{'}} [{\hat{F}}_{n}^{*}]

. Now, assume that Assumptions A1.–A3. of Section 2.2 hold true. Then, as discussed in Example 4.4 and Section 5.2 of Beutner and Zähle (2016), it can be derived from a result in Arcones and Yu (1994) that under Assumptions A1. and A2. we have that Condition (a) of Corollary A3 holds for

a_{n} : = \sqrt{n}

,

B : = B_{F}

, and

F_{n} : = F

, where

B_{F}

is a centered Gaussian process with covariance function

Γ (t_{0}, t_{1}) = F (t_{0} \land t_{1}) (1 - F (t_{0} \lor t_{1})) + \sum_{i = 0}^{1} \sum_{k = 2}^{\infty} C ov (𝟙_{{X_{1} \leq t_{i}}}, 𝟙_{{X_{k} \leq t_{1 - i}}})

. Here,

C_{ϕ}

can be chosen to be the set

C_{ϕ, F}

of all

v \in D_{ϕ}

whose discontinuities are also discontinuities of F. Moreover, Theorem A5 below shows that under the assumptions A1.–A3. the second part of Condition (a) (i.e., Equation (A14)) and Condition (e) of Corollary A4 hold for

{\hat{C}}_{n} : = E^{^{'}} [{\hat{F}}_{n}^{*}] = \frac{1}{n} \sum_{i = 1}^{n} w_{n i} 𝟙_{[X_{i}, \infty)}

with

w_{n i} : = E^{'} [W_{n i}]

(see also Equation (9)) and the same choice of

a_{n}

, B, and

F_{n}

, when

S

is the set of all sequences

(G_{n}) \subseteq D (H)

with

G_{n} - F \in D_{ϕ}

,

n \in N

, and

∥ G_{n} {- F ∥}_{ϕ} \to 0

.

Further examples for Condition (a) in Corollary A4 for dependent observations can, for example, be found in Bühlmann (1994); Naik-Nimbalkar and Rajarshi (1994); Peligrad (1998).

Theorem A5.

In the setting of Example A2 assume that assertions A1.–A3. of Section 2.2 hold, and let

S

be the set of all sequences

(G_{n}) \subseteq D (H)

with

G_{n} - F \in D_{ϕ}

,

n \in N

, and

∥ G_{n} {- F ∥}_{ϕ} \to 0

. Then, the second part of assertion (a) (i.e., Equation (A14)) and assertion (e) in Corollary A4 hold.

Proof.

Proof of second part of (a): It is enough to show that under assumptions A1.–A3. the Assumptions (A1)–(A4) of Theorem 1 in Bühlmann (1995) hold when the class of functions is

F_{ϕ} : = F_{ϕ}^{-} \cup F_{ϕ}^{+}

. Here,

F_{ϕ}^{-} : = {f_{x} : x \leq 0}

and

F_{ϕ}^{+} : = {f_{x} : x > 0}

with

f_{x} (\cdot) : = ϕ (x) 𝟙_{(- \infty, x]} (\cdot)

for

x \leq 0

and

f_{x} (\cdot) : = - ϕ (x) 𝟙_{(x, \infty)} (\cdot)

for

x > 0

. Due to A2. and A3. we only have to verify Assumptions (A3) and (A4) of Theorem 1 in Bühlmann (1995). That is, we show that the following two assertions hold.

(1): There exist constants $b, c > 0$ such that $N_{[]} (ε, F_{ϕ} {, ∥ \cdot ∥}_{p}) \leq c ε^{- b}$ for all $ε > 0$ .
(2): $\int {\bar{f}}^{p} d F < \infty$ for the envelope function $\bar{f} (z) : = {sup}_{x \in R} | f_{x} (z) |$ .

Here, the bracketing number

N_{[]} (ε, F_{ϕ} {, ∥ \cdot ∥}_{p})

is the minimal number of

ε

-brackets with respect to

{∥ \cdot ∥}_{p}

(

L^{p}

-norm with respect to

d F

) to cover

F_{ϕ}

, where an

ε

-bracket with respect to

{∥ \cdot ∥}_{p}

is the set,

[ℓ, u]

, of all functions f with

ℓ \leq f \leq u

for some Borel measurable functions

ℓ, u : R \to R_{+}

with

ℓ \leq u

pointwise and

{∥ u - ℓ ∥}_{p} \leq ε

.

(1) We only show that (1) with

F_{ϕ}

replaced by

F_{ϕ}^{-}

holds true. Analogously, one can show that the same holds true for

F_{ϕ}^{+}

(and therefore for

F_{ϕ}

). On the one hand, since

I_{p}^{-} : = \int_{(- \infty, 0]} ϕ^{p} d F < \infty

by Assumption (a), we can find for every

ε > 0

a finite partition

- \infty = y_{0}^{ε} < y_{1}^{ε} < \dots < y_{k_{ε}}^{ε} = 0

such that

max_{i = 1, \dots, k_{ε}} \int_{(y_{i - 1}^{ε}, y_{i}^{ε}]} ϕ^{p} d F \leq {(ε / 2)}^{p}

(A15)

and

k_{ε} \leq ⌈ I_{p}^{-} / {(ε / 2)}^{p} ⌉

. On the other hand, using integration by parts we obtain

\int_{(- \infty, 0]} F d (- ϕ^{p}) = ϕ {(0)}^{p} F (0) - \int_{(- \infty, 0]} (- ϕ^{p}) d F = ϕ {(0)}^{p} F (0) + I_{p}^{-},

so that we can find a finite partition

- \infty = z_{0}^{ε} < z_{1}^{ε} < \dots < z_{m_{ε}}^{ε} = 0

such that

max_{i = 1, \dots, m_{ε}} \int_{(z_{i - 1}^{ε}, z_{i}^{ε}]} F d (- ϕ^{p}) \leq {(ε / 2)}^{p}

(A16)

and

m_{ε} \leq ⌈ (ϕ {(0)}^{p} F (0) + I_{p}^{-}) / {(ε / 2)}^{p} ⌉

.

Now, let

- \infty = x_{0}^{ε} < x_{1}^{ε} < \dots < x_{k_{ε} + m_{ε}}^{ε} = 0

be the partition consisting of all points

y_{i}^{ε}

and

z_{i}^{ε}

, and set

\begin{matrix} ℓ_{i}^{ε} (\cdot) & : = ϕ (x_{i}^{ε}) 𝟙_{(- \infty, x_{i - 1}^{ε}]} (\cdot), \\ u_{i}^{ε} (\cdot) & : = ϕ (x_{i - 1}^{ε}) 𝟙_{(- \infty, x_{i - 1}^{ε}]} (\cdot) + ϕ (\cdot) 𝟙_{(x_{i - 1}^{ε}, x_{i}^{ε}]} (\cdot) . \end{matrix}

(A17)

Then,

ℓ_{i}^{ε} \leq u_{i}^{ε}

. Moreover,

\begin{matrix} ∥ u_{i}^{ε} - ℓ_{i}^{ε} ∥_{p} & = & {(\int {(u_{i}^{ε} - ℓ_{i}^{ε})}^{p} d F)}^{1 / p} \\ \leq & {(\int_{(- \infty, x_{i - 1}^{ε}]} {(ϕ (x_{i - 1}^{ε}) - ϕ (x_{i}^{ε}))}^{p} d F)}^{1 / p} + {(\int_{(x_{i - 1}^{ε}, x_{i}^{ε}]} ϕ^{p} d F)}^{1 / p} \\ \leq & {(\int_{(- \infty, x_{i - 1}^{ε}]} (ϕ {(x_{i - 1}^{ε})}^{p} - ϕ {(x_{i}^{ε})}^{p}) d F)}^{1 / p} + ε / 2 \\ \leq & {((ϕ {(x_{i - 1}^{ε})}^{p} - ϕ {(x_{i}^{ε})}^{p}) F (x_{i - 1}^{ε}))}^{1 / p} + ε / 2 \end{matrix}

where we used Minkovski’s inequality and Equation (A15), and that

ϕ

is non-increasing on

(- \infty, 0]

and

x_{i - 1}^{ε} \leq x_{i}^{ε}

. Since F is at least

F (x_{i - 1}^{ε})

on

(x_{i - 1}^{ε}, x_{i}^{ε}]

, we have

(ϕ {(x_{i - 1}^{ε})}^{p} - ϕ {(x_{i}^{ε})}^{p}) F (x_{i - 1}^{ε}) \leq \int_{(x_{i - 1}^{ε}, x_{i}^{ε}]} F d (- ϕ^{p}) \leq {(ε / 2)}^{p}

due to Equation (A16). Thus,

∥ u_{i}^{ε} - ℓ_{i}^{ε} ∥_{p} \leq ε

, so that

[ℓ_{i}^{ε}, u_{i}^{ε}]

provides an

ε

-bracket with respect to

{∥ \cdot ∥}_{p}

. It is moreover obvious that the

ε

-brackets

[ℓ_{i}^{ε}, u_{i}^{ε}]

,

i = 1, \dots, k_{ε} + m_{ε}

, cover

F_{ϕ}^{-}

. Thus,

N_{[]} (ε, F_{ϕ}^{-} {, ∥ \cdot ∥}_{p}) \leq c ε^{- p}

for a suitable constant

c > 0

and all

ε > 0

.

(2) The envelope function

\bar{f}

is given by

\bar{f} (y) = ϕ (y)

for

y \leq 0

and by

\bar{f} (y) = ϕ (y -) = ϕ (y)

(recall that

ϕ

is continuous) for

y > 0

. Then, under Assumption (a) the integrability condition 2) holds.

Proof of (e): We have to show that

∥ {\hat{C}}_{n} {- F ∥}_{ϕ} = {sup}_{x \in R} | {\hat{C}}_{n} (x) - F (x) | ϕ (x) \to 0

P

-a.s. We only show that

sup_{x \in (- \infty, 0]} | {\hat{C}}_{n} (x) - F (x) | ϕ (x) ⟶ 0 P - a . s .,

(A18)

because the analogue for the positive real line can be shown in the same way. Let

ℓ_{i}^{ε}

and

u_{i}^{ε}

be as defined in Equation (A17). By assumption A1. we have

\int ϕ d F < \infty

, so that similar as above we can find a finite partition

- \infty = x_{0}^{ε} < x_{1}^{ε} < \dots < x_{k_{ε} + m_{ε}}^{ε} = 0

such that

[ℓ_{i}^{ε}, u_{i}^{ε}]

,

i = 1, \dots, k_{ε} + m_{ε}

, are

ε

-brackets with respect to

{∥ \cdot ∥}_{1}

(

L^{1}

-norm with respect to F) covering the class

F_{ϕ} : = {f_{x} : x \in R}

introduced above. We proceed in two steps.

Step 1. First we show that

sup_{x \leq 0} | {\hat{C}}_{n} (x) - F (x) | ϕ (x) \leq max_{i = 1, \dots, k_{ε} + m_{ε}} max \{\int u_{i}^{ε} d ({\hat{C}}_{n} - F); \int ℓ_{i}^{ε} d (F - {\hat{C}}_{n})\} + ε

(A19)

holds true for every

ε > 0

. Since

({\hat{C}}_{n} (x) - F (x)) ϕ (x) = \int f_{x} d {\hat{C}}_{n} - \int f_{x} d F

, for Equation (A19) it suffices to show

\begin{matrix} sup_{x \leq 0} | \int f_{x} d {\hat{C}}_{n} - \int f_{x} d F | \\ \leq max_{i = 1, \dots, k_{ε} + m_{ε}} max \{\int u_{i}^{ε} d ({\hat{C}}_{n} - F); \int ℓ_{i}^{ε} d (F - {\hat{C}}_{n})\} + ε . \end{matrix}

(A20)

To prove Equation (A20), we note that for every

x \in (- \infty, y]

there is some

i_{x} \in {1, \dots, k_{ε} + m_{ε}}

such that

f_{x} \in [ℓ_{i_{x}}^{ε}, u_{i_{x}}^{ε}]

(see Step 1). Therefore, since

[ℓ_{i_{x}}^{ε}, u_{i_{x}}^{ε}]

is an

ε

-bracket with respect to

{∥ \cdot ∥}_{1}

,

\begin{matrix} \int f_{x} d {\hat{C}}_{n} - \int f_{x} d F & \leq & \int u_{i_{x}}^{ε} d {\hat{C}}_{n} - \int f_{x} d F \\ = & \int u_{i_{x}}^{ε} d ({\hat{C}}_{n} - F) + \int (u_{i_{x}}^{ε} - f_{x}) d F \\ \leq & \int u_{i_{x}}^{ε} d ({\hat{C}}_{n} - F) + \int (u_{i_{x}}^{ε} - ℓ_{i_{x}}^{ε}) d F \\ \leq & max_{i = 1, \dots, k_{ε} + m_{ε}} \int u_{i}^{ε} d ({\hat{C}}_{n} - F) + ε . \end{matrix}

Analogously, we obtain

\begin{matrix} \int f_{x} d {\hat{C}}_{n} - \int f_{x} d F & \geq & - (max_{i = 1, \dots, k_{ε} + m_{ε}} \int ℓ_{i}^{ε} d (F - {\hat{C}}_{n}) + ε) . \end{matrix}

That is, Equation (A19) holds true.

Step 2. Because of Equation (A19), for Equation (A18) to be true, it suffices to show that

\int ℓ_{i}^{ε} d (F - {\hat{C}}_{n}) ⟶ 0 and \int u_{i}^{ε} d ({\hat{C}}_{n} - F) ⟶ 0 P - a . s .

(A21)

for every

i = 1, \dots, k_{ε} + m_{ε}

. We only show the second convergence in Equation (A21), the first convergence can be shown even easier. We have

\begin{matrix} \int u_{i}^{ε} d ({\hat{C}}_{n} - F) & = & \frac{1}{n} \sum_{j = 1}^{n} (w_{n i} ϕ (x_{i - 1}^{ε}) 𝟙_{(- \infty, x_{i - 1}^{ε}]} (X_{j}) - E [ϕ (x_{i - 1}^{ε}) 𝟙_{(- \infty, x_{i - 1}^{ε}]} (X_{1})]) \\ + \frac{1}{n} \sum_{j = 1}^{n} (w_{n i} ϕ (X_{j}) 𝟙_{(x_{i - 1}^{ε}, x_{i}^{ε}]} (X_{j}) - E [ϕ (X_{1}) 𝟙_{(x_{i - 1}^{ε}, x_{i}^{ε}]} (X_{1})]) \\ = : & S_{1} (n) + S_{2} (n) . \end{matrix}

The first summand on the right-hand side of

\begin{matrix} S_{2} (n) & = & \frac{1}{n} \sum_{j = 1}^{n} (ϕ (X_{j}) 𝟙_{(x_{i - 1}^{ε}, x_{i}^{ε}]} (X_{j}) - E [ϕ (X_{1}) 𝟙_{(x_{i - 1}^{ε}, x_{i}^{ε}]} (X_{1})]) \\ + \frac{1}{n} \sum_{j = 1}^{n} (w_{n i} - 1) ϕ (X_{j}) 𝟙_{(x_{i - 1}^{ε}, x_{i}^{ε}]} (X_{j}) \end{matrix}

converges

P

-a.s. to 0 by Theorem 1 (ii) (and Application 5, p. 924) in Rio (1995) and our assumption A1. The second summand converges

P

-a.s. to 0 too, which can be seen as follows. From Equation (9), we obtain for n sufficiently large

| w_{n i} - 1 | \leq \{\begin{matrix} 2 & , & i = 1, \dots, ℓ_{n} \\ \frac{ℓ_{n} - 1}{n - ℓ_{n} + 1} & , & i = ℓ_{n} + 1, \dots, n - ℓ_{n} \\ 2 & , & i = n - ℓ_{n} + 1, \dots, n \end{matrix},

so that for n sufficiently large

\begin{matrix} | \frac{1}{n} \sum_{j = 1}^{n} (w_{n i} - 1) ϕ (X_{j}) 𝟙_{(x_{i - 1}^{ε}, x_{i}^{ε}]} (X_{j}) | \\ \leq \frac{ℓ_{n} - 1}{n - ℓ_{n} + 1} \frac{1}{n} \sum_{j = ℓ_{n} + 1}^{n - ℓ_{n}} ϕ (X_{j}) 𝟙_{(x_{i - 1}^{ε}, x_{i}^{ε}]} (X_{j}) \\ + 2 \frac{2 ℓ_{n}}{n} \frac{1}{2 ℓ_{n}} (\sum_{j = 1}^{ℓ_{n}} ϕ (X_{j}) 𝟙_{(x_{i - 1}^{ε}, x_{i}^{ε}]} (X_{j}) + \sum_{j = n - ℓ_{n} + 1}^{n} ϕ (X_{j}) 𝟙_{(x_{i - 1}^{ε}, x_{i}^{ε}]} (X_{j})) \\ = : S_{2, 1} (n) + S_{2, 2} (n) . \end{matrix}

We have seen above that

\frac{1}{n} \sum_{j = 1}^{n} ϕ (X_{j}) 𝟙_{(x_{i - 1}^{ε}, x_{i}^{ε}]} (X_{j})

converges

P

-a.s. to the constant

E [ϕ (X_{1}) 𝟙_{(x_{i - 1}^{ε}, x_{i}^{ε}]} (X_{1})]

. Since

ℓ_{n}

converges to ∞ at a slower rate than n (by assumption A3.), it follows that

S_{2, 1} (n)

converges

P

-a.s. to 0. Using the same arguments we obtain that

S_{2, 2} (n)

converges

P

-a.s. to 0. Hence,

S_{2} (n)

converges

P

-a.s. to 0. Analogously, one can show that

S_{1} (n)

converges

P

-a.s. to 0. ☐

References

Acerbi, Carlo. 2002. Spectral measures of risk: A coherent representation of subjective risk aversion. Journal of Banking & Finance 26: 1505–18. [Google Scholar]
Acerbi, Carlo, and Balazs Szekely. 2014. Backtesting Expected Shortfall. New York: Morgan Stanley Capital International. [Google Scholar]
Acerbi, Carlo, and Dirk Tasche. 2002a. On the coherence of expected shortfall. Journal of Banking & Finance 26: 1487–503. [Google Scholar]
Acerbi, Carlo, and Dirk Tasche. 2002b. Expected Shortfall: A natural coherent alternative to Value at Risk. Economic Notes 31: 379–88. [Google Scholar] [CrossRef]
Arcones, Miguel Angel, and Bin Yu. 1994. Central limit theorems for empirical and U-processes of stationary mixing sequences. Journal of Theoretical Probability 7: 47–71. [Google Scholar] [CrossRef]
Bauer, Heinz. 2001. Measure and Integration Theory. Berlin: De Gruyter. [Google Scholar]
Belloni, Alexandre, Victor Chernozhukov, Ivan Frenández-Val, and Christian B. Hansen. 2017. Program evaluation and causal inference with high-dimensional data. Econometrica 85: 233–98. [Google Scholar] [CrossRef]
Beutner, Eric, Wei Biao Wu, and Henryk Zähle. 2012. Asymptotics for statistical functionals of long-memory sequences. Stochastic Processes and their Applications 122: 910–29. [Google Scholar] [CrossRef]
Beutner, Eric, and Henryk Zähle. 2010. A modified functional delta method and its application to the estimation of risk functionals. Journal of Multivariate Analysis 101: 2452–63. [Google Scholar] [CrossRef]
Beutner, Eric, and Henryk Zähle. 2012. Deriving the asymptotic distribution of U- and V-statistics of dependent data using weighted empirical processes. Bernoulli 18: 803–22. [Google Scholar] [CrossRef]
Beutner, Eric, and Henryk Zähle. 2016. Functional delta-method for the bootstrap of quasi-Hadamard differentiable functionals. Electronic Journal of Statistics 10: 1181–222. [Google Scholar] [CrossRef]
Billingsley, Patrick. 1999. Convergence of Probability Measures. New York: Wiley. [Google Scholar]
Bølviken, Eric, and Montserrat Guillen. 2017. Risk aggregation in Solvency II through recursive log-normals. Insurance: Mathematics and Economics 73: 20–26. [Google Scholar] [CrossRef]
Bühlmann, Peter. 1994. Blockwise bootstrapped empirical process for stationary sequences. Annals of Statistics 22: 995–1012. [Google Scholar] [CrossRef]
Bühlmann, Peter. 1995. The blockwise bootstrap for general empirical processes of stationary sequences. Stochastic Processes and their Applications 58: 247–65. [Google Scholar] [CrossRef]
Davison, Anthony C., and David Victor Hinkley. 1997. Bootstrap Methods and Their Application. Cambridge: Cambridge University Press. [Google Scholar]
Dudley, Richard Mansfield. 1966. Weak convergence of probabilities on nonseparable metric spaces and empirical measures on Euclidean spaces. Illinois Journal of Mathematics 10: 109–26. [Google Scholar]
Dudley, Richard Mansfield. 1967. Measures on non-separable metric spaces. Illinois Journal of Mathematics 11: 449–53. [Google Scholar]
Efron, Bradley. 1979. Bootstrap methods: Another look at the jackknife. Annals of Statistics 7: 1–26. [Google Scholar] [CrossRef]
Efron, Bradley, and Robert Tibshirani. 1994. An introduction to the Bootstrap. New York: Chapman & Hall. [Google Scholar]
Emmer, Susanne, Marie Kratz, and Dirk Tasche. 2015. What is the best risk measure in practice? A comparison of standard measures. Journal of Risk 18: 31–60. [Google Scholar] [CrossRef] [Green Version]
Gilat, David, and Roelof Helmers. 1997. On strong laws for generalized L-statistics with dependent data. Commentationes Mathtematicae Universitatis Carolinae 38: 187–92. [Google Scholar]
Gribkova, Nadezhda. 2002. Bootstrap approximation of distributions of the L-statistics. Journal of Mathematical Sciences 109: 2088–102. [Google Scholar] [CrossRef]
Gribkova, Nadezhda. 2016. Department of Theory of Probability and Mathematical Statistics, Saint Petersburg State University, Saint Petersburg, Russia. Personal communication.
Helmers, Roelof, Paul Janssen, and Robert Serfling. 1990. Berry-Esséen and bootstrap results for generalized L-statistics. Scandinavian Journal of Statistics 17: 65–77. [Google Scholar]
Jones, Bruce L., and Ričardas Zitikis. 2003. Empirical estimation of risk measures and related quantities. North American Actuarial Journal 7: 44–54. [Google Scholar] [CrossRef]
Krätschmer, Volker, Alexander Schied, and Henryk Zähle. 2013. Quasi-Hadamard differentiability of general risk functionals and its application. Statistics and Risk Modeling 32: 25–47. [Google Scholar] [CrossRef]
Krätschmer, Volker, and Henryk Zähle. 2017. Statistical inference for expectile-based risk measures. Scandinavian Journal of Statistics 44: 425–54. [Google Scholar]
Lahiri, Soumendra Nath. 2003. Resampling Methods for Dependent Data. New York: Springer. [Google Scholar]
Lauer, Alexandra, and Henryk Zähle. 2015. Nonparametric estimation of risk measures of collective risks. Statistics and Risk Modeling 32: 89–102. [Google Scholar] [CrossRef]
Lauer, Alexandra, and Henryk Zähle. 2017. Bootstrap consistency and bias correction in the nonparametric estimation of risk measures of collective risks. Insurance: Mathematics and Economics 74: 99–108. [Google Scholar] [CrossRef]
Mehra, K. L., and Sudhakara Rao. 1975. On functions of order statistics for mixing processes. Annals of Statistics 3: 874–83. [Google Scholar] [CrossRef]
Naik-Nimbalkar, Uttara V., and M.B. Rajarshi. 1994. Validity of blockwise bootstrap for empirical processes with stationary observations. Annals of Statistics 22: 980–94. [Google Scholar] [CrossRef]
Peligrad, Magda. 1998. On the blockwise bootstrap for empirical processes for stationary sequences. Annals of Probability 26: 877–901. [Google Scholar] [CrossRef]
Pitts, Susan M. 1994. Nonparametric estimation of compound distributions with applications in insurance. Annals of the Institute of Mathematical Statistics 46: 537–55. [Google Scholar]
Pollard, David. 1984. Convergence of Stochastic Processes. New York: Springer. [Google Scholar]
Rio, Emmanuel. 1995. A maximal inequality and dependent Marcinkiewicz-Zygmund strong laws. Annals of Probability 23: 918–37. [Google Scholar] [CrossRef]
Rubin, Donald. 1981. The Bayesian bootstrap. Annals of Statistics 9: 130–34. [Google Scholar] [CrossRef]
Shao, Jun, and Dongsheng Tu. 1995. The Jackknife and Bootstrap. New York: Springer. [Google Scholar]
Shorack, Galen R. 1972. Linear functions of order statistics. Annals of Mathematical Statistics 43: 412–27. [Google Scholar] [CrossRef]
Shorack, Galen R., and Jon A. Wellner. 1986. Empirical Processes with Applications to Statistics. New York: Wiley. [Google Scholar]
Stigler, Stephen M. 1974. Linear functions of order statistics with smooth weight functions. Annals of Statistics 2: 676–93. [Google Scholar] [CrossRef]
Sun, Shuxia, and Fuxia Cheng. 2018. Bootstrapping the Expected Shortfall. Theoretical Economics Letters 8: 685–98. [Google Scholar] [CrossRef]
Tsukahara, Hideatsu. 2013. Estimation of distortion risk measures. Journal of Financial Econometrics 12: 213–35. [Google Scholar] [CrossRef]
Van der Vaart, Aad W., and Jon A. Wellner. 1996. Weak Convergence and Empirical Processes. New York: Springer. [Google Scholar]
Van Zwet, Willem R. 1980. A strong law for linear functionals of order statistics. Annals of Probability 8: 986–90. [Google Scholar] [CrossRef]
Varron, Davit. 2015. Laboratoire de Mathématiques de Besançon, University of Franche-Comté, Besançon, France. Personal communication.
Zähle, Henryk. 2014. Marcinkiewicz–Zygmund and ordinary strong laws for empirical distribution functions and plug-in estimators. Statistics 48: 951–64. [Google Scholar] [CrossRef]

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Beutner, E.; Zähle, H. Bootstrapping Average Value at Risk of Single and Collective Risks. Risks 2018, 6, 96. https://doi.org/10.3390/risks6030096

AMA Style

Beutner E, Zähle H. Bootstrapping Average Value at Risk of Single and Collective Risks. Risks. 2018; 6(3):96. https://doi.org/10.3390/risks6030096

Chicago/Turabian Style

Beutner, Eric, and Henryk Zähle. 2018. "Bootstrapping Average Value at Risk of Single and Collective Risks" Risks 6, no. 3: 96. https://doi.org/10.3390/risks6030096

APA Style

Beutner, E., & Zähle, H. (2018). Bootstrapping Average Value at Risk of Single and Collective Risks. Risks, 6(3), 96. https://doi.org/10.3390/risks6030096

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Bootstrapping Average Value at Risk of Single and Collective Risks

Abstract

1. Introduction

2. Main Results

2.1. The Case of i.i.d. Observations

2.2. The Case of $β$ -Mixing Observations

2.3. Applications

2.3.1. Bootstrapping the Down Side Risk of an Asset Price

2.3.2. Bootstrapping the Total Risk Premium in Insurance Models

3. Proofs of Main Results

3.1. Average Value at Risk functional

3.2. Compound Distribution Functional

3.3. Composition of Average Value at Risk Functional and Compound Distribution Functional

4. Conclusions

Author Contributions

Funding

Conflicts of Interest

Appendix A. Convergence in Distribution °

Appendix A.1. Slutsky-Type Results for the Open-Ball σ-Algebra

Appendix A.2. Delta-Method and Chain Rule for Uniformly Quasi-Hadamard Differentiable Maps

Appendix B. Delta-Method for the Bootstrap

Appendix B.1. Abstract Delta-Method for the Bootstrap

Appendix B.2. Application to Plug-In Estimators of Statistical Functionals

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Bootstrapping Average Value at Risk of Single and Collective Risks

Abstract

1. Introduction

2. Main Results

2.1. The Case of i.i.d. Observations

2.2. The Case of β -Mixing Observations

2.3. Applications

2.3.1. Bootstrapping the Down Side Risk of an Asset Price

2.3.2. Bootstrapping the Total Risk Premium in Insurance Models

3. Proofs of Main Results

3.1. Average Value at Risk functional

3.2. Compound Distribution Functional

3.3. Composition of Average Value at Risk Functional and Compound Distribution Functional

4. Conclusions

Author Contributions

Funding

Conflicts of Interest

Appendix A. Convergence in Distribution °

Appendix A.1. Slutsky-Type Results for the Open-Ball σ-Algebra

Appendix A.2. Delta-Method and Chain Rule for Uniformly Quasi-Hadamard Differentiable Maps

Appendix B. Delta-Method for the Bootstrap

Appendix B.1. Abstract Delta-Method for the Bootstrap

Appendix B.2. Application to Plug-In Estimators of Statistical Functionals

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

2.2. The Case of $β$ -Mixing Observations