1. Introduction
One of the most popular risk measures in practice is the so-called Average Value at Risk which is also referred to as Expected Shortfall (see
Acerbi and Szekely (
2014);
Acerbi and Tasche (
2002a,
2002b);
Emmer et al. (
2015) and references therein). For a fixed level
, the corresponding Average Value at Risk is the map
defined by
, where
refers to the distribution function of
X,
is the usual
-space associated with some atomless probability space, and
for any
with
the set of the distribution functions
of all
. Here,
and
denotes the left-continuous inverse of
F. The statistical functional
is sometimes referred to as risk functional associated with
. Note that
when
is continuous at
.
In this article, we mainly focus on bootstrap methods for the Average Value at Risk. Before doing so, we briefly review nonparametric estimation techniques and asymptotic results for the Average Value at Risk. Given identically distributed observations
(
) on some probability space
with unknown marginal distribution
, a natural estimator for
is the empirical plug-in estimator
where
is the empirical distribution function of
and
refer to the order statistics of
. The second representation in Equation (
2) shows that
is a specific L-statistic which was already mentioned in
Acerbi (
2002);
Acerbi and Tasche (
2002a);
Jones and Zitikis (
2003).
In particular, if the underlying sequence
is strictly stationary and ergodic, classical results of
van Zwet (
1980) and
Gilat and Helmers (
1997) show that
converges
-almost surely to
as
, i.e., that strong consistency holds. If
are i.i.d. and
F has a finite second moment and takes the value
only once, then a result of Stigler ((
Stigler 1974, Theorems 1–2)) yields the asymptotic distribution of the estimation error:
where
with
,
, and ⇝ refers to convergence in distribution (see also
Shorack 1972;
Shorack and Wellner 1986). In fact, for independent
the second summand in the definition of
vanishes. Results of
Beutner and Zähle (
2010) show that Equation (
3) still holds if
is strictly stationary and
-mixing with mixing coefficients
and
for some
.
Tsukahara (
2013) obtained the same result. A similar result can also be derived from an earlier work by
Mehra and Rao (
1975), but under a faster decay of the mixing coefficients and under an additional assumption on the dependence structure. We emphasize that the method of proof proposed by Beutner and Zähle is rather flexible, because it easily extends to other weak and strong dependence concepts and other risk measures (see
Beutner et al. 2012;
Beutner and Zähle 2010,
2016;
Krätschmer et al. 2013;
Krätschmer and Zähle 2017).
Even in the i.i.d. case the asymptotic variance
depends on
F in a fairly complex way. For the approximation of the distribution of
, bootstrap methods should thus be superior to the method of estimating
. However, to the best of our knowledge, theoretical investigations of the bootstrap for the Average Value at Risk seem to be rare. According to
Gribkova (
2016), a result of
Gribkova (
2002) yields bootstrap consistency for Efron’s bootstrap when
are i.i.d, while Theorem 3 of
Helmers et al. (
1990) seems not to cover the Average Value at Risk, because there the function
J (which plays the role of
) is assumed to be Lipschitz continuous. In these articles, bootstrap consistency is typically proved by first proving consistency of the bootstrap variance and then using this result by showing that upper bounds for the difference between the sampling distribution and the bootstrap distribution converge to zero. Employing different techniques,
Beutner and Zähle (
2016) established bootstrap consistency in probability for the multiplier bootstrap when
are i.i.d. as well as bootstrap consistency in probability for the circular bootstrap when
are strictly stationary and
-mixing with mixing coefficients
and
for some
and
. Recently,
Sun and Cheng (
2018) established bootstrap consistency in probability for the moving blocks bootstrap when
are strictly stationary and
-mixing with mixing coefficients
and
for some
,
and
. Strictly speaking, Sun and Cheng did not consider the Average Value at Risk (Expected Shortfall) but the Tail Conditional Expectation in the sense of
Acerbi and Tasche (
2002a,
2002b).
The contribution of the article at hand is twofold. First, we extend the results of
Beutner and Zähle (
2016) on the Average Value at Risk from bootstrap consistency
in probability to bootstrap consistency
almost surely. Second, we establish bootstrap consistency for the Average Value at Risk of
collective risks, i.e., for
and more general expressions.
The rest of the article is organized as follows. In
Section 2, we present and illustrate our main results which are proved in
Section 3.
Section 3 is followed by the conclusions. The proofs of
Section 3 rely on a new functional delta-method for the almost sure bootstrap which seems to be interesting in its own right and which is presented in
Appendix B. Roughly speaking, the (functional) delta method studies properties of particular estimators for quantities of the form
. Here,
H is a known functional, such as the Average Value at Risk functional, and
is a possibly infinite dimensional parameter, such as an unknown distribution function. The particular estimators covered by the (functional) delta method are of the form
where
is an estimator for
. In general and in the particular application considered here, the appeal of the (functional) delta method lies in the fact that, once “differentiability” of
H (here, the Average Value at Risk functional) is established, the asymptotic error distribution of
can immediately be derived from the asymptotic error distribution of
(here
). This also applies to the (functional) delta method for the bootstrap where bootstrap consistency of the bootstrapped version of
will follow from the respective property of the bootstrapped version of
(here
). Thus, if in financial or actuarial applications the data show dependencies for which the asymptotic error distribution and/or bootstrap consistency of plug-in estimators for the Average Value at Risk have not been established yet, it would be enough to check if for these dependencies the asymptotic error distribution and/or bootstrap consistency of
is known; thanks to the (functional) delta method the Average Value at Risk functional would inherit these properties. In
Appendix A.1, we give results on convergence in distribution for the open-ball
-algebra which are needed for the main results, and in
Appendix A.2 we prove a delta-method for uniformly quasi-Hadamard differentiable maps that is the basis for the method of
Appendix B. Readers interested in these methods used to prove the main results might wish to first work through
Appendix A and
Appendix B before reading
Section 2 and
Section 3.
3. Proofs of Main Results
Here, we prove the results of
Section 2. In fact, Theorems 1–3 are special cases of Corollaries 1 and 4. The latter corollaries are proved with the help of the technique introduced in
Appendix B.2, which in turn avails the concept of uniform quasi-Hadamard differentiability (see Definition A1 in
Appendix B.1).
Keep the notation introduced in
Section 1. Let
be the space of all cádlág functions
v on
with finite sup-norm
, and
be the
-algebra on
generated by the one-dimensional coordinate projections
,
, given by
. Let
be a weight function, i.e., a continuous function being non-increasing on
and non-decreasing on
. Let
be the subspace of
consisting of all
satisfying
and
. The latter condition automatically holds when
. We equip
with the trace
-algebra of
, and note that this
-algebra coincides with the
-algebra
on
generated by the
-open balls (see Lemma 4.1 in
Beutner and Zähle (
2016)).
3.1. Average Value at Risk functional
Using the terminology of Part (i) of Definition A1, we obtain the following result.
Proposition 1. Let and assume that F takes the value α only once. Let be the set of all sequences with pointwise. Moreover, assume that . Then, the map is uniformly quasi-Hadamard differentiable with respect to tangentially to , and the uniform quasi-Hadamard derivative is given bywhere as before . Proposition 1 shows in particular that for any
which takes the value
only once, the map
is uniformly quasi-Hadamard differentiable at
F tangentially to
(in the sense of Part (ii) of Definition A1) with uniform quasi-Hadamard derivative given by Equation (
13).
Proof. (of Proposition 1)
First, note that the map
defined in Equation (
13) is continuous with respect to
, because
holds for every
.
Now, let
be a quadruple with
satisfying
pointwise,
,
satisfying
and
, and
satisfying
. It remains to show that
that is,
Let us denote the integrand of the integral in Equation (
14) by
. In virtue of
pointwise,
,
, and
we have
and
for every
. Thus, for every
with
, we obtain
and
i.e.,
. Moreover, for every
with
, we obtain
and
i.e.,
. Since we assumed that
F takes the value
only once, we can conclude that
for Lebesgue-a.e.
. Moreover, by the Lipschitz continuity of
with Lipschitz constant
we have
Since
(recall
), the assumption
ensures that the latter expression provides a Borel measurable majorant of
. Now, the Dominated Convergence theorem implies Equation (
14). ☐
As an immediate consequence of Corollary A4, Examples A1 and A2, and Proposition 1, we obtain the following corollary.
Corollary 1. Let F, , , , and be as in Example A1 (S1. or S2.) or as in Example A2 respectively, and assume that the assumptions discussed in Example A1 or in Example A2 respectively are fulfilled for some weight function ϕ with (in particular ). Moreover, assume that F takes the value α only once. Then,and 3.2. Compound Distribution Functional
Let
be the compound distribution functional introduced in
Section 2.1. For any
, let the function
be defined by
and denote by
the set of all distribution functions
F that satisfy
. Using the terminology of Part (ii) of Definition A1, we obtain the following Proposition 2. In the proposition, the functional
is restricted to the domain
in order to obtain
as the corresponding trace. The latter will be important for Corollary 3.
Proposition 2. Let and . Assume that . Then, the map is uniformly quasi-Hadamard differentiable at F tangentially to with trace . Moreover, the uniform quasi-Hadamard derivative is given bywhere as before . In particular, if for some , then Proposition 2 extends Proposition 4.1 of
Pitts (
1994). Before we prove the proposition, we note that the proposition together with Corollary A4 and Examples A1 and A2 yields the following corollary.
Corollary 2. Let F, , , , and be as in Example A1 (S1. or S2.) or as in Example A2 respectively, and assume that the assumptions discussed in Example A1 or in Example A2 respectively are fulfilled for for some . Then, for and To ease the exposition of the proof of Proposition 2, we first state a lemma that follows from results given in
Pitts (
1994). In the sequel we use
to denote the function defined by
for any measurable function
f and any distribution function
H of a finite (not necessarily probability) Borel measure on
for which
is well defined on
.
Lemma 1. Let , and and be any sequences such that and for some . Then, the following two assertions hold.
- (i)
There exists a constant such that for every - (ii)
For every there exists a constant such that for every
Proof. (i): From Equation (2.4) in
Pitts (
1994) we have
so that it remains to show that
is bounded above uniformly in
. The functions
and
both lie in
, because
. Along with
, this implies
(see Lemma 2.1 in
Pitts (
1994)). Therefore,
for some suitable finite constant
and all
.
(ii): With the help of Lemma 2.3 of
Pitts (
1994) (along with
), Lemma 2.4 of
Pitts (
1994), and Equation (2.4) in
Pitts (
1994), we obtain
It hence remains to show that and are bounded above uniformly in . However, this was already done in the proof of Part (i). ☐
Proof. Proof of Proposition 2. First, note that for
, we have
by Equation (2.1) in
Pitts (
1994). Moreover, according to Lemma 2.2 in
Pitts (
1994), we have that the integrals
and
are finite under the assumptions of the proposition. Hence,
can indeed be seen as the trace.
Second, we show
-continuity of the map
. To this end let
and
such that
. For every
, we have
where the first and the second inequality follow from Lemma 2.3 and Equation (2.4) in
Pitts (
1994) respectively. Hence,
Now, the series converges due to the assumptions, and implies . Thus, , which proves continuity.
Third, let
be a quadruple with
satisfying
,
,
satisfying
and
, and
satisfying
. It remains to show that
To do so, define for
a map
by
with the usual convention that the sum over the empty set equals zero. We find that for every
where for the third “=” we use the fact that for
By Part (ii) of Lemma reflemma preceding qHD of compound (this lemma can be applied since
) there exists a constant
such that for all
Since
and
, we have
for some finite constant
and all
. Hence, the right-hand side of Equation (
17) can be made arbitrarily small by choosing
M large enough. That is,
can be made arbitrarily small uniformly in
by choosing
M large enough.
Furthermore, it is demonstrated in the proof of Proposition 4.1 of
Pitts (
1994) that
can be made arbitrarily small by choosing
M large enough.
Next, applying again Part (ii) of Lemma 1, we obtain
Using , this term tends to zero as for a given M.
It remains to consider the summand
We show that for
M fixed this term can be made arbitrarily small by letting
. This would follow if for every given
and
the expression
could be made arbitrarily small by letting
. For every such
k and
ℓ we can find a linear combination of indicator functions of the form
,
, which we denote by
, such that
for some suitable finite constant
depending only on
and
. The first inequality in Equation (
18) is obvious (and holds for any
). The second inequality in Equation (
18) is obtained by applying Lemma 2.3 of
Pitts (
1994) to the first summand (noting that
; recall
), by applying Lemma 4.3 of
Pitts (
1994) to the second summand (which requires that
is as described above), and by applying Lemma 2.3 of
Pitts (
1994) to the third summand.
We now consider the three summands on the right-hand side of Equation (
18) separately. We start with the third term. Since
, Lemma 4.2 of
Pitts (
1994) ensures that we may assume that
is chosen such that
is arbitrarily small. Hence, for fixed
M the third summand in Equation (
18) can be made arbitrarily small.
We next consider the the second summand in Equation (
18). Obviously,
We start by considering the first summand in Equation (
19). In view of Equation (
16), it can be written as
Applying Lemma 2.3 of
Pitts (
1994) with
and
we obtain
where we applied Part (i) of Lemma 1 to
to obtain the last inequality. Hence, for the left-hand side of Equation (
20) to go to zero as
it suffices to show that
as
. The latter follows from
where we applied Part (ii) of Lemma 1 with
to all summands in
. For every
k and
this expression goes indeed to zero as
, because, as mentioned before,
is uniformly bounded in
, and we have
. Next, we consider the second summand in Equation (
19). Applying Equation (
16) to
and
and subsequently Part (ii) of Lemma 1 to the summands in
, we have
Clearly for every
k this term goes to zero 0 as
, because
as
by assumption. This together with the fact that Equation (
20) goes to zero 0 as
shows that Equation (
19) goes to zero in
as
. Therefore, the second summand in Equation (
18) goes to zero as
.
It remains to consider the first term in Equation (
18). We find
where for the last inequality we used Formula (2.4) of
Pitts (
1994). In the following, Equation (
19) we showed that
goes to zero as
for every
k and
. Hence, for every such
k and
ℓ, it is uniformly bounded in
. Therefore, we can make Equation (
22) arbitrarily small by making
small which, as mentioned above, is possible according to Lemma 4.2 of
Pitts (
1994). This finishes the proof. ☐
3.3. Composition of Average Value at Risk Functional and Compound Distribution Functional
Here, we consider the composition of the Average Value at Risk functional
defined in Equation (
1) and the compound distribution functional
introduced in
Section 2.1. As a consequence of Propositions 1 and 2, we obtain the following Corollary 3. Note that, for any
, Lemma 2.2 in
Pitts (
1994) yields
so that the composition
is well defined on
.
Corollary 3. Let and assume . Let , and assume that takes the value α only once. Then, the map is uniformly quasi-Hadamard differentiable at F tangentially to , and the uniform quasi-Hadamard derivative is given by , i.e.,with and as in Proposition 1 and 2, respectively. Proof. We intend to apply Lemma A1 to
and
. To verify that the assumptions of the lemma are fulfilled, we first recall from the comment directly before Corollary 3 that
. It remains to show that the Assumptions (a)–(c) of Lemma A1 are fulfilled. According to Proposition 2 we have that for every
the functional
is uniformly quasi-Hadamard differentiable at
F tangentially to
with trace
, which is the first part of Assumption (b). The second part of Assumption (b) means
and follows from
(for which we applied Lemma 2.3 and Inequality (2.4) in
Pitts (
1994)), the convergence of the latter series (which holds by assumption), and
. Further, it follows from Proposition 1 that the map
is uniformly quasi-Hadamard differentiable tangentially to
at every distribution function of
that takes the value
only once. This is Assumption (c) of Lemma A1.
It remains to show that Assumption (a) of Lemma A1 also holds true. In the present setting, Assumption (a) means that for every sequence
with
we have
pointwise. We show that we even have
. Thus, let
. Then,
where we used Equation (
16) for the second “=” and applied Part (ii) of Lemma 1 to the summands of
to obtain the latter inequality. Since the series converges, we obtain
when assuming
. ☐
As an immediate consequence of Corollary A4, Examples A1 and A2, and Corollary 3, we obtain the following corollary.
Corollary 4. Let F, , , , and be as in Example A1 (S1. or S2.) or as in Example A2, respectively, and assume that the assumptions discussed in Example A1 or in Example A2 respectively are fulfilled for for some (in particular ). Moreover, assume and that takes the value α only once. Then,and