Bootstrapping Average Value at Risk of Single and Collective Risks

Almost sure bootstrap consistency of the blockwise bootstrap for the Average Value at Risk of single risks is established for strictly stationary β-mixing observations. Moreover, almost sure bootstrap consistency of a multiplier bootstrap for the Average Value at Risk of collective risks is established for independent observations. The main results rely on a new functional delta-method for the almost sure bootstrap of uniformly quasi-Hadamard differentiable statistical functionals, to be presented here. The latter seems to be interesting in its own right.


Introduction
One of the most popular risk measures in practice is the so-called Average Value at Risk which is also referred to as Expected Shortfall (see Acerbi and Szekely (2014); Acerbi andTasche (2002a, 2002b); Emmer et al. (2015) and references therein).For a fixed level α ∈ (0, 1), the corresponding Average Value at Risk is the map AV@R α : L 1 → R defined by AV@R α (X) := R α (F X ), where F X refers to the distribution function of X, L 1 is the usual L 1 -space associated with some atomless probability space, and for any F ∈ F 1 with F 1 the set of the distribution functions F X of all X ∈ L 1 .Here, g α (t) := 1 1−α max{t − α; 0} and F ← (s) := inf{x ∈ R : F(x) ≥ s} denotes the left-continuous inverse of F. The statistical functional R α : F 1 → R is sometimes referred to as risk functional associated with AV@R α .Note that AV@R α (X) = E[X|X ≥ F ← X (α)] when F X is continuous at F ← X (α).In this article, we mainly focus on bootstrap methods for the Average Value at Risk.Before doing so, we briefly review nonparametric estimation techniques and asymptotic results for the Average Value at Risk.Given identically distributed observations X 1 , . . ., X n (, X n+1 , . ..) on some probability space (Ω, F , P) with unknown marginal distribution F ∈ F 1 , a natural estimator for R α (F) is the empirical plug-in estimator where ∞) is the empirical distribution function of X 1 , . . ., X n and X 1:n , . . ., X n:n refer to the order statistics of X 1 , . . ., X n .The second representation in Equation (2) shows that R α ( F n ) is a specific L-statistic which was already mentioned in Acerbi (2002); Acerbi and Tasche (2002a); Jones and Zitikis (2003).
In particular, if the underlying sequence (X i ) i∈N is strictly stationary and ergodic, classical results of van Zwet (1980) and Gilat and Helmers (1997) show that R α ( F n ) converges P-almost surely to R α (F) as n → ∞, i.e., that strong consistency holds.If X 1 , X 2 , . . .are i.i.d. and F has a finite second moment and takes the value α only once, then a result of Stigler ((Stigler 1974, Theorems 1-2) ) yields the asymptotic distribution of the estimation error: where σ 2 F := g α (F(x 0 ))Γ(x 0 , x 1 )g α (F(x 1 )) dx 0 dx 1 with Γ(x 0 , x 1 ) , and ; refers to convergence in distribution (see also Shorack 1972;Shorack and Wellner 1986).In fact, for independent X 1 , X 2 , . . . the second summand in the definition of Γ(x 0 , x 1 ) vanishes.Results of Beutner and Zähle (2010) show that Equation (3) still holds if (X i ) i∈N is strictly stationary and α-mixing with mixing coefficients α(i) = O(i −θ ) and lim x→∞ (1 − F(x))x 2θ/(θ−1) < ∞ for some θ > 1 + √ 2. Tsukahara (2013) obtained the same result.A similar result can also be derived from an earlier work by Mehra and Rao (1975), but under a faster decay of the mixing coefficients and under an additional assumption on the dependence structure.We emphasize that the method of proof proposed by Beutner and Zähle is rather flexible, because it easily extends to other weak and strong dependence concepts and other risk measures (see Beutner et al. 2012;Beutner andZähle 2010, 2016;Krätschmer et al. 2013;Krätschmer and Zähle 2017).
Even in the i.i.d.case the asymptotic variance σ 2 F depends on F in a fairly complex way.For the approximation of the distribution of √ n(R α ( F n ) − R α (F)), bootstrap methods should thus be superior to the method of estimating σ 2 F .However, to the best of our knowledge, theoretical investigations of the bootstrap for the Average Value at Risk seem to be rare.According to Gribkova (2016), a result of Gribkova (2002) yields bootstrap consistency for Efron's bootstrap when X 1 , X 2 , . . .are i.i.d, while Theorem 3 of Helmers et al. (1990) seems not to cover the Average Value at Risk, because there the function J (which plays the role of g α ) is assumed to be Lipschitz continuous.In these articles, bootstrap consistency is typically proved by first proving consistency of the bootstrap variance and then using this result by showing that upper bounds for the difference between the sampling distribution and the bootstrap distribution converge to zero.Employing different techniques, Beutner and Zähle (2016) established bootstrap consistency in probability for the multiplier bootstrap when X 1 , X 2 , . . .are i.i.d. as well as bootstrap consistency in probability for the circular bootstrap when X 1 , X 2 , . . .are strictly stationary and β-mixing with mixing coefficients β(i) = O(i −b ) and |x| p dF(x) < ∞ for some p > 2 and b > p/(p − 2).Recently, Sun and Cheng (2018) established bootstrap consistency in probability for the moving blocks bootstrap when X 1 , X 2 , . . .are strictly stationary and α-mixing with mixing coefficients α(i) ≤ cδ i and |x| p dF(x) < ∞ for some p > 4, c > 0 and δ ∈ (0, 1).Strictly speaking, Sun and Cheng did not consider the Average Value at Risk (Expected Shortfall) but the Tail Conditional Expectation in the sense of Acerbi andTasche (2002a, 2002b).
The contribution of the article at hand is twofold.First, we extend the results of Beutner and Zähle (2016) on the Average Value at Risk from bootstrap consistency in probability to bootstrap consistency almost surely.Second, we establish bootstrap consistency for the Average Value at Risk of collective risks, i.e., for R α (F * m ) and more general expressions.
The rest of the article is organized as follows.In Section 2, we present and illustrate our main results which are proved in Section 3. Section 3 is followed by the conclusions.The proofs of Section 3 rely on a new functional delta-method for the almost sure bootstrap which seems to be interesting in its own right and which is presented in Appendix B. Roughly speaking, the (functional) delta method studies properties of particular estimators for quantities of the form H(θ). Here, H is a known functional, such as the Average Value at Risk functional, and θ is a possibly infinite dimensional parameter, such as an unknown distribution function.The particular estimators covered by the (functional) delta method are of the form H( T n ) where T n is an estimator for θ.In general and in the particular application considered here, the appeal of the (functional) delta method lies in the fact that, once "differentiability" of H (here, the Average Value at Risk functional) is established, the asymptotic error distribution of H( T n ) can immediately be derived from the asymptotic error distribution of T n (here F n ).This also applies to the (functional) delta method for the bootstrap where bootstrap consistency of the bootstrapped version of H( T n ) will follow from the respective property of the bootstrapped version of T n (here F n ).Thus, if in financial or actuarial applications the data show dependencies for which the asymptotic error distribution and/or bootstrap consistency of plug-in estimators for the Average Value at Risk have not been established yet, it would be enough to check if for these dependencies the asymptotic error distribution and/or bootstrap consistency of F n is known; thanks to the (functional) delta method the Average Value at Risk functional would inherit these properties.In Appendix A.1, we give results on convergence in distribution for the open-ball σ-algebra which are needed for the main results, and in Appendix A.2 we prove a delta-method for uniformly quasi-Hadamard differentiable maps that is the basis for the method of Appendix B. Readers interested in these methods used to prove the main results might wish to first work through Appendices A and B before reading Sections 2 and 3.

The Case of i.i.d. Observations
Keep the notation of Section 1. Assume that (X i ) i∈N is a sequence of i.i.d.real-valued random variables on some probability space (Ω, F , P) with distribution function and (W ni ) be a triangular array of nonnegative real-valued random variables on another probability space (Ω , F , P ) such that one of the following two settings is met.S1.The random vector (W n1 , . . ., W nn ) is multinomially distributed according to the parameters n and p 1 = • • • = p n := 1/n for every n ∈ N. S2.W ni = Y i /Y n for every i = 1, . . ., n and n ∈ N, where Y n := 1 n ∑ n j=1 Y j and (Y j ) is any sequence of nonnegative i.i.d.random variables on (Ω , F , P ) with is nothing but Efron's boostrap (Efron 1979).If in Setting S2. the distribution of Y 1 is the exponential distribution with parameter 1, then the resulting scheme is in line with the Bayesian bootstrap of Rubin (1981).Let Theorem 1.In the setting above assume that φ 2 dF < ∞ for some continuous function φ : R → [1, ∞) with 1/φ(x) dx < ∞ (in particular F ∈ F 1 ), and that F takes the value α only once.Then Theorem 1 is a special case of Corollary 1 below.For the bootstrap Scheme S1. the result of Theorem 1 can be also deduced from Theorem 7 in Gribkova (2002).According to Gribkova (2016), Condition (1) of this theorem is satisfied if there are 0 = a 0 < a 1 < • • • < a k = 1 for some k ∈ N such that J is Hölder continuous on each interval (a i−1 , a i ), 1 ≤ i ≤ k, and the measure dF −1 has no mass at the points a 1 , . . ., a k−1 .For the bootstrap Scheme S2. the result seems to be new.
We now consider the collective risk model.Let (X i ) i∈N and F n be as above, and let p = (p k ) k∈N 0 be the counting density of a distribution on N 0 .Let F denote the set of all distribution functions on R, and consider the functional Theorem 2. In the setting above assume that |x| 2λ dF(x) < ∞ for some λ > 1 (in particular F ∈ F 1 ) and ∑ ∞ k=1 p k k 1+λ < ∞, and that C p (F) takes the value α only once.Then, and Theorem 2 is a special case of Corollary 4 below.Lauer andZähle (2015, 2017) derive the asymptotic distribution as well as almost sure bootstrap consistency for the Average Value at Risk (and more general risk measures) of F * m n when m n /n is asymptotically constant, but we do not know any result in the existing literature which is comparable to that of Theorem 2.

The Case of β-Mixing Observations
Keep the notation of Section 1. Assume that (X i ) i∈N is a strictly stationary sequence of β-mixing random variables on (Ω, F , P) with distribution function F. As before let be a sequence of integers such that n ∞ as n → ∞, and n < n for all n ∈ N. Set k n := n/ n for all n ∈ N. Let (I nj ) n∈N, 1≤j≤k n be a triangular array of random variables on (Ω , F , P ) such that I n1 , . . . ,I nk n are i.i.d.according to the uniform distribution on {1, . . ., n − n + 1} for every n ∈ N. Let (Ω, F , P) := (Ω × Ω , F ⊗ F , P ⊗ P ) and Note that the sequence (X i ) and the triangular array (W ni ) regarded as families of random variables on the product space (Ω, F , P) := (Ω × Ω , F ⊗ F , P ⊗ P ) are independent.At an informal level, this means that, given a sample X 1 , . . ., X n , we pick k n − 1 blocks of length n and one block of length n − (k n − 1) n in the sample X 1 , . . ., X n , where the start indices I n1 , I n2 , . . ., I nk n are chosen independently and uniformly in the set of indices {1, . . ., n − n + 1}: block 1: The bootstrapped empirical distribution function F * n is then defined to be the distribution function of the discrete probability measure with atoms X 1 , . . ., X n carrying masses W n1 , . . ., W nn , respectively, where W ni specifies the number of blocks which contain X i .This is known as the blockwise bootstrap (see, e.g., Bühlmann (1994Bühlmann ( , 1995) ) and references therein).Assume that the following assertions hold: A1. φ p dF < ∞ for some p > 4 (in particular F ∈ F 1 ).A2.The sequence of random variables (X i ) is strictly stationary and β-mixing with mixing coefficients (β i ) satisfying β i ≤ cδ i for some constants c > 0 and δ ∈ (0, 1).A3.The block length n satisfies n = O(n γ ) for some γ ∈ (0, 1/2).
and note that which can be verified easily.Let σ 2 F := g α (F(x 0 ))Γ(x 0 , x 1 )g α (F(x 1 )) dx 0 dx 1 with Γ(x 0 , x 1 ) : Theorem 3. In the setting above (in particular under A1.-A3.)assume that F takes the value α only once.Then, we have Theorem 3 is a special case of Corollary 1 below.To the best of our knowledge, there does not yet exist any result on almost sure bootstrap consistency for the Average Value at Risk when the underlying data are dependent.

Bootstrapping the Down Side Risk of an Asset Price
Let (A i ) i∈N 0 be the price process of an asset.Let us assume that it is induced by an initial state A 0 ∈ R + and a sequence of R + -valued i.i.d.random variables (R i ) i∈N via A i := R i A i−1 , i ∈ N. Here, R i is the return of the asset in between time i − 1 and time i.For instance, if A 0 , A 1 , A 2 , . . .are the observations of a time-continuous Black-Scholes-Merton model with drift µ and volatility σ at the points of the time grid {0, h, 2h, . ..}, then the distribution of R 1 is the log-normal distribution with parameters (µ − σ 2 /2)h and σ 2 h.However, the adequacy of a specific parametric model is usually hard to verify.For this reason, we do not restrict ourselves to any particular parametric structure for the dynamics of (R i ) i∈N .
Let us assume that we can observe the asset prices A 0 , . . ., A n up to time n, and that we are interested in the Average Value at Risk at level α of the negative price change A n − A n+1 (which specifies the down side risk of the asset) in between time n and n + 1.That is, since for any a 0 , . . ., . copies of X, we can use R α ( F n ) as an estimator for R α (F) and derive from Equation (4) an asymptotic confidence interval at a given level τ ∈ (0, 1) for R α (F) where one has to estimate σ 2 F by )) dx 0 dx 1 .As the estimator for σ 2 F depends on F n in a somewhat complex way, the bootstrap confidence interval at level τ derived from Equations ( 4) and ( 5) is supposed to have a slightly better performance.Here, q * t (ω) denotes a t-quantile of (a Monte Carlo approximation of) the distribution of the left-hand side in Equation ( 5) for fixed ω.For Equations ( 4) and ( 5) it suffices to assume that E[|R 1 | 2+ε ] < ∞ for some arbitrarily small ε > 0.

Bootstrapping the Total Risk Premium in Insurance Models
In actuarial mathematics, the collective risk model is frequently used for modeling the total claim distribution of an insurance collective.If the counting density p = (p k ) k∈N 0 corresponds to the distribution of the random number N of claims caused by the whole collective within one insurance period, and if X 1 , . . ., X N (, X N+1 , . ..) denote the i.i.d.sizes of the corresponding claims with marginal distribution F, then C p (F) is the distribution of the total claim ∑ N i=1 X i (the latter sum is set to 0 if N = 0).Now, R α (C p (F)) is a suitable insurance premium for the whole collective when the Average Value at Risk at level α is considered to be a suitable premium principle.
Assume that p is known, for instance p m = 1 for some fixed m ∈ N, and let X 1 , . . ., X n be observed historical (i.i.d.) claims with n large.On the one hand, the construction of an exact confidence interval for R α (C p (F)) at level τ ∈ (0, 1) based on X 1 , . . ., X n is hardly possible.Likewise, the performance of an asymptotic confidence interval at level τ derived from Equation ( 6) with (nonparametrically) estimated σ 2 p,F is typically only moderate.Take into account that σ 2 p,F depends on the unknown F in a fairly complex way.On the other hand, the bootstrap confidence interval at level τ derived from Equation ( 7) should have a better performance.Here, q * t (ω) denotes a t-quantile of (a Monte Carlo approximation of) the distribution of the left-hand side in Equation ( 7) for fixed ω.
Note that Theorem 2 ensures that Equations ( 6) and ( 7) hold true when the marginal distribution F of the X i is any log-normal distribution, any Gamma distribution, any Pareto distribution with tail index greater than 2, or any convex combination of one of these distributions with the Dirac measure δ 0 , and the counting density p corresponds to any Dirac measure with atom in N, any binomial distribution, any Poisson distribution, or any geometric distribution.The former distributions are classical examples for the single claim distribution and the latter distributions are classical examples for the claim number distribution.

Proofs of Main Results
Here, we prove the results of Section 2. In fact, Theorems 1-3 are special cases of Corollaries 1 and 4. The latter corollaries are proved with the help of the technique introduced in Appendix B.2, which in turn avails the concept of uniform quasi-Hadamard differentiability (see Definition A1 in Appendix B.1).
Keep the notation introduced in Section 1.Let D be the space of all cádlág functions v on R with finite sup-norm v ∞ := sup t∈R |v(t)|, and D be the σ-algebra on D generated by the one-dimensional coordinate projections π t , t ∈ R, given by π t (v) := v(t).Let φ : R → [1, ∞) be a weight function, i.e., a continuous function being non-increasing on (−∞, 0] and non-decreasing on [0, ∞).Let D φ be the subspace of D consisting of all x ∈ D satisfying v φ := vφ ∞ < ∞ and lim |t|→∞ |v(t)| = 0.The latter condition automatically holds when lim |t|→∞ φ(t) = ∞.We equip D φ with the trace σ-algebra of D, and note that this σ-algebra coincides with the σ-algebra B • φ on D φ generated by the • φ -open balls (see Lemma 4.1 in Beutner and Zähle ( 2016)).

Average Value at Risk functional
Using the terminology of Part (i) of Definition A1, we obtain the following result.
Proposition 1.Let F ∈ F 1 and assume that F takes the value α only once.Let S be the set of all sequences (G n ) ⊆ F 1 with G n → F pointwise.Moreover, assume that 1/φ(x) dx < ∞.Then, the map R α : F 1 (⊆ D) → R is uniformly quasi-Hadamard differentiable with respect to S tangentially to D φ D φ , and the uniform quasi-Hadamard derivative Ṙα;F : where as before g Proposition 1 shows in particular that for any F ∈ F 1 which takes the value α only once, the map R α : F 1 (⊆ D) → R is uniformly quasi-Hadamard differentiable at F tangentially to D φ D φ (in the sense of Part (ii) of Definition A1) with uniform quasi-Hadamard derivative given by Equation (13).
Proof.(of Proposition 1) First, note that the map Ṙα;F defined in Equation ( 13) is continuous with respect to • φ , because Let us denote the integrand of the integral in Equation ( 14) by we have lim n→∞ F n (x) = F(x) and lim n→∞ (F n (x) + ε n v n (x)) = F(x) for every x ∈ R. Thus, for every x ∈ R with F(x) < al pha, we obtain g α (F(x))v(x) = 0 and i.e., lim n→∞ I n (x) = 0. Since we assumed that F takes the value α only once, we can conclude that lim n→∞ I n (x) = 0 for Lebesgue-a.e. x ∈ R.Moreover, by the Lipschitz continuity of g α with Lipschitz constant 1 1−α we have , the assumption 1/φ(x) dx < ∞ ensures that the latter expression provides a Borel measurable majorant of I n .Now, the Dominated Convergence theorem implies Equation ( 14).
As an immediate consequence of Corollary A4, Examples A1 and A2, and Proposition 1, we obtain the following corollary.
Corollary 1.Let F, F n , F * n , C n , and B F be as in Example A1 (S1. or S2.) or as in Example A2 respectively, and assume that the assumptions discussed in Example A1 or in Example A2 respectively are fulfilled for some weight function φ with 1/φ(x) dx < ∞ (in particular F ∈ F 1 ).Moreover, assume that F takes the value α only once.Then,

Compound Distribution Functional
Let C p : F → F be the compound distribution functional introduced in Section 2.1.For any λ ≥ 0, let the function φ λ : R → [1, ∞) be defined by φ λ (x) := (1 + |x|) λ and denote by F φ λ the set of all distribution functions F that satisfy φ λ (x) dF(x) < ∞.Using the terminology of Part (ii) of Definition A1, we obtain the following Proposition 2. In the proposition, the functional C p is restricted to the domain F φ λ in order to obtain D φ λ as the corresponding trace.The latter will be important for Corollary 3.
where as before H p,F := ∑ ∞ k=1 k p k F * (k−1) .In particular, if p m = 1 for some m ∈ N, then Proposition 2 extends Proposition 4.1 of Pitts (1994).Before we prove the proposition, we note that the proposition together with Corollary A4 and Examples A1 and A2 yields the following corollary.
Corollary 2. Let F, F n , F * n , C n , and B F be as in Example A1 (S1. or S2.) or as in Example A2 respectively, and assume that the assumptions discussed in Example A1 or in Example A2 respectively are fulfilled for φ = φ λ for some λ > 0.Then, for λ ∈ [0, λ) To ease the exposition of the proof of Proposition 2, we first state a lemma that follows from results given in Pitts (1994).In the sequel we use f * H to denote the function defined by f * H(•) := v( • − x) dH(x) for any measurable function f and any distribution function H of a finite (not necessarily probability) Borel measure on R for which f * H(•) is well defined on R.
Then, the following two assertions hold.
(i) There exists a constant C 1 > 0 such that for every k, n ∈ N Proof.(i): From Equation (2.4) in Pitts (1994) we have so that it remains to show that |x| λ dF n (x) is bounded above uniformly in n ∈ N. The functions Pitts (1994)).Therefore, |x| λ dF n (x) ≤ C 1 for some suitable finite constant C 1 > 0 and all n ∈ N.
(ii): With the help of Lemma 2.3 of Pitts (1994) (along with Pitts (1994), and Equation (2.4) in Pitts (1994), we obtain It hence remains to show that |x| λ dF n (x) and |x| λ dG n (x) are bounded above uniformly in n ∈ N.However, this was already done in the proof of Part (i).
Proof.Proof of Proposition 2. First, note that for G 1 , G 2 ∈ F φ λ , we have by Equation (2.1) in Pitts (1994).Moreover, according to Lemma 2.2 in Pitts (1994), we have that the integrals |x| λ dC p (F)(x) and |x| λ dC p (G)(x) are finite under the assumptions of the proposition.Hence, D φ λ can indeed be seen as the trace.Second, we show ( where the first and the second inequality follow from Lemma 2.3 and Equation (2.4) in Pitts (1994) respectively.Hence, Now, the series converges due to the assumptions, and with the usual convention that the sum over the empty set equals zero.We find that for every M ∈ N where for the third "=" we use the fact that for By Part (ii) of Lemma reflemma preceding qHD of compound (this lemma can be applied since Since λ < λ and v n − v φ λ → 0, we have v n φ λ ≤ K 1 for some finite constant K 1 > 0 and all n ∈ N. Hence, the right-hand side of Equation ( 17) can be made arbitrarily small by choosing M large enough.That is, S 1 (n, M) can be made arbitrarily small uniformly in n ∈ N by choosing M large enough.
Furthermore, it is demonstrated in the proof of Proposition 4.1 of Pitts (1994) that S 3 (M) can be made arbitrarily small by choosing M large enough.
Next, applying again Part (ii) of Lemma 1, we obtain It remains to consider the summand We show that for M fixed this term can be made arbitrarily small by letting n → ∞.This would follow if for every given k ∈ {1, . . ., M} and ∈ {0, . . ., k − 1} the expression for some suitable finite constant c(λ , v) > 0 depending only on λ and v.The first inequality in Equation ( 18) is obvious (and holds for any v ∈ D φ λ ).The second inequality in Equation ( 18) is obtained by applying Lemma 2.3 of Pitts (1994) to the first summand (noting that Pitts (1994) to the second summand (which requires that v is as described above), and by applying Lemma 2.3 of Pitts (1994) to the third summand.
We now consider the three summands on the right-hand side of Equation ( 18) separately.We start with the third term.Since v ∈ D φ λ , Lemma 4.2 of Pitts (1994) ensures that we may assume that v is chosen such that v − v φ λ is arbitrarily small.Hence, for fixed M the third summand in Equation ( 18) can be made arbitrarily small.
We next consider the the second summand in Equation ( 18).Obviously, We start by considering the first summand in Equation ( 19).In view of Equation ( 16), it can be written as Applying Lemma 2.3 of Pitts (1994) where we applied Part (i) of Lemma 1 to 1 [0,∞) − F * n φ λ to obtain the last inequality.Hence, for the left-hand side of Equation ( 20) to go to zero as n → ∞ it suffices to show that (ε where we applied Part (ii) of Lemma 1 with v = ε n v n to all summands in For every k and ∈ {0, . . ., k − 1} this expression goes indeed to zero as n → ∞, because, as mentioned before, v n φ λ is uniformly bounded in n ∈ N, and we have ε n → 0. Next, we consider the second summand in Equation ( 19).Applying Equation ( 16) to F * (k−1) n and F * (k−1) and subsequently Part (ii) of Lemma 1 to the summands in H k−1 (F n , F), we have Clearly for every k this term goes to zero 0 as n → ∞, because by assumption.This together with the fact that Equation (20) goes to zero 0 as n → ∞ shows that Equation ( 19) goes to zero in • φ λ as n → ∞.Therefore, the second summand in Equation ( 18) goes to zero as n → ∞.
It remains to consider the first term in Equation ( 18).We find where for the last inequality we used Formula (2.4) of Pitts (1994).In the following, Equation ( 19) we showed that n φ λ goes to zero as n → ∞ for every k and ∈ {0, . . ., k − 1}.Hence, for every such k and , it is uniformly bounded in n ∈ N. Therefore, we can make Equation ( 22) arbitrarily small by making v − v φ λ small which, as mentioned above, is possible according to Lemma 4.2 of Pitts (1994).This finishes the proof.

Composition of Average Value at Risk Functional and Compound Distribution Functional
Here, we consider the composition of the Average Value at Risk functional R α defined in Equation ( 1) and the compound distribution functional C p introduced in Section 2.1.As a consequence of Propositions 1 and 2, we obtain the following Corollary 3. Note that, for any λ > 1, Lemma 2.2 in Pitts (1994) , and assume that C p (F) takes the value α only once.Then, the map T α,p := R α • C p : F φ λ (⊆ D) → R is uniformly quasi-Hadamard differentiable at F tangentially to D φ λ D φ λ , and the uniform quasi-Hadamard derivative Ṫα,p;F : with g α and v * H p,F as in Proposition 1 and 2, respectively.
Proof.We intend to apply Lemma A1 to H = C p : (for which we applied Lemma 2.3 and Inequality (2.4) in Pitts (1994)), the convergence of the latter series (which holds by assumption), and v φ λ ≤ v φ λ < ∞.Further, it follows from Proposition 1 that the map R α is uniformly quasi-Hadamard differentiable tangentially to D φ λ D φ λ at every distribution function of F φ λ that takes the value 1 − α only once.This is Assumption (c) of Lemma A1.It remains to show that Assumption (a) of Lemma A1 also holds true.In the present setting, Assumption (a) means that for every sequence where we used Equation ( 16) for the second "=" and applied Part (ii) of Lemma 1 to the summands of H k to obtain the latter inequality.Since the series converges, we obtain As an immediate consequence of Corollary A4, Examples A1 and A2, and Corollary 3, we obtain the following corollary.
Corollary 4. Let F, F n , F * n , C n , and B F be as in Example A1 (S1. or S2.) or as in Example A2, respectively, and assume that the assumptions discussed in Example A1 or in Example A2 respectively are fulfilled for φ = φ λ for some λ > 1 (in particular F ∈ F 1 ).Moreover, assume ∑ ∞ k=1 p k k 1+λ < ∞ and that C p (F) takes the value α only once.Then,

Conclusions
In this paper, we consider the sub-additive risk measure Average Value at Risk and presented in Sections 2.1 and 2.2 results on almost sure bootstrap consistency for the corresponding empirical plug-in estimator based on i.i.d. or strictly stationary, geometrically β-mixing observations.Our results supplement those by Beutner and Zähle (2016) on bootstrap consistency in probability and those by Sun and Cheng (2018) on bootstrap consistency in probability for the Tail Conditional Expectation (which is not sub-additive).In Section 2.1, we also look at the case where one is interested in Average Value of Risk in the collective risk model.Note that one might interpret the collective risk model as a pooling of independent risks.In the context of Solvency II, pooling of risks has received increased attention (see, for example, Bølviken and Guillen 2017).However, one should keep in mind that our results of Section 2.1 can typically not be applied in the Solvency II context.In Solvency II applications risks are usually dependent, whereas in the collective risk model the different risks (claims) are assumed to be independent.

Appendix A. Convergence in Distribution •
Let (E, d) be a metric space and B • be the σ-algebra on E generated by the open balls B r (x) := {y ∈ E : d(x, y) < r}, x ∈ E, r > 0. We refer to B • as open-ball σ-algebra.If (E, d) is separable, then B • coincides with the Borel σ-algebra B. If (E, d) is not separable, then B • might be strictly smaller than B and thus a continuous real-valued function on E is not necessarily (B • , B(R))-measurable.Let C • b be the set of all bounded, continuous and (B • , B(R))-measurable real-valued functions on E, and M • 1 be the set of all probability measures on (E, B • ).
Let X n be an (E, B • )-valued random variable on some probability space (Ω n , F n , P n ) for every n ∈ N 0 .Then, referring to Billingsley (1999, sct. 1.6) In this case, we write X n ; • X 0 .This is the same as saying that the sequence ( Beutner and Zähle (2016).It is worth mentioning that two probability measures µ, ν ∈ M (Billingsley 1999, Theorem 6.2)).
In Appendices A-C in Beutner and Zähle (2016), several properties of convergence in distribution • (and weak • convergence) have been discussed.The following two subsections complement this discussion.
Appendix A.1.Slutsky-Type Results for the Open-Ball σ-Algebra For a sequence (X n ) of (E, B • )-valued random variables that are all defined on the same probability space (Ω, F , P), the sequence (X n ) is said to converge in probability • to X 0 if the mappings ω → d(X n (ω), X 0 (ω)), n ∈ N, are (F , B(R + ))-measurable and satisfy In this case, we write X n → p,• X 0 .The superscript • points to the fact that measurability of the mapping ω → d(X n (ω), X 0 (ω)) is a requirement of the definition (and not automatically valid).Note, however, that in the specific situation where X 0 ≡ x 0 for some x 0 ∈ E, measurability of the mapping ω → d(X n (ω), X 0 (ω)) does hold (see Lemma B.3 in Beutner and Zähle (2016)).In addition, note that the measurability always hold when (E, d) is separable; in this case, we also write → p instead of → p,• .Theorem A1.Let (X n ) and (Y n ) be two sequences of (E, B • )-valued random variables on a common probability space (Ω, F , P), and assume that the mapping ω → d(X n (ω), Y n (ω)) is (F , B(R + ))-measurable for every n ∈ N. Let X 0 be an (E, B • )-valued random variable on some probability space (Ω 0 , F 0 , P 0 ) with P 0 [X 0 ∈ Proof.In view of X n ; • X, we obtain for every fixed f ∈ BL

Since f lies in BL •
1 and we assumed d(X n , Y n ) → p 0, we also have , where the inclusion may be strict.
Corollary A1.Let (X n ) and (Y n ) be two sequences of (E, B • )-valued random variables on a common probability space (Ω, F , P).Let X 0 be an (E, B • )-valued random variable on some probability space (Ω 0 , F 0 , P 0 ) Let ( E, d) be a metric space equipped with the corresponding open-ball σ-algebra B • .Then, X n ; • X 0 and Y n → p,• y 0 together imply: Proof.Assertion (ii) is an immediate consequence of Assertion (i) and the Continuous Mapping theorem in the form of (Billingsley 1999, Theorem 6.4); take into account that (X 0 , y 0 ) takes values only Then, the following two assertions hold: Proof.The proof is very similar to the proof of Theorem C.4 in Beutner and Zähle (2016).
Moreover, define the map h 0 : Now, the claim would follow by the extended Continuous Mapping theorem in the form of Theorem C.1 in Beutner and Zähle (2016) applied to the functions h n , n ∈ N 0 , and the random variables Third, the map h 0 is continuous by the definition of the quasi-Hadamard derivative.Thus, h 0 is (B • 0 , B • )-measurable, because the trace σ-algebra B • 0 := B • ∩ E 0 coincides with the Borel σ-algebra on E 0 (recall that E 0 is separable).In particular, ḢS (ξ) is (F 0 , B • )-measurable.(ii): For every n ∈ N, let E n and h n be as above and define the map h n : E n → E by

Moreover, define the map h
For Equation (A5), it suffices to show that the assumption of the extended Continuous Mapping theorem in the form of Theorem C.1 in Beutner and Zähle (2016) applied to the functions h n and ξ n (as defined above) are satisfied.The claim then follows by Theorem C.1 in Beutner and Zähle (2016).First, we have already observed that ξ n (Ω n ) ⊆ E n and ξ 0 (Ω 0 ) ⊆ E 0 .Second, we have seen in the proof of Part Bauer (2001) shows that the map  2016) is ensured by Assumption (d) and the continuity of the extended map ḢS at every point of E 0 (recall Assumption (f)).Hence, Equation (A5) holds.
By Assumption (g) and the ordinary Continuous Mapping theorem (see (Billingsley 1999, Theorem 6.4)) applied to Equation (A5) and the map h By Proposition B.4 in Beutner and Zähle (2016), we can conclude Equation (A4).
The following lemma provides a chain rule for uniformly quasi-Hadamard differentiable maps (a similar chain rule with different S was found in Varron ( 2015)).To formulate the chain rule, let V be a further vector space and E ⊆ V be a subspace equipped with a norm Let E 0 and E 0 be subsets of E and E, respectively.Let S and S be sets of sequences in V H and V H , respectively, and assume that the following three assertions hold.
(b) H is uniformly quasi-Hadamard differentiable with respect to S tangentially to E 0 E with trace E and uniform quasi-Hadamard derivative ḢS : E 0 → E, and we have ḢS (E 0 ) ⊆ E 0 .(c) H is uniformly quasi-Hadamard differentiable with respect to S tangentially to E 0 E with trace E and uniform quasi-Hadamard derivative ˙ H S : E 0 → E.
Then, the map T := H • H : V H → V is uniformly quasi-Hadamard differentiable with respect to S tangentially to E 0 E with trace E, and the uniform quasi-Hadamard derivative ṪS is given by ṪS := ˙ H S • ḢS .
Proof.Obviously, since H(V H ) ⊆ V H and H is associated with trace E, the map H • H can also be associated with trace E.
Note that by assumption, H(θ n ) ∈ V H and in particular (H(θ n )) ∈ S. By the uniform quasi-Hadamard differentiability of H with respect to S tangentially to E 0 E with trace E, because H is associated with trace E and ḢS (E 0 ) ⊆ E 0 .Hence, by the uniform quasi-Hadamard differentiability of H with respect to S tangentially to E 0 E , we obtain This completes the proof.

Appendix B. Delta-Method for the Bootstrap
The functional delta-method is a widely used technique to derive bootstrap consistency for a sequence of plug-in estimators with respect to a map H from bootstrap consistency of the underlying sequence of estimators.An essential limitation of the classical functional delta-method for proving bootstrap consistency in probability (or outer probability) is the condition of Hadamard differentiability on H (see Theorem 3.9.11 of van der Vaart Wellner (1996)).It is commonly acknowledged that Hadamard differentiability fails for many relevant maps H. Recently, it was demonstrated in Beutner and Zähle (2016) that a functional delta-method for the bootstrap in probability can also be proved for quasi-Hadamard differentiable maps H. Quasi-Hadamard differentiability is a weaker notion of "differentiability" than Hadamard differentiability and can be obtained for many relevant statistical functionals H (see, e.g., Beutner et al. 2012;Beutner andZähle 2010, 2012;Krätschmer et al. 2013;Krätschmer and Zähle 2017).Using the classical functional delta-method to prove almost sure (or outer almost sure) bootstrap consistency for a sequence of plug-in estimators with respect to a map H from almost sure (or outer almost sure) bootstrap consistency of the underlying sequence of estimators requires uniform Hadamard differentiability on H (see Theorem 3.9.11 of van der Vaart Wellner (1996)).In this section, we introduce the notion of uniform quasi-Hadamard differentiability and demonstrate that one can even obtain a functional delta-method for the almost sure bootstrap and uniformly quasi-Hadamard differentiable maps H.
To explain the background and the contribution of this section more precisely, assume that we are given an estimator T n for a parameter θ in a vector space, with n denoting the sample size, and that we are actually interested in the aspect H(θ) of θ.Here, H is any map taking values in a vector space.Then, H( T n ) is often a reasonable estimator for H(θ).One of the main objects in statistical inference is the distribution of the error H( T n ) − H(θ), because the error distribution can theoretically be used to derive confidence regions for H(θ).However, in applications, the exact specification of the error distribution is often hardly possible or even impossible.A widely used way out is to derive the asymptotic error distribution, i.e., the weak limit µ of law{a n (H( T n ) − H(θ))} for suitable normalizing constants a n tending to infinity, and to use µ as an approximation for µ n := law{a n (H( T n ) − H(θ))} for large n.Since µ usually still depends on the unknown parameter θ, one should use the notation µ θ instead of µ.In particular, one actually uses µ T n := µ θ | θ= T n as an approximation for µ n for large n.Not least because of the estimation of the parameter θ of µ θ , the approximation of µ n by µ T n is typically only moderate.An often more efficient alternative technique to approximate µ n is the bootstrap.The bootstrap has been introduced by Efron (1979) and many variants of his method have been introduced since then.One may refer to Davison and Hinkley (1997); Efron (1994); Lahiri (2003); Shao and Tu (1995) for general accounts on this topic.The basic idea of the bootstrap is the following.Re-sampling the original sample according to a certain re-sampling mechanism (depending on the particular bootstrap method) one can sometimes construct a so-called bootstrap version T * n of T n for which the conditional law of a n (H( T * n ) − H( T n )) "given the sample" has the same weak limit µ θ as the law of a n (H( T n ) − H(θ)) has.The latter is referred to as bootstrap consistency.Since T * n depends only on the sample and the re-sampling mechanism, one can at least numerically determine the conditional law of a n (H( T * n ) − H( T n )) "given the sample" by means of a Monte Carlo simulation based on L n repetitions.The resulting law µ * L can then be used as an approximation of µ n , at least for large n.
(ii) If S consists of all sequences (θ n ) ⊆ V H with θ n − θ ∈ E, n ∈ N, and θ n − θ E → 0 for some fixed θ ∈ V H , then we replace the phrase " with respect to S" by "at θ" and " ḢS " by " Ḣθ ".
(iii) If S consists only of the constant sequence θ n = θ, n ∈ N, then we skip the phrase "uniformly" and replace the phrase " with respect to S" by "at θ" and " ḢS " by " Ḣθ ".In this case, we may also replace "H(y 1 ) − H(y 2 ) ∈ E for all y 1 , y 2 ∈ V H " by "H(y) − H(θ) ∈ E for all y ∈ V H ".
(iv) If E = V, then we skip the phrase "quasi-".(v) If E = V, then we skip the phrase "with trace E".
The conventional notion of uniform Hadamard differentiability as used in Theorem 3.9.11 of van der Vaart Wellner (1996) corresponds to the differentiability concept in (i) with S as in (ii), E as in (iv), and E as in (v).Proposition 1 shows that it is beneficial to refrain from insisting on E = V as in (iv).It was recently discussed in Belloni et al. (2017) that it can be also beneficial to refrain from insisting on the assumption of (ii).For E = V ("non-quasi" case), uniform Hadamard differentiability in the sense of Definition B.1 in Belloni et al. (2017) corresponds to uniform Hadamard differentiability in the sense of our Definition A1 (Parts (i) and (iv)) when S is chosen as the set of all sequences Belloni et al. (2017), it is illustrated by means of the quantile functional that this notion of differentiability (subject to a suitable choice of (K θ , d K )) is strictly weaker than the notion of uniform Hadamard differentiability that was used in the classical delta-method for the almost sure bootstrap, Theorem 3.9.11 in van der Vaart Wellner (1996).Although this shows that the flexibility with respect to S in our Definition A1 can be beneficial, it is somehow even more important that we allow for the "quasi" case.
Of course, the smaller the family S the weaker the condition of uniform quasi-Hadamard differentiability with respect to S. On the other hand, if the set S is too small, then Condition (e) in Theorem A4 ahead may fail.That is, for an application of the functional delta-method in the form of Theorem A4 the set S should be large enough for Condition (e) to be fulfilled and small enough for being able to establish uniform quasi-Hadamard differentiability with respect to S of the map H.
We now turn to the abstract delta-method.As mentioned in Section 1, convergence in distribution will always be considered for the open-ball σ-algebra.We use the terminology convergence in distribution • (symbolically ; • ) for this sort of convergence; for details see Appendix A and Appendices A-C of Beutner and Zähle (2016).In a separable metric space the notion of convergence in distribution • boils down to the conventional notion of convergence in distribution for the Borel σ-algebra.In this case, we use the symbol ; instead of ; • .
Let (Ω, F , P) be a probability space, and ( T n ) be a sequence of maps Regard ω ∈ Ω as a sample drawn from P, and T n (ω) as a statistic derived from ω. Somewhat unconventionally, we do not (need to) require at this point that T n is measurable with respect to any σ-algebra on V. Let (Ω , F , P ) be another probability space and set (Ω, F , P) := (Ω × Ω , F ⊗ F , P ⊗ P ).
The probability measure P represents a random experiment that is run independently of the random sample mechanism P. In the sequel, T n will frequently be regarded as a map defined on the extension Ω of Ω.
Theorem A3 is a consequence of Theorem A2 in Appendix A.2 as we assume that T n takes values only in V H .The proof of the measurability statement of Theorem A3 is given in the proof of Theorem A4.Theorem A3 is stated here because, together with Theorem A4, it implies almost sure bootstrap consistency whenever the limit ξ is the same in Theorem A3 and Theorem A4.
Theorem A3.Let (θ n ) be a sequence in V H and S := {(θ n )}.Let E 0 ⊆ E be a separable subspace and assume that E 0 ∈ B • .Let (a n ) be a sequence of positive real numbers with a n → ∞, and assume that the following assertions hold: for some (E, B • )-valued random variable ξ on some probability space (Ω 0 , F 0 , P 0 ) with ξ(Ω 0 ) ⊆ E 0 .(b) a n (H( T n ) − H(θ n )) takes values only in E and is (F , B • )-measurable.(c) H is uniformly quasi-Hadamard differentiable with respect to S tangentially to E 0 E with trace E and uniform quasi-Hadamard derivative ḢS .
Then, ḢS (ξ) is (F 0 , B • )-measurable and Theorem A4.Let S be any set of sequences in V H . Let E 0 ⊆ E be a separable subspace and assume that E 0 ∈ B • .Let (a n ) be a sequence of positive real numbers with a n → ∞, and assume that the following assertions hold: for some (E, B • )-valued random variable ξ on some probability space (Ω 0 , F 0 , P 0 nonnegative real-valued random variables on (Ω , F , P ) such that Setting S1. or Setting S2. of Section 2.1 is met.Define the map F * . Recall that Setting S1. is nothing but Efron's boostrap (Efron (1979)), and that Setting S2. is in line with the Bayesian bootstrap of Rubin (1981) if Y 1 is exponentially distribution with parameter 1.In Section 5.1 in Beutner and Zähle (2016), it was proved with the help of results of Shorack and Wellner (1986) and van der Vaart Wellner (1996) that respectively Condition (a) of Corollary A3 (with F n := F) and Condition (a) of Corollary A4 (with C n := F n ) hold for a n := √ n and B := B F , where B F is an F-Brownian bridge.Here, C φ can be chosen to be the set C φ,F of all v ∈ D φ whose discontinuities are also discontinuities of F. In addition, note that, in view of C n = F n , Condition (e) holds if S is (any subset of) the set of all sequences (G n ) of distribution functions on R satisfying G n − F ∈ D φ , n ∈ N, and G n − F φ → 0 (see, for instance, Theorem 2.1 in Zähle ( 2014)).
Example A2.Let (X i ) be a strictly stationary sequence of β-mixing random variables on (Ω, F , P) with distribution function F, and F n be given by Equation (A12).Let ( n ) be a sequence of integers such that n ∞ as n → ∞, and n < n for all n ∈ N. Set k n := n/ n for all n ∈ N. Let (I nj ) n∈N, 1≤j≤k n be a triangular array of random variables on (Ω , F , P ) such that I n1 , . . ., I nk n are i.i.d.according to the uniform distribution on {1, . . ., n − n + 1} for every n ∈ N. Define the map F * with W ni given by Equation ( 8 . Now, assume that Assumptions A1.-A3. of Section 2.2 hold true.Then, as discussed in Example 4.4 and Section 5.2 of Beutner and Zähle (2016), it can be derived from a result in Arcones and Yu (1994) that under Assumptions A1. and A2.we have that Condition (a) of Corollary A3 holds for a n := √ n, B := B F , and F n := F, where B F is a centered Gaussian process with covariance function Further examples for Condition (a) in Corollary A4 for dependent observations can, for example, be found in Bühlmann (1994);Naik-Nimbalkar and Rajarshi (1994); Peligrad (1998).
Theorem A5.In the setting of Example A2 assume that assertions A1.-A3. of Section 2.2 hold, and let S be the set of all sequences (G n ) ⊆ D(H) with G n − F ∈ D φ , n ∈ N, and G n − F φ → 0.Then, the second part of assertion (a) (i.e., Equation (A14)) and assertion (e) in Corollary A4 hold.
Here, the bracketing number N [ ] (ε, F φ , • p ) is the minimal number of ε-brackets with respect to • p (L p -norm with respect to dF) to cover F φ , where an ε-bracket with respect to • p is the set, [ , u], of all functions f with ≤ f ≤ u for some Borel measurable functions , u : R → R + with ≤ u pointwise and u − p ≤ ε.
Proof of ( because the analogue for the positive real line can be shown in the same way.Let ε i and u ε i be as defined in Equation (A17).By assumption A1. we have φ dF < ∞, so that similar as above we can find a finite partition −∞ = ε-brackets with respect to • 1 (L 1 -norm with respect to F) covering the class F φ := { f x : x ∈ R} introduced above.We proceed in two steps.
Step 2. Because of Equation (A19), for Equation (A18) to be true, it suffices to show that for every i = 1, . . ., k ε + m ε .We only show the second convergence in Equation (A21), the first convergence can be shown even easier.We have The first summand on the right-hand side of arbitrarily small by letting n → ∞.For every such k and we can find a linear combination of indicator functions of the form 1 [a,b) , −∞ < a < b < ∞, which we denote by v, such that v To verify that the assumptions of the lemma are fulfilled, we first recall from the comment directly before Corollary 3 that C p (F φ λ ) ⊆ F 1 .It remains to show that the Assumptions (a)-(c) of Lemma A1 are fulfilled.According to Proposition 2 we have that for every λ ∈ (1, λ) the functional C p is uniformly quasi-Hadamard differentiable at F tangentially to D φ λ D φ λ with trace D φ λ , which is the first part of Assumption (b).The second part of Assumption (b) means Ċp,F (D φ λ ) ⊆ D φ λ and follows from Ċp;F (v) φ λ = v * which together with the Portmanteau theorem (in the form of(Beutner and Zähle 2016, Theorem A.4)) implies the claim.Set E := E × E and let B• be the σ-algebra on E generated by the open balls with respect to the metric d and ξ 0 := ξ if we can show that the assumptions of Theorem C.1 in Beutner and Zähle (2016) are satisfied.First, by Assumption (a) and the last part of Assumption (b), we have Fourth, Condition (a) of Theorem C.1 in Beutner and Zähle (2016) holds by Assumption (b).Fifth, Condition (b) of Theorem C.1 in Beutner and Zähle (2016) is ensured by Assumption (d).
one can argue as above) and in particular (B • 0 , B • )-measurable.Fourth, Condition (a) of Theorem C.1 in Beutner and Zähle (2016) holds by Assumption (b).Fifth, Condition (b) of Theorem C.1 in Beutner and Zähle ( Here, C φ can be chosen to be the set C φ,F of all v ∈ D φ whose discontinuities are also discontinuities of F.Moreover, Theorem A5 below shows that under the assumptions A1.-A3. the second part of Condition (a) (i.e., Equation (A14)) and Condition (e) of Corollary A4 hold for C n := E [ F * n ] = 1 n ∑ n i=1 w ni 1 [X i ,∞) with w ni := E [W ni ] (see also Equation (9)) and the same choice of a n , B, and F n , when S is the set of all sequences (G n ) ⊆ D(H) with G n − F ∈ D φ , n ∈ N, and G n − F φ → 0.

ε
i d(F − C n ) −→ 0 and u ε i d( C n − F) −→ 0 P-a.s.(A21) Let T * n : Ω −→ V be any map.Since T * n (ω, ω ) depends on both the original sample ω and the outcome ω of the additional independent random experiment, we may regard T * n as a bootstrapped version of T n .Moreover, let C n : Ω −→ V be any map.As with T n , we often regard C n as a map defined on the extension Ω of Ω.We use C n together with a scaling sequence to get weak convergence results for T * n .The role of C n is often played by T n itself (see Example A1), but sometimes also by a different map (see Example A2).Assume that T n , T * n , and C n take values only in V H . Let B • and B • be the open-ball σ-algebras on E and E with respect to the norms • E and • E , respectively.Note that B • coincides with the Borel σ-algebra on E when (E, • E ) is separable.The same is true for B • .Set E := E × E and let B • be the σ-algebra on E generated by the open balls with respect to the metric d(( x 1 , x 2 ), ( y 1 , y 2 takes values only in E and is (F , B • )-measurable.(c) H is uniformly quasi-Hadamard differentiable with respect to S tangentially to E 0 E with trace E and uniform quasi-Hadamard derivative ḢS .(d) The uniform quasi-Hadamard derivative ḢS can be extended from E 0 to E such that the extension ḢS : E → E is (B • , B • )-measurable and continuous at every point of E 0 .(e) ( C n (ω)) ∈ S for P-a.e. ω.(f) The map h : E → E defined by h( x 1 , x 2 ), and recall from Section 2.2 that this is the blockwise bootstrap.Similar as in Lemma 5.3 in Beutner and Zähle (2016) it follows that a n ( F * n − C n ), with C n := E [ F * n ], takes values only in D φ and is (F , B • φ )-measurable.That is, the first part of Condition (a) of Corollary A4 holds true for C n