1. Introduction
In May, 2001, the first author contributed to the influential and highly visible “An academic response to Basel II” Daníelsson
et al. [
1]. In their academic response to, at the time, Basel 2, the authors were spot on concerning the weaknesses within the prevailing, international regulatory framework, as well as for the way in which larger international banks were managing market, credit and operational risk. We cite from their Executive Summary:
The proposed regulations fail to consider the fact that risk is endogenous. Value-at-Risk can destabilize an economy and induce crashes [⋯]
Statistical models used for forecasting risk have been proven to give inconsistent and biased forecasts, notably underestimating the joint downside risk of different assets. The Basel Committee has chosen poor quality measures of risk when better risk measures are available.
Heavy reliance on credit rating agencies for the standard approach to credit risk is misguided [⋯]
Operational risk modeling is not possible given current databases [⋯]
Financial regulation is inherently procyclical [⋯] the purpose of financial regulation is to reduce the likelihood of systemic crises, these proposals will actually tend to negate, not promote this useful purpose.
And in summary:
Perhaps our most serious concern is that these proposals, taken altogether, will enhance both the procyclicality of regulation and the susceptibility of the financial system to systemic crises, thus negating the central purpose of the whole exercise. Reconsider before it is too late.
Unfortunately, five years later it
was too late!
The above quotes serve two purposes: first, academia has a crucial role to play in commenting officially on proposed changes in the regulatory landscape; second, when well-documented, properly researched and effectively communicated, we may have an influence on regulatory and industry practice. We refer to the full document Daníelsson
et al. [
1] for further details and Shin [
2] for more background.
For the purpose of this paper, we refer to the regulatory document BCBS [
3] as Basel 3.5 for the trading book; Basel 4 is already on the regulatory horizon, even if the implementation of Basel 3 is only planned for 2019. In particular, through its consultative document BCBS [
4], the Basel Committee went already a step beyond BCBS [
3]; indeed “the Committee has its intention to pursue two key confirmed reforms outlined in the first consultative paper BCBS [
3]: stressed calibration [⋯] move from Value-at-Risk (VaR) to Expected Shortfall (ES)”. It is proposed in the same document that VaR at a confidence level of 99% should be replaced by ES at a confidence level of 97.5% for the internal models-based approach. Our comments are also relevant for insurance regulation’s Solvency 2, now planned in the EU for January 1, 2016. The Basel 3.5 document arose out of consultations between the regulators, industry and academia, and this was in the wake of the subprime crisis. It also paid attention to and remedied some of the criticisms raised in Daníelsson
et al. [
1]; we shall exemplify this below. Among the various issues raised, for our purpose, the following question (no. 8, p. 41) in BCBS [
3], is relevant:
“What are the likely constraints with moving from Value-at-Risk (VaR) to Expected Shortfall (ES), including any challenges in delivering robust backtesting and how might these be best overcome?”
Since its introduction around 1994, VaR has been criticized by numerous academics, as well as practitioners for its weaknesses as
the benchmark (see Jorion [
5]) for the calculation of regulatory capital in banking and insurance:
W1 VaR says nothing concerning the what-if question: “Given we encounter a high loss, what can be said about its magnitude?”;
W2 For high confidence levels, e.g., 95% and beyond, the statistical quantityVaR can only be estimated with considerable statistical, as well as model uncertainty, and
W3 VaR may add up the wrong way, i.e., for certain (one-period) risks, it is possible that:
the latter defies the (better said,
some of the) notion of diversification.
The worries, W1–W3, were early on brushed aside as being less relevant for practice. By now, practice has caught up,and W1–W3 have become highly relevant, hence parts of Basel 3.5.
The fact that the above concerns about VaR and, more importantly, about model uncertainty are well founded can be learned from some of the recent political discussions concerning banking regulation and the financial crisis. Proof of this is, for instance, to be found in (USS [
6], p.13) and (UKHLHC [
7], p.119); we quote explicitly from these documents, as they nicely summarize some of the key practical issues facing more quantitative regulation of modern financial markets. Before doing so, we recall the terminology of RWA (Risk-Weighted Asset). In general terms, banking
solvency is based on a quotient of
capital (specifically defined through levels of liquidity) to RWAs. The latter are the risk numbers associated with trading or credit positions mainly based on mark-to-market or mark-to-model values. Also included is risk capital for operational risk, which can easily reach the 20%–30% range of the total RWAs. In these numbers, risk measures, like VaR, appear prominently. In general, financial engineers (including mathematicians) and their products/models play a crucial role in determining these RWAs. Accountants are typically more involved with the numerator, capital. Below, we list some quotes related to the concerns about VaR and model uncertainty. The highlighting is ours.
Quote 1 (from USS [
6]): “Near the end of January, the bank approved use of a new CIO Value-at-Risk (VaR) model that
cut in half the SCP’s [London Whale’s structured credit portfolio’s] purported risk profile [...] The change in VaR methodology effectively masked the significant changes in the portfolio.” The quote refers to JPMorgan Chase Whale Trades.
Quote 2 (from UKHLHC [
7]): “From a former employee of HBOS: We actually got an external advisor [to asses how frequently a particular event might happen] and they came out with one in 100,000 years and we said <<no>>, and I think we submitted one in 10,000 years. But that was a year and a half before it happened. It does not mean to say it was wrong:
it was just unfortunate that the 10,000th year was so near.”
Quote 3 (from BCBS [
3], p.20): “However, a number of weaknesses have been identified with VaR, including its
inability to capture <<tail risk>>.” The quote clearly refers to the
what-if question
W1.
Quote 4 The RWA uncertainty issue is very well addressed in BCBS [
8] (in particular p.6), indeed: “There is considerable variation across banks in average RWAs for credit risk. In broad terms, the variation is similar to that found for market risk in the trading book. Much of the variation (up to three quarters) is explained by the underlying differences in the risk composition of the banks’ assets, reflecting differences in risk preferences as intended under the risk-based capital framework.
The remaining variation is driven by diversity in both bank and supervisory practices.” The supervision of the Euro area’s biggest banks by the European Central Bank will very much concentrate on RWAs in its asset-quality reviews; see The Economist [
9].
Though ES also suffers from
W2, it partly corrects
W1 and always adds up correctly (≤), i.e., ES is subadditive (corrects
W3). The subadditivity is accepted widely as a desirable property of risk measures, since its introduction in Artzner
et al. [
10], although it is also considered by several authors (see, for instance, Dhaene
et al. [
11] and Kou
et al. [
12]), as a debatable one. Of course, the ‘one number cannot suffice’ paradigm also holds for ES; see Rootzén and Klüppelberg [
13]. Concerning
W2, classical Extreme Value Theory (EVT), as, for instance, explained in (McNeil
et al. [
14], Chapter 7), yields sufficient warnings concerning the near-impossibility of the accurate estimation of single risk measures, like VaR and ES, at high confidence levels; see in particular (McNeil
et al. [
14], Figure 7.6). For the purpose of this paper, we shall mainly concentrate on
W3, compare VaR and ES estimates and discuss Question 8, p. 41, of BCBS [
3] from the point of view of risk aggregation and model uncertainty.
2. How Superadditive Can VaR Be?
In Embrechts
et al. [
15], the question is addressed of how large the gap between the left- and right-hand side in Equation (
1) can be. The answer is very much related to the issue of model uncertainty (MU), especially at the level of inter-dependence, i.e.,
dependence uncertainty.
Let us first recall the standard definitions of VaR and ES. Suppose
X is a random variable (rv) with distribution function (df)
,
for
. For
, we then define:
and:
whenever
is continuous; it follows that:
leading to the standard interpretation for ES as a conditional expected loss. The ES and its equivalence in the continuous setting, are known under different names and abbreviations, such as TVaR, CVaR, CTE, TCE and AVaR. When discrete distributions are involved, the above cited notions are no longer equivalent; see Acerbi and Tasche [
16].
Our set-up is as follows.
Suppose are one-period risk positions with dfs , , also denoted as . We assume to be known for the purpose of our discussion. In practice, this may correspond to models fitted to historical data or models chosen in a stress-testing environment. One could also envisage empirical dfs when sufficient data are available. Important is that in the analyses and examples below, we disregard statistical uncertainty; this can and should be added in a full-scale discussion. As a consequence, for the purpose of this paper, MU should henceforth be interpreted as functional-MU at the level of inter-dependence rather than statistical-MU. Full MU would combine (at least) both.
Consider the portfolio position
. The techniques discussed below would also allow for the analysis of other portfolio structures, like, for instance,
for some
, typically large. MU results for such more general examples, however, need further detailed study; see Embrechts
et al. [
15] for some remarks on this.
Denote by , , the marginal VaRs at the common confidence level, , typically close to one. For the moment, we concentrate on VaR as a risk measure, as it still is the regulatory benchmark. Other risk measures will appear later in the paper.
As stated, this task cannot be performed, since for the calculation of
, we need a
joint model for the random vector
. Under a specific joint model, the calculation of
amounts to a
d-dimensional integral (or sum in the discrete case). Only in very few cases can this be done analytically. As a consequence numerical integration and/or Monte Carlo methodology, including the use of quasi-random (low discrepancy) techniques, may enter. For
α close to one, tools from rare event simulation become important; see for instance Asmussen and Glynn [
17], Chapter VI. For a more geometric approach, useful in lower dimensions, say
, see [
18,
19].
When we relax from a full joint distributional assumption (a single model) to a specific subclass of models, it may be possible to obtain some inequalities or asymptotics (in
or
, say) for
. For instance, if
X is
elliptically distributed, then:
see (McNeil
et al. [
14], Theorem 6.8). An important subclass of elliptical distributions form the so-called multivariate normal variance mixture models, i.e.,
where:
- (i)
, stands for the k-dimensional identity matrix;
- (ii)
is a non-negative, scalar-valued rv, which is independent of ; and
- (iii)
and .
See (McNeil
et al. [
14],
Section 3.2) for this definition. For the most general definition based on affine transformations of spherical random vectors, see (McNeil
et al. [
14], Section 3.3.2). In many ways, elliptical models are like “heaven” for finance and risk management; see (McNeil
et al. [
14], Theorem 6.8 and Proposition 6.13). Unfortunately, and this in particular in moments of stress, the world of finance may be highly non-elliptical.
A further interesting class of models results for
X being
comonotonic, i.e., there exist increasing functions
,
, and an rv,
Z, so that:
and in that case:
i.e., VaR is
comonotone additive. For a proof of Equation (
3), see (McNeil
et al. [
14], Theorem 6.15). Recall that two risks (two rvs) with finite second moments are comonotone exactly when the joint model achieves maximal correlation, typically less than one; see (McNeil
et al. [
14], Theorem 5.25). Consequently, strictly superadditive risks, like in Equation (
1), correspond to dependence structures with
less than maximal correlation. This often leads to confusion amongst practitioners; it was also one of the reasons (a correlation pitfall) for writing Embrechts
et al. [
20]. Without extra model knowledge, we are hence led to calculating best-worst VaR bounds for
in the presence of dependence uncertainty:
The terminology, best versus worst, of course, very much depends on the situation at hand: whether one is long or short in a trading environment or whether the bounds are interpreted by a regulator or a bank, say.
We further comment on notation: recall that the only available information so far is the marginal distributions of the risks, i.e., , . Whenever we use a joint distribution function, F, with those given marginals for the vector, X, we denote in order to highlight this choice; see Equations (4) and (5) above. We hope that the reader is fine with this slight abuse of notation.
Using the notion of copula, we may rephrase (
4) and (
5) by applying Sklar’s Theorem; see (McNeil
et al. [
14], Theorem 5.3). Denote by
the set of all
d-dimensional copulas; then Equations (4) and (5) are equivalent to:
As with
F above, the upper
C-index highlights the fact that the joint df of
is
.
Rewriting the optimization problem (Equations (4) and (5)) in its equivalent copula form (Equations (6) and (7)) stresses the fact that, once we are given the marginal dfs,
,
, solving for
and
amounts to finding the copulas which, together with the
s, achieve these bounds. Hence, solving for Equations (4) and (5), or equivalently for Equations (6) and (7) (the set-up we will usually consider), one obtains the MU-
interval for fixed marginals:
If an inequality in Equation (
8) becomes an equality for a given copula,
C, the corresponding copula,
C, is referred to as an
optimal coupling. A current important area of research corresponds to finding the bounds in Equation (
8), analytically and/or numerically, prove sharpness under specific conditions and find the corresponding optimal couplings.
The interval
yields a measure for MU across all possible joint models as a function of inter-dependencies between the marginal factors (recall that we assume the
to be known!). So far, we assume no prior knowledge about the inter-dependence among the marginal risks,
. If extra, though still incomplete information, like, for instance, “all
s are positively correlated” is available, then the above MU interval narrows. An important question becomes: can one quantify such MU? This is precisely the topic treated in Embrechts
et al. [
15], Barrieu and Scandolo [
21], Bignozzi and Tsanakas [
22]. There is a multitude of both analytic, as well as numeric (algorithmic) results. We consider three measures relevant for the MU discussion; the confidence level,
, is fixed.
Measure 1 The model-specific
superadditivity ratio for the aggregate loss,
:
where we define
. The superadditivity ratio measures the non-coherence, equivalently, the superadditivity gap of VaR for a
given joint model for
X. As such, it yields an indication of how far VaR can be away from being able to properly describe diversification.
Measure 2 The
worst superadditivity ratio :
between the worst-possible VaR and the comonotonic VaR. It measures the superadditivity gap across
all joint models with given marginals.
Measure 3 The ratio between worst-possible ES and worst-possible VaR:
This relates to the question in Basel 3.5 from the Introduction.
In the next section, we discuss some of the methodological results leading to estimates for Equations (4)–(11); these are based on some very recent mathematical developments on dependence uncertainty.
Section 4 contains several numerical examples.
Section 5 addresses the robust backtesting question for VaR and comments on the possible change from VaR to ES for regulatory purposes. We draw a conclusion in
Section 6. As it stands, the paper has a dual goal: first, it provides a broadly accessible critical assessment of the VaR
versus ES debate triggered by Basel 3.5; at the same time, we list several areas of ongoing and possible future research that may come out of these discussions.
3. Mathematical Developments on Dependence Uncertainty
Questions of the type Equations (4)–(7) go back a long way in probability theory: an early solution for
was given independently by Makarov [
23], a student of A. N. Kolmogorov from whom Makarov obtained the problem, and Rüschendorf [
24] with a different approach. This type of question belongs to a rather specialized area of multivariate probability theory and is mathematically non-trivial to answer. Although at the moment of writing this paper, we still do not yet have complete answers, recently, significant progress has been made providing insight not only in the mathematical theory in this area, but also yielding answers to practically relevant questions.
To investigate problems with dependence uncertainty, like Equations (4)–(7), it is useful to define the set of all possible aggregations:
Such problems lead to research on the probabilistic properties of and statistical inference in this set,
(
was formally introduced in Bernard
et al. [
25], but all prior research in this area dealt in some form or another with the same framework). For example, the questions, (
4) and (
5), can be rephrased as:
A full characterization of
is still out of reach; recently, however, significant progress has been made, especially in the so-called
homogeneouscase. We refer to a recent book Rüschendorf [
26] for an overview of research on extremal problems with marginal constraints and dependence uncertainty. In particular, the book provides links between Equations (4)–(7) and copula theory, mass-transportation and financial risk analysis.
3.1. The Homogeneous Case
Let us first look at the case
, which we call a
homogeneous model. For this model, analytical results are available. Analytical values for
have been obtained in Wang
et al. [
27] and Puccetti and Rüschendorf [
28] for the homogeneous model when the marginal distributions have a tail-decreasing density (such as Pareto, Gamma or log-normal distributions). Wang
et al. [
27] also provide analytical expressions for
for marginal distributions with a decreasing density. These results are summarized below.
Proposition 1 Corollary 3.7 of Wang
et al. [
27], in a slightly different form.
Suppose that the density function of F is decreasing on for some . Then, for and ,where is the smallest number in , such that:Moreover, suppose that the density function of F is decreasing on its support. Then, for and , Although the expressions (13)–(15) look somewhat complicated, they can be reformulated using the notion of duality, which dates back to Rüschendorf [
24], and the resulting
dual representation originated in the theory of mass-transportation. The following proposition provides an equivalent representation of Equation (
13). It is stated in Rüschendorf [
28], in a slightly modified form and under a more general condition of complete mixability.
Proposition 2 Under the same assumptions of Proposition 1, suppose that for any sufficiently large threshold, s, it is possible to find , such that:where , with . Then, for , we have that: The proof of Propositions 1 and 2 are based on the recently introduced and further developed mathematical concept of complete mixability.
Definition 3.1 Wang and Wang [
29].
The marginal distribution, F, is said to be d-completely mixable if there exist rvs with df F, such that is almost surely constant. Recent results on complete mixability are summarized in Wang and Wang [
29] and Puccetti
et al. [
30]; an earlier study on the problem of constant sums can be found in Rüschendorf and Uckelmann [
31], with ideas that originated from the early 1980s (see Rüschendorf [
26]). A necessary and sufficient condition for distributions with monotone densities to be completely mixable is given in Wang and Wang [
29]; it is used in the proof of the bounds in Propositions 1 and 2.
Complete mixability, as opposed to comonotonicity, corresponds for this problem to extreme negative dependence. In other words,
is obtained through a concept of extreme negative correlation (given that correlation exists) between conditional distributions, instead of maximal correlation, as discussed in
Section 2. This rather counter-intuitive mathematical observation partially answers why VaR is non-subadditive and warns that regulatory or pricing criteria based on comonotonicity are not as conservative as one may think.
So far, conditional complete mixability and, hence, the sharpness of the dual bound for
has only been shown for dfs satisfying the tail-decreasing density condition of Proposition 1. Of course, this condition is satisfied by most distributional models used in risk management practice. For such models, we are hence able to calculate (
12),
and
as defined in Equations (10) and (11). Examples will be given later.
3.2. Towards the Inhomogeneous Case
When the assumption of homogeneity
is removed, analytical results become much more challenging to obtain. The connection between (
12) and the concept of
convex orderturns out to be relevant. Relations between the two concepts were described in (Bernard
et al. [
25], Theorem 4.6) and (Bernard
et al. [
32], Theorem 2.4 and Appendix (A1)–(A4)). Let
U be a uniform rv on
,
be the distribution of
(upper
α-tail distribution) and
be the distribution of
(lower
α-tail distribution) for
and
.
Proposition 3 Suppose dfs have positive densities on their supports; then, for ,and:where the essential infimum of a random variable, S, is defined as:and the essential supremum of a random variable, S, is defined as: As a consequence, it is shown that the worst VaR in
is attained by a minimal element with respect to convex order in
. A similar statement holds for
. In some cases, for instance, under assumptions of complete mixability, even the smallest elements with respect to convex order can be given, but in general, there may not be such a smallest element. Recent attempts to find analytical solutions for minimal convex ordering elements have been summarised in Bernard
et al. [
25]. Based on current knowledge, we are only able to calculate Equation (
12),
and
in the inhomogeneous case under fairly strong assumptions on the marginal dfs, for which the “sup-inf” and “inf-sup” problems are solvable. It is of interest that an algorithm called the
Rearrangement Algorithm (RA) has been introduced to approximate these worst (best)-case VaRs (see Numerical Optimization below).
Another important issue is the optimal coupling structure for the worst VaR. From Proposition 3, we can see that the interdependence (copula) between the random variables can be set arbitrarily in the lower region of the marginal supports, and only the tail dependence (in a region of probability,
, in each margin) matters for the worst VaR value. In the tail region, a smallest element in the convex order sense solves these “sup-inf” and “inf-sup” problems, (
18) and (
19). To be more precise, each of the individual risks are coupled in a way, such that, conditional on their being all large, their sum is concentrated around a constant (ideally, the sum
is a constant, but this is not realistic in many cases). That is why conditional complete mixability plays an important role in the optimal coupling for the worst VaR (the optimal coupling for the best VaR is similar, just that the conditional region now is a (typically large) interval of probability
α). This also leads to the fact that information on overall correlation, such as the linear correlation coefficient or Spearman’s rho, may not directly affect the value of the worst VaR. Even with a constraint that the risks in a portfolio are uncorrelated or mildly correlated, the worst VaR may still be reached. This, to some extent, warns about the danger of using a single number as the dependence indicator in a quantitative risk management model. In the recent paper, Bernard
et al. [
32], it has been shown that additional variance constraints may lead to essentially improved VaR bounds.
3.3. Numerical Optimization
Numerical methods are regarded as very useful when it comes to optimization problems. One such method is the rearrangement algorithm (RA) introduced in Puccetti and Rüschendorf [
33], which was modified, extended and further discussed with applications to quantitative risk management in Embrechts
et al. [
15]. The RA is a simple, but fast, algorithm designed to approximate convex minimal elements in a set,
, through discretization. The RA allows for a fast and accurate computation of the bounds in Equation (
12) for arbitrary marginal dfs, both in the homogeneous, as well as inhomogeneous case. The method uses a discretization step of the relevant quantile region (
, say) resulting in a
matrix on which, through successive operations, a matrix with minimal variance for the row-sums (think of complete mixability) is obtained. The RA can easily handle large dimensionality problems of
1,000, say. For details and examples, see Embrechts
et al. [
15]. As argued in Bernard
et al. [
25], the numerical approximations obtained through the RA suggest that the bound,
, in Proposition 1 is sharp for all commonly used marginal distributions, and this is without the requirement of a tail-decreasing density. Up to now, we do not have a formal proof of this. In Bernard
et al. [
32], an extension of the RA (called ERA) is introduced and shown to give reliable bounds for the variance constrained VaR.
3.4. Asymptotic Equivalence of Worst VaR and Worst ES
For any random variable,
Y,
is bounded above by
. As a consequence, the worst case VaR is bounded above by the worst case ES, i.e.,
Since
, Equation (
20) gives a simple way to calculate the upper bounds of the worst VaR. This bound implies that:
It was an observation made in Puccetti and Rüschendorf [
34] that the bound in Equation (
20) is asymptotically sharp under general conditions, i.e., an asymptotic equivalence of worst VaR and worst ES holds.
The exact identity of worst-possible VaR and ES estimates holds for bounded homogeneous portfolios when the common marginal distribution is completely mixable, as indicated in the following remark.
Remark 1 Assume that F is a bounded, continuous distribution on the bounded interval , . Assume also that F is d-completely mixable in , i.e., there exists a random vector and a constant, k, such that:By the definition of VaR in Equation (2), we then have that:and:Because of Equation (20), Equations (22) and (23) imply: The above example suggests a strong connection between
and
. Indeed, consider Equation (
18), and note that by the definition of
,
for any
. Therefore, mathematically, the link between
and
really concerns the difference between
and
for some
S in
. Intuitively, such
S, which solves the “sup-inf” and “inf-sup” problems, should have a rather small value of
, leading to a small value of
. Furthermore, the
case in Proposition 1 points in the same direction.
The asymptotic equivalence of worst VaR and worst ES was established in Puccetti and Rüschendorf [
34] in the homogeneous case based on the dual bounds in Embrechts and Puccetti [
35] and an assumption of conditional complete mixability. The assumption was later weakened by Puccetti
et al. [
36] and Wang [
37] and removed in Wang and Wang [
38]. The following extension to inhomogeneous models in the general case was given in (Embrechts
et al. [
39] (Theorem 3.3)).
Proposition 4 Asymptotic equivalence of worst VaR and ES.
Suppose that the continuous distributions, , satisfy that for some and ,- (a)
is uniformly bounded, and
- (b)
then: In the homogeneous model, no assumptions other than a finite and non-zero
is required for Equation (
24) to hold; see (Wang and Wang [
38], Corollary 3.8). A notion of
asymptotic mixability (asymptotically constant sum) leads to the asymptotic equivalence of worst VaR and worst ES (see Bernard
et al. [
32], Puccetti and Rüschendorf [
40]), indicating that this equivalence is connected with the law of large numbers, and therefore, it holds under general conditions. The equivalence (
24) is also suggested by several numerical examples (see Examples 4.1 and 4.2 in
Section 4). For research on the asymptotic equivalence (
24) for general risk measures, we refer to Wang
et al. [
41].
Remark 2 An immediate consequence from Equation (24) is that in the finite mean, homogeneous case, when , as , we have that: Generally speaking, the worst superadditivity ratio
is asymptotically
in all homogeneous and inhomogeneous models of practical interest. In other words, we can say that the worst VaR is almost as extreme as the worst ES at the same confidence level for
d large. According to BCBS [
4],
is to be compared with
; by Equation (
24), the worst
is generally (much) larger than the worst
.
It also is worth pointing out that the worst superadditivity ratio
approaches infinity when
is infinite; this is consistent with Equation (
25) and was shown in Puccetti and Rüschendorf [
34]. Models leading to (estimated) infinite mean risks have attracted considerable attention in the literature; see, for instance, the early contribution Nešlehová
et al. [
42] within the realm of operational risk and Delbaen [
43] for a more methodological point of view. Clearly, ES is not defined in this case, nor does there exist any non-trivial coherent risk measure on the space of infinite mean rvs; see Delbaen [
43]. As a consequence, as VaR is always well-defined, it may become a risk measure of “last resort”. We shall not enter into a more applied “pro
versus contra” discussion on the topic here, but just stress the extra insight our results give within the context of Basel 3.5. In particular, note that for infinite mean risks, the worst VaR grows much faster than the comonotonic one.
Remark 3 So far, we have mainly looked at the asymptotic properties when the portfolio dimension, d, becomes large, i.e., , like in Proposition 4. Alternatively, one could consider d fixed and . The latter then quickly becomes a question in (Multivariate) Extreme Value Theory (EVT); indeed for α close to one, one is concerned about the extreme tail-behavior of the underlying dfs. The reader interested in the results of this type can, for instance, consult Mainik and Rüschendorf [44] and Mainik and Embrechts [45] and the references therein. Finally, one could also consider joint asymptotic behavior where both and together in a coupled way; this would correspond to so-called Large Deviations Theory; see (Asmussen and Glynn [17] (Section VI)) for an introduction in the context of rare event simulation. 5. Robust Backtesting of Risk Measures
Finally, we want to further comment on Question 8 from BCBS [
3]: whereas it is fully clear that,
if one wants to regulate a financial institution relying on a number, ES is better than VaR concerning
W1 and
W3. However, both suffer from
W2. In
Section 3 and the examples in
Section 4, we have seen that under a worst scenario of interdependence, both VaR and ES yield similar values. Backtesting VaR is fairly straightforward (hit-and-miss tests), whereas for ES, one has to assume an underlying model; for an EVT -based approach, see for instance (McNeil
et al. [
14], p. 168). Below, we will make the latter statement scientifically more precise.
If one, for instance, needs to compare different backtesting procedures on one measure, the situation for ES as compared to VaR is less favorable. An important notion here is
elicitability; see Gneiting [
52]. Such forecasts, in our case, risk measures, like VaR and ES, are functionals of the underlying data: they map a data vector to a real number, in some cases, an interval. A (statistical) functional is called elicitable if it can be defined as the minimizer of a suitable scoring function. The scoring functions are then used to compare competing forecasts through their average scores calculated from point forecast and realized observations. In Gneiting [
52], it is shown that, in general, VaR is elicitable, whereas ES is not. To this observation, the author adds the following statement: “The negative result [for ES] may challenge the use of the CVaR [ES] functional as a predictive measure of risk, and may provide a partial explanation for the lack of literature on the evaluation of CVaR forecasts, as opposed to quantile or VaR forecasts.” Recently, considerable progress has been made concerning the embedding of the statistical theory of elicitability within the mathematical theory of risk measures; see for instance Ziegel [
53]. An interesting question that early on emerged from this research was: “Do there exist (non-trivial) coherent (i.e., subadditive) risk measures that are elicitable?” A positive answer is to be found in Ziegel [
53]: the
τ-expectiles. The
τ-expectile,
, for an rv,
X, with
, is defined as:
For an rv,
X, with
, the
τ-expectiles is the unique solution,
x, of the equation:
In particular,
. One can show that for
,
is elicitable; for
,
is subadditive, whereas for
,
is superadditive. Moreover,
is
not comonotone additive. In Bellini and Bignozzi [
54], it is shown that, under a slight modification of elicitability, the only elicitable and coherent risk measures are the expectiles. We are not advocating
as
the risk measure to use, but mainly want to show the kind of research that is triggered by parts of Basel 3.5. For more information, see, for instance, Bellini
et al. [
55] and Delbaen [
56]. Early contributions are to be found in Rémillard [
57] (Section 4.4.4.1). We do mention the above publications, as on p. 60 in BCBS [
3], it is mentioned that “Spectral risk measures are a promising generalization of ES that is cited in the literature.” As mentioned above, it is shown that non-trivial law-invariant spectral risk measures, such as ES, are
not elicitable. As a consequence, and this is by definition of elicitability, “objective comparison and backtesting of competing estimation procedures for spectral risk measures is difficult, if not impossible, in a decision theoretically sound manner”; see the Discussion section in Ziegel [
53]. The latter paper also describes a possible approach to ES-prediction. Another recent paper Emmer
et al. [
58] discusses the backtesting issues of popular risk measures and presents a discretized backtesting procedure for ES. Clearly, proper backtesting of spectral risk measures needs more research.
A different way of looking at the relative merits of VaR and ES as measures of financial risk is presented in Davis [
59]. In the latter paper, the author uses the notion of
prequential statistics and concludes: “The point [⋯] is that significant conditions must be imposed to secure the consistency of mean-type estimates [ES], in contrast to the situation for quantile estimates [VaR] (Theorem 5.2), where almost no conditions are imposed. [⋯] This seems to indicate—in line with the elicitability conclusions—that verifying the validity of mean-based estimates is essentially more problematic than the same problem for quantile-based statistics.” To what extent these conclusions tip the decision from an ES-based capital charge back to a VaR-based one is, at the moment, not yet clear.
Whereas the picture concerning backtesting across risk measures needs further discussion, the situation becomes even more blurred when a notion of
robustness is added. In its broadest sense, robustness has to do with (in)sensitivity to underlying model deviations and/or data changes. Furthermore, here, a whole new field of research is opening up; at the moment, it is difficult to point to the right approach. The quotes and the references below give the interested reader some insight into the underlying issues and different approaches. The spectrum goes from a pure statistical one, like in Huber and Ronchetti [
60], to a more economics decision making one, like Hansen and Sargent [
61]. In the former text, robustness mainly concerns so-called distributional robustness: what are the consequences when the shape of the actual underlying distribution deviates slightly from the assumed model? In the latter text, the emphasis lies more on robust control, in particular, how should agents cope with fear of model misspecification, and goes back to earlier work in statistics, mainly Whittle [
62] (the first edition appeared already in 1963). The authors of Hansen and Sargent [
61] provide the following advice: “If Peter Whittle wrote it, read it.” Finally, an area of research that also uses the term robustness and is highly relevant in the context of
Section 3 is the field of Robust Optimization as, for instance, summarized in Ben-Tal
et al. [
63].
The main point of the comments above is that “there is more to robustness than meets the eye”. In many ways, in some form or another, robustness lies at the core of financial and insurance risk management. Below, we gathered some quotes on the topic, which readers may find interesting for follow-up; we briefly add a comment when relevant in light of our paper as presented so far.
Quote 1 (from Stahl [
64]): “Use stress testing based on mixture models [⋯] contamination.” In practice, one often uses so-called contamination; this amounts to model constructions of the type
with
and
ϵ typically small. In this case, the df
F corresponds to “normal” behavior, whereas
G corresponds to a stress component. In Stahl [
64], this approach is also championed and embedded in a broader Bayesian ansatz.
Quote 2 (from Cont
et al. [
65]) “Our results illustrate, in particular, that using recently proposed risk measures, such as CVaR/Expected Shortfall, leads to a less robust risk measurement procedure than Value-at-Risk.” The authors showed that, in general, quantile estimators are robust with respect to the weak topology, and coherent distortion estimators are not robust in the same sense; this is consistent with Hampel’s notion of robustness for
L-statistics, as discussed in Huber and Ronchetti [
60].
Quote 3 (from Kou
et al. [
12]): “Coherent risk measures are not robust”. The authors showed that VaR is more robust compared to ES, and this was with respect to a small change in the data by using tools, such as influence functions and breakdown points; see, also, Kou and Peng [
51] for similar results with respect to Hampel’s notion of robustness. The authors of [
51] champion Median Shortfall, which is defined as the median of the alpha-tail distribution and is equal to a VaR at a higher confidence level.
Quote 4 (from Cambou and Filipović [
66]) “ES is robust, and VaR is non-robust based on the notion of
-divergence.”
Quote 5 (from Krätschmer
et al. [
67]) “We argue here that Hampel’s classical notion of quantitative robustness is not suitable for risk measurement, and we propose and analyse a refined notion of robustness that applies to tail-dependent law-invariant convex risk measures on Orliz spaces.” These authors introduce an index of quantitative robustness. As a consequence, and this is somewhat in contrast to Quotes 2 and 3: “This new look at robustness will then help us to bring the argument against coherent risk measures back into perspective: robustness is not lost entirely, but only to some degree when VaR is replaced by a coherent risk measure, such as ES.”
Quote 6 (from Emmer
et al. [
58]) “With respect to the weak topology, most of the common risk measures are discontinuous. Therefore [⋯] in risk management, one usually considers robustness as continuity with respect to the Wasserstein distance [⋯]” “[⋯] mean, VaR, and Expected Shortfalls are continuous with respect to the Wasserstein distance.”
Quote 7 (from BCBS [
4]): “This confidence level [97.5th ES] will provide a broadly similar level of risk capture as the existing 99th percentile VaR threshold, while providing a number of benefits, including generally more stable model output and often less sensitivity to extreme outlier observations.”
Quote 8 (from Embrechts
et al. [
39]) “With respect to dependence uncertainty in aggregation, VaR is less robust compared to ES.” The authors introduce a notion of
aggregation-robustness, under which ES and other spectral risk measures are robust. They also show that VaR generally exhibits a larger dependence-uncertainty spread compared to ES.
The above quotes hopefully bring the robustness in “robust backtesting” somewhat into perspective. More discussions with regulators are needed in order to understand what precisely is intended when formulating this aspect of future regulation. As we already stressed, the multi-facetted notion of robustness must be key to any financial business and, consequently, regulation. In its broadest interpretation as “resilience against or awareness of model and data sensitivity”, this ought to be clear to all involved. How to make this awareness more tangible is a key task going forward.
6. Conclusions
The recent financial crises have shown how unreliably some quantitative tools perform in stormy markets. Through its Basel 3.5 documents, BCBS [
3] and BCBS [
8], the Basel Committee has opened up the discussion in order to make the international banking world a safer place for all involved. Admittedly, our contribution to the choice of risk measure for the potential supervision of market risk is a minor one and only touches upon a small aspect of the above regulatory documents. We do however hope that some of the methodology, examples and research reviews presented will contribute to a better understanding of the issues at hand.
On some of the issues, our views are clear, like “In the finite mean case, ES is a superior risk measure to VaR in the sense of aggregation and answering the crucial what-if question”. The debate for the lack of proper aggregation has been ongoing within academia since VaR was introduced around 1994: in several, for practice, relevant cases, VaR adds up wrongly, whereas ES always adds up correctly (subadditivity). More importantly, thinking in ES-terms makes risk managers concentrate more on the “what-if” question, whereas the VaR thinking is only concerned about the “if” question. In the infinite mean case, ES cannot be used, whereas VaR remains well defined. Within (mainly environmental) economics, an interesting debate on risk management in the presence of infinite mean risks is taking place. The key terminology here is “The dismal Theorem”; see for instance Weitzman [
68]. Moreover, in the finite mean case, our results show that quite generally, the conservative estimates provided by ES are roughly not too pessimistic if compared to the corresponding VaR estimates in the worst case dependence scenarios.
Both, however, remain statistical quantities, the estimation of which is marred by model risk and data scarcity. The frequency rather than severity thinking is also very much embedded in other fields of finance; think for instance of the calibrations used by rating agencies for securitization products (recall the (in)famous CDO senior equity tranches) or companies (transition and default probabilities). Backtesting models to data remains a crucial aspect throughout finance; elicitability and prequentist forecasting add new aspects to this discussion. Robustness remains, for the moment at least, somewhat elusive. Our brief review of some of the recent work, this motivated by Basel 3.5, will hopefully entice more academic, as well as practical research and discussions on these very important themes.
The interested reader is advised to consult several of the references mentioned in the paper; questions underlying Basel 3.5 have already led to interesting discussions (not to say, controversies) among academics and practitioners. Especially on the importance of the subadditivity axiom, as well as on the interpretation of robustness, diverging views exist.