Next Article in Journal
Modified Munich Chain-Ladder Method
Previous Article in Journal
Information-Based Trade in German Real Estate and Equity Markets
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Dependence Uncertainty Bounds for the Expectile of a Portfolio

1
RiskLab, Department of Mathematics, ETH Zurich, 8092 Zürich, Switzerland
2
Faculty of Economics, Vrije Universiteit Brussel, Pleinlaan 2, 1050 Bruxelles, Belgium
*
Author to whom correspondence should be addressed.
Risks 2015, 3(4), 599-623; https://doi.org/10.3390/risks3040599
Submission received: 4 August 2015 / Accepted: 4 December 2015 / Published: 10 December 2015

Abstract

:
We study upper and lower bounds on the expectile risk measure of risky portfolios when the joint distribution of the risky components is not fully specified. First, we summarize methods for obtaining bounds when only the marginal distributions of the components are known, but not their interdependence (unconstrained bounds). In particular, we provide the best-possible upper bound and the best-possible lower bound (under some conditions), as well as numerical procedures to compute them. We also derive simple analytic bounds that appear adequate in various situations of interest. Second, we study bounds when some information on interdependence is available (constrained bounds). When the variance of the portfolio is known, a simple-to-compute upper bound is provided, and we illustrate that it may significantly improve the unconstrained upper bound. We also show that the unconstrained lower bound cannot be readily improved using variance information. Next, we derive improved bounds when the bivariate distributions of each of the risky components and a risk factor are known. When the factor induces a positive dependence among the components, it is typically possible to improve the unconstrained lower bound. Finally, the unconstrained dependence uncertainty spreads of expected shortfall, value-at-risk and the expectile are compared.

1. Introduction and Preliminaries

This paper aims to contribute to the broader academic discussion on the properties of risk measures relevant to risk management and regulation in the banking and insurance industry; see [1] and [2] for an overview. The two most well-known risk measures are the value-at-risk (VaR) and expected shortfall (ES),
VaR α ( X ) = inf x R : P ( X x ) α , ES β ( X ) = 1 1 - β β 1 VaR q ( X ) d q
where the latter is only defined for random variables (rvs) X with a finite expectation. While VaR is dominantly used in industry, it lacks the property of subadditivity and is thus not coherent in the sense of [3]. By contrast, ES, which is merely the average of all upper VaRs, is coherent. In fact, it is the smallest coherent risk measure that is more conservative than VaR (see [3]). Recently, [4] brought the issue of elicitability to the foreground. A risk measure is said to be elicitable if it is a minimizer of the expectation of some scoring function, which depends on the point forecast and the true observed loss. The work in [4] showed that VaR is elicitable (if the corresponding quantile is unique), but ES is not. While some authors interpret this to mean that ES cannot be back-tested (e.g., [5,6]), [7] argue that elicitability is relevant for relative comparisons between estimators, but not for absolute significance testing. Moreover, [8] show that the pair (VaR, ES) is jointly elicitable. Nevertheless, the question arises whether there are non-trivial coherent risk measures that are elicitable. In [9,10,11], it is shown that the only risk measure that is both elicitable and coherent is the expectile. The expectile is introduced in [12] as the minimizer of the expectation of an asymmetric quadratic scoring function,
e τ ( X ) = argmin e R E [ ( τ 1 { X > e } + ( 1 - τ ) 1 { X < e } ) ( X - e ) 2 ]
It follows that e τ ( X ) is the unique solution of the equation implied by the first order conditions (however, [13] points out that no differentiability or continuity of the distribution function is required):
( 1 - τ ) E [ ( e τ - X ) 1 { X < e τ } ] = τ E [ ( X - e τ ) 1 { X > e τ } ]
Expectiles are well known in regression analysis [14,15,16]; they are used for forecasting financial time series [17] and estimating VaR and ES [18]. A penalized least squares approach in portfolio optimization was suggested by [19]; the expectile is a special case when a quadratic downside penalty is used. The expectile is also closely related to the Omega performance measure [20]; see [21], p. 128. The expectile was first explicitly considered as a risk measure in [22], and the authors coined the acronym EVaR. This name was later adopted in other articles ([23,24]). However, even the original authors admit that this acronym was already used for economic-VaR [25] and recently also for entropic-VaR [26]. To avoid confusion, we shall use the notation e τ , as in [2,27] and [28], p. 290.
Throughout, we assume that the random variables represent losses. The expectile-based risk measure e τ ( X ) is subadditive and thus coherent for τ [ 1 / 2 , 1 ) ; for this property, as well as other features and representations, see [13,27]. A discussion on risk management with expectiles can be found in [24]. In the present paper, we further contribute to this discussion by examining their properties when aggregating risks.
Note that by rearranging Equation (1),
e τ = E [ X ] + θ E [ ( X - e τ ) 1 { X > e τ } ] , where θ : = 2 τ - 1 1 - τ 0 for τ [ 1 / 2 , 1 )
Thus, the expectile can also be interpreted as the insurance premium using the Dutch premium principle (see [29]), where the insurer buys an excess-of-loss reinsurance contract for any claim above the premium e, with loading factor θ (and applies zero loading for the retained part). From Equation (2), we observe that expectiles are only implicitly defined, and their computation appears cumbersome. However, if the loss distribution X F is known, the following approach can be used to compute the corresponding expectile. First, define the tail integral,1 a function which will be useful for shorter notation,
TI X ( x ) : = x u d F ( u ) = E [ X 1 { X > x } ]
Analytic expressions for TI X are available for many commonly-used distributions, such as Pareto, log-normal, normal, Student t, exponential, gamma, and other. Next, applying Newton’s method to Equation (1) yields a practical iterative procedure for computing e τ ( X ) = lim k x k , given by:
x k + 1 = ( 1 - τ ) E [ X ] + ( 2 τ - 1 ) TI X ( x k ) ( 1 - τ ) p k + τ ( 1 - p k )
where p k = F ( x k ) . An analogue for an empirical distribution using iterative reweighting is mentioned in [12] and stated explicitly in [14]. The convergence of this procedure is very fast. It is shown in [24] that for most common distributions, the expectiles are smaller than quantiles at level τ (for τ high enough), while for heavy-tailed (infinite variance) distributions, the opposite holds true. They coincide exactly (for all τ) for a Student t distribution with ν = 2 and asymptotically (as τ 1 ) for the Pareto ( 2 ) distribution. Therefore, initializing the procedure at x 0 = VaR τ ( X ) appears reasonable.
While the expectile is a familiar object in regression analysis, its properties relevant to risk management are less studied. The focus of this paper is on risk aggregation and measurement under model uncertainty. Often the total (aggregate) loss that a company faces can be expressed as a sum S = X 1 + + X d , where the X i represent, e.g., the losses of different business lines or risk types. The risks X i are typically modeled separately, and little might be known about their interdependence. We will be interested in finding the range of values a risk measure ρ ( S ) can take for different aggregate losses S S , where S is the so-called admissible class containing all of the aggregate loss distributions that are consistent with the available marginal and dependence information. In particular, define the best-possible upper bound and the best-possible lower bound as:
ρ ̲ = inf { ρ ( S ) : S S } and ρ ¯ = sup { ρ ( S ) : S S }
where the risk measure ρ will be either VaR, ES or the expectile. The idea to assess the impact of (partial) dependence information on risk bounds has been explored in a series of recent papers; see [30,31,32,33,34,35,36]. In these papers, the risk measure used was the VaR. In this paper, we will mainly focus on the expectile as a challenger for VaR.
The paper is structured as follows: Section 2 considers the case when only marginal distributions are known. We provide the best-possible upper bound, as well as the best-possible lower bound (under some conditions) and provide numerical procedures to practically compute these bounds. We also provide weaker bounds and show they are close to the best-possible ones in various situations of interest. We study the location-scale family and provide analytical expressions for the best possible bounds in this context. Section 3 gives bounds when the mean and variance of the aggregate loss are known. In Section 4, we consider the availability of dependence information through factor models. We provide various bounds in this context, and the results of this and previous sections are applied in an example using the skew-t distribution. In Section 5, the width of the dependence uncertainty interval for the expectile is compared to that of VaR and ES. Finally, Section 6 summarizes the observations.

2. Bounds when Only the Marginal Distributions Are Known

Due to the curse of dimensionality, it is typically easier to statistically fit a one-dimensional distribution function (df) to each X i than to fit a multivariate distribution to X = ( X 1 , , X d ) . Under an idealized version of dependence uncertainty (DU), only the marginal distributions X i F i , i = 1 , , d are known, while the dependence structure (copula) is completely unknown. Hence, the aggregate loss S can be any of the elements in the (Fréchet) admissible class S ,
S ( F 1 , , F d ) = { X 1 + + X d : X i F i , i = 1 , , d }
The (best-possible) bounds on the expectile are denoted by e ¯ τ and e ̲ τ . To determine the bounds, it turns out to be sufficient to find elements in S ( F 1 , , F d ) that are maximal, respectively, minimal in the sense of convex order.2 We first recall the definition of this ordering concept and then connect it with e ¯ τ and e ̲ τ .
Definition 1 (Convex order) Let X and Y be random variables, such that:
E [ ϕ ( X ) ] E [ ϕ ( Y ) ] for all convex functions ϕ : R R
provided the expectations exist. Then, X is said to be smaller than Y in the convex order ( X cx Y ).
Consider the convex functions ϕ e ( x ) = ( x - e ) 1 { x > e } indexed by e R . We find that:3
X cx Y E [ ( X - e ) 1 { X > e } ] E [ ( Y - e ) 1 { Y > e } ] e R e τ ( X ) e τ ( Y )
In particular, this shows that upper bound e ¯ τ , resp., lower bound e ̲ τ , is obtained if one can find the maximum, resp., minimum element, in the convex order sense in the admissible class S . The last implication in Equation (4) comes from the following lemma by taking G ( e ) = E [ ( Y - e ) 1 { Y > e } ] and noting that E [ X ] = E [ Y ] . Specifically, the following lemma connects bounds on the stop-loss premium E [ ( S - e ) 1 { S > e } ] with bounds on e τ ( S ) .
Lemma 2. Suppose G : R R is a non-increasing function, such that:
E [ ( S - e ) 1 { S > e } ] G ( e ) e R
Then, e τ ( S ) e , where:
e = inf e R : e E [ S ] + 2 τ - 1 1 - τ G ( e )
Analogously, a lower bound on the stop-loss premium yields a lower bound on e τ ( S ) .
Proof. From Equation (5), it immediately follows that:
e E [ S ] + 2 τ - 1 1 - τ G ( e ) e E [ S ] + 2 τ - 1 1 - τ E [ ( S - e ) 1 { S > e } ]
Since for both inequalities, the right-hand side is non-increasing in e, the solution of Equation (2) (i.e., e τ ( S ) ) must be less than or equal to e .  ☐

2.1. Upper Bound with Marginal Information

It is shown in [40] that the comonotonic dependence structure leads to the maximal element (denoted S c ) with respect to the convex order in the admissible class S .
S S : S cx S c : = i = 1 d F i - 1 ( U ) , where U U ( 0 , 1 )
Hence, we find that e ¯ τ = e τ ( S c ) . In the case of identical margins F i = F 1 , i = 2 , , d , using positive homogeneity, this simplifies to:
e ¯ τ = e τ ( d X 1 ) = d e τ ( X 1 ) = i = 1 d e τ ( X i )
In general, however, the expectile is not comonotone additive, and hence, the upper bound e ¯ τ often needs to be computed numerically. Unfortunately, the df of S c is typically not available in an analytical from, so the iterative procedure Equation (3) is more difficult to apply, since it would involve a nested root search. In particular, to compute TI S c ( x k ) at each step, one would first need to find a p k , such that F S c - 1 ( p k ) = x k , and then sum up the tail integrals TI i ( F i - 1 ( p k ) ) for the margins.
Since S c is defined in Equation (7) in terms of its quantile function F S c - 1 , it is easier to work in terms of the probability level p corresponding to the expectile. For continuous marginal distributions with densities f i , we can again apply Newton’s method by differentiating Equation (1) with respect to p using the chain rule. This yields an iterative procedure for computing e τ ( S c ) = F S c - 1 ( p ) in terms of p = lim k p k , given by:
p k + 1 = p k - e k / B ( p k ) + i = 1 d ( 1 - τ ) E [ X i ] + ( 2 τ - 1 ) TI i ( F i - 1 ( p k ) B ( p k ) ( p k ( 1 - τ ) + ( 1 - p k ) τ )
where e k = F S c - 1 ( p k ) and:
B ( u ) = F S c - 1 ( u ) = i = 1 d F i - 1 ( u ) = i = 1 d 1 f i F i - 1 ( u ) , u ( 0 , 1 )
Again, since analytic expressions for the mean, the tail integral and inverse df of parametric marginal distributions are often available, this is a very fast and accurate method. It is possible that Equation (8) yields p k + 1 1 , in which case we can take p k + 1 = ( p k + 1 ) / 2 instead; similarly, if p k + 1 0 , set p k + 1 = p k / 2 . Analogously to Equation (3), it is reasonable to initiate the procedure at p 0 = τ .
In general, by subadditivity (recall that we use τ [ 1 / 2 , 1 ) ),
e τ ( S ) i = 1 d e τ ( X i ) = : e τ +
so e τ + is a valid upper bound, too, but it is typically not the best possible. In Section 4.3, e τ + is computed, as well as the best-possible upper bound in an example with skew-t distributions; we can observe that they are very close in all cases.

2.2. Lower Bound with Marginal Information

The analysis of the lower bound is more involved. We first observe that:
e ̲ τ E [ S ] = i = 1 d E [ X i ]
This can be seen either by applying Jensen’s inequality to the degenerate random variable m = E [ S ] to show m cx S , S S , or by noting that e 1 / 2 ( S ) = E [ S ] and that e τ ( S ) is increasing in τ (see, e.g., [24] for these and other properties). If the admissible class S contains the constant E [ S ] , then this is the smallest element in convex order, and the lower bound Equation (10) is attained (sharp). This situation is achieved when the components X i are “compensating” for each other and corresponds to the notion of joint mixability, which was formally introduced in [41] and extends the concept of complete mixability ([42]) to the inhomogeneous case; see also [43] for an overview of these and related concepts. Precisely, a distribution F is called d-completely mixable if there exist rvs X i F , i = 1 , , d , such that i = 1 d X i = d c a.s., c R . Analogously, a d-tuple of dfs ( F 1 , , F d ) is called jointly mixable if there exist rvs X i F i , such that i = 1 d X i = k a.s., k R . Another concept that leads to an explicit smallest element in the Fréchet admissible class is that of mutual exclusivity, which requires that the margins have a large probability mass at an endpoint of the support; see [44].
In general, however, the dependence structure that leads to the smallest element in the convex order is known only for d = 2 (countermonotonicity). If d > 2 , for distributions that are bounded below (and satisfy some further conditions), in the case of identical margins, one can use the method from [45] (involves solving an integral equation), or for different margins, the method from [46] (requires solving a functional equation). A general, but approximate method is the rearrangement algorithm (RA) (see [47,48,49]), which is based on a discretization of the margins. In the following, we describe a simple, yet necessary modification of the RA that provides improved approximations for the best-possible lower bounds in the case of expectation-based risk measures, such as ES and the expectile.

2.2.1. Rearrangement Algorithm

The most general method currently available for computing lower bounds for the common risk measures is the RA; see [47,48,49]. While the quantile-based RA performs well when computing bounds on the (quantile-based) VaR, [50] indicates that for heavy-tailed margins, the RA lower bound for ES ([48]) is not sharp, because the tail expectation is underestimated due to discretization. Since also expectiles are defined in terms of the tail expectation (see Equation (2)), it is important to address this issue of the RA. In the following, we first recall the standard discretization of the RA for each margin i = 1 , , d and next provide two modifications that we further investigate.
Standard RA: x k , i = F i - 1 k - 1 N , k = 1 , , N .
Midpoint RA: x k , i = F i - 1 k - 1 / 2 N , k = 1 , , N .
Expectation RA: x k , i = F i - 1 k - 1 / 2 N , k = 2 , , N - 1 ,
x N , i = ES 1 - 1 / N ( X i ) , x 1 , i = LES 1 / N ( X i ) : = N 0 1 / N F i - 1 ( q ) d q
While the standard discretization may seem conservative, it is nonetheless an approximate lower bound, since the RA may stop at a suboptimal rearrangement (see [48]). Moreover, if the distribution is unbounded below, then F i - 1 ( 0 ) = - . For high N (such that p N > d ), this would still give a finite lower bound for ES; however, it would be undefined for the expectile, since it depends on both the upper and the lower tail of the aggregate distribution. The midpoint RA avoids this problem, but still underestimates tail expectations; the expectation RA should solve both issues. To evaluate and compare the sharpness of the bounds obtained using the different discretizations, we consider the homogeneous case with Pareto marginals. In this case, the exact lower bound on ES can be obtained using the method in [45]. In Table 1, the resulting underestimation errors are listed. Observe that the midpoint RA improves the results considerably, but the errors are still noticeable. The expectation RA, however, gives results that are within 0 . 1 % of the true lower bound. In light of this, we will use the “expectation” version of RA for computing the unconstrained lower bounds on the expectile (i.e., e ̲ τ ) in Section 4.3, since it provides more accurate estimates of the tail expectations. One final adjustment concerns the stopping condition for RA. In [48], the RA stops when an iteration reduces the ES by less than a pre-defined ε. To match the stopping condition with the objective, we stop the RA when the reduction in e τ becomes smaller than ε. This means that the expectile of the current rearrangement needs to be computed at each iteration. However, the Equation expectile of the previous iteration of the RA makes a very good initial guess for Equation (3), so this is not time-consuming. To summarize, the expectation RA only changes the way the margins are discretized and the stopping condition; the rest of the RA remains as in [48]. For other recent developments on the RA, we refer to https://sites.google.com/site/RearrangementAlgorithm/ and [51].
Table 1. Relative underestimation as a percent of the exact ES ̲ p for d Pareto ( θ ) distributions, using the rearrangement algorithm (RA) with different discretizations of size N = 10 5 and stopping condition ε = 10 - 4 .
Table 1. Relative underestimation as a percent of the exact ES ̲ p for d Pareto ( θ ) distributions, using the rearrangement algorithm (RA) with different discretizations of size N = 10 5 and stopping condition ε = 10 - 4 .
Standard RA p = 0 . 95 p = 0 . 99
d123458123458
θ = 5 0.10.20.20.20.30.40.30.50.60.70.81.2
θ = 3 0.30.40.50.60.71.00.71.11.41.71.92.6
θ = 2 . 5 0.50.70.91.01.21.51.21.72.22.62.93.8
θ = 2 1.21.61.92.22.53.12.53.44.24.95.46.9
θ = 1 . 5  5.36.67.68.39.010.6 9.011.513.414.916.219.4
Midpoint RA p = 0 . 95 p = 0 . 99
d123458123458
θ = 5 0.00.00.00.00.10.10.10.10.10.10.20.2
θ = 3 0.10.10.10.20.20.30.20.30.40.50.50.7
θ = 2 . 5 0.20.20.30.30.40.50.40.60.70.81.01.3
θ = 2 0.50.70.80.91.01.31.01.41.72.02.22.8
θ = 1 . 5 3.03.84.34.75.16.05.16.57.58.39.010.6
Expectation RA p = 0 . 95 p = 0 . 99
d123458123458
θ = 5 0.00.00.00.00.00.00.00.00.00.00.00.0
θ = 3 0.00.00.00.00.00.00.00.00.00.00.00.0
θ = 2 . 5 0.00.00.00.00.00.00.00.00.00.00.00.1
θ = 2 0.00.00.00.00.00.00.00.00.10.10.10.1
θ = 1 . 5 0.10.10.10.10.10.10.10.10.10.10.10.2

2.3. Example: Location-Scale Family

We assume that the X i belong to the same location-scale family of dfs, i.e., F i ( · ) = F ( ( · - μ i ) / σ i ) , i = 1 , , d for some df F. Denote also μ = i = 1 d μ i and σ = i = 1 d σ i .

2.3.1. Upper Bound

By Equation (7), the convex order-maximal element in S ( F 1 , , F d ) is given by:
S c = i = 1 d F i - 1 ( U ) = μ + σ F - 1 ( U ) , where U U ( 0 , 1 )
Hence,
e ¯ τ = μ + σ e τ ( F - 1 ( U ) ) = e τ +
which can be computed using the procedure described in Equation (3). This also means that when the margins have the same shape, the bound based on subadditivity Equation (9) is the best possible.

2.3.2. Lower Bound

As mentioned before, obtaining an element in the admissible class that is minimum in the sense of convex order is often difficult or not even possible to achieve. However, in case the marginal dfs are from the same location-scale family that is symmetric, we can express the minimal element in the admissible class explicitly and, thus, also obtain the best-possible lower bound on the expectile.
Theorem 3. Let X i F i ( · ) = F ( ( · - μ i ) / σ i ) , i = 1 , , d belong to the location-scale family of a symmetric df F. Suppose without loss of generality that σ 1 σ i , i = 2 , , d .
(i) 
If σ 1 i = 2 d σ i , then a minimal element in S ( F 1 , , F d ) in convex order is:
S = F 1 - 1 ( U ) + i = 2 d F i - 1 ( 1 - U ) = i = 1 d μ i + σ 1 - i = 2 d σ i F - 1 ( U ) , where U U ( 0 , 1 )
Correspondingly,
e ̲ τ = i = 1 d μ i + σ 1 - i = 2 d σ i e τ ( F - 1 ( U ) )
(ii) 
Otherwise, if F furthermore admits a unimodal density, then the minimal element in the admissible class is the constant μ = i = 1 d μ i , and thus, e ̲ τ = μ .
Proof. (i) Case σ 1 = i = 2 d σ i is trivial. If σ 1 > i = 2 d σ i , we use the well-known fact that the convex order is consistent with the ordering of expected shortfall (note that VaR u ( X ) = F X - 1 ( u ) ). In particular, Theorem 3.A.5 in [52] states that:
X cx Y p 1 F X - 1 ( u ) d u p 1 F Y - 1 ( u ) d u p ( 0 , 1 ) and E [ X ] = E [ Y ]
Clearly, X 1 = F 1 - 1 ( U ) and X i = F i - 1 ( 1 - U ) , i = 2 , , d have the required dfs, so S = i = 1 d X i S . If F is continuous, then for any S = i = 1 d X i S and any p ( 0 , 1 ) , we have that:
p 1 F S - 1 ( u ) d u = E [ S 1 { X 1 > F 1 - 1 ( p ) } ] E [ S 1 { X 1 > F 1 - 1 ( p ) } ] p 1 F S - 1 ( u ) d u
The first inequality follows from the fact that { X 1 > F 1 - 1 ( p ) } = { X i < F i - 1 ( 1 - p ) } , i = 2 , , d and A = { X i < F i - 1 ( 1 - p ) } minimizes E [ X i 1 A ] over events A of probability 1 - p . Similarly, the second inequality follows because E [ S 1 A ] is maximal when A = { S > F S - 1 ( p ) } .
If F is not continuous, then the indicators in Equation (11) need to be augmented by adding sets, such as:
X 1 = F 1 - 1 ( p ) , V ( P ( X 1 F 1 - 1 ( p ) ) - p ) / P ( X 1 = F 1 - 1 ( p ) ) , where V U ( 0 , 1 ) (independent)
to the first one (and similarly for the others). Since the dfs are symmetric and belong to a location-scale family, the atoms at F - 1 ( p ) , F 1 - 1 ( p ) and F i - 1 ( 1 - p ) , i = 2 , , d are of the same size.
In Case (ii), the rvs are jointly mixable by Corollary 3.6 in [41], and hence, the result follows.  ☐
Note that Theorem 3 is of interest beyond the context of expectiles, as the convex order least element also yields lower bounds on, e.g., variance and ES in this admissible class. An early result in this direction was [53], where identical symmetric unimodal distributions with a differentiable density are considered. It is shown that for such a df F, there exist X i F , i = 1 , , d , in the form X i = R U i , where U i U ( - 1 , 1 ) and R is a continuous rv. Then, using that the uniform distribution is completely mixable (an explicit construction is given), it follows that also F is completely mixable. This model can be considered as a scale mixture of uniform distributions, where R is the scale factor, common to all margins. In Section 4, more general factor models are considered, and Theorem 3 is applied to find the minimal element in the convex order and to compute exact lower bounds in an example.

3. Bounds when the Mean and Variance of the Sum Are Known

We consider the case in which additional to the marginal information, also the variance of S is known, i.e., we consider the admissible class S ( F 1 , , F d , s 2 ) ,
S ( F 1 , , F d , s 2 ) = { X 1 + + X d : X i F i , i = 1 , , d and Var ( X 1 + + X d ) = s 2 }
where s 2 > 0 is a compatible variance constraint. In this setting, the bounds on e τ will be denoted e ¯ τ s 2 and e ̲ τ s 2 . It is not so clear how to determine these best-possible bounds. Instead, we proceed by considering a larger admissible class that is easier to deal with,4 but gives weaker bounds. Note that in the case of VaR, in [30], it is shown that the weaker bounds are typically close to the best-possible ones. In the following sections, let m = 1 d E [ X i ] , and denote:
M 2 ( m , s 2 ) : = { S L 2 : E [ S ] = m , Var ( S ) = s 2 }

3.1. Upper Bound with Variance Constraint

Since S ( F 1 , , F d , s 2 ) is a subset of M 2 ( m , s 2 ) , we find that e ¯ τ s 2 B , where:
B : = sup { e τ ( S ) : S M 2 ( m , s 2 ) }
It will become apparent that variables supported on two points play an important role in the class M 2 .
Definition 4. A random variable is X called diatomic if P ( X = a ) = p and P ( X = b ) = 1 - p for some a < b and p ( 0 , 1 ) .
The expectile of such a diatomic random variable has a simple expression,
e τ ( X ) = ( 1 - τ ) p a + τ ( 1 - p ) b ( 1 - τ ) p + τ ( 1 - p )
Theorem 5. The maximum expectile B defined in Equation (13) is given by:
B = m + s q 1 - q
where q = ( 2 τ - 1 ) 2 . It is attained by a diatomic rv with support { a , b } and mass τ at a, where:
a = m - s 1 - τ τ , b = m + s τ 1 - τ
Proof. Denote by A M 2 ( m , s 2 ) the subset of diatomic variables. The proof further consists of two steps. First, we construct a variable X τ A that maximizes e τ on A . Next, we show that X τ also provides the solution to Equation (13). Any X p A has two support points a p , b p with mass p at a p ,
a p = m - s 1 - p p , b p = m + s p 1 - p
where 0 < p < 1 . Substituting a p and b p into Equation (14) yields:
e τ ( X p ) = m + s ( 2 τ - 1 ) ( 1 - p ) p ( 1 - p ) τ + ( 1 - τ ) p
Figure 1. Expectile e τ ( X p ) for a diatomic rv X p (standardized to m = 0 , s = 1 ), as a function of p ( 0 , 1 ) . The maximum is attained at p = τ ; the minimum is approached as p 0 or p 1 .
Figure 1. Expectile e τ ( X p ) for a diatomic rv X p (standardized to m = 0 , s = 1 ), as a function of p ( 0 , 1 ) . The maximum is attained at p = τ ; the minimum is approached as p 0 or p 1 .
Risks 03 00599 g001
Using differentiation with respect to p, we find that e τ ( X p ) attains its maximum on ( 0 , 1 ) when p = τ (see also Figure 1). Hence, the variable that maximizes e τ on A is given by:
X τ = m - s 1 - τ τ with probability τ , m + s τ 1 - τ with probability 1 - τ
For this rv, defining q = ( 2 τ - 1 ) 2 , Equation (15) simplifies to:
e τ ( X τ ) = m + s ( τ - 1 / 2 ) ( 1 - τ ) τ = m + s q 1 - q
Now, consider any S M 2 ( m , s 2 ) . Without loss of generality, we can express S = F S - 1 ( U ) for some standard uniformly distributed rv U. Letting p = F S ( e τ ( S ) ) , the variable S can also be written as:
S = F S - 1 ( U ) 1 { U p } + F S - 1 ( U ) 1 { U > p }
Define a diatomic variable Y such that E [ Y ] = m and e τ ( Y ) = e τ ( S ) by
Y = ( ( m - ( 1 - p ) ES p ( S ) ) / p ) 1 { U p } + ES p ( S ) 1 { U > p }
From Jensen’s inequality, it follows that Var ( Y ) Var ( S ) = s 2 . Since Y is diatomic and the right-hand side of Equation (15) is increasing in s (recall that τ [ 1 / 2 , 1 ) ),
e τ ( S ) = e τ ( Y ) e τ ( X p ) e τ ( X τ )
which completes the proof.  ☐
The bound B does not make use of the specific information on the marginal distributions. When the variance is “too high”, the unconstrained bound e ¯ τ will be stronger. In the opposite case, B will dominate e ¯ τ . We formulate the following corollary.
Corollary 6
e ¯ τ s 2 min B , e ¯ τ
Remarks.
(i)
A procedure called the extended rearrangement algorithm (ERA) was introduced in [30] and makes it possible to compute an approximation of e ¯ τ s 2 from below, using both the marginal, as well as the variance information. This algorithm will be applied in an example in Section 4.3.
(ii)
Denote by C = sup VaR τ ( S ) : S M 2 ( m , s 2 ) and by D = sup ES τ ( S ) : S M 2 ( m , s 2 ) . A similar proof as in Theorem 5 shows that C and D are attained by the same diatomic variable X τ that attains the bound B; see also [30]. We find that:
C = D = m + s τ 1 - τ
and, thus, that B < C = D . On the other hand, the numerical value of these upper bounds would coincide for e τ , VaR α and ES β , if we set α = β = ( 2 τ - 1 ) 2 .

3.2. Lower Bound with Variance Constraint

From the proof of Theorem 5, it follows that:
A : = inf e τ ( S ) : S M 2 ( m , s 2 )
is given by A = m ( = E [ S ] ) . Indeed, e τ ( X p ) m as p 0 , respectively p 1 ; see also Figure 1. Moreover, this bound cannot be improved by assuming either an upper bounded support or a lower bounded support.5 Hence, we conclude that working in the moment space M 2 does not make it readily possible to improve on e ̲ τ .

4. Bounds for Factor Models

A factor model is introduced in [32] as a way to include additional information on the dependence structure and, hence, reduce the DU. This model considers rvs X i and a factor W for which the bivariate distributions H i of ( X i , W ) are known. The aggregate risk S = X 1 + + X d then belongs to the factor-constrained admissible class:
S ( H i , , H d ) = { X 1 + + X d : ( X i , W ) H i , i = 1 , , d }
In the following, denote by F i the marginal distribution of X i and by F i | w the conditional distribution of ( X i | W = w ) , i = 1 , , d (if defined). The additional information of this factor structure leads to narrower factor-constrained DU bounds,
e ̲ τ f = inf { e τ ( S ) : S S f ( H 1 , , H d ) } , e ¯ τ f = sup { e τ ( S ) : S S f ( H 1 , , H d ) }
In the rest of this section, we consider a model where, conditional on a non-negative factor W with distribution G, the rvs X i belong to the location-scale family of distribution F 0 .
X i = μ i + γ i W + σ i W Z i , Z i F 0 , Z i W , i = 1 , , d , and W G
Models of this type are called location-scale mixture models and have a broad range of applications, going back to [55,56], where a particular location-scale mixture family (generalized hyperbolic) is introduced. In the area of financial modeling, [57] show that this family allows a good fit of asset returns; [58,59] apply it for pricing; and [60] apply it in the context of Garch models. Specific consideration has been given in the literature to sub-families of this class; see, for example, [61,62] for the case of the multivariate variance gamma distribution, as well as [63,64] for the case of multivariate skew-t distributions.

4.1. Upper Bound

According to Theorem 4.1 in [32], the largest element in the convex order is achieved in the case when, conditional on the factor, the margins are comonotonic. Thus, computing the upper bound on risk in such a model is as easy as computing e τ ( X i ) , since the conditionally comonotonic sum belongs to the same class of location-scale mixtures:
S W c = μ + γ W + σ W F 0 - 1 ( U ) , where μ = 1 d μ i , γ = 1 d γ i , σ = 1 d σ i
and W U U ( 0 , 1 ) . In particular, we find that e ¯ τ f = e τ ( S W c ) .

4.2. Lower Bound

The minimal element in the convex order sense is given by the following result (counterpart to Theorem 4.1 in [32]).
Theorem 7. If S w is a convex order-minimal element in S w = S ( F 1 | w , , F d | w ) for each w, then S W is a minimal element in S f ( H 1 , , H d ) and e ̲ τ f = e τ ( S W ) .
Proof. Since S w S w , it can be written as i = 1 d X i , w for some X i , w F i | w , i = 1 , , d . Thus, ( X i , W , W ) have the required bivariate distributions, and S W S f . To show it is minimal, consider any T S f ( H 1 , , H d ) , and denote T w = T | ( W = w ) S w . By the definition of the convex order, E [ ϕ ( S w ) ] E [ ϕ ( T w ) ] for any convex function ϕ. Using monotonicity and the tower property, we obtain:
E [ ϕ ( S W ) ] E [ ϕ ( T ) ]
which completes the proof.  ☐
Let σ = max { 0 , 2 max i = 1 d { σ i } - j = 1 d σ j } . By Theorem 3, if the df F 0 in model Equation (16) is symmetric and unimodal, then a minimal element in S w is:
S w = μ + γ w + σ w F 0 - 1 ( U ) , U U ( 0 , 1 )
Thus, by Theorem 7, S W = μ + γ W + σ W F 0 - 1 ( U ) is a convex order minimal element in the factor-constrained admissible class. Moreover, since U is independent of W, S W also belongs to the same mixture family as the margins, so the corresponding lower bound e ̲ τ f = e τ ( S W ) can be computed as easily as e τ ( X i ) . Note that the assumption that F 0 is symmetric is natural, since the location-mixing term γ i W can be used to add asymmetry to the df of X i .

4.3. Example: Skewed Student t Distribution

The results in Section 4.1 and Section 4.2 apply for general choices of dfs F 0 and G in Equation (16). The most well-known location-scale mixture class is that of normal mean-variance mixtures, i.e., the case when F 0 = Φ . If, in addition, W follows the generalized inverse Gaussian (GIG) distribution, then the family of generalized hyperbolic (GH) distributions is obtained; see Section 6.2.3 in [28]. This is a flexible class of distributions that exhibits skewness and heavy tails and is therefore useful for modeling financial data. Moreover, it can also be extended to the multivariate GH distribution,
X = μ + γ W + W A Z , Z N ( 0 , I d )
where vectors in R d are written in bold, I d is the identity matrix and Σ = A A is the Cholesky decomposition of the scale matrix. The multivariate GH class is closed under linear operations, so it has the portfolio property, which is useful for applications. In this section, a particular subclass of GH is considered: the skew-t distributions.
The hyperbolic skewed Student t distribution is a special case of normal mean-variance mixtures, where the mixing distribution is the inverse-gamma df; see [64]. The inverse-gamma distribution I Γ ( α , β ) has density:
f ( x ; α , β ) = β α Γ ( α ) x - α - 1 exp - β x
Setting F 0 = Φ and G = I Γ ( ν / 2 , ν / 2 ) in Equation (16) results in X i Skew - t ν ( μ i , γ i , σ i 2 ) . The multivariate skew-t subclass of the GH distribution Equation (17) is also closed under linear transformations. Using Theorem 4.1 in [32], the factor-constrained worst-case dependence structure is achieved using A with σ = ( σ 1 , , σ d ) as the first column and zeros in the others (conditional comonotonicity), resulting in a degenerate matrix Σ. The corresponding aggregate risk is:
S W c Skew - t ν ( μ , γ , σ ) , where μ = i = 1 d μ i , γ = i = 1 d γ i , σ = i = 1 d σ i
Applying Theorems 3 and 7, the factor-constrained lower bounds in the “dominated” case σ = 2 max i = 1 d { σ i } - j = 1 d σ j 0 (see Case (i) of Theorem 3) can be attained using A with σ as the first column, except - σ i in row i corresponding to the largest σ i and zeros in other columns (conditional countermonotonicity with respect to the i-th margin). The corresponding aggregate risk is then:
S W Skew - t ν ( μ , γ , σ )
The inverse df and tail integral (as well as the df and density) of a skew-t distribution can be computed using the methods in [65], which rely on the use of a Bessel function, numerical integration and root search (and are computationally intensive). Hence, we can apply the iterative algorithm Equation (3) (using the df and tail integral) to compute its expectile. Thus, we have a method to obtain the upper, respectively lower, factor-constrained DU bound. The unconstrained upper bound on the expectile can be computed using the iterative procedure Equation (8) (based on the tail integral, inverse df and density) and the unconstrained lower bound using the “expectation” version of RA introduced in Section 2.2.1.
Note that the conditionally jointly mixable case cannot be attained using a multivariate GH dependence structure. In this case, S W = μ + γ W follows a scaled and translated inverse-gamma distribution. In order to apply the iterative procedure Equation (3), we need the df and tail integral (TI) for inverse-gamma. For a general W I Γ ( α , β ) , we calculate, using the substitution u = β / t ,
F W ( x ) = 0 x f W ( t ) d t = 1 Γ ( α ) β / x u α - 1 e - u d u
which is the (normalized) incomplete gamma function. In MATLAB, this can be computed using the function gammainc(b./x,a,‘upper’). Similarly, we have:
TI W ( x ) = β α - 1 1 Γ ( α - 1 ) 0 β / x u α - 2 e - u d u
which is given by b/(a-1)*gammainc(b./x,a-1,‘lower’) in MATLAB.
In Table 2, the expectile bounds for two examples of a skew-t distribution are listed. The parameters were selected to be in the range observed when fitting skew-t to daily stock returns (scaled by a factor of 250) of companies in the S&P100 index. Model A is a conditionally jointly-mixable case, and Model B is a “dominated” case. First, notice that the approximate upper bound e τ + is very close to the best-possible bound e ¯ τ in all cases. Next, observe that due to the positive dependence the factor model induces, the value of the factor-constrained upper bound is similar to the unconstrained one, whereas the factor structure noticeably improves the lower bound; this is in agreement with the observations in [32]. In Model A, the unconstrained lower bound is close to the mean, so the margins are “almost” jointly mixable.
Table 2. Upper and lower dependence uncertainty (DU) bounds for e τ ( S ) for two skew-t examples with d = 8 . Column e τ lists values for a multivariate skew-t distributed X with a diagonal Σ matrix.
Table 2. Upper and lower dependence uncertainty (DU) bounds for e τ ( S ) for two skew-t examples with d = 8 . Column e τ lists values for a multivariate skew-t distributed X with a diagonal Σ matrix.
Model A. E [ S ] = 1 . 24 , ν = 4 . 5 , μ = ( - 0 . 2 , - 0 . 15 , , 0 . 15 ) ,
γ = ( - 0 . 25 , - 0 . 15 , , 0 . 45 ) , σ = ( 4 . 5 , 5 , , 8 )
τ e ̲ τ e ̲ τ f e τ e ¯ τ f e ¯ τ e τ +
0.81.242.1613.7035.5835.6235.63
0.91.243.0221.6357.1457.2157.22
0.951.244.1429.6578.7378.8578.87
0.991.258.4451.18135.63135.98136.02
0.9991.3023.3096.78251.11252.65252.84
Model B. E [ S ] = 1 . 13 , ν = 5 , μ = ( - 0 . 2 , - 0 . 15 , , 0 . 15 ) ,
γ = ( - 0 . 25 , - 0 . 15 , , 0 . 45 ) , σ = ( 3 . 5 , 3 . 5 , , 3 . 5 , 25 . 5 )
τ e ̲ τ e ̲ τ f e τ e ¯ τ f e ¯ τ e τ +
0.81.912.1819.3434.5834.6134.62
0.92.503.0130.6855.2955.3655.37
0.953.153.9941.9075.7475.8475.86
0.995.157.3470.80128.00128.28128.31
0.99910.0517.51126.92228.06229.15229.29
In order to illustrate the influence of variance information on the bounds, we first need to find the feasible range for Var ( S ) = s 2 . The law of total variance yields:
Var ( S ) = E [ Var ( S | W ) ] + Var ( E [ S | W ] ) = E [ W Var ( σ i Z i ) ] + γ 2 2 ν 2 ( ν - 4 ) ( ν - 2 ) 2
as long as ν > 4 . The first term lies in the range [ 0 , σ 2 ν / ( ν - 2 ) ] , corresponding to conditional joint mixability up to conditional comonotonicity. In Figure 2, we plot for Model B the variance-constrained bound B from Theorem 5 (based on the sole knowledge of the first two moments of the sum S), and we compare it with the unconstrained upper bound e ¯ . We also plot the variance-constrained bound obtained by means of the extended rearrangement algorithm (ERA), which in addition to the first two moments of S, also takes into account the marginal distributions of the components X i (see [30] for a description of this algorithm). Variance constraints s 2 are taken in the range corresponding to standard deviation s [ 1 . 9 , 64 . 6 ] . We observe that variance information yields a considerably reduced upper bound B, as long as s 2 is small enough. As the parameter τ increases, the bound B becomes weaker and is relevant on a smaller range of s 2 . The approximate bound computed using ERA is very close to the bound B, indicating that B can nearly be attained by constructing the appropriate dependence among the random variables (with the given marginal distributions). This dependence yields a sum S that has a (nearly) diatomic structure, i.e., S becomes distributed as the random variable in Theorem 5. However, the highest variance that S can possibly attain, under the constraint that it is diatomic and consistent with the marginal distributions, occurs when its upper atom is given by b = i = 1 d ES τ ( X i ) . Therefore, when the variance constraint is too high, we cannot expect ERA to return a diatomic distribution for S; see also Figure 3, where the distribution function of S obtained using ERA is plotted for different variance constraints.
Figure 2. Moment space upper bound B on the expectile, and an approximation of e ¯ τ s 2 computed using the extended rearrangement algorithm (ERA), as a function of the standard deviation constraint s on the horizontal axis. The unconstrained expectile bounds are also plotted for the sake of comparison. The dotted vertical line is the maximum standard deviation of a diatomic random variable, which is consistent with the marginal upper- and lower-tail expectations.
Figure 2. Moment space upper bound B on the expectile, and an approximation of e ¯ τ s 2 computed using the extended rearrangement algorithm (ERA), as a function of the standard deviation constraint s on the horizontal axis. The unconstrained expectile bounds are also plotted for the sake of comparison. The dotted vertical line is the maximum standard deviation of a diatomic random variable, which is consistent with the marginal upper- and lower-tail expectations.
Risks 03 00599 g002
Figure 3. Distribution function of S computed using ERA for Model B, τ = 0 . 8 . In the left panel, a standard deviation constraint s constr = 25 is applied and attained by ERA. In the right panel, a constraint s constr = 58 is attempted, but cannot be attained, resulting in a lower actual standard deviation s ERA = 43 . 2 ; moreover, the distribution is not diatomic. The dotted lines are the optimal locations of the atoms from Theorem 5.
Figure 3. Distribution function of S computed using ERA for Model B, τ = 0 . 8 . In the left panel, a standard deviation constraint s constr = 25 is applied and attained by ERA. In the right panel, a constraint s constr = 58 is attempted, but cannot be attained, resulting in a lower actual standard deviation s ERA = 43 . 2 ; moreover, the distribution is not diatomic. The dotted lines are the optimal locations of the atoms from Theorem 5.
Risks 03 00599 g003
Remarks.
(i)
More generally, the GIG and, hence, (non-skew) hyperbolic distributions are infinitely divisible (see [66]), so methods from [67] can be applied to compute the inverse df and the tail integral. In turn, the iterative procedure Equation (3) can be applied to compute the expectile.
(ii)
The most time-consuming quantity to compute was the unconstrained lower bound on the expectile, because the RA requires a discretization of the margins, i.e., calculating the skew-t inverse df d · N times (each margin took about 10 min on an Intel i5 2.5 GHz desktop with N = 10 4 ). A similar calculation with Pareto dfs as in Table 1 was done for this discretization size. The maximum error using the expectation RA was 0.4% for p = 0 . 99 and 1.5% for p = 0 . 999 ; hence, this discretization size was deemed sufficient for our purposes.
(iii)
Due to the mixture form of GH distributions, a faster method for discretizing the margins could be using a Monte Carlo sample. Since GH dfs can have heavy tails, a similar approach to the “expectation” discretization for RA was considered, specifically, rejecting any sample points that lie below F i - 1 ( 1 / N ) or above F i - 1 ( 1 - 1 / N ) and adding two points equal to the expectations over the corresponding intervals. However, this method resulted in a large variance over repeated trials, so the obtained bounds were not used.

4.4. Adding Variance Information

In this section, we consider factor models with additional variance information. We define the admissible class:
S ( H i , , H d , s W 2 ) = { S S ( H i , , H d ) : Var ( S | W ) = s W 2 }
where the conditional variance is known for each outcome of W. Consider the problem:
e ¯ τ f , s 2 = sup e τ ( S ) : S S ( H i , , H d , s W 2 )
Theorem 8. Let e be given by:
e = m + ( 2 τ - 1 ) E s W 2 + ( m W - e ) 2
where m : = E [ S ] and m W : = E [ S | W ] . Then:
e ¯ τ f , s 2 e
Proof. Let S W S ( H i , , H d , s W 2 ) . By [54] (Case C 13 ), the upper bound6 on the stop-loss premium over S w M 2 ( m w , s w 2 ) (recall the definition of M 2 in Equation (12)) is given by:
E [ ( S w - e ) 1 { S w > e } ] 1 2 m w - e + s w 2 + ( m w - e ) 2
which holds for any e R . Using monotonicity and the tower property, we obtain an upper bound for the unconditional stop-loss premium E [ ( S W - e ) 1 { S W > e } ] . Writing θ = ( 2 τ - 1 ) / ( 1 - τ ) and invoking Lemma 2, we find that e τ ( S W ) e , where e satisfies:
e = m + θ 2 E m W - e + s W 2 + ( m W - e ) 2 = m + θ 2 ( m - e ) + θ 2 E s W 2 + ( m W - e ) 2
The stated equation for e follows by rearranging.  ☐
Remark. If the conditional variances s w 2 are not known, but the total variance s 2 : = Var ( S ) is available, then we still have that e ¯ τ f , s 2 min e ¯ τ f , B .

5. Dependence Uncertainty Spread Comparison

In this section, the dependence uncertainty (DU) spreads of VaR, ES and the expectile are compared, where the DU spread for a risk measure ρ is defined as:
ρ ¯ ( S ) - ρ ̲ ( S )
Here, we focus on the Fréchet admissible class (only marginal dfs known); see Section 2. The behavior of DU spreads of VaR and ES for large-dimensional portfolios is discussed in [68]. In order to make the resulting capital requirements similar under the different risk measures, one could, for example, use the same level α = β = τ for all three, but multiply by different scaling factors.
The approach taken by the Basel Committee on Banking Supervision [69], when moving from VaR 0 . 99 as the risk measure for the trading book capital requirements to ES, consists of adjusting the confidence level, apparently so that the numerical value of ES β ( X ) for a normally-distributed rv X matches VaR 0 . 99 ( X ) = 2 . 3263 . Doing so yields β 0 . 97423 , which gets rounded to β = 0 . 975 . Similarly, [24] suggest using a parameter τ, such that e τ ( X ) = VaR 0 . 99 ( X ) for X Φ ; this yields τ 0 . 99855 . Note that rounding to 0 . 999 would give e 0 . 999 = 2 . 4358 ; therefore, five significant digits τ = 0 . 99855 will be used in this section when comparing the expectile to the other risk measures. ES is not as sensitive to the level β, and ES 0 . 975 = 2 . 3378 is close enough to VaR 0 . 99 .
In Figure 4 and Figure 5, the DU spreads are plotted in the homogeneous case for different Pareto and Student t distributions, respectively, as functions of the dimension d. For the Pareto example, VaR ̲ , VaR ¯ , ES ̲ and e ̲ are computed from the minimal elements in the convex order, obtained using the methods from [45]. ES ¯ and e ¯ are obtained using the comonotonic dependence structure.
Since the Student t distribution is symmetric and unimodal, it is completely mixable [53], so the lower bounds on ES and the expectile are equal to the mean. To compute the lower bound on VaR, we apply RA. As the Student t density is decreasing from the median, [45] can again be applied for the upper bound on VaR. While VaR and ES focus only on the losses, the expectile also takes the gains into account. Student t has two infinite tails, which leads to a larger DU spread for the expectile, especially in the most heavy-tailed case. Overall, the results indicate that for the chosen adjusted significance levels, the DU spread is typically the smallest for ES.
Figure 4. DU spread for VaR, expected shortfall (ES) and the expectile of the sum of d Pareto ( θ ) distributed margins, with d on the horizontal axis. The DU spread is given relative to ES β + ( S ) .
Figure 4. DU spread for VaR, expected shortfall (ES) and the expectile of the sum of d Pareto ( θ ) distributed margins, with d on the horizontal axis. The DU spread is given relative to ES β + ( S ) .
Risks 03 00599 g004
Figure 5. The DU spread for VaR, ES and the expectile of the sum of d Gaussian or Student t ( ν ) distributed margins, with d on the horizontal axis. The DU spread is given relative to ES β + ( S ) .
Figure 5. The DU spread for VaR, ES and the expectile of the sum of d Gaussian or Student t ( ν ) distributed margins, with d on the horizontal axis. The DU spread is given relative to ES β + ( S ) .
Risks 03 00599 g005

6. Final Remarks

In the statistics literature, the expectile functional and its properties related to regressions are well known. Recently, in the context of risk measurement, the expectile risk measure has also been shown to have appealing theoretical properties. We contribute to the analysis of this new risk measure by focusing on its properties under dependence uncertainty. We first summarize and provide improved methods for computing bounds on the expectile of a portfolio in the case of no information on dependence (unconstrained bounds) and prove analytic bounds for a location-scale family. Next, we discuss the influence of dependence information on these unconstrained bounds.
In this regard, we provide simple-to-compute bounds under an additional constraint on the portfolio variance and show that the upper bound can be considerably improved. By contrast, the unconstrained lower bound cannot be improved by only using the information on the first two moments.
Furthermore, we provide bounds in the factor-constrained case. A family of commonly-used distributions, the normal mean-variance mixtures, is considered as a special case. These models are particularly tractable, and we state the conditional best- and worst-case dependence structures explicitly. We note that due to the restriction on dependence that such a factor model induces, the lower e τ bounds were significantly improved (for high values of τ). The upper bounds are only slightly reduced, and the simple (unconstrained) upper bound based on subadditivity remains adequate for practical purposes.
We compare the dependence uncertainty spread of the expectile (i.e., the difference between the maximum and minimum possible value of the risk measure when only marginal information is used) with that of VaR and ES. We observe that the results are not favorable to the expectile. While the expectile has been proposed as the elicitable counterpart to ES, it is not clear that this property is indeed crucial for back-testing, and evidence exists to the contrary (e.g., [70]). Hence, gaining elicitability may not justify the increase in the dependence uncertainty spread. However, alternative mathematical approaches exist to “provide a broadly similar level of risk capture” ([71], p. 18) when moving to another risk measure (or even sticking with VaR), such as scaling. Although this makes the interpretation less clear (it was not clear for the expectile to begin with), it would allow reducing the confidence level, hence making statistical analysis more feasible and also reducing model uncertainty (see Table 2).

Acknowledgments

The authors thank Paul Embrechts and three anonymous referees for helpful comments and suggestions. E. Jakobsons thanks the Swiss Finance Institute for financial support. S. Vanduffel acknowledges the financial support from the Flemish Science Foundation (FWO).

Author Contributions

E.J. contributed mainly to the bounds with marginal information, examples and computations. S.V. contributed mainly to the bounds with variance information. The article was written in close collaboration.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. P. Embrechts, G. Puccetti, L. Rüschendorf, R. Wang, and A. Beleraj. “An academic response to Basel 3.5.” Risks 2 (2014): 25–48. [Google Scholar] [CrossRef] [Green Version]
  2. S. Emmer, M. Kratz, and D. Tasche. “What is the best risk measure in practice? A comparison of standard measures.” J. Risk 18 (2015): 31–60. [Google Scholar] [CrossRef]
  3. P. Artzner, F. Delbaen, J.M. Eber, and D. Heath. “Coherent measures of risk.” Math. Financ. 9 (1999): 203–228. [Google Scholar] [CrossRef]
  4. T. Gneiting. “Making and evaluating point forecasts.” J. Am. Stat. Assoc. 106 (2011): 746–762. [Google Scholar] [CrossRef]
  5. S. Kou, and X. Peng. “Expected shortfall or median shortfall.” J. Financ. Eng. 1 (2014): 1450007. [Google Scholar] [CrossRef]
  6. J.M. Chen. “Measuring Market Risk Under the Basel Accords: VaR, Stressed VaR, and Expected Shortfall.” Aestimatio IEB Int. J. Financ. 8 (2014): 184–201. [Google Scholar]
  7. C. Acerbi, and B. Szekely. “Back-testing expected shortfall.” Risk 27 (2014): 76–81. [Google Scholar]
  8. T. Fissler, and J.F. Ziegel. “Higher order elicitability and Osband’s principle.” Available online: arxiv.org/abs/1503.08123 (accessed on 30 September 2015).
  9. J.F. Ziegel. “Coherence and elicitability.” Math. Financ., 2014. [Google Scholar] [CrossRef]
  10. F. Bellini, and V. Bignozzi. “On elicitable risk measures.” Quant. Financ. 15 (2015): 725–733. [Google Scholar] [CrossRef]
  11. F. Delbaen, F. Bellini, V. Bignozzi, and J.F. Ziegel. “Risk measures with the CxLS property.” Financ. Stoch., 2015. [Google Scholar] [CrossRef]
  12. W.K. Newey, and J.L. Powell. “Asymmetric least squares estimation and testing.” Econom.: J. Econom. Soc. 55 (1987): 819–847. [Google Scholar] [CrossRef]
  13. F. Delbaen. “A remark on the structure of expectiles.” Available online: arxiv.org/abs/1307.5881 (accessed on 22 July 2013).
  14. B. Efron. “Regression percentiles using asymmetric squared error loss.” Stat. Sin. 1 (1991): 93–125. [Google Scholar]
  15. Q. Yao, and H. Tong. “Asymmetric least squares regression estimation: A nonparametric approach.” J. Nonparametr. Stat. 6 (1996): 273–292. [Google Scholar] [CrossRef]
  16. G. De Rossi, and A. Harvey. “Quantiles, expectiles and splines.” J. Econom. 152 (2009): 179–185. [Google Scholar] [CrossRef]
  17. C.W. Granger, and C.Y. Sin. “Modelling the absolute returns of different stock indices: exploring the forecastability of an alternative measure of risk.” J. Forecast. 19 (2000): 277–298. [Google Scholar] [CrossRef]
  18. J.W. Taylor. “Estimating value at risk and expected shortfall using expectiles.” J. Financ. Econom. 6 (2008): 231–252. [Google Scholar] [CrossRef]
  19. S. Manganelli. “Asset allocation by penalized least squares.” Available online: www.ecb.europa.eu/pub/pdf/scpwps/ecbwp723.pdf (accessed on 6 February 2007).
  20. C. Keating, and W.F. Shadwick. “A universal performance measure.” J. Perform. Meas. 6 (2002): 59–84. [Google Scholar]
  21. B. Rémillard. In Statistical Methods for Financial Engineering. Boca Raton, FL, USA: CRC Press, 2013. [Google Scholar]
  22. C.M. Kuan, J.H. Yeh, and Y.C. Hsu. “Assessing value at risk with CARE, the conditional autoregressive expectile models.” J. Econom. 150 (2009): 261–270. [Google Scholar] [CrossRef]
  23. G. De Rossi. “Staying ahead on downside risk.” In Optimizing Optimization: The Next Generation of Optimization Applications and Theory. Edited by S. Satchell. Waltham, MA, USA: Academic Press, 2009, pp. 143–160. [Google Scholar]
  24. F. Bellini, and E. di Bernardino. “Risk management with expectiles.” Eur. J. Financ., 2015. [Google Scholar] [CrossRef]
  25. Y. Aıt-Sahalia, and A.W. Lo. “Nonparametric risk management and implied risk aversion.” J. Econom. 94 (2000): 9–51. [Google Scholar] [CrossRef]
  26. A. Ahmadi-Javid. “Entropic value-at-risk: A new coherent risk measure.” J. Optim. Theory Appl. 155 (2012): 1105–1123. [Google Scholar] [CrossRef]
  27. F. Bellini, B. Klar, A. Müller, and E.R. Gianin. “Generalized quantiles as risk measures.” Insur.: Math. Econ. 54 (2014): 41–48. [Google Scholar] [CrossRef]
  28. A.J. McNeil, R. Frey, and P. Embrechts. Quantitative Risk Management: Concepts, Techniques and Tools, 2nd ed. Princeton, NJ, USA: Princeton University Press, 2015. [Google Scholar]
  29. A. Van Heerwaarden, and R. Kaas. “The Dutch premium principle.” Insur.: Math. Econ. 11 (1992): 129–133. [Google Scholar] [CrossRef]
  30. C. Bernard, L. Rüschendorf, and S. Vanduffel. “Value-at-Risk bounds with variance constraints.” J. Risk Insur., 2015. forthcoming. [Google Scholar] [CrossRef]
  31. V. Bignozzi, G. Puccetti, and L. Rüschendorf. “Reducing model risk via positive and negative dependence assumptions.” Insur.: Math. Econ. 61 (2015): 17–26. [Google Scholar] [CrossRef]
  32. C. Bernard, L. Rüschendorf, S. Vanduffel, and R. Wang. “Risk bounds for factor models.” Available online: papers.ssrn.com/sol3/papers.cfm?abstract_id=2572508 (accessed on 26 February 2015).
  33. G. Puccetti, L. Rüschendorf, D. Small, and S. Vanduffel. “Reduction of Value-at-Risk bounds via independence and variance information.” Scand. Actuar. J., 2015. forthcoming. [Google Scholar] [CrossRef]
  34. C. Bernard, and S. Vanduffel. “Quantile of a mixture with application to model risk assessment.” Depend. Model. 3 (2015): 172–181. [Google Scholar] [CrossRef]
  35. C. Bernard, L. Rüschendorf, S. Vanduffel, and J. Yao. “How robust is the value-at-risk of credit risk portfolios? ” Eur. J. Financ., 2015. [Google Scholar] [CrossRef]
  36. C. Bernard, and S. Vanduffel. “A new approach to assessing model risk in high dimensions.” J. Bank. Financ. 58 (2015): 166–178. [Google Scholar] [CrossRef]
  37. N. Bäuerle, and A. Müller. “Stochastic orders and risk measures: consistency and bounds.” Insur.: Math. Econ. 38 (2006): 132–148. [Google Scholar] [CrossRef]
  38. E. Jouini, W. Schachermayer, and N. Touzi. “Law invariant risk measures have the Fatou property.” In Advances in Mathematical Economics. Edited by S. Kusuoka and A. Yamazaki. Berlin, Germany: Springer, 2006, pp. 49–71. [Google Scholar]
  39. F. Bellini. “Isotonicity properties of generalized quantiles.” Stat. Probab. Lett. 82 (2012): 2017–2024. [Google Scholar] [CrossRef]
  40. A.H. Tchen. “Inequalities for distributions with given marginals.” Ann. Probab. 8 (1980): 814–827. [Google Scholar] [CrossRef]
  41. B. Wang, and R. Wang. “Joint mixability.” Math. Oper. Res., 2015. forthcoming. [Google Scholar] [CrossRef]
  42. B. Wang, and R. Wang. “The complete mixability and convex minimization problems with monotone marginal densities.” J. Multivar. Anal. 102 (2011): 1344–1360. [Google Scholar] [CrossRef]
  43. G. Puccetti, and R. Wang. “Extremal dependence concepts.” Stat. Sci., 2015. forthcoming. [Google Scholar]
  44. J. Dhaene, and M. Denuit. “The safest dependence structure among risks.” Insur.: Math. Econ. 25 (1999): 11–21. [Google Scholar] [CrossRef]
  45. C. Bernard, X. Jiang, and R. Wang. “Risk aggregation with dependence uncertainty.” Insur.: Math. Econ. 54 (2014): 93–108. [Google Scholar] [CrossRef]
  46. E. Jakobsons, X. Han, and R. Wang. “General convex order on risk aggregation.” Scand. Actuar. J., 2015. [Google Scholar] [CrossRef]
  47. P. Embrechts, G. Puccetti, and L. Rüschendorf. “Model uncertainty and VaR aggregation.” J. Bank. Financ. 37 (2013): 2750–2764. [Google Scholar] [CrossRef]
  48. G. Puccetti. “Sharp bounds on the expected shortfall for a sum of dependent random variables.” Stat. Probab. Lett. 83 (2013): 1227–1232. [Google Scholar] [CrossRef]
  49. G. Puccetti, and L. Rüschendorf. “Computation of sharp bounds on the expected value of a supermodular function of risks with given marginals.” Commun. Stat.—Simul. Comput. 44 (2015): 705–718. [Google Scholar] [CrossRef]
  50. P. Embrechts, and E. Jakobsons. “Dependence uncertainty for aggregate risk: Examples and simple bounds.” In The Fascination of Probability, Statistics and their Applications: In Honour of Ole E. Barndorff-Nielsen. Edited by M. Podolskij, R. Stelzer, S. Thorbjørnsen and A. Veraart. Berlin, Germany: Springer, 2016. [Google Scholar]
  51. M. Hofert, A. Memartoluie, D. Sunders, and T. Wirjanto. “Improved algorithms for computing worst Value-at-Risk: Numerical challenges and the Adaptive Rearrangement Algorithm.” Available online: arxiv.org/abs/1505.02281 (accessed on 9 May 2015).
  52. M. Shaked, and J.G. Shanthikumar. Stochastic Orders. Berlin, Germany: Springer, 2007. [Google Scholar]
  53. L. Rüschendorf, and L. Uckelmann. “Variance minimization and random variables with constant sum.” In Distributions with Given Marginals and Statistical Modelling. Berlin, Germany: Springer, 2002, pp. 211–222. [Google Scholar]
  54. F. De Vylder, and M.J. Goovaerts. “Analytical best upper bounds on stop-loss premiums.” Insur.: Math. Econ. 1 (1982): 163–175. [Google Scholar] [CrossRef]
  55. O.E. Barndorff-Nielsen. “Exponentially decreasing distributions for the logarithm of particle size.” In Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences; 1977, Volume 353, pp. 401–419. [Google Scholar]
  56. O.E. Barndorff-Nielsen. “Hyperbolic distributions and distributions on hyperbolae.” Scand. J. Stat. 5 (1978): 151–157. [Google Scholar]
  57. O.E. Barndorff-Nielsen. “Normal inverse Gaussian distributions and stochastic volatility modeling.” Scand. J. Stat. 24 (1997): 1–13. [Google Scholar] [CrossRef]
  58. E. Eberlein, and U. Keller. “Hyperbolic distributions in finance.” Bernoulli 1 (1995): 281–299. [Google Scholar] [CrossRef]
  59. E. Eberlein, U. Keller, and K. Prause. “New insights into smile, mispricing, and value at risk: The hyperbolic model.” J. Bus. 71 (1998): 371–405. [Google Scholar] [CrossRef]
  60. K. Aas, I. Hobæk Haff, and X.K. Dimakos. “Risk estimation using the multivariate normal inverse Gaussian distribution.” J. Risk 8 (2005): 39–60. [Google Scholar]
  61. D.B. Madan, and E. Seneta. “The variance gamma (VG) model for share market returns.” J. Bus. 63 (1990): 511–524. [Google Scholar] [CrossRef]
  62. D.B. Madan, P.P. Carr, and E.C. Chang. “The variance gamma process and option pricing.” Eur. Financ. Rev. 2 (1998): 79–105. [Google Scholar] [CrossRef]
  63. S. Demarta, and A.J. McNeil. “The t copula and related copulas.” Int. Stat. Rev. 73 (2005): 111–129. [Google Scholar] [CrossRef]
  64. K. Aas, and I. Hobæk Haff. “The generalized hyperbolic skew Student’s t-distribution.” J. Financ. Econom. 4 (2006): 275–309. [Google Scholar] [CrossRef]
  65. S. Dokov, S.V. Stoyanov, and S. Rachev. “Computing VaR and AVaR of skewed-t distribution.” J. Appl. Funct. Anal. 3 (2008): 189–209. [Google Scholar]
  66. O.E. Barndorff-Nielsen, and C. Halgreen. “Infinite divisibility of the hyperbolic and generalized inverse Gaussian distributions.” Probab. Theory Relat. Fields 38 (1977): 309–311. [Google Scholar] [CrossRef]
  67. Y.S. Kim, S.T. Rachev, M.L. Bianchi, and F.J. Fabozzi. “Computing VaR and AVaR in infinitely divisible distributions.” Probab. Math. Stat. 30 (2010): 223–245. [Google Scholar] [CrossRef]
  68. P. Embrechts, B. Wang, and R. Wang. “Aggregation-robustness and model uncertainty of regulatory risk measures.” Financ. Stoch. 19 (2015): 763–790. [Google Scholar] [CrossRef]
  69. Basel Committee on Banking Supervision. Fundamental Review of the Trading Book. Basel, Switzerland: Bank of International Settlements, 2012. [Google Scholar]
  70. N. Costanzino, and M. Curran. “Backtesting general spectral risk measures with application to Expected Shortfall.” Available online: papers.ssrn.com/sol3/papers.cfm?abstract_id=2514403 (accessed on 21 February 2015).
  71. Basel Committee on Banking Supervision. Fundamental Review of the Trading Book: A Revised Market Risk Framework. Basel, Switzerland: Bank of International Settlements, 2013. [Google Scholar]
  • 1.Note that for a continuous rv X, ES β ( X ) = 1 1 - β TI X ( VaR β ( X ) ) .
  • 2.Likewise, the study of VaR bounds is connected to identifying (in an appropriate admissible class) the elements that are minimum in the sense of convex order, a feature that points to a similarity between the study of bounds on the expectile and the study of bounds on VaR; see Section 2.3 in [30] for these results.
  • 3.Note that Equation (4) also follows from the more general results in [37,38]. Indeed, [37] has shown that any convex risk measure ρ with the Fatou property is consistent with the convex order, meaning that X cx Y implies ρ ( X ) ρ ( Y ) . Furthermore, [38] shows that law-invariant risk measures have the Fatou property. Since the expectile is convex and law invariant, it is consistent with the convex order. See [39] for further results on the properties of the expectile and other generalized quantiles with respect to various stochastic orders.
  • 4.Note indeed that the admissible class S ( F 1 , , F d , s 2 ) reflects d + 1 constraints rendering optimization difficult. By relaxing the d (infinite dimensional) constraints on the marginal distributions and substituting them by the portfolio mean constraint, we enlarge the class (as there are many marginal distributions that yield the same portfolio mean) and effectively obtain two constraints only, which greatly facilitates the optimization.
  • 5.Assuming a compact support would improve the lower bound, but we do not elaborate on this case here and refer to [54].
  • 6.This upper bound can also be derived using the reasoning in the proof of Theorem 5. Indeed, one shows that the upper bound is attained by a diatomic variable X p (with mean m w and variance s w 2 ). Next, one optimizes over p ( 0 , 1 ) to obtain Equation (18).

Share and Cite

MDPI and ACS Style

Jakobsons, E.; Vanduffel, S. Dependence Uncertainty Bounds for the Expectile of a Portfolio. Risks 2015, 3, 599-623. https://doi.org/10.3390/risks3040599

AMA Style

Jakobsons E, Vanduffel S. Dependence Uncertainty Bounds for the Expectile of a Portfolio. Risks. 2015; 3(4):599-623. https://doi.org/10.3390/risks3040599

Chicago/Turabian Style

Jakobsons, Edgars, and Steven Vanduffel. 2015. "Dependence Uncertainty Bounds for the Expectile of a Portfolio" Risks 3, no. 4: 599-623. https://doi.org/10.3390/risks3040599

Article Metrics

Back to TopTop