The New International Regulation of Market Risk: Roles of VaR and CVaR in Model Validation

We model the new quantitative aspects of market risk management for banks that Basel established in 2016 and came into effect in January 2019. Market risk is measured by Conditional Value at Risk (CVaR) or Expected Shortfall at a confidence level of 97.5%. The regulatory backtest remains largely based on 99% VaR. As additional statistical procedures, in line with the Basel recommendations, supplementary VaR and CVaR backtests must be performed at different confidence levels. We apply these tests to various parametric distributions and use non-parametric measures of CVaR, including CVaR- and CVaR+ to supplement the modelling validation. Our data relate to a period of extreme market turbulence. After testing eight parametric distributions with these data, we find that the information obtained on their empirical performance is closely tied to the backtesting conclusions regarding the competing models.


Introduction
In 2016, the Basel Committee decided that the market risk capital of banks should be calculated with CVaR or Expected Shortfall 1 at the 97.5% confidence level, while maintaining the backtesting of the models, as before, at 99% VaR (BCBS, 2016(BCBS, , 2019. This shift toward CVaR would be motivated by issues of consistency and the inadequacy of the risk coverage by the VaR, which has been noted over time. Market risk is now jointly managed by CVaR and VaR, at two different probabilities: p 2.5 % = and p 1 % = respectively. 2 Further, Basel suggests adding statistical procedures to ensure the ex-post suitability of models (BCBS, 2016, page 82;BCBS, 2019, paragraph 32.13). We therefore perform four backtests in addition to the 1% VaR backtest, including two on VaR at p 2.5 % = and p 5 % = and two others on CVaR at p 2.5 % = and p 5 % = . We use non-parametric measures, including CVaR-and CVaR+, to supplement the validation of the distributions used. The aim of this paper is to orchestrate all these aspects in a validation process compatible with the regulations in force.
We are working with data obtained from three risky stocks-IBM, General Electric and Walmart-, whose price fluctuations refute the usual assumptions of normality of returns during the period examined. The study period encompasses the extreme price fluctuations during the last economic recession in the United States (NBER, 3 December 2007 to June 2009) and the financial crisis of [2007][2008][2009]. We evaluate the behavior of VaR and CVaR using several parametric distributions to model returns: the normal distribution, Student's t, the EGB2 (Exponential GB2), SN2 (Skewed Normal Type 2), and SEP3 (Skewed Exponential Power Type 3). 4 We also construct homogeneous and heterogeneous mixtures of parametric densities. Eight models are analyzed in order to identify the distributions that best represent the data to manage the market risk contained therein.
1 CVaR is also called Expected Shortfall in the literature. Both measures are equivalent with continuous distributions without jumps (Rockafellar and Uryasev, 2002). See also Dionne (2019). 2 In this paper, we use the letter p to refer to the probability that the VaR is exceeded and 1-p for the corresponding confidence level. The p-value notation is for statistical tests. 3 https://www.nber.org/cycles.html 4 For a description of SN2 and SEP3, see Fernandez et al. (1995) and Rigby et al. (2014).
The analysis comprises three steps. First, the estimation of the models' parameters is validated by standard measures such as the AIC, BIC and Kolmogorov-Smirnov goodness-of-fit test. The second validation consists in comparing the kurtosis and asymmetry obtained from the parametric models with the same moments determined by a non-parametric approach of the data.
The most important point in this step is to evaluate each model by comparing the value of its parametric CVaR against the non-parametric interval [CVaR-np, CVaR+np] which is computed from our sample of returns following Rockafellar and Uryasev (2002). Given that the three nonparametric measures of the sample obey the fundamental inequalities CVaR-np ≤ CVaRnp ≤ CVaR+np, we consider that a good model should also produce a CVaR that obeys the same framing: CVaR-np ≤ CVaRModel ≤ CVaR+np. The third validation is the backtesting of the risk measures, which we carry out in compliance with the Basel regulations in force for market risk. We find that the results of the last two steps are strongly linked; failing to validate that a distribution properly fits the data would significantly affect the backtesting results.
Given that the calculations of parametric CVaR are much more complex than those of VaR, we define in detail each of the distributions or mixtures of distributions that we use in the eight models, together with the mathematical derivations of the corresponding CVaR. The mathematical developments are presented in the appendices. Appendix A1 describes the symbols of the different models. Appendix A2 shows the general expression of CVaR, and Appendix A3 outlines the general expression of CVaR from a mixture of distributions. Details of the statistical models and the backtesting procedure are also provided in the appendices.
The following section presents the data used. Section 2 provides a preliminary analysis of the data. Section 3 is devoted to estimating the parameters of the eight competing models and empirically verifying their respective performances. Section 4 conducts backtesting of the models and the final section concludes the paper.

Preliminary data analysis
We begin by calculating the optimal weights of a portfolio that minimizes the relative VaR at the 95% confidence level (p = 5%) under the constraint that the weights sum to 1. We assume the returns to be normally distributed for the moment so minimizing the relative VaR is equivalent to minimizing the CVaR plus the mean of the portfolio returns. Moreover the optimal weights are independent of the chosen p, as we will see later. We also assume that these weights will remain optimal for all distributions studied in the next sections to obtain comparable backtesting results between the different models. In other words, we suppose there is separation between portfolio optimization decision and model backtesting as it is often observed in many financial institutions.
Under the normal assumption, relative VaR of the portfolio is written as: where 1 0 ( ) − Φ ⋅ is the inverse of the cumulative function of ( ) N 0,1 evaluated at p, β is the vector of security weights, T β is the transpose of β , Σ is the variance-covariance matrix of security returns and 0 q is the quantile of ( ) N 0,1 relative to p. The VaR expression is positive since 0 q 0 < in left tail. On the other hand, since 0 q depends only on p, equation (1) is minimised on the term T β Σβ alone. Thus, the optimal weights are independent of the chosen p. The Excel file 6 gives the results of Table 2 in percentages for p = 5%. The table shows that the VaR of the optimal portfolio are lower than those of the weighted assets, which is consistent with the diversification principle. We now compute the CVaR of the optimal portfolio. With the normal assumption, CVaR is written according to equation (2) (based on equation A6 in the appendix): where ( ) 0 φ ⋅ is the density function of ( ) N 0,1 . Table 3 shows the results for the optimal portfolio. From equation (2), we can see that is constant. Thus, the optimal weights are the same as minimizing relative VaR.
The portfolio's CVaR is also naturally lower than that of the weighted assets, which means that the basic principle of diversification is followed. In the next section we continue to work with p = 5%.
At the end of the section we will discuss the effect of this choice with respect to p = 2.5% and p = 1%.
We want to calculate the VaR, CVaR, CVaR-and CVaR+ measures for the sample data at p = 5% following Rockafellar and Uryasev (2002). With our notation, we can write: CVaR E X X VaR and CVaR E X X VaR where the vector { } t 1200 t t 1 X = = denotes the portfolio returns. Note the minus sign in front of VaR since VaR 0 > and t X 0 < in the left tail of the distribution. The data { } t X can be seen as a discrete finite sample drawn from an unknown distribution. Therefore, a non-parametric estimate of equations in (3) can be obtained from the historical simulation method by writing: The results are presented in Table 4. CVaRnp is not shown in the table because CVaRnp is equal to CVaR+np. This last fact is also verified at p = 2.5% and p = 1%. 7 The relative difference between CVaRnp and CVaR+np ( ) 2.97795% 2.96269% 2.96269% 0.51% − = is extremely small. This adds a significant selective requirement in identifying a distribution or mixture of continuous parametric distributions whose CVaR must be framed by CVaR-and CVaR+. 8  Figure 2 clearly shows that a normal density would not be appropriate to represent the optimal portfolio data. On the other hand, Student's t would not be sharp enough and does not keep enough mass around the mode, as would be required by the kernel density plot of the data. These remarks would rather suggest a Laplace distribution.

Estimation of the parametric distributions
From now on, we are using the weights of optimal portfolio in Section 2 as the reference portfolio. The VaR calculated from the parametric models will be the absolute VaR relative to 0, like the CVaR. The models are denoted as M1 to M8. Complete estimations of model coefficients are presented in the following tables and in tables A2 and A3 in the appendix.
For comparison, we start with the M1 model, assuming that the data follow a normal distribution (see definitions and expressions in Appendix A4). Model M1 is denoted 1:NO, in that it consists of a single normal distribution. The results are presented in Table 5. VaR of this model is higher than VaRnp (non-parametric VaR), whereas CVaR is much lower than the two nonparametric CVaR. Unsurprisingly, the Kolmogorov-Smirnov (KS) test rejects this model (p-value = 0.0015 < 10%). The reported asymmetry and kurtosis values, 0.3589 and 9.8158, correspond to the empirical moments of portfolio returns (Table 2). They are very different compared to those of the normal distribution.
 is problematic. The reason for this is probably related to the fact that Student's t allows to account for tail thickness, but its kurtosis is undefined. In addition, Student's t does not capture the asymmetry of the data. We now move on to the M3 model, using the EGB2 distribution (CVaR calculations are made by numerical integrals because the analytical expression is not available; see definitions and expressions in Appendix A6). This model provides one more parameter. Indeed, the parameters ν and τ characterize both the tail thickness and the asymmetry of the distribution. The distribution is skewed negatively or positively, or is symmetric when , ν < τ ν > τ , or ν = τ respectively. As for the thickness of the tail, the smaller the ν , the thicker the left tail (all other parameters kept equal). Estimation of the M3 model gives 0.1652. τ = and 0.1587 ν = . Given that ν is very close to , τ we have a slight negative skewness = -0.081 (Table 7), which is not compatible with the nonparametric skewness coefficient of the data = 0.359.
The kurtosis of 5.8 is still insufficient compared with 9.8. Despite the great flexibility mentioned in the literature regarding the four-parameter EGB2, these results seem to indicate that a single parametric distribution would not be sufficient to properly identify the risks inherent in our data, despite the fact that the KS test does not reject this model (p-value = 0.3490).
A last word concerning the values 0.1587 ν = and 0.1652. τ = Given that the parameters ν and τ are very small and near zero, we know the lemma 2 of Caivano and Harvey (2014), which says that the EGB2 tends toward a Laplace density when 0. ν ≈ τ ≈ This directly corroborates the observation of the sharp mode of the kernel density plotted in Figure 2. We will observe this convergence toward a Laplace distribution below. We now move on to mixed distributions.
We estimate VaR and CVaR of the M4 model constructed with a mixture of two normal distributions (2:NO) using the expressions given in appendices A2, A3 and A4. The quantile m q at the degree of confidence ( ) 1 p − of a mixture of densities is obtained by a numerical method.
VaR is equal to m q . − CVaR is calculated using the value of m q . The results of the 2:NO model, presented in Table 8, show that we may be on the right track with a mixture of densities.
VaR 1.95397 % = and CVaR=3.11363 % clearly approach the non-parametric measurements compared with those obtained for the 1:NO distribution. Kurtosis = 6.7 also improves. The KS test gives a p-value = 0.2181, which is comfortably above 10%. However, the CVaR of the 2:NO model is still far from the range np np The estimation of the mixture of two Student's t distributions (2:T, see definitions and expressions in appendices A2, A3 and A5) presented in Table 9 demonstrates a very large parameter for the degree of freedom of the first Student's t 1 23,642.3 ν = > 30, clearly indicating that the first Student's t is practically a normal distribution. The second distribution with a degree of freedom 2 6.4162 4 ν = > allows the mixture to now have a well-defined kurtosis of 8.4, close to the kurtosis of data of 9.8. We have a p-value equal to 0.1100, which is at the limit of rejection at 10%. The BIC = -7,207.65 is worse than that of 1:T and 1:EGB2. VaR of 2.02945% is very close to the non-parametric distribution, but CVaR 3.04198 % = > np np .
Note also that the asymmetry coefficient of 2:T = -0.15 is negative while the non-parametric = 0.36 > 0. This suggests that the asymmetry in the data should be better integrated into the modeling. This 2:T mixture appears to be an improvement, but remains insufficient because it does not seem to allow the asymmetry to be modeled directly. Before exploring the addition of a parameter capturing asymmetry, we want to examine what happens for a mixture of three normal densities. Model M6 is constructed with a 3:NO mixture. In Table 10, the p-value of the KS test is 0.2280 > 10%. Moreover, M6 is the first model whose kurtosis of 9.4 is almost identical to the non-parametric distribution. This time, the asymmetry coefficient is positive, as is the non-parametric distribution.
VaR (3:NO) = 2.03847% is almost identical to the non-parametric distribution. As for the CVaR (3:NO) = 3.00452%, it is the closest to the interval np np CVaR , CVaR − +     thus far. This model appears to be better suited to the data. We will come back to this finding when we perform the backtests of the models. To advance in the modeling, we now explore the effect of adding an asymmetry parameter as an enhancement to the previous 3:NO model. The SN2 density (Skewed Normal type 2, Fernandez et al., 1995, appendix A7) allows this. We inject an asymmetry parameter in two normal densities and keep the third one as is. The mixture of this model M7 becomes 2:SN2 + 1:NO.
The 2:SN2+1:NO model includes 10 parameters. The effect of capturing asymmetry is clear in all the results presented in Table 11. The asymmetry coefficient is closest to the non-parametric one, and the kurtosis of M7 is even slightly higher than that of the non-parametric distribution. The  In addition to the direct parameterization of the asymmetry, we also want to capture the tail thickness. Fernandez et al. (1995) propose the SEP3 density (Skewed Exponential Power type 3; see also Rigby et al., 2014). We wish to reduce the number of parameters at the same time. The M8 model is constructed with the 2:SEP3 mixture (mixture of two SEP3 distributions, see definitions and calculations in appendices A3, A8 and A9). Table 12 is the largest. The asymmetry coefficient is very small and positive. The kurtosis is large, but smaller than in the previous model. Given the values of AIC and BIC, the model fits the data better than the previous model. The VaR(2:SEP3) = 1.99295% is a little far from the non-parametric distribution, but most importantly, the VaR = degenerates into an asymmetric ( 1) ν ≠ normal ( 2 τ = ), which is finally an SN2. In this case, a Laplace mixture added to an SN2 would probably have suited the data. This result directly corroborates similar findings in the recent market risk literature highlighting the mixture qualities of a Laplace and a Gaussian distribution (see Haas et al., 2006;Haas, 2009;Broda and Paolella, 2011;Miao et al., 2016;Taylor, 2019).  Table A4 in Appendix A11). This percentile is actually too far down the tail of losses for CVaR to be accurate. This should not pose a problem under current regulatory requirements, given that Basel requires backtest on VaR rather than CVaR at this 1% percentile and CVaR at 2.5 % is the measure of market risk.

First, the p-value of the KS test in
To conclude this section, Figure 3  As for model CVaR, we apply the ES Z backtest of Acerbi and Szekely (2017), and the backtest of Righi and Ceretta (2015), which will now be called RC. We also show the results of the backtests 1 Z and 2 Z for information purposes only. 9 Here we deploy five backtests to validate the VaR and CVaR of competing models in order to: (i) satisfy the regulatory requirement to perform the 1% VaR backtest (BCBS, 2016, page 77; BCBS, 2019, paragraph 32.5); (ii) as a complement, validate the 2.5% CVaR and the 2.5% VaR; (iii) as another complement, validate the 5% CVaR and the 5% VaR.
Currently, none of the four backtests in points (ii) and (iii) is explicitly required to validate the banks' overall market risk coverage. However, we propose them as part of the Basel recommendation to foresee additional statistical tests with varying degrees of confidence to support model accuracy (BCBS, 2016, page 82; BCBS, 2019, paragraph 32.13).
It is thus natural to consider adding validation of the risk measures of (ii) at 2.5% given that 2.5% CVaR determines the coverage. The 5% backtests of (iii) would be of less importance, but should help confirm the robustness of the models. Note that the five backtests are carried out as out-of-sample tests, the approach of which is set out in Appendix A10. 9 Although they are currently quite popular in the literature, these backtests have some problems as reported in the literature that prevent us from drawing conclusions based on their results.

Backtest results of the VaR and CVaR models
Backtest results for the eight models are presented in Table 13. Unsurprisingly, the normal model 1:NO is rejected because of its 1% VaR, 2.5% VaR and 2.5% CVaR. However, we did not expect that the Student 1:T model would be rejected for similar reasons. The p-values are larger, but remain <10%, which is the critical rejection threshold.

Conclusion
This paper presented a framework for validating market risk models. The approach jointly deploys CVaR and VaR backtests, in compliance with international regulations in force (coverage with 2.5% CVaR and required backtest on 1% VaR). Further, given the use of actual data that cover a period of extreme market turbulence, the assumption of normality of returns is definitively outdated. Identifying a parametric model entails comparing the magnitudes resulting from the calculations using the model parameters with the equivalent magnitudes estimated in a nonparametric distribution. The keystone of this article is the specification of the framework of CVaR of the model to be evaluated by the interval [CVaR-np, CVaR+np], which appears to be an important criterion for evaluating models and is very closely linked to the conclusions of the backtests of the models. As seen in the different estimates, nonparametric kurtosis and asymmetry also help guide the research approach to determine the direction in which to move forward.

Further, this research is an exercise in the actual implementation of VaR and CVaR
backtesting when choosing a parametric model that can manage the market risk embedded in the data. The identification of the 2:SEP3 mixture, which seems to work well with our data, is not a coincidence. In fact, the mixing of a normal distribution with a Laplace distribution directly corroborates the conclusions of the recent literature, which positions this mixture as the natural replacement for normal distribution for market risk (see Haas et al., 2006;Haas, 2009;Broda and Paolella, 2011;Miao et al., 2016;Taylor, 2019). 25   Table 13 Out-of-sample backtests of VaR and CVaR  Table A1 presents the symbols of the estimated models. Mixture of 2 SEP3 distributions Let's start by deriving the general formulas of CVaR for a statistical distribution (Appendix A2) and for a mixture of distributions (Appendix A3).

Expression of the density and cumulative function of a reduced distribution
We are interested in the family of location-scale parametric distributions F having a location parameter µ and a scale parameter .
σ If F F ∈ and y F  , then the reduced variable ( ) z y = − µ σ follows the distribution 0 F defined with equality: 0 F(y) F (z) = .
29 0 F is said to be a reduced cumulative function of F . The reduced density 0 f (·) is related to the density f (·) by writing: All densities in this document belong to F , including the normal distribution and Student's t. The location and scale parameters coincide with the mean and standard deviation of the normal distribution. This is not always the case for the other distributions F ∈ .

General expression of CVaR
We note q 0 < the quantile of VaR corresponding to the degree of confidence (1 -p). As in the study by Broda and Paolella (2011), the tail quantity of a density ( ) We develop the expression of CVaR using its definition: F q p − µ σ = , which would simplify the expression (A4). Even so, we will leave the expression as it is so that it will be of the same form as for mixtures of distributions where there will indeed be several cumulatives ( ) In the general case, m q is found numerically as a solution to the equation

A4. Expression of CVaR of a normal distribution
The density We denote both ( )  32 We apply (A4) and (A5) to obtain:

A5. Expression of the CVaR of the Student's t distribution
The density ( ) T, , , f µ σ ν ⋅ of the Student's t of parameters µ (location), σ (scale) and ν (degrees of freedom) is written as: We change the variable 2 u z = ν , hence zdz vdu 2 = . The integral of equation (A7) becomes: 10 There is another way to write the constant A with the gamma function ( ) Γ ⋅ instead of the beta function.
In equation (A8), we replace with its value ( ) The final expression of the tail is simplified in (A9). In order to be valid we need to have 1 ν > . We now apply (A9) in (A4) to find: In Excel, the functions related to Student's t distribution consider the degree of freedom ν to be an integer. Therefore, calculations cannot be made in standard form, and an additional module is required. The XRealStats.xlam module is used. It must be downloaded from their website 11 , placed in the C:/TP5 directory and activated to use the functions that allow calculations with ν∈R. The cumulative and density functions are called by T_DIST. The inverse of the cumulative function is T_INV.

A6. The EGB2 distribution: Exponential GB2
The EGB2 (Exponential Generalized Beta type 2) density has four parameters and is written, according to Kerman and McDonald (2015), for y R ∈ : The parameters ν and τ characterize both tail thickness and the asymmetry of the distribution. The distribution has a negative or positive asymmetry, or is symmetrical when ν < τ , ν > τ or ν = τ respectively. As for the tail thickness, the smaller the ν , the thicker the left tail (all other parameters being equal).  (2015), McDonald (2008) and McDonald and Xu (1995). Cummins, Dionne, McDonald, and Pritchett (1990) applies the GB2 to compute reinsurance premiums and quantiles for the distribution of total insurance losses. EGB2 is increasingly used in finance, as the studies by Caivano and Harvey (2014)

A7. The Skewed Normal Type 2 distribution: SN2
The definition of the density of Skewed Normal Type 2 (SN2) by Fernandez et al. (1995) for y R ∈ can be written as: The expression (A11) is valid only if ( ) 2 p 1 2 1, defined. This requires that 2 p 1.

ν ≤ −
The expression of the tail is developed as follows: Equation (A12) is obtained by changing the variable u z = × ν and using equation (A5a).
Equations (A11) and (A12 ) in (A4) give the expression of CVaR: Again, when 1 ν = we find the CVaR of ( ) Parameter a is for the shape of these functions. It is easy to see that ( ) ( ) a a, . Γ = γ +∞ We also have a distribution that bears the same name, i.e. gamma, 12 whose cumulative parameter shape = a (and scale = 1 because it is standardized) evaluated at the point x 0 > . It is written as  (2015), the p-value is obtained by bootstrapping according to Efron and Tibshirani (1994). Here, we will obtain it instead by following exactly the same construction as for ES Z .
Finally, and for information purposes only, the Z1 and Z2 statistics are defined by:

A11. Model estimation and parametric and non-parametric VaR and CVaR calculations
The estimated parameters of the distributions are given in the following tables.