Next Article in Journal
Computing Sharp Bounds of Metric Based Fractional Dimensions for the Sierpinski Networks
Next Article in Special Issue
Oil Price Shocks to Foreign Assets and Liabilities in Saudi Arabia under Pegged Exchange Rate
Previous Article in Journal
Preference and Stability Regions for Semi-Implicit Composition Schemes
Previous Article in Special Issue
Portfolio Optimization Considering Behavioral Stocks with Return Scenario Generation
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Estimating Value-at-Risk and Expected Shortfall: Do Polynomial Expansions Outperform Parametric Densities?

Departamento de Fundamentos del Análisis Económico, Campus San Vicente del Raspeig, Universidad de Alicante, 03690 Alicante, Spain
*
Author to whom correspondence should be addressed.
Mathematics 2022, 10(22), 4329; https://doi.org/10.3390/math10224329
Submission received: 17 October 2022 / Revised: 7 November 2022 / Accepted: 10 November 2022 / Published: 18 November 2022

Abstract

:
We assess Value-at-Risk (VaR) and Expected Shortfall (ES) estimates assuming different models for the standardized returns: distributions based on polynomial expansions such as Cornish-Fisher and Gram-Charlier, and well-known parametric densities such as normal, skewed-t and Johnson. This paper aims to analyze whether models based on polynomial expansions outperform the parametric ones. We carry out the model performance comparison in two stages: first, with a backtesting analysis of VaR and ES; and second, using loss functions. Our backtesting results show that all distributions, except for normal ones, perform quite well in VaR and ES estimations. Regarding the loss function analysis, we conclude that polynomial expansions (specifically, the Cornish-Fisher one) usually outperform parametric densities in VaR estimation, but the latter (specifically, the Johnson density) slightly outperform the former in ES estimation; however, the gains of using one approach or the other are modest.
MSC:
91G70
JEL Classification:
C22; C58; G1

1. Introduction

In recent years, big events such as the great plunge in oil prices in 2015, the economic instability in Greece during 2009–2017, the China stock market turbulence in 2015–2016, Brexit in 2016 and the COVID-19 pandemic in 2020–2021, have caused important losses for investors; consequently, measuring risk properly has become a crucial task for researchers and practitioners in finance. Nowadays, two of the most used downside risk measures are the Value-at-Risk (VaR) and the Expected Shortfall (ES). Indeed, the Basel Committee on Banking Supervision (Basel III) imposes to financial institutions to meet some capital requirements based on VaR and ES estimates. If they are not properly estimated, it may lead to a sub-optimal capital allocation, affecting the profitability or stability of financial institutions. The VaR and the ES measure, respectively, the quantile and the expected return conditioning on returns being lower than its quantile. These two risk measures have received some criticism: desirable properties of a risk measure (among them, the coherence property) are formalized by [1] in a set of axioms, and VaR has been found not to be coherent because it does not satisfy the subadditivity property and, as a consequence, it might fail to appropriately account for risk concentrations, and diversification strategies might be negatively affected; VaR has also been criticized for not providing information beyond the quantile; regarding ES, it does not verify the elicitability condition, which means that there could be an issue with direct backtesting of ES estimates (see [2]), although some feasible approaches to backtesting ES have recently been proposed (see, among others, [3], using a linear approximation of ES based on VaRs at different coverage levels, and [4], using cumulative violations). In spite of this criticism, Var and ES are very informative about the tail shape of the asset returns distribution, and regulatory requirements have focused on them.
Both VaR and ES estimates are usually obtained by previously estimating, with maximum likelihood (ML) techniques, the parameters implied in a returns model with ARMA structure for the conditional mean of returns and GARCH-type models for their conditional variance, and assuming a particular distribution for the standardized errors, henceforth Z t . The standard normal distribution has been the most frequent distribution assumed for Z t , but this assumption is often rejected in the empirical finance literature when modeling stock returns and other financial assets, even after controlling for volatility clustering effects. As a consequence, recent research has focused on the analysis of models that use generalizations of the normal distribution that are flexible enough to tackle with asymmetry and heavy tails, but that are also analytically tractable. Here we focus on two approaches which have been highly used in recent years to obtain generalizations of the normal distribution that might lead to more accurate daily estimates of VaR and ES: (i) use of polynomial expansions that allow to easily obtain more flexible models than the normal; (ii) use of parametric densities that encompass the normal. Our aim in this paper is to compare the performance of these two approaches, through the implementation of backtesting methods and loss functions. In principle, one could think that the flexibility inherent to polynomial expansions should lead to better estimates of VaR and ES, if the sample size is large enough. However, the parametric densities have been introduced to specifically address the departures from non-normality that are typical in financial data and, thus, they might yield better results than the polynomial expansions, with the sample sizes that are usual in econometric practice with financial series. This is precisely the kind of comparison that we are interested in; and, in order to perform a fair comparison between the two approaches, the order of the polynomial expansions that we will consider will lead to distributions with two parameters, and the parametric distributions that we will consider will have two shape parameters.
The polynomial expansions that we use here are the Cornish-Fisher (CF) and the Gram-Charlier (GC) expansions. Both expansions differ in their nature: CF is a transformation of the normal quantile function (see [5]), whereas GC is a transformation of the normal probability density function. The CF distribution that we consider here is obtained using a fourth-order expansion. In this way, a two-parameter distribution is obtained, and there is a simple relation between the two parameters and the skewness and kurtosis; this easy tractability comes from using the bijective property of the CF polynomial expansion, once the domain of variation is appropriately restricted (monotonicity condition), see [6]; if the monotonicity condition does not hold, then the ‘increasing rearrangement’ procedure of [7] can be used to restore monotonicity (an empirical application of this methodology can be seen in [8]). The GC expansions (see, e.g., [9]) are very flexible and analytically tractable since many densities can be expressed as the product of the standard normal probability density function times an infinite series of Hermite polynomials. In practice, truncated GC expansions (also known as Edgeworth- Sargan densities) have to be used (see [10] for the analysis of these densities with financial data), and this often implies negative densities over some interval of their domain. Several solutions have been proposed to handle this problem: (i) restricting the parameter space and then estimating the parameters by constrained ML, see [11]; (ii) using certain transformations in order to guarantee positivity of the expansion, see [12,13]; (iii) considering semi-nonparametric densities which are always positive by construction, and hence the parameters can be easily estimated by using ML instead of constrained ML, see [14,15,16]. In this work we use the third approach, and consider the standardized semi-nonparametric distribution of fourth order, which is a two-parameter distribution.
The parametric densities that we use here are two well-known two-parameter densities: the skewed-t density, introduced by [17] in financial econometrics; and one of the densities proposed in Johnson [18], which was analyzed by [19,20] in the context of modeling financial returns. The skewed-t density exploits the leptokurtosis property of the Student-t distribution, but also allows for asymmetries thanks to the introduction of an additional parameter. The idea behind Johnson’s approach to derive densities that encompass the normal is to consider random variables obtained by transforming a normal random variable with the inverse of a non-decreasing continuous function g; specifically, three different g functions have been proposed, leading to lognormal, unbounded and bounded distributions, although here we will consider exclusively the unbounded one, that is, g x = sinh 1 ( x ) .
The accuracy of VaR and ES estimates under the two approaches (polynomial expansions or parametric densities) is measured using backtesting techniques with rolling-window data. For backtesting VaR, we implement the traditional tests, i.e., the unconditional coverage test of [21] and the independence test of [22]. For backtesting ES, we use the robust unconditional and conditional coverage tests of [4], which focus on cumulative violations. Furthermore, different loss functions (see, e.g., [23]) are used in order to rank models that are not rejected with backtesting procedures.
The remainder of the paper is structured as follows. Section 2 describes the statistical specifications that are used for modeling returns, and how inference on VaR and ES is made with them. Section 3 describes the procedures that are used to assess the performance of the statistical models, namely, backtesting procedures for VaR and ES and loss functions. Section 4 describes the data and applies the different methods to study the performance through backtesting and loss functions. Section 5 concludes the paper with a summary of the main findings and contributions.

2. Statistical Models and Inference

2.1. General Framework

We assume that daily stock returns, R t , satisfy that
R t = μ + σ t Z t ,
where { Z t } are independent and identically distributed (iid) zero-mean unit-variance random variables, μ is a parameter and { σ t } follows a non-linear asymmetric GARCH (NGARCH) model, see e.g., [24], given by
σ t 2 = b 0 + b 1 σ t 2 + b 2 ( R t 1 μ c σ t 1 ) 2 ,
where b 0 , b 1 and b 2 are positive parameters, and c is an additional parameter that captures the leverage effect. Note that c > 0 implies that returns exhibit leverage effect, i.e., volatility increases more after negative shocks than after positive shocks of the same magnitude, whereas c < 0 means that there is inverse leverage effect, i.e., volatility increases more after positive shocks than after negative shocks of the same magnitude. The NGARCH is reduced to the GARCH when c = 0 . In this paper we assume that Z t = v Y 1 / 2 ( Y t m Y ) , where Y t is random variable that is defined by means of a polynomial expansion or a parametric model, and we denote m Y E ( Y t ) and v Y Var ( Y t ) . The aim of our research is to compare the performance of densities based on polynomial expansions and densities based on parametric models; in order to make a fair comparison, both the polynomial expansions and the parametric models that we consider will be characterized by two parameters ψ 1 , ψ 2 that determine the skewness and the kurtosis of Y t in each case. Thus, hereafter the probability density function (pdf) and the cumulative distribution function (cdf) of Y t are denoted as f Y ( · ψ 1 , ψ 2 ) and F Y ( · ψ 1 , ψ 2 ) , respectively.
Our analysis will focus on the performance of the models when estimating the value at risk (VaR) and the expected shortfall (ES) of returns. Given α ( 0 , 1 ) , the α -VaR and the α -ES of Y t will be denoted as VaR Y ( α ψ 1 , ψ 2 ) and ES Y ( α ψ 1 , ψ 2 ) , respectively, that is, VaR Y ( α ψ 1 , ψ 2 ) = F Y 1 ( α ψ 1 , ψ 2 ) and ES Y ( α ψ 1 , ψ 2 ) = E ( Y t Y t VaR Y ( α ψ 1 , ψ 2 ) ) . Therefore, as Y t = m Y + v Y 1 / 2 σ t ( R t μ ) , it follows that the cdf, the pdf, the α -VaR and the α -ES of R t conditional on information up to period t 1 are:
F t ( r F t 1 ) = F Y ( m Y + v Y 1 / 2 σ t ( r μ ) ψ 1 , ψ 2 ) ,
f t ( r F t 1 ) = v Y 1 / 2 σ t f Y ( m Y + v Y 1 / 2 σ t ( r μ ) ψ 1 , ψ 2 ) ,
VaR t ( α F t 1 ) = μ + σ t v Y 1 / 2 ( VaR Y ( α ψ 1 , ψ 2 ) m Y ) ,
ES t ( α F t 1 ) = μ + σ t v Y 1 / 2 ( ES Y ( α ψ 1 , ψ 2 ) m Y ) .
Note that these expressions depend on seven parameters: the mean parameter μ , the four parameters of the NGARCH model b 0 , b 1 , b 2 , c , and the two shape parameters ψ 1 , ψ 2 . These seven parameters can be estimated by Maximum Likelihood (ML), once the pdf f Y ( · ψ 1 , ψ 2 ) is specified, and the corresponding mean and variance (m Y , v Y ) are found. Specifically, the estimates are found by maximizing
ln L ( μ , b 0 , b 1 , b 2 , c , ψ 1 , ψ 2 ) = t = 1 T 1 2 ln v Y 1 2 ln σ t 2 + ln f Y ( m Y + v Y 1 / 2 σ t ( R t μ ) ψ 1 , ψ 2 )
where σ t is given in (2). The optimization problem incorporates non-negativity restrictions on b 0 , b 1 and b 2 , the second-order stationarity restriction b 2 ( 1 + c 2 ) + b 1 < 1 , and the restrictions on ψ 1 and ψ 2 that are specified below in each case. Once the parameters of the model are estimated, VaR t ( α F t 1 ) and ES t ( α F t 1 ) are estimated using the expressions given in (5) and (6), but replacing the unknown parameters by their ML estimates. In the next two subsections we specify the distributions for Y t that we consider.

2.2. Distributions Based on Polynomial Expansions

2.2.1. Cornish-Fisher Expansion

We follow here the specification of the Cornish-Fisher expansion described in [25]. A random variable Y follows a two-parameter Cornish-Fisher distribution (a CF distribution, for short) with parameters ψ 1 and ψ 2 if
Y = a 0 ( ψ 1 , ψ 2 ) + a 1 ( ψ 1 , ψ 2 ) W + a 2 ( ψ 1 , ψ 2 ) W 2 + a 3 ( ψ 1 , ψ 2 ) W 3 ,
where W is a standard normal random variable, ψ 1 , ψ 2 are parameters such that
ψ 1 < 6 ( 2 1 ) 2.4853
and
ψ 2 36 + 11 ψ 1 2 ψ 1 4 216 ψ 1 2 + 1296 9 , 36 + 11 ψ 1 2 + ψ 1 4 216 ψ 1 2 + 1296 9 ,
and we denote a 0 ( ψ 1 , ψ 2 ) = ψ 1 / 6 , a 1 ( ψ 1 , ψ 2 ) = 1 ψ 2 / 8 + 5 ψ 1 2 / 36 , a 2 ( ψ 1 , ψ 2 ) = ψ 1 / 6 and a 3 ( ψ 1 , ψ 2 ) = ψ 2 / 24 ψ 1 2 / 18 . For simplicity, hereafter we write a j instead of a j ( ψ 1 , ψ 2 ) . Note that the conditions on ψ 1 and ψ 2 are introduced to guarantee that the transformation from W to Y is one-to-one. As a consequence, if we denote h ( W ) = a 0 + a 1 W + a 2 W 2 + a 3 W 3 , the relationship (7) implies that the cdf of Y is F Y ( y ψ 1 , ψ 2 ) = Φ ( h 1 ( y ) ) . The corresponding pdf is
f Y ( y ψ 1 , ψ 2 ) = ϕ ( d ( y , ψ 1 , ψ 2 ) ) ( ψ 2 8 ψ 1 2 6 ) d ( y , ψ 1 , ψ 2 ) 2 + ψ 1 3 d ( y , ψ 1 , ψ 2 ) + 1 ψ 2 8 + 5 ψ 1 2 36
where ϕ ( · ) denotes the standard normal pdf,
d ( y , ψ 1 , ψ 2 ) a 2 3 a 3 + q ( y , ψ 1 , ψ 2 ) 2 + Δ ( y , ψ 1 , ψ 2 ) 3 + q ( y , ψ 1 , ψ 2 ) 2 Δ ( y , ψ 1 , ψ 2 ) 3 ,
q ( y , ψ 1 , ψ 2 ) 2 a 2 3 9 a 1 a 2 a 3 + 27 a 3 2 ( a 0 y ) 27 a 3 3 , and
Δ ( y , ψ 1 , ψ 2 ) q ( y , ψ 1 , ψ 2 ) 2 4 + ( 3 a 1 a 3 a 2 2 ) 3 729 a 3 6 .
It also follows from (7) that m Y = 0 and
v Y = 1 + 1 96 ψ 2 2 + 25 1296 ψ 1 4 1 36 ψ 2 ψ 1 2 .
Additionally, the α -VaR and the α -ES of Y are:
VaR Y ( α ψ 1 , ψ 2 ) = a 0 + a 1 Φ 1 ( α ) + a 2 Φ 1 ( α ) 2 + a 3 Φ 1 ( α ) 3 ,
ES Y ( α ψ 1 , ψ 2 ) = a 0 a 1 α ϕ ( Φ 1 ( α ) ) + a 2 α ( α Φ 1 ( α ) ϕ ( Φ 1 ( α ) ) ) a 3 α ( 2 + Φ 1 ( α ) 2 ) ϕ ( Φ 1 ( α ) ) .
The latter expression follows because E [ Y Y VaR Y ( α ψ 1 , ψ 2 ) ] = E [ h ( W ) h ( W ) h ( Φ 1 ( α ) ) ] = E [ a 0 + a 1 W + a 2 W 2 + a 3 W 3 W Φ 1 ( α ) ] , and the truncated moments of a standard normal distribution are E [ W W a ] = ϕ ( a ) / Φ ( a ) , E [ W 2 W a ] = 1 a ϕ ( a ) / Φ ( a ) , and E [ W 3 W a ] = ( 2 + a 2 ) ϕ ( a ) / Φ ( a ) (see e.g., [26]).

2.2.2. Gram-Charlier Expansion

We follow here the specification of the Gram-Charlier expansion described in [27]. A random variable Y follows a two-parameter Gram-Charlier distribution (a GC distribution, for short) with parameters ψ 1 and ψ 2 if its pdf is
f Y ( y ψ 1 , ψ 2 ) = ϕ ( y ) v v 1 + ψ 1 H 1 ( y ) + ψ 2 H 2 ( y ) 2 ,
where we denote v = ( 1 , ψ 1 , ψ 2 ) , and { H k ( · ) } k = 0 are the normalized Hermite polynomials, defined recursively as follows: H 0 ( y ) = 1 , H 1 ( y ) = y and, for k 2 ,
H k ( y ) = y H k 1 ( y ) k 1 H k 2 ( y ) k .
The corresponding cdf of Y is given by
F Y ( y ψ 1 , ψ 2 ) = Φ ( y ) ϕ ( y ) k = 1 4 γ k ( ψ 1 , ψ 2 ) k H k 1 ( y ) ,
where we denote γ 1 ( ψ 1 , ψ 2 ) = 2 ψ 1 ( 1 + 2 ψ 2 ) / v v , γ 2 ( ψ 1 , ψ 2 ) = 2 ( ψ 1 2 + 2 ψ 2 2 + 2 ψ 2 ) / v v , γ 3 ( ψ 1 , ψ 2 ) = 2 3 ψ 1 ψ 2 / v v and γ 4 ( ψ 1 , ψ 2 ) = 6 ψ 2 2 / v v . For simplicity, hereafter we write γ j instead of γ j ( ψ 1 , ψ 2 ) . It also follows from (8) that m Y = γ 1 and v Y = 2 γ 2 + 1 γ 1 2 . The α -VaR of Y is
VaR Y ( α ψ 1 , ψ 2 ) = F Y 1 ( α ψ 1 , ψ 2 ) ,
where F Y ( · ) is given in (9); there is no closed-form expression for F Y 1 ( · ) , but it can be computed numerically using the procedure described in [28]. Finally, the α -ES of Y is
ES Y ( α ψ 1 , ψ 2 ) = 1 α Φ VaR Y ( α ψ 1 , ψ 2 ) k = 1 5 η k ( ψ 1 , ψ 2 ) p k ( VaR Y ( α ψ 1 , ψ 2 ) ) ,
where η 1 ( ψ 1 , ψ 2 ) = 1 γ 2 / 2 + 3 γ 4 / 24 , η 2 ( ψ 1 , ψ 2 ) = γ 1 3 γ 3 / 6 , η 3 ( ψ 1 , ψ 2 ) = γ 2 / 2 6 γ 4 / 24 , η 4 ( ψ 1 , ψ 2 ) = γ 3 / 6 , η 5 ( ψ 1 , ψ 2 ) = γ 4 / 24 , p 1 ( y ) = ϕ ( y ) / Φ ( y ) , p 2 ( y ) = 1 y ϕ ( y ) / Φ ( y ) , and p k ( y ) = ( k 1 ) p k 2 ( y ) y k 1 ϕ ( y ) / Φ ( y ) for k 3 .

2.3. Parametric Distributions

2.3.1. Skewed-t Distribution

There are several equivalent specifications of a skewed-t distribution; here we use the parameterization of [29]. A random variable Y follows a skewed-t distribution with parameter ψ 1 (skewness parameter) and ψ 2 (degrees of freedom), where ψ 1 0 , 1 and ψ 2 > 0 , if its pdf is
f Y ( y ψ 1 , ψ 2 ) = f ψ 2 ( y 2 ψ 1 ) if y 0 , f ψ 2 ( y 2 ( 1 ψ 1 ) ) if y > 0 ,
where f ψ 2 ( · ) denotes the pdf of the Student-t distribution with ψ 2 degrees of freedom. The corresponding cdf is
F Y ( y ψ 1 , ψ 2 ) = 2 ψ 1 F ψ 2 ( y 2 ψ 1 ) if y 0 , 2 ( 1 ψ 1 ) F ψ 2 ( y 2 ( 1 ψ 1 ) ) + ( 2 ψ 1 1 ) if y > 0 ,
where F ψ 2 ( · ) denotes the cdf of the Student-t distribution with ψ 2 degrees of freedom. When ψ 2 > 1 the mean of Y is
m Y = 2 ψ 2 ( 1 2 ψ 1 ) Γ ( ψ 2 1 2 ) π Γ ( ψ 2 2 ) .
Additionally, if ψ 2 > 2 the variance of Y is
v Y = 4 ψ 2 ( 1 3 ψ 1 + 3 ψ 1 2 ) ψ 2 2 m Y 2 .
Note that the Student-t distribution is nested in this distribution when ψ 1 = 1 / 2 . Also note that the pdf of ( Y m Y ) / v Y 1 / 2 is the same as the pdf in Equation (10) of [17], after reparameterization. The α -VaR of Y is
VaR Y ( α ψ 1 , ψ 2 ) = 2 ψ 1 F ψ 2 1 ( α 2 ψ 1 ) if α ψ 1 , 2 ( 1 ψ 1 ) F ψ 2 1 ( 1 α 2 ( 1 ψ 1 ) ) if α > ψ 1 .
Note that, by the symmetry of f ψ 2 ( · ) , it follows that F ψ 2 1 ( α ) = F ψ 2 1 ( 1 α ) for α ( 0 , 1 ) . Also note that there is no closed-form expression for F ψ 2 1 ( · ) , but most programming languages include this function. Finally, the α -ES of Y is
ES Y ( α ψ 1 , ψ 2 ) = 4 ψ 2 α ( ψ 2 1 ) ψ 1 2 f ψ 2 ( F ψ 2 1 ( α 2 ψ 1 ) ) ( 1 + F ψ 2 1 ( α 2 ψ 1 ) 2 ψ 2 ) , if α ψ 1 , 4 ψ 2 ( 1 ψ 1 ) 2 α ( ψ 2 1 ) { f ψ 2 ( F ψ 2 1 ( 1 α 2 ( 1 ψ 1 ) ) ) ( 1 + F ψ 2 1 ( 1 α 2 ( 1 ψ 1 ) ) 2 ψ 2 ) 1 2 ψ 1 ( 1 ψ 1 ) 2 f ψ 2 ( 0 ) } , if α > ψ 1 .

2.3.2. Johnson Distribution

We follow here the parameterization of the Johnson distribution described in [30]. A random variable Y follows a Johnson distribution with shape parameters ψ 1 and ψ 2 , where < ψ 1 < and ψ 2 > 0 , if
Y = sinh W ψ 1 ψ 2 ,
where W is a standard normal random variable, and sinh ( x ) = ( exp ( x ) exp ( x ) ) / 2 . The cdf of a random variable with Johnson distribution is
F Y ( y ψ 1 , ψ 2 ) = Φ ψ 1 + ψ 2 sinh 1 ( y ) ,
where sinh 1 ( y ) = ln ( x + 1 + x 2 ) . The corresponding pdf is
f Y ( y ψ 1 , ψ 2 ) = ψ 2 1 + y 2 ϕ ψ 1 + ψ 2 sinh 1 ( y ) .
The mean of Y is m Y = exp ( ψ 2 2 / 2 ) sinh ( ψ 1 / ψ 2 ) , and its variance is
v Y = 1 2 exp ψ 2 2 1 exp ψ 2 2 cosh 2 ψ 1 ψ 2 + 1 ,
where cosh ( x ) = ( exp ( x ) + exp ( x ) ) / 2 . Finally, the α -VaR and the α -ES of Y are
VaR Y ( α ψ 1 , ψ 2 ) = sinh Φ 1 ( α ) ψ 1 ψ 2 ,
ES Y ( α ψ 1 , ψ 2 ) = 1 2 α exp ( ψ 2 2 2 ) exp ( ψ 1 ψ 2 ) Φ Φ 1 ( α ) 1 ψ 2 exp ( ψ 1 ψ 2 ) Φ Φ 1 ( α ) + 1 ψ 2 .

3. Assessing the Performance of VaR and ES Estimates

3.1. Backtesting VaR and ES

To evaluate one-day-ahead VaR/ES forecasts, first we use the backtesting method, which is a formal statistical framework that consists on checking whether actual losses are in line with predicted losses (see [31]). We perform this analysis using a one-day rolling-window methodology. Specifically, given a sample of T daily returns, we divide the sample into two parts: the in-sample period with the first T 0 observations, and the out-of-sample period with the remaining T T 0 observations. In our first estimation we use the data from day 1 to day T 0 to estimate the model by ML, and then derive the estimates of VaR T 0 + 1 ( α F T 0 ) and ES T 0 + 1 ( α F T 0 ) as described in the previous section. In our second estimation we exclude observation 1 from the estimation sample, and incorporate the observation of day T 0 + 1 ; thus, we proceed to estimate the model by ML using the data from day 2 to day T 0 + 1 , and then derive the estimates of VaR T 0 + 2 ( α F T 0 + 1 ) and ES T 0 + 2 ( α F T 0 + 1 ) as above. We continue in this way until getting VaR and ES estimates for the whole out-of-sample period, i.e., { VaR ^ t ( α F t 1 ) } t = T 0 + 1 T and { ES ^ t ( α F t 1 ) } t = T 0 + 1 T . The accuracy of these sequence of estimates is then analyzed by comparing them with the sequence of actual out-of-sample returns { R t } t = T 0 + 1 T , using the procedures that are described below.
We use two procedures for backtesting VaR. First, we apply the unconditional coverage test of [21]. This test is based on the fact that, if the model is correctly specified, then the so-called hit variables h t I ( R t < VaR t ( α F t 1 ) ) , where I ( · ) is the indicator function, are independent Bernoulli random variables, and hence t = T 0 + 1 T h t follows a Binomial distribution Bi ( T T 0 , α ) . The test statistic is
L R UC = 2 ln α N ( 1 α ) T T 0 N α ^ N ( 1 α ^ ) T T 0 N ,
where N t = T 0 + 1 T h ^ t , α ^ N / ( T T 0 ) and h ^ t I ( R t < VaR ^ t ( α F t 1 ) ) . Under the null hypothesis that the model is correctly specified, the asymptotic distribution of L R UC is χ 1 2 ; thus, the null hypothesis of correct specification is rejected at the 5 % significance level if the sample value of L R UC is above χ 1 , 0.05 2 = 3.8415 . Second, we apply the independence test of [22]. This test is based on the fact that, if the model is correctly specified, then hits are independent over time; hence, if we denote α i j P ( h t + 1 = j h t = i ) for i , j = 0 , 1 , then α i 1 must be equal to α , both when i = 0 and i = 1 , and α i 0 must be equal to 1 α , both when i = 0 and i = 1 . The test statistic is
L R IND = 2 ln α ^ N 01 + N 11 ( 1 α ^ ) N 00 + N 10 α ^ 01 N 01 ( 1 α ^ 01 ) N 00 α ^ 11 N 11 ( 1 α ^ 11 ) N 10
where N i j t = T 0 + 1 T 1 I ( h ^ t + 1 = j ) I ( h ^ t = i ) , for i , j = 0 , 1 , α ^ 01 N 01 / N 00 + N 01 and α ^ 11 N 11 / N 10 + N 11 . Under the null hypothesis that the model is correctly specified, the asymptotic distribution of L R IND is also χ 1 2 ; thus, the null hypothesis of correct specification is rejected at the 5 % significance level if the sample value of L R IND is above χ 1 , 0.05 2 = 3.8415 .
We also use two procedures for backtesting ES. First, we apply the robust unconditional test of [4]. The test statistic is
M U ES = T T 0 ( H ¯ ( α ) α 2 ) α ( 1 3 α 4 ) + T T 0 T 0 R ^ ES Λ ^ R ^ ES ,
where H ¯ ( α ) ( T T 0 ) 1 t = T 0 + 1 T H ^ t ( α ) ,
H ^ t ( α ) h ^ t α { α F Y ( m ^ Y + v ^ Y 1 / 2 σ ^ t ( R t μ ^ ) ψ ^ 1 , ψ ^ 2 ) } ,
Λ ^ is the 7 × 7 variance-covariance matrix of the estimates, and R ^ ES is a 7 × 1 matrix whose precise definition is given in page 955 of [4]. Under the null hypothesis that the model is correctly specified, the asymptotic distribution of M U ES is standard normal; thus, the null hypothesis of correct specification is rejected at the 5 % significance level if the absolute value of the sample value of M U ES is above z 0.025 = 1.96 . Second, we apply the robust conditional test of [4]. For a given m N , the test statistic is
M C ES ( m ) = ( T T 0 ) ρ ^ ( m ) Ω ^ 1 ρ ^ ( m )
where ρ ^ ( m ) = ρ ^ 1 , , ρ ^ m , ρ ^ j = γ ^ j / γ ^ 0 for j = 0 , 1 , , m ,
γ ^ j = 1 T T 0 j t = T 0 + 1 T j ( H ^ t + j ( α ) α 2 ) ( H ^ t ( α ) α 2 ) ,
Ω ^ is an m × m matrix whose i j th element is Ω ^ i j = I ( i = j ) + ( ( T T 0 ) / T 0 ) R ^ j Λ ^ R ^ j , and R ^ j is a 7 × 1 matrix whose precise definition is given in page 955 of [4]. Under the null hypothesis that the model is correctly specified, the asymptotic distribution of M C ES ( m ) is χ m 2 ; thus, the null hypothesis of correct specification is rejected at the 5 % significance level if the sample value of M C ES ( m ) is above χ m , 0.05 2 .

3.2. Assessment of Performance with Loss Functions

The main goal of the backtesting procedures described in the previous subsection is to determine if a model provides reliable VaR and ES estimates. However, it is often the case that the null hypothesis of correct specification is not rejected for several models; hence, a criterion to select one model among those that have not been rejected with backtesting techniques is required. A frequently used criterion is based on the so-called loss functions (LF). This approach was introduced in [32], and its aim is to calculate the size of noncovered losses by measuring the distance between a risk measure for period t, say RiskM t , computed with the information available up to period t 1 , and the observed returns for that period R t . Following [23], the loss functions considered in this paper have the form
L ( R t , RiskM t ) = f ( R t , RiskM t ) if R t < RiskM t , 0 if R t RiskM t .
Note that this type of loss function only pays attention to the magnitude of the hit when it actually occurs. In our empirical analysis we analyze four risk measures: VaR t ( 0.05 F t 1 ) , VaR t ( 0.025 F t 1 ) , ES t ( 0.05 F t 1 ) and ES t ( 0.025 F t 1 ) . As regards the function f , we will use the following alternatives:
f 1 ( R t , RiskM t ) = 1 + ( R t RiskM t ) 2 f 2 ( R t , RiskM t ) = ( R t RiskM t ) 2 f 3 ( R t , RiskM t ) = 1 R t RiskM t f 4 ( R t , RiskM t ) = ( R t RiskM t ) 2 RiskM t f 5 ( R t , RiskM t ) = R t RiskM t
Function f 1 is used in [32], function f 2 is used in [33], and functions f 3 , f 4 , f 5 are used in [34]. Note that both f 1 and f 2 penalize large hits more than small ones, whereas the remaining functions stress the importance of uncovered losses but taking also into account their relative sizes.
For a given risk measure and a given loss function f j , the overall performance of an estimate of the risk measure, say RiskM ^ , can be found by computing the average loss across the out-of-sample period, that is,
L ¯ j ( RiskM ^ ) = t = T 0 + 1 T L j ( R t , RiskM ^ t ) T T 0 ,
where L j ( R t , RiskM ^ t ) denotes the loss that is found according to (10), when function f j is used as f . When comparing the performance of different estimates of the risk measure RiskM, the one that minimizes the average loss L ¯ j ( RiskM ^ ) should be preferred.

4. Empirical Results

We collect daily log-returns of several indexes, stocks and others extracted from Datastream and Yahoo Finance. Log-returns are obtained by taking the logarithmic differences of closing daily prices, i.e., R t = ( ln ( P t ) ln ( P t 1 ) ) × 100 , where P t denotes the price of the stock at time t. The selection of series is made with the aim of encompassing several markets across the world and stocks of different countries and sectors. As described in Section 3.1, the full data period of size T is divided into the in-sample period (first T 0 observations) and the out-of-sample period (remaining T T 0 observations). The period and sample size vary across series; the number of observations in the in-sample period is T 0 = 2000 in all cases; thus, the out-of-sample period is approximately 60% of the total sample in most cases. Information about the data, including descriptive statistics, period and in-sample period, is reported in Table 1. Note that the end of the sample period is always December 2019, in order to exclude the COVID-19 period.
In order to examine how sensitive parameter estimates are to the distribution of the errors, first we report the estimates that are obtained with each distribution using the in-sample period of one of the series, namely the Euro Stoxx 50. As a reference, we also report the estimates that are obtained when we assume that the distribution of the errors Z t is standard normal. These results can be found in Table 2.
The estimates of the unconditional mean parameter μ and the conditional variance parameters, b 0 , b 1 , b 2 , c, are similar across distributions. The shape parameters ψ 1 and ψ 2 are not directly comparable across distributions. These parameters are related to the skewness and kurtosis of the distribution; thus, we compare the estimates of skewness and kurtosis that are found with each distribution using two procedures: first, by computing the sample skewness and sample kurtosis of the estimated errors Z ^ t ; second, by computing the ML estimates of the skewness and kurtosis coefficient (the formulas that relate these coefficients with ψ 1 and ψ 2 in each distribution can be found in the references that are included in Section 2). The estimates of skewness and kurtosis across distributions are similar; in all cases left skewness and leptokurtosis are observed. In order to visually check what type of distribution could be more appropriate, in Figure 1 and Figure 2 we show kernel density estimates obtained with the estimated errors Z ^ t of the normal model, together with the standard normal pdf. In all cases the kernel density estimates exhibit a much higher kurtosis than the standard normal pdf; additionally, a slight left skewness is also observed in most cases. These plots suggest that the normal model is not appropriate.
For each returns series described in Table 1, the estimates of VaR and ES for all distributions (including the standard normal) are backtested using the procedures described in Section 3.1. Specifically, given a coverage level α , we compute the p-values that are obtained when backtesting VaR ( α ) with the test statistics L R UC and L R IND , and we consider that the model is “successfully backtested for VaR ( α ) ” if both p-values are greater than 0.05; the results that are obtained when performing this exercise for coverage levels α = 0.05 and α = 0.025 are reported in Table 3. Similarly, we compute the p-values that are obtained when backtesting ES ( α ) with the statistics M U ES and M C ES ( 1 ) , and we consider that the model is “successfully backtested for ES ( α ) ” if both p-values are greater than 0.05; the results that are obtained when performing this exercise for coverage levels α = 0.05 and α = 0.025 are reported in Table 4.
When backtesting VaR, the normal distribution is successfully backtested only in 18 cases out of the 32 analyses that are carried out with it. The other four distributions perform much better, as they are only rejected in two cases (CF distribution), three cases (skewed-t and Johnson distributions) or five cases (GC distribution). Also note that, when comparing the results of VaR(0.05) and VaR(0.025), there are not big differences between them, except in the Johnson distribution (which fails in three cases when α = 0.05 , but is always successfully backtested when α = 0.025 ). When backtesting ES, the normal distribution performs even worse, as it is successfully backtested only in 13 cases. The other four distributions perform much better, but some differences are observed among them: the CF and the Johnson distributions are always successfully backtested and the skewed-t distribution is only rejected in two cases; however, the GC distribution is rejected in six cases. Also note that, in terms of performance of the different distributions, the results for ES(0.05) and ES(0.025) are completely similar. It is also worth emphasizing that there are two cases (FTSE100 and Heineken with the GC model) that are successfully backtested for VaR but not for ES when coverage level is 0.05, and two other cases (Hang Seng with the GC model and FTSE100 with the skewed-t model) in which this also happens when coverage level is 0.025; note that this may happen when the model does not appropriately capture the behavior at the extreme left tail. Similar cases have also been reported, e.g., in Table 7 of [27], and in Table 2 of [35] (when analyzing FTSE100 with their statistic T n .).
To sum up, the backtesting results suggest that the standard normal distribution is clearly outperformed by all four two-parameter distributions, but these results shed no clear light on whether the distributions based on polynomial expansions outperform the distributions based on parametric densities.
We proceed then to comparatively analyze the performance of the four two-parameter distributions using loss functions. When using several loss functions, it may happen that the best model according to one of the loss functions is not the same as the best model according to another loss function. A possible way to reconcile all the results is to use the sum of all loss functions and then consider that the best model is the one with lowest sum; this is, for example, the approach followed in [23]. Instead, here we will only perform comparisons between two models (first between the two polynomial expansions, second between the two parametric models, and third between the best ones in the first and second comparisons); for this reason, in each case we decide to consider that the best model is the one that yields lower loss functions for most of the five loss functions described in (11), and thus we avoid that a bad performance with one of the loss functions is decisive in the comparison. More specifically, for each of our 16 return series and for each of our 4 risk measures (that is, 64 cases), we perform the comparison of models in this way: (i) we define the “best polynomial expansion” (BPE) as follows: if neither the GC nor the CF distributions have been successfully backtested for that risk measure, we consider that there is no BPE; if one of these two distributions is successfully backtested but the other is not, then the one that is successfully backtested is considered as the BPE; and when both the CG and the CF distributions are successfully backtested, we compute the average loss functions L ¯ j ( RiskM ^ ) defined in (12), for j = 1 , , 5 , with each distribution, and the one that yields lower values for most of these five loss measures is considered as the BPE; (ii) we define the “best parametric density” (BPD) by comparing the skewed-t distribution and the Johnson distribution in the same way as the GC and CF distributions are compared; (iii) finally, we compare the BPE and the BPD using the median of the five relative differences (in percentage) between one and the other, that is
Median of Relat . Diff . = Median { 100 × L ¯ j ( RiskM ^ ) BPD L ¯ j ( RiskM ^ ) BPE L ¯ j ( RiskM ^ ) BPD , for j = 1 , , 5 } ,
Note that this value gives us with a measure of the relative gain obtained by using a polynomial expansion instead of a parametric density: a positive (negative) value tells us how much better (worse) the BPE is with respect to the BPD, in percentage. To clarify matters, in the first analysis that we perform, i.e., when analyzing the estimation of VaR ( 0.05 ) for S&P500, all four distributions are successfully backtested. The average loss functions when using the GC distribution are L ¯ 1 = 0.1023 ,   L ¯ 2 = 0.0543 , L ¯ 3 = 0.02102 , L ¯ 4 = 0.02704 and L ¯ 5 = 0.03483 , and four of these values (all but L ¯ 2 ) are lower than the similar values that are obtained with the CF distribution; hence, the GC distribution is the BPE. The average loss functions when using the skewed-t distribution are L ¯ 1 = 0.10161 , L ¯ 2 = 0.05027 , L ¯ 3 = 0.02181 , L ¯ 4 = 0.02654 and L ¯ 5 = 0.0343 , and all of them are lower than the similar values that are obtained with the Johnson distribution; hence the skewed-t is the BPD. And the median of the five relative differences between the GC distribution and the skewed-t distribution is 100 × ( L ¯ 5 , Skewed-t L ¯ 5 , GC ) / L ¯ 5 , Skewed-t = 1.55 % , that is, in this case using a polynomial expansion yields slightly worse results than using a parametric density. Table 5 reports the results that are found when performing this analysis for VaR ( 0.05 ) and Var ( 0.025 ) , and Table 6 reports similar results when performing this analysis for ES ( 0.05 ) and ES ( 0.025 ) .
The results in Table 5 show that the polynomial expansions typically outperform the parametric densities when estimating VaR: the polynomial expansions are preferred in 25 out of the 32 analysis that are performed. More specifically, the CF distribution is clearly the dominant one: when estimating VaR ( 0.025 ) , it is the preferred one in 12 cases (in the other 4 cases the preferred model is twice the GC model, and once each one of the parametric densities); when estimating VaR ( 0.05 ) , the CF distribution still continues to be the preferred one in 9 cases (in the other 7 cases the preferred model is four times the skewed-t distribution, twice the GC distribution and once the Johnson distribution). However, the results in Table 5 also show that the gain by using a polynomial expansion instead of a parametric density is modest: the values of our measure of the relative difference between the BPE and the BPD are within the interval [ 4 % , 6 % ] in all but one of the 29 comparisons that are made, and most of the times (in 19 cases) the absolute value of the relative difference is below 3 % .
The results in Table 6 show a different picture when estimating ES. Parametric densities outperform polynomial expansions in 21 out of the 32 analysis that are performed. More specifically, the Johnson distribution is now the dominant one, but the results are much less clear-cut than those of Table 5. In fact, when estimating ES ( 0.05 ) , in 8 cases the BPE outperforms the BPD (and in those 8 cases the CF distribution is the BPE), and in 8 cases the BPD outperforms the BPE (and in 6 of these cases the Johnson distribution is the BPD). When estimating ES(0.025) there is more evidence of a better performance of the BPD, as it outperforms the BPE in 13 cases (and in 7 of these cases the Johnson distribution is the BPD). However, it is possibly more important to emphasize that now the differences in performance are even lower than those observed in Table 5: the values of our measure of the relative difference between the BPE and the BPD are within the interval [ 4 % , 3 % ] in all but two of the 32 comparisons that are made, and most of the times (in 23 cases) the absolute value of the relative difference is below 2 % .
To sum up, the distributions based on polynomial expansions (specifically, the CF distribution) outperform the parametric distributions that we consider when the risk measure of interest is VaR, but if our interest lies on the estimation of ES the parametric distributions (specifically, the Johnson distribution) provide somewhat better results than the polynomial expansions that we consider. In any case, the relative gains of using one approach instead of the other are modest, especially when estimating ES.

5. Concluding Remarks

We study the performance of alternative models for estimating the VaR and the ES of daily returns, using both backtesting tests and loss functions to rank the candidate models. All the models we consider specify a NGARCH structure for the conditional variance with constant conditional mean, but differ in the distribution assumed for the standardized errors, Z t ; more specifically, we examine (i) distributions based on polynomial expansions (Cornish-Fisher and Gram-Charlier), and (ii) well-known parametric distributions (skewed-t and Johnson distributions). We employ traditional tests for backtesting VaR, see [21,22], and the approach of cumulative violations proposed in [4] for backtesting ES.
The datasets we use include stock returns, indexes and exchange rates, and cover markets from around the world. The coverage levels considered here are 5% and 2.5%, which are typical values employed by regulators and risk managers in practice. The first stage of our analysis, based on the backtesting approach, provides evidence that polynomial expansions are as good as parametric densities to generate statistically correct VaR/ES estimates at both coverage levels; it also shows that the standard normal density is not appropriate in most cases. A second stage analysis, based on the use of loss function, is implemented for those successfully backtested VaR/ES models, in order to rank the models and examine whether or not polynomial expansions yield better estimates than parametric densities. The conclusion of this analysis is that polynomial expansions (specifically, the Cornish-Fisher one) usually outperform parametric densities in VaR estimation, but parametric densities (specifically, the Johnson density) slightly outperform polynomial expansions in ES estimation. However, the gains of using one approach or the other are modest, especially in the case of ES estimation.
Several interesting avenues for further research are as follow: First, examining whether an increase in the number of parameters would favor one of the two approaches over the other; this type of analysis should consider, for example, the three-parameter generalization of the Student-t density proposed in [36], or the higher-order GC expansion considered in [37]. Second, including in the analysis polynomial expansions based on alternative densities to the normal that yield closed-form expressions for ES, see, e.g., [38] for the case of the Student-t, or [39] for the case of hyperbolic secant and logistic densities. Third, extending our analysis by considering the use of joint scoring functions for VaR and ES and intraday data, in the spirit of [40].

Author Contributions

Methodology, B.C.-B., Á.L. and J.M.; empirical analysis, B.C.-B.; writing—original draft, B.C.-B., Á.L. and J.M.; writing—review and editing, Á.L. and J.M. All authors have read and agreed to the published version of the manuscript.

Funding

This paper has been supported by Spanish Government under project PID2021-124860NB-I00 and Generalitat Valenciana under project CIPROM/2021/060.

Data Availability Statement

Not applicable.

Acknowledgments

The authors express their gratitude to the reviewers for their valuable suggestions.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Artzner, P.; Delbaen, F.; Eber, J.M.; Heath, D. Coherent measures of risk. Math. Financ. 1999, 9, 203–228. [Google Scholar] [CrossRef]
  2. Gneiting, T. Making and evaluating point forecasts. J. Am. Stat. Assoc. 2011, 106, 746–762. [Google Scholar] [CrossRef] [Green Version]
  3. Kratz, M.; Lok, Y.H.; McNeil, A.J. Multinomial VaR backtests: A simple implicit approach to backtesting expected shortfall. J. Bank. Financ. 2018, 88, 393–407. [Google Scholar] [CrossRef] [Green Version]
  4. Du, Z.; Escanciano, J.C. Backtesting expected shortfall: Accounting for tail risk. Manag. Sci. 2017, 63, 940–958. [Google Scholar] [CrossRef] [Green Version]
  5. Cornish, E.A.; Fisher, R.A. Moments and cumulants in the specification of distributions. Rev. De L’Institut Int. De Stat. 1938, 5, 307–322. [Google Scholar] [CrossRef] [Green Version]
  6. Maillard, D. A user’s guide to the Cornish-Fisher expansion. SSRN 2018. [Google Scholar] [CrossRef] [Green Version]
  7. Chernozhukov, V.; Fernández-Val, I.; Galichon, A. Rearranging Edgeworth–Cornish–Fisher expansions. Econ. Theory 2010, 42, 419–435. [Google Scholar] [CrossRef] [Green Version]
  8. Amédée-Manesme, C.O.; Barthélémy, F.; Keenan, D. Cornish-Fisher expansion for commercial real estate value at risk. J. Real Estate Financ. Econ. 2015, 50, 439–464. [Google Scholar] [CrossRef] [Green Version]
  9. Stuart, A.; Ord, K. Kendall’s Advanced Theory of Statistics, 4th ed.; Griffin: London, UK, 1977; Volume 1, pp. 166–171. [Google Scholar]
  10. Mauleón, I.; Perote, J. Testing densities with financial data: An empirical comparison of the Edgeworth-Sargan density to the Student’s t. Eur. J. Financ. 2000, 6, 225–239. [Google Scholar] [CrossRef]
  11. Jondeau, E.; Rockinger, M. Gram–Charlier densities. J. Econ. Dyn. Control 2001, 25, 1457–1483. [Google Scholar] [CrossRef]
  12. Ñíguez, T.M.; Perote, J. Forecasting heavy-tailed densities with positive Edgeworth and Gram-Charlier expansions. Oxf. Bull. Econ. Stat. 2012, 74, 600–627. [Google Scholar] [CrossRef]
  13. León, Á.; Ñíguez, T.M. The transformed Gram Charlier distribution: Parametric properties and financial risk applications. J. Empir. Financ. 2021, 63, 323–349. [Google Scholar] [CrossRef]
  14. Gallant, A.R.; Nychka, D.W. Semi-nonparametric maximum likelihood estimation. Econometrica 1987, 55, 363–390. [Google Scholar] [CrossRef] [Green Version]
  15. Gallant, A.R.; Tauchen, G. Seminonparametric estimation of conditionally constrained heterogeneous processes: Asset pricing applications. Econometrica 1989, 57, 1091–1120. [Google Scholar] [CrossRef] [Green Version]
  16. León, Á.; Mencía, J.; Sentana, E. Parametric properties of semi-nonparametric distributions, with applications to option valuation. J. Bus. Econ. Stat. 2009, 27, 176–192. [Google Scholar] [CrossRef] [Green Version]
  17. Hansen, B.E. Autoregressive conditional density estimation. Int. Econ. Rev. 1994, 35, 705–730. [Google Scholar] [CrossRef]
  18. Johnson, N.L. Systems of frequency curves generated by methods of translation. Biometrika 1949, 36, 149–176. [Google Scholar] [CrossRef]
  19. Simonato, J.G. The performance of Johnson distributions for computing value at risk and expected shortfall. J. Deriv. 2011, 19, 7–24. [Google Scholar] [CrossRef]
  20. Simonato, J.G. GARCH processes with skewed and leptokurtic innovations: Revisiting the Johnson-Su case. Financ. Lett. 2012, 9, 213–219. [Google Scholar] [CrossRef]
  21. Kupiec, P. Techniques for verifying the accuracy of risk measurement models. J. Deriv. 1995, 2, 73–84. [Google Scholar] [CrossRef]
  22. Christoffersen, P.F. Evaluating interval forecasts. Int. Econ. Rev. 1998, 39, 841–862. [Google Scholar] [CrossRef]
  23. Abad, P.; Muela, S.B.; López, C. The role of the loss function in value-at-risk comparisons. J. Risk Model Valid. 2015, 9, 1–19. [Google Scholar] [CrossRef]
  24. Christoffersen, P.F. Elements of Financial Risk Management, 2nd ed.; Amsterdam Elsevier/Academic Press: Amsterdam, The Netherlands, 2012; pp. 76–77. [Google Scholar]
  25. Aboura, S.; Maillard, D. Option pricing under skewness and kurtosis using a Cornish–Fisher expansion. J. Futur. Mark. 2016, 36, 1194–1209. [Google Scholar] [CrossRef]
  26. Liquet, B.; Nazarathy, Y. A dynamic view to moment matching of truncated distributions. Stat. Probab. Lett. 2015, 104, 87–93. [Google Scholar] [CrossRef] [Green Version]
  27. León, Á.; Ñíguez, T.M. Modeling asset returns under time-varying seminonparametric distributions. J. Banking Financ. 2020, 118, 105870. [Google Scholar] [CrossRef]
  28. Skoulakis, G. Simulating from polynomial-normal distributions. Commun. Stat.-Simul. Comput. 2009, 48, 472–477. [Google Scholar] [CrossRef]
  29. Zhu, D.; Galbraith, J.W. Modeling and forecasting expected shortfall with the generalized asymmetric Student-t and asymmetric exponential power distributions. J. Empir. Financ. 2011, 18, 765–778. [Google Scholar] [CrossRef]
  30. Choi, P.; Nam, K. Asymmetric and leptokurtic distribution for heteroscedastic asset returns: The SU-normal distribution. J. Empir. Financ. 2008, 15, 41–63. [Google Scholar] [CrossRef]
  31. Jorion, P. Value at Risk: The New Benchmark for Managing Financial Risk; McGraw-Hill: New York, NY, USA, 2007; pp. 139–157. [Google Scholar]
  32. Lopez, J.A. Methods for evaluating value-at-risk estimates. Econ. Rev. 1999, 2, 3–17. [Google Scholar] [CrossRef] [Green Version]
  33. Sarma, M.; Thomas, S.; Shah, A. Selection of value-at-risk models. J. Forecast. 2003, 22, 337–358. [Google Scholar] [CrossRef]
  34. Caporin, M. Evaluating value-at-risk measures in the presence of long memory conditional volatility. J. Risk 2008, 10, 79–110. [Google Scholar] [CrossRef]
  35. Hoga, Y.; Demetrescu, M. Monitoring value-at-risk and expected shortfall forecasts. Manag. Sci. 2022. [Google Scholar] [CrossRef]
  36. Zhu, D.; Galbraith, J.W. A generalized asymmetric Student-t distribution with application to financial econometrics. J. Econom. 2010, 157, 297–305. [Google Scholar] [CrossRef]
  37. Del Brio, E.; Mora-Valencia, A.; Perote, J. Risk quantification for commodity ETFs: Backtesting value-at-risk and expected shortfall. Int. Rev. Financ. Anal. 2020, 70, 101163. [Google Scholar] [CrossRef]
  38. León, Á.; Ñíguez, T.M. Polynomial adjusted Student-t densities for modeling asset returns. Eur. Financ. 2022, 28, 907–929. [Google Scholar] [CrossRef]
  39. Bagnato, L.; Potì, V.; Zoia, M.G. The role of orthogonal polynomials in adjusting hyperbolic secant and logistic distributions to analyse financial asset returns. Stat. Pap. 2015, 56, 1205–1234. [Google Scholar] [CrossRef]
  40. Meng, X.; Taylor, J.W. Estimating value-at-risk and expected shortfall using the intraday low and range data. Eur. J. Oper. Res. 2020, 280, 191–202. [Google Scholar] [CrossRef]
Figure 1. Kernel estimates of the probability density functions of the residuals Z ^ t obtained with the normal model (part 1).
Figure 1. Kernel estimates of the probability density functions of the residuals Z ^ t obtained with the normal model (part 1).
Mathematics 10 04329 g001
Figure 2. Kernel estimates of the probability density functions of the residuals Z ^ t obtained with the normal model (part 2).
Figure 2. Kernel estimates of the probability density functions of the residuals Z ^ t obtained with the normal model (part 2).
Mathematics 10 04329 g002
Table 1. Descriptive Statistics for Daily Log-Returns.
Table 1. Descriptive Statistics for Daily Log-Returns.
SeriesMeanStd.Dev.MaxMinSkew.Kurt.PeriodT
S&P5000.021.1610.96−9.47−0.239.4811/10/2000–10/12/20195000
Euro Stoxx 50−0.011.4210.44−9.01−0.055.2411/10/2000–10/12/20195000
BRBOVES0.041.6913.68−12.10−0.124.6211/10/2000–10/12/20195000
NASCOMP0.021.4613.25−9.590.086.8311/10/2000–10/12/20195000
Hang Seng0.011.3913.40−13.59−0.049.2111/10/2000–10/12/20195000
DAX0.021.5612.37−9.60−0.115.3911/10/2000–10/12/20195000
FTSE1000.001.3312.22−11.51−0.259.9511/10/2000–10/12/20195000
Euro to US $0.000.593.84−4.62−0.102.7811/10/2000–10/12/20195000
China Merch. Bank0.052.159.60−14.270.174.0710/04/2002−10/12/20194353
Kweichow Moutai0.122.059.56−15.210.343.6002/01/2002−10/12/20194421
Amazon.com0.082.2829.62−28.460.5414.8311/10/2000–10/12/20195000
Apple0.101.1713.02−19.75−0.236.3611/10/2000–10/12/20195000
Coca−Cola0.011.7913.00−10.60−0.1411.0211/10/2000–10/12/20195000
Walt Disney0.031.7914.82−20.29−0.2111.0011/10/2000–10/12/20195000
e−bay0.032.5026.53−23.040.1012.7411/10/2000–10/12/20195000
Heineken0.011.449.57−21.53−0.7414.3929/06/2000–10/12/20195000
Table 2. Parameter Estimates of Euro Stoxx 50 (In-Sample Period *).
Table 2. Parameter Estimates of Euro Stoxx 50 (In-Sample Period *).
CFGCSkewed-tJohnsonNormal
b 0 0.0240
(0.0049)
0.0239
(0.0064)
0.0245
(0.0044)
0.0239
(0.0086)
0.0245
(0.0127)
b 1 0.7575
(0.0433)
0.7548
(0.0444)
0.7541
(0.0296)
0.7577
(0.0473)
0.7574
(0.1120)
b 2 0.0432
(0.0031)
0.0427
(0.0031)
0.0443
(0.0074)
0.0432
(0.0031)
0.0421
(0.0413)
c2.0919
(0.2144)
2.1235
(0.2214)
2.0882
(0.2612)
2.0926
(0.2588)
2.1294
(1.1149)
μ −0.0184
(0.0248)
−0.0175
(0.0421)
−0.0244
(0.0219)
−0.0182
(0.0790)
−0.0234
(0.0350)
ψ 1 −0.2613
(0.0593)
0.3873
(0.1484)
0.5695
(0.0158)
1.6762
(3.2153)
ψ 2 0.3093
(0.1664)
0.1628
(0.0438)
25.7810
(14.2877)
4.2613
(4.0214)
Sample skewness of Z ^ t −0.2933−0.2933−0.2941−0.2933−0.2923
Estimated skewness based on ψ ^ 1 , ψ ^ 2 −0.2754−0.2423−0.2548−0.2776
Sample kurtosis of Z ^ t 3.41123.41083.41523.41143.4070
Estimated kurtosis based on ψ ^ 1 , ψ ^ 2 3.33743.33013.32753.3434
Note: Standard errors in parentheses. * In-sample period is from 11 October 2000 to 10 June 2008 (2000 observations).
Table 3. Backtesting VaR at the 5% Significance Level.
Table 3. Backtesting VaR at the 5% Significance Level.
Coverage Level α = 0.05 Coverage Level α = 0.025
CFGCSkewed-tJohnsonNormalCFGCSkewed-tJohnsonNormal
S&P5001111111110
Euro Stoxx 501111111110
BRBOVES1111111111
NASCOMP1111111110
Hang Seng1111111110
DAX1111111110
FTSE1001110111110
Euro to US $1111111111
China Merch. Bank1111011111
Kweichow Moutai1011011111
Amazon.com1000010010
Apple1111011111
Coca-Cola1100100111
Walt Disney1111011111
e-bay0111011111
Heineken1111010111
Note: A value of “1” means that the corresponding model is succesfully backtested; otherwise, the value is “0”.
Table 4. Backtesting ES at the 5% Significance Level.
Table 4. Backtesting ES at the 5% Significance Level.
Coverage Level α = 0.05 Coverage Level α = 0.025
CFGCSkewed-tJohnsonNormalCFGCSkewed-tJohnsonNormal
S&P5001111011110
Euro Stoxx 501111011110
BRBOVES1111111111
NASCOMP1111011110
Hang Seng1111010110
DAX1111011110
FTSE1001011011010
Euro to US $1111111111
China Merch. Bank1011011111
Kweichow Moutai1111011110
Amazon.com1001011111
Apple1111111111
Coca-Cola1111111110
Walt Disney1111111110
e-bay1111111111
Heineken1011110110
Note: A value of “1” means that the corresponding model is succesfully backtested; otherwise, the value is “0”.
Table 5. Comparison of Distributions with Loss Functions: VaR.
Table 5. Comparison of Distributions with Loss Functions: VaR.
VaR ( 0.05 ) VaR ( 0.025 )
Best Polynomial Expan . ( 1 ) Best Parametric Density ( 2 ) Median of Relat . Diff . * [ ( 1 ) ( 2 ) ] / ( 2 ) Best Polynomial Expan . ( 1 ) Best Parametric Density ( 2 ) Median of Relat . Diff . * [ ( 1 ) ( 2 ) ] / ( 2 )
S&P500GCSkewed-t 1.55 % GCSkewed-t 3.11 %
Euro Stoxx 50CFSkewed-t 0.28 % CFJohnson 0.63 %
BRBOVESCFJohnson 0.50 % CFJohnson 0.63 %
NASCOMPGCSkewed-t 0.61 % GCJohnson 5.51 %
Hang SengGCJohnson 5.37 % CFJohnson 2.43 %
DAXCFJohnson 0.73 % CFJohnson 0.95 %
FTSE100CFSkewed-t 0.49 % CFJohnson 0.85 %
Euro to US $GCSkewed-t 3.40 % CFSkewed-t 3.31 %
China Merch. BankGCJohnson 14.25 % CFJohnson 3.18 %
Kweichow MoutaiCFJohnson 2.97 % CFJohnson 5.58 %
Amazon.comCF— **CFJohnson 3.81 %
AppleCFJohnson 1.18 % CFJohnson 1.25 %
Coca-ColaCF— **— **Johnson
Walt DisneyCFJohnson 1.31 % CFJohnson 1.49 %
e-bayGCJohnson 3.07 % CFJohnson 1.68 %
HeinekenCFJohnson 1.85 % CFJohnson 2.50 %
Note: Only distributions that are “sucessfully backtested“ are considered (excluding the normal). * A positive sign means that the best polynomial expansion performs better (has lower median loss) than the best parametric density. ** In this case no distribution of this type is sucessfully backtested.
Table 6. Comparison of Distributions with Loss Functions: ES.
Table 6. Comparison of Distributions with Loss Functions: ES.
ES ( 0.05 ) ES ( 0.025 )
Best Polynomial Expan . ( 1 ) Best Parametric Density ( 2 ) Median of Relat . Diff . * [ ( 1 ) ( 2 ) ] / ( 2 ) Best Polynomial Expan . ( 1 ) Best Parametric Density ( 2 ) Median of Relat . Diff . * [ ( 1 ) ( 2 ) ] / ( 2 )
S&P500GCSkewed-t 3.42 % CFSkewed-t 4.00 %
Euro Stoxx 50CFJohnson 0.20 % CFJohnson 1.18 %
BRBOVESCFJohnson 0.03 % CFJohnson 0.45 %
NASCOMPCFJohnson 0.74 % CFJohnson 1.77 %
Hang SengCFJohnson 1.19 % CFJohnson 0.58 %
DAXCFJohnson 0.10 % CFJohnson 0.34 %
FTSE100CFJohnson 0.09 % CFJohnson 0.50 %
Euro to US $CFJohnson 1.13 % CFSkewed-t 9.38 %
China Merch. BankCFSkewed-t 2.32 % GCSkewed-t 2.39 %
Kweichow MoutaiCFSkewed-t 1.69 % CFSkewed-t 0.86 %
Amazon.comCFJohnson 0.65 % CFSkewed-t 7.03 %
AppleCFJohnson 0.42 % CFJohnson 1.71 %
Coca-ColaCFJohnson 0.25 % CFSkewed-t 2.94 %
Walt DisneyCFJohnson 0.02 % GCJohnson 2.41 %
e-bayCFJohnson 0.18 % CFJohnson 1.21 %
HeinekenCFSkewed-t 0.23 % CFSkewed-t 3.27 %
Note: Only distributions that are “sucessfully backtested” are considered (excluding the normal). * A positive sign means that the best polynomial expansion performs better (has lower median loss) than the best parametric density.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Castillo-Brais, B.; León, Á.; Mora, J. Estimating Value-at-Risk and Expected Shortfall: Do Polynomial Expansions Outperform Parametric Densities? Mathematics 2022, 10, 4329. https://doi.org/10.3390/math10224329

AMA Style

Castillo-Brais B, León Á, Mora J. Estimating Value-at-Risk and Expected Shortfall: Do Polynomial Expansions Outperform Parametric Densities? Mathematics. 2022; 10(22):4329. https://doi.org/10.3390/math10224329

Chicago/Turabian Style

Castillo-Brais, Brenda, Ángel León, and Juan Mora. 2022. "Estimating Value-at-Risk and Expected Shortfall: Do Polynomial Expansions Outperform Parametric Densities?" Mathematics 10, no. 22: 4329. https://doi.org/10.3390/math10224329

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop