Value at Risk Estimation Using the GARCH-EVT Approach with Optimal Tail Selection

A conditional Extreme Value Theory (GARCH-EVT) approach is a two-stage hybrid method that combines a Generalized Autoregressive Conditional Heteroskedasticity (GARCH) filter with the Extreme Value Theory (EVT). The approach requires pre-specification of a threshold separating distribution tails from its middle part. The appropriate choice of a threshold level is a demanding task. In this paper we use four different optimal tail selection algorithms, i.e., the path stability method, the automated Eye-Ball method, the minimization of asymptotic mean squared error method and the distance metric method with a mean absolute penalty function, to estimate out-of-sample Value at Risk (VaR) forecasts and compare them to the fixed threshold approach. Unlike other studies, we update the optimal fraction of the tail for each rolling window of the returns. The research objective is to verify to what extent optimization procedures can improve VaR estimates compared to the fixed threshold approach. Results are presented for a long and a short position applying 10 world stock indices in the period from 2000 to June 2019. Although each approach generates different threshold levels, the GARCH-EVT model produces similar Value at Risk estimates. Therefore, no improvement of VaR accuracy may be observed relative to the conservative approach taking the 95th quantile of returns as a threshold.


Introduction
Value at Risk (VaR) is the most widely known measure of market risk. VaR indicates how big the maximum loss of an asset or a portfolio of assets is over the target horizon so that there is a low, pre-specified probability q that the actual loss will be greater. VaR can also be considered in the context of returns. Formally, we denote by r t the return on assets at time t. The one-day-ahead Value at Risk for a long trading position at a q significance level, noted VaR q (r t ), anticipated conditionally to an information set, F t , available at time t is defined by the formula: P r t+1 ≤ VaR q (r t )|F t = q. (1) This definition shows that VaR is a qth conditional quantile of the returns distribution. For a short trading position VaR is a 1-qth conditional quantile of the returns distribution: P r t+1 ≥ VaR 1−q (r t )|F t = q.
(2) The main practical problem consists in the selection of an appropriate method to measure Value at Risk. The most popular and the simplest methods include historical simulation, Monte Carlo simulation or variance-covariance. Pérignon and Smith (2010) [1] reported that 73% of U.S. international banks use historical simulation, while only a minority of financial institutions use more complex parametric models. When financial markets become volatile and extreme returns appear, none of these methods are capable of appropriately measuring the risk.
The Extreme Value Theory (EVT) provides a theoretical and practical foundation for statistical models describing extreme events. There is an extensive body of literature that focuses on EVT and discusses the tail behavior of assets [2][3][4][5][6][7][8][9][10]. EVT models deal with i.i.d. variables, for that purpose McNeil and Frey (2000) [11] constructed a combination of the Extreme Value Theory and Generalized Autoregressive Conditional Heteroskedasticity (GARCH) model, referred to as conditional EVT or GARCH-EVT. The model captures the most important stylized facts about financial time series, such as volatility clustering and leptokurtosis, and quickly adapts to recent market movements. The two-stage GARCH-EVT model is widely used to estimate accurately financial risk measured by the Value at Risk or Expected Shortfall (ES). McNeil and Frey (2000) [11] showed that the application of GARCH and EVT combined leads to a more accurate estimation of VaR compared with EVT methods. Fernandez (2003) [12] showed that EVT outdoes a GARCH model with normal innovations by far and that it gives similar results to a GARCH model with t innovations, as long as the innovations come from a symmetric and fat-tailed distribution. In turn, Jadhav and Ramanathan (2009) [13] estimated VaR using 14 (parametric and non-parametric) estimation procedures at a 99% confidence level. The EVT method performs better than the other methods. Gençay and Selçuk (2004) [14] reported that at the 99th and higher quantiles the Generalized Pareto Distribution (GPD) model is clearly superior to five other methods used in the study in terms of VaR forecasting. Moreover, Z. Zhang and H. K. Zhang (2016) [15] showed that the Exponential GARCH (EGARCH) model with Generalized Error Distribution (GED) combined with the EVT approach does very well in predicting critical loss for precious metal markets. Just (2014) [16] verified unconditional and conditional VaR estimation models in the agricultural commodity market. The GARCH-EVT model enables a correct VaR estimation also in this market. Tabasi et al. (2019) [17] showed that the GARCH-EVT model outperforms the simple GARCH model with Student's t and normal distributions for residuals.
Application of GARCH-EVT in empirical research requires pre-specification of a threshold which separates tails of distribution from its middle part. The appropriate choice of threshold level u is ambiguous, but crucial in the estimation of Generalized Pareto Distribution parameters and the corresponding accuracy of Value at Risk. The standard practice is to adopt as low a threshold as possible, but there is a trade-off between variance and bias. If a threshold is too low, the asymptotic basis of the model is violated leading to a high bias. However, too high a threshold generates insufficient excesses with which the model is estimated, leading to a high variance [18]. Most authors preferred to select a threshold as a fixed quantile of the data set, instead of determining a threshold value at each step, especially when they use a moving window of observation to find out-of-sample VaR estimates. McNeil and Frey (2000) [11], Karmakar and Shukla (2015) [19], Bee et al. (2016) [20], Totić and Božović (2016) [21], Li (2017) [22], Fernandez (2003) [12], Jadhav and Ramanathan (2009) [13] and Huang et al. (2017) [23] chose the 90th quantile of the loss distribution as a threshold. In contrast, a less conservative, but fixed threshold was used by Gençay and Selçuk (2004) [14], Cifter (2011) [24], Soltane et al. (2012) [25].
Traditional approaches to threshold selection are based on graphical representations. A frequently used procedure consists in the analysis of a mean residual life plot, which represents the mean of the excesses of threshold u. This method was applied by Aboura (2014) [26] and Omari et al. (2017) [27] to estimate VaR based on the GARCH-EVT approach. Another very popular procedure to threshold selection involves a graphical representation of Hill [28], Pickands [29] or Dekkers-Einmahl-de Haan estimators [30]. The graphical based threshold choice procedures require identification of stable regions in the graphs and thus they are highly subjective. There is extensive literature proposing the optimal choice of threshold, corresponding to the application of automated methods. The most common methods for the adaptive choice of the threshold are based on the minimization of asymptotic mean squared error (AMSE) estimate (e.g., [31][32][33][34]). For example, Hall (1990) [35] and Danielson et al. (2001) [33] use a bootstrap procedure to minimize the AMSE. Heuristic rules, e.g., Csörgő and Viharos (1998) [36], the Eye-Ball method [37] or the path stability (PS) method [38] to select the tail fraction are easy to implement, but they are arbitrary. The distance metric methods were proposed in [37]. Danielsson et al. (2016) [37] used three types of distances between empirical and theoretical quantiles, i.e., the Kolmogorov-Smirnov, average squared and mean absolute distances. A recent overview of the topic can be found in [37,38]. However, none of these conceptions outperforms other methods in all situations, thus researchers are still searching for another method to find the optimal tail (e.g., [39]).
This paper provides an empirical study of conditional EVT. The research objective of this paper is to compare the predictive ability of VaR estimates when each estimate is made with an optimal choice of the distribution tail. Five methods are applied to describe the tail, i.e., the minimization of AMSE estimate introduced in [38], the path stability algorithm [38], the distance metric method with the mean absolute penalty function [37], the automated Eye-Ball method [37] and the fixed quantile procedure. Unlike other studies we update the optimal fraction of the tail for each moving window of observations. It means that each VaR forecast is calculated on the basis of a new estimate of the threshold and in this way we can estimate the risk with the newest time horizon. We hypothesize that the optimal choice of the tail fraction makes it possible to improve the accuracy of VaR prediction. This study contributes to the literature in the following ways. Firstly, we conduct the empirical study for one day out-of-sample Value at Risk forecasts using 10 stock indices from the period 2000-June 2019. We propose to use the GARCH-EVT model with optimal tail selection. For all considered models, we allow the tail fraction and the model parameters to change over time. Secondly, we compare the VaR forecasts accuracy and the tail fraction between the investigated models. The results indicate which model may be used in practice by investors, financial and regulatory institutions. This study is an extended version of conference proceedings [40], where only path stability algorithm was analyzed.
The rest of the paper is organized as follows. Section 2 provides methodological details. It introduces the Peaks Over Threshold (POT) model and describes the tail selection problem in this model. Then the conditional GARCH-EVT model and backtesting procedures are presented. Section 3 presents empirical results for VaR forecasting, while Section 4 concludes the study.

Modeling Tail Using Extreme Value Theory
The Extreme Value Theory is a theory that focuses primarily on analyzing the asymptotic behavior of extreme values of a random variable. The theory provides robust tools for estimating only extreme values distribution instead of the whole distribution. The Peaks Over Threshold, next to the Block Maxima Model (BMM), is one of the two key models of EVT. It allows us to model the tail regions of the distribution instead of the entire sample. Using the POT method one can easily derive a closed-form expression for the Value at Risk or Expected Shortfall. In this study we focus on the Value at Risk. Although VaR is a primary measure of market risk for financial institutions it not a subadditive measure. ES is free from the drawback and theoretically captures the information contained in the tail in a better fashion. However, as an expected value ES requires that the distribution must have a finite first moment. Guégan and Hassani (2018) [41] evidenced infinite ES in financial data when alpha-stable and Generalized Extreme Value distributions were considered. The same problem may appear in case of GPD when the shape parameter is greater than one. We avoid this difficulty in our study since only VaR is taken into consideration. The distribution of excesses over high threshold is defined as: where x 0 ≤ ∞ is the right endpoint of F. The Pickands-Balkema-de Haan Theorem [42] is a basic result in POT and states that for a large class of underlying distributions F, there exists a function β(u) such that: G ξ,β(u) (y) is the Generalized Pareto Distribution given by: where β > 0, y ≥ 0 for ξ ≥ 0 and 0 ≤ y ≤ − β ξ for ξ < 0. The distribution has only two parameters, β is a scale parameter and ξ is a shape parameter. Heavy tail distributions (i.e., stable, Cauchy, Student's t) have ξ > 0 (Fréchet domain of attraction), whereas thin tail distributions like normal and log-normal have ξ = 0 (Gumbel domain of attraction). Distributions with a finite right endpoint have ξ < 0 (Weibull domain of attraction). Rearranging (3) and (5) we obtain a cumulative distribution function of returns: To obtain a useful closed form of distribution (6) it is convenient to replace F(u) with the empirical estimator of exceedance over a threshold. The estimator is of the formF(u) = 1− N u n where N u is a number of returns that exceed threshold u and n is the number of returns. The estimator of cumulative distribution F is then as follows:F VaR for a short position (right tail of distribution) is given by the x value in Equation (7) and we get: where 1 − q is the confidence level of VaR. In order to calculate VaR for a long position (left tail of distribution) it is necessary to carry out the calculations for minus returns.

Optimal Tail Selection
As referred in the Introduction, there are a large number of methods dealing with the optimal choice of the distribution tail fraction. However, not all methods perform well in finite samples, therefore they may not be used in the GARCH-EVT model with moving windows. Some methods that perform well in simulation studies, based on theoretical distributions, may not perform well in financial applications. For instance, methods based on minimizing the asymptotic MSE, especially bootstrap-based methods, do not perform very well in empirical studies [37]. The efficiency assessment is usually conducted using simulations studies but it is not the case in this work. We handle the problem of the choice of optimal tail fraction form the investors' point of view without entering into theoretical considerations including estimator properties. This is empirical research and we focus on computational methods instead of simulations ones. In our study we test several methods implemented in the R tea package [43]. Since we measure the risk in finite samples, in our study we chose only four methods that were able to converge in the optimization procedure.

Mean Absolute Deviation Distance Metric Method
The procedure mean absolute deviation distance metric (MINDIST) proposed in [37] minimizes the distance between the largest upper order statistics of the dataset, i.e., the empirical tail, and the theoretical tail of a Pareto distribution. The parameter α of this distribution is estimated by the Hill estimator. The distance is then minimized with respect to this k. The optimal number, denoted k 0 here, is equivalent to the number of extreme values. We use mean absolute deviation (MAD) penalty functions in the study: where r n−j,n is the empirical quantiles and q( j, k) is a quantile estimated by r n−j+1,n k j 1 α k ,α k is the Hill estimator. The Hill estimator [28] is defined as follows: where k is the number of upper order statistics used in estimation of the tail index.

Eye-Ball Method
The automated Eye-Ball method is a heuristic procedure proposed in [37], and it searches for a stable region in the Hill plot by defining a moving window. Inside the defined window the estimates of the Hill estimator with respect to k have to be in a predefined range around the first estimate within this window. It is sufficient to claim that only the h percentage of the estimates within this window lies within this range. The smallest k to accomplish this is then the optimal number of upper order statistics, i.e., returns in the tail. The estimator is as follows: where w is the size of the moving window, which is typically 1% of the full sample, h is typically around 90%, while ε is 0.3, n + is the number of positive returns.

Path Stability Method
The path stability (PS) method is an algorithm introduced in [38]. The algorithm searches for a stable region of the PS, i.e., the Hill plot of a tail index with respect to k. This is done in the following steps: Step 1. Given an observed returns (r 1 , . . . , r n ), compute T(k) :=ξ k,n using the Hill estimator for k = 1, . . . , n − 1.
Step 2. Obtain j 0 as a minimum value of j, a non-negative integer, such that the rounded values, to j decimal places (j = 1 here), of the estimates T(k) are distinct. Define a (T) k ( j) = round (T(k), j), k = 1, . . . , n − 1, the rounded values of T(k) to j decimal places.
Step 3. Consider the set of k values associated to equal consecutive values of a Step 4. Consider all those estimates, T(k), k max , now with two additional decimal places, i.e., compute T(k) = a (T) k ( j 0 + 2). Obtain the mode of T(k) and denote K T the set of k -values associated with this mode.
Step 5. Takek T as the maximum value of K T . Step 6. Computeξ PS =ξˆk T ,n .

Minimization of Asymptotic Mean Squared Error Method
The minimization of asymptotic mean squared error (dAMSE) method is an algorithm introduced in [38]. The tail in the dAMSE method is identified by minimizing the AMSE criterion with respect to k. The optimal number, denoted k 0 , may be associated with the unknown threshold u of the tail index with respect to k. This is done in the following steps: Step 1. Given an observed returns (r 1 , . . . , r n ), compute for the tuning parameters τ = 0 and τ = 1, the values ofρ τ (k) k,n − 3) , dependent on the statistics: where (log r n−i+1:n − log r n−k:n ) j , j = 1, 2, 3. with the scaled log-spacings.

Conditional Extreme Value Theory Model
The GARCH-EVT model was introduced by McNeil and Frey [11] to VaR modeling by extending the EVT framework to dependent data. This model uses the EVT to model the tails of standardized residues e t obtained from the GARCH model. When the estimated GARCH model is correct, the residuals of the model should be realizations of the unobserved i.i.d. noise variables. It was assumed in this study that returns are modeled using the most popular GARCH(1,1) model [44]: where: ω, α, β > 0, α + β < 1, ε t ∼ i.i.d.(0, 1). VaR for a short position (right tail) is calculated using the formula: where: σ t (1)-one step ahead forecast of conditional volatility of the GARCH(1,1) model, (8) for the standardized residuals e t of the GARCH(1,1) model.

Backtesting
The model used for VaR estimation should be statistically verified using the bactesting procedure. VaR verification tests are based on hit function (failure process) I t (q) T t=1 defined for a long trading position by the formula: In practice the most commonly used test is the Kupiec's proportion of failures test (also known as the unconditional coverage test) [45]. According to Kupiec (1995) [45], the number of VaR violations by actual returns has a binomial distribution T 1 ∼ B(T, q) and the hypotheses are defined as: The null hypothesis assumes that the share of VaR violations by actual returns is compliant with an assumed q. The test statistic is defined as: where:q = T 1 /(T 0 + T 1 ), T 1 -the number of VaR exceedances, T 0 -the number of unexceeded VaR. With the true null hypothesis LR UC statistic has the asymptotic chi-square distribution with one degree of freedom.
The advantage of Kupiec's test is that it assesses the model taking into account either too large and too small number of exceedances. A good model used for VaR estimation should also be characterized by independence of exceedances. Test proposed by Christoffersen (1998) [46] additionally checks independence of exceedances. When the model accurately estimates VaR, then an exception today should not depend on whether or not an exception occurred on the previous day. In order to test hypothesis that the exceedances are independent (the first exceedance) an alternative is defined where the sequence of exceedances is modeled with a first order Markov chain with a matrix of transition probabilities: where π ij = P I t (q) = j I t−1 (q) = i) . The Markov chain reflects the existence of a order one memory in the sequence of exceedances. The hypotheses in the independence test are defined as: H 0 : π 01 = π 11 , H 1 : π 01 π 11 .
Finally, Christoffersen (1998) [46] combines the two tests, i.e., the unconditional coverage test and the independence test into the conditional coverage test. In this test, the null hypothesis takes the following form: Mathematics 2020, 8, 114 The test statistic is of the form: where:π ij = T ij /(T i0 + T i1 ), T ij -the number of days when condition j occurred assuming that condition i occurred on the previous day (1 if exceedance accurs, 0 if no exceedance appears). Under the true null hypothesis LR CC statistic has the asymptotic chi-square distribution with two degrees of freedom.
The disadvantage of Christoffersen's test is that it examines the independence of the first exceedance only. Therefore, it should be supplemented by other tests e.g., a test which analyzes whether the number of periods (days) between the violations of VaR by actual returns is independent over time [47]. Under the null hypothesis of this test, the duration of time between VaR violations should have no memory and mean duration of 1/q. Since the only continuous distribution, which is memory free, is the exponential distribution, the test can be conducted on any distribution which embeds the exponential as a restricted case, and a likelihood ratio test then conducted to see whether the restriction holds. Here, the Weibull distribution is used with parameter b = 1 representing the case of the exponential. The Weibull distribution is defined as: where d-the duration of time (in days) between two VaR exceedances. Under the null hypothesis of independence we have the likelihood: where T 1 -the number of periods, in which an exceedance occurred. The likelihood ratio test statistic is defined as: The test statistic is asymptotically distributed as chi-square with one degree of freedom. We refer to [47] for details of the test.
The basic test using both VaR values and series of exceedances is the Dynamic Quantile test of Engle and Manganelli [48]. The idea of the test is that exceedances at time t should not depend on exceedances at earlier times, or on VaR and any processed information (ω t−1, j ) available at time t − 1 (e.g., past returns, square of past returns). Engle and Manganelli define Hit t (q) = I t (q) − q. In the test the regression equation is estimated: VaR is well estimated if there is no reason to reject the null hypothesis: The current VaR exceedances are uncorrelated with past exceedances when β i = 0 for i = 1, . . . , p, whereas unconditional coverage hypothesis is verified for β 0 = 0.
In matrix notation, we have Hit = Xβ + e. The Wald test statistic is defined as: Under the true null hypothesis DQ statistic has the asymptotic chi-square distribution with p + n + 2 degrees of freedom.
This test identifies an incorrect VaR measurement, which is not rejected by the classic tests of the number and independence of exceedances. We refer to [48,49] for details of the test.
The loss function is another goodness-of-fit measure for VaR calculation. The loss function for a given q may be determined as in [50]: A lower Q value indicates a better goodness of fit.

Results of Empirical Study
To test the forecasting performance of the examined GARCH-EVT model with different thresholds we chose 10 stock indices as the basis for analysis. They were the S&P 500 (SPX), FTSE 100 (UKX), CAC 40 (CAC), DAX, OMX Stockholm 30 (OMXS), KOSPI, NIKKEI 225 (NKX), Hang Seng (HSI), Bovespa (BVP), All Ordinaries (AOR). The data consists of daily prices of the selected assets from the beginning of 2000 up to the end of June 2019. It gives from 4781 (NKX) to 4982 (CAC) percentage log-returns which are used in our calculations. The data set was obtained from the financial stock news website, stooq.pl [51]. Using rolling windows of 2000 returns in size we updated the estimates of parameters for each moving window. Most used optimization methods require a relatively long window of observation to the optimal algorithms convergence. We selected a moving window of 2000 observations. The window of the 1000 generates considerable calculation problems. The less demanding optimization algorithm is PS, which may be used for relatively small samples. Echaust (2018) [40] used only 750 returns in the moving window.
According to the GARCH-EVT approach the tail fraction is estimated for standardized residuals of the GARCH model, not for returns. We used a GARCH(1,1) model with normal innovations. Then we calculated the tail fraction in five cases, i.e., using the optimal tail selection method, and additionally the 95th quantile of each moving window. Table 1 shows the mean, minimum, maximum and standard deviation of a threshold level for indices, while Figure 1 presents the estimated threshold level for the S&P 500 index. The PS method estimates the threshold in the most conservative way, much lower than the 95th quantile. Such a choice of threshold guarantees sufficient data in the tail to calculate VaR estimates at standard confidence levels. On the other hand, the threshold calculated with the PS algorithm has a very high range and standard deviation. It means that the threshold is highly volatile over time. It is of interest that PS is the only method that indicates a higher right threshold than the left one. The Eye-Ball algorithm is the most restrictive and it sets the threshold at a high level. In turn, the MINDIST method produces the most volatile threshold estimates compared to other methods. These two remarks are consistent with Danielson et al. (2016) [37], who argued that the automated Eye-Ball method and the MINDIST method with the Kolmogorov-Smirnov metric tend to pick a small number of data in the tails and therefore the tail close to the maximum. We use a different penalty function in the MINDIST method, but the conclusion is similar. The high threshold level may result in an insufficient number of returns lying in the tails to estimate GPD parameters in the next step (see Table 2). The dAMSE methodology uses a number of order statistics the closest to the 95th quantile and has a standard deviation the closest to the method with the 95th quantile among the optimization methods used in the study.
Smirnov metric tend to pick a small number of data in the tails and therefore the tail close to the maximum. We use a different penalty function in the MINDIST method, but the conclusion is similar. The high threshold level may result in an insufficient number of returns lying in the tails to estimate GPD parameters in the next step (see Table 2). The dAMSE methodology uses a number of order statistics the closest to the 95th quantile and has a standard deviation the closest to the method with the 95th quantile among the optimization methods used in the study.  Note: PS-path stability method, dAMSE-minimization of asymptotic mean squared error method, 5%-fixed threshold method, MINDIST-mean absolute deviation distance metric method. The calculations were performed with the fGarch [52] and tea [43] packages in R. The following assumptions were made: GARCH-maximum log-likelihood estimation; PS-j = 1; Eye-Ball-w = 0.02 (with the exception of AOR for right tail w = 0.021), h = 0.9, ε = 0.3; MINDIST-size of the upper tail ts = 0.15. Having specified tails of distribution the next step in calculations is to estimate GDP parameters and Value at Risk forecasts. Guégan and Hassani (2011) [54] underline the fact that the method chosen to estimate the GPD parameters affects the VaR tremendously. We use log-likelihood method described in Coles (2001) [18] which is the most popular estimation method for this model, and Formula (8) for VaR calculation. We examined the out-of-sample 99% and 99.5% VaR estimates for the left and right tails of returns distribution. Since thresholds based on the optimal choice differ substantially, it may be expected that GPD parameters and thus VaR estimation for these tail choices will behave differently. Tables 3 and 4 present results of VaR estimation and Figure 2 presents 99% VaR estimates for the S&P 500 index. We can see that the VaR estimates across methods differ very slightly despite the differences in the thresholds. We can expect that the risk is estimated correctly regardless of the method used to describe the tails. In all the cases the VaR estimates in the left tail are greater than these in the right tails. Most investors believe that the left tail risk is higher than the right one. It is a natural consequence of the crashes being perceived as much more turbulent than the booms. Crashes develop in shorter time intervals than booms and changes of prices are significantly bigger. The results from Tables 3 and 4 support this perception. Additionally, the PS algorithm highlights the risk asymmetry harder than other methods and the asymmetry tends to be greater for a more extreme risk, i.e., a higher VaR confidence.  Note: the calculations were performed with package evir [53] in R. GPD parameters were estimated by the maximum log-likelihood method.
Having specified tails of distribution the next step in calculations is to estimate GDP parameters and Value at Risk forecasts. Guégan and Hassani (2011) [54] underline the fact that the method chosen to estimate the GPD parameters affects the VaR tremendously. We use log-likelihood method described in Coles (2001) [18] which is the most popular estimation method for this model, and Formula (8) for VaR calculation. We examined the out-of-sample 99% and 99.5% VaR estimates for the left and right tails of returns distribution. Since thresholds based on the optimal choice differ substantially, it may be expected that GPD parameters and thus VaR estimation for these tail choices will behave differently. Tables 3 and 4 present results of VaR estimation and Figure 2 presents 99% VaR estimates for the S&P 500 index. We can see that the VaR estimates across methods differ very slightly despite the differences in the thresholds. We can expect that the risk is estimated correctly regardless of the method used to describe the tails. In all the cases the VaR estimates in the left tail are greater than these in the right tails. Most investors believe that the left tail risk is higher than the right one. It is a natural consequence of the crashes being perceived as much more turbulent than the booms. Crashes develop in shorter time intervals than booms and changes of prices are significantly bigger. The results from Tables 3 and 4 support this perception. Additionally, the PS algorithm highlights the risk asymmetry harder than other methods and the asymmetry tends to be greater for a more extreme risk, i.e., a higher VaR confidence.     Note: PS-path stability method, dAMSE-minimization of asymptotic mean squared error method, 5%-fixed threshold method, MINDIST-mean absolute deviation distance metric method, x-VaR was not calculated. The calculations were performed with the evir package [53] in R. Generalized Pareto Distribution parameters were estimated by the maximum log-likelihood method.
Backtesting results for VaR for all the indices are summarized in Tables 5-14. Assessing the quality of the estimated VaR, based on Kupiec's test it may be concluded that all the procedures work impeccably for the considered assets and at both confidence levels. The only one exception is the automated Eye-Ball method, which fails for S&P 500 for 99% VaR in the left tail (significance level of the 5% test). The risk is underestimated in this case and too many exceedances of VaR appear. The results of Christoffersen's test are the same as for Kupiec's test. This indicates the dependence of the first exceedance for the same index and the same level of confidence. A more restrictive backtest is the duration independence test. In the left tail all models (except the MINDIST method) passes the VaR duration-based procedure. In the right tail for 99.5% VaR, only ones (in the case of Hang Seng), the test rejects the null hypothesis for the PS, automated Eye-Ball and 5%-quantile methods. The dAMSE method produces accurate VaR forecasts in all the cases. The most rigorous validation procedure is the Dynamic Quantile test. It checks if a sequence of violations is time independent up to five-day lags. According to the test, violations are dependent in about 20% of cases (in nine cases for the PS, automated Eye-Ball, 5%-quantile methods and in eight cases for the dAMSE method). The test rejects the null hypothesis in the same cases regardless which method is used. If then VaR estimate is incorrect for the particular asset we can expect the same result to be reached for all methods. VaR estimates obtained using all tail selection methods have a similar goodness of fit in terms of values of the loss function and values of mean absolute deviation between the returns and the quantiles if exceedances occur for individual indices. In addition, VaR estimates for the right tail fit better to empirical quantiles for all indices than VaR estimates for the left tail.  [55], Loss-loss function Q described by Formula (33), bold-rejection of the null hypothesis at the significance level of 0.05. The calculations were performed with the rugarch [56] and GAS [57] packages in R.   [55], Loss-loss function Q described by Formula (33), bold-rejection of the null hypothesis at the significance level of 0.05. The calculations were performed with the rugarch [56] and GAS [57] packages in R.   [55], Loss-loss function Q described by Formula (33), bold-rejection of the null hypothesis at the significance level of 0.05. The calculations were performed with the rugarch [56] and GAS [57] packages in R.  [55], Loss-loss function Q described by Formula (33), bold-rejection of the null hypothesis at the significance level of 0.05. The calculations were performed with the rugarch [56] and GAS [57] packages in R.   [55], Loss-loss function Q described by Formula (33), bold-rejection of the null hypothesis at the significance level of 0.05. The calculations were performed with the rugarch [56] and GAS [57] packages in R.  [55], Loss-loss function Q described by Formula (33), bold-rejection of the null hypothesis at the significance level of 0.05. The calculations were performed with the rugarch [56] and GAS [57] packages in R.

Conclusions
The GARCH-EVT approach allows us to model the tails of the time-varying conditional return distribution. The main problem in using the method in practice relates to the selection of the appropriate tail fraction of the distribution above which the asymptotic properties hold. In this paper we extend the state-of-the-art by conducting a comparative study of accuracy of VaR forecasts, when each VaR estimate is calculated with an optimal choice of the distribution tail fraction. We used four different optimization procedures and compared the results to an approach based on a fixed, 95th quantile of distribution as a threshold. Such a choice of threshold is seen as standard in the conditional EVT approach. The GARCH-EVT model performs relatively well in estimating the risk for all choices of threshold. Backtesting procedures indicate that regardless of the choice of the tail, approximately the same accuracy of VaR prediction is provided. This conclusion presents a real dilemma to the importance of an appropriate selection of a threshold in financial applications. We found that the optimal tail selection methods do not improve accuracy of the VaR prediction relative to the standard method and hence our research hypothesis is not confirmed. Investors may then use the conditional EVT approach, taking the 95th percentile of the sample as a threshold, to obtain the accurate estimate of tail risk.

Conflicts of Interest:
The authors declare no conflict of interest.