Does the Assumption on Innovation Process Play an Important Role for Filtered Historical Simulation Model ?

Most of the financial institutions compute the Value-at-Risk (VaR) of their trading portfolios using historical simulation-based methods. In this paper, we examine the Filtered Historical Simulation (FHS) model introduced by Barone-Adesi et al. (1999) theoretically and empirically. The main goal of this study is to find an answer for the following question: “Does the assumption on innovation process play an important role for the Filtered Historical Simulation model?”. For this goal, we investigate the performance of FHS model with skewed and fat-tailed innovations distributions such as normal, skew normal, Student’s-t, skew-T, generalized error, and skewed generalized error distributions. The performances of FHS models are evaluated by means of unconditional and conditional likelihood ratio tests and loss functions. Based on the empirical results, we conclude that the FHS models with generalized error and skew-T distributions produce more accurate VaR forecasts.


Introduction
The most well known risk measure, Value-at-Risk (VaR), is used to measure and quantify the level of financial risk within a firm or investment portfolio over a specific holding period.The VaR measures the potential loss of risky asset or portfolio over a defined period and for a given confidence level.The VaR is defined as where F is the cumulative distribution function (cdf) of financial losses, F −1 denotes the inverse of F and p is the quantile at which VaR is calculated.The approaches to VaR could be investigated in three categories: (i) fully parametric models approach based on a volatility models; (ii) non-parametric approaches based on the Historical Simulation (HS) methods and (iii) Extreme Value Theory approach based on modeling the tails of the return distribution.
In this paper, we focus on the non-parametric HS models.The HS model is based on the assumption that historical distribution of returns will remain the same over the next periods.The HS model assumes that price change behaviour repeats itself over the time.Thus, future distribution of asset returns could be described by the empirical one.The one-day-ahead VaR R forecast for HS model is given by where p is the quantile at which VaR is calculated.Mögel and Auer (2017) compared the performance of HS model with several competitive VaR models and stated that HS model produces the similar VaR forecasts with unconditional generalized Pareto distribution.
The HS model has several advantages.For instance, it is easy to understand and implement.It is a nonparametric model and does not require any distributional assumption.However, the HS model has also several shortcomings.The HS model ignores the time-varying volatility dynamics.In order to remove lack of HS model, Hull and White (1998) and Barone-Adesi et al. (1999) introduced the FHS model.This approach can be viewed as mixture of the HS and the Generalized Autoregressive Conditional Heteroskedasticity (GARCH) models.Specifically, it does not make any distributional assumption about the standardized returns, while it forecasts the variance through a volatility model.Hence, it is mixture of parametric and non-parametric statistical procedures.Barone-Adesi and Giannopoulos (2001) demonstrated the usefulness of the FHS model over the historical one.Kuester et al. (2006) compared the forecasting performance of several advanced VaR models.Kuester et al. (2006) concluded that GARCH-Skew-T, Extreme Value Theory (EVT) approach with normal and Skew-T innovations and FHS model with normal and Skew-T innovations perform the best.Angelidis et al. (2007) compared the FHS model with GARCH models specified under different innovation distributions such as normal, Student's-t and Skewed-T.Roy (2011) estimated the VaR of the daily return of Indian capital market using FHS model.Omari (2017) compared FHS, Exponentially Weighted Moving Average (EWMA), GARCH-normal, GARCH-Student's-t, GJR-GARCH-normal and GJR-GARCH-Student's-t models in terms of accuracy of VAR forecasts.Omari (2017) demonstrated that GJR-GARCH-Stundet's-t approach and Filtered Historical Simulation method with GARCH volatility specification perform competitively accurate in estimating VaR forecasts for both standard and more extreme quantiles thereby generally out-performing all the other models under consideration.
The goal of this paper is to investigate the VaR forecasting performance of the FHS model specified under skewed and fat-tailed innovations distributions.For this goal, the comprehensive introduction to the FHS and GARCH models is given.The FHS model under six innovation distributions are introduced.Rolling window estimation produce is used to obtain both unknown parameters of GARCH models and VaR forecasts.The performance of the FHS models, in terms of accuracy of VaR forecasts, are evaluated by means of backtesting methods and loss functions.
The rest of the paper is organized as follows: Section 2 is devoted to theoretical properties of the FHS and GARCH models under normal, Student's-t, skew-normal, skew-T, generalized error and skewed generalized error innovation distributions.Backtesting methodology is given in Section 3. Empirical findings and model comparisons are presented in Section 4. Concluding remarks are given in Section 5.

Filtered Historical Simulation Models
In this section, the FHS model is defined.Then, the log-likelihood functions of GARCH model specified under normal, skew-normal, Student's-t, skew-T, generalized error and skewed genealized error innovation distributions are presented.
FHS model can be summarized as follows:

√
Let R t denotes the daily log-returns.The benchmark GARCH(1,1) model, introduced by Bollerslev (1986), is defined by where ω > 0, γ 1 > 0,γ 2 > 0, µ t and h 2 t are the conditional mean and variance, respectively, and ε t is the innovation distribution with zero mean and unit variance.Maximum Likelihood Estimation (MLE) method is widely used to estimate parameters of GARCH models.Under the assumption of independently and identically distributed (iid) innovations with f (ε t ; τ) density function, the log-likelihood function of r t for a sample of T observations is given by where ψ = (µ, ω, γ 1 , γ 2 , τ) is the parameter vector of GARCH model, τ is the shape parameter(s) of f (ε t ; τ) and ε t = e t h t .The standardized residuals of estimated GARCH(1,1) model are extracted as follows: where êt is the estimated residual and ĥt is the corresponding daily estimated volatility.Now, we can generate the first simulated residual by randomly (with replacement) draw standardized residuals from the dataset with multiplying the one-day ahead volatility forecast: The first simulated return for period t + 1 can be obtained as follows: where z * t+1 is the first simulated residual for period t + 1.
This procedure is repeated B times of length T. Here, B represents the number of bootstrapped samples and T represents the each of bootstrapped sample size.Then, VaR for period t + 1 can be forecasted as follows: The rest of this section is devoted to present the log-likelihood functions of GARCH model under normal, skew-normal, Student's-t, skew-T, generalized error and skewed generalized error distributions.

Normal Distribution
The log-likelihood function of the GARCH model specified under normal innovations is given by where ψ = (µ, ω, γ 1 , γ 2 ) denotes the parameter vector of the GARCH-normal (GARCH-N) model and

Skew-Normal Distribution
The first skew extension of normal distribution was proposed by Azzalini (1985).The probability density function (pdf) of skew-normal (SN) distribution is given by where λ is an additional parameter that controls the skewness.When λ < 0, the SN distribution is left skewed, otherwise, it is right skewed.If λ = 0, the SN distribution reduces to standard normal distribution.The kth moment of SN distribution is given by here, k = 0, 1, 2, . . ., n and δ = λ Note that the even moments of the SN distribution are equal to standard normal distribution.The mean and variance of SN distribution is, respectively, given by, where b = 2 π .The standardized SN distribution is obtained using the transformed random variable ε = (z − µ) σ where E (ε) = 0 and var (ε) = 1.The random variable z can be expressed as z = εσ + µ and ∂z ∂ε = σ.Thus, the pdf of the standardized SN distribution is given by Hereafter, using the standardized SN distribution, the log-likelihood function of GARCH model with SN innovation distribution is given by where ψ = (µ, ω, γ 1 , γ 2 , λ) is the parameter vector.

Student's-t Distribution
Since financial return series has fatter tails than normal distribution, Bollerslev (1986Bollerslev ( , 1987) proposed the GARCH model with the Student's-t innovations.GARCH model with the Student's-t innovations enables to model both fat-tail and excess kurtosis observed in financial return series.The log-likelihood function of the GARCH-Student's-t (GARCH-T) model is given by where ψ = (µ, ω, γ 1 , γ 2 , υ) is the parameter vector, Γ(υ) is the gamma function and parameter υ controls the tails of the distribution.

Skew-T Distribution
The pdf of skew-T distribution obtained by Azzalini and Capitanio (2003) is given by where t (•) and T (•) are pdf and cdf of Student's-t distribution, respectively, and λ controls the skewness.When λ = 0, ST distribution reduces to Student's-t distribution in Equation ( 16).The moments of ST distribution are given by The mean and variance of ST distribution are, respectively, given by The standardized ST distribution is obtained using the transformed random variable , where E (ε) = 0 and var (ε) = 1.The random variable z can be expressed as z = εσ + µ and ∂z ∂ε = σ.Thus, the pdf of standardized ST distribution is given by where µ and σ are mean and standard deviation of ST distribution, respectively.The log-likelihood function of GARCH model with the ST innovation distribution is given by where ψ = (µ, ω, γ 1 , γ 2 , λ, υ) is the parameter vector.

Generalized Error Distribution
Nelson (1991) introduced the GARCH volatility model of generalized error distribution (GED).The log-likelihood function of GARCH-GED model is given by where ψ = (µ, ω, γ 1 , γ 2 , υ) is the parameter vector, υ is tail-thickness parameter and Note that the normal distribution is a special case of the GED when υ = 2.If υ < 2, the GED has heavier tails than the Gaussian distribution.

Skewed Generalized Error Distribution
Skewed Generalized Error Distribution (SGED) provides an opportunity to model skewness and excess kurtosis observed in financial return series.Lee et al. (2008) introduced the GARCH-SGED model and concluded that GARCH model with SGED innovation process outperformed the GARCH-N model for all confidence levels.

Evaluation of VaR Forecasts
Now, we introduce backtesting methodology that is used to compare VaR forecast accuracy of the models.Statistical accuracy of the models is evaluated by backtests of Kupiec (1995), Christoffersen (1998), Engle and Manganelli (2004) and Sarma et al. (2003).Recently, some alternative backtesting methods for VaR forecasts were proposed by Ziggel et al. (2014) and Dumitrescu et al. (2012).Kupiec (1995) proposed a likelihood ratio (LR) test of unconditional coverage (LR uc ) to evaluate the model accuracy.The test examines whether the failure rate is equal to the expected value.The LR test statistic is given by where π = n 1 /(n 0 + n 1 ) is the MLE of p, n 1 represents the total violation and n 0 represents the total non-violations forecasts.Violation means that if VaR t > r t , violation occurs, opposite case indicates the non-violation.Under the null hypothesis (H 0 : p = π), the LR statistic follows a chi-square distribution with one degree of freedom.
The LR uc test fails to detect if violations are not randomly distributed.Christoffersen (1998) proposed a LR test of conditional coverage LR cc to remove the lack of Kupiec (1995) test.The LR cc test investigates both equality of failure rate and expected one and also independently distributed violations.The LR cc test statistic under the null hypothesis shows that the failures are independent and equal to the expected one.It is given by where n ij is the number of observations with value i followed by j for i, j = 0, 1 and the probability, for i, j = 1.It denotes that the violation occurred, otherwise indicates the opposite case.
The LR cc statistic follows a chi-square distribution with two degrees of freedom.
The Dynamic Quantile (DQ) test, proposed by Engle and Manganelli (2004), examines if the violations is uncorrelated with any variable that belongs to information set Ω t+1 when the VaR is calculated.The main idea of DQ test is to regress the current violations on past violations in order to test for different restrictions on the parameters of the model.The estimated linear regression model is given by where This regression model tests whether the probability of violation depends on the level of the VaR.Here, p and q are used as 5 and 1, respectively, for illustrative purpose.
In most instances, evaluating the performance of VaR models by means of LR uc , LR cc and DC tests may not be sufficient to decide the most adequate model among others.For instance, some models may have the same violation number with different forecast errors.Sarma et al. (2003) defined a test on the basis of regulator's loss function (RLF) to take into account differences between realized returns and VaR forecasts.The RLF is given by where VaR t+1 represents the one-day-ahead VaR forecast for a long position.
The unexpected loss (UL) is equal to average value of differences between realized return and VaR forecasts.The one-day-ahead magnitude of the violation for long position is given by The QLF and UL loss functions do not consider the case in which the realized returns exceed the VaR forecast.The appropriate loss function should take into consideration the cost of excess capital.Because, overestimated VaR forecasts yield firms to hold much more capital value than required one.The main objective of any firm is to maximize the their profits.For this reason, Sarma et al. (2003) is proposed the new loss function, called Firm's Loss Function (FLF).The FLF is given by where β is the cost of excess capital.

Data Description
To evaluate the performance of FHS models in terms of accuracy of VaR forecasts, ISE-100 index of Turkey is used.The used time series data contains 1092 daily log-returns from 3 January 2013 to 4 May 2017.The descriptive statistics of the log-returns of ISE-100 index are given in Table 1.Table 1 shows that the mean return is closed to 0. The results of the Jarque-Bera test prove that the null hypothesis of normality is rejected at any level of significance.It shows strong evidence for high excess kurtosis and negative skewness.Thus, it is clear that log return of ISE-100 index has non-normal characteristics, excess kurtosis, and fat tails.Figure 1  Figure 2 displays the time-varying skewness and kurtosis of ISE-100.For Figure 2, window length is determined as 392 and the rolling window procedure is used.Based on Figure 2, it is clear that skewness and kurtosis of ISE-100 index exhibit great variability across the time.
The benchmark model, GARCH(1,1), is estimated with six different innovation distributions: Normal, SN, Student's-t, ST, GED and SGED.Table 2 shows the estimated parameters of GARCH models.The rugarch package in R software is used to obtain parameter estimation of normal, Student's-t, GED and SGED models.The constrOptim function in R software is used to minimize negative log-likelihood functions of GARCH-ST and GARCH-SN models.
Based on Table 2, we conclude that GARCH-T and GARCH-SGED models have the lower log-likelihood value among others.Since GARCH-T model has the lowest log-likelihood value, it could be chosen as best model for in-sample period.Table 2 also shows that the conditional variance parameters γ 2 are highly significant for all GARCH models.

Backtesting Results
In this subsection, rolling window estimation procedure is used to estimate parameters of GARCH models.Then, VaR forecasts of FHS models are obtained by using estimated parameters of GARCH models, one-day-ahead forecasts of conditional mean and conditional variance and standardized residuals extracted from estimated GARCH models.Rolling window estimation produce allows us to capture time-varying characteristics of the time series in different time periods.Window length is determined as 392 and next 700 daily returns are used to evaluate the out of sample performance of VaR models.
Table 3 shows the backtesting results for FHS-N, FHS-T, FHS-ST, FHS-SN, FHS-GED and FHS-SGED models.The two step decision making procedure is applied to decide the best VaR model.In first step, the performance of VaR models are evaluated according to results of LR u c, LR c c and DC tests.In second step, the models, achieved to pass these three backtest, considered as accurate model and obtained the results of loss functions of these VaR models.Finally, the lowest values of loss functions indicate the best VaR models.
Table 4 shows that all FHS models perform well based on the results of LR uc , LR cc and DC tests results at p = 0.05 ad p = 0.025 levels.However, FHS model with Student's-t and ST innovation distributions provide better VaR forecasts than other competitive models at p = 0.01 level based on the result of DC test.Therefore, it can be concluded that FHS model specified under skewed and fat-tailed innovation distributions provides more accurate VaR forecasts especially for high quantiles.
Even if FHS models have similar results in view of LR uc , LR cc and DC results, they have different failure rates and forecast errors.Loss functions are useful to compare VaR models with their forecast errors.Based on the ARLF, UL and FLF results, we conclude following results: (i) FHS-SN is the best performed model at p = 0.05 and p = 0.025 levels according to ARLF and UL criteria.Based on the FLF results, FHS-GED model has the lowest excess capital value than other models at p = 0.05 and p = 0.025 levels.Therefore, FHS-GED model could be chosen as best model for p = 0.05 and p = 0.025 levels; (ii) Based on the three backtesting results, FHS-T and FHS-ST models provide the most accurate VaR forecasts among others at p = 0.01 level.According to loss functions results, it is easy to see that FHS-ST model has lower values of ARLF, UL and FLF results than FHS-T model.Therefore, FHS-ST model could be chosen as the best model for p = 0.01 model.
Figures 3 displays the VaR forecasts of FHS models specified under six innovation distributions.As seen in Figure 3, the assumption on innovation process does not affect the VaR forecasts of FHS model soulfully.However, the GED and ST distributions could be preferable to reduce the forecast error of the FHS model.

Table 1 .
Summary statistics for the ISE-100 index.

p = 0.05 Models Mean VaR (%) N. Of Vio. Failure Rate LR-uc LR-cc DQ
Daily VaR forecast of GARCH models with different innovation distributions for 97.5% and 99% confidence levels.