Next Article in Journal
Beneath the Surface: Disentangling the Dynamic Network of the U.S. and BRIC Stock Markets’ Interrelations Amidst Turmoil
Previous Article in Journal
Mapping Cyber-Financial Risk Profiles: Implications for European Cybersecurity and Financial Literacy
Previous Article in Special Issue
Dynamics of Foreign Exchange Futures Trading Volumes in Thailand
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Sequential Importance Sampling for Estimating Multi-Period Tail Risk

Department of Statistics, University of Seoul, 163 Seoulsiripdaero, Dongdaemun-gu, Seoul 02504, Republic of Korea
*
Author to whom correspondence should be addressed.
Risks 2024, 12(12), 201; https://doi.org/10.3390/risks12120201
Submission received: 1 November 2024 / Revised: 7 December 2024 / Accepted: 10 December 2024 / Published: 13 December 2024
(This article belongs to the Special Issue Financial Derivatives: Market Risk, Pricing, and Hedging)

Abstract

:
Plain or crude Monte Carlo simulation (CMC) is commonly applied for estimating multi-period tail risk measures such as value-at-risk (VaR) and expected shortfall (ES). After fitting a volatility model to the past history of returns and estimating the conditional distribution of innovations, one can simulate the return process following the fitted volatility model with the estimated conditional distribution of innovations. Repeated generation of the return processes with the desired length gives a sufficient number of simulated multi-period returns. Then, the multi-period VaR and ES are directly estimated from the empirical distribution of them. CMC is easily applicable. However, it needs to generate a huge number of multi-period returns for the accurate estimation of a tail risk measure, especially when the confidence level of the measure is close to 1. To overcome this shortcoming, we propose a sequential importance sampling, which is a modification of CMC. In the proposed method. The sampling distribution of innovations is chosen differently from the estimated conditional distribution of innovations so that the simulated multi-period losses are more severe than in the case of CMC. In other words, the simulated losses over the VaR that is wanted to estimate are common in the proposed method, which reduces very much the estimation error of ES, and requires the less simulated samples. We propose how to find the near optimal sampling distribution. The multi-period VaR and ES are estimated from the weighted empirical distribution of the simulated multi-period returns. We propose how to compute the weight of a simulated multi-period return. An empirical study is given to backtest the estimated VaRs and ESs by the proposed method, and to compare the performance of the proposed sequential importance sampling with CMC.

1. Introduction

Value at Risk (VaR) and Expected Shortfall (ES) are two widely used risk measures in financial institutions to assess the potential loss in value of an asset or a portfolio over a given time period. VaR represents the maximum potential loss over a specified time period at a given confidence level. It tells the worst-case loss that will not be exceeded at a certain level of confidence. If the VaR of an asset or a portfolio over a specified time period is given, then ES measures the expected loss over the VaR in the case that the loss exceeds the VaR. ES provides additional information on the tail risk.
There are a huge amount of studies on VaR and ES estimation (Hong et al. 2014; Nadarajah et al. 2014; Nieto and Ruiz 2016). In earlier studies, VaR and ES are estimated by forecasting the quantile of the distribution of returns directly. Such studies include historical simulation, the kernel smoothing method (Chen and Tang 2005), the conditional autoregressive value at risk (CAViaR) (Engle and Manganelli 2004), and the extreme value approach (Longin 2000). Monte Carlo simulation is also useful in the case that the closed-form expression of the risk measures can not be obtained analytically. However, a very large number of simulated samples are required to obtain the desired level of estimation error in the method.
In recent studies, a volatility model is assumed that adequately describes the dynamic behavior of the conditional variance of the return. The GARCH-type models are representative. The studies on VaR and ES estimation can be classified into three types, based on how the conditional distribution of the innovations in the volatility model is estimated from previous returns. Hull and White (1998) and Barone-Adesi et al. (1999) developed non-parametric methods for the estimation of the conditional distribution of innovations. The method is called the filtered historical simulation (FHS). Gao and Song (2008) derived the limiting distribution of VaR and ES estimated by FHS. Although the FHS method offers the advantage of not relying on distributional assumptions, it needs long time series of returns in order to reflect extreme losses or gains in the estimation of the distribution of innovations. In parametric approaches, the standard normal or the standardized Student’s t-distribution are commonly assumed as the distribution of innovations. However, some general distributions were shown to give more accurate estimate to the risk measures (Bernardi et al. 2012; Broda and Paolella 2009; Simonato 2011). Applying a parametric method enables us to estimate the innovation distribution easily and efficiently, while if the adopted parametric model differs significantly from the actual distribution of innovations, it cannot accurately estimate VaR and ES. McNeil and Frey (2000) proposed a semi-parametric method, in which the tail of the innovation is fitted by the extreme value distribution. Jalal and Rockinger (2008) showed that the method developed by McNeil and Frey (2000) provides very good and robust VaR and ES forecasts when the threshold for determining which innovations are extreme is well chosen. However, it is not an easy task to choose the appropriate threshold. Moreover, if the number of innovations determined to be extreme is too small, then it leads to inaccurate parameter estimates of the extreme value distribution. Through an empirical study, Kuester et al. (2006) compared the existing methods for VaR estimation, and showed that the hybrid method combining a GARCH filter with an extreme value theory-based approach performs best, closely followed by that with the FHS. Righi and Ceretta (2015) compared the performance of some estimation methods of ES by an empirical study and a Monte Carlo experiment.
The estimation of VaR and ES over multiple days, such as two weeks (10 business days) or more, is essential according to Basel Committee on Banking Supervision (2013). The most simple and widely used method to estimate the multiday VaR is the square root of time method. In the method, the k-day VaR is estimated by scaling up the one-day VaR estimate by the square root of k. However, it is well known that the method has serious flaws (Brummelhuis and Kaufmann 2007; Lönnbark 2016). Quantile regressions have been developed for the estimation of the multiday risk measures by Ghysels et al. (2016); Le (2020), and Chen et al. (2021). In the method, a linear function of the short-horizon returns is fitted to the quantile of the multiday returns. The direct and iterated estimations are two main methods studied in the literature. In the direct approach, the k-day risk measures are estimated by the same method as the one-day risk measures, except that the k-day returns instead of daily returns are used for fitting the volatility model and the distribution of innovations. Since it is usually required for the k-day returns to be non-overlapping in the estimation, the direct approach may suffer from lack of data. In the iterated approach, the conditional variance of a k-day return is estimated by iterating the volatility model specified by the one-day returns. The moments of the standardized k-day returns is also estimated from the volatility model specified by the one-day returns. Applying the Cornish–Fisher expansion or Gram–Charlier expansion to the estimated moments, the distribution of the standardized k-day returns is estimated, which gives the estimates to the k-day risk measures (Lönnbark 2016; Zhou et al. 2016). For a more detailed discussion on the direct and iterated approaches, we refer to Ghysels et al. (2019) and Ruiz and Nieto (2023).
In the estimation of the multiday risk measures, crude Monte Carlo simulation (CMC) is also useful (Christoffersen 2011). In the method, the parameters of the volatility model of daily returns and the distribution of innovations are estimated from the past returns, and the daily returns over next k days is simulated from the fitted volatility model and the estimated conditional distribution of innovations. By generating a large number of return processes, one can obtain a sufficient number of simulated k-day returns. Then, the k-day VaR and ES are directly estimated from the empirical distribution of them. CMC is easily applicable. However, in order to estimate the risk measures accurately, the volatility model of daily returns should be correctly specified, and also the distribution of innovations. It also needs to generate a huge number of multi-period returns in order to obtain an accurate tail risk estimation, especially when the confidence level is very high. Since ES measures the expected loss over VaR, even severe losses that occur with very low probability must be reproduced for the accurate estimation of ES. Instead of directly estimating the risk measures from the simulated k-day returns, the Cornish–Fisher expansion can be applied to approximate the distribution of the k-day returns (Zhang et al. 2023). In the method, the mean, variance, and some higher moments of the k-day returns are estimated from the simulated k-day returns, and then the expansions are applied. Glasserman et al. (2000) proposed an importance sampling to reduce the estimation error of VaR in a static model. Hoogerheide and van Dijk (2010) considered GARCH-type volatility models, and proposed an adaptive importance sampling for the multiday VaR and ES estimation in a Bayesian framework. They proposed how to find the approximate optimal importance joint sampling density of the parameters of the volatility model and the returns over the multiple days.
To overcome the shortcoming of CMC, we propose a sequential importance sampling (SIS) for estimating the multi-period risk measures in GARCH-type volatility models. The proposed method is a modification of the crude Monte Carlo (CMC) simulation. In the proposed method, we choose the sampling distribution of innovations differently from the distribution estimated from the past standardized returns. The latter is used as the sampling distribution of innovations in CMC. We call the sampling distribution the importance sampling distribution. In our proposal, the importance sampling distribution is chosen so that the simulated losses over k days are more severe than the case of CMC, in which the daily log return processes are simulated from the fitted volatility model and the estimated distribution of innovations. Compared to CMC, SIS generates much more samples of k-day losses over the VaR, which results in reduced estimation error of the risk measures (Rubinstein and Kroese 2016) and requiring the less simulated samples.
In the proposed method, we first fit a GARCH-type volatility model to the past daily returns and estimate the distribution of innovations. We choose the exponential twisting of the estimated distribution of innovations as the importance sampling distribution of innovations. The daily return process over k days is simulated from the fitted volatility model and the importance sampling distribution of innovations. A large number of simulated k-day returns are obtained by generating return processes repeatedly as proposed above. Since the sampling distribution of the innovations is chosen differently from the estimated distribution of innovations, the empirical distribution of the simulated k-day returns is different from the target distribution of the k-day return. After assigning a weight to each simulated k-day return, the k-day VaR and ES are estimated from the weighted empirical distribution of the simulated k-day returns. We propose how to compute the weight of a simulated k-day return. We also prove that the optimal twisting parameter is unique, and show that an approximate value to it can be found by applying the stochastic approximation.
We have performed an empirical study to compare the performance of the proposed sequential importance sampling with two CMCs, in which the distribution of innovations is assumed to follow the standard normal distribution and the standardized Student’s t-distribution, respectively. Empirical results show that the proposed method does not reduce the estimation error in VaR estimation, but reduces the estimation error in ES estimation even if we consider the time it takes to obtain the estimate. We have performed backtestings to determine whether the proposed method and CMCs give accurate VaR and ES estimates.
The outline of the paper is as follows: In Section 2, we briefly review the crude Monte Carlo simulation for multi-period VaR and ES estimation. We propose a sequential importance sampling for multi-period VaR and ES estimation in Section 3. In Section 4, empirical results are given. Finally, we conclude the paper in Section 5.

2. Crude Monte Carlo Simulation for Multi-Period Risk Estimation

Let P t , t = 1 , 2 , , be the price of an asset or a portfolio at the end of the t-th time period, and let R t = log P t log P t 1 . Then, R t is the log return during the t-th time period. We assume that R t is a continuous random variable. In this paper, we consider the case that the length of a time period is a day. We denote by H t = { R t , R t 1 , , R 1 } the history of log returns until time t. Then, the conditional mean of R t given H t 1 is defined as
μ t = E [ R t | H t 1 ] ,
and the conditional variance of R t given H t 1 is defined as
σ t 2 = E [ ( R t μ t ) 2 | H t 1 ] .
Since the mean of the daily log return is very small compared to σ t , μ t is usually assumed to be 0 (Christoffersen 2011). Then, the above equation is rewritten as
σ t 2 = E [ R t 2 | H t 1 ] .
The generic model of the daily log return is as follows: given H t 1 ,
R t = σ t Z t , Z t i . i . d . f ( z ) ,
where f ( z ) is a probability density function (pdf) with mean 0 and variance 1. In the above model, the innovations Z 1 , Z 2 , are conditional standardized log returns. The usual choice of f ( z ) is the standard normal or the standardized Student’s t-distribution (Christoffersen 2011). However, we do not need to restrict f ( z ) to these distributions.

2.1. Volatility Model

There have been huge studies on the model explaining the dynamics of { σ t 2 , t 1 } . Among them, GARCH-type models are representative (Francq and Zakoian 2019; Teräsvirta 2009). In this paper, we consider the GJR-GARCH (Glosten et al. 1993) as the model of { σ t 2 , t 1 } for convenience. The GJR-GARCH model is capable of capturing most of the stylized facts about the time series of returns: the non-normality of the marginal return distribution, very low autocorrelations, the volatility clustering, and the leverage effect. However, other GARCH-type models such as EGARCH, threshold GARCH, and FIGARCH can be applied to our proposed method. In the GJR-GARCH ( 1 , 1 ) model, σ t in Equation (2) follows that
σ t 2 = ω + ( α 1 + γ 1 I ( R t 1 < 0 ) ) R t 1 2 + β 1 σ t 1 2 ,
where ω 0 , α 1 > 0 , β 1 > 0 , γ 1 > 0 , and  I ( A ) is the indicator of event A, i.e., 
I ( A ) = 1 , if A occurs , 0 , otherwise .
If α 1 + β 1 < 1 , then { σ t 2 , t 1 } is a weakly stationary process.
The volatility clustering phenomenon is successfully explained by the GJR-GARCH model. By introducing the term γ 1 I ( R t 1 < 0 ) into the model equation, the model also can incorporate the volatility’s asymmetric response, depending on whether the returns are positive or negative. The value of γ 1 indicates the amount of additional response to σ t 2 per unit value of the squared log return during the previous time period when it is negative.
The parameters of a volatility model can be easily estimated using the maximum likelihood estimation when { Z t , t 1 } follows the normal distribution. However, as known from many empirical analyses, the conditional distribution of returns observed in the financial market does not follow a normal distribution. Even in this case, the parameters can be estimated assuming that conditional returns are normally distributed. This method is called quasi-maximum likelihood estimation (QMLE). According to Bollerslev and Wooldridge (1992), the estimates obtained in this way are also consistent.
In applying Model (2) with a GARCH-type volatility model, we assume that the log return process is stationary in the sense that the parameters of the volatility model are constant, as well as those of the distribution model of f ( z ) . However, the assumption of the stationary log return process is not valid for a long time period, but only valid for a short time period, such as two or three years, for various reasons (Akgiray 1989). The rolling window method is generally applied to fitting a volatility model to a observed log-return process. In the method, we assume that the parameters of the volatility model do not change during a time period of fixed length, i.e., for m > 0 , { R t 1 , , R t m } follows a volatility model with fixed parameters. The values of the parameters are estimated using { R t 1 , , R t m } instead of the full history of log returns.

2.2. Estimation of the Innovation Distribution

The distribution of the innovations must have mean zero and variance one. The typical choices for the distribution are the standard normal or the standardized Student’s t-distribution. Filtered historical simulation (FHS) is a non-parametric approach to estimate the distribution of the innovations (Barone-Adesi et al. 1999, 2002). Suppose that we have fitted a model of volatility to the log returns { R t 1 , , R t m } , and obtained the estimate σ ^ s to σ s , s = t 1 , , t m . Then, the latent variable Z s is estimated as
Z ^ s = R s σ ^ s , s = t 1 , , t m .
We call { Z ^ t 1 , , Z ^ t m } the residuals of the fitted model. The residuals are approximately independent and identically distributed (i.i.d.) due to Equation (2). We denote by f t 1 ( z ) the estimated pdf of the innovations from { Z ^ t 1 , , Z ^ t m } .
In FHS, the empirical pmf of { Z ^ t 1 , , Z ^ t m } is used as the approximated pmf of the innovations, i.e.,
f t 1 ( z ) = 1 m j = 1 m δ ( z Z ^ t j ) ,
where δ ( z ) is the Dirac measure having the probability mass of 1 at the origin. The p quantile of f ( z ) at time t is estimated as the p quantile of the empirical pmf (5).
Butler and Schachter (1997) proposed to carry out Gaussian kernel smoothing of the past returns in the historical simulation. By exploiting their idea, we obtain a continuous approximation of the empirical pmf (5) as follows:
f t 1 ( z ) = 1 m j = 1 m ϕ δ ( z Z ^ t j ) ,
where ϕ δ ( z ) is the pdf of the normal distribution with mean 0 and variance δ 2 . Then, f t 1 ( z ) in the above equation is also an approximate pdf of the innovations.

2.3. Multi-Period Risk Measures

Let R t ( k ) = log ( P t + k 1 / P t 1 ) . Then, it is the log return of the portfolio over k time periods, and also represented as R t ( k ) = i = 0 k 1 R t + i . The value-at-risk of the portfolio over the time interval with confidence level q is defined as
VaR t q ( k ) = sup { r R | Pr { R t ( k ) r | H t 1 } q } .
The expected shortfall of the portfolio over k time periods with confidence level q is defined as
ES t q ( k ) = E [ R t ( k ) | R t ( k ) VaR t q ( k ) ] .
Since the log returns are assumed to be continuous, R t ( k ) is less than VaR t q ( k ) with probability 1 q . Let p = 1 q . Then, we have that
ES t q ( k ) = E [ R t ( k ) I ( R t ( k ) VaR t q ( k ) ) ] p .
For the case of k = 1 , we obtain from Equation (2) that
VaR t q ( 1 ) = σ t z p ,
and
ES t q ( 1 ) = σ t E [ Z t | Z t < z p ] ,
where z p is the p quantile of f ( z ) . If we have obtained z ^ p , t 1 , the p quantile of f t 1 ( z ) , then we have from Equation (10) that
VaR ^ t q ( 1 ) = σ ^ t z ^ p , t 1 .
Suppose that ES ^ f , t 1 q is the conditional mean of Z , given Z < z ^ p , t 1 when Z follows the pdf f t 1 ( z ) . Then, the plug-in estimator to ES t q ( 1 ) from Equation (11) is as follows:
ES ^ t q ( 1 ) = σ ^ t ES ^ f , t 1 q .
For k > 1 , we can also estimate VaR t q ( k ) and ES t q ( k ) from Equations (12) and (13) by letting the time period consisting of k consecutive days being a single period. As mentioned above, the log return process can be assumed to be stationary during two or three years. For a time period longer than this, it might be difficult to assume that the process is stationary. If a single period consists of k consecutive days, then there are 500 / k periods during two years, and  750 / k periods during three years. If  k = 10 , then the numbers of periods are computed to be 50 and 75, respectively, which are too little to obtain reliable estimates for the parameters of the volatility model as well as those of the innovation distribution. Thus, instead of applying Equations (12) and (13), we rely on Monte Carlo simulation (Christoffersen 2011). In the simulation, a day constitutes a single period, and the stochastic process of daily log returns over next k periods from period t are simulated using the fitted volatility model of daily log returns and f t 1 ( z ) .

2.4. Crude Monte-Carlo Simulation

A GARCH-type volatility model reflects the impact of past returns on current volatility, and it is a recursive function of volatilities. We can see from Equation (3) that given the volatility at t 0 1 , the volatility at time t t 0 is determined by the returns H ( t 0 1 ) : ( t 1 ) = { R t 0 1 , , R t 1 } . Let θ = ( ω , α 1 , β 1 , γ 1 ) be the vector of the GJR-GARCH(1,1) parameters. We denote by ψ ( H t 1 ; θ ) the volatility at time t given H t 1 , i.e.,
σ t 2 = ψ ( H t 1 ; θ ) .
Suppose that the log return process is stationary in a time interval of length lager than m + k . Then, the parameter θ ^ estimated from the past m log returns from a time t can be assumed to govern the volatility process during next k periods. We denote by θ ^ t 1 the estimate to θ obtained by fitting the model (2) and (3) to { R t 1 , , R t m } . The residuals during the past m periods are obtained from Equation (4). Let f t 1 ( z ) be the approximate distribution of f ( z ) obtained by one of the methods described in Section 2.2.
Suppose that we have generated k random samples independently from f t 1 ( z ) . We denote them by { Z ˜ t + i , i = 0 , 1 , , k 1 } . Then, the log returns over next k time periods are simulated as follows: for i = 0 , 1 , , k 1 ,
σ ˜ t + i 2 = ψ ( H ˜ t + i 1 ; θ ^ t 1 ) , R ˜ t + i = σ ˜ t + i Z ˜ t + i ,
where H ˜ t + i 1 = { R ˜ t + i 1 , , R ˜ t } H t 1 for i > 0 . For  i = 0 , H ˜ t + i 1 = H t 1 . Note that σ ˜ t 2 is equal to σ ^ t 2 , which is computed by substituting θ ^ t 1 for θ in Equation (14). Then, σ ˜ t + 1 2 , , σ ˜ t + k 1 2 are the simulated volatilities during the period from t + 1 to t + k 1 , and  R ˜ t , , R ˜ t + k 1 are the simulated log returns during the period from t to t + k 1 . In what follows, we denote by the filtered historical simulation (FHS) the above procedure of simulating daily log returns with f t 1 ( z ) given in Equation (5). If  f t 1 ( z ) given in Equation (6) is applied in the procedure instead of Equation (5), we call the procedure the FHS with Gaussian kernel smoothing.
The iterative generation of daily log returns gives a simulated log return over k periods from period t as follows:
R ˜ t ( k ) = i = 0 k 1 R ˜ t + i .
Repeating the above procedure N times independently, we obtain N independent samples of R ˜ t ( k ) . Let them be R ˜ t ( 1 ) ( k ) , , R ˜ t ( N ) ( k ) . Then, the empirical cdf of them is an approximate cdf of R t ( k ) , and  VaR t q ( k ) is estimated as the 100 q -th percentile of the negated log returns over k days, i.e.,
VaR ^ t q ( k ) = Percentile ( { R ˜ t ( 1 ) ( k ) , , R ˜ t ( N ) ( k ) } , 100 q ) ,
and that
ES ^ t q ( k ) = j = 1 N R ˜ t ( j ) ( k ) I ( R ˜ t ( j ) ( k ) VaR ^ t q ( k ) ) j = 1 N I ( R ˜ t ( j ) ( k ) VaR ^ t q ( k ) ) .

2.5. Back-Testing

In this subsection, we summarize the back-testings of VaR proposed by Kupiec (1995) and Christoffersen (1998), and the back-testing of ES proposed by Acerbi and Szekely (2014). We define I t as follows: for t T ,
I t = 1 , if R t ( k ) < VaR ^ t q ( k ) 0 , otherwise .
If R t ( k ) < VaR ^ t q ( k ) at time t, then we call that a violation occurs at time t. Then, I t is the indicator of the occurrence of a violation at time t. We call { I t , t T } the violation process. In what follows, we consider the violation process as a sequence of 0s and 1s listed according to the order of the day at which the estimation was performed. The exact value of t is ignored in the process.
In the estimation of the k-period VaR at time t, only H t 1 is used. Suppose that another piece of information, A t 1 , is available in the estimation at time t. If Pr { I t = 1 | H t 1 , A t 1 } is equal to Pr { I t = 1 | H t 1 } for any available information A t 1 up to time t, then H t 1 is sufficient for the current estimation of the k-day VaR. If  Pr { I t = 1 | H t 1 , A t 1 } is not equal to Pr { I t = 1 | H t 1 } for an information A t 1 , then A t 1 should be incorporated into constructing a better VaR estimation (Berkowitz et al. 2011).
Suppose that H t 1 , t T is a series of sufficient information. Then, the past history of violations { I s , s < t , s T } also are not helpful for the current estimation of the k-day VaR, which implies that they are independent with the current violation I t . Under the hypothesis
H 0 : { I t , t T } is a Bernoulli process with success probability p ,
the number of violations follows the Binomial distribution with n (the size of T ) and success probability p, which enables us to obtain the confidence interval of the number of excess losses (Kupiec 1995). Let n 1 be the number of violations, and let n 0 = n n 1 . Under  H 0 in (20), the likelihood function of p is given by
L ( p ; I t , t T ) = q n 0 p n 1 .
Kupiec (1995) considered the following alternative hypothesis:
H 1 : { I t , t T } is a Bernoulli process with success probability π ( p ) .
Let π ^ be the maximum likelihood estimator (m.l.e.) of p. Then, π ^ is computed to be n 1 / ( n 0 + n 1 ) . The likelihood ratio test on H 0 vs. H 1 is performed with the test statistic
L R u c = 2 log L ( p ; I t , t T ) L ( π ^ ; I t , t T ) .
L R u c follows asymptotically the χ 2 ( 1 ) distribution. We call the above test the unconditional coverage test.
The unconditional coverage test does not capture temporal dependence of the violation process. This results in lowering the power of the test (Berkowitz et al. 2011). To address the problem, Christoffersen (1998) considered a test for the temporal independence. He assumed that { I t , t T } is a Markov chain with transition probability matrix
Π 1 = π 00 π 01 π 10 π 11 .
Temporal independence of { I t , t T } implies that π 00 = π 10 and π 01 = π 11 . The former and the latter equations are the same since π 00 = 1 π 01 and π 10 = 1 π 11 . Christoffersen (1998) considered the following hypothesis test:
H 0 : π 01 = π 11 v . s . H 1 : π 01 π 11 ,
and proposed a likelihood-ratio test. We call the above hypothesis test the independent test. Let n i j be the number of transitions from i to j in { I t , t T } . Then, the likelihood function under H 1 is as follows:
L ( Π 1 ; I t , t T ) = ( 1 π 01 ) n 00 π 01 n 01 ( 1 π 11 ) n 10 π 11 n 11 .
Let π = π 01 under H 0 , and let Π 0 be the 2 by 2 matrix obtained by letting π 01 = π 11 = π and π 00 = π 10 = 1 π in Π 1 . Then, the likelihood function under H 0 is L ( Π 0 ; I t , t T ) , i.e.,
L ( Π 0 ; I t , t T ) = ( 1 π ) ( n 00 + n 10 ) π ( n 01 + n 11 ) .
The m.l.e. of π is computed to be π ^ = ( n 01 + n 11 ) / n , while the m.l.e. of π 01 and π 11 under H 1 are computed to be π ^ 01 = n 01 / ( n 00 + n 01 ) and π ^ 11 = n 11 / ( n 10 + n 11 ) , respectively. By substituting π ^ into L ( Π 0 ; I t , t T ) , and substituting π ^ 01 and π ^ 11 into L ( Π 1 ; I t , t T ) , we obtain the following log likelihood-ratio statistic:
L R i n d = 2 log L ( Π ^ 0 ; I t , t T ) L ( Π ^ 1 ; I t , t T ) .
L R i n d follows asymptotically the χ 2 ( 1 ) distribution.
In order to test the unconditional coverage of estimated VaRs and the temporal independence of { I t , t T } simultaneously, Christoffersen (1998) define the following statistic on { I t , t T } :
L R c c = L R u c + L R i n d .
Under the assumption that { I t , t T } is a Markov chain, H 0 in (20) is equivalent to that π 01 = π 11 = p , and  H 0 in (23) is equivalent to H1 in (21). Then, the definition of L R c c implies that L R c c is the log likelihood-ratio statistic corresponding to the following hypothesis test:
H 0 : π 01 = π 11 = p v . s . H 1 : π 01 π 11 .
L R c c follows asymptotically the χ 2 ( 2 ) distribution. We call the above hypothesis test the conditional coverage test.
Acerbi and Szekely (2014) proposed three backtestings for ES: testing ES after VaR estimation, testing ES directly, estimating ES from realized ranks. Among them, we apply the testing ES after VaR estimation called Z 1 test in this paper. In the test, the following hypothesis test is considered: for an positive integer k,
H 0 : ES t q ( k ) = ES ^ t q ( k ) , VaR t q ( k ) = VaR ^ t q ( k ) , t T , H 1 : ES t q ( k ) > ES ^ t q ( k ) , VaR t q ( k ) = VaR ^ t q ( k ) , t T .
For the above hypothesis test, Acerbi and Szekely (2014) proposed the following test statistic of the observed 10-day log returns { R t ( k ) , t T } :
Z 1 = 1 n 1 t T R t ( k ) I t ES ^ t q ( k ) + 1 .
In computing the above statistic, it is assumed that one or more violations have occurred, i.e., n 1 > 0 .
Contrary to the assumption on { R t ( k ) , t T } of Acerbi and Szekely (2014), { R t ( k ) , t T } are not independent. However, the distribution of Z 1 under H 0 can be estimated by the bootstrapping as in Acerbi and Szekely (2014). In the method, we generate random copies of Z 1 under H 0 . Then, the empirical distribution of them is used as the approximate distribution of Z 1 under H 0 . Let Z 1 ( 1 ) , , Z 1 ( m ) be the random copies. The p-value of Z 1 given in Equation (28) is computed as follows:
p = 1 m i = 1 m I ( Z 1 ( i ) < Z 1 ) .
If the k-period ESs are underestimated many times, then the value of Z 1 is less than that of the case in which the k-period ESs are mostly estimated accurately. This results in low values of p. If p is less than the prespecified significant level, then we reject H 0 .

3. The Proposed Method

3.1. Sequential Importance Sampling of Log Return Process

When q is close to 1, most of the samples R ˜ t ( 1 ) ( k ) , , R ˜ t ( N ) ( k ) generated by the crude Monte Carlo simulation are larger than VaR t q ( k ) . In this case, VaR ^ t q ( k ) is determined by few large losses, and the estimation error of VaR t q ( k ) increases. In the same reason, the estimation error of ES t q ( k ) also increases. To address the problem, we propose a sequential importance sampling for the estimation of VaR t q ( k ) and ES t q ( k ) . In our proposal, we choose the exponentially twisted distribution of f t 1 ( z ) given in (6) as the sampling distribution of the innovations when we apply the Monte Carlo simulation (15). Suppose that the residuals { Z ^ t 1 , , Z ^ t m } are obtained at the end of period t 1 for a sufficiently large m. We define the importance sampling pdf of Z ˜ t + i for i = 0 , , k 1 as follows: for λ ( , ) ,
g t 1 ( z ; λ ) = 1 m c ( λ ) exp { λ z } j = 1 m ϕ δ ( z Z ^ t j ) , < z < ,
where
c ( λ ) = 1 m j = 1 m exp λ Z ^ t j + δ 2 λ 2 2 .
Note that c ( λ ) is the moment generating function of f t 1 ( z ) , i.e., c ( λ ) = E f t 1 [ e λ Z ] . Equation (30) is rewritten as
g t 1 ( z ; λ ) = j = 1 m c j ϕ δ ( z ( Z ^ t j + λ δ 2 ) ) , < z < ,
where
c j = exp { λ Z ^ t j } j = 1 m exp { λ Z ^ t j } , j = 1 , , m .
Equation (32) says that the importance sampling distribution of Z ˜ t + i , i = 0 , , k 1 , is a mixture of the normal distributions N ( Z ^ t 1 + λ δ 2 , δ 2 ) , , N ( Z ^ t m + λ δ 2 , δ 2 ) with weights c 1 , , c m . Suppose that J is a discrete random variable with pmf Pr { J = j } = c j , j = 1 , , m . Then, the random variable Z generated from N ( Z ^ t J + λ δ 2 , δ 2 ) follows the pdf g t 1 ( z ; λ ) .
In the proposed scheme, we generate Z ˜ t , , Z ˜ t + k 1 independently from g t 1 ( z ; λ ) . For  λ = 0 , g t 1 ( z ; λ ) is the same as f t 1 ( z ) given in Equation (6). We have that
f t 1 ( z ) g t 1 ( z ; λ ) exp { λ z } .
For the case of negative value of λ , the above equation implies that the negative values of innovations occur more frequently in the proposed method compared to the FHS with Gaussian kernel smoothing. This tendency becomes more evident at smaller values of λ . In order to obtain a sample process of log returns during the period from t to t + k 1 , we apply the iteration (15). We can see that the probability of the occurrence of a large loss over k periods increases with the small values of Z ˜ t , , Z ˜ t + k 1 during the periods. The smaller the value of λ , the probability of the occurrence of large losses higher.
We denote by Z ˜ t : ( t + k 1 ) = { Z ˜ t , , Z ˜ t + k 1 } the generated sequence of innovations, and denote by R ˜ t : ( t + k 1 ) = { R ˜ t , , R ˜ t + k 1 } the simulated process of log returns. By abuse of notation, we denote by g t 1 ( r ) , r R k , the pdf of R ˜ t : ( t + k 1 ) , and denote by f t 1 ( r ) , r R k , the pdf of R ˜ t : ( t + k 1 ) if the innovations were generated from f t 1 ( z ) . We assume that f t 1 ( z ) is very close to f ( z ) so that f t 1 ( r ) is a good approximation to the true pdf of R t : ( t + k 1 ) .
The likelihood ratio of a sample process R ˜ t : ( t + k 1 ) following f t 1 ( r ) with respect to the importance sampling pdf g t 1 ( r ) is as follows:
W ( R ˜ t : ( t + k 1 ) ) = f t 1 ( R ˜ t : ( t + k 1 ) ) g t 1 ( R ˜ t : ( t + k 1 ) ) .
Given H t 1 , a sample process of innovations from t to t + k 1 uniquely determines the log return process during the time interval. It can be easily shown that the above likelihood ratio of R ˜ t : ( t + k 1 ) is equal to that of Z ˜ t : ( t + k 1 ) . It gives that
W ( R ˜ t : ( t + k 1 ) ) = i = 0 k 1 f t 1 ( Z ˜ t + i ) g t 1 ( Z ˜ t + i ) .
Since Pr f t 1 { R ˜ t ( k ) x } is represented as E f t 1 [ I ( R ˜ t ( k ) x ) ] , it follows from Rubinstein and Kroese (2016) that
Pr f t 1 { R ˜ t ( k ) x } = E g t 1 [ I ( R ˜ t ( k ) x ) W ( R ˜ t : t + k 1 ) ] ,
where Pr g { · } and E g [ · ] are the probability and the expectation with respect to a pdf g ( · ) , respectively.
Suppose that we have simulated N processes of innovations { Z ˜ t ( j ) , Z ˜ t 1 ( j ) , , Z ˜ t + k 1 ( j ) } from g t 1 ( z ; λ ) , j = 1 , , N . Let R ˜ t : ( t + k 1 ) ( j ) be the log return process corresponding to Z ˜ t : ( t + k 1 ) ( j ) . We obtain from Equations (33) and (34) that the unnormalized likelihood ratio of R ˜ t : ( t + k 1 ) ( j ) is given by
w ( j ) = exp λ i = 0 k 1 Z ˜ t + i ( j ) , j = 1 , , N .
Then, the sample mean of { w ( 1 ) , , w ( N ) } gives a strongly consistent estimator of the normalizing constant of the likelihood ratio. Let R ˜ t ( j ) ( k ) be the k-period log return corresponding to R ˜ t : ( t + k 1 ) ( j ) . It follows from Equation (35) that a strongly consistent estimator of Pr f t 1 { R ˜ t ( k ) x } is given by
Pr ^ f t 1 { R ˜ t ( k ) x } = j = 1 N I ( R ˜ t ( j ) ( k ) x ) w ( j ) j = 1 N w ( j ) .
If we let W ( j ) = w ( j ) / j = 1 N w ( j ) , then we have that
Pr ^ f t 1 { R ˜ t ( k ) x } = j = 1 N I ( R ˜ t ( j ) ( k ) x ) W ( j ) .
Let { r 1 , , r N } be the order statistic of { R ˜ t ( 1 ) ( k ) , , R ˜ t ( N ) ( k ) } , and  W j be the likelihood ratio corresponding to r j , j = 1 , , N . Then, Equation (36) is rewritten as
Pr ^ f t 1 { R ˜ t ( k ) x } = j = 1 N I ( r j x ) W j .
We define j as follows:
j = max { m : j = 1 m W j 1 q } .
Since { r 1 , , r N } are in ascending order, it follows from Equation (37) that
VaR ^ t q ( k ) = r j + r j + 1 2 ,
and that
ES ^ t q ( k ) = j = 1 j r j W j j = 1 j W j .
Suppose that we have observed { R 1 , , R n } . Algorithm 1 shows the procedure to find VaR ^ t q ( k ) and ES ^ t q ( k ) , t = m + 1 , , n , by the proposed method.
Algorithm 1 Sequential importance sampling for multi-period VaR and ES estimation
Require: 
a volatility model ψ , { R i , 1 i n } (log returns), m (the size of the rolling window), k (the number of periods during which the risk measures are estimated), q (the confidence level), λ (the twisting parameter), δ 2 (the variance of the smoothing distribution)
Ensure: 
VaR ^ t q ( k ) , ES ^ t q ( k ) , t = m + 1 , , m + n
1:
for  t { m + 1 , m + 2 , , n }  do
2:
    Fit the volatility model ψ to { R t j , j = 1 , , m } and obtain θ ^
3:
    Estimate volatilities { σ ^ t j 2 , j = 1 , , m }
4:
     Z ^ t j R t j / σ ^ t j for j = 1 , , m
5:
     c j exp { λ Z ^ t j } / j = 1 m exp { λ Z ^ t j } , j = 1 , , m .
6:
     g t 1 ( z ) j = 1 m c j ϕ δ ( z ( Z ^ t j + λ δ 2 ) ) .
7:
    for  j { 1 , 2 , , N }  do
8:
        Generate random samples { Z ˜ t , Z ˜ t + 1 , , Z ˜ t + k 1 } independently from g t 1 ( z )
9:
        Generate a process of log returns using the iteration: for i = 0 , 1 , , k 1 ,
σ ˜ t + i 2 ψ ( H ˜ t + i 1 ; θ ^ ) R ˜ t + i σ ˜ t + i Z ˜ t + i
10:
        Compute the unnormalized likelihood ratio of the log return process R ˜ t : t + k 1 :
w ( j ) exp λ i = 0 k 1 Z ˜ t + i
11:
         R ˜ t ( j ) ( k ) i = 0 k 1 R ˜ t + i
12:
    end for
13:
    Normalize the likelihood ratios: W ( j ) w ( j ) / j = 1 N w ( j ) , j = 1 , 2 , , N .
14:
    Find { r 1 , , r N } the order statistic of { R ˜ t ( 1 ) ( k ) , , R ˜ t ( N ) ( k ) } .
15:
     W j the likelihood ratio corresponding to r j , j = 1 , , N .
16:
     j max { m : j = 1 m W j 1 q }
17:
     VaR ^ t q ( k ) ( r j + r j + 1 ) / 2
18:
     ES ^ t q ( k ) j = 1 j r j W j j = 1 j W j
19:
end for

3.2. Finding the Optimal Twisting Parameter

Given H t 1 and θ , R t : ( t + k 1 ) is determined uniquely by Z t : ( t + k 1 ) . Thus, we can see that R ˜ t ( k ) in Equation (16) is a function of Z ˜ t : t + k 1 given H t 1 and θ ^ t 1 . We denote it by r ( z ; H t 1 , θ ^ t 1 ) , z R k , i.e.,
R ˜ t ( k ) = r ( Z ˜ t : ( t + k 1 ) ; H t 1 , θ ^ t 1 ) .
By abuse of notation, we denote by f t 1 ( z ) = i = 1 k f t 1 ( z i ) the joint pdf of Z ˜ t : ( t + k 1 ) . In what follows, we call f t 1 ( z ) the nominal pdf of Z ˜ t : ( t + k 1 ) . We have from Equation (9) that an approximate form of ES t q ( k ) is obtained as
ES t q ( k ) E f t 1 τ t ( Z ˜ t : ( t + k 1 ) ) ,
where
τ t ( z ) = r ( z ; H t 1 , θ ^ t 1 ) I ( r ( z ; H t 1 , θ ^ t 1 ) VaR t q ( k ) ) p , z R k .
Since τ t ( Z ˜ t : ( t + k 1 ) ) is nonnegative, the optimal importance sampling pdf of Z ˜ t : ( t + k 1 ) for the estimation of E f t 1 τ t ( Z ˜ t : ( t + k 1 ) ) is as follows (Rubinstein and Kroese 2016):
g t 1 ( z ) = τ t ( z ) f t 1 ( z ) E f t 1 τ t ( Z ˜ t : ( t + k 1 ) ) , z R k .
In the proposed sequential importance sampling, the pdf of an importance sample Z ˜ t : ( t + k 1 ) has the following form:
g t 1 ( z ; λ ) = i = 1 k g t 1 ( z i ; λ ) , z R k .
We want to determine the value of the twisting parameter λ so that the cross entropy of g t 1 ( z ; λ ) relative to g t 1 ( z ) is minimized. Then, the desired value of λ is given by
λ t 1 = argmin λ E g t 1 log g t 1 ( Z ) g t 1 ( Z ; λ ) .
It can be easily shown that
λ t 1 = argmin λ E f t 1 [ τ t ( Z ) log g t 1 ( Z ; λ ) ] .
We can see from Equation (30) that
log g t 1 ( Z ; λ ) = λ i = 1 k Z i + i = 1 k log j = 1 m ϕ δ ( Z i Z ^ t j ) k log c ( λ ) k log m .
In the above equation, log c ( λ ) is the cumulant generating function of f t 1 ( z ) , which is infinitely differentiable and convex. Thus, for z R k , log g t 1 ( z ; λ ) is concave with respect to λ , which implies that the term E f t 1 [ τ t ( Z ) log g t 1 ( Z ; λ ) ] in Equation (41) is convex with respect to λ . Thus, λ t 1 is unique if it exists.
Since log g t 1 ( z ; λ ) is differentiable with respect to λ , we can apply the stochastic gradient descent method to finding the solution of optimization problem (41). In applying the method, we generate samples of Z from f t 1 ( z ) in order to estimate the derivative of E f t 1 [ τ t ( Z ) log g t 1 ( Z ; λ ) ] with respect to λ . However, only about p proportion of the simulated R ˜ t ( k ) s are less than VaR t q ( k ) . In other words, most of the generated samples make τ t ( Z ) become zero, and do not contribute to estimate the derivative of E f t 1 [ τ t ( Z ) log g t 1 ( Z ; λ ) ] for given λ , which results in a large estimation error. Thus, we need to resort to another sampling pdf of Z .
Equation (41) is rewritten as follows: for a λ ˜ R ,
λ t 1 = argmin λ E λ ˜ τ t ( Z ) log g t 1 ( Z ; λ ) f t 1 ( Z ) g t 1 ( Z ; λ ˜ ) ,
where E λ ˜ [ · ] denotes the expectation with respect to the pdf g t 1 ( z ; λ ˜ ) . We need to choose a negative λ ˜ much less than 0 so that a fairly large number of samples of R ˜ t ( k ) are smaller than VaR t q ( k ) . We denote by W t 1 ( Z ; λ ˜ ) the likelihood ratio f t 1 ( Z ) / g t 1 ( Z ; λ ˜ ) . Then, it follows from Equation (42) that
d d λ E λ ˜ τ t ( Z ) log g t 1 ( Z ; λ ) W t 1 ( Z ; λ ˜ ) = E λ ˜ τ t ( Z ) W t 1 ( Z ; λ ˜ ) i = 1 k Z i k c ( λ ) c ( λ ) .
Now, we can apply the stochastic gradient descent method to find λ t 1 in an iterative manner. Suppose that we have obtained an estimate for λ t 1 at the j-th iteration. We denote it by λ j . We generate Z ( 1 ) , , Z ( L ) from g t 1 ( z ; λ j ) at the ( j + 1 ) -st iteration. We define R ˜ ( l ) ( k ) = r ( Z ( l ) ; H t 1 , θ ^ t 1 ) and S = { l : R ˜ ( l ) ( k ) VaR t q ( k ) } . Then, for l S , τ t ( Z ( l ) ) is equal to R ˜ ( l ) ( k ) / p , and for l S , it is equal to 0. It follows from Equation (44) that
d d λ E λ j τ t ( Z ) log g t 1 ( Z ; λ ) W t 1 ( Z ; λ j ) 1 p L l S R ˜ ( l ) ( k ) W t 1 ( Z ( l ) ; λ j ) i = 1 k Z i ( l ) k c ( λ ) c ( λ ) .
Note that W t 1 ( Z ; λ j ) is proportional to exp { λ j i = 1 k Z i } . Let w ( l ) = exp { λ j i = 1 k Z i ( l ) } , l = 1 , , L . Then, the next estimate to λ t 1 is set to be
λ j + 1 = λ j a ( j ) L l S R ˜ ( l ) ( k ) w ( l ) i = 1 k Z i ( l ) k c ( λ j ) c ( λ j ) ,
where a ( j ) is the learning rate. By the above iteration with an appropriate initial value λ 1 , we obtain a sequence { λ 1 , λ 2 , } convergent to λ t 1 . We terminate the above iteration when the difference between λ j and λ j + 1 is sufficiently small. The final estimate to λ t 1 may be appropriate as the initial value λ 1 in finding λ t .
We adopt the learning rate of the form
a ( j ) = a j + b , j = 1 , 2 , ,
for appropriate constants a and b. In the above described method to find λ t 1 , the value of VaR t q ( k ) needs to be known in advance. Suppose that we estimate the k-period VaR and ES at each day. Then, we may use VaR ^ t 1 q ( k ) as the value of VaR t q ( k ) instead.

4. Illustration

We have estimated the 10-day VaR and the 10-day ES of the S&P 500 Index every 10 days from 21 December 1973 to 29 December 2023 by the proposed method, the crude Monte Carlo simulation with normal innovations, and the crude Monte Carlo simulation with standardized Student’s t innovations. We call the Monte Carlo methods SIS, CMC-N, and CMC-t, respectively. We denote by T the set of every 10 days from 21 December 1973 to 29 December 2023. The size of set T is 1261. In order to statistically test whether the estimates are correct, we have performed some backtestings; the conditional coverage test (Christoffersen 1998) for the 10-day VaR estimates and the Z 1 test (Acerbi and Szekely 2014) for the 10-day ES estimates.
Table 1 shows summary statistics of { R t ( 10 ) , t T } , the 10-day log returns of the S&P 500 Index observed every 10 days from 21 December 1973 to 29 December 2023. The histogram of the 10-day log returns is shown in Figure 1. We can see from Table 1 and Figure 1 that the 10-day log returns of the S&P 500 Index have the sample mean close to 0, and are skewed to the left. There are many large losses compared to large profits.

4.1. Estimation of Risk Measures

By applying CMC-N, CMC-t, and SIS, we obtained { VaR ^ t q ( 10 ) , t T } , the 10-day VaR estimates at every 10 days from 21 December 1973 to 29 December 2023, and { ES ^ t q ( 10 ) , t T } , the 10-day ES estimates at every 10 days during the same period. The confidence levels of the estimates were set to be 0.95, 0.975, 0.99. In each method, the rolling window method with the window size of 750 was applied. At the beginning of each day t, t T , we fitted the GJR-GARCH(1,1) model to the daily log returns of previous 750 days, and obtained the estimated daily volatilities and the residuals during the previous 750 days. When we fitted the GJR-GARCH(1,1) model in CMC-N and SIS, we assumed that the innovations follow the standard normal distribution. In CMC-t, the innovations were assumed to follow the standardized Student’s t-distribution. In CMC-N and CMC-t, we applied the crude Monte Carlo simulation described in Section 2.4 to generate 10 4 sample processes of R ˜ t : t + 9 , the daily log returns over next 10 days, where the approximate distribution of the innovations is the standard normal distribution in CMC-N, and the standardized Student’s t-distribution in CMC-t. In SIS, we also have generated 10 4 importance sample processes of R ˜ t : t + 9 as described by Section 3. The value of δ in Equation (30) was set to be 0.25. We implemented CMC-N, CMC-t, and SIS in R (R Core Team 2024). The estimation of the risk measures described above was performed in a desktop computer with six cores and 8 GB RAM.
For t, t T , VaR ^ t q ( 10 ) and ES ^ t q ( 10 ) were estimated simultaneously from the simulated 10 4 sample processes of R ˜ t : t + 9 . A total of 1261 number of 10-day VaR and 10-day ES estimations were performed for each method and confidence level. Table 2 shows the elapsed time to obtain the estimates for each estimation method and confidence level.
For all confidence levels, CMC-t took more time to obtain the estimates than CMC-N and SIS. This is due to the fact that fitting the GJR-GARCH model with the standardized Student’s t innovations to the observed log returns takes more time than fitting with the normal innovations. In SIS, we find the optimal value of the twisting parameter λ and compute the unnormalized likelihood ratio for each of the importance samples of R ˜ t : t + 9 , t T , which made the time difference between CMC-N and SIS in Table 2. However, the time difference was not so much. SIS took approximately 20% more time than CMC-N. We can see that the computational burden incurred by using SIS instead of CMC-N is not that large, and that SIS is more efficient than CMC-t in terms of the computational burden.
In estimating the VaR t q (10), t T , we obtained the simulated values of the 10-day log return from the simulated R ˜ t : t + 9 . We denote them by R ˜ t ( 1 ) ( 10 ) , , R ˜ t ( N ) ( 10 ) , where N is equal to 10 4 . We also computed the unnormalized likelihood ratio for each R ˜ t : t + 9 in SIS. We divide the set { R ˜ t ( 1 ) ( 10 ) , , R ˜ t ( N ) ( 10 ) } into 10 sets of equal numbers of R ˜ t ( 10 ) . For each of the sets, we computed the 10-day VaR estimate from Equation (17) in CMC-N and CMC-t, and from Equation (38) in SIS. Then, the sample mean of these 10-day VaR estimates becomes VaR ^ t q ( 10 ) in each method. The standard error of the sample mean of these 10-day VaR estimates becomes the standard error of VaR ^ t q ( 10 ) . We also computed the relative error of VaR ^ t q ( 10 ) , which is the S.E. of VaR ^ t q ( 10 ) divided by VaR ^ t q ( 10 ) . Figure 2 shows the negated 10-day VaR estimates obtained by SIS. In the figure, the confidence level of the VaR estimates is the 0.975 . The figure also shows 10-day log returns computed at day t, t T . The behavior of VaR estimates by CMC-N (and CMC-t) looks similar to the figure.
If R t ( 10 ) < VaR ^ t q ( 10 ) at day t, then we call that a violation occurs at day t, and call such a loss the excessive loss, i.e., a 10-day excessive loss means a negated 10-day log return larger than the corresponding 10-day VaR estimate. We computed { I t , t T } the violation process defined in Equation (19) with k being equal to 10, and counted the number of violations. If a method estimates VaR ^ t q ( 10 ) correctly for t T , then a violation occurs at day t with probability p. Since a total of 1261 number of 10-day VaR estimations were performed, the expected number of violationsis 1261 × p. Table 3 shows the expected number of violation, and the number of observed violation for each method and confidence level. The table also shows the 95% confidence interval of the number of violations under the null hypothesis (20). In the table, the numbers of observed violations are within their confidence intervals for all confidence levels and methods. It seems that all methods estimated the 10-day VaR appropriately. We will discuss this point in more detail in the next subsection.
Table 4 shows the average S.E. and R.E. of { VaR ^ t q ( 10 ) , t T } for each method when the confidence levels are 0.95, 0.975, and 0.99. We can see that CMC-N gives the lowest average S.E and average R.E. for all confidence levels. However, when the confidence level is 0.99, the average S.E. (and also the average R.E.) of CMC-N and SIS are similar. Thus, SIS was not effective in reducing the average S.E. of the 10-day VaR estimates. This is due to the fact that the twisting parameter of SIS is found to minimize the estimation error of the 10-day ES, not the 10-day VaR.
When we estimate a value by a Monte Carlo simulation, the efficiency of the Monte Carlo simulation is inversely proportional to the product of the variance of the estimate and the simulation time to obtain the estimate (Glynn and Whitt 1992; Sak and Hörmann 2012). If the estimate obtained by a Monte Carlo simulation has the same variance as the estimate obtained by another Monte Carlo simulation, then the ratio of the products represents the ratio of the simulation times taken for the two Monte Carlo simulations to obtain the estimates with the same accuracy. Suppose that it took the same simulation time for both the Monte Carlo simulations to obtain the estimate. If we recall that the variance of the estimate is inversely proportional to the sample size, and that the simulation time increases almost linearly with the sample size, then we can see that the ratio represents how much simulation time will be taken for a Monte Carlo simulation to obtain the estimate with the same accuracy as that by the other Monte Carlo simulation.
We have obtained a series of estimates to the 10-day VaRs, not a single estimate. Thus, the efficiency of a method is inversely proportional to the product of the square of the average S.E., and the simulation time to obtain the estimates. We call the product the time-variance. The lower the time-variance of a method is, the more efficient the method is. Table 4 shows the time-variance of the 10-day VaR estimation in each method. It can be seen from the table that CMC-N is the most efficient, regardless of the confidence level. Both CMC-T and SIS took much time to obtain the estimates, and have the larger S.E. than CMC-N. These result in the higher time-variances of CMC-T and SIS compared to CMC-N. When we computed the time-variance for each method and confidence level in Table 4, we used the elapsed time in Table 2 as the simulation time taken to obtain { VaR ^ t q ( 10 ) , t T } . Computing ES ^ t q ( 10 ) requires only one more step than computing VaR ^ t q ( 10 ) in all methods. This additional step takes very little time compared to the overall process to obtain VaR ^ t q ( 10 ) . Thus, the elapsed time shown in Table 2 can be used as the approximate simulation time to obtain { VaR ^ t q ( 10 ) , t T } .
After we obtained VaR ^ t q ( 10 ) , t T , we also computed the 10-day ES estimate for each of the 10 sets of R ˜ t ( 10 ) obtained from { R ˜ t ( 1 ) ( 10 ) , , R ˜ t ( N ) ( 10 ) } . In the computation, we applied Equation (18) in both CMC-N and CMC-t, and applied Equation (39) in SIS. Then, the sample mean of these 10-day ES estimates becomes ES ^ t q ( 10 ) . The standard error of the sample mean of these 10-day ES estimates becomes the standard error of ES ^ t q ( 10 ) . The relative error of ES ^ t q ( 10 ) is computed to be the S.E. of ES ^ t q ( 10 ) divided by ES ^ t q ( 10 ) .
Table 5 shows the average S.E. and the average R.E. of { ES ^ t q ( 10 ) , t T } for each method and confidence level. We can see that SIS gives the lowest average S.E and R.E. for all confidence levels. Using SIS, the average S.E. (and also the average R.E.) of the 10-day ES estimates is reduced by about 3 to 5 times compared to CMC-N, and by about 3 to 7 times compared to CMC-t. We can see that our proposed scheme to find the optimal twisting parameter works well in reducing the estimation error of the 10-day ES.
Table 5 also shows the time-variance of the 10-day ES estimation in each method. We can see from the table that SIS is the most efficient for all confidence levels. Using SIS, the time-variance of the estimates is reduced by about 5 to 20 times compared to CMC-N, which means that SIS will take about 5 to 20 times less time than CMC-N to obtain a 10-day ES estimate with the same accuracy. Using SIS, the time-variance of the estimates is reduced by about 12 to 60 times compared to CMC-t. Thus, SIS will take significantly less time than CMC-t to obtain a 10-day ES estimate with the same accuracy.

4.2. Backtesting on 10-Day VaR Estimates

In this subsection, we apply the backtestings described in Section 2.5 to statistically test whether our proposed method, as well as CMC-N and CMC-t, estimate the 10-day VaRs appropriately. For each method and confidence level, we obtained the violation process { I t , t T } from { VaR ^ t q ( 10 ) , t T } and the observed 10-day log returns { R t ( 10 ) , t T } . In order to test whether Pr { R t ( 10 ) < VaR ^ q ( 10 ) } = p , t T , i.e., the violation rate is p, we computed L R u c in Equation (22). Table 6 shows the value of L R u c and the significant probability of it. We can see that for all methods and confidence levels the estimates to the 10-day VaRs are appropriate in the sense that we can not reject the null hypothesis (20) at 5 % significant level. In other words, if the violations are independent, then there is no reason to deny that the violation rate is p.
We have tested the temporal independence of the violation process { I t , t T } . Table 6 shows the value of L R i n d in Equation (24) and the significant probability of it for each method and confidence level. The table says that the null hypothesis H 0 in (23) can not be rejected at 5 % significant level for all methods and confidence levels, and that the violation process follows a Bernoulli process rather than a Markov chain, i.e., there is no temporal dependence in the violation process.
In order to test whether { I t , t T } is temporally independent with desired violation rate p, i.e., the violation process follows a Bernoulli process with success process p, we have performed the conditional coverage test on { I t , t T } . Table 6 shows the value of L R c c and the significant probability of it for each method and confidence level. We can see from the table that for all methods and confidence levels, the null hypothesis H 0 in (26) can not be rejected at at 5 % significant level. Thus, we accept that, for all estimation methods and confidence levels, the violation process follows a Bernoulli process with success process p, equivalently, Pr { R t ( 10 ) < VaR ^ q ( 10 ) | H t 1 } = p , t T . The past information of violations is not helpful in the current estimation of the 10-day VaR. We conclude that SIS as well as CMC-N and CMC-t estimated the 10-day VaR accurately in this sense.

4.3. Backtesting on 10-Day ES Estimates

From the conclusion on the 10-day VaR estimates, we can assume that VaR t q ( 10 ) = VaR ^ t q ( 10 ) , t T for all methods and confidence levels. In order to test statistically whether CMC-N, CMC-t, and SIS estimate the 10-day ESs accurately, we applied the Z 1 test described in Section 2.5. When { VaR ^ t q ( 10 ) , t T } and { ES ^ t q ( 10 ) , t T } were obtained by CMC-N, we first computed the Z 1 statistic of the observed 10-day log returns { R t ( 10 ) , t T } from Equation (28). To obtain the p-value of Z 1 under H 0 , we have generated 10 4 sample processes of { R ˜ t ( 10 ) , t T } by the same manner as in CMC-N. In order to obtain a sample process of { R ˜ t ( 10 ) , t T } , we applied the rolling window method with window size 750 to estimate the parameters of GJR-GARCH(1,1) with normal innovations every t T , and generated the daily log returns R t : t + 9 by applying the crude Monte Carlo method described in Section 2.4. In the generation of daily log returns, the innovations were assumed to follow the standard normal distribution. We repeated the procedure 10 4 times, and denote by { R t ( i ) ( 10 ) , t T } the i-th sample process of the 10-day log returns. By substituting R t ( i ) ( 10 ) for R t ( k ) in Equation (28), we obtained the simulated Z 1 statistic of the sample process { R t ( i ) ( 10 ) , t T } . We call it Z 1 ( i ) , i = 1 , , 10 4 . Then, the p-value of Z 1 under H 0 is obtained from Equation (29).
When { VaR ^ t q ( 10 ) , t T } and { ES ^ t q ( 10 ) , t T } were obtained by CMC-t, we have performed the hypothesis test (27) in the same manner as described in the previous paragraph, except that in applying the rolling window method, we fitted the GJR-GARCH(1,1) with standardized Student’s t innovations to the observed log return process, and that the innovations were assumed to follow the standardized Student’s t-distribution in applying the crude Monte Carlo method to generating the daily log returns R t : t + 9 .
In the case that { VaR ^ t q ( 10 ) , t T } and { ES ^ t q ( 10 ) , t T } were obtained by SIS, they are estimated risk measures on the 10-day log returns under the assumption that the log return process follows the GJR-GARCH(1,1) with innovations having time-varying distributions, and that the distribution at a time is approximated well by the Gaussian kernel smoothing (6) of previous innovations. Thus, for computing the p-value of Z 1 in this case, we need the sample processes of { R ˜ t ( 10 ) , t T } following the assumption. In the generation of a sample process, we fitted the GJR-GARCH(1,1) with standard normal innovations to the observed log return process in applying the rolling window method, and applied the crude Monte Carlo method with innovations of pdf (6) as described in Section 2.4.
Table 7 shows the value of Z 1 and its p-value for each method and confidence level of 0.95, 0.975, and 0.99. If we set the significance level at 5 % , then { ES ^ t 0.95 ( 10 ) , t T } by CMC-N appears to be underestimated. In other cases, H 0 could not be rejected, and we conclude that all methods worked well in the estimation of the 10-day ES.

5. Conclusions

In this paper, a sequential importance sampling for the estimation of multi-period VaR and ES has been proposed in order to overcome the shortcomings of crude Monte Carlo simulation. By choosing the sampling distribution of innovations differently from crude Monte Carlo simulation, we can reduce very much the estimation error of the multi-period tail risk measures, and also the required number of simulations to obtain the accurate estimates of them. In our proposal, we adopt the exponential twisting of the Gaussian kernel smoothing of the past innovations as the importance sampling distribution. We have shown that the optimal twisting parameter is unique, and that an approximate value to it can be found by applying the stochastic approximation. We also propose how to compute the weight of each sample of multi-period return, which enables to estimate the tail risk measures from the simulated multi-period returns. Empirical results showed that the proposed method shows better performance than the crude Monte Carlo simulation in terms of variance reduction for the multi-period ES estimation. Through backtestings, we have seen that the proposed method gives accurate VaR and ES estimates.
The proposed method is a Monte Carlo simulation. The computing power of modern computers makes the estimation of a multi-period VaR and a multi-period ES by the proposed method possible within seconds. By applying the proposed SIS method, practitioners can estimate the multi-period VaR and ES more accurately, so that they can develop more efficient and optimal risk management strategies. Moreover, the proposed method enables to manage extreme losses more effectively when designing long-term investment. Therefore, it could make a significant contribution to improving risk management strategies and enhancing the stability of investment portfolios.
Our proposed method is limited to the case of the univariate return process. In order to construct a portfolio optimally during a time period based on risk measures such as VaR and ES, we need to estimate the multi-period VaR or ES with varying allocation of assets making up the portfolio. In this case, it is efficient to consider the multivariate return process of assets making up the portfolio. As an interesting future research, the proposed sequential importance sampling can be extended to the case that the return process is modeled as a multivariate GARCH process such as the constant conditional correlation model or the dynamic conditional correlation model.

Author Contributions

Conceptualization, S.K.; methodology, Y.-J.S. and S.K.; software, Y.-J.S.; validation, Y.-J.S.; formal analysis, Y.-J.S. and S.K.; investigation, Y.-J.S.; resources, Y.-J.S.; data curation, Y.-J.S.; writing—original draft preparation, Y.-J.S. and S.K.; writing—review and editing, S.K.; visualization, Y.-J.S.; supervision, S.K.; project administration, S.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original data presented in the study are openly available in Yahoo Finance.

Acknowledgments

The authors would like to thank the anonymous reviewers for their comments and suggestions on the first draft of this paper. Their suggestions have greatly improve the quality of the paper.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Acerbi, Carlo, and Balazs Szekely. 2014. Back-testing expected shortfall. Risk 27: 76–81. [Google Scholar]
  2. Akgiray, Vedat. 1989. Conditional heteroscedasticity in time series of stock returns: Evidence and forecasts. Journal of Business 62: 55–80. [Google Scholar] [CrossRef]
  3. Barone-Adesi, Giovanni, Kostas Giannopoulos, and Les Vosper. 1999. Var without correlations for portfolios of derivative securities. Journal of Futures Markets 19: 583–602. [Google Scholar] [CrossRef]
  4. Barone-Adesi, Giovanni, Kostas Giannopoulos, and Les Vosper. 2002. Backtesting derivative portfolios with filtered historical simulation (fhs). European Financial Management 8: 31–58. [Google Scholar] [CrossRef]
  5. Basel Committee on Banking Supervision. 2013. Fundamental Review of the Trading Book: A Revised Market Risk Framework. Consultative Document. Available online: https://www.bis.org/publ/bcbs265.pdf (accessed on 6 December 2024).
  6. Berkowitz, Jeremy, Peter Christoffersen, and Denis Pelletier. 2011. Evaluating value-at-risk models with desk-level data. Management Science 57: 2213–27. [Google Scholar] [CrossRef]
  7. Bernardi, Mauro, Antonello Maruotti, and Lea Petrella. 2012. Skew mixture models for loss distributions: A bayesian approach. Insurance: Mathematics and Economics 51: 617–23. [Google Scholar] [CrossRef]
  8. Bollerslev, Tim, and Jeffrey M. Wooldridge. 1992. Quasi-maximum likelihood estimation and inference in dynamic models with time-varying covariances. Econometric reviews 11: 143–72. [Google Scholar] [CrossRef]
  9. Broda, Simon A., and Marc S. Paolella. 2009. Chicago: A fast and accurate method for portfolio risk calculation. Journal of Financial Econometrics 7: 412–36. [Google Scholar] [CrossRef]
  10. Brummelhuis, Raymond, and Roger Kaufmann. 2007. Time-scaling of value-at-risk in garch (1, 1) and ar (1)-garch (1, 1) processes. The Journal of Risk 9: 39. [Google Scholar] [CrossRef]
  11. Butler, J. S., and Barry Schachter. 1997. Estimating value-at-risk with a precision measure by combining kernel estimation with historical simulation. Review of Derivatives Research 1: 371–90. [Google Scholar]
  12. Chen, Qian, Xiang Gao, Xiaoxuan Huang, and Xi Li. 2021. Multiple-step value-at-risk forecasts based on volatility-filtered midas quantile regression: Evidence from major investment assets. Investment Management and Financial Innovations 18: 372–84. [Google Scholar] [CrossRef]
  13. Chen, Song Xi, and Cheng Yong Tang. 2005. Nonparametric inference of value-at-risk for dependent financial returns. Journal of Financial Econometrics 3: 227–55. [Google Scholar] [CrossRef]
  14. Christoffersen, Peter. 2011. Elements of Financial Risk Management. Cambridge: Academic Press. [Google Scholar]
  15. Christoffersen, Peter F. 1998. Evaluating interval forecasts. International Economic Review 39: 841–62. [Google Scholar] [CrossRef]
  16. Engle, Robert F., and Simone Manganelli. 2004. Caviar: Conditional autoregressive value at risk by regression quantiles. Journal of Business & Economic Statistics 22: 367–81. [Google Scholar]
  17. Francq, Christian, and Jean-Michel Zakoian. 2019. GARCH Models: Structure, Statistical Inference and Financial Applications. Hoboken: John Wiley & Sons. [Google Scholar]
  18. Gao, Feng, and Fengming Song. 2008. Estimation risk in garch var and es estimates. Econometric Theory 24: 1404–24. [Google Scholar] [CrossRef]
  19. Ghysels, Eric, Alberto Plazzi, and Rossen Valkanov. 2016. Why invest in emerging markets? the role of conditional return asymmetry. The Journal of Finance 71: 2145–92. [Google Scholar] [CrossRef]
  20. Ghysels, Eric, Alberto Plazzi, Rossen Valkanov, Antonio Rubia, and Asad Dossani. 2019. Direct versus iterated multiperiod volatility forecasts. Annual Review of Financial Economics 11: 173–95. [Google Scholar] [CrossRef]
  21. Glasserman, Paul, Philip Heidelberger, and Perwez Shahabuddin. 2000. Variance reduction techniques for estimating value-at-risk. Management Science 46: 1349–64. [Google Scholar] [CrossRef]
  22. Glosten, Lawrence R., Ravi Jagannathan, and David E. Runkle. 1993. On the relation between the expected value and the volatility of the nominal excess return on stocks. The Journal of Finance 48: 1779–801. [Google Scholar] [CrossRef]
  23. Glynn, Peter W., and Ward Whitt. 1992. The asymptotic efficiency of simulation estimators. Operations Research 40: 505–20. [Google Scholar] [CrossRef]
  24. Hong, Jeff L., Zhaolin Hu, and Guangwu Liu. 2014. Monte carlo methods for value-at-risk and conditional value-at-risk: A review. ACM Transactions on Modeling and Computer Simulation (TOMACS) 24: 1–37. [Google Scholar] [CrossRef]
  25. Hoogerheide, Lennart, and Herman K. van Dijk. 2010. Bayesian forecasting of value at risk and expected shortfall using adaptive importance sampling. International Journal of Forecasting 26: 231–47. [Google Scholar] [CrossRef]
  26. Hull, John, and Alan White. 1998. Incorporating volatility updating into the historical simulation method for value-at-risk. Journal of Risk 1: 5–19. [Google Scholar] [CrossRef]
  27. Jalal, Amine, and Michael Rockinger. 2008. Predicting tail-related risk measures: The consequences of using garch filters for non-garch data. Journal of Empirical Finance 15: 868–77. [Google Scholar] [CrossRef]
  28. Kuester, Keith, Stefan Mittnik, and Marc S. Paolella. 2006. Value-at-risk prediction: A comparison of alternative strategies. Journal of Financial Econometrics 4: 53–89. [Google Scholar] [CrossRef]
  29. Kupiec, Paul H. 1995. Techniques for verifying the accuracy of risk measurement models. Journal of Derivatives 3: 73–84. [Google Scholar] [CrossRef]
  30. Le, Trung H. 2020. Forecasting value at risk and expected shortfall with mixed data sampling. International Journal of Forecasting 36: 1362–79. [Google Scholar] [CrossRef]
  31. Longin, Francois M. 2000. From value at risk to stress testing: The extreme value approach. Journal of Banking & Finance 24: 1097–130. [Google Scholar]
  32. Lönnbark, Carl. 2016. Approximation methods for multiple period value at risk and expected shortfall prediction. Quantitative Finance 16: 947–68. [Google Scholar] [CrossRef]
  33. McNeil, Alexander J., and Rüdiger Frey. 2000. Estimation of tail-related risk measures for heteroscedastic financial time series: An extreme value approach. Journal of Empirical Finance 7: 271–300. [Google Scholar] [CrossRef]
  34. Nadarajah, Saralees, Bo Zhang, and Stephen Chan. 2014. Estimation methods for expected shortfall. Quantitative Finance 14: 271–91. [Google Scholar] [CrossRef]
  35. Nieto, Maria Rosa, and Esther Ruiz. 2016. Frontiers in var forecasting and backtesting. International Journal of Forecasting 32: 475–501. [Google Scholar] [CrossRef]
  36. R Core Team. 2024. R: A Language and Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing. [Google Scholar]
  37. Righi, Marcelo Brutti, and Paulo Sergio Ceretta. 2015. A comparison of expected shortfall estimation models. Journal of Economics and Business 78: 14–47. [Google Scholar] [CrossRef]
  38. Rubinstein, Reuven Y., and Dirk P. Kroese. 2016. Simulation and the Monte Carlo Method. Hoboken: John Wiley & Sons. [Google Scholar]
  39. Ruiz, Esther, and María Rosa Nieto. 2023. Direct versus iterated multiperiod value-at-risk forecasts. Journal of Economic Surveys 37: 915–49. [Google Scholar] [CrossRef]
  40. Sak, Halis, and Wolfgang Hörmann. 2012. Fast simulations in credit risk. Quantitative Finance 12: 1557–69. [Google Scholar] [CrossRef]
  41. Simonato, Jean-Guy. 2011. The performance of johnson distributions for computing value at risk and expected shortfall. Journal of Derivatives 19: 7. [Google Scholar] [CrossRef]
  42. Teräsvirta, Timo. 2009. An introduction to univariate garch models. In Handbook of Financial Time Series. Berlin/Heidelberg: Springer, pp. 17–42. [Google Scholar]
  43. Zhang, Ning, Xiaoman Su, and Shuyuan Qi. 2023. An empirical investigation of multiperiod tail risk forecasting models. International Review of Financial Analysis 86: 102498. [Google Scholar] [CrossRef]
  44. Zhou, Chunyang, Xiao Qin, Xundi Diao, and Yingchen He. 2016. Estimating multi-period value at risk of oil futures prices. Applied Economics 48: 2994–3004. [Google Scholar] [CrossRef]
Figure 1. Histogram of the 10-day log returns of the S&P 500 Index observed every 10 days from 21 December 1973 to 29 December 2023.
Figure 1. Histogram of the 10-day log returns of the S&P 500 Index observed every 10 days from 21 December 1973 to 29 December 2023.
Risks 12 00201 g001
Figure 2. The 10-day log return of the S&P 500 Index observed every 10 days from 21 December 1973 to 29 December 2023, and their corresponding negated 10-day VaR estimates obtained by SIS method with confidence level of 0.975 .
Figure 2. The 10-day log return of the S&P 500 Index observed every 10 days from 21 December 1973 to 29 December 2023, and their corresponding negated 10-day VaR estimates obtained by SIS method with confidence level of 0.975 .
Risks 12 00201 g002
Table 1. Summary statistics of the 10-day log returns of the S&P 500 Index observed every 10 days from 21 December 1973 to 29 December 2023.
Table 1. Summary statistics of the 10-day log returns of the S&P 500 Index observed every 10 days from 21 December 1973 to 29 December 2023.
Min.1st Qu.MedianMean3rd Qu.Max.
−0.2688−0.01370.00530.00310.02160.1360
Table 2. Elapsed time (in seconds) to estimate the 10-day VaR and ES every 10 days from 21 December 1973 to 29 December 2023.
Table 2. Elapsed time (in seconds) to estimate the 10-day VaR and ES every 10 days from 21 December 1973 to 29 December 2023.
Confidence Level
Method 0.95 0.975 0.99
CMC-N419.7399.9401.0
CMC-t647.5661.4683.0
SIS497.3486.6497.2
Table 3. The number of violations, its expected value, and 95 % confidence interval of the number of violations under the assumption that the violations are independent with rate 1 q .
Table 3. The number of violations, its expected value, and 95 % confidence interval of the number of violations under the assumption that the violations are independent with rate 1 q .
Confidence Level ( q )
Method 0.95 0.975 0.99
Expected number (C.I.)63.1 (48, 79)31.5 (21, 43)12.6 (6, 20)
CMC-N543515
CMC-t563716
SIS573213
Table 4. Average S.E., average R.E., and the time-variance of the 10-day VaR estimates.
Table 4. Average S.E., average R.E., and the time-variance of the 10-day VaR estimates.
Confidence LevelMethodAverage S.EAverage R.ETime-Variance
0.95CMC-N8.436 × 10 4 1.533 × 10 2 2.987 × 10 4
CMC-t8.737 × 10 4 1.618 × 10 2 4.943 × 10 4
SIS2.178 × 10 3 4.101 × 10 2 2.359 × 10 3
0.975CMC-N1.197 × 10 3 1.724 × 10 2 5.730 × 10 4
CMC-t1.253 × 10 3 1.821 × 10 2 1.038 × 10 3
SIS2.060 × 10 3 2.996 × 10 2 2.065 × 10 3
0.99CMC-N1.918 × 10 3 2.153 × 10 2 1.475 × 10 3
CMC-t2.129 × 10 3 2.351 × 10 2 3.096 × 10 3
SIS2.122 × 10 3 2.317 × 10 2 2.239 × 10 3
Table 5. Average S.E., average R.E., and the time-variance of the 10-day ES estimates.
Table 5. Average S.E., average R.E., and the time-variance of the 10-day ES estimates.
Confidence LevelMethodAverage S.E.Average R.E.Time-Variance
0.95CMC-N9.064 × 10 4 1.172 × 10 2 3.448 × 10 4
CMC-t1.104 × 10 3 1.378 × 10 2 7.892 × 10 4
SIS3.622 × 10 4 4.401 × 10 3 6.525 × 10 5
0.975CMC-N1.360 × 10 3 1.444 × 10 2 7.400 × 10 4
CMC-t1.656 × 10 3 1.714 × 10 2 1.814 × 10 3
SIS3.858 × 10 4 3.807 × 10 3 7.243 × 10 5
0.99CMC-N2.235 × 10 3 1.930 × 10 2 2.004 × 10 3
CMC-t2.986 × 10 3 2.404 × 10 2 6.090 × 10 3
SIS4.532 × 10 4 3.392 × 10 3 1.021 × 10 4
Table 6. Test statistics and significant probabilities for the unconditional coverage, the independence, and the conditional coverage tests.
Table 6. Test statistics and significant probabilities for the unconditional coverage, the independence, and the conditional coverage tests.
Confidence Level
Test Method 0.95 0.975 0.99
L R u c CMC-N1.434 (0.231)0.3795 (0.538)0.431 (0.511)
CMC-t0.861 (0.353)0.925 (0.336)0.848 (0.357)
SIS0.631 (0.427)0.007 (0.932)0.012 (0.913)
L R i n d CMC-N0.204 (0.651)0.0008 (0.977)0.361 (0.548)
CMC-t0.108 (0.742)0.008 (0.931)0.412 (0.521)
SIS0.072 (0.788)1.668 (0.197)0.271 (0.603)
L R c c CMC-N1.639 (0.441)0.3804 (0.827)0.793 (0.673)
CMC-t0.969 (0.616)0.932 (0.627)1.260 (0.533)
SIS0.703 (0.704)1.675 (0.433)0.283 (0.868)
Table 7. The value of Z 1 and its significant probability.
Table 7. The value of Z 1 and its significant probability.
Confidence Level
Method 0.95 0.975 0.99
CMC-N−0.0821 (0.0189)−0.0392 (0.1741)−0.0540 (0.1689)
CMC-t−0.0508 (0.1131)−0.0157 (0.3473)−0.0353 (0.2736)
SIS−0.0389 (0.1804)−0.0077 (0.3955)−0.0505 (0.2089)
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Seo, Y.-J.; Kim, S. A Sequential Importance Sampling for Estimating Multi-Period Tail Risk. Risks 2024, 12, 201. https://doi.org/10.3390/risks12120201

AMA Style

Seo Y-J, Kim S. A Sequential Importance Sampling for Estimating Multi-Period Tail Risk. Risks. 2024; 12(12):201. https://doi.org/10.3390/risks12120201

Chicago/Turabian Style

Seo, Ye-Ji, and Sunggon Kim. 2024. "A Sequential Importance Sampling for Estimating Multi-Period Tail Risk" Risks 12, no. 12: 201. https://doi.org/10.3390/risks12120201

APA Style

Seo, Y.-J., & Kim, S. (2024). A Sequential Importance Sampling for Estimating Multi-Period Tail Risk. Risks, 12(12), 201. https://doi.org/10.3390/risks12120201

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop