Transformational Approach to Analytical Value-at-Risk for near Normal Distributions

: In this paper, we extend the parametric approach of VaR estimation that is based upon the application of two transforms, one for handling skewness and other for kurtosis. These transformations restore normality to data when applied in succession. The transforms are well deﬁned and offer an alternative to VaR models based on the variance–covariance approach. We demonstrate the application of the technique using three pairs of uncorrelated but negatively skewed and fat-tailed stock return distributions, one pair each from recent periods in US and international market, and one from the stressed period of US economic history. Furthermore, we extend the analysis to economic domain by calculating expected shortfalls and risk capital under different estimation methods. For the sake of completion, we compare the estimation results of normal and transformation methods to non-parametric historical simulation.


Introduction
Value at risk (VaR) is an accepted risk measure in financial risk management and in particular, the banking industry. Basel II institutionalized its application in banking in setting capital reserves based on VaR computations. Merton and Perold (1993) laid the basis for it in the form of risk capital for financial firms. Thus, we find its usage in hedge funds too, where it has been found to be a better measure of fund risk than the standard deviation (Gupta and Liang 2005;Liang and Park 2010;El Kalak et al. 2016). The measure is easy to compute and simple to understand, and has wide ranging applications not only in financial institutions and portfolio management, but also in non-financial firms (Bodnar et al. 1998).
Implementations of VaR commonly assume normality of asset returns (see Huisman et al. 1998), relying on the first two moments. Hence, the approach is referred to as the variance-covariance approach. Models based on this approach are labeled Deltanormal models after they were popularized by RiskMetrics. They are also referred to as analytical VaR models (see Jorion 2001). The assumption of normality simplifies the computation of VaR considerably. However, realized returns are negatively skewed and fat-tailed. The assumption of normality in such circumstances is not a realistic one.
The most common approach to tackle non-normality is to assume that conditional returns are normal, even if unconditional returns are not. GARCH and stochastic volatility models, jump diffusion models, and Markov switching models all rely upon this feature. For example, GARCH and stochastic volatility models assume that returns are normal conditional upon knowing the current variance. Duffie and Pan (1997) review the application of these models to the computation of VaR.
Other approaches to tackle non-normality take one of two forms. One approach is to model the tails of the distribution separately using extreme value theory. Bali (2003), Daníelsson et al. (2013), and Rocco (2014) advocate this approach in order to account for the extremal events observed in practice. The other approach is to transform the realized distribution of returns so that the transformed data is normal. This approach has been suggested in Hull and White (1998).
The Hull and White approach depends upon transforming the normal data through a single function G to make it normal. Even though the authors suggest G could be any function, they acknowledge that estimation of the parameters of G could be a challenging task. For example, if G is a function given by a mixture of normals, maximum likelihood estimation of its parameters suffers from instability, local solutions, and convergence problems (see Hamilton 1991).
To overcome the problems associated with the Hull and White generic one function approach, we propose a two-step specific transformation approach. These transformations offer a distinct advantage over Hull and White's single transformation, given the loglikelihood functions are well defined and do not suffer from problems of convergence, local solutions, or instability. Our approach is similar to theirs only to the extent that we also transform the data to achieve normality. The purpose of the paper is to extend the model-building approach of VaR so that the percentiles of risk of the actual data are reflected more accurately in the model.
We draw upon the literature in statistics that identifies suitable transforms to make data normal. The transforms we employ are Manly 1976 andDraper 1980. The first works to reduce asymmetry, and the second mitigates excess kurtosis so that the final data after these transformations are normally distributed. Both transforms are well defined, continuous, and invertible. We apply both of them in succession, since each transformation mitigates a different kind of departure from normality. 1 To demonstrate the efficacy of the procedure, we conduct our analysis over three pairs of stock data. Two pairs are chosen over the same period but from two different markets (US and India), while a third is chosen over the 2007-2008 in the US market. The cross-section analysis captures efficacy of the methodology in highly liquid and less liquid markets, while the period analysis captures the efficacy in times of economic stress. Each of these pairs of stocks are uncorrelated with each other so that the VaR estimate turns out to be a coherent measure of risk. 2 As stated in Huisman et al. (1998) the asset returns are often assumed to be normal for convenience and simplicity. It is the simplest parametric approach to VaR estimation, as opposed to historical simulation, which is non-parametric. In this paper, we carry out analysis of VaR on the chosen stocks using all three methods, the parametric normal and transformational method, and the historical simulation method. Backtesting is done using unconditional Kupiec's test and Christoffersen's interval tests. While Christoffersen's tests fail to reject the consecutive day independence of violations of VaR for all three models, Kupiec's test reveals that the proportion of exceptions is significantly reduced after the transformations in the liquid US market, but not in the less liquid Indian market.
The challenge in VaR estimation using normality lies in accounting for fat tails in actual data. The most popular technique to capture the extra probability mass in the tails employed today is to forecast volatility using the GARCH (1, 1) model. Therefore, we also compare our VaR estimates using raw return volatility to those obtained using GARCH (1, 1) long run volatility estimates. 3 Again, Kupiec's test reveals that GARCH estimates yield fewer exceptions for the transformational method than normality based VaR for the US market.
One criticism of choosing risk capital based on VaR estimates and number of exceptions observed in back testing is that high limits on VaR shall result in lower exceptions but simultaneously lead to idle investible surplus. Therefore, we extend our economic analysis 1 We compare advantages of our sequence of transforms with other popular ones in the next section. 2 The analysis here can be applied to separate asset classes in a portfolio designed to achieve diversification. 3 In GARCH frameworks, a wrong distributional assumption will generate inefficient parameter estimates. See Asem (2007).
to compute expected shortfalls (ES) for our chosen stocks, and the risk capital set aside to absorb losses. 4 We find that the transformational method leads to VaR and ES values that lie between the non-parametric historical simulation and normality estimates across markets and time, suggesting that the transformational method to VaR estimation may be a good compromise to balance holding too much capital to insure exceptional losses and holding too little. However, application of GARCH correction for fat tails simultaneously with the transformation seems to impact risk capital and VaR estimates in unexpected ways, especially when illiquid assets are involved as in our sample. Thus, from a risk capital perspective, we conclude that mixing the transformational method with GARCH is to be used with caution, even though from a purely statistical perspective, the mixed methods may lead to fewer exceptions to VaR limits being breached.
Our contribution to the literature on VaR is primarily two-fold. First and foremost, we propose a simple methodology rooted in statistical transformations, which can overcome problems in estimating VaR when faced with real data that are non-normal. 5 Second, since VaR is used to set aside risk capital to absorb losses, the transformational method provides a good middle estimate between holding too much and too little risk capital. Thus, our methodology does offer a simple alternative that has significant potential for economic value analysis for estimating risk capital for near normal distributions. For tackling extremes, however, other measures like distribution fitting, conditional analyses, or the extreme value theory, or a combination of all, must be adopted.
The rest of the paper is organized as follows. In Section 2, the VaR paradigm as it exists today is presented. Section 3 discusses those departures from normality that this paper addresses. Then, in Section 4, the set of transformations proposed to be applied is reviewed. In Section 5, an application is provided where normality breaks down in the raw data, but is restored via the set of transformations. The VaR is then estimated for the three pairs in three different ways. Section 6 discusses the backtesting procedure and provides results of backtesting the models of Section 5. Section 7 repeats the exercise carried out in Section 6, with the volatility being modeled using a GARCH (1, 1) process. Section 8 applies the different methods to establish risk capital requirement using off the shelf guidance formula under Basel II. Section 9 concludes. Embrechts (2001) defines VaR as follows:

Value at Risk
"VaR is a percentile (or quantile) of the profit and loss (P&L) distribution with the property that, with a small given probability, we stand to incur that loss or more over the fixed time horizon." Key points that need to be noted about this definition are: (1) VaR is a point estimate, not a maximum loss; (2) the probability is exogenously provided; and (3) the time horizon is fixed and given. Thus, VaR has been designed to answer questions like "what is the amount of loss that can be equaled or exceeded on the asset portfolio over the next trading day with 1% chance?" Since it is an estimate, the VaR depends on the method of estimation and is not a unique number. Different estimates may be obtained using different methods.
Again, over the fixed time horizon, the asset mix is assumed to be held constant and the portfolio weights do not change. While this can certainly be true over short horizons, it may not reflect reality over the long-term. Consequently, the VaR estimate is most frequently computed on a daily basis and extrapolated to longer periods, so that n-day VaR equals the 1-day VaR multiplied by the square root of the number of days n. Jorion (2001) provides three methods for the computation of VaR using asset returns data. The first is the Delta-Normal method; the second is the historical simulation method, which is independent of any distributional assumption, while the last is the Monte-Carlo simulation method, which makes distributional assumptions for the returns. 6 In this paper, the emphasis is on the Delta-Normal method when the assumption of normality is violated.
The Delta-Normal method is an analytical parametric technique 7 developed by Risk-Metrics. It assumes that the individual asset returns in the portfolio are multivariate normally distributed with zero means. Thus, the problem of non-coherence of VaR is circumvented, and the mean, standard deviation, and asset return correlations are sufficient statistics to compute the VaR. The change in the asset values (deltas) is dependent only upon the variance-covariance matrix of asset returns. The advantage of this method, in particular, is its speed and simplicity, and the fact that distribution of returns need not be assumed to be stationary through time, since volatility updating is incorporated into the parameter estimation.
To exemplify the Delta-Normal method, let us consider a single asset having a daily returns distribution x ∼ N µ, σ 2 . If the value of the current investment is S dollars, the 1-day VaR at the 100(1 − α) % level of confidence is given by where Z α is the lower α percentile of the standard normal distribution. The n-day VaR n,α is given by √ n VaR 1,α . The negative sign ensures that the VaR number is positive. Assuming normality when the returns are non-normal can lead to errors in the estimate of VaR. In general, if the distribution is heavy tailed, then it will lead to underestimation of VaR at relatively high confidence levels and overestimation at relatively low confidence levels. Jorion (1996) suggests scaling the volatility/VaR estimate. In fact, the Basel II committee recommends scaling VaR estimate by a multiple, which in turn depends upon the exceptions observed from the internal VaR estimates. For example, Gupta and Liang (2005) and Ou and Zhao (2020) apply similar multiples while estimating required capital for hedge funds and exchange traded funds, respectively. Thus, any improvement in VaR estimation technique also leads to improvements in capital estimates.
In the following section, we review the specific departures from normality that can be addressed using the transforms discussed in this paper.

Departures from Normality
This paper focuses on departures from normality arising out of skewness and kurtosis in the return data series. Returns on small stocks are known to be highly skewed (Amaya et al. 2015) and so are returns on venture capital investments (Cochrane 2005). Hedge fund returns exhibit excess kurtosis (Gupta and Liang 2005;Liang and Park 2010). It must be noted that normality may also be violated in higher moments, if not in the third and fourth. However, we restrict ourselves to non-normality arising out of asymmetry (skewness) and thick/thinness (excess kurtosis) in the tails. We use the Jarque-Bera statistic (JB stat henceforth) to check for normality, since it is explicitly based on the third and fourth moments of the return distribution.
In case normality of returns is rejected, one cannot apply Equation (1) to estimate VaR. One approach is to try other distributions that fit the given data and then compute the quantile estimate as suggested in the definition in Section 2. Still others suggest updating volatility estimates using time series models like GARCH models to account for kurtosis. The third approach is to restore normality using a set of transformations and then provide an estimate of VaR using the transformed data so that Equation (1) can be applied, and that is the method that we employ to our pairs of stocks whose returns are uncorrelated.
The advantages of doing so to our portfolio are two-fold: (i) the task of computing VaR for the portfolio is simplified, and (ii) since normality is restored, the VaR estimate is not vulnerable to the non-coherence critique.
In the following section, we elaborate on the transformations that may be applied to non-normal raw return series to make the data normal.

Transformations Applied to Achieve Normality
We assume in this section that the formula for VaR in Equation (1) cannot be applied, because there is skewness and excess kurtosis in the data (hence the JB stat rejects the normality assumption). Therefore, we attempt to restore normality in stages via a set of transformations.
Among different transformations used, the power transformation models are known to outperform the untransformed models in forecasting VaR (Tsiotas 2020). Among these, the Box-Cox transformation (1964) is widely used, even in different fields, for example in recent studies by Berger and Schmid (2020) in medical research, Cai et al. (2020) in structural engineering, Li (2018) in health studies, and Cunha et al. (2020) and Mögel and Auer (2018) in volatility modeling in finance. Other popular power transforms include Manly (1976), John and Draper (1980), Bickel and Doksum (1981), and Yeo and Johnson (2000) transformation.
The major limitation of using Box-Cox transformation is its poor performance in handling long-tailed or skew data (Zhu and Melnykov 2018;Dhanoa et al. 2020). Another limitation of Box-Cox transformation is its incompatibility with a normal error distribution (Eriksson et al. 2019). To circumvent the limitation of Box-Cox transformation for negative values, researchers also use signed power transformation suggested by Bickel and Doksum (1981). Bickel and Doksum (1981) transformation performs well in handling kurtosis; however, it performs poorly in handling a skewed distribution (Yeo and Johnson 2000). Tsiotas (2020) shows that Box-Cox Manly transformation outperforms the Yeo and Johnson (2000) transformation.
Manly transform is suggested as an alternative to Box and Cox (1964) transform on the grounds that negative x values are allowed (Zhu and Melnykov 2018). Moreover, in finite samples, the Box-Cox transform fails to eliminate skewness (Gonçalves and Meddahi 2011;Zhu and Melnykov 2018), while Spitzer (1978) shows that in finite samples, there is an increase in the variance of the transformed variable. The latter study also points out the Box-Cox transformation may fail normality tests because of excess kurtosis. In contrast, Manly transform is quite effective at turning skewed distributions into symmetric normallike distribution (Zhu and Melnykov 2018). We therefore employ the Manly transform to reduce skewness in the raw return series. Bhattacharyya and Madhav (2012) conduct a statistical analysis combining a dynamic GARCH with various transformations over market indices that are highly liquid assets. They report Manly and John and Draper transformation performs equally well as others for these liquid indices.
Our study differs from Bhattacharyya and Madhav (2012), as our analyses incorporates economic uses of VaR as opposed to purely statistical analyses conducted by the authors. We deliberately choose relatively illiquid assets compared to indices, and our findings suggest that the combination of GARCH and transformations may lead to unexpectedly high-risk capital requirements, thus questioning the efficacy of risk capital held from a risk-return perspective, because idle capital incurs an opportunity cost.
Given the discussion above, we propose an alternative two step transformation combining the Manly and John and Draper transforms. The first step transformation reduces the skewness, and the second step transformation reduces the excess kurtosis. The combination of both the steps reduces skewness and excess kurtosis. We discuss these transforms below.

Manly Transformation
In 1976, Manly proposed the following transformation to map skewed data (x) to make it more symmetric (like normal): where x is the original non-normal variable, y is the transformed variable, and γ is a parameter to be estimated. When x takes n values −x 1 , . . . , x n , the value of parameter γ is calculated by maximizing a log likelihood function of the following form: where µ i is the mean of the transformed variable y i , and y i 's have variance σ 2 . An estimate of the variance is provided by: Under the assumption thatμ i are all equal,μ i is replaced by the mean of the transformed variable y i . Equation (3) is then solved to yield the parameter γ.
The Manly transform works well to improve symmetry over (−∞, ∞), but it does not work well with data that exhibits non-normal kurtosis. In order to address the problem of kurtosis, John and Draper, 1980, propose the following set of transforms on near normal (almost symmetric) data:

John and Draper Transform
The following set of transforms corrects for fairly symmetric but non-normal distribution due to excess kurtosis: This is a modulus transformation applied to each (non-normal) tail separately. The likelihood function for estimating λ is given by x is the geometric mean of the raw (|x| + 1) s. The Manly transform works well to improve symmetry, while the JD transform works better for decreasing the magnitude of excess kurtosis. Consequently, in this paper we apply both transforms in succession, while checking for normality at each stage.
The following three steps succinctly describe the methodology that we follow to compute the VaR when the original daily returns distribution r is not normal.

2.
Check if the transformed data given by y = JD • MANLY ( r s ) is normally distributed using the JB stat. Let y ∼ N(µ tr , σ tr ).

3.
The one-day VaR estimate for actual returns r at (100-α)% level is given by where S is the size of the investment.

Computation of VaR
In this section, we demonstrate the application of the transformations discussed in the prior section in calculating VaR for a portfolio comprising three pairs of uncorrelated securities. These are chosen from different markets at different times. From the US market the stock prices are for Pan American Silver Corp (Ticker symbol: PAAS) and Portland General Electric Company (POR) for the period May 24, 2006-October 10, 2008 (period of economic stress), and Kinross Gold Corporation (KGC) and Duke Energy (DUK) for the period July 25, 2017-December 10, 2019 (business as usual period), from the commodity and utility sector, respectively. The two uncorrelated stocks from India are Punjab Alkalies and Chemicals Limited (PACL) and Bharat Petroleum Corporation Limited (BPCL) considered over the business-as-usual period and belonging to the chemical and petroleum industry, respectively. The number of business days provide us with 600 days of daily returns for each period. Given the assets are uncorrelated and transformed data normal, the VaR of the portfolio equals the sum of individual asset VaRs.
The summary statistics of the three pairs of stocks are presented in three panels in Table 1. The daily returns of all stocks across the three panels exhibit negative skewness and are leptokurtic. The JB statistic shows that the returns are not normally distributed for any one of them, and correlations are low in magnitude and statistically insignificant. 8 In Figure 1, Panels A-C plot the distributions of realized returns over the data period for the stocks. For purpose of comparison, the normal density is also plotted as if the data was drawn from a normal distribution having same mean and variance. The graphs demonstrate negative skewness and positive excess kurtosis (fat tails) for all stocks.
While the focus is on parametric estimation, the historical simulation is included for the sake of completeness.
The historical VaR measure has some serious disadvantages to both companies employing it and financial sector regulators. In order to obtain accurate estimates, a large data sample of the empirical distribution is required. The VaR estimate is therefore subject to the frequency and length of the data sample. A further drawback is the inability to allow for conditionality of the parameters over time. To overcome these flaws, a parametric approach, such as the normal approach, is often adopted. Since the distribution is approximated by a parametric distribution, parameters can be allowed to change over time. Estimation risk on the VaR estimate itself is also reduced, particularly for higher quantiles. Furthermore, the parametric approach has We use three methods to compute the VaR based on past 100 day returns, the nonparametric historical simulation, parametric normal, and the transformational method in which the Manly and JD transforms are applied in succession to normalized raw data. While the focus is on parametric estimation, the historical simulation is included for the sake of completeness.
The historical VaR measure has some serious disadvantages to both companies employing it and financial sector regulators. In order to obtain accurate estimates, a large data sample of the empirical distribution is required. The VaR estimate is therefore subject to the frequency and length of the data sample. A further drawback is the inability to allow for conditionality of the parameters over time.
To overcome these flaws, a parametric approach, such as the normal approach, is often adopted. Since the distribution is approximated by a parametric distribution, parameters can be allowed to change over time. Estimation risk on the VaR estimate itself is also reduced, particularly for higher quantiles. Furthermore, the parametric approach has the advantage of not being dependent on the chosen quantile, facilitating the ease with which comparisons between the VaR estimates across various institutions can be made. Parametric conversion, however, will only hold in practice if the parametric approach accurately reflects the distribution at all quantiles in the tail. Indeed, it has been the case that institutions have notoriously chosen confidence levels and time horizons to suit them. Huisman et al. (1998) provide an excellent discussion on these methods.
Consequently, any VaR model we undertake is back-tested to verify its efficacy. The most popular test is an exceptions test based on Kupiec (1995) followed by test of independence by Christoffersen. The exceptions test says if the daily 1% VaR is based on a normal distribution assumption and it is indeed the true underlying distribution then realized daily returns should not fall below the VaR 1, 0.01 estimate more than 1% of the time. Similarly, the null hypothesis for Christoffersen's test is that the probability of a change of state from loss not in excess of VaR to a state of exceeding VaR is the same as that of continuing in the same state of loss not exceeding VaR. We backtest the VaR estimates in our examples using both tests. The estimation and backtesting windows are shown in Figure 2. To overcome these flaws, a parametric approach, such as the normal approach, is often adopted. Since the distribution is approximated by a parametric distribution, parameters can be allowed to change over time. Estimation risk on the VaR estimate itself is also reduced, particularly for higher quantiles. Furthermore, the parametric approach has the advantage of not being dependent on the chosen quantile, facilitating the ease with which comparisons between the VaR estimates across various institutions can be made. Parametric conversion, however, will only hold in practice if the parametric approach accurately reflects the distribution at all quantiles in the tail. Indeed, it has been the case that institutions have notoriously chosen confidence levels and time horizons to suit them. Huisman et al. (1998) provide an excellent discussion on these methods.
Consequently, any VaR model we undertake is back-tested to verify its efficacy. The most popular test is an exceptions test based on Kupiec (1995) followed by test of independence by Christoffersen. The exceptions test says if the daily 1% VaR is based on a normal distribution assumption and it is indeed the true underlying distribution then realized daily returns should not fall below the , .
estimate more than 1% of the time. Similarly, the null hypothesis for Christoffersen's test is that the probability of a change of state from loss not in excess of VaR to a state of exceeding VaR is the same as that of continuing in the same state of loss not exceeding VaR.  VaR on day t is estimated using the past 100 days of data for Model 1 (using historical simulation), Model 2 (assuming normality of raw returns), and Model 3 (after transforming raw data). Then the exceptions test is carried out for a window of 125 days following the VaR estimate at date t. Both windows are rolled forward by one day and the procedure repeated for both models until the sample period is exhausted. This leads to testing of exceedances of VaR limits over 47,000 data points.
|_______________________________| |_________________________| Estimation Window Backtesting Window Figure 2. Estimation and backtesting windows. This figure shows the estimation and the backtesting windows. VaR on day t is estimated using the past 100 days of data for Model 1 (using historical simulation), Model 2 (assuming normality of raw returns), and Model 3 (after transforming raw data). Then the exceptions test is carried out for a window of 125 days following the VaR estimate at date t. Both windows are rolled forward by one day and the procedure repeated for both models until the sample period is exhausted. This leads to testing of exceedances of VaR limits over 47,000 data points. Table 2 shows the results of the transformations. Each panel shows the parameters of transformations and their effects on JB stat when Manly and JD are applied in succession. It is seen that the JB stat values, after the transformations have been applied, fail to reject normality for the transformed data in all three panels. We plot the transformed data densities for KGC and DUK in panel A, for PACL and BPCL in panel B, and for PASS and POR in panel C of Figure 3, respectively. For comparison purposes, we superimpose normal densities on plots of the stocks. The graphs show interesting patterns when compared to the corresponding raw data plots of Figure 1.
We notice that the mode for the transformed data has decreased for the all the stocks studied. The reduction is less noticeable in the case of PACL. The effect of the transforms is to redistribute probability mass from the center towards the left tail of the raw distribution. Simultaneously, mass in the tails is also reduced and redistributed over the rest of the return values.
J. Risk Financial Manag. 2021, 14, x FOR PEER REVIEW 11 of 19 and BPCL are corr =0.0088, t-stat = 0.1971. Thus, return series remains uncorrelated after the transforms, thus preserving additivity of VaR values for the portfolios of two stocks in each of the three cases.
We plot the transformed data densities for KGC and DUK in panel A, for PACL and BPCL in panel B, and for PASS and POR in panel C of Figure 3, respectively. For comparison purposes, we superimpose normal densities on plots of the stocks. The graphs show interesting patterns when compared to the corresponding raw data plots of Figure 1. We notice that the mode for the transformed data has decreased for the all the stocks studied. The reduction is less noticeable in the case of PACL. The effect of the transforms is to redistribute probability mass from the center towards the left tail of the raw distribution. Simultaneously, mass in the tails is also reduced and redistributed over the rest of the return values.

Backtesting
We estimate the three models of VaR for all the firms in our portfolio. Model 1 employs the historical simulation, Model 2 uses the normal distribution, while Model 3 uses the transformed data. Each model is estimated using the past 100 days return data. The

Backtesting
We estimate the three models of VaR for all the firms in our portfolio. Model 1 employs the historical simulation, Model 2 uses the normal distribution, while Model 3 uses the transformed data. Each model is estimated using the past 100 days return data. The backtesting of the model is based on the proportions of exceptions test popularized by Kupiec (1995). 9 The backtesting philosophy for the three models is predicated on the following rationale: if Var t 1,α is established on day t using a given model, then in the following days, the number of realized daily losses lower than Var t 1,α (called exceptions) will be fewer for the model with a better fit. 10 Both estimation and backtesting windows are then rolled forward by one day, and the process is repeated until the entire sample period is exhausted. The estimation and backtesting periods equal 100 and 125 days, respectively (see Figure 2), at α = 1%, so that z α = −2.33.
In Model 3 (M3), the raw data over the estimation window is standardized and transformed so that the transformed data has mean (µ tr ) and standard deviation (σ tr ). The first percentile (pct 1%) for this transformed data is given by (−2.33σ tr + µ tr ), which is mapped back to the standardized raw distribution by inverting the transforms in reverse order, i.e., ( MANLY −1• JD −1 ( pct 1%)}) . Since this point lies on the standardized distribution, the value-at-risk at 1% under Model 3 on day t is given by −( MANLY −1• JD −1 ( pct 1%)}σ + µ) ×50, 000 (see Step 3 of the methodology described towards the close of Section 4).
With the 1% VaR value set on day t for all the three models (M1, M2, and M3), VaR t M1 , VaR t M2 , and VaR t M3 , we proceed as follows. For every day beginning t + 1 and ending t + 125, we check if the realized daily loss/gain equals or exceeds the VaR value for M1 and M2. If the loss equals or exceeds the VaR value, then it is counted as an exception. The sum of all exceptions divided by the total number of days (125) gives us the proportion of exceptions observed from the model over the backtesting window. Following Kupiec (1995), we assume these exceptions follow a binomial distribution.
Since the three models are using same population of returns, for comparison, we follow the two sample proportions tests, which employ the Z-statistic and have the following form: wherep i andp j are the proportion of exceptions observed for models M i and is the pooled probability of an exception. X i (i = j = 1, 2, 3) and N i (i = j = 1, 2, 3) are the number of exceptions observed and their corresponding sample size in the different models. The standard error of the pooled probability is: The results of the back test are presented in Table 3. Since each backtest window has length 125 days, and there are 376 such windows on a rolling basis, the total number of data points for each model = 125 × 376 = 47,000.  1929.49 $2013.16 $1788.74 This exhibit shows the difference in proportions of exceptions observed when VaR values are set after (i) historical simulation (Model 1), (ii) assuming normality (Model 2), and (iii) transforming data using Manly (1976) and John and Draper (1980) transforms (Model 3). Three pairs of stocks are taken, one pair of highly liquid stocks from the recent normal period (KGC and DUK, in Panel A), one pair of less liquid stocks from recent normal period (PACL and BPCL, in Panel B), and one pair of highly liquid stocks from a destressed period (POR and PAAS, in Panel C). Exceptions are assumed to follow a binomial distribution following Kupiec (1995). The estimation window has a length of 100 days and backtesting window length equals 125 days. a denotes 1% levels of significance. Table 3 shows that exceptions from M2 are far more numerous than corresponding exceptions obtained from M3, and in turn, the exceptions from M3 are higher than the exceptions from M1 for the highly liquid stocks of the US market. For the less liquid stocks, the exceptions from M3 are higher than M2, but lower than M1 for PACL. For BPCL, the exceptions from M2 are higher than M1, but lower than M3. The expected shortfall for the stressed period (Panel 3.C) is the lowest for M3.
As expected for liquid stocks in Panels 3.A and 3.C for the US market, we find that high magnitudes for VaR result in lower ES, but the same relation fails to hold for illiquid stocks in Panel 3.B owing to the nature of lumpiness in the VaR estimates. The results show that liquidity risk for estimation of VaR for individual assets is a concern that needs to be explicitly accounted for in the model (see Soprano 2015 for liquidity adjustments to VaR estimation).

Comparison to GARCH (1, 1) Volatility Estimation Model
As stated earlier, the most common approach to tackling non-normality is using a more sophisticated model for volatility. GARCH (1, 1) is most commonly employed to form unconditional variance of returns (or the long-run volatility estimation). For purposes of comparison to other analytical models of VaR estimation, we model the volatility of the returns process for all the stocks as GARCH (1, 1). 11 For GARCH (1,1), we use an estimation window of 100 days and backtesting window of 125 days, with information being updated daily. This implies that while the first estimate is based on 100 days of data, the next one is based on 101 days, and so on, until the last estimate established is based on 376 days of data.
A visual analysis reveals GARCH (1, 1) estimates of long-run volatility are more stable than the rolling window estimates, as expected. 12 As more information is incorporated into the GARCH model, the estimates become more precise. Comparison of rolling window and GARCH estimates shows that the numbers follow a similar trend. When rolling window estimates fall (rise), so do the GARCH ones, but at a slower rate, indicating greater stability.
Results of backtesting using conditional volatility estimates are presented in Table 4 (similar to those reported in Table 3). It shows that the violations for highly liquid stocks for the US market are lower when volatility is modeled using GARCH applied on transformed data (JDManly-GARCH) compared to GARCH applied on un-transformed data (Normal-GARCH). The expected shortfalls are higher for the JDManly-GARCH model compared to Normal-GARCH. For the less liquid stocks, the Normal-GARCH model yields lower number of VaR violations as compared to JDManly-GARCH model. The expected shortfall is approximately the same for both models for PACL, but is higher for Normal-GARCH model for BPCL. This exhibit shows the difference in proportions of exceptions observed, average daily value-at-risk, and average daily expected shortfall when VaR values are set after (i) volatility is modeled using GARCH (1,1) on the normal return series; and (ii) volatility is modeled using GARCH (1,1) on the transformed data using Manly (1976) and John and Draper (1980) transforms. Three pairs of stocks are taken, one pair of highly liquid stocks from the recent normal period (KGC and DUK, in Panel A), one pair of less liquid stocks from recent normal period (PACL and BPCL, in Panel B), and one pair of highly liquid stocks from a destressed period (POR and PAAS, in Panel C). Exceptions are assumed to follow a binomial distribution following Kupiec (1995). The estimation window has a length of 100 days and backtesting window length equals 125 days. a denotes 1% levels of significance.

Economic Impact
There is a trade-off between exceptions and estimates of VaR when it is used as risk capital (reserves) to absorb losses. As is well known, the easiest way to avoid debilitating loss exceedances is to hold large amounts of capital. Yet idle capital incurs lost opportunity cost. Consequently, firms that employ VaR to set risk capital reserves try to balance the exceedances with the capital limits.
We demonstrate the economic impact of our methodology using the regulatory minimum capital requirement (MCR) guidance provided under Basel II for the portfolio of our stocks. Basel II requires minimum risk capital reserves to meet the market risk of a bank's asset holdings. Each bank must meet, on a daily basis, a capital requirement expressed as the higher of (1) its previous day's value-at-risk (VaR t−1 ); and (2) an average of the daily value-at-risk measures on each of the preceding sixty business days (VaRavg), multiplied by a multiplication factor (m b ). That is, at the beginning of day t, where the multiplication factor varies between 3 and 4. It is clear from Equation (9) that the MCR is dependent upon the VaR estimates. Holding m b constant (=4), we compute the average MCR under the three models over the sample period. The average MCR over the sample period (T days) is defined as Since the first VaR estimate is available after 100 days of return data and another 60 days of VaR estimate is needed for calculation of MCR, a total of T = 440 days of trading days data over the sample period is used for all models (excluding the last day of the sample period, since it is assumed that the MCR is set at the beginning of day t). Table 5 reports the result of the computations. The average MCR is highest under M1, followed by M3 and M2 for both pairs of highly liquid stocks in the US market (i.e., in normal period as well as in stressed period, Panels 5.A and 5.C, respectively). The panels show that the transformation method yields significantly different MCR estimates that lie between the historical simulation (high) estimates and normal distribution estimates (low). The same is not true for less liquid stocks. The MCR is lowest for M3 for both PACL and BPCL. For PACL, the MCR is highest for M2, and for BPCL, the MCR is highest for M1. The differences in MCRs between the models for all the stocks are significant, except for the difference between M1 and M3 for PACL, and M2 and M3 for BPCL in Panel B. This exhibit shows the average MCRs observed when VaR values are set after (i) historical simulation (Model 1); (ii) assuming normality (Model 2); and (iii) transforming data using Manly (1976), and John and Draper (1980) transforms (Model 3). Three pairs of stocks are taken, one pair of highly liquid stocks from the recent normal period (KGC and DUK, in Panel A), one pair of less liquid stocks from recent normal period (PACL and BPCL, in Panel B), and one pair of highly liquid stocks from a stressed period (POR and PAAS, in Panel C). The average MCR is calculated as the average of daily MCR over 440 days in the sample. a denotes 1% levels of significance.

Conclusions
This paper extends the model-building approach of VaR and suggests application of two transforms, one for handling skewness and other for kurtosis, which tend to restore normality to data when applied in succession. Inversion of these transforms enables one to compute analytical VaR, which is true to the normality assumption. The transforms are well defined and can be used in GARCH and other conditional volatility VaR models. We apply the technique to portfolios of uncorrelated stocks from two different markets that differ in liquidity, and two different periods, one of which is the stressed period of the 2008 crisis. We verify that VaR estimates based on the transformation approach are less likely to be exceeded by realized loss data for liquid stocks during stressed as well as normal periods, thus decreasing the under capitalization of the portfolio in our example. The results continue to hold when volatility is modeled using GARCH.
The empirically demonstrated advantage of the transformation technique for liquid markets is that it leads to lower exceedances than the normal distribution assumption, and yields reserve minimum capital requirement that lies between those estimated using historical simulations and the normality method. However, even with the application of the transformation technique, liquidity risk still needs to be accounted for separately, as is done in other methods of determination of VaR.
Further research may explore combining liquidity correction methods to the transformational method and applying more sophisticated conditional volatility models in conjunction, like E-GARCH and M-GARCH. From a practical perspective, however, the method yields capital requirements that are neither too low nor too high, and the estimates may be utilized to shore up risk capital to absorb losses and lower exceedances, without a concomitant increase in idle capital that shall impose an opportunity cost of idle capital on the investors. Author Contributions: Conceptualization: P.P. Data curation, methodology implementation, and analysis: V.S. and K.S. Writing-review and editing: all three authors. All authors have read and agreed to the published version of the manuscript.