Do We Need Stochastic Volatility and Generalised Autoregressive Conditional Heteroscedasticity? Comparing Squared End-Of-Day Returns on FTSE

The paper examines the relative performance of Stochastic Volatility (SV) and Generalised Autoregressive Conditional Heteroscedasticity (GARCH) (1,1) models fitted to ten years of daily data for FTSE. As a benchmark, we used the realized volatility (RV) of FTSE sampled at 5 min intervals taken from the Oxford Man Realised Library. Both models demonstrated comparable performance and were correlated to a similar extent with RV estimates when measured by ordinary least squares (OLS). However, a crude variant of Corsi’s (2009) Heterogeneous Autoregressive (HAR) model, applied to squared demeaned daily returns on FTSE, appeared to predict the daily RV of FTSE better than either of the two models. Quantile regressions suggest that all three methods capture tail behaviour similarly and adequately. This leads to the question of whether we need either of the two standard volatility models if the simple expedient of using lagged squared demeaned daily returns provides a better RV predictor, at least in the context of the sample.


Introduction
The paper explores the performance of Stochastic Volatility (SV) and Generalised Autoregressive Conditional Heteroscedasticity (GARCH) (1,1) models as estimators of the volatility of the FTSE Index. The volatilities estimated by these models are compared with realised volatility estimates for FTSE, obtained from the Oxford Man Realised Library and sampled at 5 min intervals, as described in Heber et al. (2009). Their volatility forecasts were further compared with those derived from a simple historical volatility model.
We used the stochvol R package, which uses Markov chain Monte Carlo (MCMC) samplers, to conduct inference by obtaining draws from the posterior distribution of parameters and latent variables, which could then be used for predicting future volatilities. This is done within the context of a fully Bayesian implementation of heteroscedasticity modelling within the framework of stochastic volatility. For more information, see the discussion of the method by Kastner and Frühwirth-Schnatter (2014), and of the stochvol package by Kastner (2016). Taylor (1982) suggested probabilistically modelling volatility, in effect through a state-space model where the logarithm of the squared volatilities, the latent states, follows an autoregressive process of order one. Over time, this specification became known as the SV model. A series of early papers by Jacquier et al. (1994), Ghysels et al. (1996), and Kim et al. (1998) provided evidence in support of the application of stochastic-volatility models, but their practical use has been infrequent. The reasons for this were considered by Bos (2012), who suggested that the empirical application of SV models was limited by two major factors: the variety (and potential incompatibility) of estimation methods for SV models and the lack of standard software packages for implementing these methods. The situation for multivariate SV is even more problematic. Taylor (1994) provides a review of both stochastic volatility and the ARCH/GARCH literature. More recent reviews of the use of inference in the context of this literature were provided by McAleer (2005), and a review of multivariate stochastic-volatility models was provided by Asai et al. (2006). Granger and Poon (2003, p. 485), in their review of volatility-forecasting methods, noted the difficulties in the application of the SV model: "the SV model has no closed form, and hence cannot be estimated directly by maximum likelihood". They also noted that the quasimaximum-likelihood method may also be inefficient if the volatility proxies are non-Gaussian. The advantage of the stochvol R package is that it incorporates an efficient MCMC estimation scheme for SV models, as discussed by Kastner and Frühwirth-Schnatter (2014). This facilitated analysis in this paper, which features a direct comparison of the volatility predictions of a SV model, a GARCH (1,1) model, and a simple application of a historical-volatility-based estimation method, as applied to the FTSE index.
The paper is divided into four sections: Section 2 reviews the literature and employed econometric method. Section 3 presents the results, and Section 4 concludes the paper.

Stochastic Volatility
There have been numerous empirical studies of changes in volatility in various stock, currency, and commodity markets. Findings in volatility research have implications for option pricing, volatility estimation, and the degree to which volatility shocks persist. These research questions have been approached by means of different models and methodologies. Taylor (1982) suggested a novel SV approach, and Taylor (1994) implemented the SV model as follows: if P t denotes the prices of an asset at time t,, and assuming no dividend payments, returns on the asset can be defined, in the context of discrete time periods, as: Volatility is customarily indicated by σ, and prices are described by a stochastic differential equation: with W being a standard Weiner process. If µ and σ are constants, X t has normal distribution, and with U t being independent and identically distributed (i.i.d).
Equation (3) can be generalised by replacing σ with positive random variable σ t to give: where U t ∼ N(0, 1). In circumstances where returns process {X t } can be presented by Equation (4), Taylor (1994) called σ t the stochastic volatility for period t. His definition assumed that (X t − µ)/σ t follows normal distribution.
Stochastic process {σ t } generates realised volatilities {σ * t } that, in general, are not observable. For any realisation, σ * The mixture of these conditional normal distributions defines the unconditional distribution of X t , which has excess kurtosis whenever σ t has positive variance and is independent of U t .
In the empirical section that follows, we used the RV of the FTSE sampled at 5 min intervals, provided by Oxford Man, as a proxy for true realised volatility. We then compared the estimates of volatility obtained from the SV and GARCH (1,1) models by using the RV estimates as a benchmark. Taylor (1994, p. 3) suggested using "capital letters to represent random variables and lowercase letters to represent outcomes", and we follow that convention. Given the observed returns of I t−1 = {x 1 , x 2 , .....x t−1 }, the conditional variance for period t is: Taylor (1994) notes that, in general, random variable H t , which generates observed conditional variance h t , is not, in general, equal to σ 2 t . A convenient way to use economic theory to motivate changes in volatility is to assume that returns are generated by a number of intraperiod price revisions, as in the manner of Clark (1973) and Tauchen and Pitts (1983), to mention two of many studies.
It is assumed that there are N t price revisions during trading day t, each caused by unpredictable information. Let event i on day t change the logarithmic price by ω it , with If we assume that ω it ∼ i.i.d. and is independent of random variable N t , with ω it ∼ N(0, σ 2 ω ), then The above model suggests that squared volatility is proportional to the amount of price information. The lack of standard software to estimate such a model is addressed by Kastner and Frühwirth-Schnatter (2014), who proposed an efficient MCMC estimation scheme that is implemented in the R stochvol package (Kastner (2016)). Kastner (2016, p. 2) proceeded by "letting y = (y 1 , y 2 , .....y n ) | be a vector of returns, with mean zero. An intrinsic feature of the SV model is that each observation y t is assumed to have its 'own' contemporaneous variance, e h t , which relaxes the usual assumption of homoscedasticity". It was assumed that the logarithm of this variance follows an autoregressive process of order one. This assumption is fundamentally different to GARCH models, where time-varying conditional volatility is assumed to follow deterministic instead of stochastic evolution.
The centered parameterization of the SV model can be given as: where N(µ, σ 2 n ), denotes normal distribution with mean µ and variance σ 2 n . θ = (µ, φ, σ n ) | is the vector of parameters that consists of the level of log variance µ, the persistence of log variance, φ, and volatility of log variance σ n . Process h = (h 0 , h 1 , ......, h n ) | , which features in Equations (10) and (11), is the unobserved or latent time-varying volatility process. Kastner (2016, p. 4) remarked that: "A novel and crucial feature of the algorithm implemented in stochvol is the usage of a variant of the "ancillarity-sufficiency interweaving strategy" (ASIS) which has been brought forward in the general context of state-space models by Yu and Meng (2011). ASIS exploits the fact that, for certain parameter constellations, sampling efficiency improves substantially when considering a non-centered version of a state-space model".
Another key feature of the algorithm used in stochvol is the joint sampling of all instantaneous volatilities "all without a loop" (AWOL), a technique with links to Rue (2001) that is discussed in McCausland et al. (2011). The combination of these features enables the stochvol R package to efficiently estimate SV models even when large datasets are involved. Engle (1982) developed the Autoregressive Conditional Heteroscedasticity (ARCH) model that incorporates all past error terms. It was generalised to GARCH by Bollerslev (1986) to include lagged term conditional volatility. In other words, GARCH predicts that the best indicator of future variance is a weighted average of long-run variance, the predicted variance for the current period, and any new information in this period, as captured by the squared residuals.

ARCH and GARCH
The framework was developed as follows: consider a time series y t = E t−1 (y t ) + ε t , where E t−1 (y t ) is the conditional expectation of y t at time t − 1, and ε t is the error term. The basic GARCH model has the following specification: in which ω > 0, α j ≥ 0 and β j ≥ 0 (usually a positive fraction), to ensure a positive conditional variance, h t ≥ 0 (see Tsay (1987)). The ARCH effect is captured by parameter α j , which represents the short-run persistence of shocks to returns, β j captures the GARCH effect that contributes to long-run persistence, and α j + β j measures the persistence of the impact of shocks to returns to long-run persistence. A GARCH (1,1) process is weakly stationary if α j + β j ≤ 1 (see the discussion in Allen et al. (2013)). We contrasted the estimates of volatility from the SV model with those from a GARCH (1,1) model, and assessed which better explained the behaviour of the RV of FTSE sampled at 5 min intervals.

Realised Volatility
We used the 5 min RV estimates from Oxford Man for FTSE as the RV benchmark (see: https: //realized.oxford-man.ox.ac.uk/data). Their database contains "daily (close-to-close) financial returns, and a corresponding sequence of daily realised measures rm 1 , rm 2 , ....., rm T . Realised measures are theoretically sound, high-frequency, nonparametric-based estimators of the variation of the price path of an asset during the times in which the asset trades frequently on an exchange. Realised measures ignore the overnight variation of prices, and sometimes the variation in the first few minutes of the trading day, when recorded prices may contain large errors". The metrics were developed by Andersen et al. (2001), Andersen et al. (2003), and Barndorff-Nielsen and Shephard (2002). Shephard and Sheppard (2010) provide an account of RV measures used in the Oxford Man Realised Library.
The simplest realised metric is realised variance (RV): where x j,t = X t j,t − X t j−1,t . The t j,t are the times of trades or quotes on the t-th day. The theoretical justification of this measure is that, if prices are observed without noise, then, as min j | t j,t − t j−1,t |↓ 0, it consistently estimates the quadratic variation of the price process on the t-th day. If sampling is reduced to very small time intervals, market-microstructure noise may become a contaminant. In order to avoid this issue, we used RV estimates from Oxford Man, sampled at 5 min intervals, hereafter RV5.

Historical-Volatility Model
Poon and Granger (2005) discussed various practical issues encountered in attempts to forecast volatility. They suggested that the HISVOL model has the following form: whereσ t is the expected standard deviation at time t, φ is the weight parameter, and σ is the historical standard deviation for periods indicated by the subscripts. Poon and Granger (2005) suggested that this group of models include random walk, historical averages, autoregressive (fractionally integrated) moving average, and various forms of exponential smoothing that depend on weight parameter φ.
We used a simple form of this model in which the estimate of σ is the previous day's demeaned squared return. Poon and Granger (2005) noted that, in a review of 66 previous studies, implied standard deviations appeared to perform best, followed by historical volatility and GARCH, which have roughly equal performance. They also noted that, at the time of writing, there were insufficient studies of SV models to come to any conclusions about this class of models. This is a motivation for the current study that assesses the performance of all three classes of models. Barndorff-Nielsen and Shephard (2003) pointed out that taking the sums of squares of increments of log prices has a long tradition in the financial economics literature. Early examples are Poterba and Summers (1986), Schwert (1989), Taylor and Xu (1997), Christensen and Prabhala (1998), Dacorogna et al. (1998), and Andersen et al. (2001). Shephard and Sheppard (2010, p. 200, footnote 4) noted that: "Of course, the most basic realised measure is the squared daily return". We utilised this approach as the basis of our historical-volatility model.

Heterogeneous Autoregressive Model (HAR)
Corsi (2009, p. 174) suggested "an additive cascade model of volatility components defined over different time periods. The volatility cascade leads to a simple AR-type model in the realized volatility with the feature of considering different volatility components realized over different time horizons and which he termed as a Heterogeneous Autoregressive model of Realized Volatility." Corsi (2009) suggested that the model successfully achieves the purpose of reproducing the main empirical features of financial returns (long memory, fat tails, and self-similarity) in a parsimonious way. He wrote his model as: where σ (d) is daily integrated volatility, and RV are the daily, weekly, and monthly (ex post) observed realized volatilities, respectively.
We appealed to Corsi (2009) to inspire our HISVOL model, which uses lags of historical RV estimates, but we used lags of squared demeaned daily returns.

Quantile Regression
We used the RV5 estimates as a benchmark to assess how the SV and GARCH(1,1) models performed. We were interested in behaviour in the extreme tails as well as in the centre of distribution. We therefore used quantile regression to assess behaviour in the tails. Koenker and Hallock (2001, p. 145) provided an introduction to quantile regression, and noted that "quantiles seem inseparably linked to the operations of ordering and sorting the sample observations that are usually used to define them". They added: "The symmetry of the piecewise linear absolute value function implies that the minimization of the sum of absolute residuals must equate the number of positive and negative residuals, thus assuring that there are the same number of observations above and below the median".
They asked: What about the other quantiles? As the symmetry of the absolute value yields the median, it follows that minimising the sum of asymmetrically weighted absolute residuals by simply giving differing weights to positive and negative residuals provides the other quantiles. The solution to where function ρ τ (·) is the titled absolute value function, shown in Figure 1, gives the τth sample quantile function. An estimate of the conditional median function can be obtained by replacing scalar ξ in Equation (10) by parametric function ξ(x i , β), and setting τ to 1/2. Estimates of the other conditional quantile functions can be obtained by replacing absolute values by ρ τ (·), and solving Expression (11) by linear programming: min We applied quantile regression to investigate the relationship between RV5 estimates on the FTSE Index and risk values predicted by the SV and GARCH(1,1) models.

Preliminary Analysis
The sample dataset consisted of approximately 10 years of daily data of adjusted continuously compounded returns for FTSE, from 24 April 2009 through to 16 April 2019, sourced from Yahoo Finance via the R quantmod library. However, we retained the last 20 observations from the series for forecast purposes, leaving a total of 2454 observations. A matching set of daily RV5 estimates for FTSE was obtained from the Oxford Man Realised Library. Summary statistics for the two series is provided in Table 1 and plots of the two series in Figure 2.
FTSE has a mean daily return of 0.02% and a standard deviation of 0.97%. It has positive excess kurtosis and does not conform to Gaussian distribution, as can be seen from the QQ plot in Figure 3. RV5 has a mean of 8.6261 × 10 −5 and a standard deviation of 0.00016097. However, it is measured as a variance, and if we take the square root of its value and multiply it by 100, it is on a common scale with the FTSE returns. We undertook this transformation in some of the comparison plots in subsequent figures. It has very high skewness and kurtosis, which is also evident in the QQ plots in Figure 3.

SV and GARCH Estimates
We used R library stochvol to fit a stochastic-volatility model to the FTSE return series, and used Gaussian distributions to fit the SV model. Some of the initial parameters for SV model estimation are shown in Table 2. The SV model applied to FTSE produced the volatility estimates shown in Figure 4, while Figure 5 displays a kernel-density estimate for parameters contained in θ, producing posterior-density estimates. Figure 4 plots the estimated volatilities produced by the SV model at the 5%, 50%, and 95% posterior quantiles.
We also estimated GARCH (1,1) to obtain conditional volatilities. Coefficients are shown in Figure 3. The diagnostic tests suggested that there were no ARCH effects in the residuals, but rejected the null hypothesis that residuals conform to normal distribution. Plots of the volatilities obtained from both the SV and GARCH (1,1) estimates are shown in Figure 6.

Summary of 1000 Markov chain Monte Carlo (MCMC) draws after burn-in of 1000
Prior distributions mu ∼ normal mean = 0 S.D. = 100       Table 3 confirms that SV estimates were slightly lower than GARCH(1,1) estimates. Mean SV estimate was 0.87152, while mean GARCH(1,1) volatility estimate was 0.92405. This difference is significant, as revealed by a nonparametric sign test, the results of which are available from the authors on request. The interquartile range for the SV model is 0.38324, while that for GARCH(1,1) was slightly smaller with a value of 0.32905. This is reflected in the standard deviations of the estimates, with a value of 0.32600 for the SV model, and 0.30490 for the GARCH(1,1). (Details of the GARCH model fitted are provided in Table 4).
The third panel in Table 3 provides summary statistics for SQRV5L which is the square root of realised volatility series RV5, scaled by 100, to transform it into the same dimension as the two conditional-volatility series. Mean value was markedly greater at 0.81989, standard deviation was also larger at 0.43644, while the interquartile range was 0.44676. Excess kurtosis was much greater, with a value of 29.966, given that the SV and GARCH(1,1) models had excess kurtosis of 1.9519 and 3.3669, respectively. These statistics suggest that the modelfree estimate of volatility, provided by RV5, was more volatile and had larger spikes than the conditional volatilities produced by the two models.
We also estimated some regressions to explore the linear correlations between the estimates of the two volatility models and our base RV5 volatility estimates. In order to keep the values in the same dimension, we used the SQRV5L values. Results are shown in Table 5. If we used the RV5 values in the form of SQRV5L as the benchmark dependent variable, then conditional volatilities from the SV model had an adjusted R-square of 0.193842, and a highly significant slope coefficient with a value of 0.589930 and t-statistic of 24.31. The GARCH(1,1) model performs slightly better, with an adjusted R-squared of 0.202789 and a slope coefficient of 0.645116, with a t statistic of 25. The Durbin-Watson statistic for both models was around 1, suggesting that there iswasa borderline problem of serial correlation in the residuals.
As a further cross-check of the effectiveness of the two models, we used a further crude estimate of volatility with the demeaned squared daily returns on FTSE in the context of a HISVOL model, which was motivated by Corsi's (2009) Heterogeneous Autoregressive model of Realized Volatility (HAR-RV). Table 5. Regression analysis of three volatility models: SV, GARCH(1,1), and lagged demeaned squared returns. In the third panel, we used the attribute that the squared demeaned returns on the FTSE are highly persistent, or have long memory, and regressed the square root of RV5 scaled by 100 on 28 lags of the squared demeaned FTSE return, also scaled by 100.
Plots of the volatility series are shown in Figure 7. In the upper panel of Figure 7, SQRV5 is plotted on a different scale to the two conditional volatilities so they were not superimposed on each other. In the lower panel, correspondence between RV5 and the crude measure of demeaned squared daily returns on the FTSE is evident. Figure 8 provides a plot of actual versus fitted values when SQRV5L was regressed on 28 lags of LSQDMFTSE. Again, the effectiveness of the model is apparent. Figure 9 shows the predicted values of the conditional volatilities from the SV and GARCH, indexed on the L.H.S. of Figure 9, and the lagged values DMSQFTSERET plus the actual RV5 value for the day. The latter two series are indexed on the R.H.S. of Figure 9. If we rescaled the values of the latter two variables, it is apparent that there was closer proximity between them than from the predictions of SV and GARCH, even in the case of the first observation in the graph.   A potential issue was whether the relationship was significant in the quantiles, as this is an issue for risk measurement. In Table 6, we report the results of estimating quantile regressions for the same series.

Quantile-Regression Results
The results of SQRV5L quantile regression on the lags of the three explanatory series are shown in Table 6. All three quantile regressions used SQRV5L as the dependent variable, adopting the modelfree RV5 estimation as the benchmark. We adopted tau values of 0.05, 0.25, 0.50, 0.75, and 0.95. Our concern was whether there was evidence of significant relationships across quantiles. All t statistics were significant at the 1% level across all quantiles. In the case of the third panel in Table 6, we only used one lag of the squared demeaned FTSE return, rather than the full model, to avoid excessive numbers of entries in the Tables. Even so, coefficients on all quantiles were significant at the 1% level.
Plots of these relationships are provided in Figure 10. The horizontal blue lines in the panels in Figure 10 depict least-squares regression coefficients, and the dotted lines above and below them depict the error bands. The black line links the coefficients estimated by quantile regression, and the grey area around the line shows the error bands. An interesting feature of the quantile-regression coefficients was that all three black lines in Figure 10 have positive slopes. This is consistent with the existence of more elastic responses to estimates in higher quantiles of RV5 volatility, in all three cases of SV, GARCH(1,1), and squared demeaned returns on FTSE, as predictors of volatility. Results suggested that none of these models was fully successful in capturing volatility peaks in RV5 in the higher quantiles.

Rolling-Regression Analysis
As a further check, we ran some rolling regressions of our three daily volatility estimates from the three models of STOCHVOL, GARCH (1,1) and HISVOL as the explanatory variable for the realised RV5 estimates, using window sizes of 60 and 500 days.These window sizes were chosen to approximate three months and two years of daily returns. Graphs of the coefficient estimates from these three regressions, plus error bands of plus and minus two standard deviations, are shown in the Appendix A.
These plots suggest that, when we used the longer windows of 500 observations, the three models were significant around 50% of the time, in that the error bands did not span zero. None of the three models performed well in the 60 observation windows, and spanned zero for the majority of the considered period. The HISVOL model performed poorly in these short windows.

Conclusions
The paper featured an examination of the effectiveness of SV and GARCH(1,1) models as explainers of modelfree estimates of FTSE volatility using RV samples at 5 min intervals as a benchmark, as provided by Oxford Man Institute's Realised Library. In order to provide further contrast, we also used lags of squared demeaned daily returns on FTSE to provide a simple alternative estimate of daily volatility. The effectiveness of these three methods was explored via the application of ordinary-least-squares (OLS) regression and quantile-regression analysis. Poon and Granger (2005) provided motivation in their analysis of 66 studies of this topic in which they noted that, at that time, there were an insufficient number of SV studies to provide a comparison between the GARCH and HISVOL models. Our intention was to address this sparsity in the literature.
We used vanilla estimates of SV and GARCH(1,1) by adopting Gaussian distributions. If measured by adjusted R-squared values, GARCH(1,1) appeared to produce slightly better OLS predictions than the SV model. However, both performed relatively poorly as compared with the simple expedient of using squared demeaned daily returns on FTSE to predict RV5 volatility. Our results support Poon and Granger (2005), in that neither the GARCH nor the SV model outperformed a simple form of a HISVOL model on this FTSE sample when RV samples at 5 min intervals were used as a benchmark. However, all three models performed poorly within short 60 observation windows in the rolling regressions. This suggests that the HISVOL model performs better over a longer observation window and is likely to be of more practical applications over longer holding periods.   Figure A2. Rolling-regression-window size 60.