Multiscale Stochastic Volatility Model with Heavy Tails and Leverage Effects

This paper studies multiscale stochastic volatility models of financial asset returns. It specifies two components in the log-volatility process and allows for leverage/asymmetric effects from both components while return innovation terms follow a heavy/fat tailed Student t distribution. The two components are shown to be important in capturing persistent dependence in return volatility, which is often absent in applications of stochastic volatility models which incorporate leverage/asymmetric effects. The models are applied to asset returns from a foreign currency market and an equity market. The model fits are assessed, and the proposed models are shown to compare favorably to the one-component asymmetric stochastic volatility models with Gaussian and Student t distributed innovation terms.


Introduction
There is a large volume of studies on the volatility of financial asset returns in which the volatility of the returns is assumed to be governed by a stochastic process. Since the initial work of Taylor (1986), stochastic volatility (SV) models have been subjected to much research in financial econometrics. The main feature of a canonical SV model is that the logarithm of the conditional volatility of asset returns is generated by a latent/unobserved autoregressive (AR) process. Its noise/innovation terms are drawn from a univariate Gaussian distribution, and the innovation terms of the asset return process themselves are also drawn from a standard Gaussian distribution. To incorporate a heavy/fat tail property of the marginal distribution of the asset returns in the model, a Student t distribution is often assumed for the innovation terms of the asset return equation.
As the SV models have a hierarchical structure, parameter estimation of the models has been found to be challenging. The general method of moments (GMM), the simulated method of moments (SMM), the efficient method of moments (EMM), the empirical characteristic functions (ECF), and important sampling methods, among others, have been introduced in the literature to circumvent this difficulty. Bayesian Monte Carlo, in particular Markov chain Monte Carlo (MCMC), has also been proposed in the literature as an estimation approach for the SV models. In addition to offering computational flexibility, the MCMC method also allows investigators to incorporate prior information about the parameters of a model formally. This additional component of the MCMC method has proven to be quite attractive to investigators working with more complex SV models. For a further review of this approach, see, for instance, Chib et al. (2009) and Lopes and Polson (2010). Since then, various univariate extensions of the SV models within the MCMC framework have been explored, for instance, as in , and more recently in Men and Wirjanto (2018).
In the meantime, extensions of the SV models also took shape on another front with the introduction of multivariate SV (MSV) models, starting with Harvey et al. (1994) and subsequently followed by a number of studies, which include Lopes and Polson (2010); Aguilar and West (2000); Chib et al. (2002).
Still another direction of the extension of the SV models emerged with the introduction of multifactor models which incorporate multiscaling features. The principal idea of this approach is that univariate series are driven by several factors that vary at different time scales, as in the studies by LeBaron (2001); Alizadeh et al. (2002); Chernov (2003). Within this stream of the literature, Molina et al. (2010) proposed a multiscale SV (MSSV) model to capture different scales of the logarithm of the conditional volatilities of asset returns. In this model, the conditional volatility is driven by equally weighted factors where each factor is driven by a first-order autoregressive (AR(1)) process. The innovation of the return process and the component latent AR(1) processes are assumed to be uncorrelated and follow a univariate Gaussian distribution.
Interestingly, LeBaron (2001) argued that two-factor stochastic volatility models exhibit heavy-/fat-tailed return distribution, which are the empirical features of many asset returns. Given this observation and given that the marginal distributions of asset returns often appear to have heavy/fat tails, we extend the MSSV model by assuming a Student t distribution for the innovation of the mean equation from which the heavy/fat tail of the asset returns can be adequately captured. We coin this extended multiscale volatility model as an asymmetric MSSV (MSASV) model. This represents the first contribution of the paper to the literature.
Our second contribution to the literature is to assume that the innovation terms of the mean equation of the model are correlated nontrivially with the innovation terms of the latent/unobserved volatility factor process. In this paper, the correlation structure is introduced specifically to accommodate the asymmetric/leverage effect that have been observed to be present between asset returns and future volatilities. 1 The third contribution of this paper to the literature consists of developing suitable MCMC algorithms for the inference of the MSASV models. It is also worth pointing out that the MCMC method developed in this paper is different from that used in Molina et al. (2010), where the authors utilized the method advocated in Harvey and Shephard (1996) by taking the logarithm of the squared measurement equation. In this paper, we specify a posterior distribution of the latent states directly, and the states are then simulated via a Metropolis-Hastings (MH) method where the proposed distribution is simulated by a method known as a slice sampler.
Lastly, our fourth contribution to the literature lies in the use of an auxiliary particle filter (APF) for the purpose of carrying out both a one-step-ahead in-sample (or trainingsample-based) volatility prediction and an out-of-sample (or test-sample-based) volatility prediction for the fitted MSASV models.
The remaining parts of the paper are organized as follows. Section 2 reviews the SV model and presents the MSSV and MSASV models. The MSASV model is extended to incorporate the heavy/fat tails of the marginal distribution of the returns. Specifically, we assume that the innovations of the return time series has a Student t distribution. In addition, we also introduce a nontrivial correlation structure between the innovation terms of the mean equation and the innovation terms of the latent/unobserved factor (i.e., volatility) process in the model. Section 3 presents novel MCMC algorithms for model inference. Simulation studies are conducted in Section 4 to show the ability of the proposed MCMC algorithms to recover the true parameters of the model. Empirical applications are then provided in Section 5 to illustrate the performance of our model and algorithms based on the asset return data sets from both the foreign currency market and the equity markets, and Section 6 concludes the paper.

The SV Model
A canonical SV model studied in the literature is a one-component (or factor) SV model, where the conditional volatility of the asset returns is assumed to have been generated by a latent/unobserved AR(1) process. The multiscale SV (MSSV) model proposed by Molina et al. (2010) is a direct extension of this one-component SV model. For this reason we first review the one-component SV model briefly.
As mentioned earlier the SV model was proposed by Taylor (1986) to incorporate time-varying volatility of the returns. Define by y t the asset return at time t. Then the dynamics of y t is given by: where η t is statistically independent random noise terms, such that η t ∼ N (0, 1). It is also assumed that t is statistically independent with a common univariate Gaussian distribution N (0, δ 2 ), and the innovation terms, t and η t , are statistically independent of each other. In addition we also impose the condition that |φ| < 1 in order to ensure that the latent/unobserved AR(1) process is second-order stationary, As the SV model is hierarchical and the mean equation defined in (1) is highly nonlinear, its likelihood function does not possess a closed-form representation, and it is highly intractable to integrate out the T latent/unobserved volatility processes from this likelihood function. Faced with this difficulty MCMC methods have been proposed to estimate the parameters of the SV models.

The MSSV and MSASV Models
The MSSV model, proposed by Molina et al. (2010), is a direct extension of the onecomponent SV model. In this model, the y t process is determined by multiple additive latent/unobserved volatilities as factors. The model is defined as: where the innovation terms, t and η t+1 , t = 1, 2, . . . , are assumed to be statistically independent of each other, η t = (η 1,t , ..., η K,t ) is a vector of multivariate Gaussian variates such that η t+1 ∼ N(0, I K ), where 0 is a k-dimensional vector of zeros, I K is a K × K identity matrix, and t 's are statistically independent of each other with a common univariate Gaussian distribution, denoted as N(0, δ 2 ). In (4)-(6) h t = (h 1,t , ..., h K,t ) is a vector of K latent/unobserved volatility states at time t, and 1 denotes a K-dimensional vector of ones. The innovation terms of the latent/unobserved volatility process h t , t = 1, 2, . . . , are also statistically independent of each other; that is, Σ is a K × K diagonal matrix with the k-th diagonal element being given by σ 2 k , with σ 2 k > 0, and Φ is a K × K diagonal matrix containing the mean reversion parameters, such that |φ k | < 1, for k = 1, . . . , K. The covariance matrix of the initial latent/unobserved volatility vector h 0 is given by the implied second-order stationary, marginal covariance matrix Ω of the latent volatility process, which, in turn, satisfies the condition that Ω = ΦΩΦ + Σ. Note that in (4)-(6), if we set φ i = 0 for some i, the implied model will reduce to a model which contains a permanent (log-normal) source of independent jumps in the volatility series, as the h i,t would be (temporally statistically independent) Gaussian processes which is added to the (log)variance process driving the return series.
As pointed out by Molina et al. (2010), the model in (4)-(6) can be motivated as a discrete-time approximation to the underlying continuous-time SV models, where the volatility is an exponential function of a sum of multiple Ornstein-Uhlenbeck processes with the mean reverting processes varying on well-separated time scales. Its model representation and the ensuing discussion of the model are relegated to Appendix A. Alternatively the model stated in (4)-(6) can also be viewed as arising from the fact that the SV models allow for superposition of latent volatilities where total volatilities is the sum of individual component volatilities. See, inter alia, Roberts et al. (2004) and Griffin and Steel (2010) for this particular set up. For this reason the model in (4)-(6) are sometimes also referred to as a multi-component (or multi-factor) SV model.
Following Molina et al. (2010) we impose the condition that φ 1 > ... > φ K in order to ensure that the MSSV model is identifiable. Under this restriction, all of the components of the latent/unobserved process in (5) are ensured to evolve on different time scales. Note also that we exclude a location parameter from this process as the innovation terms, t , in the model possess a non-unit variance.
The original MSSV model studied in Molina et al. (2010) does not allow for correlation between the innovation terms, t and η t+1 . In the equity markets asset returns have been shown to have a negative correlation with their logarithms of conditional volatilities. In this paper we incorporate a nontrivial correlation structure between the innovation terms of the mean equation and the innovation terms of the latent volatility component processes. In principle we can also allow for a correlation structure among the innovation terms of the latent/unobserved AR(1) processes. However, in order to maintain a reasonable simplification of the development of the MCMC algorithm, and also to ensure identifiability of the model, we do not entertain this possibility in this paper. Another important observation pertaining to the asset returns is the heavy/fat tail property of the marginal distribution of the returns, which is often captured by assuming that the innovation terms of the mean equation follow a Student t distribution. Accordingly we assume that t ∼ t(v) with v degrees of freedom. The MSASV model with the Student t distributional assumption for the innovation terms of the mean equation is called an MSASV-t model. 2 To simplify the derivation for the proposed MCMC algorithm we reparametrize the latent/unobserved AR(1) process of the MSASV model as where e k,t , k = 1, ..., K, are independent univariate standard normal noises, ψ k = σ k ρ k and τ k = σ k 1 − ρ 2 k , k = 1, ..., K. This reparametrized form highlights the nontrivial correlation structure we have introduced in the model between the innovation terms of the mean equation and the innovation terms of the latent factor processes, as conventionally defined in the one-component SV literature and interpreted it as a leverage/asymmetric effect. However, as mentioned earlier, in this paper we do not allow for a non-trivial correlation structure among the latent innovation terms for reason of computational tractability and to ensure model identifiability. Given (7), instead of sampling ρ k and σ k , k = 1, ..., K, we sample ψ k and τ k , k = 1, ..., K, and then proceed backwards to obtain samples of ρ k and σ k .
We complete the specification of the MSASV and the MSASV-t models by incorporating explicit prior distributions for the models' parameters. For simplicity we assume that all prior distributions of the parameters of both multiscale SV models are statisti-cally independent of each other. To impose a second-order stationary condition on the latent/unobserved volatility processes we specify the prior distributions for φ 1 and φ 2 to be N(0, 10), which is truncated in the interval (−1, 1). These prior distributions give rise to relatively flat densities over their support regions. In the MCMC algorithm we sample σ 2 i instead of σ i , i = 1, 2, by using an inverse Gamma distribution IG(5, 0.05). As to the prior distributions of v we adopt a half-Cauchy prior with the density function given by As part of the implementation of the MCMC algorithm, we augment the latent/unobserved volatility states h with a vector of parameters and estimate them as a by-product of the process.

Estimation of the MSASV Model
We first present an outline of the MCMC algorithm in Table 1. Step 0. Initialize h, φ k , σ k , k = 1, 2, and δ.
Then we provide an additional explanation for this algorithm as follows.
Step 0. Initialize φ k , ψ k , σ k , k = 1, 2, and δ by using the relevant prior distributions. To determine the initial value of the vector h we set the parameters of the latent volatility process as φ k = 0.5, σ k = 0.12, k = 1, 2, v = 0.5 and γ = 0.5. Then we generate the initial value of h by using the definition (5)-(6) of the process.
Step 1. Sample h. We carry out the simulation by adopting a single-move acceptancerejection algorithm.
We only state the full conditionals of h t , t = 2, ..., T − 1. The full conditionals of h 1 and h T can be relatively straightforward to derive and therefore they are not presented here.
The full conditional of h 1,t is: where c 1t and c 2t represent two normalizing constants. The reason that the inequality sign in (10) holds true is because the last two parts of the right-hand side of the Equation in (9) is constrained to be less than unity. It is also worth pointing out that both the full conditional distribution (9) and the dominant distribution in (10) are unknown; as a result we are unable to simulate them directly. Instead we use the MH method to sample the full conditional distribution (9). We note that the proposal distribution of the MH algorithm is critically important for the performance of the simulation outcome. Notably Chib and Greenberg (1995) laments that choosing a good proposal density likes searching for a proverbial needle in a haystack. In general a proposal density can be obtained by means of an approximation of the underlying full conditional (see Jacquier et al. (1994Jacquier et al. ( , 2004) or by selecting a standard Gaussian density (see Kim et al. (1998) and Zhang and King (2008)).
As is well-known in the literature, the critical aspect of MCMC in fitting a SV model is the sampling quality of the full conditionals of the augmented parameters, which are the log volatilities, h. The contribution of this paper to the literature lies in the development of the MH method to sample the full conditional distribution (9), where the proposal distribution is the dominant distribution in (10), which can be sampled by the method of slice sampler proposed by Neal (2003). The efficiency of the slice sampler method has been studied by authors such as Roberts and Rosenthal (1999) and Mira and Tierney (2002). In particular Roberts and Rosenthal (1999) show that, under certain sufficient conditions, the slice algorithm is quite robust and has geometric ergodicity properties. Mira and Tierney (2002) point out that the slice sampler has a smaller second-largest eigenvalue, which allows for a faster convergence to the underlying distribution. Algorithm of the slice sampler for h 1,t It is straightforward to show that the right-hand side of (10) can be: expressed as If y t = 0, then we have: 2. Draw u 3 uniformly from the (0, 1) interval. Let Then we have: 3. If y t = 0, draw h 1,t uniformly from the interval, which is determined by the inequalities stated in (11) and (12) as: We note that the method of the single-move simulation is widely used in the SV literature; notable examples include Jacquier et al. (1994Jacquier et al. ( , 2004; Yu et al. (2006); Zhang and King (2008) . 4 One important advantage of the slice sampler is that each iteration can give us a point from the underlying distribution; in contrast in the MH algorithm, many generated points have to be discarded.
Step 2. Sampling φ k , k = 1, 2. Given the conjugate prior distribution The full conditional is proportional to the product of a univariate Gaussian distribution and a positive function. As a result we can sample this full conditional by the method of slice sampler.
Step 3, 4, 5. Sampling parameters ψ k and τ k , k = 1, 2 and δ. As the priors for these parameters are conjugate, the full conditionals are Gaussian and inverse Gamma distributions respectively. We can easily simulate these full conditionals. Therefore, we omit the presentation of these formulas from the text and, instead, refer readers to Kim et al. (1998) for a full description of them.

Estimation of the MSASV-t Model
Sampling the latent/unobserved states h k,t , t = 1, ..., T − 1. The simulation of h k,1 and h k,T follows similar steps. The full conditional of h 1,t , t = 2, ..., T − 1, is: (18) where f (v) is a prior density of v. In the literature there is a number of ways in which to specify this prior distribution. Jacquier et al. (2004) propose a discrete prior distribution U [3, 40] from which the full conditional is sampled directly from a multinomial distribution. Geweke (1993) suggests α exp(−αv) with α = 0.2 as an alternative, while Zhang and King (2008) choose a Gaussian distribution v ∼ N (20, 25). Bauwens and Lubrano (1998) use a Cauchy prior proportional to 1/(1 + v 2 ). In this paper we adopt a Gaussian prior. Since this full conditional is an unknown distribution, we rely on a random-walk MH algorithm, in which the proposal density is a standard Gaussian density and the acceptance probability is computed by using Equation (18).

Auxiliary Particle Filter
Model comparison of fitted SV models can be carried out by evaluating the model's likelihood. However, for the MSASV and the MSASV-t models proposed in this paper, the likelihood is quite intractable to derive in an analytical form because of its highly non-linear structure. As a result, to carry out this task, we resort to an auxiliary particle filter (APF) method introduced by Pitt and Shephard (1999). This is an efficient recursive algorithm which approximates the filter and the one-step-ahead predictive distributions of the latent/unobserved states of the model. By successive conditioning steps, we can express the sample likelihood of the multiscale SV model as: where I t = {y t , ..., y 1 } represents the information known at time t. The conditional density of y t+1 given θ and I t has the following representation: Consider a particle sample {h Given this particle sample, we can express the one-step-ahead approximation of the predictive density of h t+1 as: If we denote the sample drawn from the distribution of . . , N, then the conditional density function (20) can be approximated as: However for the approximation (21) to be feasible the predictive density function of h t+1 must be known to the investigator. Fortunately this condition is met in the context of the models studied in this paper, as the assumed form of the latent/unobserved volatility process implies that h t+1 conditional on h t has a bivariate Gaussian distribution given by N Φh t , Σ with Φ = diag(φ 1 , φ 2 ) and Σ = diag(σ 2 1 , σ 2 2 ). This fact is also used when we perform the one-step ahead predictions of the volatility. We omit the presentation of the APF procedure in calculating (20) and (22), and, instead, we refer readers to Chib et al. (2002Chib et al. ( , 2006 for this.

Diagnostics
There is a number of diagnostic tools in Statistics that can be utilized to assess the goodness-of-fit of the MSASV and MSASV-t models. One of them is known as a Kolmogorov-Smirnov (KS) test. The KS test is designed to assess whether realized observation errors of a model actually originate from the assumed distribution. Another approach is to use the method of probability integral transforms (PITs) introduced by Diebold et al. (1998).
To discuss the PITs suppose that { f (y t |I t−1 )} T t=1 is a sequence of conditional densities of y t given the information I t−1 available to the investigator at time t − 1, and {p(y t |I t−1 )} T t=1 is the corresponding sequence of one-step-ahead density forecasts. The PIT corresponding to an observed value of y t is given by: Under the null hypothesis that the sequence {p(y t |I t−1 )} T t=1 coincides with { f (y t |I t−1 )} T t=1 , the sequence {u(t)} T t=1 corresponds to independent and identically distributed (i.i.d.) observations from the uniform distribution on the (0, 1) interval.

Model Selection
There is also a number of ways in which to carry out selection of fitted models. The Akaike information criterion (AIC) introduced by Akaike (1987) and the Bayesian information criterion (BIC) introduced by Schwarz (1978) are two most commonly used to discriminate different versions of the fitted SV models in the literature. However it is important to point out that both the AIC and BIC require the knowledge of an exact number of independent parameters in the fitted model. However this requirement is not satisfied in the estimation approach adopted currently in this paper, since the latent/unobserved volatility states are augmented as parameters in the Bayesian framework. Due to the fact that these states are usually found to be highly correlated with each other, it is not appropriate to treat them as independent parameters. This represents a serious impediment to using either the AIC or the BIC for model selection in the context of the fitted MSASV and MSASV-t models. Motivated by this concern, we consider an alternative criterion for model comparison, called the deviance information criterion (DIC). The DIC was proposed by Spiegelhalter et al. (2002) and has proven to be particularly useful for hierarchical models such as the SV models considered in this paper. Notably Berg et al. (2004) has used this criterion for model comparison of a number of fitted one-component SV models.
The DIC is defined as: The first termD is a Bayesian measure which represents a model fit. It is defined as the posterior mean of the deviance:D where D(θ) = −2 log f (y|θ). Larger values ofD signify a deterioration in the fit of the model. The second term, P D , is defined as where D(θ) is the deviance of the posterior mean. It captures the complexity of the model. In other words P D is the difference between the posterior mean of the deviance and the deviance under the posterior mean of θ. The larger the value of P D , the easier it is for the model to fit the data. The term P D is called the effective number of parameters. Since the likelihood is analytically intractable in the case of the MSASV models, to compute DIC we resort to numerical methods to evaluateD and D(θ) instead. Li et al. (2014) have raise concerns on the use of DIC in discriminating the one-factor SV models. For this reason, in this paper, we utilize the MCMC outputs in our calculation ofD and D(θ). As the true value of θ is unknown, we instead use the Bayesian estimateθ of θ in our calculation of the DIC.

Simulation Studies
In this section we present and discuss the simulation results for the MSASV model where the innovation terms of the asset-return equation are assumed to be endowed with a univariate Gaussian distribution. As the simulation for the MSASV-t models produce qualitatively very similar results, we do not include them in this section for the sake of brevity. Once the model has been fitted, we can use the KS test to assess whether the fitted model agrees well with the simulated asset return series. Specifically, for a given θ, the following equations are used to generate an asset-return time series y and volatility states h: where h k,0 ∼ N σ 2 k /(1 − φ 2 k ) , y t ∼ e (h 1,t +h 2,t )/2 t and t ∼ N(0, δ 2 ). The parameter values used to generate asset returns are presented in the second column of Table 2. For the results in this table we generate 12,000 observations from the MSASV model, in which the first 10,000 observations are fitted by the MSASV model and the remaining 2000 observations are used for comparison with the one-step-ahead out-of-sample predicted asset volatilities. We iterate the estimation algorithm 200,000 times and discard the initial 100,000 sampled points as the burn-in before we draw inference from the results. In Table 2 we present the estimated parameters together with the Bayesian highest probability density (HPD) intervals and standard deviations. 5 We observe from the table that the estimated parameter values are fairly close to the corresponding true values for the model. Next we assess the overall fit of the model by analyzing the PITs from the fitted MSASV model. The uniform distribution of u(t) on the (0, 1) interval is on display in Figure 1 by means of both the scatter plot and the histogram. The two horizontal lines in the histogram plot represent the 95% Bayesian confidence bands, the detail of which calculation can be found in Diebold et al. (1998). The KS test statistic is calculated at 0.0091 with a corresponding p-value of 0.3805. Thus we can not reject the null hypothesis that the PITs are uniformly distributed over the (0, 1) interval at any conventional significance level. In Figure 2 the empirical cumulative distribution function (CDF) of the PITs is plotted together with the theoretical CDF of the uniform distribution U(0, 1). The graph reaffirms our earlier claim that the fitted MSASV model agrees well with the simulated return series. From the above comparisons and the result of the KS test we can also draw a conclusion that the proposed MCMC method for the MSASV model fits the simulated return data remarkably well. Once the MSASV model has been estimated, we can use the fitted model to perform both the in-sample and out-of-sample one-step ahead volatility predictions. In Figure 3 we compare the absolute value of the simulated returns with the estimated and one-step-ahead in-sample and out-sample predicted volatilities, where the latter is separated by a vertical dotted line at t = 10, 000. We note that the forecasted volatilities resemble very closely the true time series of the absolute value of the simulated returns. Moreover the time series of the estimated two components also compares extremely favorably with the absolute value of the simulated returns on display in Figure 4.  Posterior mean of (h 1t + h 2t )/2 (second panel). Posterior mean of slow mean reverting of (h 1t )/2 (third panel) and the posterior mean of fast mean reverting of (h 2t )/2 (fourth panel) based on the simulated return data.
Overall the simulation studies in this section show that the proposed MSASV model and its MCMC algorithm work very well in terms of parameter estimation of the model and are able to capture the two components that shape the dynamics of the volatility of the simulated returns adequately.

The MSASV Model
In this section we apply the proposed MSASV model and its MCMC algorithm to two classic data sets of asset returns with one originating from the exchange market and another one from the equity market. The first data set consists of 945 observations on daily pound/dollar exchange rate from 1 October 1981 to 28 June 1985, called EXC hereafter. The use of this data set allows us to make comparison of our results with those presented in Molina et al. (2010), who use returns from the foreign currency markets. This particular data set from the exchange market has been analyzed in well-known studies such as Harvey et al. (1994); Shephard and Pitt (1997); Meyer and Yu (2000); Skaug and Yu (2008); Yu (2011). Since there are not many observations contained in this data set, we fit all of the available observations by the proposed MSASV model and compare only in-sample predicted volatilities with the estimated and the absolute observed returns. The second data set includes the daily returns of the Australian All Ordinaries stock index, called AUX 6 . The data set contains 1508 observations from 2 January 2000 to 30 December 2005, excluding weekends and holidays. For a comparison purpose the first 1400 observations are fitted by the proposed MSASV model, and the remaining 108 observations are used for comparison with the estimated and predicted volatilities. Table 3 lists the estimated parameters of the MSASV model fitted to the EXC data. Bayesian HPD intervals with standard deviations are also presented in this table. With relatively small standard errors, the HPD intervals contain the parameter estimates of the model. It is worth noting that the leverage/asymmetric effect is estimated with an incorrect expected sign for the second component. Moreover the leverage/asymmetric effect in both components are estimated very imprecisely. This is broadly consistent with the previous findings in the literature on the one-component SV models which shows that the leverage/asymmetric effect is not a prominent feature of the returns in the foreign currency markets. Our estimates of φ 1 and φ 2 are close to those for the MSSV model reported by Mira and Tierney (2002), which are 0.988 and 0.149 respectively. Importantly the estimation result for φ 1 and φ 2 in Table 3 points to a distinct advantage of the MSASV model vis-a-vis the one-component ASV model for analyzing return data even when the magnitude of the second component (0.1477) is relatively small in magnitude. In particular the MSASV model allows for a better identification of the slower mean-reverting volatility component. We can see this in the results with a more persistently slow mean-reverting component (giving the estimate of φ 1 as 0.9766, which is very close to 1). This is likely to result in a nontrivial impact on the volatility predictions.
The overall model fit can again be assessed by analyzing the PITs from the fitted MSASV model. The uniform distribution of u(t) on the (0, 1) interval can be visualized in Figure 5 through both the scatter plot and the histogram. As the sample size of the PITs are found to be relatively small, the Bayesian confidence bands of the PITs is relatively much wider as expected. The KS test statistic is recorded at 0.0165 with a p-value of 0.9557. Based on these values we can not reject the null hypothesis that the PITs are uniformly distributed over the (0, 1) interval at any conventional significance level. In Figure 6 the empirical CDF of the PITs is on display together with the theoretical CDF of the U(0, 1). The plotted graph is broadly consistent with our earlier finding that the fitted MSASV model compares very favorably with the returns from the foreign currency market. Thus from the above comparisons and the result of the KS test, we can conclude that the proposed MCMC method for the MSASV model fits the return series remarkably well. In Figure 7 we compare the absolute value of the observed returns with the estimated volatilities and the one-step-ahead in-sample predicted volatilities. The fitted and predicted volatilities appear to track very closely the true time series of the absolute asset returns. Once again the time series of the estimated two components compares very favorably with the absolute value of the observed returns shown in Figure 8.   Posterior mean of (h 1t + h 2t )/2 (second panel). Posterior mean of slow mean reverting of (h 1t )/2 (third panel) and the posterior mean of fast mean reverting of (h 2t )/2 (fourth panel) based on the EXC data.
Next we procedd to carry out the same analysis we had before on the AUX returns data. Table 4 lists the estimated parameters of the MSASV model fitted to the AUX data. Bayesian HPD intervals with standard deviations are also provided in this table.
Again with relatively small standard errors, the parameter estimates of the model are included in the constructed HPD intervals. The leverage/asymmetric effect in both factors is estimated this time with a correct expected sign. Moreover its estimate for the first component is quantitatively large and statistically highly significant, while that, for the second component, it is quantitatively small and statistically not significant. As with the EXC data set, even when the estimate of the second component (0.1785) is much smaller in magnitude than that of the first component (0.9659), the use of the MSASV model allows us to better identify the slower mean-reverting volatility component. In particular its estimate is shown to be much more persistent (with the estimate of φ 1 being closer to unity) than the estimate of the second component, and this, in turn, will have a large impact on the overall volatility predictions. As before the overall model fit can be assessed through the analysis of the PITs from the fitted MSASV model. The uniform distribution of u(t) on the (0, 1) interval is on display in Figure 9 via both the scatter plot and the histogram. The KS test statistic is calculated at 0.0259 with a p-value of 0.2979. Based on these values we cannot not reject the null hypothesis that the PITs are uniformly distributed over the (0, 1) interval even at the 10% significance level. In Figure 10 the empirical CDF of the PITs is shown together with the theoretical CDF of the Uniform (0, 1). The graph supports our earlier claim that the fitted MSASV model agrees very well with the AUX returns data. Once the MSASV model has been estimated, as before we can use the fitted model to perform in-sample one-step ahead predictions. In Figure 11 we compare the absolute observed returns with the estimated and one-step-ahead in-sample and out-sample predicted volatilities, where the latter is separated by a vertical dotted line at t = 1400. The forecasted volatilities appear to resemble closely the true time series of the absolute value of the observed returns. Once again the time series of the estimated two components compares very favorably with the absolute value of the observed returns as shown in Figure 12.  Posterior mean of (h 1t + h 2t )/2 (second panel). Posterior mean of slow mean reverting of (h 1t )/2 (third panel) and the posterior mean of fast mean reverting of (h 2t )/2 (fourth panel) based on the AUX data.
Next we compare the proposed MSASV model with the one-component asymmetric SV (ASV) model where correlation is permitted between the innovation terms of the asset returns and the innovation terms of the latent/unobserved volatility process. The two data sets are also fitted by the one-component ASV model. Table 5 lists the values ofD, P D and DIC calculated based on the fitted MSASV and one-component ASV models. Based on the calculated DIC values, we conclude that the MSASV model fits the two data sets better and provides evidence of at least two latent/unobserved component volatilities in the dynamics of the asset return data studied in this paper. 7

The MSASV-t Model
In this subsection we fit the heavy/fat tailed MSSV models to the two datasets of the asset returns investigated in Section 5.1. Table 6 includes the estimated parameters of the MSASV-t model for the EXC data set with the standard deviations and the 95% Bayesian HPD intervals. Estimates of the leverage/asymmetric effect in both factors are quantitatively small and statistically not significant. This again reinforces the previous findings in the literature of the one-component SV models that the leverage/asymmetric effect is not an important feature of the asset returns in the foreign exchange markets.  5242 (13.8659, 39.9615) As in the previous case the assessment of the model fit to the data can be determined by assessing PITs from the fitted MSASV-t model. The uniform distribution of u(t) on the (0, 1) interval is again visualized in Figure 13 by means of both the scatter plot and the histogram. The KS test statistic is recorded at 0.0283 with a p-value of 0.4294. Based on these values, we can not reject the null hypothesis that the PITs are uniformly distributed over the (0, 1) interval at any conventional significance level. In Figure 14 the empirical CDF of the PITs is plotted together with the theoretical CDF of the Uniform (0, 1). The graph again is shown to be consistent with our earlier assessment that the fitted MSASV-t model compares very favorably with the simulated return data. In Figure 15 we compare the absolute value of the observed returns with the estimated and predicted volatilities. The fitted and predicted volatilities appear to track very closely the absolute values of the observed asset returns. The time series of the estimated two components also compares quite favorably with the absolute value of the observed returns as presented in Figure 16.   For the AUX return data the estimated parameters, their HPD intervals and related standard deviations are presented in Table 7. The leverage/asymmetric effect in both components in the MSASV-t model is now estimated with the correct expected sign, and quantitatively large as well as statistically highly significant. This suggests that the leverage/asymmetric effect is a distinctly prominent feature of the returns in the equity markets, much in keeping with the findings in the literature on the one-component SV models. As in the previous cases the first and second components of the latent volatility process in this model are estimated significantly at 0.9914 and 0.3320 respectively. As before the fact that the first component of the volatility process has been estimated to be so close to unity gives rise to a better identification of the slow mean-reverting volatility component.  The overall model fit assessment is again conducted by the test of the PITs calculated from the fitted model. The uniform distribution of u(t) on the (0, 1) interval is on display in Figure 17 through both the scatter plot and the histogram. The KS test statistic is calculated at 0.0362 with a p-value of 0.4867. Based on these values we can not reject the null hypothesis that the PITs are uniformly distributed over the interval (0, 1) at any conventional significance level. In Figure 18 the empirical CDF of the PITs is plotted together with the theoretical CDF of the Uniform (0, 1). The graph simply reinforces our earlier conclusion that the fitted MSASV-t model agrees very strongly with the asset return data.  Figure 19 compares the absolute value of the observed returns with the estimated and one-step-ahead in-sample and out-of-sample predicted volatilities. The predicted volatilities appear again to track very closely the true time series of the absolute returns. In addition the time series of the estimated two factors also compares very favorably with the absolute value of observed returns as illustrated in Figure 20.  As we have done previously in Section 5.1, we compare the proposed MSASV-t model with the heavy-tailed one-factor asymmetric SV (ASV-t) model where the innovation terms of the asset returns have a Student t distribution and are correlated with the innovation terms of the latent volatility process. The two data sets are also fitted by the heavy-tailed one-component ASV-t model, which serves as a benchmark. Table 8 reports the values ofD, P D and DIC calculated based on the fitted MSASV-t and one-component ASV-t models. It is noted that the MSASV-t model fits the EXC data slightly better than the one-component ASV-t model, while for the AUX data, it fits considerably better than the one-component ASV-t model.

Conclusions
In this paper we have systematically studied several extended versions of the multiscale SV model introduced by Molina et al. (2010) in the modeling of the dynamics of the volatility of financial asset returns. The logarithm of conditional volatilities of the asset returns was described by latent/unobserved AR(1) processes with different time scales. In order for the proposed model to capture the heavy/fat tails in the marginal distribution of the asset returns, the innovation terms of the asset returns followed a Student t distribution. Novel MCMC algorithms were developed for the purpose of conducting Bayesian inference of the models. An auxiliary particle filter was also employed to approximate the filtering and prediction distributions of the latent/unobserved states of the models when we calculated the models' likelihoods and performed volatility predictions. In this paper, we also allowed for a nontrivial correlation structure between the innovation of the mean equation and those of the latent factor processes, which can be interpreted as the leverage/asymmetric effect, much in keeping with the literature on the one-component SV models. However, we did not allow for a correlation structure to exist among the innovation terms of latent/unobserved AR(1) processes. This was done for the reason of computational tractability and to ensure model identifiability. This is a limitation of the present study and represents an important issue to be considered as future research. We briefly discuss this issue in Appendix A and show how it may be resolved if we are willing to impose additional restrictions on the model, in particular on the noise/innovation terms driving the process of the various volatility components in the model.  assumed to be statistically independent of η j (t k )'s but the η j (t k )'s are allowed to be pairwise correlated. However, in the equity markets, market participants operate at different time scales (and based on different information sets), but not independently, and where the leverage effect is pronounced.
The above discretely approximated model can be rewritten as: log(σ 2 (t k )) = F 1 (t k ) + · · · + F K (t k ) Next we define the asset returns in the usual way as: and write a discrete-time driving vector of volatilities as h(t k ) − F(t k −¯) , where F(t k ) = (F 1 (t k ), · · · , F K (t k )) and¯= (µ 1 , · · · , µ K ) . The autoregressive parameter is denoted as φ j = 1 − α j ∆, j = 1, · · · , K. Furthermore, we define the standard deviation parameter as σ j,η = β j √ ∆, j = 1, · · · , K. Since the observations are equally spaced, with some abuse of notation, we can use t instead of t k /∆ for discrete time indices, such that t ∈ 1, 2, ..., T. This allows us to express a K-dimensional AR(1) process for log-volatilities in a state-space form as: y t = exp (1 h t + 1 ¯)/2 ffl t , t = 1, · · · , T h t+1 = Φh t + Σ 1/2  t+1 , t = 1, · · · , T h 0 ∼ N(0, Ω) where  t = (η 1,t , · · · , η K,t ) is the vector of standard random Gaussian variates, 1 is a K-dimensional vector of ones, Φ is a K-dimensional autoregressive (diagonal) matrix with typical elements φ j , and Σ is the covariance matrix, assumed to be diagonal given that the correlation parameter between the factors is not identifiable. The original version of the model assumes that the Brownian motions driving S(t) and F(t) are independent and the components of F(t) are driven by independent Brownian motions. Our discretized model is based on an extension of the original model in which the components of F(t) are assumed to be correlated to the Browning motion driving S(t). That is, the original model assumes no correlation between the asset return and its volatility, and our discretized model is based on an extension of the original model which allows for the leverage/asymmetric effect observable in the equity markets, much in keeping with the assertion made by Black (1976) and Christie (1982) that a decrease of the stock price implies an increase of the associated volatility. In Section 2.2 of the main text we judiciously make precise and explicit the parameterization of the discrete-time MSSV models used in both the simulation and the estimation process, including the initial state and random variable notation.
Notes 1 Chib et al. (2002) considered Student t innovation terms, as well as jumps in their SV models, while Jensen and Maheu (2010) estimated a semi-parametric SV model and found tails thicker than those of a Student t distribution. A non-parametric SV model with leverage effects was also estimated in Jensen and Maheu (2014). However, all of these studies did not consider how leverage occurs in a multi-component (or multi-factor) SV model. 2 As alluded to earlier there are obvious links between our proposed models and multifactor models in the literature, such as Alizadeh et al. (2002) and Chernov (2003). This type of model has also been discussed by Kalli and Chib (2015) who develops methods for processes with an arbitrary number of component processes in the log volatility process. 3 As mentioned earlier these two-component SV models specify the latent/unobserved volatilities as a sum of two AR(1) processes. 4 Kim et al. (1998) point out that a one-at-a-time updating procedure can lead to a poor mixing in the one-component SV models. Furthermore the introduction of slice sampling can also lead to problems of over-conditioning and further affect the mixing in the chain. 5 It is worth noting that the HPD intervals are the most credible intervals. Specifically it is a Bayesian analog of classical confidence intervals, and represents the shortest possible interval enclosing (1-α)% of posterior mass, where α is the value for tightness of the interval and expresses the amount of probability mass excluded from the interval