Stochastic Volatility Models with Skewness Selection

This paper expands traditional stochastic volatility models by allowing for time-varying skewness without imposing it. While dynamic asymmetry may capture the likely direction of future asset returns, it comes at the risk of leading to overparameterization. Our proposed approach mitigates this concern by leveraging sparsity-inducing priors to automatically select the skewness parameter as dynamic, static or zero in a data-driven framework. We consider two empirical applications. First, in a bond yield application, dynamic skewness captures interest rate cycles of monetary easing and tightening and is partially explained by central banks’ mandates. In a currency modeling framework, our model indicates no skewness in the carry factor after accounting for stochastic volatility. This supports the idea of carry crashes resulting from volatility surges instead of dynamic skewness.


Introduction
Accurate representation of asset returns is one of the key topics in finance.Based on the theoretical results from Markowitz [1952] and Hansen and Richard [1987], standard approaches to asset pricing have largely focused on the first and second moments.Stochastic volatility (SV) models discussed, for example, on Jacquier et al. [2002], are one of the cornerstone models in modern financial econometrics.In its simplest form SV models asset returns via normal distribution with persistent volatility and a mean that is either constant or a linear function of explanatory variables.Such models capture the first two moments of asset returns in a simple and elegant matter being supported empirically and theoretically as discussed in Shephard [2005].
While we acknowledge the importance of the first two moments in asset pricing, we also notice the potential benefits of including skewness when modeling returns.Due to its ability of capturing the likely direction of returns, models with time-varying skewness may be more suitable to forecast periods with a higher concentration of same signal returns leading to better detection of periods of both overperformance and underperformance.Bianchi et al. [2022] is one key example of the empirical benefits of adding such feature for modeling cross-sectional stock momentum.While the momentum factor is known for delivering good mean-variance compensation, it is also subject to long period of negative performance.By capturing such prolonged periods of likely negative returns via dynamic skewness, Bianchi et al. [2022] improves the performance of the stock momentum factor upon traditional approaches which neglect skewness.
However, including dynamic skewness in traditional financial econometric models is not a free-lunch.While allowing for asymmetry may lead to a better representation of some financial time series, it may not be a vital feature and its inclusion risks overparameterization. Therefore, we wish to include dynamic skewness only when required by the data and remove such feature if is not necessary.This paper expand stochastic volatility models by allowing dynamic skewness without having to impose it.We replace the traditional hypothesis of gaussian errors in favor of a skew-normal distribution.Such change preserves the usual features for the first two moments of SV models but allows for dynamic skewness.Since the inclusion of time-varying asymmetry may not always be necessary, we consider a sparsity -inducing scheme for its parameters.
In particular, we consider a random-walk evolution for the asymmetry.When the standard deviation of the dynamic process is shrunk to zero, our model results in a SV with constant skewness.If, additionally, the level of the skewness is shrunk to zero, we recover a traditional SV model.By combining prior information with the likelihood of the model, our proposed approach automatically chooses between dynamic, static or no skewness.
We consider two empirical applications.We obtain three main results on a bond yield application for Brazil and the US.First, our proposed model indicates that bond yield changes for both countries are better represented by including time-varying skewness with out-of-sample improvements for forecasting the direction of future yields.
Second, the recovered skewness is associated with interest rate cycles of monetary easing and tightening.Third, inflation and unemployment partially explain the recovered skewness linking it to central banks' mandates.We also model the carry factor for currency returns.Our model indicates not only the lack of dynamics but also no skewness at all for it after accounting for volatility.Therefore, similar to the crash mitigation via volatility scaling proposed by Barroso and Santa-Clara [2015], the carry factor also have its crashes mitigated once dynamic volatility is taken into account without requiring the inclusion skewness.
This paper intersects and contributes to multiple areas.First, it contributes to the stochastic volatility literature by expanding the static skewness model of Nakajima and Omori [2012] to the dynamic case while also expanding the sparsity -inducing scheme of Nakajima [2020] by allowing dynamic skewness to be shrunk towards the static case.Second, it contributes to the toolbox of methods for recovering dynamic skewness.Trolle and Schwartz [2014] rely on option data while Rafferty [2012] uses rolling windows.Both approaches have limitations.While theoretically sound, option based approaches require tradeable options which large collection of strikes with high liquidity and continuous expiration dates.Such requirements are hard to meet in several applications, specially when for emerging markets such as in our Brazilian bond applications.Rolling-window based approaches artificially input dynamics into skewness by changing the sample period by period.Such change comes at the cost of outliers playing a large role in the estimation in addition to a trade-off based on window-size, which will affect the precision of the estimate, and the speed in which the dynamics change.Third, it contributes to the interest rate literature by showing that inflation and unemployment partially explain skewness and linking it to central banks' mandates expanding the traditional mean and variance analysis of papers such as Litterman and Scheinkman [1991] and Joslin [2018].Fourthly, it contributes to the debate of whether the carry factor presents dynamic skewness after accounting for heteroskedasticity, discussed Burnside et al. [2011] and Jurek [2014], by claiming that skewness is unlikely to be dynamic and, in fact, is more likely to be zero after accounting for SV effects.
The paper is organized as follows.It starts by describing traditional SV models and moves on to our proposal with time-varying skewness on Section 2. Section3 discusses the sparsity-inducing framework.Section 4 shows our HMC approach to simulate from the joint posterior.Section 5 presents the bond yield forecasting application while Section 6 shows the carry factor application.Section 7 concludes.

SV model with time-varying skewness
A vanilla stochastic volatility model is given by Equations ( 1) -(5).Equation (1) represent asset returns with mean 0 and dynamic volatility exp(h t /2).Equation (2) indicates persistent log-volatility with level µ and persistence ϕ with initial values given by the stationary distribution shown in Equation (3).In its simplest version both the measure and state equations have Gaussian errors as represented in Equations ( 4) and (5).
In order to introduce asymmetry into the model, we replace the normal variable ε t in Equation ( 4) by a skewnormal random variable z t .We consider the skew-normal representation of Azzalini [1985] and Azzalini [2013] shown in Equation ( 6) where ϕ(•) and Φ(•) are the standard normal density and distribution function, respectively.
λ controls the degree of asymmetry in the distribution as show in Figure (1).In particular, for λ = 0, the skewnormal reduces to a standard normal distribution.Also, we may add location ξ and scale ω parameters to the skew-normal denoting X = ξ + ωz by X ∼ SN (ξ, ω, λ) which has its distribution presented in Equation ( 7).
Appendix A describes how ξ , λ and λ relate to mean, variance and skewness.Therefore, our model can be viewed as an expansion of the stochastic volatility models with static skewness proposed by Nakajima and Omori [2012], Nakajima [2020] and Li and Scharth [2020].
x Density Figure 1: Skew-normal densities with ξ = 0 and ω = 1 and λ equal to -2, 0, and 1 for the red, black and blue lines, respectively.If λ < 0, the distribution is skewed to the left.If λ = 0, SN(0) reduces to the standard Gaussian distribution.Finally, if λ > 0 the distribution is skewed to the right.
As shown in Bianchi et al. [2022] and Carr and Wu [2007], its plausible that asset returns have time-varying skewness.Thus, we allow for this possibility by replacing the static λ for λ t which evolves according to a randomwalk starting at λ 0 as represented by Equations ( 8) to ( 9).Therefore, we change our observation equation by allowing both volatility and skewness to be time-varying changing the observation equation to Equation (11).
λ 0 = α 0 (10) 3 Sparsity-inducing approach The inclusion of asymmetry may lead to a better representation of time series specially by capturing periods with a higher mass of returns with the same signal, as shown, for example, in Bianchi et al. [2022], Azzalini [2013], and Rachev et al. [2005].However, it may not always be necessary since it requires the estimation of additional parameters.To mitigate this overparametrization risk, we aim to include time-varying skewness only when required by the data.While one can estimate multiple models varying the skewness specification and then performing selection via Bayes factor, we propose a sparse-inducing method which conveniently performs model selection without requiring estimating multiples models while also avoiding Bayes factor completely.
Equation ( 9) drives the dynamics of the asymmetry parameter.If σ λ is close enough to zero, then λ t is static and assume the value of its initial point λ 0 .Additionally, if the initial point is also zero, then the model reduces to the vanilla SV model.Therefore, if we induce sparsity for σ λ and α 0 we can achieve all 3 cases of interest.
In the non-Bayesian literature, shrinkage is based on maximizing the likelihood of a model subject to a penalty function with the LASSO of Tibshirani [1996] being the most commonly employed approach.From a Bayesian point of view, shrinkage problems can be represented as a penalization to the log-likelihood via log-prior.In fact, the posterior mode of a linear model with Double Exponential prior with location 0 and scale 2/ψ is equal to the point estimate of the LASSO with penalty ψ as shown in Park and Casella [2008].
Thus, in order to shrink σ λ and λ 0 towards zero, we consider both having Double Exponential priors shown in Equation ( 12) and ( 13) similarly to the approach of Belmonte et al. [2014] in the context of dynamic regression models.Priors such as the ones discussed in Lopes et al. [2022] and Bitto and Frühwirth-Schnatter [2019] also are reasonable approaches to induce sparsity on time-varying parameter models.
While, to the best of our knowledge, we are the first to induce sparsity on the dynamic skewness framework, Nakajima [2020] have used the spike and slab prior of George and McCulloch [1993] to produce a data-driven framework which obtains the posterior probability of the presence of static skewness.
4 Remaining priors and posterior inference As discussed in Gamerman and Lopes [2006], the RWMH algorithm is a type o MCMC algorithm which generates a sequence of values Θ which approximates p(Θ|y).Each iteration i generates Θ (i) which is defined in part by q(Θ prop |Θ i−1 ) where Θ prop is a proposal for the next value in the chain and Θ i−1 represents the value of Θ on the previous iteration.The name RWMH is due to the proposal being generate as a random -walk from the previously sampled Θ.Each proposed value for Θ i may be accepted or rejected based on Equation ( 14) where acc represents the acceptance ratio.While there are little restrictions for the use of RWMH algorithms, its limitations come from the computational side since the acceptance ratio is usually low requiring a large amount of iterations.
One of the main advantages of HMC comes exactly due to its higher acceptance rate than RWMH.HMC improves upon RWMH by employing guided proposals based on the gradient of the log posterior to direct the Markov chain towards regions of higher posterior density while also sampling the tail areas properly as discussed in Betancourt [2017].Thomas and Tu [2021] and Betancourt [2017] are good introductions to HMC.
Over time, HMC travels on trajectories that are governed by the Hamiltonian equations: where H(Θ, p) represents the Hamilton function.In the context of this paper H(Θ, p) = −logf (Θ|y)+ 1 2 p T M −1 p.Additionally, ∇ Θ logf (Θ|y) is the gradient of the log posterior density which is the main responsible for the improvement of HMC over RWMH in acceptance ratio.
As discussed in Betancourt [2017], proposals generated from the exact solution of the Hamilton equations would be accepted with probability 1.However, they usually are not analytically tractable and solutions comes from numerical methods, typically, via the leapfrog method.Since the leapfrog is an approximation, an acceptance step is added to ensure that proposals does not deviate far from the specified Hamiltonian H(θ, p).Therefore, on practice, the acceptance rate of HMC is less than 100% but higher than RWMH.Thus, we employ HMC to sample from p(Θ|y).
An additional benefit of HMC apporaches are their easy implementation via Stan introduce by Carpenter et al. [2017].In this paper, we combine Stan with R via rStan introduced by Guo et al. [2020].In every application, we run our HMC scheme for 30000 iterations with the first 15000 being used as burn-in draws.

Empirical application: Bond Yields
The bond market is one of the largest in the world being key for investors and policy makers.Most papers focus on the first two moments e.g.Litterman and Scheinkman [1991], Collin-Dufresne and Goldstein [2002], Cochrane and Piazzesi [2005] and Joslin [2018].Our paper focus on the much less explored third moment.As discussed previously, skewness captures the likely direction of returns.Therefore, it allows an interest rate investor to improve their forecast about sign of future yields allowing larger investments when positive returns are likely and buying protection or going short otherwise.
In our first application, we model monthly yield changes in fixed 1-year maturity for both American and Brazilian bonds.The sample of US bonds comes from the updated dataset of Gürkaynak et al. [2007] made available by the Federal Reserve Board starting at 07/1981 and going until 08/2023.The sample of Brazilian bonds is based on the DI interest rate contracts available on the Brazilian stock and futures exchange B3.The DI contracts are interpolated to form 1-year fixed maturity bonds using the same Nelson-Siegel procedure described in Gürkaynak et al. [2007] resulting in a sample starting in 02/2004 and ending in 08/2023.Figure (2) plots both time series of monthly yield changes.Notably, both series fluctuate around zero with volatility clusters.Both series hint at the possibility of skewness.For example, the US series presents a persistent and negative yield change period in the early 90 and early 2000.In the Brazilian series, the negative and persistent yield changes are even clearer with the period from 08/2016 to 03/2018 being one example.
We apply our proposed model to both US and Brazilian bonds with the priors presented in Appendix B. To account for the Brazilian time series being shorter than the American, the estimation sample comes closer to the present than the American.Precisely, we consider an estimation period which goes up to 12/2018 for the US   If the asymmetry of yield changes is connected to interest rate cycles, then drivers of the monetary policy should help explain skewness.We test this hypothesis by regressing the posterior mean of {λ t }, denoted λ, into inflation and unemployment as represented by Equation ( 15).Both variables are reasonable ex-ante since central banks have Figure 4: {λ t } recovered for the US and Brazilian bonds during the estimation period which goes up to 12/2018 for the US and up to 12/2019 for the Brazilian case.While λ is not the skewness itself, it can be transformed by using the formula in Appendix A. Green lines represent quantiles .25 and .75 while solid red lines represent quantiles .05 and .95.Time-varying skewness is likely in both cases.We obtain stronger evidence o dynamic skewness for Brazil in the bottom panel than for the US in the top one.
mandates of price stability and full employment.
Table (2) indicates that unemployment levels partially explains the asymmetry in yield changes for both countries.The negative sign of β unemployment is reasonable since central banks usually fights high unemployment levels with interest rate cuts to promote consumption and, as show in Figure ( 5), easing cycles are associated with negative skewness on yield changes.Additionally, for the Brazilian case, inflation is also likely to play a role.The sign of β inf lation is also reasonable since central banks combat surges in inflation with interest rake hikes which, as show in Figure ( 5), is associated with high values of skewness.Finally, while inflation was a problem in the US in the late 70's and early 80's, for the majority of our estimation sample the American economy didn't face inflationary pressures.Since such pressures didn't exist, the yields won't react to them.Therefore, the near 0 effect of β λ is justifiable.
Intercept 6 Empirical application: Carry factor In addition to the Bond yield application described in Section 5, we also consider an application for the currency market.Lustig et al. [2011] indicates that two factors, dollar factor and carry, explain the cross-section of currency returns.This application focus on the latter which captures interest rate differentials between countries by going long on countries with high interest rate differentials with respect to the US and, conversely, goes short countries with the smallest interest rate differentials with respect to the US.While we focus on FX markets, Koijen et al.We are not the first to study the skewness of carry returns.For example, Burnside et al. [2011] and Rafferty [2012] identify a time-varying crash-risk on the carry factor.This risk would materialize in some occasions leading to large negative returns to the carry factor and skewing its distribution to the left.Conversely, Jurek [2014] uses out-of-the-money currency options hedging against large crashes and show that carry remains profitable indicating a small role for tail risk on currency returns.Additionally, Jurek [2014] presents evidence in favor of time varying volatility for carry returns and that the largest negative return of its sample, 10/2008, occurs in a period of high volatility.Our model is well suited to evaluate such claims.Jurek [2014] is not an isolated case.Barroso and Santa-Clara [2015] provides another example of tail risk mitigation in tradeable factors after accounting for volatility without skewness playing a major role.
We consider the prior specification shown in Appendix C and sample from the posterior using the HMC scheme describe in Section 4. (red) for both {exp ht 2 } and {λ t } which are shown at the top and bottom panel, respectively.The top panel corroborates with the evidence of time-varying volatility with a surge in volatility around 10/2008 similarly to the description of Jurek [2014].The bottom panel indicate that is likely that there is no skewness at all after accounting for stochastic volatility.
Additionally, we verify how the carry factor performs in period with high volatility when compared to low volatility.We say that volatility is high if is above the average volatility recovered for the full sample and we say its low otherwise.Table (4) reports the results of our analysis.While the average return for the carry factor in both environments is the same, crash indicators such as the return on the 5th percentile and minimum return are severely improved.Therefore, in combination with the results shown in Figure (7b), our results indicate that after accounting for volatility, its unlikely that skewness play a large role in affecting the returns of the carry factor.

Conclusion
This paper expands stochastic volatility models by allowing for time-varying skewness without having to imposing it.By considering a LASSO-type regularization for both the standard deviation and starting level of the skewness dynamics, our model chooses between dynamic, static and no skewness in a data-driven approach.On our bond yield application, we highlight the benefits of dynamic skewness by showing its connection to monetary easing/tightening cycles and with central banks' mandates.Additionally, we show that asymmetry is informative about the likely direction of future bond yield changes.On the second application, we shed light into the debate of carry average returns reflecting time-varying skewness versus no skewness but time-varying volatility.Our model indicates no skewness after accounting for stochastic volatility.

Figure 2 :
Figure 2: Change in yields from fixed 1-year maturity bonds from the US and Brazil.The top panel presents monthly changes in yields from 1-year bonds from the US government starting in 07/1981 and going up to 08/2023.Similarly, the bottom panel shows monthly changes in 1-year Brazilian bonds from 02/2004 up to 08/2023.

Figure 3 :
Figure 3: Scale recovered for the US and Brazilian bonds during the estimation period which goes up to 12/2018 for the US and up to 12/2019 for the Brazilian case.Green lines represent quantiles .25 and .75 while solid red lines represent quantiles .05 and .95.For both series, time-varying volatility is plausible.Top panel indicates rises in yield change volatility during the early 00's and around the global financial crisis for the US.While the bottom panel indicate peaks of volatility for Brazilian yield changes between 2009 to 2010 with a second spike in late 2018 and early 2019.

Figure 5 :
Figure 5: {λ t } recovered for the US, top panel, and Brazilian bonds, bottom panel, during the estimation period which goes up to 12/2018 for the US and up to 12/2019 for the Brazilian case.Green shaded area indicate monetary easing periods while red ones indicate monetary tightening.In both cases, λ t seems able to capture the monetary cycles and reflect future direction of yield change.US monetary police cycles are based on the FED effective fund rate while for the Brazilian case they are recovered from the target rate policy rate decision of each meeting.

[ 2018 ]
argues in favor of carry-based factors being a suitable to explain a the cross-section of a large number of asset classes such as commodities and equities.We consider and updated sample of the carry factor describe in Lustig et al. [2011] 2 , presented in Figure (6), which starts on 11/1983 and goes up to 05/2021.

Figure 6 :
Figure 6: Updated version of the time series for the Carry factor introduced by Lustig et al. [2011].Our sample goes from 11/1983 to 05/2021.

Figure 7 :
Figure 7: Skewness recovered from the carry factor.Green lines represent quantiles .25 and .75 while solid red lines represent quantiles .05 and .95.After controlling for volatility, we find no evidence of skewness in the carry factor.
Our modeling approach leads to the following unknown quantities Θ = {µ h , ϕ h , σ h , α 0 , σ λ , {h t }, {λ t }}.We aim to recover the joint posterior distribution p(Θ|y) which by Bayes' rule can be computed as follows is the likelihood component being characterized by our proposed sampling model.We consider independent components for each member of p(Θ).p(µh ) and p(ϕ h ) follow Gaussian distributions, p(σ 2 h ) has inverse gamma distribution while p(α 0 ) and p(σ λ ) have double exponential distribution as discussed previously.The exact values for the prior parameters are described in Appendix B and Appendix C.While we can provide a full description of p(y|Θ) and p(Θ), the problem of recovering p(Θ|y) is not analytically tractable.Thus, we resort to simulation methods to sample from p(Θ|y) and then compute summary posterior statistics to characterize our results.This paper uses a particular Markov Chain Monte Carlo (MCMC) method known as Hamiltonian Monte Carlo (HMC) instead of the more traditional Random -Walk Metropolis-Hastings (RWMH) algorithm to sample from p(Θ|y).

Table 1 :
Table (1) presents the posterior summaries for the parameters of both time-series.The values of ϕ h indicate high persistence in the log-scale which is reflected in Figure (3) which plots posterior summaries of {h t }.In both cases, our approach captures the surges in volatility indicated in Figure (2).For example, the US series captures spikes of volatility in the early 2000's and on the financial crisis of 2008.The Brazilian time series also presents a spike in volatility in 2008 which is also captured by our model.Additionally, both series are likely to have a dynamic skewness as show by the posterior summaries of σ λ with the Brazilian bonds having a bigger range of variation for λ t than bonds from the US.Such, evidence is supported by the posterior summaries of {λ t } in Figure (4) as well.Posterior summaries of the parameters from our proposed model.µ h , ϕ h and σ h represent the log-scale level, persistence and standard deviation, respectively.σλ captures the dynamic of the skewness component and α 0 is the initial level of the skewness.Both series are likely to have a dynamic skewness as show by the posterior summaries of σ λ with the Brazilian bonds having a bigger range of skewness variation than bonds from the US t } alongside with monetary easing-tightening cycles.Green shaded areas represent monetary easing periods characterized by interest rates cuts by the countries' central bank.Conversely, red shaded areas represent monetary tightening periods characterized by interest rates hikes.Our approach identifies negative skewness in the yield changes during easing and large positive skewness when a central bank is hiking

Table 2 :
β Inf lation β U nemployment Linear regression of the posterior mean of {λ t } into inflation and unemployment during the estimation sample.For both countries unemployment partially explains the skewness in yield changes while the asymmetry in the Brazilian case is also partially explained by inflation.We also evaluate the out of sample performance of our proposed model.If skewness is informative about the likely direction of changes in bond yields, then sign(E t λt+1 ) should agree with sign(y t+1 ).We verify this claim within a increasing window framework where the first window corresponds to the estimation sample.For each window, we run our HMC procedure and obtain sign(E t λt+1 ).As shown in Table(3), our procedure indicates For US bonds, sign(y t+1 ) is equal to sign(E t λ t+1 ) 66.1% of cases with average value of change being 20.6% when correctly forecasted and only 8.7% when wrong.Similarly, for the Brazilian case, sign(y t+1 ) is equal to sign(E t [λ t+1 ]) in 72.7% of cases.Additionally, the average magnitude of correctly forecasted yield changes is 8.2% and only 2.4% when wrongly predicted.Thus, the out of sample analysis support our claim of dynamic skewness in bond yield

Table 3 :
If bond yield skewness captures the likely direction of yield changes, then we should expected yield increases when E t [λ t+1] > 0 and, otherwise, decreases.Our model provides evidence supporting such claims.In both cases, our model correctly predicts the correct yield change direction in at least 66.1% of times.

Table 4 :
Summary statistics of the carry factor in high and low volatility environments for the sample starting in 11/1983 and going up to 12/2021.While the average return for the carry factor in both environments is the same, crash indicators such as the return on the 5-th percentile and minimum return are severely improved.