Next Article in Journal
Measuring Causal Invariance Formally
Previous Article in Journal
Robust Aggregation Operators for Intuitionistic Fuzzy Hypersoft Set with Their Application to Solve MCDM Problem
Previous Article in Special Issue
A New Regression Model for the Analysis of Overdispersed and Zero-Modified Count Data
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Locally Both Leptokurtic and Fat-Tailed Distribution with Application in a Bayesian Stochastic Volatility Model

1
Department of Mathematics, Cracow University of Economics, ul. Rakowicka 27, 31-510 Kraków, Poland
2
Department of Financial Mathematics, Jagiellonian University in Kraków, ul. Prof. Stanisława Łojasiewicza 6, 30-348 Kraków, Poland
3
Department of Econometrics and Operations Research, Cracow University of Economics, ul. Rakowicka 27, 31-510 Kraków, Poland
*
Author to whom correspondence should be addressed.
Entropy 2021, 23(6), 689; https://doi.org/10.3390/e23060689
Submission received: 30 April 2021 / Revised: 22 May 2021 / Accepted: 26 May 2021 / Published: 30 May 2021
(This article belongs to the Special Issue Bayesian Inference and Computation)

Abstract

:
In the paper, we begin with introducing a novel scale mixture of normal distribution such that its leptokurticity and fat-tailedness are only local, with this “locality” being separately controlled by two censoring parameters. This new, locally leptokurtic and fat-tailed (LLFT) distribution makes a viable alternative for other, globally leptokurtic, fat-tailed and symmetric distributions, typically entertained in financial volatility modelling. Then, we incorporate the LLFT distribution into a basic stochastic volatility (SV) model to yield a flexible alternative for common heavy-tailed SV models. For the resulting LLFT-SV model, we develop a Bayesian statistical framework and effective MCMC methods to enable posterior sampling of the parameters and latent variables. Empirical results indicate the validity of the LLFT-SV specification for modelling both “non-standard” financial time series with repeating zero returns, as well as more “typical” data on the S&P 500 and DAX indices. For the former, the LLFT-SV model is also shown to markedly outperform a common, globally heavy-tailed, t-SV alternative in terms of density forecasting. Applications of the proposed distribution in more advanced SV models seem to be easily attainable.

1. Introduction

Most of leptokurtic and heavy-tailed distributions, commonly entertained in financial volatility modelling, may be derived as scale mixtures of normal (SMN) distributions, with the concept dating back to [1]. SMN is a very wide and useful class of distributions, which belongs to an even wider class of elliptical distributions (see [2]). We say that a random variable ϵ follows a scale mixture of normals if it has the following stochastic representation:
ϵ = V 1 / 2 Z ,
where Z has a standard normal distribution independent from a positive random variable V. If V G a m m a ( ν / 2 , ν / 2 ) , then ϵ follows Student’s t-distribution with ν degrees of freedom, while for V I G a m m a ( α , β ) we obtain a variance gamma distribution (see [3]). For V P a r e t o ( 1 , ν / 2 ) we obtain a modulated normal type I (MN type I) distribution, while V B e t a ( ν / 2 , 1 ) yields a modulated normal type II distribution (see [4]), which is also known in the literature as a slash distribution (see [5]). Extensions of the latter were introduced by [6,7], who developed a modified slash (MS) and generalized modified slash (GMS) distributions, respectively.
As mentioned above, the SMN class contains many different subclasses of distributions with different tail behaviour and modal concentration in comparison with the Gaussian bell curve. In this paper, we focus particularly on two of them, namely the slash distribution, as a choice of a heavy-tailed distribution, and the MN type I distribution, featuring leptokurticity. The author of [8] shows that these two distributions are dual, meaning that the shape of the spectral density near the mode in the MN type I distribution is related to the shape of the slash distribution’s tails and vice versa.
The slash distribution seems to be one of the most common heavy-tailed distributions, with the list obviously topped by the t-distribution. It should be noted that as long as the slash distribution has many advantages, particularly in modelling financial data, it also suffers from some limitations, which we discuss in the following paragraph, along with some drawbacks of the MN type I distribution, stemming from its duality with the slash distribution.
Similarly to the Student’s t-distribution, the tails of which get thicker with the degrees of freedom approaching zero, ν 0 in the slash distribution yields the same effect (note, however, that the parameter ν in the slash distribution is not termed as degrees of freedom). This property is much appreciated in modelling heavy-tailed data, including financial returns. However, an unwanted consequence of using most of the heavy-tailed distributions (including Student’s t, slash, MS and GMS distributions) is that it may be the case that a relatively too large probability mass is put far in the tails. Such a shifting of the probability mass from around the mode to the tails may be somewhat unwanted (i.e., not necessarily supported by modelled data), since, informally speaking, in many (even financial) applications, there can be almost no chances of observing data points that far from the mode. To cope with such a possible discrepancy between the theoretical distribution and the data, we make some amendments to the slash distribution so that the resulting, new distribution (hereafter termed as the local slash distribution, LS) allows the tails to be freely thick but only locally up to some distance from the mode. The effect is achieved through introducing only one additional parameter, d ( 1 , ) , to censor the distribution of V 1 / 2 in (1) by inserting min { V 1 / 2 , d } , instead. Consequently, the tails can be thick, but only locally, while thinning out in the infinity. We show that all moments exist for this new distribution.
Apart from heavy-tailed distributions, leptokurtic ones have also found applications in many fields, including finance, with both features being typically concomitant. One of the possible choices to capture data leptokurticity is the MN type I distribution. However, it has not gained much popularity in applications, particularly financial ones, mainly due to its only thin tails. Moreover, as long as the MN type I distribution is able to capture strong a modal concentration of data, its duality to the slash distribution causes a possibly overly fat-tailedness of the latter to translate into a possibly overly high concentration of the probability mass around the mode in the former. To control for the pointedness of the MN type I distribution, in this paper, we modify it by introducing an additional, censoring parameter c ( 0 , 1 ) and supplanting the random variable V 1 / 2 in (1) with max { V 1 / 2 , c } . We term the resulting construct as the local MN (LMN) type I distribution and provide a detailed description of its properties later on.
In many applications, particularly financial ones, both features of a probability distribution are desired simultaneously: the heavy tails and leptokurticity. In distributions commonly employed in practice, such as Student’s t, higher probability mass accumulation in the tails immediately induces higher modal concentration, with both areas of the distribution controlled for by only a single parameter. In this paper, with prior modifications of the slash and MN type I distributions, we combine the two resulting (LS and LMN type I) distributions within the SMN setting to obtain a completely new, symmetric distribution, termed as the locally both leptokurtic and fat-tailed (LLFT) distribution, which is far more flexible in modelling the outlying as well as “modal” observations. The LLFT distribution features five parameters (in its standardized form), with all moments existing but allowed to be arbitrarily large.
In this paper, we apply the proposed LLFT distribution for the error term in the basic, otherwise conditionally normal SV model that for a univariate y t can be written as y t = h t ε t , ln h t = γ + ϕ ( ln h t 1 γ ) + η t , where ε t N ( 0 , 1 ) and η t N ( 0 , σ 2 ) are i.i.d. and mutually independent. Although it is well-known that such an SV model has the ability to induce leptokurtosis and heavy tails, typically observed in financial time series, its underlying conditional Gaussianity is still quite a limitation. Introducing the LLFT distribution for ε t aims at a more adequate capturing of the heavy tails and leptokurticity of financial assets’ returns, as compared not only to the basic but also conditionally heavy-tailed SV models, typically entertained in the literature, including t-SV and slash-SV models.
Modifications of the distribution for ε t has been one of the most prolific strands of SV literature (see, e.g., [9,10,11,12,13,14,15,16]). Other typical directions of generalizing the basic SV structure focus on capturing the leverage effect and asymmetry (see, e.g., [12,17,18,19]) as well as on refining the volatility process by, for example, accommodating realized volatility and long memory (see, e.g., [18,20]), or allowing for discrete, Markov switches of the parameters (see, e.g., [21,22,23]). However, we do not follow these lines of research in our current paper, focusing rather on the construction of a new distribution “from scratch” and its introduction into the basic SV model, thereby contributing to the research area of improving the conditional distribution in SV models. Extending our framework for other, more elaborate SV specifications is left for future work.
For statistical inference in the resulting LLFT-SV model, we resort to the Bayesian framework, which is typically considered for SV models (see [14,24,25,26]). We show that under certain prior structure, the marginal data density in the LLFT-SV model (and thus, the posterior distribution) is bounded even when the same values of observations repeat in the sample, which is not necessarily the case when these were to be modelled with the slash or MN type I distributions, as duly noted by [27]. The result seems essential in view of repeating (or at least strongly concentrated around some constant, e.g., the mode) values of returns that occur quite often for some individual financial assets (such as a company’s stocks).
Building on the hierarchical representation of SMN distributions, we develop an effective Markov chain Monte Carlo (MCMC) method for posterior sampling of the LLFT-SV model’s parameters and latent variables. The procedure suitably adapts standard techniques of the Metropolis–Hastings algorithm and Gibbs sampler.
To sum up, the contribution of our work is several-fold. Methodologically, a new, specific scale mixture of normals is constructed, featuring five free parameters, with two of them censoring separately the leptokurticity and fat tails to be only local, while the other three controlling for the relative weights and magnitudes of these two features in a disentangled manner. Then, we introduce this new LLFT distribution into a stochastic volatility model and develop a Bayesian framework for the resulting LLFT-SV specification, resolving some additional theoretical considerations about the existence of the posterior distribution. Next, we adapt relevant MCMC numerical methods for posterior simulations of the parameters and latent variables. Finally, we conduct an extensive empirical study of the LLFT-SV model’s workings and density forecasting performance for a selected financial asset and additionally provide a brief illustration for two common stock market indices.
The article is organized as follows. In Section 2, we define the locally both leptokurtic and fat-tailed (LLFT) scale mixture of normal distributions. In particular, we show in detail the properties of the introduced distributions, together with their relations to other, well-known distribution families. In Section 3, we introduce the stochastic volatility (SV) model incorporating the proposed distribution for the error term and develop a Bayesian framework for the resulting LLFT-SV model. Section 4 presents a real-world data analysis illustrating predictive advantages of the model. Finally, Section 5 concludes and discusses possible avenues for further research.

2. Introducing the Locally Both Leptokurtic and Fat-Tailed Scale Mixture of Normal Distributions

2.1. The Slash Distribution—A Short Review

With begin with a detailed presentation of the slash distribution’ properties, with the distribution itself parametrized according to [5], where it was introduced. Following the cited paper, we assume that
X | W N ( 0 , W 2 ) ,
where W has P a r e t o ( 1 , ν ) distribution and N ( μ , σ 2 ) stands for the Gaussian distribution with mean μ and variance σ 2 . Note that recently, some other parametrization has been used (see for example [14,15]), in which it is assumed that X | W ˜ N ( μ , 1 / W ˜ ) , where W ˜ has B e t a ( ν ˜ , 1 ) distribution. However, both parametrizations are equivalent under ν = 1 2 ν ˜ , since P a r e t o ( 1 , ν ) is a inverse distribution for B e t a ( ν , 1 ) . In this paper, we choose to stick to the original parametrization by [5], as it enables a direct comparison of the slash distribution’s properties with the ones of Student’s t. We will discuss this later in this section. The probability density function (pdf) of the random variable X given by (2) is
p d f ( x ) = 1 W ν 1 ν e x 2 2 W 2 2 π W d W = 2 ν 2 1 ν | x | ν 1 Γ ν + 1 2 Γ ν + 1 2 , x 2 2 π , x 0 ν 2 π ( ν + 1 ) , x = 0 ,
which is easy to obtain by noticing that the function under the integral above is the kernel of an inverse gamma distribution (under x 0 ) at W 2 , and Γ ( a , x ) denotes the upper incomplete gamma function: Γ ( a , x ) = x t a 1 e t d t . The exact form of the pdf (3) of the slash distribution also admits a recurrent form, which was also shown in [5]. Density is a continuous function on the real line, symmetric and unimodal. The characteristic function of X given by (2) has the form E ( e i t X ) = 1 2 ν E ν 2 + 1 t 2 2 , where E n ( z ) = 1 e t z t n d t is the exponential integral function. It is well known that the slash distribution reduces to the standard Gaussian case as ν . Note that, similarly to Student’s t-distribution, the moment E ( X 2 r ) exists only under assumption ν > 2 r , r N , and is given as
E ( X 2 r ) = E ν 2 r W 2 r Γ r + 1 2 π = ν 2 r Γ r + 1 2 π ( ν 2 r ) .
Hence, the variance exists only under ν > 2 , and is equal to E ( X 2 ) = ν / ( ν 2 ) , which is the same as for Student’s t-distribution with ν > 2 degrees of freedom. The kurtosis of the slash distribution exists only for ν > 4 and equals K u ( X ) = 3 ( ν 2 ) 2 / [ ( ν 4 ) ν ] . For a given value of ν , the slash distribution’s kurtosis is lower than in the Student’s t-distribution with ν degrees of freedom, but is analogically a decreasing function of ν , tending to 3 as ν , and approaching when ν 4 + . Interestingly, the slash distribution features the same tail thickness as the t-distribution with ν degrees of freedom, i.e.,
lim x p d f ( x ) given by ( 3 ) x ( ν + 1 ) = 2 ν 2 1 ν Γ ν + 1 2 π 0 .
Thereby, the probability of extreme values increases as ν 0 + , while for ν , the slash distribution reduces to the standard Gaussian case.
In view of the above, the properties of the slash distribution with the parameter ν are comparable with the ones of Student’s t-distribution, with both sharing and ’suffering’ from the same restriction ensuring the existence of their k t h moments: ν > k . Otherwise, relevant integrals do not converge since too much probability is located in the tails.
In the following subsection, we propose a modification of the slash distribution to ‘curb’ the heaviness of the tails but only far enough from the mode. In consequence, all moments of such a modified distribution exist.

2.2. The Concept of the Local Slash (LS) Distribution

Here, we modify the slash distribution reviewed in the previous section by introducing a censored variable in place of W in (2):
X | W N ( 0 , min { W 2 , d 2 } ) ,
where, similarly as before, W follows P a r e t o ( 1 , ν ) distribution with ν > 0 , while d ( 1 , ) is a censoring parameter. The marginal distribution of X given by (5) depends strongly on parameter d, which is related to the tail thickness. The introduction of parameter d ensures that the tails’ thickness of X is upheld but only up to some distance from the mode (with the distance being controlled for by d), while thinning out thereafter. Since the heavy-tailedness is only “local” now, we name the new distribution as the local slash (LS) distribution.
The probability density function of the LS distribution can be easily derived and admits the form
p d f ( x ) = 2 d ν 1 e x 2 2 d 2 + 2 ν / 2 ν 1 x 2 ν + 1 2 Γ ν + 1 2 , x 2 2 d 2 Γ ν + 1 2 , x 2 2 2 π , x 0 d ν 1 ν d ν + 1 + 1 2 π ( ν + 1 ) x = 0 .
Note that the LS distribution is symmetric and is a discrete mixture of the zero-mean Gaussian distribution with variance d 2 and some unknown distribution with its pdf given by
2 ν / 2 ν 1 x 2 ν + 1 2 Γ ν + 1 2 , x 2 2 d 2 Γ ν + 1 2 , x 2 2 2 π ( 1 d ν ) , x 0 ν d ν + 1 1 2 π d ( ν + 1 ) d ν 1 x = 0 .
The weights of this mixture are d ν and 1 d ν , respectively. Hence, if ν 0 + , then the LS distribution collapses to N ( 0 , d 2 ) .
The LS distribution’s cdf is given as
c d f ( y ) = 1 2 erf y 2 + 2 ν / 2 y ν Γ ν + 1 2 , y 2 2 Γ ν + 1 2 , y 2 2 d 2 π + 1 , y 0 1 2 , y = 0
while the characteristic function
E ( e i t X ) = 1 2 d ν 2 e 1 2 d 2 t 2 ν E ν 2 + 1 d 2 t 2 2 + ν E ν 2 + 1 t 2 2 .
One of the most important properties of the LS distribution is that the moments E ( X 2 r ) , r N , exist for any ν R + , and
E ( X 2 r ) = E ( E ( X 2 r | W ) ) = 2 r d ν Γ r + 1 2 2 r d 2 r ν d ν π ( 2 r ν ) , 2 r ν 2 r Γ r + 1 2 ( 2 r log ( d ) + 1 ) π , 2 r = ν ,
which is a continuous function of ν and d, for any fixed r N . Thus, the variance exists for any ν R + and equals
E ( X 2 ) = 2 d 2 ν ν 2 ν , ν 2 2 log ( d ) + 1 , ν = 2 .
Note that for any fixed ν > 0 we have
lim d E ( X 2 ) = ν ν 2 , ν > 2 , ν 2 ,
which means that in the limiting case of d , the variance stabilizes for ν > 2 and tends to the variance of the slash distribution (coinciding with the one of Student’s t-distribution). The above result also indicates that for any ν 2 , the variance exists and can be arbitrarily high (for sufficiently large d).
The kurtosis exists for any d and ν and equals
K u ( X ) = 3 ( ν 2 ) 2 d ν ν d ν 4 d 4 ( ν 4 ) ν d ν 2 d 2 2 , ν 2 , 4 6 d 2 3 ( 2 log ( d ) + 1 ) 2 , ν = 2 3 d 4 ( 4 log ( d ) + 1 ) 1 2 d 2 2 , ν = 4 .
Hence,
lim d K u ( X ) = 3 ( ν 2 ) 2 ( ν 4 ) ν , ν > 4 , ν 4 .
Figure 1 presents the shape of the variance and kurtosis of the LS distribution as a function of both ν and d. Two white lines represent the cases where ν = 2 or ν = 4 . Note that for any fixed d > 1 , the variance increases as ν 0 + , while the kurtosis tends to 3 with either ν 0 + or ν , achieving its maximum at ν = 2 (see Figure 1).
Figure 2 presents the density functions of the LS distribution for different values of ν and d. Note that the higher value of d, the more mass is spread out on tails (for fixed ν ).

2.3. Local Modulated Normal Type I Distribution

We begin this section with a presentation of the modulated normal (MN) type I distribution introduced by [4], which we later modify into the local modulated normal (LMN) type I distribution. Firstly, let us recall that a random variable X follows an MN type I distribution if
X | R N ( 0 , R 2 ) ,
where R is a continuous random variable on ( 0 , 1 ) with B e t a ( ρ , 1 ) distribution, ρ > 0 . The pdf and cdf of random variable X admit the forms
p d f ( x ) = 0 1 R ρ 1 ρ e x 2 2 R 2 2 π R d R = 2 ρ 2 1 ρ 1 x 2 1 2 ρ 2 Γ 1 2 ρ 2 , x 2 2 π , x 0 ρ 2 π ( ρ 1 ) , x = 0 , ρ > 1 , i . e . , lim x 0 2 ρ 2 1 ρ 1 x 2 1 2 ρ 2 Γ 1 2 ρ 2 , x 2 2 π = , x = 0 , 0 < ρ 1 ,
c d f ( y ) = 1 2 erfc y 2 + y E ρ + 1 2 y 2 2 2 2 π + 1 , y 0 1 2 , y = 0 .
The characteristic function equals 2 ρ 2 1 ρ t 2 ρ 2 Γ ρ 2 Γ ρ 2 , t 2 2 and tends to characteristic function of the standard normal distribution as ρ , which indicates the equivalence of the two distributions in the limiting case. The moments exists for any ρ > 0 and equals E ( X 2 r ) = ρ 2 r Γ r + 1 2 π ( ρ + 2 r ) , r N . Hence, the variance is E ( X 2 ) = ρ ρ + 2 , while the kurtosis equals K u ( X ) = 3 ( ρ + 2 ) 2 ρ ( ρ + 4 ) , which is a decreasing function of ρ (yet always taking on higher values than 3), and tends to as ρ 0 + .
Figure 3 presents the pdf of the MN type I distribution for different values of ρ , compared with the Gaussian curve. Presented cases exhibit the ability of this distribution to capture leptokurtic data. It is straightforward to show that, for ρ < 2 , the pdf is a convex function on sets ( 0 , ) and ( , 0 ) . If 1 < ρ < 2 (see Figure 3, green line), the density is pointed at zero, thus not differentiable at this point. If ρ 1 , the pdf tends to as x 0 , and degenerates to a single-point distribution as ρ 0 . Hence, for ρ 1 , the normalized probability ϵ 1 P ( | X | < ϵ ) tends to for ϵ 0 . For ρ > 1 , this probability equals 2 π ρ / ( ρ 1 ) , approaching as ρ 1 + .
Note that the accumulation of the mass near to zero is typical for returns of many financial assets, corresponding with only small or even no price changes at all. However, such a behaviour of the data can be observed only locally, in the sense that for a finite sample size it seems a little far-fetched to assume that the normalized probability ϵ 1 P ( | X | < ϵ ) is unbounded. Note that for a finite sample size, such a property for ρ 1 becomes an assumption (somewhat incidental) that could not be validated. To weaken this assumption and control for the pointedness of the distribution, we introduce a locally leptokurtic distribution with a continuous pdf function at zero, for which ϵ 1 P ( | X | < ϵ ) is bounded on ϵ > 0 for any fixed ρ > 0 , including ρ 1 . Moreover, the pdf for such a distribution is differentiable on the entire real line.
The idea for constructing a locally leptokurtic distribution is to use the censored random variable R as a standard deviation of X, which (conditionally on R) is normally distributed:
X | R N ( 0 , max { R 2 , c 2 } ) ,
where R is a continuous random variable with B e t a ( ρ , 1 ) distribution and c ( 0 , 1 ) is a censoring parameter. We refer to the marginal distribution of X as the local modulated normal (LMN) type I distribution. Its pdf and cdf are given by the following formulas:
p d f ( x ) = 0 c R ρ 1 ρ e x 2 2 c 2 2 π c d R + c 1 R ρ 1 ρ e x 2 2 R 2 2 π R d R = 2 ρ 2 ρ 1 x 2 1 2 ρ 2 Γ 1 2 ρ 2 , x 2 2 Γ 1 2 ρ 2 , x 2 2 c 2 + 2 c ρ 1 e x 2 2 c 2 2 π , x 0 c ρ c ρ 2 π c ( ρ 1 ) , x = 0 .
c d f ( y ) = y c E ρ + 1 2 y 2 2 c ρ E ρ + 1 2 y 2 2 c 2 2 2 π c 1 2 erfc y 2 + 1 , y 0 1 2 , y = 0 .
Note that the pdf is continuous on the entire real line. Since the first term of the above cdf function tends to 0 as ρ , the LMN type I distribution reduces to standard normal distribution in this limiting case, for any fixed c ( 0 , 1 ) . Moreover, it reduces to a zero-mean Gaussian case with variance c 2 , when ρ 0 + .
The characteristic function of the LMN type distribution admits the form: c ρ e 1 2 c 2 t 2 1 2 ρ E 1 ρ 2 t 2 2 c ρ E 1 ρ 2 c 2 t 2 2 . The moments E ( X 2 r ) exist for any ρ > 0 , c ( 0 , 1 ) , and are given as E ( X 2 r ) = 2 r Γ r + 1 2 2 r c ρ + 2 r + ρ π ( ρ + 2 r ) , r N . Hence, the variance equals E ( X 2 ) = 2 c ρ + 2 + ρ ρ + 2 , while the kurtosis K u ( X ) = 3 ( ρ + 2 ) 2 4 c ρ + 4 + ρ ( ρ + 4 ) 2 c ρ + 2 + ρ 2 , being higher than 3, and approaching this value as either ρ 0 + or ρ .
Figure 4 presents the densities of the LMN type I distribution for different values of ρ 1 , while Figure 5—for ρ 5 / 4 , with c { 0.02 , 0.05 , 0.1 , 0.2 , 0.5 , 0.8 } in both cases. The presented curves exhibit visible flexibility in capturing leptokurtic data. In particular, it can be noticed that a combination of a small value of ρ with a small value of c can produce an “extremely” leptokurtic distribution (see Figure 5, with c = 0.02 ).

2.4. Locally Both Leptokurtic and Fat-Tailed Distribution

To capture both effects in financial modelling, that is, leptokurticity and fat tails, we construct a new distribution, belonging to the SMN family, that combines the advantages of both the LS and LMN type I distributions introduced above. Let the conditional distribution of some random variable X be defined as:
X | R N ( 0 , min { max { R 2 , c 2 } , d 2 } ) ,
where R is a mixture of B e t a ( ρ , 1 ) and P a r e t o ( 1 , ν ) distributions with weights p and 1 p , respectively, i.e.,
R = B e t a ( ρ , 1 ) , with probability p P a r e t o ( 1 , ν ) , with probability 1 p .
We name the marginal distribution of X as the locally both leptokurtic and fat-tailed distribution (LLFT, in short), since it has the ability to be (locally) leptokurtic and (locally) heavy-tailed. Note that LLFT distribution can be written as a mixture of the LS distribution (given by (5)) and LMN type I distribution given by (16) with weights p and 1 p . The pdf, cdf, characteristic function and moments follow immediately from the previous two sections and well-known properties of probability distributions’ mixtures; therefore, we omit the details. Figure 6 presents the limiting cases of the LLFT distribution, the set of which comprises the slash, LS, MN type I, LMN type I and Gaussian distributions.
Figure 7 compares the LLFT distribution’s pdf for different values of the censoring parameters c { 0 + , 0.02 , 0.05 , 0.1 , 0.2 } and d { 3 , 4 , 5 , 6 , } , under fixed ν = 1 , ρ = 1 and p = 1 / 2 . Parameter c clearly affects the pointedness of the pdf curve: the lower the value of c, the more peaked the pdf (see Figure 7b). Moving away from the mode, the curves behave similarly (see Figure 7c), with their tails’ local heaviness driven by the parameter d: the higher its values, the longer the tails keep their local heaviness (see Figure 7d), before they thin out (see Figure 7e). This, however, does not pertain to the case when d (red line), which represents a simple mixture of the slash and MN type I distributions (without their “local” modifications).
As regards the moments of the LLFT distribution, they can easily be derived as the moments of a mixture, and thus we skip their presentation. However, for a reason clarified below, it merits a mention that the expectation E ( e r X ) exists for any r N and equals
E ( e r X ) = 1 + k = 1 ( p · g k + ( 1 p ) · f k ) < ,
where
g k = 2 k r 2 k 2 k c 2 k + ρ + ρ ( 2 k + ρ ) Γ ( k + 1 )
and
f k = 2 k d ν r 2 k 2 k d 2 k ν d ν ( 2 k ν ) Γ ( k + 1 ) , for 2 k ν 2 k r 2 k ( 2 k log ( d ) + 1 ) Γ ( k + 1 ) , for 2 k = ν .
It can be shown that the finiteness of E ( e r X ) stems from the tails of X L L F T thinning out beyond the censoring and actually converging to the tails of a zero-mean normal distribution with variance d 2 .
The above feature of the LLFT distribution could prove essential in empirical finances, particularly option pricing, where predicting the price of a derivative hinges upon some models built for logarithmic rates return, say y t , the conditional distribution of which (given the past of the process) belongs to the same family of distributions as the one assumed for the standardized errors of the observations. Transformation of the return y t into the price requires using the exponential function for y t . Depending on the family distribution of the latter, conditional (on the past) expectation of the price simply may not exist (as it happens in the case of y t following a Student’s t or slash distribution). However, from (21), it follows that if the conditional distribution of y t is LLFT, then the expectation of the induced price distribution is finite (conditionally on the past price).

3. The SV Model with Locally Leptokurtic and Fat-Tailed Innovations

In this section, we use the newly proposed, LLFT distribution (given by (19) and (20)) in the context of modelling financial time series. In volatility modelling of this type of data, two major classes of models are used: the generalized autoregressive conditionally heteroscedastic (GARCH) and stochastic volatility (or stochastic variance, SV) models. Here, we focus on the latter class. The stochastic variance models were introduced to describe time-varying volatility ([17,24,28,29,30,31]), and it seems that they are more flexible than GARCH. In the basic SV model, a log-normal auto-regressive process is specified for the conditional variance, with the conditional mean equation’s innovations following the Gaussian distribution. We generalize such a model by assuming the LLFT distribution instead, which yields the LLFT-SV specification. Due to the LLFT distribution’s properties (see Section 2), such a model is a natural and promising alternative to the heavy-tailed SV (including t-SV) models proposed in, e.g., [14,17].
The basic stochastic volatility model is defined as follows:
y t = x t β + ε t h t ,
ln h t = γ + ϕ ln h t 1 γ + η t ,
where y t denotes an observable at time t, x t is an 1 × r vector of exogenous variables or lagged observations with parameters comprised in an r × 1 vector β , ε t i i N ( 0 , 1 ) is a Gaussian white noise sequence with mean zero and unit variance, η t i i N 0 , σ 2 , η t and ε s are mutually independent (denoted as η t ε s ) for all t , s { 1 , 2 , , T } , T is the length of the modelled time series, and finally, 0 < φ < 1 .
We extend the above SV specification by waiving the normality ε t and replacing it with the LLFT distribution given by (19):
ε t = λ t min { max { ω t , c } , d } ,
where λ t i i N ( 0 , 1 ) , ω t B e t a ρ , 1 with probability p and ω t P a r e t o 1 , ν with probability 1 p and λ s ω t for s , t { 1 , 2 , , T } , with censoring parameters c and d such that 0 < c < 1 < d .
The conditional distribution of y t in the LLFT-SV model (given the past of y t and the current latent variables h t and ω t ) is determined by the distribution of λ t , so y t follows the normal distribution with mean μ t = x t β and standard deviation σ t = h t min { max { ω t , c } , d } . For μ t we assume an autoregressive structure of order m with a constant: μ t = β 0 + β 1 y t 1 + β 2 y t 2 + + β m y t m , where the polynomial 1 j = 1 m β j B j has roots outside the unit circle.

3.1. Bayesian Setup

The Bayesian statistical model amounts to specifying the joint distribution of all observations, latent variables and parameters. The assumptions presented so far determine the conditional distribution of the observations and latent variables, given the parameters, thus necessitating the marginal distribution of the parameters (the prior or a priori distribution) to be formulated. In the prior structure, we assume mutual independence between parameters β , ϕ , γ , σ 2 , p , ν , ρ , c , d and use standard prior distributions. The vector β , as well as scalar parameter ϕ , are assumed to have a truncated normal distribution. Parameter γ has a normal prior, whereas for σ 2 , we set an inverse gamma prior distribution I G ( α σ , β σ ) with mean β σ / ( α σ 1 ) under α σ > 1 . For ν we assume a gamma distribution G ( α ν , β ν ) , with mean α ν / β ν . For ρ , we also assume a gamma distribution, while for parameter p—a beta distribution. Finally, for parameters c and d, we assume inverse Nakagami distributions (see [32]) truncated to intervals ( 0 , 1 ) and ( 1 , ) , respectively. A random variable x follows an inverse Nakagami distribution with parameters α , β > 0 , which we write as x I N K ( α , β ) , if x 2 is inverse-gamma distributed with mean β α 1 (under α > 1 ) and variance β ( α 1 ) 2 ( α 2 ) (under α > 2 ), owing to which sampling from I N K ( α , β ) distribution is straightforward and becomes down sampling from the corresponding I G ( α , β ) distribution and taking the square root of the draw. The probability density function is given as f I N K x | α , β = 2 β α Γ ( α ) 1 x 2 α + 1 2 e β 1 x 2 . The inverse Nakagami distribution has been presented by [32], although in a different parameterization. Note that under c and d following truncated I N K distributions, c 2 and d 2 are then inverse-gamma distributed (with corresponding truncations). Notice that for c ( 0 , 1 ) , a beta prior could be considered instead. However, the resulting full conditional posterior would be far less tractable for MCMC sampling. The exact specifications of these distributions are presented later in this section.
Under a sample y 1 , y 2 , , y T , where T is the sample’s length, we introduce the following matrix notation: y = y 1 , y 2 , , y T , h = h 1 , h 2 , , h T , ω = ω 1 , ω 2 , , ω T , y ( 0 ) = h 0 , y m + 1 , y m + 2 , , y 0 and
X = x 1 x T , where x t = [ y t m + 1 , y t m + 2 , , y t ] , for t = 1 , 2 , , T .
As regards the initial conditions y ( 0 ) , for h t we fix h 0 = 1 , while for y t the first m pre-sample observations are used. The following symbols are used:
f N n x | u , V —the probability density function of the n-variate normal distribution with mean vector u and positive definite covariance matrix V ,
f N x | a , b —the probability density function of the normal distribution with mean a and variance b,
f L N x | a , b —the probability density function of the log-normal distribution with mean a and variance b,
f G x | a , b —the probability density function of the gamma distribution with mean a / b ,
f I G x | a , b —the probability density function of the inverse gamma distribution with mean b / ( a 1 ) for a > 1 ,
f B e t a x | a , b —the probability density function of the beta distribution with parameters a and b,
f P a r e t o x | a , b —the probability density function of the Pareto distribution with parameters a and b,
I ( a , b ) x —the indicator function for the interval ( a , b ) .
Now the Bayesian model can be fully written as:
p y , h , ω , β , ϕ , γ , σ 2 , p , ν , ρ , c , d | y 0 = t = 1 T f N ( y t | μ t , h t { min { max { ω t , c } , d } } 2 ) × × t = 1 T f L N h t | e γ + ϕ l n h t 1 γ + 1 2 σ 2 , ( e σ 2 1 ) e 2 γ + 2 ϕ l n h t 1 γ + σ 2 × p ( ω ) p β p ϕ p ( σ 2 ) p γ p ν p ρ p c p d p ( p ) ,
where
p ( ω ) = t = 1 T p f B e t a ω t | ρ , 1 + ( 1 p ) f P a r e t o ω t | 1 , ν ,
and the prior distributions are specified as follows:
  • p β f N k + 1 β | μ β , Σ β I ( 0 , 1 ) ( λ R m a x ) e 1 2 β μ β Σ β 1 β μ β I ( 0 , 1 ) ( λ R m a x ) , λ R is the vector of eigenvalues of the companion matrix, related to the AR form (if exists) with characteristic polynomial ( 1 j = 1 k β j B j ) ;
  • p γ = f N ( γ | μ γ , σ γ 2 ) e 1 2 σ γ 2 γ μ γ 2 ;
  • p ϕ f N ( ϕ | μ ϕ , σ ϕ 2 ) I 1 , 1 ( ϕ ) e 1 2 σ ϕ 2 ϕ μ ϕ 2 I 1 , 1 ( ϕ ) ;
  • p σ 2 = f I G σ 2 | α σ , β σ = α σ β σ Γ α σ 1 σ 2 α σ + 1 e β σ 1 σ 2 ;
  • p ν = f G ν | α ν , β ν = α ν β ν Γ α ν ν α ν 1 e β ν ν ;
  • p ρ = f G ρ | α ρ , β ρ = α ρ β ρ Γ α ρ ρ α ρ 1 e β ρ ρ ;
  • p c f I N K c | α c , β c I 0 , 1 1 c 2 α c + 1 2 e β c 1 c 2 I 0 , 1 ( c ) ;
  • p d f I N K d | α d , β d I 1 , + 1 d 2 α d + 1 2 e β d 1 d 2 I 1 , + ( d ) ;
  • p ( p ) = f B e t a p | α p , β p = Γ [ α p + β p ] p α p 1 ( 1 p ) β p 1 Γ [ α p ] Γ [ β p ] I 0 , 1 ( p ) .

3.2. MCMC Method for the Bayesian LLFT-SV Model

The posterior density function, p ( θ , ω , h | y , y ( 0 ) ) , where θ comprises all the model’s parameters, is proportional to (25) and thus, highly dimensional and non-standard. To make an inference about the parameters and latent variables, relevant numerical methods are needed. In our paper, we resort to a common Markov chain Monte Carlo (MCMC) method, namely the Gibbs algorithm, consisting of sequential sampling from the full conditional posteriors derived from (25), which we present below.

3.2.1. The Full Conditional Posterior Distributions of Parameters

The conditional posterior densities of the LLFT-SV model’s parameters are the following (see (25)):
  • p β | y , h , ω , γ , ϕ , σ 2 , ν , ρ , p , c , d , y ( 0 ) f N k + 1 ( β | μ ˜ β , Σ ˜ β ) I ( 0 , 1 ) ( λ R m a x ) , where μ ˜ β = Σ ˜ β ( Σ β 1 μ β + X Σ y 1 y ) , Σ ˜ β = ( X Σ y 1 X + Σ β 1 ) 1 , Σ y = diag ( σ 1 2 , σ 2 2 , , σ T 2 ) , and σ t 2 = h t { min { max { ω t , c } , d } } 2 , for t = 1 , 2 , , T ;
  • p γ | y , h , ω , β , ϕ , σ 2 , ν , ρ , p , c , d , y ( 0 ) = f N ( γ | μ ˜ γ , σ ˜ γ 2 ) , where σ ˜ γ 2 = 1 σ γ 2 + T ϕ 1 2 σ 2 1 and μ ˜ γ = μ γ σ γ 2 + ( 1 ϕ ) σ 2 t = 1 T ( ln h t ϕ ln h t 1 ) σ ˜ γ 2 ;
  • p ϕ | y , h , ω , β , γ , σ 2 , ν , ρ , p , c , d , y ( 0 ) f N ( ϕ | μ ˜ ϕ , σ ˜ ϕ 2 ) I 1 , 1 ( ϕ ) , where μ ˜ ϕ =
    μ ϕ σ ϕ 2 + 1 σ 2 t = 1 T ( ln h t γ ) ( ln h t 1 γ ) σ ˜ ϕ 2 and σ ˜ ϕ 2 = 1 σ ϕ 2 + t = 1 T ln h t γ 2 σ 2 1 ;
  • p σ 2 | y , h , ω , β , γ , ϕ , ν , ρ , p , c , d , y ( 0 ) = f I G ( σ 2 | α ˜ σ , β ˜ σ ) , where α ˜ σ = α σ + T 2 and β ˜ σ = 1 2 t = 1 T ln h t γ ϕ ln h t 1 γ 2 + β σ ;
  • p ρ | y , h , ω , β , γ , ϕ , σ 2 , ν , p , c , d , y ( 0 ) = f G ( ρ | α ˜ ρ , β ˜ ρ ) , where α ˜ ρ = α ρ + t = 1 T I 0 , 1 ( ω t ) and β ˜ ρ = β ρ t = 1 T ln ω t I 0 , 1 ω t + I 1 , + ( ω t ) ;
  • p ν | y , h , ω , β , γ , ϕ , σ 2 , ρ , p , c , d , y ( 0 ) = f G ( ν | α ˜ ν , β ˜ ν ) , where α ˜ ν = α ν + t = 1 T I 1 , + ( ω t ) and β ˜ ν = β ν + t = 1 T ln ω t I 1 , + ω t + I 0 , 1 ω t ;
  • p p | y , h , ω , β , γ , ϕ , σ 2 , ν , ρ , c , d , y ( 0 ) = f B e t a ( p | α ˜ p , β ˜ p ) , where α ˜ p = α p + t = 1 T I 0 , 1 ( ω t ) and β ˜ p = β p + t = 1 T I 1 , + ( ω t ) .
Since the above densities of the full conditional distributions have known closed forms, we can directly simulate from these distributions using standard pseudo-random numbers generators.
As regards the parameters c and d, their full posterior conditionals are far more complicated as compared to the remaining ones.
  • Conditional distribution of c. If t = 1 T I 0 , 1 ( ω t ) = 0 , then the conditional distribution is straightforward, since p c | y , h , ω , β , γ , ϕ , σ 2 , ν , ρ , p , d , y 0 p ( c ) . If t = 1 T I 0 , 1 ( ω t ) = n > 0 then the full conditional posterior density of c is as follows:
    p c | y , h , ω , β , γ , ϕ , σ 2 , ν , ρ , p , d , y 0 1 t = 1 T min { max ω t , c , d } e 1 2 t = 1 T y t x t β 2 h t min { max ω t , c , d } 2 1 c 2 α c + 1 2 e β c 1 c 2 I 0 , 1 ( c ) ,
    which can be expressed as a mixture of n + 1 different distributions:
    p c | y , h , ω , β , γ , ϕ , σ 2 , ν , ρ , p , d , y 0 k = 1 n 1 t = 1 n min { max { ω ˜ t , c } , d } e 1 2 t = 1 n a ˜ y , t min { max { ω ˜ t , c } , d } 2 1 c 2 α c + 1 2 e β c 1 c 2 I ω ˜ k 1 , ω ˜ k ( c ) g k c ( c ) + + 1 c n e 1 2 t = 1 n a ˜ y , t c 2 1 c 2 α c + 1 2 e β c 1 c 2 I ω ˜ n , , 1 ( c ) g n + 1 c ( c ) ,
    where ω ˜ 1 , ω ˜ 2 , , ω ˜ T are order statistics, defined by sorting the values of ω 1 , ω 2 , , ω T in ascending order, i.e.,
    0 = ω ˜ 0 < ω ˜ 1 < ... < ω ˜ n < 1 < ω ˜ n + 1 < ... < ω ˜ T ,
    with corresponding a ˜ y , t for t = 1 , ... , T , where a y , t = ( y t x t β ) 2 h t have been previously assigned to unordered ω t s.
    To simulate from this mixture, the mixing weights and probability density functions with normalizing constants are needed. For k = 1 , we have a truncated inverse Nakagami distribution for c with parameters α c and β c :
    g 1 c ( c ) = w c , 1 1 t = 1 n ω ˜ t e 1 2 t = 1 n a ˜ y , t ω ˜ t 2 1 c 2 α c + 1 2 e β c 1 c 2 I 0 , ω ˜ 1 ( c ) w c , 1
    where w c , 1 = 1 2 β c α c Γ α c , β c ω ˜ 1 2 is the normalizing constant for the truncated inverse Nakagami distribution.
    For k = 2 , ... , n , we also have a truncated inverse Nakagami distribution for c with parameters α c + k 1 2 and β c + 1 2 t = 1 k 1 a ˜ y , t :
    g k c c = w c , k e 1 2 t = k n a ˜ y , t ω ˜ t 2 t = k n ω ˜ t e 1 2 t = 1 k 1 a ˜ y , t + β c 1 c 2 w c , k 1 c 2 α c + k 2 I ω ˜ k 1 , ω ˜ k ( c ) ,
    where
    w c , k = 1 2 1 2 t = 1 k 1 a ˜ y , t + β c α c k 1 2 × × Γ α c + k 1 2 , 1 2 t = 1 k 1 a ˜ y , t + β c ω ˜ k 2 Γ α c + k 1 2 , 1 2 t = 1 k 1 a ˜ y , t + β c ω ˜ k 1 2 .
    Finally, for k = n + 1 :
    g n + 1 c c = w c , n + 1 e 1 2 t = 1 n a ˜ y , t + β c 1 c 2 w c , n + 1 1 c 2 α c + n + 1 2 I ω ˜ n , 1 ( c ) ,
    where
    w c , n + 1 = 1 2 1 2 t = 1 n a ˜ y , t + β c α c n 2 × × Γ α c + n 2 , 1 2 t = 1 n a ˜ y , t + β c 1 Γ α c + n 2 , 1 2 t = 1 n a ˜ y , t + β c ω ˜ n 2
    The mixing weights are proportional to: w c , 1 0 = w c , 1 1 t = 1 n ω ˜ t e 1 2 t = 1 n a ˜ y , t ω ˜ t 2 ,
    w c , k 0 = w c , k e 1 2 t = k n a ˜ y , t ω ˜ t 2 t = k n ω ˜ t for k = 2 , ... , n , and w c , n + 1 0 = w c , n + 1 .
    To simulate from the mixture components, we apply the acceptance-rejection method by using the uniform distribution on the interval ω ˜ k 1 , ω ˜ k for k = 1 , ... , n , and on the interval ω ˜ n , 1 for k = n + 1 .
  • Conditional distribution of d. If t = 1 T I 1 , + ( ω t ) = 0 , then the conditional distribution is straightforward, since p d | y , h , ω , β , γ , ϕ , σ 2 , ν , ρ , p , c , y 0 p ( d ) . If t = 1 T I 1 , + ( ω t ) = T n > 1 then the full conditional posterior distribution for d can be expressed in a very similar manner:
    p d | y , h , ω , β , γ , ϕ , σ 2 , ν , ρ , p , c , y ( 0 ) 1 t = 1 T min { max { ω t , c } , d } e 1 2 t = 1 T y t x t β 2 h t min { max { ω t , c } , d } 2 1 d 2 α d + 1 2 e β d 1 d 2 I 1 , + ( d ) ,
    which can be expressed as a mixture of T n + 1 different distributions:
    p d | y , h , ω , β , γ , ϕ , σ 2 , ν , ρ , p , c , y 0 1 d T n e 1 2 t = n + 1 T a ˜ y , t d 2 1 d 2 α d + 1 2 e β d 1 d 2 I 1 , ω ˜ n + 1 ( d ) + + k = 2 T n 1 t = 1 + n T min { max { ω ˜ t , c } , d } e 1 2 t = n + 1 T a ˜ y , t min { max { ω ˜ t , c } , d } 2 × × 1 d 2 α d + 1 2 e β d 1 d 2 I ω ˜ n + k 1 , ω ˜ n + k ( d ) + + 1 t = 1 + n T ω ˜ t e 1 2 t = n + 1 T a ˜ y , t ω ˜ t 2 1 d 2 α d + 1 2 e β d 1 d 2 I ω ˜ T , + ( d ) ,
    In this case, the first component is proportional to the truncated inverse Nakagami distribution for d with parameters α d + T n 2 and β d + 1 2 t = n + 1 T a ˜ y , t :
    g 1 d d = w d , 1 1 d 2 α d + T n + 1 2 e 1 2 t = n + 1 T a ˜ y , t + β d 1 d 2 I 1 , ω ˜ n + 1 d w d , 1
    where
    w d , 1 = 1 2 1 2 t = n + 1 T a ˜ y , t + β d α d T n 2 × × Γ α d + T n 2 , 1 2 t = n + 1 T a ˜ y , t + β d ω ˜ n + 1 2 Γ α d + T n 2 , 1 2 t = n + 1 T a ˜ y , t + β d 1
    is the normalizing constant for the truncated inverse Nakagami distribution.
    For k = 2 , ... , T n , we also have a truncated inverse Nakagami distribution for d with parameters α d + T n k + 1 2 and β d + 1 2 t = n + k T a ˜ y , t :
    g k d d = w d , k 1 t = 1 + n n + k 1 ω ˜ t e 1 2 t = n + 1 n + k 1 a ˜ y , t ω ˜ t 2 × × 1 d 2 α d + 1 + T n k + 1 2 e 1 2 t = n + k T a ˜ y , t + β d 1 d 2 I ω ˜ n + k 1 , , ω ˜ n + k ( d ) w d , k ,
    where
    w d , k = 1 2 1 2 t = n + k T a ˜ y , t + β d α d T n k + 1 2 × × Γ α d + T n k + 1 2 , 1 2 t = n + k T a ˜ y , t + β d ω ˜ n + k 2 Γ α d + T n k + 1 2 , 1 2 t = n + k T a ˜ y , t + β d ω ˜ n + k 1 2 .
    The last component is:
    g T n + 1 d d = w d , k t = 1 + n T ω ˜ t e 1 2 t = n + 1 T a ˜ y , t ω ˜ t , 2 1 d 2 α d + 1 2 e β d 1 d 2 I ω ˜ T , , + ( d ) w d , T n + 1 ,
    where w d , T n + 1 = 1 2 β d α d Γ α d Γ α d , β d ω ˜ T 2 . In this case, the mixing weights are proportional to: w d , 1 0 = w d , 1 , w d , k 0 = w d , k t = 1 + n n + k 1 ω ˜ t e 1 2 t = n + 1 n + k 1 a ˜ y , t ω ˜ t 2 for k = 2 , ... , T n , and w d , T n + 1 0 = w d , T n + 1 t = 1 + n T ω ˜ t e 1 2 t = n + 1 T a ˜ y , t ω ˜ t 2 .
    Similarly to the algorithm for the parameter c, to simulate from the mixture components here, the acceptance–rejection method is applied by using the uniform distribution on the interval ω ˜ n + k 1 , ω ˜ n + k for k = 2 , ... , T n . In turn, for k = T n + 1 , we simulate from a truncated gamma distribution on the interval ω ˜ T , + using the algorithm proposed by [33].
    If t = 1 T I 1 , + ( ω t ) = T n = 1 , then we have only two components in the mixture (32), the first and last ones.

3.2.2. The Conditional Posterior of h t s

We sample each element of the vector h using the independence Metropolis–Hastings (similarly to [24]) algorithm within a separate Gibbs step, initializing at h t = 0.1 y t 2 + 1 for t = 1 , , T . The conditional posterior density of h t , t { 1 , , T } , is given by
p h t | y , h 1 , , h t 1 , h t + 1 , , h T , ω , β , γ , ϕ , σ 2 , ν , ρ , p , c , d , y ( 0 ) h t 1 e 1 2 σ h 2 ln h t s h 2 h t 1 2 e 1 2 y t x t β 2 h t { min { max { ω t , c } , d } } 2 ,
where σ ˜ h 2 = σ 2 1 + ϕ 2 , s ˜ h = γ + ϕ ln h t 1 γ + ϕ ln h t + 1 γ 1 + ϕ 2 for t = 1 , ... , T 1 , and σ ˜ h 2 = σ 2 , s ˜ h = γ + ϕ ln h t 1 γ for t = T .
The proposal distribution used to draw h t is the inverted gamma:
I G φ + 1 2 , φ 1 e s ˜ h + 1 2 σ ˜ h 2 + 1 2 y t x t β 2 { min { max { ω t , c } , d } } 2 ,
where φ = 2 e σ ˜ h 2 1 e σ ˜ h 2 1 . As regards the initial conditions for ln h t , it is assumed that ln h 0 = 0 .

3.2.3. The Conditional Posterior of ω t s

The conditional posterior density of ω t , t { 1 , , T } , is given by:
p ω t | y , h , ω 1 , , ω t 1 , ω t + 1 , , ω T , β , γ , ϕ , σ 2 , ν , ρ , p , c , d , y 0 1 min { max { ω t , c } , d } e 1 2 y t x t β 2 h t { min { max { ω t , c } , d } } 2 × × p ω t ρ 1 ρ I 0 , 1 ω t + 1 p ν ω t ( ν + 1 ) I 1 , + ω t ,
which can be written as a mixture of four different distributions:
p ω t | y , h , ω 1 , , ω t 1 , ω t + 1 , , ω T , β , γ , ϕ , σ 2 , ν , ρ , p , c , d , y 0 w ω , 1 c ρ ω t ρ 1 ρ I 0 , c ω t f 1 ( ω t ) + p ρ w ω , 2 ω t ρ 2 e 1 2 a y , t w ω , 2 I c , 1 ω t f 2 ( ω t ) + + 1 p ν w ω , 3 1 ω t 2 ν 2 + 1 + 1 2 e 1 2 a y , t 1 ω t 2 w ω , 3 I 1 , d ω t f 3 ( ω t ) + + w ω , 4 d ν ν 1 ω t ν + 1 I d , + ω t f 4 ( ω t ) i = 1 4 w ˜ ω , i f i ( ω t ) ,
where w ω , 1 = c ρ 1 p e 1 2 y t x t β 2 h t c 2 , a y , t = y t x t β 2 h t ,
w ω , 2 = a y , t ρ 2 1 2 2 ρ 2 1 2 Γ 1 2 ρ 2 , a y , t 2 Γ 1 2 ρ 2 , a y , t 2 c 2 ,
w ω , 3 = a y , t ν 2 1 2 2 ν 2 1 2 Γ ν 2 + 1 2 , a y , t 2 d 2 Γ ν 2 + 1 2 , a y , t 2 ,
w ω , 4 = ( 1 p ) d ν 1 e 1 2 y t x t β 2 h t d 2 .
The weights w ˜ ω , i , i = 1 , 2 , 3 , 4 , of the mixture are as follows:
w ˜ ω , i = w ω , i w ω , 1 + p ρ w ω , 2 + ( 1 p ) ν w ω , 3 + w ω , 4 for i = 1 , 4 ,
w ˜ ω , 2 = p ρ w ω , 2 w ω , 1 + p ρ w ω , 2 + ( 1 p ) ν w ω , 3 + w ω , 4 ,
and w ˜ ω , 3 = ( 1 p ) ν w ω , 3 w ω , 1 + p ρ w ω , 2 + ( 1 p ) ν w ω , 3 + w ω , 4 .
It is easy to build an algorithm to generate from the mixture. First, we sample from the uniform distribution to randomly select the component distribution, and then we generate a pseudo-random value from it. The first term of the mixture is given by the density:
f 1 ω t = c ρ ω t ρ 1 ρ I 0 , c ω t ,
which can be sampled using the inverse cdf technique. The second density is:
f 2 ω t = ω t ρ 2 e 1 2 a y , t w ω , 2 I c , 1 ω t ,
which for ρ > 1 is the truncated beta density with parameters ρ 1 and 1. However, for 0 < ρ < 1 , we obtain a non-standard distribution, but an independence Metropolis–Hastings algorithm can be applied with the proposal distribution being the Beta distribution with parameters ρ + 1 and 1. The third term of the mixture is:
f 3 ω t = 1 ω t 2 ν 2 + 1 + 1 2 e 1 2 a y , t 1 ω t 2 w ω , 3 I 1 , d ω t ,
which is the density of an inverse gamma distribution of ω t 2 , truncated to the interval ( 1 , d ) . Thus, we can sample ω t 2 from the truncated inverted gamma distribution with shape parameters ν 2 + 1 and scale parameters 1 2 a y , t , and then calculate ω t . We simulate from the truncated gamma distribution using the algorithm proposed by [33]. The last term of the mixture is given by the following probability density function:
f 4 ω t = d ν 1 ω t ν + 1 ρ I d , + ω t ,
which can be easily sampled from via the inverse cdf technique. We initialize the sampler at starting values ω t = 1.1 for t = 1 , , T .

4. Empirical Illustrations

In this section, we apply the LLFT-SV model to describe the volatility of three time series of daily stock market prices. We begin with the analysis of a quite particular asset, namely the MS Industries AG (MSAG.DE), which is a Germany-based industrial technology company. The MSAG.DE data set is specific in the sense of featuring many (repeating) zero returns, which is typical for many individual companies’ stocks. Section 4.1 and Section 4.2 cover the results for the original and perturbed prices series, respectively, with a very detailed analysis providing in-depth insights into the LLFT-SV model performance. Then, in Section 4.3, we shift our attention to two common stock market indices: S&P 500 and DAX, to examine the LLFT-SV model’s validity also for more ‘typical’ financial data.
The MSAG.DE stock prices were downloaded from http://finance.yahoo.com (accessed on 31 March 2021) and cover the period from 28 December 2016 through 25 February 2021. The series is transformed into logarithmic rates returns, expressed in percentage points, and forming a series of 1053 observations. The first two available observations are spared for the initial condition in the AR(2) structure underlying the mean equation (two lags are chosen in view of the Lindley type test for the restriction that the autocorrelation parameters for higher lags are equal to zero). Therefore, x t = ( 1 , y t 1 , y t 2 ) , and the final number of the modelled observation is T = 1051 .
Basic descriptive characteristics of the modelled data are presented in Table 1, with the prices and returns plotted in Figure 8. It can be seen from the graphs that the return rates are centred around zero, featuring some outliers, as well. The data distribution is highly non-Normal, as confirmed by kurtosis, which exceeds three by far. From Table 1, it can also be noted that the returns’ range is fairly spread, with their minimum and maximum at 22.399 and 30.390 , respectively. These extreme observations occurred in March 2020 and can be explained by the turbulence in financial markets caused by the COVID-19 pandemic. The data are also characterized by the presence of zero returns, with zero being exactly the median (while the sample mean is close to zero). The relative frequency of zero returns is equal to about 0.11 . The exact positions of the zero returns are indicated in the figures by the blue vertical lines. This relatively high concentration of the zero returns manifests in the histogram through a prominent peak at the mode (see panel (c) in Figure 8). It is worth noting that the zero returns are not always related to zero volumes. In many cases, the volume is positive, but the price stays unchanged.
In view of the above, it could be argued that the data at hand, with some concentration in zero, should be modelled by means of mixed, continuously-discrete distributions. These, however, remain highly unpopular in practice, usually giving way to models based on families of continuous distributions. As pointed out by [27], Bayesian inference on the basis of a sample containing repeated observations (not necessarily zero-valued) is not always possible with the use of a continuous sampling distribution, even under a proper prior distribution. Admittedly, the LLFT-SV model introduced in this paper, as being based on a mixture of continuous probability distributions, fails to explicitly model the incidence of zero returns. However, owing to the properties of the LLFT distribution proposed in this paper, in the LLFT-SV model, the zero returns or any repeated observations do not pose such a problem, since (under the assumed prior structure) it can be proved that the marginal likelihood (i.e., marginal data density value) is finished (see Theorem A1 in Appendix A). The empirical study is divided into two subsections. In the first one, we analyze the original series, featuring repeated zero returns. However, the validity of the LLFT-SV model can also be illustrated in the case where repeated observations do no occur per se, but the series still contains values that concentrate tightly around the median. To this end, we perturb the MSAG.DE prices and present the results for such a series in Section 4.2.
For the sake of comparison, we also consider the Stochastic Volatility model with the conditional Student’s t-distribution (t-SV). The t-distribution’s density function can be expressed as a scale mixture of normals. That is, the t-distributed (with ν degrees of freedom) random variable ε t in the t-SV model can be written as follows: ε t = λ t ω t , where λ t i i N ( 0 , 1 ) and { ω t } i i I G ν 2 , ν 2 . The t-SV model is usually used to better explain the heavy tails of the empirical distributions of returns. However, as exemplified below, in contrast to LLFT-SV, the t-SV model may still not be flexible enough to simultaneously allow for fat tails and a very high peak close to the mean (or median) returns.
We set the following values of the hyperparameters of the model under consideration: μ β = 0 , Σ β = I , μ γ = 0 , σ γ = 10 , μ ϕ = 0 , σ ϕ = 10 , α σ = 2.5 , β σ = 0.16 (see [34]), α ρ = 10 , β ρ = 10 , α p = 1 , β p = 1 .
Such prior distributions reflect our rather little prior knowledge about the model’s parameters. In order to examine the sensitivity of the posterior distributions to the priors, we try different values of the remaining hyperparameters, which we present in subsections below.
To obtain a pseudo-random sample from the posterior distribution, we use the hybrid sampler presented in the previous section—the Metropolis–Hastings algorithm within the Gibbs procedure, generating 50,000 of burn-in and 200,000 posterior drawings. The algorithm has been implemented in the authors’ own computer code written in GAUSS. Computations were carried out on a personal computer with an Intel Core i7-9850H processor and 16 GB RAM and took about 5–6 h for a single model. A detailed examination of the MCMC convergence is provided in Appendix B.

4.1. Results for the Original MSAG.DE Data

In this subsection, we present the results obtained for the original MSAG.DE data set and assumea priori that the values of c very close to 0 are unlikely. To represent such a prior belief about c, it is assumed that c I N K ( 2 ; 0.1 ) truncated to the interval ( 0 , 1 ) .
Table 2 reports the posterior medians, 90% confidence intervals, and interquartile ranges. The 90% confidence intervals are calculated using the 5th and 95th percentiles of the posterior samples.
Figure 9, Figure 10, Figure 11, Figure 12, Figure 13 and Figure 14 depict the posterior densities of the model parameters obtained under several different prior distributions. Figure 9, Figure 10, Figure 11 and Figure 12 show univariate marginal posteriors and priors. Clear differences between the priors and posteriors indicate that the data contribute significant information about the parameters, changing our prior assumptions. The dependencies between ν and d, as well as between ρ and p, visible in Figure 13 and Figure 14, are striking. The plots reveal nonlinear relations and multimodality. Moreover, the posterior distributions of d and ν are very sensitive to the prior distributions (compare the first three columns by pairs from Table 2). The sensitivity of the posterior distribution of ν with respect to the prior can also be seen in the t-SV model (see the last two columns in Table 2). However, in both types of models considered here, the percentiles of ν indicate that the local slash and Student’s t-distributions are more appropriate than the normal distribution for the mean equation innovations. In turn, the posterior distribution of parameter p indicates that the locally leptokurtic and heavy-tailed mixture normal distribution is more appropriate than the local slash distribution. The LLFT distribution makes it possible to explain both the outlying returns and the returns concentrating close to their median. As regards the persistence in volatility ( ϕ ) and the variance of the volatility process ( σ 2 ), all models give similar, though not identical, results.
However visible, the posterior sensitivity of the latent processes’ parameters does not transfer into similar sensitivity of the posterior means of the processes themselves, especially as it comes to the conditional standard deviations of the returns, σ t (see Table 3). The average of posterior means of σ t = h t min { max { ω t , c } , d } is equal to 2.119 with the standard deviation 1.051 . In the t-SV model, the average of posterior means of σ t = h t ω t is equal to 2.173 with the standard deviation 1.210 . The series of the posterior means of σ t obtained in both the models are highly correlated, with the correlation coefficient at 0.964 . In all Bayesian models considered in this subsection, the corresponding latent processes are highly correlated, indicating their very similar dynamics. It can be seen in Figure 15, Figure 16, Figure 17, Figure 18, Figure 19, Figure 20 and Figure 21, where we graphed the posterior means and conditional standard deviations of h t s (in both of the models h t s are relatable) and ω t in t-SV that corresponds to min { max { ω t , c } , d } in LLFT-SV. Notice that due to the non-negativity of these quantities, the two-standard-deviation bands presented in the figures are truncated only to positive values.
Finally, we compare the predictive capabilities of the Bayesian LLFT-SV and t-SV models. For this purpose, the data set is split into two subsets: training and ex post prediction evaluation sets. As the training set, we take the first 800 observations, with the resulting posterior that can actually be viewed as the prior distribution before a consecutive observation is included into the recursive (not rolling) estimation window. The forecast evaluation period encompasses the most recent 251 trading days. We perform one- to ten-step-ahead predictions over the period 2 March 2020 through 25 February 2021, which gives 242 predictive distributions for each of the ten forecast horizons under consideration, and thereby 2420 predictive distributions in total. The predictive distributions are calculated based on the whole data set available at time 800 + t for each t = 1 , , 242 (recursive forecasting scheme). The models are re-estimated each time, i.e., upon arrival of each new observation. Each of the predictive densities is based upon 50,000 MCMC posterior draws, preceded by either 250,000 burn-in passes for t = 1 or 10,000 cycles for t = 2 , 3 , , 242 , with the sampler each time initiated at the final draw of the previous run. The sequence of the one-step-ahead predictive distributions covers the period from 2 March 2020 to 12 February 2021, while the sequence of the ten-step-ahead predictive distributions covers the period 2 March 2020–25 February 2021.
As the basis of the predictive model comparison, we choose the sum of the so-called log predictive likelihoods, which for some horizon h in a model M can be written as:
S L P ( h , M ) = t = T ˜ + 1 T ˜ + n h log p ( y t + h 1 o | y 1 t 1 , o , y ( 0 ) , M ) ,
where p ( y t + h o | y 1 t , o , y 0 , M ) is the predictive density of y t + h at the observed value y t + h o , y 1 t , o = ( y 1 o , ... , y t o ) denotes the observations up to time t, n h is the number of forecasts, and finally, T ˜ = T n h . Note that for h = 1 , the difference: S L P ( 1 , M 1 ) S L P ( 1 , M 2 ) amounts to the cumulative log predictive Bayes factor in favour of model M 1 against M 2 , which informs us how the posterior chances of M 1 versus M 2 (based on the observations up to time T ˜ ) change upon observing predicted data, y T ˜ + 1 T ˜ + n h = ( y T ˜ + 1 , ... , y T ˜ + n h ) .
The use of predictive likelihoods (the predictive density values at the realized “future” observations) are motivated and described in, e.g., [35]. Note that, in this study, we do not consider the models’ in-sample “fit”, which is not only due to our intent to avoid the burden of efficient approximation of marginal data density values (and the issue of sensitivity with respect to the priors) but mostly because it would be hardly related to the present context of forecasting.
The predictive density values (predictive likelihoods) in Equation (39) need to be calculated numerically using draws from the posterior distribution of parameters and latent variables:
p ^ ( y t + h 1 o | y 1 t 1 , o , y ( 0 ) , M ) = 1 N j = 1 N p ( y t + h 1 o | y 1 t 1 , o , θ ( j ) , h 1 t + h 1 , ( j ) , ω 1 t + h 1 , ( j ) , y ( 0 ) , M ) ,
where θ ( j ) , h 1 t + h 1 , ( j ) = ( h 1 ( j ) , , h t + h 1 ( j ) ) , ω 1 t + h 1 , ( j ) = ( ω 1 ( j ) , , ω t + h 1 ( j ) ) are draws from the posterior ( j = 1 , 2 , , N ).
Table 4 presents the results of predictive power comparisons of two alternative models: LLFT-SV and t-SV, obtained for the original (i.e., not perturbed) data set and assuming that ν G ( 8 ; 0.8 ) and fixed values of parameters c and d are the medians of their marginal posteriors, obtained under two different priors for c (we provide further details below). Thus, two cases are further considered. In the first one: c = 0.118 and d = 7.553 , while in the second: c = 0.014 and d = 2.808 . Note that we decided to fix the values of these two parameters here (rather than sample them from their posterior), which is intended to spare much of otherwise largely time-consuming calculations. Even now, the entire forecasting exercise for a single LLFT-SV model, with fixed values of c and d, takes as much as roughly two days of calculations (on a personal computer with Intel Core i7-9850H processor and 16 GB RAM). Otherwise, allowing for posterior sampling of these two parameters as well would extend this time to about eight days (for a single model).
The results in Table 4 show clear evidence that the LLFT-SV models (particularly the one with c = 0.014 ) outperform t-SV for all forecast horizons. Due to apparently small differences in the log predictive likelihoods of the t-SV and LLFT-SV models with c = 0.118 and d = 7.553 , one could get an impression that both are quite comparable when it comes to their predictive power. However, the cumulative decimal logarithm of the predictive Bayes factor in favour of the LLFT-SV model with c = 0.118 versus t-SV, defined as a difference of the corresponding sums of the log predictive likelihoods for h = 1 (see the column for h = 1 in Table 4), is equal to about 2.175 , which indicates that the posterior odds ratio based on the first 800 observations and calculated in favour of the LLFT-SV model increased by about 150 times upon observing the predicted data. The LLFT-SV model with c = 0.014 provides further, even more considerable improvement, since the cumulative logarithm of the predictive Bayes factor in favour of this model against the one with c = 0.118 equals about 22. Thus, the posterior odds of these two models increases by as much as about 22 orders of magnitude upon observing data over the forecast period (i.e., 2 March 2020 through 25 February 2021). Note that the low value of c = 0.014 allows σ t to reach very small values (close to zero), thus enabling the distribution of the conditional mean equation’s innovations to feature a noticeable peakedness. Therefore, it may be concluded that enabling a sharp peak at the mode in the conditional distribution and modelling returns hovering near zero seem to be more important to the predictive performance than fat tails (at least for the data at hand).

4.2. Results for the Perturbed MSAG.DE Data

Now, we illustrate the validity of the LLFT-SV model in the case where repeated observations do no occur per se, but the series still contains values that concentrate tightly around (but not at) the mode. For this purpose, we use slightly perturbed observations, using two kinds of perturbation applied to the price series, both in the spirit of [27]. In the first case, the data are perturbed by adding a uniformly distributed random number from the interval ( 5 × 10 5 , 5 × 10 5 ) , while we widen the interval to ( 5 × 10 4 , 5 × 10 4 ) in the second. As a consequence of the price perturbation, the perturbed returns become different from each other and different from zero. The perturbed returns corresponding to the zero returns in the original series fall within the intervals ( 0.0072 , 0.0049 ) and ( 0.056 , 0.058 ) in the first and second case, respectively. Basic descriptive characteristics (not presented in the paper for the sake of brevity) of the perturbed daily returns are very similar to those of the original data, presented in Table 1. Moreover, the histograms of the original and both the perturbed data sets practically coincide (thus, we do not present them, either). Naturally, the medians are now different from zero, equaling 0.0005 in the first case and 0.0034 in the second. In the first case, the concentration of the returns around zero is higher than in the second case, with about 11 % of the returns lying within the interval ( 0.0072 , 0.0049 ) , and only 3.2 % in the second case (in the same interval).
It can be seen from Table 5 that upon specifying I N K ( 2 ; 0.1 ) as the prior for c in the LLFT-SV model, thus preferring a priori values of the parameter to be relatively distant from zero, the results of Bayesian inference about parameters and latent variables turns out to be quite robust to the considered perturbations. It seems that under such a prior, the values of c are too high, dominating (i.e., censoring) ω t , to “adequately” explain the presence of numerous observations very close to the mode. Thereby, the returns that are close but not equal to zero appear to still be treated here in the same way as the zero returns in the model for the original series.
Turning back to Table 5, we notice that very similar results are obtained for the t-SV model. It seems that although the model “adequately” captures the heavy tails of the empirical distribution of the returns, it fails to explain the observations that are compressed around the mode. In both types of models, the inference about latent processes, h t and σ t , remains unchanged (again, relevant graphs are not presented in the paper, for the sake of brevity).
Since both fat tails and a high concentration of financial returns around the mode need to be accounted for simultaneously, we intend to also consider such LLFT-SV models that allow c to be closer to zero (as compared with the prior assumptions in the previous analysis). Recall that smaller values of c lead to a more pointed distribution of the conditional mean equation’s innovations. To investigate this fact, we assume that c follows a priori the N I K ( 0.4 ; 0.009 ) distribution, so that its mode is now equal to about 0.1 instead of 0.2 considered in the previous analysis. Simultaneously, we impose the restriction ν 1 , which is necessary from a numerical point of view. This restriction is not needed when c is bounded away from zero (then the posterior probability of ν 1 is equal to zero).
When the prior distribution of c admits relatively more mass close to zero, only the larger perturbation (the second one) turns out to change the location and dispersion of the posterior distributions ν and ρ , which also affects, in consequence, the inference about all the remaining parameters (apparently with the exception of the volatility persistence, ϕ ), see Table 6. For the original data and the first perturbation, the two parameters: ν and ρ , have their posterior probability mass at much lower values than in the case of c I N K ( 2 ; 0.1 ) . Moreover, the posterior marginals of these two parameters exhibit a very high concentration around their modes. Additionally, d gathers almost all of its posterior mass near 1 (see Figure 22), indicating that the presence of the latent process h t in the LLFT-SV structure is enough to describe heavy tails of the empirical distribution of modelled returns.
Although the posterior results for the parameters driving the term min { max { ω t , c } , d } (i.e., ρ , ν , p, and obviously, c and d ) are very sensitive to the prior assumptions about parameter c, among others, the posterior inference about the conditional standard deviations (of the original and perturbed data, σ t ) are strikingly similar in terms of their dynamics (the posterior mean series are highly correlated); see Figure 23. However, in the LLSF-SV model with c I N K ( 0.4 ; 0.009 ) , considerably smaller values of the posterior means of σ t are noted for the zero returns, although it can be noticed in the figure only at high magnification. The average value of the posterior means of σ t corresponding to the zero returns is equal to: 0.15 (with standard deviation 0.05 ) in the case of c I N K ( 0.4 ; 0.009 ) , 1.00 ( 0.44 ) in the case of c I N K ( 2 ; 0.1 ) , and 1.67 ( 0.69 ) in the t-SV model with ν G ( 8 ; 0.8 ) . The zero returns in the original data and the observations that are concentrated around the mode in the perturbed data (marked by the blue vertical lines in Figure 23, Figure 24 and Figure 25) tend to be treated in a special way by assigning to them relatively small values of ω t ; see Figure 24 and Figure 25. Thus, in this sense, the LLFT-SV model is able to detect “inliers”, defined here as the observations that lie in the central part of empirical distribution and their occurrence is “atypical” in the sense of their repeatedness or their “atypically” high concentration around the mode.
Note that the inference about the mixing variables, ω t s, depends on the size of perturbation. It seems that the closer a point observation ( y t ) is to zero (or to the mode), the smaller value of corresponding min { max { ω t , c } , d } in the LLFT-SV model with c I N K ( 0.4 ; 0.009 ) . Thanks to this prior assumption, min { max { ω t , c } , d } can take on values close to zero and, in consequence, atypical “inlying” observations can be better explained. This result is in accordance with those of the predictive power comparison of models, presented in the previous subsection. Allowing c to take on low values considerably improves the predictive performance of the LLFT-SV model, due to its far more appropriate handling of the very center of the data distribution’s central part.

4.3. Results for Stock Market Indices

In this section, we present only a brief analysis of the LLFT-SV model estimation results for two common stock market indices: Standard and Poor’s 500 (S&P 500) and Deutscher Aktienindex (DAX), both representing “typical” financial series, with their dynamics reflecting movements of the entire market rather than only of a single company’s stock prices (possibly, of a limited liquidity). Both data sets were downloaded from http://stooq.pl (an open access web page) and cover the same period as the one considered previously for MSAG.DE: 28 December 2016 through 25 February 2021. As seen in Figure 26, the logarithmic rates of return reveal strong leptokurticity and fat tails, with the latter built up particularly due to the market volatility outburst triggered by the COVID-19 pandemic. However, contrary to the MSAG.DE data set, the series currently at hand do not feature zero returns nor any repeating value, therefore representing quite distinct characteristics.
Similarly to the analysis for MSAG.DE, the first two available observations are spared for the initial condition in the AR(2) structure, yielding the final number of modelled observation of T = 1047 for DAX, and T = 1044 for S&P 500.
Posterior characteristics of the key parameters of the LLFT-SV model ( ν , ρ , p) indicate the validity of modelling the error term through the LLFT distribution, pointing to a non-negligible contribution of both mixture components: beta and Pareto (in the SMN representation); see Table 7. As implied by the posterior expectations of p, for DAX, both components are almost of the same weight, while in the case of S&P 500, the beta component constitutes roughly 30% of the mixture. Higher posterior medians of c for DAX, as compared to S&P 500, indicate a lower local conditional peakedness, which is suitably accompanied by lower posterior medians of d, the values of which, being close to 1, bring the conditional distribution of ε t (given d) closer to a normal distribution. However, it seems that in the case of S&P 500 the posterior distributions of c and d are visibly affected by the priors (see relevant panels in Figure 27).
In general, we could conclude that incorporating the LLFT distribution into an SV model may be an empirically valid extension of the basic stochastic volatility structure also for “more typical” financial time series, such as market indices, rather than only for some specific company’s (“less typical”) stock returns.

5. Discussion

In the paper, we proposed a new scale mixture of normal distributions that features local leptokurticity and local fat tails. This new LLFT distribution was further incorporated into a basic stochastic volatility model (yielding a LLFT-SV specification), so as to enhance the model’s capability of capturing corresponding empirical properties of financial time series. For the new model, we developed a Bayesian framework along with valid MCMC methods of posterior sampling. Empirical results indicate the validity of the LLFT-SV specification for modelling both “non-standard” financial time series with repeating zero returns (resulting in a single pronounced histogram bar), as well as more “typical” data on the S&P 500 and DAX indices. For the former, the LLFT-SV model was also shown to markedly outperform a common, globally heavy-tailed, t-SV alternative in terms of density forecasting (as measured by the sum of predictive likelihoods). Apparently, the noticeable predictive superiority of the LLFT-SV model draws on its more adequate handling of the peak in the central part of the returns’ distribution but not at the expense of a valid modelling of the outliers.
In general, it can be concluded that it is not only fat tails but also close-to-mode returns that are vital to financial data modelling. Thus, more flexible than common heavy-tailed (such as Student’s t or slash) distributions may be empirically required so that both features can be ’freely’ controlled for by separate parameters (rather than a single one). Nonetheless, the LLFT-SV specification proposed in this study leaves some important room for further improvement, with the model’s most obvious limitations being the symmetry of the conditional distribution and the preclusion of the leverage effect. The symmetry of the LLFT distribution suggests that it lends itself mainly to modelling either time series with quite a symmetric data distribution or at least those where an adequate capturing of the tails and peakedness would empirically prove more important than allowing for skewness (although this could not be settled without comparing alternative models). Obviously, extending the LLFT-SV model towards the asymmetry could be directly related to it gaining the ability of accounting for the leverage effect. Moreover, skewing the LLFT distribution itself seems a valid and important extension. We leave such generalizations of our current framework for future work. Finally, and on a different note, evaluation of the LLFT-SV model in terms of risk modelling and management would also be most desirable.

Author Contributions

All authors have contributed equally to the paper. This includes conceptualization, methodology, formal analysis, investigation, data curation, software, visualization, and writing. In addition, all authors have read and agreed to the published version of the manuscript.

Funding

Łukasz Lenart acknowledges support from a subsidy granted to Cracow University of Economics. Łukasz Kwiatkowski acknowledges financial support through a grant from the National Science Center (NCN, Poland) under decision no. UMO-2018/31/B/HS4/00730.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data are publicly available.

Acknowledgments

The authors express their sincere gratitude to the anonymous referees for their valuable comments that allowed us to greatly improve the paper.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

In this appendix, we provide and theorems showing that the marginal data density value (i.e., marginal likelihood) is finished in the Bayesian LLFT-SV model considered in Section 3.
Theorem A1.
For the stochastic volatility model presented in Section 3, and under the additional assumption that the inverse gamma prior distribution of σ 2 is truncated to the interval ( 0 , κ σ 2 ) , for any fixed κ σ 2 < , we have that p ( y ) < .
Proof of Theorem 2 
Denote p ( ω | p , ρ , ν ) = t = 1 T p f B e t a ω t | ρ , 1 + ( 1 p ) f P a r e t o ω t | 1 , ν and h ˜ = [ ln h 1 , ln h 2 , , ln h T ] . Note that
p ( y ) = ω h ˜ β σ γ ϕ p ν ρ c d p y , ω , h ˜ , β , σ , γ , ϕ , p , ν , ρ , c , d | y 0 d d d c d ρ d ν d p d ϕ d γ d σ d β d h ˜ d ω ω h ˜ β σ γ ϕ p ν ρ c d e 1 2 t = 1 T y t x t β e h ˜ t / 2 min { max { ω t , c } , d } 2 t = 1 T min { max { ω t , c } , d } f 1 × e 1 2 t = 1 T h ˜ t γ ϕ ( h ˜ t 1 γ ) σ 2 + h ˜ t σ T f 2 × p ( ω | p , ρ , ν ) p ( σ ) p β p p p ν p ρ p ( d ) p ( c ) d d d c d ρ d ν d p d ϕ d γ d σ d β d h ˜ d ω .
For the term f 1 , we have the following inequality f 1 e 1 2 d 2 t = 1 T y t x t β e h ˜ t / 2 2 c T . Using this inequality, we get that there exist C 1 , C 2 < such that
c d f 1 p ( d ) p ( c ) d d d c C 1 1 2 t = 1 T e h ˜ t y t x t β 2 + β d α d × Γ α d Γ α d , β d + 1 2 t = 1 T e h ˜ t y t x t β 2 C 2
The term f 2 is the kernel of a multivariate Gaussian distribution for h ˜ , which can be written as
f 2 = | Σ h | 1 2 e 1 2 ( h ˜ r ) Σ h ( h ˜ r ) e γ 2 ( T t = 1 T ϕ t ) e h 0 ϕ ϕ T 1 ϕ 1 e σ 2 g ( ϕ ) ,
where Σ h = [ a i j ] T × T is a tridiagonal matrix such that | Σ h | 1 2 = σ T , with the elements on the main diagonal: a i i = ϕ 2 + 1 σ 2 , for i = 1 , 2 , , T 1 , a T T = 1 σ 2 and a i j = ϕ σ 2 , for | i j | = 1 , and r is the mean vector. The term g ( ϕ ) is a polynomial of order T 2 with positive values for ϕ ( 1 , 1 ) and with known coefficients that depend only on the sample’s size, T. Hence, there exists C 3 < , which depends only on T, and such that g ( ϕ ) C 3 .
Using (A2), (A3) and inequalities e h 0 ϕ ϕ T 1 ϕ 1 e | h 0 | T , e γ 2 ( T t = 1 T ϕ t ) e | γ | T , e κ σ 2 g ( ϕ ) e κ σ 2 C 3 , and integrating over h ˜ , we get that there exists C 4 < such that
p ( y ) C 4 ω β σ γ ϕ p ν ρ e | γ | T × p ( ω | p , ρ , ν ) p ( σ ) p β p p p ν p ρ d ρ d ν d p d ϕ d γ d σ d β d ω = C 4 γ e | γ | T p ( γ ) d γ <
which ends the proof. □

Appendix B

We examine the convergence of the MCMC sampler through a visual inspection of ACF, standardized CUSUM ([36,37,38]) and trace plots; see Figure A1 and Figure A2. Furthermore, in Table A1, inefficiency factors are reported (calculated with the batch-means method, using the ACF lags from 1 through 500; see e.g., [39]).
The results presented in Figure A1 and Figure A2 do not display any sort of anomalies that would invalidate the MCMC convergence for the MSAG.DE data (plots obtained for the S&P 500 and DAX indices are similar, and therefore, for saving the space, not reported here, although available upon request). The ACF plots reveal, however, a typical autocorrelation, most persistent for parameters γ , ν and p, which is further reflected in the inefficiency factors (see Table A1).
Figure A1. Standardized CUSUM for the parameters.
Figure A1. Standardized CUSUM for the parameters.
Entropy 23 00689 g0a1
Table A1. Inefficiency factors obtained in the LLFT-SV model, under ν I G ( 8 ; 0.8 ) , d I N K ( 2 ; 100 ) , c I N K ( 2 ; 0.1 ) .
Table A1. Inefficiency factors obtained in the LLFT-SV model, under ν I G ( 8 ; 0.8 ) , d I N K ( 2 ; 100 ) , c I N K ( 2 ; 0.1 ) .
ParameterMSAG.DEDAXS&P 500
γ 16.7436.1972.553
ϕ 11.50713.1847.219
σ 2 14.41917.61112.23
ν 16.27511.67610.114
ρ 11.9978.6514.837
p18.02221.07813.499
c10.68812.95415.081
d7.69615.4718.127
β 0 7.6544.0125.363
β 1 9.7523.1372.985
β 2 6.4685.5542.858
Figure A2. Trace plots and ACF for MCMC samples.
Figure A2. Trace plots and ACF for MCMC samples.
Entropy 23 00689 g0a2

References

  1. Andrews, D.R.; Mallows, C.L. Scale mixture of normal distribution. J. R. Stat. Soc. Ser. B 1974, 36, 99–102. [Google Scholar] [CrossRef]
  2. Fang, K.T.; Kotz, S.; Ng, K.W. Symmetric Multivariate and Related Distributions; Chapman and Hall: New York, NY, USA, 1990. [Google Scholar]
  3. Madan, D.B.; Seneta, E. The Variance Gamma (V.G.) Model for Share Market Returns. J. Bus. 1990, 63, 511–524. [Google Scholar] [CrossRef]
  4. Romanowski, M. Random Errors in Observation and the Influence of Modulation on Their Distribution; Verlag Konrad Wittwer: Stuttgart, Germany, 1979. [Google Scholar]
  5. Rogers, W.H.; Tukey, J.W. Understanding some long-tailed symmetrical distributions. Stat. Neerl. 1972, 26, 211–226. [Google Scholar] [CrossRef]
  6. Reyes, J.; Gómez, H.W.; Bolfarine, H. Modified slash distribution. Stat. J. Theor. App. Stat. 2013, 47, 929–941. [Google Scholar] [CrossRef]
  7. Reyes, J.; Barranco-Chamorro, I.; Gómez, W.H. Generalized modified slash distribution with applications. Commun. Stat. Theory 2020, 49, 2025–2048. [Google Scholar] [CrossRef]
  8. Gneiting, T. Normal scale mixtures and dual probability densities. J. Stat. Comput. Sim. 1997, 59, 375–384. [Google Scholar] [CrossRef]
  9. Bai, X.; Russel, J.R.; Tiao, G.C. Kurtosis of GARCH and Stochastic Volatility Models with Non-Normal Innovations. J. Econom. 2003, 114, 349–360. [Google Scholar] [CrossRef]
  10. Rachev, S.T.; Mittnik, S.; Fabozzi, F.J.; Focardi, S.M.; Jašić, T. Financial Econometrics: From Basics to Advanced Modeling Techniques; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2007. [Google Scholar]
  11. Tsiatos, G. On generalised asymmetric stochastic volatility models. Comput. Stat. Data Anal. 2012, 56, 151–172. [Google Scholar] [CrossRef]
  12. Nakajima, J.; Omorio, Y. Stochastic volatility model with leverage and asymmetrically heavy-tailed error using GH skew Student’s t-distribution. Comput. Stat. Data Anal. 2012, 56, 3690–3704. [Google Scholar] [CrossRef]
  13. Abanto-Valle, C.A.; Wang, C.; Wang, X.; Wang, F.-X.; Chen, M.-H. Bayesian inference for stochastic volatility models using the generalized skew-t distribution with applications to the Shenzhen Stock Exchange returns. Stat. Interface 2014, 7, 487–502. [Google Scholar] [CrossRef] [Green Version]
  14. Abanto-Valle, C.A.; Bandyopadhyay, D.; Lachos, V.H.; Enriquez, I. Robust Bayesian Analysis of Heavy-tailed Stochastic Volatility Models using Scale Mixtures of Normal Distributions. Comput. Stat. Data Anal. 2010, 54, 2883–2898. [Google Scholar] [CrossRef] [Green Version]
  15. Yanhui, X.; Hui, P. Modelling financial time series based on heavy-tailed market microstructure models with scale mixtures of normal distributions. Int. J. Syst. Sci. 2018, 49, 1615–1626. [Google Scholar]
  16. Chib, S.; Nardari, F.; Shephard, N. Markov chain Monte Carlo methods for stochastic volatility models. J. Econom. 2002, 108, 281–316. [Google Scholar] [CrossRef]
  17. Jacquier, E.; Polson, N.G.; Rossi, P.E. Bayesian analysis of stochastic volatility models with fat-tails and correlated errors. J. Econom. 2004, 122, 185–212. [Google Scholar] [CrossRef]
  18. Shephard, N. Stochastic Volatility: Selected Readings; Oxford University Press: Oxford, NJ, USA, 2005. [Google Scholar]
  19. Tomohiro, A. Bayesian Inference for Nonlinear and Non-Gaussian Stochastic Volatility Model with Leverage Effect. J. Jpn. Stat. Soc. 2006, 36, 173–197. [Google Scholar]
  20. Shirota, S.; Hizu, T.; Omori, Y. Realized stochastic volatility with leverage and long memory. Comput. Stat. Data Anal. 2014, 76, 618–641. [Google Scholar] [CrossRef] [Green Version]
  21. Virbickaitė, A.; Lopes, H.F. Bayesian semiparametric Markov switching stochastic volatility model. Appl. Stoch. Model Bus. Ind. 2019, 35, 978–997. [Google Scholar] [CrossRef]
  22. Kwiatkowski, Ł. Markov Switching In-Mean Effect. Bayesian Analysis in Stochastic Volatility Framework. Cent. Eur. J. Econ. Model. Econom. 2010, 2, 59–94. [Google Scholar]
  23. Kwiatkowski, Ł. Bayesian Analysis of a Regime Switching in-Mean effect for the Polish Stock Market. Cent. Eur. J. Econ. Model. Econom. 2012, 3, 187–219. [Google Scholar]
  24. Jacquier, E.; Polson, N.; Rossi, P. Bayesian analysis of stochastic volatility models [with discussion]. J. Bus. Econ. Stat. 1994, 12, 371–417. [Google Scholar]
  25. Abanto-Valle, C.A.; Lachos, V.H.; Dey, D.K. Bayesian estimation of a skew-student-t stochastic volatility model. Methodol. Comput. Appl. 2015, 17, 721–738. [Google Scholar] [CrossRef]
  26. Leão, W.L.; Abanto-Valle, C.A.; Chen, M.H. Bayesian analysis of stochastic volatility-in-mean model with leverage and asymmetrically heavy-tailed error using generalized hyperbolic skew Student’s t-distribution. Stat. Interface 2017, 10, 529–541. [Google Scholar] [CrossRef] [Green Version]
  27. Fernández, C.; Steel, M.F.J. On the Dangers of Modelling Trough Continuous Distributions: A Bayesian Perspective. In Bayesian Statistics 6; Bernardo, J.M., Berger, J., Dawid, A., Smith, A., Eds.; Oxford University Press: Oxford, UK, 1998; pp. 213–238. [Google Scholar]
  28. Taylor, S. Modeling Financial Time Series; John Wiley and Sons Ltd.: Chichester, UK, 1986. [Google Scholar]
  29. Ruiz, E. Quasi-maximum likelihood estimation of stochastic volatility models. J. Econom. 1994, 63, 289–306. [Google Scholar] [CrossRef] [Green Version]
  30. Clark, P.K. A subordinated stochastic process model with finite variance for speculative prices. Econometrica 1973, 41, 135–155. [Google Scholar] [CrossRef]
  31. Taylor, S. Financial returns modelled by the product of two stochastic processes-a study of the daily sugar prices 1961–75. In Time Series Analysis: Theory and Practice 1; Anderson, O., Ed.; North-Holland Publishing Company: Amsterdam, The Netherlands, 1982; pp. 203–226. [Google Scholar]
  32. Louzada, F.; Ramos, P.; Nascimento, D. The Inverse Nakagami-m Distribution: A Novel Approach in Reliability. IEEE Trans. Reliab. 2018, 67, 1030–1042. [Google Scholar] [CrossRef]
  33. Damien, P.; Walker, S.G. Sampling Truncated Normal, Beta, and Gamma Densities. J. Comput. Graph. Stat. 2001, 10, 206–215. [Google Scholar] [CrossRef]
  34. Osiewalski, J.; Pajor, A. On Sensitivity of Inference in Bayesian MSF-MGARCH Models. Cent. Eur. J. Econ. Model. Econom. 2019, 11, 173–197. [Google Scholar]
  35. Geweke, J.; Amisano, G. Comparing and evaluating Bayesian predictive distributions of asset returns. Int. J. Forecast. 2010, 26, 216–230. [Google Scholar] [CrossRef]
  36. Yu, B.; Mykland, P. Looking at Markov samplers through cusum paths plots: A simple diagnostic idea. Stat. Comput. 1998, 8, 275–286. [Google Scholar] [CrossRef]
  37. Bauwens, L.; Lubrano, M. Bayesian inference on GARCH models using the Gibbs sampler. Econom. J. 1998, 1, C23–C46. [Google Scholar] [CrossRef]
  38. Bauwens, L.; Lubrano, M.; Richard, J.-F. Bayesian Inference in Dynamic Econometric Models; Oxford University Press: Oxford, UK, 1999. [Google Scholar]
  39. Greenberg, E. Introduction to Bayesian Econometrics; Cambridge University Press: Cambridge, UK, 2012. [Google Scholar]
Figure 1. Variance (on the left) and kurtosis (on the right) for LS distribution (given by (5)) as a function of ν and d.
Figure 1. Variance (on the left) and kurtosis (on the right) for LS distribution (given by (5)) as a function of ν and d.
Entropy 23 00689 g001
Figure 2. Probability density functions of the LS distribution for different values of ν and d.
Figure 2. Probability density functions of the LS distribution for different values of ν and d.
Entropy 23 00689 g002
Figure 3. Probability density functions of the MN type I distribution for different values of ρ .
Figure 3. Probability density functions of the MN type I distribution for different values of ρ .
Entropy 23 00689 g003
Figure 4. Probability density functions of the LMN type I distribution for different values of ρ 1 , c { 0.02 , 0.05 , 0.1 , 0.2 , 0.5 , 0.8 } .
Figure 4. Probability density functions of the LMN type I distribution for different values of ρ 1 , c { 0.02 , 0.05 , 0.1 , 0.2 , 0.5 , 0.8 } .
Entropy 23 00689 g004
Figure 5. Probability density functions of the LMN type I distribution for different values of ρ 5 / 4 , c { 0.02 , 0.05 , 0.1 , 0.2 , 0.5 , 0.8 } .
Figure 5. Probability density functions of the LMN type I distribution for different values of ρ 5 / 4 , c { 0.02 , 0.05 , 0.1 , 0.2 , 0.5 , 0.8 } .
Entropy 23 00689 g005
Figure 6. The tree with limiting cases for the LLFT distribution.
Figure 6. The tree with limiting cases for the LLFT distribution.
Entropy 23 00689 g006
Figure 7. The pdf of the LLFT distribution for different values of parameters c and d and fixed ν , ρ and p = 1 / 2 .
Figure 7. The pdf of the LLFT distribution for different values of parameters c and d and fixed ν , ρ and p = 1 / 2 .
Entropy 23 00689 g007
Figure 8. Time plots of (a) daily prices, (b) log returns in percentages for MSAG.DE, (c) histogram of log returns (in percentages) for MSAG.DE with a Gaussian curve.
Figure 8. Time plots of (a) daily prices, (b) log returns in percentages for MSAG.DE, (c) histogram of log returns (in percentages) for MSAG.DE with a Gaussian curve.
Entropy 23 00689 g008
Figure 9. Histograms of the marginal posteriors (bars) and priors (red line) of mixture parameters ν , ρ and p, under ν G ( 8 ; 0.8 ) , c I N K ( 2 ; 0.1 ) and d I N K ( 2 ; 100 ) .
Figure 9. Histograms of the marginal posteriors (bars) and priors (red line) of mixture parameters ν , ρ and p, under ν G ( 8 ; 0.8 ) , c I N K ( 2 ; 0.1 ) and d I N K ( 2 ; 100 ) .
Entropy 23 00689 g009
Figure 10. Histograms of the marginal posteriors (bars) and priors (red line) of SV parameters γ , ϕ and σ 2 , under ν G ( 8 ; 0.8 ) , c I N K ( 2 ; 0.1 ) and d I N K ( 2 ; 100 ) .
Figure 10. Histograms of the marginal posteriors (bars) and priors (red line) of SV parameters γ , ϕ and σ 2 , under ν G ( 8 ; 0.8 ) , c I N K ( 2 ; 0.1 ) and d I N K ( 2 ; 100 ) .
Entropy 23 00689 g010
Figure 11. Histograms of the marginal posteriors (bars) and priors (red line) of c and d, under ν G ( 8 ; 0.8 ) , c I N K ( 2 ; 0.1 ) and d I N K ( 2 ; 100 ) .
Figure 11. Histograms of the marginal posteriors (bars) and priors (red line) of c and d, under ν G ( 8 ; 0.8 ) , c I N K ( 2 ; 0.1 ) and d I N K ( 2 ; 100 ) .
Entropy 23 00689 g011
Figure 12. Histograms of the marginal posteriors (bars) and priors (red line) of c and d, under ν G ( 0.2 ; 0.05 ) , c I N K ( 2 ; 0.1 ) and d I N K ( 9 ; 100 ) .
Figure 12. Histograms of the marginal posteriors (bars) and priors (red line) of c and d, under ν G ( 0.2 ; 0.05 ) , c I N K ( 2 ; 0.1 ) and d I N K ( 9 ; 100 ) .
Entropy 23 00689 g012
Figure 13. Bivariate marginal posteriors: (a) for pairs ( ν , ρ ) and ( p , ρ ) , (b) for pairs ( γ , ν ) and ( ν , σ 2 ) ; obtained in the LLFT-SV model with ν G ( 8 ; 0.8 ) , c I N K ( 2 ; 0.1 ) and d I N K ( 2 ; 100 ) .
Figure 13. Bivariate marginal posteriors: (a) for pairs ( ν , ρ ) and ( p , ρ ) , (b) for pairs ( γ , ν ) and ( ν , σ 2 ) ; obtained in the LLFT-SV model with ν G ( 8 ; 0.8 ) , c I N K ( 2 ; 0.1 ) and d I N K ( 2 ; 100 ) .
Entropy 23 00689 g013
Figure 14. Bivariate marginal posteriors for pairs ( c , ρ ) and ( ν , d ) , obtained in the LLFT-SV model with (a) ν G ( 8 ; 0.8 ) , c I N K ( 2 , 0.1 ) , d I N K ( 2 ; 100 ) , (b) ν G ( 8 ; 0.8 ) , c I N K ( 2 , 0.1 ) , d I N K ( 9 ; 100 ) , (c) ν G ( 0.2 ; 0.05 ) , c I N K ( 2 , 0.1 ) , d I N K ( 9 ; 100 ) .
Figure 14. Bivariate marginal posteriors for pairs ( c , ρ ) and ( ν , d ) , obtained in the LLFT-SV model with (a) ν G ( 8 ; 0.8 ) , c I N K ( 2 , 0.1 ) , d I N K ( 2 ; 100 ) , (b) ν G ( 8 ; 0.8 ) , c I N K ( 2 , 0.1 ) , d I N K ( 9 ; 100 ) , (c) ν G ( 0.2 ; 0.05 ) , c I N K ( 2 , 0.1 ) , d I N K ( 9 ; 100 ) .
Entropy 23 00689 g014
Figure 15. Posterior means (black line) with two standard-deviation bands (truncated only to positive values; red) of h t s obtained in the LLFT-SV model with: (a) ν G ( 0.2 ; 0.05 ) , c I N K ( 2 , 0.1 ) , d I N K ( 9 ; 100 ) , (b) ν G ( 8 ; 0.8 ) , c I N K ( 2 , 0.1 ) , d I N K ( 9 ; 100 ) , (c) ν G ( 8 ; 0.8 ) , c I N K ( 2 , 0.1 ) , d I N K ( 2 ; 100 ) .
Figure 15. Posterior means (black line) with two standard-deviation bands (truncated only to positive values; red) of h t s obtained in the LLFT-SV model with: (a) ν G ( 0.2 ; 0.05 ) , c I N K ( 2 , 0.1 ) , d I N K ( 9 ; 100 ) , (b) ν G ( 8 ; 0.8 ) , c I N K ( 2 , 0.1 ) , d I N K ( 9 ; 100 ) , (c) ν G ( 8 ; 0.8 ) , c I N K ( 2 , 0.1 ) , d I N K ( 2 ; 100 ) .
Entropy 23 00689 g015
Figure 16. Posterior means (black line) with two standard-deviation bands (truncated only to positive values; red) of h t s obtained in the t-SV model with: (a) ν G ( 0.2 ; 0.05 ) , (b) ν G ( 8 ; 0.8 ) .
Figure 16. Posterior means (black line) with two standard-deviation bands (truncated only to positive values; red) of h t s obtained in the t-SV model with: (a) ν G ( 0.2 ; 0.05 ) , (b) ν G ( 8 ; 0.8 ) .
Entropy 23 00689 g016
Figure 17. Posterior means (black line) with two-standard-deviation bands (truncated only to positive values; red) obtained in the LLFT-SV model with ν G ( 0.2 ; 0.05 ) , c I N K ( 2 ; 0.1 ) , d I N K ( 9 ; 100 ) for: (a) ω t , (b) min { max { ω t , c } , d } .
Figure 17. Posterior means (black line) with two-standard-deviation bands (truncated only to positive values; red) obtained in the LLFT-SV model with ν G ( 0.2 ; 0.05 ) , c I N K ( 2 ; 0.1 ) , d I N K ( 9 ; 100 ) for: (a) ω t , (b) min { max { ω t , c } , d } .
Entropy 23 00689 g017
Figure 18. Posterior means (black lines) with two-stsandard-deviation bands (truncated only to positive values; red) of min { max { ω t , c } , d } , obtained in the LLFT-SV model with ν G ( 8 ; 0.8 ) , c I N K ( 2 , 0.1 ) , and: (a) d I N K ( 2 ; 100 ) , (b) d I N K ( 9 ; 100 ) .
Figure 18. Posterior means (black lines) with two-stsandard-deviation bands (truncated only to positive values; red) of min { max { ω t , c } , d } , obtained in the LLFT-SV model with ν G ( 8 ; 0.8 ) , c I N K ( 2 , 0.1 ) , and: (a) d I N K ( 2 ; 100 ) , (b) d I N K ( 9 ; 100 ) .
Entropy 23 00689 g018
Figure 19. Posterior means (black line) with two-standard-deviation bands (truncated only to positive values; red) of ω t obtained in the t-SV model with: (a) ν G ( 0.2 , 0.05 ) , (b) ν G ( 8 , 0.8 ) .
Figure 19. Posterior means (black line) with two-standard-deviation bands (truncated only to positive values; red) of ω t obtained in the t-SV model with: (a) ν G ( 0.2 , 0.05 ) , (b) ν G ( 8 , 0.8 ) .
Entropy 23 00689 g019
Figure 20. Posterior means (black line) with two-standard-deviation bands (truncated only to positive values; red) of σ t obtained in the LLFT-SV model with: (a) ν G ( 0.2 , 0.05 ) , c I N K ( 2 , 0.1 ) , d I N K ( 9 ; 100 ) , (b) ν G ( 8 , 0.8 ) , c I N K ( 2 , 0.1 ) , d I N K ( 9 ; 100 ) , (c) ν G ( 8 , 0.8 ) , c I N K ( 2 , 0.1 ) , d I N K ( 2 ; 100 ) . (d) The series of modelled returns.
Figure 20. Posterior means (black line) with two-standard-deviation bands (truncated only to positive values; red) of σ t obtained in the LLFT-SV model with: (a) ν G ( 0.2 , 0.05 ) , c I N K ( 2 , 0.1 ) , d I N K ( 9 ; 100 ) , (b) ν G ( 8 , 0.8 ) , c I N K ( 2 , 0.1 ) , d I N K ( 9 ; 100 ) , (c) ν G ( 8 , 0.8 ) , c I N K ( 2 , 0.1 ) , d I N K ( 2 ; 100 ) . (d) The series of modelled returns.
Entropy 23 00689 g020
Figure 21. Posterior means (black line) with two-standard-deviation bands (truncated only to positive values; red) of σ t obtained in the t-SV model with: (a) ν G ( 0.2 ; 0.05 ) , (b) ν G ( 8 ; 0.8 ) . (c) The series of modelled returns.
Figure 21. Posterior means (black line) with two-standard-deviation bands (truncated only to positive values; red) of σ t obtained in the t-SV model with: (a) ν G ( 0.2 ; 0.05 ) , (b) ν G ( 8 ; 0.8 ) . (c) The series of modelled returns.
Entropy 23 00689 g021
Figure 22. Bivariate marginal posteriors for pairs ( c , ρ ) and ( ν , d ) , obtained in the LLFT-SV model, under ν G ( 8 ; 0.8 ) , c I N K ( 0.4 ; 0.009 ) and d I N K ( 2 ; 100 ) ; (a) for the original MSAG.DE data set, (b) for the perturbed data, U ( 5 × 10 5 , 5 × 10 5 ) , (c) for the perturbed data, U ( 5 × 10 4 , 5 × 10 4 ) .
Figure 22. Bivariate marginal posteriors for pairs ( c , ρ ) and ( ν , d ) , obtained in the LLFT-SV model, under ν G ( 8 ; 0.8 ) , c I N K ( 0.4 ; 0.009 ) and d I N K ( 2 ; 100 ) ; (a) for the original MSAG.DE data set, (b) for the perturbed data, U ( 5 × 10 5 , 5 × 10 5 ) , (c) for the perturbed data, U ( 5 × 10 4 , 5 × 10 4 ) .
Entropy 23 00689 g022
Figure 23. Posterior means (black line) with two-standard-deviation bands (truncated only to positive values; red) of σ t obtained for the original MSAG.DE data in the LLFT-SV model with ν G ( 8 ; 0.8 ) , and: (a) c I N K ( 2 ; 0.1 ) , (b) c I N K ( 0.4 ; 0.009 ) .
Figure 23. Posterior means (black line) with two-standard-deviation bands (truncated only to positive values; red) of σ t obtained for the original MSAG.DE data in the LLFT-SV model with ν G ( 8 ; 0.8 ) , and: (a) c I N K ( 2 ; 0.1 ) , (b) c I N K ( 0.4 ; 0.009 ) .
Entropy 23 00689 g023
Figure 24. Posterior means (black line) with two-standard-deviation bands (truncated only to positive values; red) of min { max { ω t , c } , d } , obtained for the original MSAG.DE data in the LLFT-SV model with ν G ( 8 ; 0.8 ) , and: (a) c I N K ( 2 ; 0.1 ) , (b) c I N K ( 0.4 ; 0.009 ) .
Figure 24. Posterior means (black line) with two-standard-deviation bands (truncated only to positive values; red) of min { max { ω t , c } , d } , obtained for the original MSAG.DE data in the LLFT-SV model with ν G ( 8 ; 0.8 ) , and: (a) c I N K ( 2 ; 0.1 ) , (b) c I N K ( 0.4 ; 0.009 ) .
Entropy 23 00689 g024
Figure 25. Posterior means (black line) with two-standard-deviation bands (truncated only to positive values; red) of min { max { ω t , c } , d } , obtained in the LLFT-SV model for the perturbed MSAG.DE data. The prior distribution: ν G ( 8 ; 0.8 ) , c I N K ( 0.4 ; 0.009 ) . The result for: (a) the first perturbation, (b) the second perturbation.
Figure 25. Posterior means (black line) with two-standard-deviation bands (truncated only to positive values; red) of min { max { ω t , c } , d } , obtained in the LLFT-SV model for the perturbed MSAG.DE data. The prior distribution: ν G ( 8 ; 0.8 ) , c I N K ( 0.4 ; 0.009 ) . The result for: (a) the first perturbation, (b) the second perturbation.
Entropy 23 00689 g025
Figure 26. The series (left) and histograms (with fitted Gaussian curves; right) of the S&P 500 and DAX returns.
Figure 26. The series (left) and histograms (with fitted Gaussian curves; right) of the S&P 500 and DAX returns.
Entropy 23 00689 g026
Figure 27. Marginal posteriors (blue line) and priors (red line) of the LLFT-SV model parameters under ν G ( 8 ; 0.8 ) , c I N K ( 0.4 ; 0.009 ) and d I N K ( 2 ; 100 ) . The top two rows show plots for S&P 500 and the bottom two rows for DAX.
Figure 27. Marginal posteriors (blue line) and priors (red line) of the LLFT-SV model parameters under ν G ( 8 ; 0.8 ) , c I N K ( 0.4 ; 0.009 ) and d I N K ( 2 ; 100 ) . The top two rows show plots for S&P 500 and the bottom two rows for DAX.
Entropy 23 00689 g027
Table 1. Sample characteristics for the MSAG.DE data set.
Table 1. Sample characteristics for the MSAG.DE data set.
MeanSt. Dev.MedianMinMaxSkewnessKurtosisPercent of Zero Returns
0.035 2.851 0.000 22.399 30.390 0.635 24.780 11 %
Table 2. Posterior characteristics of model parameters for the original MSAG.DE data set. The first row of each entry: posterior median. The second row: 90% posterior confidence interval (in parentheses). The third row: interquartile range.
Table 2. Posterior characteristics of model parameters for the original MSAG.DE data set. The first row of each entry: posterior median. The second row: 90% posterior confidence interval (in parentheses). The third row: interquartile range.
ParameterLLFT-SVLLFT-SVLLFT-SVt-SVt-SV
ν G ( 0 . 2 ; 0 . 05 ) ν G ( 8 ; 0 . 8 ) ν G ( 8 ; 0 . 8 ) ν G ( 0 . 2 ; 0 . 05 ) ν G ( 8 ; 0 . 8 )
c INK ( 2 ; 0 . 1 ) c INK ( 2 ; 0 . 1 ) c INK ( 2 ; 0 . 1 )
d INK ( 9 ; 100 ) d INK ( 9 ; 100 ) d INK ( 2 ; 100 )
γ 0.9471.2901.2060.9821.045
(0.492, 1.500)(0.772, 1.682)(0.758, 1.661)(0.709, 1.255)(0.779, 1.304)
0.4060.4130.3920.2240.210
ϕ 0.9140.9050.9110.9050.899
(0.860, 0.950)(0.854, 0.944)(0.860, 0.947)(0.851, 0.947)(0.839, 0.941)
0.0360.0360.0360.0360.042
σ 2 0.1500.1710.1560.1560.186
(0.078, 0.261)(0.099, 0.282)(0.087, 0.270)(0.087, 0.282)(0.102, 0.324)
0.0720.0720.0690.0780.087
ν 3.996.305.605.116.44
(2.73, 7.35)(3.64, 8.82)(3.57, 8.96)(3.64, 8.54)(4.34, 10.85)
1.822.382.241.682.52
ρ 0.640.670.66
(0.22, 0.98)(0.24, 0.97)(0.23, 0.96)
0.330.320.35
p0.1480.2400.214
(0.046, 0.302)(0.106, 0.363)(0.107, 0.382)
0.0870.1250.131
c0.1340.1200.118
(0.061, 0.261)(0.058, 0.231)(0.057, 0.220)
0.0770.0670.067
d5.1613.3547.553
(1.144, 10.673)(1.105, 6.396)(1.105, 16.632)
2.9513.1464.524
β 0 −0.043−0.040−0.037−0.076−0.082
(−0.130, 0.008)(−0.121, 0.008)(−0.118, 0.008)(−0.163, 0.008)(−0.166, 0.005)
0.0660.0630.0600.0690.069
β 1 −0.046-0.037−0.034−0.115−0.118
(−0.118, 0.005)(−0.100, 0.005)(−0.097, 0.005)(−0.163, −0.067)(−0.166, −0.067)
0.0600.0480.0510.0420.039
β 2 0.0230.0230.0230.0350.035
(−0.004, 0.065)(−0.001, 0.065)(−0.001, 0.062)(−0.010, 0.080)(−0.010, 0.083)
0.0300.0300.0270.0360.039
Table 3. Basic characteristics (averages, standard deviations, correlation coefficients) of the posterior means of latent processes, in models with ν G ( 8 ; 0.8 ) , c I N K ( 2 ; 0.1 ) , and d I N K ( 2 ; 100 ) .
Table 3. Basic characteristics (averages, standard deviations, correlation coefficients) of the posterior means of latent processes, in models with ν G ( 8 ; 0.8 ) , c I N K ( 2 ; 0.1 ) , and d I N K ( 2 ; 100 ) .
Latent ProcessModel TypeAverageStandard DeviationCorrelation Coefficient
h t LLFT-SV5.8659.1260.994
t-SV4.8546.710
min { max { ω t , c } , d } LLFT-SV1.0710.2340.759
ω t t-SV1.1480.154
h t min { max { ω t , c } , d } LLFT-SV2.1191.0510.964
h t ω t t-SV2.1731.210
Table 4. Sums of the log predictive likelihoods.
Table 4. Sums of the log predictive likelihoods.
ForecastLLFT-SVLLFT-SVt-SV
Horizon ( c = 0.118 , d = 7.553 ) ( c = 0.014 , d = 2.808 )
h = 1 261.66 239.82 263.84
h = 2 264.43 241.64 266.59
h = 3 264.48 241.30 266.15
h = 4 266.56 245.16 267.87
h = 5 267.93 246.56 269.02
h = 6 264.99 242.72 266.61
h = 7 265.45 243.37 267.13
h = 8 265.57 242.87 267.14
h = 9 265.01 241.98 266.73
h = 10 267.47 244.50 269.49
Table 5. Posterior characteristics of model parameters for the perturbed and original MSAG.DE data, under c I N K ( 2 ; 0.1 ) . The first row of each entry: posterior median. The second row: 90% posterior confidence interval (in parentheses). The third row: interquartile range.
Table 5. Posterior characteristics of model parameters for the perturbed and original MSAG.DE data, under c I N K ( 2 ; 0.1 ) . The first row of each entry: posterior median. The second row: 90% posterior confidence interval (in parentheses). The third row: interquartile range.
ParameterOriginal DataPerturbed DataOriginal DataPerturbed Data
LLFT-SVLLFT-SVt-SVt-SV
U ( 5 × 10 4 , 5 × 10 4 ) U ( 5 × 10 4 , 5 × 10 4 )
ν G ( 8 ; 0 . 8 ) ν G ( 8 ; 0 . 8 ) ν G ( 8 ; 0 . 8 ) ν G ( 8 ; 0 . 8 )
c INK ( 2 ; 0 . 1 ) c INK ( 2 ; 0 . 1 )
d INK ( 2 ; 100 ) d INK ( 2 ; 100 )
γ 1.2061.2271.0451.045
(0.758, 1.661)(0.800, 1.654)(0.779, 1.304)(0.786, 1.304)
0.3920.3710.2100.210
ϕ 0.9110.9140.8990.902
(0.860, 0.947)(0.863, 0.950)(0.839, 0.941)(0.842, 0.944)
0.0360.0330.0420.039
σ 2 0.1560.1500.1860.177
(0.087, 0.270)(0.084, 0.255)(0.102, 0.324)(0.096, 0.309)
0.0690.0660.0870.087
ν 5.605.676.446.44
(3.57, 8.96)(3.71, 8.89)(4.34, 10.85)(4.34, 10.71)
2.242.172.522.45
ρ 0.660.70
(0.23, 0.96)(0.35, 0.97)
0.350.26
p0.2140.226
(0.107, 0.382)(0.109, 0.381)
0.1310.126
c0.1180.127
(0.057, 0.220)(0.072, 0.229)
0.0670.057
d7.5537.462
(1.105, 16.632)(1.118, 16.376)
4.5244.472
β 0 −0.037−0.049−0.082−0.079
(−0.118, 0.008)(−0.124, 0.005)(−0.166, 0.005)(−0.166, 0.005)
0.0600.0540.0690.069
β 1 −0.034−0.043−0.118−0.118
(−0.097, 0.005)(−0.100, −0.001)(−0.166, −0.067)(−0.166, −0.067)
0.0510.0420.0390.039
β 2 0.0230.0290.0350.038
(−0.001, 0.062)(−0.001, 0.065)(−0.010, 0.083)(−0.010, 0.086)
0.0270.0270.0390.036
Table 6. Posterior characteristics of the LLFT-SV model parameters for the perturbed and original MSAG.DE data, under c I N K ( 0.4 ; 0.009 ) . The first row of each entry: posterior median. The second row: 90% posterior confidence interval (in parentheses). The third row: interquartile range.
Table 6. Posterior characteristics of the LLFT-SV model parameters for the perturbed and original MSAG.DE data, under c I N K ( 0.4 ; 0.009 ) . The first row of each entry: posterior median. The second row: 90% posterior confidence interval (in parentheses). The third row: interquartile range.
ParameterPerturbed DataPerturbed DataOriginal Data
U ( 5 × 10 4 , 5 × 10 4 ) U ( 5 × 10 5 , 5 × 10 5 )
c INK ( 0 . 4 ; 0 . 009 ) c INK ( 0 . 4 ; 0 . 009 ) c INK ( 0 . 4 ; 0 . 009 )
d INK ( 2 ; 100 ) d INK ( 2 ; 100 ) d INK ( 2 ; 100 )
γ 0.800−0.005−0.047
(0.408, 1.129)(−0.278, 0.387)(−0.306, 0.219)
0.2730.2310.211
ϕ 0.9290.9170.911
(0.881, 0.962)(0.866, 0.950)(0.860, 0.947)
0.0330.0330.033
σ 2 0.1020.1350.153
(0.048, 0.183)(0.081, 0.225)(0.090, 0.255)
0.0540.0570.066
ν 3.361.0401.032
(1.47, 4.83)(1.003, 2.394)(1.002, 1.133)
0.980.070.05
ρ 0.110.0470.046
(0.06, 0.17)(0.027, 0.079)(0.029, 0.078)
0.040.020.020
p0.1190.11700.117
(0.097, 0.144)(0.100, 0.136)(0.100, 0.136)
0.0190.0150.015
c0.0210.0140.014
(0.018, 0.025)(0.012, 0.015)(0.012, 0.016)
0.0020.0010.001
d9.5032.7952.806
(2.316, 18.985)(2.496, 10.764)(2.514, 3.496)
4.7060.2600.230
β 0 −0.002−0.0001−0.0001
(−0.007, 0.003)(−0.0017, 0.001)(−0.001, 0.002)
0.0030.0010.001
β 1 −0.0003−0.00020.0000
(−0.003, 0.003)(−0.0013, 0.0008)(−0.0004, 0.001)
0.00250.00050.001
β 2 0.00150.00000.0000
(−0.001, 0.0044)(−0.0009, 0.0011)(−0.0004, 0.001)
0.0020.00080.0007
Table 7. Posterior characteristics of the LLFT-SV model parameters for DAX and S&P 500, under d I N K ( 2 ; 100 ) . The first row of each entry: posterior median. The second row: 90% posterior confidence interval (in parentheses). The third row: interquartile range.
Table 7. Posterior characteristics of the LLFT-SV model parameters for DAX and S&P 500, under d I N K ( 2 ; 100 ) . The first row of each entry: posterior median. The second row: 90% posterior confidence interval (in parentheses). The third row: interquartile range.
ParameterDAXDAXS&P 500S&P 500
ν G ( 8 ; 0 . 8 ) ν G ( 8 ; 0 . 8 ) ν G ( 8 ; 0 . 8 ) ν G ( 8 ; 0 . 8 )
c INK ( 2 ; 0 . 1 ) c INK ( 0 . 4 ; 0 . 009 ) c INK ( 2 ; 0 . 1 ) c INK ( 0 . 4 ; 0 . 009 )
γ 0.1910.296−0.558−572
(−0.649, 1.080)(−0.565, 1.269)(−1.377, 0.289)(−1.419, 0.303)
0.5600.5880.5810.588
ϕ 0.9830.9830.9800.980
(0.962, 0.998)(0.962, 0.998)(0.962, 0.995)(0.962, 0.995)
0.0120.0150.0120.012
σ 2 0.0330.0330.0810.078
(0.018, 0.066)(0.015, 0.063)(0.054, 0.120)(0.051, 0.123)
0.0210.0180.0270.027
ν 6.727.008.338.75
(3.99, 10.15)(4.13, 10.85)(5.18, 11.48)(5.53, 11.90)
2.522.662.522.59
ρ 0.830.840.860.85
(0.57, 1.02)(0.59, 1.02)(0.61, 1.05)(0.60, 1.04)
0.190.180.180.17
p0.5060.5730.3090.294
(0.275, 0.718)(0.332, 0.789)(0.175, 0.476)(0.147, 0.475)
0.1970.1840.1140.121
c0.4080.4210.1980.103
(0.253, 0.556)(0.294, 0.568)(0.122, 0.586)(0.051, 0.591)
0.1150.1040.1340.101
d1.0791.0665.6436.369
(1.027, 6.851)(1.027, 5.4866)(1.089, 14.52)(1.089, 15.147)
0.0650.0527.3267.821
β 0 0.0710.0710.0950.083
(0.032, 0.110)(0.032, 0.113)(0.068, 0.122)(0.062, 0.119)
0.0300.0330.0210.024
β 1 −0.043−0.040−0.097−0.094
(−0.091, 0.005)(−0.088, 0.008)(−0.142, −0.049)(−0.139, −0.052)
0.0390.0390.0360.033
β 2 0.0080.005−0.004−0.001
(−0.040, 0.059)(−0.043, 0.056)(−0.052, 0.044)(−0.046, 0.050)
0.0420.0420.0360.042
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Lenart, Ł.; Pajor, A.; Kwiatkowski, Ł. A Locally Both Leptokurtic and Fat-Tailed Distribution with Application in a Bayesian Stochastic Volatility Model. Entropy 2021, 23, 689. https://doi.org/10.3390/e23060689

AMA Style

Lenart Ł, Pajor A, Kwiatkowski Ł. A Locally Both Leptokurtic and Fat-Tailed Distribution with Application in a Bayesian Stochastic Volatility Model. Entropy. 2021; 23(6):689. https://doi.org/10.3390/e23060689

Chicago/Turabian Style

Lenart, Łukasz, Anna Pajor, and Łukasz Kwiatkowski. 2021. "A Locally Both Leptokurtic and Fat-Tailed Distribution with Application in a Bayesian Stochastic Volatility Model" Entropy 23, no. 6: 689. https://doi.org/10.3390/e23060689

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop