1. Introduction
A well-known upward trend in stock prices is illustrated in
Figure 1 for S&P500. Here
is price on day
t. The best linear fit
of
corresponds to roughly
annual growth. De-trended plot of returns
is shown in
Figure 2 and the fluctuations of
are attributed to market volatility. The simplest of models attempting to describe these fluctuations implies that de-trended returns are governed by a stochastic differential equation (SDE):
where
is the stochastic volatility and
is the normally distributed Wiener process,
,
.
Stochastic volatility, in turn, is believed to be described by the mean-reverting SDE for stochastic variance
implying that stochastic variance—and hence volatility—tends to revert to its mean value,
. One of the important implications of the latter is that for returns accumulated over
days,
, average variance of returns grows linearly with
,
. Since we are not concerned here with quantities such as leverage (
Dashti Moghaddam et al., 2021;
Perello & Masoliver, 2003) and study distribution of returns, in what follows we will neglect correlations between
and
(
Drăgulescu & Yakovenko, 2002;
Z. Liu et al., 2019) and largely concentrate on daily returns
.
Numerous models exist for
, such as Cox–Ingersoll–Ross (
Cox et al., 1985;
Drăgulescu & Yakovenko, 2002;
Heston, 1993;
Z. Liu et al., 2019), multiplicative (
Fuentes et al., 2009;
Z. Liu et al., 2019;
Nelson, 1990;
Praetz, 1972), and the combination of the two (
Dashti Moghaddam & Serota, 2021). Here we will concentrate on multiplicative model since it is the simplest model that predicts power-law tails of the distribution of returns and is the easiest to handle analytically. While power-law tails in returns are not universally agreed upon, there is a strong case for them at least for daily returns, while for accumulated returns power law may persist for a large portion of the tail (see, e.g.,
Farahani & Serota, 2025 and below).
In the multiplicative model
, which yields the following probability density functions (PDFs) for steady-state distributions of stochastic variance and volatility: (
Z. Liu et al., 2019):
where IGa is the Inverse Gamma function (
Wolfram, n.d.-a). From Equations (
1) and (
3) the distribution of stock returns can be found as a product distribution (
Ma & Serota, 2014) of inverse Gamma and normal distribution and the result is a Student t-distribution (
Fuentes et al., 2009;
Z. Liu et al., 2019;
Praetz, 1972):
Clearly this distribution is even
1 and thus establishes symmetry between gains and losses. This, of course, also applies to the power-law tails, whose exponent is
. However this symmetry is clearly broken for actual data. To wit, the distribution of S&P500 returns has (
Farahani & Serota, 2025):
While various models exist that aim to explain such features as skewness (
Gupta et al., 2024), the multiplicative model is the simplest one that yields power-law tails of returns—matching our empirical observation. Such scale-free tails can explain “Black Swans” of theoretically arbitrarily large drops associated with market crashes, which, in turn, may be caused by various systemic risks—see, for instance, (
Wasi et al., 2023). Another possibility is that market crashes are associated with the Dragon Kings phenomena, which are even more dramatic than Black Swans (
Johansen & Sornette, 2001;
Sornette & Ouillon, 2012).
The motivation for this work was therefore to model this symmetry breaking while, ideally, still remaining within SDE framework based on multiplicative model of stochastic volatility. The first and fairly obvious idea would be that the stochastic variance Equation (
2) are governed by a different set of parameters for gains and losses, that is to say that their parameters
and
are different. In other words, this implies that gains and losses should be fitted separately by weighted Student t-distributions (
4) in a manner that weights add up to unity and their ratio is the ratio of points under respective distributions. For brevity, we call the final distribution “half Student-t”.
While still having SDE underpinning “half Student-t” is obviously not an organic distribution. Additionally, it predicts a negative mean—contrary to the empirical evidence above. Consequently, we adopted yet another approach based on a skew extension of Student t-distribution by Jones and Faddy (
Jones, 2001;
Jones & Faddy, 2003). Unfortunately, modified Jones–Faddy (mJF) distributions that we use are not cleanly derived from SDE formalism but they are close in spirit and yield good fits to the full distribution of returns.
This paper is organized as follows. In
Section 2 we provide expressions for the PDF and cumulative distribution function (CDF) of mJF distributions as well as their statistical parameters, such as mean, mode, variance and skewness. In
Section 3 we present results of fitting the full distribution of returns. Finally, we summarize and discuss our results in
Section 4.
2. Analytical Framework
This section lists analytical expressions for the PDF, CDF, mean
, variance
(
being standard deviation), and mode
for the four distributions described in
Section 2.1,
Section 2.2,
Section 2.3 and
Section 2.4 below. We use first and second Pearson coefficients of skewness
to characterize the skew of the distribution, where
is the median, which is evaluated numerically.
2 We will consider CDF appropriate for gains and losses separately; they are given, respectively, by
where
is the PDF of returns. Complementary CDF, CCDF, appropriate for gains and losses are, respectively,
and
.
2.1. Student t-Distribution
The PDF of Student t-distribution is given by (
4) implying that the tails of the PDF scale as
and of the CDF as
. Due to symmetry, the CDF for both gains and losses is given by
where
is a regularized incomplete beta function (
NIST, n.d.) and, obviously,
and
2.2. Half Student t-Distribution
In effect, it is a mixture distribution whose PDF of gains and losses are given, respectively, by
and
so that the full distribution is a mixture of two halves of Student t-distribution (hence the name “half Student-t”) and is given by
where
and
is the ratio of points under gains and losses (
Farahani & Serota, 2025). Generally speaking, a mixture distribution is not a preferable venue from a physicist’s point of view since it does not follow from a first–principles model. However
and
marginally do.
The CDF of gains and losses are given, respectively, by
and
Using (
9)–(
11), we find the following expressions for the mean and variance respectively
and
where
is a Gamma function (
NIST, n.d.). Clearly,
in this model so the sign of the skew
will be that of
. Another observation is that the number of parameters in half Student-t, aside from
, is double that of Student t-distribution. One possible simplification of this model is to assume that the mean stochastic volatility governing gains and losses is the same,
so that the difference between gains and losses, including power-law exponents, reduces solely to difference between
and
.
2.3. Modified Jones–Faddy Distribution mJF1
The PDF of the first of modified Jones–Faddy distributions (mJF1) introduced here for characterization of distribution of stock returns is given by
where the normalization factor
C is given by
The CDF for gains and losses are given, respectively, by
and
mJF1 is a direct descendant of the distribution (
4) with one minor and one significant variation. First, just as in the case of standard Student distribution (
Wolfram, n.d.-b), a location parameter
can be introduced here as well. Obviously it does not affect (
1) since the variable can always be shifted by a constant. The second variation introduces a skew (skew t distribution (
Jones, 2001;
Jones & Faddy, 2003)), via
and
here. In particular, power-law tails scale as
at
and
at
. This breaks a construct based on (
1) and (
2) which treats volatility of gains and losses uniformly: substitution
in (
16) leads back to (
4) (with non-zero location parameter
). At this point we are unaware of an SDE-based formulation that would result in a distribution (
16).
Turning now to mean, variance and mode of mJF1 we find, respectively,
2.4. Modified Jones–Faddy Distribution mJF2
The second modified Jones–Faddy distribution mJF2 is a simple generalization of mJF1 in that instead of a single
we now have
and
, just as for half Student-t in
Section 2.2. At this point we believe that assumption of the same mean stochastic volatility for gains and losses, as is the case for mJF1, makes more sense. Additionally, introduction of an extra fitting parameter in mJF2 only minimally improves fitting. Therefore, we present mJF2 results largely for completeness.
The PDF of mJF2 is given by
where normalization factor is
The CDF for gains and losses are given, respectively, by
and
Mean, variance and mode of mJF2 are given, respectively, by
3. Numerical Results
Table 1 shows parameters of distributions in
Section 2 obtained by Bayesian fitting of 1980–2025 S&P500 returns.
Table 2 gives the values of mean
, variance
, and mode
from equations obtained in that section for each of the distributions.
is evaluated numerically. First and second Pearson coefficients of skewness,
and
are then computed using (
5). Exponents of power-law tails of the distributions’ CCDF are computed as
, where
and
. The tail exponents of S&P500 returns are obtained by linear fitting of the tails.
Clearly, half Student-t, which does not allow for location parameter, fails to capture positive sign of
. We point out that positive values of
in
Table 2 are roughly an order of magnitude smaller than
of the linear fit in
Figure 1 but are still non-zero, as illustrated in
Figure 3. Also, the second Fisher coefficients of skewness
of fitted distribution are much closer to S&P500 returns than the first Fisher coefficients
. Specifically for mJF1 and mJF2 it is due to the large discrepancy in the value of the mode
. The reason for that is difficulty in identifying the value of the mode in empirical data, as is obvious from
Figure 4 below (see next paragraph for explanation). In this particular case we used a smoothing procedure to obtain the value of
for S&P500.
Figure 4 shows fits of the PDF of the distribution of S&P500 returns using PDF of four distributions described in
Section 2. These fits render parameters listed in
Table 1.
Figure 5,
Figure 6,
Figure 7 and
Figure 8 show CCDF of S&P500 returns and CCDF of the four fitting distributions derived in
Section 2 with parameters from
Table 1. Linear fits of tail areas are also shown in
Figure 6 and
Figure 8. Clearly, visually all four distributions exhibit very close tail behavior, which also only slightly differs from S&P500 tail and its linear fit. This is confirmed by closeness of power-law exponents in
Table 2. We point out, however, that the distributions of
Section 2 approach power-law behavior only asymptotically and their own linear fits in
Figure 6 and
Figure 8 would not generally speaking produce exponents listed in
Table 2.
Figure 9,
Figure 10,
Figure 11,
Figure 12,
Figure 13,
Figure 14,
Figure 15 and
Figure 16 demonstrate the results of statistical tests meant to probe goodness of fit.
Figure 9 and
Figure 11 show confidence intervals of linear fits and
Figure 13 and
Figure 15 show confidence intervals of mJF1 fits. Confidence intervals are obtained using inversion of the binomial distribution (
Janczura & Weron, 2012) and we specifically focused on mJF1 as the most transparent and minimal generalization of Student t-distribution.
Figure 10 and
Figure 12 show
p-values of order-statistics-based U-test (
Pisarenko & Sornette, 2012)
3 for linear fits and
Figure 14 and
Figure 16 show
p-values for mJF1 fits. It should be noted that both approaches were developed for detections of outliers, such as Dragon Kings and negative Dragon Kings, in the tails of the distributions. For instance values
and
would signal a 95% probability of having, respectively, a Dragon King and a negative Dragon King. In simpler terms, if a data point falls outside the confidence interval and/or if its
p-value or (1 −
p)-value is very small, then it most likely does not belong to the fitting distribution (linear fit being the tail of Pareto distribution).
4. Conclusions and Discussion
The purpose of this work was to glean insight into and to try to analytically describe key empirical findings about S&P500 1980–2025 returns: heavier tails of losses, leading to the negative skew of the distribution, and positive mean of the distribution, which cannot be entirely attributed to the larger numbers of gains than losses. Our main conclusion is that a modified Jones–Faddy skew t-distribution, (
16)–(
19) is most likely the best candidate for the stated purpose, even though it is currently unknown how to derive it from first-principles stochastic differential equations.
The main idea behind symmetry breaking of Student t-distribution (
4), which is based on (
1)–(
3), is that the latter equation for stochastic volatility is governed by a different set of parameters for gains and losses. In this particular case, we operated on the basis of multiplicative model of stochastic volatility (
3); thus, the heavier power-law dependence of losses is explained by assuming that the resulting parameter for losses,
, is smaller than that for gains,
, hence the modified Jones–Faddy skew t-distribution mJF1, (
16). An additional innocuous introduction of the location parameter helps to explain the positive mean of the distribution. The location parameter seems to account not only for larger number of gains than losses but also for larger values of gains in the bulk of the distribution, which offsets the heavier negative tails. We hope to address the latter point quantitatively in future work.
mJF1 still implies that the mean stochastic volatility
is the same for gains and losses. To account for the possibility to the contrary we introduced two other distributions with different mean volatilities for gains and losses,
and
: a mixture half Student t-distribution, (
11)–(
13) and its simplified form with
, and the second modified Jones–Faddy distribution mJF2, (
23)–(
26). The advantage of the former is that it is still rooted in the stochastic differential equation framework. However, due to its structure, it fails to account for the positive mean of actual returns—daily S&P500 returns in this case. mJF2, despite an extra parameter, showed virtually no difference relative to mJF1 in fitting the empirical data, both visually and based on statistical tests described in
Section 3. Consequently we believe that mJF1 is the cleanest and most transparent generalization of Student-t for describing daily S&P500 returns. In particular, as is seen from
Table 2, power-law exponents defined by
ratios are very close for mJF1 and mJF2.
There are a number of possible future directions of this work. The most obvious is to consider other market indices. From our previous experience, we expect that DOW will not exhibit significant difference with S&P500, both in overall behavior and values of parameters. However other long-lasting US indices, such as Russel and NASDAQ, are worth looking into, as well as European and Asian ones. Also overall market returns, reflected by key indices, versus individual companies is a rather challenging question (
Albuquerque, 2012). Yet another important avenue is the study of accumulated returns versus dailies. The most important is realized volatility, which shows linear behavior as the function of the number of days of accumulation, and the rather abrupt drop off in the tails for longer accumulations. Of course, the most challenging task is finding first-principles explanation of symmetry breaking described in this work.