An Empirical Investigation to the “Skew” Phenomenon in Stock Index Markets: Evidence from the Nikkei 225 and Others

: The skew processes have recently received much attention, owing to their capacity to describe controlled dynamics. In this paper, we employ the skew geometric Brownian motion (SGBM) to depict nine major stock index markets. The skew process not only shows us where the “support” and “resistance” levels are, but also how strong the force is. However, the densities of the skew processes make it challenging to estimate the parameters in a convenient manner. For the sake of overcoming this challenge, we adopt a Bayesian approach, which plays an important role in allowing us to estimate the parameters by conditional probability densities without having to evaluate complex integrals. Furthermore, we also propose the likelihood ratio tests and signiﬁcance tests for the skew probability. In the empirical study, our ﬁndings reveal that skew phenomenon exists in the global stock markets and that the SGBM model works better than the traditional GBM model, as well as performing competitively, compared to the GBM-jump model (GBM-J) and Markov regime switching GBM model (GBM-MRS). In addition, we explore the possible reasons behind the skew phenomenon in stock markets, the price clustering phenomenon and herd behaviors can help to explain the skew phenomenon.


Introduction
For decades, a classical problem has been how to describe the dynamics of stock prices. The most basic model used to describe stock prices is geometric Brownian motion (GBM), which was first used by Black and Scholes [1]. However, there are many empirical features of market data that cannot be captured by GBM. Many researchers have proposed different approaches to modeling phenomenon in the market. For example, Merton [2] proposed a jump-diffusion process to capture the extraordinary magnitude changes of the asset price and Hamilton [3] introduced regime switching to capture structural changes in time series data. In this paper, we focus on another market phenomenon, which is called long-duration asset prices or price clustering. This phenomenon has been observed by some researchers. Sonnemans [4] found that the price of the stocks on the Dutch stock market tended to cluster at round numbers (e.g., 10, 20, 30, etc. or 5, 15, 25, etc.) and were affected by the round number price barriers (prices passed less frequently round numbers than other numbers). Similar phenomenon was also observed in China (Brown and Mitchell [5]; Hu et al. [6]) and Japan (Aşçioǧlu et al. [7]). The phenomenon can be described by the skew diffusion process. Itô and McKean [8] first introduced the skew Brownian motion (SBM), a process that behaves similarly to the conventional Brownian motion before arriving at the skew level. Once it hits a skew level a, its excursion from a has a probability of p to go upward and a probability of 1 − p to go downward. Due to this property of skew process, it is very suitable for describing the price clustering phenomenon. We notice that skew processes can also illustrate the strength of the phenomenon through the skew probability p. It is a more precise characterization of the phenomenon than price clustering. Thus, we name the phenomenon described by the skew process as skew phenomenon.
In the existing literature on SBM, much effort has been made in exploring the mathematical properties of SBM or applying them to model the underlying asset and price derivatives. However, empirical testing of SBM has received much less attention. Is there, indeed, the skew phenomenon present in the market? Can skew geometric Brownian motion (SGBM) outperform other models when applied to modeling the dynamics of asset price? We attempt to answer these questions in this paper. The contributions of this paper are four-fold. First, we apply a Bayesian estimation approach to estimate the parameters. To do this, the local time in the model is removed by a piecewise transform. Second, we test whether skew phenomenon is present in the global stock markets by checking whether the skew probability p is significantly different from 0.5. Third, we compare the performance of SGBM with three other models, the basic GBM model, GBM-jump model (GBM-J) and Markov regime switching GBM model (GBM-MRS). Fourth, we show that the skew phenomenon and the price clustering phenomenon can confirm with each other; the skew phenomenon is also related to the herd behavior. In this point, we try to explain why the skew phenomenon exists.
The remaining part of this paper is organized as follows. Section 2 presents the SGBM model and its piecewise transform. Section 3 specifies the methods for evaluating the model's performance. Section 4 describes the data and empirical results. In Section 5, we try to find some theories to explain the existence of skew phenomenon. Finally, the conclusion is presented in Section 6.

The Model
Let S t be the stock price at time t. Assume that S t follows the SGBM whereL S t (a) is the symmetric local time of the process S t at the "skew level" a, and the skew probability p is the probability of moving upward when S t hits the level a. If we set µ = µ − 1 2 σ 2 and X t = ln S t , then we get that X t follows It is not difficult to obtain thatL X t (ln a) is the symmetric local time of the process X t and the "skew level" of X t is ln a. To remove the local time, similar to Harrison and Shepp [28], we define a function G(x) by Then, we obtain its inverse function H(x) as Consider Y t = G (X t ). Then, The inverse transform X t = H (Y t ) can be expressed as As the function G(x) is the difference of two convex functions, we apply the generalized Itô formula to Y t = G (X t ) to obtain: The discretized version of Equation (7) can be expressed as: where are independent and standard normally distributed. Denote N 1 (respectively, N 2 ) as the set in which the sample value is bigger (respectively, smaller) than the skew level ln a (i.e., N 1 := {i : i = 1, · · · , N, x t i ≥ ln a} and N 2 := {i : i = 1, · · · , N, x t i < ln a}). Let n 1 ,n 2 be the number of i included in the set N 1 and N 2 , respectively. Obviously, n 1 + n 2 = N. Define y t i := ∆Y t i . Then, the likelihood function of Equation (8) can be written as: where Θ is the set of all parameters µ, σ 2 , a, p in Equation (8) and X represents the sample data.

Significance Test of Skew Probability p
As SBM reduces to standard Brownian motion when p = 0.5, the skew model does not make any sense if the estimatedp is not significantly different from 0.5. Thus, we need to test the significance of the estimated skew probabilityp. The null hypothesis H 0 isp = 0.5, and the alternative hypothesis H 1 isp = 0.5. The other parameters are assumed to be the same under the two hypotheses. Under the null hypothesis, the likelihood function is The likelihood ratio is defined as follows: where X is the sample data, l(X|Θ 1 ) is the likelihood function under H 1 , and l(X|Θ 0 ) is the likelihood function under H 0 . When λ is large enough, we reject H 0 and accept H 1 ; then, we can consider the alternative model to be better than the null model. We need to know the distribution of λ under H 0 in order to decide whether to accept the null hypothesis or reject it at a specified significance level; however, it is difficult to obtain an analytical solution of the distribution. Thus, we apply the bootstrap method to obtain an approximation of the distribution. First, we get the parameters of standard BM model by maximum likelihood estimation. Then, a simulated sample X * can be obtained using the estimated model and, thus, we can calculate a value of λ based on X * . Repeat this process 10,000 times, we can get 10,000 λ(X * ). Finally, we can get the p-value of λ(X), which is the probability that λ(X * ) is bigger than λ(X).

Comparison of Model Performance
After verifying previous procedure, in this part, we compare the SGBM defined by Equation (1) with the commonly-used models, to see which model fits the best. We focus on the following three models.
• GBM-J: where Z ∼ N (µ J , σ 2 J ), N(t) is a Poisson process with intensity λ, and W t , Z v and N(t) are independent of each other.
where s t is a two state continuous-time Markov process which is independent of W t .
The null hypothesis is that the GBM (respectively, GBM-J and GBM-MRS) is suitable for fitting the data, and the alternative hypothesis is that the SGBM fits the data better. The test statistic λ(X) defined by Equation (11) is also adopted. We obtain the p-value from 10,000 bootstrap samples. If the p-value is less than 10%, we have the evidence that the SGBM fits the data better. If the p-value is less than 5%, we have very strong evidence against the commonly-used models.

Data
The closing prices of the AEX (Netherlands, AEX), BEL 20 (Belgium, BFX), DAX (Germany, GDAXI), CAC 40 (France, FCHI), FTSE 100 (UK, FTSE), Shanghai Shenzhen CSI 300 (China, CSI 300), Nikkei 225 (Japan, N225), SMI (Switzerland, SSMI), and S&P 500 (U.S., SPX) indices are used in the empirical study. The stock markets considered represent nine of the most important stock markets internationally; all of them are mature capital markets (except for Chinese market). As a representative of an emerging market, the Chinese market has been rapidly developing and has had an important influence on the world economy; for these reasons, we also take it into account. The data are taken from the Wind Database. The sample period of eight mature markets is from 1 January 2000 to 31 December 2017, while the starting sample period of CSI 300 is 1 January 2002, as it is the first day that the data are available. First, we use the weekly data over the whole sample period to check the skew phenomenon in stock markets in the long-run. Then, we use daily data in each year to check the skew phenomenon over a year. By rolling forward the data quarterly, we obtain 61 groups of data for Chinese market and 69 groups for the other markets. Figure 1 and Table 1 illustrate the price series under consideration and show their descriptive statistics, respectively. In Figure 1, we can observe that all stock indices display very similar patterns. For example, during 2004, 2007, and 2016, the indices all clustered at a certain level. It is likely that skew phenomenon exists over these periods.
In Table 1, according to the coefficient of variation (C.V.), we can see that Shanghai Shenzhen CSI 300 index is the most volatile of the indices, while the FTSE 100 is the most stable. Table 2 shows that the nine stock markets are highly correlated, which coincides with the results in Figure 1. The correlation coefficient between the S&P 500 and DAX is the highest, while the correlation coefficient between AEX and CSI 300 is the lowest.

Weekly Data of the Whole Sample Period
We estimate the parameters of SGBM model and test the skew phenomenon using the weekly closing prices of the nine stock indices over the whole sample period. For more details of parameter estimation approach, see Appendix A. Table 3 gives the results of parameter estimation for the nine indices. We focus on the estimated skew probabilityp. For the nine stock indices, the skew probabilitiesp are 0. 3343 According to the significance test of the skew probability p, all these estimates are statistically significantly different from 0.5 at 5% level, other than thep of CSI 300, indicating that there are significant skew phenomena in the eight stock markets. According to Table 1, since C.V. is used to measure the dispersion of data, it is not abnormal that the skew phenomenon does not exist in the Chinese market over the whole sample period. Coinciding with the correlation shown in Table 2, all thep values are smaller than 0.5, which means that it is more likely for the dynamics of indices to move downward than upward when hitting the skew level a. However, the likelihood ratio test gives different conclusions. There are only four markets on which SGBM model strictly significantly outperforms the other three models at the 10% level, while there are four markets on which SGBM does not outperform them. The p-values of likelihood ratio tests in Japanese market is 0.100 against GBM and 0.074 against GBM-J, reaching a significant margin. It is hard to say whether SGBM outperforms these two models, since the p-values are at such a marginal level. Therefor, we try to find some clues from the graph of the index.  Figure 2 shows the historical price data and skew level of Nikkei 225 index, which is 10,543.1285, as shown in Table 3. In Figure 2, we can see that there are two major time intervals during which the skew phenomenon is very significant. The first interval is from 2001 to 2003, during which the process walks through the skew level several times, but the major part is below it. The second interval is from 2009 to 2013. The vertical lines indicate the dates when the extreme price movements occur. It is obvious that extreme price movements occur frequently when the index is near the skew level. As many researchers have devoted considerable effort to correlation between extreme price movement and herding, the herding may be conducive to the explanation of skew phenomenon.  As the observations of the index are discrete, it is difficult to find that the dynamic hits a specific level precisely. Thus, we use a small interval around it to stand for the level. Sonnemans [4] was interested in the number of times crossing a level. Besides that, we also consider the number of times reflecting from a level. The difference between "crossing" and "reflecting" is whether the previous price and the following price are on the same side of the level. In Figure 3 we notice that the data exhibit price clustering on some scales. After signalling the skew level in Figure 3 by a straight line, it is obvious that the skew level is one of the points at which the indices cluster. The total number of times hitting the skew level is 53. In addition to three times crossing the skew level from above and three times from below, there are also 17 times reflected by the skew level upward and 30 times downward. For convenience, we assume that, after hitting the skew level, whether the dynamic goes up or goes down can be described by a Bernoulli random variable with probability p. The result of hypothesis test shows that the p is rejected to be greater than or equal to 0.5. We can conclude that there is a big chance for Nikkei 225 index to move down when price series hits 10,543.1285, the process goes down in most cases once it touches the skew level. We can notice that skew level does not need to be a very high level for the skew probability p to be smaller than 0.5, or to be at a very low level for the skew probability p to be bigger than 0.5.
In fact, the skew levels we estimate in the empirical study are all lower than both the means and medians in the eight mature stock markets. However, the skew levels need to be the levels at which the dynamic crosses several times and has a preference for moving upward or downward. Overall, the skew probabilities p are different from 0.5 with statistical significance in the eight markets and the SGBM model does not underperform the standard GBM model, GBM-J model, and GBM-MRS model, proving that skew phenomenon exists broadly in stock markets all around the world.

Daily Data over One Year Period
After investigating the skew phenomenon over the whole sample period, we study the skew phenomenon over a one-year period. We obtain 61 subsamples of the CSI 300 index and 69 subsamples of the other eight stock indices. We estimate the parameters and check the skew phenomenon by testing the significance of the skew probability p. Table 4 and Figure 4 show the results for the Nikkei 225 index. The detailed results of other indices are available upon request. For simplicity, we only show the results of subsamples whose skew probability p is significantly different from 0.5 at the 10% level. Furthermore, we define the subsamples in which the skew phenomenon exists as skew subsamples. There are 28 subsamples in which the skew phenomenon exists, accounting for about 40% of the total subsamples. Almost all the skew probabilitiesp are smaller than 0.5, indicating that these skew levels are resistance levels. The only subsample in which thep is bigger than 0.5 is from the fourth quarter of 2016 to the third quarter of 2017, with the corresponding skew levelâ of 16,336.7533. Similar to the weekly data, the skew level of each subsample can be seen as the scale at which the index clusters. Taking the subsample from the second quarter of 2003 to the first quarter of 2004 as an example, Figure 5 shows the conclusion clearly. The skew level we estimate is 10,742.4626. The price clustering phenomenon is also most foremost on the level. Therefore, our skew phenomenon and price clustering phenomenon can confirm with each other. Table 5 shows the results of the comparison of p-values and MSEs for each model. The third column of Table 5 shows that the SGBM model clearly works better than the GBM model in most cases. In detail, the SGBM model wins 19 (67.86%) of the subsamples. The SGBM model also beats the GBM-J model, which is exhibited in the fifth column of Table 5; the SGBM model wins 18 (64.29%) of the subsamples. However, the results indicate that there may not be a noticeable difference between the SGBM model and GBM-MRS model for fitting the data. In fact, the two models capture different characteristics of the dynamic. A combination of them may outperform all the models mentioned here, and this is one of our future works.     Table 6 summarizes the results of all the nine indices. It can be easily observed in Table 6 that the SMI has the most skew subsamples among all the indices, followed by the Nikkei 225, while FTSE 100 has the fewest skew subsamples. Since the market behaviors of the different indices vary, it is reasonable that there are more skew subsamples in some markets while fewer in others. Although the numbers of skew subsamples in all markets are fewer than half of total subsamples, the percentages are still noticeable. Collectively, all the numbers in Table 6 confirm that the skew phenomenon exists in subsamples of daily data over the period of a year. The skew probability p is smaller than 0.5 in most skew subsamples, which means that there are more resistance levels than support levels; this is consistent with the investor psychology: as investors are more sensitive to decline of stock price, stock price will sometimes move downward with a bigger probability when hitting a specific level. Although there are some markets in which the long-run skew phenomenon does not exist and the percentage of short-run skew subsamples is smaller than 50%, we should not ignore skew phenomenon when modeling asset prices. As a skew model can be reduced to a conventional model, the skew model can be used in both kinds of markets: the markets in which skew phenomenon exists and the markets in which skew phenomenon does not exist. For those markets in which skew phenomenon does not exist, the skew probability will simply be 0.5. However, the conventional model cannot capture skew features, and there will be bias if we use the conventional model to describe markets in which skew phenomenon is present. Thus, it is necessary to introduce the skew model and consider skew phenomenon when modeling asset dynamics.

Why the Skew Phenomenon Exists
As the market behavior is uncertain and confusing, many researchers try to find some reasons to explain different market phenomena. In this section, we try to explain why the skew phenomenon exists.
As shown in Figures 3 and 5, the skew levels are consistent with the levels that the indices cluster at. Thus, the explanations for price clustering phenomenon can help us to understand the reason for the existence of skew phenomenon. Donaldson and Kim [29] pointed out that the DJIA's rise and fall was indeed restrained by "support" and "resistance" levels; these "support" and "resistance" levels are known as psychological barriers. The existence of such psychological barriers in different markets has been proven in many empirical studies. Sonnemans [4] found that round numbers could act as price barriers for individual stocks. Westerhoff [30] claimed the psychological barriers existed in foreign exchange market. Dowling et al. [31] tested the presence of psychological barriers in WTI and Brent oil futures and found them present in the Brent prices but not in the WTI prices. Skew level is similar to, but not identical to, a psychological barrier. Psychological barriers exist in markets due to investor's perceptions that the fundamental asset value is anchored to a nearest round number. The skew level can be viewed as a psychological barrier, although it is not a round number. Both ideas describe the unusual behavior of asset dynamics when hitting a special level.
According to the empirical results in Tables 1 and 3, the skew probabilityp is smaller than 0.5 and the skew level is lower than the mean in each market. This may be not consistent with what we should expect. Usually, we expect a high skew level a when p is small. However, "running after rising and falling" phenomenon can often be seen in stock markets, which can be connected with herding behavior and positive feedback trading strategies. Figure 2 illustrates that the skew phenomenon is closely related to extreme price movements. During periods of extreme market movements, the herd behavior is universal in the markets. Devenow and Welch [32] conducted a literature review on the economics of rational herding in financial markets, demonstrating that irrational investors usually disregard their prior beliefs and follow other investors blindly. Chang et al. [33] examined the investor behavior within different international markets (i.e., US, Hong Kong, Japan, South Korea, and Taiwan) and found significant evidence of herding in the two emerging markets (Korea and Taiwan). Perhaps the existence of a low skew level with a small skew probability can be explained by positive feedback trading. When there is a stock market downturn, many investors expect a high probability to keep moving downward. Therefore, a low skew level sometimes correspond to a small skew probability.
Additionally, skew phenomenon may be the result of government regulation. There is less government regulation in stock markets than interest markets or foreign exchange markets, but it does not mean that the government will leave the stock market alone. Stock prices are usually considered to react to external forces. Chen et al. [34] proved that stock returns were exposed to systematic economic news, where many macroeconomic variables would systematically affect stock market returns. To stabilize the stock markets, a government may take some measures and give guidance to markets. Chang et al. [33] proved that, in the emerging market, herd behavior could result due to a relatively high degree of government intervention. In June 2015, the Chinese stock market lost over $3.2 trillion in value, Chinese government took unprecedented steps to prevent stocks from falling further. Authorities suspended initial public offerings (IPOs), limited bearish bets though CSI 300 Index Futures, and encouraged financial firms to buy more shares. The empirical results show that there was indeed skew phenomenon in 2015. When the stock index hits a specific level, the government will try to guide the trend of market, causing the occurrence of a skew level.
In the end, it is noteworthy that the Chinese market is the only one in which the p is not significantly different from 0.5. As an emerging market, the Chinese stock market started relatively late compared with other developed countries and is still immature. Therefore, it is not abnormal that the long-run skew phenomenon existing in the other mature capital markets can not be found in the Chinese market. Another reason may be that the interest rate in China is relatively high compared with other developed countries. The U.S. held a zero interest rate for seven years, and Europe and Japan have been holding zero or negative interest rate since 2016. However, the interest rate in China was between 4% and 6% during 2000-2018, which was much higher than the interest rates in other areas. Thus, it is inappropriate to ignore time value in the Chinese market. The trend of the dynamic is influenced by the interest rate more significantly in Chinese stock market than in other stock markets. Therefore, the long-run skew phenomenon is also affected by the interest rates, the skew level should be an oblique line rather than a straight line, as the sample period spanned 17 years. When we move to one-year sample period, there is still short-run skew phenomenon in the Chinese market.

Conclusions
This study tests for the skew phenomenon in nine international stock markets, based on the SGBM model, and find that skew phenomenon is common worldwide. For the weekly data over the whole sample period, the skew probabilities p are significantly different from 0.5 at the 5% level in eight markets. Furthermore, we test the goodness-of-fit of SGBM model and three commonly-used models using the likelihood ratio test. SGBM significantly outperforms other models in four markets: the Dutch market, British market, Swiss market and American market. In the Japanese market, the p-value of likelihood ratio test is 0.1, reaching a significant margin. The graph of historical prices show that the Nikkei 225 index goes downward at most times when it hits the skew level. Thus, we can consider the Japanese market to be a skew market as well. Overall, we can say that skew phenomenon exists broadly in the global stock markets.
For the daily data over the one-year period, there are 61 subsamples for the CSI 300 and 69 subsamples for the other indices. There are 15,19,14,20,13,23,27,29, and 20 skew subsamples in each market, respectively. The Swiss market has the most skew subsamples while the British market has the fewest skew subsamples. The proportion of skew subsamples, out of the total subsamples, is noticeable. The skew probability is smaller than 0.5 in most skew subsamples, indicating that there are more resistance levels than support levels.
In addition, we attempt to explain why the skew phenomenon exists in stock markets. As the skew levels can be viewed as the barriers at which the indices cluster, psychological barriers of stock price may be one of the reasons. Herding behavior and positive feedback trading strategy may provide another reason, as a skew probability smaller than 0.5 corresponds to a skew level lower than the sample mean in the empirical test. Government regulation can cause the occurrence of skew phenomenon as well.
For the above explanations of the skew phenomenon, an important investment implication is that, besides the characteristics such as jump and regime switching, the skew phenomenon is also noteworthy. In the financial markets with skew phenomenon, the value of skew levels and skew probabilities are great assistance to investors in judging the indices trends. For the government, testing the skew phenomenon is a method to examine whether the intervention is effective. The value of skew levels and skew probabilities are the evidence of the effect of intervention.

Acknowledgments:
The authors are indebted to the participants in the seminar on Stochastic Processes and Financial Engineering at Nankai University for their valuable comments and discussions.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A. Bayesian Estimation of Skew Brownian Motion Model
We introduce the parameter estimation method in this appendix. If we assume Θ to be the set of all parameters (θ 1 , θ 2 , · · · , θ n ) and the joint prior density function of Θ to be p (Θ), the posterior density function of the model is: p(Θ|X) = l(X|Θ)p(Θ) l(X|Θ)p(Θ)dΘ ∝ l(X|Θ)p(Θ).

(A1)
In Bayesian inference, the inference which we want to conduct can be evaluated from the expectation of a certain function g(Θ): To avoid the complexity of multiple integrals, the Monte Carlo method is adopted in this paper.
Let Θ (1) , · · · , Θ (m) be the samples generated from the posterior density function p (Θ|X); then, Equation (A2) is approximated by: However, due to the difficulty in generating the samples θ (1) , · · · , θ (m) directly, we employ the Gibbs sampler, which provides an exercisable way to generate these samples. In fact, the Gibbs sampler always uses the full set of univariate conditionals to define the iteration. In our case, instead of p a|X, µ, σ 2 , p ∝ 1 We get the cumulative density F a through the n grid points {a 1 , · · · , a n } and right part of Equation (A7). Then, together with a uniform random number U a and preselected values 0 < ξ 1 < · · · < ξ n < 1, a (i+1) and the new grid points (i.e., used in the next iteration) are obtained as F −1 a (U a ) and F −1 a (ξ i ). For the simulated sample, the parameters are estimated to be µ = 0.0593, σ = 0.2109, a = 10, 367.3596, and p = 0.3034 under the SGBM assumption, and the parameters are estimated to be µ = 0.0227 and σ = 0.2152 under the GBM assumption. We can see that the absolute relative error reached as high as 66.22% between the real value of µ and estimated value of µ under the GBM model, while the absolute relative error is much smaller under the SGBM assumption. As for σ, the ARE is small for both models. It turns out to be true that skew phenomenon, if ignored, can yield a substantial bias in the estimates of parameters.
Finally, we consider testing for the presence of skew phenomenon on the basis of the likelihood ratio test and significance test of skew probability p. As shown in Table A1, the skew probability is significantly different from 0.5, thus there is skew phenomenon present in the simulated sample. The p-value of the likelihood ratio test shows that the SGBM model outperforms the GBM model. We can see that skew phenomenon plays an important role in the dynamics of stock prices, such that it is essential to take into consideration when modeling stock prices.