Next Article in Journal
Quantifying the Selective, Stochastic, and Complementary Drivers of Institutional Evolution in Online Communities
Next Article in Special Issue
A New Method for Determining the Embedding Dimension of Financial Time Series Based on Manhattan Distance and Recurrence Quantification Analysis
Previous Article in Journal
Model Checking Fuzzy Computation Tree Logic Based on Fuzzy Decision Processes with Cost
Previous Article in Special Issue
Asymmetric Fractal Characteristics and Market Efficiency Analysis of Style Stock Indices
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Efficiency of the Moscow Stock Exchange before 2022

Quantitative Finance Research Group, Scuola Normale Superiore, Piazza dei Cavalieri 7, 56126 Pisa, Italy
*
Author to whom correspondence should be addressed.
Entropy 2022, 24(9), 1184; https://doi.org/10.3390/e24091184
Submission received: 21 July 2022 / Revised: 16 August 2022 / Accepted: 18 August 2022 / Published: 25 August 2022
(This article belongs to the Special Issue Applications of Statistical Physics in Finance and Economics)

Abstract

:
This paper investigates the degree of efficiency for the Moscow Stock Exchange. A market is called efficient if prices of its assets fully reflect all available information. We show that the degree of market efficiency is significantly low for most of the months from 2012 to 2021. We calculate the degree of market efficiency by (i) filtering out regularities in financial data and (ii) computing the Shannon entropy of the filtered return time series. We developed a simple method for estimating volatility and price staleness in empirical data in order to filter out such regularity patterns from return time series. The resulting financial time series of stock returns are then clustered into different groups according to some entropy measures. In particular, we use the Kullback–Leibler distance and a novel entropy metric capturing the co-movements between pairs of stocks. By using Monte Carlo simulations, we are then able to identify the time periods of market inefficiency for a group of 18 stocks. The inefficiency of the Moscow Stock Exchange that we have detected is a signal of the possibility of devising profitable strategies, net of transaction costs. The deviation from the efficient behavior for a stock strongly depends on the industrial sector that it belongs to.

1. Introduction

When prices reflect all available information, the market is called efficient [1]. One way to claim the efficiency of a market is by testing the Efficient Market Hypothesis (EMH). In its weak form, the EMH considers that the last price incorporates all the past information about market prices [2]. If the weak form of EMH is rejected, previous prices help to predict future prices. For traders, market efficiency means that analyzing the history of previous prices does not help to design a strategy that produces abnormal profits. For a company issuing shares, market efficiency means that the cost of its share already reflects all information about the valuation and decisions of the company. The EMH is of great interest also in research. Mathematical models of an asset price are usually based on the assumption that the price follows a martingale: the expected value of a future price is the current value of the price. If the EMH is rejected, there should be an estimation of the future price that is better than its current value. In such a case, new models should be created.
A review of studies confirming the EMH was presented by Fama in 1970 [2] and then in 1991 [3]. The martingale hypothesis was also tested later. It was shown that the efficiency of a market depends on the development of the country [4]. Moreover, the martingale hypothesis was confirmed on short time intervals, but it may be violated on longer intervals [5]. In addition, there is a range of strategies designed to increase an expected profit. High-frequency and algorithmic trading strategies are discussed in [6]. Statistical and machine learning methods for high frequency trading are reviewed in [7]. The existence of such profitable strategies contradicts the Efficient Market Hypothesis. According to Grossman and Stiglitz [8], the degree of market inefficiency determines the effort investors are willing to expend to gather and trade on information.
The goal of this paper is to investigate the degree of stock market efficiency of the Moscow stock Exchange using the Shannon entropy. We quantify the degree of market inefficiency and the degree of price randomness. We aim to distinguish between price predictability due to stylized facts of financial time series [9] and due to market inefficiency. In particular, we consider volatility clustering and price staleness as data regularities needed to be filtered out. Based on the behavior of stock prices, we group them into clusters using several measures. Combining stocks into one cluster means a common price behavior that moves prices away from complete randomness.
A range of methods is used to measure a degree of market efficiency. In particular, Cajueiro and Tabak used the Hurst exponent and R/S statistics to rank efficiency of markets [10,11]. The Hurst exponent was measured on Bitcoin data to compare it with mature markets [12]. A generalized version of the use of the Hurst exponent, multifractal detrended fluctuation analysis, was applied to investigate the efficiency of stock and credit markets [13]. The algorithmic complexity of return time series was applied to measure the efficiency of financial markets [14] and to check the Efficient Market Hypothesis [15]. Finally, the Shannon entropy as a measure of randomness is used in a range of articles; see [16,17,18]. The general idea of these methods is to compare the characteristic of the time series with the value corresponding to a completely random process. In our study, we use Monte Carlo simulations to determine which deviations from a completely random process are statistically significant.
Before estimating the degree of market efficiency, we need to dispose of regularities that make prices more predictable but that do not imply any profitable strategies. A method for filtering regularities was introduced in [19]. However, such a process of filtering has not usually been applied in other research studies. In fact, deviations of price behavior from perfect randomness may be the result of some known regularity pattern, such as volatility clustering or daily seasonality, but not a signal of market inefficiency. One of the innovations of this article is a new method for filtering data regularities, allowing the estimation of volatility and a degree of price staleness minute by minute.
We process data by filtering regularities of financial time series including volatility clustering and price staleness. Price staleness is defined as a lack of price adjustments yielding 0-returns. Traders may trade less because of high transaction costs and so the price does not update. See [20] for more details. Price staleness produces an extra amount of 0-returns called excess 0-returns. The other source of 0-returns in the time series is price rounding. Estimations of volatility and degree of price staleness are mutually connected: Excess 0-returns appear due to price staleness tend to underestimate volatility. At the same time, volatility estimation is needed to calculate the expected amount of 0-returns due to rounding.
One method for estimating volatility in the presence of excess 0-returns was presented in [21]. It uses expectation-maximization algorithm [22] to estimate returns in the places of all 0-returns and uses the GARCH(1,1) model to estimate volatility [23]. The maximization of the likelihood function appearing at each step of the considered algorithm requires several parameters for numerical optimization. If the estimation of volatility is sensitive to these parameters, which are user-defined, then they may affect the entropy of returns standardized by volatility and the amount of 0-returns in the time series. In this article, we suggest a modification of moving average volatility estimations that require an adjustment of the only parameter that can be defined using out-of-sample testing. The idea is to adopt a simple method for volatility estimation such that price staleness is taken into consideration. Moreover, while estimating volatility, we filter out excess 0-returns.
The degree of market efficiency has been measured for many countries. Stock indices for 20 countries were considered in [24]. The efficiency of 11 emerging markets and the US and Japan markets was measured in [10]. US stock markets were considered in a recent paper [25]. A review of articles about Baltic countries was presented in [26]. A degree of uncertainty of Chinese [27], Tunisian [28], Mexican [29], and Portuguese [30] stock markets was also considered by using entropy measures. However, the efficiency of the Russian stock market has not yet been analyzed. In this paper, we present an analysis of market efficiency based on the estimation of Shannon entropy for a group of 18 stocks of Russian companies from five industries.
Our paper introduces four original contributions in the field. First, we construct a method for filtering out heteroskedasticity and price staleness. This filtering process helps identify a true degree of market inefficiency. Second, we calculate the degree of market inefficiency for the previous decade using monthly intervals. We conclude that the degree of market inefficiency for the Moscow Stock Exchange was greater than 80 % . Third, we determine which pair of stocks exhibits the largest amount of inefficiency, as measured by estimating Shannon’s entropy on their high-frequency price time series. We show that months where the predictability of stock prices attains its maximum cluster together. We find out the form of behavior that is repeated most often with respect to stocks for inefficient time periods. Finally, we estimate the closeness of price movements using two measures of entropy. Based on these results, we cluster together groups of stocks for which the efficient market hypothesis is rejected, thus pointing out how market inefficiency displays some dependence on the financial sector they belong.
The article is organized as follows. Section 2 describes the dataset and the method for filtering data regularities and calculating the Shannon entropy. Section 3 presents the results on simulated and real data. Section 4 concludes the paper.

2. Materials and Methods

Our main goal is to measure a degree of efficiency of the Moscow Stock Exchange. The data taken for the study are reviewed in the next section. All data processing and computing can be divided into three stages. First, we filter data regularities from financial time series. Then, we calculate the degree of efficiency of the market using the Shannon entropy. Finally, we use the resulting time series to cluster stocks using Kullback–Leibler distance discussed in Section 2.4.

2.1. Dataset

We study the Moscow Stock Exchange. We consider close prices aggregated at one-minute time scale. In particular, we select only minutes of the main trading session from 10:00 to 18:40. The time interval covers ten years from 2012 to 2021. The time period is divided into monthly time intervals. We take 18 companies, 16 of them are from five sectors: oil industry, metallurgy, banks, telecommunications, and electricity. All stocks are listed in Table 1 (there are 2520 trading days. Assuming that there are 520 min in each trading day, there are 1310400 trading minutes in total. We use the Brownlees and Gallo’s algorithm of an outlier detection [31]. See details in Appendix A.1). All data are provided by Finam Holdings (https://www.finam.ru/, accessed on 17 August 2022).

2.2. Apparent Inefficiencies

To estimate a degree of market efficiency, we first should eliminate the known patterns of predictability, such as a daily seasonality. Financial agents operating in the market tend to trade less in the middle of a day. It is reflected in prices, but again, this pattern in trading volume should be filtered out to detect genuine patterns of inefficiency. Other known regularities include volatility clustering, price staleness, and microstructure noise. See Appendix A for a guide on filtering out apparent inefficiencies. The contribution of this article is that it devises a simple method for filtering volatility clustering and price staleness. One of the methods used to estimate volatility is the exponentially weighted moving average (EWMA). It is described in the next section.

2.2.1. EWMA

We define price returns as r t = ln P t P t 1 , where P t is the last price available at time t, and ln ( ) is the natural logarithm. In order to estimate volatility σ n , we apply the exponentially weighted moving average [32] of values μ 1 1 | r i | , i < n , where μ 1 = 2 π .
σ ¯ n = S i g 1 ( α , r n 1 , σ ¯ n 1 ) = α μ 1 1 | r n 1 | + ( 1 α ) σ ¯ n 1
This form of exponential moving average was used in [19]. Here, E [ | r n | ] = μ 1 σ n is used assuming that returns are normally distributed, r n N ( 0 , σ n ) . More weights are provided for the more recent data. An alternative formula based on expectation E [ r n 2 ] = σ n 2 is described as follows.
σ ¯ n 2 = S i g 2 ( α , r n 1 , σ ¯ n 1 ) = α r n 1 2 + ( 1 α ) σ ¯ n 1 2 .
A large value of return increases the value of volatility. The current value of volatility reflects all available values of returns and changes slowly if the value of α is small.
We follow the approach suggested by [33] (p. 97) to find optimal values of α in Equations (1) and (2). The value of α is selected so that it minimizes error E r σ ( α ) = i ( σ ¯ i 2 r i 2 ) 2 . In order to minimize E r σ ( α ) as a function of the only parameter 0 < α < 1 , we apply Brent’s algorithm [34] (the method is available in Python by using the function scipy.optimize.minimize_scalar. Alternatively, we could use the golden-section search [35] that requires the boundary of the search and the only parameter for the stopping criteria). We modify the exponential moving average method in Section 2.2.3 so that it removes a bias due to the effect of price staleness discussed in the next section.

2.2.2. Estimation of Price Staleness

Let us define an efficient price, P e , as a continuous process following a Geometric Brownian Motion.
P t e = P 0 e + 0 t σ s P s e d W s
An observed price moves along a discrete grid. Possible price values are multiples of the tick size, d.
P t = d · P t e d
If the efficient price changes insignificantly, the return of the rounded price will be equal to 0. Analogically, if the return of rounded price is 0, the return of efficient price has a value close to 0. We use Equation (3) to estimate the probability that a return of rounded price has zero value:
p i = e r f ( R i 1 ) + 1 R i 1 π ( exp ( R i 1 2 ) 1 ) ,
where R i = d P ¯ i σ ¯ i 2 Δ and e r f ( x ) are Gaussian error functions; d is a tick size (we estimate the tick size using a two-step procedure for each month. First, we find the amount of significant digits in price. Then, we determine the most frequent increment in ordered prices); Δ is a time step (the time step between the end and start of the main trading session is set as 1 min. Moreover, we consider any time gap without trading more than 2 h as the closure of the market. We set the time step to be equal to 1 min for these gaps); P ¯ is a rounded price; σ ¯ is an estimation of volatility [36]. It is obtained by considering the probability that a price following a Geometric Brownian Motion moves less than one tick size, assuming that price increments are normally distributed.
There is another source for obtaining 0-returns, namely price staleness. Price staleness represents a regularity pattern of the dynamics; namely, the fundamental (efficient) price of an asset is not updated because of a number of reasons, such as no transactions because of high cost, which makes trading unprofitable for agents. See [20] for more details. This results in a persistence pattern of (“excess”) 0-returns. Such a pattern, for example, tends to reduce any estimation of the volatility. Therefore, we need to filter out 0-returns due to price staleness while retaining 0-returns due to rounding for a genuine estimation of volatility.
We save 0-returns in the amount of the sum of past values of the probability in Equation (3) [36]. We set other 0-returns as missing values. We adopt this method to estimate the degree of price staleness together with volatility in the next section.

2.2.3. Modification of EWMA

In this Section, we present a modification of the EWMA that takes into consideration the effect of price staleness. Our modification of the EWMA is based on the suggestion for estimating volatility σ n as σ ¯ n 1 (i.e., by setting α = 0 ), if the value of r n 1 is missing because of price staleness. That is, there is no new information from returns to update the value of volatility.
Initially, the expected amount of 0-returns due to rounding is N s a v e = 0 . Thus, each appearance of 0-returns does not affect the value of volatility. A 0-return is defined as a value due to rounding and is saved in the sequence if the sum of all p i (Equation (3)) moves to a new integer value. Other details and the algorithm of volatility estimation can be found in Appendix B.
We update the estimation of volatility and price staleness minute-by-minute. This method has the clear advantage of making the online inference possible by processing data in real time.

2.3. Calculating a Degree of Market Inefficiency

2.3.1. The Shannon Entropy

A degree of randomness of price returns is assessed by Shannon entropy. The entropy of a source is an average measure of the randomness of its outputs [37].
Definition 1.
Let X = { X 1 , X 2 , . } be a stationary random process with a finite alphabet A and a measure μ. An n-th order entropy of X is
H n ( μ ) = x 1 n A n μ ( x 1 n ) log μ ( x 1 n )
with the convention 0 log 0 = 0 . The process entropy (entropy rate) of X is
h ( μ ) = lim n H n ( μ ) n .

2.3.2. Discretization

The Shannon entropy is computed over a finite alphabet. To measure Shannon’s entropy, we need to keep the length of blocks of symbols, k, sufficiently large. The predictable behavior of returns can be seen on blocks of greater length and may not be noticeable on blocks of smaller length. For this reason, we consider 3-symbol and 4-symbol discretizations using empirical quantiles:
s t ( 3 ) = 1 , r t θ 1 , 0 , θ 1 < r t θ 2 , 2 , θ 2 < r t , s t ( 4 ) = 0 , r t Q 1 , 1 , Q 1 < r t Q 2 , 2 , Q 2 < r t Q 3 , 3 , Q 3 < r t ,
where θ 1 and θ 2 are tertiles and Q 1 , Q 2 , and Q 3 are quartiles. The tertiles divide data into three equal parts. The quartiles divide data into four equal parts. Q 2 is also the median of the empirical distribution of returns. For the later analysis, we will need a discretization describing the behavior of a pair of stocks:
s t ( p ) = 0 , r t ( 1 ) m 1 and r t ( 2 ) m 2 , 1 , r t ( 1 ) m 1 and r t ( 2 ) > m 2 , 2 , r t ( 1 ) > m 1 and r t ( 2 ) m 2 , 3 , r t ( 1 ) > m 1 and r t ( 2 ) > m 2 ,
where r t ( 1 ) and r t ( 2 ) are two time series of price returns, and m 1 and m 2 are their medians.

2.3.3. The Estimation Of Entropy

Let x 1 n A n be the sequence of length n generated by an ergodic source μ from the finite alphabet A, where x i i + k 1 = x i x i + k 1 . There are possible missing values in the sequence generated independently from x 1 n . We consider all blocks of length k that do not contain missing values. We take the following:
k = m a x ( K : K < log ( n b ( K ) ) ) ,
where n b ( k ) is the number of blocks of length k. The restriction on a value of k allows having enough blocks to estimate probabilities appearing in k-th order entropy [38]. The base of the logarithm is the size of alphabet A (3 or 4).
For each a 1 k A k , empirical frequencies are defined as follows.
f ( a 1 k | x 1 n ) = # { i [ 1 , n k + 1 ] : x i i + k 1 = a 1 k } .
Empirical frequencies are the actual amount of each block from A k in the data. By considering an empirical k-block distribution as
μ ^ k ( a 1 k | x 1 n ) = f ( a 1 k | x 1 n ) n b ,
an empirical k-entropy is defined by the following.
H ^ k ( x 1 n ) = a 1 k μ ^ k ( a 1 k | x 1 n ) log ( μ ^ k ( a 1 k | x 1 n ) ) = log ( n b ) 1 n b i = 1 M f i log f i .
The estimation of the process entropy is described as follows.
h ^ k = H ^ k k .
See [38] for the proof of the consistency of this estimator and [36] for the case of missing values. Since the sequence is finite, the estimation of entropy is underestimated. To remove this bias, we use the correction for the entropy estimation introduced in [39,40]:
H ^ k G = log ( n b ) 1 n b i = 1 M f i log exp G ( f i ) ,
where the sequence G ( i ) is defined recursively as
G ( 1 ) = γ ln ( 2 ) G ( 2 ) = 2 γ ln ( 2 ) G ( 2 n + 1 ) = G ( 2 n ) G ( 2 n + 2 ) = G ( 2 n ) + 2 2 n + 1 , n 1
with the Euler’s constant γ = 0.577215 .

2.3.4. Detection of Inefficiency

We need to perform three steps to determine if the time interval is efficient or not. First, we filter out apparent inefficiencies (see Appendix A). Then, we estimate the entropy of the filtered return time series using Equation (7). Finally, we determine if the value of entropy is significantly lower relative to the case of perfect randomness. We detect inefficiency in the time interval using Monte Carlo simulations. We regard a Brownian motion as absolutely unpredictable. First, we define the length of sequences as l = n b ( k ) + k 1 . Then, we simulate 10 4 realizations of Brownian motions with Gaussian increments and the length l. For each realization, we calculate entropy using 3- and 4-symbol discretizations. Then, we find the first percentile of the obtained entropies for each discretization. These percentiles are the bounds of 99 % of the Confidence Interval (CI) for testing market efficiency. Finally, we define an efficiency rate as the ratio of the entropy of the time interval and the bound of CI. If the efficiency rate is less than 1 for at least one type of discretization; we define the time interval as inefficient. We provide testing for inefficiency twice using different discretizations because the unique testing may not be robust. See an example in Appendix C.

2.4. Kullback–Leibler Distance

In addition to estimating the entropy of one time series, we can also consider the difference between two time series. Kullback–Leibler divergence [41] is used to measure similarity between two distributions for two discrete probability distributions P and Q.
K L ( P | Q ) = i p i log p i q i
We use p i and q i as empirical probabilities obtained in Equation (6). Since the Kullback–Leibler divergence is asymmetric, we consider the distance between two time series proposed in [42].
D ( P , Q ) = K L ( P | Q ) H G ( P ) + K L ( Q | P ) H G ( Q )
The greater the distance of D ( P , Q ) , the more probability distributions P and Q differ.

3. Results

3.1. Simulations

The aim of this section is to assess the accuracy of the estimation of volatility and the degree of price staleness. We will choose the method that produces the least amounts of error with respect to the estimation for further analysis on real data. We take the following model of an observed price P ˜ t and t = 1 2 N :
P t = 0 t σ s P s d W s 1 P ˜ i = P i ( 1 B i ) + P ˜ i 1 B i q t = q 0 + 0 t μ s d s + 0 t ν d W s 2 B i = 1 with probability q i 0 with probability 1 q i
where W 1 and W 2 are two independent Brownian motions with a length of 2 N , N = 10 5 ; a price of P 0 = 100 ; and ν = 10 4 . B = 1 stands for the case when price is not updated due to price staleness (see [20,43]). Prices are rounded to two digits; thus, the tick size is d = 0.01 . We consider four choices for q t and σ t listed below.
q t 1 = 0 q t 2 = 0.1 + 0 t ν d W s 2 q t 3 = 0.2 + 0 t ν d W s 2 q t 4 = 0.2 + 0 t μ s 4 d s + 0 t ν d W s 2 μ t 4 = 0.8 π / N cos ( 8 t π / N ) σ t 1 = 5 × 10 4 σ t 2 A R C H ( 1.75 × 10 7 , 0.2 , 0.1 ) σ t 3 G A R C H ( 1.25 × 10 8 , 0.1 , 0.85 ) σ t 4 G A R C H ( 1.25 × 10 8 , 0.15 , 0.8 )
For price staleness, we consider four cases: the absence of price staleness; two stochastic probabilities with different constant means; a periodic mean. For all four cases of volatility, the unconditional expected value of σ t is 5 × 10 4 . The first choice of volatility is a constant. Then, we consider the ARCH model [44] with two lagged values, where 0.2 and 0.1 correspond to the first and the second lags, respectively. Volatility values directly depend only on the previous returns values. The dependency on the previous return should be reflected in the value of smoothing parameter. The third and fourth choices are GARCH(1,1) models [23], where the last parameter (0.85 or 0.8) stands for the coefficients for lagged variances. We consider two sets of parameters for a GARCH model, giving less persistence to the fourth model.
We divide the data into two equal parts with the size N. The first part is a training set for finding optimal values of α from Equations (1) and (2). The second part is a testing set for calculating errors represented in Table 2 and Table 3. We compare two methods that use S i g 1 and S i g 2 for volatility estimation. For each method, we find the optimal value of α . In addition, we set a fixed value of alpha, α = 0.05 , as a benchmark for the comparison. We also apply non-modified EMWA estimation from Section 2.2.1 with selected optimal value of α to show the contribution of 0-filtering to the accuracy of volatility estimation. We simulate 10 3 prices for each model.
Table 2 represents a mean absolute percentage error (MAPE) that is 1 N i | σ ¯ i σ i σ i | for six different approaches. These approaches differ in the choice of a function for volatility, the value of α , and the presence of missing values. Table 3 represents three values for each of the two methods using S i g 1 and S i g 2 for volatility estimation. The first value is the optimal value of α . The second is E r N = | N r o u n d N A N 0 N 1 | , where N A is the amount of remaining non-missing returns, N r o u n d is the amount 0-returns that would appear due to rounding (before adding the effect of staleness in the simulated data), N A is the amount of non-missing returns, and N 0 is the amount of 0-returns. E r N represents the absolute error of the proportion of 0-returns that remain in the data and are defined as 0-returns due to rounding. The third value is the proportion of data set as missing values (that is, 1 N A N ).
It can be seen from Table 2 that the method that more often produces the lowest value of MAPE is with fixed α = 0.05 and S i g 1 used for volatility estimation. Moreover, for almost all cases, 0-filtering makes the volatility estimate more accurate. The error of the amount of 0-returns due to rounding is smaller for the function S i g 1 than for the function S i g 2 for all 16 cases.
After the comparison of the two functions of volatility estimation, we decide to use S i g 1 , which uses absolute values of returns, in the next sections. For the rest of the paper, we fix the value of α as 0.05 for the simplicity of further analysis.

3.2. Moscow Stock Exchange

We calculate 18 · 120 = 2160 efficiency rates for each type of discretization, where 18 is the amount of stocks and 120 is the amount of months in 10 years. We define a degree of inefficiency as the fraction of 2160 months that are defined as inefficient according to Section 2.3.4. The degree of inefficiency for the chosen group of stocks traded at Moscow Exchange is 0.823. In our previous study, ref. [36] we found that the degree of inefficiency for the U.S. ETF market is about 0.11 for monthly time intervals and the 3-symbol discretization only. This difference in the degrees of inefficiency can be explained by the hypothesis that developed markets have a high level of efficiency. W. A. Risso reached this conclusion in the article [24]. The degree of inefficiency for each stock and discretization is presented in Table 4. We notice that 4-symbol discretizations contribute to a larger amount of inefficient months compared to the 3-symbol discretization. That is, the 4-symbol discretization appears to have a more predictable structure than 3-symbol discretizations.
Figure 1 shows the minimum value of efficiency rates among all months for each stock.
There are two most notable deviations from one for MLTR stocks (Mechel, mining and metals company) and RSTI (Rosseti, power company). We investigate them in the next section. For the other 16 stocks, the minimum value of efficiency rate is attained for the AFLT stock, and it is equal to 0.933 (0.964) for three (four) symbols.

Analysis of MLTR and RSTI

We plot the values of efficiency rates for monthly intervals for the MLTR and RSTI stocks. See Figure 2 and Figure 3.
Both types of discretization show coherent results. For MLTR, there are two notable decreases in the efficiency rates at the beginning of 2014 and in the middle of 2016. For both types of discretizations, the eight months with the lowest efficiency rate (in the ascending order of time) include January–February and May–October of 2014. For each month, we write down the most frequent block of symbols in Table 5. Note that block 1111 for the 4-symbol discretization appears as the most frequent for 6 months out of 8 for MLTR. The block denotes a slight decrease in price for 4 min in a row. The meaning of the last two columns is discussed later.
For RSTI, there are two sharp decreases in 2014 and 2015. There are 11 months that have the lowest efficiency rates that are in common for both discretizations. These months are April–September of 2014 and June–October of 2015. Note that these inefficient months cluster together and are not distributed uniformly among the entire time period of 10 years. This is the signal of a market condition that affects the inefficiency of the stocks for more than one month.
We construct a simple trading strategy on discretized returns to test the predictability of future returns. We consider blocks of length 4 obtained by the 4-symbol discretization. For each month, we divide blocks into two halves. The discretization is made using only the first half of a month. We consider the sequences of the first three symbols of each block. If the empirical probability of obtaining 0 or 1 after the sequence of three symbols in the first half is greater than 0.5, this sequence is from group D (decreasing). If the empirical probability of obtaining 2 or 3 after the sequence of three symbols is greater than 0.5, this sequence is from group I (increasing). Then, for the second half of the month, we determine a success if symbols 0 or 1 follow a sequence from group D or if symbols 2 or 3 follow a sequence from group I. Then, we calculate the fraction of successes. Thus, it is the probability of making a profit: sell after group D or buy after group I. In the case of market efficiency, this probability is equal to 0.5. For example, we expect that after 111, the next symbol would be 1 according to Table 5. That is, after this block, a trader can sell a stock. The fourth column of the Table 5 shows the results for a filtered return time series. The fifth column stands for the original return time series.
For all cases, the probability is greater than 0.5. Obviously, the probabilities for the original return time series are greater than those for the filtered return time series. The reason is that predictability for the original return time series follows from the sources of apparent inefficiencies.
The same analysis is performed for the RSTI stock. Eleven months with the lowest efficiency rates are presented in Table 6. For the RSTI stock, the simple trading strategy provides the fraction of successes (of predicting increases and decreases in price) greater than 0.5 for all 11 months. The frequent behavior of the price of RSTI during the chosen months is a slight increase in price for several minutes in a row denoted by symbol 2.
The simple trading strategy is an illustrative example of market inefficiency. In fact, such a strategy could result in no profit when used in practice because it does not take into account the costs of transaction and other trading frictions. Moreover, the filtering of daily seasonality pattern is made by using the entire period of analysis. That is, this method cannot be applied in real time. Finally, we consider blocks containing only observed returns by neglecting the missing values from the analysis. Thus, the application of such a strategy in practice should be integrated with the case when a missing value follows a sequence of three symbols.

3.3. Stock Market Clustering

Most of the month-long time intervals are identified as inefficient. However, is there some dependence between two stocks that are inefficient at the same time?

3.3.1. Kullback–Leibler Distance

We measure the similarity of discretized filtered returns by using the Kullback–Leibler (KL) distance (Equation (8)). We use k, the length of blocks, as the maximum value suitable for both sequences according to Equation (5). The 4-symbol discretization is used. The Kullback–Leibler divergence D L ( P | Q ) is calculated using empirical frequencies. The entropy rates are calculated using Equation (7). Using the Kullback–Leibler distance for all pairs of stocks, we cluster them in three groups using hierarchical clustering with the UPGMA algorithm [45]. This algorithm is implemented by using the Python function cluster.hierarchy.dendrogram with the argument distance=average. The result is in Figure 4. Combining companies into one cluster means that their stocks have a common behavior that is not related to the value of volatility, the degree of price staleness, and the structure of microstructure noise.
It can be seen that banks and oil companies are clustered together (right). There is a group of four stocks (RTKM, HYDR, AFLT, and MGNT) that have nothing in common at first glance. The remaining group (left) mainly consists of metallurgy companies. However, there is no visible distinction between the stocks of banks and oil companies. According to the clustering tree, two telecommunications companies differ significantly, as well as electricity companies.
Finally, two stocks with the lowest efficiency rates, RSTI and MLTR, are the furthest (in the sense of KL distance) from any other stock. That is, there are no stocks that behave similarly to these two stocks.

3.3.2. Entropy of Co-Movement

Now, we consider another measure of difference between two stocks: the entropy of co-movement. We calculate the Shannon entropy of the discretization describing the movement of a pair of prices presented in Equation (4). We consider only minutes that are in common for both stocks. For these minutes, we consider values of residuals obtained after ARMA fitting. The result is in Figure 5.
Two companies related to telecommunications are a separate cluster. Three metallurgy companies (MAGN, CHMF, and NLMK) also cluster together. Stocks relating to oil and bank companies form the other cluster. The same cluster, with the exception of the TATN (oil industry), was also formed in the previous section. The “closeness” of stocks GAZP and SBER is detected either in this and in the previous section. The three stocks on the left that join other stock clusters last are the stocks with the lowest efficiency rates.
Some clusters may form on the basis that companies belong to the same industry. The division of companies into industries is noticeable from the dendrogram in Figure 5. However, this criterion does not explain all clusters. For instance, GMKN from metallurgy is in the cluster of oil companies and banks.

4. Discussion and Conclusions

We have investigated the predictability of the Moscow Stock Exchange. We are interested in a measure of market inefficiency that is not related to known sources of regularity in financial time series. Usually, these sources are not filtered out, and accordingly, their impact is taken into account in the degree of price predictability (see, e.g., [16,17,18]).
We have focused on two sources of regularity, namely volatility clustering and price staleness [20]. The process of filtering volatility clustering was performed in [19] by estimating volatility using the exponentially weighted moving average. We have developed a modification of the volatility estimation by taking into consideration the effect of price staleness. Price staleness produces excess 0-returns that affect the estimation of volatility. Another approach of estimating volatility in the case of presence 0-returns was proposed in [21] where all 0-returns are reevaluated during an expectation-maximization algorithm. In our approach, we separate 0-returns that may have resulted from rounding and from price staleness. Thus, we also filter out apparent inefficiency due to price staleness. Our approach combining the estimates of volatility and the degree of staleness can be used for a real-time causal analysis, since only past observations are used.
One of the clear advantage of the proposed approach relies in its simplicity: There is only one smoothing parameter in the method that can be optimized using historical data. We fix the value of smoothing parameter equaled 0.05. In the literature, the smoothing parameter α is usually taken close to 0. Using the principle of the best one-step forecasting, the smoothing parameter is set to 0.06 for the daily data and to 0.03 for the monthly data [33]. The value of the parameter α is set to be equal 0.12 for in-sample testing and 0.22 for out-of-sample testing in [46]. Hunter [32] suggests using α = 0.2 ± 0.1 .
We used the Shannon entropy as a measure of randomness to infer the degree of (in)efficiency of the Moscow Stock Exchange. We used two types of the discretization of return time series to test efficiency more reliably for each month. The 4-symbol discretization helps find more price movements that lead to market inefficiency than compared to the 3-symbol discretization. Eighty percent of months over the period from 2012 to 2021 are defined as inefficient. According to Risso [24], a higher level of efficiency corresponds to more developed markets. Deviation from efficiency is a frequent phenomenon in various markets. For example, the authors of work [14] conclude that the Colombo Stock Exchange is only 10.5% efficient while the Pakistan Stock Exchange is 23.7% efficient. Cajueiro and Tabak [10] have shown that Asian markets are less efficient than Latin American markets. The authors of [28] estimates the efficiency of the Tunisian stock market as 97%. There are periods of inefficiency for some stocks traded at the Tel-Aviv stock exchange as stated in [15]. Short periods of inefficiency were also detected for US stock markets in [25].
By investigating the discretized values of filtered price returns, we came to the following conclusions:
  • Even after filtering out all known sources of regularity, most months contain signals of market inefficiency.
  • The most inefficient months are grouped together for two stocks exhibiting the lowest efficiency rates.
  • For such months, discretized price returns before and after filtering out apparent inefficiencies are predictable.
  • We introduced the entropy of co-movement. Stock prices display common patterns that have an interpretation in terms of the sector that the stock belong to.
  • The stocks of banks and oil companies cluster together in terms of co-inefficiency for the case of the Moscow stock exchange.
One possible improvement to stock clustering is to modify the entropy of co-movement such that it is possible to define a proper distance function. This is left for future research. The proposed method for measuring market efficiency using the Shannon entropy can be applied in other markets of different countries. In this study, we used monthly time intervals for entropy calculation. Our future work will be related to the optimization of the length of return time series. One problem is to find a significant decrease in entropy without using Monte Carlo simulations. We also plan to switch to a higher frequency (less than one minute) to analyze the predictability of financial time series.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/e24091184/s1.

Author Contributions

Conceptualization, A.S., S.M. and P.M.; data curation, A.S. and S.M.; methodology, A.S.; software, A.S.; formal analysis, A.S.; supervision, S.M. and P.M.; writing—original draft, A.S.; writing—review and editing, P.M. and S.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the research project “Dynamics and Information Research Institute—Quantum Information, Quantum Technologies” within the agreement between UniCredit Bank and Scuola Normale Superiore. We are grateful to UniCredit Bank R&D group for financial support through the “Dynamics and Information Theory Research Institute” at the Scuola Normale Superiore.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The Python codes (v. 3.9.5) used in this study are available in Supplementary materials. The dataset analyzed in this study can be found here: https://www.finam.ru/profile/moex-akcii/gazprom/export/ (accessed on 17 August 2022).

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Data Cleaning And Whitening

Appendix A.1. Outliers

We use the method of an outlier detection introduced in [31]. The algorithm finds price values that are too far from the mean in relation to the standard deviation. The algorithm deletes a price P i if
| P i P ¯ i ( k ) | c s i ( k ) + γ ,
where P ¯ i ( k ) and s i ( k ) are, respectively, a δ -trimmed sample mean and the standard deviation of the k price recorded closest to time i. The δ % of the lowest and the δ % of the highest observations are discarded when the mean and standard deviation are calculated from the sample. The parameters are k = 20 , δ = 5 , c = 5 , γ = 0.05 .

Appendix A.2. Stock Splits

We check condition | r | > 0.2 in the return series to detect unadjusted splits. A split is a change in the number of company’s shares and in the price of the single share such that a market capitalization does not change. There are no unadjusted splits found.

Appendix A.3. Intraday Volatility Pattern

The volatility of intraday returns has periodic behavior. The volatility is higher near the opening and the closing of the market. It shows an U-shaped profile every day. The intraday volatility pattern from the return series is filtered by using the following model. We define deseasonalized returns as follows:
R ˜ d , t = R d , t ξ t ,
where
ξ t = 1 N d a y s d | R d , t | s d ,
R d , t is the raw return of day d and intraday time t, s d is the standard deviation of absolute returns of day d, and N d a y s is the number of days in the sample.

Appendix A.4. Heteroskedasticity

Different days have different levels of the deviation of the deseasonalized returns R ˜ . In order to remove this heteroskedasticity, we estimate the volatility σ ¯ t in Appendix B. We define the standardized returns by using the following.
r t = R ˜ t σ ¯ t .

Appendix A.5. Price Staleness

If a transaction cost is high, the price is updated less frequently, even if trading volume is not zero. This effect is called price staleness and is discussed in Section 2.2.2. We identify 0-returns appearing due to rounding (and not due to price staleness) using the Equation (3). Other 0-returns are set as missing values, as shown in Appendix B.

Appendix A.6. Microstructure Noise

The last step in filtering apparent inefficiencies is filtering out microstructure noise. The microstructure effects are caused by transaction costs and price rounding. We consider the residuals of an ARMA(P,Q) model of the standardized returns after filtering out 0-returns. We apply the method introduced in [47] to find the residuals of an ARMA(P,Q) model by using the Kalman filter. We select the values of P and Q that minimize the value of BIC [48] such that P + Q < 6 . The values of P and Q are chosen for each calendar year and are used for the next year. For the year 2012, we select P = 0 and Q = 1 corresponding to an MA(1) model.

Appendix B. Algorithm

The aim of the algorithm is to estimate volatility and filter out excess 0-returns due to price staleness. Some 0-returns appear due to price rounding. These 0-returns will be saved in the data. First, we set the number of 0-returns “to save” N s a v e = 0 and the first value of a cumulative function Z 1 = 0 . The cumulative function is updated Z t = Z t 1 + p t , if r t 1 is not defined as missing due to staleness. Each time when Z ( t ) Z ( t 1 ) = 1 , N s a v e is increased by 1.
We notice that the first non-zero return after a row of 0-returns due to staleness is the sum of all missing returns generated by a hidden efficient price. This return is also set as missing. However, the value of return used for estimating volatility is calculated as its expected value: r ^ n 1 = r n 1 N 0 + 1 , where N 0 is the amount of missing values strictly before the non-zero return r n 1 . The same is also referred to initially missing values, e.g., due to no-trading or errors in collecting the data.
Another assumption is that a 0-return appears due to staleness if the previous return had the 0-value and was defined to appear due to staleness. We include this rule, since we assume that it is more likely that two consecutive 0-returns appear due to high transaction costs than due to rounding (that is, simply speaking, two outcomes of generating Gaussian random variables are less than a tick size).
Generally, for the estimation of volatility at time t we should consider three cases: P t 1 was missing (or minute t 1 is non-trading), r t 1 = 0 , r t 1 0 . Thus, the algorithm is the following. We provide the algorithm for the case of S i g 1 , which is used in the application for real data. We remove all 0-returns that start the sequence.

Pseudocode

Step 0: σ ¯ 1 = | r 1 | / μ 1 ; Z 1 = 0 , N s a v e = 0 ; N 0 = 0 .
For t from 2 to N, where N is the length of time series:
Step 1:
  • If r t 1 is missing: σ ¯ t = σ ¯ t 1 ; Increase N 0 by the amount of consecutive missing prices
  • Else if r t 1 = 0 :
    If N s a v e > 0 and N 0 = 0 : N s a v e = N s a v e 1 , σ ¯ t = S i g 1 ( α , 0 , σ ¯ t 1 )
    Else: σ ¯ t = σ ¯ t 1 , N 0 = N 0 + 1 , r t 1 = missing
  • Else: σ ¯ t = S i g 1 ( α , r n 1 N 0 + 1 , σ ¯ t 1 ) , N 0 = 0
Step 2:
  • Calculate p t (Equation (3))
  • If r t 1 is not missing, Z t = Z t 1 + p i
  • If Z ( t ) Z ( t 1 ) = 1 , N s a v e = N s a v e + 1
Finally, we check if the effect of staleness really exists in the price time series.
p ^ = i p i N q ^ = 1 p ^ V a r = p ^ q ^ N
If N r e a l i p i + 1.96 V a r , we leave the time series without placing any missing values, where N r e a l is the initial amount of 0-returns. The value of α can be selected using a training set. The optimal value of α minimizes the mean of ( σ ¯ t 2 r t 2 ) 2 .

Appendix C. A Predictable Time Series with Entropy at Maximum

The goal of this section is to construct a price model where entropy is high because of discretization. This model shows that a high entropy value may be caused by discretization, but not because of the randomness of a return time series.
There are equal probabilities of having symbols 0, 1, and 2. Symbol 1 corresponds to log-returns, r, equal to 0.4 , and 2 corresponds to log-returns equal to 0.4 . The structure of symbol 0 is more complicated. It covers three other symbols: 3 , 4 , and 5. They correspond to log-returns 0.3 , 0.1 , and 0.2 , respectively. One of the symbols 3 , 4 , or 5 appears with probabilities depending on the previous value of these symbols. The probabilities are presented in the Table A1. Having a symbol presented in a column, there are probabilities of obtaining a symbol presented in a row.
Table A1. Transition probabilities.
Table A1. Transition probabilities.
First Symbol · 3 · 4 · 5
3 · 1 6 1 3 1 2
4 · 1 2 1 6 1 3
5 · 1 3 1 2 1 6
Rows stand for the first symbol of a block, and columns stand for the second symbol.
The model implies an average zero return. However, a trading strategy that increases a profit exists. After three, a trader should buy, and after four and five the trader should sell. However, the entropy of a 3-symbol series is at its maximum, which should imply an absence of profitable strategies.
Considering the same example with 4-symbol discretizations, we obtain Q 1 = 0.4 , Q 2 = 0.1 , and Q 3 = 0.4 . Therefore, we have the following discretization of returns.
s = 0 , r = 0.4 , 1 , r = 0.3 or r = 0.1 , 2 , r = 0.2 , 3 , r = 0.4 .
Thus, we can distinguish returns r = 0.2 from the others using 4-symbol discretization. Table A1 provides the following probabilities for the blocks of two symbols and from the 4-symbol discretization: p ( 11 ) = 7 162 , p ( 12 ) = p ( 21 ) = 5 162 , p ( 22 ) = 1 162 . Noting that p ( 0 ) = p ( 3 ) = 1 3 , p ( 1 ) = 2 9 , and p ( 2 ) = 1 9 , we calculate that
H 1 = 2 3 log 1 3 2 9 log 2 9 1 9 log 1 9 0.946 < 1
and
H 2 = 1 2 ( 7 162 log 7 162 + 5 81 log 5 162 + 1 162 log 1 162 + + 4 9 log 1 9 + 8 27 log 2 27 + 4 27 log 1 27 ) 0.944 < H 1

References

  1. Samuelson, P.A. Proof that properly anticipated prices fluctuate randomly. Ind. Manag. Rev. 1965, 6, 41–49. [Google Scholar]
  2. Fama, E.F. Efficient Capital Markets: A Review of Theory and Empirical Work. J. Financ. 1970, 25, 383–417. [Google Scholar] [CrossRef]
  3. Fama, E.F. Efficient Capital Markets: II. J. Financ. 1991, 46, 1575. [Google Scholar] [CrossRef]
  4. Kim, J.H.; Shamsuddin, A. Are Asian stock markets efficient? Evidence from new multiple variance ratio tests. J. Empir. Financ. 2008, 15, 518–532. [Google Scholar] [CrossRef]
  5. Linton, O.; Smetanina, E. Testing the martingale hypothesis for gross returns. J. Empir. Financ. 2016, 38, 664–689. [Google Scholar] [CrossRef]
  6. Mandes, A. Algorithmic and High-Frequency Trading Strategies: A Literature Review; MAGKS Papers on Economics 201625; Philipps-Universität Marburg, Faculty of Business Administration and Economics, Department of Economics (Volkswirtschaftliche Abteilung): Marburg, Germany, 2016. [Google Scholar]
  7. Huang, B.; Huan, Y.; Xu, L.D.; Zheng, L.; Zou, Z. Automated trading systems statistical and machine learning methods and hardware implementation: A survey. Enterp. Inf. Syst. 2019, 13, 132–144. [Google Scholar] [CrossRef]
  8. Grossman, S.J.; Stiglitz, J.E. On the Impossibility of Informationally Efficient Markets. Am. Econ. Rev. 1980, 70, 393–408. [Google Scholar]
  9. Cont, R. Empirical properties of asset returns: Stylized facts and statistical issues. Quant. Financ. 2001, 1, 223–236. [Google Scholar] [CrossRef]
  10. Cajueiro, D.O.; Tabak, B.M. Ranking efficiency for emerging markets. Chaos Solitons Fractals 2004, 22, 349–352. [Google Scholar] [CrossRef]
  11. Cajueiro, D.; Tabak, B. Ranking efficiency for emerging markets II. Chaos Solitons Fractals 2005, 23, 671–675. [Google Scholar] [CrossRef]
  12. Drożdż, S.; Gȩbarowski, R.; Minati, L.; Oświȩcimka, P.; Watorek, M. Bitcoin market route to maturity? Evidence from return fluctuations, temporal correlations and multiscaling effects. Chaos 2018, 28, 071101. [Google Scholar] [CrossRef] [PubMed]
  13. Shahzad, S.J.H.; Nor, S.M.; Mensi, W.; Kumar, R.R. Examining the efficiency and interdependence of US credit and stock markets through MF-DFA and MF-DXA approaches. Phys. A 2017, 471, 351–363. [Google Scholar] [CrossRef]
  14. Giglio, R.; Matsushita, R.; Figueiredo, A.; Gleria, I.; Silva, S.D. Algorithmic complexity theory and the relative efficiency of financial markets. EPL 2008, 84, 48005. [Google Scholar] [CrossRef]
  15. Shmilovici, A.; Alon-Brimer, Y.; Hauser, S. Using a Stochastic Complexity Measure to Check the Efficient Market Hypothesis. Comput. Econ. 2003, 22, 273–284. [Google Scholar] [CrossRef]
  16. Molgedey, L.; Ebeling, W. Local order, entropy and predictability of financial time series. Eur. Phys. J. B 2000, 15, 733–737. [Google Scholar] [CrossRef]
  17. Risso, W.A. The informational efficiency and the financial crashes. J. Int. Bus. Stud. 2008, 22, 396–408. [Google Scholar] [CrossRef]
  18. Mensi, W.; Aloui, C.; Hamdi, M.; Nguyen, D.K. Crude oil market efficiency: An empirical investigation via the Shannon entropy. Écon. Intern. 2012, 129, 119–137. [Google Scholar] [CrossRef]
  19. Calcagnile, L.M.; Corsi, F.; Marmi, S. Entropy and Efficiency of the ETF Market. Comput. Econ. 2020, 55, 143–184. [Google Scholar] [CrossRef]
  20. Bandi, F.M.; Kolokolov, A.; Pirino, D.; Renò, R. Zeros. Manag. Sci. 2020, 66, 3466–3479. [Google Scholar] [CrossRef]
  21. Sucarrat, G.; Escribano, A. Estimation of log-GARCH models in the presence of zero returns. Eur. J. Financ. 2018, 24, 809–827. [Google Scholar] [CrossRef]
  22. Dempster, A.P.; Laird, N.M.; Rubin, D.B. Maximum Likelihood from Incomplete Data Via the EM Algorithm. J. R. Stat. Soc. Ser. B Stat. Methodol. 1977, 39, 1–22. [Google Scholar] [CrossRef]
  23. Bollerslev, T. Generalized autoregressive conditional heteroskedasticity. J. Econom. 1986, 31, 307–327. [Google Scholar] [CrossRef] [Green Version]
  24. Risso, W.A. The informational efficiency: The emerging markets versus the developed markets. Appl. Econ. Lett. 2009, 16, 485–487. [Google Scholar] [CrossRef]
  25. Alvarez-Ramirez, J.; Rodriguez, E. A singular value decomposition entropy approach for testing stock market efficiency. Phys. A 2021, 583, 126337. [Google Scholar] [CrossRef]
  26. Degutis, A.; Novickytė, L. The efficient market hypothesis: A critical review of literature and methodology. Ekonomika 2014, 93, 7–23. [Google Scholar] [CrossRef]
  27. Ahn, K.; Lee, D.; Sohn, S.; Yang, B. Stock market uncertainty and economic fundamentals: An entropy-based approach. Quant. Financ. 2019, 19, 1151–1163. [Google Scholar] [CrossRef]
  28. Mahmoud, I.; Sebai, S.; Naoui, K.; Jemmali, H. Market Informational Efficiency of Tunisian Stock Market: The Contribution of Shannon Entropy. J. Econ. Financ. Adm. Sci. 2014, 6, 6–17. [Google Scholar]
  29. Coronel-Brizio, H.; Hernández-Montoya, A.; Huerta-Quintanilla, R.; Rodríguez-Achach, M. Evidence of increment of efficiency of the Mexican Stock Market through the analysis of its variations. Phys. A 2007, 380, 391–398. [Google Scholar] [CrossRef]
  30. Dionisio, A.; Menezes, R.; Mendes, D.A. An econophysics approach to analyse uncertainty in financial markets: An application to the Portuguese stock market. Eur. Phys. J. B 2006, 50, 161–164. [Google Scholar] [CrossRef]
  31. Brownlees, C.; Gallo, G. Financial econometric analysis at ultra-high frequency: Data handling concerns. Comput. Stat. Data Anal. 2006, 51, 2232–2245. [Google Scholar] [CrossRef]
  32. Hunter, J.S. The Exponentially Weighted Moving Average. J. Qual. Technol. 1986, 18, 203–210. [Google Scholar] [CrossRef]
  33. Morgan, J.; Longerstaey, J.; Spencer, M. RiskMetrics: Technical Document; J. P. Morgan: New York, NY, USA, 1996; Available online: https://www.msci.com/documents/10199/5915b101-4206-4ba0-aee2-3449d5c7e95a (accessed on 17 August 2022).
  34. Brent, R.P. An Algorithm with Guaranteed Convergence for Finding a Zero of a Function. Comput. J. 1971, 14, 422–425. [Google Scholar] [CrossRef]
  35. Kiefer, J. Sequential Minimax Search for a Maximum. Proc. Am. Math. Soc. 1953, 4, 502. [Google Scholar] [CrossRef]
  36. Shternshis, A.; Mazzarisi, P.; Marmi, S. Measuring market efficiency: The Shannon entropy of high-frequency financial time series. Chaos Solitons Fractals 2022, 162, 112403. [Google Scholar] [CrossRef]
  37. Shannon, C.E. A Mathematical Theory of Communication. Bell. Syst. Tech. J. 1948, 27, 379–423. [Google Scholar] [CrossRef]
  38. Marton, K.; Shields, P.C. Entropy and the Consistent Estimation of Joint Distributions. Ann. Probab. 1994, 22, 960–977. [Google Scholar] [CrossRef]
  39. Grassberger, P. Entropy Estimates from Insufficient Samplings. arXiv 2003, arXiv:physics/0307138. [Google Scholar] [CrossRef]
  40. Grassberger, P. On Generalized Schürmann Entropy Estimators. Entropy 2022, 24, 680. [Google Scholar] [CrossRef]
  41. Kullback, S.; Leibler, R.A. On Information and Sufficiency. Ann. Math. Stat. 1951, 22, 79–86. [Google Scholar] [CrossRef]
  42. Benedetto, D.; Caglioti, E.; Loreto, V. Language Trees and Zipping. Phys. Rev. Lett. 2002, 88, 048702. [Google Scholar] [CrossRef]
  43. Kolokolov, A.; Livieri, G.; Pirino, D. Statistical inferences for price staleness. J. Econom. 2020, 218, 32–81. [Google Scholar] [CrossRef]
  44. Engle, R.F. Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of United Kingdom Inflation. Econometrica 1982, 50, 987–1007. [Google Scholar] [CrossRef]
  45. Sokal, R.R.; Michener, C.D. A statistical method for evaluating systematic relationships. Univ. Kansas Sci. Bull. 1958, 38, 1409–1438. [Google Scholar]
  46. Bollen, B. What should the value of lambda be in the exponentially weighted moving average volatility model? Appl. Econ. 2015, 47, 853–860. [Google Scholar] [CrossRef]
  47. Jones, R.H. Maximum Likelihood Fitting of ARMA Models to Time Series with Missing Observations. Technometrics 1980, 22, 389–395. [Google Scholar] [CrossRef]
  48. Schwarz, G. Estimating the Dimension of a Model. Ann. Stat. 1978, 6, 461–464. [Google Scholar] [CrossRef]
Figure 1. Minimum of efficiency rates for 18 stocks using 3- and 4-symbol discretizations.
Figure 1. Minimum of efficiency rates for 18 stocks using 3- and 4-symbol discretizations.
Entropy 24 01184 g001
Figure 2. Efficiency rate for the MLTR stock using 3- and 4-symbol discretizations.
Figure 2. Efficiency rate for the MLTR stock using 3- and 4-symbol discretizations.
Entropy 24 01184 g002
Figure 3. Efficiency rate for the RSTI stock using 3- and 4-symbol discretizations.
Figure 3. Efficiency rate for the RSTI stock using 3- and 4-symbol discretizations.
Entropy 24 01184 g003
Figure 4. Hierarchical clustering tree using KL distance. The threshold for clustering into groups is 0.035.
Figure 4. Hierarchical clustering tree using KL distance. The threshold for clustering into groups is 0.035.
Entropy 24 01184 g004
Figure 5. Hierarchical clustering tree using the entropy of co-movement. The threshold for clustering into groups is 0.989.
Figure 5. Hierarchical clustering tree using the entropy of co-movement. The threshold for clustering into groups is 0.989.
Entropy 24 01184 g005
Table 1. Stocks of Russian companies traded at Moscow Exchange.
Table 1. Stocks of Russian companies traded at Moscow Exchange.
TickerCompanySectorSizeOutliers
GAZPGazpromOil1,307,42750
LKOHLukoilOil1,287,582192
ROSNRosneftOil1,270,592130
SNGSSurgutneftegazOil1,211,80911
TATNTatneftOil1,191,390174
SBERSberbankBank1,309,40237
VTBRVTB BankBank1,287,3300
CHMFSeverstalMetal1,214,735157
NLMKNovolipetsk Steel      Metal      1,194,32458
GMKNNornikelMetal1,272,769197
MTLRMechelMetal1,084,990161
MAGNMagnitogorsk Iron and Steel WorksMetal1,106,77113
MTSSMobile TeleSystemsTelecommunications1,153,527260
RTKMRostelecomTelecommunications1,140,798134
HYDRRusHydroElectric utility1,252,5840
RSTIRossetiElectricity1,094,2440
AFLTAeroflotAirline1,083,552123
MGNTMagnitFood retailer1,184,223544
For each company, we specify the ticker of stock, its sector, the size of data, and the amount of outliers removed. The size is given in the amount of minutes with trading activity.
Table 2. Results on volatility estimation.
Table 2. Results on volatility estimation.
ModelMAPE, Method v 1 MAPE, v 2 MAPE with α = 0.05 , v 1 MAPE with α = 0.05 , v 2 MAPE w/o 0-Filtering, v 1 MAPE w/o 0-Filtering, v 2
σ 1 , q 1 0.0193 ( 0.0007 , 0.0507 ) 0 . 017 ( 0.0014 , 0.0406 ) 0.0975 ( 0.0955 , 0.0995 ) 0.0897 ( 0.0878 , 0.0915 ) 0.0193 ( 0.0007 , 0.0507 ) 0 . 017 ( 0.0014 , 0.0406 )
σ 1 , q 2 0 . 0607 ( 0.0245 , 0.1017 ) 0.0629 ( 0.0293 , 0.1057 ) 0.095 ( 0.093 , 0.0972 ) 0.0914 ( 0.0893 , 0.0936 ) 0.0862 ( 0.0459 , 0.131 ) 0.0674 ( 0.0294 , 0.1154 )
σ 1 , q 3 0 . 0737 ( 0.0333 , 0.1278 ) 0.0756 ( 0.033 , 0.1338 ) 0.0948 ( 0.0928 , 0.0971 ) 0.0915 ( 0.0894 , 0.094 ) 0.138 ( 0.0917 , 0.1863 ) 0.0888 ( 0.0368 , 0.1592 )
σ 1 , q 4 0 . 0716 ( 0.0323 , 0.1213 ) 0.0739 ( 0.0354 , 0.1268 ) 0.0949 ( 0.0926 , 0.0973 ) 0.0913 ( 0.089 , 0.0937 ) 0.1404 ( 0.1022 , 0.1875 ) 0.0873 ( 0.0405 , 0.1516 )
σ 2 , q 1 0.1121 ( 0.1082 , 0.121 ) 0.1183 ( 0.1146 , 0.1244 ) 0.1459 ( 0.1438 , 0.1481 ) 0.1446 ( 0.1422 , 0.147 ) 0 . 1118 ( 0.108 , 0.1207 ) 0.1179 ( 0.1144 , 0.1243 )
σ 2 , q 2 0.1359 ( 0.1163 , 0.1715 ) 0.1411 ( 0.1237 , 0.1765 ) 0.1462 ( 0.1439 , 0.1487 ) 0.1489 ( 0.1457 , 0.1526 ) 0 . 1341 ( 0.1043 , 0.1819 ) 0.1407 ( 0.1193 , 0.1832 )
σ 2 , q 3 0 . 146 ( 0.1198 , 0.1958 ) 0.1519 ( 0.1266 , 0.1981 ) 0.1473 ( 0.1449 , 0.1499 ) 0.1496 ( 0.1464 , 0.1534 ) 0.1649 ( 0.1196 , 0.2271 ) 0.1589 ( 0.123 , 0.222 )
σ 2 , q 4 0 . 146 ( 0.1205 , 0.1912 ) 0.15 ( 0.1256 , 0.1986 ) 0.1472 ( 0.1447 , 0.1498 ) 0.1494 ( 0.1463 , 0.1532 ) 0.1696 ( 0.1274 , 0.2261 ) 0.1571 ( 0.1223 , 0.2239 )
σ 3 , q 1 0.1479 ( 0.1446 , 0.1513 ) 0 . 1473 ( 0.1442 , 0.1505 ) 0.1495 ( 0.1467 , 0.1522 ) 0 . 1473 ( 0.1442 , 0.1502 ) 0.1479 ( 0.1446 , 0.1513 ) 0 . 1472 ( 0.1441 , 0.1503 )
σ 3 , q 2 0.1592 ( 0.1485 , 0.1891 ) 0.1613 ( 0.1508 , 0.1857 ) 0 . 1529 ( 0.149 , 0.1574 ) 0.1546 ( 0.1497 , 0.1598 ) 0.1622 ( 0.144 , 0.2033 ) 0.1628 ( 0.1491 , 0.1978 )
σ 3 , q 3 0.1681 ( 0.1528 , 0.2048 ) 0.171 ( 0.1556 , 0.2178 ) 0 . 1567 ( 0.1525 , 0.1616 ) 0.1584 ( 0.1536 , 0.1639 ) 0.1904 ( 0.154 , 0.2477 ) 0.1815 ( 0.1546 , 0.2464 )
σ 3 , q 4 0.1668 ( 0.1527 , 0.1997 ) 0.1701 ( 0.1555 , 0.2178 ) 0 . 1568 ( 0.1525 , 0.1613 ) 0.1583 ( 0.1537 , 0.1633 ) 0.192 ( 0.1591 , 0.246 ) 0.181 ( 0.1556 , 0.2455 )
σ 4 , q 1 0.1897 ( 0.1856 , 0.1952 ) 0 . 1873 ( 0.1838 , 0.1911 ) 0.1881 ( 0.1844 , 0.1918 ) 0.1924 ( 0.1879 , 0.1968 ) 0.1897 ( 0.1856 , 0.1952 ) 0 . 1873 ( 0.1837 , 0.1911 )
σ 4 , q 2 0.2035 ( 0.1906 , 0.2454 ) 0.2057 ( 0.1921 , 0.2474 ) 0 . 1954 ( 0.1891 , 0.2022 ) 0.2037 ( 0.1961 , 0.2119 ) 0.2049 ( 0.1836 , 0.2617 ) 0.2079 ( 0.1902 , 0.2642 )
σ 4 , q 3 0.2146 ( 0.1965 , 0.2623 ) 0.2166 ( 0.1996 , 0.2757 ) 0 . 2015 ( 0.1951 , 0.2077 ) 0.2101 ( 0.2026 , 0.2177 ) 0.2318 ( 0.1912 , 0.307 ) 0.2294 ( 0.1988 , 0.3082 )
σ 4 , q 4 0.214 ( 0.1967 , 0.2591 ) 0.2155 ( 0.1986 , 0.2689 ) 0 . 2013 ( 0.1951 , 0.2088 ) 0.2097 ( 0.2023 , 0.2185 ) 0.2338 ( 0.1976 , 0.3064 ) 0.2286 ( 0.1988 , 0.306 )
The first column indicated a model. Columns 2 and 3 represent results for two methods described in Section 2.2.3. Columns 4 and 5 are for the same methods but with the fixed value of α . Columns 6 and 7 shows the error of the standard EMWA approach with the optimal selected value of α . Values highlighted in bold are the smallest errors in each row. The 95% CI is presented below each averaged statistic. v 1 stands for using S i g 1 ; v 2 stands for using S i g 2 .
Table 3. Results upon filtering out 0-returns.
Table 3. Results upon filtering out 0-returns.
Model α for v 1 α for v 2 Er N , v 1 Er N , v 2 Fraction of Data Deleted, v 1 Fraction of Data Deleted, v 2
σ 1 , q 1 0.0027 ( 0.0 , 0.0137 ) 0.0022 ( 0.0 , 0.0103 ) 0.0006 ( 0.0 , 0.0 ) 0.0015 ( 0.0 , 0.0259 ) 0.0001 ( 0.0 , 0.0 ) 0.0003 ( 0.0 , 0.0052 )
σ 1 , q 2 0.0228 ( 0.0033 , 0.0569 ) 0.0259 ( 0.0044 , 0.067 ) 0.0094 ( 0.0004 , 0.026 ) 0.011 ( 0.0005 , 0.0295 ) 0.2005 ( 0.0814 , 0.3244 ) 0.2008 ( 0.0818 , 0.3247 )
σ 1 , q 3 0.0335 ( 0.006 , 0.0902 ) 0.0379 ( 0.0058 , 0.1063 ) 0.0106 ( 0.0005 , 0.0288 ) 0.0121 ( 0.0005 , 0.0336 ) 0.3661 ( 0.2474 , 0.481 ) 0.3659 ( 0.246 , 0.4797 )
σ 1 , q 4 0.0314 ( 0.0056 , 0.0824 ) 0.036 ( 0.007 , 0.0966 ) 0.0104 ( 0.0004 , 0.0283 ) 0.0122 ( 0.0007 , 0.0355 ) 0.3628 ( 0.2521 , 0.4717 ) 0.3626 ( 0.2515 , 0.4713 )
σ 2 , q 1 0.0039 ( 0.0 , 0.0161 ) 0.0037 ( 0.0006 , 0.0146 ) 0.0149 ( 0.0 , 0.0438 ) 0.035 ( 0.0 , 0.0586 ) 0.0029 ( 0.0 , 0.0092 ) 0.0067 ( 0.0 , 0.0134 )
σ 2 , q 2 0.035 ( 0.0059 , 0.0903 ) 0.0367 ( 0.0054 , 0.1021 ) 0.0209 ( 0.0012 , 0.0448 ) 0.0319 ( 0.0039 , 0.0601 ) 0.2016 ( 0.0856 , 0.3249 ) 0.2032 ( 0.0878 , 0.3263 )
σ 2 , q 3 0.0489 ( 0.0079 , 0.1326 ) 0.0556 ( 0.0091 , 0.1476 ) 0.0217 ( 0.001 , 0.0473 ) 0.0275 ( 0.0022 , 0.0603 ) 0.3706 ( 0.25 , 0.4835 ) 0.371 ( 0.2518 , 0.4836 )
σ 2 , q 4 0.049 ( 0.0093 , 0.1268 ) 0.0525 ( 0.0082 , 0.1484 ) 0.0206 ( 0.0012 , 0.0443 ) 0.0274 ( 0.0013 , 0.0571 ) 0.3645 ( 0.2495 , 0.4695 ) 0.3651 ( 0.2491 , 0.4692 )
σ 3 , q 1 0.0424 ( 0.0349 , 0.0527 ) 0.048 ( 0.0392 , 0.0603 ) 0.0034 ( 0.0 , 0.0337 ) 0.0089 ( 0.0 , 0.0402 ) 0.0007 ( 0.0 , 0.0067 ) 0.0018 ( 0.0 , 0.0085 )
σ 3 , q 2 0.0672 ( 0.0267 , 0.1456 ) 0.0767 ( 0.0249 , 0.1592 ) 0.0155 ( 0.0006 , 0.0404 ) 0.02 ( 0.0009 , 0.0551 ) 0.1995 ( 0.0826 , 0.3248 ) 0.1997 ( 0.0826 , 0.3251 )
σ 3 , q 3 0.0825 ( 0.0264 , 0.1734 ) 0.0985 ( 0.0301 , 0.2371 ) 0.018 ( 0.0011 , 0.0477 ) 0.0243 ( 0.0008 , 0.0702 ) 0.3678 ( 0.2444 , 0.4746 ) 0.3671 ( 0.2421 , 0.4751 )
σ 3 , q 4 0.0788 ( 0.0266 , 0.163 ) 0.0969 ( 0.0325 , 0.2362 ) 0.0178 ( 0.001 , 0.0463 ) 0.0222 ( 0.0007 , 0.067 ) 0.3623 ( 0.2466 , 0.476 ) 0.3615 ( 0.2486 , 0.4739 )
σ 4 , q 1 0.0819 ( 0.0696 , 0.1037 ) 0.0904 ( 0.0757 , 0.119 ) 0.0013 ( 0.0 , 0.0248 ) 0.0047 ( 0.0 , 0.0329 ) 0.0003 ( 0.0 , 0.0052 ) 0.0011 ( 0.0 , 0.0075 )
σ 4 , q 2 0.1132 ( 0.0534 , 0.2359 ) 0.1339 ( 0.0576 , 0.2925 ) 0.0185 ( 0.0007 , 0.0564 ) 0.0265 ( 0.0008 , 0.087 ) 0.1993 ( 0.0765 , 0.3287 ) 0.1982 ( 0.077 , 0.3257 )
σ 4 , q 3 0.1338 ( 0.0557 , 0.2678 ) 0.1596 ( 0.0597 , 0.3734 ) 0.0214 ( 0.0009 , 0.0621 ) 0.0321 ( 0.001 , 0.1119 ) 0.3687 ( 0.2419 , 0.4823 ) 0.3669 ( 0.2378 , 0.4817 )
σ 4 , q 4 0.1317 ( 0.0571 , 0.263 ) 0.1556 ( 0.0613 , 0.3541 ) 0.0211 ( 0.0008 , 0.0667 ) 0.0315 ( 0.0011 , 0.108 ) 0.3641 ( 0.2599 , 0.4823 ) 0.3625 ( 0.2564 , 0.4822 )
Values of α , errors of the number of 0-returns due to rounding, and fraction of data set as missing values. The first column indicated a model. 95% CI is presented below each averaged statistic. v 1 stands for using S i g 1 ; v 2 stands for using S i g 2 .
Table 4. The degree of inefficiency for each stock.
Table 4. The degree of inefficiency for each stock.
TickerDegree of InefficiencyFor 3 Symbols OnlyFor 4 Symbols Only
GAZP0.7250.3920.675
LKOH0.650.3420.542
ROSN0.7420.3920.708
SNGS0.7250.40.625
TATN0.6170.3920.525
SBER0.7250.4330.658
VTBR0.8420.5920.792
CHMF0.8580.550.692
NLMK0.80.4670.692
GMKN0.7330.4750.608
MTLR0.9920.7830.975
MAGN0.8330.650.758
MTSS0.9670.70.942
RTKM0.9420.6830.908
HYDR0.8920.750.8
RSTI0.9170.7420.875
AFLT0.9830.7750.95
MGNT0.8420.6670.742
Fraction of inefficient months using 3-symbol and 4-symbol discretizations. Each value in the last two columns is calculated using 120 efficiency rates.
Table 5. The most frequent blocks appearing for the MLTR stock and the probabilities of success.
Table 5. The most frequent blocks appearing for the MLTR stock and the probabilities of success.
Months of 2014The Most Frequent Block, 3-sThe Most Frequent Block, 4-sProb. of Success, FilteredProb. of Success, Original
January0000011110.640.75
February0000022220.640.74
May0000011110.610.73
June2222211110.600.73
July1111111110.620.74
August0000011110.610.76
September0000011110.630.74
October12012003030.550.6
The first column represents months with the lowest efficiency rates. Columns 2 and 3 are the most frequent blocks in 3- and 4-symbol discretizations. Columns 4 and 5 include the probability of success when applying the simple trading strategy for filtered and original price returns.
Table 6. The most frequent blocks appearing for the RSTI stock and probabilities of success.
Table 6. The most frequent blocks appearing for the RSTI stock and probabilities of success.
MonthsThe Most Frequent Block, 3-sThe Most Frequent Block, 4-sProb. of Success, FilteredProb. of Success, Original
April 201421212101110.630.77
May 20140000011110.610.73
June 20140000011110.60.73
July 20140000022220.620.74
August 20140000022220.610.76
September 2014000000222220.630.74
June 20150000022220.540.61
July 20150000011110.550.6
August 20150000022220.540.6
September 20150000022220.550.61
October 20151111101110.560.62
The first column represents months with the lowest efficiency rates. Columns 2 and 3 are the most frequent blocks in 3- and 4-symbol discretization. The length of a block is defined using Equation (5). Columns 4 and 5 are the probability of the success of the simple trading strategy for filtered and original price returns.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Shternshis, A.; Mazzarisi, P.; Marmi, S. Efficiency of the Moscow Stock Exchange before 2022. Entropy 2022, 24, 1184. https://doi.org/10.3390/e24091184

AMA Style

Shternshis A, Mazzarisi P, Marmi S. Efficiency of the Moscow Stock Exchange before 2022. Entropy. 2022; 24(9):1184. https://doi.org/10.3390/e24091184

Chicago/Turabian Style

Shternshis, Andrey, Piero Mazzarisi, and Stefano Marmi. 2022. "Efficiency of the Moscow Stock Exchange before 2022" Entropy 24, no. 9: 1184. https://doi.org/10.3390/e24091184

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop