Entropy and Recurrence Measures of a Financial Dynamic System by an Interacting Voter System

A financial time series agent-based model is reproduced and investigated by the statistical physics system, the finite-range interacting voter system. The voter system originally describes the collective behavior of voters who constantly update their positions on a particular topic, which is a continuous-time Markov process. In the proposed model, the fluctuations of stock price changes are attributed to the market information interaction amongst the traders and certain similarities of investors’ behaviors. Further, the complexity of return series of the financial model is studied in comparison with two real stock indexes, the Shanghai Stock Exchange Composite Index and the Hang Seng Index, by composite multiscale entropy analysis and recurrence analysis. The empirical research shows that the simulation data for the proposed model could grasp some natural features of actual markets to some extent.


Introduction
In recent decades, the modeling of the dynamics of price fluctuation behaviors in the financial market has sparked considerable interest in both the finance and physics community, which is becoming a key problem in risk management, physical asset valuation and derivatives pricing. It is also becoming increasingly important for understanding the mechanisms of dynamical financial price variations that have exhibited some interesting statistical properties, such as the fat tails phenomenon, power law of logarithmic return and volume, volatility clustering, multifractality of volatility, etc. [1][2][3][4][5][6][7][8][9]. Various agent-based models have been introduced to make an empirical study of the fluctuations in the market; for instance, see [10][11][12][13][14][15][16][17][18][19][20][21][22][23][24][25][26][27]. Some of these models are created by applying the interacting particle, system theories and methods, such as the percolation network [11,17,20,23,24,28], the Ising dynamic system [12,13], the contact model [10,25,26] and the voter system [19]. For example, Lux and Marchesi [14] introduced an agent-based model in which chartist agents compete with fundamentalists agents, leading to power law distributed returns as observed in real markets, which contradicts the popular efficient market hypothesis. Fang and Wang [12] developed an interacting-agent model of speculative activity explaining price formation in financial market that is based on the stochastic Ising dynamic system. Through computer simulation and empirical study, it shows that the established financial model can reproduce the main factors and reveal certain statistical characteristics of asset returns. Actually, it is widely accepted that the financial market is an evolving dynamic system that reacts to external investment information to determine the best price for a given asset. It consists of a great number of agents interacting with one another in complicated ways. In modeling of the financial fluctuations, one of the most important thing is to find or define a proper mechanism for interacting information that the market investors hold. This is also what many economists are dedicated to. The establishment of a modeling process enriches the theoretical study of financial stock pricing. The practical relevance of a thorough understanding of the mechanism governing market fluctuation lies in the benefits that this induces in the processes of asset allocation and risk management. The dynamic voter interacting system is the famous statistical physics model, which can be also viewed as a model for non-equilibrium statistical mechanics [29][30][31][32]. In the voter process, the voters constantly update their attitudes at independent exponential random variables. At times of reconsideration, a voter chooses one neighbor uniformly from amongst all neighbors and takes that neighbor's opinion. In this case, the voter theory could be taken to describe the decision making mechanism amongst economic agents in the market. In this paper, a financial price model is introduced by the finite-range voter system, in which we also assume that the investors' attitudes towards markets lend to the fluctuations of stock prices and suppose that the interacting particles in the voter system represent the investment opinions. Applying the voter system to the financial modeling may provide a good potential link between economics and physics and give a beneficial way to depict the mechanism of market agents.
Afterwards, we make an empirical study of the statistical behaviors of logarithmic returns for the proposed price model in comparison with two important Chinese stock indexes, the Shanghai Stock Exchange (SSE) Composite Index and the Hang Seng Index (HSI). We mainly focus on the exploration of the complexity of the financial time series. The multiscale entropy (MSE) method was one such method developed to quantify the relative complexity of normalized time series across multiple scales [33], which consists of two steps: (1) a coarse-graining procedure is used to derive the representations of a system's dynamics at different time scales; and (2) the sample entropy (SampEn) algorithm is used to quantify the regularity of a coarse-grained time series at each time scale factor. Various literature has seen the applications of this method to many research areas [33][34][35][36]. However, the reliability of SampEn is reduced as a time scale factor is increased. More specifically, the variance of the entropy estimator grows very quickly as the number of data points is reduced, and in the real application, the data length is not often long enough, which will lead to less reliability in distinguishing time series. A modified algorithm in [37], called composite multiscale entropy (CMSE), was introduced to overcome this weakness, in which the experimental results showed that the CMSE presented better performance on short time series than the MSE and could provide a more reliable entropy estimation by the analysis of both white and 1/f noise series. In the present paper, the CMSE analysis is adopted as a measure of the complexity of simulated data and real ones. Furthermore, nonlinear determinism can potentially explain large movements in financial data that linear stochastic models cannot account for [38]. Thus, we here apply the recurrence plots (which provide visual insight into the complex nonlinear deterministic patterns in time series data [39,40]) and the recurrence quantification analysis (RQA) method (that is able to quantify structure in the recurrence plots through different recurrence measures [41,42]) to the nonlinear analysis of the real market data and the simulated data derived from the financial price model.
In sum, there are two main contributions in this paper. One is to provide a financial price modeling process, in which the information interaction mechanism of agents in the stock market is described by the long-range voter system. The other is to study the complexity behaviors of simulation time series by the CMSE method (a recent proposed method in engineering) and the famous RQA technique, respectively. We hope that our study could enrich the modeling and statistical analysis of the financial market.

Price Process Modeling by a Finite-Range Voter System
In this section, we intend to construct a price simulation process according to the mechanisms of a finite-range-biased voter interacting system. First, we give a mathematical description of the finite-range-biased voter model. The voter model is an interacting particle system [29][30][31][32] = {0}. More generally, we consider the initial distribution as υ θ , the product measure with density θ (each site is independently occupied with probability θ) and let η θ s denote the voter model with initial distribution υ θ . More formally, the stochastic dynamics of voter model η s is a Markov process on a configuration space {0, 1} Z d , whose generator has the form: where the functions g is on {0, 1} Z d that depend on finitely many coordinates, and η is the transition rate function for the process, which is given by the following (see [30][31][32]). For any x ∈ Z d , the state of x ∈ Z d flips according to the transition rates: where I is the indicative function, p(x, y) ≥ 0 for x, y ∈ Z d and y∈Z d p(x, y) = 1 for all x ∈ Z d . Here, we suppose that the transition probability p(x, y) is translation invariant and symmetric and such that the Markov chain with those transition probabilities is irreducible [15,43]. If a site x ∈ Z d is occupied by a one (resp. zero), then at rate one (resp. λ), it picks a site y ∈ Z d with probability p(x, y) and adopts the state of the individual at y. For the biased voter model (λ > 1), there exists a "critical value" for the process, which is defined as λ c = inf{λ : we have for any ǫ > 0 and for all s sufficiently large, The above results imply that, on a d-dimensional lattice, the process becomes vacant exponentially for λ < λ c ; the process survives with the positive probability for λ > λ c .
We introduce the graphical representation of a one-dimensional-biased voter model on the configuration space {0, 1} Z [31,32], since the graphical representation is very useful to illustrate and simulate the model. We start by constructing the process η s from a collection of Poisson processes in the case λ ≥ 1. For each pair x, y ∈ Z with |x − y| ≤ R (R is the finite-range), let {T (x,y) n : n ≥ 1} and {U (x,y) n : n ≥ 1} be independent Poisson processes with rate one and λ − 1, respectively. At times T (x,y) n , we draw an arrow from y to x and put a δ at x. At times U (x,y) n , we just draw an arrow from y to x. Then, the process is obtained from the graphical representation as follows: At time T (x,y) n , the state of site x imitates the state of site y, i.e., becomes occupied by a one (resp. zero) if site y is occupied by a one (resp. zero). At time U (x,y) n , the site x becomes occupied by a one if y is occupied by a one, and the state of site x is not affected if y is occupied by a zero. A figure illustration of the construction of the graphical representation for a one-dimensional-biased voter model with neighbor range R = 3 is presented in Figure 1. We imagine fluid entering the bottom and flowing up the structure. The δ's are like dams, and the arrows are like pipes, which allow the fluid to flow in the indicated direction.
In the following, a financial price simulation process model is developed by applying the finite-range-biased voter dynamic system with neighbor range R. In this model, we assume that the investors' investment attitudes towards the financial market lead to fluctuations of stock prices and suppose that the investment attitudes are represented by the interacting particles in the biased voter model. We may classify the investment attitudes into buying, selling and neural ones, which correspondingly sort the investors' into three groups according to the attitudes that they hold. We assume that each trader can trade the stock several times at each day t ∈ {1, 2, · · · , N}, but at most a unit number of the stock at each time. Let l be the time length of trading day; we denote the stock price at time s in the t-th trading day by P t (s), where s ∈ [0, l]. Suppose that the stock market consists of 2M + 1 (M is large enough) traders, who are located in a line {−M, · · · , −1, 0, 1, · · · , M} ⊂ Z (similarly for a d-dimensional lattice Z d ). At the very beginning of each trading day, we select a certain proportion of traders (with the initial distribution υ θ ) randomly in the system and consider them as those who receive some market news. We define a random variable ζ t with the values +1, −1, 0 to represent that these investors hold a buying opinion, selling opinion or neutral opinion with probabilities p +1 , p −1 or 1 − (p +1 + p −1 ), respectively. Then, these investors send a bullish, bearish or neutral signal to their finite-range neighbors. According to the d-dimensional voter process system, investors can affect each other or the news can be spread, which is supposed as the main factor of price fluctuations for the market. The aggregate excess demand for the asset at time t is defined by: , and M may depend on the trading days N. From the above description and [15,43,44], we define the simulation formula of a discrete time stock price as follows: where β(> 0) represents the depth parameter of the market, which measures the sensitivity of price fluctuation in response to the change in excess demand, and P 0 is the stock price at Time 0. The corresponding stock logarithmic return and absolute return from t − 1 to t are defined by:

Composite Multiscale Entropy Analysis
In this section, we adopt a modified new multiscale entropy analysis algorithm, the composite multiscale entropy (CMSE) method, for complexity analysis of the above-mentioned three simulation data and two actual financial market indexes, which was proposed to overcome the problem that the reliability of the sample entropy of a coarse-grained series is reduced as a time scale factor is increased [37]. The difference between the CMSE and MSE is in the coarse-graining procedure, and the CMSE can be carried out on a time series in the following two steps: (1) For an one-dimensional time series x = {x 1 , x 2 , · · · , x N }, consecutive coarse-grained time series are constructed by averaging a successively increasing number of points within non-overlapping windows. Unlike the MSE algorithm in which each of the coarse-grained time series {y (τ ) } is computed as y x i , the k-th coarse-grained time series in the CMSE method for a scale factor τ , Note that for τ = 1, the coarse-grained time series is simply the original time series. Figure 3 shows a schematic illustration of the coarse-graining procedure for both MSE (a) and CMSE (b) with τ = 2 and τ = 3, respectively, from which a clear difference between these two methods can be seen.  (2) The entropy measure, the sample entropy (SampEn), is calculated for each coarse-grained time series and then plotted as a function of the scale factor. SampEn quantifies the regularity or predictability of a time series, which is defined as the negative logarithm of the conditional probability that a point that repeats itself within a tolerance of ǫ in an m-dimensional phase space will repeat itself in an m + 1-dimensional phase space: where C(m, ǫ) is the number of repeating points in the m-dimensional phase space, repeating is defined as points closer than ǫ in a Euclidean sense to the examined points; for details, see [35][36][37]47]. Finally, the CMSE value is defined as the means of the sample entropies of all coarse-grained time series, that is: while the MSE is computed by only using the first coarse-grained time series y   In Figure 4, we calculate the CMSE values of returns and absolute returns with different power exponents, labeled as |r(t)| q , for simulation data and actual data from Scale 1 to Scale 30 (τ = 1 to 30), where q is taken as 0.25, 0.5, 0.75, 1, 1.5 and 2, respectively. It is known that the absolute return is a proxy of the volatility of time series, and |r| q exhibit obvious different volatility behaviors for different q [19]. The entropy value of each coarse-grained time series is calculated with phase space embedding dimension m = 6 for minimizing the fraction of false neighbors [38] and ǫ = 0.15σ, where σ denotes the standard deviation of the original time series. From the figure, we find that values of the composite multiscale entropy of return series monotonically decrease as the scale factor increases for either actual indexes or simulation data from the financial price model. For the absolute return with different power exponents |r(t)| q , it is observable that the entropy values decrease gradually and, finally, remain almost constant as the scale factor becomes larger, which indicate that each of these time series, unlike the return series, contains correlations and complex structures across multiple time scales. The behavior for each of these volatility time series is similar to that of 1/f series, which has the correlated fluctuations (but the degrees of correlations are different). Meanwhile, it is observed that the entropy values for all given scales become smaller with the power exponents q becoming larger, suggesting decreasing complex structures. The simulative data for the financial price model show similar fluctuation behaviors to the actual SSE and HSI data.
To find some relationships between the correlations and the complexity of time series, we adopt the well-known detrended fluctuation analysis (DFA) method, which was proposed to explore the long-range correlations of nonstationary time series [19], to calculate the Hurst exponents of return series and |r(t)| q series for both the financial price model and actual market indexes. The results can be found in Table 1. It is seen that the Hurst exponents of return series for both the simulation data and the actual ones are around 0.5, indicating weak correlations. The Hurst exponents for |r(t)| q are all much larger than 0.5, which means that the time series are long-range auto-correlated. However, for each time series, the Hurst exponents do not show the obvious monotonic relationship with power exponent q. Therefore, we cannot intuitively conclude that the stronger correlations of time series correspond to higher complex structures or entropy values.  In Figure 5, we present the CMSE results for return series, absolute returns and their corresponding shuffled time series. From Figure 5a, it is seen that the CMSE values of shuffled returns for the price model and the actual indexes are decreasing as the time scale increases, similar to the trends of original returns. In Figure 5b, it is shown that the CMSE values of shuffled absolute return series are also reducing with the time scale enlarging, but the trends are quite different from the original absolute returns. These statistical behaviors may be understand as follows. In the shuffling procedure, the values of time series are put into random order, and thus, all correlations are destroyed; therefore, the composite multiscale sample entropy of shuffled time series presents behaviors similar to the uncorrelated or weak correlated series.

Recurrence Plot and Recurrence Quantification Analysis
In this section, we utilize the recurrence plot and recurrence quantification analysis to explore the complex determinism of return time series for the price model, SSE and HSI. The recurrence plot (RP) is a qualitative tool for visualizing nonlinear dynamics in time series data. RP shows when a point in the phase space is near (at a distance lower than a certain threshold) to another point [39,40]. To perform the recurrence plot analysis, a phase space from a single observation is constructed. The state of the system can be represented by the discrete time delay vector x t = {x t , x t−△t , x t−2△t , · · · , x t−(m−1)△t }, where m denotes the embedding dimension and △t is the time delay. Then, the Euclidean distance matrix R using the independence time-delayed coordinates is calculated: where || · || is a norm, Θ(·) is the Heaviside step function and h is a threshold value, which has the meaning of the tolerance of recurrence. The distance matrix consists of zeros and ones and corresponds to the state of the system (one, recurrence and zero, no recurrence). In this paper, the embedding dimension is selected through the false nearest neighbors method [38], and a value m = 6 seems appropriate for the time series under study. The time delay is fixed to △t = 1 by the average mutual information method [48]. Concerning the recurrence threshold h, we adopt the 5% (h ≈ 0.02) and 10% (h ≈ 0.04) of the maximal phase space diameter of the discussed time series. In Figure 6 and Figure 7, the recurrence plots of returns for the simulated data and two actual market data are depicted in the case of h = 0.02 and h = 0.04, respectively. In Figure 6, more of the recurrence points are observed in the main diagonal lines for these time series, while the recurrence structure consisting of roughly vertical (horizontal) patterns can be found in the plots of Figure 7. Since the graphical representation may be difficult to evaluate, recurrence quantification analysis (RQA) [41,42,49] was developed to provide numerical measures that allow for the quantification of the structure and complexity of RPs. These quantities are based on recurrence points densities, diagonal and vertical line structure. The recurrence rate: which measures the density of recurrence points in an RP and can be interpreted as the probability that any state may recur. One of the RQA measures based on the line parallel to the main diagonal is determinism (DET), which is defined by: where P (l) is a histogram of diagonal lines of the length l, and l min is the minimal length of a diagonal line that is defined by l min = 2. DET provides an indication of determinism and predictability in the system; thus the larger the value of DET, the more predictable the system with diagonal lines in RP.
Another measure on the diagonal line is the Shannon information entropy (L EN T ) defined for diagonal line collections: where the probability of the line distribution is p(l) = P (l)/ l≥l min P (l). The rise in L EN T refers to the increase of complexity of the time series. Moreover, the mean length of the diagonal lines L mean = N R l=l min lp(l) is a parameter indicating the system stability. Instead of considering diagonal lines, we measure vertical recurrence lines. In analogy to determinism, the laminarity (LAM) is defined for vertical line patterns: where P (v) denotes a histogram of vertical lines of the length v with the minimum line length v min = 2. The larger the laminarity parameter, the more stable the behavior of the system. Finally, the average vertical line length (TT) is given as TT = N R v=v min vp(v), and it estimates the mean time that the system remains at a specific state.
The estimated results of RQA measures are presented in Table 2. It is found that for each of these three simulated data and two actual stock indexes, the recurrence rate (RR) for recurrence threshold h = 0.04 is larger than that for h = 0.02; especially, the value for HSI is observed as a higher increase than those for other series. The DET and L mean values decrease when the h changes from 0.02 to 0.04, except the DET value for HSI, which becomes 0.7793 for h = 0.04 from 0.7685 for h = 0.02. Since the higher DET and L mean correspond to a more predictable and stable system, thus the results imply the reduction of determinism of these return series in the case of h = 0.04. It is also noticed that the L mean for HSI under h = 0.02 is much smaller than those for other series, indicating the weakest predictability among them. Furthermore, it is observable that all of the values of L EN T have increased for h = 0.04, and its value for the return series with R = 1 has a relatively considerable rise, which indicate the increasing of the complexity of these time series. In terms of measures of vertical recurrence lines, the values of LAM and TT enlarge when h becomes 0.04, which means a rise in the fraction of recurrence points forming vertical lines, and it can also obviously be seen in Figure 7.

Conclusions
In the present paper, we have developed a financial price model by using the mechanisms of the interacting dynamic finite-range voter system, which is a well-known statistical physics model. The comparative research of statistical properties of two Chinese stock indexes and the simulated data with different parameters (intensity and neighbor range parameters) is performed. The composite multiscale entropy analysis is applied to investigate the complexity of returns and absolute returns with different power exponents. The findings show that the CMSE for return series decreases as the time scale becomes larger, while the CMSE values for volatility series first reduce gradually, but finally remain almost stable, which is consistent with the auto-correlations of the series. The obviously different complexity degrees of |r(t)| q for different values of q are also observable. Finally, the recurrence plots and recurrence quantification analysis are utilized to further explore the complexity of the return series. For recurrence thresholds h = 0.02 and h = 0.04, the RQA measures present apparently different values, indicating different complex determinism behaviors. Based on the statistical research, we can observe that there is some evidence of similar complex behaviors of the returns of the financial model derived from the finite-range voter system to the real stock markets; this shows that the proposed model can grasp some nature of the real stock market in certain respects and is reasonable for real stock price modeling.