A Forecasting Model Based on High-Order Fluctuation Trends and Information Entropy

Most existing high-order prediction models abstract logical rules that are based on historical discrete states without considering historical inconsistency and fluctuation trends. In fact, these two characteristics are important for describing historical fluctuations. This paper proposes a model based on logical rules abstracted from historical dynamic fluctuation trends and the corresponding inconsistencies. In the logical rule training stage, the dynamic trend states of up and down are mapped to the two dimensions of truth-membership and false-membership of neutrosophic sets, respectively. Meanwhile, information entropy is employed to quantify the inconsistency of a period of history, which is mapped to the indeterminercy-membership of the neutrosophic sets. In the forecasting stage, the similarities among the neutrosophic sets are employed to locate the most similar left side of the logical relationship. Therefore, the two characteristics of the fluctuation trends and inconsistency assist with the future forecasting. The proposed model extends existing high-order fuzzy logical relationships (FLRs) to neutrosophic logical relationships (NLRs). When compared with traditional discrete high-order FLRs, the proposed NLRs have higher generality and handle the problem caused by the lack of rules. The proposed method is then implemented to forecast Taiwan Stock Exchange Capitalization Weighted Stock Index and Heng Seng Index. The experimental conclusions indicate that the model has stable prediction ability for different data sets. Simultaneously, comparing the prediction error with other approaches also proves that the model has outstanding prediction accuracy and universality.


Introduction
For stock market forecasts, summarizing the rules that can be used for future predictions from historical data is crucial. At the same time, due to the noise that is contained in the actual data, Song and Chissom [1][2][3] proposed a fuzzy time series method for general rule extraction. On this basis, some scholars proposed first-order models [4][5][6], whereas others further studied high-order models [7,8]. These studies not only imply that high-order models can reflect historical trends, but also emphasize the importance of the relationship between historical trends and current state.
However, most of above models summarize the logical relationships between current and historical trends that are only based on discrete states. In fact, in addition to the simple state-to-state relationship, the relationships between the historical states and the current states are also related to other features. For example, Lee et al. [9] suggested that the fluctuation trends in related stock markets are linked with each other. From this point of view, Lee et al. proposed a prediction model that is based on two related stock markets that extended such models to two-factor high-order prediction neutrosophic logical relationships (NLRs) have higher generality and can address the problem that is caused by the lack of rules in the forecasting stage.
Inspired by the above research, we propose a prediction model based on high-order fluctuation trends and information entropy. First, the original time series of the stock market is converted into a fluctuation time series, and then the fluctuation time series is blurred into a fuzzy fluctuation time series according to a predefined label. Second, the IE of its historical fluctuations is calculated based on the probability of different states of each current value. Third, the NS is used to represent the current state and to establish a neutrosophic logical relationships. Fourth, the Jaccard similarity measure is used to seek out similar appropriate logical relationship groups and calculate its expected NSs. Finally, we obtain the desired expected NSs, then the final predicted value is calculated through the process of deneutrosophication. For verification, the proposed model is implemented to forecast Taiwan Stock Exchange Capitalization Weighted Stock Index (TAIEX) and Heng Seng Index (HIS). The experimental conclusions indicate that the model has stable prediction ability for different data sets. Simultaneously, comparing the prediction error with other approaches also proves that the model has outstanding prediction accuracy and universality.

Fuzzy Set (FS)
Fuzzy set theory was proposed by Zadeh [32], and it has been widely applied in several fields. A brief introduction to the basic concepts of fuzzy set follows. Definition 1. Denote the universe of discourse as U = {u 1 , u 1 , . . . , u n }. A fuzzy set L = {L 1 , L 2 , . . . , L g } in U can be defined by its membership function: L j = f Lj (u 1 )/u 1 + f Lj (u 2 )/u 2 + . . . + f Lj (u n )/u n (j = 1, 2, . . . , g) (1) where f Lj: U → [0, 1] is the membership function of the fuzzy set L j , and f Lj (u i ) is the membership degree of u i belonging to L j , where f Lj (u i ) ∈ [0, 1] (I = 1, 2, . . . , n, j = 1, 2, ..., g). The symbol + is not the conventional operation of addition but union.
Let fuzzy set L = {L 1 , L 2 , . . . , L g } be a finite and fully ordered discrete term set, where g is an odd number in real situations with the first (g − 1)/2 elements describe the degree of a property and the last (g − 1)/2 elements from the opposite description. For example, when g = 7, it might represent a set of linguistic variants as: L = {L 1 , L 2 , L 3 , L 4 , L 5 , L 6 , L 7 } = {very bad, bad, below fair, fair, above fair, good, very good}, and so on. The relationship between the element L i (I = 1, 2, . . . , g) and its subscript i is strictly monotonically increasing [33], so the function can be defined, as follows: f: L i = f (i). Clearly, the function f (i) is a strictly monotonically increasing function about a subscript i.

Fuzzy Time Series (FTS)
Fuzzy time series was proposed by Song and Chissom in 1993 [2]. It has been successfully used to solve many practical problems. The brief introduction of basic concepts of fuzzy time series, as follows.

Definition 2.
Let the time series {Y(t)|t = 1, 2, . . . , T}, a subset of real number, denote the universe of discourse. According to Definition 1, each element in the time series can be fuzzified to a fuzzy set element F(t)(t = 1, 2, . . . , Entropy 2018, 20, 669 4 of 15 T), then {F(t)|t = 1, 2, . . . , T} is called a fuzzy time series defined on the time series {Y(t)|t = 1, 2, . . . , T}. If there exists a fuzzy relation R(t -p, t), such that where • is a max-min composition operator, F(t) is called derived from F(t − p), denoted by the fuzzy logical relationship (FLR) F(t − p) → F(t). F(t -p) and F(t) are called the left-hand side (LHS) and the right-hand side (RHS) of the FLR, respectively. FLRs with the same LHS can be categorized into an ordered fuzzy logical group (FLG).
, then the p-order FLR can be represented by: and it is called the m-order fuzzy-fluctuation logical relationship (FFLR) of the fuzzy-fluctuation time series, where Q(t − 1), Q(t − 2), . . . , Q(t − m) is called the left-hand side (LHS) and Q(t) is called the right-hand side (RHS) of the FFLR, and Q(k)(k = t, t − 1,t − 2, . . . , t − m) ∈ L.

Information Entropy
Information entropy (IE) [12] was firstly defined by Shannon in 1948 as a measure of event uncertainty. Shannon suggested that the smaller the possibility of an incident, the greater the amount of information that it contains. Conversely, the greater the likelihood of an event, the smaller the amount of information. So, the amount of information can be expressed as a function of the probability of occurrence of an event.

Definition 5.
where p(x i ) represents the probability of occurrence of the ith event. In addition, the information entropy must satisfy the following conditions: ∑ N i=1 p(x i ) = 1, 0 < p(x i ) < 1 and non-negativity: E ≥ 0.

Neutrosophic Set (NS)
Neutrosophic set (NS) was proposed by Smarandache [20]. It has been widely used to describe complex phenomena. Definition 6. Let X be a space of points (objects), with a generic element in X denoted by x. A neutrosophic set A in X is characterized by a truth-membership function T A (x), a indeterminacy-membership function I A (x), and a falsity-membership function F A (x). The functions T A (x), I A (x), and F A (x) are real standard or nonstandard subsets of ]0 − ,1 + [. There is no restriction on the sum of T A (x), I A (x), and F A (x).

Neutrosophic Logical Relationship (NLR)
Definition 7. Let i be the subscript of a fuzzy set element L i (i = 1, 2, . . . , g), A(t) be the LHS of an m-order FFLR Q(t − 1), Q(t − 2), . . . , Q(t − m) → Q(t) , P i A(t) be the probabilities of corresponding L i (i = 1, 2, . . . , g) in A(t). P i A(t) can be generated by: where w i,j = 1 if Q(t − j) = L i and 0 otherwise.
where α i , β i are the weights of L i (i = 1, 2, . . . , g) in terms of their contribution to represent the corresponding characteristics, ∑ The same as the definition of FLG, N A(t) is called the left-hand side (LHS) of a neutrosophic logical relationship (NLR). The similar LHSs of NLRs can be categorized and group their RHSs of corresponding FFLRs into an ordered fuzzy logical group (FLG) B A(t) , which can also be represented by a neutrosophic set N A(t) , referring to the above method.
where P i represents the probability of corresponding Thus, a neutrosophic logical relationship (NLR) can be defined as:

Definition 8.
Let N A(t1) and N A(t2) be two NSs. The Jaccard similarity [34] between N A(t1) and N A(t2) in vector space can be expressed as follows: Entropy 2018, 20, 669 6 of 15

Deneutrosophication of a Neutrosphic Set
Definition 9. Deneutrosophication of a neutrosophic fluctuation set refers to converting a neutrosophic set x into a fuzzy set y by the following function according to Ali et al. [35]: Specifically, in this paper, our equation for deneutrosophication is as follows:

Proposed Model Based on High-Order Fluctuation Trends and Information Entropy
This paper presents a prediction model that is based on high-order fluctuation trends and information entropy. Different from existing high-order fuzzy time series forecasting models, the proposed model summarizes the discrete high-order fluctuation states into truth-membership of upper-trend, falsity-membership of upper-trend, and chaos of trends. It coincides with the definition of neutrosophic set, which employs the three dimensions of truth-membership, falsity-membership, and indeterminacy-membership to describe a characteristic. Based on the NS expression of high-order fluctuation states, the NS theory can be used to deal with related issues, such as the comparison of high-order states during the location of rules. The most significant contribution of this model is that it extends existing high-order fuzzy logical relationships to neutrosophic logical relationships and it employs information entropy to represent the indeterminacy of fluctuation trends. When compared with traditional discrete high-order FLRs, the proposed NLRs have higher generality and can handle the problem caused by the lack of rules in the forecasting stage. The detailed steps are shown as follow steps and in Figure 1.
Step 1 Construct FFTS for the historical training data. Step 2 Establish m-order FFLRs for the training data set According to Definition 4, each Q(t) (t > m) in the training set can be represented by its previous m days' fuzzy-fluctuation numbers. Then, the m-order FFLRs of the prediction model can be established.
Step 3 Calculate fluctuation information entropy According to the Definition 5, the information entropy of the m-order fluctuation can be separately calculated. Among them, p(x 1 ), p(x 2 ), p(x 3 ), p(x 4 ), and p(x 5 ), respectively, indicate the probability of occurrence of the linguistic variants L 1 , L 2 , L 3 , L 4 , and L 5 in the LHS. Then, the information entropy of the m-order FFLRs can be obtained according to Equation (4), which will be the element of the indeterminacy membership of the NSs.
Step 4 Convert the LHSs of FFLRs to NSs Calculate the probability of corresponding L i (i = 1, 2, . . . , 5) in the LHSs of training data set, according to Equation (5). Combined with the information entropy of the m-order FFLRs obtained in the previous step, convert the left-hands of FFLRs into neutrosophic sets according to Equation (6). Step 5 Grouping and optimization Applying the Jaccard similarity measure method, categorize and group the converted LHSs of FFLRs according their similarities. Convert the RHSs of corresponding FFLRs into neutrosophic sets according to Equation (7). NLRs are obtained from the training data sets for future forecasting.
Step 6 Forecast test time series The Jaccard similarity measurement method can be used to find the most appropriate NLR for each test data set. According to Definition 9, the RHS of the NLR can be converted to an expectation fuzzy value Y(i + 1). Then, calculate the real number of the fluctuation by: F (i + 1) = Y (i + 1) × len. Finally, the predicted value can be obtained from the actual value of the previous day X(i) and the predicted fluctuation value F ' (i + 1): X (i + 1) = X(i) + F (i + 1).
Entropy 2018, 20, x FOR PEER REVIEW 6 of 15 falsity-membership, and indeterminacy-membership to describe a characteristic. Based on the NS expression of high-order fluctuation states, the NS theory can be used to deal with related issues, such as the comparison of high-order states during the location of rules. The most significant contribution of this model is that it extends existing high-order fuzzy logical relationships to neutrosophic logical relationships and it employs information entropy to represent the indeterminacy of fluctuation trends. When compared with traditional discrete high-order FLRs, the proposed NLRs have higher generality and can handle the problem caused by the lack of rules in the forecasting stage. The detailed steps are shown as follow steps and in Figure 1.  Step 1. Construct FFTS for the historical training data.

Forecasting Taiwan Stock Exchange Capitalization Weighted Stock Index (TAIEX)
The Taiwan stock market has always been the focus of research in the field of stock market forecasting. The TAIEX is an indicator of the change in the value of Taiwan's overall market stocks, which is seen as a window to the Taiwanese economy. In addition, a large number of studies have used TAIEX data as an example to illustrate their proposed prediction methods [36][37][38][39][40][41]. To facilitate a comparison with the accuracy of these models, we also used the TAIEX data set to illustrate our proposed method. Specifically, while referring to the general practice of the above studies, the forecasting process in this section is based on TAIEX's 1999 data. Similarly, we also selected data from January 1999 to October 1999 as a training set, and selected from November to December 1999 as a test data set.
Step 1 Construct FFTS for the historical training data The fluctuation trend is constructed based on the elements of the historical training data set. Then, while using the overall mean of the number of fluctuations in the training data set, the fluctuation trend is blurred to FFTS, e.g., the average value of the training data set for TAIEX. Therefore, len is 85. Then, X(1) = 6152.43, X(2) = 6199.91, and Y(2) = X(2) − X(1) = 47.48. Further, F(2) = L 4 , means slightly up. Thus, the training set can be converted to fuzzy fluctuation set.
Step 2 Establish m-order FFLRs for the training data set Different orders directly affect the effect of the prediction. Here, we only need to explain the process, so we take the 9th-order as an example. Then, according to Definition 4, the 9th-order FFLRs of the prediction model were established.

Step 4 Convert the LHSs of FFLRs to NSs
Calculate the probability of corresponding L i (i = 1, 2, . . . , 5) in the LHSs of the training data set and then convert the left-hands of FFLRs into neutrosophic sets according to Equation (6).
Step 5 Grouping and optimization Firstly, apply the Jaccard similarity measure method to categorize and group the converted LHSs of FFLRs. In this example, the threshold similarity value is set to 0.94. Then, convert the RHSs of the corresponding FFLRs into neutrosophic sets according to Equation (7). The conversion and grouping process is shown in Figure 2. Step 5. Grouping and optimization Firstly, apply the Jaccard similarity measure method to categorize and group the converted LHSs of FFLRs. In this example, the threshold similarity value is set to 0.94. Then, convert the RHSs of the corresponding FFLRs into neutrosophic sets according to Equation (7). The conversion and grouping process is shown in Figure 2.   Step 6. Forecast test time series For each test data point, calculate its 9th-order historical fluctuation states and convert them to a neutrosophic set. Locate the NLR with the highest similarity. Then, the corresponding RHS of the NLR is the forecasting neutrosophic value. For example, we chose 1 November 1999 as an example of the test data. The LHS of its fluctuation states is (L3, L3, L4, L3, L2, L3, L5, L4, L4, L3, L4, L4, L3, L4, L3, L2,  L2, L3, L4, L5, L4, L4, L3, L3, L3), which can be represented by a neutrosophic set (0.0480, 0.6085, 0.1920). According to the similarity comparison, the most appropriate NLR for forecasting is (0.0444, 0.5306, 0.1111)→(0.0480, 0.6085, 0.1920).
Then, calculate the predicted fuzzy fluctuation: Step 6 Forecast test time series For each test data point, calculate its 9th-order historical fluctuation states and convert them to a neutrosophic set. Locate the NLR with the highest similarity. Then, the corresponding RHS of the NLR is the forecasting neutrosophic value. For example, we chose 1 November 1999 as an example of the test data. The LHS of its fluctuation states is (L 3 , L 3 , L 4 , L 3 , L 2 , L 3 , L 5 , L 4 , L 4 , L 3 , L 4 , L 4 , L 3 , L 4 , L 3 , L 2 , L 2 , L 3 , L 4 , L 5 , L 4 , L 4 , L 3 , L 3 , L 3 ), which can be represented by a neutrosophic set (0.0480, 0.6085, 0.1920). According to the similarity comparison, the most appropriate NLR for forecasting is (0.0444, 0.5306, 0.1111) → (0.0480, 0.6085, 0.1920).
Then, calculate the predicted fuzzy fluctuation: Calculate the real number of the fluctuation: Finally, The predicted value can be obtained from the actual value of the previous day and the predicted fluctuation value: According to the above steps, we can obtain other prediction results that are shown in Table 1 and Figure 3.  In experimental analysis, some of the methods are used to measure the accuracy of prediction in order to quantify the effect of model prediction. These methods are mainly used in the prediction field, including the mean percentage error (MPE), the mean square error (MSE), and the root mean squared error (RMSE). Since there is no significant difference in the effect of different error calculation methods, RMSE was chosen as the main formula for error calculation.
Since different orders affect the prediction effect, in order to reduce the experimental error and In experimental analysis, some of the methods are used to measure the accuracy of prediction in order to quantify the effect of model prediction. These methods are mainly used in the prediction field, including the mean percentage error (MPE), the mean square error (MSE), and the root mean squared error (RMSE). Since there is no significant difference in the effect of different error calculation methods, RMSE was chosen as the main formula for error calculation.
Since different orders affect the prediction effect, in order to reduce the experimental error and improve the accuracy of the prediction, it was necessary to select the optimal order. Experimental analysis showed that when the order was nine, the predictability of the model is better. Table 2 shows the experimental errors for different years under different orders. It is important and necessary to accurately predict the trend of fluctuations. Some of the most recent articles have proposed many excellent prediction methods. Therefore, by comparing our method with the previous methods, the advantages and disadvantages of our model can be verified. Table 3 shows the prediction errors for different methods from 1997 to 2005. By comparing with the latest research, we found that the overall prediction effect of this method was excellent. For example, the average error of Guan et al.'s model [40] is 94.5, the average error of Cheng et al.'s model [41] is 102.4, and the error of the proposed model is 92. 16. Upon further analysis, from the annual prediction error results, we can see that this method can effectively predict the trend from 1997 to 2005, which is more universal.

Forecasting Hong Kong Heng Seng Index (HIS)
HIS is one of the representative indices in Asia. It is not only an important indicator of the Hong Kong stock market price, but is also a stock price index that reflects the most influential trends in the Hong Kong stock market prices. By comparing several authoritative prediction methods, we attempted to verify the universality of the model in other stock markets. Table 4 and Figure 4 compare the different prediction methods from 1998 to 2012.  As shown in Table 4, the average prediction error of the Yu [42] method is 359.66, the Wan [43] method is 395.26, and the Ren [44] method is 526.46. The average prediction error of proposed method is 248.23, which is the smallest error. In addition, it can be seen from Figure 4 that the prediction result of the method is more stable and the prediction effect is more prominent.

Conclusions
This paper proposed a prediction model that is based on logical rules abstracted from  As shown in Table 4, the average prediction error of the Yu [42] method is 359.66, the Wan [43] method is 395.26, and the Ren [44] method is 526.46. The average prediction error of proposed method is 248.23, which is the smallest error. In addition, it can be seen from Figure 4 that the prediction result of the method is more stable and the prediction effect is more prominent.

Conclusions
This paper proposed a prediction model that is based on logical rules abstracted from historical dynamic fluctuation trends and the corresponding inconsistencies. During the logical rule training stage, the two dimensions of truth-membership and false-membership of neutrosophic sets were mapped to the dynamic trend states of up and down, respectively. Information entropy was employed to quantify the inconsistency of a period of history and was mapped to the indeterminacy membership of the neutrosophic sets. In the forecasting stage, the similarities among neutrosophic sets were employed to locate the most similar left side of a logical relationship. Therefore, the two characteristics of fluctuation trends and inconsistency assisted with future forecasting. The proposed model was implemented to forecast TAIEX and HIS. The experimental results showed that the model has stable prediction ability for different data sets. Simultaneously, comparing the prediction error with other approaches also proved that our model has outstanding prediction accuracy and universality. This study used two datasets, TAIEX and HIS. Hence, the next study requires data from other stock markets to further validate the model. This study discussed high-order FFTS to characterize the historical fluctuation trends and high-order information fluctuation entropy to measure the inconsistency of historical fluctuations. While considering the relationships among high-order history, current and future, it is likely to use network theory to handle such forecasting problems. In future research, flexible network models should be constructed and more network approaches should be introduced to this area.