Forecasting Model Based on Neutrosophic Logical Relationship and Jaccard Similarity

: The daily ﬂuctuation trends of a stock market are illustrated by three statuses: up, equal, and down. These can be represented by a neutrosophic set which consists of three functions—truth-membership, indeterminacy-membership, and falsity-membership. In this paper, we propose a novel forecasting model based on neutrosophic set theory and the fuzzy logical relationships between the status of historical and current values. Firstly, the original time series of the stock market is converted to a ﬂuctuation time series by comparing each piece of data with that of the previous day. The ﬂuctuation time series is then fuzziﬁed into a fuzzy-ﬂuctuation time series in terms of the pre-deﬁned up, equal, and down intervals. Next, the fuzzy logical relationships can be expressed by two neutrosophic sets according to the probabilities of different statuses for each current value and a certain range of corresponding histories. Finally, based on the neutrosophic logical relationships and the status of history, a Jaccard similarity measure is employed to ﬁnd the most proper logical rule to forecast its future. The authentic Taiwan Stock Exchange Capitalization Weighted Stock Index (TAIEX) time series datasets are used as an example to illustrate the forecasting procedure and performance comparisons. The experimental results show that the proposed method can successfully forecast the stock market and other similar kinds of time series. We also apply the proposed method to forecast the Shanghai Stock Exchange Composite Index (SHSECI) to verify its effectiveness and universality.


Introduction
It is well known that there is a statistical long-range dependency between current values and historical values at different times in certain time series [1].Therefore, many researchers have developed various models to predict the future of such time series based on historical data sets, for example the regression analysis model [2], the autoregressive moving average (ARIMA) model [3], the autoregressive conditional heteroscedasticity (ARCH) model [4], the generalized ARCH (GARCH) model [5], and so on.However, crisp data used in those models are sometimes unavailable as such time series contain many uncertainties.In fact, models that satisfy the constraints precisely can miss the true optimal design within the confines of practical and realistic approximations.Therefore, Song and Chissom proposed the fuzzy time series (FTS) forecasting model [6][7][8] to predict the future of such nonlinear and complicated problems.In a financial context, FTS approaches have been widely applied to stock index forecasting [9][10][11][12][13].In order to improve the accuracy of forecasts for stock market indices, some researchers combine fuzzy and non-fuzzy time series with heuristic optimization methods in their forecasting strategies [14].Other approaches even introduce neural networks and machine learning procedures in order to find forecasting rules from historical time series [15][16][17].The major points in FTS models are related to the fuzzifying of original time series, the establishment of fuzzy logical relationships from historical training datasets, and the forecasting and defuzzification of the outputs.Various proposals have been considered to determine the basic steps of the fuzzifying method, such as the effective length of intervals-e.g., determining the optimal interval length based on averages and distribution methods [18], using statistical theory [18][19][20][21][22][23], the unequal interval length method based on ratios of data [24], or the length determination method based on particle swarm optimization (PSO) techniques [10], etc.To state appropriate fuzzy logical relationships, Yu [25] proposed a weight assignation model, based on the recurrent fuzzy relationships, for each individual relationship.Aladag et al. [26] considered artificial neural networks to be a basic high-order method for the establishment of logical relationships.Fuzzy auto regressive (AR) models and fuzzy auto regressive and moving average (ARMA) models are also widely used to reflect the recurrence and weights of different fuzzy logical relationships [9,10,[27][28][29][30][31][32][33][34][35].These obtained logical relationships will be used as rules during the forecasting process.However, the proportions of the lagged variables in AR or ARMA models only represent the general best fitness for certain training datasets, without taking into account the differences between individual relationships.Although the weight assignation model considers the differences between individual relationships, it has to deal with special relationships that appear in the testing dataset but never happen in the training dataset.These FTS methods look for point forecasts without taking into account the implicit uncertainty in the ex post forecasts.
For a financial system, if anything, future fluctuation is more important than the indicated number itself.Therefore, the crucial ingredients for financial forecasting are the fluctuation orientations (including up, equal, and down) and to what extent the trends would be realized.Inspired by this, we first changed the original time series into a fluctuation time series for further rule generation.Meanwhile, comparing the three statuses with the concept of the neutrosophic set, the trends and weights of the relationships between historical and current statuses can be represented by the different dimensions of the neutrosophic sets, respectively.The concept of the neutrosophic set was originally proposed from a philosophical point of view by Smarandache [36].A neutrosophic set is characterized independently by a truth-membership function, an indeterminacy-membership function and a falsity-membership function.Its similarity measure plays a key role in decision-making in uncertain environments.Researchers have proposed various similarity measures and mainly applied them to decision-making-e.g., Jaccard, Dice and Cosine similarity measures [37], distance-based similarity measures [38], entropy measures [39], etc.Although neutrosophic sets have been successfully applied to decision-making [37][38][39][40][41][42], they have rarely been applied to forecasting problems.
In this paper, we introduce neutrosophic sets to stock market forecasting.We propose a novel forecasting model based on neutrosophic set theory and the fuzzy logical relationships between current and historical statuses.Firstly, the original time series of the stock market is converted to a fluctuation time series by comparing each piece of data with that of the previous day.The fluctuation time series is then fuzzified into a fuzzy-fluctuation time series in terms of the pre-defined up, equal, and down intervals.Next, the fuzzy logical relationships can be expressed by two neutrosophic sets according to the probabilities for different statuses of each current value and a certain range of corresponding histories.Finally, based on the neutrosophic logical relationships and statuses of recent history, the Jaccard similarity measure is employed to find the most proper logical rule with which to forecast its future.
The remaining content of this paper is organized as follows: Section 2 introduces some preliminaries of fuzzy-fluctuation time series and concepts, and the similarity measures of neutrosophic sets.Section 3 describes a novel approach for forecasting based on fuzzy-fluctuation trends and logical relationships.In Section 4, the proposed model is used to forecast the stock market using

Definition of Fuzzy-Fluctuation Time Series (FFTS)
Song and Chissom [6][7][8] combined fuzzy set theory with time series and defined fuzzy time series.In this section, we extend fuzzy time series to fuzzy-fluctuation time series (FFTS) and propose the related concepts.Definition 1.Let L = l 1 , l 2 , . . ., l g be a fuzzy set in the universe of discourse U; it can be defined by its membership function, µ L : U → [0, 1] , where µ L (u i ) denotes the grade of membership of u i , U = {u 1 , u 2 , . . .u i , . . . ,u l }.
The fluctuation trends of a stock market can be expressed by a linguistic set L = {l 1 , l 2 , l 3 } = {down, equal, up}.The element l i and its subscript i are strictly monotonically increasing [43], so the function can be defined as follows: f : Definition 2. Let F(t)(t = 1, 2, . . ., T) be a time series of real numbers, where T is the number of the time series.G(t) is defined as a fluctuation time series, where G(t) = F(t) − F(t − 1), (t = 2, 3, . . ., T).Each element of G(t) can be represented by a fuzzy set S(t)(t = 2, 3, . . ., T) as defined in Definition 1. Then we call the time series G(t), which is to be fuzzified into a fuzzy-fluctuation time series (FFTS), S(t).Definition 3. Let S(t) (t = n + 1, n + 2, . . ., T, n ≥ 1) be a FFTS.If S(t) is determined by S(t − 1), S(t − 2), . . ., S(t − n), then the fuzzy-fluctuation logical relationship is represented by: and it is called the nth-order fuzzy-fluctuation logical relationship (FFLR) of the fuzzy-fluctuation time series, where S(t − n), . . ., S(t − 2)S(t − 1) is called the left-hand side (LHS) and S(t) is called the right-hand side (RHS) of the FFLR, and S(k)(k = t, t − 1, t − 2, . . ., t − n) ∈ L.

Basic Concept of Neutrosophic Logical Relationship (NLR)
Smarandache [36] originally presented the neutrosophic set theory.Based on neutrosophic set theory, we propose the concept of the fuzzy-neutrosophic logical relationship, which employs the three terms of a neutrosophic set to reflect the fuzzy-fluctuation trends and weights of an nth-order FFLR.

Definition 4. Let P i
A(t) be the probabilities of each element l i (l i ∈ L) in the LHS of an nth-order FFLR S(t − 1), S(t − 2), . . ., S(t − n) → S(t) , and it can be generated by: where w i,j = 1 if S(t − j) = i and 0 otherwise.Let X be a universal set, and the left-hand side of a neutrosophic logical relationship is defined by: A(t) = x, P ) is called the righ-hand side of a neutrosophic logical relationship.
Definition 6 [37].Let A(t 1 ) and A(t 2 ) be two neutrosophic sets.The Jaccard similarity measure between A(t 1 ) and A(t 2 ) in vector space is defined as follows: (5)

A Novel Forecasting Model Based on Neutrosophic Logical Relationships
In this paper, we propose a novel forecasting model based on high-order neutrosophic logical relationships and Jaccard similarity measures.In order to compare the forecasting results with other researchers' work [9,17,23,25,[44][45][46][47][48], the authentic TAIEX (Taiwan Stock Exchange Capitalization Weighted Stock Index) is employed to illustrate the forecasting process.The data from January 1999 to October 1999 are used as the training time series and the data from November 1999 to December 1999 are used as the testing dataset.The basic steps of the proposed model are shown in Figure 1.
Symmetry 2017, 9, 191 5 of 16 Step 4. Forecast test time series For each observed point F(i) in the test time series, we can use a neutrosophic set A(i) to represent its nth-order fuzzy-fluctuation trends.Then, for each A(t) obtained in step 3, compare A(i) with A(t) respectively, and find the most similar one based on the Jaccard similarity measure method described in Definition 6. Next, use the corresponding as the forecasting rule to predict the fluctuation value ( 1) G' i + of the next point.Finally, obtain the forecasting value by
Step 2. Establish nth-order FFLRs for the training data set According to Definition 3, each S(t)(t > n) in the historical training data set can be represented by its previous n days' fuzzy-fluctuation numbers to establish the training FFLRs.
Step 3. Convert the FFLRs to NLRs According to Definition 4, the LHS of each FFLR can be expressed by a neutrosophic set A(t).Then, we can generate the RHSs B A(t) for different LHSs respectively, as described in Definition 5. Thus, the FFLRs for the historical training dataset are converted into NLRs.

Step 4. Forecast test time series
For each observed point F(i) in the test time series, we can use a neutrosophic set A(i) to represent its nth-order fuzzy-fluctuation trends.Then, for each A(t) obtained in step 3, compare A(i) with A(t) respectively, and find the most similar one based on the Jaccard similarity measure method described in Definition 6. Next, use the corresponding B A(t) as the forecasting rule to predict the fluctuation value G (i + 1) of the next point.Finally, obtain the forecasting value by

Forecasting Taiwan Stock Exchange Capitalization Weighted Stock Index
Many studies use TAIEX1999 as an example to illustrate their proposed forecasting methods [9,17,25,34,[44][45][46][47][48].In order to compare the accuracy with their models, we also use TAIEX1999 to illustrate the proposed method.
Step 1: Calculate the fluctuation trend for each element in the historical training dataset of TAIEX1999.Then, we use the whole mean of the fluctuation numbers of the training dataset to fuzzify the fluctuation trends into FFTS.For example, the whole mean of the historical dataset of TAIEX1999 from January to October is 85.That is to say, len = 85.For F(1) = 6152.43and F(2) = 6199.91,G(2) = 47.48,S(2) = 3.In this way, the historical training dataset can be represented by a fuzzified fluctuation dataset as shown in Table A1.
Step 2: Based on the FFTS from 5 January to 30 October 1999-shown in Table A1-the nth-order FFLRs for the forecasting model are established as shown in Table A2.The subscript i is used to represent element l i in the FFLRs for convenience.
Step 3: In order to convert the FFLRs to NLRs, first of all the LHSs of the FFLRs in Table A2 are represented by a neutrosophic set, respectively (shown in Table A2).Then, the RHSs of the FFLRs are grouped with the same LHS neutrosophic set value into the RHSs group.A neutrosophic set is used to represent the RHSs group.For example, the LHS of FFLR 2,3,1,1,1,2,2,3,3→1 can be represented by the neutrosophic set (0.33,0.33,0.33).The detailed grouping and converting processes are shown in Figure 2.
Step 3: In order to convert the FFLRs to NLRs, first of all the LHSs of the FFLRs in Table A2 are represented by a neutrosophic set, respectively (shown in Table A2).Then, the RHSs of the FFLRs are grouped with the same LHS neutrosophic set value into the RHSs group.A neutrosophic set is used to represent the RHSs group.For example, the LHS of FFLR 2,3,1,1,1,2,2,3,3→1 can be represented by the neutrosophic set (0.33,0.33,0.33).The detailed grouping and converting processes are shown in Figure 2. In this way, the FFLR 2,3,1,1,1,2,2,3,3→1 and other members of the same group are converted into an NLR (0.33,0.33,0.33)→(0.4,0.3,0.3).Therefore, the FFLRs in Table A2 can be converted into NLRs as shown in Table 1.Step 4: Use the NLRs obtained from historical training data to forecast the test dataset from 1 November to 30 December 1999.For example, the forecasting value of the TAIEX on 1 November 1999 is calculated as follows: In this way, the FFLR 2,3,1,1,1,2,2,3,3→1 and other members of the same group are converted into an NLR (0.33,0.33,0.33)→(0.4,0.3,0.3).Therefore, the FFLRs in Table A2 can be converted into NLRs as shown in Table 1.Step 4: Use the NLRs obtained from historical training data to forecast the test dataset from 1 November to 30 December 1999.For example, the forecasting value of the TAIEX on 1 November 1999 is calculated as follows: First, the ninth-order historical fuzzy-fluctuation trends 3,2,2,2,2,3,1,2,2 on 1 November 1999 can be represented by a neutrosophic set (0.11,0.67,0.22).Then, we use the Jaccard similarity measure method as described by Definition 6 to choose the most optimal NLR from the NLRs listed in Table 1.The NLR (0.11,0.67,0.22)→(0.17,0.33,0.5) is evidently the best rule for further forecasting.Therefore, the forecasted fuzzy-fluctuation number is: The forecasted fluctuation from the current value to the next value can be obtained by defuzzifying the fluctuation fuzzy number: Finally, the forecasted value can be obtained by the current value and the fluctuation value: The other forecasting results are shown in Table 2 and Figure 3. First, the ninth-order historical fuzzy-fluctuation trends 3,2,2,2,2,3,1,2,2 on 1 November 1999 can be represented by a neutrosophic set (0.11,0.67,0.22).Then, we use the Jaccard similarity measure method as described by Definition 6 to choose the most optimal NLR from the NLRs listed in Table 1.The NLR (0.11,0.67,0.22)→(0.17,0.33,0.5) is evidently the best rule for further forecasting.Therefore, the forecasted fuzzy-fluctuation number is: ( 1) ( 0.17) 0.5 0.33 The forecasted fluctuation from the current value to the next value can be obtained by defuzzifying the fluctuation fuzzy number: ( 1) ( 1) 0.33 85=28.05 Finally, the forecasted value can be obtained by the current value and the fluctuation value: ( 1) ( ) ( 1)=7854.85+28.05=7882.9 The other forecasting results are shown in Table 2 and Figure 3.The forecasting performance can be assessed by comparing the difference between the forecasted values and the actual values.The widely used indicators in time series model comparisons are the mean squared error (MSE), the root of the mean squared error (RMSE), the mean absolute error (MAE), and the mean percentage error (MPE), etc.To compare the performance of different  The forecasting performance can be assessed by comparing the difference between the forecasted values and the actual values.The widely used indicators in time series model comparisons are the mean squared error (MSE), the root of the mean squared error (RMSE), the mean absolute error (MAE), and the mean percentage error (MPE), etc.To compare the performance of different forecasting methods, the Diebold-Mariano test statistic (S) is also widely used [49].These indicators are defined by Equations ( 6)-( 10): where n denotes the number of values forecasted, forecast(t) and actual(t) denote the predicted value and actual value at time t, respectively.S is a test statistic of the Diebold method that is used to compare the predictive accuracy of two forecasts obtained by different methods.Forecast1 represents the dataset obtained by method 1, and Forecast2 represents another dataset from method 2. If S > 0 and |S| > Z = 1.64 at the 0.05 significance level, then Forecast2 has better predictive accuracy than Forecast1.With respect to the proposed method for the ninth order, the MSE, RMSE, MAE, and MPE are 9753.63,98.76, 76.32, and 0.01, respectively.Let the order number n vary from two to 10; the RMSEs for different nth-order forecasting models are listed in Table 3.The item "Average" refers to the RMSE for the average forecasting results of these different nth-order (n = 2, 3, ..., 10) models.In practical forecasting, the average of the results of different nth-order (n = 2, 3, ..., 9) forecasting models is adopted to avoid the uncertainty.The proposed method is employed to forecast the TAIEX from 1997 to 2005.The forecasting results and errors are shown in Figure 4 and Table 4. forecasting methods, the Diebold-Mariano test statistic (S) is also widely used [49].These indicators are defined by Equations ( 6)-( 10): Let the order number n vary from two to 10; the RMSEs for different nth-order forecasting models are listed in Table 3.The item "Average" refers to the RMSE for the average forecasting results of these different nth-order (n = 2, 3, ..., 10) models.In practical forecasting, the average of the results of different nth-order (n = 2, 3, ..., 9) forecasting models is adopted to avoid the uncertainty.The proposed method is employed to forecast the TAIEX from 1997 to 2005.The forecasting results and errors are shown in Figure 4    Table 5 shows a comparison between the RMSEs of different methods for forecasting the TAIEX1999.From this table, we can see that the performance of the proposed method is acceptable.The greatest advantage of the proposed method is that it does not need to determine the boundary of discourse or the intervals for number fuzzifying.Meanwhile, the introduction of neutrosophic sets into the expression of logical relationships makes it possible to employ a similar comparison method to locate the most appropriate rules for further forecasting.Therefore, the proposed method, to some extent, is more rigorous than other methods that just use meaningless values in the case of missing rules in the training data.Though the RMSEs of some of the other methods outperform the proposed method, they often need to determine complex discretization partitioning rules or use adaptive expectation models to justify the final forecasting results.The method proposed in this paper is simpler and more easily realized by a computer program.[46] 102.11 0.39 Chen and Chen's Method (2015) [9] 103.9 0.29 Chen and Chen's Method (2015) [44] 92 −0.51 Zhao et al.'s Method (2016) [23] 110   Table 5 shows a comparison between the RMSEs of different methods for forecasting the TAIEX1999.From this table, we can see that the performance of the proposed method is acceptable.The greatest advantage of the proposed method is that it does not need to determine the boundary of discourse or the intervals for number fuzzifying.Meanwhile, the introduction of neutrosophic sets into the expression of logical relationships makes it possible to employ a similar comparison method to locate the most appropriate rules for further forecasting.Therefore, the proposed method, to some extent, is more rigorous than other methods that just use meaningless values in the case of missing rules in the training data.Though the RMSEs of some of the other methods outperform the proposed method, they often need to determine complex discretization partitioning rules or use adaptive expectation models to justify the final forecasting results.The method proposed in this paper is simpler and more easily realized by a computer program.[46] 102.11 0.39 Chen and Chen's Method (2015) [9] 103.9 0.29 Chen and Chen's Method (2015) [44] 92 −0.51

Forecasting Shanghai Stock Exchange Composite Index
The SHSECI is the most famous stock market index in China.In the following, we apply the proposed method to forecast the SHSECI from 2007 to 2015.For each year, the authentic datasets of the historical daily SHSECI closing prices between January and October are used as the training data, and the datasets from November to December are used as the testing data.The RMSEs of forecast errors are shown in Table 6.
From Table 6, we can see that the proposed method can successfully predict the SHSECI stock market.

Conclusions
In this paper, a novel forecasting model is proposed based on neutrosophic logical relationships, the Jaccard similarity measure, and on fluctuations of the time series.The high-order fuzzy-fluctuation logical relationships are represented by neutrosophic logical relationships.Therefore, we can use the Jaccard similarity measure method to find the optimal forecasting rules.The biggest advantage of this method is that it can deal with the problem of lack of rules.Considering the fact that future fluctuation is more important than the indicated number itself, this method focuses on the forecasting of fluctuation orientations in terms of the extent of the fluctuation rather than on the real numbers.Meanwhile, utilizing NLRs instead of FLRs makes it possible to select the most appropriate rules for further forecasting.Therefore, the proposed method is more rigorous and interpretable.Experiments show that the parameters generated by the training dataset can be successfully used for future datasets as well.In order to compare the performance with that of other methods, we took the TAIEX 1999 as an example.We also forecasted TAIEX 1997-2005 and SHSECI 2007-2015 to verify its effectiveness and universality.In the future, we will consider other factors that might affect the fluctuation of the stock market, such as the trade volume, the beginning value, the end value, etc.We will also consider the influence of other stock markets, such as the Dow Jones, the National Association of Securities Dealers Automated Quotations (NASDAQ), the M1b, and so on.

Figure 1 .
Figure 1.Flowchart of our proposed forecasting model.
Taiwan Stock Exchange Capitalization Weighted Stock Index (TAIEX) datasets from 1997 to 2005 and Shanghai Stock Exchange Composite Index (SHSECI) from 2007 to 2015.Conclusions and potential issues for future research are summarized in Section 5.

Table 1 .
Neutrosophic logical relationships (NLRs) for the historical training data of TAIEX1999.

Table 3 .
Comparison of forecasting errors for different nth orders.

Table 3
Comparison of forecasting errors for different nth orders.

Table 4
RMSEs of forecast errors for TAIEX 1997 to 2005.

Table 5 .
A comparison of RMSEs for different methods for forecasting the TAIEX1999.

Table 5 .
A comparison of RMSEs for different methods for forecasting the TAIEX1999.
* The proposed method has better predictive accuracy than the method at the 5% significance level. *

Table 6 .
RMSEs of forecast errors for SHSECI from 2007 to 2015.

Table A1 .
Historical training data and fuzzified fluctuation data of TAIEX 1999.

Table A2 .
The FFLRs and the converted left hand side of NLRs for historical training data of TAIEX1999.