A Two-Factor Autoregressive Moving Average Model Based on Fuzzy Fluctuation Logical Relationships

: Many of the existing autoregressive moving average (ARMA) forecast models are based on one main factor. In this paper, we proposed a new two-factor ﬁrst-order ARMA forecast model based on fuzzy ﬂuctuation logical relationships of both a main factor and a secondary factor of a historical training time series. Firstly, we generated a ﬂuctuation time series (FTS) for two factors by calculating the difference of each data point with its previous day, then ﬁnding the absolute means of the two FTSs. We then constructed a fuzzy ﬂuctuation time series (FFTS) according to the deﬁned linguistic sets. The next step was establishing fuzzy ﬂuctuation logical relation groups (FFLRGs) for a two-factor ﬁrst-order autoregressive (AR(1)) model and forecasting the training data with the AR(1) model. Then we built FFLRGs for a two-factor ﬁrst-order autoregressive moving average (ARMA(1,m)) model. Lastly, we forecasted test data with the ARMA(1,m) model. To illustrate the performance of our model, we used real Taiwan Stock Exchange Capitalization Weighted Stock Index (TAIEX) and Dow Jones datasets as a secondary factor to forecast TAIEX. The experiment results indicate that the proposed two-factor ﬂuctuation ARMA method outperformed the one-factor method based on real historic data. The secondary factor may have some effects on the main factor and thereby impact the forecasting results. Using fuzziﬁed ﬂuctuations rather than fuzziﬁed real data could avoid the inﬂuence of extreme values in historic data, which performs negatively while forecasting. To verify the accuracy and effectiveness of the model, we also employed our method to forecast the Shanghai Stock Exchange Composite Index (SHSECI) from 2001 to 2015 and the international gold price from 2000 to 2010.


Introduction
A historic time series can show the rules and patterns of some phenomena and can be applied to forecast the same event in the future [1].Many researchers have described time series models to predict the future of a given system, including regression analysis [2], artificial neural networks (ANN) [3], evolutionary computation [4], support vector machines (SVM) [5], and immune systems [6].However, although these models satisfy the constraints, they might overemphasize the randomness of the dataset and distort the internal evolutionary rules, and may not perform optimally.To solve this problem, Song and Chissom proposed the fuzzy time series forecasting model [7] which introduced the fuzzy set theory by Zadeh [8] into a time series.Chen [9] developed a first order fuzzy time series to simplify the fuzzy relationships in Song and Chissom's model [7,10,11], described by complex matrix operations.Chen's method [9] has been the basis for the future research of fuzzy logic groups because of its universality and level of performance.For the selection of the length of the intervals, Huarng [12] proposed two methods: based on averages and on distribution.Since then, the fuzzy time series model has been widely used for forecasting in many nonlinear and complex forecasting problems.In order to forecast the fluctuation of the stock market, Chen [13] proposed a hybrid first order fuzzy time series model using granular computing as the partitioning method.Many studies [14][15][16] used a second-order fuzzy time series model to create the rules for the forecasting of future trends.The biggest differences between these fuzzy time series models are the detailed partitioning method and the trend rules.Efendi et al. [17] used a fuzzy time series model to forecast daily electricity load demand.Sadaei et al. [18] proposed a short-term load forecasting model based on the seasonality memory process and fuzzy time series model.These fuzzy time series models are all autoregressive (AR) models.With fuzzy lagged variables of a time series, these models can be represented as AR(n).Such models are also used for project cost forecasting [19] and the enrollment forecasting at Alabama University [20,21].
In order to improve the accuracy of fuzzy time series models, many researchers have proposed other models on the basis of Chen's model.For example, an unequal interval length method was proposed by Huarng and Yu [22] based on the ratios of data in which the length of interval was exponentially variable.In addition to determining the intervals, the definition of the universe of discourse also plays an effective role in the forecasting accuracy.To establish a suitable universe of discourse, in addition to the maximum and minimum values of the historical data of the main factor, the models need two proper real numbers to cover the noise.
Another essential step when creating fuzzy time series models is the establishment of fuzzy logical relationships (FLR).In this realm, the research by Egrioglu et al. [23] is regarded as a basic high-order method for forecasting based on artificial neural networks.Moreover, Egrioglu [24] employed generic algorithms to establish fuzzy relations.Some other soft computing techniques have been used to forecast in many studies [25][26][27].In fact, fuzzy time series forecasting studies are frequently based on fuzzy autoregressive (AR) structures [28][29][30][31][32].To further improve the performance of fuzzy AR models, an adaptive fuzzy inference system (ANFIS) [33] has been used in time series prediction [34][35][36][37].However, only using an AR structure for some of the time series may lead to unsatisfactory and flawed results.To address this, we combined moving average (MA) structures and produced an ARMA-type fuzzy time series forecasting model that includes both AR and MA structures.Because of the excellent performance of the ARMA model, it has been widely mentioned in the.For example, Kocak [38] and Kocak el al. [39] researched first-order ARMA fuzzy time series models based on fuzzy logical relation tables and an artificial neural network, respectively.Kocak [40] also studied a high-order ARMA fuzzy time series model.
Most of the existing fuzzy time series models first fuzzify the exact values of the time series, then use AR models of the dataset itself to forecast its future.Such methods usually improve the performance by using extra solution steps, such as the use of artificial neural networks.In this paper, we propose a new first-order ARMA model based on two-factor fuzzy logical relationships.The advantages of this model are that it uses the fluctuation values rather than the exact values of the time series, and a secondary factor is used to help forecast the main factor with ARMA fuzzy time series models.Since the fluctuation orientations, including up, equal, and down, and the extent to which the trends would be realized, are the crucial ingredients for financial forecasting.Because of this, using a fluctuation time series for further rules generation would be more reasonable.Although internal rules determine future changes, we could not ignore the effects of relative external changes.Therefore, we chose an external element as the secondary factor to generate the logical rules.The experiment results indicate that the proposed two-factor fluctuation method outperforms the one-factor method, based on real historic data, because the secondary factor may have some effects on the main factor and thereby impact the forecasting results.Using fuzzified fluctuations, rather than fuzzified real data, could avoid the influence of extreme values in the historic data which negatively affects forecasting.
The remainder of the paper is organized as follows.The next section presents the basic preliminaries of fuzzy-fluctuation time series.The third section introduces the procedure used to build the ARMA(1,m) model.Next, the proposed model is used to forecast the stock market using TAIEX datasets from 1997 to 2005, SHSECI from 2001 to 2015, and internal gold prices from 2000 to 2010.Finally, we discuss the conclusions and potential future research.

New Forecasting Model Based on Two-Factor ARMA(1,m) FFLRs
In this paper, we propose a new forecasting model with two-factor first-order fuzzy fluctuation logical relationships ARMA model.To make a comparison with the forecasting results of other researchers' work [29,30,41,42], we used the real Taiwan Stock Exchange Capitalization Weighted Stock Index (TAIEX) to show the forecasting procedure.We used the data from January to October of the given year as a training time series and the data from November to December of the same year as the testing dataset.The basic steps of the proposed model are shown in Figure 1.• Define the intervals for corresponding fuzzy sets: , ( 1,2,..., ).
Step 2: The second step was to determine the two-factors fuzzy-fluctuation logic relationships for the AR(1) model.In this step, we determined the two-factor fuzzy-fluctuation logical relationships for the AR(1) model as outlined in Definition 3. Let the lagged variables Then, the FFLRs with the same LHS were grouped into a fuzzy-fluctuation logical relationship group (FFLRG) by putting all their RHSs together, as on the RHS of the FFLRG.For example, when the FFLRs for a two-factor AR(1) Step 3: The next step was to obtain the fuzzy fluctuation forecast result from AR(n) model.We assumed the lagged variables and we defined the following conditions:RHS Conditions: If L i , K s → L j , . . ., L j , L h , . . ., L h , L l , . . ., L l exists and assuming the numbers of L j , L h , and L l from the previous equation are a, b, and c respectively, then the fuzzy fluctuation forecast result would be L j , . . ., L j , L h , . . ., L h , L l , . . ., L l .Null RHS Condition: If L i , K s → empty exists on the FFLRG, then the fuzzy forecast is L i , K s .
Step 4: Next, we defuzzified the fluctuation forecast result for the AR(1) model.We used the centralization method to defuzzify the forecast results.For example, assuming m j , m h , and m l are the middle points of corresponding sub-intervals of L j , L h , and L l respectively, the defuzzified fluctuation forecast result is represented by: Step 5: Next, we calculated the fluctuation error series E(t): where X 1 (t) is the time series of the fluctuation numbers of main factor, and X1 (t) is calculated result from Step 4.
Step 6: The next step was to construct fuzzy fluctuation time series for the error series E(t).In the same manner as described in Step 1, we fuzzified E(t) into FFTSs R(t).We assumed h is the absolute mean of all elements in the time series E(t), (t = 2, 3, 4, . . .T), g is the number of intervals of the fuzzy sets, ε 1 , ε 2 , . . ., ε g are corresponding intervals, R(t) = W i has the highest membership value of corresponding intervals ε i (i = 1, 2, . . ., g), and W 1 , W 2 , . . ., W g are the corresponding fuzzy sets.
Step 7: Next, we determined the two-factor fuzzy logical relationships for the ARMA(n,m) model.In this step, we determined the fuzzy logical relationships for ARMA(n,m) model as outlined in Definition 4. Let the lagged variables Q , and the FFLR of this two-factor ARMA(1,m) model is L i , K s , W i2 , W m2 . . ., W s2 → L j .Then, as described in Step 2, the FFLRs with the same LHS were grouped into a FFLRG for the ARMA(1,m) model.
Step 8: Next, we obtained the fuzzy fluctuation forecast result from the ARMA(1,m) model.In the same manner as described in Step 3, we forecasted the future based on the two-factor FFLRG and the lagged variables.Assuming the lagged variables . ., and R(t − 1) = W s2 , we defined the following conditions: RHS Condition: . ., L l exists and assume the number of L j L h and L l from the previous equation is a, b, and c, respectively, then the fuzzy fluctuation forecast result would be L j , . . ., L j , L h , . . ., L h , L l , . . ., L l .
Null RHS Condition: If L i , K s , W i3 , W m3 , . . ., W s3 → empty exists on the FFLRG, then it was replaced with the FFLRG of its corresponding AR(1) model of L i , K s .
Step 9: In the final step, we defuzzified the forecast fluctuation and obtained forecast results.As described in Step 4, we defuzzified the obtained new forecast fluctuation: Then, we obtained the forecasting value with:

Forecasting TAIEX 2004
We used the 2004 TAIEX data as an example to illustrate our method.As the secondary factor, the 2004 Dow Jones data was used.
Step 1: Construct FFTS for historical main and secondary factor training data.Step 2: Determine the fuzzy logical relationships (FFLRs) for two-factor AR(1) model.
Step 3: Obtain fuzzy fluctuation forecast result for time series.
Based on the results obtained from Step 2, the two-factor AR(1) FFLRs are shown in Table 1.
The fluctuation forecast result was defuzzified according to Equation (3); the results are shown in Table 1.
Step 5: Calculate the fluctuation error series E(t) of the historic training data.We first added the forecast fluctuation to the previous day and obtained our forecast results.Then we calculated the difference between our forecast values and actual values.
Based on the results of Step 5, we fuzzified the fluctuation error series E(t).as we did in Step 1.The absolute mean of the fluctuation error series was 64.32.Then we divided the fluctuation error series E(t) into intervals according to their absolute mean.The results are shown in Appendix B.
Step 7: Determine the fuzzy logical relationships for the ARMA(1,m) model.
In this case, to obtain optimal results, we used m = 3 to build our model.Then we used the fuzzy two-factor ARMA(1,3) solution to forecast the test dataset, which is the TAIEX 2004 from November to December.The forecast result is shown in Table 2.The forecast values were obtained by adding the fluctuation values to the current values.The forecast results are shown in Table 2.We assessed the forecast performance by comparing the difference between the forecast values and the actual values.The widely used indicators in time series model comparisons are the mean squared error (MSE), root of the mean squared error (RMSE), mean absolute error (MAE), and mean percentage error (MPE),.To compare the performance of different forecasting methods, the Diebold-Mariano test statistic (S) is also used.These formulas are defined by Equations ( 9)-( 13): where n denotes the number of values forecasted, and forecast(t) and actual(t) denote the predicted value and actual value at time t, respectively.S is a test statistic of the Diebold method, that is used to compare the predictive accuracy of two forecasts obtained by different methods.To compare the forecasting results with different parameters, such as the number m of the two-factor ARMA(1,m) model and the element number g of linguistic sets, used in the fluctuation fuzzifying process, we completed different experiments and calculated the results.The forecasting errors of the averages for the experiments are shown in Tables 3 and 4. In Table 4, g = 3 means the linguistic set is {down, equal, up}, g = 5 means {greatly down, slightly down, equal, slightly up, greatly up}, g = 7 means {very greatly down, greatly down, slightly down, equal, slightly up, greatly up, very greatly up}, etc. "None" means that the model only used the AR(1) method to forecast.
We employed the proposed method to forecast the TAIEX from 1997 to 2005.The forecast results and errors are shown in Figure 2   In Table 4, g = 3 means the linguistic set is {down, equal, up}, g = 5 means {greatly down, slightly down, equal, slightly up, greatly up}, g = 7 means {very greatly down, greatly down, slightly down, equal, slightly up, greatly up, very greatly up}, etc. "None" means that the model only used the AR(1) method to forecast.
We employed the proposed method to forecast the TAIEX from 1997 to 2005.The forecast results and errors are shown in Figure 2  Table 6 shows a comparison of the RMSEs for the different methods when forecasting the TAIEX 2004.From this table, the performance of the proposed method is excellent.Though some of the other methods have better RMSEs results, they often need to build complex discretization partitioning rules or employ adaptive expectation models to modify the final forecast results.The method proposed in this paper is easily achieved by a computer program.

Forecasting SHSECI
The SHSECI (Shanghai Stock Exchange Composite Index) is the most influential stock market index in China.We chose Dow Jones as a secondary factor to build our model.For each year, the authentic datasets of historical daily SHSECI closing prices from January to October were used as the training data, and the datasets from November to December were employed as the testing data.The RMSEs of forecast errors are shown in Table 7.The proposed model accurately forecasted the SHSECI stock market.

Forecasting Gold Price
We also applied the proposed method to forecast the international gold price in USD from 2000 to 2010.We chose the COMEX gold price as a secondary factor.For each year, the authentic datasets of the historical daily closing prices from January to October were used as the training data, and the datasets from November to December were the testing data.The RMSEs of the forecast errors are shown in Table 8.Taking the 2010 gold price as an example, the forecast results are shown in Figure 3.We can see that the proposed model can accurately forecast the international gold price.

Conclusions
In this paper, a new forecasting model is proposed based on a first-order two-factor ARMA(1,m) model.The proposed method is based on the fluctuations of two time series.The secondary factor was used to modify the forecast performance of the main factor.The experiments showed that the fuzzy logic relations of the main and secondary factors obtained from the two training datasets can successfully predict the testing dataset of the main factor.To compare the performance with other methods, we employed TAIEX 2004 as an example to illustrate our process.We also forecasted TAIEX 1997-2005, SHSECI 2001-2015, and the international gold price 2000-2010 to show its accuracy and versatility.For future research, we may consider additional aspects of the stock markets such as volumes, ending prices, opening prices, etc.A third factor, or more, could be used to modify the forecasting process.We can see that the proposed model can accurately forecast the international gold price.

Conclusions
In this paper, a new forecasting model is proposed based on a first-order two-factor ARMA(1,m) model.The proposed method is based on the fluctuations of two time series.The secondary factor was used to modify the forecast performance of the main factor.The experiments showed that the fuzzy logic relations of the main and secondary factors obtained from the two training datasets can successfully predict the testing dataset of the main factor.To compare the performance with other methods, we employed TAIEX 2004 as an example to illustrate our process.We also forecasted TAIEX 1997-2005, SHSECI 2001-2015, and the international gold price 2000-2010 to show its accuracy and versatility.For future research, we may consider additional aspects of the stock markets such as volumes, ending prices, opening prices, etc.A third factor, or more, could be used to modify the forecasting process.

Appendix B
The fluctuation error series of training data is shown in Table A3.

1 .•
year as a training time series and the data from November to December of the same year as the testing dataset.The basic steps of the proposed model are shown in Figure Generate fluctuation time series for two factors:• Calculate absolute means of the two FTSs:

Figure 1 .
Figure 1.Flowchart of the proposed forecasting model.

Figure 1 .
Figure 1.Flowchart of the proposed forecasting model.

Step 8 :
Obtain fuzzy fluctuation forecast result for the time series based on the FFLRGs of the ARMA(1,m) model.Based on the results obtained in Step 2, the two-factor ARMA(1,3) fuzzy logic relationships are shown in Appendix C. Step 9: Defuzzify the fluctuation forecast result.We defuzzified the fluctuation forecast result according to Equation (3).The results are shown in Appendix C.

Figure 3 .
Figure 3.Comparison of actual and forecast results for gold prices in 2010.

Figure 3 .
Figure 3.Comparison of actual and forecast results for gold prices in 2010.
Forecast1 represents the dataset obtained by Method 1, and Forecast2 represents another dataset from Method 2. If S > 0 and |S| > Z = 1.64, at the 0.05 significance level, then Forecast2 has better predictive accuracy than Forecast1.With respect to the proposed method for two-factor ARMA(1,3), the MSE, RMSE, MAE, and MPE were 2814.65,53.05, 42.09, and 0.0071, respectively.

Table 4 .
Comparison of forecasting errors for different linguistic sets (m = 3).

Table 4 .
and Table5.Comparison of forecasting errors for different linguistic sets (m = 3).

Table 6 .
A comparison of RMSEs for different methods for forecasting the TAIEX 2004.
** Use Diebold-Mariano test statistic (S), the proposed method has better accuracy than other methods at 5% significance level at least.

Table 7 .
Root of the mean squared error (RMSE)s of forecast errors for Shanghai Stock Exchange Composite Index (SHSECI) from 2007 to 2015.

Table 8 .
RMSEs of forecast errors for gold price from 2000 to 2010.

Table A1 .
Historical training data and fuzzified fluctuation data of TAIEX2004.

Table A2 .
Historical training data and fuzzified fluctuation data of Dow Jones 2004.

Table A3 .
The Fluctuation Error Series.