A Novel Grey Prediction Model Combining Markov Chain with Functional-Link Net and Its Application to Foreign Tourist Forecasting

: Grey prediction models for time series have been widely applied to demand forecasting because only limited data are required for them to build a time series model without any statistical assumptions. Previous studies have demonstrated that the combination of grey prediction with neural networks helps grey prediction perform better. Some methods have been presented to improve the prediction accuracy of the popular GM(1,1) model by using the Markov chain to estimate the residual needed to modify a predicted value. Compared to the previous Grey-Markov models, this study contributes to apply the functional-link net to estimate the degree to which a predicted value obtained from the GM(1,1) model can be adjusted. Furthermore, the troublesome number of states and their bounds that are not easily speciﬁed in Markov chain have been determined by a genetic algorithm. To verify prediction performance, the proposed grey prediction model was applied to an important grey system problem—foreign tourist forecasting. Experimental results show that the proposed model provides satisfactory results compared to the other Grey-Markov models considered.


Introduction
Both time series and econometric methods have been commonly used for demand forecasting.However, prediction performance of econometric methods can be significantly influenced by incomplete information associated with explanatory factors; and models for time series, such as ARIMA [1] and Box-Jenkins models, usually require large size of samples to obtain reasonable prediction accuracy [2][3][4][5].Neural networks, such as multilayer perceptron and support vector regression, have also been applied to demand forecasting [6,7].Although the neural network has proven to be an efficient technique involving computational intelligence in representing complex nonlinear mappings, similar to econometric methods, multilayer perceptron and support vector regression suffer from incomplete information associated with input variables.
Grey prediction models [8] have the ability to characterize an unknown system with small data sets [9], without requiring conformance to statistical assumptions such as normality distribution.For time series prediction, GM (1,1) is among the most frequently used grey prediction models [10].It requires only four recent samples to derive reliable and acceptable prediction accuracy [5], and has been widely applied to various decision problems involving management, economics, and engineering [2][3][4][11][12][13][14][15][16].To better improve the prediction performance of the original GM(1,1) model, several versions combining with computational intelligence have been proposed, such as models with self-adaptive intelligence [17], neural-network-based grey prediction for electricity consumption prediction [18,19], PGM(1,1) using particle swarm optimization to determine the development coefficient [20], GM(1,1) models with online sequential extreme learning machine [21], an optimized nonlinear grey Bernoulli model [22], an adaptive GM(1,1) for electricity consumption [3], and grey wave forecasting through qualified contour sequences [23].Literally, the combination of grey prediction and neural networks can better represent system dynamics with uncertainty and nonlinearity [21].
The GM(1,1) model with residual modification could be established to improve prediction accuracy of the original GM(1,1) model [7,9].To modify the predicted values from the original model, a residual modification model is commonly set up by building the original GM(1,1) model, and then constructing the residual GM (1,1) model by a residual series [4,19].As a matter of facts, construction of grey prediction models with residual modification all stemmed from the foundation of the GM(1,1) residual model.It is interesting that prediction accuracy obtained by the original GM(1,1) model can be effectively improved using the Markov chain to realize the residual model [24,25].The Grey-Markov model, MCGM(1,1) uses the GM(1,1) model to get the basic trend of the original data, and then uses the Markov chain to fix residual errors generated by the GM(1,1) model.It has shown advantages over the GM(1,1) model, when the time series data fluctuated significantly [26,27].Other related MCGM(1,1) studies, such as Hsu and Wen [28] and Hsu [29] used Markov chain sign estimation to modify residuals for the trans-Pacific air passenger market and the global integrated circuit industry.Hsu et al. [30] combined a Fourier grey model with Markov chain to predict turning time of the stock market.Kumar and Jain [31] applied MCGM(1,1) to predict conventional energy consumption.Li et al. [32] combined RGM(1,1) with the Markov chain for thermal electric power generation.Mao and Sun [33] applied MCGM(1,1) to fire accident prediction.Sun et al. [25] proposed a MCGM(1,1) variant using the Cuckoo search algorithm for foreign tourist arrivals prediction.Wang [34] showed the effectiveness of MCGM(1,1) for tourism demand prediction.Xie et al. [35] proposed a QP-Markov model to estimate the probability that one energy component can transit to another energy component.
However, for an MCGM(1,1)-based model, it is not easy to determine the number of states and their bounds for the Markov chain-these parameters are usually specified in advance through experience and the modification range for a predicted value derived by the original GM(1,1) model is identical to its corresponding predicted residual from the Markov chain.These more or less have an impact on prediction performance.Because of the advantage of combining grey prediction with neural networks, we propose a residual modification model based on neural networks, NN-Grey-Markov, incorporating a functional link net (FLN) with effective function approximation capability [36][37][38][39] to estimate the modification range with respect to a predicted residual obtained from the Markov chain.The genetic algorithm (GA) is employed to determine connection weights of an FLN, the number of states, and the bounds of each state, to construct the proposed grey prediction model with high prediction accuracy.
Foreign tourist forecasting can be recognized to be a grey system problem since several factors influence tourism demand in uncertain ways.That is, several factors such as exchange rate, security, and disease cause fluctuations in tourism demand but the precise manner of this effect is not clear.The variety of the international tourism market has meant foreign tourist prediction has been a challenging task for tourism administrators [25,40,41].The global tourism industry has a significant impact on a nation's economic development and foreign tourist forecasting plays a very important role when devising tourism development plans for cities or countries.It therefore makes us more intrigued to examine the prediction performance of the proposed residual modification model on foreign tourist forecasting.
The remainder of the paper is organized as follows: Section 2 introduces the MCGM(1,1) model and Section 3 presents the proposed NN-Grey-Markov model.Section 4 validates the prediction accuracy of the proposed grey prediction model for foreign tourist forecasting using two real cases.This paper is concluded with Section 5.

Original GM(1,1) Model
By one time accumulated generating operation (1-AGO) [9], a new sequence, x (1) = (x n ), can be generated from an original data sequence x (0) = (x n ) as follows: and x (1) 2 , . . ., x n can be approximated by a first-order whitenization differential equation, where a is the developing coefficient and b is the control variable.Using 1-AGO is beneficial to identify regularity hidden in data sequences, even if the original data are finite, insufficient, and chaotic.
The predicted value, x(1) k , for x k can be obtained by solving the differential equation with initial condition x (1) 1 holds, and a and b can be estimated from the grey difference equation where z k is the background value, z where α = 0.5 usually, for convenience.Using n − 1 grey difference equations (k = 2, 3, . . ., n), a and b can be derived using the ordinary least squares approach, where and 3 , . . ., x T (8) Using the inverse AGO, the predicted value of x Therefore, and x(1) 1 = x(0) 1 holds.

Residual Modification by Markov Chain
Let ε = (ε 1 , ε 2 , . . ., ε n ) denote the sequence of residual values, where Let [ε min , ε max ] denote the range of residuals, where ε min and ε max are the minimum and maximum values among ε k , respectively.Then [ε min , ε max ] can be divided into r intervals (r ≥ 2), with each interval treated as a state.The state with lower bound ε min is state 1, and state r is the state with upper bound ε max .Therefore, state of ε k can be determined depending on where it locates.It is not necessary to require intervals with equal length.
Subsequently, an m-step transition probability matrix P (m) can be generated as follows: where p (m) ij denotes the transition probability from state i to j (1 ≤ i, j ≤ r) by m steps, ij denotes the number of transitions from state i to j by m steps, and t i denotes the number of state i among the sequence of residual values.For each row in P (m) , the sum of elements equals one.However, p

(m)
ii can be specified directly as one when the sum of elements in the row i equals zero.In other words, such a state is treated as an absorbing state.
The predicted residual value, ε(m) where c w (1 ≤ w ≤ r) is the center of state w, whose lower and upper bounds are l w and u w , respectively.Alternatively, c w can be expressed as [25,32] Then x(0) k can be revised as a new predicted value x k by adding the predicted residual ε(m) k . x The Markov chain is used to modify the residuals generated by the GM(1,1) model.Sun et al. [25] and Mao and Sun [33] used the sequence of relative errors rather than the sequence of residual values.

The Proposed NN-Grey-Markov Model
Two issues must be addressed for the original Grey-Markov model.First, the modification range of x(0) k in the original Grey-Markov model is restricted to ε(m) k with a positive sign, which may affect prediction accuracy of the residual modification models.To increase flexibility, the restriction may be relaxed by deriving the sign and modification range with respect to ε(m) k .Second, the number of intervals, r, is fixed and usually specified in advance.To improve prediction accuracy it is reasonable to apply a GA-which is a powerful search and optimization method [42][43][44]-to automatically determine r and the lower and upper bounds of each interval.FLN is an appropriate tool to provide estimations for the sign and modification range, due to its effective function approximation capability.Section 3.1 describes how to apply FLN to estimate the sign and modification range for each predicted residual, and Section 3.2 describes the construction of the proposed NN-Grey-Markov model using GA to determine the required parameters, including FLN connection weights, r and the lower and upper bounds of each interval.
where y k ranges from −1 to 1 and can be interpreted as the degree to which x(0) k can be adjusted.That is, if y k is positive, the greater y k , the more likely x(0) k is to be adjusted toward On the contrary, if y k is negative, the smaller y k , the more likely x(0) k is to be adjusted toward x(0) k − ε(m) k .We estimated y k with FLN using the hyperbolic tangent function, as the activation function, which has range (−1, 1).An enhanced pattern with respect to a single input denoted by t k , can be generated as (t k , sin(πt k ), cos(πt k ), sin(2πt k ), cos(2πt k ), sin(4πt k )) through a functional link, where t k denotes the time period k with respect to x(0) k .Let θ be the bias to the output node.Then the actual output value, y k , corresponding to (t k , sin(πt k ), cos(πt k ), sin(2πt k ), cos(2πt k ), sin(4πt k )) is Although the components in the functional expansion representation can be unrestrictedly extended for t k , this is not practical in real applications.(t k , sin(πt k ), cos(πt k ), sin(2πt k ), cos(2πt k ), sin(4πt k )) with respect to t k is acceptable [37].Hu [45] also demonstrated the superiority of applying residual modification using FLN to predict energy demand.

Constructing the Proposed NN-Grey-Markov Model
To construct the proposed grey prediction model with high prediction accuracy, we consider the mean absolute percentage error (MAPE), which is usually recommended to be used for modelling [46,47].MAPE with respect to x (0) is formulated as follows: What we are aiming for is to set up a prediction model with high prediction accuracy.The problem can be formulated as maximizing the reciprocal of MAPE for constructing the prediction model.Using this fitness function, a real-valued GA was developed to automatically determine 7 + 2r parameters, including the connection weights (w 1 , w 2 , w 3 , w 4 , w 5 , w 6 ), bias (θ), the number of intervals (r), partition points (p 1 , p 2 , . . ., p r−1 ), and relative weights in respective intervals (α 1 , α 2 , . . ., α r ) for the proposed grey prediction model, where w 1 , w 2 , w 3 , w 4 , w 5 , w 6 , and θ range from −1 to 1, p 1 , p 2 , . . ., p r−1 range from ε min to ε max , and r range from 2 to 10.It is noted that u r−1 = l r = p r−1 holds.
Let n size and n max denote the population size and maximum number of generations, respectively, and P m denote the population in generation m (1 ≤ m ≤ n max ).After evaluating the fitness value of each chromosome in P m , n size new chromosomes were generated for P m+1 by means of selection, crossover, and mutation.GA was performed for n max generations.When the stopping condition was satisfied, the algorithm is terminated, and the best chromosome with maximum fitness value among consecutive generations can be used to examine the generalization ability of the NN-Grey-Markov model.These genetic operations are briefly described below.Two chromosomes from P m were randomly selected by binary tournament selection, and the one with higher fitness was put into a mating pool.This process was repeated until n size chromosomes were placed in the mating pool.n size /2 pairs of chromosomes from the pool were then randomly selected, and offspring of the selected parents were reproduced by crossover and mutation.

Crossover
Crossover was applied to reproduce children by altering the parent chromosomal makeup.For two selected chromosomes, u , each pair of real-valued genes can be used to generate two new genes with crossover probability Pr c .
where h 1 , h 2 , . . ., h 7+2r are all random numbers in the interval [0, 1].It is noted that Pr c should be specified as a large value because it controls the exploratory range in the solution space.

Mutation
Mutation was performed with probability Pr m for each real valued parameter in a new chromosome generated by crossover.To avoid excessive perturbation, a low mutation rate was taken into account.When a mutation happened with a real valued gene, that gene was changed by adding a randomly selected number from a specified interval.After crossover and mutation, n del (0 ≤ n del ≤ n size ) chromosomes in P m+1 were removed randomly from the set of new chromosomes to create space for the chromosome with maximum fitness value in P m .

Background
The global tourism industry plays a significant role in the economic development of a country.To boost the tourism industry, devising tourism development and marketing strategies by estimating the number of the foreign tourists has become increasingly important for governments and industries in the private sector such as airlines, hospitality services, and travel agencies.Effective tourism demand forecasting can significantly affect the amount of resources that governments and private sectors invest [6].In Taiwan, tourism statistics show that foreign tourists mainly came from Japan, Hong Kong, Macao, Korea, China, and USA for 2014-2016.It is noteworthy that the number of tourist arrivals from Southeast Asia increased by 15% in May 2016 compared to May 2015.In face of the growth rate, authorities have actively investigated how to continuously expand the tourism market in the Southeast Asia through new policies.Therefore, foreign tourist forecasting will have a great impact on the outcomes of programs related to the policy.
Real datasets are used to conduct experiments to compare foreign tourist forecasting from the proposed NN-Grey-Markov model against the original GM(1,1), MCGM(1,1), and several models proposed by Sun et al. [25], including segmented GM(1,1) (SGM(1,1)), SGM(1,1) using Markov chain (MCSGM(1,1)), and MCSGM(1,1) using a Cuckoo search algorithm (CMCSGM(1,1)).In contrast to the original GM(1,1) and MCGM(1,1) using all observed data, the SGM model first used a rolling mechanism to determine the set of newly observed data, and then constructed the GM(1,1) model.Thus, the rolling mechanism could select only a few recent data by capturing the developing trend from all observed data.This reflects the premise that as the system develops, the significance of older data reduces [9].
As the system develops further, the significance of the older data reduces [9,48].Therefore, the training data, retained after rolling, were applied to the SGM(1,1), MCSGM(1,1) and CMCSGM(1,1) models.The rolling mechanism could select only a few recent data by capturing the developing trend from the training data.For x (0) = (x n ), the l-point rolling (4 ≤ l ≤ n − 1) can be exercised on x (0) to construct a GM(1,1) model.MAPE l corresponding to the l-point rolling can be computed as Finally, the best number of point, say v, that can be used to construct a GM(1,1) model, called SGM(1,1), is determined as For fair comparisons, the proposed NN-Grey-Markov model used the same training data as the SGM(1,1), the MCSGM(1,1), and the CMCSGM(1,1) models.The rest of this section is organized as follows.Section 4.1 presents the parameter specifications for the GA and Section 4.2 presents prediction accuracy for different grey prediction models on real data.

GA Parameters
It is known that population size and crossover and mutation probabilities can have impacts on GA performance.There are no optimal parameter settings.Therefore, following [42,44], the experiment parameters were chosen to be: In the experiment, no matter what the data set is, the same parameters of GA were used to examine the prediction accuracy of the proposed NN-Grey-Markov model.

Case I
The first experiment was conducted on the yearly statistics reported by Taiwan Tourism Bureau [49].Table 1 shows historical annual foreign tourists to Taiwan from six countries, Japan, Hong Kong/Macao, Korea, China, USA, and Southeast Asia, collected from 2001 to 2016.Year 2016 was used for testing using a one-step transition probability matrix, i.e., m = 1.Therefore, after performing the rolling mechanism, 2011-2015 data from China and 2012-2015 from the other countries can be used for model-fitting for the SGM(1,1), the MCSGM(1,1), the CMCSGM(1,1), and proposed grey prediction models.Whichever the country is, the original GM(1,1) and the MCGM(1,1) were constructed using data from 2001 to 2015.Figures 1 and 2 show prediction results with respect to model fitting and testing for different models, respectively.Figure 1 shows that the proposed NN-Grey-Markov model outperforms the other prediction models considered for model-fitting.For testing, the proposed NN-Grey-Markov model outperforms the SGM(1,1), the MCSGM(1,1), and the CMCSGM(1,1) models, and it is little inferior to the original GM(1,1) and the MCGM(1,1) models for Hong Kong/Macao and Southeast Asia.

Case II
Historical annual data from 1997 to 2013 published by China National Tourism Administration were used to conduct the second experiment.The data were summarized in Reference [25].The collected data were associated with foreign tourists from eight main countries, including Japan, Korea, Malaysia, Mongolia, Philippines, Russia, Singapore, and USA.Year 2013 was used for testing by a one-step transition probability matrix.Performing the rolling mechanism, 2005-2012 data from Korea, Japan, USA, and Malaysia; 2006-2012 from Russia, 2003-2012 from Mongolia and Philippines; and 2004-2012 from Singapore from 2004 to 2012 can be used to construct the SGM(1,1), the MCSGM(1,1), the CMCSGM(1,1), and proposed NN-Grey-Markov models.The original GM(1,1) and the MCGM(1,1) were constructed using data from 1997 to 2012 for each economy.Using the same data, Sun et al. [25] demonstrated the effectiveness of the CMCSGM(1,1) model.
Figures 3 and 4 show prediction results with respect to model fitting and testing for different prediction models, respectively.For model-fitting, Figure 3 shows that the proposed NN-Grey-Markov model is comparable or superior to the compared models.It is slightly inferior to the CMCSGM(1,1) model for Malaysia.As for testing results, prediction accuracy of the proposed NN-Grey-Markov model outperforms that of the CMCSGM(1,1) model, except for Malaysia and Singapore.The proposed grey prediction model is superior to the original GM(1,1), the SGM(1,1),

Case II
Historical annual data from 1997 to 2013 published by China National Tourism Administration were used to conduct the second experiment.The data were summarized in Reference [25].The collected data were associated with foreign tourists from eight main countries, including Japan, Korea, Malaysia, Mongolia, Philippines, Russia, Singapore, and USA.Year 2013 was used for testing by a one-step transition probability matrix.Performing the rolling mechanism, 2005-2012 data from Korea, Japan, USA, and Malaysia; 2006-2012 from Russia, 2003-2012 from Mongolia and Philippines; and 2004-2012 from Singapore from 2004 to 2012 can be used to construct the SGM(1,1), the MCSGM(1,1), the CMCSGM(1,1), and proposed NN-Grey-Markov models.The original GM(1,1) and the MCGM(1,1) were constructed using data from 1997 to 2012 for each economy.Using the same data, Sun et al. [25] demonstrated the effectiveness of the CMCSGM(1,1) model.
Figures 3 and 4 show prediction results with respect to model fitting and testing for different prediction models, respectively.For model-fitting, Figure 3 shows that the proposed NN-Grey-Markov model is comparable or superior to the compared models.It is slightly inferior to the CMCSGM(1,1) model for Malaysia.As for testing results, prediction accuracy of the proposed NN-Grey-Markov model outperforms that of the CMCSGM(1,1) model, except for Malaysia and Singapore.The proposed grey prediction model is superior to the original GM(1,1), the SGM(1,1), and the MCSGM(1,1) models.It is obvious that the proposed NN-Grey-Markov model provides comparable and satisfactory results compared to the other prediction models considered.

Discussion and Conclusions
Data fluctuations such as tourism time series data often arise from random factors, which can be effectively reduced by the Grey-Markov model.Based on the Grey-Markov model, this study proposed a novel grey residual modification model, NN-Grey-Markov, for tourism demand prediction.Compared to previous studies based on MCGM(1,1), there are two distinctive features of the proposed NN-Grey-Markov model.First, the proposed grey prediction model incorporates FLN to estimate the sign of each revised residual, and available degree to which a predicted value from the GM(1,1) model can be adjusted.Second, we need not define the number of states and their bounds for the Markov chain in advance, since these can be fully determined by GA.It should be noted that, the FLNGM(1,1) model [45] integrated the original GM(1,1) with the residual GM(1,1) models and then used the functional-link net to estimate the sign and the modification range with respect to a predicted residual obtained from the residual GM(1,1) model.The implementation of the proposed NN-Grey-Markov model is therefore different from that of the FLNGM(1,1) model.
Development of the tourism industry has contributed relatively highly to economic prosperity.In the variable global tourism market, accurate prediction of tourism demand is crucial for governments and private sectors to set up strategies-such as investment and construction-to

Discussion and Conclusions
Data fluctuations such as tourism time series data often arise from random factors, which can be effectively reduced by the Grey-Markov model.Based on the Grey-Markov model, this study proposed a novel grey residual modification model, NN-Grey-Markov, for tourism demand prediction.Compared to previous studies based on MCGM(1,1), there are two distinctive features of the proposed NN-Grey-Markov model.First, the proposed grey prediction model incorporates FLN to estimate the sign of each revised residual, and available degree to which a predicted value from the GM(1,1) model can be adjusted.Second, we need not define the number of states and their bounds for the Markov chain in advance, since these can be fully determined by GA.It should be noted that, the FLNGM(1,1) model [45] integrated the original GM(1,1) with the residual GM(1,1) models and then used the functional-link net to estimate the sign and the modification range with respect to a predicted residual obtained from the residual GM(1,1) model.The implementation of the proposed NN-Grey-Markov model is therefore different from that of the FLNGM(1,1) model.
Development of the tourism industry has contributed relatively highly to economic prosperity.In the variable global tourism market, accurate prediction of tourism demand is crucial for governments and private sectors to set up strategies-such as investment and construction-to promote the tourism industry.It is challenging to predict precisely the trend of tourism demand.From the perspective of the grey system, it is reasonable to apply the GM(1,1) model to foreign tourist prediction.Historical annual data for foreign tourists, collected from Taiwan and China official institutions, were used to evaluate prediction performance of the proposed NN-Grey-Markov model.The proposed model with pre-specified GA parameters, including population size, number of generations, probabilities for crossover and mutation, performs well.This means that fine parameter tuning is not required for the proposed prediction model, and that parameter specifications introduced in the previous section were acceptable.Real case experiments reveal that the proposed NN-Grey-Markov model outperformed other grey prediction models considered for the majority of data sets.This validated the potential usefulness of the proposed NN-Grey-Markov model for tourism demand prediction.
For future studies, there are two issues that require addressing.First, this study used a one-step transition probability matrix to predict the residual for testing on a predicted time period.The other alternative is to sum the rows of the transition probability matrices corresponding to some near time periods prior to a predicted one to estimate the residual corresponding to a predicted time period [32,33].It would be interesting to examine the influence on foreign tourist prediction using the proposed NN-Grey-Markov model.Second, FLN used the hyperbolic tangent function as the output neuron's activation function, computing a weighted sum of a vector of connection weights with an enhanced pattern.This assumes additivity among individual variables in the enhanced pattern [50].

3. 1 .
Incorporating Functional-Link Net into the Proposed NN-Grey-Markov Model For flexibility, it is reasonable to modify x(0) k as x (0) k by adding or subtracting ε(m) k , x
(i) n size = 200: It is reasonable to specify population size ranging from 50 to 500 individuals.(ii) n max = 1000: n max plays a role of stopping condition, and it should take available computing time into account.(iii) n del = 2: A small number of elite chromosomes is considered.(iv) Pr c = 0.8, Pr m = 0.01.

Figure 1 .
Figure 1.Model-fitting results for Case I.

Figure 1 .
Figure 1.Model-fitting results for Case I.

Figure 1 .
Figure 1.Model-fitting results for Case I.

Figure 2 .
Figure 2. Testing results for Case I.

Figure 2 .
Figure 2. Testing results for Case I.
Information 2017, 8, 126 10 of 13 and the MCSGM(1,1) models.It is obvious that the proposed NN-Grey-Markov model provides comparable and satisfactory results compared to the other prediction models considered.

Figure 3 .
Figure 3. Model-fitting results for Case II.

Figure 3 .
Figure 3. Model-fitting results for Case II.

Figure 3 .
Model-fitting results for Case II.

Figure 4 .
Figure 4. Testing results for Case II.

Figure 4 .
Figure 4. Testing results for Case II.

Table 1 .
Historical annual foreign tourists from six countries to Taiwan.