Short Term Tra ﬃ c Flow Prediction of Urban Road Using Time Varying Filtering Based Empirical Mode Decomposition

: Short-term tra ﬃ c ﬂow prediction is important to realize real-time tra ﬃ c instruction. However, due to the existing strong nonlinearity and non-stationarity in short-term tra ﬃ c volume data, it is hard to obtain a satisfactory result through the traditional method. To this end, this paper develops an innovative hybrid method based on the time varying ﬁltering based empirical mode decomposition (TVF-EMD) and least square support vector machine (LSSVM). Speciﬁcally, TVF-EMD is ﬁrstly used to deal with the implied non-stationarity in the original data by decomposing them into several di ﬀ erent subseries. Then, the LSSVM models are established for each subseries to capture the linear and nonlinear characteristics embedded in the original data, and the corresponding prediction results are superimposed to obtain the ﬁnal one. Finally, case studies based on two groups of data measured from an arterial road intersection are employed to evaluate the performance of the proposed method. The experimental results indicate it outperforms the other involved models. For example, compared with the LSSVM model, the average improvements by the proposed method in terms of the indexes of mean absolute error, mean relative percentage error, root mean square error and root mean square relative error are 7.397, 15.832%, 10.707 and 24.471%, respectively.


Introduction
As one of the key technologies of real-time traffic signal control, traffic assignment, route guidance, and other functions in the intelligent transportation system, short-term traffic flow prediction has always been the research focus.Its forecasting accuracy plays a decisive role in improving the performance of the intelligent transportation system [1].For pursuing higher accuracy, a variety of spatio-temporal forecasting methods have been developed [2,3].Among them, the temporal forecasting methods are widely used and have attracted more and more attention in the recent decades.Generally, these methods are roughly divided into three categories, i.e., statistical theoretical models, intelligent models, and hybrid models.
Statistical methods mainly include time-series models (e.g., autoregressive integrated moving average (ARIMA), seasonal autoregressive integrated moving average (SARIMA), etc.) [4][5][6], Kalman filtering model [7,8], and history average model [9].Among them, the time series model has been widely applied in the prediction of traffic volume data.For example, Kumar et al. [4] developed a SARIMA model to predict the traffic flow, in which the order of model was determined by autocorrelation function and partial autocorrelation function.The forecasting results showed that the proposed model had satisfactory forecasting accuracy.Zhao et al. [5] proposed a short-term traffic flow forecasting model combined the ARIMA model and the space-time characteristics of the expressway network to improve forecasting accuracy.Wang et al. [6] adopted an ARIMA model to forecast the traffic time-series data, and the satisfactory results could be obtained.Generally, the statistical theoretical models are simple, convenient and easy to apply.However, those models usually overlook the interferences of random factor, strong non-stationarity and nonlinearity hidden in the traffic data.
Unlike the statistical methods, the intelligent models usually perform better in explaining the nonlinear relationship between the input and output.These models include artificial neural network (ANN) [10][11][12][13][14], support vector machine (SVM) [15,16] and least square SVM (LSSVM) [17].Wang et al. [16] proposed a brand-new model integrated the wavelet function and the SVM model to forecast the target data, which could improve the forecasting results.Luo et al. [18] presented a hybrid optimization algorithm combined particle swarm optimization (PSO) and genetic algorithm to find the optimal parameters of LSSVM, which could effectively improve the model's accuracy and convergence speed.Shang et al. [19] introduced the proportion coefficient to combine the advantages of Gaussian kernel function and polynomial function.The forecasting result showed the built model was effective and practicable.Obviously, these intelligent models do not contain some special model architectures and have highly adaptable, especially for the nonlinear data.However, they may suffer from the problems of slow convergence speed and over-fitting.
To obtain more accurate and stable prediction, many scholars have introduced a variety of hybrid models which could combine the advantages of different models.The hybrid models can be commonly divided into four types: decomposition-based methods, weighting-based methods, parameter optimization-based methods, and error correction-based methods [20].In recent years, the decomposition-based methods have become the research focus [21].This kind of hybrid model could use the data processing models to address the nonlinear and non-stationary features in the data, and thus the forecasting accuracy could be enhanced.
The widely used decomposition algorithms have wavelet decomposition, empirical mode decomposition (EMD), and ensemble empirical mode decomposition (EEMD), etc.Among them, wavelet decomposition [22] is a multi-scale signal analysis method to tackle non-stationary signals.However, its performance usually relies on the selection of wavelet base functions.On the other hand, EMD can filter the signal adaptively [22].By this method, different features in the original sequence can be filtered out step by step, and the corresponding subseries can be regarded as intrinsic mode functions (IMFs).Unfortunately, its decomposition process may suffer from the problems of model mixing and end effect.By adding many Gaussian white noise samples in EMD, EEMD has been developed [23].However, it has the problem in the determination of noise amplitude and ensemble number.Nevertheless, these decomposition algorithms still have been successfully applied in the traffic flow prediction.For example, Duo et al. [24] proposed a hybrid forecasting method of short-term traffic volume based on EMD and the improved SVM.The forecasting results verified that EMD could improve accuracy significantly.Tang et al. [25] adopted a new hybrid model for traffic volume prediction by using the combination of EEMD and SVM.The results showed this model had superior performance over the single SVM.Tian et al. [26] presented a hybrid prediction model based on the improved complete EEMD (ICEEMDAN) algorithm, the kernel online sequential extreme learning machine (KOSELM), and the ARIMA model.The forecasting accuracy had been improved significantly.Despite these applications, the decomposition-based methods still go through various challenges.
To further enhance the accuracy of the traffic volume prediction, it is necessary to find new methods to deal with short-term traffic volume data.This paper proposes a novel time varying filtering based empirical mode decomposition (TVF-EMD) algorithm, which vividly describes the time-varying characteristics of data and overcomes the occurrence of mode mixing [27].Specifically, TVF-EMD is firstly adopted to decompose the short-term traffic volume data and obtain multiple subsequences.Secondly, the LSSVM model is adopted for each subsequence to perform the final prediction.On this basis, five evaluation indexes including the mean absolute error, mean relative percentage error, root mean square error, root mean square relative error and equal coefficient are used to systematically evaluate the forecasting results.Meanwhile, the comparison of the proposed method with other forecasting models including EMD-LSSVM, LSSVM, and ARIMA is conducted.Finally, some conclusions are provided.
The rest of this paper is organized as follows: In Section 2, TVF-EMD and LSSVM are briefly discussed.Simultaneously, the structure and procedure of the proposed method are described in detail; In Section 3, two case studies are performed and the effectiveness of the proposed method is analyzed and discussed; In Section 4, some conclusions are summarized.

Methods
TVF-EMD is a data decomposition algorithm, which can be used to reduce the nonlinear and non-stationary components in short-time traffic volume data.On the other hand, LSSVM could perform well in describing short-time traffic volume data with nonlinear and non-stationary characteristics.This paper simultaneously combines the advantages of these two models and builds a new hybrid forecasting model, i.e., TVF-EMD-LSSVM.In order to better understand this method, the specific illustration of its notations is summarized in Appendix A.

Time Varying Filtering Based Empirical Mode Decomposition
EMD is an adaptive signal processing method that can decompose the signal into a series of IMFs and a non-zero mean residual [28], the expression is shown in Equation (1): where im f i (t) is the ith im f , i = 1, 2, . . ., N. The EMD screening process can be divided into five steps, as shown in Appendix B. As an IMF, the following conditions should be satisfied: (i) the number of zeros and poles must either be equal or differ at most by one; (ii) the local mean value of the upper and lower envelopes is zero.However, the above requirements have two limitations: (i) in the actual screening process, it is too rigid for stopping criterion; (ii) the second requirement of IMF may not be valid at a low sampling rate [27].Thus, the model mixing occurs during decomposition.Aiming to overcome the weakness of EMD, Li et al. [27] proposed a TVF-EMD screening method to solve the above problems by developing local narrow-band signal.The local narrowband signal is not only similar to the IMF but also provides a Hilbert spectrum with physical significance.The filtering process of this method is completed by time-varying filtering, which is divided into three steps: (i) estimation of the local cut-off frequency; (ii) calculation of the local mean function; (iii) judgement of the residual signal.

Estimation of the Local Cut-Off Frequency
In TVF-EMD method, B-spline approximation filter is chosen as a time-varying filter, which adopts polynomial splines to approximate the signal and can be represented as: where [.] ↓m is the down-sampling operation; p n m is a pre-filter and According to Equation (2), the node m determines the local cut-off frequency of the B-spline time-varying filter.In practice, the nodes cannot be known.As a result, it is necessary to estimate the local cut-off frequency from the input signal.Then, the B-spline time-varying filter is constructed.The specific process is provided in Appendix C.

Calculation of Local Mean Function
After obtaining the local cut-off frequency ϕ bis (t), the signals h(t) can be obtained by Taking the extreme time point ({t min }, {t min }) of h(t) as node m, the time-varying filter can be constructed by B-spline approximation, and the cutoff frequency of the filter is consistent with ϕ bis (t).Subsequently, the B-spline approximation filter is performed on the input signal and the result is recorded as m(t).

Judgement of the Residual Signal
Since the definition of local narrow-band signal is closely related to the instantaneous bandwidth, TVF-EMD has formulated the relative criteria to check the instantaneous narrow-band signal, namely, For a given bandwidth ξ threshold, if θ(t) ≤ ξ, the signal can be viewed as a narrow-band signal.Here, the weighted average instantaneous frequency ϕ avg (t) and Loughlin instantaneous bandwidth B L can be calculated by:

Least Square Support Vector Machine
After decomposing by TVF-EMD, the LSSVM model is built for each subseries.LSSVM has great improvement over the SVM model.The inequality constraints in the standard SVM algorithm are replaced by the equality constraints.On the conditions, the quadratic programming problem is transformed into the problem of solving linear equations [29].
Considering a set of data D = (x i , y i ), i = 1, • • • , k, where x i ∈ R g is input and g is the dimension of x i which can be determined by minimizing the root mean square error of the values output by the training part [20]; Thus, the regression function can be written as follow: where ψ(•) denotes a non-linear function; ω represents a weight vector; d is an offset.The parameters ω and d can be obtained by optimizing the following function: Appl.Sci.2020, 10, 2038 where q i denotes error variable; µ and ς denote variable parameters; E W = 1 2 ω T ω; To solve the above optimization problems, the Lagrange function is constructed as shown in Equation (9).
where α i is the Lagrange multiplier.According to the Karush-Kuhn-Tucker (KKT) conditions, the optimal solution can be calculated by: where γ = ζ/µ denotes the penalty coefficient.After eliminating q i and ω, the original optimization problem becomes where Finding out α and d through Equation ( 11), the LSSVM regression model becomes: where K(x, x i ) is the kernel function which needs to meet Mercer's conditions.Generally, the kernel functions include RBF kernel function, sigmoid kernel function and polynomial kernel function, etc.The RBF kernel function is also called the Gaussian kernel function.It has strong nonlinear learning ability with fewer parameters, which is the most effective kernel function.Therefore, the RBF kernel function is selected in this paper.It can be expressed as where σ denotes the kernel function parameter.When applying the LSSVM model with RBF kernel function, the selections of the parameter σ and the penalty coefficients γ determine the model's learning and generalization capabilities.Thus, it's vital to search for the most suitable parameters.

The Proposed Method
Based on the above discussions, a novel hybrid model which combines the TVF-EMD model and LSSVM model can be developed to improve the forecasting accuracy.First, the TVF-EMD method is presented to deal with the non-stationary and nonlinear traffic volume series.After that, multiple subsequences called narrow-band signals are obtained.Then, the LSSVM model is established for each subsequence.Finally, the prediction results of the subsequences are accumulated to generate the lasted forecasting results.The specific process of TVF-EMD-LSSVM model is shown in Figure 1, and the steps are shown as follows: Step 1: Preprocess the original traffic volume data with the errors data and missing data to get the experimental data; Step 2: Decompose the data into several subsequences c j (1), . . ., c j (k) , j = 1 . . .M + 1 by TVF-EMD algorithm; Step 3: Divide each subseries into two parts, including training parts x (1), . . ., x (k) and test parts x (k + 1), . . ., x (k + N) ; Step 4: Establish the LSSVM model to predict the k + 1th data ĉj (k + 1) of subsequences, and sum up to get the forecasting value x(k + 1); Step 5: After updating the training set data to x (2), . . ., x (k + 1) , repeat step 2 to step 4 to obtain the prediction results.Continue to predict one step ahead until the prediction task is completed.

The Proposed Method
Based on the above discussions, a novel hybrid model which combines the TVF-EMD model and LSSVM model can be developed to improve the forecasting accuracy.First, the TVF-EMD method is presented to deal with the non-stationary and nonlinear traffic volume series.After that, multiple subsequences called narrow-band signals are obtained.Then, the LSSVM model is established for each subsequence.Finally, the prediction results of the subsequences are accumulated to generate the lasted forecasting results.The specific process of TVF-EMD-LSSVM model is shown in Figure 1, and the steps are shown as follows: Step 1: Preprocess the original traffic volume data with the errors data and missing data to get the experimental data; Step 2: Decompose the data into several subsequences ({ (1), , ( )}, 1 1)

Data Description
The data collection A (including 2016 samples) was measured from the intersection entrance A of an arterial road in the main urban area of Chongqing and the location is shown in Figure 2. The

Data Description
The data collection A (including 2016 samples) was measured from the intersection entrance A of an arterial road in the main urban area of Chongqing and the location is shown in Figure 2. The statistical interval was 5 min, as shown in Figure 3. Two-thirds of the data were used to train the model, and the rest were used to test the performance of the built model.Table 1 summarizes the characteristics of data collection A. It could be observed that this dataset had strong volatility.
Appl.Sci.2020, 10, x FOR PEER REVIEW 7 of 15 statistical interval was 5 min, as shown in Figure 3. Two-thirds of the data were used to train the model, and the rest were used to test the performance of the built model.Table 1 summarizes the characteristics of data collection A. It could be observed that this dataset had strong volatility.

Data Processing
There are many factors affecting prediction accuracy, such as data quality, data characteristics, and model selection, etc.However, the quality of traffic volume data is one of the main factors [26].Therefore, the processing of the abnormal data including missing data and erroneous data appears to be crucial in traffic volume prediction [30].To repair abnormal data, the adjacent completion method is adopted and its function is shown in Equation ( 14): where w denotes the number of data to be repaired.

Evaluation Criteria
In order to analyze and evaluate the forecasting performance of the proposed model, five commonly used evaluation indexes including mean absolute error (MAE), mean relative percentage  statistical interval was 5 min, as shown in Figure 3. Two-thirds of the data were used to train the model, and the rest were used to test the performance of the built model.Table 1 summarizes the characteristics of data collection A. It could be observed that this dataset had strong volatility.

Data Processing
There are many factors affecting prediction accuracy, such as data quality, data characteristics, and model selection, etc.However, the quality of traffic volume data is one of the main factors [26].Therefore, the processing of the abnormal data including missing data and erroneous data appears to be crucial in traffic volume prediction [30].To repair abnormal data, the adjacent completion method is adopted and its function is shown in Equation ( 14): where w denotes the number of data to be repaired.

Evaluation Criteria
In order to analyze and evaluate the forecasting performance of the proposed model, five commonly used evaluation indexes including mean absolute error (MAE), mean relative percentage

Data Processing
There are many factors affecting prediction accuracy, such as data quality, data characteristics, and model selection, etc.However, the quality of traffic volume data is one of the main factors [26].Therefore, the processing of the abnormal data including missing data and erroneous data appears to be crucial in traffic volume prediction [30].To repair abnormal data, the adjacent completion method is adopted and its function is shown in Equation ( 14): where w denotes the number of data to be repaired.

Evaluation Criteria
In order to analyze and evaluate the forecasting performance of the proposed model, five commonly used evaluation indexes including mean absolute error (MAE), mean relative percentage error (MRPE), root mean square error (RMSE), root mean square relative error (RMSRE) and equal coefficient (EC) were used in the study [26,29].Their specific definitions are given by: The smaller values of MAE, MRPE, RMSE, and RMSRE indicate the higher accuracy.The closer to one the EC value is, the higher accuracy the prediction is.

TVF-EMD-LSSVM Model Prediction
According to the forecasting process of the proposed model in Section 2.3, TVF-EMD is used to decompose the experimental data A into 10 subsequences, as shown in Figure 4. error (MRPE), root mean square error (RMSE), root mean square relative error (RMSRE) and equal coefficient (EC) were used in the study [26,29].Their specific definitions are given by: The smaller values of MAE, MRPE, RMSE, and RMSRE indicate the higher accuracy.The closer to one the EC value is, the higher accuracy the prediction is.

TVF-EMD-LSSVM Model Prediction
According to the forecasting process of the proposed model in Section 2.3, TVF-EMD is used to decompose the experimental data A into 10 subsequences, as shown in Figure 4.

Figure 4. The pictures of TVF-EMD decomposition results (A).
By constructing training and test sets for each subsequence, the LSSVM model is built to predict them.The dimension parameter was determined by minimizing the root mean square error of the output value in the training part [20].Moreover, the optimal penalty coefficient and kernel function parameters of each subsequence were determined by the optimization function.Finally, the traffic volume prediction value was obtained by accumulating the forecasting results of the subsequences.By constructing training and test sets for each subsequence, the LSSVM model is built to predict them.The dimension parameter was determined by minimizing the root mean square error of the output value in the training part [20].Moreover, the optimal penalty coefficient and kernel function Appl.Sci.2020, 10, 2038 9 of 15 parameters of each subsequence were determined by the optimization function.Finally, the traffic volume prediction value was obtained by accumulating the forecasting results of the subsequences.

Comparison and Analysis of Forecasting Results
To illustrate the performance of the proposed method, three additional forecasting models including ARIMA model, LSSVM model, and EMD-LSSVM model were used to perform the performance comparison.The processes of the LSSVM model, ARIMA model, and EMD-LSSVM model were similar to the forecasting progress in Section 3.4.1.The evaluation indexes of four different models are shown in Table 2 and the corresponding prediction results are shown in Figure 5. From these comparisons, some main observations are provided below:  2 and the corresponding prediction results are shown in Figure 5. From these comparisons, some main observations are provided below:

•
Compared with the other three involved models, the proposed model had better forecasting performance, where its error indexes of MAE, MRPE, RMSE, RMSRE, and EC were 1.721, 3.969%, 2.974, 6.797%, and 0.9956, respectively.Specifically, in Figure 4, the red line represents the prediction result of the proposed model, while the blue line represents the true value.Their comparison indicates the proposed method could well capture the time-varying characteristics of the actual situation.From  and 0.0014, respectively.The reason could be attributed to that the nonlinear features hidden in the original data were more significant than those of linear one, which leads to the conclusion that the linear ARIMA model cannot capture the characteristics well.Therefore, it owns the lowest forecasting accuracy.

Additional Case
To further test the stability of the proposed model, another group of data (data collection B) was used.These data were measured from the intersection entrance B of an arterial road in Chongqing (including 2016 samples), as shown in Figures 2 and 6.Table 3 provided the relevant information of them.For simplicity, only the error indexes are given in Table 4.The intuitively results are shown in Figure 7. From Table 4 and Figure 7, the main results we • TVF-EMD was better than EMD in dealing with data nonlinearity and non-stationary.The forecasting result proves that the forecasting accuracy of TVF-EMD based method was higher than EMD based method.

•
The hybrid models could take advantage of the superiority each component model.The results display that the forecasting accuracy of the hybrid models was higher than that of the single models.

•
The ARIMA model usually presented the high performance for the data with significant linear features.However, for short-term traffic volume data with high nonlinear characteristics, the LSSVM model may have better forecasting performance.

Additional Case
To further test the stability of the proposed model, another group of data (data collection B) was used.These data were measured from the intersection entrance B of an arterial road in Chongqing (including 2016 samples), as shown in Figures 2 and 6.Table 3 provided the relevant information of them.For simplicity, only the error indexes are given in Table 4.The intuitively results are shown in Figure 7. From Table 4 and Figure 7, the main results we  TVF-EMD was better than EMD in dealing with data nonlinearity and non-stationary.The forecasting result proves that the forecasting accuracy of TVF-EMD based method was higher than EMD based method.


The hybrid models could take advantage of the superiority each component model.The results display that the forecasting accuracy of the hybrid models was higher than that of the single models.


The ARIMA model usually presented the high performance for the data with significant linear features.However, for short-term traffic volume data with high nonlinear characteristics, the LSSVM model may have better forecasting performance.

Conclusions
In practice, the data of short-term traffic volume commonly owns strong nonlinearity and non-stationarity so that it is hard to provide a satisfactory forecasting result through the traditional methods.In order to improve the forecasting performance, a novel hybrid model based on the combination of TVF-EMD algorithm and LSSVM is developed in this study.Two case studies based on measured data from an intersection are provided to evaluate the performance of the proposed method.Several main conclusions are summarized as follows: TVF-EMD has a more positive impact than EMD on improving forecasting accuracy.As a newly-improved decomposition method, TVF-EMD can vividly describe the time-varying characteristics (e.g., non-stationarity and nonlinearity) hidden in the data by time-varying filtering technology, where the problems of end effect and model mixing may be well addressed.
The forecasting accuracy of the hybrid models is higher than those of the single models.Generally, the hybrid model could combine the advantages of different component models.In this paper, the advantages of TVF-EMD in processing data non-stationarity and nonlinearity and the merit of LSSVM's strong ability in addressing the nonlinear problem are combined.
The innovation of this paper is to introduce a new data processing method TVF-EMD algorithm, which improves the model mixing problem of the original EMD algorithm.To further improve the forecasting performance, some future tasks should be carried out.For example, the combination of the proposed method with probabilistic prediction models should be focused; the multi-step ahead prediction will be developed in the future; the application of the proposed method in other fields, such as wind speed prediction and solar radiation prediction, should also be performed.
Step 5: Calculate the local cut-off frequency ϕ bis (t) as follows: Step 6: Rearrange ϕ bis (t) to solve the problem of signal intermittence.

Figure 1 .
Figure 1.The procedure of time varying filtering based empirical mode decomposition and least square support vector machine (TVF-EMD-LSSVM).

Figure 2 .
Figure 2. Location of the intersection in Chongqing.

Figure 2 .
Figure 2. Location of the intersection in Chongqing.

Figure 2 .
Figure 2. Location of the intersection in Chongqing.

Figure 5 .
Figure 5.The predictions of different methods (A).

Figure 7 .
Figure 7.The predictions of different methods (B).
m); β n (t) denotes B-spline function; n stands for B-spline order; m represents the node; t is time; * represents convolution operation.

Table 2 .
The comparison results of different methods (A).The processes of the LSSVM model, ARIMA model, and EMD-LSSVM model were similar to the forecasting progress in Section 3.4.1.The evaluation indexes of four different models are shown in Table

Table 2 .
The comparison results of different methods (A).Compared with the other three involved models, the proposed model had better forecasting performance, where its error indexes of MAE, MRPE, RMSE, RMSRE, and EC were 1.721, 3.969%, 2.974, 6.797%, and 0.9956, respectively.Specifically, in Figure4, the red line represents the prediction result of the proposed model, while the blue line represents the true value.Their comparison indicates the proposed method could well capture the time-varying characteristics of the actual situation.From Table2, the forecasting accuracy of the proposed model was higher than the EMD-LSSVM model with the reductions in terms of the five indexes MAE, MRPE, RMSE, RMSRE, and EC by 2.654, 5.991%, 2.831, 8.464%, and 0.0174, respectively.The reason could be that the TVF-EMD algorithm uses time-varying filtering technology, which could describe the time-varying characteristics of the data.Simultaneously, it can improve the imperfection of the model mixing in the EMD algorithm.Comparedwith the single models, the decomposition-based forecasting methods had the higher forecasting accuracy.For example, five error indexes in terms of MAE, MRPE, RMSE, RMSRE, and EC of the LSSVM model were 8.131, 17.871%, 10.801, 27.674%, and 0.9336, respectively, which presents the evident accuracy reduction in comparison with those of the proposed method.Compared with EMD-LSSVM model, these indexes were reduced by 3.756, 7.911%, 4.996, 12.413%, and 0.0306, respectively.The reason for these phenomena could be attributed to high non-stationarity and nonlinear characteristics embedded in the original data, which could be effectively addressed by the decomposition methods. The MAE, MRPE, RMSE, RMSRE, and EC of the ARIMA model were 8.284, 17.977%, 11.01, 27.25% and 0.9322.Compared with LSSVM, these indexes were reduced by 0.153, 0.106%, 0.209 0.424, and 0.0014, respectively.The reason could be attributed to that the nonlinear features hidden in the original data were more significant than those of linear one, which leads to the conclusion that the linear ARIMA model cannot capture the characteristics well.Therefore, it owns the lowest forecasting accuracy.

Table 2 ,
the forecasting accuracy of the proposed model was higher than the EMD-LSSVM model with the reductions in terms of the five indexes MAE, MRPE, RMSE, RMSRE, and EC by 2.654, 5.991%, 2.831, 8.464%, and 0.0174, respectively.The reason could be that the TVF-EMD algorithm uses time-varying filtering technology, which could describe the time-varying characteristics of the data.Simultaneously, it can improve the imperfection of the model mixing in the EMD algorithm.•Comparedwith the single models, the decomposition-based forecasting methods had the higher forecasting accuracy.For example, five error indexes in terms of MAE, MRPE, RMSE, RMSRE, and EC of the LSSVM model were 8.131, 17.871%, 10.801, 27.674%, and 0.9336, respectively, which presents the evident accuracy reduction in comparison with those of the proposed method.Compared with EMD-LSSVM model, these indexes were reduced by 3.756, 7.911%, 4.996, 12.413%, and 0.0306, respectively.The reason for these phenomena could be attributed to high non-stationarity and nonlinear characteristics embedded in the original data, which could be effectively addressed by the decomposition methods.• The MAE, MRPE, RMSE, RMSRE, and EC of the ARIMA model were 8.284, 17.977%, 11.01, 27.25% and 0.9322.Compared with LSSVM, these indexes were reduced by 0.153, 0.106%, 0.209 0.424,
Figure 7.The predictions of different methods (B).

Table 4 .
The comparison results of different methods (B).

Table 4 .
The comparison results of different methods (B).