A Hybrid Model Based on a Two-Layer Decomposition Approach and an Optimized Neural Network for Chaotic Time Series Prediction

The prediction of chaotic time series has been a popular research field in recent years. Due to the strong non-stationary and high complexity of the chaotic time series, it is difficult to directly analyze and predict depending on a single model, so the hybrid prediction model has become a promising and favorable alternative. In this paper, we put forward a novel hybrid model based on a two-layer decomposition approach and an optimized back propagation neural network (BPNN). The two-layer decomposition approach is proposed to obtain comprehensive information of the chaotic time series, which is composed of complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) and variational mode decomposition (VMD). The VMD algorithm is used for further decomposition of the high frequency subsequences obtained by CEEMDAN, after which the prediction performance is significantly improved. We then use the BPNN optimized by a firefly algorithm (FA) for prediction. The experimental results indicate that the two-layer decomposition approach is superior to other competing approaches in terms of four evaluation indexes in one-step and multi-step ahead predictions. The proposed hybrid model has a good prospect in the prediction of chaotic time series.


Introduction
Chaotic time series exist in a wide range of areas, such as nature, the economy, society, and industry.They contain many important and valuable information, useful for complex system modeling and prediction.Time series data mining is an important means of the control and decision-making of practical problems in various fields [1,2].In recent years, scholars have carried out a great amount of research work on chaotic time series analysis and prediction.Han et al. [3] proposed an improved extreme learning machine combined with a hybrid variable selection algorithm for the prediction of multivariate chaotic time series, which can achieve high predictive accuracy and reliable performance.Chandra [4] put forward a competitive cooperative coevolution algorithm to train recurrent neural networks (RNNs) for chaotic time series prediction.Yaslan et al. [5] presented a hybrid model based on empirical mode decomposition (EMD) and support vector regression (SVR) for electricity load forecasting.Chen [6] proposed a prediction model of a radial basis function (RBF) neural network optimized by an artificial bee colony algorithm for prediction of traffic flow time series.A multilayered echo state machine with the addition of multiple layers of reservoirs was introduced in [7], and it could be more robust than the echo state network with a conventional reservoir in dealing with chaotic time series prediction.

1.
A hybrid model based on a two-layer decomposition technique is proposed in this paper.For the sake of solving the problem that the prediction model based on single decomposition technique cannot completely deal with the nonlinear and non-stationary of chaotic time series, this paper puts forward a two-layer decomposition technique based on CEEMDAN and VMD, which is able to fully extract the complex characteristics of time series and improve prediction accuracy.

2.
A firefly algorithm (FA) is applied to optimize the weights between input and hidden layer, the weights between the hidden and output layer and the thresholds of neuron nodes, which can reduce the human interference of parameter settings and improve the function approximation ability of the neural network.A BPNN optimized by the FA is applied to predict the subsequences obtained by two-layer decomposition.

3.
The real world chaotic time series, daily maximum temperature time series in Melbourne, is used to assess the validity of the proposed hybrid model.The experimental results indicate that our hybrid model has a significant improvement in prediction accuracy compared to the existing single-model-based approaches and hybrid models based on the single layer decomposition technique.
The remainder of the paper is organized as follows.Preliminaries and related works are introduced in Section 2. In Section 3, we introduce the principles and the implementation steps of the proposed method.Section 4 presents the experiments illustrating the availability of the proposed model.Finally, conclusions and future directions are demonstrated in Section 5.For the convenience of reading, the notations used in this paper are shown in Table 1. the distance between firefly i and j β 0 the attractiveness at the light source (r = 0) s i the space positions of firefly i

Preliminaries and Related Works
In this section, we will firstly introduce the basic methods, including CEEMDAN, VMD, and the FA.Furthermore, related works of the hybrid prediction model will be described.

Complete Ensemble Empirical Mode Decomposition with Adaptive Noise
Ensemble EMD (EEMD) adds white noise to the original signal to solve the mode mixing problem of EMD.However, the white noise sequence cannot be absolutely canceled through the finite average, and the magnitude of reconstruction error depends on the times of integration.In addition, increasing the times of integration would lead to an increase computational burden.To solve these problems, CEEMDAN was proposed as an improved version of EEMD [23].By adding a finite number of adaptive white noise at each stage, the CEEMDAN method is able to obtain the reconstruction error close to zero through a small average number of times of integration.Thus, CEEMDAN avoids the mode mixing problem in EMD, while reducing computational complexity compared to EEMD.

Variational Mode Decomposition
VMD is a novel non-recursive signal processing approach [24], which decomposes the original signal into a group of subsequences.Each subsequence is called a mode and is concentrated near a specific central pulsation frequency.For assessing the bandwidth of each mode, there are three main schemes: Firstly, apply the Hilbert transform to each mode separately, calculate the associated analytic signals, and then obtain a number of unilateral frequency spectrums.Then, adjust each mode to its estimated center frequency by adding an exponential term, so as to shift the frequency spectrums of these modes to the respective baseband.Finally, estimate the bandwidth of every mode by means of the Gaussian smoothness of the demodulated signal, for example, the L2-norm of the gradient.

Firefly Algorithm
The firefly algorithm (FA) [25] is a heuristic optimization algorithm based on firefly behavior, whereby flash signals are used to attract potential mates for positional movement.It is a swarm intelligence optimization algorithm and therefore has the advantages of a swarm intelligence algorithm.
In addition, compared with other similar algorithms, it has the following two advantages.Firstly, it can realize automatic segmentation, which is well suited for solving highly nonlinear optimization problems.Secondly, the FA has multi-modal characteristics and can deal with multi-modal problems quickly and efficiently.The basic idea of the FA is to treat every point in space as a firefly, which is attracted to the brighter fireflies and moves in this direction.During the movement of the weakly glowing firefly, the position is updated until the optimal position is found.Since it is proposed, FA has been widely used to solve various practical problems.

Related Works
Aiming at the analysis of nonlinear and non-stationary time series, many hybrid models have been proposed.In [26], the EMD-BPNN is presented and applied to forecast tourism demand, namely the number of tourists.The hybrid model showed better performance than a single BPNN or ARIMA model.Zhou et al. [27] put forward a hybrid model of EEMD-GRNN (general regression neural network) for PM2.5 forecasting.Simulation results indicated that the EEMD-GRNN model outperformed the GRNN, multiple linear regression, and the ARIMA.This research was significant for the development of air quality warning systems.Moreover, a comparison work of hybrid prediction models based on wavelet decomposition, wavelet packet decomposition, EMD, and fast EEMD was implemented in [28].The study investigated the decomposition and prediction performance of multiple hybrid models.In [29], VMD was adopted to decompose wind power time series into multiple modes, and Gram-Schmidt orthogonalization was used to eliminate redundant attributes.Next, the hybrid model based on VMD and extreme learning machines was proposed for short-term wind power forecasting.Similarly, Lahmiri [30] proposed a hybrid model for economic and financial time series forecasting, called VMD-GRNN.Jianwei E et al. [31] raised a hybrid model VMD-ICA-ARIMA for crude oil price forecasting.Wang et al. [32] suggested a hybrid model based on VMD, phase space reconstruction, and a wavelet neural network, which is reliable for multi-step prediction of wind speed time series.
As mentioned before, hybrid models based on decomposition methods, such as EMD, EEMD, and VMD, have been extensively used for time series modeling and have exhibited satisfactory performance.

The Structure of CEEMDAN-VMD-FABP Model
Subsequences obtained by an effective decomposition technique are much easier to analyze than an original time series, and a single-layer decomposition pattern is one of the most commonly adopted methods in existing hybrid models.Models based on a single-layer decomposition technique are able to increase the predictive performance of various chaotic time series to some extent, but they are difficult in completely reflecting the non-stationarity and irregularity of original signals.In view of this, we propose a novel two-layer decomposition approach using CEEMDAN and VMD for chaotic time series forecasting, which is shown in Figure 1.In addition, we adopt a BPNN optimized by an FA (FABP) to predict subsequences obtained by a two-layer decomposition operation.FABP has the ability to automatically optimize weights and thresholds, which is able to reduce the randomness of parameter selection and ultimately strengthen the function approximation ability of a neural network.Next, we will comprehensively introduce the structure of the proposed model.The CEEMDAN method with noise robustness is first used to decompose the original signal.CEEMDAN has good performance in anti-noise, so the original signal is not denoised in this paper.At present, many scholars are committed to improving the CEEMDAN algorithm by increasing its robustness and improving its adaptive anti-noise performance so that it omits the process of denoising the original signal.Therefore, the original signal is automatically decomposed into a group of intrinsic mode functions (IMFs), namely IMF1, IMF2, …, IMFn, and a residue.
IMF1 has the highest frequency, which is the most difficult to track and predict.To improve the overall prediction accuracy of the original signal, the VMD algorithm is introduced here for the secondary decomposition of IMF1.IMF1 is decomposed by the VMD algorithm into VMF1, VMF2, …, VMFm.Next, they are each predicted by FABP.Afterwards, the prediction results of these subsequences are combined into the final result of IMF1.Furthermore, the subsequences IMF2, …, IMFn and the residue are each predicted using FABP.The summation of the prediction results of IMF2, …, IMFn and the residue is superimposed with the prediction result of the IMF1.The final forecast result of the original chaotic signal is obtained.At this point, the entire forecast process is complete.

CEEMDAN for Original Time Series
First, the CEEMDAN decomposition algorithm [23] is adopted to obtain a group of IMFs adaptively.The computational process of CEEMDAN is as follows.
Step 1.A collection of noise-added original time series is created: , where ( ) t ε is independent Gaussian white noise with unit variance, and 0 ω is a noise coefficient.
Step 2. For each x t , EMD is used to obtain the first IMF and take the average: The first residue is then ( ) ( ) ( ) The CEEMDAN method with noise robustness is first used to decompose the original signal.CEEMDAN has good performance in anti-noise, so the original signal is not denoised in this paper.At present, many scholars are committed to improving the CEEMDAN algorithm by increasing its robustness and improving its adaptive anti-noise performance so that it omits the process of denoising the original signal.Therefore, the original signal is automatically decomposed into a group of intrinsic mode functions (IMFs), namely IMF1, IMF2, . . ., IMFn, and a residue.
IMF1 has the highest frequency, which is the most difficult to track and predict.To improve the overall prediction accuracy of the original signal, the VMD algorithm is introduced here for the secondary decomposition of IMF1.IMF1 is decomposed by the VMD algorithm into VMF1, VMF2, . . ., VMFm.Next, they are each predicted by FABP.Afterwards, the prediction results of these subsequences are combined into the final result of IMF1.Furthermore, the subsequences IMF2, . . ., IMFn and the residue are each predicted using FABP.The summation of the prediction results of IMF2, . . ., IMFn and the residue is superimposed with the prediction result of the IMF1.The final forecast result of the original chaotic signal is obtained.At this point, the entire forecast process is complete.

CEEMDAN for Original Time Series
First, the CEEMDAN decomposition algorithm [23] is adopted to obtain a group of IMFs adaptively.The computational process of CEEMDAN is as follows.
Step 1.A collection of noise-added original time series is created: , where ε(t) is independent Gaussian white noise with unit variance, and ω 0 is a noise coefficient.
Step 2. For each x i (t), EMD is used to obtain the first IMF and take the average: The first residue is then r Essentially, the procedure of the EMD algorithm is a sifting process from which IMFs can be obtained [33].The specific computation process of the signal x i (t) is described as follows.Firstly, we obtain every local maxima and local minima of x i (t).Next, the upper envelope u(t) and the lower envelope l(t) are structured by cubic spline interpolation.Finally, we calculate the average of u(t) and l(t) and record the average as m(t).m(t) = (u(t) + l(t))/2. ( We subtract m(t) from the original signal x i (t) and record the result as c i 1 (t).
It is then determined whether c i 1 (t) is an integral part of the IMF by checking whether c i 1 (t) satisfies the above two conditions.The IMFs should conform to the following two conditions: (1) the absolute value of the difference between the number of extreme points and zero crossing must be less than or equal to 1 in the whole data series; (2) the average of the upper envelope and lower envelope must be zero at any point.If c i 1 (t) meets the above two conditions, c i 1 (t) is recognized as IMF1.In the meantime, and the above process is repeated until we IMF1 is obtained.
Step 3. The second IMFs is obtained by decomposing the noise-added residue where E j (•) represents the j-th IMF obtained by EMD.
Step 4. The remaining IMFs are repeated until the number of extreme points of the residual signal does not exceed two.
So far, the original signal has been decomposed into a series of IMF components, namely IMF1, IMF2, . . ., IMFn.Due to the highest frequency and unpredictability of IMF1, this research introduces the second decomposition of the IMF1 based on the VMD.

VMD for IMF1
The main task of the VMD algorithm [34] is to solve the following constrained optimization problem: where u k (t) denotes the k-th mode of decomposition, ω k is the center frequency of mode k, ∂ t (•) denotes partial derivative, σ indicates the Dirac distribution, * denotes convolution computation, and f is the original signal to be decomposed.
In order to solve the above constrained optimization problem, a quadratic penalty term and a Lagrangian multiplier are shown in Equation (6).The former is conducive to enforcing the constraint, and the latter helps to improve convergence.Hence, the translated unconstrained form is shown as where α represents the balancing parameter, and λ is the Lagrangian multiplier.
An alternate direction method of multipliers (ADMM) [35] is applied to settle the optimization problem of (6).The saddle point of the obtained augmented Lagrangian L in iterative optimization process is determined by AMDD.The suboptimized solution is embedded into ADMM and optimized in the Fourier domain.The detailed calculation process can be found in [24].To implement the VMD algorithm and update the modes, u k and ω k are iterated in two directions according to the following solutions: Subsequently, the solutions are represented as follows: where f (ω), ûk (ω), and λ(ω) represent the Fourier transforms of f (t), u k (t), and λ(t), and n denotes the number of iterations.

BPNN Optimized by a Firefly Algorithm
After two-layer decomposition, a BPNN is used to predict all subsequences obtained by decomposition.The FA is adopted to optimize the parameters of the BPNN and is capable of improving the BPNN's function approximation ability.
The search process is related to two important parameters of fireflies: the brightness and mutual attraction of the fireflies.Bright fireflies will attract the weak fireflies to move to them.The brighter the lights are, the better their positions are, and the brightest fireflies represent the optimal solution of the function.The higher a firefly's brightness is, the more attractive it is to other fireflies.If the luminance is the same, the fireflies will engage in a random motion, and the two important parameters are inversely proportional to the distance.The greater the distance is, the smaller the attraction is.The relative fluorescent brightness of a firefly is defined as where I 0 denotes the intensity of the light source.The better the objective function value is, the higher the firefly's brightness is.γ represents the light absorption coefficient.As the distance increases and the transmission medium weakens the light intensity, the fluorescence gradually decreases.The light absorption coefficient is set to reflect this characteristic.The parameter r ij denotes the distance between firefly i and j, which is calculated according to the following formula: where n represents the dimension of the problem.Once a firefly is attracted by the flash of other fireflies, the attractiveness β is updated based on where β 0 denotes the attractiveness at the light source (r = 0).The formula for updating the location of fireflies is as follows: where s i and s j denotes the space positions of firefly i and j, separately.α denotes the step factor, and ε represents a random value.
The main algorithm process of the FA can be described as Algorithm 2. We then obtain all prediction results of all subsequence, and the final prediction results of all sub-signals obtained by decomposition are superimposed.The advantages of this approach are verified in the next section.

Experimental Results
In this section, the daily maximum temperature time series in Melbourne is applied to analyze and verify the availability of the presented method.The experimental data is collected from the real world.The evaluation criteria are selected as root-mean-squared error (RMSE), normalized root-mean-square error (NRMSE), mean absolute percentage error (MAPE), and symmetric mean absolute percentage error (SMAPE).All prediction errors shown next are the mean values of 50 experimental results.The expressions are as follows: where y(k) denotes the real value, ŷ(k) denotes the predicted value, and n denotes the number of samples.The daily maximum temperature time series in Melbourne is used in the experiment.The dataset contains the daily maximum temperature from 3 January 1981 to 31 December 31 1990, a total of 3650 samples, which is shown in Figure 2. The training set is made up of the first 3000 samples, and the testing set is composed of the remaining 650.In order to compare the effectiveness of the model, six other methods were used in the comparative experiments: an RBF neural network, [36], an ANFIS [37], the original BP model, FABP, FABP with CEEMDAN decomposition (CEEMDAN-FABP), and FABP with VMD decomposition (VMD-FABP).The experimental environment involved the Windows 7 operating system, and all experiments were carried out using MATLAB R2016a on a 3.50 GHz, Intel(R) Core i3-4150M CPU with 6 GB RAM.  Figure 3 shows the decomposition results of the original time series based on CEEMDAN, which adaptively obtains 13 IMFs.CEEMDAN has good anti-noise performance, so this paper leaves out the de-noising process of the original time series.Figure 3 shows the decomposition results of the original time series based on CEEMDAN, which adaptively obtains 13 IMFs.CEEMDAN has good anti-noise performance, so this paper leaves out the de-noising process of the original time series.In this paper, sample entropy is an indicator for judging the way IMF predicts.Figure 4 shows the sample entropies of the IMFs.The entropy of the original time series is 0.81, which is indicated by the dotted line.In this research, the IMF components, whose entropy is greater than the entropy of the original time series, are predicted separately, because these IMFs are highly complex.The IMFs whose entropies are smaller than the original time series are combined and superimposed.After the combined operation, the model complexity and computational time are significantly reduced, while ensuring accuracy of prediction.Based on the results in Figure 4, the first four IMF components should be predicted separately, while the IMF components from 5 to 13 should be combined for prediction.

Decomposition results of the daily maximum temperature time series in Melbourne based on complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN).
In this paper, sample entropy is an indicator for judging the way IMF predicts.Figure 4 shows the sample entropies of the IMFs.The entropy of the original time series is 0.81, which is indicated by the dotted line.In this research, the IMF components, whose entropy is greater than the entropy of the original time series, are predicted separately, because these IMFs are highly complex.The IMFs whose entropies are smaller than the original time series are combined and superimposed.After the combined operation, the model complexity and computational time are significantly reduced, while ensuring accuracy of prediction.Based on the results in Figure 4, the first four IMF components should be predicted separately, while the IMF components from 5 to 13 should be combined for prediction.We obtained the prediction results of five IMF components, as shown in Figure 5.It can be observed that the prediction curve of each IMF is able to track the actual values, and the prediction trend is basically consistent.Due to the high frequency characteristics of IMF1, accurate prediction and tracking is more difficult, resulting in larger prediction errors for IMF1.We obtained the prediction results of five IMF components, as shown in Figure 5.It can be observed that the prediction curve of each IMF is able to track the actual values, and the prediction trend is basically consistent.Due to the high frequency characteristics of IMF1, accurate prediction and tracking is more difficult, resulting in larger prediction errors for IMF1.We obtained the prediction results of five IMF components, as shown in Figure 5.It can be observed that the prediction curve of each IMF is able to track the actual values, and the prediction trend is basically consistent.Due to the high frequency characteristics of IMF1, accurate prediction and tracking is more difficult, resulting in larger prediction errors for IMF1. Figure 6 shows the final prediction results of the hybrid model of CEEMDAN and FABP.It can be seen that the prediction curve can track the real curve well and that the prediction error is small.The prediction result can be obtained in the case of the original time series without denoising.Figure 6 shows the final prediction results of the hybrid model of CEEMDAN and FABP.It can be seen that the prediction curve can track the real curve well and that the prediction error is small.The prediction result can be obtained in the case of the original time series without denoising.The prediction errors presented in Figures 5 and 6 are shown in Table 2.As can be seen from Table 2, the overall prediction error of the time series is 0.7763, while the prediction error of IMF1 is 0.6361.Therefore, if the prediction accuracy of IMF1 can be improved, the overall prediction accuracy will be greatly improved.Based on the above analysis, we propose the application of a two-layer decomposition strategy to further decompose the IMF1 component.The IMF1 component is decomposed into six subsequences based on the VMD algorithm.The decomposition results of the IMF1 component are shown in Figure 7.We use FABP to predict The prediction errors presented in Figures 5 and 6 are shown in Table 2.As can be seen from Table 2, the overall prediction error of the time series is 0.7763, while the prediction error of IMF1 is 0.6361.Therefore, if the prediction accuracy of IMF1 can be improved, the overall prediction accuracy will be greatly improved.Based on the above analysis, we propose the application of a two-layer decomposition strategy to further decompose the IMF1 component.The IMF1 component is decomposed into six subsequences based on the VMD algorithm.The decomposition results of the IMF1 component are shown in Figure 7.We use FABP to predict VMF1, . . ., VMF6, and we combine the prediction results to obtain the prediction results of the IMF1 component.Next, we combine the prediction results of IMF1 with IMF2, . . ., IMF5' components, thereby obtaining the prediction results of daily maximum temperature time series in Melbourne, which are shown in Figure 8.The error indexes of different prediction models are shown in Table 3.It can be seen that the proposed two-layer decomposition algorithm can significantly reduce the prediction error and improve prediction accuracy compared with the direct prediction.This is mainly because the decomposition algorithm converts complex original signals into several simple and easy-to-analyze sub-signals, which is conducive to analysis and prediction.Based on the two-layer decomposition model, the prediction error is smaller than the single-layer approach, and the prediction performance is improved.We conducted one-step-, two-step-, three-step-, and five-step-ahead prediction experiments on the time series.The corresponding prediction errors, including RMSE, NRMSE, MAPE, and SMAPE, are shown in Tables 3-6.According to the results presented in the tables, the proposed model has a minimum prediction error in multiple prediction experiments, which indicates that the hybrid model of CEEMDAN-VMD-FABP has the best prediction performance.We can also say that the hybrid model based on the two-layer decomposition approach is better than the hybrid models based on a single decomposition approach.Surprisingly, although the ANFIS performed poorly in one-step prediction, its one-step prediction and multi-step prediction have similar effects, especially in the five-step ahead prediction experiment, and its performance is better than the RBF in multi-step prediction.The ANFIS method shows stability ability in terms of prediction, although the overall effect was not satisfactory.The prediction results of other models except for the ANFIS basically conform to the actual law.The more advanced the steps are, the more difficult it is to predict the chaotic time series.3-6, into a column chart, presented in Figure 9.It can be seen that the proposed hybrid model of CEEMDAN-VMD-FABP has the best performance.The proposed two-layer decomposition model has minimum prediction error in one-step-, two-step-, three-step-, and five-step-ahead prediction.Moreover, we compared the training time and testing time in one-step-ahead prediction.Under different prediction steps, the running time is not much different, so we only list the running time of one-step prediction in Table 3.As can be seen in the table, the testing time of each method is not much different.Although the method proposed in this paper has the longest training time, the experimental results prove that the proposed two-layer decomposition model can obtain the best prediction accuracy, which shows the effectiveness of the proposed method.Moreover, even if the training time is long, the overall running time is only a few minutes (not long), completely within the acceptable range.

Conclusions
In this paper, we propose a two-layer decomposition technique consisting of CEEMEAN and VMD.We obtained a group of subsequences by a two-layer decomposition method.These subsequences were separately predicted, and the prediction results were combined to obtain the final Moreover, we compared the training time and testing time in one-step-ahead prediction.Under different prediction steps, the running time is not much different, so we only list the running time of one-step prediction in Table 3.As can be seen in the table, the testing time of each method is not much different.Although the method proposed in this paper has the longest training time, the experimental results prove that the proposed two-layer decomposition model can obtain the best prediction accuracy, which shows the effectiveness of the proposed method.Moreover, even if the training time is long, the overall running time is only a few minutes (not long), completely within the acceptable range.

Conclusions
In this paper, we propose a two-layer decomposition technique consisting of CEEMEAN and VMD.We obtained a group of subsequences by a two-layer decomposition method.These subsequences were separately predicted, and the prediction results were combined to obtain the final result.For the prediction, we used a BPNN optimized by a firefly algorithm.From this work, the following can be concluded: 1.
The actual time series is usually non-stationary and noisy.It is generally difficult to analyze the original time series.CEEMDAN is an anti-noise decomposition method, and VMD can handle non-stationary signals very well.Therefore, subsequences decomposed by CEEMDAN and VMD are easy to analyze and predict.2.
After decomposition of the original signal, the BPNN was used for prediction.At this stage, the parameters in the BPNN greatly influenced prediction accuracy.Therefore, in order to reasonably select the model parameters, the FA algorithm was introduced to optimize the parameters of BP.
In general, in order to improve prediction accuracy, the following can be considered for further study: Firstly, original input variables were analyzed to eliminate factors that are not conducive to analysis and prediction.Secondly, we optimized the prediction model at the prediction stage.We studied the two aspects simultaneously, and the experimental results demonstrate the effectiveness of the proposed method.

Figure 1 .
Figure 1.The structure of the two-layer decomposition prediction model.

Figure 1 .
Figure 1.The structure of the two-layer decomposition prediction model.
is shown in Figure2.The training set is made up of the first 3000 samples, and the testing set is composed of the remaining 650.In order to compare the effectiveness of the model, six other methods were used in the comparative experiments: an RBF neural network,[36], an ANFIS[37], the original BP model, FABP, FABP with CEEMDAN decomposition (CEEMDAN-FABP), and FABP with VMD decomposition (VMD-FABP).The experimental environment involved the Windows 7 operating system, and all experiments were carried out using MATLAB R2016a on a 3.50 GHz, Intel(R) Core i3-4150M CPU with 6 GB RAM.

Figure 2 .
Figure 2. The daily maximum temperature time series in Melbourne.

Figure 2 .
Figure 2. The daily maximum temperature time series in Melbourne.

Figure 3
Figure3shows the decomposition results of the original time series based on CEEMDAN, which adaptively obtains 13 IMFs.CEEMDAN has good anti-noise performance, so this paper leaves out the de-noising process of the original time series.

Figure 2 .
Figure 2. The daily maximum temperature time series in Melbourne.

Figure 3 .
Figure 3. Decomposition results of the daily maximum temperature time series in Melbourne based on complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN).

Figure 6 .
Figure 6.Prediction results and prediction errors of the daily maximum temperature in Melbourne based on the hybrid model of CEEMDAN and FABP.

Figure 6 .
Figure 6.Prediction results and prediction errors of the daily maximum temperature in Melbourne based on the hybrid model of CEEMDAN and FABP.

Figure 7 .
Figure 7. Decomposition results of IMF1 based on the variational mode decomposition (VMD) algorithm.

Figure 8 .
Figure 8. Prediction results and prediction errors of the daily maximum temperature in Melbourne based on the two-layer decomposition algorithm and a BPNN optimized by a firefly algorithm

Figure 7 .
Figure 7. Decomposition results of IMF1 based on the variational mode decomposition (VMD) algorithm.

Figure 7 .
Figure 7. Decomposition results of IMF1 based on the variational mode decomposition (VMD) algorithm.

Figure 8 .
Figure 8. Prediction results and prediction errors of the daily maximum temperature in Melbourne based on the two-layer decomposition algorithm and a BPNN optimized by a firefly algorithm (FABP).

Figure 8 .
Figure 8. Prediction results and prediction errors of the daily maximum temperature in Melbourne based on the two-layer decomposition algorithm and a BPNN optimized by a firefly algorithm (FABP).

Figure 9 .
Figure 9. Performance comparison of different models with different prediction steps.

Figure 9 .
Figure 9. Performance comparison of different models with different prediction steps.

Table 1 .
The notations used in this paper.

Table 2 .
Overall prediction error and prediction error of IMF1.

Table 2 .
Overall prediction error and prediction error of IMF1.

Table 3 .
Prediction errors of daily maximum temperature time series based on different algorithms (one step ahead).

Table 3 .
Prediction errors of daily maximum temperature time series based on different algorithms (one step ahead).

Table 4 .
Prediction errors of daily maximum temperature time series based on different algorithms (two steps ahead).

Table 5 .
Prediction errors of daily maximum temperature time series based on different algorithms (three steps ahead).

Table 6 .
Prediction errors of daily maximum temperature time series based on different algorithms (five steps ahead).To show the above experimental results more intuitively, we transform the error values, i.e. the RMSE, NRMSE, MAPE, and SMAPE shown in Tables