Analysis and Forecasting of the Carbon Price in China’s Regional Carbon Markets Based on Fast Ensemble Empirical Mode Decomposition, Phase Space Reconstruction, and an Improved Extreme Learning Machine

: With the development of the carbon market in China, research on the carbon price has received more and more attention in related fields. However, due to its nonlinearity and instability, the carbon price is undoubtedly difficult to predict using a single model. This paper proposes a new hybrid model for carbon price forecasting that combines fast ensemble empirical mode decomposition, sample entropy, phase space reconstruction, a partial autocorrelation function, and an extreme learning machine that has been improved by particle swarm optimization. The original carbon price series is decomposed using the fast ensemble empirical mode decomposition and sample entropy methods, which eliminate noise interference. Then, the phase space reconstruction and partial autocorrelation function methods are combined to determine the input and output variables in the forecasting models. An extreme learning machine optimized by particle swarm optimization was employed to forecast carbon prices. An empirical study based on carbon prices in three typical regional carbon markets in China found that this new hybrid model performed better than other comparable models.


Introduction
Nowadays, climate change has seriously threatened sustainable human development.Especially, China, as the world's biggest emitter of CO2, is particularly concerned in this regard [1].In order to actively implement the Paris Agreement and to contribute to the fight against climate change, China has committed to reducing its carbon intensity by 40-45% per unit of GDP whilst increasing the share of non-fossil energy consumption to 15% by the year of 2020.Since the introduction of the Emissions Trading System (ETS) by the European Union (E.U.) in 2005, carbon emissions trading has become an important market tool for responding to climate change as well as a long-term mechanism to address pollution problems.Advancing with the times, China has successfully established regional pilots for carbon emissions trading and has currently formed eight regional carbon markets, consisting of three provinces and five cities.Moreover, in the carbon trading market, an important corollary factor is carbon price prediction, which helps to reflect the carbon reduction performance and market value [2].It is certain that an accurate and a rational prediction of carbon prices would allow us to understand the pattern of carbon price variations and avoid risks in investments [3].Therefore, it is meaningful to be concerned with scientific methods for predicting carbon prices in China.In addition, carbon prices in China's regional markets are time-series-connected with historical data, which also makes it possible to deliver an accurate prediction by using the model that is presented in this paper.
While early-stage research has been conducted on a qualitative analysis of China's carbon market [4,5], more and more carbon price prediction methods have emerged in recent studies.They can be classified into two types that are focused on modeling and forecasting the carbon price volatility: mathematical statistical models and artificial neural networks.The conventional mathematical statistical models include difference-in-difference (DID), the vector auto-regression model (VAR), the autoregressive integrated moving average (ARIMA), and generalized autoregressive conditional heteroscedasticity (GARCH) models.Huang [6] proved that carbon emission trading has a significant and sustained promotion effect on carbon emission reductions by using the difference-in-difference method.Zeng et al. [7] employed a structural vector autoregressive (SVAR) model for exploring the dynamic relationships among the carbon emission allowance price, the regional economy, and energy prices in Beijing.Their empirical research results showed that, instead of the energy price, the historical carbon allowance price series was the major influencing factor on the carbon price.Zhu and Wei [8] examined the forecasting ability of three hybrid ARIMA models under the E.U.ETS.However, the ARIMA model requires a stable time series, which obviously renders it unsuitable for the direct prediction of a carbon price time series.Notably, the different GARCH-type models are popular in this field.Xia [9] studied the carbon price volatility of five pilot cities in China with the AR-GARCH (1,1) model.The results of experiments showed high consistency.Byun and Cho [10] compared three methods to predict the related volatility, and concluded that the GARCH-type models were the most suitable method.However, Zhang et al. [11] found that GARCH-type models are only satisfactory for in-sample forecasting and have limited significance for out-of-sample results.As described in the abovementioned studies, a single statistical model may not satisfy the condition of flexibility for an appropriate simulation due to the dynamic characteristics of carbon price volatility.
Today, with the growing data volume and algorithms' ability to learn, machine learning algorithms are prevailing.The most essential feature of machine learning is to learn the data, which means to build a system to parse data so as to excavate the laws that hold between the data.The main advantage of a machine learning algorithm is that it can consider multiple attributes or features at one time and capture the hidden relationship between them that is difficult for a statistical model to reveal [12][13][14][15][16][17].Compared with a traditional statistical model, machine learning algorithms have a stronger self-learning ability, a generalization ability, fast calculation speed, an associative memory ability that can fit a nonlinear relationship, and more flexible applicability to the amount of sample data.Based on these advantages, machine learning algorithms are applied in many fields [18][19][20][21][22][23][24].For example, many algorithms have been developed to predict carbon prices, such as the back propagation neural network (BPNN), the support vector machine (SVM), and the radial basis function neural network (RBF).Liu and Sun [25] applied a BPNN to forecast the carbon price and the carbon trading volume in Shanghai.Zhang et al. [26] proposed a grey neural network improved by the ant colony algorithm (GNN-ACA) for carbon spot price forecasting.The results showed that the selected model performed significantly better than single ARIMA and least squares support vector machine (LSSVM) models by using data collected from the E.U.ETS.Tsai [27] proposed a carbon price forecasting system using the radial basis function neural network (RBF), which can supply precise and real-time predictions of carbon prices.Moreover, the SVM and LSSVM, individually or in combination with plenty of other algorithms, have been widely adopted to predict the carbon trading price.Gao and Li [28] compared some different prediction models in accordance with the daily EU emission allowance (EUA) futures prices from March 2008 to September 2013 (DEC12).The results indicated that the proposed EMD-PSO-SVM model performed better than other artificial neural networks (ANNs) in carbon price forecasting.Razak et al. [29] and Zhu et al. [30] used LSSVM as their main forecasting model.Their conclusions indicated that the LSSVM model seems to be a superior method for forecasting highly nonlinear and nonstationary carbon prices.
Huang et al. [31] put forward the extreme learning machine (ELM) model in 2004, which has a higher precision of generalization as well as a faster convergence speed than the abovementioned models.Moreover, it can avoid many of the problems that may arise in gradient-based learning methods; for instance, stopping criteria and learning periods.As a consequence, it has been widely employed to make predictions in a variety of fields since its introduction.Shrivastava and Panigrahi [32] investigated the performance of a combination of ELM and the Wavelet technique (WELM) in price forecasting in electricity markets.The empirical research demonstrated that this model is appropriate for price forecasting.Liu et al. [33] applied the ELM model in wind speed forecasting.In this study, the ARIMA model and the SVM model were involved in a comparison of the prediction performance.The experimental results showed that the proposed WPD-EMD-ELM model performed the best among the compared models.Furthermore, an ELM's input weights matrix and hidden layer bias are key parameters in the ELM's generalization capability.Based on this, it is essential to utilize an optimization algorithm to obtain the optimal parameters.Rocha et al. [34] implemented parameter selection for an ELM improved by Particle Swarm Optimization (PSO-ELM) in the forecasting of a distributed electrical generation system's capacity.Fan et al. [35] proposed a PSO-ELM model for short-term power load forecasting.The results proved that the improved model showed a higher learning rate and prediction accuracy compared with the traditional ELM model.From the above, it can be found that the ELM-type models have been well-employed in a variety of forecasting scenarios.Therefore, one of the purposes of this paper is to verify the feasibility of the PSO-ELM model for carbon price prediction.
Given the chaotic property and intrinsic complexity of carbon prices, it may not be appropriate to directly forecast carbon prices before data preprocessing.Presently, empirical mode decomposition (EMD) and the wavelet transform (WT) are considered to be the common data preprocessing approaches for decomposing the initial series and eliminating the random volatility.WT was applied to signal processing in electricity market price forecasting by Saber et al. [36].In the process of analyzing the unified interval price of China's carbon trading market, Li and Lu [37] applied the GARCH-EMD model to predict carbon prices.The results demonstrate that EMD is an effective method to decompose unstable carbon prices.Zhu et al. [38] built a multiscale model that combined EMD and developmental least squares support vector regression (LSSVR) for carbon price forecasting with a high accuracy.Basing on the data from the E.U.ETS, the empirical results showed that the EMD-LSSVR model performed the best in comparison with other prediction models according to the values of statistical indicators.It is worth noting that EMD may have a mode mixing problem that causes the decomposed intrinsic mode functions (IMFs) to lose their meaning.To tackle the problem, Huang and Wu carried out ensemble empirical mode decomposition (EEMD) via introducing white noise into the original series [39].In 2014, fast ensemble empirical mode decomposition (FEEMD) was proposed to improve EEMD's computing capacity for a large amount of sample data [40].They were all used successfully in wind speed forecasting.Heng et al. [41] reconstructed the initial data for wind speed forecasting by FEEMD.Sun and Liu proposed EMD and FEEMD for processing the original wind speed data, and then combined these methods with different intelligent algorithms for the prediction of wind speed [42,43].Thanks to carbon prices having dynamic and nonlinear properties that are similar to those of wind speed, this paper proposes EMD and FEEMD to decompose a carbon price series and introduces both a phase space reconstruction theory (PSR) and a partial autocorrelation function (PACF) for the analysis of the decomposed subsequences.
Currently, China has constructed eight regional carbon markets, and, on 19 December 2017, started the construction of the national carbon market.As the earliest pilot markets, the Beijing, Shenzhen, and Hubei carbon markets have been in smooth operation and gradually formed their features in the process of promoting emission reductions.
Having summarized the research of our predecessors, this thesis selects the carbon price of Beijing, Shenzhen, and Hubei as the example.We focus on Beijing's carbon price, and analyze the features through a comparison with the other two typical markets.After being decomposed by the FEEMD, the IMFs are analyzed by phase space reconstruction and a partial autocorrelation function to determine the input of the forecasting models in the next step.Additionally, this paper adopts the PSO-ELM model to forecast carbon prices.
The main contribution of this paper is this new hybrid combination model for carbon price prediction, which is expressed as FEEMD-PSR-PACF-PSO-ELM.Firstly, this paper comprehensively considers the chaotic property and the partial autocorrelation of decomposed carbon price subsequences to reconstruct the input and output variables.Secondly, the research idea, which is based on the FEEMD model combined with the PSO-ELM model to decompose carbon prices, represents a new attempt to predict carbon prices.
The rest of this paper is divided into four sections.Section 2 presents the methods and models that are applied in this paper, including the decomposition methods, the chaotic series reconstruction, and the hybrid prediction model.An exhaustive explanation of the hybrid forecasting models that are proposed in this paper is given in Section 3. The data processing and the analysis of carbon price forecasting based on actual data from different regions under China's ETS are presented in Section 4. Finally, Section 5 provides conclusions according to the results of the empirical analysis.

The Particle Swarm Optimization Algorithm
Proposed by Kennedy and Eberhart [44] in 1995, the particle swarm optimization algorithm simulates bird predation behavior and calls each bird a particle.As a well-recognized optimization algorithm, its rationale is to continuously update the distance between Pbest (the best location found by itself) and Gbest (the current global best position).Suppose that, in a D-dimensional search space, the t-th particle is presented by Xt = (xt1, xt2, …, xtD) T , and the speed and the Pbest are expressed as Vt = (vt1, vt2,…, vtD)T and Pt = (pt1, pt2, …, ptD) T , respectively.In addition, Gbest is stated as Gt = (Gt1, Gt2, ..., GtD).A kernel of PSO can be expressed as: where w is assumed to be the inertia weight to amend the search range, and c1 and c2 are acceleration factors set to 1.4945.Afterwards, r1 and r2 are assigned evenly among the interval [0, 1].In order not to blindly search, the position and speed values have limitations of [−Xmax, Xmax] and [−Vmax, Vmax], respectively.

Extreme Learning Machine
Extreme learning machine is an innovative algorithm based on a feed forward neural network as shown in Figure 1.The gradient descent algorithm is used to regulate the weights and parameters in the training process.Meanwhile, the Moore-Penrose inverse is used to calculate the hidden layer matrix to transform the training process into a solution to a least square problem.ELM has a faster learning velocity in comparison with other neural network models, and it can be used for classification, regression, clustering, and sparse approximation while guaranteeing learning accuracy at the same time.Of note is that, during the learning process, the weight values and thresholds may have non-optimal values, which lead to poor performance and an unstable output.Actually, ELM needs a large number of hidden layer nodes to reach an expected result, which may make it prone to overfitting.To resolve these problems, this paper uses PSO to optimize the deviation of the hidden layer and the input layer weight of ELM, referred to as PSO-ELM, which works as shown in Figure 2. In Figure 2, Part 1 is the forecasting process of the extreme learning machine, and Part 2 is the principle of the particle swarm optimization algorithm.The proposed model not only takes full advantage of PSO's global search capability and ELM's rapid convergence rate, but also overcomes the inherent problems of ELM.

Fast Ensemble Empirical Mode Decomposition and Sample Entropy
As described above, fast ensemble empirical mode decomposition (FEEMD) is a fast implementation of EEMD.It is often used for signal decomposition, which decomposes a nonstationary timing signal X(i) (i = 1, 2, ..., n) into a finite number of IMFs and one residual R component.Moreover, it can effectively solve the mixing mode phenomenon of EMD, and introduces white noise and the idea of an ensemble average.Given the features of a dataset, the amplitude k of white noise is set as 0.05-0.5 times and the iteration time M is set to 100.
Sample entropy (SE) is a modification of approximate entropy (AE), which is used to assess the complexity of physiological time-series signals, diagnose disease states, and so on, and was proposed in 2000 by Richman et al. [45].The larger the SE values, the higher the sample complexity.Furthermore, SE has two advantages over AE: data length independence and better consistency; that is, the influence of the parameters on the sample entropy is the same.The SE value has three important parameters, denoted by N, m, and r, where N expresses the length of subsequences, m represents the dimension, and r is the similarity tolerance.The formula for calculating SE is shown below.
Since N cannot be an infinite value in an actual calculation application, N takes a finite value, and the sample entropy is calculated as: Generally, in practical applications, m is 1 or 2, and r is set from 0.1×std to 0.25×std (where std represents the original sequences' standard deviation).Therefore, this paper sets m at 2 and r as 0.2×std.Based on the characteristics of the SE value, this paper judges the autocorrelation of each decomposed sequence by calculating the SE value, and then combines the sequences with similar SE values.In other words, sequences with similar complexity are combined into new sequences to prepare for follow-ups.

Phase Space Reconstruction and the Maximal Lyapunov Exponent
Phase space reconstruction was put forward to deal with the complexity and nonlinearity in time series based on Chaos.Considering the nonlinear and chaotic characteristics of carbon price time series, this paper applied phase space reconstruction (PSR) to reconstruct the phase space of each subsequence to accurately determine the input of carbon price prediction.In general, regarding a time series {xi} = {x1,x2, …, xN} with τ (the delay time) and m (the embedded dimension), which are two key parameters of PSR, the reconstructed matrix is calculated by: Then, the original time series and the corresponding output can be reconstructed as: In mathematics, the Lyapunov index of a dynamic system describes the properties of the separation rate of infinitely small close tracks [46].Whether the system has dynamic chaos can be judged intuitively from whether the maximum Lyapunov exponent is greater than zero: a positive Lyapunov exponent means that, no matter how small the spacing between the initial two trajectories is, the difference will increase exponentially with time in the system's phase space, so that it can be called Chaos.There are many ways to calculate the maximal Lyapunov exponent, such as the Wolf method and Gram-Schmidt Renormalization.This paper uses the widely used method 'Wolf' [47] to calculate the maximal Lyapunov exponent under PSR.In a word, this paper utilizes PSO to optimize the key parameters of ELM.

The Framework of the Proposed Model
The focus of the overall hybrid model in this paper can be divided into two parts, which are presented in Figure 3.One part is to decompose the initial carbon price time series to determine the input and output of the prediction model.The other part is to forecast the carbon price and verify the accuracy of the proposed prediction model.In particular, this paper uses PACF to analyze those subsequences whose chaotic characteristics are not significant.In this way, after comprehensively analyzing the characteristics of the sequences, the input and output of the PSO-ELM model can be determined more reasonably.Similarly, in part two, after WT decomposition, the initial data is transformed into an approximate sequence and a detailed sequence.The detailed sequence is discarded.Then, the same steps of part one are followed to perform phase space reconstruction and a partial autocorrelation analysis to determine the input and output of PSO-ELM.The forecasting is divided into training sets and test sets.Finally, the forecast is compared with the actual carbon price.

Data
As the capital of China, Beijing's carbon market development is related to the sustainable development of the capital in the future.The Beijing carbon market mechanism provides basic support to the strategic positioning of the capital's "four centers".In addition, we select the carbon price data from the Shenzhen and Hubei carbon markets to perform a comparative analysis.The Shenzhen and Hubei carbon markets are typical carbon markets in China with a longer transaction time and a higher transaction volume.Afterward, we chose the daily transaction price of these three carbon markets for empirical studies.The training set and testing set were divided according to a ratio of 7:3.
We selected the carbon price of the Beijing carbon market from 28 November 2013 to 29 December 2017 in the first case study to verify the hybrid model's applicability.The carbon price data from the Shenzhen carbon market from 1 November 2013 to 29 December 2017 and the data from the Hubei carbon market from 2 April 2014 to 20 June 2017 were used to further test the model's validity.To further validate the effectiveness of the model, we updated the latest data on the Beijing carbon market and conducted an empirical analysis as shown in Section 4.4.Figure 4 shows the original price data from these three regional carbon markets, which were obtained from the literature and an official website with the address: http://www.tanjiaoyi.com/.

Figure 4.
The carbon price graph of the three regional carbon markets.

Carbon Price Decomposition
It can be seen from Figure 4 that the original carbon price series of the regional carbon markets all have serious fluctuations.In order to reduce noise interference, this paper proposes the FEEMD method to decompose carbon prices.At the same time, EMD and WT were also employed to decompose the same Beijing carbon price series so as to test the superiority of FEEMD.The results are shown in Figures 5 and 6, respectively.Figure 5a expresses that the FEEMD decomposed the Beijing carbon price series into seven IMFs and one remainder; Figure 5b shows that EMD decomposes the series into six IMFs and one remainder.In Figure 6, the original data were decomposed into an approximation A1 and a detail D1 by WT.A1 was expected to present the main fluctuation, while D1 depicted the highest frequency.Therefore, A1 was used for prediction in this paper.

The Calculation of Sample Entropy
As described above, SE is used to measure the complexity in a series.Since decomposition by FEEMD and EMD results in a large number of IMFs, this paper calculates the SE values of each IMF, which are shown in Table 1 and Figure 7, so as to understand their complexity and merge them into new sequences, which will improve the computational efficiency.The recombination results are exhibited in Tables 2 and 3.The new subsequences will be used in the carbon price predictions.In the process of predicting each subsequence, the key part is the determination of the input and output.This paper introduces phase space reconstruction and the PACF method to reconstruct the subsequences.
Firstly, after calculation of the τ and m for each subsequence, the phase space is rebuilt based on Formulas ( 6) and (7).At the same time, the maximal Lyapunov exponents are calculated to examine the chaotic properties.After obtaining the τ and m of the five series after decomposition by FEEMD, the answers can be seen in Table 4.
Secondly, there may be non-chaotic subsequences.This paper introduces a PACF analysis on these subsequences to make the determination of the input and output more complete and reasonable.In a word, the main idea of PACF is to find the lags that satisfy the 95% confidence interval.In Table 4, we found that the maximal Lyapunov exponents of sub4 and sub5 are negative.Theoretically speaking, these subsequences do not have chaotic characteristics; so, phase space reconstruction methods are not suitable for them.Therefore, we used the PACF method to analyze sub4 and sub5.The results are shown in Figure 8, and the Train and Test sets are shown in Table 4.For instance, sub5 was reconstructed using PACF and two lags were obtained, and the size of the data in the Train and Test sets was set according to the ratio of 7 to 3. Similarly, Table 5 and Figure 9a show the results of the PSR and PACF analysis of the subsequences decomposed by EMD, respectively.In addition, we found that, through the result on its maximal Lyapunov exponent, which is shown in Table 6, the approximation (A1) component decomposed by WT did not satisfy the chaotic characteristic.Figure 9b shows the PACF analysis result of A1.In order to clearly understand the meaning of Figures 8 and 9, Table 7 lists the autocorrelation coefficients of each subsequence after PACF.With the data in Table 7, it can be seen that the coefficients of the lags that exceed the limit range line in Figures 8 and 9 are two times greater than the standard error; so, we extract them as significant lags for prediction.In order to effectively evaluate the performance of the prediction models, this paper selected the root mean square error (RMSE) and the mean absolute percentage error (MAPE) to test the accuracy of the proposed models.As general error indicators, we know that the larger the value, the worse the performance and vice versa.The formulas are listed as follow.
where n represents the amount of data in the test set, t y is the t-th actual data, and t y ˆ is the matching prediction output.

Beijing Carbon Price Forecasting
First of all, to show the forecasting performance capability of this hybrid model, the models for the comparison are established as shown in Figure 10. Figure 10 is divided into two main parts.The first part compares the influence of different decomposition methods, which are shown in the blue box.The second part emphasizes the forecasting veracity among the prediction models under comparison, which are displayed in the pink box.The operations and graphics in this paper were all done in matlab2015b and Excel.Figure 11 shows the carbon price forecasting fitting curves that were calculated by all of the models in this paper.The MAPE and RMSE values are listed in Table 8 and Figure 12.Based on the results, we can draw the following conclusions: 1.The proposed FEEMD-PSR-PACF-PSO-ELM model had the lowest MAPE and RMSE values (2.4604% and 1.853, respectively) among the models under comparison in this paper, which demonstrates its performance.
2. In Figure 12, the forecasting curve of FEEMD-PSR-PACF-PSO-ELM model was the nearest to the actual carbon price, and that of FEEMD-PSR-PACF-BP was the least close to the actual carbon price.
3. Compared with the FEEMD-PSR-PACF-ELM and FEEMD-PSR-PACF-BP models, the proposed model had the best performance, which proves the superiority of the PSO-ELM predictive model.
4. Compared with EMD-PSR-PACF-PSO-ELM and WT-PACF-PSO-ELM, we can infer that the FEEMD decomposition method has the best effect.It should be noted that the approximate sequence after WT decomposition was also analyzed by phase space reconstruction.If, based on the value of the maximum Lyapunov exponent, it was found to not be chaotic, a PACF analysis was performed.Therefore, this group of models directly differs only in the method that was used to decompose the initial data.

Case Studies of Other Typical Pilot Carbon Prices
In order to test the applicability of the proposed model and compare it with other regional market situations, Shenzhen and Hubei's carbon prices were analyzed for forecasting.In order to avoid redundancy, this section presents only the results and not the details of the analysis process.Appendix A contains the tables and figures that show repetitive work, including Figures A1-A3, and  Tables A1-A12.
Firstly, the carbon price series of Shenzhen and Hubei were decomposed by FEEMD, EMD, and WT separately, and the results are presented in Figures A1 and A2.In addition, we calculated the SE values of each IMF, as shown in Tables A1 and A2, to understand their complexity and merge them into new subsequences, which are shown in Tables A3 and A4.
Secondly, Tables A5-A8 were prepared to determine the input and output of the predictive models.Based on the result, it can be concluded that there are some subsequences that are not chaotic.Therefore, we applied the PACF method to them.
Finally, to demonstrate the performance and general applicability of the proposed model, the carbon price series from different markets were employed for supplemental verification.The results on the Shenzhen and Hubei carbon prices are shown in Figures 13 and 14, respectively.The MAPE and RMSE values of all the models are shown in Table 9.
From the forecasting results on the Shenzhen and Hubei carbon markets, some conclusions can be drawn.
(a) Similar to the forecasting results on the Beijing carbon market, the proposed model (the FEEMD-PSR-PACF-PSO-ELM model) performs the best among the models under comparison, and the FEEMD-PSR-PACF-BP model has the worst fitting effect.
(b) The difference is that, under the same model, the accuracy of the carbon price prediction is different in these regions.For instance, the results on the Shenzhen carbon market show that the MAPE value with the best performance is 8.39%, which is weaker than that of Beijing (2.46%) and Hubei (1.645%).This may be due to the different actual regional situations, but it does not prevent us from establishing the validity of the proposed hybrid model.

Additional Case Study of the Beijing Carbon Market
In order to test the applicability and superiority of the proposed model, we used the latest official data on the Beijing carbon market, that is, data from 28 November 2013 to 5 December 2018.On this basis, according to the above-described analysis process, a new Beijing carbon price was analyzed and predicted.
Similar to the above illustration, the samples were divided into two subsets for prediction: a training set (approximately 70%) and a testing set (approximately 30%).For the sake of simplicity, we ignore the details of the process and directly explain the results of the analysis for each step.Figure A3 displays the decomposed results of FEEMD, EMD, and WT.After the calculation of the SE value (shown in Table A9), the new subsequences were divided as illustrated in Table A10.To determine the input and output of the model, we performed a phase space reconstruction and a PACF analysis of each sequence.After calculating the main parameters (as shown in Table A11), the input and output of these subsequences were obtained, and are shown in Table A12.Finally, the PSO-ELM model, the ELM model, and the back propagation neural network (BP) model were used for forecasting, and the results are shown in Figure 15 and Table 10.
As shown in Table 10, the proposed model has the best MAPE and RMSE values among the models under comparison.It is worth noting that the prediction accuracy with the updated data on the Beijing carbon price is higher, which also means better prediction performance.In Figure 15, the curve of FEEMD-PSR-PACF-PSO-ELM best fits the actual data.In a word, we can conclude that the model proposed in this paper still has the best applicability to the prediction of carbon prices in Beijing's carbon market after updating the data in the models under comparison.

Conclusions
The promotion of the carbon market is a requirement for the high-quality development of China's economy.An accurate carbon price forecasting method is helpful for the stability of the carbon market.This paper proposed a new hybrid model for carbon price prediction based on fast ensemble empirical mode decomposition, sample entropy, phase space reconstruction, and a partial autocorrelation function that utilizes an extreme learning machine improved by particle swarm optimization.Due to the nonlinearity and volatility of carbon price time series, this paper combined a decomposition method and a phase space reconstruction theory for data analysis and processing.FEEMD was introduced to decompose the original carbon price to reduce the noise signal.The sample entropy was calculated to merge the series decomposed by FEEMD to form new subsequences, which reduced the overall computational workload.Based on Chaos theory, a phase space reconstruction and the maximum Lyapunov exponent were employed to determine the input and output variables of the prediction models.In particular, this paper tested the chaotic property of each subsequence by calculating the maximum Lyapunov exponent, and performed a PACF analysis of the subsequences that were not suitable for PSR.This paper forecast the carbon price using the PSO-ELM model.The decomposition methods and prediction models, including WT, EMD, single ELM, and BP, were compared.To verify the performance and validity of FEEMD-PSR-PACF-PSO-ELM, case studies of three different carbon markets were used.Moreover, we focused on the carbon price forecast under Beijing's ETS.The conclusions that can be drawn according to the empirical results are summarized below.
(a) Through the performance of the forecasting models in the case studies, we can infer that the decomposition methods (FEEMD, EMD, WT) can improve the forecasting accuracy by reducing the noise interference in the initial data on the carbon price.In the comparison of the decomposition methods, the FEEMD method has better applicability for forecasting the carbon price when applying the same prediction models.
(b) Integrating the Chaos and PACF methods leads to an effective method for processing nonstationary and nonlinear carbon prices that takes full account of the characteristics of the carbon price subsequences.
(c) The PSO-ELM model has the best performance in forecasting the carbon price compared with the other models that were considered in this paper.Taking only historical data into account to determine the input and output of the forecasting models, and following the above-described technical route, we can obtain future carbon price changes in regional carbon markets in China through the proposed model, which contributes to policy development and investment.Moreover, it may be useful for the analysis of the national carbon market.
This paper focuses on the study of historical time series of carbon prices, and fully considers the instability and nonlinear properties of carbon price series.The applicability of the proposed hybrid model was also verified by case studies.However, we did not analyze possible influencing factors in this paper.Therefore, subsequent research may focus on external influencing factors of the carbon price in China's carbon market.

Figure 1 .
Figure 1.The topology of the extreme learning machine (ELM).

Figure 2 .
Figure 2. The flow chart of the particle swarm optimization (PSO)-ELM model.

Figure 2
Figure 2 displays the flowchart of the proposed PSO-ELM model for carbon price prediction.Part 1 is the calculation process of the PSO algorithm and Part 2 is the forecasting procedure of ELM.In a word, this paper utilizes PSO to optimize the key parameters of ELM.The focus of the overall hybrid model in this paper can be divided into two parts, which are presented in Figure3.One part is to decompose the initial carbon price time series to determine the input and output of the prediction model.The other part is to forecast the carbon price and verify the accuracy of the proposed prediction model.

Figure 3 .
Figure 3.The framework of the proposed forecasting models.IMF, intrinsic mode function.PACF, partial autocorrelation function.

Figure 5 .
Figure 5. (a) The fast ensemble empirical mode decomposition (FEEMD) outcome of the Beijing carbon price series; (b) The outcome by the empirical mode decomposition (EMD) method.

Figure 6 .
Figure 6.The wavelet transform (WT) results on the Beijing carbon price series.

Figure 7 .
Figure 7. (a) The SE value of subsequences processed by FEEMD; (b) The SE value of subsequences processed by EMD.

Figure 8 .Figure 9 .Table 6 .
Figure 8.(a) The result of the partial autocorrelation function (PACF) analysis of sub4 after processing by FEEMD of the Beijing carbon price; (b) The result of the PACF analysis of sub5 after processing by FEEMD of the Beijing carbon price.

5 .
The comparison of the proposed FEEMD-PSR-PACF-PSO-ELM model and the single PSO-ELM model shows the rationality of the decomposition and the data reconstruction, regardless of the fitting of the prediction curves or the MAPE and RMSE error analysis values.

Figure 10 .
Figure 10.The framework of the forecasting models under comparison.BPNN, back propagation neural network.

Figure 11 .Figure 12 .
Figure 11.The forecasting curves of the Beijing carbon price.

Figure 13 .Figure 14 .
Figure 13.(a) The forecasting curves of the Shenzhen carbon price; (b) The MAPE values; (c) The RMSE values.

Figure 15 .
Figure 15.(a) The carbon price forecasting curves; (b) The comparison between the proposed model and the actual carbon price.

Figure A1 .
Figure A1.(a) The FEEMD decomposition result of the Hubei carbon price; (b) The EMD decomposition result of the Hubei carbon price; (c) The FEEMD decomposition result of the Shenzhen carbon price; (d) The EMD decomposition result of the Shenzhen carbon price.

Table 1 .
The results on the sample entropy (SE) values.

Table 2 .
The new subsequence after being processed by fast ensemble empirical mode decomposition (FEEMD).

Table 3 .
The new subsequence after being processed by empirical mode decomposition (EMD).

Table 4 .
The parameters of each subsequence after processing by FEEMD-SE.

Table 5 .
The parameters of each subsequence after processing by EMD-SE.

Table 7 .
The coefficients of different subsequences processed by partial autocorrelation function (PACF).

Table 8 .
The performance values of the prediction models under Beijing's emissions trading scheme (ETS).

Table 9 .
A comparison of the different forecasting models under the Shenzhen ETS and the Hubei ETS.

Table 10 .
A comparison of the different forecasting models under Beijing's ETS.