A Carbon Price Prediction Model Based on the Secondary Decomposition Algorithm and Inﬂuencing Factors

: Carbon emission reduction is now a global issue, and the prediction of carbon trading market prices is an important means of reducing emissions. This paper innovatively proposes a second decomposition carbon price prediction model based on the nuclear extreme learning machine optimized by the Sparrow search algorithm and considers the structural and nonstructural inﬂuencing factors in the model. Firstly, empirical mode decomposition (EMD) is used to decompose the carbon price data and variational mode decomposition (VMD) is used to decompose Intrinsic Mode Function 1 (IMF1), and the decomposition of carbon prices is used as part of the input of the prediction model. Then, a maximum correlation minimum redundancy algorithm (mRMR) is used to preprocess the structural and nonstructural factors as another part of the input of the prediction model. After the Sparrow search algorithm (SSA) optimizes the relevant parameters of Extreme Learning Machine with Kernel (KELM), the model is used for prediction. Finally, in the empirical study, this paper selects two typical carbon trading markets in China for analysis. In the Guangdong and Hubei markets, the EMD-VMD-SSA-KELM model is superior to other models. It shows that this model has good robustness and validity.


Introduction
Agriculture, fisheries, and animal husbandry are the main contributors to the development of the global economy. The increase in temperature caused by carbon dioxide emissions has a huge impact on them. Because global warming has led to reduced fishery production and threatened food security, this will affect the development of the global economy and the environment for human survival [1]. Global warming has aggravated the frequency of river drought and caused serious damage to the ecosystem [2]. In summary, carbon dioxide emissions have had an important impact on the human living environment, natural ecosystems, and the development of the global economy. Therefore, we should reduce carbon emissions as our urgent problem.
To reduce carbon emissions worldwide, the international community has adopted carbon dioxide emissions trading rights as an important economic measure to deal with global warming, which is very important for the global promotion of carbon emissions reduction. The European Union Emissions Trading Scheme (EU ETS) is an important mechanism to deal with carbon emissions. The EU ETS is the first, largest, and most prominent carbon emission regulatory system in European countries. The EU Emissions Trading Program has established European Union Permits (EUAs), and emitters have a certain number of EU permits, and emitters can freely trade EUAs. In this way, emission reduction targets can be achieved at the lowest cost, especially it is very effective in reducing industrial carbon emissions [3]. EU ETS has an important impact on the performance of enterprises. The performance of enterprises with free carbon emission allowances is prediction models based on the secondary decomposition algorithm (Enriched models in this area). The second is that there are still gaps in the literature on the carbon price prediction model based on the combination of the secondary decomposition algorithm and multiple influencing factors. This model fills the gap in this area. The third document about KELM's carbon price prediction model is relatively small. This paper proposed the latest SSA-KLEM model to predict carbon prices, which enriches the models in this area.
Introduce the rest of this article. The second part is the methods and models, including EMD, VMD, KELM, SSA, and the EMD-VMD-SSA-KELM model framework proposed in this paper. The third part is the collection of data including carbon price, structural influencing factors, and nonstructural influencing factors (the primary and secondary decomposition of carbon prices). The fourth part is model input and parameter setting. The fifth part is the prediction result and error analysis. The sixth part is the additional forecast, and the seventh part is the conclusion.

Empirical Mode Decomposition
EMD is a signal decomposition algorithm [31]. EMD decomposition is to decompose a signal f (t) into Intrinsic Mode Functions (IMFs) and a residual. The following prerequisites must be met by every IMF: in the whole data, the amount of local extreme points and zero points must be the same or at most one difference. At any point in the data, the sum value of the upper envelope and the lower envelope must be zero.
The decomposition principle of EMD is as below: Step 1: find out all the local maximum and minimum points in the signal, and then combine each extreme point to construct the upper envelope and lower envelope by the curve fitting method, so that the original signal is Enveloped by the upper and lower envelopes.
Step 2: the mean curve m(t) can be constructed from the upper and lower envelope lines, and then the original signal f (t) is subtracted from the mean curve, so the obtained H(t) is the IMF.
Step 3: since the IMF obtained in the first and second steps usually does not meet the two conditions of the IMF, the first and second steps must be repeated until the SD (screening threshold, generally 0.2~0.3) is less than the threshold. It stops when the limit is reached so that the first H(t) that meets the condition is the first IMF. How to find SD: Step 4: Residual: Repeat the first, second, and third steps until r(t) meets the preset conditions.

Variational Mode Decomposition
VMD is an adaptive, completely non-recursive modal change and signal processing method [32]. This technique has the advantage of being able to determine the number of modal decompositions. It can achieve the best center frequency and limited bandwidth, can achieve the effective separation of the IMF, signal frequency domain division, and then obtain the effective decomposition component of a given signal. First, construct the variational problem. Assuming that the original signal f is decomposed into k components, each modal component must have a center frequency and a limited bandwidth, and the sum of the estimated bandwidth of each model is the smallest. The sum of all modal components In the Formula (3): K is the number of decomposed modes, {µ k }, {ω k } correspond to the K-th component and its central frequency, and δ(t) is the Dirac fir tree. * is the convolution operator.
Then, by solving Equation (3) and introducing Lagrange multiplication operator λ, the constrained variational problem is transformed into an unconstrained variational problem, and the augmented Lagrange expression is obtained.
In Formula (4): α is the secondary penalty factor, and its function is to reduce the interference of Gaussian noise. Using the Alternating Direction Multiplier (ADMM) iterative algorithm combined with Parseval, Fourier equidistant transformation, optimize the modal components and center frequency, and search for the saddle point of the augmented Lagrange function, alternately optimize u k , ω k , and λ after iteration. These formulas are as follows.û k+1 n (ω) ←f In the formula: γ is the noise tolerance, which satisfies the fidelity requirement of signal decomposition,û k+1 n (ω),û i (ω),f (ω),λ(ω) correspond to u k+1 n (t), u i (t), f (t) and Fourier transforms of λ(t), respectively.
The main iteration requirements of VMD are as follows: 1: Initializeû 1 k , ω 1 k , λ 1 and the maximum number of iterations N, 0 ← n .

4:
Accuracy convergence judgment basis ε > 0, if not satisfied ∑ k û n+1 k −û n k 2 2 < ε and n < N, return to the second step, otherwise complete the iteration and output the finalμ k and ω k .

Sparrow Search Algorithm
SSA is a new swarm intelligence optimization algorithm [33]. Its bionic principles are as follows: The sparrow foraging process can be abstracted as a discoverer-adder model, and a reconnaissance early warning mechanism is added. The discoverer itself is highly adaptable and has a wide search range, guiding the population to search and forage. To obtain better fitness, the joiner follows the discoverer for food. At the same time, to increase their predation rate, some joiners will monitor the discoverer to fight for food or forage around them. When the entire population faces the threat of predators or realizes the danger, it will immediately carry out anti-predation behavior. In SSA, the solution to the optimization problem is obtained by simulating the foraging process of sparrows. Assuming that there are N sparrows in a D-dimensional search space, the position of the i-th sparrow in the D-dimensional search space is X I = [x il , . . . , x id , . . . , x iD ] where i = 1, 2, . . . , N, x id represents the position of the i-th sparrow in the d-th dimension.
Discoverers generally account for 10% to 20% of the population. The position update formula is as follows: In Formula (8): t represents the current number of iterations. T represents the maximum number of iterations. α is a uniform random number between [0, 1]. Q is a random digit that submits to a standard normal distribution. L represents a size of 1xd, with all elements A matrix of 1. R 2 ∈ [0, 1] is the warning value. ST ∈ [0.5, 1] is a safe value. When R 2 < ST, the population does not find the presence of predators or other dangers, the search environment is safe, and the discoverer can search extensively to guide the population to obtain higher fitness. When R 2 ≥ ST, the sparrows are detected and the predators are found. The danger signal was immediately released, and the population immediately performed anti-predation behavior, adjusted the search strategy, and quickly moved closer to the safe area.
Except for the discoverer, the remaining sparrows are all joiners and update their positions according to the following formula: In the Formula (9): xw t d is the worst position of the sparrow in the d dimension at the t-th iteration of the population. xb t+1 d represents the optimal position of the sparrow in the d dimension at the (t+1)-th iteration of the population position. When I > n 2 , it indicates that the i-th joiner has no food, is hungry, and has low adaptability. To obtain higher energy, he needs to fly to other places for food. When I ≤ n 2 , the i-th joiner will randomly find a location near the current optimal position x b for foraging.
Sparrows for reconnaissance and early warning generally account for 10% to 20% of the population. The location is updated as follows: In the Formula (10): β is the step control parameter, which is a random digit subject to N(0, 1). K is a random number between [−1, 1], indicating the direction of the sparrow's movement, which is also a step Long control parameter, e is a minimal constant to avoid the situation where the denominator is 0. f i represents the fitness value of the i-th sparrow, f g and f w are the optimal and worst fitness values of the current sparrow population, respectively. When f i = f g , it makes known that the sparrow is at the margin of the whole population and is easily attacked by predators. When f i = f g , it indicates that the sparrow is in the center of the whole population because it is aware of the threat of predators to avoid being attacked by predators and get close to other sparrows in time to adjust the search strategy.

Partial Autocorrelation Function
The relationship between time series and their lags is given. Based on the lag order, the input and output variables of the neural network are determined. Given the time series x t with φ kj representing the autoregressive equation of j and k order regression coefficients, the k order autoregressive model is expressed as

Maximum Correlation Minimum Redundancy Algorithm
mRMR is to find the most relevant feature in the original feature set, but the least correlation with each other, and to use mutual information to express the correlation [34]. The mutual information between the two variables X and Y is: The Sub-Formulas p(X), p(Y) are frequency functions, and p(X, Y) are joint frequency functions.
Based on mutual information, the core expression of the algorithm is In the formula, Formula (13) represents the maximum correlation, Formula (14) represents the minimum redundancy. S is the feature subset. n shows the number of features. I(x, p) shows the mutual information between the feature and the target feature. P represents the target feature. I x i , x j represents the mutual information between the features.
Generally, through the wig integration Formulas (13) and (14), the final maximum correlation and minimum redundancy judgment conditions are obtained:

Extreme Learning Machine with Kernel
KELM is an extension of ELM by Huang et al. [35]. The kernel function mapping is used to replace the random mapping of the hidden layer, which avoids the problem of poor stability caused by randomly given hidden layer parameters by ELM, and improves the robustness of the model. Because of its fast calculation speed and strong generalization ability, KELM's basic principles are as follows: Assuming that the number of hidden layer nodes is L, the hidden layer output function . . , l , the ELM model can be shown as: The goal of ELM is to minimize the training error and the output weight β of the hidden layer. Based on the principle of minimum structural risk, a quadratic programming problem is constructed as follows: In the formula, C is the penalty factor; ξ i is the i-th error variable. Introducing the Lagrange multiplier α i , the quadratic programming problem of Equation (17) is transformed into: According to the KKT condition, the derivatives of β, ξ i , and α i are obtained, respectively. Finally, get the output weight of the ELM model: In the Formula (19): H is the hidden layer matrix, T is the target value matrix, I is the identity matrix.
To improve the prediction accuracy and stability of the model, the kernel matrix is introduced to replace the hidden layer matrix H of ELM, and the training samples are mapped to high-dimensional space through the kernel function. Define the kernel matrix as Ω ELM , and the elements Ω ELM (i, j), construct the KELM model as follows: In the Formula (20), K x i , x j usually chooses radial basis kernel function and linear kernel function, and the expressions are shown in Formulas (22) and (23): In the Formula (22), σ 2 is the width parameter of the kernel function.
Although the introduction of the kernel function increases the stability of the prediction model, C and σ 2 affect the two important parameters of the KELM prediction accuracy during the training process. If C is too small, a larger training error will occur, and if C is too large, overfitting will occur. Moreover, σ 2 affects the generalization performance of the model.

The Proposed Model
This structural model is based on a new carbon price prediction model proposed by data preprocessing technology, structural influencing factors, nonstructural influencing factors, feature selection technology, sparrow search algorithm, and secondary decomposition algorithm. Figure 1 shows the flow chart of the EMD-VMD-SSA-KLEM model. cross-validation is generally used for parameter confirmation. To avoid the influence caused by parameter selection, on this basis, the searchability of the sparrow search algorithm is combined with the fast-learning ability of KELM, and the γ and C of the model are optimized and evolved to obtain the optimal SSA-KELM prediction model. (4) Establish undecomposed models, EMD models, EMD-VMD models, and other multiple models as Figure 2 to verify the superiority of the EMD-VMD-SSA-KELM model.  (1) Part 1 is the flow chart of carbon price prediction. EMD is used to decompose the initial carbon price to obtain a decomposed IMF. Then, use variational modal decomposition to decompose IMF1 to get the VIMF of secondary decomposition. VIMF is the inherent mode function generated by VMD decomposition of IMF1.This is the output of the model. cross-validation is generally used for parameter confirmation. To avoid the influence caused by parameter selection, on this basis, the searchability of the sparrow search algorithm is combined with the fast-learning ability of KELM, and the γ and C of the model are optimized and evolved to obtain the optimal SSA-KELM prediction model. (4) Establish undecomposed models, EMD models, EMD-VMD models, and other multiple models as Figure 2 to verify the superiority of the EMD-VMD-SSA-KELM model.

Data Collection
China is one of the largest carbon emitters in the world and is facing increasing pressure to reduce emissions. Carbon price forecasting is of great significance to grasp the dynamic changes of prices in China's carbon trading market. Therefore, this paper studies the daily carbon price data in China to prove the robustness and accuracy of the prediction model framework proposed in this paper. According to the carbon market investment index recommendation of the China Carbon Emissions Trading Network, we have selected the first two typical carbon trading markets, Guangdong and Hubei, respectively, and the daily carbon prices of these two markets are used as the main research data of this article. These data come from China Carbon Emissions Trading Network. Besides, we consider that carbon prices may be affected by a variety of factors and have complex features such as uncertainty. Therefore, the various influencing factors we consider have an important impact on carbon price forecasts. The factors we consider include the structural influencing factors on the supply side and the demand side, and the nonstructural influencing factors on the Baidu index.

Data Collection
China is one of the largest carbon emitters in the world and is facing increasing pressure to reduce emissions. Carbon price forecasting is of great significance to grasp the dynamic changes of prices in China's carbon trading market. Therefore, this paper studies the daily carbon price data in China to prove the robustness and accuracy of the prediction model framework proposed in this paper. According to the carbon market investment index recommendation of the China Carbon Emissions Trading Network, we have selected the first two typical carbon trading markets, Guangdong and Hubei, respectively, and the daily carbon prices of these two markets are used as the main research data of this article. These data come from China Carbon Emissions Trading Network. Besides, we consider that carbon prices may be affected by a variety of factors and have complex features such as uncertainty. Therefore, the various influencing factors we consider have an important impact on carbon price forecasts. The factors we consider include the structural influencing factors on the supply side and the demand side, and the nonstructural influencing factors on the Baidu index. The carbon price selected in this paper takes into account the differences in public holidays and trading hours at home and abroad, as well as the impact of variable missing values. This paper selects public time. The Guangdong dataset selects the carbon price from 31 October 2017 to 4 November 2019, and the Hubei dataset selects the carbon price from 31 October 2017 to 7 November 2019 and their training datasets. There are a total of 493 data. Generally, the ratio of the experimental training set to the testing set is about 8:2. It is shown in Table 1.

Structural Influence Factors
Domestic carbon prices are affected by supply and demand factors. First, carbon emission allowances are the largest supply-side influencing factor of the carbon market transaction price. At the same time, the EU carbon emission allowance (EUA) price is the benchmark of the global carbon trading market, which has an important impact on carbon emission allowances. Taking into account the market linkage, this paper selects the EUA Futures and Certified Emission Reduction (CER) Futures carbon prices as the international carbon prices. Then, the use of fossil energy is the main reason for carbon emissions. The price of coal is the settlement price of Rotterdam coal futures, the price of crude oil is the settlement price of Brent crude oil, and the price of natural gas comes from the New York Mercantile Exchange. Besides, carbon prices are also vulnerable to other factors in the market. This article also considers the impact of the RMB exchange rate against the US dollar on the domestic carbon market price. The data comes from the Wind database.

Nonstructural Influence Factors
With the development of the Internet, the search index provides useful data for carbon price prediction. Google and Baidu are currently the most used search engines. Baidu index is used more in mainland China, and the Google index is more used abroad. Therefore, the Baidu index is more reliable in this paper. Specifically, this article selects 13 Baidu indexes including Paris Agreement, Low Carbon, Kyoto Agreement, Energy, Clean Energy, Global Warming, Carbon Sink, Carbon Trading, Carbon Emission, Carbon Neutrality, Carbon Footprint, Greenhouse Gas, and Greenhouse Effect. Search index keywords, and get search index data by Formula (24).
SI is unstructured data, and BI is each search keyword after normalization.

Primary Decomposition
EMD decomposes Guangdong carbon prices and Hubei carbon prices. The decomposition results and PACF results are illustrated in Figure 3. As shown in Figure 4, the carbon price is decomposed into 5 IMFs and 1 R, and the decomposition results and PACF results are obtained. The price of carbon becomes more regular after the decomposition of EMD.

Secondary Decomposition
IMF1 is decomposed by VMD. Figure 5 shows the decomposition results and PA results of IMF1 in the Guangdong market. The sub-sequence after VMD decomposition more regular.

Secondary Decomposition
IMF1 is decomposed by VMD. Figure 5 shows the decomposition results and PA results of IMF1 in the Guangdong market. The sub-sequence after VMD decompositio more regular.

Secondary Decomposition
IMF1 is decomposed by VMD. Figure 5 shows the decomposition results and PACF results of IMF1 in the Guangdong market. The sub-sequence after VMD decomposition is more regular.

mRMR Algorithm
According to the mRMR algorithm to reduce the dimensionality of structured d and unstructured data, it can be seen from Table 2 that the influencing factors of carb prices in Guangdong and Hubei are in order.

Input
PACF determines the lag order of each sequence. Table 3 shows the lag order of e sequence. The order of lag is part of the model input. For example, the lag order of R data in Guangdong is 1, 2, 4, and 5, so part of the raw data prediction model input is raw data lags 1, 2, 4, and 5 data. Table 2 shows the ranking of influencing factors accord to mRMR. The more variables input to the prediction model, the lower the accuracy of prediction. Therefore, this paper selects the first two influencing factors that have greatest correlation with Guangdong's carbon price and the least redundancy. The ot part of the input to the Guangdong carbon price prediction model is the price of coal a natural gas. Similarly, this article chooses coal prices and CER as the other part of input to the Hubei carbon price prediction model. Different settings of the parameter the prediction model may produce different prediction results. The parameter settings always listed as shown in Table 4.

mRMR Algorithm
According to the mRMR algorithm to reduce the dimensionality of structured data and unstructured data, it can be seen from Table 2 that the influencing factors of carbon prices in Guangdong and Hubei are in order.

Input
PACF determines the lag order of each sequence. Table 3 shows the lag order of each sequence. The order of lag is part of the model input. For example, the lag order of Raw data in Guangdong is 1, 2, 4, and 5, so part of the raw data prediction model input is the raw data lags 1, 2, 4, and 5 data. Table 2 shows the ranking of influencing factors according to mRMR. The more variables input to the prediction model, the lower the accuracy of the prediction. Therefore, this paper selects the first two influencing factors that have the greatest correlation with Guangdong's carbon price and the least redundancy. The other part of the input to the Guangdong carbon price prediction model is the price of coal and natural gas. Similarly, this article chooses coal prices and CER as the other part of the input to the Hubei carbon price prediction model. Different settings of the parameters of the prediction model may produce different prediction results. The parameter settings are always listed as shown in Table 4.

Evaluation Index
This article uses three commonly used indicators as shown in Table 5. The smaller the mean absolute error (MAE), mean absolute percentage error (MAPE), and root mean square error (RMSE), the better the predictive performance of the model. Table 5. The evaluation indexes.

Metric
Definition Equation

MAE
Mean absolute error

Simulation Experiment One
The carbon price data of Guangdong are the simulation experiment one. The results of the undecomposed models, EMD models, and EMD-VMD models are shown in Figure 6. Table 6 gives the evaluation of their predictive results. Table 7 gives the evaluation and comparison results of their prediction results. Table 8 shows the improvement effect of the SSA optimized KELM model in the Guangdong market.

Simulation Experiment Two
Taking the carbon price data of Hubei as the simulation experiment two, the results of the undecomposed models, EMD models, and EMD-VMD models are shown in Figure 7. Table 9 gives the evaluation of their predictive results. Table 10 gives the evaluation and comparison results of their prediction results. Table 11 shows the improvement effect of the SSA-optimized KELM model in the Hubei market. The result analysis is similar to the simulation experiment one.

Simulation Experiment Two
Taking the carbon price data of Hubei as the simulation experiment two, the results of the undecomposed models, EMD models, and EMD-VMD models are shown in Figure  7. Table 9 gives the evaluation of their predictive results. Table 10 gives the evaluation and comparison results of their prediction results. Table 11 shows the improvement effect of the SSA-optimized KELM model in the Hubei market. The result analysis is similar to the simulation experiment one.    Through the simulation experiments of the above two markets, several results analysis can be obtained.
(A) In the simulation experiments of two typical markets in China, the EMD-VMD-SSA-KELM is the best. According to the evaluation criteria, the EMD-VMD-SSA-KELM performs best in two typical markets. This result shows that the EMD-VMD-SSA-KLEM is optimal. (B) In the result analysis of two market cases, KELM is superior to LSSVM and ELM in most results. EMD-KELM is superior to EMD-LSSVM and EMD-ELM. EMD-VMD-KELM is superior to EMD-VMD-LSSVM and EMD-VMD-ELM. However, in the Hubei market, EMD-LSSVM is superior to EMD-KELM, possibly because EMD-KELM has the influence of kernel parameter settings. Finally, EMD-SSA-KELM is superior to EMD-LSSVM. This still indicates that the KELM model has better global search capabilities and is a good model. KELM models optimized by SSA have better predictive performance than KELM models and other similar comparable models. The possible reason is that SSA optimizes C and γ of the KELM model to improve global search capability. Therefore, the KELM models need to be optimized by SSA. (C) In the analysis of two market cases, in the comparison between the undecomposed models and the EMD models, the prediction of carbon price after decomposition of EMD can obviously improve the predictive performance of the models. The most likely reason is that carbon price is highly non-linear and highly complex. Using EMD can decompose carbon price into multiple relatively regular components, so it is necessary to perform EMD decomposition of the carbon price. (D) In the analysis of two market cases, in the comparison between the undecomposed models, the EMD models and the EMD-VMD models, VMD further decomposes the IMF1 generated by EMD decomposition, which can obviously improve the predictive performance of the models. The main reason is that IMF1 is irregular. By further decomposing VMD to generate more regular sub-sequences, this defect can be solved, so the predictive performance of EMD-VMD models is better.

Additional Forecasting Cases
For the sake of further proof of the model's superiority proposed in this paper, the predictive model of EMD-VMD-SSA-KELM combined with influencing factors and the predictive model of EMD-VMD-SSA-KELM without influencing factors are compared. Table 12 shows their performance comparison results. In the Guangdong market, The EMD-VMD-SSA-KELM with influencing factors has a MAPE of 0.3381%, MAE of 0.0822, and RMSE of 0.1031. The EMD-VMD-SSA-KELM without influencing factors has a MAPE of 0.4251%, MAE of 0.1025, and RMSE of 0.1238.

Conclusions
This paper proposes a model of EMD-VMD-SSA-KELM combined with influencing factors. Through the experimental studies of the Guangdong and Hubei market, we have a few following conclusions.
(1) The model predictive results of EMD-VMD-SSA-KELM combined with influencing factors are the best. It shows that influencing factors can improve the predictive ability of the EMD-VMD model. (2) Influencing factors combined with the EMD-VMD-SSA-KELM model has opened up a new carbon price prediction model. (3) KELM models optimized by SSA have better predictive performance than KELM models and other similar comparable models. SSA optimizes C and gamma of the KELM model to improve global search capability, so the predictive effect of the model is the best. (4) In the comparison between the undecomposed models, the EMD models, and the EMD-VMD models, the predictive results of the EMD-VMD models are the best. EMD-VMD's processing of carbon price is helpful to improve the predictive performance of the models.
According to our forecast results, it has important practical significance: (1) Provide investment advice for investors to refer to. (2) Provide policymakers with more considerations, formulate reasonable policies, and reduce carbon emissions. (3) Researchers provide new ideas for predicting carbon prices.