Multi-Step-Ahead Carbon Price Forecasting Based on Variational Mode Decomposition and Fast Multi-Output Relevance Vector Regression Optimized by the Multi-Objective Whale Optimization Algorithm

The accurate and stable forecasting of carbon prices is vital for governors to make policies and essential for market participants to make investment decisions, which is important for promoting the development of carbon markets and reducing carbon emissions in China. However, it is challenging to improve the carbon price forecasting accuracy due to its non-linearity and non-stationary characteristics, especially in multi-step-ahead forecasting. In this paper, a hybrid multi-step-ahead forecasting model based on variational mode decomposition (VMD), fast multi-output relevance vector regression (FMRVR), and the multi-objective whale optimization algorithm (MOWOA) is proposed. VMD is employed to extract the primary mode for the carbon price. Then, FMRVR, which is used as the forecasting module, is built on the preprocessed data. To achieve high accuracy and stability, the MOWOA is utilized to optimize the kernel parameter and input the lag of the FMRVR. The proposed hybrid forecasting model is applied to carbon price series from three major regional carbon emission exchanges in China. Results show that the proposed VMD-FMRVR-MOWOA model achieves better performance compared to several other multi-output models in terms of forecasting accuracy and stability. The proposed model can be a potential and effective technique for multi-step-ahead carbon price forecasting in China’s three major regional emission exchanges.


Introduction
Global climate change induced by greenhouse emissions has become a serious challenge to the sustainable development of society, which has attracted worldwide attention recently.An emission trading scheme as an effective market mechanism to reduce carbon emission is widely applied worldwide [1].The additional cost of excess emissions mitigates companies' motivation for excessive production and stimulates them to seek a more green production approach, which results in overall carbon emission mitigation.China is the largest carbon emitter [2][3][4], accounting for 28% of the world emissions [5] in 2015, and reaching 10,433 Mton CO 2 emission eq/yr (equivalent per year) in 2016 [6].In response to the challenge of global warming, China has established eight pilot regional emission exchanges in Shenzhen, Guangdong, Hubei, Tianjin, Shanghai, Chongqing, Beijing, and Fujian since 2013 [2,7].Accurate carbon price forecasting is not only critical for governors to make policies [3] but also important for the reasonable production arrangement of carbon emission companies to reduce costs and increase profit.Additionally, it is vital for investors to mitigate price risk and maximize investment revenue [8].However, carbon prices are impacted by both the external environmental variation [9] and investors' heterogeneous beliefs, which makes carbon prices non-stationary and non-linear.Therefore, promoting the accuracy of carbon price forecasting is a challenge for academics and an important topic in the field of energy policy and energy consumption.
Accurate carbon price forecasting has attracted extensive attention worldwide.Recent studies on carbon price series forecasting can be divided into three categories: econometrics models, artificial intelligence models, and hybrid (ensemble) models.The first category includes multiple linear regression [10], generalized autoregressive conditional heteroskedasticity (GARCH) [11], and heterogeneous autoregressive with realized volatility (HAR-RV) models [12].The limitation in econometric models is the difficulty in dealing with the non-linearity of a carbon price series.To overcome such limitations, artificial intelligence models are applied widely in carbon price series forecasting, such as the least square support vector machine (LSSVM) method [13], multi-layered perceptron (MLP)-artificial neural network (ANN) model [14].The hybrid model shows better performance compared to a single model in time series forecasting [15][16][17][18].To further improve the forecasting performance in carbon price forecasting, the hybrid (ensemble) models are proposed.Sun et al. [19] proposed a combined variational mode decomposition (VMD)-spiking neural network (SNN) model to forecast carbon price from Intercontinental Exchange (ICE).Zhou et al. [20] proposed a multiscale ensemble model based on ensemble empirical mode decomposition (EEMD), extreme learning machine (ELM), and support vector machine (SVM) models to forecast carbon price in the Shenzhen Emissions Exchange.Zhu et al. [1] proposed a novel ensemble model based on empirical mode decomposition (EMD) and LSSVM with a kernel function prototype for ICE carbon price forecasting.Zhu et al. [3] proposed a multiscale ensemble model based on EMD and the evolutionary least squares support vector regression for carbon price forecasting in the European Climate Exchange.Zhu and Wei [21] proposed a novel hybrid model based on autoregressive integrated moving average model (ARIMA) and LSSVM to forecast two main carbon future prices from the European Climate Exchange.Atsalakis [22] proposed a computational intelligence technique called PATSOS (a forecasting model consisted of two ANFIS models) based on two adaptive neuro-fuzzy inference system (ANFIS) sub-systems to manage the risk in carbon price trading in the European Energy Exchange.
In general, the hybrid method achieves better performance in carbon price forecasting [1,3,[19][20][21][22].However, there are three deficiencies which exist in recent studies: (1) Existing studies focus on the one-step forecasting of carbon price series, and few studies focus on multi-step forecasting which is important to allow market participants to make more long-term decisions.(2) Most existing studies focus on the carbon price in the European market.Few studies have paid attention to carbon price series in carbon emission exchanges in China.However, as the largest carbon emitter worldwide, studies on Chinese carbon emission exchanges are of significance.(3) The existing ANN and SVR methods can't provide probabilistic forecasting (i.e., variance in the forecasted value) without additional computation.To compensate for the above gaps and promote the accuracy in multi-step-ahead carbon price forecasting, we propose a hybrid multi-step-ahead forecasting model based on VMD, fast multi-output relevance vector regression (FMRVR), and the multi-objective whale optimization algorithm (MOWOA) for carbon price series from three major carbon emission exchanges in China.
The proposed hybrid multi-step-ahead forecasting model VMD-FMRVR-MOWOA contains three modules.First, the VMD is used as the preprocessing module to eliminate persistent noise in the price series.Then, the FMRVR, which is utilized as the forecasting module is estimated based on the preprocessed data.Finally, the MOWOA is applied to optimize the kernel parameter and input lag for the FMRVR to obtain better performance.The proposed model is applied to three carbon prices from three major pilot regional emission exchanges in China: Shenzhen emission allowance 2016 (SZA2016), Guangdong emission allowance (GDEA), and Hubei emission allowance (HBEA).The results show that the proposed method achieves better performance compared to other multi-output models.The main contributions of our study are as follows: (a) FMRVR is used to forecast carbon price for the first time.(b) MOWOA is used to optimize the FMRVR for the first time.(c) As far as we know, the proposed VMD-FMRVR-MOWOA is the first multi-step-ahead forecasting model applied to carbon price series.(d) The carbon price series from three major regional emission exchanges in China are analyzed.
The remainder of this paper is organized as follows: Section 2 describes the framework of the proposed model and presents a brief description of VMD, FMRVR, and MOWOA, respectively.Section 3 introduces a case study of three major regional emissions exchanges in China and analyses the results.Section 4 concludes our study.

The Proposed Method
The proposed model VMD-FMRVR-MOWOA contains three modules.(1) Preprocessing module: For non-linear and non-stationary characteristics [9], the carbon price is often preprocessed to extract the main mode hidden in the series.EMD and VMD are two commonly used methods [1,[23][24][25].VMD is used to denoise the original series in our study since VMD demonstrates better noise robustness than EMD [26].(2) Forecasting module: FMRVR is applied as the multi-step forecasting module of the proposed model.The RVR, which is a Bayesian learning method using the kernel-trick, is applied widely in time series analysis [27][28][29][30][31][32] because it has significant advantages of sparsity formulation and probabilistic output, the ability to use non-Mercer kernels but only one output.Thayananthan, et al. [33] extended RVM to multi-output relevance vector regression (MRVR); however, it still has the limitation of low computational efficiency.The FMRVR used in this paper is a fast version of the MRVR proposed by Ha [34].The kernel parameter and input lag are crucial in the use of FMRVR.
(3) Optimization module: MOWOA, which is an efficient meta-heuristic multi-objective optimization algorithm, is applied to optimize these two parameters.Multi-step-ahead forecasting needs to compare multi-objective performance simultaneously; thus, the multi-objective optimization method is essential [35][36][37][38].As MOWOA achieves better performance than two recently developed algorithms (i.e., multi-objective ant lion optimization algorithm (MOALO) and multi-objective dragonfly algorithm (MODA)) [39], it is chosen as the optimization method in this study.
The structure of the hybrid model is shown in Figure 1.The whole data set is split into two parts: training set and test set.The proposed method consists of five main steps.
Step 1: setting a moving window (window length equal to M) to select the input data for the VMD-FMRVR models.
Step 2: denoising the data series in the fixed number window by VMD to obtain the denoised data series.
Step 3: estimating the parameters in FMRVR based on the denoised data series in the given window and then forecasting to obtain the values of the three steps beyond the given window.The input lags and width of the kernel function in FMRVR module are randomly given.
Step 4: repeating VMD-FMRVR modeling with the movement of the window.
Step 5: finding the best input lags and width of the kernel function in the FMRVR by the MOWOA.

Variational Mode Decomposition (VMD)
The VMD algorithm is a signal decomposition technique proposed by Dragomiretskiy and Zosso [40], which decomposes the original signal into K intrinsic mode functions (IMFs).The key idea of VMD in obtaining the final K IMFs is to minimize the bandwidth sum of the IMFs.To obtain the mode bandwidth, three steps are applied: (1) Obtaining the unilateral frequency spectrum of each mode k u by a Hilbert transformation.
(2) Shifting the mode's frequency spectrum to "baseband" (3) Estimating the bandwidth by the H1 Gaussian smoothness.The decomposition process is realized by solving the following optimization problem [22]: The alternate direction method of multipliers (ADMM) is used to solve the above optimization problem.The complete algorithm of ADMM for obtaining the final parameters and modes in VMD is shown in Figure 2 (modified based on Dragomiretskiy's pseudo-code [40]).

Variational Mode Decomposition (VMD)
The VMD algorithm is a signal decomposition technique proposed by Dragomiretskiy and Zosso [40], which decomposes the original signal into K intrinsic mode functions (IMFs).The key idea of VMD in obtaining the final K IMFs is to minimize the bandwidth sum of the IMFs.To obtain the mode bandwidth, three steps are applied: (1) Obtaining the unilateral frequency spectrum of each mode u k by a Hilbert transformation.
(2) Shifting the mode's frequency spectrum to "baseband" (3) Estimating the bandwidth by the H1 Gaussian smoothness.The decomposition process is realized by solving the following optimization problem [22]: where • • • , ω K } are the center frequencies of the modes.δ(t) is the Dirac function, and * is the convolution operation.s is the original signal.The alternate direction method of multipliers (ADMM) is used to solve the above optimization problem.The complete algorithm of ADMM for obtaining the final parameters and modes in VMD is shown in Figure 2 (modified based on Dragomiretskiy's pseudo-code [40]).

Fast Multi-Output Relevance Vector Regression (FMRVR)
The RVM, which was invented by Tipping [41], is used for multi-input but single-output regression.Thayananthan [42] and Thayananthan et al. [33] extended the RVM to the MRVR, which has the limitation of low computational efficiency.To overcome this limitation, Ha [34] proposed the FMRVR to decrease the time complexity of the MRVR.

Model Specification of the FMRVR
Given a data set of input-output pairs   dimensional input regression can be expressed as follows: where is the weight matrix, and is a kernel function.The U-input V-output model can be also expressed as follows: where , and is the noise matrix.To clarify the relation between inputs and outputs, the U-input V-output model can be expressed in elements form as follows: The pseudo-code of the alternate direction method of multipliers (ADMM) algorithm for VMD.

Fast Multi-Output Relevance Vector Regression (FMRVR)
The RVM, which was invented by Tipping [41], is used for multi-input but single-output regression.Thayananthan [42] and Thayananthan et al. [33] extended the RVM to the MRVR, which has the limitation of low computational efficiency.To overcome this limitation, Ha [34] proposed the FMRVR to decrease the time complexity of the MRVR.

Given a data set of input-output pairs x
i=1 , a V-dimensional multi-output U-dimensional input regression can be expressed as follows: where W ∈ R (N+1)×V is the weight matrix, and ) is a kernel function.The U-input V-output model can be also expressed as follows: where ) , and To clarify the relation between inputs and outputs, the U-input V-output model can be expressed in elements form as follows: The FMRVR is tuned within a general Bayesian learning framework.Under the assumption of noise independence, the likelihood of the data set is as follows: where Ω is the noise covariance matrix.To avoid over-fitting, W is assumed to obey the following distribution:

Inference
The posterior distribution of all the unknowns can be decomposed as following: The p( W| T, A, Ω) can be given out directly as following: where is the posterior covariance and M =ΣΦ T T is the posterior mean.To maximize the posterior distribution p( α, W, Ω|T) is equal to maximize p( α, Ω|T).Since p( α, Ω|T) ∝ p( T| α, Ω)p(α)p(Ω), the maximizing of p( α, Ω|T) is equal to maximize p( T| α, Ω). p( T| α, Ω) can be given as follows: To estimate the final α MP and Ω MP by maximizing the above marginal likelihood, the expectationmaximization (EM) algorithm is applied.

Making Predictions
Given a new input x * ∈ R U×1 , we can get the predicted mean and covariance by: 12) where y * is the predicted mean, and Ω * is the predicted covariance.
M =ΣΦ T T, α MP and Ω MP are obtained from the EM algorithm.

The Multi-Objective Whale Optimization Algorithm (MOWOA)
The whale optimization algorithm proposed by Mirjalili and Lewis [43] is a nature-inspired meta-heuristic single-objective optimization algorithm.Wang et al. [39] further proposed the MOWOA for multi-objective optimization.The MOWOA introduces the concept of Pareto domination into the optimization process.

Multi-Objective Optimization Problems
In multi-objective optimization problems, the concept of "dominates" proposed by Edgeworth [44] and extended by Pareto [45] is widely used in solution comparisons to find the global optimum.The relative definitions of the multi-objective optimization are given as follows: x Subject to: inequality, equality, and boundary constraints.

WOA
In the whale optimization algorithm (WOA), whales (i.e., search agents) have three modes to update their positions: random search mode, shrinking encircling mode, and spiral feeding mode.

Random search mode
In this mechanism, search agents update their positions randomly, which makes it possible for the WOA to perform a global search.The updating process is as follows: where → X is the position of the search agent, t is the iteration number.
where → a decreases from 2 to 0 over the iterations and → r is a random number in [0,1].

Shrinking encircling mode
In this mechanism, search agents update their positions in a shrinking movement towards current best solution as follows: where → X * (t) is the position of current best search agent.

Spiral feeding mode
In this mechanism, search agents update their positions in a spiral movement towards the current best solution as follows: where b is a constant and l is a random number in [−1].Search agents adopt a circling mechanism to update their positions in both random search and shrinking encircling modes and perform a spiral mechanism in spiral feeding mode.The probabilities to choose a circling mechanism or spiral mechanism are both 50%.In the circling mechanisms, if |A| < 1, search agents update their positions in the shrinking encircling mode.If |A| ≥ 1, the positions are updated in random search mode.
MOWOA introduces the Pareto-dominate concept into the WOA to find the non-dominated solution, as shown in line 28 of Figure 3. Furthermore, the archive, roulette wheel selection mechanisms [46] are also adopted in the MOWOA.The pseudo-code of the MOWOA algorithm (modified based on Wang's pseudo-code [39]) is shown in Figure 3.

Data Descriptions
Considering the data availability, continuity, and comparability, three carbon price series used in this paper: SZA2016 from Shenzhen emission exchange, GDEA from Guangzhou emissions exchange, and HBEA from Hubei emissions exchange.SZA2016, HBEA, and GDEA are three actively traded assets that help to provide continuous carbon prices for study.The samples are obtained from 1 September 2016 to 11 September 2018, excluding non-trading days, with 496 data points included per series [47].Three carbon prices are plotted in Figure 4.The basic information of the three regional emission exchanges is shown in Figure 5.The total carbon trading volume of these emission exchanges account for 77.1% of China's total carbon trading volume, and the total carbon trading turnover of these emissions exchanges account for 71.4% of China's total carbon trading turnover.These emission exchanges are three major pilot regional carbon emission exchanges in China.

Data Descriptions
Considering the data availability, continuity, and comparability, three carbon price series used in this paper: SZA2016 from Shenzhen emission exchange, GDEA from Guangzhou emissions exchange, and HBEA from Hubei emissions exchange.SZA2016, HBEA, and GDEA are three actively traded assets that help to provide continuous carbon prices for study.The samples are obtained from 1 September 2016 to 11 September 2018, excluding non-trading days, with 496 data points included per series [47].Three carbon prices are plotted in Figure 4.The basic information of the three regional emission exchanges is shown in Figure 5.The total carbon trading volume of these emission exchanges account for 77.1% of China's total carbon trading volume, and the total carbon trading turnover of these emissions exchanges account for 71.4% of China's total carbon trading turnover.These emission exchanges are three major pilot regional carbon emission exchanges in China.The statistical descriptions of the three carbon price series are shown in Table 1.The average carbon price and price volatility of SZA2016 are higher than those of the HBEA and GDEA, which is also evident in Figure 4.The statistical descriptions of the three carbon price series are shown in Table 1.The average carbon price and price volatility of SZA2016 are higher than those of the HBEA and GDEA, which is also evident in Figure 4. Notes: Count denotes the number of samples, std is short for standard deviation, min denotes the minimum, max denotes the maximum, skew is short for skewness, and kurt is short for kurtosis.SZA2016 is short for Shenzhen emission allowance 2016, GDEA is short for Guangdong emission allowance, and HBEA is short for Hubei emission allowance.

Evaluation Criteria
To measure the forecasting performance, three criteria (the mean absolute percentage error (MAPE) as calculated in Equation ( 23), the mean absolute error (MAE) as calculated in Equation ( 24), and the root mean square error (RMSE) as calculated in Equation ( 25)) are used.The process of the evaluation criteria is detailed in Table 2.
where y actual is the actual value and ŷ f orecasted is the forecasted value, and n is the number of forecasted points.The smaller the above three criteria are, the higher forecasting accuracy the model achieves.

Data Preprocessing
As shown in Figure 4 and Table 1, carbon price series are heavily random and highly volatile.To eliminate noise, VMD is applied to preprocess the raw carbon price series.After VMD, the raw carbon price series are decomposed into some IMFs.Furthermore, the IMFs are reconstructed to obtain the preprocessed data.VMD remains the major mode in the price series.The preprocessed carbon price series are smoother than the raw series which can significantly improve the effectiveness and accuracy of time series prediction [46].

Parameter Setting and Input Selection
The general expression for the input and output of the model can be described as follows: where the input lag is m + 1, x t is the observation value at time t, x t+1 , x t+2 , x t+3 are one-step, two-step, and three-step forecasted values, respectively.In the FMRVR, the radial basis function (RBF) kernel is chosen as the kernel function which is expressed in Equation ( 27), with kernel width (i.e., a free parameter) υ: The input lag and kernel width are two parameters that are optimized in the MOWOA.The parameters setting used to train the VMD-FMRVR-MOWOA is shown in Table 3.The objective function used in MOWOA is {RMSE 1 , RMSE 2 , RMSE 3 }, where the RMSE is achieved in one-step, two-step, and three-step ahead forecasting, respectively.
The parameter setting in FMRVR-MOWOA is similar to that in Table 3.The parameter settings (input lags, initial covariance, and scale in the kernel function) in MOTP-MOWOA and MOGP-MOWOA are optimized by the MOWOA.The parameters in MSVR, the ε loss function parameter and the error trade-off parameter are optimized by a grid optimization algorithm.BPNN is a 6-13-3 structure network, where the maximum number of iterations is 1000 and the training precision is 0.00004.

Results and Discussion
Table 4 presents the forecasting results of SZA2016 from the different models.For clarity, the values of MAPE, MAE, and RMSE are plotted in Figure 6.As shown in Table 4 and Figure 6, VMD-FMRVR -MOWOA outperforms the other five models in all three steps ahead forecasts.The results show the following: These results show that the proposed method has a high accuracy in forecasting SZA2016 price series.(b) The proposed model outperforms the FMRVR-MOWOA in all of the one-step, two-step, and three-step ahead forecasting, which can be mainly attributed to the fact that VMD denoises the carbon price series.This shows that the VMD algorithm helps to improve the forecasting performance in carbon price series.(c) The proposed model outperforms the MOTP-MOWOA and MOGP-MOWOA in one-step, two-step, and three-step ahead forecasting.These results show that the VMD-FMRVR has a stronger performance in SZA2016 price forecasting than the MOTP and MOGP.(d) Forecasting accuracy decreases as the forecasting step size increases.All MAPE, MAE, and RMSE values increase by the increase in one-step, two-step, and three-step ahead forecasting for all six models.These results indicate that the difficulty in forecasting increases with the forecasting step.show that the proposed method has a high accuracy in forecasting SZA2016 price series.(b) The proposed model outperforms the FMRVR-MOWOA in all of the one-step, two-step, and three-step ahead forecasting, which can be mainly attributed to the fact that VMD denoises the carbon price series.This shows that the VMD algorithm helps to improve the forecasting performance in carbon price series.(c) The proposed model outperforms the MOTP-MOWOA and MOGP-MOWOA in one-step, twostep, and three-step ahead forecasting.These results show that the VMD-FMRVR has a stronger performance in SZA2016 price forecasting than the MOTP and MOGP.(d) Forecasting accuracy decreases as the forecasting step size increases.All MAPE, MAE, and RMSE values increase by the increase in one-step, two-step, and three-step ahead forecasting for all six models.These results indicate that the difficulty in forecasting increases with the forecasting step.Notes: MOTP=multi-output student-t process regression, MOGP=multi-output Gaussian process regression, MSVR=multi-output support vector regression and BPNN=back propagation neural network.The forecasting performances for GDEA and HBEA are presented in Tables 5 and 6, respectively.For clarity, the MAPE, MAE, and RMSE values are presented in Figures 7 and 8. Similar to the results for SZA2016 forecasting shown in Table 4, the proposed model outperforms the other five models in multi-step-ahead forecasting given its stability in GDEA and HBEA price series forecasting.The forecasting performances for GDEA and HBEA are presented in Tables 5 and 6, respectively.For clarity, the MAPE, MAE, and RMSE values are presented in Figures 7 and 8. Similar to the results for SZA2016 forecasting shown in Table 4, the proposed model outperforms the other five models in multi-step-ahead forecasting given its stability in GDEA and HBEA price series forecasting.As shown in Table 5 and Figure 7, there is a small difference among the six models in the forecasting performance of the GDEA price series.For one-step, two-step, and three-step ahead forecasting, the smallest MAPE values are 2.944%, 3.422%, and 3.620%, respectively; the smallest MAE values are 0.411, 0.478, and 0.504, respectively; and the lowest RMSE are 0.570, 0.662, and 0.694, respectively.These smallest MAPE, MAE, and RMSE values are all achieved by the proposed model, which implies that the proposed model outperforms the other five models in the multi-step-ahead forecasting of GDEA price series.As shown in Table 5 and Figure 7, there is a small difference among the six models in the forecasting performance of the GDEA price series.For one-step, two-step, and three-step ahead forecasting, the smallest MAPE values are 2.944%, 3.422%, and 3.620%, respectively; the smallest MAE values are 0.411, 0.478, and 0.504, respectively; and the lowest RMSE are 0.570, 0.662, and 0.694, respectively.These smallest MAPE, MAE, and RMSE values are all achieved by the proposed model, which implies that the proposed model outperforms the other five models in the multi-step-ahead forecasting of GDEA price series.As shown in Table 6 and Figure 8, the forecasting performance for HBEA price series with the six models is described.For one-step, two-step, and three-step ahead forecasting, the MAPE values achieved by the proposed model are 2.018%, 2.668%, and 3.149%; the MAE values achieved by the proposed model are 0.390, 0.549, and 0.677; and the RMSE values achieved by the proposed model are 0.565, 0.85, and 1.125, respectively.All of these criteria rank first in the six models.These results imply that the proposed model outperforms the other five models in the multi-step-ahead forecasting of the HBEA price series.

Improvement of the Proposed Hybrid Model
Three criteria are applied to measure the improvement ratio of the proposed model compared to the other five models in terms of forecasting performance, which helps make an intuitionistic analysis.The criteria are defined in Table 7.As shown in Table 6 and Figure the forecasting performance for HBEA price series with the six models is described.For one-step, two-step, and three-step ahead forecasting, the MAPE values achieved by the proposed model are 2.018%, 2.668%, and 3.149%; the MAE values achieved by the proposed model are 0.390, 0.549, and 0.677; and the RMSE values achieved by the proposed model are 0.565, 0.85, and 1.125, respectively.All of these criteria rank first in the six models.These results imply that the proposed model outperforms the other five models in the multi-step-ahead forecasting of the HBEA price series.

Improvement of the Proposed Hybrid Model
Three criteria are applied to measure the improvement ratio of the proposed model compared to the other five models in terms of forecasting performance, which helps make an intuitionistic analysis.The criteria are defined in Table 7.The improvement ratio of the proposed model compared to the other five models is calculated for the three forecasted carbon price series.The improvements in one-step, two-step, and three-step ahead forecasting with respect to the MAPE, MAE, and RMSE are shown in Tables 8-10, respectively.Some discussions are as follows: (a) For the improvement ratio in the MAPE shown in Table 8, the largest improvement ratios for one-step, two-step, and three-step ahead forecasting of SZA2016 are 26.886%,23.566%, and 20.153%, and the smallest improvement ratios are 10.614%, 11.082%, and 7.755%, respectively.These results show that the proposed model outperforms the other five models and shows a steady performance in forecasting of SZA2016.Similar results are also shown in the MAPE improvement ratio for GDEA and HBEA.(b) For the improvement ratio of the MAE and RMSE, consistent with that of MAPE shown in Table 8, the results presented in Tables 9 and 10 suggest that the proposed model outperforms the other five models in all SZA2016, GDEA, and HBEA forecasts.To check the overall performance of the proposed model, the average improvement ratios of the proposed model over the three carbon price series forecasting are presented in Table 11.As shown in Table 11, the smallest average improvements in the MAPE for one-step, two-step, and three-step ahead forecasting are 19.535%,16.335%, and 14.532%, and the largest average improvements in the MAPE are 32.799%,30.498%, and 27.869%, respectively.These results indicate that the proposed model outperforms the other five models in terms of MAPE performance.
Similarly, the smallest average improvements in the MAE for one-step, two-step, and three-step ahead forecasting are 22.286%, 18.188%, and 15.701%, and the largest average improvement in the MAE are 35.366%,32.403%, and 29.336%, respectively.These results show that the proposed model achieves better performance in the MAE than the other five models.
The smallest average improvements in the RMSE for one-step, two-step, and three-step ahead forecasting are 21.615%, 14.495%, and 8.474%, and the largest average improvements in RMSE are 36.601%,33.566%, and 30.139%, respectively.These results suggest that the proposed model exceeds the other five models with regards to the RMSE performance.
The above results show that the proposed model achieved the best forecasting performance among the six models with respect to the three performance criteria.The proposed model performs steadily in the multi-step-ahead forecasting of the three carbon price series.

Conclusions
The accurate and stable forecasting of carbon prices is important to promote the development of carbon markets and vital to reduce carbon emissions in China.However, improving forecasting accuracy is hard work due to the non-stationary characteristics of carbon prices, especially for multi-step-ahead forecasting.In this paper, we propose a hybrid multi-step-ahead forecasting model, VMD-FMRVR-MOWOA, for carbon price series.The proposed hybrid forecasting model is applied to three carbon price series from three major pilot regional carbon emission exchanges in China: SZA2016 from Shenzhen emission exchange, GDEA from Guangzhou emissions exchange, and HBEA from Hubei emissions exchange.The sample period of the three carbon price series is from 1 September 2016 to 11 September 2018.Compared with the other five multi-output models (FMRVR-MOWOA, MOTP-MOWOA, MOGP-MOWOA, MSVR, and BPNN), the proposed VMD-FMRVR-MOWOA model obtained the smallest MAPE, MAE, and RMSE in all of the one-step, two-step, and three-step ahead forecasting of the three carbon prices from the major regional emission exchange in China.The smallest average improvement ratio in MAPE, MSE, and RMSE of the proposed hybrid model compared with the other five models are 14.532%, 15.701%, and 8.474%, respectively.These results show that the proposed VMD-FMRVR-MOWOA model outperforms the other five multi-output models in terms of forecasting accuracy and stability, and it can be a potential and effective technique for multi-step-ahead carbon price forecasting in China's major regional emission exchanges.The proposed model can help governors to deeply understand the characteristics of the carbon price in the three major regional emission exchanges in order to make a more reasonable carbon pricing mechanism, help companies make reasonable production arrangement to reduce cost and help market investors to control price

Figure 2 .
Figure 2. The pseudo-code of the alternate direction method of multipliers (ADMM) algorithm for VMD.

Definition 3 :
Pareto Optimal: A solution → x * is called Pareto optimal if and only if :

→
X rand is a random search agent, → A and → C are the coefficient vectors calculated by:

Figure 3 .
Figure 3.The pseudo-code of the MOWOA algorithm.

Figure 3 .
Figure 3.The pseudo-code of the MOWOA algorithm.

Figure 4 .
Figure 4. Plots of the three studied carbon price series.

Figure 5 .
Figure 5. Basic information of the three studied carbon price series.

Figure 4 .
Figure 4. Plots of the three studied carbon price series.

Figure 5 .
Figure 5. Basic information of the three studied carbon price series.

Table 1 .
Data descriptions for the three carbon price series.

Table 2 .
The evaluation metrics of forecasting performance.

Table 3 .
Parameter setting used in the VMD-FMRVR-MOWOA to train the model.
The proposed model has the smallest MAPE, MAE, and RMSE in one-step, two-step, and three-step ahead forecasting performance.The MAPE values achieved by the proposed model are 6.738%, 8.278%, and 9.646% in one-step, two-step, and three-step ahead forecasting, respectively.The smallest MAPE values from the remaining five models are 7.538%, 9.310%, and 10.457% BPNN, respectively.The RMSE values achieved by the proposed model are 2.655, 3.373, and 4.009 in one-step, two-step, and three-step ahead forecasting, respectively.The smallest RMSE values from the remaining five models are 2.960, 3.757, and 4.390 in one-step, two-step, and three-step ahead forecasting achieved by FMRVR-MOWOA, FMRVR-MOWOA, and BPNN, respectively.The MAPE, MAE, and RMSE obtained by the proposed model are smaller than the smallest MAPE, MAE, and RMSE values achieved by the other five models, respectively.
in the one-step, two-step, and three-step ahead forecasts achieved by FMRVR-MOWOA, FMRVR-MOWOA, and BPNN, respectively.The MAE values achieved using the proposed model are 2.164, 2.677, and 3.122 in one-step, two-step, and three-step ahead forecasting, respectively.The smallest MAE values from the remaining five models are 2.405, 3.001, and 3.367 in one-step, two-step, and three-step ahead forecasting achieved by the FMRVR-MOWOA, FMRVR-MOWOA, and

Table 4 .
The one-step, two-step, and three-step ahead forecasting performance comparisons of SZA2016 for different forecasting models.

Table 4 .
The one-step, two-step, and three-step ahead forecasting performance comparisons of SZA2016 for different forecasting models.

Table 5 .
The one-step, two-step, and three-step ahead forecasting performance comparisons of GDEA with different forecasting models.

Table 5 .
The one-step, two-step, and three-step ahead forecasting performance comparisons of GDEA with different forecasting models.

Table 6 .
The one-step, two-step, and three-step ahead forecasting performance comparisons of HBEA for different forecasting models.

Table 6 .
The one-step, two-step, and three-step ahead forecasting performance comparisons of HBEA for different forecasting models.

Table 7 .
Improvement ratio of MAPE, MAE, and RMSE.MAPE proposed , MAE proposed , and RMSE proposed are the forecasting performance achieved by the proposed model.MAPE compared , MAE compared , and RMSE compared are the forecasting performance achieved by the other model.P MAPE , P MAE , and P RMSE are the improvement ratio in MAPE, MAE, and RMSE, respectively. where

Table 8 .
Improvement ratio of MAPE with the proposed hybrid model compared to other models in one-step, two-step, and three-step ahead forecasting.

Table 9 .
Improvement ratio of MAE with the proposed hybrid model compared to other models in one-step, two-step, and three-step ahead forecasting.

Table 10 .
Improvement ratio of RMSE with the proposed hybrid model compared to other models in one-step, two-step, and three-step ahead forecasting.

Table 11 .
Average improvement ratio of the proposed hybrid model with respect to MAPE, MAE, and RMSE for the three carbon prices series in China's emissions exchanges.