1. Introduction
The concentration of carbon dioxide in the atmosphere has risen rapidly as a result of industrialisation and the increase in all types of waste incineration. Emissions of carbon dioxide and other greenhouse gases constitute the main cause of the greenhouse effect [
1]. The increasing greenhouse effect has resulted in global warming, with serious negative impacts on the balance of ecosystems. In response to the challenge of climate change, countries have introduced carbon emissions trading markets.
The carbon market, as a key instrument used by governments to address energy transition and low-carbon development, has performed better over the past 20 years [
2]. In 2005, Europe established the EU Emissions Trading System (EU ETS), the first greenhouse gas emissions trading system in the world. To meet its carbon peaking and carbon neutrality targets, China has selected eight regions, including Shenzhen, Hubei, Beijing, Guangdong, and Tianjin, as pilot regions for the establishment of a carbon emissions trading market. Furthermore, in 2017, the National Development and Reform Commission (NDRC) formally announced that China would launch a pilot carbon trading market and gave the project a prominent place in the 13th Five-Year Plan, demonstrating the firm confidence of China in the development of a green economy [
3]. Through the marketisation of carbon allowances, governments are incentivising companies to switch to cleaner energy or less fossil fuel-intensive production to reduce carbon emissions [
4]. The carbon price, as a core indicator of the carbon market, is one of the most effective ways to encourage reductions in carbon emissions and limit the increase in the global average temperature [
5]. However, as an emerging market-based instrument, the carbon price is determined by a combination of internal market mechanisms and external influences. The volatility of the carbon price challenges the stability of the market and further affects the efficiency of emission reductions [
6]. The core issue of the carbon market is the formation and prediction of the carbon price. Accurately predicting the carbon price will help establish a carbon pricing mechanism, which will facilitate the pricing of other carbon financial products, such as carbon futures and carbon options, and will also be beneficial in providing practical guidance for production, operation, and investment decisions, ultimately achieving green, low-carbon, and high-quality development [
7,
8].
Due to the complexity of influencing factors, the carbon price tends to be characterised by non-linearity, non-stationarity, and high noise, posing major challenges to carbon price forecasting. Carbon price forecasting is inherently a time-series modelling task [
9]. Existing prediction models can be divided into three main categories: statistical models, artificial intelligence (AI) models, and hybrid models. Statistical models mainly include the autoregressive integrated moving average model (ARIMA) [
10,
11] and the autoregressive conditional heteroskedasticity model (GARCH) [
12,
13,
14]. Statistical models are based on certain economic theories and apply a combination of mathematical and statistical strategies to build models that capture the information embedded in the data. However, statistical models require complex feature engineering and are limited in dealing with non-linear, non-smooth, and non-Gaussian time series. In addition, they do not adequately capture the complex dynamic features in the data [
15]. Therefore, more flexible and accurate forecasting methods need to be introduced. Machine learning models predominantly consist of extreme learning machine (ELM) [
16,
17,
18,
19], random forest (RF) [
3,
20,
21], and support vector machine (SVM) [
22,
23,
24]. Machine learning models have the advantage of being interpretable and transparent, but their ability to deal with non-linear and non-stationary time series is still inadequate [
13]. With the development of artificial intelligence technology and big data, the technical background for predicting carbon prices with deep learning models is maturing. Deep learning models primarily include artificial neural networks (ANNs) [
25,
26], convolutional neural networks (CNNs) [
20,
27,
28], long short-term memory (LSTM) networks [
28,
29,
30,
31], and gated recurrent unit (GRU) networks [
9,
28,
32]. The above research explores the application of AI models in carbon price series forecasting, expanding the field of AI modelling research and achieving significant advancements.
However, given the high degree of uncertainty and non-linearity of carbon price series, a single model is no longer sufficient for accurate forecasting. In response, hybrid models have been studied by scholars to further explore the deeper relationships underlying irregular carbon price volatility. More specifically, hybrid models are typically a combination of signal decomposition strategies and the prediction algorithms described above. One of the most effective ways to reduce the complexity of carbon price series is to implement the decomposition-integration method. The first step is to decompose the original non-stationary time series into a number of relatively regular sub-models. Then, prediction models, including statistical and AI models, are applied to predict the single sub-models of the decomposition so that feature information at different scales can be extracted individually. Finally, the prediction results of each sub-model are reconstructed to obtain the prediction results [
33]. Currently, the major decomposition methods include empirical modal decomposition (EMD) and its variants [
17,
29,
31,
34,
35,
36,
37], wavelet transform (WT) [
38,
39,
40], and variational modal decomposition (VMD) and its variants [
33,
41,
42,
43]. Although the above-mentioned decomposition methods have produced better prediction results, they also have limitations. For example, EMD suffers from modal aliasing and endpoint distortions; WT faces difficulties in choosing wavelet basis functions, high computational complexity of the discrete wavelet transform, and boundary effects; and VMD encounters problems such as difficulty in parameter selection and sensitivity to noise. Despite the limitations of the decomposition methods, all the hybrid models constructed based on the decomposition-integration method outperform single statistical or AI models. Therefore, an in-depth study of the application of decomposition methods in the field of carbon price forecasting is needed to better address the challenges of carbon price forecasting. Complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN), as an efficient signal processing technique, reduces the reconstruction error by adding adaptive white noise to the original signal, where each mode is noise-enhanced on a randomly generated white noise background. The advantage of CEEMDAN is that it can effectively prevent modal aliasing and reduce the interference of white noise, thus improving the accuracy and stability of the signal decomposition. For this reason, CEEMDAN will be adopted as the sequence decomposition method in this paper.
It has been shown that AI models with feature extraction not only provide effective pre-processing of data but also have high computational efficiency, enabling the construction of suitable prediction models for time series. However, there are still a number of significant challenges. Firstly, the decomposition of carbon price series involves placing each sub-sequence into a forecasting model without taking into account the different complexities and correlations between the sub-sequences, which reduces forecasting efficiency and accuracy. Secondly, it is important to build models with more appropriate parameters separately, as the prediction model is the same for each sub-sequence, without considering that the sub-sequences differ due to their unique characteristics and frequencies. Thirdly, existing integration methods not only fail to focus on the intrinsic relationship between the original sequence and the reconstructed sequences but also are mainly limited to linear patterns, e.g., combining the predicted values of all reconstructed sequences to obtain the final prediction results. However, linear integration methods can affect the accuracy of predictions, as linear patterns are usually not applicable in all cases. Finally, error correction, which can significantly improve the accuracy of the model, is rarely considered in carbon price forecasting models.
To address the above barriers, a carbon price prediction model based on improved feature extraction and a non-linear integration method, named CEEMDAN-FuzzyEn-PSORF-ELM-LSTM-MLRRF-ARIMA, is proposed. Methodologically, it improves feature extraction and deep learning algorithms, develops an innovative form of non-linear integration based on MLRRF, and improves the accuracy of forecasting the non-smooth and non-linear carbon price. The first step in the method is the decomposition of the original carbon price series into a number of simple, smooth modes using CEEMDAN. Subsequently, simple modes with similar complexity are reorganised according to FuzzyEn, and feature extraction is carried out by combining CEEMDAN with FuzzyEn. This boosts computational efficiency, improves prediction accuracy, and reduces the complexity of the sequence. Then, PSORF, ELM, and LSTM are applied as prediction models for components of varying complexities to better capture the fluctuating characteristics, considering that different modes have unique frequencies and characteristics. Immediately following this, the initial integration of high-, medium-, and low-complexity sub-sequences is performed using MLR to further explore the relationship between the original sequence and each modality. Meanwhile, the aggregation of sub-sequences by MLR with non-linear integration learning further clarifies the relationship and improves the accuracy of aggregation, as non-linear integration learning can better adapt to non-linear data. RF is a typical non-linear bootstrap aggregating (bagging) integration learning method. It makes predictions by constructing combinations of multiple decision trees, each of which has a strong ability to generalise over the training data and serves to mitigate over-fitting during integration. Therefore, in this paper, the non-linear integration method based on multiple linear regression and random forest (MLRRF) is adopted to combine carbon price forecasting results. Finally, ARIMA is applied to correct the error, further boosting the accuracy of the forecast.
The innovations and contributions of this paper are as follows:
- (1)
A novel prediction method that combines improved feature extraction, hybrid modelling, non-linear integrated learning, and error correction is introduced to provide highly accurate carbon price forecasts. The results demonstrate that the prediction method proposed in this paper remarkably improves carbon price prediction accuracy and has greater anti-interference ability and general applicability.
- (2)
By considering the different complexities and correlations among the decomposition modes, an improved feature extraction method that combines CEEMDAN and FuzzyEn is implemented to efficiently screen out different features from the original carbon price sequence, which increases extraction efficiency and precision.
- (3)
As different complexity components have their own characteristics and frequencies, PSORF, ELM, and LSTM are applied as prediction models for high-, medium-, and low-complexity sequences, respectively, better capturing the characteristics of each component.
- (4)
Because non-linear integration has a smaller error and a wider range of applications than linear integration, RF is introduced as a non-linear learning method for non-linear integration based on MLR, and, therefore, the MLRRF non-linear integration framework is innovatively established.
- (5)
Error correction is performed on the results of MLRRF integration to further explore the application of error correction in carbon price forecasting.
The remaining sections of this paper are structured as follows.
Section 2 outlines the theoretical basis of relevant methods.
Section 3 presents the decomposition and integration hybrid forecasting model.
Section 4 applies the proposed model to Shenzhen and Hubei and discusses the calculation results.
Section 5 presents the conclusions and discussion.