1. Introduction
Global climate change has emerged as one of the most pressing challenges facing humanity in the 21st century, with extreme weather events occurring more frequently and carbon emissions drawing significant attention [
1]. Carbon markets, as market-based mechanisms for emissions reduction, incentivize firms to lower greenhouse gas emissions through emissions trading, establishing themselves as a critical global strategy for addressing climate change [
2]. Since its inception in 2005, the European Union Emissions Trading System (EU ETS) has become the world’s largest carbon market, offering valuable insights into price volatility and market mechanisms for other regions [
3]. As the world’s largest carbon emitter, China initiated regional carbon trading pilots in 2011 and officially launched its national carbon market in 2021, marking a new phase in the development of its carbon market [
4]. By December 2024, the Shanghai Environment and Energy Exchange recorded a cumulative trading volume of 419 million tons, making it the largest pilot market in China by transaction scale [
5].
Carbon prices serve as the cornerstone signal of carbon markets, reflecting supply–demand dynamics while providing critical guidance for governments in designing emissions reduction policies and for firms in making investment decisions [
6]. Accurate carbon price forecasting enables governments to formulate evidence-based carbon pricing policies, mitigate market volatility risks, and supports firms in optimizing investment strategies and asset allocation [
7]. However, carbon prices are influenced by a multitude of factors, including energy prices, financial market fluctuations, policy shifts, climate variables, and market sentiment, exhibiting non-linear, non-stationary, and highly volatile characteristics [
8]. Furthermore, carbon prices are shaped by macroeconomic cycles, international trade patterns, and geopolitical events, adding further complexity to forecasting efforts [
9]. For instance, Creti et al. found that EU ETS carbon prices experienced significant shocks during the financial crisis, demonstrating a strong correlation with energy markets [
10].
Research on carbon price forecasting has an extensive history, primarily grounded in traditional time series methodologies. Models like the Generalized Autoregressive Conditional Heteroskedasticity (GARCH) and the Autoregressive Integrated Moving Average (ARIMA) were often held back by their reliance on manual feature engineering and their struggle to pick up on non-linear patterns, which kept their forecasting results from being as good as they could be [
11]. Lately, though, things have shifted—methods from machine learning and deep learning have really taken off in this field. Tools like Support Vector Machines (SVMs) and Random Forests (RFs) have shown they can handle a bunch of different variables pretty well, offering a lot of reliability in the process [
12], Meanwhile, Long Short-Term Memory (LSTM) networks have turned out to be particularly good at spotting long-term connections in time series data [
13]. On top of this, some researchers have managed to make their predictions even sharper by bringing in outside factors—like energy prices, financial market signals, or even less structured data like online news sentiment [
14]. A good example here is Byun and Cho [
14], who put together an indicator based on news sentiment and ended up with much better forecasting results for EU ETS carbon prices [
15].
Even so, there are still a few shortcomings in the current body of research that need addressing. Many studies overlook the multi-scale features and structural breaks in carbon price series, hindering models’ ability to capture the complex patterns of price volatility [
16]. The majority of research has focused on the EU ETS [
17], with relatively limited attention to forecasting China’s carbon market, particularly the heterogeneity across its national and regional pilot markets [
18]. China’s carbon market exhibits pronounced regional disparities; for instance, the Beijing market experiences significant price volatility driven by policy interventions, whereas the Tianjin market remains relatively stable. Furthermore, existing studies often focus solely on point forecasts, neglecting the quantification of price volatility ranges and uncertainty, which limits their practical utility in decision making [
19]. Additionally, some studies fail to adequately account for the uncertainty of policy scenarios, such as the implementation of carbon peaking and neutrality targets, which could profoundly influence carbon prices [
20].
Building on the aforementioned research gaps, this study proposes a novel carbon-price forecasting framework that integrates an overall market model and a market-specific model to predict price trends in China’s carbon market from 2025 to 2030. The overall market model focuses on the national carbon market’s general trajectory, while the market-specific model targets price forecasts for eight pilot markets (Beijing, Chongqing, Fujian, Guangzhou, Hubei, Shanghai, Shenzhen, and Tianjin), accounting for inter-market heterogeneity. The primary innovations of this study are threefold, as follows: (1) it constructs a forecasting model aligned with historical data dynamics by incorporating seasonal fluctuations and stochastic disturbances; (2) it sets differentiated target prices based on historical price levels of individual markets, analyzing the impact of market heterogeneity on price trends; and (3) it combines point forecasts with volatility analysis to provide more comprehensive price forecasting insights. This research aims to offer a scientific foundation for carbon market participants and policymakers, contributing to the development of China’s carbon market system and the realization of its dual-carbon goals.
2. Data and Methods
2.1. Data Sources
This study utilizes daily historical trading data from China’s eight major carbon markets, spanning the period from 25 June 2021, to 31 December 2024. The data were sourced from the publicly accessible the Greenhouse Gas Voluntary Emissions Reduction Trading Platform (Greenhouse Gas Voluntary Emission Reduction Trading Platform (Source of Original Transaction Data):
https://ets.sceex.com.cn/internal.htm?orderby=tradeTime%20desc&pageSize=14&k=guo_nei_xing_qing&url=mrhq_gn&pageIndex=1, accessed on 30 January 2025), encompassing records from the Shanghai Environment and Energy Exchange, Beijing Green Exchange, Tianjin Emission Rights Exchange, Guangzhou Carbon Emission Rights Exchange, Hubei Carbon Emission Trading Center, Fujian Haixin Trading Center, Shenzhen Green Exchange, and Chongqing Carbon Emission Rights Trading Center. The raw dataset includes fields such as trading date, trading institution, trading product, opening price, highest price, lowest price, average transaction price, closing price, trading volume, and transaction value, comprising a total of 8309 records.
2.2. Data Cleaning and Preprocessing
To ensure data accuracy and consistency, this study applied the following cleaning and preprocessing steps to the raw dataset:
- (1)
Handling Missing Values
Missing values, originally denoted by “-” in the raw data, were replaced with NaN. For price-related fields (opening price, highest price, lowest price, average transaction price, and closing price), a forward fill method was employed, grouped by market, to impute missing values. This method fills missing entries by propagating the most recent non-missing value forward in chronological order (from earlier to later dates) within each market’s time series, ensuring that all subsequent missing values are replaced with the same value until a new non-missing value is encountered. This approach preserves the temporal continuity of the price data, which is essential for maintaining the integrity of time series analysis in carbon-emission trading markets. Grouping by market ensures that the imputation is performed independently within each market, preserving market-specific price trends. For trading volume and transaction value fields, missing values were filled with 0, indicating no trading activity.
- (2)
Standardization of Institution Names
Due to variations in market names across different time periods (e.g., “Beijing Environment Exchange” versus “Beijing Green Exchange”), a mapping table was used to standardize these names (e.g., unified as “Beijing Green Exchange”). Data from the eight major pilot markets were retained, while records from irrelevant exchanges (e.g., “European Energy Exchange”) were excluded.
- (3)
Merging Market Data
The Shenzhen market includes multiple trading products (e.g., SZA2013 to SZA2020). To facilitate a unified analysis, Shenzhen market data within the same trading dates were consolidated into a single product (SZEA) using a volume-weighted average. The formula for calculating the weighted average price is as follows:
where
represents the weighted average price after consolidation;
denotes the average transaction price of the i-th trading product;
indicates the trading volume of the
ith trading product; and n is the total number of trading products. The trading volume and transaction value were calculated as the cumulative sums of the respective values across all products.
- (4)
Outlier Removal
To eliminate outliers in the dataset, the following two methods were applied:
Threshold Method: To accurately reflect the data distribution, a detailed statistical analysis of the raw data was conducted, and the price threshold was set at 148.3 CNY/ton.
Figure 1 presents a boxplot of the transaction prices, where the Interquartile Range (IQR) is calculated as the difference between the third quartile (Q3 = 77.95 CNY/ton) and the first quartile (Q1 = 31.04 CNY/ton), resulting in an IQR of 46.91 CNY/ton. Using the standard outlier detection method (Q3 + 1.5 × IQR), the upper threshold was determined to be 148.3 CNY/ton.
Three Standard Deviation Method: For each market, the mean (
) and standard deviation (
) of each price field were calculated. Data points falling outside the range (
) were identified as outliers, replaced with NaN and subsequently imputed using the forward fill method to address missing values. The calculation formula is as follows:
- (5)
Data Validation
Theoretical transaction values were calculated using the average transaction price and trading volume ( = P·V), and compared with the actual transaction values. If the actual value was missing or the discrepancy exceeded 1 CNY, the theoretical value was used as a replacement to ensure data consistency. The cleaned dataset was saved in CSV format, encompassing trading records from the eight major markets, with fields including trading date, market name, trading product, opening price, highest price, lowest price, average transaction price, closing price, trading volume, and transaction value.
2.3. Feature Analysis Methods
To comprehensively examine the trading characteristics of China’s eight major carbon markets, this study conducts Exploratory Data Analysis (EDA) from the following four perspectives: price trends, trading volume distribution, price volatility, and market correlations. The specific methods are as follows:
The average transaction price trends over time were plotted for each market, reflecting price levels and volatility patterns. Price trend graphs were constructed as line charts, with the x-axis representing trading date, and the y-axis representing the average transaction price (in CNY/ton). The y-axis was scaled from 0 up to 160 CNY/ton, making sure it captures the full span of price variations you would expect across all of the markets.
- (2)
Trading Volume Distribution Analysis
To get a sense of how active and liquid each market is, we worked out the total trading volume for each one. This total volume was figured out by adding up the daily trading amounts, and the formula for that is laid out below:
where
stands for the overall trading volume in market m;
refers to the amount traded in market m on the t-th day; and T marks the total count of days that trading took place.
- (3)
Price Volatility Analysis
To gauge how much prices fluctuate, we used the standard deviation as our metric, figuring out the standard deviation of the average transaction price for each market. The formula for this is laid out below:
where
shows how much prices in market m tend to vary;
refers to the average price of transactions in market m on the
t-th day;
m stands for the overall average price in market m; and
marks the total number of days that trading happened in market m.
- (4)
Inter-Market Price Correlation Analysis
To see how closely prices move together across different markets, we worked out the Pearson correlation coefficient for the average prices among them, with the formula for this given below:
where
represents the correlation coefficient between markets m and n;
denote the average transaction prices of markets m and n on day t, respectively;
m and
n indicate the mean prices of markets m and n, respectively; and T is the number of common trading days. The correlation coefficient ranges from −1 to 1.
Data cleaning and feature analysis were implemented using the Python programming language, primarily relying on the following libraries: pandas for data processing and cleaning and matplotlib and seaborn for data visualization. The data cleaning and analysis code was executed in a Python 3.11 environment, ensuring the reproducibility and reliability of the results.
2.4. Feature Engineering
To enhance the performance of subsequent forecasting models, this study conducted feature engineering on the cleaned dataset, extracting the following two categories of features:
Lagged Features: The average transaction prices from the previous 1 day and 7 days (denoted as
, respectively) were extracted to capture the temporal dependency of prices. Lagged features were computed by applying a time shift to the price series after grouping by market, with the formula as follows:
where
represents the average transaction price of market m on day
t −
k.
Moving Average Features: The 7-day moving average price (denoted as M
) was calculated to smooth price fluctuations and capture short-term trends. The formula for the moving average price is as follows:
where
represents the average transaction price of market m on day
t −
i. If fewer than 7 days of data were available, the calculation was performed using the available data.
Temporal Features: The month of the trading date (denoted as Mt, ranging from 1 to 12) was extracted to capture the seasonal variations in prices. Additionally, the compliance cycle was identified (denoted as Ct), with June and July defined as the compliance period (Ct = 1), and other months as the non-compliance period (Ct = 0), to reflect the potential impact of compliance cycles on prices.
Trading Volume Features: The daily trading volume (denoted as Vt,m), was directly used as an indicator of market activity.
- (2)
Market Features
Market names (Market) were transformed into dummy variables through one-hot encoding, generating eight binary features (denoted as
Dm,i, i
), indicating whether a trading record pertains to a specific market. For instance, if a record belongs to the Shanghai market, then D
Shanghai = 1, while the features for other markets are set to 0. The formula for generating dummy variables is as follows:
2.5. XGBoost Model
To forecast carbon prices, this study employs XGBoost (Extreme Gradient Boosting) as the primary machine learning model. XGBoost, a gradient boosting algorithm based on decision trees, is widely adopted for regression tasks due to its efficiency, flexibility, and robust capacity to model multi-feature data. By integrating multiple decision trees, XGBoost effectively captures non-linear relationships and interaction effects among features, making it well-suited for handling the complex time series data inherent in carbon price forecasting.
3. Results Analysis
3.1. Carbon Market Price Trend Analysis
Between June 2021 and December 2024, the average transaction prices (in CNY/ton) for China’s eight main carbon markets are shown over time in
Figure 2, highlighting just how different these markets can be when it comes to price levels, how much prices swing, and what might be driving those changes. The Beijing market recorded the highest price levels among the eight markets, with prices ranging from 80 to 120 CNY/ton and occasionally approaching 140 CNY/ton. These elevated prices are likely attributable to Beijing’s role as China’s political and economic center, reinforced by stringent carbon emission regulations. However, the pronounced price fluctuations reflect a heightened sensitivity to external factors, such as policy interventions or compliance deadlines. On the other hand, the Shanghai market kept things much steadier, with prices staying within a tighter range of 60 to 80 CNY/ton and following a fairly smooth path overall, which really shows what a well-established carbon market looks like. Shanghai’s solid trading systems and high liquidity did a good job of keeping price swings in check, likely thanks to its role as a financial hub and its early start in carbon trading. Meanwhile, the Guangzhou market saw some pretty noticeable ups and downs, with prices ranging from 30 to 70 CNY/ton. After we took out some outliers—prices that went above 200 CNY/ton—there was a clear downward trend in 2023, dropping from 50 CNY/ton to around 35 CNY/ton. Those swings might come from how sensitive the market is to local economic conditions and policy shifts, plus the fact that it is a smaller market overall. The Hubei market demonstrated the lowest volatility, consistently ranging between 35 and 45 CNY/ton, a stability consistent with its high trading volume, indicating robust liquidity and broad participant engagement that effectively buffered price fluctuations. The Tianjin, Fujian, Shenzhen, and Chongqing markets recorded lower price levels, with fluctuation ranges between 20 and 60 CNY/ton. Among these, Tianjin exhibited the greatest stability (around 30 CNY/ton), possibly due to lenient emission targets or limited market activity, while Shenzhen showed moderate volatility (potentially due to diverse trading products like SZEA). Fujian and Chongqing displayed a gradual upward trend over time, reflecting increasing market maturity. These regional disparities in price dynamics may present opportunities for cross-market arbitrage, underscoring the need for coordinated national policies to promote price convergence and enhance market efficiency. Additionally, the removal of outliers ensured the reliability of the analysis, mitigating distortions from data entry errors or non-representative transactions.
3.2. Trading-Volume Distribution Analysis
The total trading volume (in tons) of China’s eight major carbon markets from 2021 to 2024, as shown in
Figure 3, was calculated by aggregating the daily trading volumes, reflecting the trading activity and liquidity levels of each market. Specific values annotated on the bar chart enhance the visual clarity of the data. The Shanghai market led with a total trading volume of 419 million tons, far surpassing other markets and accounting for a substantial share of the overall trading activity. This dominant position aligns with Shanghai’s role as a financial hub and a pioneering carbon trading pilot, where robust trading infrastructure and supportive policies have attracted significant participation, markedly enhancing market liquidity. The Guangzhou and Hubei markets followed, with total trading volumes of 41.64 million tons and 36.54 million tons, respectively, ranking second and third. This indicates strong trading activity in these regional markets, likely driven by local industrial activity and policy incentives. Hubei’s high trading volume complements its low price volatility, reflecting a market characterized by high liquidity and stability. The Fujian, Chongqing, Tianjin, and Shenzhen markets had trading volumes that sat in the middle of the pack, going from 15.34 million tons in Shenzhen all the way up to 25.82 million tons in Fujian. Fujian’s higher numbers might have something to do with its focus on certain areas—like forestry carbon sink trading—while the smaller volumes in Chongqing and Tianjin point to less action in those markets, probably because they are smaller in scale or do not have as many players involved. Conversely, the Beijing market registered the lowest total trading volume at 11.60 million tons, despite its elevated prices. This constrained trading volume likely contributes to its pronounced price fluctuations, as reduced liquidity heightens vulnerability to supply–demand imbalances or external disruptions. These differences in trading volumes across regions really drive home how important liquidity is in carbon trading. If we could boost the trading volumes in markets that are not as active, it might help make things run more smoothly and keep prices steadier, and the way Shanghai and Hubei have managed things could offer some useful ideas for others to follow.
3.3. Price Volatility Characteristics Analysis
Figure 4 shows how much prices in China’s eight main carbon markets fluctuated between 2021 and 2024, using the standard deviation of the average transaction prices (in CNY/ton) as the yardstick, with the exact numbers marked on the bar chart to make the data easier to read and understand. The Beijing market displayed the highest price volatility, with a standard deviation of 26.93 CNY/ton, consistent with its limited trading volume. This tells us that markets with less trading activity tend to get hit harder by things like supply–demand imbalances or outside shocks—say, policy changes—leading to some pretty big price jumps. The Shenzhen market came in second for price ups and downs, with a standard deviation of 19.57 CNY/ton, putting it at a middle-of-the-road level, likely because of its mix of trading products (like SZEA) and the different dynamics at play in that market. Guangzhou’s market had a standard deviation of 15.85 CNY/ton, which matches the noticeable swings in its price trends, probably due to how much it is affected by local economic conditions, policy shifts, and its smaller overall size. On the other hand, the Shanghai market kept its prices much steadier, with a standard deviation of just 11.85 CNY/ton, matching its stable price patterns and high trading volume. Its strong liquidity really helped keep price swings in check, making it a market you can predict more easily. The Fujian, Chongqing, Hubei, and Tianjin markets did not see much price movement, with standard deviations ranging from 3.70 CNY/ton in Tianjin to 7.02 CNY/ton in Fujian. Hubei’s low number (4.86 CNY/ton) goes hand in hand with its high trading volume, showing that its strong liquidity and large number of participants do a good job of smoothing out price changes, while Tianjin’s tiny fluctuations might come from its low trading activity, meaning prices just do not move much. The link between price swings and trading volume here suggests that getting more activity and liquidity into quieter markets could help tone down price fluctuations, making the market work better and easier to predict. The way Hubei and Shanghai have kept their prices steady offers some useful ideas for other regional markets to think about.
3.4. Inter-Market Price Correlation Analysis
Figure 5 lays out a correlation matrix for the average transaction prices across China’s eight main carbon markets from 2021 to 2024, with correlation coefficients going from −1 to 1. The heatmap uses a color scale from cool to warm tones—red for positive correlations and blue for negative ones—to show just how varied the price connections among markets. The Shenzhen market exhibits strong positive correlations with Beijing (0.64), Fujian (0.68), and Shanghai (0.61), suggesting highly synchronized price movements, potentially driven by shared economic factors, similar policy frameworks, or inter-market arbitrage activities. Similarly, the Fujian market shows notable correlations with Tianjin (0.60) and Beijing (0.55), indicating a degree of price linkage among these markets. The Shanghai market has a middling level of correlation with Beijing (0.55), Fujian (0.46), and Tianjin (0.46), which points to some degree of price connection among them, likely shaped by things like national policies or efforts to tie markets together more closely. Meanwhile, the Guangzhou and Hubei markets show a correlation of 0.30, hinting that they might be influenced by similar local factors—think industrial activity or emission reduction goals. On the other hand, the Hubei market does not seem to move much in sync with most other markets, with correlation numbers running from just 0.07 (Tianjin) to 0.30 (Shenzhen). That independence probably has a lot to do with its high trading volume and steady pricing, which makes it less affected by price ups and downs in other markets. The Chongqing market demonstrates negative correlations with Hubei (−0.30) and Guangzhou (−0.30), indicating that its price movements often diverge from these markets, possibly reflecting differences in regional economic conditions or policy priorities. The diverse patterns of inter-market price correlations suggest a degree of market integration between Shenzhen, Beijing, Fujian, and Shanghai. However, the independence of the Hubei market and the negative correlations of the Chongqing market highlight regional disparities in market dynamics, underscoring the need for coordinated policies to promote price convergence, reduce regional disparities, and enhance the overall efficiency of China’s carbon trading system.
3.5. Feature Engineering Results Analysis
Feature engineering was performed on the cleaned dataset, extracting internal and market features to provide multidimensional input data for subsequent carbon price forecasting models. Internal features include lagged features (average transaction prices from the previous 1 and 7 days), 7-day moving average prices, temporal features (month and compliance cycle), and daily trading volume. Market features were generated by transforming market names into dummy variables, covering the eight major markets. Following feature generation, the dataset was expanded with 13 additional feature fields, resulting in a total of 23 fields. Lagged features effectively capture the temporal dependency of prices, temporal features reflect the potential influence of compliance cycles on prices, and market features highlight inter-market disparities, providing robust data support for modeling price dynamics and market heterogeneity in subsequent analyses.
Exploratory the data analysis of China’s eight major carbon markets from 2021 to 2024 reveals significant heterogeneity in price dynamics, trading activity, and market integration. The Beijing market exhibits the highest price levels and volatility but the lowest trading volume at 11.60 million tons, indicating limited market activity and high sensitivity to external shocks. In contrast, the Shanghai and Hubei markets demonstrate advantages in liquidity and market maturity, with high trading volumes (419 million tons and 36.54 million tons, respectively) and low volatility (standard deviations of 11.85 CNY/ton and 4.86 CNY/ton, respectively). Price correlation analysis indicates strong correlations between the Shenzhen market and Beijing, Fujian, and Shanghai (correlation coefficients of 0.64, 0.68, and 0.61, respectively), while the Hubei market remains relatively independent, and the Chongqing market shows negative correlations (coefficients of −0.30 with both Hubei and Guangzhou). These findings suggest that increasing participation and liquidity in less active markets, such as Beijing and Tianjin, could enhance price stability and market efficiency. Moreover, coordinated policies are needed to reduce regional disparities, promote price convergence across markets, and support the development of a unified national carbon market in China.
4. Modeling and Simulation Validation
4.1. Modeling Approach
In this study, the hyperparameters of the XGBoost model were set as follows: the number of trees (n_estimators) at 100, the learning rate (learning_rate) at 0.1, the maximum tree depth (max_depth) at 5, and the random seed (random_state) at 42. These parameters were determined through preliminary experiments to balance predictive accuracy and computational efficiency.
4.2. Data Partitioning and Feature Selection
This study utilized historical data from China’s eight major carbon exchanges (Beijing, Chongqing, Fujian, Guangzhou, Hubei, Shanghai, Shenzhen, and Tianjin) spanning 25 June 2021, to 31 December 2024. The dataset includes carbon price (Average Price)alongside a range of feature variables.
To evaluate the model’s predictive performance, the data were chronologically split into a training set and a test set, with the training set comprising 80% of the data (25 June 2021 to 31 March 2024) and the test set comprising 20% (1 April 2024 to 31 December 2024). The carbon price range in the training set was 3.45 to 149.64 CNY/ton, while the test set ranged from 25.5 to 67.06 CNY/ton. This partitioning ensures that the model is trained on earlier data and tested on later data, aligning with the practical requirements of time series forecasting.
For feature selection, a total of 18 features were employed, including lagged features (lag1_pricelag7_price and ma7_price), temporal features (month and is_compliance_period), market features (market_Beijing), and external variables (industrial growth rate and coal price). These features were generated through feature engineering to capture short-term price trends, seasonal effects, market disparities, and the influence of external economic factors.
4.3. Model Training and Performance Evaluation
Figure 6 presents the overall carbon price forecasting performance of China’s carbon market from 2021 to 2024, encompassing both the training and testing phases. The training set spans 25 June 2021 to 31 March 2024, while the testing set covers 1 April 2024 to 31 December 2024. Each market is represented by the following two lines: a solid line indicating the actual carbon price (Train/Test Actual) and a dashed line representing the predicted carbon price. To facilitate differentiation, each market is assigned a distinct color. During the training phase, the model demonstrates strong fitting capability, achieving an overall R
2 value of 0.83, indicating its ability to effectively capture historical price patterns. In the testing phase, the model exhibits robust predictive performance, with an overall R
2 value of 0.89, reflecting strong generalization ability. This indicates that the model achieves high accuracy and stability in predicting carbon prices.
Figure 7a–h shows how well the market-specific models predicted carbon prices for China’s eight carbon markets—Beijing, Chongqing, Fujian, Guangzhou, Hubei, Shanghai, Shenzhen, and Tianjin—from 2021 to 2024. The figure is split into eight smaller plots, labeled (a) to (h), each one matching up with a market and showing the actual and predicted prices for both the training and testing phases. In these plots, blue and green solid lines mark the real prices for the training and testing sets (labeled Train Actual and Test Actual), while the red and orange solid lines show what the model predicted (Train Predicted and Test Predicted). Looking at the figure, you can see that during the training phase, the model’s predictions matched up really closely with the actual prices across all of the markets, which suggests it is pretty good at picking up on historical price patterns. But in the testing phase, the results differ quite a bit depending on the market. For example, in the Chongqing market (plot b) and Fujian market (plot c), the predicted prices are pretty far off from the actual ones, while in the Shenzhen market (plot g) and Tianjin market (plot h), the predictions stay much closer to reality. This points to the idea that the market-specific model struggles a bit when dealing with markets where prices swing a lot, so there is definitely some room to tweak it to get better at predicting down the line.
This study first trained an overall model by aggregating historical data from the eight major markets to construct a single XGBoost model, capturing the common patterns across all markets. To further examine inter-market differences, separate XGBoost models (market-specific models) were trained for each market, enabling a comparison of the predictive performances between the overall and market-specific models. The model performance was evaluated using the following three metrics: the Root Mean Square Error (RMSE), which measures the average error between predicted and actual values; the Mean Absolute Error (MAE), which quantifies the absolute deviation between predicted and actual values; and the coefficient of determination (R
2), which assesses the model’s explanatory power, with values closer to 1 indicating a better model fit. The performance results of the overall model are presented in
Table 1.
As shown in
Table 1, the overall model performed robustly on both the training and testing sets, achieving an R
2 of 0.89 on the testing set, which indicates strong generalization capability. The RMSE and MAE values for the training and testing sets are closely aligned (7.14 and 5.07 for RMSE; 5.44 and 4.06 for MAE, respectively), suggesting that the model exhibits no significant overfitting issues.
As shown in
Table 2, the performance of the market-specific models varies across markets. The Shanghai market demonstrates the highest forecasting accuracy, with a testing set R
2 of 0.92, an RMSE of 3.12 CNY/ton, and an MAE of 2.41 CNY/ton, indicating excellent predictive capability. The Guangzhou and Tianjin markets also exhibit strong performance, with testing set R
2 values of 0.89 (RMSE: 5.19 CNY/ton, MAE: 4.17 CNY/ton) and 0.86 (RMSE: 1.46 CNY/ton, MAE: 1.17 CNY/ton), respectively, with Tianjin showing the smallest prediction error among all markets. In contrast, the Chongqing market yields the lowest forecasting performance, with a testing set R
2 of 0.71, an RMSE of 3.57 CNY/ton, and an MAE of 2.54 CNY/ton, suggesting that the model struggles to capture price patterns effectively in this market. Additionally, while the Beijing and Shenzhen markets achieve reasonable testing set R
2 values of 0.83 and 0.80, their higher RMSE values of 11.02 CNY/ton (MAE: 8.45 CNY/ton) and 8.33 CNY/ton (MAE: 5.93 CNY/ton), respectively, indicate substantial prediction errors, particularly in Beijing, where the error is the largest across all markets. These performance differences likely arise from variations in market characteristics, such as greater price volatility or smaller data volumes in Beijing and Shenzhen, and potentially more complex price dynamics in Chongqing, which may challenge the model’s ability to generalize effectively.
To clearly show how well the models performed in their predictions, this study put together scatter plots of the results for both the overall model and the market-specific ones, as shown in
Figure 8a–i. Subplots (a) to (h) depict the forecasting performance of the market-specific models for the eight carbon exchanges, while subplot (i) illustrates the performance of the overall model. In the plots, blue dots stand for the training set, while red dots mark the testing set. Plot (a) gives the prediction results for the Beijing market, where the model did not do so well, probably because there is not much data to work with and prices in that market swing a lot, which might have caused the model to overfit. Plot (b) presents the results for the Chongqing market, for which the model did an okay job, likely since trading activity in Chongqing is on the lower side and the data patterns are a bit tricky to pin down. Plot (c) shows how the model did for the Fujian market, and it was not great there either—the predicted curve does not match the actual prices in the testing set, likely because Fujian’s market does not have a lot of data to go on. Subplot (d) illustrates the results for the Guangzhou market, for which the model performed well, with the predicted curve closely aligning with part of the price trend in the testing set, though some deviations occur during abrupt price changes, likely influenced by the market’s high sensitivity to policy factors. Subplot (e) shows the forecasting performance for the Hubei market, for which the model performed well, capturing the main price trends in the testing set, but with some errors during large fluctuations, likely due to the complex data patterns in this market. Subplot (f) presents the results for the Shanghai market, for which the model performed well, with the predicted curve fitting most price trends in the testing set, though deviations occur at certain abrupt change points, potentially due to the limited data volume in this market. Subplot (g) displays the forecasting performance for the Shenzhen market, for which the model performed well, effectively capturing the main price trends in the testing set, particularly during stable periods, where the model demonstrated consistent performance. Subplot (h) illustrates the results for the Tianjin market, for which it exhibited the best forecasting performance, with the predicted curve in the testing set closely matching the actual values, indicating strong predictive capability in this market, likely due to the relatively stable data patterns in Tianjin. Subplot (i) presents the forecasting performance of the overall model, demonstrating that the model effectively fits the actual prices in both the training and testing sets, with the predicted values closely aligning with actual values. Notably, in the testing set, the predicted curve accurately captures the price fluctuation trends, validating the model’s generalization capability.
4.4. Comparative Analysis of Predictive Models
The performance of the XGBoost model in forecasting carbon prices across the overall carbon market was validated through comparative experiments to ensure the scientific rigor and reliability of the model selection. The comparison included a baseline naive model, as well as Random Forest and ARIMA models, representing typical applications of simple forecasting methods, machine learning models, and traditional time series approaches, respectively.
The comparative experiment was based on the overall carbon market price series described earlier, with the training and testing sets split in an 80:20 ratio. The feature set includes lagged price features (1-day, 7-day, and 14-day lagged prices and 7-day moving average), external factors (industrial growth rates and coal prices along with their lagged values and moving averages), and additional indicators such as time trends and volatility metrics. To ensure fairness, all models were trained and tested on the same data, and their performance was evaluated using the Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and the coefficient of determination (R2).
Table 3 presents the performance comparison of each model on the training and testing sets. The XGBoost model demonstrated the best performance on the testing set, achieving an R2 value of 0.89, with RMSE and MAE values of 5.07 and 4.06, respectively, significantly outperforming other models. In contrast, the naive model yielded a testing set R2 of only 0.17, while the Random Forest and ARIMA models achieved testing set R2 values of 0.55 and 0.33, respectively. These results highlight XGBoost’s superior predictive accuracy and stability, effectively capturing the dynamic changes in carbon prices.
4.5. Feature Importance Analysis
To identify the key drivers influencing carbon prices, this study leveraged the feature importance functionality of the XGBoost model to analyze the contribution of features in both the overall model and the eight market-specific models. The feature importance results for the overall model are presented in
Figure 9.
As shown in
Figure 9, the feature ma7_price (7-day moving average price) emerged as the most significant predictor in the overall carbon price forecasting model, with an importance score of 0.48155, indicating that short-term price trends exert a substantial influence on carbon prices. The feature lag1_price (previous day’s price) follows with an importance score of 0.44677, suggesting a strong short-term autocorrelation in carbon prices. The external variables cumulative industrial growth and industrial growth rate have importance scores of 0.01470 and 0.00613, respectively, indicating a relatively minor direct impact of industrial value-added on carbon prices. The feature coal price exhibits an importance score of 0.00520, suggesting a limited role of coal prices in driving carbon price dynamics.
The feature importance analysis for the market-specific models of China’s eight carbon markets (Beijing, Chongqing, Fujian, Guangzhou, Hubei, Shanghai, Shenzhen, and Tianjin) from 2021 to 2024 is presented in
Figure 10a–h. The eight subplots (a–h) in
Figure 10 illustrate the feature importance for each market, revealing significant variations in the models’ reliance on different features across markets. The Shanghai market model (subplot f) exhibited the strongest dependence on the Ma7_price (7-day moving average price) feature, with an importance score of 0.89327, while the Beijing market model (subplot a) relies most heavily on the Lag1_price (previous day’s price) feature, with an importance score of 0.58769. In contrast, external factors such as industrial growth rate and coal price generally exhibited low importance across all market models; for instance, in the Tianjin market (subplot h), the importance of the industrial growth rate was only 0.02689. These findings indicate that the models predominantly rely on price-related features, underscoring the dominant role of price lagged features in carbon price forecasting. This also suggests that future research could further explore the potential influence of external economic factors on enhancing model predictive performance.