3.1. Carbon Factor Forecasting Performance Analysis
The varying volatility of renewable energy integration introduces significant non-stationarity to the carbon factor sequence. To decouple these complexities, the proposed VMD algorithm decomposes the original time series into modes with distinct central frequencies, as illustrated in
Figure 2.
The decomposition results reveal clear physical distinctions among the components. The Fluctuation component (), located in the top panel, displays high-frequency oscillations with substantial amplitude, reflecting the rapid intraday dynamics caused by the intermittency of wind and solar power generation. In contrast, the Trend component (), shown in the middle panel, exhibits a smooth, low-frequency waveform that captures long-term baseline trends driven by regular generation scheduling and seasonal variations. Finally, the Residual component () in the bottom panel isolates high-frequency stochastic noise and random perturbations. This “Divide and Conquer” strategy effectively separates deterministic patterns from stochastic noise.
Based on this decomposition, the BSLO algorithm adaptively tailored the CTL model architecture for each specific component to achieve optimal feature extraction. As detailed in
Table 4, for the highly volatile Fluctuation component, BSLO assigned a massive memory capacity (256 LSTM units) and a higher learning rate (
) to track rapid state changes. Conversely, for the smooth Trend component, the optimizer converged to a configuration with a high number of attention heads (32) but minimal LSTM units (8), suggesting a focus on capturing global dependencies via the attention mechanism. Crucially, for the noise-dominated Residual component, the algorithm adopts a minimalist structure (1 head, 4 LSTM units), applying the “Occam’s Razor” principle to prevent overfitting stochastic noise. This adaptive configuration significantly contributes to the overall prediction accuracy.
- 2.
Comparative Justification of Optimization Strategy
To evaluate the effectiveness of the proposed BSLO algorithm in hyperparameter optimization, we conducted a comparative experiment against three standard methods: Particle Swarm Optimization (PSO), Bayesian Optimization (TPE), and Random Search.
A subset of the operating data from 1 May 2025, to 31 May 2025 (approximately 1500 samples) was selected for this analysis. The data was decomposed into three distinct sub-signalsvia VMD: IMF 1 (Daily Periodic Component), IMF 2 (Trend Component), and Residual (High-Frequency Component). To ensure a consistent comparison under limited computational resources, the population-based algorithms (BSLO and PSO) were configured with a population size of and a maximum of iterations. Similarly, Bayesian Optimization was limited to 50 function evaluations. The objective was to minimize the Mean Squared Error (MSE) on the validation set.
Figure 3 presents the convergence curves of the different algorithms.
Daily Periodic Component (IMF 1): IMF 1 represents the dominant 24 h cyclic pattern of the carbon factor. As shown in
Figure 3a, BSLO (solid red line) demonstrates efficient search capability, achieving a final MSE of 0.0380, which is lower than both the Random Search baseline (0.0415) and PSO (0.0386). This indicates BSLO’s superior ability in capturing the primary periodicity of the data.
Trend Component (IMF 2): IMF 2 reflects the long-term evolutionary trend. In
Figure 3b, the PSO algorithm (dashed blue line) tends to flatten around iteration 15 (MSE
0.0395), suggesting a potential local optimum. In contrast, BSLO continues to optimize throughout the process, reaching a lower error of 0.0350, proving its robustness in tracking evolutionary trends.
High-Frequency Component (Residual): The Residual contains stochastic fluctuations and noise. For this complex component shown in
Figure 3c, BSLO achieved the best performance with a final MSE of 0.7028, significantly outperforming Bayesian Optimization (0.7811) and Random Search (0.7593).
The results indicate that under the same number of iterations, BSLO exhibits stable convergence and superior optimization ability across periodic, trend, and stochastic components compared to the benchmarked methods. This justifies the utilization of BSLO for the model’s hyperparameter tuning.
- 3.
Ablation Study of Key Components
To justify the complexity of the proposed pipeline and quantify the contribution of each key component, we conducted an ablation study. We designed three variants of the model for comparison:
CTL: The standalone deep learning model (CNN-Transformer-LSTM) without decomposition or optimization, serving as the base predictor.
CTL-BSLO: The base predictor optimized by BSLO, but without signal decomposition (raw data input).
VMD-CTL: The predictor with VMD, but using fixed hyperparameters (without BSLO optimization).
The results across 1-step, 4-step, and 8-step horizons are presented in
Table 5.
As shown in
Table 5, both VMD and BSLO contribute significantly to the prediction accuracy:
Impact of Decomposition (VMD): Comparing the CTL and VMD-CTL variants, the introduction of VMD results in a drastic reduction in error. For the 1-step horizon, the MAPE decreases from 21.59% (CTL) to 12.16% (VMD-CTL). This confirms that decomposing the non-stationary carbon intensity signal into stable sub-modes is the primary factor in improving model performance.
Impact of Optimization (BSLO): Comparing VMD-CTL with the Proposed (VMD-BSLO-CTL) model (MAPE 9.15%), the BSLO optimization further reduces the MAPE by approximately 3.01%. This validates that manual or fixed parameter settings are insufficient for the diverse sub-signals generated by VMD, and adaptive optimization is necessary to fully unlock the model’s potential.
Synergistic Effect: The proposed framework, which combines both components, achieves the lowest RMSE and MAPE across all horizons. This demonstrates that the complexity introduced by the VMD-BSLO pipeline translates directly into tangible performance gains.
- 4.
Comprehensive Evaluation of Prediction Performance
The final prediction results on the test set were obtained by aggregating the forecasts of the three decomposed components.
Figure 4 illustrates the single-step prediction performance. Visually, the proposed model exhibits a strong capability to track the ground truth curve, accurately capturing both the smooth transitions during valley periods and the sharp inflection points during ramping events. While minor deviations persist at extreme peaks due to the inherent stochasticity of high-frequency noise, the overall fitting degree remains exceptionally high.
To further evaluate the consistency and error distribution of the model,
Figure 5 presents the density scatter plot of the predicted values against the ground truth. The color gradient indicates the data point density, where dark purple represents high-density regions and dark orange denotes low-density areas. It is evident that the vast majority of data points are tightly clustered along the ideal
diagonal, yielding a coefficient of determination
of 0.991. This high goodness of fit confirms that the model has captured the underlying physical patterns without significant systematic bias. Although a few scattered outliers exist in the low-density orange regions, the overall distribution demonstrates the robustness of the model across different carbon intensity levels. Quantitatively, the model achieves an RMSE of 10.15 and a Mean Absolute Percentage Error of 9.15% for the single-step horizon.
To explicitly quantify the model’s prediction reliability, we conducted probabilistic forecasting using the Bootstrap residual method. As shown in
Figure 6, the 95% confidence interval (blue shaded area) tightly envelopes the ground truth curve. The narrow bandwidth of the confidence interval indicates that the model maintains high certainty and low variance even during peak fluctuations, further validating the stability of the BSLO-optimized parameters.
To validate the model’s reliability for the MPC rolling horizon, the prediction performance was evaluated across three different time steps: 1 step (30 min), 4 steps (2 h), and 8 steps (4 h).
Table 6 benchmarks the proposed VMD-BSLO-CTL model against two distinct categories: (1) Naïve Baselines, including Seasonal Naïve and Persistence models, introduced to rigorously test against the data’s inherent predictability; and (2) Advanced Forecasting Models, including LSTM, TCN-LSTM-SVM, and DLinear.
The quantitative results indicate the performance stability of the proposed framework, particularly in multi-step forecasting:
Short-term Inertia vs. True Learning: At the 1-step horizon, the Persistence model achieves a remarkably low MAPE of 12.71%. This confirms the high inertial autocorrelation of the grid carbon intensity. However, the proposed model effectively breaks through this “inertia barrier,” further reducing the MAPE to 9.15%, demonstrating its ability to capture high-frequency fluctuations that simple autoregression misses.
Long-term Robustness: The contrast becomes stark as the horizon extends. The Seasonal Naïve model shows a consistently high error (>53%), indicating the lack of simple 24 h periodicity. More critically, the performance of the Persistence model collapses at the 8-step horizon, with MAPE skyrocketing to 102.14% and RMSE to 108.87. In comparison, the proposed model demonstrates exceptional robustness, maintaining a MAPE of 18.39% even at the 4 h horizon. While other deep learning models (e.g., TCN-LSTM-SVM at 32.28%) also degrade, the proposed decomposition-ensemble strategy effectively mitigates error accumulation, ensuring reliable forward-looking signals for scheduling optimization.
To further substantiate the generalization capability of the proposed framework, we conducted an additional evaluation using the North Scotland dataset, a region characterized by high wind penetration and extreme load volatility.
Table 7 presents the performance comparison across varying prediction horizons (1-step, 4-step, and 8-step), utilizing RMSE, MAE, and
as key metrics.
The results reveal a critical divergence in model performance as the prediction horizon extends:
Short-term Competitiveness: In the single-step (1-Step) scenario, linear-based models such as DLinear and the Persistence baseline demonstrate competitive performance, particularly in MAE (2.57 gCO2/kWh and 3.43 gCO2/kWh, respectively). This is attributed to the strong autocorrelation inherent in the high-resolution sampling, which simple autoregressive mechanisms can exploit.
Long-term Degradation of Baselines: However, as the horizon increases to 4 and 8 steps, these benchmark models suffer from catastrophic degradation due to the accumulation of recursive errors. For instance, the of DLinear plummets from 0.832 (Step 1) to a mere 0.054 at Step 8, indicating an inability to capture long-term dependencies in such a volatile environment. Similarly, standard deep learning models like LSTM and TCN-LSTM-SVM show significant performance decay, with RMSEs rising above 23 gCO2/kWh.
Superior Robustness of Proposed Model: In sharp contrast, the proposed VMD-BSLO-CTL strategy demonstrates exceptional long-term stability. Even at the 8-step horizon, the model maintains an RMSE of 16.61 gCO2/kWh and an of 0.828, significantly outperforming the second-best model. This resilience confirms that the VMD effectively isolates high-frequency noise, while the BSLO-optimized architecture accurately models the complex non-linear features that simpler linear models fail to capture. This verifies that the proposed method is not only accurate for immediate dispatch but also highly reliable for longer-term look-ahead scheduling in complex energy systems.
3.2. MPC-Based Scheduling Optimization Analysis
The performance of the proposed scheduling strategy is evaluated through a comprehensive multi-dimensional assessment, as summarized in
Figure 7. This comparative analysis reveals the complex trade-offs inherent in high-renewable and time-of-use environments. As shown by the blue dotted line in
Figure 7, the Single-Objective MPC strategy, which strictly prioritizes grid stability (minimizing load variance and peak-valley difference), yields the smoothest load curve. However, this comes at a severe penalty: to flatten the grid curve, the algorithm forces EV charging into expensive or high-carbon time windows that do not align with the optimal midday solar generation. Consequently, its economic cost and carbon emissions actually exceed those of the uncoordinated baseline. In contrast, the proposed Multi-Objective MPC achieves an optimal balance. By dynamically weighting user incentives against grid needs, it successfully captures the “green and cheap” charging windows. Compared to the unoptimized baseline, the proposed framework reduces economic costs by 4.17% and carbon emissions by 8.82%, while simultaneously reducing the peak-valley difference by 6.46% and load variance by 11.34%, demonstrating that appropriate scheduling can unlock economic value while supporting the grid.
To analyze the performance enhancement mechanism in depth under the multi-objective strategy,
Figure 8 illustrates the macroscopic effect of load component redistribution over time, while
Figure 9 reveals the micro-dispatch mechanism behind the fleet’s collective benefits. Unlike the baseline scenario, where charging load merely overlaps with existing peaks, the dispatchable load under the multi-objective strategy demonstrates highly intelligent time-shifting behavior and dynamic objective prioritization:
Grid Support during Morning Peak: During the 06:00–07:00 base load morning peak, the optimizer prioritizes the essential peak-shaving objective over instantaneous cost. This compels V2G vehicles to execute net discharge (red blocks in
Figure 8).
Figure 9 provides the micro-validation of this response: core V2G vehicles (e.g., dark traces) actively initiate SOC decline during this early critical grid period.
Coordinated Green Absorption and Conflict Navigation (11:00–14:00): Subsequently, the charging load is clustered within this optimal window. This period offers low-carbon intensity and economic favorability, but the concentrated charging also presents a potential high-stress zone for the grid. The coordinated SOC trajectories (
Figure 9) confirm that vehicles utilize this cheap, low-carbon midday window for rapid replenishment after their morning discharge, while also actively avoiding the charging peak. The successful dispatch proves that the strategy actively managed this stress, ensuring the cumulative net load curve (dark blue line in
Figure 8) remains stable while maximizing green energy absorption.
This precise coordination, adhering strictly to safety constraints and final energy requirements, validates the strategy’s ability to achieve simultaneous optimization across economic, grid, and environmental objectives.
Furthermore, to address the challenge of uncertainty in real-world operations—where perfect forecasting is unattainable—we evaluated the robustness of the proposed framework against prediction errors. A robustness test was conducted by introducing Gaussian noise with a margin of ±15% to the input carbon factor sequence during the optimization phase. Crucially, while the MPC decisions were made based on these noisy forecasts, the final performance metrics were calculated using the actual (ground truth) carbon intensity and electricity prices. As presented in
Table 8, the system demonstrates strong resilience to input uncertainty. Despite the significant noise, the performance degradation is minimal: the deviation in economic cost is only +0.19%, and the increase in carbon emissions is limited to +1.01%. These results indicate that the proposed multi-objective optimization framework maintains high performance stability and engineering reliability even when operating with imperfect information.
Having established the system’s resilience to external data uncertainty, it is equally critical to examine the impact of internal parameter selection, specifically the objective weights. To evaluate the robustness of the proposed strategy against weight variations and understand system behavior under varying preferences, a sensitivity analysis was conducted based on the Pareto Frontier. We categorized the optimization objectives into two conflicting groups: “Operational Costs” (Economic Cost and Carbon Emissions) and “Grid Stability” (Peak-Valley Difference and Load Variance). Within these groups, fixed internal ratios were maintained to reflect specific scenario characteristics: a 5:2 ratio was applied between economic cost and carbon emissions to represent their relative market importance, while a 1:1 ratio was assigned to peak-valley difference and load variance as they equally describe grid smoothness.
A preference parameter
was then introduced as a weighting lever to dynamically adjust the trade-off between the two groups. Specifically, the weight of the “Operational Costs” group is scaled by
, while the “Grid Stability” group is scaled by
. As
increases, the optimizer progressively prioritizes cost and emission reductions over grid flatness.
Figure 10 illustrates the resulting trends, plotting Economic Cost (a) and Carbon Emissions (b) against Grid Stress.
The results in
Figure 10 reveal clear patterns:
Clear Trade-off: A natural conflict exists between minimizing operational costs and maintaining a perfectly flat grid load. Reducing costs and emissions drives the system to cluster EV charging during low-price, low-carbon midday windows. While operationally efficient, this concentration reduces the degree of load leveling compared to a strict grid-prioritized strategy. Conversely, forcing a perfectly flat load curve requires shifting demand to expensive or high-carbon periods, leading to a sharp increase in costs.
Optimal Balance: The strategy selected in this paper (marked by the red star, ) is located at the “knee point” of the Pareto curve. This point represents the system’s “sweet spot,” achieving the majority of potential cost and carbon savings while maintaining grid stress at an acceptable level. Deviating from this point would either result in diminishing returns in savings or cause a disproportionate penalty in grid stability, confirming that the chosen parameters capture the most efficient operating point.
Finally, to complete the robustness verification, we extend the evaluation from “data and parameters” to the critical dimension of “user behavior uncertainty,” specifically the risk of job interruption (e.g., unexpected early departure). While the previous tests validated resilience against signal noise and weight variations, real-world operations must also account for users disconnecting earlier than scheduled. To address this, an additional comparative test was conducted using the proposed Risk-Averse Robust MPC, which incorporates an explicit Gaussian hazard penalty to model departure anxiety.
Table 9 presents the performance deviation of this robust strategy compared to the baseline. The results reveal a distinct “Front-Loading” mechanism: the system actively builds a “Safety Energy Buffer” by accelerating charging 2–3 h before the scheduled departure. As shown in the table, this active defense capability incurs a marginal “Price of Robustness”: the economic cost increases by only 0.81% (1591.92 to 1604.87 CNY), and carbon emissions rise by 1.57%. Notably, this safety enhancement does not compromise grid stability; the peak-valley difference remains unchanged, while load variance actually improves slightly (−2.48%) due to the smoother distribution of charging power across the safety window. This confirms that the proposed framework can secure high service reliability against behavioral uncertainty with negligible economic trade-offs.