An Intelligent Two-Stage Dispatch Framework for Cost and Carbon Reduction in Multi-Energy Virtual Power Plants
Abstract
1. Introduction
2. Review of Related Work
2.1. Stochastic Optimization and Game-Theoretic Approaches
2.2. Deep Reinforcement Learning Breakthroughs
2.3. Research Gaps and Our Contributions
2.4. Distinction from Existing DRL-MPC Frameworks
3. Modeling the VPPs System
3.1. Physical Model of Energy Resources
3.2. System Architecture and Parameters
3.3. Operational Constraints
4. DRL Optimization Scheduling Algorithm
4.1. Overview of the Two-Stage Scheduling Framework
- H = 4 is the prediction and control horizon.
- Cop(t + k) is the operating cost at time t + k, comprising gas turbine fuel costs, equipment maintenance costs, and costs/revenues from grid interaction.
- Ecarbon(t + k) is the carbon emissions at time t + k.
- λ is a weighting coefficient for carbon emissions, linking environmental cost to the economic objective.
4.2. Algorithm Improvements and Optimizations
4.3. Algorithm Framework Design
5. Case Analysis and Result Discussion
5.1. Simulation Environment and Parameter Settings
5.2. Convergence Performance Analysis
5.3. Multi-Objective Optimization Effects
5.4. Scheduling Strategy Analysis
5.5. Sensitivity Analysis
5.6. Reward Weight Sensitivity Analysis
5.7. Online Deployment and Edge Computing Feasibility
6. Conclusions and Future Prospects
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- International Renewable Energy Agency (IRENA). Renewable Capacity Statistics 2022; International Renewable Energy Agency (IRENA): Abu Dhabi, United Arab Emirates, 2022. [Google Scholar]
- International Renewable Energy Agency (IRENA). Renewable Energy Prospects for the European Union, January 2018. Available online: https://www.irena.org/-/media/Files/IRENA/Agency/Publication/2018/Feb/IRENA_REmap_EU_2018.pdf (accessed on 8 February 2026).
- International Energy Agency (IEA). Net Zero by 2050: A Roadmap for the Global Energy Sector; International Energy Agency (IEA): Paris, France, 2021. [Google Scholar]
- Mohammadi, H.; Karimi, H.; Liu, L. A review on virtual power plant for energy management. Sustain. Energy Technol. Assess. 2021, 47, 101370. [Google Scholar] [CrossRef]
- Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction, 2nd ed.; MIT Press: Cambridge, MA, USA, 2018. [Google Scholar]
- Stefan, R.; Johannes, M.; Joachim, G.; Oliver, B. Computational intelligence based on optimization of hierarchical virtual power plants. Energy Syst. 2021, 12, 517–544. [Google Scholar] [CrossRef]
- Jadidoleslam, M. Risk-constrained participation of virtual power plants in day-ahead energy and reserve markets based on multi-objective operation of active distribution network. Sci. Rep. 2025, 15, 9145. [Google Scholar] [CrossRef]
- International Energy Agency (IEA). Global EV Outlook 2023; International Energy Agency (IEA): Paris, France, 2023. [Google Scholar]
- Shui, J.; Peng, D.; Zeng, H.; Song, Y.; Yu, Z.; Yuan, X.; Shen, C. Optimal scheduling of multiple entities in virtual power plant based on the master-slave game. Appl. Energy 2024, 376, 124286. [Google Scholar] [CrossRef]
- Xue, L.; Zhang, Y.; Wang, J.; Li, H.; Li, F. Privacy-preserving multi-level co-regulation of VPPs via hierarchical safe deep reinforcement learning. Appl. Energy 2024, 371, 123654. [Google Scholar] [CrossRef]
- Naughton, J.; Wang, H.; Riaz, S.; Cantoni, M.; Mancarella, P. Optimization of multi-energy virtual power plants for providing multiple market and local network services. Electr. Power Syst. Res. 2020, 189, 106775. [Google Scholar] [CrossRef]
- Zhao, H.; Zhang, C.; Zhao, Y.; Wang, X. Low-Carbon Economic Dispatching of Multi-Energy Virtual Power Plant with Carbon Capture Unit Considering Uncertainty and Carbon Market. Energies 2022, 15, 7225. [Google Scholar] [CrossRef]
- Han, Z.; Zhang, Y.; Li, B. Two-stage Optimization Scheduling of Cold, Heat and Power Virtual Power Plant Based on Multi-scenario Technology. Electr. Meas. Instrum. 2022, 59, 174–180. [Google Scholar]
- Lin, L.; Guan, X.; Peng, Y.; Wang, N.; Maharjan, S.; Ohtsuki, T. Deep Reinforcement Learning for Economic Dispatch of Virtual Power Plant in Internet of Energy. IEEE Internet Things J. 2020, 7, 3288–3301. [Google Scholar] [CrossRef]
- Wei, X.; Chan, K.W.; Wang, G.; Hu, Z.; Zhu, Z.; Zhang, X. Robust preventive and corrective security-constrained OPF for worst contingencies with the adoption of VPP: A safe reinforcement learning approach. Appl. Energy 2025, 380, 124970. [Google Scholar] [CrossRef]
- Yang, J.; Yang, X.; Yu, T. Multi-Unmanned Aerial Vehicle Confrontation in Intelligent Air Combat: A Multi-Agent Deep Reinforcement Learning Approach. Drones 2024, 8, 382. [Google Scholar] [CrossRef]
- Li, Y.; Chang, W.; Yang, Q. Deep reinforcement learning based hierarchical energy management for virtual power plant with aggregated multiple heterogeneous microgrids. Appl. Energy 2025, 382, 125333. [Google Scholar] [CrossRef]
- Wang, J.; Guo, C.; Yu, C.; Liang, Y. Virtual power plant containing electric vehicles scheduling strategies based on deep reinforcement Learning. Electr. Power Syst. Res. 2022, 205, 107714. [Google Scholar] [CrossRef]
- Guo, G.; Gong, Y. Multi-Microgrid Energy Management Strategy Based on Multi-Agent Deep Reinforcement Learning with Prioritized Experience Replay. Appl. Sci. 2023, 13, 2865. [Google Scholar] [CrossRef]
- Domínguez-Barbero, D.; García-González, J.; Sanz-Bobi, M.Á.; García-Cerrada, A. Energy management of a microgrid considering nonlinear losses in batteries through Deep Reinforcement Learning. Appl. Energy 2024, 368, 123435. [Google Scholar] [CrossRef]
- Pei, Y.; Ye, K.; Zhao, J.; Yao, Y.; Su, T.; Ding, F. Visibility-enhanced model-free deep reinforcement learning algorithm for voltage control in realistic distribution systems using smart inverters. Appl. Energy 2024, 372, 123758. [Google Scholar] [CrossRef]
- Tang, X.; Wang, J. Deep Reinforcement Learning-Based Multi-Objective Optimization for Virtual Power Plants and Smart Grids: Maximizing Renewable Energy Integration and the Grid Efficiency. Processes 2025, 13, 1809. [Google Scholar] [CrossRef]
- Yan, Q.; Zhang, M.; Lin, H.; Li, W. Two-stage adjustable robust optimal dispatching model for multi-energy virtual power plant considering multiple uncertainties and carbon trading. J. Clean. Prod. 2022, 336, 130400. [Google Scholar] [CrossRef]
- Axehill, D.; Besselmann, T.; Raimondo, D.M.; Morari, M. A parametric branch and bound approach to suboptimal explicit hybrid MPC. Automatica 2014, 50, 240–246. [Google Scholar] [CrossRef]
- Manwell, J.F.; McGowan, J.G.; Rogers, A.L. Wind Energy Explained: Theory, Design and Application, 2nd ed.; Wiley: Hoboken, NJ, USA, 2010. [Google Scholar]
- Skoplaki, E.; Palyvos, J.A. On the temperature dependence of photovoltaic module electrical performance: A review of efficiency/power correlations. Sol. Energy 2009, 83, 614–624. [Google Scholar] [CrossRef]
- Chicco, G.; Mancarella, P. Distributed multi-generation: A comprehensive view. Renew. Sustain. Energy Rev. 2009, 13, 535–551. [Google Scholar] [CrossRef]
- Lokupitiya, E.; Paustian, K. Agricultural soil greenhouse gas emissions: A review of National Inventory Methods. J. Environ. Qual. 2006, 35, 1413–1427. [Google Scholar] [CrossRef]
- Wächter, A.; Biegler, L.T. On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming. Math. Program. 2006, 106, 5–57. [Google Scholar] [CrossRef]
- National Solar Radiation Database (NSRDB). Data for North China Region. 2022. Available online: https://nsrdb.nrel.gov/ (accessed on 10 February 2025).
- Fan, J.; Zhang, J.; Yuan, L.; Yan, R.; He, Y.; Zhao, W.; Nin, N. Deep Low-Carbon Economic Optimization Using CCUS and Two-Stage P2G with Multiple Hydrogen Utilizations for an Integrated Energy System with a High Penetration Level of Renewables. Sustainability 2024, 16, 5722. [Google Scholar] [CrossRef]
- Samende, C.; Fan, Z.; Cao, J.; Fabián, R.; Baltas, G.N.; Rodríguez, P. Battery and Hydrogen Energy Storage Control in a Smart Energy Network with Flexible Energy Demand Using Deep Reinforcement Learning. Energies 2023, 16, 6770. [Google Scholar] [CrossRef]





| Study | Method | Strengths | Weaknesses |
|---|---|---|---|
| Lin et al. [14] | DDPG + Edge | Low computation latency | Simplified PV/Wind models |
| Wei et al. [15] | L-SAC | Robust to contingencies | Slow convergence (>350 episodes) |
| Li et al. [17] | MADDPG | Multi-microgrid coordination | No carbon emission optimization |
| Wang et al. [18] | SAC-TD3 hybrid | EV strategy optimization | Ignored thermal-electrical coupling |
| Barbero et al. [20] | TD3 | Nonlinear battery modeling | Single-objective (cost minimization) |
| Aspect | Typical SAC/TD3-MPC [e.g., (Axehill et al. 2014) [24]] | Proposed MADDPG–MPC Framework |
|---|---|---|
| DRL Core | Single-agent (SAC, TD3) | Multi-agent (MADDPG) for distributed entity coordination |
| Model Integration | Often uses standard, linearized equipment models | Integrates high-fidelity, nonlinear physical models (wind turbulence, PV thermal coupling, variable storage efficiency) |
| Feedback Mechanism | MPC corrects actions; DRL policy is static post-training | Bidirectional feedback: MPC deviations are fed into MADDPG’s experience replay for online policy refinement |
| Objective Handling | Often single-objective or fixed weighted sum | Explicit multi-objective reward with AHP-derived weights, synergizing cost, carbon, and renewable utilization |
| Real-time Adaptation | MPC handles short-term deviations | Combined MAS long-term learning with MPC short-term robustness, enhanced by adaptive exploration |
| Device Type | Rated Power | Efficiency/Characteristics | Carbon Emission Coefficient (kg/MWh) | Response Time (s) |
|---|---|---|---|---|
| Wind turbine | 1.5 MW | N/A (Wind speed determined) | 0 | 10–30 |
| Photovoltaic power station | 2.0 MWp | ≈18 (affected by temperature) | 0 | 5–15 |
| Gas turbine | 5.0 MW | Electrical efficiency 35, electric-to-heat ratio 0.6 | 0.35 | 5–60 |
| Battery energy storage | 2.0 MW/2 MWH | Charge 90%, discharge 85% | 0 | <1 |
| Energy storage device | 1.5 MWth/3 MWH | Charge 95%, discharge 95% | 0.35 | <1 |
| Algorithms | Episode (Mean ± Standard Deviation) Required for Convergence | Final Reward (Mean ± Standard Deviation) |
|---|---|---|
| Improved DDPG | 1350 ± 42 | −1250 ± 18 |
| Traditional DDPG | 1600 ± 68 | −1190 ± 25 |
| SAC | 1450 ± 55 | −1220 ± 22 |
| Model | Episode Required for Convergence | The Convergence Speed Has Increased |
|---|---|---|
| Basic DDPG (A) | 1600 | - |
| A + PER(B) | 1500 | 6.25% |
| A + Adaptive noise (C) | 1530 | 4.38% |
| Complete model (D) | 1350 | 15.63% |
| Algorithms | Operating Costs (USD) | Carbon Emissions (kg) | Absorption Rate (%) | Time of Calculation (s) | Cost MAPE (%) |
|---|---|---|---|---|---|
| PSO | 12,568 | 876 | 78.5 | 45.6 | 7.8 |
| DE | 11,842 | 901 | 81.2 | 38.9 | 6.5 |
| Traditional DDPG | 11,256 | 765 | 85.3 | 22.4 | 4.5 |
| Improving DDPG | 10,892 | 721 | 88.6 | 25.7 | 3.2 |
| Carbon Price ($/t CO2) | Gas Engine Operation (h) | Carbon Emissions (kg) | Energy Storage Cycle Capacity (MWh) | Operating Costs ($) |
|---|---|---|---|---|
| 20 (Benchmark) | 8 | 721 | 6.0z | 10,892 |
| 50 | 5 | 557 | 1.153 z | 11,410 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Ni, H.; Wang, Y.; Tang, X.; Wang, J. An Intelligent Two-Stage Dispatch Framework for Cost and Carbon Reduction in Multi-Energy Virtual Power Plants. Processes 2026, 14, 743. https://doi.org/10.3390/pr14050743
Ni H, Wang Y, Tang X, Wang J. An Intelligent Two-Stage Dispatch Framework for Cost and Carbon Reduction in Multi-Energy Virtual Power Plants. Processes. 2026; 14(5):743. https://doi.org/10.3390/pr14050743
Chicago/Turabian StyleNi, Haochen, Yonghua Wang, Xinfa Tang, and Jingjing Wang. 2026. "An Intelligent Two-Stage Dispatch Framework for Cost and Carbon Reduction in Multi-Energy Virtual Power Plants" Processes 14, no. 5: 743. https://doi.org/10.3390/pr14050743
APA StyleNi, H., Wang, Y., Tang, X., & Wang, J. (2026). An Intelligent Two-Stage Dispatch Framework for Cost and Carbon Reduction in Multi-Energy Virtual Power Plants. Processes, 14(5), 743. https://doi.org/10.3390/pr14050743

