Optimizing Virtual Power Plants Cooperation via Evolutionary Game Theory: The Role of Reward–Punishment Mechanisms
Abstract
1. Introduction
- (1)
- To determine the critical thresholds and parameter configurations that transform non-cooperative VPP markets into stable cooperative systems, identifying the precise conditions under which reward–punishment mechanisms overcome free-riding behaviors and market failures.
- (2)
- To quantify the efficiency gains and renewable energy integration improvements achievable through evolutionary stable strategies, establishing measurable benchmarks for cooperative versus non-cooperative market outcomes in decentralized energy systems.
- (3)
- To establish the mathematical relationship between static incentive structures and long-term market stability, providing theoretical proofs for the emergence of cooperation as an evolutionarily dominant strategy under specific regulatory frameworks.
- (4)
- To demonstrate the practical applicability of EGT in resolving documented VPP market failures, offering quantitative evidence that static reward–punishment mechanisms can achieve cooperation rates exceeding 85% while maintaining system stability across diverse market conditions.
2. Theoretical Foundations of EGT
2.1. Overview of EGT
2.2. ESS and RD
2.3. The Role of Reward–Punishment Mechanisms in Evolutionary Games
2.4. The Integration of VPPs and EGT
3. Model Assumptions and Analysis
3.1. Participants and Assumptions
- Power Generation VPPs and Load-Type VPPs are shown as the core entities in the VPP system, participating in peer-to-peer (P2P) trading with each other. They collaborate within the system to manage energy generation and consumption, optimizing market strategies and utilizing flexible grid capacity. The diagram indicates their involvement in reporting transactions to a central trading center and submitting transaction proofs after verification.
- Government plays a regulatory and supportive role in the VPP ecosystem by offering subsidies and ensuring that participants submit the required proofs of transaction to verify their compliance and engagement in market activities. This regulatory framework promotes stability and encourages sustainable practices within the VPP system.
- Power Exchange serves as the intermediary between the different market participants, facilitating the P2P trading and ensuring that the transaction flows align with the market’s broader objectives. The Power Exchange is integral to maintaining liquidity, efficiency, and transparency in the VPP ecosystem.
- Traditional Power Plants are positioned within the ecosystem to engage in a similar P2P trading relationship with the Load and Generation VPPs, interacting with the Power Exchange and providing their verification and transaction reports as required.
- (1)
- Load-side VPPs: These are aggregators or consumers who can adjust their demand patterns to support grid stability and cooperate in demand-side response.
- (2)
- Generation-side VPPs: These are producers of distributed energy, such as solar, wind, or small-scale generation units, who provide power to the grid or market.
- (3)
- Government/Regulator: The government acts as the policymaker, setting fixed reward and punishment parameters to incentivize or penalize certain behaviors.
3.2. Payoff Functions and Static Reward–Punishment Mechanisms
3.3. RD-Based Evolutionary Game Model
- (1)
- Strategy Space
- (2)
- Payoff Matrix
- •
- Both cooperate (C, C): When both participants cooperate, they receive a base payoff that reflects mutual benefit from collaboration, which can include shared energy resources, trading profits, and system stability.
- •
- One cooperates, the other defects ((C, D) or (D, C)): When one participant cooperates while the other defects, the cooperating participant incurs a penalty (e.g., through market manipulation or free-riding), while the defector may gain a short-term advantage but risks long-term inefficiency and instability.
- •
- Both defect (D, D): When both participants defect, they both receive a lower payoff due to the lack of collaboration and the inefficiencies that arise from such behavior. This scenario could represent an unstable, non-cooperative equilibrium in the market.
- (3)
- RD (Replicator dynamics)
- •
- x: The proportion of participants using the cooperation strategy at any given point in time.
- •
- U(C, x): The expected payoff for cooperation, which depends on the current distribution of strategies (i.e., the proportion of cooperators in the population).
- •
- : The average payoff across all participants in the system, representing the overall average payoff in the population (both cooperators and defectors).
3.4. Game Process and Equilibrium Analysis
- (1)
- Static Reward–Punishment Mechanism
- (2)
- Evolutionary Process
- (3)
- Equilibrium Analysis
4. Equilibrium Analysis Under Static Reward–Punishment Mechanisms
4.1. Equilibrium Analysis Under Reward–Punishment Framework
- (1)
- If G1(y) < 0, then x = 0 is the equilibrium point.
- (2)
- If G1(y) > 0, then x = 1 is the equilibrium point.
- (1)
- If G2(x) < 0, then y = 0 is the equilibrium point.
- (2)
- If G2(x) > 0, then y = 1 is the equilibrium point.
- (1)
- Equilibrium (0, 0)—Complete Defection:
- Stability Condition: This point is stable when both cooperation strategies yield lower payoffs than defection.
- Parameter Requirement: γ < γcritical and δ > δcritical.
- Practical Implication: Insufficient rewards and excessive punishment lead to market failure.
- (2)
- Equilibrium (0, 1)—Asymmetric Cooperation:
- Stability Condition: Stable when only generation-side cooperation is profitable.
- Parameter Requirement: Complex interaction between γ, δ, and market price differentials.
- Practical Implication: Rarely stable in real VPP markets due to interdependence.
- (3)
- Equilibrium (1, 0)—Asymmetric Cooperation:
- Stability Condition: Stable when only load-side cooperation is profitable.
- Parameter Requirement: Specific γ/δ ratio favoring demand-side participation.
- Practical Implication: Limited practical relevance due to market coupling.
- (4)
- Equilibrium (1, 1)—Full Cooperation:
- Stability Condition: Det(J) > 0 and Tr(J) < 0.
- Parameter Requirement: γ > 0.34δ + 0.15 (critical threshold identified).
- Optimal Range: γ ∈ [0.3, 0.6] and δ ∈ [0.2, 0.4].
- Practical Implication: This represents the desired cooperative equilibrium for efficient VPP operation.
- (1)
- Research Motivation and Theoretical Foundation
- (2)
- Simulation Framework and Parameter Configuration
- (3)
- Individual Subplot Analysis and Theoretical Validation
- (4)
- Academic Significance and Summarization
4.2. Regulatory Framework Design for Cooperative Equilibrium Achievement
- (1)
- Designing Incentives for Cooperation
- (2)
- Balancing Reward and Penalty Levels
- (3)
- Promoting Long-Term Cooperation and Sustainability
5. Simulation Results and Validation
5.1. Baseline Scenario Analysis
- (i)
- Data Source Documentation: Government incentive parameter (γ = 0.4): California Public Utilities Commission Decision 19-09-027 establishing 40% premium payments for demand response cooperation; validated through German’s FNA renewable energy incentive schedules averaging 39.7% premium rates across 15 VPP programs.
- (ii)
- Energy pricing parameters (E1 − E6): PJM Interconnection market-clearing prices during peak demand periods (June–August 2023), adjusted for Chinese market conditions using World Bank purchasing power parity conversion factors. Price differentials validated through European Energy Exchange spot market analysis.
- (iii)
- Transaction volumes (D1, D2): Median P2P energy trading volumes from Brooklyn Microgrid operational data (2019–2023) and Vandebron Dutch platform transaction records, representing typical distributed energy trading patterns.
- (iv)
- Cost and revenue parameters (B1, N, M): International Renewable Energy Agency global cost database for distributed generation; State Grid Corporation demand response pilot program compensation rates; European Network of Transmission System Operators balancing service revenue benchmarks.
- (v)
- Risk parameter (δ): Composite analysis of non-compliance penalties across FERC jurisdictions, European Union energy market regulations, and Australian Energy Market Operator penalty structures, normalized to the Chinese regulatory framework through comparative institutional analysis.
5.2. Proposed Model-Based Numerical Simulation Study
- (1)
- Research Motivation and Theoretical Foundation
- (2)
- Simulation Scenario and Parameter Configuration
5.3. Simulation Results Analysis
- (1)
- Systematic Convergence Analysis (Figure 4a)
- (2)
- ESS Validation with Empirical Data (Figure 4b)
- (3)
- Critical Parameter Boundaries (Figure 4c)
- (4)
- Reward Effectiveness and Diminishing Returns (Figure 4d)
- (5)
- Path-Dependent Outcomes Analysis (Figure 4e)
- (6)
- Evolutionary Trajectories Across Multiple Iterations (Figure 4f)
- (7)
- Convergence versus Iteration Analysis (Figure 4g)
- (8)
- Theoretical versus Empirical Correlation (Figure 4h)
- (9)
- Regulatory Calibration Targets (Figure 4i)
- (10)
- Market Evolution Phases Analysis (Figure 4j)
- (11)
- Stability Basins and Critical Regions (Figure 4k)
- (12)
- Multi-Objective Performance Analysis (Figure 4l)
- (13)
- Strategic Payoff Analysis (Figure 4m)
- (14)
- Future Technology Integration Projections (Figure 4n)
- (15)
- Implementation Roadmap and Risk Assessment (Figure 4o)
- (1)
- Figure 5a: Multi-iteration Evolutionary Trajectory Convergence Analysis
- (2)
- Figure 5b: Parameter Sensitivity and Empirical Validation Analysis
- (3)
- Figure 5c: 3D Strategic Phase Portrait with Temporal Evolution
- (4)
- Figure 5d: Vector Field Dynamics with Cooperation Thresholds
- (5)
- Figure 5e: Empirical Validation Heatmap with Real-World Benchmarks
- (6)
- Figure 5f: Stability Analysis with Convergence Timeline Validation
- (7)
- Figure 5g: Multi-Dimensional Performance Analysis
- (8)
- Figure 5h: Strategic Payoff Evolution with Empirical Timeline Validation
- (9)
- Theoretical Implications and Conclusions
5.4. Improved RD Model-Based Simulation Study
- (1)
- Rcom (Cooperation Revenue)
- (2)
- Rrisk (Risk Penalty/Reward)
- •
- Market Fluctuations: The introduction of Rrisk ensures that the model accounts for uncertainties and market volatility, which are important in real-world energy markets where prices, demand, and supply can fluctuate unpredictably.
- •
- Cooperative Behavior Incentives: Rcom explicitly introduces the financial benefits of cooperation, reflecting the real-world advantages of collaboration between VPPs, such as energy sharing, grid balancing, and efficiency improvements.
- •
- Real-World Complexity: The payoff functions are now more complex and realistic, incorporating the impact of risk and cooperation revenues, as well as potential market disturbances, giving a more accurate representation of how participants’ decisions evolve over time.
5.5. Comprehensive Quantitative Validation and Large-Scale Implementation Analysis
5.5.1. Research Motivation and Theoretical Foundation
5.5.2. Simulation Framework and Core Parameter Configuration
5.5.3. Individual Subplot Analysis and Theoretical Validation
5.5.4. Theoretical Contributions and Research Impact
5.6. Advanced Simulation Validation and Quantitative Performance Assessment of Static Reward–Punishment Mechanisms in Large-Scale VPP Implementations
5.6.1. Research Motivation and Theoretical Foundation
5.6.2. Enhanced Theoretical Modeling Framework
- (1)
- Network-Enhanced Payoff Structure with Dynamic Externalities
- (2)
- Adaptive Evolutionary Dynamics with Performance Feedback Integration
- (3)
- Cooperation Rate Optimization and Critical Threshold Identification
- (4)
- Integrated Performance Assessment and Validation Methodology
- (5)
- Large-Scale Implementation Validation and Statistical Framework
- (6)
- Theoretical Integration and Validation Methodology
5.6.3. Simulation Scenario Description and Core Parameter Configuration
5.6.4. Individual Subplot Analysis and Theoretical Validation
5.6.5. Research Conclusions and Theoretical Validation
6. Discussion and Policy Implications
6.1. Significance of Sensitivity Analysis
6.2. Single-Parameter Sensitivity Analysis and Policy Implications
- (1)
- (2)
- (3)
- Comprehensive Analysis
- (4)
- Policy Implications
6.3. Multi-Parameter Sensitivity Analysis and Policy Implications
- (1)
- Load Type (x(t)) Evolution Over Time for Different γ
- (2)
- Power Generation (y(t)) Evolution Over Time for Different γ Values
- (3)
- Comprehensive Analysis
- (4)
- Policy Implications
6.4. Impact of Key Parameter Changes on System Evolution
6.4.1. Low Reward or Penalty Values Result in Persistent Non-Cooperative Behavior
6.4.2. Increasing γ or δ Beyond Certain Thresholds Fosters Cooperation
6.4.3. Critical Region with Multiple Equilibria
6.4.4. Extreme Reward or Penalty Leads to System Instability
6.4.5. Balancing Reward and Punishment for Stable Cooperation
6.5. System Robustness Simulation Verification and Policy Implications
6.5.1. Stability Analysis Under Varying Initial Conditions and External Perturbations: Implications for Convergence and Policy Design
- (1)
- Impact of Initial Conditions on Strategic Convergence
- (2)
- Detailed Analysis of Strategic Variables
- (3)
- General Insights from the Stability Analysis
- (4)
- Policy Implications and System Resilience
6.5.2. Stability Under Perturbations: Evolutionary Convergence and System Resilience in Noisy Environments
- (1)
- Overview of Perturbation Effects on System Dynamics
- (2)
- Stability of x(t) under Initial Conditions with Noise
- (3)
- Stability of y(t) under Initial Conditions with Noise
- (4)
- Integrated Insights from Noise-Driven Stability Dynamics
- •
- Convergence is preserved even under moderate levels of noise, demonstrating the intrinsic stability of the system structure.
- •
- Initial conditions critically influence the speed of convergence, with values closer to equilibrium reducing the system’s adjustment time.
- •
- Noise introduces short-term volatility, particularly when systems are initialized from extreme or imbalanced conditions, but this does not derail long-term stability.
- •
- Strategic proximity to equilibrium enhances adaptive capacity, enabling agents to respond more effectively to environmental uncertainty.
- (5)
- Policy Implications and Design Considerations in Noisy Systems
- •
- Early-stage interventions that guide the system closer to equilibrium—by incentivizing cooperative behavior or adjusting strategy distributions—can substantially reduce stabilization time.
- •
- Robust policy mechanisms must account for environmental noise and strategic volatility. While fixed reward–punishment frameworks may be sufficient under normal conditions, adaptive policies that respond dynamically to deviations may be required in more volatile settings.
- •
- System resilience should be a design priority, as it ensures long-term functionality and cooperation despite inevitable external shocks. This includes not only the robustness of strategy evolution but also the structural flexibility of the incentive and regulatory mechanisms.
- •
- Sensitivity analysis and scenario testing, like those shown in this study, should be embedded in system planning processes to ensure that policies remain effective under a range of plausible real-world disturbances.
7. Conclusions and Prospects
7.1. Summary of Key Contributions and Findings
- (1)
- Theoretical Framework Development and Mathematical Innovation
- (2)
- Critical Parameter Threshold Identification and Optimization Boundaries
- (3)
- Quantitative Performance Validation and Empirical Evidence
- (4)
- Regulatory Framework Design and Implementation Pathways
- (5)
- Market Dynamics Understanding and Equilibrium Stability Analysis
- (6)
- Methodological Contributions and Analytical Innovation
7.2. Applicability and Limitations of the Model
- (1)
- Static parameter assumptions fundamentally limit real-world applicability
- (2)
- Deterministic modeling overlooks inherent system uncertainties
- (3)
- Exclusion of emerging technologies undermines implementation feasibility
- (4)
- Simplified participant modeling reduces behavioral complexity
- (5)
- Validation constraints limit generalizability
7.3. Theoretical and Practical Implications
7.4. Future Research Directions
- (1)
- Dynamic parameter optimization represents the most immediate research priority for enhancing practical applicability. The static nature of current reward–punishment mechanisms, while theoretically tractable, constrains real-world implementation where market conditions exhibit continuous evolution. Developing adaptive algorithms that calibrate incentive parameters in real-time through machine learning approaches could significantly improve system responsiveness to market volatility and participant strategy evolution [38]. Reinforcement learning techniques combined with Markov decision processes offer particularly promising pathways for creating regulatory mechanisms that automatically adjust to external economic shocks, seasonal demand patterns, and technological disruptions [39,40]. Initial investigations suggest such adaptive frameworks might improve system stability by 30–40% relative to static approaches, though empirical validation across diverse market conditions remains essential.
- (2)
- Stochastic modeling integration constitutes a fundamental methodological advancement necessary for capturing renewable energy system uncertainties. The deterministic framework developed here overlooks inherent intermittency patterns in photovoltaic and wind generation that critically influence VPP operational strategies. Incorporating probabilistic modeling for demand-side volatility, market price fluctuations, and grid stability variations would enable more robust analysis of cooperation dynamics under realistic operating conditions. Monte Carlo simulation techniques combined with stochastic differential equations could model the complex interactions between renewable generation variability and strategic decision-making processes [41,42]. Particular attention should focus on developing methods that maintain computational tractability while capturing essential uncertainty characteristics observed in operational VPP deployments.
- (3)
- Blockchain technology integration addresses critical implementation barriers related to trust, transparency, and automated execution of incentive mechanisms [43,44]. The exclusion of distributed ledger infrastructure from the current framework represents a significant constraint on practical deployment, particularly in decentralized market environments where participant coordination requires trustless validation systems. Developing comprehensive smart contract architectures that automate reward–punishment execution while ensuring cryptographic security and transparent transaction validation could eliminate centralized regulatory dependencies that currently limit VPP scalability. Ethereum-based implementations offer immediate deployment pathways, though research should also explore alternative blockchain platforms optimized for energy market applications with lower transaction costs and enhanced throughput capabilities.
- (4)
- Multi-scale and multi-agent system analysis would extend the framework’s applicability to complex operational scenarios involving diverse stakeholder interactions. Current binary cooperation–defection modeling oversimplifies the nuanced strategic behaviors observed in real VPP markets where participants exhibit varying degrees of cooperation, partial compliance, and adaptive learning processes. Expanding the analysis to accommodate multiple player types with heterogeneous objectives, capabilities, and constraints would enhance predictive accuracy for diverse market environments. Agent-based modeling approaches could simulate emergent behaviors arising from interactions among load aggregators, generation operators, storage providers, and regulatory entities, enabling investigation of cooperation dynamics across different organizational scales and temporal horizons [45,46].
- (5)
- Regional market integration and multi-zone pricing complexity present increasingly urgent research challenges as electricity markets evolve toward greater decentralization. The establishment of multiple pricing zones within regional markets creates arbitrage opportunities and coordination challenges that existing frameworks cannot adequately address. Developing VPP participation models that account for price signal variations across different zones while maintaining cooperation incentives requires sophisticated optimization approaches that balance local efficiency with system-wide stability. Research should investigate how price differentials influence strategic decision-making processes and explore mechanism design approaches that preserve cooperation incentives despite geographic and temporal price variations.
- (6)
- High-frequency trading environments and increased transaction volumes demand new analytical frameworks that maintain market efficiency while ensuring system stability. As VPP participation in electricity markets expands, transaction processing capabilities and settlement mechanisms face scalability challenges that could undermine cooperation maintenance. Research should explore market microstructure effects on cooperation dynamics, investigating how latency, order processing delays, and settlement timing influence strategic behavior patterns. Developing real-time cooperation assessment mechanisms that function effectively under high transaction volumes represents a critical technical challenge requiring advances in both theoretical modeling and computational implementation.
- (7)
- Empirical validation through longitudinal analysis of operational VPP systems would strengthen theoretical predictions and guide practical implementation strategies. The current research relies primarily on simulation studies and cross-sectional analysis, limiting generalizability to diverse operational contexts. Establishing partnerships with VPP operators and regional transmission organizations would enable the collection of high-resolution operational data necessary for validating evolutionary game predictions under realistic market conditions. Longitudinal studies tracking cooperation evolution over multiple seasonal cycles and market stress periods would provide essential insights into theoretical model accuracy and identify refinements necessary for practical deployment.
- (8)
- Machine learning enhancement of strategic behavior prediction offers opportunities for developing more sophisticated models of participant adaptation and learning processes. Current RD assumes simplified strategy updating mechanisms that may not accurately reflect how VPP operators actually modify their approaches based on market experience and peer interactions. Integrating neural network architectures with evolutionary game frameworks could capture complex learning patterns and enable the prediction of cooperation emergence under novel market conditions [32,47,48,49]. Deep reinforcement learning approaches might particularly benefit the analysis of multi-objective optimization scenarios where VPP operators balance profitability, reliability, and environmental objectives simultaneously.
- (9)
- Cross-jurisdictional comparative analysis would enhance understanding of regulatory and cultural factors that influence cooperation emergence patterns. The research demonstrates significant performance variations across different regulatory environments, yet limited investigation of underlying causal mechanisms constrains policy transfer potential. Systematic comparison of cooperation outcomes across jurisdictions with varying regulatory frameworks, market structures, and cultural contexts would identify key institutional factors that facilitate or inhibit cooperation emergence. Such analysis would provide valuable guidance for policymakers seeking to adapt successful approaches to their specific regulatory environments while avoiding implementation pitfalls observed in other contexts.
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Glossary
Term/Acronym | Definition/Description |
Cooperation Optimization | It refers to the process of enhancing mutual cooperation between agents in a system through strategic adjustments and incentive mechanisms, with the goal of achieving the most beneficial outcomes for all participants in terms of payoff and system efficiency. |
Evolutionary Game Theory (EGT) | A framework for modeling strategic interactions where players’ strategies evolve over time based on previous outcomes. |
Evolutionarily Stable Strategy (ESS) | An ESS is a strategy that, if adopted by most members of a population, cannot be invaded by any alternative strategy, provided that the population is initially in a state of equilibrium, ensuring the persistence of the strategy over time in a competitive environment. |
Evolutionarily Stable Equilibrium (ESE) | An ESE refers to a strategic state in an evolutionary game where the population’s strategy distribution remains stable over time, and no single individual or group can improve its payoff by deviating unilaterally from the equilibrium strategy, thereby maintaining overall system stability. |
Market Efficiency | It refers to the condition where market outcomes (e.g., prices, production, and consumption) are optimized such that resources are allocated in a manner that maximizes total welfare, minimizes inefficiencies, and ensures that no participant can achieve a higher payoff through unilateral actions or alternative strategies. |
Punishment Intensity (δ) | A parameter representing the severity of penalties for non-cooperative behavior in evolutionary game models of VPPs. |
Renewable Energy Integration | It involves incorporating energy generated from renewable sources (e.g., solar, wind) into existing power systems or virtual power plants in a way that optimizes grid stability, enhances sustainability, and reduces reliance on non-renewable energy sources, while addressing challenges like intermittency and variability in generation. |
Reward–Punishment Mechanisms | A system of incentives (rewards) for cooperation and penalties (punishments) for non-cooperation to guide participants toward desired behavior. |
Replicator Dynamics (RD) | A model used to describe how the proportion of individuals using a particular strategy changes over time based on its relative success compared to others. |
Reward Coefficient (γ) | A parameter representing the strength of rewards for cooperative behavior in VPPs, used in evolutionary game models. |
Virtual Power Plant (VPP) | A system that aggregates decentralized energy resources, such as renewable energy, for optimization, control, and market trading. |
References
- Zhang, Y.; Pan, W.; Lou, X.; Yu, J.; Wang, J. Operation characteristics of virtual power plant and function design of operation management platform under emerging power system. In Proceedings of the 2021 International Conference on Power System Technology (POWERCON), Haikou, China, 8–9 December 2021; pp. 194–196. [Google Scholar] [CrossRef]
- Meng, X.; Gao, F.; Xu, T.; Zhou, K.; Li, W.; Wu, Q. Inverter-data-driven second-level power forecasting for photovoltaic power plant. IEEE Trans. Ind. Electron. 2021, 68, 7034–7044. [Google Scholar] [CrossRef]
- Meng, Y.; Qiu, J.; Zhang, C.; Lei, G.; Zhu, J. A Holistic P2P market for active and reactive energy trading in VPPs considering both financial benefits and network constraints. Appl. Energy 2024, 356, 122396. [Google Scholar] [CrossRef]
- Alam, K.S.; Kaif, A.M.A.D.; Das, S.K. A blockchain-based optimal peer-to-peer energy trading framework for decentralized energy management within a virtual power plant: Lab scale studies and large scale proposal. Appl. Energy 2024, 365, 123243. [Google Scholar] [CrossRef]
- Bao, P.; Zhang, W.; Zhang, Y. Secondary frequency control considering optimized power support from virtual power plant containing aluminum smelter loads through VSC-HVDC link. J. Mod. Power Syst. Clean Energy 2023, 11, 355–367. [Google Scholar] [CrossRef]
- Yazdaninejad, M.; Amjady, N.; Dehghan, S. VPP self-scheduling strategy using multi-horizon IGDT, enhanced normalized normal constraint, and bi-directional decision-making approach. IEEE Trans. Smart Grid 2020, 11, 3632–3645. [Google Scholar] [CrossRef]
- Naughton, J.; Wang, H.; Cantoni, M.; Mancarella, P. Co-optimizing virtual power plant services under uncertainty: A robust scheduling and receding horizon dispatch approach. IEEE Trans. Power Syst. 2021, 36, 3960–3972. [Google Scholar] [CrossRef]
- Lin, C.; Hu, B.; Shao, C.; Xie, K.; Peng, J. Computation offloading for cloud-edge collaborative virtual power plant frequency regulation service. IEEE Trans. Smart Grid 2024, 15, 5232–5244. [Google Scholar] [CrossRef]
- Park, H.; Ko, W. A bi-level scheduling model of the distribution system with a distribution company and virtual power plants considering grid flexibility. IEEE Access 2022, 10, 36711–36724. [Google Scholar] [CrossRef]
- Majumder, S.; Khaparde, S.A.; Agalgaonkar, A.P.; Kulkarni, S.; Srivastava, A.; Perera, S. Chance-constrained pre-contingency joint self-scheduling of energy and reserve in a VPP. In Proceedings of the 2024 IEEE Power & Energy Society General Meeting (PESGM), Seattle, WA, USA, 21–25 July 2024; p. 1. [Google Scholar] [CrossRef]
- Li, L.; Fan, S.; Xiao, J.; Zhang, Y.; Huang, R.; He, G. Energy management strategy for community prosumers aggregated VPP participation in the ancillary services market based on P2P trading. Appl. Energy 2025, 384, 125472. [Google Scholar] [CrossRef]
- Cheng, L.; Huang, P.; Zou, T.; Zhang, M.; Peng, P.; Lu, W. Evolutionary game-theoretical approaches for long-term strategic bidding among diverse stakeholders in large-scale and local power markets: Basic concept, modelling review, and future vision. Int. J. Electr. Power Energy Syst. 2025, 166, 110589. [Google Scholar] [CrossRef]
- Zhang, T.; Qiu, W.; Zhang, Z.; Lin, Z.; Ding, Y.; Wang, Y.; Wang, L.; Yang, L. Optimal bidding strategy and profit allocation method for shared energy storage-assisted VPP in joint energy and regulation markets. Appl. Energy 2023, 329, 120158. [Google Scholar] [CrossRef]
- Ghodusinejad, M.H.; Yousefi, H.; Mohammadi-Ivatloo, B. An internal pricing method for a local energy market with P2P energy trading. Energy Strategy Rev. 2025, 58, 101673. [Google Scholar] [CrossRef]
- Wang, X.; Zhao, H.; Lu, H.; Wang, Y.; Wang, J. Decentralized coordinated operation model of VPP and P2H systems based on stochastic-bargaining game considering multiple uncertainties and carbon cost. Appl. Energy 2022, 312, 118750. [Google Scholar] [CrossRef]
- Aguilar, J.; Bordons, C.; Arce, A.; Galán, R. Intent profile strategy for virtual power plant participation in simultaneous energy markets with dynamic storage management. IEEE Access 2022, 10, 22599–22609. [Google Scholar] [CrossRef]
- Prikaziuk, E.; Silva, C.F.; Koren, G.; Cai, Z.; Berger, K.; Belda, S.; Graf, L.; Tomelleri, E.; Verrelst, J.; Segarra, J.; et al. Evaluation and improvement of Copernicus HR-VPP product for crop phenology monitoring. Comput. Electron. Agric. 2025, 233, 110136. [Google Scholar] [CrossRef]
- Chen, G.; Yu, Y. Convergence analysis and strategy control of evolutionary games with imitation rule on toroidal grid. IEEE Trans. Autom. Control 2023, 68, 8185–8192. [Google Scholar] [CrossRef]
- Cheng, L.; Chen, Y.; Liu, G. 2PnS-EG: A general two-population n-strategy evolutionary game for strategic long-term bidding in a deregulated market under different market clearing mechanisms. Int. J. Electr. Power Energy Syst. 2022, 142, 108182. [Google Scholar] [CrossRef]
- Xu, Y.; Zhang, X.; Wang, K. Stakeholder interaction in the digital transformation of China’s electric power sector: An evolutionary game model. Util. Policy 2025, 94, 101902. [Google Scholar] [CrossRef]
- Cheng, L.; Peng, P.; Lu, W.; Sun, J.; Wu, F.; Shi, M.; Yuan, X.; Chen, Y. The evolutionary game equilibrium theory on power market bidding involving renewable energy companies. Int. J. Electr. Power Energy Syst. 2025, 167, 110588. [Google Scholar] [CrossRef]
- Lim, I.S.; Masuda, N. To trust or not to trust: Evolutionary dynamics of an asymmetric n-player trust game. IEEE Trans. Evol. Comput. 2024, 28, 117–131. [Google Scholar] [CrossRef]
- Ye, M.; Tianqing, C.; Wenhui, F. A single-task and multi-decision evolutionary game model based on multi-agent reinforcement learning. J. Syst. Eng. Electron. 2021, 32, 642–657. [Google Scholar] [CrossRef]
- Ming, Z.; Jianjun, Z.; Hehua, W. Evolutionary game analysis of problem processing mechanism in new collaboration. J. Syst. Eng. Electron. 2021, 32, 136–150. [Google Scholar] [CrossRef]
- Lin, J.; Long, P.; Liang, J.; Dai, Q.; Li, H.; Yang, J. The coevolution of cooperation: Integrating Q-learning and occasional social interactions in evolutionary games. Chaos Solitons Fractals 2025, 194, 116165. [Google Scholar] [CrossRef]
- Lv, Y.; Yang, J.; Sun, X.; Wu, H. Evolutionary game analysis of stakeholder privacy management in the AIGC model. Oper. Res. Perspect. 2025, 14, 100327. [Google Scholar] [CrossRef]
- Yin, H.; Sun, J.; Cai, W. Honest or dishonest? Promoting integrity in loot box games through evolutionary game theory. IEEE Trans. Comput. Soc. Syst. 2024, 11, 5961–5972. [Google Scholar] [CrossRef]
- Peng, X.; Ding, Y.; Liu, J.; Li, Y.; Yuan, K. Multi-perspective collaborative planning of DN and distribution energy stations with stepped carbon trading and adaptive evolutionary game. Int. J. Electr. Power Energy Syst. 2025, 166, 110522. [Google Scholar] [CrossRef]
- Ding, Y.; Chen, W.; Pan, X.; Liu, K.; Wei, S.; How, L.; Li, J. An evolutionary game model considering response priority for flexible resource scheduling in buildings. J. Build. Eng. 2025, 105, 112154. [Google Scholar] [CrossRef]
- Hong, L.; Wang, R.; Chen, H.; Cui, W.; Tsoulakos, N.; Yan, R. Evolutionary game-based ship inspection planning considering ship competitive interactions. Transp. Res. Part E Logist. Transp. Rev. 2025, 196, 103994. [Google Scholar] [CrossRef]
- Bai, H.; Shen, R.; Lin, Y.; Xu, B.; Cheng, R. Lamarckian platform: Pushing the boundaries of evolutionary reinforcement learning toward asynchronous commercial games. IEEE Trans. Games 2024, 16, 51–63. [Google Scholar] [CrossRef]
- Shi, Y.; Rong, Z. Analysis of Q-learning like algorithms through evolutionary game dynamics. IEEE Trans. Circuits Syst. II Express Briefs 2022, 69, 2463–2467. [Google Scholar] [CrossRef]
- Zhang, H.; Tan, J.; Liu, X.; Huang, S.; Hu, H.; Zhang, Y. Cybersecurity threat assessment integrating qualitative differential and evolutionary games. IEEE Trans. Netw. Serv. Manag. 2022, 19, 3425–3437. [Google Scholar] [CrossRef]
- Cheng, L.; Huang, P.; Zhang, M.; Yang, R.; Wang, Y. Optimizing electricity markets through game-theoretical methods: Strategic and policy implications for power purchasing and generation enterprises. Mathematics 2025, 13, 373. [Google Scholar] [CrossRef]
- Cheng, L.; Wei, X.; Li, M.; Tan, C.; Yin, M.; Shen, T.; Zou, T. Integrating evolutionary game-theoretical methods and deep reinforcement learning for adaptive strategy optimization in user-side electricity markets: A comprehensive review. Mathematics 2024, 12, 3241. [Google Scholar] [CrossRef]
- Lim, I.S.; Capraro, V. A synergy of institutional incentives and networked structures in evolutionary game dynamics of multi-agent systems. IEEE Trans. Circuits Syst. II Express Briefs 2022, 69, 2777–2781. [Google Scholar] [CrossRef]
- Zhang, Z.X.; Chen, W.N.; Shi, W.; Jeon, S.; Zhang, J. An individual evolutionary game model guided by global evolutionary optimization for vehicle energy station distribution. IEEE Trans. Comput. Soc. Syst. 2024, 11, 1289–1301. [Google Scholar] [CrossRef]
- Avila, P.; Mullon, C. Evolutionary game theory and the adaptive dynamics approach: Adaptation where individuals interact. Philos. Trans. R. Soc. B 2023, 378, 20210502. [Google Scholar] [CrossRef]
- Zhang, S.P.; Zhang, J.Q.; Chen, L.; Liu, X.D. Oscillatory evolution of collective behavior in evolutionary games played with reinforcement learning. Nonlinear Dyn. 2020, 99, 3301–3312. [Google Scholar] [CrossRef]
- Wu, Y.; Pan, L. LSTEG: An evolutionary game model leveraging deep reinforcement learning for privacy behavior analysis on social networks. Inf. Sci. 2024, 676, 120842. [Google Scholar] [CrossRef]
- Elmusrati, M. Modelling Stochastic Uncertainties: From Monte Carlo Simulations to Game Theory; Walter de Gruyter GmbH & Co. KG: Berlin, Germany, 2024. [Google Scholar]
- Jeong, Y. Probabilistic game theory and stochastic model predictive control-based decision making and motion planning in uncontrolled intersections for autonomous driving. IEEE Trans. Veh. Technol. 2023, 72, 15254–15267. [Google Scholar] [CrossRef]
- Zhou, F.; Zhang, C.; Chen, T.; Lim, M.K. An evolutionary game analysis on blockchain technology adoption in cross-border e-commerce. Oper. Manag. Res. 2023, 16, 1766–1780. [Google Scholar] [CrossRef]
- Li, J.; Li, S.; Zhang, Y.; Tang, X. Evolutionary game analysis of rent seeking in inventory financing based on blockchain technology. Manag. Decis. Econ. 2023, 44, 4278–4294. [Google Scholar] [CrossRef]
- Hong, H.; Yu, X. Multi-agent Cooperative Optimization Strategy of a Virtual Power Plant Based on Game Theory. J. Phys. Conf. Ser. 2023, 2656, 012005. [Google Scholar] [CrossRef]
- Liu, X.; Li, S.; Zhu, J. Optimal coordination for multiple network-constrained VPPs via multi-agent deep reinforcement learning. IEEE Trans. Smart Grid 2022, 14, 3016–3031. [Google Scholar] [CrossRef]
- Traulsen, A.; Glynatsi, N.E. The future of theoretical evolutionary game theory. Philos. Trans. R. Soc. B 2023, 378, 20210508. [Google Scholar] [CrossRef] [PubMed]
- Chen, Y.; Zhang, Y.; Wang, S.; Wang, F.; Li, Y.; Jiang, Y.; Chen, L.; Guo, B. DIM-DS: Dynamic incentive model for data sharing in federated learning based on smart contracts and evolutionary game theory. IEEE Internet Things J. 2022, 9, 24572–24584. [Google Scholar] [CrossRef]
- Talajić, M.; Vrankić, I.; Pejić Bach, M. Strategic management of workforce diversity: An evolutionary game theory approach as a foundation for AI-driven systems. Information 2024, 15, 366. [Google Scholar] [CrossRef]
Parameter | Definition |
---|---|
γ | Government incentives |
D1 | P2P trading of electricity for load type VPPs |
D2 | Trading electricity in the load type VPP market |
E1 | P2P pricing |
E2 | User electricity bill |
E3 | market price |
E4 | Load type VPPs purchased through P2P transactions with other VPPs |
B1 | Power generation cost |
N | Income obtained from demand response |
δ | P2P credit risk |
M | Revenue from peak shaving and frequency regulation |
E5 | Electricity prices in a tight market |
E6 | Market electricity prices for excess electricity |
Load Type VPP | Power Generation VPP | |
---|---|---|
Participate in Diversified Transactions | Participate in Market Transactions | |
Participate in diversified transactions | Ua11 = (1 + γ)D1(E2 − E1) + D2(E2 − E3) + N − δ; Ub11 = (1 + γ)D1(E1 − C1) + D2(E3 − B1) + M − δ | Ua12 = (1 + γ)D1(E2 − E4) + D2(E2 − E3) + N − δ; Ub12 = D(E6 − B1) + M |
Participate in market transactions | Ua21 = D(E2 − E5) + N; Ub21 = D(E3 − B1) + M | Ua22 = D(E2 − E3) + N; Ub22 = D(E3 − B1) + M |
(x, y) | Det(J) | Tr(J) | Eigenvalues of J | Local Stability |
---|---|---|---|---|
(0, 0) | Unstable | |||
(0, 1) | Unstable | |||
(1, 0) | Unstable | |||
(1, 1) | Stable |
Parameter | Numerical Value |
---|---|
γ | 0.4 |
D1 | 50 kWh |
D2 | 50 kWh |
E1 | CNY 0.41 |
E2 | CNY 0.48 |
E3 | CNY 0.46 |
E4 | CNY 0.43 |
E5 | CNY 0.44 |
E6 | CNY 0.47 |
B1 | CNY 0.35 |
N | CNY 100,000 |
δ | CNY 25,000 |
M | CNY 100,000 |
Load Type VPP (Group A)\Power Generation VPP (Group B) | Participate in Diversified Transactions | Participate in Market Transactions |
---|---|---|
Participate in diversified transactions | ||
Participate in market transactions | ||
Participate in diversified transactions | ||
Participate in market transactions |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Cheng, L.; Huang, P.; Zhang, M.; Wang, K.; Zhang, K.; Zou, T.; Lu, W. Optimizing Virtual Power Plants Cooperation via Evolutionary Game Theory: The Role of Reward–Punishment Mechanisms. Mathematics 2025, 13, 2428. https://doi.org/10.3390/math13152428
Cheng L, Huang P, Zhang M, Wang K, Zhang K, Zou T, Lu W. Optimizing Virtual Power Plants Cooperation via Evolutionary Game Theory: The Role of Reward–Punishment Mechanisms. Mathematics. 2025; 13(15):2428. https://doi.org/10.3390/math13152428
Chicago/Turabian StyleCheng, Lefeng, Pengrong Huang, Mengya Zhang, Kun Wang, Kuozhen Zhang, Tao Zou, and Wentian Lu. 2025. "Optimizing Virtual Power Plants Cooperation via Evolutionary Game Theory: The Role of Reward–Punishment Mechanisms" Mathematics 13, no. 15: 2428. https://doi.org/10.3390/math13152428
APA StyleCheng, L., Huang, P., Zhang, M., Wang, K., Zhang, K., Zou, T., & Lu, W. (2025). Optimizing Virtual Power Plants Cooperation via Evolutionary Game Theory: The Role of Reward–Punishment Mechanisms. Mathematics, 13(15), 2428. https://doi.org/10.3390/math13152428