1. Introduction
According to the 2022 energy consumption statistics released by the National Bureau of Statistics of China, there was a 9.2% increase in energy production compared to the previous year. Coal consumption accounted for 56.2% of total energy consumption and was primarily utilized for industrial and power generation purposes, leading to a rise in carbon emissions. This has resulted in adverse effects such as global warming, extreme weather events, and biodiversity loss. The recent turbulence in the global energy market has heightened the urgency for decarbonization due to escalating prices of traditional energy sources like natural gas, oil, and coal, necessitating an urgent transformation within the energy industry [
1].
The energy composition of China has undergone significant changes in recent decades, as depicted in
Figure 1. The proportion of coal and oil has steadily decreased, while the share of natural gas, primary electricity, and other sources of energy has progressively risen. In response to the demands of global climate change and environmental conservation, China is committed to advancing the optimization of its energy structure and the expansion of renewable energy sources in the future. Under the guidance of China’s dual carbon goals (carbon peak and carbon neutral), renewable energy and incentives for carbon reduction have gained unprecedented market space and increased competitiveness [
2,
3]. However, given that the power sector is responsible for approximately half of total carbon emissions, achieving carbon peaking and neutrality presents both a significant opportunity and a formidable challenge for this industry. For instance, advanced economies like the United States are projected to undergo a transition period of 50 to 70 years towards achieving carbon neutrality after reaching their emission peak. China faces a more challenging task in comparison.
The State Council has issued a notice on the issuance of the Action Plan for Carbon Peaking before 2030, emphasizing the need to expedite the development of a current power system by significantly enhancing its comprehensive regulation capacity and accelerating the construction of flexible power supply regulation. Furthermore, it aims to encourage participation in system regulation from self-provided power plants, traditional high-load industrial loads, industrial and commercial interruptible loads, and electric vehicle charging networks [
4,
5]. China’s goal is to peak its carbon emissions by 2030 and achieve carbon neutrality by 2060, which means it has only three decades to accomplish this objective [
6]. In this context, the widespread implementation of distributed power (DG) supply utilizing renewable energy sources has been observed globally. As the penetration rate of distributed power supply and renewable energy in the power system continues to rise, their inherent characteristics of limited capacity, large volume, and low density result in increased volatility and intermittency, posing significant challenges to the stability of the power network system. This presents a considerable obstacle for the power grid.
In recent years, the concept of virtual power plants (VPPs) has emerged as a promising solution to address the challenges in power management. VPPs are designed to aggregate and manage various flexible resources, including controlled loads, storage devices, and distributed generation, in response to customers’ power demand, energy prices, generator locations, and other relevant factors. This approach incentivizes owners of distributed energy resources (DERs) to actively participate in the power market and offer ancillary services. Moreover, VPPs are capable of dynamically dispatching the power system through advanced control methods and optimization algorithms to ensure efficient allocation of power consumption for users [
7,
8].
Figure 2 demonstrates the capability of VPPs to seamlessly integrate renewable energy sources such as solar, wind, and geothermal energy while being uniformly optimized for management using smart grid technology. Furthermore, the VPP system leverages cloud computing platforms for data collection and analysis to achieve optimal energy utilization across household and commercial electricity consumption as well as electric vehicle charging management. These advancements not only enhance energy efficiency but also contribute significantly towards achieving dynamic balance between energy supply and demand alongside promoting environmental sustainability—critical elements for future development within the realm of power systems research. However, as the number of small-scale distributed energy sources continues to increase, numerous studies have projected that renewables will surpass 50% of the electricity market by 2050 [
9]. This widespread adoption of distributed power presents increasingly complex challenges for VPPs, particularly in maintaining system stability. Consequently, there is an urgent need to develop different strategies and technologies for optimizing the structure and scheduling process of VPPs. Therefore, the strategic and efficient utilization of VPPs for optimizing dispatch in the power market will emerge as a central focus and challenge for forthcoming power systems, prompting extensive scholarly inquiry in this domain.
Currently, the majority of research is focused on optimizing system scheduling for various VPPs. Michael et al. [
10] took into account market price uncertainty and distributed energy, aiming to maximize the economic benefits of VPPs through the optimization of the energy management system (EMS). Chen et al. [
11] proposed a two-stage robust optimization model for VPPs in order to establish a calculation model for electric vehicles’ available capacity, ultimately achieving optimal operational benefits. Simultaneously, the exploration of carbon trading and blockchain integration in VPP research is a prominent area of interest. Zhang et al. [
12] introduced a self-concluding variational particle swarm optimization algorithm (SCV-PSO) to validate the scheduling model by considering the involvement of carbon trading mechanism and green certificate trading mechanism. Blockchain technology offers a means to establish accessible and sustainable energy networks through real-time information sharing methods, enabling consumers to make informed decisions about energy usage while addressing energy distribution disparities. Alam et al. [
13] proposed a decentralized architecture for blockchain-based peer-to-peer (P2P) multi-layer energy trading to facilitate the coordinated optimization of VPPs. In light of potential privacy concerns related to VPP blockchain, Yu et al. [
14] suggested a double auction mechanism based on VPP blockchain that not only safeguarded privacy but also significantly enhanced computational efficiency.
While these studies offer valuable insights into the optimization of VPPs, the majority of them still rely on static models and do not completely account for the dynamic characteristics of the system. Addressing these issues, Marinescu et al. [
15] proposed a different concept for dynamic virtual power plant (DVPP) in this paper, which included a combination of dispatchable and non-dispatchable renewable energy sources, as well as common control and operation procedures. This proposed model integrated dynamic aspects at various levels including local (for each RES generator), global (for grid auxiliary services and interaction with neighboring elements), and economic (for internal optimization and participation in the electricity market). Building upon this foundation, Wei et al. [
16] introduced a sine–cosine multi-objective particle swarm optimization algorithm to optimize the environment–economy hybrid dynamic scheduling model of a DVPP, aiming to reduce operating costs. Furthermore, Haberle et al. [
17] implemented the hybrid DVPP model using adaptive dynamic participation factors, extending its application to dispersed DER locations within the power grid in order to verify its practicality.
Game theory provides a mathematical framework for analyzing phenomena characterized by competition or conflict. It investigates the expected and actual behaviors of individuals or teams in a game, allowing them to adjust strategies based on opponents’ actions to identify Nash equilibrium points and maintain consistent strategies. Policy changes can lead to reduced revenue for users. Within VPPs, members engage in varying levels of competition and adapt their strategies accordingly. Game theory utilizes visual mathematical methods to calculate and quantify uncertainties in the operation processes of VPPs, thereby enabling participants to achieve their objectives more effectively [
18,
19]. Currently, numerous studies have applied game theory to investigate VPPs, broadly categorizing them as cooperative or non-cooperative games based on the nature of the interactions involved.
Participants in cooperative games can establish binding agreements through collaboration and negotiation, with a focus on enhancing overall returns. In these games, the total payoff is distributed among participants, emphasizing the importance of cooperation. Liu et al. proposed an optimal scheduling strategy for VPPs based on potential game (PG) theory to achieve higher economic gains while maintaining multi-agent income equilibrium [
20]. Non-cooperative games differ from cooperative games by the lack of binding agreements, enabling each player to independently make decisions to maximize their personal interests. These games focus on the selection of individual strategies. The Stackelberg game, a specific type of non-cooperative game, is frequently utilized in the context of VPPs. Da et al. proposed a non-cooperative game coordination control (NCGCC) method for VPPs [
21]. Zhao et al. [
22] combined the Stackelberg game with cooperative game theory to develop a hybrid game model. This model aimed to enhance the coordination of unit operation, demand response, and P2P energy transactions among community energy provider (CEP) alliances, mediated by distribution network operators (DSOs) through strategic electricity pricing.
The current study makes significant contributions to optimizing the strategies of market participants and governments. However, it is essential to acknowledge several limitations, including reliance on static models, complexity and computational needs of models, insufficient exploration of dynamic services and spatial factors, as well as inadequate attention to carbon emission and trading mechanisms. Despite these challenges, game theory continues to offer substantial advantages in studying VPPs and remains a powerful tool for optimizing their efficiency and economic performance.
In addressing the challenges of carbon trading markets, it is imperative for market participants and governments to adopt distinct strategic behaviors. As regulators and policy makers, governments can facilitate the advancement and integration of renewable energy through a range of incentive policies and regulatory tools, while ensuring grid stability. Market participants, such as energy companies, power generation companies, and electricity sales companies should tailor their strategies based on the government’s array of policy tools and market opportunities in order to optimize returns and drive sustainable energy adoption. Consequently, this paper seeks to analyze the long-term behavior of market participants and governments in procuring renewable energy within VPPs and carbon trading markets. It presents an evolutionary game model that mirrors real-world dynamics, offering guidance for formulating future policies for both governments and market participants.
Evolutionary game theory (EGT), derived from imitation dynamics, applies the principles of game theory to evolutionary problems in biology and offers a robust framework for analyzing the decision-making process among rational agents within a population [
23]. EGT is rooted in traditional game models, such as prisoner’s dilemma, hawk–pigeon game, public goods game, etc., but distinguishes itself by acknowledging that individual behavior may be influenced by factors such as bounded rationality, incomplete information, or behavioral patterns, and does not necessitate perfect information conditions [
24,
25,
26]. Furthermore, as a form of dynamic game theory, evolutionary game theory underscores the dynamic adaptation and evolution of strategies. Individuals are capable of making dynamic decisions and adjusting their strategies through learning from their opponents’ behaviors until their strategies converge to an evolutionarily stable state known as Nash equilibrium.
In this study, we utilize evolutionary game theory to extend the analysis of strategic behavior exhibited by market participants and governments within the framework of carbon trading. Specifically, we propose a multi-level market participant strategy: in addition to adopting a virtual power plant (R strategy) or maintaining existing emission levels (NR strategy), participants can also opt for a hybrid strategy (HR strategy) that partially adopts a virtual power plant. On the government side, we consider a more intricate mix of policy tools, including various policies (I strategies) such as carbon taxes, renewable energy subsidies, and technology research and development incentives. We examine the long-term evolution of these policy combinations and their impact on markets. Based on these considerations, the primary objectives of this paper can be summarized as follows:
- (1)
To what extent does the adoption of VPPs depend on different policy configurations (i.e., policies R, NR, and HR)?
- (2)
What proportion of market participants choose strategies R, NR, and HR based on long-term evolutionary behavior?
- (3)
How do government subsidies affect the strategies of market regulators?
Our responses to these inquiries will provide guidance to market participants regarding their future VPP utilization strategies and allocation, as well as underscore the necessity of government subsidies.
The manuscript is organized as follows. In
Section 2, we delineate the evolutionary game model advanced in this study, detailing the problem statement, model development, the iterative dynamics of the game, and a comprehensive sensitivity analysis.
Section 3 utilizes the VPP of State Grid Jibei Electric Power Co., Ltd., Beijing, China, as a case study, conducting a comparative analysis with Germany to provide empirical evidence supporting the effectiveness of the model. Finally,
Section 4 concludes the paper with a summary of our findings and a conclusion on the implications for future research directions.
2. Methods
Given the bounded rationality of both government and market participants, the strategy selection process for the two players in the game is not static. Instead, players continuously adapt their strategies dynamically by observing and matching their revenues. Consequently, our evolutionary game model is employed to describe the system’s evolution over time. In this section, we elaborate on the model in detail, grounded on the fundamental assumptions utilized in this study.
2.1. Problem Definition
In addressing the greenhouse effect, climate change, and environmental degradation, we considered the integration of carbon markets. Within this framework, Player A, representing a market participant, can choose to completely adopt VPPs to reduce carbon emissions (R strategy), continue to maintain current emission levels by using conventional power plants (NR strategy), or partially adopt VPPs, implementing a hybrid energy strategy to reduce emissions to varying degrees (HR strategy). As a government entity, Player B can implement a carbon tax policy to incentivize carbon emission reductions ( strategy), provide renewable energy subsidies to encourage market participants to adopt VPPs ( strategy), and offer technology research and development incentives to promote innovation and technological advancement ( strategy). Alternatively, the government can choose to implement a combination of carbon tax, subsidies, and research and development incentives (I strategy) or opt for no policy intervention (NI strategy).
According to the principles of evolutionary games, we initially posit the following fundamental hypothesis:
- (1)
When market participants employ VPPs (R strategy) for power regulation, they have the potential to generate revenue
and derive various benefits. Firstly, the implementation of VPPs reduces carbon emissions, ensuring compliance with environmental regulations, thereby promoting sustainability. Additionally, market participants are responsible for covering the costs associated with VPP operations
.
includes the deviation output penalty, energy storage systems (ESS), energy management systems (EMS), distributed energy resources (DERs), interruptible load (IL), and shifting station load (SSL) costs, as shown in Equation (
1).
- (2)
When market participants utilize conventional power plants under the NR strategy, their revenue is represented by , while the corresponding maintenance and operational costs of conventional power plants are denoted as .
- (3)
If market participants opt to implement a hybrid strategy (HR strategy), their generated revenue is represented by , and they also bear the associated costs .
- (4)
The government’s net income from the implementation of policies (including environmental benefits) is denoted as . Simultaneously, the total cost incurred by the government in implementing a comprehensive strategy (including subsidy costs and policy implementation costs) is denoted as , with an additional cost for environmental governance denoted as . The subsidies provided by the government to market participants are represented by , while carbon taxes imposed by the government are expressed as . If market participants opt for the NR strategy, the government has the potential to generate supplementary income from fines F.
- (5)
The strategic decisions of market participants are impacted by the dynamic fluctuations in energy market prices (), the uncertainty surrounding technological advancements (), and the variability and unpredictability in policy implementation ().
In this model, we make the assumption that Player A, representing market participants, aims to maximize their profit (), while Player B, representing the government, aims to maximize social welfare () encompassing both environmental and economic benefits. The trading process of the carbon emission market is considered to be an iterative game process, characterized by a series of infinite time and multiple rounds, where each round’s outcome influences the decision-making in subsequent rounds.
2.2. Model Construction
The payoff matrix of market participant A and government B under different strategies is as shown in Equations (
2) and (
3).
In the mixed strategy, and denote the respective proportions of government subsidies and carbon taxes. Under policy I, market participants employing strategy R or HR are eligible for subsidy , but may also be subject to carbon tax . Those opting for strategy NR may face a penalty F.
Table 1 presents the payment matrix for the two players in the game, with each player having two strategic options. The respective payoffs under these strategy combinations are listed in
Table 1.
2.3. Replication Dynamic Analysis
Dynamic replicators are dynamic differential equations that describe how much the strategy used by a population changes over time. These equations are shown in the equation set for all players.
2.3.1. Government
Given two distinct government strategies, denoted as
I and
, each with corresponding payoffs
and
, let the proportion of governments adopting strategy
I be represented by
y, while the proportion adopting strategy
is denoted as
. The average payoff for the government can be expressed as Equation (
4).
The replicator dynamic equation governing the government’s strategy
I is as shown in Equation (
5).
Similarly, the replicator dynamic equation for strategy
is illustrated in Equation (
6).
After formulating the replication dynamic equation based on the Malthusian equation, we propose to the government the selection of an evolutionary game replication dynamic equation with a probability denoted as
y in accordance with Equation (
7).
Given
, determine the equilibrium point
y as Equation (
8).
- (1)
When , the government chooses not to implement policy exclusively. If , indicating that the payoff of implementing policy I is lower than not implementing policy , then represents a stable equilibrium point.
- (2)
When , the government unequivocally chooses to implement policy I. If , indicating that the payoff from implementing policy I exceeds that of not implementing policy , then constitutes a stable equilibrium point.
- (3)
When , and the payoffs of the two policies are equal, any y represents an equilibrium point. In this scenario, the government has the flexibility to implement either policy in any proportion without impacting the stability of the system.
2.3.2. Market Participant
Market participants employ two strategies, denoted as strategy
R and strategy
, with corresponding payoffs
and
. The proportion of participants choosing strategy
R is represented by the variable
x, while the proportion choosing strategy
is represented by the variable
. The average payoff for market participants is expressed in Equation (
9).
After formulating the replication dynamic equation based on the Malthusian equation, the replicator dynamic equation governing the strategy selection of market participants
R is represented by Equation (
10).
Given
, the equilibrium points are determined by Equation (
11).
- (1)
When , all market participants opt for strategy , indicating the dominance of the strategy in the entire market. If, at this point, , signifying a higher payoff for strategy compared to strategy R, then represents a stable equilibrium point, as no participants are inclined to switch to the less profitable strategy R.
- (2)
When , all market participants opt for strategy R. If , indicating that the payoff of strategy R exceeds that of strategy , then constitutes a stable equilibrium point, as no participants are inclined to switch to the less lucrative strategy.
- (3)
When the payoffs of the two strategies are equal (), any x serves as an equilibrium point, leading to a stable proportion of different strategies within the system.
2.4. Stability Analysis of Equilibrium Points
By solving the system of differential equations and , we can determine the five equilibrium points as follows: , , , , and .
The points
to
constitute the boundaries of this evolutionary game, making the Jacobian matrix of this system as shown in Equation (
12).
By utilizing the chain rule and simplification techniques, the components of the Jacobian matrix can be individually derived as demonstrated in Equations (
13)–(
16).
The detailed elements of the Jacobian matrix are expressed in Equation (
17) as follows.
Concurrently, the conclusions presented in
Table 2 can be deduced from the matrix. According to the stability criteria outlined in
Table 2, when
or
, if
, then these points are considered stable. This means that under these conditions, the government’s policy
I is implemented and market participants may choose either
(
) or
R (
). This suggests that the government’s policy can effectively incentivize market participants to adopt certain strategies, but this effectiveness depends on the parameter
F and how well the policy is implemented. If the government’s incentive
is sufficiently strong, market participants are more likely to choose the
R strategy at the point
, thereby achieving the policy
I’s goal. The stability at the point
suggests that despite the government’s policy implementation, market participants may still opt for the
strategy. This could be attributed to ineffective policy implementation or inadequate incentives provided to market participants.
When the system is close to the states of and , it deviates from them due to dynamic changes. In the case of , market participants may entirely shift toward choosing the R strategy over the strategy. This illustrates that in the absence of policy implementation I, market participants will ultimately prefer to choose the strategy regardless of penalty F.
To further analyze and predict the long-term evolution trend of the system and formulate corresponding strategies, we conducted a region of attraction analysis on the equilibrium points (0, 0) and (1, 1), as depicted in
Figure 3.
In
Figure 3, the domain of attraction of (0, 0) encompasses the vicinity near the bottom left corner, while the domain of attraction of (1, 1) includes the region near the top right corner. The equilibrium points (1, 0) and (0, 1) are unstable, with system states exhibiting transient behavior in their proximity. This indicates that in this model, if initial conditions are close to
, the system will ultimately converge to the equilibrium point
, where market participants opt for strategy
and government refrains from implementing policy
I. Similarly, if initial conditions are proximate to
, the system will tend toward equilibrium point
, where market participants choose strategy
R and government implements policy
I.
Therefore, for policymakers aiming to achieve the state in the system, it is essential for the government to develop policies that ensure the initial state falls within the upper-right region. This will effectively align the strategies and policies of both market participants and the government. Market participants should select strategies within the region, specifically choosing the R strategy to comply with government policy I in order to avoid being fined F.
Furthermore, a sensitivity analysis was conducted on the variable
and
F. As illustrated in
Figure 4, the dynamic behavior of market participants and government strategy selection under varying penalty
F and policy incentive
conditions were determined. The analysis demonstrated that robust policy incentives could effectively steer both the market and the government towards selecting environmental protection strategies, leading to a movement towards the environmental protection equilibrium point
, even in the absence of fines. While fines exerted a significant binding effect on market participants, they alone were insufficient to incentivize governments to implement environmental policies. The optimal outcome occurred when strong fines and robust policy incentives were combined, enabling elevated coordination between the market and the government to maximize benefits.
2.5. Simulation Analysis
In
Figure 5, we examined the temporal dynamics of market participants
and government
proportions across various initial conditions. When the initial environmental bias of market participants and governments was low, there was slight incentive for the system to adjust, leading to a rapid decline in environmental status. Similarly, even under peak initial conditions, without sustained incentives or policy support within the system, market participants and governments eventually abandoned environmental protection strategies and policies, resulting in a low level of environmental protection. This indicates that even with a strong inclination towards environmental protection at the outset, the system may struggle to maintain elevated levels of environmental protection without adequate policy incentives or penalties. Therefore, continuous incentives and rigorous policy enforcement are necessary to effectively uphold a high level of environmental strategy and policy implementation.
4. Conclusions
The continuous advancement of technology and the enhancement of market mechanisms position VPPs as playing an increasingly pivotal role in the global energy transition. However, current research exhibits deficiencies in several areas. Most studies primarily focus on optimizing individual VPP systems while neglecting interactions among multiple market participants and the long-term effects of governmental policy changes and market conditions on VPP expansion and implementation. Furthermore, although some research has explored integrating carbon trading mechanisms with VPPs, there is a lack of dynamic analysis regarding strategy evolution within complex market environments. These studies commonly rely on static models that inadequately simulate real-market interactions among policy, technology, and participant behavior. Additionally, existing models lack comprehensive capabilities for addressing uncertainties, market volatility, and technological advancements’ impact on VPP expansion.
We introduced a sophisticated dynamic evolutionary game model to address these deficiencies, combined with the practical application of a carbon trading market and VPPs. This model simulated the strategic choice between market participants and the government, as well as its long-term evolution process, and profoundly analyzed their impact on revenue. We not only analyzed the impact of policy support and VPP technology integration on earnings but also evaluated the dynamic interaction of government policy adjustments and changes in market participants’ behavior through quantitative simulations. Different from previous static models, this model can dynamically capture the trend of market participants’ strategy evolution and predict the long-term impact of government policies in an uncertain environment. The results showed that a VPP strategy had wide applicability and scalability in significantly improving energy management efficiency and reducing carbon emissions. Whether in regions with strong government policy support or in market-driven environments, the integration of VPPs can effectively optimize the operation of energy systems and significantly improve benefits for both market participants and governments.
Our quantitative simulation analysis demonstrated that the implementation of VPP technology, in conjunction with government policies, resulted in an average 90% increase in market participants’ revenue and a 35% increase in government revenue. The simulation analysis from the case study also validated the efficacy of the VPP strategy. In the case of State Grid Hebei Power Co., Ltd., substantial economic benefits were realized by market participants through the proactive adoption of VPP technology and its integration with government policies. These findings have broader implications for other regions, such as Germany’s Energiewende policy, which effectively facilitated widespread adoption of VPPs and renewable energy development through strategic policy incentives and active market participation. This underscores that irrespective of the market and policy landscape, appropriate incentives and support mechanisms can yield successful outcomes for implementing the VPP strategy across diverse regions and market conditions.
The research in this paper demonstrates that as technology advances and the market expands, there is potential for wider promotion and application of VPP strategy. The government should continue to refine subsidy strategies, establish a dynamic policy framework adaptable to diverse markets, and ensure widespread adoption and long-term operation of VPP technology. Furthermore, industry stakeholders should invest in smart grid technologies, energy storage solutions, and real-time data analytics to drive innovation in VPP technology and facilitate its effective expansion across various markets. In order to drive a more sustainable and resilient energy future, our research will continue to investigate the long-term impact of electric vehicles on energy markets and the environment, including their socio-economic implications, potential risks, and benefits associated with widespread adoption. Simultaneously, it is imperative for the government to focus on optimizing subsidy strategies in order to sustain the growth momentum of photovoltaic power generation. This necessitates regular assessment and adjustment of subsidy amounts, as well as the establishment of dynamic penalty mechanisms for non-compliant market participants. Furthermore, key market players such as power distribution companies and power generation companies should persist in investing in smart grid technologies, energy storage solutions, and real-time data analytics to drive innovation aimed at enhancing VPP efficiency and reliability.
In conclusion, VPPs hold significant potential for application in the electricity market and offer promising opportunities for enhancing energy efficiency and reducing carbon emissions. Through the utilization of evolutionary game theory and strategic decision-making, governments and market participants can effectively optimize the integration and management of VPPs, thereby developing more robust strategies. The introduction of a dynamic evolutionary game model not only establishes a theoretical foundation for implementing VPP in the electricity market but also provides practical guidance for policy effectiveness and market behavior within complex market conditions. This study offers valuable insights for policymakers and market participants seeking to formulate effective strategies, serving as a reference point for future energy transition and sustainable development pathways.