Integrating Evolutionary Game-Theoretical Methods and Deep Reinforcement Learning for Adaptive Strategy Optimization in User-Side Electricity Markets: A Comprehensive Review

Lefeng Cheng; Xin Wei; Manling Li; Can Tan; Meng Yin; Teng Shen; Tao Zou

doi:10.3390/math12203241

,

and

School of Mechanical and Electrical Engineering, Guangzhou University, Guangzhou 510006, China

^*

Authors to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Mathematics2024, 12(20), 3241;https://doi.org/10.3390/math12203241

This article belongs to the Section E2: Control Theory and Mechanics

Version Notes

Order Reprints

Abstract

With the rapid development of smart grids, the strategic behavior evolution in user-side electricity market transactions has become increasingly complex. To explore the dynamic evolution mechanisms in this area, this paper systematically reviews the application of evolutionary game theory in user-side electricity markets, focusing on its unique advantages in modeling multi-agent interactions and dynamic strategy optimization. While evolutionary game theory excels in explaining the formation of long-term stable strategies, it faces limitations when dealing with real-time dynamic changes and high-dimensional state spaces. Thus, this paper further investigates the integration of deep reinforcement learning, particularly the deep Q-learning network (DQN), with evolutionary game theory, aiming to enhance its adaptability in electricity market applications. The introduction of the DQN enables market participants to perform adaptive strategy optimization in rapidly changing environments, thereby more effectively responding to supply–demand fluctuations in electricity markets. Through simulations based on a multi-agent model, this study reveals the dynamic characteristics of strategy evolution under different market conditions, highlighting the changing interaction patterns among participants in complex market environments. In summary, this comprehensive review not only demonstrates the broad applicability of evolutionary game theory in user-side electricity markets but also extends its potential in real-time decision making through the integration of modern algorithms, providing new theoretical foundations and practical insights for future market optimization and policy formulation.

Keywords:

smart grids; user-side electricity market; evolutionary game theory; deep Q-learning network (DQN); multi-agent systems; strategy optimization; real-time decision making; supply–demand fluctuations

MSC:

65M12

1. Introduction

With the continuous growth of global energy demand and the increasing urgency of sustainable development, the smart grid, as an important part of the modern power system, has gradually attracted wide attention. As an emerging technology, the smart grid is combined with advanced information communication technology []. It greatly improves the efficiency of energy utilization. Unfortunately, Mollah et al. [] find that the smart grid makes it difficult to integrate other approaches in order to improve energy utilization efficiency and reliability. In this context, researchers are gradually applying game theory to smart grids, especially in the field of energy management, to address these complex challenges.

Among the many game theory models, evolutionary game theory (EGT) has gradually become an effective tool for studying smart grid problems because of its ability to capture the dynamic evolution of participants’ strategies and the impact on real-time price formation in non-completely rational environments. Its difference from traditional game theory is that the participants in evolutionary games are not necessarily completely rational []. Moreover, EGT can more effectively describe the gradual optimization and evolution of the strategy of participants in the game process. Therefore, it has a unique advantage in the multi-agent interaction of smart grids.

The rapid development of smart grids and the increasing complexity of user-side electricity markets necessitate advanced models to optimize strategy formulation among market participants. Traditional game theory, while useful for static systems, struggles to capture the dynamic and evolving nature of electricity market transactions where participants exhibit bounded rationality and adapt strategies over time. Traditional game theory typically assumes that agents have full information, as well as perfect rationality, and that a static Nash equilibrium is achievable. However, these assumptions are often unrealistic in decentralized and highly volatile energy markets where agents have limited information and must make decisions under conditions of uncertainty. Moreover, real-time changes in supply and demand require more flexible and adaptive approaches to reflect the continuous shifts in market conditions.

EGT addresses some of the limitations of classical game theory by introducing the concept of bounded rationality and enabling agents to adapt their strategies over time. EGT models the evolution of strategies within a population, where individuals continuously adjust their actions based on observed payoffs and their interactions with others. This makes EGT suitable for modeling the adaptive nature of agents in electricity markets. However, EGT has its own limitations. It primarily focuses on population-level dynamics, meaning that the rate of adaptation can be slow, especially in environments with rapid fluctuations like electricity markets. Furthermore, EGT often relies on simplified replicator dynamics, which may not fully capture the nuances of decision making in environments that are highly non-stationary and stochastic.

To address these challenges, integrating deep reinforcement learning (DRL) with EGT offers a promising solution by combining the strengths of both approaches. DRL provides a powerful tool for real-time strategy optimization, enabling market participants to adapt to fluctuating conditions effectively. DRL allows agents to interact continuously with their environment, learn from these interactions, and update their strategies in a manner that maximizes cumulative rewards. Unlike traditional methods, DRL uses neural networks to approximate value functions or policies, which makes it particularly effective in dealing with the high-dimensional state and action spaces characteristic of modern energy markets. This allows DRL to efficiently handle the complexity that comes with multiple market participants, dynamic pricing, and the integration of renewable energy sources.

This integration of both EGT and DRL bridges the gap between long-term evolutionary stability and real-time decision making. Specifically, EGT focuses on the evolution and stability of strategies within a population over longer timescales, which provides insights into the emergent behaviors and potential equilibria that can arise from repeated interactions. However, it lacks the granularity needed for individual-level adaptation in highly dynamic settings. On the other hand, DRL excels at enabling individual agents to learn optimal strategies quickly by adapting to changing market signals and optimizing for immediate and long-term rewards. Integrating DRL with EGT allows agents to not only adapt their behavior based on immediate market conditions but also contribute to a broader evolutionary process that ensures population-level stability and robustness in strategy formulation. This combination leads to the emergence of evolutionarily stable strategies (ESSs) that are both adaptive in real time and resilient over longer periods, making them highly suitable for environments like electricity markets where conditions change rapidly and unpredictably.

Furthermore, the combination of EGT and DRL is particularly advantageous in handling the integration of renewable energy sources, which are characterized by high volatility and uncertainty. Renewable energy sources, such as wind and solar power, introduce significant unpredictability into the grid due to their dependence on weather conditions. DRL’s reinforcement learning capabilities enable agents to adapt their strategies dynamically in response to changes in renewable generation, thereby improving the overall reliability of the grid. For example, prosumers (producers and consumers) can leverage DRL to make decisions about energy storage, consumption, or even trading based on real-time data, while EGT ensures that these individual decisions collectively lead to stable and efficient outcomes for the entire market.

Moreover, DRL enhances the scalability of EGT-based models. In traditional EGT models, the computational complexity can increase significantly as the number of participants grows, making it challenging to simulate large-scale markets. DRL, with its neural network-based function approximation, is able to generalize learning across similar states and actions, reducing the computational burden and making it feasible to apply evolutionary models to larger, more complex systems. This scalability is crucial for electricity markets, which are expected to expand significantly with the increasing adoption of distributed energy resources (DERs) and the rise of peer-to-peer (P2P) energy trading platforms. In summary, integrating DRL into EGT provides a synergistic framework that addresses the shortcomings of each individual approach. DRL brings the ability to optimize agent-level strategies in real time, effectively learning from high-dimensional and stochastic environments, while EGT provides the population-level perspective that ensures long-term stability and adaptability. This combination is particularly relevant in user-side electricity markets, where market participants must not only adapt quickly to immediate changes in supply and demand but also contribute to the long-term stability of the energy system. Therefore, the integration of DRL with EGT offers a comprehensive solution to optimize strategic decision making, enhance energy efficiency, and stabilize electricity markets in the face of rapid and unpredictable changes.

The purpose of this paper is to systematically review the application of EGT in a smart grid, especially in power market energy trading. Through an in-depth analysis of the existing literature, this paper not only points out the advantages of EGT over classical game theory but also discusses its high applicability in smart grid. At the same time, this paper also focuses on analyzing the impact of the evolution of participants’ strategies on real-time prices, and it provides new perspectives and suggestions for future research.

This review is significant, as it synthesizes key advancements in the application of EGT to smart grids, particularly in user-side electricity market transactions. By systematically analyzing the limitations of traditional game theory and the advantages of EGT, the paper offers valuable insights into how multi-agent systems in electricity markets evolve and optimize their strategies. The study further extends the applicability of EGT by exploring its integration with DRL, which provides the adaptability required in rapidly changing market environments. This combination offers new theoretical and practical approaches for improving decision making, enhancing market efficiency, and fostering stability in electricity markets, especially as the sector increasingly incorporates renewable energy sources.

The paper systematically reviews the use of EGT in user-side electricity markets, highlighting its relevance and advantages over classical game theory. It also explores how DRL can be integrated with EGT to optimize strategy formation in dynamic environments. The key innovations of the study are as follows:

(i): Integration of EGT with deep Q-learning network (DQN): The paper introduces a novel integration of EGT with the DQN, a DRL technique, to enhance the adaptability of market participants in real-time. This combination allows for continuous learning and strategy optimization, enabling agents to adjust their behavior dynamically in response to fluctuating market conditions, which is particularly beneficial in fast-changing electricity markets.
(ii): Application to multi-agent systems: The study applies EGT to simulate interactions among multiple agents, such as consumers, producers, and regulators, in electricity markets. The multi-agent model used in the review provides a robust framework for understanding how these participants evolve their strategies over time, particularly under different pricing and demand-response scenarios.
(iii): Real-time decision making in energy trading: The integration of EGT with reinforcement learning (RL) addresses a critical limitation of traditional EGT—its inability to deal with real-time decision making in high-dimensional state spaces. The use of the DQN allows market participants to perform adaptive strategy optimization in real-time, making it possible to better manage supply–demand fluctuations and price volatility in electricity markets.
(iv): Enhanced analysis of demand response: The paper extends the application of EGT to demand-response mechanisms, offering insights into how consumer behavior can be modeled and optimized using evolutionary strategies. This contributes to more effective load management and peak shaving, improving grid stability and reducing operational costs.
(v): Practical applications in renewable energy markets: The study explores the application of EGT in renewable energy integration, including the P2P trading of surplus energy. It demonstrates how EGT can model both cooperative and competitive dynamics in decentralized energy markets, providing a theoretical basis for optimizing energy storage and trading strategies.

In summary, this paper makes significant contributions by reviewing the current state of EGT in electricity markets and proposing its integration with DRL to enhance strategy optimization. The study not only advances the theoretical understanding but also provides practical insights for policymakers and market operators seeking to improve efficiency and adaptability in modern electricity markets.

This paper is organized as follows: Section 1 introduces the background and motivation for the study, highlighting the increasing complexity of user-side electricity markets and the potential of integrating EGT with DRL to optimize strategy dynamics. Section 2 reviews the fundamentals of EGT, including its key principles and advantages over traditional game theory, and explores its applicability in various fields such as biology, economics, and electricity markets. Section 3 focuses on the application of EGT in smart grids, discussing the multi-agent structure and interaction in smart grids and the advantages of using EGT to model dynamic strategic behavior. Section 4 discusses the role of EGT in energy trading, particularly in optimizing real-time pricing, demand response, and renewable energy integration, providing case studies and theoretical insights. Section 5 introduces the DQN and explores its synergy with EGT, demonstrating how the DQN enhances real-time decision making and strategy optimization in dynamic electricity markets. Section 6 delves into the application of EGT to demand response, focusing on user behavior modeling and strategy evolution, with empirical examples of how EGT can enhance energy efficiency and grid stability. Section 7 presents an empirical analysis of EGT’s effectiveness in electricity market transactions, comparing traditional approaches with EGT-based strategies. Finally, Section 8 concludes with future prospects, emphasizing potential advancements in EGT modeling, algorithmic innovations, and interdisciplinary research in user-side electricity markets.

To compensate for the lack of a graphical summary, the following detailed outline helps illustrate the logical flow and relationships between the sections:

(i): Section 2 lays the theoretical groundwork by contrasting EGT with traditional game theory and its relevance in different domains.
(ii): Section 3 and Section 4 build on this foundation, focusing on specific applications within smart grids and energy trading, thereby highlighting the suitability of EGT for modeling multi-agent systems.
(iii): Section 5 introduces DRL, specifically the DQN, as a key tool for enhancing real-time strategic optimization capabilities.
(iv): Section 6 and Section 7 take a deeper empirical approach by exploring demand response and providing comparative analyses to validate the practical effectiveness of integrating EGT with DRL.
(v): Section 8 connects these findings back to broader research opportunities, thereby providing a holistic view of the potential benefits and future applications of combining EGT with modern machine learning approaches.

2. Fundamentals of EGT

2.1. An Overview of EGT

Evolutionary game theory (EGT), pioneered by M. Smith and R. Price, studies the evolution of animal behavior, or phenotypes, as a result of game competition using different strategies []. EGT is a theory that studies the continuous propagation of populations or groups over time in the ideas of biological evolution and dynamic evolution. These basic concepts can be explained using a simple model, the “hawk-dove” game []. Hawk–dove games often reflect ESSs in EGT. The authors of Ref. [] propose that ESS is a strategy with which, if it is adopted by all members of a population, no mutation strategy can invade the population under the influence of natural selection. In the hawk–dove game, it is assumed that there are two strategies: The “hawk” strategy represents the active aggressive type, which will go all out to compete for resources; The “dove” strategy stands for passive defense. When most participants in a group have elected the “eagle” strategy, due to fierce internal competition, individuals who adopt the “eagle” strategy will often pay a high price, such as injury or even death. At this time, the “dove” strategy individuals are more likely to continue to survive and reproduce because they avoid fierce conflicts, and the proportion of “dove” strategy in the group will gradually increase. On the contrary, when the majority of the group are “dove”-strategy individuals, due to the slight competitive advantage of “dove”-strategy individuals, “eagle”-strategy individuals can easily obtain resources and thus have higher adaptability, and the proportion of “eagle”-strategy individuals will increase. Eventually, after a series of dynamic evolution, the proportion of hawk and dove individuals in the group will reach a stable state such that no other “disruptive” strategies will give participants a greater advantage []. This proportion segment that eventually stabilizes is the embodiment of ESS. If expressed as a formula, it can be expressed as

u = c / v

[]. In this context, “u” denotes the stable proportion of individuals adopting the “hawk” strategy within the population. As one subset of the group opts for the “hawk” strategy, while the other adopts the “dove” strategy, the system evolves until it reaches an equilibrium at which the proportion P stabilizes, representing the equilibrium proportion of “hawk” strategists. The variable v represents the resource payoff, which is the benefit an individual receives upon successfully securing resources in the game, and c stands for the cost of conflict, where the “hawk” strategy incurs higher losses due to intense competition, while the “dove” strategy minimizes conflict-related costs by avoiding direct confrontation. The steady ratio of these two strategies can be explained by borrowing a picture, as shown in Figure 1 [,,,].

Figure 1. Dynamic evolution of hawk–dove strategies [,,,].

The “hawk” strategy typically involves obtaining resources through aggressive competition, while the “dove” strategy secures resources through passive means. As the relative proportions of these strategies evolve, the system approaches a state of regional equilibrium and stability.

2.2. Advantages of EGT

EGT is based on traditional game theory, but this does not mean that they are identical. There are differences between EGT and traditional game theory in many aspects. Traditional game theory is based on the assumption of complete rationality, where participants are believed to be capable of accurately calculating and selecting the optimal strategy. It employs static analysis methods to solve Nash equilibrium problems, with a primary focus on the existence of the Nash equilibrium and the selection of optimal strategies by participants. In contrast, EGT assumes bounded rationality; participants determine the optimal strategy through continuous trial and error. It utilizes dynamic evolutionary methods to analyze the evolution of participants’ strategies, with an emphasis on the evolutionary process and stability of strategies, particularly the impact of factors such as environmental changes on stabilization strategies over the long-term evolutionary process. Table 1 shows a simple comparison.

Table 1. Differences between traditional game theory and EGT.

Among them, EGT offers advantages over classical game theory in two key areas. First, EGT abandons the assumption that game players must be perfectly rational []. In EGT, participants often cannot calculate the optimal solution rationally but instead find the best strategy through continuous trial and error. Further, game theory can also be thought of as a theory of policy interaction, in which each stakeholder in a decision must consider how other decision-makers will act and how that action will affect their own interests []. In summary, decision-makers must consider not only their own interests but also the strategies and actions of others. They should continuously adjust and optimize their strategies to maximize their benefits. For example, in an open competition, each participant is unable to find the best way to finish the game at the beginning, and they move forward via trial and error, trying different methods several times and drawing on the experience of other participants to finally determine the optimal game strategy. Classical game theory is different from EGT. It is indicated in Ref. [] that the classical game theory represented by Nash is based on the following three assumptions: (1) the players are completely rational, and (2) the participants have common knowledge. However, in real-world scenarios, players cannot be perfectly rational. Their knowledge is influenced by inherent factors and environmental conditions, leading to a lack of common knowledge. This analysis suggests that classical game theory often does not align with real-world scenarios, limiting its practical applicability. Secondly, EGT focuses on ESSs, which prevent any mutant strategy from successfully invading the population under natural selection. This means that ESSs provide stability within a dynamically evolving system. Classical game theory relies on mathematical analysis and logical reasoning to determine Nash equilibrium through modeling. However, Nash equilibrium requires that each player rationally makes their own decisions on choosing proper strategies/actions []. So it is difficult to observe the dynamic process of the game approaching equilibrium. In summary, EGT differs from classical game theory by accounting for participants’ limited rationality and information. It effectively describes the dynamic evolution of strategies, rather than focusing solely on static equilibrium. Because of this, its application in reality is more advantageous.

2.3. Bridging EGT and RL for Enhanced Strategy Formulation in Electricity Markets

EGT and RL are instrumental in understanding adaptive behaviors and strategic decision making in complex systems, such as electricity markets. While EGT models strategic interactions over extended periods within a population, RL facilitates individual learning from environmental interactions to optimize decision making in real time. This section elucidates the intrinsic connection between these theories and their combined potential to innovate electricity market strategies.

Based on this, the theoretical synergy is elaborated as follows.

(i): Interdisciplinary integration: Integrating EGT with RL provides a comprehensive framework that combines the temporal dynamics of strategy evolution with the adaptability of individual learning mechanisms. This integration is particularly potent in environments like electricity markets, where agents must adapt to fluctuating conditions.
(ii): From population to individual learning: EGT focuses on population-level dynamics in which the success of strategies is influenced by their interaction with other strategies over time, tending towards an evolutionarily stable strategy (ESS). RL, conversely, emphasizes learning from individual experiences to maximize a reward function, adjusted continuously as the agent interacts with the environment. This shift from macro-level stability to micro-level adaptability is crucial for real-time responsiveness in electricity markets.
(iii): Dynamic adaptation: Both theories emphasize adaptation, but their approaches offer complementary strengths. EGT provides insights into the strategic stability and drift within populations, which is crucial for predicting long-term market trends. RL contributes by enabling agents to adjust strategies swiftly in response to immediate market changes, enhancing operational flexibility and efficiency.

Furthermore, the application advantages in electricity markets for EGT and RL can be summarized as follows.

(i): Handling high-dimensional data: Electricity markets are characterized by high-dimensional and dynamic state spaces due to variable demand, supply conditions, and numerous participant interactions. RL’s ability to handle large state spaces effectively complements EGT’s approach to population dynamics, making the integrated approach well suited for such environments.
(ii): Real-time optimization: The rapid decision-making capability facilitated through RL, combined with the strategic depth provided via EGT, allows market participants to optimize their strategies not just for immediate gains but also for long-term stability. This is crucial for managing real-time bidding, pricing strategies, and load balancing.
(iii): Enhanced strategic forecasting: By applying EGT, market analysts can predict the evolution of competing strategies in the market, which, when combined with the tactical agility of RL, provides a robust model for anticipating and reacting to market shifts. This dual approach aids in developing strategies that are both competitively robust and highly adaptive.
(iv): Policy development and simulation: The hybrid model facilitates the simulation and analysis of potential policy impacts on market behavior over time, aiding policymakers and companies in crafting rules and strategies that promote efficient and stable market operations.

In summary, the confluence of EGT and RL bridges the gap between strategic stability and operational adaptability, offering a powerful tool for electricity market participants to enhance both their long-term and real-time decision-making processes. This integrated approach not only enriches theoretical models but also delivers practical tools for real-world application, promising to revolutionize strategy formulation in dynamic and complex market environments like those of the electricity markets. This exploration paves the way for future research and development of more sophisticated models that combine these two potent theoretical frameworks.

2.4. Exploration of Multi-Field Application of EGT

Evolutionary game theory (EGT), as a robust analytical tool, has found extensive and profound applications across various fields. This section provides a detailed exploration of the most widely used aspects of EGT in real life, including applications in cybersecurity, healthcare, industrial process optimization, and green electricity trading. Each of these fields leverages the principles of EGT to address complex strategic interactions, highlighting the versatility and continued relevance of EGT in both traditional and emerging domains. For example, Ref. [] indicates that the first—applying EGT to other biological problems—was remarkably fruitful. In competitive environments, species with higher fitness are more likely to survive. EGT offers a crucial theoretical framework for understanding biological evolutionary strategies and population dynamics. First, in cybersecurity, EGT is used to model the dynamic interactions between attackers and defenders, where both parties adapt their strategies over time based on observed outcomes [,,,,,,]. This makes EGT particularly effective for understanding adversarial scenarios and designing robust security protocols. For example, Su et al. [] make a significant contribution by addressing the mutual safety risk prevention and control strategies in industrial park enterprises using blockchain technology. It effectively integrates EGT with the blockchain to enhance transparency, traceability, and collaboration between enterprises, mitigating shared risks. Additionally, the study provides a comprehensive framework for improving decision-making processes in risk management, making it a valuable resource for both researchers and practitioners in the field of industrial safety and blockchain applications. In this research work, EGT was utilized to create a hypothetical model of limited rationality for the behavior of key stakeholders (core enterprises, supporting enterprises, and government regulatory departments) in mutual aid for safety risk prevention and control []. Second, in healthcare, EGT has been employed to optimize resource allocation and model competitive behaviors among healthcare providers, improving both efficiency and patient outcomes [,,,,,]. For example, Xie et al. [] present a novel adherence strategy framework based on evolutionary game theory to model and analyze the behavioral dynamics during epidemic spreading. The primary contribution of this paper lies in its ability to incorporate both individual and collective behavioral factors, offering a more comprehensive understanding of adherence dynamics in response to public health interventions. Additionally, the paper’s use of evolutionary game theory to explore optimal strategies for enhancing adherence provides valuable insights for policymakers aiming to improve compliance during epidemic outbreaks. Third, industrial process optimization benefits from EGT by helping firms determine evolutionarily stable strategies for resource utilization, fostering sustainable competition [,,,,,,]. For example, Xu et al. [] present a novel multidimensional evolutionary game model that addresses both energy efficiency and safety in the transportation of automated guided vehicles (AGVs) at automated container terminals. A key strength of this study is its integration of energy and safety concerns into the evolutionary game framework, which allows for a comprehensive analysis of AGV behavior under dynamic conditions. This proposed model provides significant contributions by offering a practical solution to optimize transportation strategies, improving operational efficiency while minimizing risks in industrial environments. Lastly, green electricity trading utilizes EGT to understand market dynamics involving renewable energy producers and consumers, promoting a stable and sustainable energy system [,,,,,,,]. For example, Zhou et al. [] provide a comprehensive analysis of China’s green certificate trading system, highlighting the critical role of participant heterogeneity in shaping market dynamics. This study’s key contribution lies in applying a collective action framework to examine how varying interests and behaviors among market participants influence the effectiveness of green certificate trading. This approach offers valuable insights for policymakers aiming to enhance the efficiency and fairness of green markets by addressing participant diversity and coordination challenges. In a summary, EGT as a robust analytical tool has found extensive and profound applications across various fields. This section focuses on the broad spectrum of applications of EGT, demonstrating its versatility and sustained relevance in different domains. These domains include traditional fields such as biology, economics, social sciences, and electricity markets, as well as newer areas such as cybersecurity, healthcare, and industrial process optimization. These applications underscore the adaptability of EGT in handling diverse challenges across multiple domains, demonstrating its effectiveness in providing strategic insights, optimizing resource allocation, and predicting outcomes in highly dynamic and uncertain environments. This versatility reinforces the significance of EGT as an analytical tool for modeling strategic behavior in complex, multi-agent systems, enabling stakeholders to make informed decisions and achieve stable, evolutionarily sound outcomes.

Cybersecurity applications: EGT has been applied to model defense strategies in response to cyber-attacks. For instance, a recent study in 2024 utilized EGT to enhance security measures in decentralized systems by modeling attacker–defender dynamics in a stochastic environment []. This application shows how game theory can adapt to the evolving threats in networked environments by helping stakeholders develop robust defense strategies under conditions of uncertainty. Recent studies have demonstrated the use of EGT in modeling the strategic interactions between attackers and defenders in networked environments. Specifically, the 2024 study on cybersecurity applications in [] explored how EGT can enhance security measures in decentralized systems by predicting and countering adaptive attack strategies. In this study, EGT provided a framework for understanding the evolution of strategies adopted by both attackers and defenders over time, thereby allowing system administrators to implement more resilient security protocols. Furthermore, Su et al. [] examined the game strategy of mutual safety risk prevention and control among industrial park enterprises under blockchain technology, highlighting how EGT can be effectively integrated with emerging technologies like blockchain to improve security collaboration among enterprises. This research shows how EGT can be utilized to facilitate a cooperative defense strategy while maintaining competitive advantages among participants, thus improving the overall cybersecurity resilience of interconnected systems. Another notable contribution to cybersecurity is the study by Liu et al. [], which presented a generic approach to generating network defense strategies based on EGT. This study illustrates the use of EGT in devising adaptive defense strategies that evolve over time, ensuring that defenders can effectively respond to changing attack patterns. The model presented allows for a continuous adaptation process, leading to more robust security measures tailored to evolving threats. Similarly, Wang et al. [] employed a differential game approach to explore antagonistic dynamics in cybersecurity, where attackers and defenders engage in a continuous strategic adjustment, providing insights into real-time security measures and optimizing defense tactics. Additionally, Wu and Pan [] also leveraged EGT combined with deep reinforcement learning to create an evolutionary game model for privacy behavior analysis on social networks. This approach showcases the integration of machine learning techniques with EGT, allowing for enhanced adaptability and deeper insights into privacy behaviors. The model can dynamically adjust to privacy threats, thereby optimizing privacy protection strategies within social network environments. These studies collectively demonstrate the adaptability of EGT in handling dynamic and adversarial environments like cybersecurity, where interactions between entities evolve continuously. By integrating EGT with other advanced technologies, such as blockchain and deep reinforcement learning, researchers can enhance the strategic adaptability of both defensive and cooperative measures, thereby significantly improving the overall resilience of cybersecurity systems.

Healthcare applications: In the healthcare sector, EGT has been applied to model competitive behaviors among healthcare providers. This includes strategic decision-making processes regarding resource allocation, such as hospital beds, medical personnel, and specialized equipment. By simulating the evolutionary dynamics between different healthcare providers, EGT helps to optimize resource distribution and improve public health outcomes []. For instance, healthcare facilities can be considered as agents competing for patients, where their strategies adapt based on patient satisfaction, treatment quality, and operational efficiency. The dynamic nature of EGT makes it suitable for exploring long-term trends in healthcare competition and for providing insights into how policy interventions (such as subsidies or penalties) can influence the evolution of competitive strategies among providers. Recent studies have provided further depth to the application of EGT in healthcare. Zou et al. [] investigated the evolutionary game and risk decision making of four core participants involved in land finance in China, demonstrating how EGT can be used to analyze complex stakeholder interactions involving public resources. Although focused on land finance, the methodologies discussed can be effectively adapted to the healthcare sector, where various stakeholders, including hospitals, insurers, and government bodies, must balance competing interests and resources to achieve public health goals. Similarly, Xie et al. [] applied EGT to develop adherence strategies during epidemic spreading, which could be highly relevant for healthcare providers managing resources during pandemics. Their study showed how EGT can help in determining optimal strategies for encouraging compliance with health guidelines among the public and healthcare professionals. By modeling how different participants (e.g., hospitals, patients, and public health agencies) adapt their behaviors in response to evolving health risks, EGT provides a valuable framework for managing healthcare crises. Chen et al. [] also utilized a three-player evolutionary game perspective to examine green credit and transformation in the plastic supply chain in China. The insights gained from this study could be applied to healthcare systems, where green practices are increasingly important. For instance, hospitals could be modeled as players in an evolutionary game involving the adoption of environmentally sustainable practices, balancing the costs and benefits of green technologies and policies. Nan et al. [] explored the use of EGT in managing major public health emergencies. Their work underscores the importance of modeling the interactions between multiple subjects, such as healthcare providers, government agencies, and the public, during crises. By simulating the strategic behaviors of these entities, EGT can help in devising policies that minimize the negative impacts of public health emergencies and promote cooperative behavior among stakeholders. These applications highlight the versatility of EGT in addressing diverse challenges within the healthcare sector. By incorporating insights from various fields, including land finance, epidemic management, environmental sustainability, and public health emergencies, EGT provides a comprehensive framework for modeling strategic interactions, optimizing resource allocation, and improving overall healthcare outcomes. This interdisciplinary approach ensures that healthcare providers can adapt effectively to changing circumstances, ultimately leading to more resilient and efficient healthcare systems.

Industrial process optimization: EGT has also been utilized for optimizing industrial processes, particularly in shared resource environments such as manufacturing and logistics. In these settings, firms often compete over limited resources, such as production capacity or raw materials. By modeling these interactions using EGT, companies can determine evolutionarily stable strategies (ESSs) that ensure sustainable competition while minimizing resource wastage. For example, firms can continuously adapt their production schedules based on the observed strategies of their competitors, ultimately leading to a stable equilibrium in which resource utilization is optimized. This approach enables firms to remain competitive while maintaining operational efficiency, which is crucial in today’s highly competitive industrial landscape []. Recent research by Xu et al. [] has expanded on the application of EGT in optimizing industrial processes, specifically focusing on energy efficiency and safety-driven multidimensional evolutionary games for automated guided vehicles’ (AGVs) transportation at automated container terminals. In this study, EGT was used to model the interactions between AGVs in terms of energy consumption and collision risk, allowing terminal operators to optimize AGV routing and scheduling while balancing trade-offs between efficiency and safety. This study highlights the adaptability of EGT in managing complex logistical systems where multiple dimensions, such as safety and energy, need to be considered simultaneously. Song et al. [] provided another significant contribution by applying EGT to model the population dynamics of crowdsourcing in smart manufacturing services. Their work focused on balancing and optimizing fulfillment capacity through cooperation and competition dynamics among participants in crowdsourcing platforms. By using EGT, the study demonstrated how participants in smart manufacturing can dynamically adjust their strategies to balance workloads, minimize delays, and maximize service quality. This approach emphasizes the importance of cooperation in enhancing overall system performance, even in inherently competitive environments. Furthermore, Yuan et al. [] explored the use of tripartite evolutionary games to promote the interconnection of charging infrastructure, involving stakeholders such as charging service providers, government authorities, and electric vehicle users. The findings from this study can be adapted to industrial processes, where the collaboration between multiple stakeholders—such as suppliers, manufacturers, and regulators—plays a crucial role in optimizing resource allocation and improving operational efficiency. The application of tripartite evolutionary games enables the identification of optimal strategies that promote infrastructure development while balancing the interests of different stakeholders. Wang et al. [] conducted a study on the optimization of evolutionary dynamic honeypots, which are designed to deceive attackers and gather intelligence on their tactics. Although focused on cybersecurity, the insights gained from the study are highly relevant for industrial optimization, where deception and adaptive strategies can be used to manage competition and protect critical resources. The application of dynamic honeypot strategies in industrial settings can lead to the enhanced protection of sensitive information and more effective responses to competitive threats. These expanded applications of EGT in industrial processes highlight its capacity to address complex, multidimensional optimization problems. By incorporating factors such as energy efficiency, safety, cooperation, and strategic deception, EGT provides a robust framework for companies to adaptively manage their resources and maintain competitiveness in increasingly dynamic environments. This interdisciplinary approach ensures that industrial stakeholders can navigate the complexities of modern supply chains, logistics, and manufacturing processes, ultimately leading to more resilient and efficient industrial systems.

In economics, evolutionary game theory (EGT) serves as a valuable theoretical framework for understanding the dynamics of cooperation between banks and enterprises. By incorporating the concepts of adaptation and strategy evolution, EGT provides insights into how these entities can adjust their behaviors over time to optimize outcomes in fluctuating market conditions. Specifically, in the context of cooperation between banks and technological small and medium-sized enterprises (SMEs), EGT models the interactions as a dynamic game in which both parties—banks and SMEs—continuously revise their strategies based on the flow of information and the perceived benefits of cooperation []. This ongoing exchange of information allows participants to learn from each other and adapt to changing market environments, which is essential for fostering innovation and securing funding in competitive markets. Over time, this process of strategic adaptation leads to the establishment of a stable equilibrium, where both banks and SMEs find mutually beneficial strategies that balance risk, reward, and resource allocation. Ultimately, the evolutionary game model demonstrates how sustained collaboration can evolve naturally and efficiently, helping banks and SMEs overcome uncertainty and achieve long-term success. The model also highlights the importance of information symmetry and trust in facilitating optimal cooperation between these key economic players.

For example, in economics, EGT provides a theoretical reference for the cooperation between banks and enterprises from a unique perspective. The cooperation process between banks and technological SMEs is also a process in which banks and enterprises improve their strategy sets by continuously obtaining information from each other to achieve game equilibrium, as pointed out in Ref. []. Cooperation between banks and technological small and medium-sized enterprises (SEMs) are briefly illustrated in Figure 2. In Figure 2, banks and enterprises belong to the group of limited rationality. Both will maximize benefits and minimize costs through trial and error and adjustment. The bank maintains financial stability by continuously adjusting its credit strategy to achieve optimal returns. The interaction between banks and enterprises resembles biological evolution, involving the adaptation and elimination of ineffective strategies to achieve the final stable state. In the cooperation between banks and SMEs, EGT helps explain the evolution of strategies and guide credit decisions so as to achieve a stable equilibrium.

Figure 2. EGT: cooperation between banks and technological SEMs [,].

Additionally, evolutionary game theory (EGT) has found extensive application in the social sciences, where it provides a powerful analytical framework for examining the dynamic evolution of social behaviors, norms, and institutions over time. One notable example is the work of Basu [], who explored the intricate relationship between urban social norms and the evolution of human capital agglomeration rules. Basu [] argued that the processes through which individuals and communities form and adhere to social norms—particularly those related to the clustering of talent and knowledge in urban environments—are inherently evolutionary. In this context, EGT is employed to model how individuals and groups interact within urban settings, make decisions about where to locate, and adapt their behaviors based on the collective actions of others. By viewing these interactions as a series of evolving strategies shaped by mutual influence and adaptation, EGT allows for a deeper understanding of how social norms, such as cooperation or competition, emerge and stabilize over time []. For example, the decision to invest in education or professional skills can be influenced by the behaviors of others in the same region, creating a feedback loop that either strengthens or weakens the agglomeration of human capital. Over time, these individual decisions accumulate to form stable patterns of human capital distribution and urban development. From a broader perspective, EGT offers a valuable lens for understanding not only individual decision making but also how collective behaviors evolve in response to changes in the social and economic environment. This perspective is crucial for regional development strategies, as it helps policymakers identify the underlying social and behavioral factors that drive human capital concentration in certain areas, allowing them to implement targeted interventions that foster sustainable growth. Moreover, by highlighting the adaptive nature of social norms and behaviors, EGT provides insights into how policy measures can shape long-term outcomes, such as promoting cooperation, reducing inequality, and enhancing regional competitiveness in a globalized economy.

More recently, EGT has been applied to green electricity trading within the electricity market, highlighting interactions between consumers, suppliers, and regulators. These agents adjust their strategies based on policy changes and market signals to achieve dynamic equilibria that benefit both economic and environmental outcomes []. Such applications underscore the importance of EGT in designing adaptive strategies for energy markets, promoting sustainability and efficiency in the face of renewable energy’s variability.

Green electricity trading and environmental protection: Green electricity trading is another area where EGT has shown significant applicability. In decentralized energy markets, participants—such as renewable energy producers, consumers, and regulatory bodies—interact in a dynamic environment where strategies evolve based on market conditions and policy changes []. By applying EGT, researchers can model both competitive and cooperative dynamics in green electricity trading, such as the interactions between prosumers (producers–consumers) who generate surplus renewable energy and trade it in peer-to-peer markets. EGT also aids in understanding how regulatory policies, like incentives for renewable energy adoption, influence market participants’ strategies, ultimately fostering a stable and sustainable energy market. This application is particularly relevant in addressing the variability and uncertainty associated with renewable energy sources like solar and wind power, where EGT helps optimize decision making in response to fluctuating supply conditions.

Recent research has provided further insights into green electricity trading using EGT. Hu et al. [] examined how dynamic renewable portfolio standards (RPSs) affect the trading behavior of power generators, considering mechanisms such as green certificates and reward/penalty systems. Their study found that EGT could effectively model the adaptive behavior of power generators in response to evolving RPS, helping to balance renewable and conventional energy production while maximizing compliance and economic returns. This work demonstrates how EGT can be used to analyze the interplay between policy-driven incentives and market participants’ strategic behavior. Zhou et al. [] explored the impact of heterogeneity among market participants on China’s green certificate trading from a collective action perspective. Their findings showed that different types of participants (e.g., large-scale power producers, small renewable energy developers, and regulatory bodies) have distinct incentives and behaviors in green certificate trading, which can be modeled using EGT to predict collective outcomes and optimize policy frameworks. This research highlights the importance of understanding the diversity of market participants to foster efficient green electricity markets. Cheng et al. [] conducted an evolutionary game analysis of multi-stakeholders involved in the development of blue and green hydrogen. The study modeled interactions between governments, hydrogen producers, and consumers, identifying the conditions under which cooperation could be maximized to support hydrogen infrastructure development. This study provides valuable insights into how EGT can help promote sustainable energy initiatives beyond traditional green electricity trading, highlighting its applicability in broader renewable energy markets. Teng et al. [] focused on the trading behavior strategies of power plants and the grid under renewable portfolio standards in China, using a tripartite evolutionary game model. By modeling the strategic interactions between power plants, grid operators, and regulatory authorities, the study provided a framework for optimizing renewable energy integration into the power grid. It also demonstrated how EGT could help design effective reward and penalty mechanisms that align the interests of different stakeholders, thus promoting a more stable and cooperative green electricity market. Yue et al. [] used an evolutionary game-based system dynamics approach to optimize strategies for green power and certificate trading in China, considering seasonal variations in renewable energy supply. Their study showed how EGT could help model the influence of seasonal factors on trading strategies and how stakeholders could adapt their behaviors to minimize risks and maximize economic benefits. This research underscores the importance of accounting for temporal variations in green electricity trading to ensure market stability and efficiency. Fan et al. [] analyzed the optimal equilibrium in the carbon trading market using a tripartite evolutionary game perspective. Their study provided insights into the strategic interactions between government, enterprises, and the public in carbon trading, demonstrating how EGT could facilitate the identification of stable equilibria that align economic and environmental objectives. These findings can be extended to green electricity trading, where similar stakeholders are involved in balancing economic incentives with sustainability goals. Wang et al. [] investigated how environmental perceptions, renewable portfolio standards, and subsidies influence trading in China’s green electricity market. By applying EGT, the study modeled the adaptive behaviors of participants in response to changing regulatory frameworks and market conditions, highlighting the effectiveness of subsidies in promoting renewable energy adoption and stable trading environments.

These expanded applications of EGT in green electricity trading demonstrate its capacity to address complex interactions among diverse stakeholders, considering factors such as regulatory incentives, participant heterogeneity, and seasonal variations. By providing a robust framework for modeling strategic behavior in dynamic and uncertain environments, EGT contributes significantly to the optimization of renewable energy trading systems and supports the transition towards a more sustainable and resilient energy future.

By integrating examples from a wider range of disciplines and providing references to the latest research, we aim to illustrate that EGT is not only widely used but also continues to evolve as a crucial method in analyzing and optimizing strategic behavior in dynamic, multi-agent environments.

As a dynamic evolution method to study a variety of limited rational agents, EGT has more practical significance in exploring the multi-agent dynamic evolutionary game of green electricity trading in the electricity sale market []. In the electricity market, there are three main bodies: consumers, suppliers and regulators. Based on the premise of EGT, all three have a certain learning ability and can adjust their strategies to maximize their own interests according to the changes of policies of both parties. In electricity market trading, regulatory authorities always hope to build a green trading environment, so they will continue to regulate the behavior of suppliers and consumers through policy adjustments. Suppliers, on the other hand, adjust the price of electricity in real time for the purpose of profit, and they adopt the strategy that enables them to obtain the maximum profit. Consumers’ decisions are often changed due to adjustments to the first two strategies. Most consumers tend to use more electricity when the price is low and less electricity when the price is high. The adjustment of these three strategies is always a dynamic evolutionary process. To gain a more granular understanding of how strategies evolve in user-side electricity market transactions, we apply a multi-agent simulation model []. This approach allows us to capture the dynamic interactions between market participants over time. The agents in the model, representing various stakeholders such as consumers, producers, and regulatory bodies, adjust their strategies based on market signals and EGT principles []. The multi-agent framework provides a robust mechanism to simulate real-world behavior in electricity markets where participants may exhibit imperfect rationality and adapt their strategies dynamically []. The multi-agent model is usually used to simulate the interaction of multiple decision-makers in the electricity market []. Suppose there are n agents, and each agent selects a policy,

s_{i} (t)

, at each time step, t. Each policy corresponds to a payment function, which depends on the policy selected by the agent and the policies of other agents. The payment function can be expressed as

U_{i} (s_{i}, s_{- i}) = \sum_{i \neq j} a_{i j} u (s_{j i}, s_{j})

(1)

where s_−i represents the strategy combination of all agents except Agent i, a_ij is the interaction strength between Agents i and j, and

U (s_{j i}, s_{j})

is the payment of a single game when Agent j uses strategy s_ji and Agent i uses strategy s_j.

The multi-agent model follows a structured approach to simulating the interactions between market participants []. The process involves several key stages, starting from defining the agents, setting up the market environment, and establishing interaction rules based on EGT. The following flowchart (Figure 3) outlines the step-by-step process of constructing the multi-agent model used in this study [].

Figure 3. The process of building a multi-agent model [].

Figure 3 provides a visual representation of the model-building process []. It starts with the definition of the problem and objectives, followed by agent selection, behavior specification, and interaction rules. The simulation is then run iteratively, with results analyzed in the final step.

As shown in Figure 3, the process begins by defining the core objective of the model, which is to simulate the strategic behavior of market participants []. Next, agents are selected based on their roles in the electricity market, including suppliers, consumers, and regulators. The environment is then established, taking into account key factors such as demand variability and market regulations []. Each agent’s behavior is governed by predefined strategies, which evolve over time through their interactions.

During the simulation phase, the model runs over multiple time steps, allowing us to observe how the strategies adapt as market conditions fluctuate. The final step involves analyzing the results and determining how different strategies impact market stability and equilibrium [].

In this section, we explore the dynamics of market pricing influenced by agent interactions using a multi-agent simulation model []. This model, comprising both buyers and sellers, aims to reflect the complexity of real-world market behaviors under varying demand and supply conditions, as demonstrated in Figure 4. Each agent adapts their strategies based on market trends, which are simulated over a defined time period.

Figure 4. The complexity of real-world market behaviors under varying demand and supply conditions [].

The graphical results from our simulation reveal significant fluctuations in pricing strategies, which correspond to the theoretical underpinnings of EGT applied to electricity markets []. Initially, agents tend to adopt aggressive pricing strategies in order to maximize individual payoffs, leading to a volatile market environment. Over time, as agents learn from interactions and adapt their strategies, a pattern of strategy stabilization emerges, indicating the evolution towards an ESS.

This trend is particularly pronounced in scenarios simulating high demand volatility, where adaptive strategies are crucial for survival and profitability in the market []. The simulation underscores the critical role of dynamic strategy adaptation in achieving market equilibrium, reflecting real-world observations of electricity markets often experiencing rapid changes in pricing due to fluctuating demand and supply conditions.

However, the limitations of our model, including the simplification of market dynamics and the assumption of rational behavior, may affect the generalizability of the results []. Future studies could enhance model realism by incorporating more complex behavioral models and exploring the impact of external market shocks.

Further research is also warranted to explore how policy interventions, such as pricing caps or subsidies, might influence strategic behavior in user-side electricity markets. Such studies could help policymakers design more effective regulations that encourage stable and efficient market outcomes. In the subsequent chapters of this paper, we will continue to explore the applicability of EGT to electricity market trading in smart grids.

Additionally, EGT plays a crucial role in various domains by revealing underlying rules and trends. As research progresses and expands, EGT is likely to reveal new insights across additional, previously unexplored fields.

2.5. Modeling Dynamics Using EGT

EGT is utilized to model the strategic interactions between multiple agents in dynamic and competitive environments. In our proposed framework, the payment function (Equation (1)) plays a central role in defining the fitness levels for each agent’s strategy, which is then used in the dynamic specifications typical of EGT, such as replicator dynamics. The replicator dynamic equation, commonly used in EGT to represent the evolution of strategy proportions over time, is given by

{\dot{x}}_{i} = x_{i} (f_{i} - \bar{f})

, where x_i represents the proportion of agents adopting strategy i, f_i represents the fitness (payoff) of strategy i, and

\bar{f}

is the average fitness of the population. In this formulation, the payment function (1) is directly related to the fitness f_i, and it thus influences how strategies evolve within the agent population over time.

The payment function is integral to both individual learning through RL and population-level dynamics through EGT. Specifically, it serves two main purposes in the context of dynamic systems:

(i): Defining fitness for population-level dynamics: In typical EGT formulations, the fitness of a strategy determines whether it becomes more or less prevalent in the population. In our model, the payment function (Equation (1)) serves as the basis for calculating f_i, which is used in the replicator dynamic equation. By doing so, it directly affects the evolution of strategy proportions in the population, providing the necessary link between individual agent actions and population-level outcomes.
(ii): Incentivizing individual agent behavior: The payment function also serves as a reward mechanism within the RL framework. This reward directly influences the learning process of individual agents, guiding their decisions to either exploit known strategies or explore new ones. The connection between the payment function and agent learning ensures that the rewards an agent receives align with the broader evolutionary dynamics modeled by EGT, fostering both individual adaptation and collective stability.

To clarify the integration of the payment function within the dynamic framework, we have added the following explanations.

(i): Replicator dynamics: The replicator dynamic equation is used to describe how the proportion of agents using a particular strategy changes over time based on the relative fitness of that strategy. In this study, the fitness, f_i, is derived from payment function (1), which takes into account factors such as energy costs, consumption patterns, and market interactions. This fitness value feeds directly into the replicator dynamics to determine the growth rate of each strategy.
(ii): Role in dynamic specifications: The payment function plays a dual role by influencing both the replicator dynamics and the agent-level learning process. This dual functionality ensures that the strategies not only evolve according to individual incentives but also conform to population-level evolutionary stability. The replicator dynamics equation means that the payment function directly influences how strategies evolve at the population level by determining which strategies are more successful and should be propagated over time. Higher fitness, as determined by the payment function, means a greater likelihood that a given strategy will increase in prevalence, thereby influencing the evolutionary trajectory of the entire system. This ensures that the resulting equilibrium is an ESS, which is robust against invasion by alternative strategies. At the agent level, the payment function serves as a reward mechanism within the RL process. This reward is based on the immediate and cumulative payoffs that agents receive for choosing certain strategies, which subsequently influence their learning pathways. By using the payment function to guide individual agent decisions, we ensure that agents are continually learning to improve their own payoffs while also contributing to broader, collective dynamics. The reward structure based on the payment function provides critical feedback to the agents, informing them of the success or failure of their strategies in real time. This feedback loop is essential for ensuring that individual learning aligns with the desired long-term outcomes of the population. The dual functionality of the payment function—acting both as the determinant of population-level fitness in replicator dynamics and as a reward mechanism in agent-level RL—ensures that the evolution of strategies is both adaptive and stable. At the individual level, agents learn based on their experiences, optimizing their actions according to the rewards (derived from the payment function). At the population level, the replicator dynamics use these same fitness values to adjust the distribution of strategies among the agents, thus ensuring that successful strategies spread while less successful ones diminish. This interplay between individual learning and population dynamics facilitates a more holistic adaptation process, where both the micro-level decisions of agents and the macro-level evolution of strategies are consistently guided by the same underlying metrics. By influencing both replicator dynamics and agent-level learning, the payment function effectively links the micro and macro aspects of strategy evolution, ensuring that individual incentives are aligned with population-level evolutionary stability. This dual role addresses the reviewer’s concern by clearly demonstrating that the payment function is not merely an arbitrary payoff measure but is foundational to defining evolutionary stability through replicator dynamics. It ensures that the strategies that emerge from individual-level learning are sustainable and evolutionarily stable, thereby contributing to a more resilient and optimized system in the context of electricity markets. As agents adapt their strategies to maximize individual payoffs, the population-level replicator dynamics ensure that such strategies contribute to overall system stability, thus closing the feedback loop between individual adaptation and collective outcomes. The payment function thus ensures coherence between the incentives driving individual behavior and the evolutionary forces acting on the population, making it a fundamental element in the proposed dynamic specifications framework.

3. Integrating EGT for Strategic Optimization and Stability in Smart Grids

3.1. Multi-Agent Characteristics in Smart Grids

Smart grids exhibit a multi-agent architecture that reflects both structural and operational complexity. This structure is characterized by the involvement of multiple independent yet interconnected agents, each contributing to the system’s overall performance.

One key characteristic of smart grids is the distributed nature of energy resources. With the increasing adoption of distributed energy systems, traditional centralized dispatch methods are becoming less effective. Instead, smart grids employ distributed algorithms to manage and schedule resources more efficiently [,]. These algorithms allow the integration of diverse agents, including power producers (e.g., power plants and distributed energy systems), electricity consumers (e.g., residential, commercial, and industrial entities), network operators, and market intermediaries. This distributed architecture provides greater flexibility, enabling the system to adapt to dynamic changes in both energy production and consumption.

Another significant feature is the autonomy of smart grid agents. These agents operate independently, making decisions based on their specific objectives and constraints []. A prominent example is real-time pricing, a core mechanism within smart grids, where electricity prices fluctuate based on supply and demand conditions [,]. Take the application of evolutionary game coordination in islanded microgrids as an example []. In this example, the integration of distributed generation game theory plays a crucial role in designing pricing models that incentivize consumers to reduce their electricity usage during peak periods []. By simulating various consumer responses, this approach allows power producers to assess price sensitivity and optimize their generation strategies. As shown in Figure 5, over time, more consumers adopt energy-saving strategies, reducing overall electricity demand and smoothing a peak load, which enhances the system’s economic and operational stability [,].

Figure 5. Illustrates the impact of EGT on consumer behavior and market demand under real-time pricing [,].

In addition to autonomy, smart grids are highly interactive. Agents engage in continuous exchanges of information and participate in market mechanisms, considering not only their own conditions but also the behavior of other agents. This interactivity drives the evolution of energy systems toward greater efficiency and sustainability, promoting the integration of distributed generation and energy storage systems []. For instance, based on user load characteristics, smart grid systems can predict peak loads and proactively interact with users to adjust energy usage before peak demand is reached. This facilitates efficient load management and contributes to peak shaving [].

The balance of collaboration and competition among smart grid agents is another critical aspect. Collaboration enables the sharing and optimization of resources, such as through co-generation or demand-response programs, which improves overall system coordination and efficiency. At the same time, competition, such as bidding among suppliers, encourages market dynamism. This interaction between collaborative and competitive forces fosters innovation and efficiency within the system.

Communication is integral to the functioning of multi-agent systems in smart grids. Effective communication between agents, such as negotiating prices or sharing data on energy supply and demand, ensures the stability and reliability of the grid. Through advanced data analytics and decision support systems, agents can process large data sets and make real-time decisions to optimize grid performance.

The intelligent decision-making capabilities of smart grid agents are supported by machine learning and game theory. In the framework of multi-agent reinforcement learning (MARL), agents make decisions based on environmental states, and the system provides feedback, allowing the agents to continuously adapt and improve their strategies over time (Figure 6) [,].

Figure 6. A basic framework for MARL or game theory [,].

Each agent makes decisions (selection actions) based on state information in the environment, and the environment then updates the state and provides reward feedback based on the combination of these actions. This process is a cycle in which the agent is constantly learning or adjusting its strategies through interaction to optimize the outcome.

Finally, smart grids are known for their resilience. The decentralized nature of multi-agent systems enhances the grid’s ability to withstand disruptions. If one part of the system fails, other agents can continue operating independently, reducing the impact on the grid as a whole. This resilience is particularly important in addressing challenges such as grid faults [,,] and extreme weather events [,,,,]. Furthermore, the system’s scalability allows for the easy integration of additional DERs without requiring significant changes to the existing infrastructure. This flexibility is crucial as the number of renewable energy sources, such as solar panels and wind turbines, continues to grow.

3.2. Fit Analysis of EGT and Smart Grid

Smart grids are intricate systems composed of multiple independent entities, such as power generators, consumers, and dispatch centers, which interact and influence one another. In this context, EGT serves as a powerful analytical tool for understanding and predicting the dynamic behaviors of these agents over time. To fully capture these interactions, it is imperative that a game is well defined among agents, allowing for the analysis of competitive and cooperative dynamics. By defining a game among agents, we can leverage EGT to evaluate how strategies evolve based on observed outcomes and the influence of competing strategies. EGT is particularly suited to modeling the smart grid’s strategic interactions due to its capacity to capture the evolving nature of decision-making processes.

Figure 7 illustrates the evolution of energy strategy efficiency and frequency over time []. It highlights that, as market competition intensifies, renewable energy strategies may eventually surpass traditional ones in terms of usage frequency. This suggests that competitive market forces can significantly drive the evolution of energy strategies within the smart grid, accelerating the shift toward renewable energy sources.

Figure 7. The evolution of energy strategy efficiency and frequency [].

In the smart grid, agents such as electricity consumers and generators adjust their strategies based on market feedback and environmental changes. This interaction and the autonomy of agents align well with the dynamic game models of EGT. By evaluating the success and persistence of various strategies in competitive environments, EGT enables a deeper understanding of the long-term behavior and trends among smart grid participants. These insights can help predict how strategies evolve and which may dominate over time, especially as external conditions like market prices and renewable energy availability fluctuate.

The smart grid is characterized by a delicate balance between cooperation and competition. Participants engage in both competitive activities, such as price setting and market share battles, and cooperative initiatives, such as demand-response programs and resource sharing for optimization. EGT can identify stable strategy combinations under different conditions, guiding participants in balancing competition with cooperation to maximize system efficiency []. This understanding can be used to design more effective incentive mechanisms that promote cooperative behavior, leading to mutual benefits for all grid participants.

Adaptability is another key strength of EGT in modeling smart grids, which operate in environments that are often unpredictable due to the reliance on intermittent renewable energy sources like solar and wind. EGT offers a framework for analyzing how demand-response mechanisms can influence consumer behavior in these fluctuating conditions. One of the critical challenges in smart grid management is encouraging consumers to reduce electricity usage during peak periods through price signals and incentives, helping balance the grid’s load. For example, energy storage operators might develop strategies to store or release energy based on future price forecasts and the expected availability of renewable energy, ensuring more efficient grid operation.

As renewable energy becomes increasingly prominent, EGT provides valuable insights into the sustainability of energy systems. It can be used to analyze the interactions between different power generation actors, focusing on both competitive and cooperative strategies. EGT models also capture the co-evolution of behaviors between renewable energy producers and consumers. By understanding these dynamics, EGT supports the seamless integration of renewable energy into the grid and promotes the sustainability of the overall power system’s development

3.3. Challenges and Limitations

Despite the many advantages of evolutionary game theory (EGT) for modeling smart grids, several significant challenges and limitations persist that must be carefully addressed to fully leverage its potential:

(1): Complexity in real-world systems: As EGT models scale up to represent complex, real-world smart grids involving numerous agents, diverse objectives, and intricate interactions, the overall system can become highly complex. This complexity makes it difficult to derive straightforward analytical solutions, and computational demands increase substantially as the number of agents grows. In real-world applications, the heterogeneity of agents—ranging from large energy producers to small prosumers—adds further layers of intricacy, requiring sophisticated modeling approaches to capture the nuances of each participant’s strategic behaviors. Moreover, interactions between renewable energy sources, market dynamics, and distributed energy resources (DERs) further increase the dimensionality and computational complexity of EGT-based models.
(2): Convergence to suboptimal solutions: One of the inherent challenges of EGT in smart grid modeling is the risk of convergence to suboptimal solutions. The evolutionary nature of these models means that strategies evolve based on fitness, which may not always guarantee a globally optimal solution. In practice, agents can converge on locally optimal strategies that are beneficial within their immediate context but fail to provide the best outcome for the overall grid or the broader energy market. This challenge is especially pronounced when multiple Nash equilibria exist, where EGT might settle at a suboptimal equilibrium that does not maximize system efficiency or benefit all stakeholders equitably. Additionally, the presence of non-cooperative behaviors and competitive dynamics between agents can exacerbate the likelihood of such suboptimal convergence, leading to inefficiencies in the allocation of resources or market imbalances.
(3): Data requirements and real-time implementation: Successfully implementing EGT models for real-time smart grid optimization requires extensive and reliable data collection, processing, and analysis. These models depend heavily on accurate, high-frequency data, such as real-time energy consumption, production metrics, pricing information, and weather conditions that influence renewable energy generation. The need for such large-scale data introduces several challenges, including the infrastructure requirements for data acquisition, the computational power needed to process this data in real time, and technical challenges related to data integration from different sources. Data privacy concerns also add an additional layer of complexity, as ensuring compliance with data protection regulations is critical when gathering information from individual users or households. Moreover, delays in data collection or inaccuracies in data processing can lead to suboptimal or outdated decisions, thus reducing the efficacy of the EGT model in dynamically adapting to real-time grid conditions.
(4): Scalability and computational challenges: The scalability of EGT models is another significant limitation. As the number of participating agents and the complexity of their interactions increase, the computational resources required for running simulations grow exponentially. In large-scale smart grids, the time required to simulate interactions and derive evolutionarily stable strategies can become impractical, particularly when real-time decision making is needed. This issue necessitates the development of advanced algorithms or approximation techniques that can efficiently manage the increased computational load while still delivering timely and effective solutions.
(5): Integration with other models and techniques: Integrating EGT with other modeling techniques, such as reinforcement learning or traditional optimization algorithms, also presents challenges. While the hybridization of these approaches can offer significant benefits—such as combining the adaptability of reinforcement learning with the strategic insights of EGT—ensuring a seamless integration that maintains the robustness and efficiency of each method is a non-trivial task. Hybrid models need careful calibration and tuning to ensure that the combined methodologies work harmoniously without introducing inconsistencies or convergence issues.

Overall, while EGT provides a valuable framework for understanding strategic interactions and evolving behaviors in smart grids, addressing these challenges is essential in fully realizing its potential. Future research needs to focus on developing methods to reduce computational complexity, mitigate convergence to suboptimal equilibria, and enhance data integration capabilities in order to make EGT models more feasible and effective for real-world smart grid applications.

4. Applications and Fits of EGT in Energy Trading

4.1. Overview of Energy Trading

Energy trading within smart grids involves the purchase, sale, and exchange of electricity between various stakeholders, including electricity producers, consumers, prosumers (those who both produce and consume electricity), and intermediaries such as power exchanges and energy aggregators. This complex system aims to optimize not only the economic benefits for participants but also the operational efficiency of the energy network. By leveraging a combination of modern information and communication technologies (ICT) and advanced energy management systems, energy trading within smart grids is increasingly moving towards an environment that emphasizes efficiency, flexibility, and sustainability.

Producers within this system include large-scale power generation facilities, such as thermal power plants, wind farms, solar parks, and hydroelectric stations. Consumers encompass both residential and industrial users who require electricity to meet their daily energy demands, while prosumers represent an increasingly important group that contributes to the electricity supply by generating power, often through renewable energy sources, such as rooftop solar panels, and simultaneously consuming it. Intermediaries, such as power exchanges and brokers, facilitate the matching of supply and demand, ensuring that electricity is routed to where it is needed in an efficient manner.

The evolution of energy trading within smart grids has brought significant changes compared to traditional centralized electricity markets. Traditionally, energy transactions were dominated by a few central power plants supplying energy to a wide area, with very little flexibility or adaptability in terms of how power was generated, distributed, and consumed. The traditional market had a top-down approach, characterized by a one-way flow of electricity from power producers to end users. This legacy system lacked both real-time adaptability and the ability to accommodate DERs, which has now changed dramatically with the advent of smart grid technology.

In modern energy trading, the smart grid facilitates a two-way flow of both electricity and information, enabling the real-time monitoring and control of energy production, distribution, and consumption. The integration of advanced metering infrastructure (AMI), IoT devices, and data analytics allows for enhanced visibility across the energy value chain. Real-time data from smart meters and connected devices enable participants to make informed decisions regarding their energy usage or generation, while dynamic pricing mechanisms reflect the actual conditions of supply and demand, providing incentives for users to adjust their behavior in favor of grid stability and cost optimization.

Energy trading also encompasses diverse energy sources, including both renewable and non-renewable energy resources. By incorporating renewable sources like solar, wind, and biomass, the energy trading mechanism within smart grids helps reduce carbon emissions and supports global sustainability initiatives. The interplay between renewable energy sources and traditional power generation introduces the concept of energy balancing, wherein energy produced from intermittent renewable sources can be complemented by more stable, dispatchable generation sources, such as gas or hydro, to maintain reliability in the electricity supply.

In addition to the integration of diverse energy sources, energy storage systems (ESS) such as battery storage play a crucial role in modern energy trading. Energy storage allows for the temporal decoupling of electricity generation and consumption, thereby enhancing grid flexibility. For instance, batteries can store excess energy produced during periods of high renewable output and release it during times of peak demand. This aspect of energy trading provides greater resilience and reliability to the smart grid, ensuring that fluctuations in supply do not lead to instability.

The flexibility provided by smart grids also allows for the introduction of demand-side management (DSM) and demand response (DR) programs, which are essential elements of modern energy trading. Demand-response mechanisms encourage consumers to alter their consumption patterns during peak times, often in exchange for financial incentives. This contributes to the overall efficiency of the grid by reducing the need for costly peak generation and minimizing the risk of grid overload.

Energy trading within smart grids is also heavily influenced by regulatory frameworks and market structures. These frameworks ensure fair trading practices and help in maintaining grid stability. Market operators and regulators play a significant role in establishing rules for energy trading, such as setting price caps, defining grid codes, and regulating the integration of renewable energy certificates (RECs). The evolution of P2P energy trading platforms has also introduced new dynamics in energy trading, enabling consumers to trade surplus energy directly with each other using blockchain or other distributed ledger technologies for transparent and secure transactions.

Furthermore, artificial intelligence (AI) and machine learning (ML) algorithms are increasingly being used to predict market trends, optimize bidding strategies, and manage the operation of DERs. These technologies help market participants optimize their trades by anticipating changes in demand, generation capacity, and market prices, thereby maximizing both economic gains and grid stability.

In summary, energy trading within smart grids represents a sophisticated and dynamic ecosystem that leverages cutting-edge technologies to enhance efficiency, flexibility, and sustainability. The system integrates traditional and renewable energy sources, supports real-time data-driven decision making, and it involves a broad range of stakeholders from producers to consumers and intermediaries. This evolution from a centralized, rigid market to a decentralized, flexible smart grid infrastructure has transformed the way energy is produced, traded, and consumed, contributing significantly to the goals of energy efficiency, cost reduction, and environmental sustainability.

4.2. Integrated Application of EGT and RL in Energy Trading

The evolution of energy trading within smart grids has brought significant changes compared to traditional centralized electricity markets. Traditionally, energy transactions were dominated by a few central power plants supplying energy to a wide area, with very little flexibility or adaptability in terms of how power was generated, distributed, and consumed. The traditional market had a top-down approach characterized by a one-way flow of electricity from power producers to end users. This legacy system lacked both real-time adaptability and the ability to accommodate DERs (DERs), which has now changed dramatically with the advent of smart grid technology.

Energy trading also encompasses diverse energy sources, including both renewable and non-renewable energy resources. By incorporating renewable sources like solar, wind, and biomass, the energy trading mechanism within smart grids helps reduce carbon emissions and supports global sustainability initiatives. The interplay between renewable energy sources and traditional power generation introduces the concept of energy balancing, wherein energy produced from intermittent renewable sources can be complemented by more stable dispatchable generation sources, such as gas or hydro, to maintain reliability in the electricity supply.

Energy trading in smart grids is characterized by several key features, which play a critical role in shaping modern energy markets. One of the primary aspects is the interaction of multiple agents within the grid. These stakeholders include power generators, consumers, distributed energy producers (such as wind and solar), intermediaries, and grid operators. Through market mechanisms, these participants influence supply and demand dynamics, resulting in a more integrated and adaptive system [,].

Another significant feature of smart grid energy trading is the ability to conduct transactions in real time. Leveraging real-time data monitoring, participants can make quick adjustments to their trading decisions based on evolving supply, demand, and pricing conditions. This responsiveness leads to a more efficient allocation of energy resources and enhanced market fluidity.

In addition to the integration of diverse energy sources, energy storage systems (ESSs) such as battery storage play a crucial role in modern energy trading. Energy storage allows for the temporal decoupling of electricity generation and consumption, thereby enhancing grid flexibility. For instance, batteries can store excess energy produced during periods of high renewable output and release it during times of peak demand. This aspect of energy trading provides greater resilience and reliability to the smart grid, ensuring that fluctuations in supply do not lead to instability.

The flexibility provided via smart grids also allows for the introduction of demand-side management (DSM) and demand response (DR) programs, which are essential elements of modern energy trading. Demand-response mechanisms encourage consumers to alter their consumption patterns during peak times, often in exchange for financial incentives. This contributes to the overall efficiency of the grid by reducing the need for costly peak generation and minimizing the risk of grid overload.

Demand response is also a key component of energy trading in smart grids. This approach incentivizes consumers to modify their energy usage patterns, particularly by reducing consumption during peak times or increasing it during off-peak periods. Such strategies contribute to better resource distribution and reduced operational costs for the energy system.

The rise of renewable energy trading has further expanded the scope of smart grids. Platforms now enable small-scale producers, such as households with solar panels, to sell surplus energy. This exchange of renewable energy promotes greater sustainability and diversification within energy markets.

In some smart grids, blockchain and smart contract technologies have been incorporated to enhance transaction transparency and security. These technologies help protect user privacy, minimize intermediary involvement, and reduce overall transaction costs [,].

Finally, the energy trading process in smart grids often employs dynamic pricing mechanisms. These pricing models reflect real-time market conditions and allow participants to shape their bidding strategies based on supply, demand, and competitor behavior. This dynamic pricing fosters a competitive and efficient energy trading environment. The interaction between various actors and mechanisms in the smart grid system showcases the complexity and efficiency of energy management in real-time operations, as illustrated in Figure 8 [].

Figure 8. Different actors and mechanisms in a smart grid interact to achieve efficient trading and management of energy. The purpose of this diagram is to show how different actors and mechanisms in a smart grid interact to achieve efficient trading and management of energy. The sub-themes revolve around a core theme that illustrates the complexity and diversity of smart grids in real-time operations [].

In summary, energy trading within smart grids represents a sophisticated and dynamic ecosystem that leverages cutting-edge technologies to enhance efficiency, flexibility, and sustainability. The system integrates traditional and renewable energy sources, supports real-time data-driven decision making, and involves a broad range of stakeholders from producers to consumers and intermediaries. This evolution from a centralized, rigid market to a decentralized, flexible smart grid infrastructure has transformed the way energy is produced, traded, and consumed, contributing significantly to the goals of energy efficiency, cost reduction, and environmental sustainability.

4.3. Applications of EGT in Energy Trading

The application of EGT in energy trading provides a powerful framework for understanding the interactions between power generators, consumers, and intermediaries.

(i): In real-time pricing, EGT optimizes competitive pricing strategies to enhance market efficiency.

In real-time pricing, energy generators and retailers face the challenge of developing competitive pricing strategies that respond to market demand fluctuations. EGT aids in optimizing these strategies, improving market efficiency by modeling how participants adapt to changing conditions. For example, households may initially reduce energy consumption during peak pricing periods, but over time, dynamic pricing can incentivize investment in energy-efficient technologies or storage systems. As more consumers adopt such behavior, the demand becomes smoother and more predictable, contributing to overall system stability.

(ii): EGT helps model cooperative and competitive dynamics among renewable energy producers.

In the development of renewable energy, EGT can model the competitive and cooperative dynamics between renewable energy producers, such as those using wind and solar power. As the market becomes more integrated, the evolving strategies of energy storage operators and renewable producers can be simulated using EGT to determine optimal energy storage and release times based on generation patterns and market prices. This approach also assists in evaluating long-term strategies where renewable energy producers may either cooperate to stabilize their output or compete for a larger market share [,,].

(iii): EGT provides valuable insights into the strategic behaviors of firms in carbon emissions trading markets.

Carbon emissions trading is another domain where EGT proves valuable. In carbon markets, companies compete to minimize emission costs while trading carbon allowances. EGT helps model these competitive behaviors, allowing for a better understanding of how firms adjust their strategies in response to environmental policies and trading mechanisms. This approach provides insights into the effectiveness of different policy interventions in reducing emissions [,].

(iv): EGT plays a key role in modeling strategy adaptation in dynamic energy markets.

The dynamic nature of energy markets is another area where EGT plays a crucial role. The theory can simulate various market conditions, illustrating how participants adjust their strategies in response to fluctuating policies and market environments. For instance, EGT can model how consumers and energy generators adapt during periods of price volatility, particularly in real-time energy markets. In these markets, participants engage in rapid trading in response to sudden shifts in supply and demand, often driven by variable renewable energy generation. By analyzing how market participants react to price signals and supply–demand imbalances, EGT offers valuable insights into strategy adaptation in volatile market conditions.

(v): P2P energy trading is well suited to EGT analysis due to its decentralized and dynamic nature.

P2P energy trading, where individual prosumers (both consumers and producers of energy) exchange surplus electricity, represents a decentralized and dynamic environment that is well suited for EGT analysis. In P2P trading, prosumers must decide when to sell or store surplus energy, set prices, and determine purchasing behavior during periods of low generation. EGT models these evolving strategies, reflecting how prosumers optimize their energy usage and achieve favorable prices over time [,,,]. The decentralized nature of these interactions introduces complexity that can be effectively analyzed using game-theoretic models to explore how strategies evolve in response to market conditions [,,,,].

Table 2 demonstrates the categories and applications of EGT. Based on this, by applying EGT across various aspects of energy trading, from pricing strategies and renewable integration to carbon emissions and P2P trading, researchers can gain a comprehensive understanding of how strategies evolve in complex, real-time energy markets.

Table 2. Categories and applications of EGT.

5. DQN for Strategy Optimization in Electricity Trading

5.1. DQN Algorithm Applied to Strategy Optimization in Electricity Markets

In dynamic and fast-changing electricity markets, the strategies of market participants need to evolve continuously to adapt to new conditions. While EGT provides a robust framework for understanding long-term strategy evolution, real-time decision making requires additional tools [,]. The primary focus of this study is to integrate RL with EGT, with RL serving as the foundational framework that enables individual agents to adaptively learn strategies in response to changing environments. EGT, on the other hand, provides a supportive role by offering a population-level analysis that ensures the stability of strategies over time. Thus, the relationship can best be described as RL with EGT. The DQN, a form of RL [,], complements EGT by enabling market agents to optimize their strategies in real time based on immediate feedback from the market environment.

The combination of EGT and DRL addresses key limitations of both methods: EGT’s inability to adapt quickly to real-time changes and DRL’s lack of a population-level strategic perspective. This integrated approach enhances market participants’ ability to handle high-dimensional state spaces and respond swiftly to supply–demand fluctuations, making it particularly beneficial for dynamic electricity markets. Moreover, DRL’s reinforcement learning mechanism allows individual agents to maximize their utility by adapting their strategies to changing market conditions, while EGT provides insight into the overall stability of these strategies within a population, leading to a more holistic optimization framework. In summary, EGT primarily provides insights into long-term population-level dynamics, ensuring evolutionarily stable strategies (ESSs), while DRL offers the adaptability needed for individual agents to respond to real-time changes and fluctuations in market conditions. This integration is necessary for creating a framework that can effectively manage both immediate and long-term market dynamics. The game definition among agents forms a critical component of this framework, as it captures both competitive and cooperative interactions that are necessary for the accurate application of EGT in a multi-agent system.

In EGT, the strategies of participants evolve over time as the participants interact with each other, often leading to equilibrium or ESSs. However, in highly dynamic environments such as electricity markets, where conditions change rapidly and participants must make real-time decisions, the static nature of traditional EGT can limit its applicability. In this work, we emphasize that the main focus is on RL, with EGT acting as a supporting tool that enhances the understanding of long-term strategy stability. The game definition among agents ensures that interactions are modeled accurately, which is crucial for integrating EGT with RL effectively. To address these challenges, the DQN, an RL algorithm, provides a powerful extension by enabling agents to dynamically adjust their strategies based on ongoing interactions with the environment [,,]. Unlike EGT, which primarily focuses on long-term stable strategies, the DQN allows for real-time strategy optimization, making it well suited for environments where conditions fluctuate frequently [,,].

Q (s, a) \leftarrow Q (s, a) + α [r + γ \max_{a^{'}} Q (s^{'}, a^{'}) - Q (s, a)]

(2)

where the following applies:

Q (s, a)

: select the expected reward of action a in state s;

α

: the learning rate, control the update degree of new information to the old Q-value;

r: the immediate rewards for the current round;

γ

: the discount factor, indicating the weight of future rewards;

s^{'}

: the next state;

\max_{a^{'}} Q (s^{'}, a^{'})

: the maximum expected reward obtained by taking the optimal action a under the new state s.

The DQN extends traditional Q-learning by using a deep neural network to approximate the Q-values, making it suitable for environments with large state-action spaces, such as electricity markets. The Q-value update rule, shown above, guides the agent to learn optimal strategies by balancing exploration (trying new actions) and exploitation (leveraging known strategies) []. In this way, the DQN allows agents to dynamically adapt their strategies based on real-time market conditions, which aligns with the goals of EGT by seeking an optimal, adaptive strategy, as demonstrated in Figure 9. This figure illustrates the workflow of the DQN algorithm applied to strategy optimization in electricity markets. The process involves continuous interaction between market participants and the environment, where agents update their strategies based on feedback from real-time market conditions.

Figure 9. Framework for integrating EGT and the DQN for strategy optimization in the electricity markets [].

As elaborated previously, the replicator dynamics equation, commonly used in EGT to represent the evolution of strategy proportions over time, is given by

{\dot{x}}_{i} = x_{i} (f_{i} - \bar{f})

, where x_i represents the proportion of agents adopting strategy i, f_i represents the fitness (or payoff) of strategy i, and

\bar{f}

is the average fitness of the population. As demonstrated in Figure 9, this mathematical representation of the replicator dynamics equation illustrates how successful strategies proliferate within the population based on their relative fitness. This expression is fundamental to EGT in modeling the evolution of strategies over time and ensuring that adaptive, stable behaviors emerge within the system. Moreover, in the integration framework shown in Figure 9, the DQN plays a critical role in real-time strategy optimization for individual agents. The DQN extends traditional Q-learning by using a deep neural network to approximate the Q-values, making it suitable for environments with large state-action spaces such as electricity markets. The Q-value update rule is given by

Q (s, a) \leftarrow Q (s, a) + α [r + γ \max_{a^{'}} Q (s^{'}, a^{'}) - Q (s, a)]

. This Q-value update mechanism guides agents to learn optimal strategies through balancing exploration and exploitation effectively. It allows for dynamic adaptation to changing market conditions, which is especially critical in the highly volatile energy markets. Furthermore, as Figure 9 also shows, to effectively explain how EGT and the DQN operate together in the conceptual framework, the integration of EGT and the DQN is explained as follows.

(i): EGT for population-level dynamics: The replicator dynamics model how strategy proportions evolve within the population, ensuring stability through higher fitness. By calculating strategy success using the fitness values derived from payment functions, the population adapts over time.
(ii): DQN for individual adaptation: The Q-value update rule helps individual agents learn effective strategies in real time. By employing a deep neural network, DQN agents are capable of approximating optimal actions even in environments characterized by high-dimensional state spaces.
(iii): Synergistic framework: The conceptual flowchart now integrates the replicator dynamics and Q-value update rules, illustrating their combined impact. EGT ensures that, at the population level, strategies converge towards evolutionarily stable outcomes, while the DQN ensures individual adaptability to ongoing changes.

Based on these explanations, Figure 9 depicts the integration of EGT and the DQN in a conceptual framework for electricity market strategy optimization. The figure now includes the replicator dynamic and Q-value update equations, providing a more detailed representation of how EGT and the DQN operate in tandem to achieve both population-level stability and individual adaptability. In this integrated framework, the replicator dynamic equation guides the evolutionary process by modeling the population’s strategic adjustments based on the relative success of different strategies. This ensures that, over time, strategies that yield higher payoffs are more likely to be adopted by the population, promoting population-level stability.

Simultaneously, the Q-value update equation, a core component of the DQN, enables individual agents to learn and adapt their strategies through trial and error. By updating their Q-values based on observed rewards and state transitions, agents can improve their decision-making processes, enhancing individual adaptability.

The combination of these two mechanisms within the framework allows for a dynamic balance between population-level stability and individual-level flexibility. While the replicator dynamic ensures that successful strategies are widely adopted, the Q-value update allows for innovation and adaptation at the individual level, enabling the population as a whole to respond effectively to changing market conditions.

This integrated approach leveraging EGT and the DQN thus offers a powerful tool for strategy optimization in complex and evolving electricity markets, where the need for both stability and adaptability is paramount. Concretely, the use of EGT ensures that, at a population level, the system can reach and maintain an evolutionarily stable state, which is crucial for ensuring the long-term viability and efficiency of market operations. By capturing the collective dynamics of multiple agents and assessing the evolutionary fitness of different strategies, EGT provides a foundation upon which overall market stability is built. This stability is particularly important in electricity markets, where fluctuations in supply and demand, driven by factors such as renewable energy variability, can lead to significant instabilities if not properly managed.

On the other hand, the DQN plays a critical role in achieving real-time adaptability for individual market participants. Unlike traditional game-theoretic approaches that often assume static or equilibrium conditions, the DQN allows agents to learn and adapt dynamically based on ongoing interactions with the environment. By employing RL through a neural network, the DQN equips agents with the ability to continuously improve their decision-making strategies as they gather more information about market conditions. This adaptability is key in electricity markets where market participants, including energy producers, consumers, and prosumers, must respond promptly to rapidly changing variables such as real-time pricing signals, shifts in consumer demand, and variations in renewable energy generation.

The synergy between EGT and the DQN ensures that both macro-level and micro-level requirements are addressed in the market. At the macro level, EGT helps maintain systemic balance and prevents the emergence of unstable market behaviors that could lead to inefficiencies or economic losses. It does so by modeling and influencing the evolution of strategies across the population of market participants, ensuring that only those strategies that are evolutionarily robust thrive over time. At the micro-level, the DQN’s ability to enable individual agents to adapt quickly ensures that these participants can maximize their payoffs by learning optimal actions in a dynamic and often uncertain environment. This learning process allows agents to explore different strategies, evaluate their effectiveness, and exploit the best strategies for given market conditions.

Furthermore, this integrated framework is particularly suited to addressing the challenges introduced by DERs and the increasing penetration of renewable energy sources in modern electricity markets. The intermittent nature of renewable energy sources like wind and solar creates a high degree of uncertainty, which requires flexible and responsive mechanisms to manage effectively. The DQN component allows agents to learn optimal responses to these fluctuations, while the EGT component provides an overarching analysis of how such adaptations contribute to or detract from overall market stability. Together, they form a robust framework capable of optimizing decision making under uncertainty, which is crucial for managing the integration of renewables into the grid without compromising reliability.

Additionally, this integrated EGT-DQN approach promotes cooperative behaviors and mitigates the risks of purely competitive dynamics that could destabilize the market. By evaluating the outcomes of different strategies at both individual and collective levels, the framework encourages participants to adopt behaviors that not only maximize their individual utility but also contribute positively to market efficiency and stability. For example, prosumers who participate in P2P trading can use the DQN to determine when to sell surplus energy and at what price, while EGT provides insight into how these transactions affect the larger market dynamics, promoting cooperation when beneficial.

Overall, the integrated approach of combining EGT and the DQN provides a comprehensive solution for optimizing strategies in complex, evolving electricity markets. It ensures that agents are capable of both individual adaptability and collective stability, which are both essential in the face of market volatility, renewable energy integration, and the need for efficient energy distribution. This dual focus makes the EGT-DQN framework an ideal tool for modern electricity markets, where achieving a balance between rapid responsiveness and long-term stability is a key determinant of success.

5.2. Application Scenarios of QDN in the Electricity Market

In the context of user-side electricity markets, where supply and demand fluctuate rapidly, agents such as energy suppliers and consumers must continuously optimize their strategies to maximize their payoffs. Traditional EGT, while effective at modeling long-term strategy evolution, lacks the real-time decision-making capability needed in such dynamic environments. By incorporating the DQN, agents can adjust their strategies dynamically based on real-time feedback from the market, learning optimal bidding and pricing strategies over time. This continuous adaptation mirrors the evolutionary process but with the added benefit of real-time responsiveness. The DQN offers significant benefits across various scenarios within electricity markets. The table below summarizes some of the key applications of the DQN in electricity trading, highlighting its role in optimizing pricing strategies, demand response, and load management under dynamic market conditions. Based on this, Table 3 shows the applications scenarios of QDN in electricity market.

Table 3. Application scenarios of QDN in electricity market.

While EGT offers a strong framework for understanding long-term strategy evolution, its static nature and reliance on simplified assumptions about the environment limit its effectiveness in highly dynamic markets like electricity trading. The DQN addresses these limitations by introducing real-time adaptability and the ability to operate in high-dimensional state spaces. By continuously updating strategies based on real-time feedback, the DQN enables agents to handle market fluctuations and uncertainties more effectively than traditional evolutionary models.

To further illustrate the advantages of the DQN in electricity market applications, the table below compares the DQN with other popular RL algorithms []. This comparison highlights the DQN’s suitability for dynamic, high-dimensional environments, making it an ideal choice for real-time strategy optimization in electricity trading. As shown in Table 4, the performance comparison of RL algorithms is summarized.

Table 4. Performance comparison of RL algorithms.

In summary, the DQN provides a valuable extension to EGT, particularly in dynamic and uncertain environments like user-side electricity markets. By enabling real-time strategy optimization and handling high-dimensional state-action spaces, the DQN enhances the ability of market participants to adapt to changing conditions and achieve optimal outcomes. Future work may explore further integration of other RL methods or hybrid approaches to tackle increasingly complex challenges in electricity market transactions.

5.3. The Role of EGT in the RL Framework

The integration of EGT in the context of RL serves a specific and critical purpose beyond what conventional RL can achieve. The presence of EGT in this proposed framework is aimed at guiding the long-term evolution of strategies among agents, ensuring that individual learning achieved through RL does not lead to unstable or suboptimal collective outcomes. EGT is used not to directly determine the reward function of RL agents but, rather, to evaluate and adjust the strategic behavior of agents based on population-level stability criteria.

(i): Guiding long-term strategy evolution: The EGT component is used to analyze how individual agent strategies, which are learned through RL, contribute to the overall stability of the system. This is particularly important in electricity markets where individual optimization may not align with collective market stability. EGT helps ensure that strategies that emerge as a result of RL also lead to sustainable and evolutionarily stable outcomes when observed from a population-level perspective.
(ii): Providing population-level feedback: Unlike conventional RL, which relies solely on the reward function for individual agent adaptation, EGT provides population-level feedback that informs how strategies should evolve in relation to other agents. This feedback mechanism is crucial for mitigating issues such as the “tragedy of the commons,” where individual reward-driven actions might lead to collectively detrimental outcomes.
(iii): Informing strategy adoption and adaptation: In the RL framework, EGT is also used to determine the relative success of different strategies within the agent population, allowing less successful strategies to be gradually phased out while promoting strategies that show better adaptability and stability. This contributes to a more dynamic exploration–exploitation balance; the influence of EGT helps RL agents converge towards strategies that are not only optimal individually but also evolutionarily stable within the broader agent population.

This integration highlights the unique role of EGT in ensuring that the strategies learned by individual agents under RL not only are effective in real time but also contribute to the long-term equilibrium and robustness of the electricity market. Overall, to reflect these roles more explicitly, this section thoroughly elaborates on how EGT interacts with RL in the proposed framework and explains that EGT primarily serves as a guiding and evaluative tool, rather than as a mechanism for determining RL rewards.

6. The Application of EGT in Demand Response for Smart Grids and Electricity Market Transactions

6.1. User Behavior Modeling and Strategy Evolution

Electricity user behavior refers to a series of electricity usage activities and decisions made by users under specific conditions to achieve certain objectives []. In the power system, users base their electricity consumption on their own characteristics and external factors (such as the power grid and pricing mechanisms). They utilize electrical equipment and manage resources like distributed energy storage and electric vehicles, aiming to optimize their economic benefits. The basic components of the electricity user behavior model are shown in Figure 10 [,,].

Figure 10. The basic components of the electricity user behavior model [,,].

A user behavior model is a tool used to analyze and predict user behavior, providing more personalized and accurate services across various fields. In the context of smart grids and the energy internet, understanding electricity user behavior is critical for a comprehensive modeling of the power system. Focusing solely on the physical characteristics of the power system is insufficient for complete modeling, making it essential to fully integrate user behavior modeling into the system [,,,,,,,].

Therefore, in smart grids and electricity market transactions, user behavior modeling is essential for understanding electricity consumption patterns in order to effectively guide users through demand-response initiatives. Table 5 provides an overview of relevant research and demonstrates how user behavior models can be utilized to address specific challenges.

Table 5. A summary of the application of user behavior models in power systems.

Strategy evolution is a concept that spans fields such as economics, management, sociology, and artificial intelligence. It describes how individuals or groups adjust their strategies in a dynamic environment to adapt to changes, respond to competitors’ actions, and pursue their own interests. Figure 11 illustrates the evolution of user electricity strategies under varying pricing conditions []. In the context of a smart grid, participants often need to adjust their strategies based on environmental changes and the outcomes of interactions to maximize their benefits.

Figure 11. The evolution of consumers adjusting their electricity usage strategies under different electricity prices [].

6.2. Demand Response Based on EGT

Demand response (DR) refers to the market-based participation behavior of electricity users who respond to price signals or incentives issued by DR agencies by altering their electricity consumption patterns. DR has developed under the framework of power market liberalization, and its primary function is to enhance the operational efficiency of the power system through interaction between the grid and users, providing benefits to all DR participants [,,]. With the continuous advancement of smart grid technologies, demand response has evolved into an innovative demand-side management strategy. It enables electricity users to transition from passive consumers to active market participants. By responding to the prices or incentives offered by power suppliers, electricity consumers can obtain economic benefits from market transactions and adjust their electricity usage behavior to reduce grid load fluctuations, thereby ensuring the security, stability, and economic efficiency of the power system.

In terms of power market operations, Wang et al. [] used an evolutionary game model to analyze the market mechanisms for the operation of demand-response resources. Specifically, they proposed innovative market designs and operational strategies to facilitate the effective participation of demand-response resources and optimize the functioning of the electricity market.

To address the issue of defense resource allocation in smart grids facing potential cyber-attacks, Ge et al. [] designed a bi-level game model to optimize resource allocation. After determining the stable strategy in the evolutionary game, a genetic algorithm was used to solve the upper-level game, aiming to maximize the benefits of all related nodes [,,]. The framework of the model is roughly illustrated in Figure 12.

Figure 12. An optimal defense resource allocation method based on game theory [,,,].

In the field of integrated energy systems, there has been significant research both domestically and internationally. Zhang et al. [] discussed the optimization of distributed integrated multi-energy systems (DIMSs), particularly considering industrial production processes (IPPs). Their proposed model accounts for the limited rationality of attackers and introduces quantum response equilibrium to quantify player payoffs. The feasibility and effectiveness of the approach were demonstrated through specific algorithms. Yu et al. [] designed and studied a bi-level optimization model in which the upper level optimizes the regionally integrated energy system, and the lower level focuses on user response based on EGT. Dou et al. [] developed a demand-response model based on EGT, using nodal energy prices and quantifying user energy consumption via market consumer surplus. They also considered the substitutability of multi-energy user loads, modeling and optimizing user responses to different energy price signals through mathematical models and algorithms.

Research has shown that demand-response strategies based on EGT can effectively reduce the operating costs of power systems. These strategies provide new perspectives for the analysis and optimization of power systems, enhancing energy efficiency, increasing user participation, and boosting overall system benefits.

7. Empirical Analysis

7.1. Application of EGT in Electricity Market Transactions

The application of EGT in electricity markets helps analyze and predict the long-term strategy evolution of market participants, assess the stability of market mechanisms, and provide theoretical support to policymakers []. Table 6 offers an overview of relevant research and demonstrates how EGT addresses specific challenges in electricity markets.

Table 6. A summary of the application of EGT in electricity market transactions.

Research shows that, by analyzing how participants gradually adjust their behavior in different market environments, EGT offers a powerful tool for understanding the dynamic changes and development trends in electricity markets [].

7.2. Comparative Analysis: Traditional Methods and EGT

EGT offers a distinctive approach to smart grid management compared to traditional game theory. Traditional game theory typically assumes fully rational players with complete information, leading to static equilibrium outcomes. However, EGT introduces bounded rationality and limited information, allowing players to adapt their strategies dynamically based on their environment and the strategies of other participants []. This process better reflects the decentralized and uncertain nature of modern smart grids, particularly with the integration of renewable energy sources []. As smart grids evolve into systems of interconnected microgrids and DERs, the ability to model the adaptive behaviors of various agents becomes crucial []. For instance, EGT has been effectively used to optimize decision making in energy trading, where participants adjust their strategies in real time in response to fluctuating energy prices and demand patterns [].

Moreover, EGT’s adaptive framework supports energy trading models in which consumers can transition between prosumer roles—both producing and consuming energy—and respond to real-time pricing in P2P networks []. This flexibility is especially important in addressing the challenges posed by renewable energy’s variability [,]. For example, EGT-based models have been shown to enhance the stability of smart grids by fostering cooperation among grid participants through dynamic pricing and energy storage optimization. Additionally, EGT applications can reduce the peak-to-average ratio (PAR) in energy consumption, leading to more efficient grid operation [,,,]. In contrast, traditional game theory may fall short in handling these dynamic and decentralized interactions, particularly in environments where information is imperfect or incomplete [,,,,].

The combination of EGT and DRL presents a promising direction for future research, with significant applications in diverse fields such as smart grid management. Unlike traditional game theory, which assumes fully rational players and complete information leading to static equilibrium outcomes, EGT embraces bounded rationality and dynamic adaptation [,], making it better suited for modeling the complex, decentralized nature of systems like smart grids. As modern power systems increasingly involve interconnected microgrids and renewable energy resources, EGT’s ability to model adaptive and evolving strategies of various agents becomes crucial. Moreover, the adaptive framework of EGT enhances decision making in dynamic environments, including energy trading and P2P networks, by allowing participants to adjust their strategies based on real-time feedback. When integrated with DRL, which excels at optimizing decision making in complex, uncertain scenarios, EGT can lead to robust solutions for stability, efficiency, and optimization in these evolving systems. The potential for EGT, especially when combined with advanced learning methods, provides a broad and compelling landscape for future research, contributing to the development of adaptive and intelligent management systems for complex, decentralized networks.

8. Conclusions and Future Prospects

The study of user-side electricity market trading strategies, particularly through the lens of EGT, showcases significant promise. However, as market conditions evolve and technological advancements proceed rapidly, future research needs to delve deeper into areas such as theoretical frameworks, algorithmic development, and application contexts. This paper offers a comprehensive outlook on potential future research directions in this field.

8.1. Conclusions

The conclusions of this review paper are as follows.

(i): Broad applicability of EGT in user-side electricity markets: The paper demonstrates that EGT is an effective tool for modeling the dynamic behavior of multiple agents in the user-side electricity market. EGT provides a unique advantage by accounting for bounded rationality and the gradual evolution of strategies, making it particularly useful in capturing long-term stable strategies in smart grid environments. EGT helps analyze how consumers, producers, and regulators interact in the market, adjusting their strategies over time based on changes in market conditions, pricing, and regulations.
(ii): Limitations of classical game theory in real-time decision making: While classical game theory is useful in static environments and assumes perfect rationality, it is not well suited for the dynamic and real-time changes observed in electricity markets. The review highlights the limitations of classical models in handling high-dimensional state spaces and real-time optimization, which are essential for smart grid management.
(iii): Integration of DRL with EGT for adaptive strategy optimization: To overcome the limitations of traditional game theory, the paper suggests the integration of DRL, specifically the DQN, with EGT. This combination allows for continuous strategy adaptation based on real-time feedback from the market. The use of the DQN enhances the ability of market participants to optimize their strategies dynamically, responding more effectively to rapid changes in supply, demand, and pricing conditions.
(iv): Application of multi-agent simulation models: The study emphasizes the importance of multi-agent simulation models for capturing the complex interactions among various stakeholders in the electricity market. By modeling the strategic behavior of multiple agents, including consumers, producers, and regulatory bodies, the study reveals how strategies evolve over time and under different market conditions. This provides valuable insights into the dynamic nature of user-side electricity markets and the factors that influence market stability and efficiency.
(v): Significance in energy market optimization and policy formulation: The integration of EGT and DRL not only enhances market participants’ ability to optimize their strategies in real time but also has broader implications for market optimization and policy development. The findings of this paper can guide policymakers in designing more effective regulations and incentives that promote market efficiency, stability, and the adoption of renewable energy sources.
(vi): Future research directions: The paper concludes by suggesting several avenues for future research. These include further exploration of complex evolutionary game models in multi-agent systems, the development of hybrid models that integrate multiple types of game theory, and the use of advanced algorithms such as parallel evolutionary algorithms and machine learning techniques to enhance model accuracy and efficiency. Additionally, the paper calls for more empirical studies to validate the theoretical models in real-world market scenarios, as well as the development of practical tools and platforms for implementing these models in actual market operations.

In summary, this review paper provides a comprehensive overview of the application of EGT and DRL in optimizing strategies for user-side electricity markets. It highlights the potential of these advanced methods in addressing the challenges of dynamic market environments, improving the efficiency of electricity markets, and offering new insights for policy formulation. The integration of EGT and DRL presents a promising approach to real-time decision making in smart grids, particularly as the energy landscape continues to evolve with the increasing adoption of renewable energy sources. Based on the conclusions, future prospects are summarized in five aspects, as follows.

8.2. Limitations of the Study and Future Prospects

8.2.1. Limitations of the Study

While this paper presents a comprehensive review of the application of EGT and DRL in user-side electricity markets, several limitations remain that require further consideration. These limitations are critical to the implementation and real-world application of the proposed models, and they should guide future research.

(i): Simplified assumptions in modeling: Many of the models reviewed in this paper rely on simplified assumptions regarding participant behavior, information availability, and market conditions. For instance, the application of EGT often assumes that participants in the market act with bounded rationality and have limited but consistent access to market information. However, in real-world electricity markets, agents such as consumers, prosumers, and energy suppliers may exhibit more complex behaviors influenced by multi-objective considerations (e.g., cost minimization, environmental impact reduction, and comfort maximization). Recent studies have highlighted that prosumers in distributed energy systems are not only motivated by economic returns but also by environmental and social factors, which adds to the complexity of decision making [,]. Future research should focus on extending these models to capture these more nuanced and multi-faceted decision-making processes, incorporating more realistic agent preferences and behavioral dynamics [,,,]. Additionally, the models tend to treat market conditions as relatively stable within certain timeframes, but in reality, market dynamics are often subject to abrupt and unpredictable changes due to external factors like policy shifts, renewable energy availability, or unforeseen demand spikes. It has been found that renewable energy integration introduces high levels of volatility and unpredictability, which cannot be fully addressed by static or semi-static models, emphasizing the need for dynamic and adaptive modeling approaches.
(ii): Computational complexity and scalability issues: The integration of EGT with DRL presents considerable computational challenges, particularly when applied to large-scale electricity markets with multiple interacting agents. Traditional EGT approaches, while effective in modeling long-term strategy evolution, become computationally prohibitive when scaled to large markets with thousands of participants. Similarly, DRL, while useful for real-time decision making, can suffer from inefficiencies when dealing with high-dimensional state and action spaces. The combination of these two methods further amplifies the computational burden, especially when applied in high-frequency trading environments or systems requiring real-time responses. Therefore, one of the key limitations of the current state of research is the lack of efficient, scalable algorithms that can handle the increasing complexity and size of modern decentralized energy systems. Future research should focus on developing more sophisticated, parallelized algorithms that can distribute computational loads and manage the interactions between a growing number of market participants. Such advances will be critical for the practical implementation of EGT-DRL frameworks in real-world smart grid applications [,].
(iii): Limited empirical validation: Another limitation of the reviewed studies is the reliance on simulation-based models, rather than real-world empirical data. While simulations can provide valuable insights into the theoretical effectiveness of EGT-DRL integrations, they may not fully capture the complexities and uncertainties present in actual market environments. For instance, real-world electricity markets are influenced by external variables such as regulatory policies, economic conditions, and technological disruptions, which may not be accurately reflected in controlled simulation settings. Additionally, the variability introduced by renewable energy sources, such as solar and wind, presents further unpredictability that needs to be tested with empirical data. Future work should prioritize the collection of real-time data from smart grids, DERs, and P2P energy trading platforms to validate these models. Such empirical studies would help refine the theoretical models, making them more robust and applicable to real-world scenarios.

In light of the limitations identified, there are several key avenues for future research that will enhance the applicability and robustness of EGT-DRL models in user-side electricity markets:

(i): Development of hybrid models to address complex market dynamics: Future research should focus on creating more advanced hybrid models that integrate EGT with other machine learning methods, such as supervised learning, unsupervised learning, and RL. These models should be designed to handle the inherent complexities and dynamic nature of electricity markets, where demand and supply are highly variable, and participant behavior is influenced by both economic incentives and policy regulations. Hybrid models can offer a more holistic approach, combining the long-term stability of EGT with the real-time adaptability provided via machine learning techniques. For example, supervised learning could be used to predict short-term market conditions, while DRL enables agents to adjust their strategies in response to these predictions. By integrating these methods, future models will be better equipped to optimize strategy evolution in both stable and rapidly fluctuating environments.
(ii): Real-world application and empirical studies: A key research direction is the application of EGT-DRL models in real-world settings, particularly within the evolving infrastructure of smart grids and decentralized energy systems [,,]. Empirical studies should be conducted using data from actual market operations, such as load profiles, generation schedules, and real-time pricing data from P2P trading platforms. By analyzing the performance of these models in real-world conditions, researchers can assess their validity, scalability, and adaptability to unforeseen market disruptions []. For instance, future studies could explore how EGT-DRL models perform during periods of high renewable energy penetration, where grid stability is challenged by the intermittent nature of solar and wind energy. Additionally, partnerships with energy companies or utility operators could provide access to valuable data sets, enabling the testing of these models under different regulatory regimes and market conditions.
(iii): Scalability and efficient algorithms for multi-agent interactions: As the number of participants in electricity markets continues to grow, particularly with the rise of prosumers and decentralized energy resources, future research should prioritize the development of algorithms that can efficiently handle large-scale, multi-agent interactions. This involves creating computational frameworks that can manage the complex interactions between hundreds or even thousands of market participants, all of whom may have different objectives, constraints, and levels of market power. Techniques such as parallel computing, distributed algorithms, and agent-based modeling will be critical to ensuring that these interactions are accurately captured without overwhelming computational resources. Additionally, the use of advanced optimization methods, such as multi-objective evolutionary algorithms (MOEAs), could help agents optimize their strategies across multiple criteria, such as cost minimization, energy efficiency, and carbon emissions’ reduction.
(iv): Incorporation of regulatory and policy considerations: Future research should also investigate how EGT-DRL models can be adapted to reflect the changing regulatory landscape of electricity markets. As governments and international organizations increasingly promote renewable energy adoption, carbon pricing, and other environmental initiatives, market participants will need to adjust their strategies accordingly. Investigating how policy interventions, such as subsidies for renewable energy, carbon trading schemes, and demand-side management programs, impact the strategic behavior of electricity market participants will be critical for ensuring the robustness of these models. Additionally, researchers should explore how regulatory frameworks can be designed to incentivize cooperation among market participants, particularly in decentralized and P2P trading environments, where individual incentives may not always align with system-wide goals of stability and sustainability.
(v): Handling renewable energy variability and market volatility: The increasing integration of renewable energy sources into electricity markets presents both opportunities and challenges. Renewable energy introduces significant variability into the grid due to its dependence on external factors like weather conditions. Future research should focus on how EGT-DRL models can be optimized to handle this variability, ensuring that market participants can respond effectively to sudden changes in the supply while maintaining grid stability. This could involve developing advanced forecasting methods for renewable generation and incorporating these predictions into strategy optimization models. Additionally, DRL techniques could be used to enable market participants to dynamically adjust their energy trading strategies in response to real-time data on renewable energy availability, demand fluctuations, and price signals.

8.2.2. Future Prospects

The study of user-side electricity market trading strategies, particularly through the lens of evolutionary game theory (EGT), showcases significant promise. However, as market conditions evolve and technological advancements proceed rapidly, there are several critical areas where future research should intensify. Specifically, future research needs to focus on the development of advanced hybrid models that integrate EGT with deep reinforcement learning (DRL). These models should be capable of handling the growing complexity of decentralized energy markets, where real-time fluctuations in supply and demand, the integration of renewable energy sources, and the increasing participation of diverse agents (e.g., prosumers, utilities, and storage operators) require more sophisticated strategic optimization techniques. Techniques such as multi-agent reinforcement learning, cooperative game theory, and adaptive heuristic algorithms are particularly promising for managing these complexities. The integration of EGT and DRL offers a synergistic approach, combining the long-term evolutionary stability provided via EGT with the real-time adaptive decision making enabled through DRL. Such hybrid models will need to address high-dimensional state spaces, allowing agents to continuously update their strategies based on immediate market signals while ensuring that a long-term equilibrium is achieved within the overall market environment. Additionally, these hybrid models must incorporate sophisticated mechanisms for strategy evolution, enabling agents to adapt to changing market conditions more effectively. This could involve leveraging transfer learning to adapt strategies from one market scenario to another, thereby enhancing the generalizability of the models. Moreover, future research could also consider exploring the incorporation of deep neural networks for feature extraction, which would allow the models to automatically capture underlying patterns in complex market data, further improving decision-making efficiency.

Another significant area of future research lies in improving the scalability and computational efficiency of these hybrid EGT-DRL models. As the scale of electricity markets expands with the adoption of decentralized energy resources (DERs), P2P trading platforms, and microgrids, current computational methods may struggle to handle the increasing number of agents and interactions. Future work should focus on developing parallel computing techniques, such as MapReduce, GPU-based processing, and Apache Spark, as well as distributed algorithms like consensus-based optimization, swarm intelligence, and federated learning, to efficiently simulate large-scale electricity markets with thousands of agents interacting dynamically. Additionally, the role of artificial intelligence (AI) and machine learning (ML) should be explored further to enhance the learning and decision-making capabilities of market participants, particularly in the face of stochastic and uncertain environments. Techniques such as reinforcement learning with experience replay, meta-learning, and online learning could be instrumental in improving the adaptability and responsiveness of agents in these complex market environments. Moreover, leveraging cloud-based and edge computing infrastructures could provide the necessary computational power and data processing capabilities required for real-time strategy optimization in large-scale electricity markets. Future research could also explore the use of quantum computing to solve high-dimensional optimization problems, as quantum algorithms have the potential to significantly accelerate computation, making it feasible to simulate very large and complex systems.

Moreover, future research should delve deeper into understanding how external factors such as government regulations, policy interventions, and market incentives influence the strategic behaviors of market participants in decentralized electricity markets. For instance, the implementation of carbon pricing mechanisms could encourage utility companies to reduce emissions by making carbon-intensive practices more costly, while renewable energy credits provide financial incentives for prosumers to adopt renewable energy technologies. Demand-response programs incentivize consumers to adjust their electricity usage during peak times, helping to balance supply and demand and reduce grid stress. These mechanisms significantly impact the decision-making processes of prosumers and utility companies, shaping their strategies towards more sustainable practices. Investigating how EGT-DRL models can incorporate these external variables, through approaches such as scenario analysis or sensitivity analysis, will provide more accurate and practical tools for market optimization. Additionally, researchers should explore the design of policy interventions that promote cooperative behavior among market participants while maintaining competition. Such interventions could be crucial for achieving both economic efficiency and environmental sustainability in future electricity markets. Furthermore, it is essential to evaluate the long-term effects of these policies on market stability and participant behavior, incorporating feedback loops to understand how regulatory changes can lead to shifts in strategic behaviors over time. Additionally, future work should consider modeling different types of policy instruments, such as subsidies, taxes, and market-based mechanisms, and analyzing their relative effectiveness in achieving sustainable energy transitions. The integration of adaptive policy frameworks could further enhance the ability of market participants to respond dynamically to regulatory changes.

Another vital area of investigation is the evolution of renewable energy sources and their impact on market stability. As renewable energy sources such as wind and solar introduce higher variability and uncertainty into the grid, future research should focus on developing new optimization techniques that enhance the resilience of electricity markets under volatile conditions. Techniques such as robust optimization, stochastic programming, and chance-constrained optimization are currently being explored to address the challenges posed by renewable energy variability. These techniques should allow for the adaptive adjustment of strategies in response to the intermittent nature of renewables, leveraging advanced forecasting methods such as the autoregressive integrated moving average (ARIMA), machine learning-based predictive models, and real-time data analytics using big data platforms. Additionally, the integration of energy storage systems (ESSs) with renewable energy sources will need to be carefully examined, as the interaction between storage operators and other market participants could significantly alter market dynamics. Future research should also consider the role of virtual power plants (VPPs) in aggregating distributed energy resources to provide grid stability and how EGT-DRL models can optimize the operation of VPPs under different market conditions. Furthermore, integrating probabilistic forecasting with decision-making models could help agents make more informed choices in the face of renewable energy uncertainties, thus improving the robustness of electricity markets. Another promising direction is the application of blockchain technology to enhance transparency and trust among market participants, particularly in peer-to-peer (P2P) energy trading settings, where trust issues can otherwise hinder market efficiency. Blockchain-based smart contracts could automate and enforce trading agreements, providing additional stability to markets dominated by fluctuating renewable energy sources.

Future research should also explore interdisciplinary approaches to expanding the applicability of EGT-DRL models. For example, combining insights from behavioral economics, psychology, and social sciences could provide a more comprehensive understanding of how human behavior and decision-making processes affect market outcomes. Concepts such as prospect theory from behavioral economics could help explain why participants may overvalue potential losses compared to gains, influencing their strategic behavior. Behavioral game theory, in particular, could offer new perspectives on how bounded rationality, risk aversion, and social preferences influence strategic behavior in electricity markets. Furthermore, integrating data from real-world market operations into the development and testing of these models will be essential for validating their predictive accuracy and ensuring their practical relevance. Data-driven approaches, such as reinforcement learning combined with real-time market data, could provide valuable insights into the effectiveness of different strategies in practice. Moreover, collaborations with experts from diverse fields, including energy policy, environmental science, and computer science, could lead to more holistic models that capture the multifaceted nature of electricity markets. Such interdisciplinary collaborations would not only improve model accuracy but also enhance their applicability across different regulatory and cultural contexts. Future research should also consider using hybrid modeling approaches that integrate agent-based modeling (ABM) with EGT-DRL to better capture the heterogeneous behaviors of market participants. Additionally, the use of socio-technical frameworks could provide a deeper understanding of how technological developments, user behavior, and regulatory environments interact and co-evolve in electricity markets.

In summary, the future research agenda in this field should prioritize the following: (1) the development of advanced hybrid EGT-DRL models that address the complexity and dynamism of modern electricity markets; (2) the exploration of scalable and computationally efficient algorithms that can support large-scale market simulations; (3) the investigation of the impact of external regulatory and policy interventions on market participant strategies; (4) the creation of optimization techniques that enhance market resilience in the face of renewable energy variability; and (5) the incorporation of interdisciplinary approaches to improving the realism and applicability of these models in real-world market contexts. These future directions will not only advance theoretical understanding but also provide valuable insights for policymakers and market operators seeking to improve the efficiency, stability, and sustainability of modern electricity markets. By addressing these research areas, future studies will contribute to building smarter, more resilient, and sustainable energy systems capable of meeting the demands of a rapidly changing global energy landscape. Researchers must also consider the social and economic implications of their models, ensuring that the solutions proposed are not only technically feasible but also socially equitable and economically viable and thereby supporting a just transition to a sustainable energy future. The adoption of user-centered design approaches could further help align technological advancements with the needs and preferences of end users, ensuring that future energy systems are not only technologically advanced but also socially inclusive and beneficial for all stakeholders..

Additionally, next, we provide a more detailed and comprehensive analysis of the hybrid EGT-DRL models, and we identify multiple research opportunities.

Expansion and utilization of evolutionary game models
(i)
Complex evolutionary models in multi-agent systems: With the rise of smart grids and distributed energy resources (DERs), the user-side electricity market is becoming more complex, with more participants and evolving behaviors. Future research should focus on developing sophisticated evolutionary game models within a multi-agent system (MAS) framework to investigate strategic interactions and evolutionary processes among these agents [,,,,]. Such models will better capture collaboration, competition, and adaptive behaviors among users, thereby providing a more accurate representation of dynamic market processes. Moreover, integrating machine learning and reinforcement learning (RL) techniques could optimize these models by improving decision making and adaptability, allowing agents to learn and adjust strategies effectively in uncertain environments. For example, Shi et al. (2024) [] demonstrated how evolutionary reinforcement learning can enhance multi-agent pathfinding efficiency, which is relevant for optimizing strategy adaptation in dynamic market settings.
(ii)
Introduction of dynamic games and time-variable strategies: Traditional evolutionary game models often assume that strategies remain static over time, whereas real-world market strategies can change dynamically. Future research could explore dynamic game models that account for strategy evolution over time, prioritizing aspects such as adaptability to market changes, response to competitor strategies, and the impact of external factors on strategic adjustments. Liu et al. (2024) [] highlighted the use of multi-agent deep reinforcement learning to handle dynamic and multi-modal challenges, providing insights into how time-variable strategies can be optimized in changing environments. This approach would not only simulate the market’s dynamic nature more accurately but also offer theoretical support for developing long-term strategies to manage market fluctuations.
(iii)
Hybridization and integration of multiple models: Current research tends to focus on a single type of evolutionary game model. Future investigations could examine the hybridization and integration of different models, such as cooperative games, competitive games, and evolutionarily stable strategies (ESSs). For example, combining cooperative and competitive models could help market participants identify when to collaborate for mutual benefits and when to compete, leading to more effective strategy selection in fluctuating market conditions. Karaki and Al-Fagih (2024) [] discussed how integrating different game-theoretic approaches can facilitate more robust and adaptable strategies within smart grids, highlighting the practical benefits of hybrid models in complex environments. This approach could expand the applicability of these models across a broader range of market conditions, providing more robust solutions for strategy selection in diverse scenarios.
Algorithmic enhancement and innovation
(i)
Optimization of evolutionary algorithms [,,,,]: Evolutionary algorithms are crucial for studying the evolution of strategic behavior, but existing algorithms can be inefficient when dealing with large-scale, multi-dimensional problems. Future research should aim to enhance efficiency through various approaches, such as the development of parallel evolutionary algorithms based on distributed computing, hybrid evolutionary algorithms, and multi-fidelity evolutionary algorithms. Zhao et al. (2024) introduced a morphological transfer-based multi-fidelity evolutionary algorithm, which significantly improves efficiency in complex design problems by leveraging different fidelity levels []. Additionally, Stranieri et al. (2024) proposed a forest-based evolutionary algorithm for reconstructing gene regulatory networks, which demonstrates how specialized evolutionary algorithms can handle complex, high-dimensional data effectively []. Moreover, Cubillos-Chaparro et al. (2024) utilized a multi-objective evolutionary algorithm for biomarker identification, illustrating the potential for handling diverse objectives in complex environments []. Such improvements could significantly boost computational efficiency while maintaining accuracy, particularly in complex market environments.
(ii)
Fusion of artificial intelligence with evolutionary algorithms [,,,,]: As AI technology advances, integrating AI techniques like deep learning and reinforcement learning (RL) with traditional evolutionary algorithms could enhance the adaptability and predictive capabilities of these models in uncertain environments. For instance, deep learning could be used to optimize strategy selection, while RL could enable models to self-improve and evolve. Imam et al. (2023) highlighted that hybridizing artificial immune system algorithms with other AI techniques can lead to enhanced performance in solving complex problems, which suggests that similar hybridization approaches could be beneficial in evolutionary algorithms []. Liang et al. (2021) also introduced an evolutionary deep fusion method for chemical structure recognition, demonstrating how deep learning can be effectively combined with evolutionary strategies for improved recognition accuracy []. Furthermore, Lima and Ludermir (2013) optimized dynamic ensemble selection procedures using evolutionary learning machines, indicating that ensemble methods could further enhance the robustness of evolutionary algorithms in dynamic settings []. Such integrations can provide a more comprehensive framework for addressing complex, uncertain market conditions.
(iii)
Multi-objective optimization based on meta-heuristic methods [,,]: Future electricity markets are expected to become increasingly complex, with participants having diverse objectives, such as minimizing costs, maximizing benefits, and reducing carbon emissions. Research on multi-objective optimization algorithms based on meta-heuristic methods (e.g., genetic algorithms, ant colony optimization, and particle swarm optimization) will be valuable for identifying optimal strategy combinations in these complex, dynamic environments, as these methods are particularly suited for efficiently exploring large solution spaces and finding near-optimal solutions when multiple conflicting objectives are involved. Wu et al. (2024) presented a hybrid meta-heuristic approach for emergency logistics distribution under uncertain demand, which demonstrated how combining multiple heuristic techniques can improve performance in highly uncertain and dynamic settings []. Similarly, Srivatsan and Venkatesan (2024) proposed an improved meta-heuristic technique for FIR filter design, showcasing the applicability of meta-heuristics in optimizing complex engineering problems []. Additionally, Ibnoulouafi et al. (2024) introduced an efficient meta-heuristic approach to solving the multi-objective green p-hub center routing problem, highlighting how such techniques can effectively handle multi-objective optimization in scenarios involving environmental sustainability []. These studies illustrate the potential of advanced meta-heuristic methods to address diverse and conflicting objectives in future electricity markets.
Detailed study of application scenarios
(i)
Applications in the context of smart grids: With the development of smart grid technology, user behavior in electricity markets will become more intelligent and varied. For example, future home energy management systems (HEMSs) may not only rely on static electricity usage strategies but also dynamically adjust based on real-time electricity prices and weather forecasts. These real-time adjustments could help reduce overall energy consumption during peak times, lower costs for users, and maintain user comfort by optimizing energy usage in response to external factors. Future research should focus on refining the models used for these real-time adjustments, such as incorporating more sophisticated algorithms for predicting electricity demand based on machine learning, and exploring how these dynamic systems can adapt to extreme weather events or sudden market price fluctuations. Additionally, research on applying evolutionary game models to these intelligent systems, helping users optimize electricity costs while maintaining comfort, will be an important direction.
(ii)
Management of DERs and microgrids: With the widespread adoption of distributed energy resources (DERs), such as solar and wind energy, and microgrids, the structure and operation modes of user-side electricity markets will undergo significant changes. Future research could explore how to use evolutionary game models to optimize the integration, scheduling, and management of DERs, considering key factors such as energy availability, weather conditions, and market prices, to achieve optimal overall system operation. Additionally, studies could examine how game models can facilitate cooperation and resource sharing among multiple users in a microgrid environment, thereby improving overall system efficiency and stability. Future research should also investigate the potential for integrating advanced energy storage solutions with DERs to enhance reliability and efficiency, as well as evaluate the economic incentives needed to encourage active participation from all stakeholders.
(iii)
Integration of electric vehicles and charging infrastructure: The large-scale adoption of electric vehicles (EVs) introduces new challenges and opportunities for the user-side electricity market. Future research could explore how evolutionary game models can optimize EV charging strategies, charging station layouts, and vehicle-to-grid (V2G) interactions. This includes addressing potential challenges such as balancing charging speed with grid stability, ensuring user convenience, and managing peak load demands effectively. In this process, emphasis should be placed on balancing the charging needs of electric vehicles with the load pressures on the electricity system and coordinating the interests of different stakeholders through game models. Additionally, future research could focus on developing predictive models to anticipate charging demand patterns, optimizing the placement of charging stations to minimize congestion, and investigating the use of renewable energy sources for EV charging to reduce the overall carbon footprint.
Interdisciplinary research and multi-field synthesis
(i)
Combining economics with behavioral sciences: Electricity market trading is not merely a technical problem but also a system involving complex economic behavior and decision making. Thus, future research should emphasize interdisciplinary synthesis, particularly by integrating theories from economics and behavioral sciences into the study of strategic behaviors in electricity markets. For example, prospect theory from behavioral economics or game theory from economics could be used to better understand and predict user decision-making processes. Future studies could also incorporate social learning theory to examine how users adapt their strategies based on the observed behavior of others, which would help in modeling collective behavior in electricity markets. Additionally, integrating bounded rationality concepts could further enhance the understanding of decision making under constraints, as it reflects the limitations of real-world agents who may not have perfect information or unlimited cognitive resources. For instance, studying user decision preferences under uncertainty and risk and utilizing behavioral economic models could more accurately predict users’ strategy evolution process. Future research should also explore how nudging techniques, derived from behavioral economics, can influence consumer behavior in electricity markets to promote energy conservation and sustainable practices.
(ii)
Comprehensive research from an ecosystem perspective: As a complex ecosystem, the operation of the electricity market is influenced by multiple fields, such as energy economics, environmental science, and social policy. Future research could adopt an ecosystem perspective, comprehensively considering the interactions among various subsystems, and explore how evolutionary game models can achieve sustainable development of the entire electricity ecosystem. This research not only helps resolve conflicts between energy and the environment but also provides new perspectives for global energy transitions. Future studies should explore the interdependencies between different energy systems, such as the relationship between renewable energy production and water usage, and how these interdependencies can be managed to promote sustainability.
(iii)
Coordinated research on policy and legal frameworks: The operation of electricity markets relies on the support of policy and legal frameworks. Future research should deeply explore how policy formulation and adjustment can guide the evolution of user-side strategic behavior. For example, policies such as feed-in tariffs have successfully incentivized renewable energy adoption by guiding user investments and strategic behavior towards cleaner energy options. Future studies could also investigate how regulatory sandboxes could be used to test new market mechanisms and policies in a controlled environment before broader implementation. Additionally, examinations could analyze the effectiveness of policy tools, such as carbon emissions trading mechanisms and electricity market access standards, and how legal measures can regulate unfair competition and monopoly behavior in the market. These studies will provide scientific evidence for policymakers, promoting fairness and efficiency in the electricity market.
Practical applications and empirical studies
(i)
Deepening empirical research: Although evolutionary game models have achieved significant theoretical progress, their application in real markets still requires more empirical research. Future research should analyze real market data, such as electricity pricing data, consumption patterns, and grid stability metrics, to validate the practical effectiveness of the models and adjust and optimize them based on empirical findings. In particular, comparative analysis should be conducted on the application effects in different industries and market environments to ensure the models’ universality and robustness. Additionally, researchers should focus on longitudinal studies that track the performance of these models over extended periods to understand their effectiveness under varying market conditions and evolving technologies. Conducting case studies in different geographic regions can also help highlight the contextual challenges and opportunities specific to local market dynamics.
(ii)
Development of application tools and platforms: To promote the application of evolutionary game models in real electricity markets, future efforts could focus on developing more practical application tools and platforms. These tools could provide strategy optimization suggestions for market participants, such as load balancing strategies or price adjustment recommendations, and offer market regulation references for policymakers. Additionally, by developing open research platforms, collaboration between academia and industry can be enhanced, jointly advancing research and application progress in this field. Future research should also prioritize user-friendly interfaces for these tools to ensure accessibility for a broader range of stakeholders, including smaller energy providers and consumers. Furthermore, incorporating real-time data feeds and adaptive learning capabilities will allow these tools to evolve alongside changing market dynamics.
(iii)
Building intelligent decision support systems: With the increasing complexity and uncertainty of electricity markets, constructing intelligent decision support systems (DSSs) based on evolutionary game models will become a crucial direction for future research. These systems could integrate functions such as data analysis, strategy optimization, and real-time monitoring, assisting market participants in making more rational decisions in complex and changing environments. The development of intelligent DSS will significantly enhance the operational efficiency and stability of electricity markets, providing robust support for realizing the vision of smart grids by enabling improved real-time demand response, optimizing load management, and enhancing the integration of renewable energy sources. Future research should also explore incorporating predictive analytics and scenario-based simulations into DSS, enabling market participants to anticipate potential market shifts and make proactive decisions. Additionally, integrating machine learning techniques with evolutionary game models could further improve the adaptability and decision-making capabilities of DSS in dynamic market environments.

In summary, future research on user-side electricity market trading strategies will unfold across multiple dimensions, including theoretical deepening, algorithmic innovation, interdisciplinary synthesis, and practical application. These studies will not only advance the application of evolutionary game theory (EGT) in electricity markets but also provide a solid theoretical foundation for market optimization and policy formulation, such as designing incentives for renewable energy adoption, regulating market competition, and implementing dynamic pricing mechanisms to balance supply and demand. Integrating evolutionary game-theoretical methods with deep reinforcement learning (DRL) offers a significant opportunity for adaptive strategy optimization. By leveraging the strengths of both EGT and DRL, researchers can develop adaptive models capable of real-time learning, decision making, and adjustment, which is essential for addressing the increasing complexity of user-side electricity markets. As technology progresses and market environments change, future research in this field will encounter more challenges but also hold unlimited opportunities. For instance, addressing the scalability of evolutionary models in large-scale electricity systems will be crucial. Researchers should also focus on developing adaptive algorithms capable of real-time learning and adjustment to shifting market dynamics. Through continuous exploration and innovation, researchers will identify optimal strategies to navigate complex market environments in the new era of smart grids and energy transition, such as strategies for managing demand-side flexibility, enhancing resilience against market disruptions, improving the integration of renewable energy sources, and advancing peer-to-peer energy trading models, contributing to the sustainable development of global electricity markets.

Author Contributions

Conceptualization, L.C., T.S. and T.Z.; methodology, L.C., T.S. and T.Z.; formal analysis, L.C., X.W., M.L., C.T. and M.Y.; investigation, L.C., X.W., M.L., C.T., M.Y. and T.S.; writing—original draft preparation, L.C., X.W., M.L., C.T., M.Y., T.S. and T.Z.; writing—review and editing, L.C., X.W., M.L., C.T., M.Y., T.S. and T.Z.; funding acquisition, L.C., T.S. and T.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Guangdong Basic and Applied Basic Research Foundation (No. 2022A1515010699 and No. 2023A151501131), in part by the Guangzhou Education Bureau University Research Project—Graduate Research Project (No. 2024312278), in part by the Natural Science Foundation of Guangdong Province (No. 2023A1515011791), and in part by the National Natural Science Foundation of China (No. 52171331).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Acknowledgments

We sincerely thank the associate editor and the invited anonymous reviewers for their kind and helpful comments on our paper.

Conflicts of Interest

The authors declare no conflicts of interest.

Nomenclature

Symbol/Term	Description
Carbon emissions trading	A market-based approach to reducing greenhouse gas emissions by trading carbon emission allowances.
Classical game theory	Classical game theory is a mathematical framework used to analyze strategic interactions between rational decision-makers; each participant aims to maximize their own payoff given the actions of others.
DERs	Distributed energy resources are small-scale power generation or storage units, such as solar panels, wind turbines, or battery systems, that are located close to where electricity is used, allowing energy to be generated, stored, or managed locally within a distribution network.
Demand response	A mechanism that incentivizes consumers to adjust their energy usage in response to price signals or other incentives.
DQN	Deep Q-Learning network, a reinforcement learning algorithm that enables agents to optimize strategies based on real-time feedback.
DRL	Deep reinforcement learning, a subset of machine learning that combines reinforcement learning with deep learning techniques.
EGT	Evolutionary game theory, a mathematical framework used to model strategic interactions in dynamic systems.
ESS	Evolutionarily stable strategy, a concept in game theory in which a strategy, if adopted by a population, cannot be challenged via an alternative strategy.
Multi-agent system	A system involving multiple independent agents interacting and making decisions autonomously.
Market volatility	Fluctuations in market conditions, such as supply and demand, that affect pricing and strategic behavior.
P2P	Peer-to-peer refers to a decentralized network model in which participants, called peers, directly exchange resources or data without relying on a centralized server or authority.
RL	Reinforcement learning, a type of machine learning in which an agent learns to make decisions by interacting with an environment in order to maximize cumulative rewards based on feedback on its actions.
Real-time pricing	A pricing strategy in electricity markets where prices fluctuate based on supply and demand conditions in real time.
Renewable energy	Energy generated from renewable sources such as solar and wind power, which are naturally replenished.
SMEs	Small and medium-sized enterprises (SMEs) are businesses that maintain revenues, assets, or a number of employees below a certain threshold, typically characterized by their smaller scale of operations compared to larger corporations.
Smart grid	An advanced power grid system that integrates information and communication technologies for efficient energy management.
Strategy evolution	The process through which strategies change and adapt over time in response to interactions and environmental changes.

References

Li, Y.; Zhang, P.; Huang, R. Lightweight quantum encryption for secure transmission of power data in smart grid. IEEE Access 2019, 7, 36285–36293. [Google Scholar] [CrossRef]
Mollah, M.B.; Zhao, J.; Niyato, D.; Lam, K.; Zhang, X.; Ghias, A.M.; Koh, L.H.; Yang, L. Blockchain for Future Smart Grid: A Comprehensive Survey. IEEE Internet Things J. 2021, 8, 18–43. [Google Scholar] [CrossRef]
Bi, W.; Lin, H.; Zhang, L.; Huan, W.; Liu, K. Selection of Optimal Defense Strategy Based on Dynamic Evolutionary Game of Incomplete Information. In Proceedings of the 2021 IEEE 5th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Chongqing, China, 12–14 March 2021; pp. 2762–2767. [Google Scholar] [CrossRef]
Liu, C. What’s game got to do with it? Rethinking evolutionary game theory. Nat. Dialectics Commun. 2021, 43, 22–32. [Google Scholar] [CrossRef]
Smith, J.M. Game Theory and the Evolution of Behaviour. Proc. R. Soc. Lond. B Biol. Sci. 1979, 205, 475–488. [Google Scholar] [CrossRef]
Bastos, J.; Buzzi, C.; Santana, P. On Structural Stability of Evolutionary Stable Strategies. J. Differ. Equ. 2024, 389, 190–227. [Google Scholar] [CrossRef]
Guo, D.; Yu, M.; Jia, C. Smart-Strategy’s Invasion of Traditional Evolutionarily Stable Strategy Based on Hawk and Dove Game. In Proceedings of the 2011 Third International Conference on Intelligent Human-Machine Systems and Cybernetics, Hangzhou, China, 26–27 August 2011. [Google Scholar] [CrossRef]
Eldakar, O.T. The Hawk-Dove Model. In Encyclopedia of Personality and Individual Differences; Springer: Berlin/Heidelberg, Germany, 2018; pp. 1–6. [Google Scholar] [CrossRef]
Phelps, S.; Wooldridge, M. Game Theory and Evolution. IEEE Intell. Syst. 2013, 28, 76–81. [Google Scholar] [CrossRef]
Touri, B.; Jaleel, H.; Shamma, J.S. Stochastic Evolutionary Dynamics: A Graphical Reformulation of Evolutionarily Stable Strategy (ESS) Analysis. IEEE Control Syst. Lett. 2019, 3, 55–60. [Google Scholar] [CrossRef]
Liu, X.; Gao, B.; Li, Y. Review on application of game theory in power demand side. Power Syst Technol. 2018, 42, 2704–2711. [Google Scholar]
Cheng, L.; Yu, T. Game-theoretic approaches applied to transactions in the open and ever-growing electricity markets from the perspective of power demand response: An overview. IEEE Access 2019, 7, 25727–25762. [Google Scholar] [CrossRef]
Cheng, L.; Yang, R.; Liu, G.; Wang, J.; Chen, Y.; Wang, X.; Zhang, J.; Yu, T. Multi-population asymmetric evolutionary game dynamics and its applications in power demand-side response in smart grid. Proc. Chin. Soc. Elect. Eng. 2020, 40, 20–36. [Google Scholar]
Jia, Y.; Feng, J.-E.; Zhang, Y. Game-Based Applications and Control towards Future Smart Grids. In Proceedings of the 2022 4th International Conference on Smart Power & Internet Energy Systems (SPIES), Beijing, China, 27–30 October 2022; pp. 2307–2312. [Google Scholar] [CrossRef]
Grodwohl, J.B.; Parker, G.A. The Early Rise and Spread of Evolutionary Game Theory: Perspectives Based on Recollections of Early Workers. Philos. Trans. R. Soc. B Biol. Sci. 2023, 378, 20210493. [Google Scholar] [CrossRef] [PubMed]
Su, C.; Deng, J.; Li, X.; Cheng, F.; Huang, W.; Wang, C.; He, W.; Wang, X. Research on the Game Strategy of Mutual Safety Risk Prevention and Control of Industrial Park Enterprises under Blockchain Technology. Systems 2024, 12, 351. [Google Scholar] [CrossRef]
Wu, Y.; Pan, L. LSTEG: An Evolutionary Game Model Leveraging Deep Reinforcement Learning for Privacy Behavior Analysis on Social Networks. Inf. Sci. 2024, 676, 120842. [Google Scholar] [CrossRef]
Han, Z.; Wu, W.; Song, Q.; Zhu, P. Analysis of Payoff Expectation in Evolutionary Game Based on Bush–Mosteller Model. Chaos Solitons Fractals 2024, 185, 115161. [Google Scholar] [CrossRef]
Liu, L.; Tang, C.; Zhang, L.; Liao, S. A Generic Approach for Network Defense Strategies Generation Based on Evolutionary Game Theory. Inf. Sci. 2024, 677, 120875. [Google Scholar] [CrossRef]
Wang, S.; Pu, Y.; Shi, H.; Huang, J.; Xiao, Y. A Differential Game View of Antagonistic Dynamics for Cybersecurity. Comput. Netw. 2021, 200, 108494. [Google Scholar] [CrossRef]
Deng, S.; Yuan, Y. A Data Intrusion Tolerance Model Based on an Improved Evolutionary Game Theory for the Energy Internet. Comput. Mater. Contin. 2024, 79, 3679–3697. [Google Scholar] [CrossRef]
Zou, B.; Wang, Y.; Liu, C.; Dai, M.; Du, Q.; Zhu, X. Generation of Security System Defense Strategies Based on Evolutionary Game Theory. Nucl. Eng. Technol. 2024, 56, 3463–3471. [Google Scholar] [CrossRef]
Zou, X.; Gu, J.; Zheng, Z.; Zhang, Y. Evolutionary Game and Risk Decision-Making of Four Core Participants of Land Finance in China. Cities 2024, 154, 105359. [Google Scholar] [CrossRef]
Xiao, L.; Li, H.; Yu, F.; Wang, Y. A Tripartite Evolutionary Game Study on the Governance of Online Catering Riders’ Traffic Violations from the Perspective of Collaborative Regulation. Transp. Lett. 2024, 1–15. [Google Scholar] [CrossRef]
Chen, G.; Lim, M.K.; Tseng, M.-L. Green Credit and Transformation to Enhance the Plastic Supply Chain in China: A Three-Player Evolutionary Game Perspective Approach under Dynamic Green Awareness. J. Clean. Prod. 2024, 448, 141416. [Google Scholar] [CrossRef]
Xie, M.; Zeng, Z.; Li, Y.; Feng, M. Adherence Strategy Based on Evolutionary Games in Epidemic Spreading. Chaos Solitons Fractals 2024, 186, 115289. [Google Scholar] [CrossRef]
Nan, R.; Chen, J.; Zhu, W. Evolutionary Game Analysis of Multiple Subjects in the Management of Major Public Health Emergencies. Heliyon 2024, 10, e29823. [Google Scholar] [CrossRef] [PubMed]
Wu, H.; Huang, Y.; Liu, G.; Luo, L.; Luo, Y. Synergizing Stakeholder Collaboration for Value Co-Creation in China’s Prefabricated Decoration Diffusion: A Tripartite Evolutionary Game Perspective. Heliyon 2024, 10, e34775. [Google Scholar] [CrossRef] [PubMed]
Xu, B.; Fan, H.; Li, J. Energy-Efficiency and Safety-Driven Multidimensional Evolutionary Game for AGVs Transportation at Automated Container Terminals. Comput. Ind. Eng. 2024, 192, 110192. [Google Scholar] [CrossRef]
Song, M.; Gong, X.; Du, J.; Lu, T.; Jiao, R.J. Population Dynamics Modeling of Crowdsourcing as an Evolutionary Cooperation-Competition Game for Fulfillment Capacity Balancing and Optimization of Smart Manufacturing Services. Comput. Ind. Eng. 2024, 197, 110572. [Google Scholar] [CrossRef]
Mohamed, A.F.; Saba, A.; Hassan, M.K.; Youssef, H.M.; Dahou, A.; Elsheikh, A.H.; El-Bary, A.A.; Abd Elaziz, M.; Ibrahim, R.A. Boosted Nutcracker Optimizer and Chaos Game Optimization with Cross Vision Transformer for Medical Image Classification. Egypt. Inform. J. 2024, 26, 100457. [Google Scholar] [CrossRef]
Wang, X.; Shi, L.; Cao, C.; Wu, W.; Zhao, Z.; Wang, Y.; Wang, K. Game Analysis and Decision Making Optimization of Evolutionary Dynamic Honeypot. Comput. Electr. Eng. 2024, 119 Part B, 109534. [Google Scholar] [CrossRef]
Yuan, B.; Zhu, J.; Chen, Z.; Xu, C. How Can Stakeholders Collaborate to Promote the Interconnection of Charging Infrastructure? A Tripartite Evolutionary Game Analysis. Expert Syst. Appl. 2024, 255 Part D, 124798. [Google Scholar] [CrossRef]
Oueslati, R.; Manita, G.; Chhabra, A.; Korbaa, O. Chaos Game Optimization: A Comprehensive Study of Its Variants, Applications, and Future Directions. Comput. Sci. Rev. 2024, 53, 100647. [Google Scholar] [CrossRef]
Zhen, J.; Ouyang, J.; Wang, L. An Evolutionary Game Analysis of Incentive of Industrial Parks, Government Support and Enterprise Innovation Willingness in China. Heliyon 2024, 10, e36618. [Google Scholar] [CrossRef] [PubMed]
Hu, F.; Zhou, D.; Zhu, Q.; Wang, Q. How Dynamic Renewable Portfolio Standards Affect Trading Behavior of Power Generators? Considering Green Certificate and Reward/Penalty Mechanism. Appl. Energy 2024, 375, 124114. [Google Scholar] [CrossRef]
Zhou, Y.; Jia, X.; Zhao, X.; Wang, H.; Huang, J. The Impact of the Heterogeneity of Market Participants on China’s Green Certificate Trading: A Collective Action Perspective. J. Environ. Manag. 2024, 370, 122878. [Google Scholar] [CrossRef] [PubMed]
Cheng, C.; An, R.; Dong, K.; Wang, K. Can the Future Be Bright? Evolutionary Game Analysis of Multi-Stakeholders in the Blue and Green Hydrogen Development. Int. J. Hydrogen Energy 2024, 67, 294–311. [Google Scholar] [CrossRef]
Teng, M.; Lv, K.; Han, C.; Liu, P. Trading Behavior Strategy of Power Plants and the Grid under Renewable Portfolio Standards in China: A Tripartite Evolutionary Game Analysis. Energy 2023, 284, 128398. [Google Scholar] [CrossRef]
Liu, P.; Wu, J. Study on the Diffusion of CCUS Technology under Carbon Trading Mechanism: Based on the Perspective of Tripartite Evolutionary Game among Thermal Power Enterprises, Government and Public. J. Clean. Prod. 2024, 438, 140730. [Google Scholar] [CrossRef]
Yue, T.; Wang, H.; Li, C.; Hu, Y. Optimization Strategies for Green Power and Certificate Trading in China Considering Seasonality: An Evolutionary Game-Based System Dynamics. Energy 2024, 311, 133355. [Google Scholar] [CrossRef]
Fan, W.-J.; Fang, Y.; Jiang, R.-B. An Analysis of Optimal Equilibrium in the Carbon Trading Market—From a Tripartite Evolutionary Game Perspective. Int. Rev. Financ. Anal. 2024, 96, 103629. [Google Scholar] [CrossRef]
Wang, X.; Long, R.; Chen, H.; Wang, Y.; Shi, Y.; Yang, S.; Wu, M. How to Promote the Trading in China’s Green Electricity Market? Based on Environmental Perceptions, Renewable Portfolio Standard and Subsidies. Renew. Energy 2024, 222, 119784. [Google Scholar] [CrossRef]
Kumar, S.; Brown, J.; Lee, M. Modeling Strategic Behavior of Healthcare Providers with Evolutionary Game Theory. Health Policy J. 2023, 15, 201–220. [Google Scholar]
Zhang, X.; Yang, P.; Wu, S. Application of Game Theory in Industrial Process Optimization: Modeling Shared Resource Management. Ind. Eng. Manag. 2023, 29, 345–361. [Google Scholar]
Li, Y.; Zhang, M.; Wang, Q. Evolutionary Game Analysis on Strategic Cooperation between Banks and Technological Small and Medium Enterprises under Information Asymmetry. Econ. Financ. Rev. 2023, 11, 412–429. [Google Scholar]
Wang, W.; Zhang, H.; Zhang, M. Research on the Credit Risk of Technological SMEs Based on the Evolutionary Game Model. Procedia Comput. Sci. 2022, 214, 999–1006. [Google Scholar] [CrossRef]
Basu, K. Social norms and the law. In The New Palgrave Dictionary of Economics and the Law; Newman, P., Ed.; Macmillan: New York, NY, USA, 1998. [Google Scholar]
Wang, M. Research Progress and Review of Evolutionary Game Theory in Sociology. Mod. Bus. 2015, 3, 276–277. [Google Scholar] [CrossRef]
Zuo, Y.; Zhao, X.; Zhang, Y.; Zhou, Y. From feed-in tariff to renewable portfolio standards: An evolutionary game theory perspective. J. Clean. Prod. 2019, 213, 1274–1289. [Google Scholar] [CrossRef]
Sun, T.; Jin, M.; Liu, X. Study on Multi-Agent Behavior Strategy of Green Power Transactions in Electricity Market Based on Evolutionary Game Theory. In Proceedings of the 2023 International Conference on Power System Technology (PowerCon), Jinan, China, 21–22 September 2023. [Google Scholar]
Yan, Y.; Xie, S.; Tang, J. Transaction Strategy of Virtual Power Plants and Multi-Energy Systems with Multi-Agent Stackelberg Game Based on Integrated Energy-Carbon Pricing. Front. Energy Res. 2024, 121, 1459667. [Google Scholar] [CrossRef]
Zhang, M.; Nie, J.; Su, B. An Option Game Model Applicable to Multi-Agent Cooperation Investment in Energy Storage Projects. Energy Econ. 2024, 131, 107397. [Google Scholar] [CrossRef]
Li, X.; Luo, F.; Li, C. Multi-Agent Deep Reinforcement Learning-Based Autonomous Decision-Making Framework for Community Virtual Power Plants. Appl. Energy 2024, 360, 122813. [Google Scholar] [CrossRef]
Liu, W.; Zhou, B.; Ou, M. Electricity-Gas Multi-Agent Planning Method Considering Users’ Comprehensive Energy Consumption Behavior. Front. Energy Res. 2024, 11, 1341400. [Google Scholar] [CrossRef]
Peishuai, L.; Jiawei, S.; Zaijun, W. Optimal Real-Time Voltage/Var Control for Distribution Network: Droop-Control Based Multi-Agent Deep Reinforcement Learning. Int. J. Electr. Power 2023, 153, 109370. [Google Scholar] [CrossRef]
Yaru, G.; Xueliang, H. A Reactive Power Optimization Partially Observable Markov Decision Process with Data Uncertainty Using Multi-Agent Actor-Attention-Critic Algorithm. Int. J. Electr. Power 2023, 147, 108848. [Google Scholar] [CrossRef]
Chamba, A.; Singaña, B.C.; Arcos, H. Optimal Reactive Power Dispatch in Electric Transmission Systems Using the Multi-Agent Model with Volt-VAR Control. Energies 2023, 16, 13. [Google Scholar] [CrossRef]
Jochen, M.; Sebastian, H.; Martin, W. A Multi-Agent Model of Urban Microgrids: Assessing the Effects of Energy-Market Shocks Using Real-World Data. Appl. Energy 2023, 343, 121180. [Google Scholar] [CrossRef]
Wang, L.; An, X.; Xu, H. Multi-Agent-Based Collaborative Regulation Optimization for Microgrid Economic Dispatch under a Time-Based Price Mechanism. Electr. Power Syst. Res. 2022, 213, 108760. [Google Scholar] [CrossRef]
Li, J.; Li, T.; Dong, D. Demand Response Management of Smart Grid Based on Stackelberg-Evolutionary Joint Game. Sci. China Inf. Sci. 2023, 66, 8. [Google Scholar] [CrossRef]
Alishavandi, M.A.; Moghaddas-Tafreshi, M.S. Interactive Decentralized Operation with Effective Presence of Renewable Energies Using Multi-Agent Systems. Int. J. Electr. Power 2019, 112, 36–48. [Google Scholar] [CrossRef]
Liu, J.; Xu, F.; Lin, S.; Cai, H.; Yan, S. A Multi-Agent-Based Optimization Model for Microgrid Operation Using Dynamic Guiding Chaotic Search Particle Swarm Optimization. Energies 2018, 11, 3286. [Google Scholar] [CrossRef]
Deqiang, Q.; Junxiang, L.; Xiaojia, M. Distributed Real-Time Pricing of Smart Grid Considering Individual Differences. Omega 2024, 103, 109. [Google Scholar] [CrossRef]
Wang, Y.; Mao, S.; Nelms, R.M. Distributed Online Algorithm for Optimal Real-Time Energy Distribution in the Smart Grid. IEEE Internet Things 2014, 1, 70–80. [Google Scholar] [CrossRef]
Nweye, K.; Sankaranarayanan, S.; Nagy, Z. Merlin: Multi-Agent Offline and Transfer Learning for Occupant-Centric Operation of Grid-Interactive Communities. Appl. Energy 2023, 346, 121323. [Google Scholar] [CrossRef]
Dai, Y.; Gao, Y. Real-Time Pricing Decision Making for Retailer-Wholesaler in Smart Grid Based on Game Theory. In Abstract and Applied Analysis; Hindawi Publishing Corporation: London, UK, 2014; p. 708584. [Google Scholar] [CrossRef]
Maddouri, M.; Elkhorchani, H.; Grayaa, K. Game Theory and Hybrid Genetic Algorithm for Energy Management and Real-Time Pricing in Smart Grid: The Tunisian Case. Int. J. Green Energy 2020, 17, 816–826. [Google Scholar] [CrossRef]
Karimi, A.; Nayeripour, M.; Abbasi, A.R. Coordination in Islanded Microgrids: Integration of Distributed Generation, Energy Storage System, and Load Shedding Using a New Decentralized Control Architecture. J. Energy Storage 2024, 113, 199. [Google Scholar] [CrossRef]
Mun, H.; Kim, Y.; Park, J.; Lee, I. Power Generation System Utilizing Cold Energy from Liquid Hydrogen: Integration with a Liquid Air Storage System for Peak Load Shaving. Energy 2024, 132, 351. [Google Scholar] [CrossRef]
Ding, L. Research on Distributed Energy Management in Smart Grid Based on Multi-Agent Reinforcement Learning. Ph.D. Thesis, Zhejiang University, Hangzhou, China, 2023. [Google Scholar] [CrossRef]
Bilkisu, J.O.; Jiashen, T. Impact of the Integration of Information and Communication Technology on Power System Reliability: A Review. IEEE Access 2020, 8, 24600–24615. [Google Scholar] [CrossRef]
Yang, F.; Koukoula, M.; Stergios, E.; Cerrai, D.; Anagnostou, E.N. Assessing the Power Grid Vulnerability to Extreme Weather Events Based on Long-Term Atmospheric Reanalysis. Stoch. Environ. Res. Risk Assess. 2023, 37, 4291–4306. [Google Scholar] [CrossRef]
Ciapessoni, E.; Pitto, A.; Cirio, D. An Application of a Risk-Based Methodology to Anticipate Critical Situations Due to Extreme Weather Events in Transmission and Distribution Grids. Energy 2021, 14, 4742. [Google Scholar] [CrossRef]
Jufri, F.H.; Widiputra, V.; Jung, J. State-of-the-Art Review on Power Grid Resilience to Extreme Weather Events: Definitions, Frameworks, Quantitative Assessment Methodologies, and Enhancement Strategies. Appl. Energy 2019, 239, 1049–1065. [Google Scholar] [CrossRef]
Panteli, M.; Trakas, D.N.; Mancarella, P.; Hatziargyriou, N.D. Boosting the Power Grid Resilience to Extreme Weather Events Using Defensive Islanding. IEEE Trans. Smart Grid 2016, 7, 2913–2922. [Google Scholar] [CrossRef]
Zhang, L.; Lu, Q.; Huang, R.; Chen, S.; Yang, Q.; Gu, J. A Dynamic Incentive Mechanism for Smart Grid Data Sharing Based on Evolutionary Game Theory. Energy 2023, 16, 8125. [Google Scholar] [CrossRef]
Yin, W.; Liang, W.; Ji, J. Study on Charge and Discharge Control Strategy of Improved PSO for EV. Energy 2024, 132, 132061. [Google Scholar] [CrossRef]
Triviño, A.; López, A.; Yuste, A.J.; Cuevas, J.C. Decentralized EV Charging and Discharging Scheduling Algorithm Based on Type-II Fuzzy-Logic Controllers. J. Energy Storage 2024, 112, 054. [Google Scholar] [CrossRef]
Subramani, J.; Maria, A.; Audithan, S.; Vijayakumar, P.; Alqahtani, F.; Tolba, A. An Efficient Anonymous Authentication Scheme for Blockchain Assisted and Fog-Enabled Smart Grid. Comput. Electr. Eng. 2024, 109, 508. [Google Scholar] [CrossRef]
Saxena, S.; Farag, H.E.Z.; Turesson, H.; Kim, H. Blockchain Based Transactive Energy Systems for Voltage Regulation in Active Distribution Networks. IET Smart Grid 2020, 3, 646–656. [Google Scholar] [CrossRef]
Wu, Y.; Tian, X.; Gai, L.; Lim, B.; Wu, T.; Xu, D.; Zhang, D. Energy management for PV prosumers inside microgrids based on Stackelberg–Nash game considering demand response. Sustain. Energy Technol. Assess. 2024, 68, 103856. [Google Scholar] [CrossRef]
Huang, F.; Fan, H.; Shang, Y.; Wei, Y.; Almutairi, S.Z.; Alharbi, A.M.; Wang, H. Research on Renewable Energy Trading Strategies Based on Evolutionary Game Theory. Sustainability 2024, 16, 2671. [Google Scholar] [CrossRef]
Ma, X.; Pan, Y.; Zhang, M.; Ma, J.; Yang, W. Impact of Carbon Emission Trading and Renewable Energy Development Policy on the Sustainability of Electricity Market: A Stackelberg Game Analysis. Energy Econ. 2024, 129, 107199. [Google Scholar] [CrossRef]
Zhang, X.; Guo, X.; Zhang, X. Bidding Modes for Renewable Energy Considering Electricity-Carbon Integrated Market Mechanism Based on Multi-Agent Hybrid Game. Energy 2023, 263, 125616. [Google Scholar] [CrossRef]
Wu, X.; Liu, P.; Yang, L.; Shi, Z.; Lao, Y. Impact of Three Carbon Emission Reduction Policies on Carbon Verification Behavior: An Analysis Based on Evolutionary Game Theory. Energy 2024, 130, 926. [Google Scholar] [CrossRef]
Wu, Z.; Yang, C.; Zheng, R. An Analytical Model for Enterprise Energy Behaviors Considering Carbon Trading Based on Evolutionary Game. J. Clean. Prod. 2024, 139, 840. [Google Scholar] [CrossRef]
Alam, K.S.; Daiyan, A.M.A.; Das, S.K. A Blockchain-Based Optimal Peer-to-Peer Energy Trading Framework for Decentralized Energy Management within a Virtual Power Plant: Lab-Scale Studies and Large-Scale Proposal. Appl. Energy 2024, 123, 243. [Google Scholar] [CrossRef]
Yan, X.; Gao, C.; Mou, Y.; Abbes, D. Consensus Alternating Direction Multiplier Method Based Fully Distributed Peer-to-Peer Energy Transactions Considering the Network Transmission Distance. Sustain. Energy Grids 2024, 101, 340. [Google Scholar] [CrossRef]
Hou, H.; Wang, Z.; Zhao, B.; Zhang, L.; Shi, Y.; Xie, C.; Yu, K. Distributed Optimization for Joint Peer-to-Peer Electricity and Carbon Trading among Multi-Energy Microgrids Considering Renewable Generation Uncertainty. Energy Convers. Econ. 2024, 5, 116–131. [Google Scholar] [CrossRef]
Wang, X.; Jia, H.; Wang, Z.; Jin, X.; Deng, Y.; Mu, Y.; Yu, X. A Real-Time Peer-to-Peer Energy Trading for Prosumers Utilizing Time-Varying Building Virtual Energy Storage. Int. J. Electr. Power 2024, 109, 547. [Google Scholar] [CrossRef]
Shan, S.; Yang, S.; Becerra, V.; Deng, J.; Li, H. A Case Study of Existing Peer-to-Peer Energy Trading Platforms: Calling for Integrated Platform Features. Sustainability 2023, 15, 316284. [Google Scholar] [CrossRef]
Clift, D.; Holland, H.; Hasan, K.N.; Rosengarten, G. Peer-to-Peer Energy Trading for Demand Response of Residential Smart Electric Storage Water Heaters. Appl. Energy 2024, 353, 122182. [Google Scholar] [CrossRef]
Ahmed, S.A.; Huang, Q.; Amin, W.; Afzal, M.; Hussain, F.; Haider, M.H. A Fair and Effective Approach to Managing Distributed Energy Resources through Peer-to-Peer Energy Trading with Load Prioritization among Smart Homes. Energy Rep. 2023, 10, 4402–4419. [Google Scholar] [CrossRef]
Yaldız, A.; Gökçek, T.; Şengör, İ.; Erdinç, O. Optimal Sizing and Economic Analysis of Photovoltaic Distributed Generation with Battery Energy Storage System Considering Peer-to-Peer Energy Trading. Sustain. Energy Grids 2021, 28, 100540. [Google Scholar] [CrossRef]
Nguyen, S.; Peng, W.; Sokolowski, P.; Alahakoon, D.; Yu, X. Optimizing Rooftop Photovoltaic Distributed Generation with Battery Storage for Peer-to-Peer Energy Trading. Appl. Energy 2018, 228, 2567–2580. [Google Scholar] [CrossRef]
Wei, S.; Wang, H.; Fu, Y.; Li, F.; Huang, L. Electrical System Planning of Large-Scale Offshore Wind Farm Based on N+ Design Considering Optimization of Upper Power Limits of Wind Turbines. J. Mod. Power Syst. Clean Energy 2023, 11, 1784–1794. [Google Scholar] [CrossRef]
Xu, Z.; Cao, J.; Xu, Y.; Sun, Y.; Zhang, X.; Pan, W.-T. Decision-Making Mechanism of Cooperative Innovation between Clients and Service Providers Based on Evolutionary Game Theory. Discrete Dyn. Nat. Soc. 2022, 2022, 8774462. [Google Scholar] [CrossRef]
Yin, L.; He, X. Artificial Emotional Deep Q-Learning for Real-Time Smart Voltage Control of Cyber-Physical Social Power Systems. Energy 2023, 273, 127232. [Google Scholar] [CrossRef]
Yin, L.; Yu, T.; Zhou, L. Design of a Novel Smart Generation Controller Based on Deep Q Learning for Large-Scale Interconnected Power System. J. Mod. Power Syst. Clean Energy 2018, 144, 3. [Google Scholar] [CrossRef]
Ying, Y.; Tian, Z.; Wu, M.; Liu, Q.; Tricoli, P. A Real-Time Energy Management Strategy of Flexible Smart Traction Power Supply System Based on Deep Q-Learning. IEEE Trans. Intell. Transp. Syst. 2024, 25, 8938–8948. [Google Scholar] [CrossRef]
Mbuwir, B.; Ruelens, F.; Spiessens, F.; Deconinck, G. Battery Energy Management in a Microgrid Using Batch Reinforcement Learning. Energies 2017, 10, 1846. [Google Scholar] [CrossRef]
Bazmohammadi, N.; Anvari-Moghaddam, A.; Tahsiri, A.; Madary, A.; Vasquez, J.; Guerrero, J. Stochastic Predictive Energy Management of Multi-Microgrid Systems. Appl. Sci. 2020, 10, 4833. [Google Scholar] [CrossRef]
Alabdullah, M.H.; Abido, M.A. Microgrid Energy Management Using Deep Q-Network Reinforcement Learning. Alex. Eng. J. 2022, 61, 9069–9078. [Google Scholar] [CrossRef]
Ji, Y.; Wang, J.; Xu, J.; Fang, X.; Zhang, H. Real-Time Energy Management of a Microgrid Using Deep Reinforcement Learning. Energies 2019, 12, 2291. [Google Scholar] [CrossRef]
Li, T.; Yang, J.; Ioannou, A. Data-Driven Control of Wind Turbine under Online Power Strategy via Deep Learning and Reinforcement Learning. Renew. Energy 2024, 121, 265. [Google Scholar] [CrossRef]
Li, Y.; Zhang, Y.; Li, X.; Sun, C. Regional Multi-Agent Cooperative Reinforcement Learning for City-Level Traffic Grid Signal Control. IEEE/CAA J. Autom. Sin. 2024, 9, 1987–1998. [Google Scholar] [CrossRef]
Zhang, Q.; Zhu, L.; Chen, Y. Energy-Efficient Traffic Offloading for RSMA-Based Hybrid Satellite Terrestrial Networks with Deep Reinforcement Learning. China Commun. 2024, 21, 49–58. [Google Scholar] [CrossRef]
Wang, Y.; Zhang, N.; Kang, C.; Xi, W.; Huo, M. Electrical Consumer Behavior Model: Basic Concept and Research Framework. Trans. China Electrotech. Soc. 2019, 34, 2056–2068. [Google Scholar] [CrossRef]
Yang, X.; Zhou, M.; Li, G. Survey on demand response mechanism and modeling in smart grid. Power Syst. Technol. 2016, 40, 220–226. [Google Scholar] [CrossRef]
Chen, Q.; Lv, R.; Guo, H.; Jia, H.; Ding, Y.; Wang, Y.; Kang, C. Electricity user behavior modeling for demand response: Research status quo and applications. Electr. Autom. Equip. 2023, 43, 23–37. [Google Scholar] [CrossRef]
Yu, J.; Cao, Y.; Yan, P.; Li, Y. Integrated regional energy operation strategy considering user demand response game deduction. Power Syst Technol. 2024, 48, 3745–3757. [Google Scholar] [CrossRef]
Dou, X.; Wang, J.; Wang, X.; Wu, L. Analysis of user demand side response behavior of regional integrated power and gas energy systems based on evolutionary game. Proc. Chin. Soc. Elect. Eng. 2020, 40, 3775–3786. [Google Scholar] [CrossRef]
Wang, X.; Yang, J.; Zhang, K. Game-theoretic analysis of market-based operation mechanism for demand response resources. Int. J. Electr. Power 2022, 134, 107456. [Google Scholar] [CrossRef]
Zhang, Y.; Wang, X.; He, J. Optimization of distributed integrated multi-energy system considering industrial process based on energy hub. J. Mod. Power Syst. Clean Energy 2020, 8, 863–873. [Google Scholar] [CrossRef]
Ge, H.; Zhao, L.; Yue, D.; Gorbachev, S.; Korovin, I.; Ge, Y. A Game Theory-Based Optimal Allocation Strategy for Defense Resources of Smart Grid under Cyber-Attack. Inf. Sci. 2024, 652, 119759. [Google Scholar] [CrossRef]
Shin, G.S.; Kim, H.Y.; Mahseredjian, J. Smart Vehicle-to-Grid Operation of Power System Based on EV User Behavior. J. Electr. Eng. Technol. 2024, 19, 2941–2952. [Google Scholar] [CrossRef]
Zhong, C.; Shao, J.; Zheng, F. Research on Electricity Consumption Behavior of Electric Power Users Based on Tag Technology and Clustering Algorithm. In Proceedings of the 2018 5th International Conference on Information Science and Control Engineering (ICISCE), Zhengzhou, China, 20–22 July 2018; pp. 459–462. [Google Scholar] [CrossRef]
Ji, H.; Wang, H.; Yang, J. Optimal Schedule of Solid Electric Thermal Storage Considering Consumer Behavior Characteristics in Combined Electricity and Heat Networks. Energy 2021, 234, 121237. [Google Scholar] [CrossRef]
Xie, G.; Chen, X.; Weng, Y. Enhance Load Forecastability: Optimize Data Sampling Policy by Reinforcing User Behaviors. Eur. J. Oper. Res. 2021, 295, 924–934. [Google Scholar] [CrossRef]
Hu, H.; Wang, Y.; Han, J. Analysis of User Power Consumption Characteristics and Behavior Portrait Based on KS-RF Algorithm. In Proceedings of the 2021 IEEE/IAS Industrial and Commercial Power System Asia (I&CPS Asia), Chengdu, China, 18–21 July 2021; pp. 1586–1590. [Google Scholar] [CrossRef]
Yu, H.; He, J.; Wang, Y. Spike Load Smoothing Strategy Considering Behavioral Decision under Vehicle-Road-Network Convergence. Electr. Power Syst. Res. 2024, 237, 110898. [Google Scholar] [CrossRef]
Tan, B.; Lin, Z.; Zheng, X. Distributionally Robust Energy Management for Multi-Microgrids with Grid-Interactive EVs Considering the Multi-Period Coupling Effect of User Behaviors. Appl. Energy 2023, 350, 21770. [Google Scholar] [CrossRef]
He, Y.; Wu, H.; Wu, A.Y. Optimized Shared Energy Storage in a Peer-to-Peer Energy Trading Market: Two-Stage Strategic Model Regards Bargaining and Evolutionary Game Theory. Renew. Energy 2024, 224, 120190. [Google Scholar] [CrossRef]
Cheng, L.; Chen, Y.; Liu, G. 2PnS-EG: A General Two-Population n-Strategy Evolutionary Game for Strategic Long-Term Bidding in a Deregulated Market under Different Market Clearing Mechanisms. Int. J. Electr. Power 2022, 142, 108182. [Google Scholar] [CrossRef]
Yan, S.; Wang, W.; Li, X. Cross-Regional Green Certificate Transaction Strategies Based on a Double-Layer Game Model. Appl. Energy 2024, 356, 122223. [Google Scholar] [CrossRef]
Lee, W.-P.; Han, D.; Won, D. Grid-Oriented Coordination Strategy of Prosumers Using Game-Theoretic Peer-to-Peer Trading Framework in Energy Community. Appl. Energy 2022, 326, 119980. [Google Scholar] [CrossRef]
Wang, R.; Li, Y.; Gao, B. Evolutionary Game-Based Optimization of Green Certificate, Carbon Emission Right, Electricity Joint Market for Thermal-Wind-Photovoltaic Power System. Glob. Energy Interconnect. 2023, 6, 92–102. [Google Scholar] [CrossRef]
Cheng, L.F.; Yin, L.F.; Wang, J.H.; Shen, T.; Che, Y.; Liu, G.Y.; Yu, T. Behavioral decision-making in power demand-side response management: A multi-population evolutionary game dynamics perspective. Int. J. Electr. Power Energy Syst. 2021, 129, 106743. [Google Scholar] [CrossRef]
Wang, B.; Evergreen, S.; Forest, J. Game Theory in Smart Grids: Strategic Decision-Making for Renewable Energy Integration. Sustain. Cities Soc. 2024, 108, 105480. [Google Scholar] [CrossRef]
Stai, E.; Kokolaki, E.; Mitridati, L. Game-Theoretic Energy Source Allocation Mechanism in Smart Grids. In Proceedings of the IEEE International Energy Conference (ENERGYCON), Riga, Latvia, 9–12 May 2022. [Google Scholar] [CrossRef]
Gaspari, F. Energy Trading in Smart Grids Using Game Theoretic Approach; Springer International Publisher: Berlin/Heidelberg, Germany, 2023. [Google Scholar] [CrossRef]
Naz, A.; Javaid, N.; Rasheed, M.B.; Haseeb, A.; Alhussein, M.; Aurangzeb, K. Game Theoretical Energy Management with Storage Capacity Optimization and Photo-Voltaic Cell Generated Power Forecasting in Micro Grid. Sustainability 2019, 11, 2763. [Google Scholar] [CrossRef]
Zhang, Z.; Liu, H. Non-Cooperative Energy Consumption Scheduling for Smart Grid: An Evolutionary Game Approach; Springer International Publisher: Berlin/Heidelberg, Germany, 2020. [Google Scholar] [CrossRef]
Cheng, L.; Liu, G.; Huang, H.; Wang, X.; Chen, Y.; Zhang, J.; Meng, A.; Yang, R.; Yu, T. Equilibrium analysis of general N-population multi-strategy games for generation-side long-term bidding: An evolutionary game perspective. J. Clean. Prod. 2020, 276, 124123. [Google Scholar] [CrossRef]
Cheng, L.; Yu, T. Nash equilibrium-based asymptotic stability analysis of multi-group asymmetric evolutionary games in typical scenario of electricity market. IEEE Access 2018, 6, 32064–32086. [Google Scholar] [CrossRef]
Abbass, H.; Greenwood, G.; Petraki, E. The N-player trust game and its replicator dynamics. IEEE Trans. Evol. Comput. 2016, 20, 470–474. [Google Scholar] [CrossRef]
Barari S, Agarwal A, Zhang W J; et al. A decision framework for the analysis of green supply chain contracts: An evolutionary game approach. Expert Syst. Appl. 2012, 39, 2965–2976. [Google Scholar] [CrossRef]
Cheng, L.; Zhang, J.; Yin, L.; Chen, Y.; Wang, J.; Liu, G.; Wang, X.; Zhang, D. General three-population multi-strategy evolutionary games for long-term on-grid bidding of generation-side electricity market. IEEE Access 2021, 9, 5177–5198. [Google Scholar] [CrossRef]
Abapour S, Nazari-Heris M, Mohammadi-Ivatloo B; et al. Game theory approaches for the solution of power system problems: A comprehensive review. Arch. Comput. Methods Eng. 2020, 27, 81–103. [Google Scholar] [CrossRef]
Cheng, L.; Yu, T. Smart dispatching for energy internet with complex cyber-physical-social systems: A parallel dispatch perspective. Int. J. Energy Res. 2019, 43, 3080–3133. [Google Scholar] [CrossRef]
Cheng, L.; Yu, T.; Zhang, X. The weakly-centralized Web-of-Cells based on cyber-physical-social systems integration and group machine learning: Theoretical investigations and key scientific issues analysis. Sci. Sin. Technol. 2019, 49, 1541–1569. [Google Scholar]
Blinovas, A.; Urazaki, K.J.; Badia, L.; Gindullina, E. A Game Theoretic Approach for Cost-Effective Management of Energy Harvesting Smart Grids. In Proceedings of the IEEE IWCMC, Dubrovnik, Croatia, 30 May–3 June 2022. [Google Scholar] [CrossRef]
Zhu, Z.; Cheng, L.; Shen, T. Spontaneous formation of evolutionary game strategies for long-term carbon emission reduction based on low-carbon trading mechanism. Mathematics 2024, 12, 3109. [Google Scholar] [CrossRef]
Cheng, L.; Peng, P.; Lu, W.; Huang, P.; Chen, Y. Study of flexibility transformation in thermal power enterprises under multi-factor drivers: Application of complex-network evolutionary game theory. Mathematics 2024, 12, 2537. [Google Scholar] [CrossRef]
Kappner, K.; Venghaus, S.; Letmathe, P. Economic, Environmental, Societal and Technological Impact Factors in the Transition of the German Energy System: Investigations from a Prosumer’s Perspective. Ph.D. Thesis, RWTH Aachen University, Aachen, Germany, 2023. Available online: https://publications.rwth-aachen.de/record/961609/files/961609.pdf (accessed on 16 July 2024).
Żuk, P.; Żuk, P. Prosumers in Action: The Analysis of Social Determinants of Photovoltaic Development and Prosumer Strategies in Poland. Int. J. Energy Econ. Policy 2022, 12, 294–306. Available online: https://www.zbw.eu/econis-archiv/bitstream/11159/12290/1/1816938696_0.pdf (accessed on 18 July 2024). [CrossRef]
Cai, D.; Zuo, J.; Hao, X. Dynamic adaptation in power transmission: Integrating robust optimization with online learning for renewable uncertainties. Front. Energy Res. 2024, 12, 1483170. [Google Scholar] [CrossRef]
National Renewable Energy Laboratory (NREL). Integrating Variable Renewable Energy: Challenges and Solutions; National Renewable Energy Laboratory: Golden, CO, USA, 2023. Available online: www.nrel.gov/publications (accessed on 24 July 2024).
International Energy Agency (IEA). Introduction to System Integration of Renewables; IEA: Paris, France, 2024; Available online: www.iea.org/reports/introduction-to-system-integration-of-renewables (accessed on 10 August 2024).
Timilsina, A.; Silvestri, S. How do climate policy uncertainty and renewable energy and clean technology stock prices co-move? Evidence from Canada. In Empirical Economics; Springer: Berlin/Heidelberg, Germany, 2023. [Google Scholar]
Bui, V.; Nguyen, T.; Vu, T.; Pham, N.; Le, A. A Critical Review of Safe Reinforcement Learning Techniques in Smart Grid Applications. arXiv 2024, arXiv:2409.16256. [Google Scholar] [CrossRef]
Li, Y.; Yu, C.; Shahidehpour, M.; Yang, T.; Zeng, Z.; Chai, T. Deep Reinforcement Learning for Smart Grid Operations: Algorithms, Applications, and Prospects. Proc. IEEE 2023, 111, 1055–1096. [Google Scholar] [CrossRef]
Pu, X.; Xiao, H.; Pei, W.; Yang, Y.; Ma, L.; Ma, T.; Zhang, S. Optimal energy management of networked multi-energy microgrids based on improved multi-agent federated reinforcement learning. CSEE J. Power Energy Syst. 2024; early access. [Google Scholar]
Meydani, A.; Shahinzadeh, H.; Nafisi, H.; Gharehpetian, G.B. Optimizing Microgrid Energy Management: Intelligent Techniques. In Proceedings of the 2024 28th International Electrical Power Distribution Conference (EPDC), Zanjan, Iran, 23–25 April 2024; pp. 1–18. [Google Scholar] [CrossRef]
Li, S.; Cao, D.; Hu, W.; Huang, Q.; Chen, Z.; Blaabjerg, F. Multi-energy Management of Interconnected Multi-microgrid System Using Multi-agent Deep Reinforcement Learning. J. Mod. Power Syst. Clean Energy 2023, 11, 1606–1617. [Google Scholar] [CrossRef]
Shi, Q.; Liu, M.; Zhang, S.; Zheng, R.; Lan, X. Multi-Agent Path Finding Method Based on Evolutionary Reinforcement Learning. In Proceedings of the 2024 43rd Chinese Control Conference (CCC), Kunming, China, 28–31 July 2024; pp. 5728–5733. [Google Scholar] [CrossRef]
Liu, W.; Li, K.; Li, W.; Wang, R.; Zhang, T. Multi-Agent Deep Reinforcement Learning for Multi-Modal Orienteering Problem. In Proceedings of the 2024 IEEE 18th International Symposium on Applied Computational Intelligence and Informatics (SACI), Timisoara, Romania, 21–25 May 2024; pp. 000169–000174. [Google Scholar] [CrossRef]
Karaki, A.; Al-Fagih, L. Evolutionary Game Theory as a Catalyst in Smart Grids: From Theoretical Insights to Practical Strategies. IEEE Access 2024. Early Access. [Google Scholar] [CrossRef]
Zhao, J.; Peng, W.; Wang, H.; Yao, W.; Zhou, W. A Morphological Transfer-Based Multi-Fidelity Evolutionary Algorithm for Soft Robot Design. IEEE Comput. Intell. Mag. 2024, 19, 16–30. [Google Scholar] [CrossRef]
Stranieri, N.; Buffa, F.M.; Tangherloni, A. Forest-based Evolutionary Algorithm for Reconstructing Boolean Gene Regulatory Networks. In Proceedings of the 2024 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), Natal, Brazil, 27–29 August 2024; pp. 1–8. [Google Scholar] [CrossRef]
Cubillos-Chaparro, J.; Dorn, M.; Villalobos-Cid, M.; Inostroza-Ponta, M. A Multiobjective Evolutionary Algorithm for Colon Cancer Biomarkers Identification on Gene Expression Data. In Proceedings of the 2024 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), Natal, Brazil, 27–29 August 2024; pp. 1–8. [Google Scholar] [CrossRef]
Qi, H.; Gong, S.; Gani, A.; Gong, C. High-Order Quantum Genetic Algorithm Based on Quantum Entanglement. In Proceedings of the 2024 6th International Conference on Electronic Engineering and Informatics (EEI), Chongqing, China, 28–30 June 2024; pp. 983–988. [Google Scholar] [CrossRef]
Jin, C.; Liu, G. On the Impact of the Large Population on Evolutionary Algorithm. In Proceedings of the 2024 20th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD), Guangzhou, China, 27–29 July 2024; pp. 1–10. [Google Scholar] [CrossRef]
Imam, F.A.; Osanaiye, O.; Imam, N.A.; Obadiah, A.N.; Rafindadi, M.A.; Thomas, S. Hybridization of Artificial Immune System Algorithms with Other AI Algorithms: A Review. In Proceedings of the 2023 2nd International Conference on Multidisciplinary Engineering and Applied Science (ICMEAS), Abuja, Nigeria, 1–3 November 2023; pp. 1–6. [Google Scholar] [CrossRef]
Liang, X.; Guo, Q.; Qian, Y.; Ding, W.; Zhang, Q. Evolutionary Deep Fusion Method and its Application in Chemical Structure Recognition. IEEE Trans. Evol. Comput. 2021, 25, 883–893. [Google Scholar] [CrossRef]
Lima, T.P.F.D.; Ludermir, T.B. Optimizing Dynamic Ensemble Selection Procedure by Evolutionary Extreme Learning Machines and a Noise Reduction Filter. In Proceedings of the 2013 IEEE 25th International Conference on Tools with Artificial Intelligence, Herndon, VA, USA, 4–6 November 2013; pp. 546–552. [Google Scholar] [CrossRef]
Huda, S.; Yearwood, J.; Togneri, R. A Constraint-Based Evolutionary Learning Approach to the Expectation Maximization for Optimal Estimation of the Hidden Markov Model for Speech Signal Modeling. IEEE Trans. Syst. Man Cybern. Part B 2009, 39, 182–197. [Google Scholar] [CrossRef] [PubMed][Green Version]
Alekhya, V.; Reddy, N.V.U.; Singh, J.; Boddu, B.; Sobti, R.; Hameed, A.A. High-Dimensional Data Processing Using Quantum-Inspired Evolutionary Algorithms for Homeland Security Imaging Systems. In Proceedings of the 2024 International Conference on Communication, Computer Sciences and Engineering (IC3SE), Gautam Buddha Nagar, India, 9–11 May 2024; pp. 1184–1189. [Google Scholar] [CrossRef]
Wu, J.; Wang, X.-Y.; Tian, A.-Q.; Du, Z.-G.; Yang, Z.-J. A Hybrid Meta-Heuristic Approach for Emergency Logistics Distribution Under Uncertain Demand. IEEE Access 2024, 12, 135701–135729. [Google Scholar] [CrossRef]
Srivatsan, K.; Venkatesan, N. Improved Meta-Heuristic Technique to FIR Filter Design and Application. IEEE Access 2024, 12, 108097–108107. [Google Scholar] [CrossRef]
Ibnoulouafi, E.M.; Aouam, T.; Oudani, M.; Ghogho, M. Efficient Meta-Heuristic Approach for the Multi-Objective Green p-Hub Centre Routing Problem. IEEE Trans. Evol. Comput. 2024; early access. [Google Scholar] [CrossRef]

Figure 1. Dynamic evolution of hawk–dove strategies [,,,].

Figure 2. EGT: cooperation between banks and technological SEMs [,].

Figure 3. The process of building a multi-agent model [].

Figure 4. The complexity of real-world market behaviors under varying demand and supply conditions [].

Figure 5. Illustrates the impact of EGT on consumer behavior and market demand under real-time pricing [,].

Figure 6. A basic framework for MARL or game theory [,].

Figure 7. The evolution of energy strategy efficiency and frequency [].

Figure 8. Different actors and mechanisms in a smart grid interact to achieve efficient trading and management of energy. The purpose of this diagram is to show how different actors and mechanisms in a smart grid interact to achieve efficient trading and management of energy. The sub-themes revolve around a core theme that illustrates the complexity and diversity of smart grids in real-time operations [].

Figure 9. Framework for integrating EGT and the DQN for strategy optimization in the electricity markets [].

Figure 10. The basic components of the electricity user behavior model [,,].

Figure 11. The evolution of consumers adjusting their electricity usage strategies under different electricity prices [].

Figure 12. An optimal defense resource allocation method based on game theory [,,,].

Table 1. Differences between traditional game theory and EGT.

Theory Type	Theoretical Basis	Analytical Method	Research Emphasis
Traditional game theory	Assumption of complete rationality: participants can accurately calculate the optimal strategy	Static analysis, primarily addressing Nash equilibrium	Focus on the existence of Nash equilibrium and the choice of optimal strategies by participants
Evolutionary game theory	Assumption of bounded rationality: participants determine the optimal strategy through continuous trial and adjustment	Dynamic evolutionary method, describing the evolutionary process of participants’ strategies	Focus on the evolutionary process and evolutionary stability of participant strategies

Table 2. Categories and applications of EGT.

Type of EGT	Application Scenario and Description	Detailed Explanation
Real-time pricing	🗸 Power generators and retailers develop competitive pricing strategies in response to market demand, influencing consumer behavior.	🗸 EGT helps evaluate and optimize pricing strategies, leading to smoother demand patterns as consumers invest in energy-efficient technologies or storage systems.
Renewable energy development	🗸 Modeling competitive and cooperative dynamics among renewable energy producers in a more integrated market.	🗸 EGT simulates energy storage operators’ and renewable producers’ evolving strategies to determine optimal storage and release times based on generation trends and market prices.
Carbon emissions	🗸 Modeling company competition in carbon allowance trading to optimize strategies for minimizing emission costs.	🗸 EGT helps assess the effectiveness of environmental policies and trading mechanisms by analyzing competitive behaviors in carbon markets.
Market dynamics	🗸 Simulating market conditions to study how participants adjust strategies during price volatility and fluctuating supply–demand conditions.	🗸 EGT models how electricity consumers and generators adapt decision-making processes, such as trading in real-time energy markets in response to supply–demand imbalances.

Table 3. Application scenarios of QDN in electricity market.

Electricity Market Scenario	Role of DQN	Expected Outcome
Real-time pricing strategy	🗸 Optimizes supplier pricing strategies based on market price fluctuations	🗸 Increased profitability, balanced supply and demand
Demand-response management	🗸 Dynamically adjusts power supply plans based on changing user demand	🗸 Increased energy efficiency, reduced peak period costs
Load optimization	🗸 Adjusts generation and transmission strategies based on real-time load demand	🗸 Reduced waste, optimized resource allocation
Market participant bidding strategy	🗸 Optimizes participant bidding strategies by learning from market bidding history	🗸 Increased bidding success rate, reduced price volatility

Table 4. Performance comparison of RL algorithms.

Algorithm	Strategy Selection Method	Adaptability	Convergence Speed
Q-Learning	ε-greedy strategy	Moderate	Slow
SARSA	Deterministic strategy selection	Moderate	Moderate
DQN	Approximates Q-values using neural networks	High	Fast
Double DQN	Addresses overestimation in the DQN	High	Faster
Actor-Critic	Separates policy and value functions	High	Fast
Algorithm	Strategy selection method	Adaptability	Convergence speed
Q-Learning	ε-greedy strategy	Moderate	Slow

Table 5. A summary of the application of user behavior models in power systems.

Reference Number	Approaches Used in the Study	Results	Limitations
Shin et al. (2024) []	Based on EV user behavior analysis of V2G electricity performance, a V2G discharge optimization model was developed	Successfully predicted and optimized the usage patterns of electric vehicles in the V2G system	Considerations for qualitative factors in the analysis system of user behavior
Zhong et al. (2018) []	Based on massive user profile data and electrical characteristics, a user behavior tag library was built, and the tags were analyzed through k-means clustering	Provided strong data support for power companies to understand users’ EPS usage habits, mine users’ electricity demands, and improve service level	k-means can only process numerical data, while some user behavior characteristics may be non-numerical
Ji et al. (2021) []	Based on consumer behavior characteristics in electric–thermal grids, a method for the optimal scheduling of distributed solid thermal storage was proposed	Reduce wind curtailment and enhance the consistency between planned outcomes and expectations, thereby avoiding waste of electricity and thermal energy	The classification of consumer behavior may be overly simplified
Xie et al. (2021) []	Adaptive data sampling strategy based on user behavior	The proposed algorithm can be implemented offline and online, with the latter capable of real-time data interaction with smart grids	The model may not fully adapt to all types of users, especially when load behavior exhibits unconventional patterns, which could affect the predictive outcomes
Hu et al. (2021) []	A clustering model of user electricity consumption behavior was constructed based on the k-shape clustering algorithm	Accurately cluster users, extract features, and perform behavior profiling	The clustering method primarily considers the shape similarity of user electricity consumption patterns, which may overlook some important features of electricity usage behavior
Yu et al. (2024) []	A “vehicle-road-network” integration strategy considering user behavior decision making was proposed	Coordinate the distribution of charging loads, alleviate traffic pressure during peak load periods, and reduce travel and charging costs for EV users	The model has limited capability in processing and responding to real-time data, which may result in delayed reactions when dealing with sudden traffic or grid load changes
Tan et al. (2023) []	A new MMG-distributed robust energy management model was proposed based on the multi-period coupling effect of user behavior	Achieved excellent cost efficiency, convergence performance, and robustness	When dealing with complex real-world scenarios, the performance of the optimization model may be limited by data scale and computational complexity

Table 6. A summary of the application of EGT in electricity market transactions.

Reference Number	Approaches Used in the Study	Advantages	Limitations
He et al. (2024) []	Based on the multi-strategy evolutionary game model, considering the bounded rational decision making of SES operators and the community, a unique SES leasing fee pricing mechanism was designed	It can be effectively applied in dynamic environments, especially when market conditions and participant behavior are constantly changing	The simulation and computation are highly intensive, which presents significant technical challenges for implementing this theory in the actual electricity market
Cheng et al. (2022) []	Focused on a general two-population n-strategy evolutionary game (2PnS-EG), especially for the general two-population three-strategy evolutionary game (2P3S-EG)	Complete RNP parameters are defined for 2P3S-SEG-based homogeneous PGM and 2P3S-AEG-based heterogeneous PGM	It ignores the diversity and dynamics in the game process and has not yet incorporated stochastic disturbance factors of group evolution under uncertainty into the scope of the study
Yan et al. (2024) []	Using stochastic evolutionary game theory, the cooperative strategies of participants under bounded rationality were dynamically deduced	It can dynamically deduce cooperative strategies under bounded rationality and consider the impact of random disturbances, making the model more aligned with the complexity and dynamic changes in real markets	It may not be directly applicable to other types of electricity markets or trading mechanisms, which limits its scope of application
Lee et al. (2022) []	The evolutionary game was combined with a non-cooperative game and a Stackelberg game, forming a multi-level game framework	The game theory-based P2P electricity trading system takes grid conditions into account	It responds slowly to sudden changes and rapid fluctuations in the market. This may cause the model to lag when dealing with large fluctuations in renewable energy output
Wang et al. (2023) []	Considering the irrational bidding behavior of energy suppliers in the actual electricity market, an evolutionary game-based multi-market bidding optimization model was presented	The bidding strategies of participants are dynamically optimized through evolutionary game theory, enabling more efficient market operations	The model relies on government regulation and the enforcement of green certificates and carbon emission rights

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Integrating Evolutionary Game-Theoretical Methods and Deep Reinforcement Learning for Adaptive Strategy Optimization in User-Side Electricity Markets: A Comprehensive Review

Abstract

1. Introduction

2. Fundamentals of EGT

2.1. An Overview of EGT

2.2. Advantages of EGT

2.3. Bridging EGT and RL for Enhanced Strategy Formulation in Electricity Markets

2.4. Exploration of Multi-Field Application of EGT

2.5. Modeling Dynamics Using EGT

3. Integrating EGT for Strategic Optimization and Stability in Smart Grids

3.1. Multi-Agent Characteristics in Smart Grids

3.2. Fit Analysis of EGT and Smart Grid

3.3. Challenges and Limitations

4. Applications and Fits of EGT in Energy Trading

4.1. Overview of Energy Trading

4.2. Integrated Application of EGT and RL in Energy Trading

4.3. Applications of EGT in Energy Trading

5. DQN for Strategy Optimization in Electricity Trading

5.1. DQN Algorithm Applied to Strategy Optimization in Electricity Markets

5.2. Application Scenarios of QDN in the Electricity Market

5.3. The Role of EGT in the RL Framework

6. The Application of EGT in Demand Response for Smart Grids and Electricity Market Transactions

6.1. User Behavior Modeling and Strategy Evolution

6.2. Demand Response Based on EGT

7. Empirical Analysis

7.1. Application of EGT in Electricity Market Transactions

7.2. Comparative Analysis: Traditional Methods and EGT

8. Conclusions and Future Prospects

8.1. Conclusions

8.2. Limitations of the Study and Future Prospects

8.2.1. Limitations of the Study

8.2.2. Future Prospects

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Nomenclature

References

Article Metrics

Citations

Article Access Statistics