A Review of Agent-Based Models for Energy Commodity Markets and Their Natural Integration with RL Models

Trimarchi, Silvia; Casamatta, Fabio; Gamba, Laura; Grimaccia, Francesco; Lorenzo, Marco; Niccolai, Alessandro

doi:10.3390/en18123171

Open AccessReview

A Review of Agent-Based Models for Energy Commodity Markets and Their Natural Integration with RL Models

by

Silvia Trimarchi

^1,*

,

Fabio Casamatta

²,

Laura Gamba

²,

Francesco Grimaccia

^1,*

,

Marco Lorenzo

² and

Alessandro Niccolai

¹

Department of Energy, Politecnico di Milano, Via Lambruschini, 4, 20156 Milan, Italy

²

BU Trading and Execution, A2A S.p.A., Corso di Porta Vittoria 4, 20122 Milan, Italy

^*

Authors to whom correspondence should be addressed.

Energies 2025, 18(12), 3171; https://doi.org/10.3390/en18123171

Submission received: 19 May 2025 / Revised: 11 June 2025 / Accepted: 15 June 2025 / Published: 17 June 2025

Download

Browse Figures

Versions Notes

Abstract

Agent-based models are a flexible and scalable modeling approach employed to study and describe the evolution of complex systems in different fields, such as social sciences, engineering, and economics. In the latter, they have been largely employed to model financial markets with a bottom-up approach, with the aim of understanding the price formation mechanism and to generate market scenarios. In the last few years, they have found application in the analysis of energy markets, which have experienced profound transformations driven by the introduction of energy policies to ease the penetration of renewable energy sources and the integration of electric vehicles and by the current unstable geopolitical situation. This review provides a comprehensive overview of the application of agent-based models in energy commodity markets by defining their characteristics and highlighting the different possible applications and the open-source tools available. In addition, it explores the possible integration of agent-based models with machine learning techniques, which makes them adaptable and flexible to the current market conditions, enabling the development of dynamic simulations without fixed rules and policies. The main findings reveal that while agent-based models significantly enhance the understanding of energy market mechanisms, enabling better profit optimization and technical constraint coherence for traders, scaling these models to highly complex systems with a large number of agents remains a key limitation.

Keywords:

agent-based modeling; energy commodity markets; multi-agent systems; reinforcement learning; market simulation

1. Introduction

Agent-based models are commonly used to simulate evolving systems of autonomous interacting agents. Even if it is not possible to identify a definitive author and father of agent-based models, they have been used for simulation purposes since the end of the 20th century [1]. The success of this methodology lies in its multidisciplinary nature, which has been highlighted by different milestone works [2,3]. Nowadays, they are still used in many different fields, such as biology, social science, engineering, and economics to model complex and dynamic systems [4]. In agent-based modeling, each agent is designed independently and the system evolves as a result of the different agents’ interactions [5,6].

In the economic field, they have been employed to study the evolution of markets, by modeling their processes from the bottom up. This enables analysis of the interactions between the various market participants and provides insights into the price formation process. In addition, the computational model developed can be employed to test economic theories by means of controlled and replicable experiments [5].

More recently, they have been applied to energy markets, which have shown growing complexity in the recent years. These are harder to model compared to stock markets due to the nature of the traded products. They deal with commodities like electricity, natural gas, and carbon emission credits, which are influenced by current weather, users’ behavior, and supply chain conditions, but also by long-term factors such as technological advancement and public energy policies [6,7]. The latter also determine the birth of new markets’ segments and designs in developing countries [8,9]. In addition, energy markets have to take into account delivery times and the technical requirements of the electrical grid and of the infrastructure. This aspect is even more critical if it is considered that some traded products, such as electricity, cannot be stored easily and therefore the supply and demand balance must be satisfied in real time. However, nowadays rising computational resources and power allow models to include and cope with this complexity [6].

Agent-based models (ABMs) are particularly suited to studying and modeling energy markets. Indeed, they allow the incorporation of heterogeneous agents and hence replication of the diversity of the stakeholders that characterizes the energy markets. The participants belong to different categories, such as producers, consumers, speculators, regulators, or even intermediaries, which have diverse goals, decision-making processes, and trading schemes. In addition, energy markets are a decentralized system where each agent acts independently. These frameworks are highly suited to be modeled through ABMs, given their bottom-up approach [10].

Furthermore, energy markets are intrinsically dynamic, since agents continuously adjust their position and strategies based both on the market and external conditions. Through ABMs, starting from the single individual agents, it is possible to observe their actions and reactions, and to simulate a dynamic large-scale system. In this way, thanks to the flexibility of this modeling technique, it is possible to analyze the effect that innovative paradigms have on the markets, such as distributed generation [10,11]. In addition, another key element that makes this modeling technique suitable for energy markets is its scalability. Indeed, these markets exist at various scales, starting from microgrids or local markets to international power markets. ABMs therefore allow analysis of both small-scale events and global phenomena, depending on the adopted scale [10,11].

Finally, the integration of artificial intelligence (AI) techniques, such as reinforcement learning (RL), in the agents’ definition allows for the inclusion of adaptive behaviors based on past experience. This enables more realistic simulated scenarios to be achieved, since the real-word participants in energy markets adapt their trading strategies based on experience and market conditions [12]. In addition, this aspect has become more crucial in recent years, since energy markets are experiencing a unique volatility and unpredictability of prices caused by multiple factors, such as the integration of renewable sources in the grid or world-affecting geopolitical events. Therefore, the generation of scenarios of energy price evolution is attracting ever-more interest, since they can be used both to test innovative trading strategies and to explore new trading schemes [13].

The scope of this study is to analyze the different applications of agent-based models in energy markets, exploring the new techniques and advances that are being integrated into this framework. In particular, the focus lies on the simulation environments that determine the generation of scenarios by detailing the different frameworks that have been employed. While existing reviews on agent-based modeling concentrate on stock market or energy spot market simulations, this study aims at fixing a point in the agent-based modeling of commodity markets, by collecting and analyzing the different studies available in the literature. In addition, the state-of-the-art, current potentialities, limitations, and possible future research areas are highlighted.

In the context of this study, energy commodities refers to goods or raw materials that can be used to produce energy and that can be traded, such as natural gas, coal, electricity and carbon allowances. These commodities can be traded, hedged, and delivered on energy commodity markets. These can be mainly divided into spot markets, where energy commodities are traded with immediate delivery, and futures markets, where these products are purchased or sold as contracts with a set date of delivery [13,14]. Therefore, this research focuses on the application of agent-based models in the framework of energy commodity markets.

The rest of the article is structured as follows. Firstly, in Section 2 the core concepts of agent-based modeling are defined, by detailing the different agent types that act on energy markets. Secondly, the different applications of ABMs in energy commodity markets are discussed in Section 3. Then, in Section 4 the various tools available in the literature for ABM simulations are presented, while in Section 5 the possible integration of machine learning (ML) techniques with ABMs is displayed and its limitations are discussed. Finally, in Section 6 the actual limitations and challenges of ABMs in energy markets are highlighted, and in Section 7 the conclusions are drawn and possible future developments of this modeling approach are discussed.

2. Foundations of Agent-Based Models in Energy Markets

An agent-based model is a computational model where autonomous agents interact with each other and with exogenous variables in order to simulate complex systems. These models are composed of three main components: the agents, the environment, and its topology. The agents operate independently, with unique characteristics and properties, and can delineate single individuals or a group of them. The environment represents the framework where the agents interact within themselves and with the environment itself. External variables and effects are also included here. Finally, the topology of an ABM defines how agents interact or connect with one another, detailing the different interaction mechanisms. The system’s behavior emerges from the collective actions of its agents [15,16]. A schematic representation of an agent-based model is displayed in Figure 1, where the autonomous agents, the environment, and the respective interactions are depicted.

In energy market modeling, agents typically represent diverse entities that act on the markets such as producers, consumers, storage units, distributors, speculators, or market regulators [17]. The environment is the market itself, where agents place bids or asks, and it can integrate external factors such as infrastructure issues or news [18]. Finally, the topology varies depending on the application. In peer-to-peer markets, the agents can interact directly, while in larger markets, such as the commodity ones, they interact only with the environment (the market), which represents the central point where all traders meet [19,20].

Agent modeling within ABMs is highly flexible, allowing for varying degrees of complexity and diverse roles depending on the study’s specific scope. In simpler simulation environments, agents can be categorized as producers, consumers, and speculators, each with predefined actions such as producers selling and consumers buying, while speculators introduce market volatility [21]. Conversely, to simulate broader or global trading scenarios, agents can represent a multitude of individuals or larger entities. For example, each agent can model an entire country, acting as a buyer, a consumer, or both in worldwide commodity trading [22].

ABMs’ scalability allows for simulating energy trades at various levels, including lower-level systems like emerging energy communities and peer-to-peer markets. These models are crucial for developing low-carbon energy systems by representing new agent types, such as prosumers, which are households or industries with both generation and consumption profiles. Such simulations help us understand how markets respond to the growing integration of renewable energy sources (RESs) [20,23].

Once agents’ roles are defined, their intrinsic behavior must be detailed. In commodity or futures markets, agents can mainly fall into four categories: fundamentalist, expecting prices to revert to intrinsic value; chartist, following short-term trends; mimetic, imitating others; and adaptive, combining elements of the prior three based on weighted decisions [24].

A significant strength of ABMs is their ability to integrate techniques that make agents adaptive and goal-oriented. This means agents can learn and adjust their trading strategies based on current market conditions to maximize an objective [6]. The techniques and algorithms that enable this learning are discussed further in Section 5.

Finally, they show various strengths when compared to traditional models. Indeed, they provide both a more realistic description of the microeconomic behavior and of the direct interactions among the various agents, allowing them to include the effects of both market and non-market dynamics [25].

In the next section, the applications of ABMs in the specific field of energy commodity markets are detailed and the different frameworks employed are explained in terms of both strengths and weaknesses.

3. Applications of ABMs in Energy Commodity Markets

Agent-based models are a useful tool for analyzing, modeling, and understanding economic and social systems, such as the markets, by taking into account their complexity and dynamics [26]. Their key advantage lies in their ability to model decentralized decision making, heterogeneous agent behaviors, and the interactions between these agents over time. Their applications to energy markets are multiple and various, depending on the scope of the study, the commodity market analyzed, and the methods adopted. However, all the models found in the literature address one commodity market at a time. Indeed, no model provides concurrent simulation of different markets or takes into account the connections or cross-relationships that intrinsically exist between diverse energy commodities [14].

In terms of the commodity analyzed, there is not an even distribution of articles focusing on the various energy commodities. In Figure 2, the results of a bibliometric analysis that quantifies the distribution of ABM studies across different energy commodities are displayed. The majority of the studies focus on the electricity markets, reflecting both data availability and research priorities, given the large penetration of RESs and the paradigm shift that they are experiencing. This is followed by natural gas and carbon allowance studies, while less employed commodities such as coal and hydrogen are rarely explored. This uneven distribution is also reflected in analysis of the various real-world applications.

Three main categories of modeling approaches can be distinguished: optimization, equilibrium, and simulation. Optimization models follow a single- or multi-optimization objective(s) for the whole system, and are therefore not suitable when a single agent’s behavior is investigated. On the other hand, equilibrium models are formulated as bi-level optimization problems, where agents are optimized singularly on the upper level and the lower-level problem optimizes the total system [27]. Finally, simulation models are more flexible and any kind of trading strategy can be implemented in these systems. Therefore, each agent is modeled individually as autonomous and its action policy can be both a fixed set of rules or an adaptive behavior based on past experience [28].

The different applications of ABMs in the field of energy commodity markets can be mainly divided into three different categories:

Definition of trading strategies;
Generation of market scenarios;
Evaluation of market designs.

3.1. Definition of Trading Strategies

Evaluation of the effectiveness of trading strategies can be pursued through ABMs. They can also be explored and tested on novel market designs before their implementation. Equilibrium models usually belong to this category. They are formulated as bi-level optimization problems, and particularly interesting for the generation of new trading strategies is the simulation of the Nash equilibrium. This is a condition where no market player benefits from a unilateral deviation from its current strategy, since each player’s strategy is the best response to the strategies of the others [29].

For example, Dehghanpour et al. used ABMs combined with dynamic Bayesian networks to help generation companies determine short-term strategic bidding schemes in the power balancing market. The agents, which are generation companies, iteratively updated their bidding behavior based on observed market prices and rival strategies. The learning dynamics allowed the system to converge to a Nash equilibrium, where no agent could profit from unilateral deviation [30]. A similar approach has also been tested in the definition of trading strategies in the day-ahead electricity market. Gao et al. implemented agents equipped with anticipatory capabilities, allowing them to forecast other agents’ actions and adapt their own accordingly, thus incorporating multi-agent learning and predictive opponent modeling [31]. These traditional ABM approaches typically rely on predefined rules or heuristic search algorithms for agents to find optimal responses within the simulated market.

More recently, similar studies have been conducted by including adaptive bidding strategies. This has been possible thanks to the integration of machine learning techniques. One example is the study conducted by Liang et al., where a reinforcement learning model was employed to define the bidding strategies of generation companies. Each agent was modeled as a learning agent, receiving rewards based on profits, and learning over time the optimal responses to changing price signals and rival behaviors. Also in this case the simulation converges to the Nash equilibrium [32].

Further studies include Ye et al., who optimized the bidding strategy of a single producer in the electricity commodity market using ML and ABMs, outperforming state-of-the-art methods [33]. Jain et al. also equipped autonomous agents with optimization-refined bidding strategies for the electricity spot market, some of which used artificial neural networks (ANNs) for real-time decision making based on past experience [34,35].

3.2. Scenario Generation

The generation of market scenarios represents one of the main applications of ABMs in energy markets. These allow for analysis of the possible market evolutions, to explore the agents’ bidding behavior, and also, more recently, to generate data to train intensive machine learning algorithms. Different modeling techniques can be employed for this purpose.

One of the first examples found in the literature regards the generation of scenarios with zero-intelligence agents, which randomly submit bid and ask offers. It emerges that when they are implemented with budget constraints, the market prices reach an equilibrium and the market efficiency is close to that achieved with real traders [36].

A more recent example is the study conducted by Xu et al., which developed a comprehensive ABM of the day-ahead electricity market [37]. When coping with energy spot markets, it is important to also take into account state regulators, which play a big role in setting price ceilings and penalties when market constraints are not respected. These can be explicitly modeled as agents in ABMs in order to enable other participants to learn optimal behavior under regulatory constraints. In the study by Xu et al., various types of realistic agents were implemented, each with its own bidding logic, namely, the prosumers, generation companies, retailers, and the independent system operator (ISO), which must ensure reliable power delivery. This research tested how demand-side flexibility, through proactive prosumer participation, impacted market efficiency and volatility [37]. Similarly, Fatras et al. included a market regulator agent in the electricity spot market, modeled after the Danish utility regulator. This sets regulatory constraints and penalizes non-compliant agents, demonstrating how regulators can act as agents modifying the other actors’ dynamics [38]. Differently, Bose et al. integrated upper and lower trading price limits directly into the local energy market clearing process. In this way, RL agents learn strategies that comply with these regulatory constraints while optimizing their tasks [39]. These ABM scenarios provide insights into emergent macro-level phenomena that arise from micro-level agent interactions, such as price volatility, market power, and the impact of new market entrants.

ABMs can also employ an event-based approach, where agents adjust their market positions in response to news events affecting the commodity chain, allowing for the generation of scenarios that reveal trader reactions to unforeseen conditions [17]. Similarly, frameworks for integrating exogenous political and economic variables into ABM models for commodity markets have been developed. For example, Osoba et al. used RL driven agents that dynamically respond to policy variations to show the market’s ability to adapt to legislative or regulatory changes [40]. On the other hand, Zheng et al. modeled foreign policy as agents in the simulation environment. These represent the government institution which sets regulatory rules, while the other RL-based agents must adapt to the political and economical interventions [41].

By integrating artificial neural networks and reinforcement learning, it is possible to further enhance the simulation potentialities of ABMs. Harder et al. implemented a model composed of agents with adaptive bidding strategies that can maximize their profit in various market designs. Before taking any decision, the agents evaluate the past, the forecast of the loads and of the market prices, and their current marginal cost. By receiving a reward when they achieve a positive profit and a penalization when there is a lost opportunity due to unsold capacity, the interaction between the agents determines a simulated scenario that is able to reflect market liquidity and identify cases of market manipulation in the electricity futures market [42]. Similarly, Miskiw et al. leveraged RL to analyze agent behavior in the German electricity market. Their ABM provided transparency into why agents made specific decisions, offering regulators and researchers insights into the trade-offs influencing strategic bidding behavior [43].

3.3. Market Design Evaluation

Agent-based models have also been developed to test market designs before significant changes in the system or even before their implementation. One example of market design evaluation regards the study carried out by Wu et al., which aims to examine the design of the new Chinese electricity balancing market. The ABM framework is combined with a multi-criteria decision analysis (MCDA) and different balancing schemes are evaluated in the current conditions of China’s electricity market. The model is able to capture the strategic behavior of generators, and the simulations help assess which balancing rules would lead to higher price efficiency and reduce imbalance penalties. The study shows that performance depends heavily on the mix of renewable generation, agent incentives, and coordination mechanisms [44].

For this purpose and to improve electricity market design under the effect of high penetration of renewable energy sources, where new figures such as the prosumers are emerging, more complex ABM architectures can also be employed. For example, Shafie et al. developed a multi-layer ABM to study electricity market design. This model is composed of a wholesale market layer, where renewable power producers optimize bidding strategies under uncertainty, and a second layer of customers, such as plug-in electric vehicle owners taking part in demand response programs. The interactions between these layers are modeled using incomplete information, capturing the high uncertainty of resource variability and customer behavior [45]. In addition, Dehghanpour et al. developed a hierarchical multi-agent model for designing electricity markets with price-based demand response. The model is composed of a first level of a retail agent purchasing energy from the wholesale market and selling it to consumers, and a lower level where agents optimize their consumption based on the retail prices. These employ ML methods to develop a model of the aggregate load, reducing the uncertainty in the agents’ decision-making process [46]. These studies underline the potentialities of ABMs when coping with multi-level interactions and diverse sources of uncertainty for improving the design of realistic electricity markets.

Beyond electricity, ABMs are extensively used for carbon emission allowance trading, particularly for designing suitable schemes in new markets like China and India, where cap-and-trade systems are still developing [47,48]. European carbon markets, established in 2004, offer a mature example of such cap-and-trade systems for reducing CO₂ emissions [49,50].

These ABMs model energy-intensive industries, government, and consumers, simulating various trading schemes with different allowance allocation rules, penalties, and subsidies [47,48]. The various agents, interacting based on the specific trading scheme, aim to maximize profit and adjust carbon reduction technology investments and production plans accordingly. This provides a bottom-up view of Emission Trading System dynamics and shows how allowance allocations, trading constraints, and penalty levels influence firm investment in low-carbon technologies [51]. Similar studies for sulphur dioxide (SO₂) emissions in China aim to achieve an even emission reduction across regions, with energy-intensive firms as key agents [52].

Differently, other studies focus on the simulation of the possible evolution of the emission trading market, when other emission reduction policies are also considered. Such studies are based on the hypothesis that firms have three different options to reduce their emissions, namely, output regulation, the adoption of low-carbon technologies, and the trading of emission allowances. The model shows how industries balance these options depending on their state, market conditions, and expectations for allowance prices. The final aim is to simulate how the agents’ actions impact the market of carbon allowances and the diffusion of low-carbon technologies. These models help policymakers simulate how firms adapt over time and how policy mixes influence long-term market evolution and technology diffusion [53,54].

4. Platforms and Tools for ABMs in Energy Markets

For each different application and scope, a tailored agent-based model can be developed and implemented by detailing the various agents that interact with a custom environment [55]. However, since the fundamental framework of ABMs is similar in all the different applications, a wide range of open-source agent-based modeling and simulation tools can be found in the literature. These have been developed over the years and can be adapted based on the specific study that is being carried out. They differ in terms of usability, scalability, performance, and programming language in which they are implemented [11,56]. In particular, they vary in the following:

Code language and software architecture (Java vs. Python; monolithic vs. modular);
Agent design, including behavioral templates, decision trees, rule-based, and learning-enabled logic;
Support for grid topology, load flow modeling, and stochastic demand/supply profiles;
Coupling capabilities with external optimization solvers, ML libraries, or data streams;
Flexibility in configuring market mechanisms, e.g., pay-as-clear vs. pay-as-bid auctions, and nodal vs. zonal pricing.

These tools are particularly useful since they represent a benchmark in this modeling framework and, in addition, they allow even people without programming skills or time to build an ABM from scratch to conduct studies on energy markets.

Focusing on the simulation frameworks that find application in the modeling and simulation of energy commodity markets, differently from those that are applied to purely financial markets, they are designed to take into account also the technical constraints regarding the electrical grid and the production and consumption profiles. The main tools commonly employed are summarized in Table 1, where they are reported in alphabetical order. They are characterized in terms of the programming language in which they have been implemented, properties of the agents that interact, original scope of the tool, and the year in which they were issued.

The simulation tools reported were developed in different years, and it is not possible to determine a particular timeframe when the majority of these models were issued. The first, AMES, was implemented in its first version in early 2000, while the last one, ASSUME, has been recently developed. Given the continuous growth that computational methods and resources have experienced in the past 20 years and that is still ongoing, new versions of these tools are constantly issued. These allow simulation issues to be fixed, and also to enhance and speed up simulations by leveraging new technologies [67].

In addition, the tools do not differ only in the time in which they were developed but also in the place. They are open-source codes, available on GitHub, developed and updated by broad research groups all around the world [68]. For example, AMES has been developed in the USA at the Iowa State University, AMIRIS has been implemented at the German Aerospace Center, while EMLab belongs to a research group of the Delft University of Technology [67,69,70].

The majority of the tools were originally implemented in Java, since this programming language is open-source and intrinsically object-oriented. When developing agent-based models, it is crucial to adopt object-oriented coding, since it allows for the definition of modular and independent components, such as the agents [71]. Nevertheless, it is worth noting that two of the most recently developed ABM tools are implemented in Python. This programming language, as explained by Collier et al. in their study, is becoming ever-more popular in scientific modeling, leading to a natural transition from Java coding. This is mainly due to the ease of readability of code written in Python, the variety of libraries developed for data analysis, its natural integration with ML methods, and its intrinsic interoperability with other faster-compiled programming languages, such as C or C++ [72].

The main difference between the various tools reported lies in their scope and their features, which vary depending on how agents and the environment are implemented. They can be mainly divided in three categories, where also some real-world applications have been reported:

Policy and regulation analysis: Some tools, such as AMES and ERCOT, are designed specifically to analyze policies, regulatory compliancies, and market structures. For example, AMES provides a flexible setting for assessing alternative market designs and bidding rules in wholesale electricity markets, allowing users to define complex market structures and observe emergent behavior. This tool has been employed to model the day-ahead and real-time US electricity markets, helping researchers and regulators to study how the locational marginal pricing is formed, how power grid congestion affects prices, and how electricity producers might behave strategically when placing their bids [57]. ERCOT focuses on dynamic modeling of wholesale power markets, with detailed representations of dispatchable and non-dispatchable generators and load-serving entities to evaluate specific market rules. It has been applied to the Texas wholesale energy market and employed in academic and transmission system operator (TSO) studies to test market rule changes [61].
Trading strategies and price formation: Other tools are used to analyze various trading strategies, trader behaviors, and price formation mechanisms. MASCEM, for example, offers agents with adaptive learning capabilities to determine optimal bidding strategies based on market context, helping in the study of complex price formation dynamics. It has been applied in Iberian bidding studies, with agents using optimization, ML, and game theory, as a support tool in regulator decisions [64,73]. On the other hand, MATREM simulates day-ahead and futures electricity markets, enabling detailed analysis of generator and retailer interactions. This tool is mainly employed in research studies, focusing on feasibility assessments in energy markets [65].
Renewable energy integration: The more recently developed tools, such as ASSUME, AMIRIS, and EMLab, focus on the effect that the integration of renewable energy sources is having and can have on the energy markets, enabling researchers and experts to cope with the energy transition, with its investment risks, and to ensure the necessary capacity within the power markets. For example, ASSUME is designed to analyze new electricity market designs under high-RES-penetration scenarios [59], while AMIRIS supports the modeling of dispatching and simulation of market prices with different energy actors, including policy providers. The latter has also been employed in real-world applications in Germany and Austria to simulate annual hourly dispatch under policy scenarios, and to analyze battery storage bidding, renewable subsidy impacts, and market price formation [58]. In addition, EMLab, used in multiple research studies, explores the long-term effects of interacting energy and climate policies, explicitly modeling power companies making investment decisions under imperfect information [60].

In addition, by exploiting the scalability that ABMs offer, they are tailored on different segments of the energy markets. For example, among the models that focus on small-scale markets and localized systems, it is possible to find the Grid Singularity tool, which is designed to simulate microgrids, energy communities, and peer-to-peer trading [62]. The MATREM simulator, which focuses on the demand-side response at the consumer level [65], also belongs to this category. On the other hand, tools like PowerACE and EMLab have been implemented to simulate large-scale (such as national) and interconnected markets with a multitude of agents [60,66]. The scalability of ABMs is notable not only in terms of the dimension of the market analyzed, but also in terms of the time horizon. Indeed, tools like AMES, ERCOT, and MATREM are employed to simulate short- to mid-term market dynamics, ranging from real time to some months [57]. On the other hand, the EMLab and PowerACE simulators are specifically developed to provide long-term analysis, with multi-decade planning scenarios with annual resolution and embedded stochastic investment models [60,66].

The different tools also show various implementations in terms of agents’ behavior. Firstly, they differ in terms of entities that the different agents model. They can be power plants, traders, consumers, or even nations, as detailed in Table 1, depending on the market of focus and on the simulated horizon. Furthermore, there is also a difference in the strategies adopted by the agents. These have traditionally been implemented as deterministic trading strategies, such as predefined bidding or investment rules, which depend on different parameters that can be tuned by the user or through a calibration procedure. Some agents may also exploit heuristic learning-based strategies, which are tuned on simple feedback [64,66]. On the other hand, through the integration of machine learning into ABMs, in tools such as ASSUME, EMLab, and Grid Singularity, it is possible to define agents with adaptive trading strategies, based on price forecasts, competitor modeling, or behavior replication [59,60,62]. This is possible thanks to the integration of machine learning and reinforcement learning with ABMs. The potentialities of this innovative and promising development are detailed in the following section.

5. Machine Learning Integration with ABMs

The potentialities of ABMs can be further enhanced by integrating different machine learning techniques into the simulation framework. This represents a recent and promising area of research. Indeed, while ABMs were first developed at the beginning of the 21st century, ML algorithms have seen a substantial diffusion only recently, which was possible thanks to the rising computational resources available [74].

Among the numerous ML techniques, reinforcement learning (RL) appears to be the most suitable for integration with ABMs. This method derives from a Markov decision process (MDP), where an agent learns a policy

π (a | s)

that maximizes the cumulative expected reward in an environment. This is modeled through the definition of states s, actions a, transition probabilities between the states

P (s^{'} | s, a)

, and a reward function

R (s, a)

[75]. In this article, RL is considered broadly, including not only classical RL algorithms but also the methods of deep reinforcement learning (DRL), which is a combination of deep learning (DL) and reinforcement learning (RL), which blends deep learning’s function approximation with RL’s decision making.

There are numerous analogies between ABMs and RL that make the technique highly suitable for integration. Firstly, in both cases the simulations involve agents interacting with an environment, making RL compatible with the modularity of autonomous ABM agents. Secondly, the scope of RL is to determine an optimal decision-making sequence in order to maximize a reward, similarly to the sequential decisions agents make in ABMs. Finally, RL algorithms can adapt to new information in real time, allowing the agents to adapt their strategies based on the evolution of the environment [76,77].

Numerous different RL techniques and algorithms can be found in the literature and a taxonomy of the numerous RL methods that have been developed is shown in Figure 3, where the various categories are depicted in orange while the different algorithms in blue. The RL world can be mainly divided into two branches: Model-free and model-based methods. Model-based methods assume that the agent knows or can learn the model of the environment [12,78]. These can be split in two categories, depending on whether the model of the environment is known a priori or the agent has to learn it solely from experience. The majority of the algorithms belong to the latter branch, such as Imagination-Augmented Agents (I2A), Model-Based Model-Free (MBMF), and Model-Based Value Expansion (MBVE). On the other hand, model-free methods do not try to access the model of the environment, making them easier to implement and more commonly employed [77,78]. These can be mostly divided again in two branches, namely, policy optimization and Q-learning methods. In policy optimization methods, the algorithms try to find the best policy by redefining the policy at each step while maximizing the expected total reward. To this area belong, for example, the Trust Region Policy Optimization (TPO), Proximal Policy Optimization (PPO), and Advantage Actor–Critic (A2C) algorithms, where the optimization is performed on-policy. On the other hand, in Q-learning the agent learns to perform the best action by improving the value function at each step. Some algorithms employed in these cases are the Deep Q-Network (DQN), Categorical DQN—51 atom (C51), and Hierarchical Experience Management (HEM). In addition, Q-learning has been combined with policy optimization in a class of algorithms called actor–critic methods, to which belong the Deep Deterministic Policy Gradient (DDPG), Twin Delayed Deep Deterministic Policy Gradient (TD3), and Soft Actor–Critic (SAC) algorithms [77,78].

The majority of the studies that integrate ABMs with RL exploit model-free methods, since in these applications the environment is commonly the market itself, of which it is not possible to develop a model. In addition, they belong to the area of deep reinforcement learning, which allows one of the biggest limitations of RL in ABMs, called the curse of dimensionality, to be overcome. This arises when coping with problems of high dimensionality and with continuous state and space actions due to the exponential increase in computational complexity [79,80]. The introduction of a deep neural network (DNN) into the RL architecture allows the need for a custom state representation to be avoided by approximating the state–action function and by generalizing these problems. Nevertheless, the combination of RL methods with ABMs introduces various mathematical challenges. The most significant two are the partial observability and the convergence issues of the RL algorithm. Firstly, energy markets often do not exhibit full observability but only partial observability. Therefore, the agents lack full information about the environment state, such as other agents’ strategies or plans, requiring more complex models like partially observable MDPs (POMDPs). Secondly, DNNs require independent and identically distributed data to be trained, while RL, dealing with sequential decision-making problems, usually employs highly correlated and non-stationary input data, such as those of the markets. This may cause issues in the convergence of the training process [81]. To cope with this problem, different techniques have been adopted, such as an experience replay buffer or target networks. In the first of these, the buffers are used to store past experiences and then sample from them, by prioritizing the most important ones [82]. The second consists of employing a separate target network, where the state–action values are updated at slower intervals to reduce oscillations [77,83].

The applications of DRL in the field of ABMs for energy commodity markets are numerous and various. In addition to the applications already explored in Section 3, they can be employed to improve the understanding of energy markets by making better forecasts, integrating adaptive bidding strategies, and analyzing cooperative and competitive emerging behaviors [75,84]. Indeed, numerous studies show that integrating RL into ABMs significantly enhances agent adaptability, forecasting accuracy, and market efficiency modeling.

In addition, in order to model the different entities that act on a market, in this field studies focus on multi-agent deep reinforcement learning (MADRL). This is an extension of RL, where multiple agents can learn, cooperate, communicate, and take decisions in a shared environment [77,85]. One of the first examples that can be found in the literature is the study conducted by Lincoln et al., where MADRL is introduced to model the competition between different participants [86]. Another example is the study of Lussange et al., where each agent represents a corporate reality or an individual trader and can learn both to forecast the price and to trade autonomously. These two tasks are fulfilled through two different model-free RL algorithms that interact with the central market, modeled as an order book. The agents using DRL achieve a higher cumulative profit compared to heuristic-based traders, also showing a better market responsiveness [87]. MADRL techniques are also employed in the research carried out by Harder et al. Here, the aim is to test new wholesale electricity market designs before implementation. Each agent represents a market participant and the TD3 algorithm is used. This algorithm, as shown in Figure 3, belongs to the world of model-free RL. This implementation allows adaptive bidding strategies to be developed that maximize the profit and outperforms non-learning agents, with a 15% improvement in profit optimization [42].

To assess the potentialities of MADRL compared to traditional rule-based and single-agent models, an interesting study was carried out by Harrold et al., where DRL was employed to control a microgrid. In particular, a rule-based model, and single-agent and multi-agent settings were developed by exploiting the DDPG algorithm in order to highlight the improvement that RL achieves and to compare the use of a single global controller versus a distributed multiple-agent controller. Here, the aim was to reduce the energy costs by trading with other external microgrids and an aggregator. It emerged that a multi-agent setting allowed for better control of the different microgrid components, and that an RL-based model enabled the microgrid energy costs to be reduced by up to 13% compared to a rule-based baseline, while also demonstrating better load balancing across the various grid components [88].

MADRL has also been employed to analyze the effect of algorithmic trading on energy markets. By employing a Deep Q-Network (DQN) model on a simplified day-ahead power market, it emerges that power generation companies are able to determine the bidding strategy that maximizes long-term profit. They are able to exploit the available information more effectively, which also unintentionally leads to a collusive strategy. Hence, the role of policymakers and market designers becomes more important and fundamental to ensure fully competitive markets [89].

Another important aspect of MADRL is how communication between the agents is pursued. One of the most common and most successfully employed techniques is called centralized learning and decentralized execution. In this approach, the communication between agents is not restricted during the training phase and hence any available information (such as other agent policies) can be used. On the other hand, in the execution of the learned policies the agents can use only the information that they have available. This communication protocol has the great advantage of ensuring a stationary environment in the training phase [90]. This represents a crucial point for multi-agent systems, since the interactions between multiple agents keep modifying the environment, meaning that initially good policies do not perform equally well in the future [91].

In summary, the integration of RL with ABMs determines numerous improvements. Firstly, it allows for modeling of adaptive and strategic agents, leading to more realistic emergent behaviors. Secondly, it generates more accurate predictions of market dynamics under uncertain conditions. Finally, it enables rigorous testing of novel market designs and policies before real-world implementation.

Nevertheless, even if MADRL has great potential and has already been implemented in various studies, many challenges and limitations arise from this methodology. Firstly, the majority of the studies are limited to a few market participants, since the scalability of these methods is a significant limitation [77]. Secondly, in model-free algorithms the agents’ policies change with the training process, leading to a non-stationary environment from the single-agent perspective. This causes learning stability challenges and does not allow for direct use of past experience. Therefore, researchers are focusing on the development of techniques such as centralized training–decentralized execution, which allows the training process to be stabilized [92].

6. Challenges and Limitations of ABMs for Energy Markets

ABMs show a great variety of possible purposes and the combination with other techniques, such as reinforcement learning, allows their potentialities and fields of application to be further increased. Nevertheless, they have different limitations and challenges that still need to be tackled and overcome [73]. These can be mainly divided into four categories:

Cross-relationships between energy commodities;
Calibration and validation approaches;
Scaling-up issue;
Exploration–exploitation trade-off dilemma.

6.1. Cross-Relationships Between Energy Commodities

The various ABM models found in the literature cope with one energy commodity market at a time and do not provide concurrent simulation of the different markets, as claimed in Section 3. This represents one of the biggest limitations regarding ABMs for energy markets, since no model takes into account the cross-relationships that intrinsically exist between the various energy commodities, that inevitably influence the prices and evolution of the markets.

There are multiple reasons why researchers have never developed a multi-market ABM that accounts for the various energy commodities’ connections. Firstly, this greatly increases the model complexity and raises multiple design challenges, since energy markets are complex and full of non-linear feedback loops. Therefore, it is extremely complex to model and predict how the changes in one market can cascade through others. Also, it is problematic to define how the actions of one agent in one market can influence the attitude of other participants in another market [93,94].

This limitation also derives from a data availability and integration issue. The various markets, even if deeply interconnected, have different liquidity and volatility, meaning the data are available with different time resolutions. In addition, the data often come from various sources, with different quality and standardization protocols. Therefore, this makes it difficult to create one single simulation of the different commodities, with the various markets being difficult to directly compare and to calibrate simultaneously [95].

The potential solutions to this gap come from different methodologies employed by researchers to simulate complex systems. One of the most interesting future research directions regards an inter-market information exchange. In this case, for each commodity market an ABM is developed but the various models are built to exchange information, such as policy changes or market imbalances, which may influence other agents’ actions [96,97]. In addition, the recent advancements in RL can help overcome this limitation. Indeed, even by keeping the different market simulations separate, the agents can learn the complex cross-relationships that exist between the various commodities and take them into account in their actions [98].

6.2. Calibration and Validation

There is no standard or recognized way to validate ABMs, especially when dealing with models aiming to analyze new market designs or generate market scenarios. Therefore, different researchers have tried to shed light on the topic and define some guidelines to properly validate such models. In particular, when dealing with ABM validation, it is important to underline that these techniques cannot accurately tell whether the developed model is a correct description of the complex and unknown real-world data generation process, but allow understanding of whether a model is a poor description of it [99].

The validation procedure can be divided into two different steps. The first regards the calibration and estimation of the parameters, which is performed in-sample, so that the model replicates the real-world data. The second deals with the actual validation out-of-sample, and evaluates whether the simulated data resemble some statistical properties of the real world [100,101].

Starting from the calibration to real data, according to Fagiolo et al. this procedure can be pursued with two methodologies, namely, the indirect inference method and the Bayesian approach. These methodologies estimate the models’ parameters by minimizing the distance between some statistical properties of real and simulated data. In addition, the Bayesian approach, which also integrates prior knowledge to the problem, provides a probabilistic interpretation of the estimated parameters [99].

On the other hand, in order to validate the out-of-sample data, different techniques are available [102]. Firstly, it is possible to pursue a pattern-based validation, where it is checked whether a simulated time series resembles the temporal patterns of the actual one. This can be achieved for example by exploiting the Generalized Subtracted L-divergence method, which determines whether the model reproduces the distribution of time changes of real-world data [99,103]. For example, in the study conducted by Williams et al. an agent-based model of residential electricity demand was validated against real electricity consumption data and it was checked whether demand patterns and seasonal variations were replicated [104]. In addition, it is possible to compare the causal structures found in the real-world data with those derived from the ABM [99,105]. Finally, various statistical features can be evaluated, both for real and simulated prices, and compared. These include, for example, the distribution of the number of consecutive days of increasing prices and decreasing prices, the distribution of price volatilities at different timescales, and various autocorrelation metrics, such as the distribution of autocorrelation of the trading volumes or autocorrelation of the logarithmic returns of prices [87,106]. This kind of validation was employed in a study by Shinde et al., where the model was validated with the aim of demonstrating that the macro-level properties emerging from the micro-level agent interactions statistically resembled the real-world observations. This was achieved by comparing the simulated market data in terms of, e.g., price volatility, trading volumes, bid–ask spread, and number of trades with the actual data features of the intra-day electricity markets [107].

6.3. Scaling Up

The majority of the studies found in the literature that apply ABMs to energy markets focus on single-agent setups or on markets with few agents participating. Indeed, even considering large-scale markets, the different participants are classified and clustered into few single agents in the model [73,77]. This phenomenon is caused by the limited scalability of ABMs, which represents one of their major limitations.

The challenges that prevent the scaling up of ABMs to more complex systems are numerous and are due to both technical and methodological reasons. Firstly, as the number of interacting agents or the complexity of their interactions increase, the memory required grows exponentially. This means that the processing and computational power may not be enough [108]. This aspect recalls the curse of dimensionality of reinforcement learning introduced in Section 5, which arises when dealing with high-dimensional problems, where the state and space actions expand or are continuous. In these cases, the memory required to store transitions grows exponentially, and so does the computational complexity of the model. This issue has been partially overcome with the introduction of deep reinforcement learning [79,80].

Nevertheless, advancements in computational resources are reducing the level of criticality of these scalability issues. For example, high-performance computing (HPC) and graphics processing units (GPUs) enable significant acceleration in the computational time of agent-based simulations. GPUs are tailored for tasks involving many independent computations, such as updating agent states or running multiple simulation iterations at the same time [109]. This allows the simulated models to significantly increase the number of agents and also to increase the complexity of individual behaviors.

Moreover, thanks to cloud computing platforms, researchers can access scalable and vast computational power without a significant hardware investment upfront. These platforms allow the deployment of distributed ABMs, where the simulation workload can be divided across multiple virtual machines. This enables both the execution of computationally intensive models and the running of extensive parameter sensitivity analyses, crucial for understanding complex emergent behaviors [110].

In addition, advancements in asynchronous communication protocols and optimized data structures are allowing for more efficient distributed computations, even in cases where agents communicate in real time and their actions depend on those of the others [108].

Finally, from a methodological point of view, scaling up the model raises the challenge of maintaining model accuracy. Indeed, adding more agents to the model and including more complex behavior can amplify errors and inaccuracies, especially when the agents’ interactions are dynamic or context-dependent [100,111]. To tackle this issue and prove the robustness of the ABM developed, two different approaches have been suggested, namely, ensemble modeling and data assimilation. The first consists of running multiple versions of the same model with slightly different initial conditions or parameters (e.g., the random seed). The second integrates real-time or periodic data into the model during its runtime to adjust its state and improve its accuracy [108].

6.4. Exploration–Exploitation Trade-Off Dilemma

The exploration–exploitation dilemma, even if well known in classical optimization problems, has arisen more recently in ABMs and represents a significant challenge, especially when agents learn from the environment or have adaptive behaviors [112]. The agents need to find a trade-off between exploring, and hence learning more about the environment, and exploiting, namely, following the most promising strategy according to the experience gained. If agents explore too much, the model may take a long time to converge to realistic behavior, on the other hand, if they exploit too much, they may miss potential new strategies or solutions [83,113].

Different approaches have been proposed to address this dilemma, which can be summarized mainly in two methodologies that can be applied both to classical ABMs and to those that integrate reinforcement learning [113]. The first is the adoption of an adaptive exploration rate. This rate is higher at the beginning, allowing the agents to explore, and decreases later by exploiting known strategies as the knowledge of the environment increases [114]. This solution can be exploited in energy commodity market simulations by making energy trading agents try a wider range of strategies at the beginning, in the exploration phase, when the market conditions are uncertain. Then, as the agents retrieve more data on price fluctuations and competitor behaviors, the exploration rate should decrease, leading them to converge on historically optimal strategies.

The second approach regards the implementation of heuristic rules, which balance the exploration and exploitation trade-off based on environment changes and achieved rewards [115]. For energy market ABMs, these rules may translate into agents increasing their exploration rate if they experience significant losses or if market volatility exceeds a certain threshold. In this way, they would search for new profitable strategies that work better in the current market conditions. On the other hand, if an agent keeps achieving high profits, a heuristic rule might suggest exploiting the current successful strategy. These rules can provide a useful means for agents to deal with the complexity and unpredictability of energy commodity markets, where it is not possible to define a universal optimal strategy.

7. Conclusions

Agent-based models represent a valuable and flexible methodology for modeling and analyzing energy markets. They provide a bottom-up approach where heterogeneous agents, developed as independent entities with custom strategies, interact with the environment and between themselves. These aspects allow for both replication of the multitude of stakeholders that characterize the energy markets and the dynamics of these realities.

ABMs have been successfully applied to energy commodity markets in a variety of a applications. In particular, they have been employed both to test game theory schemes, such as the Nash equilibrium, and to define new trading strategies which can, on the one hand increase the traders or companies profit, and on the other improve the electric system’s reliability. Finally, they can also be effectively exploited to evaluate new market designs and generate possible market scenarios. This represents a crucial aspect for today’s energy markets, which have seen the establishment of new trading schemes and paradigms, generated by the introduction of new policies for the integration of renewable energy sources and by recent disruptive geopolitical events. Therefore, the need for tools that allow for the simulation of market scenarios and test trading strategies is raising the attention of both energy companies and market regulators.

The potentialities of ABMs are further enhanced by the integration of reinforcement learning and particularly deep reinforcement learning into their modeling framework. The inclusion of adaptive and evolving behaviors in the agent allows a realistic evolution of the system to be achieved and for analysis of the different communication, cooperation, and competition schemes that are established between them. Particularly interesting is the employment of multi-agent models, where the agents are placed in a non-stationary environment, which evolves based on the actions of multiple market participants. This however represents an area that still requires research, since the learning phase of the agents becomes unstable, leading to issues in the convergence process.

However, the learning stability of the MADRL model is not the only challenge to deal with. One of the most discussed areas that still requires standardization is the validation of these models. In addition, the scaling up of ABMs and the exploration–exploitation dilemma, arising from agents with adaptive behavior, have been presented as limitations of this modeling approach. These challenges certainly represent possible future developments and interesting research areas in the field of ABMs. However, they require innovative methodologies and schemes that need to be applied in the field, and also an increase in computational resources to deal with ever-larger, more complex, and more realistic agent-based models.

Author Contributions

Conceptualization, A.N. and S.T.; methodology, S.T. and L.G.; formal analysis, A.N.; investigation, S.T. and M.L.; resources, F.C. and A.N.; writing S.T. and F.G.; visualization, L.G. and M.L.; supervision, F.G.; project administration, F.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Not applicable.

Conflicts of Interest

Authors Fabio Casamatta, Laura Gamba and Marco Lorenzo were employed by the company A2A S.p.A. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

A2C	Advantage Actor–Critic
ABM	Agent-based model
AI	Artificial intelligence
ANN	Artificial neural network
C51	Categorical DQN—51 atom
DDPG	Deep Deterministic Policy Gradient
DL	Deep learning
DNN	Deep neural network
DQN	Deep Q-Network
DRL	Deep reinforcement learning
GPU	Graphics processing unit
HEM	Hierarchical Experience Management
HPC	High-performance computing
I2A	Imagination-Augmented Agents
ISO	Independent system operator
MADRL	Multi-agent deep reinforcement learning
MBMF	Model-Based Model-Free
MBVE	Model-based value expansion
MCDA	Multi-criteria decision analysis
MDP	Markov decision process
ML	Machine learning
POMDP	Partially observable Markov decision process
PPO	Proximal Policy Optimization
RES	Renewable energy source
RL	Reinforcement learning
SAC	Soft Actor–Critic
TD3	Twin Delayed Deep Deterministic Policy Gradient
TPO	Trust Region Policy Optimization
TSO	Transmission system operator

References

Axelrod, R. The Complexity of Cooperation: Agent-Based Models of Competition and Collaboration: Agent-Based Models of Competition and Collaboration; Princeton University Press: Princeton, NJ, USA, 1997. [Google Scholar]
Epstein, J.M. Generative Social Science: Studies in Agent-Based Computational Modeling; Princeton University Press: Princeton, NJ, USA, 2012. [Google Scholar]
Axtell, R.L.; Farmer, J.D. Agent-based modeling in economics and finance: Past, present, and future. J. Econ. Lit. 2022, 63, 197–287. [Google Scholar] [CrossRef]
Kapeller, M.L.; Jäger, G. Threat and anxiety in the climate debate—An agent-based model to investigate climate scepticism and pro-environmental behaviour. Sustainability 2020, 12, 1823. [Google Scholar] [CrossRef]
Tesfatsion, L. Agent-based computational economics: Growing economies from the bottom up. Artif. Life 2002, 8, 55–82. [Google Scholar] [CrossRef]
Weidlich, A.; Veit, D. A critical survey of agent-based wholesale electricity market models. Energy Econ. 2008, 30, 1728–1759. [Google Scholar] [CrossRef]
Bjarghov, S.; Löschenbrand, M.; Saif, A.I.; Pedrero, R.A.; Pfeiffer, C.; Khadem, S.K.; Rabelhofer, M.; Revheim, F.; Farahmand, H. Developments and challenges in local electricity markets: A comprehensive review. IEEE Access 2021, 9, 58910–58943. [Google Scholar] [CrossRef]
Wu, J.; Mohamed, R.; Wang, Z. An agent-based model to project China’s energy consumption and carbon emission peaks at multiple levels. Sustainability 2017, 9, 893. [Google Scholar] [CrossRef]
Chen, P.; Wu, Y.; Zou, L. Distributive PV trading market in China: A design of multi-agent-based model and its forecast analysis. Energy 2019, 185, 423–436. [Google Scholar] [CrossRef]
Ringler, P.; Keles, D.; Fichtner, W. Agent-based modelling and simulation of smart electricity grids and markets–a literature review. Renew. Sustain. Energy Rev. 2016, 57, 205–215. [Google Scholar] [CrossRef]
Abar, S.; Theodoropoulos, G.K.; Lemarinier, P.; O’Hare, G.M. Agent Based Modelling and Simulation tools: A review of the state-of-art software. Comput. Sci. Rev. 2017, 24, 13–33. [Google Scholar] [CrossRef]
Perera, A.; Kamalaruban, P. Applications of reinforcement learning in energy systems. Renew. Sustain. Energy Rev. 2021, 137, 110618. [Google Scholar] [CrossRef]
Bellomo, M.; Trimarchi, S.; Niccolai, A.; Lorenzo, M.; Casamatta, F.; Grimaccia, F. A GAN Data Augmentation approach for trading applications in European Carbon Emission Allowances. In Proceedings of the 2023 IEEE International Conference on Environment and Electrical Engineering and 2023 IEEE Industrial and Commercial Power Systems Europe (EEEIC/I&CPS Europe), Madrid, Spain, 6–9 June 2023; pp. 1–5. [Google Scholar]
Lin, B.; Su, T. Does COVID-19 open a Pandora’s box of changing the connectedness in energy commodities? Res. Int. Bus. Financ. 2021, 56, 101360. [Google Scholar] [CrossRef] [PubMed]
Doumpos, M.; Zopounidis, C.; Gounopoulos, D.; Platanakis, E.; Zhang, W. Operational research and artificial intelligence methods in banking. Eur. J. Oper. Res. 2023, 306, 1–16. [Google Scholar] [CrossRef]
Railsback, S.F.; Grimm, V. Agent-Based and Individual-Based Modeling: A Practical Introduction; Princeton University Press: Princeton, NJ, USA, 2019. [Google Scholar]
Cheng, S.F.; Lim, Y.P. Designing and Validating an Agent-Based Commodity Trading Simulation; Technical Report; Singapore Management University: Singapore, 2009. [Google Scholar]
Vanfossan, S.; Dagli, C.H.; Kwasa, B. An agent-based approach to artificial stock market modeling. Procedia Comput. Sci. 2020, 168, 161–169. [Google Scholar] [CrossRef]
Atkins, K.; Marathe, A.; Barrett, C. A computational approach to modeling commodity markets. Comput. Econ. 2007, 30, 125–142. [Google Scholar] [CrossRef]
Reis, I.F.; Lopes, M.A.; Antunes, C.H. Energy transactions between energy community members: An agent-based modeling approach. In Proceedings of the 2018 International Conference on Smart Energy Systems and Technologies (SEST), Sevilla, Spain, 10–12 September 2018; pp. 1–6. [Google Scholar]
Cheng, S.F.; Lim, Y.P. An agent-based commodity trading simulation. In Proceedings of the Twenty-First IAAI Conference, Pasadena, CA, USA, 14–16 July 2009. [Google Scholar]
Giulioni, G. An agent-based modeling and simulation approach to commodity markets. Soc. Sci. Comput. Rev. 2019, 37, 355–370. [Google Scholar] [CrossRef]
Zhou, Y.; Lund, P.D. Peer-to-peer energy sharing and trading of renewable energy in smart communities: Trading pricing models, decision-making and agent-based collaboration. Renew. Energy 2023, 207, 177–193. [Google Scholar] [CrossRef]
Kanzari, D.; Said, Y.R.B. A complex adaptive agent modeling to predict the stock market prices. Expert Syst. Appl. 2023, 222, 119783. [Google Scholar] [CrossRef]
Castro, J.; Drews, S.; Exadaktylos, F.; Foramitti, J.; Klein, F.; Konc, T.; Savin, I.; van Den Bergh, J. A review of agent-based modeling of climate-energy policy. Wiley Interdiscip. Rev. Clim. Chang. 2020, 11, e647. [Google Scholar] [CrossRef]
Mignot, S.; Vignes, A. The many faces of agent-based computational economics: Ecology of agents, bottom-up approaches and paradigm shift. OEconomia 2020, 10, 189–229. [Google Scholar] [CrossRef]
Reis, I.F.; Gonçalves, I.; Lopes, M.A.; Antunes, C.H. A multi-agent system approach to exploit demand-side flexibility in an energy community. Util. Policy 2020, 67, 101114. [Google Scholar] [CrossRef]
Ventosa, M.; Baıllo, A.; Ramos, A.; Rivier, M. Electricity market modeling trends. Energy Policy 2005, 33, 897–913. [Google Scholar] [CrossRef]
Dai, T.; Qiao, W. Finding equilibria in the pool-based electricity market with strategic wind power producers and network constraints. IEEE Trans. Power Syst. 2016, 32, 389–399. [Google Scholar] [CrossRef]
Dehghanpour, K.; Nehrir, M.H.; Sheppard, J.W.; Kelly, N.C. Agent-based modeling in electrical energy markets using dynamic Bayesian networks. IEEE Trans. Power Syst. 2016, 31, 4744–4754. [Google Scholar] [CrossRef]
Gao, X.; Chan, K.W.; Xia, S.; Zhang, X.; Zhang, K.; Zhou, J. A multiagent competitive bidding strategy in a pool-based electricity market with price-maker participants of WPPs and EV aggregators. IEEE Trans. Ind. Inform. 2021, 17, 7256–7268. [Google Scholar] [CrossRef]
Liang, Y.; Guo, C.; Ding, Z.; Hua, H. Agent-based modeling in electricity market using deep deterministic policy gradient algorithm. IEEE Trans. Power Syst. 2020, 35, 4180–4192. [Google Scholar] [CrossRef]
Ye, Y.; Qiu, D.; Sun, M.; Papadaskalopoulos, D.; Strbac, G. Deep reinforcement learning for strategic bidding in electricity markets. IEEE Trans. Smart Grid 2019, 11, 1343–1355. [Google Scholar] [CrossRef]
Jain, P.; Saxena, A. A Multi-Agent based simulator for strategic bidding in day-ahead energy market. Sustain. Energy Grids Netw. 2023, 33, 100979. [Google Scholar] [CrossRef]
Nicolini, C.; Gopalan, M.; Lepri, B.; Staiano, J. Hopfield networks for asset allocation. In Proceedings of the 5th ACM International Conference on AI in Finance, Brooklyn, NY, USA, 14–17 November 2024; pp. 19–26. [Google Scholar]
Gode, D.K.; Sunder, S. Allocative efficiency of markets with zero-intelligence traders: Market as a partial substitute for individual rationality. J. Political Econ. 1993, 101, 119–137. [Google Scholar] [CrossRef]
Xu, S.; Chen, X.; Xie, J.; Rahman, S.; Wang, J.; Hui, H.; Chen, T. Agent-based modeling and simulation for the electricity market with residential demand response. CSEE J. Power Energy Syst. 2020, 7, 368–380. [Google Scholar]
Fatras, N.; Ma, Z.; Jørgensen, B.N. An agent-based modelling framework for the simulation of large-scale consumer participation in electricity market ecosystems. Energy Inform. 2022, 5, 47. [Google Scholar] [CrossRef]
Bose, S.; Kremers, E.; Mengelkamp, E.M.; Eberbach, J.; Weinhardt, C. Reinforcement learning in local energy markets. Energy Inform. 2021, 4, 7. [Google Scholar] [CrossRef]
Osoba, O.A.; Vardavas, R.; Grana, J.; Zutshi, R.; Jaycocks, A. Modeling agent behaviors for policy analysis via reinforcement learning. In Proceedings of the 2020 19th IEEE International Conference on Machine Learning and Applications (ICMLA), Miami, FL, USA, 14–17 December 2020; pp. 213–219. [Google Scholar]
Zheng, S.; Trott, A.; Srinivasa, S.; Parkes, D.C.; Socher, R. The AI Economist: Taxation policy design via two-level deep multiagent reinforcement learning. Sci. Adv. 2022, 8, eabk2607. [Google Scholar] [CrossRef] [PubMed]
Harder, N.; Qussous, R.; Weidlich, A. Fit for purpose: Modeling wholesale electricity markets realistically with multi-agent deep reinforcement learning. Energy AI 2023, 14, 100295. [Google Scholar] [CrossRef]
Miskiw, K.K.; Staudt, P. Explainable Deep Reinforcement Learning for Multi-Agent Electricity Market Simulations. In Proceedings of the 2024 20th International Conference on the European Energy Market (EEM), Istanbul, Turkey, 10–12 June 2024; pp. 1–9. [Google Scholar]
Wu, Z.; Zhou, M.; Zhang, T.; Li, G.; Zhang, Y.; Liu, X. Imbalance settlement evaluation for China’s balancing market design via an agent-based model with a multiple criteria decision analysis method. Energy Policy 2020, 139, 111297. [Google Scholar] [CrossRef]
Shafie-Khah, M.; Catalão, J.P. A stochastic multi-layer agent-based model to study electricity market participants behavior. IEEE Trans. Power Syst. 2014, 30, 867–881. [Google Scholar] [CrossRef]
Dehghanpour, K.; Nehrir, M.H.; Sheppard, J.W.; Kelly, N.C. Agent-based modeling of retail electrical energy markets with demand response. IEEE Trans. Smart Grid 2016, 9, 3465–3475. [Google Scholar] [CrossRef]
Tang, L.; Wu, J.; Yu, L.; Bao, Q. Carbon emissions trading scheme exploration in China: A multi-agent-based model. Energy Policy 2015, 81, 152–169. [Google Scholar] [CrossRef]
Tang, L.; Wu, J.; Yu, L.; Bao, Q. Carbon allowance auction design of China’s emissions trading scheme: A multi-agent-based approach. Energy Policy 2017, 102, 30–40. [Google Scholar] [CrossRef]
Allen, P.; Varga, L. Modelling sustainable energy futures for the UK. Futures 2014, 57, 28–40. [Google Scholar] [CrossRef][Green Version]
D’Adamo, I.; Gastaldi, M.; Hachem-Vermette, C.; Olivieri, R. Sustainability, emission trading system and carbon leakage: An approach based on neural networks and multicriteria analysis. Sustain. Oper. Comput. 2023, 4, 147–157. [Google Scholar] [CrossRef]
Wei, Y.; Liang, X.; Xu, L.; Kou, G.; Chevallier, J. Trading, storage, or penalty? Uncovering firms’ decision-making behavior in the Shanghai emissions trading scheme: Insights from agent-based modeling. Energy Econ. 2023, 117, 106463. [Google Scholar] [CrossRef]
Peng, Z.S.; Zhang, Y.L.; Shi, G.M.; Chen, X.H. Cost and effectiveness of emissions trading considering exchange rates based on an agent-based model analysis. J. Clean. Prod. 2019, 219, 75–85. [Google Scholar] [CrossRef]
Yu, S.m.; Fan, Y.; Zhu, L.; Eichhammer, W. Modeling the emission trading scheme from an agent-based perspective: System dynamics emerging from firms’ coordination among abatement options. Eur. J. Oper. Res. 2020, 286, 1113–1128. [Google Scholar] [CrossRef]
Zhang, J.; Ge, J.; Wen, J. Agent-Based Modeling of Carbon Emission Trading Market With Heterogeneous Agents. SSRN Electron. J. 2021. [Google Scholar] [CrossRef]
Shinde, P.; Amelin, M. Agent-based models in electricity markets: A literature review. In Proceedings of the 2019 IEEE Innovative Smart Grid Technologies-Asia (ISGT Asia), Chengdu, China, 21–24 May 2019; pp. 3026–3031. [Google Scholar]
Tesfatsion, L. Software for Agent Based Computational Economics and Complex Adaptive Systems. 2024. Available online: https://faculty.sites.iastate.edu/tesfatsi/archive/tesfatsi/acecode.htm (accessed on 17 October 2024).
El-adaway, I.H.; Sims, C.; Eid, M.S.; Liu, Y.; Ali, G.G. Preliminary attempt toward better understanding the impact of distributed energy generation: An agent-based computational economics approach. J. Infrastruct. Syst. 2020, 26, 04020002. [Google Scholar] [CrossRef]
Schimeczek, C.; Nienhaus, K.; Frey, U.; Sperber, E.; Sarfarazi, S.; Nitsch, F.; Kochems, J.; El Ghazi, A.A. AMIRIS: Agent-based Market model for the Investigation of Renewable and Integrated energy Systems. J. Open Source Softw. 2023, 8, 5041. [Google Scholar] [CrossRef]
Harder, N.; Miskiw, K.K.; Maurer, F.; Khanra, M.; Parag, P. ASSUME: Agent-Based Electricity Markets Simulation Toolbox. 2023. Available online: https://assume-project.de/ (accessed on 18 October 2024).
Chappin, E.J.; de Vries, L.J.; Richstein, J.C.; Bhagwat, P.; Iychettira, K.; Khan, S. Simulating climate and energy policy with agent-based modelling: The Energy Modelling Laboratory (EMLab). Environ. Model. Softw. 2017, 96, 421–431. [Google Scholar] [CrossRef]
Battula, S.; Tesfatsion, L.; McDermott, T.E. An ERCOT test system for market design studies. Appl. Energy 2020, 275, 115182. [Google Scholar] [CrossRef]
Grid Singularity. 2024. Available online: https://gridsingularity.com/ (accessed on 17 October 2024).
Pinto, T.; Vale, Z.; Sousa, T.M.; Praça, I.; Santos, G.; Morais, H. Adaptive learning in agents behaviour: A framework for electricity markets simulation. Integr. Comput.-Aided Eng. 2014, 21, 399–415. [Google Scholar] [CrossRef]
Santos, G.; Pinto, T.; Praça, I.; Vale, Z. MASCEM: Optimizing the performance of a multi-agent system. Energy 2016, 111, 513–524. [Google Scholar] [CrossRef]
Lopes, F. MATREM: An agent-based simulation tool for electricity markets. In Electricity Markets with Increasing Levels of Renewable Generation: Structure, Operation, Agent-Based Simulation, and Emerging Designs; Springer: Cham, Switzerland, 2018; pp. 189–225. [Google Scholar]
Fraunholz, C.; Kraft, E.; Keles, D.; Fichtner, W. Advanced price forecasting in agent-based electricity market simulation. Appl. Energy 2021, 290, 116688. [Google Scholar] [CrossRef]
Tesfatsion, L. The AMES Wholesale Power Market Test Bed. 2024. Available online: https://faculty.sites.iastate.edu/tesfatsi/archive/tesfatsi/AMESMarketHome.htm (accessed on 29 October 2024).
GitHub. 2025. Available online: https://github.com/ (accessed on 8 January 2025).
AMIRIS—The Open Agent-Based Electricity Market Model. 2024. Available online: https://www.dlr.de/en/ve/research-and-transfer/research-infrastructure/modelling-tools/amiris (accessed on 29 October 2024).
Chappin, E. EMLab—Energy Modelling Laboratory. 2024. Available online: https://emlab.tudelft.nl/ (accessed on 29 October 2024).
Mattsson, S.E.; Andersson, M.; Åström, K.J. Object-oriented modeling and simulation. In CAD for Control Systems; CRC Press: Boca Raton, FL, USA, 2020; pp. 31–69. [Google Scholar]
Collier, N.T.; Ozik, J.; Tatara, E.R. Experiences in developing a distributed agent-based modeling toolkit with Python. In Proceedings of the 2020 IEEE/ACM 9th Workshop on Python for High-Performance and Scientific Computing (PyHPC), Atlanta, GA, USA, 13 November 2020; pp. 1–12. [Google Scholar]
González-Briones, A.; De La Prieta, F.; Mohamad, M.S.; Omatu, S.; Corchado, J.M. Multi-agent systems applications in energy optimization problems: A state-of-the-art review. Energies 2018, 11, 1928. [Google Scholar] [CrossRef]
Sutton, R.S. Reinforcement Learning: An Introduction; A Bradford Book; MIT Press: Cambridge, MA, USA, 2018. [Google Scholar]
Kell, A.J.; McGough, S.; Forshaw, M. Machine learning applications for electricity market agent-based models: A systematic literature review. arXiv 2022, arXiv:2206.02196. [Google Scholar]
Ye, Y.; Papadaskalopoulos, D.; Yuan, Q.; Tang, Y.; Strbac, G. Multi-agent deep reinforcement learning for coordinated energy trading and flexibility services provision in local electricity markets. IEEE Trans. Smart Grid 2022, 14, 1541–1554. [Google Scholar] [CrossRef]
Zhang, Z.; Zhang, D.; Qiu, R.C. Deep reinforcement learning for power system applications: An overview. CSEE J. Power Energy Syst. 2019, 6, 213–225. [Google Scholar]
Achiam, J. Spinning Up in Deep Reinforcement Learning. 2018. Available online: https://github.com/openai/spinningup (accessed on 5 November 2024).
Klein, T. Autonomous algorithmic collusion: Q-learning under sequential pricing. RAND J. Econ. 2021, 52, 538–558. [Google Scholar] [CrossRef]
Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Graves, A.; Riedmiller, M.; Fidjeland, A.K.; Ostrovski, G.; et al. Human-level control through deep reinforcement learning. Nature 2015, 518, 529–533. [Google Scholar] [CrossRef]
Hernandez-Leal, P.; Kartal, B.; Taylor, M.E. A survey and critique of multiagent deep reinforcement learning. Auton. Agents Multi-Agent Syst. 2019, 33, 750–797. [Google Scholar] [CrossRef]
Liu, X.; Zhu, T.; Jiang, C.; Ye, D.; Zhao, F. Prioritized experience replay based on multi-armed bandit. Expert Syst. Appl. 2022, 189, 116023. [Google Scholar] [CrossRef]
Agostinelli, F.; Hocquet, G.; Singh, S.; Baldi, P. From reinforcement learning to deep reinforcement learning: An overview. In Proceedings of the Braverman Readings in Machine Learning, Key Ideas from Inception to Current State: International Conference Commemorating the 40th Anniversary of Emmanuil Braverman’s Decease, Boston, MA, USA, 28–30 April 2017; Invited Talks. Springer: Cham, Switzerland, 2018; pp. 298–328. [Google Scholar]
Tampuu, A.; Matiisen, T.; Kodelja, D.; Kuzovkin, I.; Korjus, K.; Aru, J.; Aru, J.; Vicente, R. Multiagent cooperation and competition with deep reinforcement learning. PLoS ONE 2017, 12, e0172395. [Google Scholar] [CrossRef]
Fang, Y.; Tang, Z.; Ren, K.; Liu, W.; Zhao, L.; Bian, J.; Li, D.; Zhang, W.; Yu, Y.; Liu, T.Y. Learning multi-agent intention-aware communication for optimal multi-order execution in finance. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Long Beach, CA, USA, 6–10 August 2023; pp. 4003–4012. [Google Scholar]
Lincoln, R.W.; Galloway, S.; Burt, G. Open source, agent-based energy market simulation with python. In Proceedings of the 2009 6th International Conference on the European Energy Market, Leuven, Belgium, 27–29 May 2009; pp. 1–5. [Google Scholar]
Lussange, J.; Lazarevich, I.; Bourgeois-Gironde, S.; Palminteri, S.; Gutkin, B. Modelling stock markets by multi-agent reinforcement learning. Comput. Econ. 2021, 57, 113–147. [Google Scholar] [CrossRef]
Harrold, D.J.; Cao, J.; Fan, Z. Renewable energy integration and microgrid energy trading using multi-agent deep reinforcement learning. Appl. Energy 2022, 318, 119151. [Google Scholar] [CrossRef]
Aliabadi, D.E.; Chan, K. The emerging threat of artificial intelligence on competition in liberalized electricity markets: A deep Q-network approach. Appl. Energy 2022, 325, 119813. [Google Scholar] [CrossRef]
Cao, D.; Hu, W.; Zhao, J.; Zhang, G.; Zhang, B.; Liu, Z.; Chen, Z.; Blaabjerg, F. Reinforcement learning and its applications in modern power and energy systems: A review. J. Mod. Power Syst. Clean Energy 2020, 8, 1029–1042. [Google Scholar] [CrossRef]
Nguyen, T.T.; Nguyen, N.D.; Nahavandi, S. Deep reinforcement learning for multiagent systems: A review of challenges, solutions, and applications. IEEE Trans. Cybern. 2020, 50, 3826–3839. [Google Scholar] [CrossRef]
Lowe, R.; Wu, Y.I.; Tamar, A.; Harb, J.; Pieter Abbeel, O.; Mordatch, I. Multi-agent actor-critic for mixed cooperative-competitive environments. Adv. Neural Inf. Process. Syst. 2017, 30, 6382–6393. [Google Scholar]
Naeem, M.A.; Peng, Z.; Suleman, M.T.; Nepal, R.; Shahzad, S.J.H. Time and frequency connectedness among oil shocks, electricity and clean energy markets. Energy Econ. 2020, 91, 104914. [Google Scholar] [CrossRef]
Rehman, M.U.; Naeem, M.A.; Ahmad, N.; Vo, X.V. Global energy markets connectedness: Evidence from time–frequency domain. Environ. Sci. Pollut. Res. 2023, 30, 34319–34337. [Google Scholar] [CrossRef]
Avalos, F.; Huang, W.; Tracol, K. Margins and Liquidity in European Energy Markets in 2022; Technical Report; Bank for International Settlements: Basel, Switzerland, 2023. [Google Scholar]
Bottecchia, L.; Lubello, P.; Zambelli, P.; Carcasci, C.; Kranzl, L. The potential of simulating energy systems: The multi energy systems simulator model. Energies 2021, 14, 5724. [Google Scholar] [CrossRef]
Gao, X.; Knueven, B.; Siirola, J.D.; Miller, D.C.; Dowling, A.W. Multiscale simulation of integrated energy system and electricity market interactions. Appl. Energy 2022, 316, 119017. [Google Scholar] [CrossRef]
Giorgi, F.; Herzel, S.; Pigato, P. A reinforcement learning algorithm for trading commodities. Appl. Stoch. Model. Bus. Ind. 2024, 40, 373–388. [Google Scholar] [CrossRef]
Fagiolo, G.; Guerini, M.; Lamperti, F.; Moneta, A.; Roventini, A. Validation of agent-based models in economics and finance. In Computer Simulation Validation: Fundamental Concepts, Methodological Frameworks, and Philosophical Perspectives; Springer: Cham, Switzerland, 2019; pp. 763–787. [Google Scholar]
An, L.; Grimm, V.; Turner II, B.L. Meeting grand challenges in agent-based models. J. Artif. Soc. Soc. Simul. 2020, 23. [Google Scholar] [CrossRef]
Collins, A.; Koehler, M.; Lynch, C. Methods that support the validation of agent-based models: An overview and discussion. J. Artif. Soc. Soc. Simul. 2024, 27, 11. [Google Scholar] [CrossRef]
Barde, S.; van Der Hoog, S. An Empirical Validation Protocol for Large-Scale Agent-Based Models. In Bielefeld Working Papers in Economics and Management No. 04-2017; SSRN (Elsevier): Amsterdam, The Netherlands, 2017; Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2992473 (accessed on 25 November 2024).
Lamperti, F. An information theoretic criterion for empirical validation of simulation models. Econom. Stat. 2018, 5, 83–106. [Google Scholar] [CrossRef]
Williams, B.L.; Hooper, R.; Gnoth, D.; Chase, J. Residential Electricity Demand Modelling: Validation of a Behavioural Agent-Based Approach. Energies 2025, 18, 1314. [Google Scholar] [CrossRef]
Guerini, M.; Moneta, A. A method for agent-based models validation. J. Econ. Dyn. Control 2017, 82, 125–141. [Google Scholar] [CrossRef]
Takahashi, S.; Chen, Y.; Tanaka-Ishii, K. Modeling financial time-series with generative adversarial networks. Phys. A Stat. Mech. Appl. 2019, 527, 121261. [Google Scholar] [CrossRef]
Shinde, P.; Boukas, I.; Radu, D.; Manuel de Villena, M.; Amelin, M. Analyzing trade in continuous intra-day electricity market: An agent-based modeling approach. Energies 2021, 14, 3860. [Google Scholar] [CrossRef]
An, L.; Grimm, V.; Sullivan, A.; Turner Ii, B.; Malleson, N.; Heppenstall, A.; Vincenot, C.; Robinson, D.; Ye, X.; Liu, J.; et al. Challenges, tasks, and opportunities in modeling agent-based complex systems. Ecol. Model. 2021, 457, 109685. [Google Scholar] [CrossRef]
Richmond, P.; Chisholm, R.; Heywood, P.; Chimeh, M.K.; Leach, M. FLAME GPU 2: A framework for flexible and performant agent based simulation on GPUs. Softw. Pract. Exp. 2023, 53, 1659–1680. [Google Scholar] [CrossRef]
Dong, D. Agent-based cloud simulation model for resource management. J. Cloud Comput. 2023, 12, 156. [Google Scholar] [CrossRef]
Abbott, R.; Hadžikadić, M. Complex adaptive systems, systems thinking, and agent-based modeling. In Advanced Technologies, Systems, and Applications; Springer: Cham, Switzerland, 2017; pp. 1–8. [Google Scholar]
Fruit, R. Exploration-Exploitation Dilemma in Reinforcement Learning Under Various Form of Prior Knowledge. Sciences et Technologies; CRIStAL UMR 9189. Ph.D. Thesis, Université de Lille 1, Villeneuve-d’Ascq, France, 2019. [Google Scholar]
Guida, V.; Mittone, L.; Morreale, A. Innovative search and imitation heuristics: An agent-based simulation study. J. Econ. Interact. Coord. 2024, 19, 231–282. [Google Scholar] [CrossRef]
Wei, P. Exploration-exploitation strategies in deep q-networks applied to route-finding problems. J. Phys. Conf. Ser. 2020, 1684, 012073. [Google Scholar] [CrossRef]
Billinger, S.; Srikanth, K.; Stieglitz, N.; Schumacher, T.R. Exploration and exploitation in complex search tasks: How feedback influences whether and where human agents search. Strateg. Manag. J. 2021, 42, 361–385. [Google Scholar] [CrossRef]

Figure 1. Schematic representation of an agent-based model. The agents interact both between themselves and with the environment, which in this case is represented by the energy production and consumption entities and by the energy markets.

Figure 2. Bibliometric analysis of the existing literature about the application of ABMs to energy markets that quantifies the distribution of ABM studies across the different energy commodities.

Figure 3. Taxonomy of reinforcement learning methods and algorithms [78].

Table 1. Comparison of agent-based tools for modeling and simulation of energy commodity markets.

Tool Name	Code Language	Agents’ Properties	Scope	Year
AMES [57]	Java	Energy traders, transmission companies, and electrical grid	Analyze bidding of generating companies in wholesale markets	2007–active
AMIRIS [58]	Java	Power plants, storage, traders, marketplaces, forecasters, and policy providers	Modeling of dispatching and simulation of market prices	2017–active
ASSUME [59]	Python	Generation and demand-side agents	Analyze new market designs and dynamics in electricity markets	2022–active
EMLab [60]	Java	Power companies with limited information that make imperfect investment decisions	Explore long-term effects of interacting energy and climate policies	2010–active
ERCOT [61]	Java	Dispatchable and non-dispatchable generators, load-serving entities	Dynamic modeling of wholesale power markets	2016–active
Grid Singularity [62]	Python	Scalable from individuals to nations	Simulate and optimize grid-aware energy markets	2016–active
MASCEM [63,64]	Java	Emulate traders’ activity	Determine the best strategy depending on the context	2012–active
MATREM [65]	Java	Generators, retailers, aggregators, consumers, market, and system operators	Simulate day-ahead and futures electricity markets	2012–active
PowerACE [66]	Java	Utility companies, regulators, storage units, and consumers	Long-term analysis of EU day-ahead energy markets	2013–active

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Trimarchi, S.; Casamatta, F.; Gamba, L.; Grimaccia, F.; Lorenzo, M.; Niccolai, A. A Review of Agent-Based Models for Energy Commodity Markets and Their Natural Integration with RL Models. Energies 2025, 18, 3171. https://doi.org/10.3390/en18123171

AMA Style

Trimarchi S, Casamatta F, Gamba L, Grimaccia F, Lorenzo M, Niccolai A. A Review of Agent-Based Models for Energy Commodity Markets and Their Natural Integration with RL Models. Energies. 2025; 18(12):3171. https://doi.org/10.3390/en18123171

Chicago/Turabian Style

Trimarchi, Silvia, Fabio Casamatta, Laura Gamba, Francesco Grimaccia, Marco Lorenzo, and Alessandro Niccolai. 2025. "A Review of Agent-Based Models for Energy Commodity Markets and Their Natural Integration with RL Models" Energies 18, no. 12: 3171. https://doi.org/10.3390/en18123171

APA Style

Trimarchi, S., Casamatta, F., Gamba, L., Grimaccia, F., Lorenzo, M., & Niccolai, A. (2025). A Review of Agent-Based Models for Energy Commodity Markets and Their Natural Integration with RL Models. Energies, 18(12), 3171. https://doi.org/10.3390/en18123171

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Review of Agent-Based Models for Energy Commodity Markets and Their Natural Integration with RL Models

Abstract

1. Introduction

2. Foundations of Agent-Based Models in Energy Markets

3. Applications of ABMs in Energy Commodity Markets

3.1. Definition of Trading Strategies

3.2. Scenario Generation

3.3. Market Design Evaluation

4. Platforms and Tools for ABMs in Energy Markets

5. Machine Learning Integration with ABMs

6. Challenges and Limitations of ABMs for Energy Markets

6.1. Cross-Relationships Between Energy Commodities

6.2. Calibration and Validation

6.3. Scaling Up

6.4. Exploration–Exploitation Trade-Off Dilemma

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI