Next Article in Journal
Recent Progress of Biomass-Derived Carbon for Supercapacitors: A Review
Previous Article in Journal
State-of-Health Predictive Energy Management and Grid-Forming Control for Battery Energy Storage Systems
Previous Article in Special Issue
Experimental Study of Pouch-Type Battery Cell Thermal Characteristics Operated at High C-Rates
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Two-Stage Game-Based Charging Optimization for a Competitive EV Charging Station Considering Uncertain Distributed Generation and Charging Behavior

1
Suqian Power Supply Company, State Grid Jiangsu Electric Power Co., Ltd., Suqian 223800, China
2
School of Automation, Nanjing University of Science and Technology, Nanjing 210094, China
*
Author to whom correspondence should be addressed.
Batteries 2026, 12(1), 16; https://doi.org/10.3390/batteries12010016 (registering DOI)
Submission received: 4 November 2025 / Revised: 14 December 2025 / Accepted: 28 December 2025 / Published: 1 January 2026

Abstract

The widespread adoption of electric vehicles (EVs) has turned charging demand into a substantial load on the power grid. To satisfy the rapidly growing demand of EVs, the construction of charging infrastructure has received sustained attention in recent years. As charging stations become more widespread, how to attract EV users in a competitive charging market while optimizing the internal charging process is the key to determine the charging station’s operational efficiency. This paper tackles this issue by presenting the following contributions. Firstly, a simulation method based on prospect theory is proposed to simulate EV users’ preferences in selecting charging stations. The selection behavior of EV users is simulated by establishing coupling relationship among the transportation network, power grid, and charging network as well as the model of users’ preference. Secondly, a two-stage joint stochastic optimization model for a charging station is developed, which considers both charging pricing and energy control. At the first stage, a Stackelberg game is employed to determine the day-ahead optimal charging price in a competitive market. At the second stage, real-time stochastic charging control is applied to maximize the operational profit of the charging station considering renewable energy integration. Finally, a scenario-based Alternating Direction Method of Multipliers (ADMM) approach is introduced in the first stage for optimal pricing learning, while a simulation-based Rollout method is applied in the second stage to update the real-time energy control strategy based on the latest pricing. Numerical results demonstrate that the proposed method can achieve as large as 33% profit improvement by comparing with the competitive charging stations considering 1000 EV integration.

Graphical Abstract

1. Introduction

EVs are becoming increasingly popular worldwide, and the charging demand has emerged as a significant load in power grids. To satisfy the rapid growth of charging demand, the construction of charging stations has also received sustained attention in recent years. For example, by the end of August 2025, the total number of charging piles in China reached 17.348 million, marking a year-on-year increase of 53.5% [1]. This large number of charging piles further stimulates the development of charging stations [2]. Moreover, with the widespread adoption of wind and photovoltaic power generation, the energy acquisition modes of charging stations have become more diversified, and the overall charging scale continues to expand, thereby better meeting the charging demands of EVs.
However, the rapid proliferation of charging stations also intensifies market competition. To maximize the profit of charging stations, electricity pricing and energy are two crucial factors [3,4]. Given that EV drivers are cost-sensitive, electricity pricing determines the scale of potential controllable EV charging load, which consequently establishes the upper bound for the profit attainable by charging stations. Furthermore, with the continuous advancement of wind and photovoltaic power generation, under a predetermined charging scale, aligning energy control as closely as possible with wind and solar output can maximize the actual obtainable profit. Therefore, it is of great importance to jointly consider both aspects.
This problem is challenging primarily for the following reasons: First, deep coupling between electricity price and energy. The charging price influences EV users’ participation behavior and the overall demand response intensity, thereby affecting the energy control potential of charging stations. In turn, charging stations adjust their pricing strategies based on user demand and energy control capability to attract users and maximize profit. This mutual interaction introduces significant coupling characteristics into the system optimization process. Second, multi-stage stochastic decision-making. In addition to the uncertain distributed generation, EV users exhibit randomness in their responses to charging prices, travel behaviors, and charging willingness. These all influence the charging decision-making process. Meanwhile, decision-making must also account for the impact of current electricity prices on future system states, as well as the influence of present energy control actions on future control potential, thereby increasing the complexity of charging optimization decisions. Finally, the impact of non-linear constraints. EV users’ responses to charging prices are influenced by multiple factors that usually exhibit non-linear characteristics. The charging demand is expected to align as closely as possible with the total output of distributed generation to reduce power purchases considering numerous discrete variables representing whether an individual EV user chooses to charge. This adds significant non-linearity and combination complexity to the proposed problem.
In recent years, the operation optimization of charging stations has become an important research focus. The primary objectives are to enhance the operational profitability of charging stations, reduce energy consumption, and satisfy charging demands. The body of literature on charging station control can be broadly categorized into three groups. In the first group are models that only consider pricing. These models primarily maximize charging station profit and regulate wind and solar output through dynamic price adjustments. In [5], a day-ahead operational pricing scheme for charging stations is proposed that incorporates both centralized and decentralized approaches. This study assumes perfect knowledge of EV charging demand and neglects users’ price preferences. In [6], an optimal pricing method to guide and coordinate EV charging at stations is proposed to minimize the number of vehicles that remain uncharged. The model does not consider energy control of the charging station. In [7], a reinforcement-learning-based dynamic pricing system is proposed, which can adapt to user behavior and demand elasticity dynamically without requiring any prior knowledge of user behavior or price elasticity. In [8], a pricing strategy based on non-cooperative games and a dual-business model is proposed to mitigate direct price competition and enhance profitability. Unlike the above works, the present work not only considers pricing optimization but also joint energy control optimization. The second group comprises models that only consider energy control in the charging stations. These models primarily control energy usage to enhance profitability, satisfy users’ demand, and reduce electricity purchases from the main grid. In [9], a four-stage intelligent optimization and control algorithm for a charging station is proposed. In [10], considering distributed energy generation uncertainty, a chance-constrained optimization method is applied for charging and discharging control to improve cost performance. In [11], a consensus-based control strategy is proposed for frequency regulation of electric vehicle charging stations with limited communication between neighbors. Similar to the comparison results in the first model, this work not only considers the energy control but also the pricing optimization. The third group combines the aforementioned two types of model. In [12], a charging navigation policy that considers the driving rate and electricity pricing is proposed to reduce traffic congestion and enhance the operational revenue of EVs. Ref. [13] jointly considers pricing of charging services, scheduling of EVs, and management of energy storage. In [14], a novel profit distribution adjustment model is proposed by a triangular fuzzy comprehensive evaluation. Based on this model, collaborative and independent charging modes are established to enhance the fairness of profit distribution. However, the aforementioned models focus on the control and optimization of homogeneous charging stations, without considering the impact of competition among multiple charging stations on pricing and energy management. Moreover, the coupling relationship between pricing and energy control has usually been neglected. The present work considers not only the competition relationship among the charging stations but also the coupling relationship between pricing and energy control. For the optimization methods of the above models, researchers commonly utilize a range of approaches, such as mixed-integer linear programming (MILP) [15], dynamic programming (DP) [16], genetic algorithms (GA) [17], and particle swarm optimization (PSO) [18], which can provide stable optimization strategies and perform efficiently for well-defined scenarios. However, these algorithms cannot address the inherent randomness in EV users’ willingness, travel behavior, or the management of uncertain distributed renewable energy. Furthermore, the deep coupling between pricing and energy control greatly increases the model solution complexity, which can reduce optimization efficiency or even prevent the derivation of feasible strategies. The present work introduces an efficient solution algorithm to overcome these difficulties with fast convergence speed.
In summary, the joint pricing and energy control optimization for EV charging station considering their coupling relationship is barely discussed in the existing literature. Furthermore, an efficient optimization algorithm is lacking to solve this joint optimization problem considering the uncertain distributed generation and charging behaviors. Based on the above discussion, it is essential to study the joint pricing and energy control for competitive charging stations considering the uncertain distributed renewable energy and charging behavior. To tackle this problem, this work contributes in the following ways:
  • A simulation method for EV users’ charging station selection preferences is proposed, which considers the coupling relationships among the transportation network, power grid, and charging network. By incorporating the relationship between traffic speed and flow and utilizing origin–destination (OD) matrix analysis, EV travel trajectories are further simulated. Furthermore, based on prospect theory, an EV charging station selection preference model is established by characterizing the relationships among station pricing, distance, and charging cost, thereby enabling the simulation and analysis of EV users’ charging station selection behaviors.
  • Based on the simulation results of EV users’ charging station selection behavior, a two-stage joint stochastic optimization model for a charging station is established, considering both charging pricing and energy control. At the first stage, the charging price competition between a charging station and its competitors is modeled, and the optimal day-ahead pricing strategy is obtained through a Stackelberg game framework. At the second stage, real-time stochastic charging control is implemented to maximize the operational benefits of the charging station considering renewable energy integration.
  • A scenario-based ADMM method is developed for optimal pricing at the first stage. In this stage, energy control is performed using the latest strategy from the second stage to evaluate the potential operational benefit of the charging station under different pricing schemes, thereby determining the optimal charging price. At the second stage, various possible future charging demands of EVs under the optimal pricing strategy are fully considered, and a simulation-based Rollout method is employed to optimize the energy control strategy online. This approach enables stochastic matching with renewable energy outputs and reduces operational cost. Meanwhile, the updated control strategy is fed back to the first stage for continuous online price adjustment.
The rest of the paper is structured as follows. Section 2 provides an overview of the system. Section 3 introduces the simulation for EV users’ charging station selection preferences. Section 4 and Section 5 introduce the model and solution, respectively. Section 6 discusses the numerical results, and the paper concludes with a brief summary in Section 7.

2. System Overview

This paper considers the competitive scenario of charging stations under a typical transportation network scenario as shown in Figure 1. In this scenario, the urban space is modeled as a multi-layer coupled network composed of three key elements: the transportation network, the power network, and the charging network. Specifically, electric vehicles depart from different origin points, select their travel routes along the transportation network, and dynamically decide which charging stations to visit for recharging. Each charging station is located at a transportation hub node and is connected to the local power grid, while also being equipped with distributed renewable energy such as wind and solar power. The central challenge lies in achieving optimal coordination between operational profitability and on-site energy management under the conditions of intense market competition among charging stations, complex user decision-making behaviors, and the uncertain integration of renewable energy sources.
Based on the above scenario, this paper proposes a methodological architecture that is illustrated in Figure 2. The entire system consists of three functional modules: the charging selection simulation module, the pricing game module, and the energy control module, which together form a dynamic and closed-loop online optimization process. Firstly, the charging selection simulation module is built upon the coupled network and employs prospect theory to model the station selection and travel behaviors of EV users. It outputs real-time data on travel flows and charging demands, providing decision support for subsequent pricing and energy management. Secondly, the pricing game module employs a Stackelberg game combined with a scenario-based ADMM optimization algorithm to determine optimal charging prices online in a competitive charging market. Finally, the energy control module integrates on-site loads, renewable energy outputs, and electricity costs, and uses a simulation-based Rollout method to perform real-time charging energy management.
In summary, the EV user charging behavior data generated by the charging selection simulation module are fed into both the pricing and energy control modules. A bidirectional online interaction is established between the pricing and energy control modules: the energy control module schedules charging based on the latest pricing results, while the energy management performance (e.g., operational revenue and load status) actively feeds back to the pricing module, enabling adaptive real-time optimization of electricity prices and energy control strategy.

3. EV User Charging Selection Simulation

In this section, this paper employs prospect theory to simulate and obtain the future charging demand of each charging station for different time periods based on the EV users’ charging station selection willingness.

3.1. EV Travel Simulation

The dynamic traffic network model is modeled using a graph-theoretic approach with a mathematical model description of the road network topology [19]:
G = ( V , E , W ) V = v x | x = 1 , 2 , 3 , , n E = e x y | v x V , v y V , x y } W = w x y ( t ) | e x y E
where G = V , E , W is the traffic road network, V represents the set of all nodes in the graph G with the total number of nodes denoted as n, E represents the set of all road segments in the road network, v x represents the node of the road network, e x y represents the road segment connecting the road nodes x and y, and w x y t W represents the road resistance at time t, which is calculated as follows:
w x y ( t ) = t 0 ( 1 + a ( S x y t ) b ) 0 S x y t 1.0 t 0 ( 1 + a ( 2 S x y t ) b ) 1.0 < S x y t 2.0
where S x y t = Q x y t / C is the saturation of road, Q x y t denotes the traffic flow between the node x and the node y during time period t, C is the traffic capacity, t 0 is the free-flow travel time, and a and b are the impedance influence factors.
To analyze the temporal distribution characteristics of EV travel behavior throughout the day, the origin–destination (OD) matrix approach is employed for modeling. OD matrix is a method to describe travel demand within a road network in transportation modeling. Each element in the OD matrix represents the number of trips from an origin node to a destination node within a given time interval. In practice, the OD matrix is typically derived from household travel surveys. This paper uses the data in [20] to generate the OD matrix. Based on this, the Monte Carlo sampling method is utilized to assign each EV a corresponding origin–destination pair, its departure and return times. The resulting OD probability matrix is expressed as follows:
P x y t = Q x y t y = 1 n Q x y t , ( 1 x n , 1 x n , x y )
where P x y t denotes the probability of the EV traveling from node x to node y. Then, the OD probability matrix of the traffic network P T = P x y t n × n can be derived [21]. After obtaining the OD matrix, the travel route between each origin–destination pair can be determined by using a sampling technique based on P T .
When generating EV travel chains, each EV j can be assigned an origin node O j and a destination node D j , together with an initial departure time t o j . Based on the above simulation method, the driving trajectory of each EV j can be obtained. Once an EV reaches its destination, a new OD pair is resampled from the OD matrix to characterize its full-day travel behavior. For EVs traveling on the road, the remaining energy of the j-th EV can be described as follows:
S O C j , t = S O C j , t 1 d j , t E c / e j c a p
where S O C j , t is the remaining energy at time t, E c is the power consumption per kilometre of the EV (unit: kWh/km), d j , t is the distance travelled by the j-th vehicle from t 1 to t (unit: km), and  e j c a p denotes the battery capacity of the j-th EV (unit: kWh). Therefore, d j , t E c / e j c a p denotes the consumed SOC during driving.

3.2. EV Users’ Charging Station Selection Behavior Based on Prospect Theory

The charging behavior of EV users typically occurs either before departure or after the completion of a trip when the remaining state-of-charge (SOC) of the EV is low. Under this condition, the user initiates the charging decision process and proceeds to the charging station selection stage.
Given the limited driving range associated with the current SOC, the EV first identifies the set of reachable charging stations within its feasible travel distance. Among these candidate stations, the user selects the optimal one by comprehensively evaluating multiple factors, including the charging price, travel distance to the station, and the queuing condition upon arrival. These factors jointly determine the user’s final charging choice and reflect the trade-off between economic cost, travel convenience, and service accessibility.
To precisely capture the “Bounded Rationality” of EV users in charging decisions, this study employs prospect theory (PT) to construct the station choice preference model [22]. Unlike traditional expected utility theory (EUT), the core concept of PT is that decision-makers evaluate choices based on the relative perception of gains and losses against a reference point, rather than absolute outcomes.
Prospect theory is a behavioral science used to explain decision-making by decision makers in the face of risk and uncertainty, and it is suitable for describing the choice behavior of decision makers under conditions of uncertainty. In the framework of prospect theory, the comprehensive prospect value of user j at charging station i can be expressed as
U i j = ω 1 · V ( Δ λ i ) + ω 2 · V ( Δ T i , j t r a v e l ) + ω 3 · V ( Δ T i , j w a i t )
where ω 1 ω 2 , and ω 3 ω 1 + ω 2 + ω 3 = 1 are the attribute preference weights, representing the relative importance that each user assigns to charging price, travel distance, and waiting time, respectively. To obtain appropriate weight values, it is suggested to collect preference data from a large number of EV users regarding their choice of charging stations, including preference data such as charging prices, travel time to the charging stations and queuing time for charging. Statistical analysis of these data can reveal EV users’ preference ratios, i.e., the weight values for these three factors. The fundamental principle of prospect theory is its reference-point dependence. The evaluation of each attribute by a user is determined not by its absolute magnitude but by its deviation from the associated psychological reference point. In (5), price deviation is Δ λ i = λ r e f λ i , where λ r e f is the user’s expected price and λ i is the charging price offered by the i-th charging station. Similarly, traveling time deviation is Δ T i , j t r a v e l = T r e f t r a v e l T i , j t r a v e l and queuing time deviation is Δ T i , j w a i t = T r e f w a i t T i , j w a i t , where T r e f t r a v e l and T r e f w a i t denote the user’s expected travel time and wait time, respectively, and T i , j t r a v e l and T i , j w a i t denote the actual travel time and wait time, respectively, if the j-th EV chooses the i-th charging station to charge. Note that the reference values λ r e f , T i , j t r a v e l , and T r e f w a i t can be obtained by collecting EV users’ preference data of charging prices, travel time to the charging stations, and queuing time for charging. The average value of these data can be selected as the reference value. The details of this comprehensive prospect value are introduced below.
PT utilizes an S-shaped value function v ( Δ y ) to map the objective deviation Δ x into a subjective psychological value. The function is
V ( Δ y ) = ( Δ y ) α if Δ y 0 κ · ( Δ y ) β if Δ y < 0
where Δ y is the attribute deviation of ( Δ λ i Δ T i , j t r a v e l Δ T i , j w a i t ), and α and β 0 < α , β 1 are coefficients for diminishing sensitivity in gains and losses, respectively. κ κ > 1 is the loss aversion coefficient, which indicates the sensitivity of the EV users to the losses [23].
If an EV travels from the origin node O ( j ) to charging station i along path P = x 0 = O ( j ) , x 1 , x 2 , , x K = i , the travel time for EV j to reach charging station i is
T i , j t r a v e l = d i , j v j
d i , j = k = 0 K 1 l x k , x k + 1
where v j denotes the average speed of the j-th EV and l x k , x k + 1 denotes the distance between the path nodes x k and x k + 1 .
The expected waiting time can be obtained by queuing theory. As the M/M/c queuing model is a common model to depict the charging arrival and service time considering the independent relationship among the EV users [24,25], this paper selects this model for optimization. The method also holds under a different queuing model. For the M/M/c queuing model, the first symbol, M, denotes the arrival time of EV users, following the Poisson distribution. The second symbol, M, denotes the service time, following the exponential distribution. The last symbol, c, denotes the number of service counters, i.e., the number of charging piles [26]. Suppose charging station i is equipped with c i charging piles. The average arrival rate of EV users is ϱ i , and the average service rate of a single charging pile is μ i . Therefore, the waiting time can be expressed as
T i , j w a i t = ( c i ρ i ) c i ρ i c i ! ( 1 ρ i ) 2 ϱ i P i 0
P i 0 = k = 0 c i 1 1 k ! ϱ i μ i k + 1 c i ! ( 1 ρ i ) ϱ i μ i c i 1
where ρ i = ϱ i c i μ i is the utilization factor of charging station i, k is the number of active charging piles, and  P i 0 represents the probability that all charging piles in charging station i are idle.
Finally, the station choice behavior of user j is simulated through a Multinomial Logit (MNL) model. The probability P j ( i ) that user j chooses charging station i is
P j ( i ) = e U i j n N e U n j
where N denotes the set of all charging stations. A higher U i j indicates a stronger subjective attraction for charging station i, thus increasing its probability of being chosen. Based on the probability distribution, the user’s final charging station choice i * is determined. The required charging energy q i , j * for the j-th EV at charging station i is then computed as
q i , j * = 1 S O C j , t · e j c a p + d i , j E c
The charging time can be calculated as follows:
T i , j charge = q i , j * / p c h
where p c h is the charging power of the EV. Based on the above introduction, the EV travel chain and its charging selection can be simulated, which will be used in the pricing game and energy control of the charging station.

4. Two-Stage Game-Based Charging Optimization Model

To achieve optimal operation of charging stations in a competitive market, a two-stage game-based joint optimization framework is developed. At the first stage, each charging station acts as a strategic player in a Stackelberg pricing game, determining its optimal day-ahead charging price considering uncertainties in future EV arrivals and renewable generation. At the second stage, given the determined prices, each charging station optimizes its energy management to maximize operational profit under stochastic renewable generation and charging demand. The two stages are interconnected, i.e., the real-time learned energy control policy is iteratively fed back to the first stage, enabling continuous online learning of optimal pricing under dynamic operating conditions. The details of this two-stage charging optimization model are introduced below.

4.1. Distributed Renewable Generation

For distributed wind power generation in the i-th charging station, the following constraints exist:
p i , t w = p w , c , v r v i , t v i n p w , c v i , t v r 3 , v o u t v i , t v r 0 , otherwise
where p i , t w denotes the generated distributed wind, p w , c denotes the wind generation capacity, v i , t denotes the wind speed in the i-th charging station at time t, and v r / v i n / v o u t are the cutin, cutout, and rated speed, respectively.
The output power of distributed photovoltaic generation is mainly affected by solar irradiance, which depends on the position of the sun, the geographical location of the photovoltaic equipment, and the weather conditions. There exists the following relationship:
p i , t s = p i , t c a p , s η p v I i , t s
where p i , t s denotes the generated distributed solar energy, p i , t c a p , s denotes the capacity of the distributed photovoltaic generation, and I i , t s denotes the current solar radiation intensity.

4.2. Day-Ahead Game-Based Pricing Model

In the day-ahead stage, multiple charging stations act as players in a non-cooperative pricing game. Charging station i N N = { 1 , 2 , , N } decides its day-ahead charging price λ i , t to attract EV users where N denotes the total number of charging stations. The EV charging demand D i , t λ i , t , λ i , t for the i-th charging station at time t is determined by the prospect-theory-based user selection model as introduced in Section 3, where λ i , t denotes the price vector of competing charging stations.
In order to determine the optimal day-ahead charging price strategy for the target charging station and thereby maximize its expected revenue, a Stackelberg game model is formulated, in which the target station acts as the leader, while the other competing stations serve as the followers [27]. The day-ahead pricing problem is formulated as a game G = N , { λ i } i N , { R i } i N where N denotes the player set, and λ i and R i denote the price vector and the payoff function of the i-th charging station, respectively. The details are introduced below.
(1) Players: The set of players is the N competing charging stations. The leader is defined as charging station i, while the followers comprise all the competing charging stations denoted by i N .
(2) Decision Variables: The decision variable for the i-th charging station is its day-ahead time-series charging price vector λ i :
λ i = [ λ i , 1 , λ i , 2 , , λ i , T ]
where T denotes the scheduling periods. The decision variable of each follower i is its price vector, representing the charging prices set by the competing stations:
λ i = [ λ i , 1 , λ i , 2 , , λ i , T ]
For the charging prices, there exist upper and lower limits:
λ i m i n λ i , t λ i m a x , t { 1 , , T }
where λ i m i n and λ i m a x are the lower and upper bound of the charging price, respectively.
(3) Payoff Function: In the Stackelberg game framework, the leader charging station i first announces its pricing strategy λ i . After observing the leader’s price, all the competing stations i N simultaneously determine their optimal prices λ i * to maximize their individual profits. Subsequently, the EV users observe the announced prices of all the charging stations and decide where to charge according to prospect theory, thereby forming the charging demand allocation across stations. Note that currently, there exist multiple public charging service tools for people to inquire as to the charging prices of all the charging stations, such as different charging service mobile applications in China. Therefore, the charging prices of all the charging stations in the competitive charging market can be known by all the charging stations.
The payoff function of each follower i is defined as the expected profit of the charging station, whose objective is to maximize its own expected revenue:
max λ i R i ( λ i , λ i ) = E ω t = 1 T λ i , t D i , t λ i , λ i , ω C i , t o p
C i , t o p = λ i , t g p i , t g + λ w p i , t w + λ s p i , t s
p i , t g = max n i , t P c h p i , t w p i , t s , 0
where ω represents the uncertainty factors, mainly including the stochastic arrival of EV users and the output fluctuations of renewable energy sources. C i , t o p denotes the operating cost of the competing charging station. p i , t g is the amount of electricity procured from the grid. λ i , t g λ w , and λ s denote the unit cost of purchasing electricity, and the potential loss of wind and solar power generation, respectively. n i , t denotes the number of EVs that need to be charged in station i at time t, and P c h denotes the charging power of the EV. Note that the charging price for each specific EV will remain unchanged during the entire charging period, i.e., the charging price is set to the station’s posted price at the moment the EV initiates its station selection. Therefore, the total charging profit of the charging station i at time t can be denoted as λ i , t D i , t . Equation (21) denotes that the charging demand is firstly supplied by the distributed renewable energy and then supplied by the purchase power in case of sufficiency.
The objective of the leader charging station i is to determine its optimal pricing strategy λ i that maximizes its total expected profit, while anticipating the optimal responses λ i * λ i of both the follower stations and the EV users:
max λ i R i ( λ i , λ i * λ i ) = E ω t = 1 T λ i , t D i , t λ i , t , λ i * λ i , ω C i , t o p π *
D i , t ( λ i , t , λ i * λ i , ω ) = j J P j , t ( i ) · q i , j , t *
where λ i , t is the charging price of the i-th charging station at time t, D i , t λ i , t , λ i * λ i , ω is the EV charging demand of the i-th charging station at time t, which results from EV users’ response to the prices λ = λ i , λ i based on the prospect theory choice model as introduced in Section 3 under the realized uncertainty ω , and J is the number of EVs. Note that the optimal operational cost C i , t o p π * represents the minimum operational cost that the i-th charging station incurs to serve the price-induced demand D i , t λ i , t , λ i * λ i , ω by executing its optimal real-time energy control policy π * , which is derived at the second stage. This requires the efficient utilization of distributed renewable generation and grid power purchases to satisfy controllable D i , t at the minimum cost, which is introduced below. In Equation (23), P j , t ( i ) denotes the probability that the j-th EV user will select the i-th charging station to charge. q i , j , t * denotes the optimal charging energy for the j-th EV at time t based on the optimal energy control policy π * at the second stage.
Note that the above model adopts a Stackelberg game framework to capture the hierarchical decision-making structure commonly observed in the evolving EV charging market, where a dominant player (e.g., a first-mover or a large-scale charging station operator) often sets prices first, and smaller competitors follow. However, a broader Nash equilibrium analysis, where all charging stations optimize charging price simultaneously, could provide a more comprehensive theoretical perspective on the market dynamics. This will be part of our future work.

4.3. Real-Time Charging Control Model

At the real-time charging control stage, the number of parked EVs in the i-th charging station is determined based on its post-charging prices. Although the parking duration of each EV is often uncontrollable, the charging process during the parking time of the EV can be controllable. Therefore, the second stage focuses on the energy management of EVs at the i-th charging station. For the i-th charging station, the objective function is to maximize the profit of the charging station. Therefore, the energy control model at the real-time stage can be formulated as follows:
max π = { π t , π t + 1 , , π T } R i ( π , t ) = ( λ i , t j = 1 J i t z j t P c h C i , t o p ( π t ) ) + E ω { τ = t + 1 T ( λ i , τ j = 1 J i τ z j τ P c h C i , τ o p ( π τ ) ) }
where π denotes the energy control policy of the i-th charging station and J i t denotes the number of parked EVs in the i-th charging station at current decision time t. π t = { z 1 t , z 2 t , , z J i t } denotes the detailed control variables at current decision time t. The primary decision variable in π t is z j t , which denotes the charging control of the j-th EV. This is a binary variable where 0 indicates no charging and 1 indicates that it is charging. For the operational cost C i , t o p ( π t ) , there is
C i , t o p ( π t ) = λ i , t g max j = 1 J i z j t P c h p i , t w p i , t s , 0 + λ w p i , t w + λ s p i , t s
In (24), the objective function R i ( π , t ) denotes the accumulated operation profit from current decision time t to T by implementing energy control policy π . As indicated in (24), the current charging decisions will affect the future operation profit. Meanwhile, due to the uncertainties in the EV arrivals and distributed renewable energy generation, there exists expectation operation for the future operation profit computation in (24).
Furthermore, each EV in the charging station should satisfy the following constraints:
τ = t j a t j d z j τ P c h = ( 1 S O C j , t a ) e j c a p , τ = 1 , 2 , , T , j = 1 , 2 , , J i τ
( 1 S O C j , t a ) e j c a p τ = t j a t z j τ P c h ( t j d t ) P c h
where t j a and t j d denote the arrival time and departure time of the j-th EV, respectively. Constraint (26) regulates that the required charging energy ( 1 S O C j , t a ) e j c a p should be satisfied when leaving the i-th charging station. In this equation, we assume that the charging efficiency equals 1, for simplicity. However, the proposed method can also be applied when the charging efficiency is smaller than 1. In constraint (27), the left side denotes the remaining energy demand after implementing the current decision z j t . The right side denotes the maximum accumulable energy in the future. Constraint (27) requires that the remaining energy demand should not be larger than the maximum accumulable energy. This is used to avoid charging service failure when making the current decision.
Note that in the above two-stage charging optimization model, there exists expectation operation in the pricing optimization model (19) and charging energy control model (24) in order to evaluate the impact of the uncertain wind and solar power on the operation optimization of the charging station. These uncertainties will influence the procurement cost from the power grid as indicated in (21) and (25). Therefore, multiple simulations of the wind and solar power generation will be generated to find an optimal charging pricing and energy control policy with the best average performance corresponding to these simulations. The details will be introduced in Section 5.
Based on the above introduction, the two-stage game-based charging optimization model is formulated. The problem of the first stage focuses on the day-ahead charging price determination, while the problem of the second stage focuses on the real-time charging energy control. These two stages are not independent. In fact, during charging price determination at the first stage, it is necessary to use the optimized energy control strategy in the second stage to evaluate the operation profit given a charging price value. In turn, the determined charging price at the first stage will influence the potential EV charging demand in the second stage, which will then affect the charging control optimization at the second stage. Therefore, there exists interaction between these sub-models. In the following, this paper develops a two-stage optimization approach to address the aforementioned stochastic optimization model.

5. Solution Methodology

There exist several challenges in solving the above model. Firstly, the leader’s optimization problem embeds the followers’ best-response functions, and each follower’s optimal response depends not only on the leader’s decision but also on the coupled pricing interactions among all the stations. This results in a bi-level optimization problem with equilibrium constraints, which is non-convex and computationally intractable under uncertainty. Furthermore, the charging demand functions D i , t and D i , t are modeled based on prospect theory, which captures users’ behavioral biases and risk perceptions under uncertainty. As a result, the demand–price relationship D ( λ ) becomes highly nonlinear and non-convex. Once the payoff functions are non-convex, conventional optimization methods often converge to local optima without guaranteeing global optimality. Secondly, a key challenge arises from the interdependence between the two stages: the leader’s first-stage charging price λ i , t directly depends on the optimal operational cost C i , t o p ( π * ) obtained from the second-stage real-time control problem, and vice versa. This coupling further increases the computational complexity and optimization complexity of the model. Thirdly, the objective functions in both stages are defined as an expected profit, where the uncertainty arises from the stochastic arrival of EV users and fluctuations in renewable energy sources. Meanwhile, there exists a large number of discrete variables z j t . These further increase the solution difficulty.
To solve this highly coupled and non-convex two-stage model, an iterative two-stage optimization framework is typically adopted. In this framework, the algorithm alternately optimizes between the leader’s pricing problem at the first stage and the real-time operational problem at the second stage until convergence. The stopping criterion is set as the profit differentials between the first stage and second stage gradually converge to a small and steady value. The details of the proposed two-stage optimization method are introduced below.

5.1. Scenario-Based ADMM for Charging Price Learning at the First Stage

This paper employs a scenario-based ADMM (Alternating Direction Method of Multipliers) approach to learn the optimal pricing game. ADMM is well-suited for distributed optimization problems, integrating the strengths of dual decomposition and the augmented Lagrangian method [28]. It operates on the principle of decomposition, splitting the main problem into subproblems that are then coordinated to find a global solution. It can be optimized alternately and often in parallel, thereby iteratively approaching the global solution of the original problem.
The standard ADMM algorithm exhibits several limitations. First, the standard ADMM requires the deterministic parameters of the model, which cannot directly handle uncertainties in the stochastic wind and photovoltaic generation. Second, it often suffers from slow convergence, particularly when applied to complex or non-convex optimization problems. Third, the algorithm’s performance is highly sensitive to the selection of the penalty parameter, which significantly influences both convergence stability and computational speed.
To address the uncertainties of wind and photovoltaic generation, multiple simulations of wind/solar power generation are generated by Sample Average Approximation (SAA) [29]. The core idea of SAA is to discretize the continuous uncertainty factor ω into a finite set of S representative scenarios, forming a scenario set Ω = ω 1 , ω 2 , , ω S . In this way, the original stochastic optimization objective is transformed into a deterministic optimization problem that seeks to maximize the average profit across all scenarios. Accordingly, the local objective function of the i-th charging station can be expressed as
max λ i 1 S s = 1 S R i ( s ) ( λ i , λ i )
R i s λ i , λ i = t = 1 T λ i , t . D i , t s λ i , t , λ i * λ i C i , t o p , s π I
where R i ( s ) ( λ i , λ i ) denotes the profit of the i-th charging station in scenario ω s , given its own pricing strategy λ i , the competitors’ pricing strategies λ i , and the energy control policy π I at the current iteration. Note that in each scenario, the uncertain arrival of EVs can be simulated based on the method introduced in Section 3 and the uncertain distributed renewable generation can be simulated based on its probability distributions.
After the SAA reformulation, the problem remains a game among N players. The ADMM framework is introduced to efficiently compute its Stackelberg equilibrium. The core idea is to decompose the multi-station game into N parallelly solvable subproblems, each corresponding to an individual charging station. ADMM achieves this decomposition by introducing an auxiliary variable z i ( i = 1 , 2 , , N ), which represents the global consensus variable for the pricing decision of station i . The key mechanism of ADMM is to enforce a consistency constraint, λ i = z i , where λ i denotes the local decision variable of charging station i .
Based on the above objective and the consistency constraint, the augmented Lagrangian function for charging station i can be defined as follows in order to handle the uncertainties in wind and solar power generation:
L ρ ( λ i , z i , u i ) = 1 S s = 1 S R i ( s ) ( λ i , λ i ) + u i T ( λ i z i ) + ρ 2 | | λ i z i | | 2
where u i is the dual variable associated with the constraint λ i = z i , ρ denotes the penalty parameter, u i T ( λ i z i ) represents the Lagrangian term, and the quadratic penalty term ρ 2 | | λ i z i | | 2 is introduced to enhance the convergence stability of the algorithm.
ADMM searches for the equilibrium solution by alternately updating λ i , z i , and  u i at iteration ( k + 1 ) . The update process is as follows:
(1) λ i update (Local price optimization, parallel computation):
λ i k + 1 = arg max λ i L ρ ( λ i , z i k , u i k )
This step is executed in parallel across all charging stations i . Each station i , given the previous iteration’s global consensus price z i k and dual variable u i k , determines its locally optimal price λ i k + 1 by maximizing its own augmented Lagrangian function L ρ .
(2) z i update (Global price consensus):
z i k + 1 = 1 N i = 1 N ( λ i k + 1 + u i k / ρ k )
The next step serves as a coordination phase, in which all charging stations’ local updates λ i k + 1 and u i k are collected to form a new global consensus price z i k + 1 through an averaging process.
(3) u i update (Dual variable update):
u i k + 1 = u i k + ρ k ( λ i k + 1 z i k + 1 )
The dual variable is updated to capture the accumulated inconsistency between the local decisions and the global consensus. This inconsistency is then used in the subsequent iteration ( k + 2 ) to impose a stronger penalty on λ i , thereby driving the algorithm toward convergence.
(4) Primal residual r i and dual residual s i update:
r i k + 1 = λ i k + 1 z i k + 1
s i k + 1 = ρ k ( z i k + 1 z i k )
The primal and dual residuals are key indicators for determining the convergence of the ADMM algorithm. The primal residual r i measures the discrepancy between the local decision variable λ i and the global consensus variable z i . The dual residual s i quantifies the change in the global variable induced by the updates of the local variables, reflecting the algorithm’s progress toward the optimal solution.
In the classical ADMM framework, it often fails to use a fixed penalty parameter ρ to balance computational tractability and convergence speed [30]. To accelerate convergence, an adaptive penalty parameter can be introduced, which dynamically adjusts ρ according to the basis of the norms of the primal and dual residual vectors, following the adaptive update strategy:
ρ k + 1 = τ i n c ρ k , ρ k / τ d e c , ρ k , if r k + 1 2 > μ s k + 1 2 if s k + 1 2 > μ r k + 1 2 otherwise
where μ > 1 τ i n c > 1 , and τ d e c > 1 are parameters. In this study, these parameters are set as μ = 10 and τ i n c = τ d e c = 2 .

5.2. Simulation-Based Rollout for Charging Control at the Second Stage

To address the real-time energy control problem at the second stage, simulation is commonly employed to generate possible realizations of P i , t w and P i , t s in the proposed model [31] and EV user charging selection simulation as introduced in Section 3. However, it is difficult to find the best solution considering the heavy simulation burden. Thus, it is not feasible for the second-stage problem. As a result, the online Rollout method is proposed to assess the base control policy and improve it [32]. By iteratively applying the Rollout procedure, the optimal control policy can be obtained [33]. The following section presents a detailed description of the simulation-based Rollout method.
Based on [34], the Q-factor can measure the policy performance, i.e.,
Q t ( s t , a t ) = r t ( s t , a t ) + E [ V t + 1 * ( s t + 1 ) | s t , a t ]
where s t = ( p i , t w , P i , t s , λ i , t , e j , t c h , l j , t c h ) denotes the state of the i-th charging station, and e j , t c h and l j , t c h denote the SOC and remaining parking time, respectively, of the j-th EV at time t. π t ( s t ) = a t = { z 1 t , z 2 t , , z J i t } is the action. Q t ( s t , a t ) is the Q-factor, and r t ( s t , a t ) is the one-step reward function, i.e.,
r t ( s t , a t ) = λ i , t j = 1 J i t z j t P c h C i , t o p ( a t )
[ V t + 1 * ( s t + 1 ) | s t , a t ] denotes the optimal value function from the next epoch to the last epoch, i.e.,
V t + 1 * ( s t + 1 ) = E τ = t + 1 T r τ ( s τ , a τ * )
where a τ * denotes the optimal action with π * ( s τ ) = a * .
To overcome the heavy computation burden in searching π * due to simulations and large state space, the Rollout method is adopted, which improves from an initial policy as the base policy π b . The base policy tries to meet charging demand as much as possible while maximizing the revenue from charging. The improved energy control policy π I can then be obtained according to the following equations:
π l s t = a t I = arg max a t A t r t s t , a t + E [ τ = t + 1 T r τ ( s τ , π b s τ ) | s t , a t ]
where a t I is the improved action in feasible action space A t .
By using a simulation approximation technique, Equation (40) can be approximated as follows:
π I ( s t ) = a t I arg max a t A t r t s t , a t + 1 M m = 1 M τ = t + 1 T [ r τ ( s τ , π b s τ ) | ξ m , s t , a t ]
where M denotes the total number of simulations, which is set as 60 in the experiment, and ζ m represents the m simulation.

5.3. Algorithm Summary

The algorithms are summarized as follows. Algorithm 1 presents the scenario-based ADMM method for the first-stage pricing game optimization. Algorithm 2 presents the simulation-based Rollout method for the second-stage energy control. Based on the pricing generated from the first-stage model, the charging behavior of EVs is optimized, through which the required energy control strategy in the first-stage model can be determined. Note that this method operates online and thus generates an improved energy control policy π t I according to the currently observed state s t . Therefore, the policy π t I is continuously updated until convergence with the iterative interaction with the scenario-based ADMM method for the first stage. It is also notable that the simulations can be implemented offline, which can save the computation time used during online pricing optimization and energy control optimization. Meanwhile, the accuracy of the input data—such as convergence thresholds, maximum iterations, initial base policy, etc.—is crucial to the accuracy of the results. The smaller the convergence threshold, the larger the maximum iteration, and a better initial base policy will bring in a better optimization result. Therefore, it is suggested to choose appropriate values of these input data based on the computation budget. Stricter values for these input data can be applied if the computation budget is large enough.
Algorithm 1 Scenario-based ADMM for First Stage: Charging Price Learning
1:
Input: convergence thresholds ϵ p r i , ϵ d u a l , max iterations K m a x .
2:
while receiving updated improved energy control policy π I from the second stage do
3:
  Initialization:
4:
  Initialize iteration k = 0 , prices λ i 0 , global prices z i 0 , dual variables u i 0 , penalty parameter ρ 0 and adaptive parameters μ , τ i n c , τ d e c .
5:
  Generate scenario set Ω = { ω 1 , , ω S } based on EV travel simulation in Section 3 and the probability distributions of distributed renewable energy.
6:
  Iterative Process:
7:
  for  k = 0 to K m a x  do
8:
    for each charging station i N (in parallel) do
9:
      Update λ i k + 1 by solving:
       λ i k + 1 = arg max λ i 1 S s = 1 S R i ( s ) ( λ i , λ i k ) + ( u i k ) T ( λ i z i k ) + ρ k 2 | | λ i z i k | | 2 where the profit R i ( s ) is computed based on the EV user charging selection simulation in Section 3 and π I .
10:
    end for
11:
    for each charging station i N (in parallel) do
12:
      Global price consensus: z i k + 1 = 1 N i = 1 N ( λ i k + 1 + u i k / ρ k )
13:
      Dual variable update based on (33)
14:
      Primal residual and dual residual calculation based on (34) and (35)
15:
    end for
16:
    Check convergence
17:
    if  | | r k + 1 | | 2 < ϵ p r i and | | s k + 1 | | 2 < ϵ d u a l  then
18:
      break
19:
    end if
20:
    Update ρ k + 1 based on (36).
21:
  end for
22:
end while
23:
Output: Optimal day-ahead charging price λ i * .
Algorithm 2 Simulation-based Rollout for Second Stage: Charging Energy Control
1:
Input: Optimized charging price λ i , current decision time t, total number of simulations M, initial base policy π b .
2:
Offline: Generate M simulations of ζ m where the charging demand is simulated by using charging selection in Section 3 base on λ i and λ i , and the distributed energy generation is simulated based on their probability distributions.
3:
Observe s t and determine feasible set A t based on (26) and (27).
4:
Find a t I by maximizing Q-factor:
       arg max a t A t r t s t , a t + 1 M m = 1 M τ = t + 1 T [ r τ ( s τ , π b s τ ) | ξ m , s t , a t ]
5:
Obtain improved energy control policy π t I .
6:
if all the π τ I ( τ = 1 , 2 , , T ) has been obtained then
7:
     send π I = ( π 1 I , π 2 I , , π T I ) to the first stage model for the next pricing optimization from time T + 1 to 2 T .
8:
     Set π b = π I .
9:
end if

6. Numerical Results

6.1. Experiment Settings

A 32-node road network topology is established as shown in the Figure 3. The charging station to be optimized is located at Node 20, while its two competing charging stations are located at Nodes 6 and 17, respectively. In this road network, it is assumed that 1000 EVs travel according to a certain pattern over a 24 h period. The simulation parameters used in this study are shown in Table 1. The statistical distribution of wind generation is assumed to follow Weibull distribution, while the statistical distribution of solar generation is assumed to follow Beta distribution. The parameters of these distributions are both fitted based on the actual data provided by NASA [35]. Based on Equations (14) and (15), the simulated wind and solar power data can be generated for these three charging stations. Figure 4, Figure 5 and Figure 6 show the expected wind power output, solar power output, and total output of the distributed renewable energy, respectively. The proposed method is implemented by matlab and the running environment is 13th Gen Intel(R) core(TM) i5-13400 with 16 GB RAM.

6.2. Performance Analysis

In this paper, a prospect theory-based model is used to simulate the hourly variations in the number of EV arrivals at each charging station. As shown in Figure 7, the comparison among the three charging stations clearly indicates that all of them experience a noticeable peak in EV arrivals during the early-morning to mid-morning period (approximately 5:00–10:00), which closely aligns with typical urban commuting hours. Although the number of EV arrivals fluctuates throughout the day, the overall trend shows higher demand in the morning, followed by a gradual decline. It can be seen that charging station #1 (red) reaches its maximum between 7:00 and 10:00, with hourly arrivals exceeding 30 vehicles, making it the station with the highest peak among the three. While charging stations #2 and #3 (blue and green) have slightly lower peaks, they also exhibit considerable arrival volumes during the morning rush hour. In particular, charging station #2 shows a noticeable increase around 9:00, while charging station #3 maintains a relatively high arrival rate between 7:00 and 11:00. In summary, among the three charging stations, station #1 receives the highest total number of EV arrivals throughout the day, which is significantly more than the others. This is because, compared with the other two stations, station #1’s pricing mechanism is designed to be more attractive to EV users by anticipating and responding to the pricing strategies of other stations. In addition, its optimized waiting time is shorter than that of the other two, further enhancing its appeal to users.
Figure 8 shows the newly generated charging demand at charging station #1. As shown in the figure, during the early morning hours (1:00–3:00), the newly generated charging demand remains relatively low. From 4:00 to 7:00, the newly generated charging demand increases significantly due to users’ tendency to charge before the morning rush hour. During the period from 11:00 to 15:00, when PV generation reaches its peak, the station experiences the highest newly generated charging demand, reaching approximately 1800 kW, making it the busiest period of the day. After 15:00, the newly generated charging demand gradually decreases. Overall, the newly generated charging demand reaches a peak between 7:00 and 14:00, corresponding to the concentrated charging period of urban residents and commuters. In contrast, during nighttime and late-night hours, the newly generated charging demand drops sharply as most EVs are already parked with no charging station selection willingness.
Figure 9 shows the game-based pricing result of these three charging stations through the scenario-based ADMM approach. As shown in the figure, all three charging stations exhibit a clear time-of-use pricing pattern. During nighttime and early-morning off-peak periods (1:00–7:00 and 20:00–24:00), the prices at all three stations remain relatively low. In the morning (8:00–10:00) and evening (17:00–19:00) peak hours, station #1 takes the lead in raising its price to around 0.13 USD/kWh, while stations #2 and #3 increase their prices to approximately 0.12 USD/kWh and 0.11 USD/kWh, respectively. This shows smaller increments compared to station #1. During the midday period (11:00–14:00), when renewable energy generation is at its peak, charging station #1 lowers its price, to attract more EVs for charging, and subsequently raises it again to limit additional entries. Overall, station #1 adopts the most aggressive pricing strategy during peak hours, with the largest price adjustments compared with the other two stations. After optimization, its pricing strategy enables it to achieve higher overall revenue.
The optimization results of the total profits for each charging station are shown in Figure 10. As the number of iterations increases, the profits of all three charging stations rise significantly, indicating gradual improvement in their operational performance. Around the 120th iteration, the profits of all the stations begin to stabilize. Ultimately, charging station #1 achieves the highest profit (approximately USD 492.9), followed by station #2 (around USD 442.8) and station #3 (around USD 421.4), showing a noticeable gap among them. In summary, the iterative optimization of pricing and energy management among multiple charging stations effectively enhances their operational profits. Through the iterative process, each station not only improves its overall economic performance but also reaches a relatively stable and mutually non-interfering profit distribution pattern.
Figure 11 illustrates the variation of the primal residual with the number of iterations, reflecting the convergence performance of the scenario-based ADMM algorithm for charging optimization. The initial residual values are relatively large but decrease rapidly, indicating a fast convergence rate in the early iterations. The continuous decline of the residuals and the overall convergence demonstrate that the algorithm can stably approach the optimal solution. This confirms the convergence property of the proposed ADMM-based optimization in this study. The residuals eventually converge to a sufficiently low level, indicating that the strategies have reached a stable state.
In this paper, the energy control strategy is optimized by the simulation-based Rollout algorithm, resulting in the matching profiles of charging demand and wind–solar generation as shown in Figure 12. Because of the relatively large number of charging requests, the load generally exceeds the output of wind and solar generation. Nevertheless, the charging load can be partially met by the distributed renewable energy, which reduces the total electricity purchased from the grid, especially during the period from 7:00 to 19:00 when the charging load is relatively high. This demonstrates that it can not only lower costs but also maximize the utilization of renewable energy from wind and solar sources.
Figure 13 presents the SOC profiles of individual EVs arriving at the target charging station after optimization. It can be observed that the station effectively manages the charging of each EV. Initially, as time progresses, the number of EVs selected for charging gradually increases, and all the EVs are able to meet their charging requirements within demand. In general, the SOC of the EVs rises continuously before 14:00 and levels up after 16:00. This trend aligns with the periods of higher wind and solar generation shown in Figure 12, indicating that EV charging is regulated according to the output of distributed renewable energy, with charging preferentially scheduled during periods of abundant wind and solar energy.
The comparisons between the proposed method with GA+Rollout, PSO+Rollout, and the classical ADMM + Heuristic Control are also conducted considering the optimized operation profits and convergence efficiency, which is shown in Table 2. Note that GA+Rollout means the pricing is optimized by GA and energy control is achieved by using Rollout. The other methods are similarly named. For classical ADMM + Heuristic Control policy, the EV will be charged to the full state as soon as possible and the distributed renewable energy will be used first. Table 2 shows that the proposed method achieves the highest total profit with fewer iterations. Specifically, it attains a total profit of approximately USD 492.9, which is 8.5% higher than PSO+Rollout (USD 454.3) and 9.5% higher than GA+Rollout (USD 450), while also outperforming the classical ADMM (USD 470) by 4.7%. In terms of convergence performance, due to the adaptive update strategy in scenario-based ADMM, the proposed method converges within about 120 iterations, representing a 52% improvement compared with the classical ADMM (250 iterations) and a 76% faster convergence compared with GA and PSO (500 iterations). These results demonstrate that the proposed approach exhibits better global optimization capability and faster convergence when addressing the two-stage charging optimization problem with uncertainty and non-convexity, compared with traditional intelligent algorithms and the standard ADMM framework.
To further explore the robustness and scalability of the proposed approach, a sensitivity analysis regarding the EV population scale is conducted. Three scenarios are designed to represent different charging demand densities: Low Density (EV Number = 500), Medium Density (EV Number = 1000), and High Density (EV Number = 1500). Table 3 illustrates the total profits of the three competitive charging stations under different EV populations. In the low-density scenario, the charging demand is relatively sparse, and the competition among stations is weak. Consequently, the profit gap between the target station #1 and its competitors is relatively small. In the high-density scenario, the charging resources and renewable energy capacity become scarce resources relative to the high demand. The intense competition highlights the advantages of the proposed game-theoretic pricing and optimized energy control. Station #1 achieves a profit of around USD 876.4, significantly outperforming station #2 (around USD 780.2) and station #3 (around USD 654.7). This indicates that as market saturation increases, the proposed method effectively captures more profit through better pricing and energy scheduling.
Figure 14 presents comparative experiments conducted under different charging efficiency scenarios. As shown in the figure, it can be observed that the total charging profit exhibits a downward trend as charging efficiency decreases. This is because the actual charge energy decreases with the decrease in the charging efficiency. As EV users only pay for the actual charging energy to meet their charging demand, the charging loss cost will be paid by the charging station, which incurs the reduced total charging profit with the decrease in the charging efficiency.

7. Discussion

This paper principally considers the joint pricing and energy control problem for EV charging stations. Based on the proposed method, it can be found that the targeted charging station can achieve the largest operation profit in the competitive charging market. Furthermore, the proposed method can obtain the optimal policy with fast convergence speed. As this work considers the joint pricing and energy control of charging stations, whose coupling relationship between the pricing and energy control is barely considered in the existing literature, the proposed method (Scenario ADMM+Rollout) is compared with four benchmarks under 1000 EV integration, i.e., GA+Rollout, PSO+Rollout, Classical ADMM + Heuristic Control, and Scenario ADMM+MILP with perfect future information. Note that GA+Rollout means the charging price is optimized by GA and the energy control is achieved by using Rollout. The others are similarly named. The optimized operation profit and computation time are compared, which is shown in Table 4. For the first-stage pricing problem, it is a non-linear optimization problem with a complicated EV users’ selection behavior model, for which MILP can not be applied. Therefore, GA, PSO, and classical ADMM are selected for charging pricing optimization. For the second-stage energy control problem, the heuristic control and MILP with perfect future information of distributed generation and EV charging demand are selected for comparison.
It can be seen that the proposed method (scenario ADMM+Rollout) achieves the largest profit and faster computation speed than GA+Rollout, PSO+Rollout and classical ADMM + Heuristic Control. This is because GA and PSO tend to converge to local optima, while heuristic control is not the optimal energy control policy. Although Scenario ADMM+MILP with perfect future information has the largest profit, it requires precise information of future distributed energy generation and an EV charging demand that is ideal. Furthermore, the profit of the proposed method is close to the ideal operation profit with much faster computation speed. This is because the large number of EVs will bring in a large number of binary variables, which greatly increases the solution time of MILP, while the proposed method iteratively updates the current action, which saves computation time.
The above advantages of the proposed method are attributed to the two-stage game-based model and the proposed solution method. On one hand, the proposed model considers the coupling relationship between charging pricing and energy control, as well as the uncertain impact of EV charging behavior and distributed energy generation. On the other hand, the proposed method can effectively handle this coupled relationship, the large number of EVs, and the uncertain impact of charging behavior and distributed energy generation. This brings in the above key findings.
In this paper, the numerical experiments demonstrate that the proposed method can achieve the joint optimization for charging price and energy control of the charging station in the competitive charging market. The main limitation of this work is the neglecting of Vehicle-to-Grid (V2G) capability of the EVs. The discharging of EVs will further improve the operation profit of the charging stations as the large number of EVs can be regarded as energy storage having large capacity. This needs to be further strengthened. In the future, the V2G capability of EVs and the corresponding incentive policy will be further explored to enhance the operation of charging stations.

8. Conclusions

This paper fundamentally proposes a two-stage game-based charging optimization method to improve the operation profit of a charging station under the competitive charging market. Furthermore, the uncertain distributed generation and charging behavior are considered during the charging optimization. The scenario-based ADMM and simulation-based Rollout method are designed to solve the first- and second-stage problems, respectively. Numerical results show that the proposed method can achieve an additional USD 221.7 in a competitive charging market considering 1000 EV integration. Compared with the traditional GA, PSO, and classical ADMM with heuristic control methods, the proposed method can achieve as large as 9.5% profit improvement and 50% computation speedup. Compared with the MILP having perfect future information method, the proposed method exhibits only a 6% deviation from the ideal optimal profit, while achieving a 72% computation speedup.

Author Contributions

Conceptualization, S.H.; methodology, H.Z.; project administration, S.H.; resources, J.P. and X.G.; software, M.W.; writing—original draft, M.W.; writing—review and editing, J.P. and F.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by Suqian Sci & Tech Program (Grant No. Z2024009) and the Young Elite Scientists Sponsorship Program of Jiangsu Province (No. JSTJ-2025-1003).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

Authors Shaohua Han, Hongji Zhu, Jinian Pang, Xuan Ge and Fuju Zhou are employed by the Suqian Power Supply Company, State Grid Jiangsu Electric Power Co., Ltd. The remaining author declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. National Energy Administration. National Electric Vehicle Charging Infrastructure Data in August 2025 [R/OL]; National Energy Administration: Beijing, China, 2025.
  2. Zeng, F.; Yuan, X.; Pan, Y.; Wang, M.; Miao, H.; Han, H.; Lyu, S. GWO-based charging price determination for charging station with competitor awareness. Electr. Eng. 2024, 106, 7587–7601. [Google Scholar] [CrossRef]
  3. Li, X.; Xiang, Y.; Lyu, L.; Ji, C.; Zhang, Q.; Teng, F.; Liu, Y. Price incentive-based charging navigation strategy for electric vehicles. IEEE Trans. Ind. Appl. 2020, 56, 5762–5774. [Google Scholar] [CrossRef]
  4. Clar-Garcia, D.; Fabra-Rodriguez, M.; Campello-Vicente, H.; Velasco-Sanchez, E. Optimal DC fast-charging strategies for battery electric vehicles during long-distance trips. Batteries 2025, 11, 394. [Google Scholar] [CrossRef]
  5. Kapoor, A.; Patel, V.S.; Sharma, A.; Mohapatra, A. Centralized and decentralized pricing strategies for optimal scheduling of electric vehicles. IEEE Trans. Smart Grid 2022, 13, 2234–2244. [Google Scholar] [CrossRef]
  6. Zhang, Y.; You, P.; Cai, L. Optimal charging scheduling by pricing for EV charging station with dual charging modes. IEEE Trans. Intell. Transp. Syst. 2018, 20, 3386–3396. [Google Scholar] [CrossRef]
  7. Narayan, A.; Krishna, A.; Misra, P.; Vasan, A.; Sarangan, V. A dynamic pricing system for electric vehicle charging management using reinforcement learning. IEEE Intell. Transp. Syst. Mag. 2022, 14, 122–134. [Google Scholar] [CrossRef]
  8. Dong, X.; Si, Q.; Yu, X.; Mu, Y.; Ren, Y.; Dong, X. Charging pricing strategy of charging operators considering dual business mode based on noncooperative game. IEEE Trans. Transp. Electrific. 2025, 11, 7334–7345. [Google Scholar] [CrossRef]
  9. Yan, Q.; Zhang, B.; Kezunovic, M. Optimized operational cost reduction for an EV charging station integrated with battery energy storage and PV generation. IEEE Trans. Smart Grid 2019, 10, 2096–2106. [Google Scholar] [CrossRef]
  10. Li, D.; Zouma, A.; Liao, J.T.; Yang, H.T. An energy management strategy with renewable energy and energy storage system for a large electric vehicle charging station. ETransportation 2020, 6, 100076. [Google Scholar] [CrossRef]
  11. Gao, S.; Li, C.; Cao, W.; Che, Y. Consensus control of electric vehicle charging stations for providing performance-based frequency regulation service. IEEE Trans. Veh. Technol. 2025, 74, 13809–13820. [Google Scholar] [CrossRef]
  12. Zhong, J.; Liu, J.; Zhang, X. Charging navigation strategy for electric vehicles considering empty-loading ratio and dynamic electricity price. Sustain. Energy Grids Netw. 2023, 34, 100987. [Google Scholar] [CrossRef]
  13. Kim, Y.; Kwak, J.; Chong, S. Dynamic pricing, scheduling, and energy management for profit maximization in PHEV charging stations. IEEE Trans. Veh. Technol. 2017, 66, 1011–1026. [Google Scholar] [CrossRef]
  14. Zha, W.; Zhong, H.; Liu, Y. Fair and efficient profit allocation for collaborative operation of distributed renewable energy operators and electric vehicle charging stations. J. Mod. Power Syst. Clean Energy 2025, 13, 1395–1406. [Google Scholar] [CrossRef]
  15. Li, K.; Shao, C.; Hu, Z.; Shahidehpour, M. An MILP method for optimal planning of electric vehicle charging stations in coordinated urban power and transportation networks. IEEE Trans. Power Syst. 2023, 38, 5406–5419. [Google Scholar] [CrossRef]
  16. Zhang, L.; Li, Y. Optimal management for parking-lot electric vehicle charging by two-stage approximate dynamic programming. IEEE Trans. Smart Grid 2017, 8, 1722–1730. [Google Scholar] [CrossRef]
  17. Shuai, C.; Ruan, L.; Chen, D.; Chen, Z.; Ouyang, X.; Geng, Z. Location optimization of charging stations for electric vehicles based on heterogeneous factors analysis and improved genetic algorithm. IEEE Trans. Transp. Electrific. 2025, 11, 4920–4933. [Google Scholar] [CrossRef]
  18. Bao, Y.; Chang, F.; Shi, J.; Zhang, W.; Gao, D.W. An approach for pricing of charging service fees in an electric vehicle public charging station based on prospect theory. Energies 2022, 15, 5308. [Google Scholar] [CrossRef]
  19. Zeng, H.; Sheng, Y.; Sun, H.; Zhou, Y.; Xue, Y.; Guo, Q. A conic relaxation approach for solving Stackelberg pricing game of electric vehicle charging station considering traffic equilibrium. IEEE Trans. Smart Grid. 2024, 15, 3080–3097. [Google Scholar] [CrossRef]
  20. Tang, D.; Wang, P. Probabilistic modeling of nodal charging demand based on spatial-temporal dynamics of moving electric vehicles. IEEE Trans. Smart Grid 2016, 7, 627–636. [Google Scholar] [CrossRef]
  21. Tang, T.; Mao, J.; Liu, R.; Liu, Z.; Wang, Y.; Huang, D. Origin–destination matrix prediction in public transport networks: Incorporating heterogeneous direct and transfer trips. IEEE Trans. Intell. Transp. Syst. 2024, 25, 19889–19903. [Google Scholar] [CrossRef]
  22. Gan, L.; Hu, Y.; Chen, X.; Li, G.; Yu, K. Application and outlook of prospect theory applied to bounded rational power system economic decisions. IEEE Trans. Ind. Appl. 2022, 58, 3227–3237. [Google Scholar] [CrossRef]
  23. Wang, Q.; Huang, C.; Wang, C.; Xie, N.; Li, K.; Qiu, P. An optimal competitive bidding and pricing strategy for electric vehicle aggregator considering the bounded rationality of users. IEEE Trans. Ind. Appl. 2025, 61, 4898–4912. [Google Scholar] [CrossRef]
  24. Kabir, M.E.; Assi, C.; Alameddine, H.; Antoun, J.; Yan, J. Demand-aware provisioning of electric vehicles fast charging infrastructure. IEEE Trans. Veh. Technol. 2020, 69, 6952–6963. [Google Scholar] [CrossRef]
  25. Sun, C.; Wen, X.; Lu, Z.; Zhang, J.; Chen, X. A graphical game approach to electrical vehicle charging scheduling: Correlated equilibrium and latency minimization. IEEE Trans. Intell. Transp. Syst. 2021, 22, 505–517. [Google Scholar] [CrossRef]
  26. Goodarzi, A.H.; Diabat, E.; Jabbarzadeh, A.; Paquet, M. An M/M/c queue model for vehicle routing problem in multi-door cross-docking environments. Comput. Oper. Res. 2022, 138, 105513. [Google Scholar] [CrossRef]
  27. Qureshi, U.; Ghosh, A.; Panigrahi, B.K. Menu-based pricing for electric vehicle charging in a mobile charging service for maximizing charging station’s profit and social welfare: A Stackelberg game. IEEE Trans. Ind. Appl. 2025, 61, 3447–3456. [Google Scholar] [CrossRef]
  28. Maneesha, A.; Swarup, K.S. A survey on applications of alternating direction method of multipliers in smart power grids. Renew. Sustain. Energy Rev. 2021, 152, 111687. [Google Scholar] [CrossRef]
  29. Li, C.; Xu, Y.; Yu, X.; Ryan, C.; Huang, T. Risk-averse energy trading in multienergy microgrids: A two-stage stochastic game approach. IEEE Trans. Ind. Inf. 2017, 13, 2620–2630. [Google Scholar] [CrossRef]
  30. Sun, K.; Sun, X.A. A two-level ADMM algorithm for AC OPF with global convergence guarantees. IEEE Trans. Power Syst. 2021, 36, 5271–5281. [Google Scholar] [CrossRef]
  31. Powell, W.B. Reinforcement Learning and Stochastic Optimization: A Unified Framework for Sequential Decisions; Wiley: Hoboken, NJ, USA, 2022. [Google Scholar]
  32. Bertsekas, D. Multiagent reinforcement learning: Rollout and policy iteration. IEEE/CAA J. Autom. Sin. 2021, 8, 249–272. [Google Scholar] [CrossRef]
  33. Long, T.; Jia, Q.S.; Wang, G.; Yang, Y. Efficient real-time EV charging scheduling via ordinal optimization. IEEE Trans. Smart Grid 2021, 12, 4029–4038. [Google Scholar] [CrossRef]
  34. Puterman, M.L. Markov Decision Processes; John Wiley & Sons: Hoboken, NJ, USA, 1994. [Google Scholar]
  35. NASA. POWER Data Access Viewer. Available online: https://power.larc.nasa.gov/data-access-viewer/ (accessed on 24 October 2025).
Figure 1. Schematic diagram of the competitive charging station scenario within a transportation network.
Figure 1. Schematic diagram of the competitive charging station scenario within a transportation network.
Batteries 12 00016 g001
Figure 2. Overall architecture of pricing and energy control for charging station.
Figure 2. Overall architecture of pricing and energy control for charging station.
Batteries 12 00016 g002
Figure 3. Topology of 32-node road network.
Figure 3. Topology of 32-node road network.
Batteries 12 00016 g003
Figure 4. The expected wind power output of the three charging stations.
Figure 4. The expected wind power output of the three charging stations.
Batteries 12 00016 g004
Figure 5. The expected PV power output of the three charging stations.
Figure 5. The expected PV power output of the three charging stations.
Batteries 12 00016 g005
Figure 6. The expected total renewable energy output of the three charging stations.
Figure 6. The expected total renewable energy output of the three charging stations.
Batteries 12 00016 g006
Figure 7. Total number of EVs attracted to each charging station.
Figure 7. Total number of EVs attracted to each charging station.
Batteries 12 00016 g007
Figure 8. Newly generated charging demand per hour at charging station #1.
Figure 8. Newly generated charging demand per hour at charging station #1.
Batteries 12 00016 g008
Figure 9. Pricing strategies of the three charging stations (USD/kWh).
Figure 9. Pricing strategies of the three charging stations (USD/kWh).
Batteries 12 00016 g009
Figure 10. Convergence analysis of the total profit for three charging stations.
Figure 10. Convergence analysis of the total profit for three charging stations.
Batteries 12 00016 g010
Figure 11. The primal residual of scenario-based ADMM for charging optimization.
Figure 11. The primal residual of scenario-based ADMM for charging optimization.
Batteries 12 00016 g011
Figure 12. Matching profile of charging load and wind–solar generation.
Figure 12. Matching profile of charging load and wind–solar generation.
Batteries 12 00016 g012
Figure 13. The SOC profiles of individual EVs at charging station #1.
Figure 13. The SOC profiles of individual EVs at charging station #1.
Batteries 12 00016 g013
Figure 14. The operation profit variation with different charge efficiency.
Figure 14. The operation profit variation with different charge efficiency.
Batteries 12 00016 g014
Table 1. Parameter settings.
Table 1. Parameter settings.
ParameterSettingParameterSetting
a1.3b1.2
α 0.43 β 0.45
e j c a p 66 kWh p c h 6.6 kW
v j 40 kW/h E c 0.164 kWh/km
λ i min 0.07 USD λ i max 0.14 USD
Table 2. Performance comparisons of different algorithms.
Table 2. Performance comparisons of different algorithms.
PolicyTotal Profit (USD)IterationsComputation Time (s)
GA+Rollout450500180
PSO+Rollout454.3500160
Classical ADMM + Heuristic Control470250110
Scenario ADMM+Rollout492.912090
Table 3. Impact of EV population scale on the operational profits of competitive charging stations.
Table 3. Impact of EV population scale on the operational profits of competitive charging stations.
EV PopulationStation #1 Profit (USD)Station #2 Profit (USD)Station #3 Profit (USD)
500237.9204.7202.6
1000492.9442.8421.4
1500876.4780.2654.7
Table 4. Comparisons with benchmarks corresponding to operation profit and computation time.
Table 4. Comparisons with benchmarks corresponding to operation profit and computation time.
PolicyGA
+Rollout
PSO
+Rollout
Classical ADMM
+Heuristic Control
Scenario ADMM
+Rollout
Scenario ADMM
+MILP with Perfect Future Info.
Profit (USD)450454.3470492.9524.3
Comp. Time (s)18016011090560
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Han, S.; Zhu, H.; Pang, J.; Ge, X.; Zhou, F.; Wang, M. Two-Stage Game-Based Charging Optimization for a Competitive EV Charging Station Considering Uncertain Distributed Generation and Charging Behavior. Batteries 2026, 12, 16. https://doi.org/10.3390/batteries12010016

AMA Style

Han S, Zhu H, Pang J, Ge X, Zhou F, Wang M. Two-Stage Game-Based Charging Optimization for a Competitive EV Charging Station Considering Uncertain Distributed Generation and Charging Behavior. Batteries. 2026; 12(1):16. https://doi.org/10.3390/batteries12010016

Chicago/Turabian Style

Han, Shaohua, Hongji Zhu, Jinian Pang, Xuan Ge, Fuju Zhou, and Min Wang. 2026. "Two-Stage Game-Based Charging Optimization for a Competitive EV Charging Station Considering Uncertain Distributed Generation and Charging Behavior" Batteries 12, no. 1: 16. https://doi.org/10.3390/batteries12010016

APA Style

Han, S., Zhu, H., Pang, J., Ge, X., Zhou, F., & Wang, M. (2026). Two-Stage Game-Based Charging Optimization for a Competitive EV Charging Station Considering Uncertain Distributed Generation and Charging Behavior. Batteries, 12(1), 16. https://doi.org/10.3390/batteries12010016

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop