Bilevel Real-Time Pricing for Tripartite Welfare Equilibrium in Smart Grids: Balancing Fairness and Efficiency

Jia, Jinze; Zhang, Sen; Song, Linsen

doi:10.3390/math14122040

Open AccessArticle

Bilevel Real-Time Pricing for Tripartite Welfare Equilibrium in Smart Grids: Balancing Fairness and Efficiency

by

Jinze Jia

¹,

Sen Zhang

² and

Linsen Song

^2,*

¹

College of Management, Henan Institute of Technology, Xinxiang 453003, China

²

School of Mathematical Sciences, Henan Institute of Science and Technology, Xinxiang 453003, China

^*

Author to whom correspondence should be addressed.

Mathematics 2026, 14(12), 2040; https://doi.org/10.3390/math14122040

Submission received: 9 May 2026 / Revised: 31 May 2026 / Accepted: 5 June 2026 / Published: 8 June 2026

Download

Browse Figures

Versions Notes

Abstract

Demand-side management plays a critical role in the secure and efficient operation of smart grids. Traditional real-time pricing generally takes social welfare maximization as the only objective, while ignoring the benefit balance among electricity suppliers, grid company and users. This will lead to uneven benefit distribution among stakeholders and impair the long-term stable operation of power systems. To solve this problem, a bilevel real-time pricing strategy based on tripartite welfare equilibrium is proposed in this paper. The upper-level model minimizes the welfare differences among electricity suppliers, grid company and users to ensure fair benefit allocation, and the lower-level model maximizes the total social welfare so as to guarantee the economic efficiency of the system. The model adopts different utility functions for residential and industrial users to describe user heterogeneity. By using the Karush–Kuhn–Tucker conditions, the original bilevel model is transformed into a single-level optimization problem with complementarity constraints. The CHKS smoothing function and pseudo-Huber function are introduced to deal with complementarity constraints and absolute-value objective functions respectively. Combined with the central difference method, a modified rolling penalty function algorithm is developed for numerical solution. The 24 h simulation results show that the prices of four time periods converge steadily to equilibrium values as iterations proceed. Compared with the total social welfare maximization model, the proposed bilevel model effectively reduces the peak-to-average load ratio. It reduces the welfare disparities among the three stakeholders while maintaining the total social welfare at a stable level. Furthermore, it still maintains excellent applicability and robustness when the user scale is expanded.

Keywords:

smart grid; real-time pricing; bilevel programming; tripartite welfare equilibrium; penalty function

MSC:

90C90

1. Introduction

With the substantial advancements in the global energy transition and the explosive growth of power-consuming devices, the relationship between the supply and demand sides of electric power is undergoing profound restructuring. Meanwhile, the increasing integration of renewable energy sources, such as wind and solar power, has significantly changed the operational characteristics of power systems. Due to the intermittent and uncertain nature of renewable generation, higher renewable energy penetration increases operational uncertainty and the pressure of maintaining supply–demand balance, and may even lead to renewable energy curtailment [1,2]. Moreover, the large-scale integration of renewable energy poses greater challenges to electricity market operation and pricing mechanisms [3,4]. Therefore, more flexible pricing mechanisms and active demand-side participation are required to effectively accommodate the fluctuations of renewable energy resources. In the traditional model, fixed electricity prices are always adopted, which gives consumers no economic incentive to shift their usage away from peak hours, resulting in a sharp rise in peak loads and idle resources during off-peak periods and thus providing no price-based guidance for consumers to adjust their usage patterns. In contrast, smart grids, supported by advanced digital information networks, enable two-way real-time interaction between power suppliers and consumers; i.e., consumers can adjust their electricity usage in response to dynamic prices, while suppliers can optimize generation based on demand signals, thereby forming a flexible, price-driven regulatory closed loop [5,6,7,8].

Power demand response management is central to the operational efficiency of smart grids. Real-time pricing (RTP) is widely recognized as the most economically efficient mechanism, as it can accurately reflect the instantaneous marginal cost of supply and demand in forward-looking transactions [9,10,11]. Existing studies on RTP in smart grids mainly fall into three categories.

The first is the social welfare maximization model (SWMM), which achieves social optimality by maximizing the difference between total consumer utility and total power supply cost, with the shadow price serving as the real-time price [12,13,14,15]. Li et al. [14] formulated a constrained optimization model for commercial users, used a smoothing cosh function to replace the KKT complementarity conditions, and proposed a smoothing Newton algorithm. Gao [15] determined the electricity price as the Lagrange multiplier of the SWMM and examined the role of the lower bound of power supply, verifying that the simplified model remains applicable to the online dual method.

The second category adopts a game-theoretic perspective. Since electricity markets involve multiple self-interested participants, including power generation companies, grid operators, and end-users, whose decisions are mutually dependent and strategically interconnected, game theory provides a natural framework for characterizing interactions among market participants. By modeling competitive and cooperative behaviors under dynamic pricing mechanisms, game-theoretic approaches can effectively describe market equilibrium and participants’ responses to price signals [16]. In this framework, the relationship between suppliers and consumers is reconstructed as one between equal decision-makers under a “game equilibrium” framework, thereby aligning distributed individual rationality with centralized social welfare through price signals [17,18,19,20]. Zhang et al. [17] considered a supplier and multiple residential consumers, establishing a nested game model with a Stackelberg game between the supplier and the user group and an evolutionary game among users. Dai et al. [19] constructed a multi-leader–follower Stackelberg game model based on RTP, where retailers maximize profits and users compete for optimal consumption, and solved for the Stackelberg equilibrium via Lagrange multipliers.

The third category leverages artificial intelligence, employing reinforcement learning and deep learning to handle high-dimensional uncertainties and complex user behaviors, representing the most cutting-edge direction [21,22,23,24]. Wang et al. [21] proposed a bilevel RTP model where the upper level maximizes supplier profit using Q-learning and the lower level optimizes each user’s consumption via a Markov decision process. Liu et al. [24] preprocessed data with weighted grey relational projection, improved a bidirectional LSTM model with an attention mechanism, and combined it with XGBoost using reciprocal error weighting.

However, to the best of our knowledge, most existing studies focus on improving overall operational efficiency and tend to overlook the balance of demands among multiple stakeholders. A sustained decline in any party’s welfare may affect its willingness to participate, thereby posing potential challenges to the long-term robustness and sustainable development of the system. Moreover, the convergence of some heuristic algorithms such as particle swarm optimization and genetic algorithms often depends on parameter settings and problem structures, and always lack unified theoretical guarantees. To address these issues, a real-time pricing model for multi-type users in smart grids based on the load–utility equilibrium is proposed. The upper-level model is oriented toward social equity, aiming to narrow the welfare gap among power generation enterprises, grid companies, and consumers, thereby promoting fairness in interest distribution. The lower-level model is designed to improve the economic efficiency of the system, with the objective of maximizing total social welfare, which includes power generation costs, grid operation costs, and users’ utility from electricity consumption. The two layers of the model are coupled through electricity price variables, together forming a preliminary framework for collaborative interest optimization. To solve this constrained bi-level programming problem, the KKT conditions are adopted. Based on these, the model is transformed into an equivalent single-level optimization model with smooth equality constraints. Furthermore, the CHKS smoothing function and the pseudo-Huber function are introduced to handle the complementarity conditions and the absolute value objective, respectively, and a modified rolling penalty function algorithm is developed for numerical solution. Finally, simulation results support the feasibility of the proposed model and algorithm to some extent. Table 1 summarizes the differences between the existing literature and this study in terms of research scope and methodology.

The contributions of this paper are as follows:

(1): A tripartite welfare equilibrium model is proposed, involving power plants, the grid company, and end-users. In this model, the upper-level problem minimizes the absolute welfare gaps among the three parties to ensure fairness, while the lower-level problem maximizes the total social welfare.
(2): Differentiated utility functions for residential and industrial users are incorporated into the proposed bilevel tripartite framework to characterize user heterogeneity, thereby enhancing the practicality and adaptability of the pricing strategy.
(3): The bilevel programming problem is transformed into a single-level optimization problem with complementarity constraints using the Karush–Kuhn–Tucker (KKT) conditions. Smoothing functions and the pseudo-Huber function are employed to handle the complementarity conditions and the absolute-value objective, respectively. Furthermore, a modified rolling penalty function algorithm incorporating a central-difference scheme is developed to solve the resulting single-level problem, and the existence and uniqueness of the solution are proved.
(4): Simulation results demonstrate that the proposed model significantly reduces the welfare gaps among the three parties while maintaining the total social welfare almost unchanged. It effectively lowers the peak-to-average load ratio, thereby achieving peak shaving and valley filling. Compared with the traditional social welfare maximization model (SWMM), the proposed approach exhibits superior performance in terms of fairness of benefit distribution and long-term system stability.

The outline of the paper is as follows. In Section 2, an equilibrium model is constructed based on the two-tier planning framework. In Section 3, the two-layer planning is transformed into an equivalent single-layer optimization problem via the KKT condition. In order to approximate the complementarity condition associated with the KKT condition, a smoothing function is constructed, which further transforms the original two-layer planning into a single-layer optimization problem containing the smoothing equation. In Section 4, a modified rolling penalty function algorithm is proposed to solve the converted equivalent single-lever optimization problem. In Section 5, the simulations show the effectiveness of the proposed algorithm and model.

2. System Model

Consider a smart power system comprising an electricity supplier, a grid company, and several users. Suppose the power supply and all users are connected with each other through an information communication infrastructure. Divide the whole cycle into

T

periods and let

N = N_{1} \cup N_{2}

as the set of users requiring electricity, where

N_{1} = {1, 2, \dots, N_{1}}

is the set of residential users,

N_{2} = {1, 2, \dots, N_{2}}

is the set of industrial users, and

x_{i}^{t}, y_{i}^{t}

is the power consumption demand of residential users and industrial users at time slot

t \in T

of the i-

th

users respectively. Denote

L_{x}^{t}

as the capacity for residential user,

L_{y}^{t}

as the generation capacity of industrial user, and

L_{t} = L_{x}^{t} + L_{y}^{t}

as the total generation capacity at the slot t.

2.1. User Model

In smart grid systems, electricity consumers constitute the core entities on the demand side. According to microeconomics, consumer electricity usage behavior can be characterized by a corresponding utility function. Denoting this utility function as

U (.)

, it possesses the following properties:

(i): The utility function is nondecreasing, i.e.,

\frac{\partial U (.)}{\partial x} \geq 0 .

(ii): The marginal utility is decreasing, i.e.,

\frac{\partial U^{2} (.)}{\partial^{2} x} \leq 0 .

In economics, utility functions are typically employed to measure the degree of satisfaction consumers derive from the consumption of goods or activities. The utility function

U (x)

examined herein takes electricity demand x as its independent variable, serving to characterise the quantitative relationship between electricity consumption and user utility. Specifically, this paper consider two types of utility functions: for residential users, we adopted the utility function, namely [22]

U_{x} (x_{i}^{t}, ω_{x}^{t}) = ω_{x} x_{i}^{t} - \frac{α {(x_{i}^{t})}^{2}}{2},

and for commercial users, we adopted the utility function, namely [25]

U_{y} (y_{i}^{t}, ω_{y}^{t}) = β l o g (ω_{y}^{t} y_{i}^{t} + 5),

where

ω_{x}^{t}, ω_{y}^{t}

are non-negative parameters describing the distinct electricity preference types of different users;

α, β

are pre-determined constants characterizing the saturation properties of the utility function.

Based on the user utility function for time slot t, the user’s benefit function at the time slot t is obtained and expressed as follows:

\begin{matrix} R_{u}^{t} = \sum_{i = 1}^{N_{1}} (U_{x} (x_{i}^{t}, ω_{x}^{t}) - p_{x_{2}}^{t} x_{i}^{t}) + \sum_{i = 1}^{N_{2}} (U_{y} (y_{i}^{t}, ω_{y}^{t}) - p_{y_{2}}^{t} y_{i}^{t}), \end{matrix}

(1)

where

p_{x_{2}}^{t}

and

p_{y_{2}}^{t}

denote the purchase prices for residential and commercial users respectively.

2.2. Grid Company Model

In smart grid systems, the grid company serves as the intermediary entity connecting the electricity supply side with the demand side. It formulates real-time electricity tariffs based on the consumption patterns of the demand side, which constitutes both a key mechanism for maintaining the supply–demand equilibrium of the power system and the primary source of revenue for the grid company. Consequently, the revenue function of the grid company during time period t can be calculated based on the electricity consumption of various user categories and the corresponding tariffs during that period, as expressed in the following equation:

\begin{matrix} R_{c}^{t} = (p_{x_{2}}^{t} - p_{x_{1}}^{t}) \sum_{i = 1}^{N_{1}} x_{i}^{t} + (p_{y_{2}}^{t} - p_{y_{1}}^{t}) \sum_{i = 1}^{N_{1}} y_{i}^{t}, \end{matrix}

(2)

where

p_{x_{1}}^{t}

and

p_{y_{1}}^{t}

denote the procurement prices of the electricity company.

2.3. Electricity Supplier Model

Define

C (.)

as the electricity supplier’s generation cost, representing the cost of supplying

L_{t}

electricity in the t-th time period, subject to two assumptions:

(i): The cost function $C (.)$ is strictly increasing;
(ii): The cost function $C (.)$ is strictly convex.

Thus, functions satisfying these conditions include piecewise linear functions and quadratic functions. To simplify the objective function solution, a quadratic function is adopted here as the electricity supplier’s cost function, namely

C_{t} (L^{t}) = a_{t} {(L_{t})}^{2} + b_{t} L_{t} + c_{t},

where

a_{t}, b_{t},

and

c_{t}

are the generation cost parameters, and

a_{t} > 0,

and

b_{t}, c_{t} \geq 0

.

Electricity suppliers generate revenue by setting corresponding purchase prices based on the grid company’s procured electricity volume. Therefore, the revenue function for electricity suppliers at time period t is as follows:

\begin{matrix} R_{s}^{t} = \sum_{i = 1}^{N_{1}} p_{x_{1}}^{t} x_{i}^{t} + \sum_{i = 1}^{N_{2}} p_{y_{1}}^{t} y_{i}^{t} - C_{t} (L_{t}) . \end{matrix}

(3)

2.4. Welfare Equilibrium Model of Real-Time Electricity Price

Reference [27] established a social welfare maximization model (SWMM) based on the utility benefits of all entities within the power grid system. Solving this model yields the optimal electricity consumption for users across all time intervals. Furthermore, by designing and solving the corresponding Lagrange multiplier, the corresponding real-time electricity prices can be obtained. Consequently, it has been widely applied in research on real-time electricity pricing for smart grids. The model is as follows:

\begin{matrix} max \sum_{t = 1}^{T} (\sum_{i = 1}^{N_{1}} U_{x} (x_{i}^{t}, ω_{x}^{t}) + \sum_{i = 1}^{N_{2}} U_{y} (y_{i}^{t}, ω_{y}^{t}) - C (L_{t})) \\ s . t . \sum_{i = 1}^{N_{1}} x_{i}^{t} \leq L_{x}^{t}, t \in T, \\ \sum_{i = 1}^{N_{2}} y_{i}^{t} \leq L_{y}^{t}, t \in T . \end{matrix}

(4)

As shown in Equation (4), while the traditional social welfare maximization model achieves overall maximization of user utility and minimization of production costs, it fails to address the issue of revenue equilibrium among users, grid companies, and electricity suppliers.

Figure 1 summarizes the overall research workflow and technical route of this paper, including bilevel pricing model formulation, equivalent transformation, algorithm design and simulation verification.

3. RTP Formulation and the Equivalent Single-Level Optimization

The above analysis demonstrates that while real-time electricity pricing mechanisms based on social welfare maximization models can achieve optimal overall social welfare, they struggle to ensure balanced welfare distribution among the three parties within the power system: consumers, grid operators, and electricity suppliers. From an operational perspective, simultaneously minimizing welfare disparities among these three parties and maximizing overall social welfare in real-time pricing optimization is crucial for maintaining the long-term stability and coordinated development of the power system. Therefore, this section proposes a bi-level equilibrium model based on the social welfare maximization model.

In the proposed bi-level equilibrium model, the upper-level structure minimizes welfare differences among the three parties, while the lower-level structure maximizes social welfare. The formula is as follows:

\begin{matrix} min \sum_{t = 1}^{T} | R_{u}^{t} - R_{c}^{t} | + | R_{u}^{t} - R_{s}^{t} | + | R_{c}^{t} - R_{s}^{t} |, \\ max \sum_{t = 1}^{T} R_{u}^{t} + R_{c}^{t} + R_{s}^{t} \\ = \sum_{t = 1}^{T} (\sum_{i = 1}^{N_{1}} U_{x} (x_{i}^{t}, ω_{x}^{t}) + \sum_{i = 1}^{N_{2}} U_{y} (y_{i}^{t}, ω_{y}^{t}) - C (L_{t})) \\ s . t . \sum_{i = 1}^{N_{1}} x_{i}^{t} \leq L_{x}^{t}, t \in T, \\ \sum_{i = 1}^{N_{2}} y_{i}^{t} \leq L_{y}^{t}, t \in T . \end{matrix}

(5)

Consider each period

t \in T

, the model (5) can be expressed as an optimization:

\begin{matrix} min | R_{u}^{t} - R_{c}^{t} | + | R_{u}^{t} - R_{s}^{t} | + | R_{c}^{t} - R_{s}^{t} |, \\ max R_{u}^{t} + R_{c}^{t} + R_{s}^{t} \\ = (\sum_{i = 1}^{N_{1}} U_{x} (x_{i}^{t}, ω_{x}^{t}) + \sum_{i = 1}^{N_{2}} U_{y} (y_{i}^{t}, ω_{y}^{t}) - C (L_{t})) \\ s . t . \sum_{i = 1}^{N_{1}} x_{i}^{t} \leq L_{x}^{t}, \\ \sum_{i = 1}^{N_{2}} y_{i}^{t} \leq L_{y}^{t} . \end{matrix}

(6)

The lower-level problem is a concave function with linear constraints. According to convex optimization theory, this constitutes a convex programming problem. Therefore, we may utilize the Karush–Kuhn–Tucker (KKT) conditions to transform the original bi-level problem into a single-level optimization problem [28]. The KKT conditions for the lower-level problem are as follows:

\begin{matrix} \{\begin{matrix} \frac{\partial U_{x} (x_{i}^{t}, ω_{x}^{t})}{\partial x} - λ_{x} = 0, \\ \frac{\partial U_{y} (y_{i}^{t}, ω_{y}^{t})}{\partial y} - λ_{y} = 0, \\ λ_{x} (L_{x} - \sum_{i = 1}^{N_{1}} x_{i}^{t}) = 0, λ_{x} \geq 0, L_{x} - \sum_{i = 1}^{N_{1}} x_{i}^{t} \geq 0, \\ λ_{y} (L_{y} - \sum_{i = 1}^{N_{2}} y_{i}^{t}) = 0, λ_{y} \geq 0, L_{y} - \sum_{i = 1}^{N_{2}} y_{i}^{t} \geq 0 . \end{matrix} \end{matrix}

(7)

Replacing lower-level with Equation (7), the primal bilevel programming problem is transformed into the following single-level optimization problem:

\begin{matrix} min | R_{u}^{t} - R_{c}^{t} | + | R_{u}^{t} - R_{s}^{t} | + | R_{c}^{t} - R_{s}^{t} |, \\ s . t . (7) . \end{matrix}

(8)

The above transforms the primal bilevel programming into a single-level optimization problem (8). However, there are complementary conditions in the constraint (7). According to the nonlinear complementary theory [29], they are equal to non-smooth equations, which makes problem (8) difficult to solve. In order to lower the difficulty to solve the optimization problem, smooth functions are constructed to approximate the complementary conditions in Equation (7), and then problem (8) is transformed into a single-level optimization with only smooth equality constraints.

First, owing to

a \geq 0, b \geq 0, a \cdot b = 0 \Leftrightarrow min {a, b} = 0

\Leftrightarrow a - {(a - b)}_{+} = 0,

where

{(\cdot)}_{+} = max {\cdot, 0} .

Next, smooth functions are used to approximate the equation. Consider the CHKS smoothing function [30] of the “min” function such that

ϕ_{μ} (a, b) = \frac{a + b - \sqrt{μ^{2} + {(a - b)}^{2}}}{2},

where

μ > 0

is a very small positive number,

a \in R, b \in R

, and the function has a strongly Jacobian consistency. Ref. [30] demonstrates the strong Jacobian consistency of the smooth function CHKS.

The smooth approximation of function

ϕ_{μ} (a, b)

with different values of parameter

m u

is shown in Figure 2.

Here,

ϕ_{μ} (a, b)

denotes the CHKS smoothing function, and we set

μ = 0.1, 0.01

and

0.001

respectively. It can be intuitively observed from Figure 2 that the smoothing function

ϕ_{μ} (a, b)

exhibits favorable convergence performance and gradually approximates

m i n {a, b}

as parameter

μ

decreases.

Let

g_{x} = L_{x} - \sum_{i = 1}^{N_{1}} x_{i}^{t}, g_{y} = L_{y} - \sum_{i = 1}^{N_{2}} y_{i}^{t} .

Replacing Equation (7) with

ϕ_{μ} (a, b)

, Equation (7) is turned into the equivalent smooth equations as follows:

\begin{matrix} \{\begin{matrix} \frac{\partial U_{x} (x_{i}^{t}, ω_{x}^{t})}{\partial x} - λ_{x} = 0, \\ \frac{\partial U_{y} (y_{i}^{t}, ω_{y}^{t})}{\partial y} - λ_{y} = 0, \\ ϕ_{μ} (λ_{x}, g_{x}) = 0, \\ ϕ_{μ} (λ_{y}, g_{y}) = 0 . \end{matrix} \end{matrix}

(9)

Substituting Equation (9) for Equation (7), problem (8) is transformed into the following optimization with only smooth equality constraints:

\begin{matrix} min | R_{u}^{t} - R_{c}^{t} | + | R_{u}^{t} - R_{s}^{t} | + | R_{c}^{t} - R_{s}^{t} |, \\ s . t . (9) . \end{matrix}

(10)

The objective function in problem (10) is an absolute value function, which is non-smooth. Here, we adopted the pseudo-Huber function as the smoothing one of the absolute value function [31,32]. It is such that

\begin{matrix} P_{η} (x) = \sqrt{η^{2} + x^{2}} - η, η > 0, \end{matrix}

(11)

where

η

is the positive smoothing parameter. For all

x \in R

, it has properties as follows.

(i): $P_{η} (x)$ is continuously differentiable everywhere;
(ii): $lim_{η \to 0} P_{η} (x) = | x |$ ;
(iii): The approximation error is bounded, i.e., $| P_{η} (x) - | x | | \leq η$ .

The smooth approximation of function

P_{η} (x)

with different values of parameter

η

is shown in Figure 3.

Here,

P_{η} (x)

denotes the pseudo-Huber smoothing function, and we set

η = 0.1, 0.01

and

0.001

respectively. It can be intuitively observed from Figure 3 that the smoothing function

P_{η} (x)

exhibits favorable convergence performance and gradually approximates

m i n | x |

as parameter

η

decreases.

Substituting the pseudo-Huber function into problem (10), we obtain the transformed formulation as follows.

\begin{matrix} min \sqrt{η^{2} + {(R_{u}^{t} - R_{c}^{t})}^{2}} + \sqrt{η^{2} + {(R_{u}^{t} - R_{s}^{t})}^{2}} + \sqrt{η^{2} + {(R_{c}^{t} - R_{s}^{t})}^{2}} - 3 η, \\ s . t . (9) . \end{matrix}

(12)

4. Rolling Penalty Function Algorithm

In this section, we give a modified rolling penalty function algorithm for problem (12).

Denote

z = {(x_{1}^{t}, x_{2}^{t}, \dots, x_{N_{1}}^{t}, y_{1}^{t}, \dots, y_{N_{2}}^{t}, p_{x_{1}}^{t}, p_{y_{1}}^{t}, p_{x_{2}}^{t}, p_{y_{2}}^{t}, L_{x}^{t}, L_{y}^{t}, η)}^{T},

f (z) = \sqrt{η^{2} + {(R_{u}^{t} - R_{c}^{t})}^{2}} + \sqrt{η^{2} + {(R_{u}^{t} - R_{s}^{t})}^{2}} + \sqrt{η^{2} + {(R_{c}^{t} - R_{s}^{t})}^{2}} - 3 η,

and

h_{i_{1}}

as the constraint condition for the problem (12).

Then, we construct the penalty function corresponding to problem (12) as follows:

F_{μ, η} (z, σ) = f (z) + σ \sum_{i_{1} = 1}^{N} h_{i_{1}}^{2} (z, μ) .

We next present the following theorem, which characterizes the relationship between the smoothed and original problems under the above assumptions.

Theorem 1.

Let

F_{μ, η} (z, σ)

be the smoothing penalty function of problem (12) with CHKS (

μ > 0

) and pseudo-Huber (

η > 0

) smoothing functions.

F (z, σ)

is the penalty function of the non-smooth single-level problem (8)’ they are bounded on

R^{n}

, and

v = min_{z} F (z, σ)

. Then, it holds that

| v_{μ, η} - v | \leq K (μ + η),

where

K > 0

is a constant independent of

μ, η

. Further, if

\bar{z}

is the optimal solution of ε of

min_{z} F_{μ, η} (z, σ)

, it is the optimal solution of

(K (μ + η) + ε)

of

min_{z} F (z, σ)

.

Proof.

According to the above description, the penalty functions of the non-smooth problem (8) and the smooth problem (11) are respectively:

F (z, σ) = f (z) + σ \sum_{i_{1} = 1}^{N} h_{i_{1}}^{2} (z),

F_{μ, η} (z, σ) = f_{μ, η} (z) + σ \sum_{i_{1} = 1}^{N} h_{i_{1}}^{2} (z, μ),

where

f_{0} (z)

denotes the original objective function containing absolute values,

f_{μ, η} (z)

represents the objective function smoothed by the pseudo-Huber function,

h_{i_{1}} (z)

refers to the original complementarity constraint, and

h_{i_{1}} (z, μ)

denotes the equality constraint smoothed via the CHKS smoothing method.

By virtue of the triangle inequality,

F_{μ, η} (z, σ) - F (z, σ) \leq | f_{μ, η} (z) - f (z) | + σ \sum_{i_{1}} | h_{i_{1}}^{2} (z, μ) - h_{i_{1}}^{2} (z) | .

By the smooth approximation property of the pseudo-Huber function, there exists a constant

K_{1} > 0

such that

| f_{μ, η} (z) - f (z) | \leq K_{1} η .

By virtue of the complementarity constraint approximation property of the CHKS smoothing function, the constraint function is uniformly bounded over the feasible region. Hence, there exists a constant

K_{2} > 0

such that

\sum_{i_{1}} | h_{i_{1}}^{2} (z, μ) - h_{i_{1}}^{2} (z) | \leq K_{2} μ .

Let

K = max {K_{1}, σ K_{2}}

; then, for any

z \in R^{n}

, we have

| F_{μ, η} (z, σ) - F (z, σ) | \leq K (μ + η),

and, taking the minimum with respect to z, we obtain

| v_{μ, η} - v | \leq K (μ + η) .

Suppose that

\bar{z}

satisfies

F_{μ, η} (\bar{z}, σ) - v_{μ, η} \leq ε

; then

F (\bar{z}, σ) - v \leq F_{μ, η} (\bar{z}, σ) + K (μ + η) - (v_{μ, η} - K (μ + η)) \leq K (μ + η) + ε,

and thus

\bar{z}

is a

K (μ + η) + ε

-optimal solution to the original problem. □

Remark 1.

Theorem 1 above proves that the error between the optimal value of the penalty function after dual smoothing with CHKS and pseudo-Huber functions and that of the original non-smooth penalty function can be uniformly controlled by the smoothing parameters μ and η. Meanwhile, it indicates that the approximate optimal solution of the smoothed problem can serve as the approximate optimal solution to the original non-smooth problem. This provides a theoretical convergence guarantee for transforming the non-smooth bilevel optimization problem into a numerically solvable smooth single-level optimization problem.

We now propose the following rolling penalty function method as follows.

Remark 2.

Algorithm 1, developed here, is similar to the rolling penalty function algorithm reported in [33]. Nevertheless, for gradient computation, we adopt the central difference approximation, as elaborated in Step 2 of Algorithm 1. Such a modification endows the algorithm with higher computational accuracy, smaller numerical errors, and stronger numerical stability throughout the iterative solution process. Accordingly, the convergence property and robustness of the overall optimization algorithm are effectively improved.

Algorithm 1: Rolling Penalty Function Method

Step 0 Initialization. Choose parameters tolerance

ε > 0

, penalty factor multiplier

c > 1

, smoothing coefficient

μ = 0.001

, initial penalty factor

σ_{1} > 0

, choose initial point

x^{(0)}

, set

k_{1} : = 1

,

t = 1

,

μ_{1} = 0.001, c_{1} > 0

, tolerance

e^{'} > 0

.

Step 1 Stopping criterion. If

σ_{k_{1}} \sum_{i_{1} = 1}^{M} h_{i_{1}}^{2} (z^{(k_{1})}) < ε

, then stop; otherwise, go to Step 2.

Step 2 Gradient update. Compute:

\nabla F_{i} (z^{(t)}) = \frac{F (z^{(t)} + h e_{i}) - F (z^{(t)} - h e_{i})}{2 h},

where

e_{i}

is the unit vector, h represents the step size of central difference.

Step 3 Set the step size. The step size is determined using the following formula as the line search rule.

F (z^{(t)} - μ_{1} \nabla F_{i} (z^{(t)})) > F (z^{(t)}) - c_{1} μ_{1} {∥ \nabla F_{i} (z^{(t)}) ∥}^{2} .

Step 4 Convergence Check. If

| | z^{(k_{1}), t + 1} - z^{(k_{1}), t} | | < e^{'}

, set

z^{k_{1}} = z^{(k_{1}), t + 1}

, return to Step 5; otherwise, set

t = t + 1

, go to Step 2.

Step 5 Update penalty factors. Set

σ_{k_{1} + 1} = c \cdot σ_{k_{1}}, k_{1} = k_{1} + 1

, return to Step 1.

The convergence of the algorithm is obvious by virtue of Ref. [34]; we present the global convergence theorem and local convergence theorem below.

Theorem 2

(Global convergence). Let

z^{k + 1}

be a global minimizer of

F (z, σ_{k})

, where the penalty parameters

{σ_{k}}

are monotonically increasing to infinity. Then, every limit point

z^{*}

of the sequence

{z^{k}}

is a global minimizer of the original problem.

Theorem 3

(Local convergence). Let the objective function

f (z)

and the constraint functions

h_{i_{1}} (z, μ)

(

i_{1} = 1, 2, \dots, N

) be continuously differentiable. Let

{ε_{k}}

be a positive sequence such that

ε_{k} \to 0

and

σ_{k} \to + \infty

. In Algorithm 1, the solution

z^{k + 1}

to the unconstrained optimization problem satisfies

| \nabla F (z^{k + 1}, σ_{k}) | \leq ε_{k} .

Moreover, for any limit point

z^{*}

of

{z^{k}}

, the set

\{\nabla_{z} h_{i_{1}} (z^{*}, μ) ∣ i_{1} = 1, 2, \dots, N\}

is linearly independent. Then

z^{*}

is a KKT point of the equality-constrained optimization problem

\begin{matrix} min_{z} f (z), \\ s . t . h_{i_{1}} (z, μ) = 0, i_{1} = 1, 2, \dots, N, \end{matrix}

and

lim_{k \to \infty} (- 2 σ_{k} h_{i_{1}} (z^{k + 1}, μ)) = λ_{i_{1}}^{*}, \forall i_{1} = 1, 2, \dots, N,

where

λ_{i_{1}}^{*}

is the Lagrange multiplier corresponding to the constraint

h_{i_{1}} (z^{*}, μ) = 0

. Here,

F_{μ} (z, σ)

denotes the quadratic penalty function:

F_{μ, η} (z, σ) = f (z) + σ \sum_{i_{1} = 1}^{N} h_{i_{1}}^{2} (z, μ) .

5. Numerical Evaluation

In this section, simulation is conducted to demonstrate the effectiveness of the proposed RTP scheme.The numerical experiments are conducted in MATLAB R2023a on a AMD Ryzen 7 5700U with Radeon Graphics and 16 GB RAM.

Considering the smart grid system in a small area, we present numerical simulation results within a 24 h time pattern as an evaluation of daily operations. The system comprises an electricity supplier, a grid company, six residential users, and two commercial users. Residential users’ electricity demand

x_{i} (0)

is randomly selected from the interval [2, 6], where

i = 1, 2 \dots, N_{1}

. Commercial users’ electricity demand

y_{j} (0)

is randomly selected from the interval [5, 10], where

j = 1, 2, \dots, N_{2}

. In the utility function,

α = 0.5

and

β = 5

, the cost function employs parameters

a = 0.01

and

b = c = 0

[13,26]. The initial electricity purchase prices from the grid company for residents are set as

p_{x_{2}} (0) = 0.8

and

p_{y_{2}} (0) = 1.2

, whilst the grid company’s purchase prices from suppliers are set as

p_{x_{1}} (0) = 0.5

and

p_{y_{1}} (0) = 0.9

. Additionally, the initial electricity supply quantities are

L_{x} (0) = \sum_{i = 1}^{N_{1}} x_{i} (0)

and

L_{y} (0) = \sum_{j = 1}^{N_{2}} y_{j} (0)

. To improve the accuracy of the experimental results, all numerical experimental results in this section are averaged from ten independent replicate runs.

First, we verify the iterative convergence performance of the proposed algorithm for the pricing model, as shown in Figure 4. In the early stage of iteration, the real-time electricity price of each time slot fluctuates sharply. As the iteration continues, the price fluctuation gradually weakens and the range narrows, and the price eventually converges steadily to a unique equilibrium price.

Figure 5 illustrates the procurement and retail electricity prices for residential and commercial users over a 24 h period. The left-hand figure shows the prices at which residential and commercial users purchase electricity from the grid company, while the right-hand figure presents the prices at which the grid company procures electricity from power generation plants.

Figure 6 presents the peak-to-average ratio (PAR) for both models. Shown as 1, 2, and 3 are the PAR values for residential users, commercial users, and the total PAR within the bilevel model. Then, 4, 5, and 6 represent the PAR values for residential users, commercial users, and the total PAR within the SWMM model. It is evident that the PAR values for our bilevel model are lower than those for SWMM, indicating that our model effectively reduces peak-period loads.

Figure 7 and Figure 8 illustrate the benefits to electricity suppliers, grid companies, and users, and the overall societal benefit under both the bilevel model and SWMM. The numerical results demonstrate that the differences in benefits among electricity suppliesr, grid companies, and users are smaller in the bilevel model than in SWMM, while the overall societal benefit gap between the two models is negligible. This demonstrates that the bilevel model not only achieves a more balanced distribution of benefits among the electricity suppliers, grid companies, and users, but also maintains overall societal benefits to a certain extent. Consequently, it promotes a fairer allocation of interests among all societal stakeholders, thereby fostering the long-term healthy development of society.

Figure 9, Figure 10 and Figure 11 present the comparison results of power plant welfare, power utility welfare, user welfare and total social welfare between the bilevel model and the SWMM model, in a scenario expanded to 30 residential users and three commercial users. The results show that, with the enlarged user scale, the welfare of power plants and power utilities under the bilevel model both increase compared with the SWMM model, while user welfare decreases slightly. This effectively narrows the welfare gap among the three stakeholders. In addition, the total social welfare of the two models is generally close. It demonstrates that the bilevel model achieves welfare equilibrium among multiple participants of the power system while maintaining a stable level of total social welfare.

6. Conclusions

In this paper, we consider the inherent trade-off between equitable benefit allocation and operational efficiency in real-time electricity pricing for power systems involving generators, grid operators, and end-users. To tackle the drawbacks of traditional pricing mechanisms—including skewed benefit distribution across stakeholders and potential social welfare inefficiencies—we develop a bilevel optimization framework for real-time price setting that centers on tripartite equilibrium coordination. The upper-tier model is designed to minimize inter-stakeholder welfare inequity to uphold distributional fairness, whereas the lower-tier model optimizes total social welfare to preserve the economic viability of the power system. Through the incorporation of KKT optimality conditions, smoothing approximation techniques, and a penalty function-based solution algorithm, the intractable bilevel optimization problem is reformulated into a numerically solvable single-level program. Empirical simulations confirm that the proposed pricing framework substantially reduces benefit disparities among the three core participants while preserving the overall level of social welfare, thus enhancing the fairness and stability of the electricity market. However, the present study is constrained to a single-source power supply configuration and does not incorporate the complexities of multi-source power systems, such as the integration of distributed energy resources and multi-agent competitive dynamics. Additionally, practical implementation faces additional challenges while new-type power systems continue to evolve with increasing penetration of renewable energy and more complex grid operations. The uncertainty introduced by wind and solar power, along with physical constraints such as transmission losses, voltage drops, and equipment wear and tear in real-world grids, may affect the performance of the proposed equilibrium model. Therefore, we will extend our framework by incorporating renewable energy uncertainty and grid physical constraints to enhance its applicability and robustness in practical power systems in our future work.

Author Contributions

Conceptualization, L.S.; Methodology, L.S.; Software, S.Z.; Writing—original draft, J.J.; Writing—review and editing, S.Z.; Project administration, L.S.; Funding acquisition, L.S. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Science Foundation of China (no. 12101198) and Henan Province science and technology research project (no. 262102211010 and 262102210015), Xinxiang soft science research program project (no. RKX2020008).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

Yang, Y.; Yan, G.; Mu, G.; Chen, Z. Hierarchical bidding strategy for heterogeneous P2H loads and wind power: State-driven aggregation and switching time scheduling. Energy 2025, 334, 137766. [Google Scholar] [CrossRef]
Zhang, R.; Jiang, T.; Li, F.; Li, G.; Chen, H.; Li, X. Bi-level strategic bidding model for P2G facilities considering a carbon emission trading scheme-embedded LMP and wind power uncertainty. Int. J. Electr. Power Energy Syst. 2021, 128, 106740. [Google Scholar] [CrossRef]
Lu, T.; Guo, Y.; Chen, X.; Nielsen, C.P.; McElroy, M.B. Economic behaviors of generation companies at higher penetration of renewables under different market concentrations. J. Clean. Prod. 2026, 556, 148086. [Google Scholar] [CrossRef]
Ping, J.; Kong, S.; Yan, Z.; Xu, X.; Chen, S. Blockchain-based network-constrained peer-to-peer energy trading in a reconfigurable distribution network. Appl. Energy 2026, 405, 127195. [Google Scholar] [CrossRef]
Jackson, J. Smart Grids. In Future Energy, 2nd ed.; Elsevier: Amsterdam, The Netherlands, 2014; pp. 633–651. [Google Scholar]
Palensky, P.; Kupzog, F. Smart Grids. Annu. Rev. Environ. Resour. 2013, 38, 201–226. [Google Scholar] [CrossRef]
Fang, X.; Misra, S.; Xue, G.; Yang, D. Smart Grid—The New and Improved Power Grid: A Survey. IEEE Commun. Surv. Tutor. 2012, 14, 944–980. [Google Scholar] [CrossRef]
Tuballa, M.L.; Abundo, M.L. A review of the development of Smart Grid technologies. Renew. Sustain. Energy Rev. 2016, 59, 710–725. [Google Scholar] [CrossRef]
Roozbehani, M.; Dahleh, M.A.; Mitter, S.K. Volatility of Power Grids Under Real-Time Pricing. IEEE Trans. Power Syst. 2012, 27, 1926–1940. [Google Scholar] [CrossRef]
Tan, R.; Krishna, V.B.; Yau, D.K.Y.; Kalbarczyk, Z. Impact of integrity attacks on real-time pricing in smart grids. In Proceedings of the 2013 ACM SIGSAC Conference on Computer and Communications security; Association for Computing Machinery: New York, NY, USA, 2013; pp. 439–450. [Google Scholar]
Widergren, S.; Marinovici, C.; Berliner, T.; Graves, A. Real-time pricing demand response in operations. In 2012 IEEE Power and Energy Society General Meeting; IEEE: Piscataway, NJ, USA, 2012; pp. 1–5. [Google Scholar]
Samadi, P.; Mohsenian-Rad, A.H.; Schober, R.; Wong, V.W.; Jatskevich, J. Optimal real-time pricing algorithm based on utility maximization for smart grid. In 2010 First IEEE International Conference on Smart Grid Communications; IEEE: Piscataway, NJ, USA, 2010; pp. 415–420. [Google Scholar]
Qu, D.; Li, J.; Shang, Y.; Li, Y.; Luo, S. Real-Time Pricing of Smart Grid Based on Smooth Approximation. J. Syst. Sci. Math. Sci. 2025, 45, 145–156. [Google Scholar]
Li, Y.; Li, J.; Dang, Y.; Gao, Y. Smoothing Newton Algorithm for Real-Time Pricing of Smart Grid Based on KKT Conditions. J. Syst. Sci. Math. Sci. 2020, 40, 646–656. [Google Scholar]
Gao, Y. The Social Welfare Maximization Model of Real-Time Pricing for Smart Grid. Chin. J. Manag. Sci. 2020, 28, 201–209. [Google Scholar]
Tian, H.; Wang, Z.; Zhang, P.; Li, J.; Wang, Q.; Zhang, S. Complex dynamics of a symmetric quantum Stackelberg duopoly game model with heterogeneous expectations. Phys. A Stat. Mech. Its Appl. 2026, 681, 131073. [Google Scholar] [CrossRef]
Zhang, Q.; Suo, Q. A Nested Game Model for Real-Time Pricing of Smart Grid under Dual-Carbon Target. Econ. Comput. Econ. Cybern. Stud. Res. 2025, 59, 254–270. [Google Scholar]
Dai, Y.; Liu, Y. A real-time pricing scheme in power market with renewable energy considering different advertising attractiveness. Electr. Power Syst. Res. 2025, 249, 112058. [Google Scholar] [CrossRef]
Dai, Y.; Gao, Y.; Gao, H.; Zhu, H.; Li, L. A real-time pricing scheme considering load uncertainty and price competition in smart grid market. J. Ind. Manag. Optim. 2020, 16, 777–793. [Google Scholar] [CrossRef]
Oggioni, G.; Schwartz, A.; Wiertz, A.K.; Zöttl, G. Dynamic pricing and strategic retailers in the energy sector: A multi-leader-follower approach. Eur. J. Oper. Res. 2024, 312, 255–272. [Google Scholar] [CrossRef]
Wang, J.; Gao, Y.; Li, R. Reinforcement learning based bilevel real-time pricing strategy for a smart grid with distributed energy resources. Appl. Soft Comput. 2024, 155, 111474. [Google Scholar] [CrossRef]
Song, H.; Wang, Z.; Gao, Y. Bi-level real-time pricing model in multitype electricity users for welfare equilibrium: A reinforcement learning approach. J. Renew. Sustain. Energy 2025, 17, 015501. [Google Scholar] [CrossRef]
Dai, Y.; Zhou, Q. Power load combination forecasting method based on improved Bi-LSTM and XGBoost. J. Univ. Shanghai Sci. Technol. 2022, 44, 138–147. [Google Scholar]
Liu, J.; Ma, Z.; Zhou, B.; Wu, H. TCN-BiGRU power load prediction based on improved gray wolf optimization algorithm. J. Univ. Electron. Sci. Technol. China 2025, 54, 916–923. [Google Scholar]
Song, L.; Sheng, G. A nonsmooth Levenberg-Marquardt method based on KKT conditions for real-time pricing in smart grid. Int. J. Electr. Power Energy Syst. 2024, 162, 110235. [Google Scholar] [CrossRef]
Wang, H.; Gao, Y. Research on the real-time pricing of smart grid based on nonsmooth equations. J. Syst. Eng. 2018, 33, 320–327. [Google Scholar]
Song, L.; Sheng, G. A Smoothing Newton Method for Real-Time Pricing in Smart Grids Based on User Risk Classification. Mathematics 2025, 13, 822. [Google Scholar] [CrossRef]
Wang, Y.; Liang, Z. Basic Theory and Method of Optimization; Fudan University Press: Shanghai, China, 2011. [Google Scholar]
Gao, Y. Nonsmooth Optimization; Science Press: Beijing, China, 2018. [Google Scholar]
Xiang, S.; Chen, X. Computation of generalized differentials in nonlinear complementarity problems. Comput. Optim. Appl. 2011, 50, 403–423. [Google Scholar] [CrossRef]
Castro, J. A CTA Model Based on the Huber Function; Springer International Publishing: Cham, Switzerland, 2014. [Google Scholar]
Filipović, V. System identification using newton-raphson method based on synergy of huber and pseudo-huber functions. Facta Univ. Ser. Autom. Control Robot. 2021, 20, 87–98. [Google Scholar] [CrossRef]
Tao, L.; Gao, Y.; Liu, Y.; Zhu, H. A Rolling Penalty Function Algorithm of Real-Time Pricing for Smart Microgrids Based on Bilevel Programming. Eng. Optim. 2020, 52, 1295–1312. [Google Scholar] [CrossRef]
Wen, Z.; Yuan, Y. Methods and Theory of Optimisation; Higher Education Press: Beijing, China, 2024. [Google Scholar]

Figure 1. Flowchart of model construction, transformation, algorithm solution and simulation.

Figure 2. Approximation Performance of Nonsmooth Function

ϕ_{μ} (a, b)

under Different Parameters.

Figure 2. Approximation Performance of Nonsmooth Function

ϕ_{μ} (a, b)

under Different Parameters.

Figure 3. Approximation Performance of Nonsmooth Function

P_{η} (x)

under Different Parameters.

Figure 3. Approximation Performance of Nonsmooth Function

P_{η} (x)

under Different Parameters.

Figure 4. Iterative changes in prices.

Figure 5. Bilevel model of prices.

Figure 6. PAR in different pricing schemes.

Figure 7. Comparison of electricity supplier welfare and grid company welfare between bilevel model and SWMM.

Figure 8. Comparison of user welfare and total welfare between bilevel model and SWMM.

Figure 9. Comparison of electricity supplier welfare and grid company welfare between bilevel model and SWMM (expanded user scale).

Figure 10. Comparison of user welfare between bilevel model and SWMM (expanded user scale).

Figure 11. Comparison of total welfare between bilevel model and SWMM (expanded user scale).

Table 1. Existing research comparison.

References	Residential Users	Commercial Users	SWMM	Equilibrium	Traditional Algorithm
[13]	×	✓	✓	×	✓
[15]	✓	✓	✓	✓	×
[25]	✓	✓	✓	×	✓
[26]	✓	×	✓	×	✓
This paper	✓	✓	✓	✓	✓

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Jia, J.; Zhang, S.; Song, L. Bilevel Real-Time Pricing for Tripartite Welfare Equilibrium in Smart Grids: Balancing Fairness and Efficiency. Mathematics 2026, 14, 2040. https://doi.org/10.3390/math14122040

AMA Style

Jia J, Zhang S, Song L. Bilevel Real-Time Pricing for Tripartite Welfare Equilibrium in Smart Grids: Balancing Fairness and Efficiency. Mathematics. 2026; 14(12):2040. https://doi.org/10.3390/math14122040

Chicago/Turabian Style

Jia, Jinze, Sen Zhang, and Linsen Song. 2026. "Bilevel Real-Time Pricing for Tripartite Welfare Equilibrium in Smart Grids: Balancing Fairness and Efficiency" Mathematics 14, no. 12: 2040. https://doi.org/10.3390/math14122040

APA Style

Jia, J., Zhang, S., & Song, L. (2026). Bilevel Real-Time Pricing for Tripartite Welfare Equilibrium in Smart Grids: Balancing Fairness and Efficiency. Mathematics, 14(12), 2040. https://doi.org/10.3390/math14122040

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Bilevel Real-Time Pricing for Tripartite Welfare Equilibrium in Smart Grids: Balancing Fairness and Efficiency

Abstract

1. Introduction

2. System Model

2.1. User Model

2.2. Grid Company Model

2.3. Electricity Supplier Model

2.4. Welfare Equilibrium Model of Real-Time Electricity Price

3. RTP Formulation and the Equivalent Single-Level Optimization

4. Rolling Penalty Function Algorithm

5. Numerical Evaluation

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI