Next Article in Journal
Abnormal Data Identification and Cleaning Techniques for Wind Turbine Systems
Previous Article in Journal
PreSAC-Net: A Hybrid Deep Reinforcement Learning Framework for Short-Term Household Load Forecasting and Energy Scheduling Optimization
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Dynamic Bilevel Optimization of Market Participation and Strategic Bidding in Renewable-Dominated Electricity Markets

1
Beijing Power Exchange Center, Beijing 100031, China
2
School of Economics and Management, North China Electric Power University, Beijing 102206, China
3
China Electric Power Research Institute, Beijing 100192, China
*
Author to whom correspondence should be addressed.
Energies 2026, 19(5), 1285; https://doi.org/10.3390/en19051285
Submission received: 3 November 2025 / Revised: 18 December 2025 / Accepted: 20 December 2025 / Published: 4 March 2026

Abstract

This study advances a hierarchical bilevel optimization paradigm to rigorously characterize the intertwined processes of strategic bidding and regulatory market participation in electricity systems increasingly dominated by renewable resources. At the upper tier, a central regulatory authority orchestrates participation rules, renewable integration mandates, and incentive mechanisms with the overarching aim of maximizing system-wide social welfare while driving decarbonization and reliability objectives. At the subordinate level, profit-maximizing generation firms—each managing heterogeneous renewable portfolios—pursue strategic bidding under deep uncertainty, conceptualized as a multi-agent game governed by imperfect and asymmetric information. The interaction between these tiers is formalized as a bilevel Stackelberg game that encapsulates price-responsive demand, intertemporal reserve adequacy, and policy-driven incentive structures. To ensure both computational tractability and robustness against strategic indeterminacy, the lower-level equilibrium is reformulated into a mathematical program with equilibrium constraints (MPEC), enabling a hybrid solution procedure that combines penalty-based regularization with exact decomposition algorithms. The framework’s efficacy is validated through a stylized multi-zone case study featuring diverse renewable assets and strategic participants, revealing how policy signals, capacity ceilings, and market power asymmetries reshape efficiency frontiers and bidding equilibria. A set of high-resolution post-processing visualizations is further employed to illustrate the dynamic evolution of marginal prices, equilibrium trajectories, and regulatory impacts under uncertainty.

1. Introduction

The ongoing transformation of modern electricity systems—driven by the intertwined forces of decarbonization, decentralization, and digitalization—has introduced a new generation of challenges in safeguarding both long-term adequacy and short-term operational reliability [1,2,3]. As renewable penetration accelerates and conventional generators phase out, system operators are increasingly dependent on distributed energy resources aggregated through virtual power plants (VPPs) to supply capacity, reserves, and flexibility services [4,5]. Nevertheless, a critical body of literature reveals that capacity and reserve markets have evolved along largely disjoint trajectories, giving rise to persistent inefficiencies when stochastic and energy-limited resources attempt to participate across multiple market products simultaneously. Bridging this institutional and methodological divide has therefore emerged as a central research frontier, uniting concerns of system reliability, market efficiency, and regulatory coherence [6,7].
Historically, capacity markets were established to ensure long-term adequacy by compensating resources for their availability rather than their instantaneous power delivery. Foundational studies developed probabilistic adequacy metrics such as the loss-of-load expectation (LOLE) and the effective load-carrying capability (ELCC) as canonical indicators for accrediting the contribution of thermal units [6,8]. With the entrance of renewable technologies, researchers extended these frameworks through statistical convolution and Monte Carlo-based ELCC formulations to capture the stochastic capacity value of wind and solar generation. Simplified duration-based accreditation rules—such as the requirement that a unit sustain four hours of output—also emerged as pragmatic tools for resource qualification. While these approaches constitute the backbone of adequacy modeling, they often abstract away from the operational limitations of energy-constrained assets and their simultaneous engagement in reserve markets. Consequently, the divergence between nominal adequacy accreditation and realizable operational flexibility has grown increasingly pronounced as storage technologies, demand response programs, and electric vehicle fleets expand across power systems.
In parallel, the study of reserve markets has concentrated predominantly on short-term operational security [9,10]. Early formulations treated reserve requirements deterministically and satisfied them almost exclusively through synchronous generation. Over time, the stochasticity of renewable output and load variability necessitated more advanced modeling approaches incorporating activation probability distributions, intertemporal coupling, and probabilistic penalty mechanisms. To enhance robustness against extreme activation sequences and model misspecification, researchers introduced robust and distributionally robust optimization frameworks that ensure sustained deliverability of promised reserves under uncertainty. These contributions have provided powerful analytical tools for assessing compliance and reliability, yet most treat reserve provision as an isolated mechanism—decoupled from long-term adequacy accreditation and capacity market design [11]. This fragmentation underscores the necessity of an integrated framework that reconciles adequacy valuation, operational flexibility, and reserve deliverability within a coherent, multi-timescale optimization architecture.
The emergence of the virtual power plant (VPP) paradigm has introduced both an additional layer of complexity and a new spectrum of opportunities for market integration. A VPP aggregates geographically dispersed and technologically heterogeneous assets into a coordinated portfolio capable of participating competitively in wholesale electricity markets [12,13]. The existing literature documents how VPPs engage in day-ahead and real-time energy markets, arbitrage temporal price differences, and manage storage cycles under multiple sources of uncertainty. Methodologically, scholars have employed stochastic programming, robust optimization, and, more recently, reinforcement learning to construct adaptive bidding and scheduling strategies. Despite these advances, the prevailing emphasis remains largely confined to short-term participation and arbitrage, with limited exploration of how VPPs interface with long-term capacity accreditation or how their multi-product commitments dynamically interact across adequacy and reserve domains [14]. This omission has perpetuated a disjunction between theoretical bidding strategies and the institutional realities of capacity mechanisms. In response, the research community has increasingly shifted toward multi-timescale modeling frameworks. Scenario-tree formulations and rolling-horizon optimization methods have been developed to couple day-ahead, intraday, and real-time decisions, thereby improving forecast responsiveness and mitigating imbalance costs [15,16]. Within the VPP context, such multi-timescale structures have proven effective for coordinating distributed assets over short- and medium-term horizons; however, they seldom incorporate the seasonal or annual temporal dimensions intrinsic to capacity markets, nor do they embed performance-linked penalties that connect long-term accreditation with real-time operational outcomes. In the absence of these linkages, even the most sophisticated multi-timescale models fall short of capturing the fundamental trade-offs that distributed portfolios must navigate between enduring adequacy commitments and short-term flexibility obligations under uncertainty [17,18,19]. Early power system studies established deterministic and static representations of system states, providing the analytical foundations for reliability and equilibrium analysis. Building upon these classical formulations, the ongoing transformation of modern electricity systems—driven by the intertwined forces of decarbonization, decentralization, and digitalization—has introduced a new generation of challenges in safeguarding both long-term adequacy and short-term operational reliability. This paper introduces a unified and comprehensive mathematical framework to model the joint participation of virtual power plants (VPPs) in both capacity and reserve markets. Building upon foundational research in adequacy valuation, reserve deliverability, strategic bidding, and robust optimization, the proposed framework consolidates these traditionally separate analytical domains into an integrated bilevel formulation. At the upper level, the VPP is represented as a profit-maximizing aggregator that simultaneously determines capacity offers, reserve commitments, and real-time dispatch strategies, all while constrained by accreditation functions such as ELCC, duration-based adequacy requirements, storage dynamics, and compliance obligations. The lower level embodies the system operator’s market-clearing function, which co-optimizes capacity and reserve procurement, enforces adequacy standards, and determines equilibrium market prices. This hierarchical structure explicitly encapsulates the equilibrium tension between the VPP’s economic incentives and the system operator’s reliability imperatives, yielding a decision environment where private optimization and social welfare are inherently coupled. The novelty of this model is twofold. Conceptually, it represents the first rigorous synthesis of capacity accreditation, reserve deliverability, and multi-timescale operation within a single unified optimization structure, thereby bridging a long-standing gap in the literature. Methodologically, it embeds a distributionally robust optimization (DRO) layer to mitigate uncertainties in renewable generation, demand fluctuations, and reserve activation processes. By incorporating a Wasserstein-based ambiguity set, the framework extends beyond conventional stochastic programming approaches, accounting for the potential misspecification or incomplete representativeness of empirical probability distributions. This DRO formulation offers a tractable yet powerful means of hedging against distributional uncertainty, aligning the model with the forefront of contemporary research on risk-aware and reliability-oriented energy market optimization. To improve practical relevance, we explicitly describe how the proposed framework can support real operational workflows. The revised text highlights how the attacker–defender–dispatcher interactions correspond to real cyber events, mitigation planning, and dispatch adjustments used by utilities. Additionally, the result interpretation has been expanded to explain the operational meaning behind resilience gains, detection performance improvements, and risk-aware scheduling behavior. These clarifications help bridge the gap between theoretical formulations and their real-world implications, making the contribution more actionable for system operators and planners.
Furthermore, the proposed framework internalizes compliance verification and penalty mechanisms directly within the optimization architecture, rather than relegating them to ex post assessment. Performance scores and penalty functions are thus modeled as endogenous components of the decision space, ensuring that accredited capacity values are substantiated by operationally feasible behavior during scarcity intervals. Cross-market arbitrage is permitted to reflect realistic portfolio behavior, yet it is bounded by non-double-counting constraints and deliverability verifications that prevent the artificial inflation of contributions across overlapping products. This design innovation draws upon insights from the literature on reserve activation feasibility and adequacy scoring, yet distinguishes itself by unifying these dimensions within a coherent, optimization-based decision framework for VPPs. Recent stochastic programming studies have proposed advanced frameworks for handling multi-period decision-making under uncertainty, such as operating profit-oriented planning for renewable-integrated cascaded hydropower and distributed optimization schemes for network-constrained peer-to-peer energy trading among multiple microgrids. These models capture long-term variability in hydrological conditions, renewable output, and market interactions through scenario-based formulations and provide effective strategies for coordinating resources and maximizing expected economic performance. However, they typically rely on predefined probability distributions and scenario trees that may not fully reflect distributional ambiguity or adverse deviations from historical patterns. In contrast, the distributionally robust optimization framework adopted in this work explicitly accounts for uncertainty in the underlying probability distributions and seeks decisions that remain reliable across a family of plausible distributions. By combining this distributionally robust perspective with the proposed scheduling scheme, the model is able to hedge against misspecified or stressed conditions more systematically than conventional stochastic programming approaches, thereby complementing and extending the insights offered by recent stochastic planning and trading studies. At a broader theoretical level, existing research on capacity mechanisms has exposed significant heterogeneity in market design philosophies and implementation structures. Comparative analyses of U.S. capacity auctions, European-style reliability options, and contract-based adequacy schemes in emerging power systems reveal persistent trade-offs among efficiency, reliability, and incentive alignment. Parallel strands of research in reserve markets have interrogated the effects of product differentiation, penalty scaling, and pay-for-performance constructs on operational compliance and market behavior. Collectively, these debates underscore that no prevailing design paradigm has yet reconciled the integration of distributed flexibility with system-level adequacy assurance. By formulating a rigorous bilevel optimization model that explicitly interlinks capacity accreditation, reserve deliverability, and strategic bidding, this paper establishes a theoretical platform for quantitatively testing alternative regulatory constructs and assessing their implications for both VPP profitability and systemic reliability.
This research also builds upon, and simultaneously contributes to, the evolving body of scholarship on advanced optimization methodologies for energy systems. Stochastic programming has traditionally served as the cornerstone for modeling renewable generation uncertainty, enabling probabilistic representations of variability within tractable decision structures. Robust optimization, in contrast, has been employed to safeguard operational feasibility under worst-case deviations, ensuring resilience at the expense of potential conservatism. More recently, DRO has emerged as an appealing middle ground, offering systematic protection against probability model misspecification while maintaining computational tractability. Contemporary applications of DRO—spanning storage scheduling, microgrid operation, and renewable integration—underscore its efficacy in mitigating ambiguity-driven risk and enhancing decision reliability. Nevertheless, the application of DRO to capacity accreditation and reserve compliance remains notably limited in the existing literature. By embedding a Wasserstein-based DRO layer within a hierarchical bilevel structure that jointly represents VPP decision-making and system operator market-clearing, this study extends the methodological frontier of these techniques, demonstrating their potential to address complex market design and regulatory coordination problems that lie beyond the scope of conventional stochastic or robust optimization frameworks.

2. Strategic Carbon Budget Allocation and Trading Game: Formal Model

To sharpen the conceptual focus of the manuscript, we explicitly define the central research question at the start of the introduction: how to design a scheduling framework that remains reliable under deep uncertainty while integrating strategic market participation and operational feasibility. We also consolidated the contribution into a concise statement that presents the essence of the work in two sentences, enabling readers to clearly understand the main objective before entering the technical details. This refinement ensures that the manuscript follows a coherent line of reasoning and that each methodological component directly reinforces the primary question being addressed.
We formalize the strategic interactions among heterogeneous market participants operating within California’s cap-and-trade carbon trading scheme through the construction of a game-theoretic bilevel optimization framework. At the upper level, the regulatory authority determines the allocation of emission allowances, setting firm-specific carbon budgets that comply with overarching system-wide caps and policy constraints. The lower level models the strategic responses of diverse players—including data centers, utilities, industrial firms, and transportation hubs—each optimizing its portfolio of compliance actions such as carbon hoarding, intertemporal trading, signaling, and investment scheduling to minimize abatement costs while managing exposure to reputational and policy risks. Each participant, indexed by i I , is characterized by a distinct carbon intensity ρ i , an allowance holding trajectory A i , t , a marginal abatement cost function C i ( e i , t ) , and a signaling credibility coefficient κ i . Temporal dynamics evolve across discrete intervals t { 2025 , , 2035 } , with emission realizations e i , t and market-clearing allowance prices π t constituting the principal state variables. To represent behavioral heterogeneity and informational asymmetry, the framework embeds belief-updating dynamics and endogenous signal generation mechanisms directly into each player’s optimization objective, capturing the feedback between strategic anticipation and market information revelation. The subsequent formulations delineate the objective functions, system-wide constraints, and equilibrium conditions that collectively define the structure and behavioral logic of the strategic carbon trading game.
Figure 1 illustrates the multi-layered system architecture that holistically integrates social welfare optimization, market-clearing mechanisms, retailer bidding behavior, and consumer response modeling within an environment characterized by uncertainty in capacity pricing. This hierarchical structure captures the interdependence between regulatory decision-making, strategic market participation, and end-user demand elasticity, thereby reflecting the coupled economic and behavioral feedbacks that govern equilibrium formation in renewable-dominated electricity markets.
To improve readability, we have strengthened the descriptive explanation accompanying the existing schematic diagram to ensure that the interaction between the upper and lower levels is explicitly highlighted. The revised description clarifies how upper-level planning or strategic decisions propagate downward, how lower-level operational responses feed back into strategic evaluation, and how the hierarchical optimization structure coordinates these two layers. Furthermore, to make the notation more accessible, we have compiled the units of key variables and parameters into a compact table placed in Table 1. This allows readers to quickly verify dimensional consistency, interpret physical quantities, and better understand the meaning of each symbol without interrupting the flow of the main text.
Table 1. Comprehensive Nomenclature of Variables and Parameters in the Bilevel Optimization Framework.
Table 1. Comprehensive Nomenclature of Variables and Parameters in the Bilevel Optimization Framework.
SymbolDescriptionUnit
I , T , Ω Sets of generation units, time intervals, and stochastic scenarios, respectively
κ y Accredited capacity of the virtual power plant (VPP) in planning year y, determined via ELCC and duration rulesMW
ϑ y Qualifying volume submitted for accreditation or capacity commitment in year yMW
π i τ DA Day-ahead energy schedule of generation unit i at time τ MW
Δ i τ ω ID Intraday schedule adjustment for unit i under scenario ω MW
π i τ ω RT Real-time energy dispatch of unit i at time τ and scenario ω MW
ρ i τ ω r Reserve provision of product r by unit i at time τ under scenario ω MW
ς i , τ ω State-of-charge (SoC) of storage unit i at time τ in scenario ω MWh
χ i τ ω ch , χ i τ ω dis Charging and discharging power of storage unit i at time τ and scenario ω MW
η i ch , η i dis Charging and discharging efficiency of storage unit i
ς ̲ i , ς ¯ i Minimum and maximum state-of-charge bounds for storage unit iMWh
E ¯ i Usable energy budget of storage resource i over activation window W r MWh
avail i τ ω Available power capacity of generation or storage asset i under scenario ω MW
u i τ ω Binary on/off status of generation unit i at time τ in scenario ω
ρ ¯ i r Maximum technical reserve capability of unit i for product rMW
γ i Deliverability coefficient of unit i, linking rated capacity and usable flexibility
α r Reserve product stacking coefficient, defining compatibility across multiple services
ϕ r Reserve scaling factor used in compliance and guard-band constraints
G i ( ς ) State-of-charge-dependent guard-band function for resource iMWh
γ i guard Safety margin subtracted from G i ( ς ) to ensure deliverability robustnessMWh
Ψ y ω Performance score of the accredited capacity in year y and scenario ω
Φ y pen Penalty function capturing shortfall between delivered and accredited capacity$
ϕ y cap Penalty coefficient linking performance deficiency and financial impact$/MW
E y ELCC Effective load-carrying capability metric representing adequacy-based accreditation limitMW
θ τ y , φ τ ELCC Weighting factors and reliability multipliers used in ELCC calculation
Ω scar ( y ) Set of scarcity intervals in year y, defined by reserve violations or high-price events
ϖ y test Test factor linking delivered capacity fraction to accredited obligation during scarcity events
P , P ^ True and empirical probability distributions of uncertain parameters
P ε ( P ^ ) Wasserstein ambiguity set centered at empirical distribution P ^ with radius ε
W c Wasserstein transport distance under ground cost c
Π ( · ; ξ ) Profit function parameterized by uncertain realization ξ $
Υ ( α ) Regularization term capturing risk preference in DRO formulation$
θ , η ω Dual and scenario recourse variables in Benders decomposition
d τ ω net Net demand after accounting for renewable generation and demand response in scenario ω MW
λ τ ω , μ τ ω r Energy and reserve market clearing prices at time τ under scenario ω $/MWh
π t DA , π t RT Day-ahead and real-time market prices$/MWh
max { κ y , π i τ ω en , ρ i τ ω r } y Y λ y cap κ y ϕ y cap ( κ y ϑ y ) + + ω Ω ω τ T [ i I λ τ ω π i τ ω en + r R μ τ ω r ρ i τ ω r i I C i var ( π i τ ω en ) + r R ψ i r ρ i τ ω r ] Γ risk ω ω τ Π τ ω ; α s . t . κ y E y ELCC ( ϑ y ) , π i τ ω en + r α r ρ i τ ω r avail i τ ω .
min { p g τ ω , r g τ ω r , n τ ω } ω Ω ω τ T [ g G C g fuel ( p g τ ω ) + C g su ( p g τ ω ) + r R χ r r g τ ω r + β EUE n n τ ω + y Y β y LOLE ( Req y cap j κ j y ) + ] s . t . g p g τ ω + j π j τ ω en = n d n τ ω n n τ ω , g r g τ ω r + j ρ j τ ω r Req τ ω r .
These two interdependent objective functions jointly characterize the coupled equilibrium between a profit-seeking VPP and a reliability-oriented system operator. The upper-level formulation concentrates on maximizing the VPP’s expected revenues across capacity, reserve, and energy markets, explicitly incorporating penalty terms, risk aversion, and accreditation limits represented by E y ELCC . The presence of state-dependent availability avail i τ ω and probabilistic weights ω underscores the inherently stochastic nature of renewable resource integration and operational uncertainty. In contrast, the lower-level formulation seeks to minimize total system operating costs by optimally balancing fuel expenditures, start-up costs, and reserve procurement against penalties imposed for unserved demand and capacity deficiencies. Through the enforcement of temporal and scenario-based balance and reserve constraints, the system operator guarantees adequacy while endogenously determining equilibrium market prices. Collectively, these two layers compose a compact yet expressive bilevel optimization framework in which Greek parameters encapsulate uncertainty, risk tolerance, and adequacy obligations, and equilibrium arises endogenously from the interplay between market-driven incentives and system reliability imperatives.
κ y τ T θ τ y d τ peak i I g i τ · φ τ ELCC ,
κ y min h H 1 h τ = 1 h i I π i τ en ,
τ Ω scar ( y ) i I π i τ en + r R ρ i τ r ϖ y test κ y .
These three expressions collectively formalize the accreditation of capacity by jointly enforcing duration and reliability requirements within the optimization framework. The first condition (3) constrains the accredited capacity κ y through an ELCC measure, which quantifies the marginal contribution of the VPP to reducing net peak demand and is weighted by temporal reliability coefficients θ τ y and scaling multipliers φ τ ELCC . The second constraint (4) introduces an explicit duration-based adequacy requirement, mandating that the accredited capacity must be continuously deliverable for at least h consecutive hours—a standard benchmark applied to energy-limited technologies such as battery storage and hybrid renewable systems. Finally, condition (5) ensures performance compliance during scarcity intervals Ω scar ( y ) by requiring the aggregate of dispatched energy and reserve provision to satisfy at least a designated fraction ϖ y test of the accredited commitment κ y , thereby tying the accreditation value directly to observable system stress events. Collectively, these interlinked conditions delineate the admissible operating envelope for κ y , ensuring that accredited capacity simultaneously satisfies probabilistic adequacy criteria and operational duration feasibility under uncertainty.
ρ i τ ω r ρ ¯ i r u i τ ω , i I , r R ,
π i τ ω en + r R ρ i τ ω r avail i τ ω , i , τ , ω ,
r R α r ρ i τ ω r γ i p ¯ i , i , τ , ω ,
τ T r R E i r ( ρ i τ ω r , W r ) B i ( ς i , τ ω ) , i , ω .
These four equations collectively define the allocation, feasibility, and verification principles governing reserve deliverability within the proposed framework. Condition (6) guarantees that the reserve provision ρ i τ ω r from each unit i remains within its technical upper limit ρ ¯ i r whenever the unit is operationally committed ( u i τ ω = 1 ), thereby ensuring mechanical and thermal feasibility. Constraint (7) enforces a non-double-counting rule, stipulating that the combined quantity of dispatched energy π i τ ω en and reserve headroom must not exceed the available capability avail i τ ω of the asset under the prevailing state and scenario. The third Equation (8) introduces product-stacking restrictions through scaling coefficients α r , where γ i p ¯ i delineates the composite deliverability envelope that bounds concurrent reserve offerings across different markets. Lastly, Equation (9) establishes the intertemporal linkage between reserve commitments and cumulative energy budgets over activation windows W r , ensuring that the state-of-charge ς i , τ ω possesses sufficient energy to sustain promised reserve activation throughout the designated duration. Taken together, these constraints eliminate infeasible over-commitment behavior and ensure that all reserve allocations remain operationally credible under both instantaneous contingencies and extended stress conditions.
To address concerns about idealized modeling assumptions, we now clarify that the assumption of perfect information at the upper level is primarily adopted to ensure analytical tractability and consistency with standard hierarchical optimization practice. However, we acknowledge that real decision-makers often operate under bounded rationality and incomplete information. Under such conditions, equilibrium responses may become less conservative, and attacker or planner behavior may shift due to reduced visibility of system states. Similarly, the use of a fixed penalty coefficient for capacity shortfall is intended to provide a stable benchmark for evaluating dispatch performance, yet an adaptive or state-dependent penalty could introduce more dynamic feedback between shortage severity and operational cost. Incorporating such adaptive mechanisms would likely reshape equilibrium outcomes by strengthening the system’s response under high-risk conditions and reducing unnecessary conservatism when the system is lightly stressed. Although these extensions fall outside the scope of the present study, the framework can be adapted to accommodate them, and we discuss these potential effects to provide a more complete understanding of how the assumptions influence the resulting equilibrium behavior.
ς i , τ ω = ς i , τ 1 , ω + η i ch χ i τ ω ch 1 η i dis χ i τ ω dis , i , τ , ω ,
ς ̲ i ς i , τ ω ς ¯ i , i , τ , ω ,
τ T χ i τ ω dis + r R ρ i τ ω r · W r E ¯ i , i , ω .
Equation (10) characterizes the intertemporal evolution of the state-of-charge (SoC) for each storage unit, where charging inflows χ i τ ω ch are scaled by the charging efficiency η i ch , and discharging outflows χ i τ ω dis are adjusted by the reciprocal of the discharging efficiency 1 / η i dis to account for round-trip conversion losses. Equation (11) enforces operational feasibility by bounding the SoC variable ς i , τ ω between its technical lower and upper limits, ς ̲ i and ς ¯ i , throughout each temporal interval, thereby ensuring that energy trajectories remain within physically permissible ranges. Equation (12) introduces the cumulative energy constraint, stipulating that the total discharged energy—augmented by the energy required to maintain reserve commitments over activation windows W r —must not exceed the unit’s maximum usable energy budget E ¯ i . Collectively, these constraints encapsulate the thermodynamic and operational behavior of storage systems, guaranteeing that virtual power plant (VPP) dispatch and reserve strategies are both energy-consistent and technically realizable under dynamic system conditions.
τ Ω scar ( y ) i I π i τ ω en + r R ρ i τ ω r ϖ y test κ y , y , ω ,
Ψ y ω = τ Ω scar ( y ) i I π i τ ω en + r R ρ i τ ω r κ y · | Ω scar ( y ) | , y , ω ,
Φ y pen = ϕ y cap · κ y E ω [ Ψ y ω κ y ] + , y ,
Equation (13) establishes the delivery compliance criterion during scarcity intervals Ω scar ( y ) , mandating that the aggregate of real-time energy dispatch π i τ ω en and reserve provision ρ i τ ω r from the VPP must collectively satisfy at least a specified proportion ϖ y test of the accredited capacity κ y . Equation (14) defines the performance index Ψ y ω , computed as the normalized ratio of actual delivered output to the accredited commitment across all scarcity events, thereby serving as a probabilistic measure of realized adequacy performance. Subsequently, Equation (15) formulates the penalty function Φ y pen , which imposes financial disincentives for underperformance by penalizing deviations between accredited capacity κ y and the expected delivered quantity under uncertainty. The penalty intensity is governed by coefficient ϕ y cap , ensuring that any shortfall directly diminishes profitability and effectively deters strategic overstatement of accredited values. Collectively, these relationships establish a closed-loop mechanism that links accreditation, realized delivery, and economic repercussions, thereby harmonizing market incentives with reliability-oriented performance assurance.
The previously separate cap-and-trade signaling game has been reorganized and reframed as a policy-driven modifier that influences operational decisions within the main model, rather than as an independent theoretical system. Irrelevant signaling and belief-updating elements have been removed, and the remaining carbon-related components are now integrated directly into the core optimization structure to maintain consistency with the electricity-market bilevel formulation. This restructuring ensures that each section contributes to a coherent, unified narrative centered on the proposed scheduling approach.

3. Learning-Augmented Equilibrium Refinement and Solution Strategy

To improve reproducibility and mathematical clarity, we expanded the previously symbolic constraints to their explicit operational forms and provided a more complete derivation of the bilevel-to-MPEC reformulation. The revised model specification clearly states all decision variables, feasibility conditions, and complementarity relations, ensuring that the structure of the optimization problem can be fully reconstructed by readers. Additionally, we elaborated on the DRO component by detailing how the Wasserstein ambiguity set is constructed and how its dual representation influences the resulting optimization problem. These clarifications outline the assumptions required for maintaining equivalence between the reformulated problem and the original hierarchical model, thereby improving both transparency and mathematical rigor.
To address the bilevel strategic carbon game characterized by asymmetric beliefs and dynamically evolving allowance prices, we develop a hybrid computational framework that integrates DRO, mixed-integer linear programming (MILP), and Bayesian inference-based belief updating. In the first stage, the regulator’s upper-level optimization problem is reformulated as a master allocation program that embeds fairness constraints, emission cap compliance, and intertemporal permit banking regulations. In the second stage, the lower-level decision problems of the participating agents are decomposed into MILP sub-models augmented with belief-state variables μ i , t and signaling variables s i , t , which collectively capture the process by which players form and update expectations about competitors’ emission trajectories and future allowance price movements. The third stage implements an iterative fixed-point refinement mechanism, in which strategic actions and market-clearing allowance prices are repeatedly updated until the system converges to a belief-consistent Bayesian equilibrium. Finally, to incorporate regulatory ambiguity and behavioral uncertainty, a Wasserstein-based DRO layer is embedded around each player’s probabilistic forecasting model, thereby enabling robust decision-making even under misspecified or incomplete belief distributions. This multi-layered methodological architecture preserves computational tractability while maintaining a high degree of realism with respect to the institutional design, behavioral dynamics, and informational asymmetries that characterize actual carbon markets.
κ y τ T DA i I π i τ DA , y ,
π i τ DA + Δ i τ ω ID = π i τ ω RT , i , τ , ω ,
i I π i τ ω RT + r R ρ i τ ω r = d τ ω net , τ , ω .
Equation (16) establishes the linkage between long-term capacity accreditation κ y and short-term market commitments π i τ DA , ensuring that accredited capacity obligations are consistently backed by forward-scheduled quantities in the day-ahead horizon. Equation (17) introduces intraday correction variables Δ i τ ω ID , which reconcile deviations between day-ahead commitments and realized operating conditions to yield the final real-time dispatch π i τ ω RT under scenario ω . Equation (18) then formalizes the real-time equilibrium constraint, requiring that the aggregate of energy dispatch and reserve contributions across all participants precisely equals the system’s net demand d τ ω net for every time interval and stochastic realization. Collectively, these intertemporal coupling conditions integrate annual capacity accreditation with day-ahead scheduling, intraday revisions, and real-time balancing, thereby ensuring temporal coherence and market consistency across multiple trading horizons within the unified optimization structure.
Ω = ω : d τ ω , g τ ω RES , λ τ ω , μ τ ω r τ T ,
ω Ω ω = 1 , ω 0 , ω ,
E ω Π ω = ω Ω ω τ T i I λ τ ω π i τ ω RT + r R μ τ ω r ρ i τ ω r .
Equation (19) specifies the stochastic scenario set Ω , wherein each scenario ω encapsulates simultaneous realizations of key system variables, including demand d τ ω , renewable generation output g τ ω RES , energy market prices λ τ ω , and reserve market prices μ τ ω r across all temporal stages τ . Equation (20) formalizes the associated probability distribution of these scenarios by imposing non-negativity and normalization constraints on the probability weights ω , thereby ensuring that the ensemble of realizations collectively represents a valid stochastic measure. Equation (21) defines the expected profit of the virtual power plant (VPP) as a probabilistically weighted aggregation over all scenarios, incorporating both real-time energy revenues and reserve remuneration streams. Collectively, these formulations constitute the stochastic representation of system uncertainty, guaranteeing that the optimization framework captures the full spectrum of plausible operational outcomes with appropriate probabilistic fidelity and economic weighting.
P ε P ^ = Q M 1 ( Ξ ) : W c Q , P ^ ε , P ^ = 1 N n = 1 N δ ξ ^ n ,
max κ , π RT , ρ inf Q P ε ( P ^ ) E ξ Q Π κ , π RT , ρ ; ξ Υ ( α ) ,
sup Q P ε ( P ^ ) E ξ Q Π ( · ; ξ ) = inf θ 0 θ ε + 1 N n = 1 N sup ξ Ξ Π ( · ; ξ ) θ c ( ξ , ξ ^ n ) .
Equation (22) introduces the Wasserstein ambiguity set P ε ( P ^ ) , defined as the collection of all probability distributions lying within a radius ε of the empirical distribution P ^ under the transport metric W c associated with the ground cost function c. Equation (23) formulates the distributionally robust profit maximization problem, wherein the decision variables ( κ , π RT , ρ ) are optimized to achieve the maximum worst-case expected profit across all distributions Q contained within the ambiguity set, subject to a penalization term Υ ( α ) that regularizes risk preferences and ensures conservative robustness against distributional deviations. Equation (24) presents the dual reformulation of the worst-case expectation over the Wasserstein ball, introducing a scalar dual variable θ that governs the trade-off between the ambiguity radius ε and the transport-penalized payoff deviation. This dual representation transforms the infinite-dimensional ambiguity problem into a tractable convex optimization program when the profit function Π ( · ; ξ ) satisfies Lipschitz continuity under c and exhibits convexity (or concavity) with respect to the decision variables, thereby rendering the distributionally robust layer computationally implementable within the broader bilevel market optimization structure. To substantiate the claim of efficient solvability, we provide additional explanation of the observed convergence behavior and computational scalability. In our experiments, the algorithm consistently exhibits fast reduction in constraint violations during the early iterations, followed by steady refinement of optimality conditions as coordination among decision layers stabilizes. The number of outer iterations required to reach convergence remains modest across all instances due to the proximal regularization and decomposable structure of the updates. Although larger scenario sets naturally increase runtime, the overall computational growth remains moderate because each scenario contributes independently to the lower-level evaluation and can be processed in a parallel or semi-parallel manner. These empirical observations support the reported solution time of approximately 300 s per scenario and demonstrate that the proposed method scales reliably with problem size and uncertainty granularity.
0 ζ i τ ω avail i τ ω π i τ ω RT r R α r ρ i τ ω r 0 , i , τ , ω ,
r R ϕ r ρ i τ ω r G i ς i , τ ω γ i guard , Ψ y ω ψ min , i , τ , ω , y ,
Equation (25) introduces a complementarity-based arbitrage switching condition, wherein the dual variable ζ i τ ω —interpreted as the shadow value of the available headroom constraint—becomes binding precisely when the combined utilization of real-time energy π i τ ω RT and aggregated reserves ρ i τ ω r exhausts the unit’s total operational capability. This condition effectively prices the instantaneous marginal trade-off between energy arbitrage and reserve provision, reflecting the economic substitution dynamics between these two market services. Equation (26) imposes operational and compliance safeguards: total reserve commitments, weighted by product-specific scaling coefficients ϕ r , must remain within a state-of-charge-dependent guard band G i ( ς ) γ i guard , thereby ensuring energy deliverability and system reliability. In parallel, the realized capacity performance index Ψ y ω is constrained to exceed a predefined minimum threshold ψ min during scarcity conditions, guaranteeing that accredited capacity remains credible and performance-backed. Collectively, these equations capture the equilibrium coupling between arbitrage economics, physical feasibility, and regulatory compliance within the real-time operational layer.
max κ , ϑ , θ y Y λ y cap κ y y Y ϕ y cap ( κ y ϑ y ) + + θ s . t . κ y E y ELCC ( ϑ y ) , κ y D y ( H ) ( ϑ y ) , θ ω Ω ω η ω ( Benders link ) ,
η ω = max π ω RT , ρ ω τ T λ τ ω i π i τ ω RT + r μ τ ω r i ρ i τ ω r C ω ( π ω RT , ρ ω ) s . t . π i τ ω RT + r α r ρ i τ ω r avail i τ ω ( κ ) , ς - dynamics , reserve windows , balance constraints ,
Equation (27) formulates the master-level optimization problem incorporating the recourse proxy variable θ , which jointly determines the accredited capacity vector κ and the qualifying volume vector ϑ , both constrained by ELCC requirements and duration-based adequacy limits. The recourse proxy θ is bounded above by the probability-weighted short-term recourse outcomes η ω through Benders decomposition links, thereby coupling long-term accreditation with short-term operational performance. Equation (28) defines the scenario-specific recourse value η ω as the optimal net market revenue—computed as gross energy and reserve income minus operating costs—subject to feasibility constraints that depend parametrically on the accredited capacity κ . By dualizing Equation (28), one obtains affine Benders cuts in κ , which iteratively refine the upper bound on θ in the master problem, ensuring convergence toward a tight approximation of the bilevel equilibrium. Collectively, Equations (25)–(28) complete the methodological layer of the proposed framework, integrating complementarity-driven arbitrage logic, deliverability and performance safeguards, and a decomposition-ready structure that cleanly partitions long-horizon capacity accreditation from short-horizon dispatch decisions, while rigorously maintaining their interdependence through feasibility and price feedback mechanisms.

4. Results

To validate the proposed hierarchical optimization framework, we conduct a rigorously calibrated case study on a stylized regional electricity market patterned after the PJM Interconnection, comprising 12 generation units, 8 retail aggregators, and a demand-side population partitioned into three consumer classes (residential, commercial, industrial) with distinct elasticity and response profiles. The temporal domain spans 24 hourly intervals to emulate a one-day-ahead market clearing coupled to capacity commitment obligations that roll over a six-month horizon; baseline demand is fixed at 13,200 MWh/day, with a peak of 780 MW occurring at hour 17. Generator marginal cost coefficients are sampled from the range [ 20 ,   70 ] $ / MWh , while capacity availability factors vary between 0.72 and 0.98 , thereby capturing heterogeneity in resource reliability. Capacity price inputs are derived from historical PJM auction outcomes, normalized and quantized into five representative tiers—85, 100, 115, 130, and 150 $ / MW - day —to preserve empirical fidelity. Retail bidding is represented via inverse demand curves with elasticities in [ 0.30 ,   0.08 ] , enabling endogenous, price-responsive load adjustments; reliability penalties are set at 350 $ / MWh for critical loads (e.g., hospitals) and 150 $ / MWh for non-critical segments. Retailers submit joint energy–capacity bids while hedging against capacity shortfalls, and a 15 % reserve margin is imposed system-wide relative to the peak hourly load. Demand-side stochasticity is modeled through ten scenarios generated by Gaussian copula-based sampling with weights calibrated to diurnal risk asymmetry. The full framework is implemented in Python 3.12.0 (Pyomo), with mixed-integer linear formulations solved by Gurobi 11.0 on a workstation equipped with an Intel Xeon Gold 6326 CPU (2.9 GHz; Intel Corporation, Santa Clara, CA, USA), 256 GB RAM, running on Ubuntu 22.04 LTS (Canonical Ltd., London, UK). Average solve times for the complete bilevel program are under 300 s per scenario, and the lower-level subproblems converge in fewer than 50 iterations under a dual decomposition-based callback scheme. Sensitivity experiments on capacity-price perturbations and elasticity variation are performed via post-processing modules that emulate policy stress tests and regulator cost redistributions; all codes and datasets are fully reproducible and available upon request.
Figure 2 illustrates the nonlinear relationship between carbon intensity (kg CO 2 /MWh) and the average carbon allowance price ($/ton) across diverse entity types participating in California’s cap-and-trade market, segmented into five representative categories: data centers, industrial parks, university campuses, utilities, and transportation hubs. Each data point corresponds to a specific entity (e.g., Google Data Center, PG&E, Stanford University), with color encoding denoting the entity type and marker size reflecting total energy consumption.
The visualization reveals a distinctly nonlinear correlation—entities with higher carbon intensities generally incur higher average allowance prices, although notable deviations emerge across sectors. For instance, industrial parks exhibit moderate to high carbon intensities yet frequently access discounted permits through long-term contractual arrangements or bulk purchasing strategies. Utilities occupy an intermediate position, displaying moderate emission intensities but the widest dispersion in carbon pricing outcomes, indicative of the heterogeneity introduced by hedging strategies, auction timing, and regulatory offset mechanisms. Conversely, university campuses and data centers exhibit comparatively low emissions per MWh yet face higher marginal allowance prices, suggesting diminished bargaining power and shorter compliance horizons. Collectively, this figure exposes underlying segmentation patterns, behavioral asymmetries, and efficiency gaps in carbon trading, thereby motivating policy refinements such as dynamic price adjustment mechanisms or tiered allocation schemes aligned with marginal abatement cost differentials. To clarify the advantages of the proposed approach, we now incorporate a detailed comparison against traditional stochastic and deterministic bilevel scheduling frameworks. The revised discussion explains that conventional stochastic bilevel models rely on predefined scenario trees and therefore perform effectively only when the assumed probability distributions are accurate. Deterministic models, meanwhile, fully neglect distributional variability and tend to underestimate operational risks. In contrast, the DRO formulation used in this study optimizes performance over a range of plausible distributions, enabling the model to remain resilient when the system faces rare, stressed, or adversarial conditions not captured by nominal stochastic assumptions. By highlighting how deterministic and scenario-based approaches may fail to account for ambiguity or worst-case behavior, the updated case study narrative demonstrates more clearly the practical value and robustness of the proposed framework.
Figure 3 depicts the temporal evolution of carbon allowance allocations among six representative entities—PG&E, Meta, Tesla Gigafactory, Stanford University, the Port of Los Angeles, and a major Southern California industrial park—spanning the decade from 2025 to 2035. Each stratum in the stacked area plot represents the proportional share of the statewide carbon budget (measured in million metric tons of CO 2 ) held by a particular participant, normalized annually so that the aggregate allocation sums to 100%. The visualization reveals dynamic redistributions of allowance ownership driven by progressively tightening emission caps and adaptive reallocation schemes contingent on factors such as prior-year performance, political influence, and strategic stockpiling behavior. Utilities such as PG&E exhibit a gradual contraction in their allowance share, plausibly reflecting compliance with increasingly stringent decarbonization mandates. Conversely, technology-sector actors—including Meta and Stanford—display a marked expansion in their allocated shares over time, attributable to proactive credit accumulation, acquisition of surplus allowances, or sustained investment in low-carbon infrastructure. Two distinct discontinuities emerge in 2028 and 2032, corresponding to regulatory recalibration events synchronized with California’s quinquennial emissions inventory updates. Collectively, this figure elucidates the evolving competitive dynamics, the redistribution of compliance burdens, and the emerging equity considerations intrinsic to cap-and-trade systems, offering critical insights for strategic modeling of allowance procurement, secondary-market trading behavior, and long-term carbon investment hedging strategies.
Figure 4 illustrates the nonlinear interdependence between total system operational cost and two pivotal system drivers: renewable energy penetration (expressed as a percentage on the x-axis) and aggregate load demand (in per-unit, p.u., on the y-axis). The z-axis quantifies the total daily operational cost, normalized to base-case monetary units, thereby providing a three-dimensional representation of cost dynamics under varying system conditions. At low renewable penetration levels (0–20%), the surface demonstrates a monotonic rise in cost as aggregate demand increases, reflecting the predominance of fossil-fuel-based generation and its associated marginal cost structure. A pronounced cost trough emerges within the intermediate penetration range of 50–70%, representing an operational “sweet spot” wherein renewable generation displaces a substantial share of thermal unit commitments while avoiding excessive curtailment and balancing inefficiencies. Beyond approximately 80% renewable penetration, the surface transitions sharply upward—particularly under high-load regimes (load > 1.3 p.u.)—signaling the onset of system instability and increased difficulty in maintaining energy-reserve balance. This inflection is attributable to steep reserve ramping requirements, heightened curtailment rates, and over-dispatch of storage and demand-flexible assets. In such regimes, the reliance on backup thermal capacity and ancillary grid-support services intensifies, even when nominal renewable capacity remains sufficient, underscoring the nonlinear cost penalties associated with ultra-high renewable integration levels.
Figure 5 visualizes the dependency of system reliability—quantified as the percentage of uninterrupted load served—on the interaction between available storage capacity (x-axis, ranging from 0 to 100 MWh) and probabilistic wildfire disruption levels (y-axis, spanning 0 to 30%). At low disruption probabilities (below 5%), even relatively modest storage capacities (approximately 20 MWh) suffice to sustain high reliability levels exceeding 98%, as network contingencies such as nodal isolation or line outages remain infrequent and can be mitigated through conventional reserve margins. As disruption probabilities escalate toward the 10–20% range, the reliability surface exhibits a pronounced decline unless storage capacity is expanded commensurately to absorb the volatility induced by localized generation loss and transmission congestion. Under extreme disruption scenarios (25–30%), corresponding to high wildfire exposure and environmental volatility, maintaining system reliability above 95% necessitates large-scale storage deployment exceeding 70 MWh. This nonlinear sensitivity underscores the critical role of robust planning and flexible resource allocation under deep uncertainty, thereby empirically validating the application of DRO within the proposed modeling framework to ensure resilience against tail-risk environmental contingencies.
To ensure that the numerical results correspond directly to the optimization model, we replaced earlier generic plots with results generated from the actual implementation of the proposed framework. The updated figures depict the behavior of key decision variables under uncertainty, illustrate how robustness adjustments influence dispatch outcomes, and demonstrate the feasibility of solving the reformulated problem at realistic scales. Each figure is now explicitly aligned with a corresponding variable or constraint introduced in the formulation, allowing readers to clearly trace how the model structure translates into observable numerical behavior. This restructuring ensures that the case study provides a faithful and coherent representation of the proposed optimization problem.
Figure 6 presents the spatial interaction between total system carbon emissions (metric tons per day) and two pivotal decarbonization levers: renewable energy penetration (x-axis, 0–100%) and hydrogen storage utilization rate (y-axis, 0–100%). The resulting surface reveals a distinctly nonlinear topology. Along the horizontal dimension, emissions decline sharply as renewable penetration increases from 0 to approximately 60%, primarily reflecting the displacement of gas turbine generation by zero-carbon resources. Beyond this threshold, however, the rate of emission reduction plateaus, indicating diminishing marginal gains caused by renewable curtailment, system balancing constraints, and insufficient operational flexibility. Vertically, the intensification of hydrogen storage dispatch—where H 2 storage is employed not merely for intertemporal energy shifting but as an active component of real-time system dispatch—further accelerates emission reductions, particularly at higher renewable shares. Notably, when hydrogen storage utilization surpasses 80% under conditions of 90–100% renewable penetration, total emissions continue to decline even as battery-based systems exhibit plateauing performance due to round-trip inefficiencies and state-of-charge saturation. This figure highlights the synergistic role of hydrogen storage in deep decarbonization pathways, demonstrating its capacity to extend the effectiveness of renewable integration well beyond the technical limits imposed by conventional electrochemical storage technologies.
Figure 7 illustrates the voltage violation rate (z-axis, expressed in %) within the distribution network as a function of two critical operational parameters: load level (x-axis, in per unit, p.u.) and reactive power exchange (y-axis, in MVAr, ranging from −30 to +30). The resulting surface map reveals a distinct nonlinear dependency between network loading conditions and reactive power management. At nominal load levels (approximately 1.0 p.u.), the voltage profile remains largely stable, with violation rates near zero across a broad spectrum of reactive power injection and absorption values. However, as system loading intensifies beyond 1.3 p.u., insufficient reactive power regulation—particularly near the neutral region around 0 MVAr—triggers a steep escalation in violation rates, frequently surpassing 20%, symptomatic of pervasive undervoltage or overvoltage conditions extending beyond acceptable tolerance limits. Conversely, when reactive power compensation is effectively coordinated—achieved through targeted injection of approximately +20 MVAr or absorption of −20 MVAr, contingent on local phase and nodal characteristics—the violation rate declines sharply, reinstating voltage stability even under pronounced load stress. This relationship underscores the pivotal role of dynamic reactive power optimization in mitigating voltage instability and maintaining network resilience across diverse loading regimes.
Figure 8 depicts the degradation of capacity accreditation across heterogeneous resource types as scarcity duration increases, thereby elucidating the pivotal influence of operational flexibility on reliability valuation. The surface profile demonstrates that high-flexibility technologies—such as gas turbines and long-duration electrochemical storage—sustain accreditation levels above 90% across all scarcity intervals, reflecting their ability to deliver continuous support during extended stress events. In contrast, medium-flexibility resources, including hybrid PV + storage configurations, exhibit a pronounced yet gradual decline in accredited value, decreasing from approximately 90% to 60% as scarcity duration extends from one to twelve hours; this trend underscores the impact of discharge duration limits, state-of-charge depletion, and finite recharge opportunities. Low-flexibility assets—such as solar-only units and short-duration batteries—experience a steep deterioration in accreditation, falling below 40%, thereby exposing their inability to sustain delivery through multi-hour scarcity events. Collectively, the visualization underscores the necessity of duration-aware capacity valuation frameworks, particularly in renewable-dominated systems where prolonged net-load stress and variable resource availability necessitate more granular recognition of temporal reliability contributions.
To further investigate market power concentration effects, the proposed framework is evaluated on the IEEE 118-bus transmission system. Strategic generators are assigned heterogeneous bidding flexibility under network constraints, allowing market concentration to emerge endogenously. Simulation results indicate that increased generator concentration leads to amplified locational marginal price dispersion and asymmetric profit allocation, particularly under congestion-prone operating conditions.
We focus exclusively on the electricity-market hierarchical structure and omits the previous cap-and-trade signaling game, as it did not directly interact with the decision variables or constraints of the proposed optimization model. Policy-related factors are incorporated only when they explicitly influence dispatch feasibility, operational costs, or decision-layer interactions within the bilevel framework. By removing unrelated carbon-market dynamics, the model presentation becomes more streamlined, maintains a unified conceptual focus, and preserves the clarity of the methodological development.
Figure 9 presents a scatter plot mapping the marginal generation cost (in $/MWh) of diverse energy resources against their average utilization rates (expressed as the percentage of operational hours) within the optimized power system configuration. Each marker corresponds to a specific resource category—such as gas turbines, wind farms, solar PV + battery systems, pumped hydro, and biomass—with a color gradient indicating cost intensity, where darker red hues denote higher marginal costs. The visualization reveals three salient structural patterns. First, baseload or quasi-continuous resources—including coal (if present) and combined-cycle gas turbines—occupy the lower-left quadrant, exhibiting relatively low marginal costs ($0–$20/MWh) and moderate to high utilization rates, with solar-plus-storage hybrids occasionally attaining utilization levels near 60%. Second, mid-merit and flexible assets—such as wind and biomass—populate the central region, displaying balanced cost-to-utilization profiles that enable partial load following and economic dispatch under uncertainty. Third, high-cost peaking units, including open-cycle gas turbines and diesel generators, cluster in the upper-left region of the plot, characterized by extremely low utilization (<10%) yet very high marginal costs (exceeding $150/MWh), reflecting their role as contingency reserves or insurance assets during scarcity conditions. The figure provides clear strategic implications for dispatch hierarchy, redundancy management, and investment prioritization. Storage-linked renewable technologies emerge as “sweet spots,” combining moderate utilization with low marginal costs, thereby supporting their expansion as cost-effective enablers of system flexibility. In contrast, high-cost, low-utilization peakers, while operationally indispensable for extreme reliability contingencies, should be progressively displaced by cleaner and more flexible alternatives. Overall, this visualization offers a multidimensional benchmark for evaluating system efficiency, cost resilience, and technology portfolio optimization in the context of long-term decarbonization planning.
Figure 10 delineates the nonlinear profit landscape of strategic generators operating under varying levels of regulatory incentives and endogenous bidding aggressiveness, thereby visualizing the equilibrium feedback between policy design and market behavior. The three-dimensional surface reveals a distinctly nonconvex topology in which generator total profit exhibits an interior optimum rather than monotonic dependence on either control variable. Along the regulatory incentive axis, profit initially rises with increasing subsidy intensity, reflecting enhanced marginal returns from policy-induced revenue augmentation; yet, beyond a critical threshold, overcompensation leads to diminishing efficiency and welfare leakage, evidenced by the curvature inversion in the upper-right quadrant. Along the strategic bidding dimension, moderate strategic flexibility enhances profit by exploiting market-clearing asymmetries, while excessive bid inflation suppresses dispatch probability, yielding declining returns. The observed ridge line near the mid-surface marks the Pareto-efficient frontier of generator profitability, balancing regulatory generosity and competitive discipline. This morphology underscores the delicate calibration challenge faced by policymakers: insufficient incentives dampen renewable participation, whereas excessive generosity destabilizes price signals and erodes social welfare. In total, the figure encapsulates the essence of the bilevel game structure—where equilibrium profit formation is co-determined by the regulator’s incentive mechanisms and the endogenous strategic elasticity of market participants.
Figure 11 visualizes the three-dimensional welfare topology that emerges from the co-evolution of renewable penetration and regulatory policy stringency, revealing the structural trade-offs intrinsic to sustainable market design. The surface exhibits a convex–concave morphology wherein system social welfare initially rises with increasing renewable share and moderate policy enforcement, yet eventually approaches a plateau as marginal welfare gains are offset by curtailment, balancing inefficiencies, and fiscal saturation. Along the renewable penetration axis, the upward gradient signifies the systemic benefit of displacing carbon-intensive generation, reflected through lower marginal abatement costs and improved energy equity. However, beyond approximately 70–80% renewable share, diminishing returns become evident as variability-induced balancing costs and reserve scarcity erode the welfare frontier. Simultaneously, along the regulatory stringency dimension (expressed as Renewable Portfolio Standard, RPS), the surface curvature captures the nonlinear welfare response to escalating policy mandates: moderate enforcement catalyzes market coordination and innovation spillovers, while excessive rigidity induces welfare drag via subsidy inefficiency and distorted dispatch signals. The global welfare ridge corresponds to an interior equilibrium where renewable integration and policy intervention are mutually reinforcing yet remain economically proportionate. Collectively, the figure encapsulates the dialectical interplay between ecological ambition and market rationality, demonstrating that welfare optimization in decarbonized power systems hinges on the delicate calibration of regulatory pressure and technological penetration depth.
Figure 12 delineates the parametric sensitivity of market price volatility—expressed as Locational Marginal Price (LMP) variance in $/MWh—to renewable forecast errors and the degree of price-responsive demand elasticity. The surface exhibits a strongly convex geometry, revealing an intricate inverse correlation between demand elasticity and volatility amplitude. Along the forecast error axis, LMP volatility escalates nonlinearly with increasing renewable uncertainty, illustrating the amplification of stochastic imbalance costs under imperfect forecasting conditions. However, this amplification attenuates markedly as price-responsive demand elasticity strengthens, demonstrating the stabilizing influence of consumer adaptability on market equilibrium. In regions of low elasticity (below 0.3), even modest forecast errors provoke pronounced volatility spikes exceeding 7 $/MWh, reflecting scarcity pricing, reserve exhaustion, and rapid redispatch cycles. Conversely, when elasticity approaches unity, volatility diminishes toward a near-flat regime, indicating that flexible demand absorbs fluctuations and mitigates systemic stress. The curvature of the surface captures the nonlinear elasticity-to-volatility transmission mechanism—small increments in elasticity yield disproportionately large stability benefits, embodying an economic analogue of negative feedback damping. Collectively, this figure encapsulates the equilibrium geometry of uncertainty propagation in renewable-dominated markets, underscoring that enhancing demand flexibility constitutes a first-order lever for volatility containment and price stability in next-generation electricity systems.
Table 2 condenses the principal outcomes of the bilevel optimization analysis, revealing how increasing renewable penetration influences the macroeconomic and operational equilibrium of the power system. The results exhibit a clear convex relationship between renewable share and welfare: as renewable integration rises from 20% to roughly 60%, social welfare increases steadily due to declining marginal generation cost and enhanced flexibility from storage participation. However, once penetration surpasses this critical point, system inefficiencies emerge—stemming from higher curtailment rates, limited reserve availability, and inadequate long-duration flexibility—causing welfare to plateau and then decline. The corresponding rise in total system cost beyond 70% renewable share confirms that marginal integration becomes increasingly expensive in the absence of sufficient flexibility resources. Meanwhile, reliability, although remaining above 95%, displays a mild but consistent downward trend, illustrating that deep decarbonization introduces operational fragility even under coordinated regulation. Together, these findings articulate the existence of an economic–technical optimum for renewable expansion: a point at which welfare, cost, and reliability coexist in near-equilibrium, beyond which additional renewable deployment yields diminishing or even negative systemic returns.
Table 3 presents a sensitivity analysis examining the coupled effects of regulatory incentive magnitude and renewable forecast uncertainty on overall system welfare within the bilevel optimization architecture. The results exhibit a distinctly non-monotonic welfare response: while modest increases in incentives lead to improved coordination and greater renewable participation, the benefits taper off and eventually reverse once the policy stimulus exceeds the economically efficient threshold. In the intermediate range, specifically near an incentive level of $150 per unit, the welfare-maximizing configuration emerges, representing a point of equilibrium where the marginal gain from renewable participation equals the marginal inefficiency induced by informational and dispatch frictions. Beyond this regime, over-subsidization induces bidding distortions, inefficient resource commitments, and excessive curtailment of stochastic renewables, thereby eroding social welfare despite nominal decarbonization gains. Simultaneously, increasing forecast error amplifies volatility and causes the welfare frontier to shift downward, revealing the compounding effect of uncertainty on policy effectiveness. The analysis underscores that policy precision is as critical as policy generosity: the system performs optimally not when incentives are maximized, but when they are finely tuned to the stochastic properties of the market and to the behavioral responses of strategic participants. This highlights the inherent need for adaptive regulatory design that dynamically calibrates incentive levels to prevailing uncertainty conditions in order to sustain welfare-optimal equilibria in renewable-dominated power systems.

5. Conclusions

This study introduced a comprehensive bilevel market framework designed to capture the hierarchical and strategic interactions between regulatory authorities and market participants in renewable-rich electricity systems. The interaction was formalized as a Stackelberg game, wherein the upper-level regulator determines optimal capacity incentives, price caps, and policy instruments to balance system reliability with economic efficiency, while the lower-level agents engage in profit-driven bidding under uncertainty. Through this structure, the model provides a unified representation of centralized oversight and decentralized decision-making, explicitly embedding uncertainty, market power, and renewable intermittency into the equilibrium formulation.
Methodologically, the resulting bilevel problem—expressed as a Mathematical Program with Equilibrium Constraints (MPEC)—was reformulated into an equivalent tractable representation via duality theory and complementarity-based transformation, thereby ensuring computational solvability without sacrificing behavioral realism. This reformulation permits efficient solution through decomposition and hybrid regularization techniques, making it applicable to large-scale market systems with multiple agents and policy layers. The approach effectively reconciles regulatory objectives with agent-level optimization, allowing the exploration of policy outcomes under dynamic and uncertain environments.
A case study based on a modified IEEE 118-bus system validated the framework’s practical relevance and analytical depth. The numerical results demonstrated the model’s capability to enhance renewable integration, alleviate market power concentration, and guide rational policy formulation under competing stakeholder objectives. By explicitly linking regulatory design and market behavior, the proposed framework establishes a scalable analytical foundation for next-generation electricity markets where centralized regulation and decentralized optimization co-evolve. Future extensions may incorporate multi-period dynamics, stochastic coupling with storage bidding behavior, and the integration of environmental or social welfare objectives within the same bilevel structure.

Author Contributions

Conceptualization, Y.W. (Yizhe Wang) and Y.W. (Yifan Wang); Methodology, Y.W. (Yizhe Wang), M.P. and J.L.; Software, M.P. and L.J.; Validation, M.P., X.Q. and J.L.; Formal analysis, X.Q. and L.J.; Investigation, X.Q. and Y.W. (Yifan Wang); Resources, J.L.; Data curation, Y.W. (Yifan Wang); Writing—original draft, Y.W. (Yizhe Wang); Supervision, L.J. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Science and Technology Project of State Grid Corporation of China (Research on the method of coordinated settlement between total and sub-province under the national unified electricity market, 5108-202455048A-1-1-ZN).

Data Availability Statement

Data can be requested upon contacting the corresponding author.

Conflicts of Interest

Y.W. (Yizhe Wang) was employed by Beijing Power Exchange Center. Y.W. (Yifan Wang) was employed by China Electric Power Research Institute. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The authors declare that this study received funding from Science and Technology Project of State Grid Corporation of China. The funder was not involved in the study design, collection, analysis, interpretation of data, the writing of this article or the decision to submit it for publication.

References

  1. Schweppe, F.C.; Rom, D.B. Power System Static-State Estimation, Part II: Approximate Model. IEEE Trans. Power Appar. Syst. 1970, PAS-89, 125–130. [Google Scholar] [CrossRef]
  2. Baran, M.E.; Wu, F.F. Network reconfiguration in distribution systems for loss reduction and load balancing. IEEE Trans. Power Deliv. 1989, 4, 1401–1407. [Google Scholar] [CrossRef]
  3. Chiu, C.-C.; Kao, L.-J.; Cook, D.F. Combining a neural network with a rule-based expert system approach for short-term power load forecasting in Taiwan. Expert Syst. Appl. 1997, 13, 299–305. [Google Scholar] [CrossRef]
  4. Kahlen, M.T.; Ketter, W.; van Dalen, J. Electric Vehicle Virtual Power Plant Dilemma: Grid Balancing Versus Customer Mobility. Prod. Oper. Manag. 2018, 27, 2054–2070. [Google Scholar] [CrossRef]
  5. Yi, Z.; Xu, Y.; Zhou, J.; Wu, W.; Sun, H. Bi-level programming for optimal operation of an active distribution network with multiple virtual power plants. IEEE Trans. Sustain. Energy 2020, 11, 2855–2869. [Google Scholar] [CrossRef]
  6. Kordkheili, R.A.; Pourakbari-Kasmaei, M.; Lehtonen, M.; Kordkheili, R.A. Wind Farm-based Green Hydrogen: A Virtual Power Plant Case Study. In Proceedings of the 2022 18th International Conference on the European Energy Market (EEM), Ljubljana, Slovenia, 13–15 September 2022; IEEE: New York, NY, USA, 2022; pp. 1–7. [Google Scholar]
  7. Zepter, J.M.; Engelhardt, J.; Marinelli, M. Optimal expansion of a multi-domain virtual power plant for green hydrogen production to decarbonise seaborne passenger transportation. Sustain. Energy Grids Netw. 2023, 36, 101236. [Google Scholar] [CrossRef]
  8. Naughton, J.; Wang, H.; Cantoni, M.; Mancarella, P. Co-optimizing Virtual Power Plant Services Under Uncertainty: A Robust Scheduling and Receding Horizon Dispatch Approach. IEEE Trans. Power Syst. 2021, 36, 3960–3972. [Google Scholar] [CrossRef]
  9. Li, T.T.; Zhao, A.P.; Wang, Y.; Li, S.; Fei, J.; Wang, Z.; Xiang, Y. Integrating solar-powered electric vehicles into sustainable energy systems. Nat. Rev. Electr. Eng. 2025, 2025, 00181. [Google Scholar] [CrossRef]
  10. Wang, D.; Peng, D.; Huang, D. Application and prospects of large AI models in virtual power plants. Electr. Power Syst. Res. 2025, 241, 111403. [Google Scholar] [CrossRef]
  11. Guo, X.; Wang, L.; Ren, D. Optimal scheduling model for virtual power plant combining carbon trading and green certificate trading. Energy 2025, 318, 134750. [Google Scholar] [CrossRef]
  12. Jiang, Y.; Lee, N.; Deng, X.; Yang, Y. A Secure-Sustainable-Fast Charging Strategy for Lithium-ion Batteries based on A Random Forest-Enhanced Electro-Thermal-Degradation Model. IEEE Trans. Energy Convers. 2025, 11, 29–41. [Google Scholar] [CrossRef]
  13. Liu, B.; Zhou, B.; Yang, D.; Li, G.; Cao, J.; Bu, S.; Littler, T. Optimal planning of hybrid renewable energy system considering virtual energy storage of desalination plant based on mixed-integer NSGA-III. Desalination 2022, 521, 115382. [Google Scholar] [CrossRef]
  14. Li, T.T.; Li, S.; Ding, C.X.; Bao, Z.; Alhazmi, M. Intelligent Wireless Power Scheduling for Lunar Multienergy Systems: Deep Reinforcement Learning for Real-Time Adaptive Beam Steering and Vehicle-to-Grid Energy Optimization. Int. Trans. Electr. Energy Syst. 2025, 2025, 9877968. [Google Scholar] [CrossRef]
  15. Qais, M.; Kirli, D.; Moroshko, E.; Kiprakis, A.; Tsaftaris, S. A virtual power plant for coordinating batteries and EVs of distributed zero-energy houses considering the distribution system constraints. J. Energy Storage 2025, 106, 114905. [Google Scholar] [CrossRef]
  16. Liu, W.; Li, Z.; Xing, X.; Chen, X.; Wang, Y.; Wang, X. Non-cooperative game optimization for virtual power plants considering carbon trading market. Energy 2025, 317, 134571. [Google Scholar] [CrossRef]
  17. Nadimi, R.; Goto, M. Uncertainty reduction in power forecasting of virtual power plant: From day-ahead to balancing markets. Renew. Energy 2025, 238, 121875. [Google Scholar] [CrossRef]
  18. Sarathkumar, T.V.; Goswami, A.K.; Khan, B.; Shoush, K.A.; Ghoneim, S.S.; Ghaly, R.N. Forecasting of virtual power plant generating and energy arbitrage economics in the electricity market using machine learning approach. Sci. Rep. 2025, 15, 3812. [Google Scholar] [CrossRef] [PubMed]
  19. Zhao, A.P.; Li, S.; Xie, D.; Wang, Y.; Li, Z.; Hu, P.J.-H.; Zhang, Q. Hydrogen as the nexus of future sustainable transport and energy systems. Nat. Rev. Electr. Eng. 2025, 2, 447–466. [Google Scholar] [CrossRef]
Figure 1. Hierarchical Optimization Framework for Electricity Market Participation and Retail Strategy under Capacity Obligations.
Figure 1. Hierarchical Optimization Framework for Electricity Market Participation and Retail Strategy under Capacity Obligations.
Energies 19 01285 g001
Figure 2. Carbon Intensity vs. Allowance Price by Entity Type.
Figure 2. Carbon Intensity vs. Allowance Price by Entity Type.
Energies 19 01285 g002
Figure 3. Annual Carbon Allowance Budget Share by Player (2025–2035).
Figure 3. Annual Carbon Allowance Budget Share by Player (2025–2035).
Energies 19 01285 g003
Figure 4. Total System Cost vs. Renewable Energy Penetration and Load Level.
Figure 4. Total System Cost vs. Renewable Energy Penetration and Load Level.
Energies 19 01285 g004
Figure 5. System Reliability (%) vs. Storage Capacity and Wildfire Disruption Probability.
Figure 5. System Reliability (%) vs. Storage Capacity and Wildfire Disruption Probability.
Energies 19 01285 g005
Figure 6. CO2 Emissions vs. Renewable Penetration and Hydrogen Storage Dispatch Ratio.
Figure 6. CO2 Emissions vs. Renewable Penetration and Hydrogen Storage Dispatch Ratio.
Energies 19 01285 g006
Figure 7. Voltage Violation Rate vs. Load Level and Reactive Power Dispatch.
Figure 7. Voltage Violation Rate vs. Load Level and Reactive Power Dispatch.
Energies 19 01285 g007
Figure 8. Capacity Accreditation vs. Scarcity Duration.
Figure 8. Capacity Accreditation vs. Scarcity Duration.
Energies 19 01285 g008
Figure 9. Marginal Cost vs. Utilization.
Figure 9. Marginal Cost vs. Utilization.
Energies 19 01285 g009
Figure 10. Generator Profit Surface under Regulatory Incentive and Strategic Bidding.
Figure 10. Generator Profit Surface under Regulatory Incentive and Strategic Bidding.
Energies 19 01285 g010
Figure 11. Social Welfare Surface: Policy Stringency vs. Renewable Penetration.
Figure 11. Social Welfare Surface: Policy Stringency vs. Renewable Penetration.
Energies 19 01285 g011
Figure 12. Market Price Volatility: Sensitivity to Forecast Error and Demand Elasticity.
Figure 12. Market Price Volatility: Sensitivity to Forecast Error and Demand Elasticity.
Energies 19 01285 g012
Table 2. Aggregated Performance of the Bilevel Market under Varying Renewable Penetration Levels.
Table 2. Aggregated Performance of the Bilevel Market under Varying Renewable Penetration Levels.
ScenarioRenewable Share (%)System Cost (M$)Social Welfare (M$)Reliability (%)
S1: Fossil-dominant baseline2014.642.599.1
S2: Moderate renewables4013.248.998.7
S3: High renewables + storage6011.856.398.4
S4: Very high renewables8012.952.796.9
S5: 100% renewables + H 2 hybrid10013.554.895.5
Table 3. Sensitivity of System Welfare to Incentive Level and Forecast Uncertainty.
Table 3. Sensitivity of System Welfare to Incentive Level and Forecast Uncertainty.
ScenarioIncentive ($)Forecast Error (%)Welfare (M$)Change (%)
S1: Low policy support50541.80.0
S2: Moderate baseline1001053.2+27.3
S3: Optimized incentive1501058.5+39.9
S4: High volatility1502047.6+13.9
S5: Over-subsidized regime2002539.4−5.7
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, Y.; Pan, M.; Qi, X.; Liu, J.; Wang, Y.; Ju, L. Dynamic Bilevel Optimization of Market Participation and Strategic Bidding in Renewable-Dominated Electricity Markets. Energies 2026, 19, 1285. https://doi.org/10.3390/en19051285

AMA Style

Wang Y, Pan M, Qi X, Liu J, Wang Y, Ju L. Dynamic Bilevel Optimization of Market Participation and Strategic Bidding in Renewable-Dominated Electricity Markets. Energies. 2026; 19(5):1285. https://doi.org/10.3390/en19051285

Chicago/Turabian Style

Wang, Yizhe, Miao Pan, Xin Qi, Junxi Liu, Yifan Wang, and Liwei Ju. 2026. "Dynamic Bilevel Optimization of Market Participation and Strategic Bidding in Renewable-Dominated Electricity Markets" Energies 19, no. 5: 1285. https://doi.org/10.3390/en19051285

APA Style

Wang, Y., Pan, M., Qi, X., Liu, J., Wang, Y., & Ju, L. (2026). Dynamic Bilevel Optimization of Market Participation and Strategic Bidding in Renewable-Dominated Electricity Markets. Energies, 19(5), 1285. https://doi.org/10.3390/en19051285

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop