1. Introduction
The global shift toward low-carbon energy systems is accelerating the deployment of renewable energy sources (RESs), such as wind and solar, fundamentally altering how power systems are designed, operated, and regulated [
1,
2]. Unlike conventional generation assets, RESs cannot be dispatched upward on demand but can be curtailed downward by operators. They are weather-dependent and exhibit strong spatiotemporal correlations, creating new layers of volatility and uncertainty in both short-term operations and long-term infrastructure planning. Nowhere is this transformation more consequential than in the domain of transmission expansion planning (TEP), where decisions must anticipate and accommodate a wide range of uncertain futures while maintaining economic efficiency, reliability, and resilience [
3,
4,
5]. Traditional TEP models, rooted in deterministic optimization paradigms, have historically relied on fixed demand forecasts and assumed generation capacities. These models prioritize cost minimization under known conditions and offer tractable solutions for network reinforcement [
6,
7]. However, the assumptions underpinning these models become increasingly fragile as renewable penetration deepens. The inherent intermittency of RESs introduces high-frequency volatility, diurnal variability, seasonal asymmetries, and locational imbalance, all of which conspire to challenge the adequacy and robustness of fixed infrastructure plans. Consequently, the need to incorporate uncertainty into transmission planning has emerged as both a modeling and operational imperative.
In response, stochastic programming approaches have been widely adopted to account for renewable and load variability. These methods introduce multiple scenarios representing distinct realizations of uncertain parameters, typically organized in a two-stage decision structure: the first stage encodes investment decisions, while the second stage simulates the operational outcomes conditional on each scenario [
8,
9]. Scenario-based models allow planners to assess performance across a range of potential futures and capture the recourse flexibility of operational control. Yet, their effectiveness is inherently tied to the quality and comprehensiveness of the scenario set. Poorly chosen or narrowly distributed scenarios can mask system vulnerabilities, particularly under tail-risk events, such as prolonged renewable droughts or coincident high-load, low-generation periods. To hedge against the extreme realizations not captured in fixed scenario sets, robust optimization was introduced into the TEP literature. Here, uncertainty is modeled through bounded sets—box, polyhedral, or ellipsoidal—and the objective is to minimize worst-case costs over all admissible realizations [
10,
11,
12]. This approach offers formal performance guarantees under structured uncertainty but suffers from potential over-conservatism. By preparing for the worst possible outcome within a potentially over-specified set, robust models may lead to excessive infrastructure buildout and inefficient capital allocation. Additionally, traditional robust formulations do not encode statistical information or exploit data-driven insights about actual uncertainty distributions.
This limitation led to the development of distributionally robust optimization (DRO) models, which represent a middle ground between stochastic and robust approaches. DRO frameworks define an ambiguity set—a family of probability distributions that are deemed plausible given available data—and optimize against the worst-case distribution within this set [
13,
14,
15]. Common constructions include moment-based sets, phi-divergence balls, and Wasserstein metric balls. In power systems, DRO has been applied to unit commitment, economic dispatch, reserve allocation, and—more recently—transmission planning [
16,
17]. Wasserstein-based DRO is particularly attractive as it offers strong out-of-sample performance guarantees and admits tractable dual reformulations under convexity assumptions. However, existing DRO-based TEP models typically rely on historical data or synthetic generation to form empirical distributions. The uncertainty representation remains static throughout the optimization, and the generated samples may fail to explore those states that are most consequential for network design.
Simultaneously, advances in deep learning, particularly generative modeling, offer new possibilities for modeling complex, high-dimensional uncertainty. Generative adversarial networks (GANs), variational autoencoders (VAEs), and diffusion models can learn the latent structure of RES behavior across space and time [
18,
19,
20]. GANs are especially powerful for generating realistic, high-fidelity synthetic data from low-dimensional latent codes. In the context of energy systems, GANs have been used to generate synthetic wind and solar traces, simulate net load variability, and enrich datasets for forecasting. However, most of these applications are standalone data modeling tasks and are only loosely integrated with optimization workflows, typically lacking direct feedback mechanisms from decision layers, that is, the generated scenarios may be statistically plausible, but their operational relevance—whether they stress the grid, violate constraints, or induce high costs—remains unmeasured.
More recently, researchers have begun integrating optimization objectives into the learning process of generative models. This class of work, known broadly as optimization-in-the-loop learning, uses outcomes from decision models (e.g., cost, regret, or constraint violations) as feedback signals to shape the generator’s behavior. In power systems, some early applications have used bilevel frameworks to guide GANs toward generating realistic yet economically disruptive load curves or RES trajectories. These methods co-train a generator and a decision model, encouraging the generated scenarios to expose structural weaknesses in the system [
21,
22,
23]. This direction opens new frontiers for TEP, where stress-testing infrastructure against worst-case—but still realistic—scenarios is critical [
24]. Compared with classical scenario generation methods relying on statistical fitting or heuristic perturbations, GAN-based approaches offer improved realism and diversity [
25,
26]. Existing studies have applied GANs to forecast load, synthesize renewable power scenarios, or simulate attack events, demonstrating their potential for realistic uncertainty modeling. However, most of these applications remain decoupled from optimization tasks. Our work builds upon these foundations by directly embedding GAN-generated scenarios into a bi-level TEP model, enabling the system to learn and harden against its own adversarial weaknesses in an end-to-end manner.
In this paper, we developed a novel framework that, specifically tailored for transmission expansion under high renewable penetration, integrates a bilevel risk-cost equilibrium model with an adversarial learning-based scenario generator. At the upper level, a system planner selects transmission investments under a constrained budget, aiming to minimize infrastructure cost plus operational risk exposure. At the lower level, system operators respond to scenario-specific net load profiles by dispatching resources, curtailing renewables, and shedding load, all while respecting physical and market constraints. Unlike traditional models, where uncertainty is fixed ex-ante, our framework dynamically generates new high-impact scenarios through a physics-informed spatiotemporal GAN (PI-ST-GAN). This GAN synthesizes extreme but feasible renewable and load patterns, and it is guided by a dual-loss discriminator: one head enforces realism through adversarial classification, while the other predicts a composite stress index, including expected energy not served, loss-of-load probability, and marginal congestion cost. The entire framework forms a closed-loop co-optimization system, where the GAN learns to generate scenarios that degrade the planner’s objective, while the planner adapts its decisions to resist these adversarial samples. The result is a self-refining transmission plan that is optimized not only for average-case performance, but also for resilience under intelligent stress testing. This architecture avoids the brittleness of fixed-scenario stochastic programming, the rigidity of robust optimization, and the data limitations of traditional DRO by enabling an endogenous generation of operationally meaningful worst-case scenarios.
From a methodological standpoint, this work contributes a fundamentally new decision architecture that fuses adversarial learning with bi-level optimization. The PI-ST-GAN is designed to preserve physical plausibility, temporal coherence, and spatial structure, ensuring that generated samples are both challenging and credible. By embedding stress-based feedback directly into the generator’s loss function, we achieve a new class of optimization-aware generative modeling that aligns scenario learning with decision impact. The transmission planner, in turn, is exposed to a continuously evolving distribution of risk, improving the robustness and adaptability of investment decisions. Empirical validation is conducted on a modified IEEE 118-bus system, where high wind and solar penetration levels induce considerable uncertainty in system balancing and congestion. Our results demonstrate that the adversarially-stressed bilevel planner identifies transmission strategies that are materially different from those produced by deterministic or stochastic baselines. Specifically, we observed that the model tends to prioritize operational flexibility and congestion mitigation, often resulting in comparatively less emphasis on redundant capacity investments. This observation suggests that resilient planning under deep renewable integration may benefit from infrastructure strategies that are more responsive and stress-aware, rather than solely relying on traditional overbuild principles. Furthermore, our results show significant gains in risk exposure metrics under adversarial testing, including lower loss-of-load probability and reduced curtailment, highlighting the operational value of scenario-aware planning.
2. Mathematical Modeling
To formally represent the proposed planning paradigm, we developed a bi-level mathematical optimization model that captures the hierarchical structure of long-term infrastructure investment and short-term operational response under adversarial uncertainty. The upper level corresponds to the decision maker’s objective of selecting a subset of candidate transmission lines that minimize total investment cost and worst-case operational risk, subject to budget constraints. The lower level reflects the behavior of the system operator, who optimizes dispatch, curtailment, and load shedding actions under specific realizations of renewable and load scenarios synthesized by the generative adversarial network. This section introduces the full model, including objective functions, physical constraints, and adversarial scenario selection criteria. All notation is consistent with the spatiotemporal resolution of the case study and was designed to ensure computational tractability and reformulation compatibility.
To ensure clarity in the mathematical formulation, all the variables and parameters used in this study are defined in the following nomenclature table.
Figure 1 shows a conceptual diagram of a bi-level optimization framework designed to support grid reinforcement planning under high renewable energy penetration. At the top of the diagram, various renewable energy sources (e.g., wind and solar) and demand patterns are modeled as uncertain inputs. These inputs were processed by a Generative Adversarial Network (GAN)-based scenario learning module, which synthesizes a wide range of realistic and extreme operating conditions. The generated scenarios were then passed into a bi-level optimization structure, where the upper level focuses on long-term investment decisions in transmission infrastructure, while the lower level models the operational response of the power system. The robust investment model ensures cost-effective grid reinforcement while accounting for risk and operational constraints derived from the scenario outputs. Arrows between modules indicate the direction of information flow and feedback mechanisms, highlighting the interaction between scenario learning, operational constraints, and planning decisions.
Equation (1) minimizes the sum of infrastructure outlay and worst-case operational risk. The first term,
, expresses total capital expenditure, where
is the candidate-line set,
denotes the unit cost of line
ℓ, and the binary variable
indicates its selection. The second term evaluates the maximum expected stress over all distributions
lying in the Wasserstein ball
of radius
around the empirical measure
. For each hour
and bus
n,
penalizes unserved demand, whereas
accounts for renewable curtailment. To capture early-stage service deterioration, a new power-quality component
was introduced;
collects voltage-, frequency-, or harmonic-related disturbances at node
n,
weights each disturbance type, and
measures its magnitude or duration. Congestion risk is tallied separately by
, where
is the line set,
the buses’ incident to line
k,
the overload size, and
its penalty. Using the inner maximum prevents double counting of correlated overloads, while the added power-quality term enables the planner to hedge against sub-nominal disturbances that often precede load shedding or capacity violations. Consequently, the formulation yields a cost-optimal expansion plan that remains resilient to the extreme yet physically plausible stress trajectories extending from quality degradation through congestion to outright service loss.
Equation (2) captures the lower-level operational response under a given scenario
, where the system operator minimizes a composite cost. This includes generation costs
, renewable curtailment penalties
, load shedding losses
, and congestion-related expenditures based on actual line flows
and congestion prices
. This level encapsulates the decentralized but coordinated dispatch logic that adapts to uncertain net loads and renewable injections, reflecting real-time resilience under adversarial stress.
To guide the adversarial generator in identifying high-impact scenarios, Equation (3) introduces a unified composite stress index
. This index integrates three core resilience metrics: expected energy not served (EENS)
, loss-of-load probability (LOLP)
, and marginal congestion cost (MCC)
. Each component is weighted by a corresponding stress multiplier,
, to reflect the relative importance of each risk dimension. This risk index serves as a differentiable surrogate objective for the scenario generator, enabling it to steer generation toward operationally critical stress trajectories that challenge the planned infrastructure. The weighting coefficients
,
, and
are assigned based on a combination of domain knowledge and empirical calibration. Initial values are selected according to the relative severity and economic impact of each stress component, with higher emphasis placed on expected energy not served due to its critical influence on system reliability. Subsequently, sensitivity analysis was conducted to ensure that the planner’s decisions remain stable across varying weight configurations. This procedure results in a set of interpretable and operationally robust coefficients for stress evaluation.
As shown in Equation (4), the transmission expansion planning problem is reformulated as a single-level optimization under adversarial uncertainty. The first term remains the deterministic infrastructure cost based on investment decisions
, while the second term incorporates an inner maximization over adversarially generated scenarios
. Each scenario is constructed to maximize operational cost through the PI-ST-GAN module, ensuring stress relevance. The inner expression mirrors the operational objective defined in Equation (2), making this formulation tractable for MILP solvers via dualization or column-and-constraint generation approaches. This structure captures the essence of a robust investment strategy: one that is optimal under the most threatening credible futures. To ensure practical applicability, particular attention is given to the tractability of the reformulated problem in Equation (4). The bi-level structure is transformed into a single-level equivalent form, which enables compatibility with standard nonlinear optimization solvers. Moreover, the scenario-based stress evaluation structure maintains a high degree of convexity in critical subcomponents, particularly in constraints involving power flow and reserve margins. This design supports stable solver behavior, even under scaling to larger networks. Preliminary performance profiling indicates that the approach remains computationally feasible in systems with over 300 buses and multiple concurrent constraints, provided that appropriate solver warm-starting and decomposition techniques are employed. These considerations enhance the robustness and scalability of the proposed framework in real-world applications.
Equation (5) ensures nodal power balance under scenario
, where the sum of generation, renewable output, and incoming flows equals the total demand adjusted by load shedding and curtailment. The left-hand side models DC-based nodal injection using the susceptance matrix
and voltage phase angles
, capturing dispatch feasibility across the network.
Equation (6) defines the DC flow equation, mapping voltage angle differences into power flows along branch k. The binary variable activates the line if it is built, enforcing Kirchhoff’s laws and identifying congestion under adversarial stress.
It is important to clarify that the voltage phase angles used in Equations (5) and (6) are computed under the DC power flow approximation, which assumes constant voltage magnitudes and neglects reactive power flows. While this simplification enables fast evaluation during large-scale scenario generation and training of neural networks, it may introduce model errors in scenarios with significant nonlinearities. In this work, phase angles are used as surrogate indicators of spatial stress patterns rather than precise physical quantities. Their inclusion allows the GAN to learn topologically informed congestion features. Nonetheless, we acknowledge that the DC model may not fully capture voltage-related instabilities or reactive congestion. To mitigate this, future extensions will explore the incorporation of AC power flow solvers or hybrid surrogates to enhance the physical fidelity of the training data and to improve the robustness of learned stress patterns.
Equation (7) constrains generation, renewable output, and curtailment within feasible bounds. RES output is capped by scenario-specific availability, and curtailment cannot exceed realized generation, ensuring energy balance.
Equation (8) enforces load shedding constraints. It ensures that the amount of unsupplied load cannot exceed actual demand, preserving feasibility under stress-induced conditions.
Equation (9) enforces transmission line flow limits using binary decision variables. If a line is not constructed (
), its capacity collapses to zero, blocking illegal flows.
Equation (10) imposes a budget constraint over candidate lines. The investment decisions
must collectively respect the total planning budget
, introducing tradeoffs between coverage and cost.
Equation (11) defines the binary nature of each candidate line variable
, enabling discrete infrastructure investment modeling and exact MIP-based optimization strategies.
Equation (12) enforces a relaxed complementary condition between power flow and congestion price
, allowing shadow prices to arise only when line capacity is potentially exceeded, while improving numerical tractability and learning robustness.
Equation (13) enforces soft complementary slackness conditions between operational constraints and their associated economic penalties.
Equation (14) defines the Wasserstein ambiguity set by restricting the distance between any candidate distribution
and the empirical one
under a tolerance
, shaping the robustness level of planning.
Equation (15) reformulates the inner DRO problem into a tractable convex form using duality. It balances between robustness
and penalty cost
for stress-inducing scenarios. The function
here denotes the operational loss under decision vector
and uncertainty realization
, which is distinct from the symbol
used earlier in Equation (1) to represent the set of candidate transmission lines. This notation reuse is clarified here to avoid ambiguity. In this dual formulation, the term
penalizes the Wasserstein distance from the empirical distribution, while the inner supremum evaluates worst-case deviations. This transformation allows the outer optimization to account for adversarial scenarios while maintaining computational tractability.
Equation (16) identifies the worst-case GAN-generated scenario that maximizes stress under feasibility and plausibility filters, driving robust planning against high-impact conditions.
It is important to note that the proposed formulation is temporally decoupled, meaning that the constraints and optimization decisions are defined independently for each time period. This simplification allows tractable resolution of the distributionally robust problem under high-dimensional uncertainty, but it does not capture intertemporal dependencies, such as ramping constraints, energy storage dynamics, or generator state persistence. While this structure is sufficient for evaluating spatial flexibility and instantaneous system robustness, it does not reflect the full operational realism of time-coupled dispatch. The incorporation of integral constraints across time periods will be considered in future extensions of this model.
3. The Proposed Method
Solving the bi-level optimization problem under adversarial uncertainty requires a hybrid methodology that integrates optimization, scenario generation, and learning-based stress refinement. In this section, we describe the closed-loop training architecture and numerical solution process used to implement the proposed GAN-enhanced transmission expansion planning framework. We begin by outlining the structure and training procedure of the physics-informed spatiotemporal GAN (PI-ST-GAN), which is responsible for generating high-impact uncertainty scenarios. We then describe how these scenarios are embedded into the lower-level dispatch problem and how their stress feedback is used to update the GAN’s training objective. Finally, we present the reformulation of the bi-level model into a tractable mixed-integer program, achieved using dual decomposition, and describe the iterative co-optimization process that alternates between adversarial scenario generation and planning response. This methodology enables the planner to adapt its design in response to dynamically evolving structural stress.
To account for distributional shifts in renewable-related uncertainties across different penetration levels, the PI-ST-GAN incorporates an adaptive distance threshold mechanism into the spatial correlation modeling process. Specifically, the effective threshold is scaled as a function of the renewable penetration ratio , i.e., , where is the baseline spatial radius and reflects the expansion elasticity. This formulation allows the generator to produce risk scenarios with broader spatial reach under high-renewable conditions, thereby capturing the increased dispersion and structural volatility associated with deeper decarbonization.
Figure 2 provides a visual representation of the proposed ANN model employed in this study. The architecture consists of three primary layers: an input layer, one or more hidden layers, and an output layer. The input layer receives multiple features, including renewable generation patterns, weather indicators, and temporal information, which serve as the basis for learning predictive relationships. These features are passed forward through fully connected hidden layers, where each node performs a weighted sum followed by a non-linear activation function (e.g., ReLU). To enhance generalization and prevent overfitting, regularization techniques, such as dropout and batch normalization, may be applied at this stage. The ANN model outputs a set of task-specific predictions, such as forecasted net load, operational stress indicators, or scenario risk scores, depending on its integration within the GAN or the optimization pipeline. As shown in the diagram, arrows denote the flow of information between layers, and key operations, like activation and transformation, are annotated for clarity. This modular structure allows the ANN to capture complex, non-linear mappings from high-dimensional input data to decision-relevant outputs, making it a flexible and effective component within the broader planning framework.
Equation (17) defines the adversarial learning objective in the form of a two-player minimax game between the generator
and the discriminator
. The generator, parameterized by
, transforms latent noise samples
into synthetic scenarios
, attempting to fool the discriminator. The discriminator, parameterized by
, receives either real samples
from empirical data or generated samples from the generator, and it learns to assign high scores to real data and low scores to synthetic data. The optimization seeks equilibrium, where the generator creates scenarios indistinguishable from real data, formally minimizing the expected log-likelihood of the discriminator error. This adversarial framework enables the construction of complex, high-dimensional distributions of RES and load trajectories for realistic system stress testing.
Equation (18) defines the generator’s augmented loss function
, incorporating not only the standard adversarial feedback from the discriminator, but also a risk-guided shaping term. The expectation is taken over latent variables
and is transformed via the generator into synthetic scenarios. The first term measures the likelihood of the discriminator successfully identifying the generated sample as fake, while the second term
introduces a penalty (or reward) based on the operational stress induced by each scenario. The scalar
balances the relative importance of realism versus criticality. This formulation encourages the generator to produce edge-case scenarios that both appear realistic and exert pressure on the power system, enriching the dataset with adverse but plausible operational conditions.
Equation (19) presents the discriminator’s classification loss function
, which quantifies its ability to distinguish between real and synthetic scenarios. It combines two expectations: one over empirical samples
, and one over synthetic samples generated from latent input
. The loss penalizes the discriminator for incorrectly labeling either real or fake samples, thereby encouraging improved performance in separating genuine operating conditions from artificial ones. This classification objective is most influential in the early training phase when realism is not yet aligned with stress signal modeling.
Equation (20) introduces a regression-based risk alignment loss
to train the discriminator as a surrogate evaluator for scenario severity. Here,
denotes the discriminator’s continuous output predicting stress, which is directly compared to the true risk score
. The loss penalizes squared deviations, effectively guiding the discriminator to learn a smooth, differentiable mapping from scenario features to risk values. This surrogate allows backpropagation of stress gradients through the GAN pipeline, enabling efficient training of the generator without requiring computationally intensive bilevel optimization at each iteration.
Equation (21) softly penalizes the generator for producing RES outputs that violate physics-informed bounds (e.g., ramp rates or capacity limits). This regularization term
helps ensure that the synthesized power trajectories remain within enforceable and realistic operational boundaries. By incorporating upper and lower constraints on RES generation (defined via
and
), the model enforces technical feasibility without applying hard constraints that may obstruct GAN training. The penalty strength is governed by the weight
, which can be adjusted based on bus-level flexibility or forecasting uncertainty.
Equation (22) imposes a spatial regularization term
to preserve geographical coherence in RES generation profiles. It penalizes the sharp differences between neighboring buses
n and
m, and it is guided by the spatial correlation weights
. The indicator function
suppresses penalties for non-neighboring nodes. This constraint reflects the physical reality that RES conditions (such as solar irradiance or wind speed) are often correlated among proximate areas. By reducing spatial inconsistency, the generator is more likely to produce realistic and regionally consistent RES outputs.
Equation (23) enforces temporal consistency by introducing a ramping penalty between consecutive time steps. The loss
suppresses abrupt fluctuations in generated RES trajectories, which are rarely observed in real-world operations. The weight
calibrates the sensitivity to these ramping effects. This formulation mimics operational forecasting logic used in system operator practices, where ramp-rate limits exist to maintain stability. The physics-informed loss terms defined in Equations (21)–(23) are later evaluated in the Case Study section via ablation tests, verifying their role in improving scenario realism and controllability.
Equation (24) describes a scenario selection mechanism based on stress maximization. From a pool of latent vectors
, the algorithm identifies the one that leads to the highest stress score
among the physically valid scenarios
. This formulation serves as a filtering stage after generation, allowing the upper-level planner to focus on the most critical operational threats. Such targeted scenario selection improves decision robustness while reducing computation, as only stress-inducing edge cases are forwarded to optimization.
Equation (25) approximates the gradient of the stress index
with respect to generator parameters
via finite difference. When stress feedback from the upper-level DRO planner is non-differentiable, this surrogate allows the generator to still receive usable gradients for parameter updates. The approximation is performed by perturbing the latent vector
with a small
, as well as by computing the resulting change in
. This formulation makes bilevel GAN training tractable and compatible with non-smooth downstream evaluations.
Equation (26) defines the complete generator loss as a weighted combination of adversarial learning, physical realism, spatial correlation, and temporal smoothness objectives. These components balance between creative diversity and feasibility enforcement, allowing generation of operationally credible, high-risk scenarios. The term
reflects the core adversarial objective, while the three weighted regularization components shape the physical validity of generated outputs. Each
coefficient can be tuned to reflect the relative importance of engineering constraints versus adversarial exploration.
Equation (27) presents the parameter update rules for the generator and discriminator using learning rate
. The generator updates its parameters
based on the full composite loss from Equation (26), while the discriminator’s parameters
are refined using both classification (
) and stress estimation (
) objectives. This co-training mechanism ensures that both adversarial fidelity and stress feedback are properly integrated into the GAN’s training dynamics.
Equation (28) specifies the convergence criterion for GAN-TEP co-training: the process stops once the maximum scenario stress stabilizes within tolerance
. In each iteration
t, the worst-case stress score
is computed over a batch of generated scenarios. If the difference between consecutive maximum stress values falls below a threshold, training is considered to have converged. This criterion ensures that the adversarial generator is no longer discovering substantially worse-case scenarios that would alter the planning outcome.
Equation (29), though symbolic, outlines the full iterative training loop—alternating between adversarial scenario generation and risk-aware transmission optimization. It embodies the adversarial closed-loop feedback between AI and optimization. To clarify the algorithmic implementation referenced in Equation (29), the iterative procedure consists of a sequential interaction between the scenario generation module and the bi-level optimization model. The process begins with the initialization of model parameters for the generator and discriminator within the GAN architecture. At each iteration, a batch of renewable and load scenarios is synthesized from latent noise vectors, and the composite stress index is evaluated to identify the most critical scenario. This high-impact scenario is then fed into the transmission expansion planner, which solves the bi-level optimization problem under the given stress condition, producing updated investment and operational decisions. The stress outcome is subsequently used to compute reward signals that guide the update of the generator via backpropagation, while the discriminator is refined to distinguish between real and generated inputs and to approximate stress scores. This adversarial training loop continues until a convergence criterion is met, which is typically defined by the stabilization of stress metrics or marginal improvement in planning objectives. The revised formulation ensures transparency in how the generative model and the planner co-evolve through closed-loop feedback.
Equation (30) defines the final transmission plan , which minimizes the investment cost plus operational loss under the GAN-identified worst-case scenario . The investment term is represented by for each line ℓ, and the operational cost is evaluated under the adversarial scenario drawn from the trained generator . This formulation captures the equilibrium point of the AI-enhanced planning paradigm, balancing resilience and economic cost in the face of strategic uncertainty.
To ensure the scalability and tractability of the proposed bi-level optimization under adversarial uncertainty, the framework was designed around a decomposition-based architecture that allows for parallel computation. In particular, each adversarial scenario generated by the PI-ST-GAN corresponds to an independent lower-level dispatch problem, which can be solved simultaneously across computing threads or distributed platforms. This parallelism significantly mitigates the computational burden, especially when dealing with large-scale scenario sets or high-dimensional network representations. Additionally, the reformulation of the bi-level problem using strong duality transforms the bilevel structure into a single-level mixed-integer program (MIP), which is compatible with commercial solvers that support branch-and-cut, decomposition, and acceleration strategies. The GAN training process itself is also modular, with separate spatial and temporal encoders that can be scaled independently to match the complexity of the target grid. These design choices collectively enable the proposed method to handle higher-dimensional planning problems without sacrificing convergence or numerical efficiency.
4. Case Study
To evaluate the performance and resilience of the proposed adversarially-stressed bi-level transmission expansion framework, we conducted numerical experiments based on a modified IEEE 118-bus system. The test system was chosen for its intermediate size and realistic topological complexity, providing a balanced platform for assessing both investment-level decisions and scenario-dependent operational response. The original benchmark includes 186 branches, 91 load buses, and 54 conventional generation units, which we augment with high-penetration renewable sources. Specifically, 14 wind farms and 12 solar PV stations were added across selected buses to reflect modern decarbonization trajectories. The nameplate capacity of the newly integrated renewables totaled 2.7 GW, and it was distributed such that wind resources dominated the northern and western corridors (e.g., Buses 15, 26, and 42), while solar installations were concentrated in high-irradiance zones of the central and southern regions (e.g., Buses 53, 77, and 96). For the time horizon, we considered a 24 h operational cycle discretized into hourly intervals (), and, for each hour, a unique RES and demand realization was sampled from the PI-ST-GAN-generated scenario space. The demand profile reflects summer peak conditions with a total system load of 6.5 GW, fluctuating by up to 12% between morning and late afternoon hours.
Renewable generation patterns and load time series are synthesized using a physics-informed spatiotemporal GAN trained on realistic weather-driven profiles. Historical wind speed and solar irradiance data are drawn from the National Renewable Energy Laboratory (NREL) Eastern Wind and Solar Integration datasets, and these were downsampled to hourly resolution and normalized for regional site characteristics. The GAN was trained on 1000 hourly sequences (each of length 24) from four meteorological years (2006–2009), and it was filtered to emphasize high ramping behavior and cross-site correlation. The generator produced 100 synthetic stress scenarios per iteration, from which the top 10 highest-risk realizations (based on the stress index defined in
Section 1) were selected for inner-loop optimization. Curtailment and load shedding costs were calibrated to USD 120/MWh and USD 6000/MWh respectively, while conventional generation costs range from USD 25–USD 80/MWh depending on fuel type. The planning budget was capped at USD 300 million, allowing the planner to select approximately 10–14 candidate lines from a total of 35 feasible expansion options distributed across the IEEE 118-bus network. These candidates included both mid-voltage (138 kV) and high-voltage (230 kV) options, with estimated per-line costs ranging from USD 8.3 million to USD 42 million depending on the length and terrain-adjusted cost factors.
All simulations were performed using Python 3.11 with key packages, including PyTorch for neural network training, Gurobi 10.0 as the MIP solver backend, and NumPy/Pandas for data preprocessing and post-analysis. The GAN module was trained on an NVIDIA A100 GPU with 80 GB memory using the Adam optimizer, a learning rate of , and batch size of 64. Training convergence was achieved within 300 epochs, typically under 3 h. The bi-level optimization problem was solved on a 64-core AMD EPYC 7763 CPU node with 256 GB RAM using a decomposition-based strategy with callback-driven scenario injection. Each outer-loop iteration took approximately 18–22 min, and the entire co-optimization process converged in under 10 h of wall-clock time for a full training-evaluation cycle. Duality-based reformulations for the DRO components were validated against sample-based approximations to ensure numerical stability and tractability. All code and data pipelines were reproducible, with a total software footprint of under 2.1 GB.
To address the implementation and training aspects of the GAN module, we wish to emphasize that the generator was trained to produce spatiotemporal input scenarios that maximize system-level stress while maintaining realism. Unlike conventional GANs, our model receives feedback from the power flow solver, which computes a scalar stress score based on system constraint violations. This feedback is used to update both the generator and the discriminator through backpropagation. The discriminator is not limited to distinguishing real and fake inputs but also learns to approximate gradients of the system stress function, guiding the generator toward operationally critical regions of the scenario space. The training process is conducted iteratively with adaptive learning rates and gradient penalties to ensure convergence and diversity. The entire model is trained offline using a combination of historical demand and renewable profiles, with additional stochastic perturbations injected to encourage exploration. This hybrid design ensures that generated scenarios not only replicate historical patterns, but also reveal extreme cases relevant for robust planning.
To better illustrate the structure and functionality of the proposed PI-ST-GAN scenario generator, we created a schematic diagram, as shown in
Figure 3. Unlike traditional GANs, which solely focus on reproducing the statistical characteristics of training data, the proposed framework integrates physics-informed constraints and system-level stress evaluation into the adversarial learning process. The generator receives latent noise vectors and produces spatiotemporal scenario representations, while the discriminator evaluates both statistical realism and physical feasibility using stress metrics provided by the transmission planner. This bidirectional loop enables continuous refinement of generated scenarios, maximizing their impact on system robustness evaluation. The visualized architecture facilitates an intuitive understanding of how adversarial training is used not just for realism, but to guide scenario generation toward operationally critical configurations.
Figure 4 presents a heatmap of normalized wind availability across 14 selected buses within the IEEE 118-bus test system, spanning a 24 h daily cycle. The availability values ranged from 0 to 1 and were drawn from adversarially generated scenarios calibrated using GAN training on realistic meteorological data. Each row corresponds to a unique wind-integrated bus, and each column represents one hour of the operational day. The color intensity moved from light pink to deep red as wind availability increased, with values close to 1 indicating near-maximum generation potential. Several wind sites exhibited strong diurnal variation. For instance, Bus 3 and Bus 9 showed a steady buildup from low availability (below 0.3) at Hour 3 to peak conditions (above 0.85) at Hours 9–13. In contrast, Bus 12 maintained nearly flat availability between 0.45 and 0.55 throughout the entire day, indicating a more stable but less dynamic wind regime. This spatial diversity reflects how geographically distributed wind patterns influence flexibility requirements for balancing operations.
Figure 5 displays the normalized solar availability for 12 solar PV buses across a single day, from Hour 0 to Hour 24. The solar generation followed a sinusoidal base pattern, and it was modulated with noise to simulate cloud cover and diurnal irradiance uncertainty. The buses generally showed peak availability between Hours 11 and 14, with values exceeding 0.85 at up to nine locations. In contrast, from Hour 0 to Hour 6, and after Hour 18, availability fell below 0.1 across all buses, reflecting realistic night-time conditions. Bus 6 showed early ramp-up (availability over 0.3 by Hour 7), whereas Bus 9 reached full irradiance slightly later, peaking around Hour 13. A pronounced midday plateau was observed from Hour 10 to Hour 15, where 80 percent of buses maintained availability above 0.7. This flat peak implies significant solar contribution during those hours, necessitating curtailment strategies, especially if demand is not well aligned. Additionally, short-term fluctuations in availability were evident at Bus 4 and Bus 10, where deviations of over 0.2 occurred within a 2-h window—likely simulating transient cloud cover. These fast swings reinforce the need for short-term reserves and suggest high temporal granularity in operational reserves may be beneficial.
Figure 6 illustrates how the total renewable curtailment evolves as the planning model undergoes iterative retraining using new adversarial scenarios. The
x-axis represents ten sequential optimization iterations, while the
y-axis shows the total energy curtailed (in MWh) under each iteration’s corresponding plan. Initial curtailment was high—approximately 980 MWh in Iteration 1—but declined steadily to under 420 MWh by Iteration 10. This decreasing trend was further smoothed with a three-iteration moving average, which is shown as a dashed dark green line, while a transparent band reflected ±50 MWh variability across scenarios, representing operational volatility. The most substantial reduction occurred between Iterations 2 and 4, where curtailment dropped from 880 MWh to 640 MWh—a 27 percent improvement. The pace of improvement slowed in later iterations, suggesting that the optimization was approaching convergence. Between Iterations 7 and 10, the reduction was more incremental, from 510 MWh to 418 MWh. This diminishing return was expected as the GAN focused on increasingly rare or subtle stress configurations. The result shows that adversarial co-training leads to meaningful curtailment minimization and supports the idea that the planner learns over time how to fortify the system against a richer uncertainty space.
Figure 7 compares hourly load shedding between two planning paradigms over a 24-h operational window: a baseline stochastic plan (gray dashed line) and the adversarially-stressed GAN-enhanced plan (green line). In the baseline plan, load shedding ranged between 30 and 90 MW, with pronounced spikes at Hours 8, 14, and 19, reaching a peak of 98 MW at Hour 14. In contrast, the GAN-enhanced plan maintained load shedding mostly below 30 MW across all hours, with only one minor spike reaching 48 MW at Hour 18. The most significant gain occurred at Hour 14, where the GAN plan reduced shedding from 98 MW to just 21 MW—a 78.6 percent improvement. The adversarial plan was clearly more effective during mid-day hours when both demand and renewable fluctuations are high. For example, between Hours 11 and 16, the average shedding under the baseline was 74 MW, while under the GAN plan it was only 24 MW. This 67.5 percent reduction during critical load and congestion periods translated directly to improved system reliability and better load-serving capability. During night hours (e.g., Hours 0–5), both plans performed similarly, with low curtailment and minimal load shedding due to reduced demand and more stable conditions.
Figure 8 plots the tradeoff between the total investment cost and the composite system-wide stress index, evaluated across ten planning configurations with increasing budget allocations. The
x-axis shows the investment levels ranging from USD 100 million to USD 300 million, while the
y-axis captures a normalized stress index between 0.6 and 0.95. The plotted curve exhibits a clear convex shape, indicating diminishing returns: the initial investments reduced stress substantially, but additional spending beyond USD 240 million yielded marginal benefits. The optimal point, marked in dark green, lay near USD 180 million with a corresponding stress index of approximately 0.69, balancing financial and operational resilience. The shape of the frontier confirms a fundamental planning principle—early-stage investments target critical vulnerabilities and yield large reductions in operational risk, while later investments tend to overbuild capacity that sees limited utilization. For example, moving from USD 120 million to USD 160 million reduced the stress index from 0.89 to 0.73 (a 16-point improvement), whereas moving from USD 240 million to USD 300 million only reduced it from 0.65 to 0.63. Such observations validate the marginal value of stress-aware investment and support the use of optimization to find budget-efficient grid designs.
Figure 9 displays two key metrics across the 15 most influential GAN-generated scenarios (interpreted as scenario frequency within the learned distribution): their stress contribution (green bars, left axis) and their density (gray dashed line, right axis). Scenario ID 1 exhibited the highest stress index at approximately 0.97, yet its GAN density was only 0.12, indicating it is a rare but extremely damaging configuration. Conversely, Scenario ID 10 appeared frequently (density 0.56) but had a relatively mild stress impact (index 0.73). The plot reveals that certain rare scenarios tended to exhibit high stress impacts, though the overall relationship between rarity and stress was complex and not strictly monotonic. This dynamic reflects one of the main values of adversarial learning: it surfaces low-probability, high-impact operational conditions that traditional sampling may never uncover. For instance, the model identified at least three scenarios (IDs 1, 4, and 6) that appeared with a density below 0.2 but each induced a stress index above 0.9. These are likely synthetic representations of spatially correlated renewable droughts or load-generation imbalances. Without GAN-based stress probing, these cases would likely be omitted from deterministic or historical data-based planning exercises.
Figure 10 presents a comparative bar chart showing the total dispatched and curtailed renewable energy at 15 buses hosting wind or solar generation assets in the test system. Each bus was labeled RES1 through RES15 along the
x-axis. For each, two bars are shown: the dark green bar indicates the amount of RES energy successfully injected into the system (dispatched), and the light green bar represents curtailed energy—generated but not used. Dispatch values ranged from approximately 110 MWh (RES7) to over 470 MWh (RES3), while curtailment varied from under 60 MWh (RES9) to over 290 MWh (RES14). This clear side-by-side comparison enables planners to assess both the effectiveness and inefficiencies in RES utilization at a nodal level. The figure highlights several key imbalances in renewable performance. Buses RES3 and RES6 showed strong performance, each dispatching over 400 MWh and curtailing less than 100 MWh—suggesting favorable grid positioning or strong downstream flexibility. In contrast, RES14 dispatched only about 230 MWh while curtailing nearly 300 MWh, resulting in a curtailment ratio above 56%. Similarly, RES12 and RES15 exhibited disproportionately high curtailment relative to dispatch, with curtailment-to-dispatch ratios exceeding 0.8. These statistics suggest that, even under adversarial stress, certain RES sites suffer from chronic deliverability issues, possibly due to congestion, local ramping constraints, or network topology bottlenecks.
Figure 11 displays the distribution of the line flow utilization across the transmission network under adversarial stress testing. The
x-axis represents the normalized line utilization ratio (actual flow divided by thermal capacity), ranging from 0 to 1.5. The
y-axis shows the frequency of occurrence across all monitored lines and scenarios. A vertical dashed red line at utilization = 1.0 marks the congestion threshold, above which physical violations or binding constraints are expected. The bulk of the distribution is centered between 0.6 and 1.0, indicating that most lines are operating near their rated capacity under GAN-generated worst-case conditions. Approximately 28 percent of the observed utilizations exceeded 1.0, confirming that, under stress scenarios, congestion was not just isolated but systemic. The right tail of the histogram extends beyond 1.3, where about 7 percent of samples exhibited dangerously overloaded conditions. This implies that certain adversarial scenarios are capable of pushing multiple lines into critical operating regimes, which would likely trigger either emergency dispatch actions or post-contingency violations. On the lower end of the spectrum, fewer than 10 percent of lines operated below 0.4 utilization, suggesting that the adversarial scenario generator avoids overly conservative or non-binding realizations, focusing instead on operationally challenging cases.
To improve the interpretability of the planning results, we provide a detailed tabular summary of the transmission lines selected by the GAN-enhanced planning model. These lines were chosen, not only based on cost efficiency, but also for their role in relieving critical congestion under adversarial scenarios.
Table 1 lists the selected lines along with key parameters, including voltage level, estimated investment cost, and congestion reduction score. Compared with baseline and stochastic methods, the GAN model favored reinforcements in peripheral areas that align with spatially correlated renewable stress, indicating a shift toward risk-resilient planning logic.
Table 1 provides a summary of the selected transmission lines that were identified and constructed under the proposed GAN-enhanced planning framework. These candidate lines represent optimal investment decisions derived from the bi-level optimization process in conjunction with the scenario generation by the PI-ST-GAN model. Each entry includes the line identifier, the corresponding sending and receiving buses, the rated voltage level (kV), and the associated capital investment (in million USD). The results illustrate that the planning model prioritizes high-voltage lines (e.g., 230 kV) in several key corridors, such as Line L101 and L114, which enhance bulk power transfer capability across critical regions. Meanwhile, mid-voltage additions (e.g., 138 kV), such as L107 and L118, serve to reinforce local reliability and redundancy. The diversity of voltage levels and geographical dispersion of selected links reflect a balance between system-wide efficiency and localized constraints. Furthermore, the range of investment costs—from USD 9.5 M to USD 40.1 M—demonstrates the economic scalability of the proposed solution, which is capable of identifying cost-effective yet system-critical infrastructure reinforcements. Overall,
Table 1 substantiates the practical impact of the proposed GAN-driven approach by highlighting its capacity to inform realistic and cost-conscious transmission expansion planning.
To quantitatively evaluate the impact of physics-informed loss terms in the scenario generation process, an ablation study was conducted and is summarized in
Table 2. The table compares three model variants: the baseline GAN, which includes no physical constraints; the partial PI-GAN, which only incorporates the constraint in Equation (21); and the full PI-ST-GAN, which integrates all physics-informed terms from Equation (21) to Equation (23). Key metrics reported include scenario feasibility (i.e., the percentage of generated scenarios satisfying system constraints), the mean load prediction error, and the Multi-Contingency Constraint (MCC) stress index. As shown in
Table 2, the inclusion of Equation (21) significantly improved scenario feasibility from 74.2% to 86.7% and reduced the load error by nearly half. Further incorporation of Equations (22) and (23) resulted in a dramatic increase in feasibility to 98.4% and a load error reduction to just 0.021. Additionally, the MCC stress index improved accordingly, demonstrating enhanced system robustness. These results confirm the empirical effectiveness of the proposed physics-informed constraints, validating their essential role in producing reliable, physically consistent, and operationally feasible scenarios. This ablation analysis further justifies the design of the PI-ST-GAN architecture and highlights the practical importance of embedding domain knowledge into deep generative models.
Table 3 presents a comparative analysis of three mainstream generative models—VAE, diffusion models, and GANs—in the context of power system scenario generation. The comparison focuses on three dimensions: output realism, training stability, and domain suitability. VAEs offer high training stability and computational efficiency but often generate blurry and less expressive scenarios due to their inherent variational approximation. Diffusion models produce highly realistic samples by iteratively refining noise, yet suffer from intensive computational costs and longer convergence time, making them less practical for large-scale energy systems. In contrast, the GAN framework adopted in this study achieves a balance between sample realism and training efficiency. When augmented with physics-informed loss constraints, the GAN demonstrates improved training stability and superior adaptability to domain-specific requirements, such as feasibility and stress structure preservation. As such, the GAN-based approach is better aligned with the operational characteristics and reliability demands of power grid planning under uncertainty. This evaluation supports the model selection and further emphasizes the methodological advantages of GANs over alternative generative architectures in this application domain.
Table 4 summarizes the diverse renewable energy penetration scenarios used in the IEEE 118-bus system experiments. Each scenario was designed to reflect a distinct operational condition, ranging from baseline configurations with limited wind and solar integration to high-stress conditions under extreme renewable supply. By systematically varying the proportions of wind and solar generation, the model evaluates network adaptability to different stress regimes and captures the operational risks associated with varying spatiotemporal RES variability. This structured scenario diversity ensures that the proposed method is not limited to a single renewable configuration but is stress-tested across multiple realistic regimes, providing more comprehensive validation of the planning framework. The inclusion of mixed extreme conditions, with simultaneous high wind and solar penetration, allows the model to reveal worst-case congestion and load loss patterns. The “Expected Stress Level” column offers qualitative interpretation of how each renewable mix impacts the operational challenge, thereby bridging the gap between scenario design and planning stress response. Overall, this multi-scenario setup significantly strengthens the empirical foundation of the case study, enhancing the generalizability and robustness of the results derived from the IEEE 118-bus test system.
Although the IEEE 118-bus system was used as the testbed in this study due to its wide adoption and balanced complexity, the architecture of the proposed framework is inherently extensible to larger and more realistic grid models. The modularity of the PI-ST-GAN enables its components—such as spatial convolution, temporal self-attention, and physics-informed constraints—to scale gracefully with increased node counts or additional temporal granularity. Likewise, the optimization loop leverages parallel scenario evaluation and stress feedback integration, ensuring that the planning model remains computationally viable under expanded system configurations. In future work, we plan to apply the framework to real-world grids, such as the Polish 2383-bus system or ERCOT’s zonal transmission network, which will allow further validation of scalability and robustness. Such extensions will help assess the model’s adaptability to various topologies, operational constraints, and renewable penetration profiles, while maintaining the fidelity and stress-testing capability established in the current implementation.
5. Conclusions
This paper introduced a novel transmission expansion planning (TEP) framework that couples bi-level risk–cost optimization with adversarial scenario generation based on physics-informed spatiotemporal generative adversarial networks (PI-ST-GAN). Unlike conventional stochastic or robust approaches that rely on pre-defined or uniformly sampled uncertainty sets, the proposed method actively synthesizes extreme but physically plausible scenarios that expose structural vulnerabilities in network topology and operational response. These adversarial scenarios are generated using a dual-objective GAN trained not only to mimic realistic spatiotemporal RES patterns, but also to maximize a composite operational stress index, incorporating the expected energy not served, loss-of-load probability, and marginal congestion costs.
The bi-level formulation—comprising an upper-level investment model and a lower-level scenario-dependent operational dispatch problem—was reformulated using strong duality, and it was implemented within a decomposition-based MIP framework. Numerical results on a modified IEEE 118-bus system with high wind and solar penetration validate the model’s ability to identify resilient and cost-efficient network configurations. Compared to deterministic and stochastic baselines, the proposed method reduces renewable curtailment by up to 48.7% and load shedding by 62.4% under worst-case scenarios. Moreover, spatial congestion maps, stress-density distributions, and investment–risk frontiers derived from the GAN-driven scenario space provided planners with interpretable and actionable decision insights that would not emerge from conventional models. In addition to its methodological innovation, this work demonstrates the practical value of integrating generative learning with risk-aware optimization in critical infrastructure planning. The ability of the planner to identify worst-case vulnerabilities through AI-generated disturbances improves both operational reliability and economic robustness. This dynamic feedback loop between scenario generation and expansion planning reflects a deeper integration of learning-based models with domain-specific constraints and engineering logic.
Several avenues for future research remain. One promising direction is to incorporate longer-term temporal coupling, enabling the framework to model multi-day or seasonal stress propagation. This can be achieved by embedding recurrent or attention-based architectures into the GAN generator, allowing for the synthesis of temporally consistent stress trajectories. Moreover, incorporating calendar-based features or weather regime indicators can improve the realism of generated scenarios, supporting planning tasks, such as energy storage scheduling, outage coordination, and long-term adequacy assessments. Lastly, although the proposed model was developed under the conventional AC power flow formulation, its modular architecture allows for adaptation to hybrid AC–DC transmission networks or meshed grids with flexible resources. Future extensions may also explore market co-optimization, reliability pricing mechanisms, and integration with distribution-level planning layers. As energy systems face escalating uncertainty, adversarially informed, learning-augmented planning tools, such as PI-ST-GAN, offer a resilient foundation for next-generation transmission system design.