5.6. Performance Comparison on Full Topology
Table 2,
Table 3 and
Table 4 comprehensively summarize model performance on the full-topology datasets (14-, 30-, and 118-bus systems) before and after AC power flow post-processing. Overall, LG-HGNN (Our
F) consistently matches or outperforms all baselines across almost all variables and grid sizes, while DC-IPOPT frequently exhibits poor AC feasibility once its DC solution is embedded into the full AC network.
Prediction Accuracy Analysis (
Table 2). For bus-level variables, LG-HGNN achieves the lowest or near-lowest MSE for both voltage angles
and magnitudes
across all IEEE systems. On the 14-bus grid, the pre-PF MSE for
decreases from
(DC-IPOPT) to
with GAT/GIN and further to
with LG-HGNN, a pattern consistent for 30- and 118-bus systems. For
, all neural models significantly outperform DC-IPOPT, with LG-HGNN and GAT typically yielding the smallest errors, demonstrating the stabilizing effect of physics-informed regularization.
For generator variables, LG-HGNN consistently attains the best or tied-best MSE for both and . While GAT and GIN already reduce errors by two orders of magnitude over DC-IPOPT, LG-HGNN further improves them, and its explicit modeling of generator nodes enhances the coupling between reactive power and voltage regulation.
For line and transformer flows (), DC-IPOPT exhibits large post-PF errors (up to ), reflecting the DC–AC mismatch. GAT and GIN reduce these errors to –, while LG-HGNN reaches the range, particularly excelling on transformer flows. This demonstrates that heterogeneous representations with global attention enable LG-HGNN to capture long-range power-flow dependencies beyond the capacity of DC-IPOPT or conventional GNNs.
Power-flow post-processing (shaded rows) affects each method differently: LG-HGNN’s MSE changes only slightly, showing its predictions are already close to AC-feasible solutions, whereas DC-IPOPT exhibits large deviations in flow-related variables, confirming that its computational simplicity comes at the cost of AC accuracy.
Constraint Violation Degree Analysis (
Table 3). The table reports average violations for key constraints, including branch thermal limits (
,
), angle-difference limits
, and active/reactive power balance (
,
). Voltage and generator bounds are excluded since they are directly enforced by constrained decoding.
Before power-flow post-processing (non-shaded rows), all neural models show minimal violations (–), with LG-HGNN matching or surpassing the best baseline. For example, on the 118-bus system, the pre-PF thermal-limit violation is about for LG-HGNN, while and remain within – pu, indicating that physics-informed loss promotes Kirchhoff-consistent solutions even without PF correction.
After power flow (shaded rows), differences with DC-IPOPT become pronounced. Because DC-IPOPT ignores reactive power and voltage constraints, its post-PF thermal violations reach , whereas LG-HGNN and other GNNs remain in the – range. Power-balance errors are reduced to numerical zero for all methods after PF. Overall, LG-HGNN produces solutions inherently closer to AC-feasible operating points than DC-IPOPT, demonstrating superior physical consistency.
Optimality Ratio Analysis (
Table 4). The updated results reveal that the proposed LG-HGNN achieves the most economically consistent performance among all solvers. Before power-flow correction, DC-IPOPT yields markedly lower optimality ratios—approximately 96–
on the 14- and 30-bus systems and
on the 118-bus grid—indicating that the DC approximation systematically underestimates the true AC generation cost. After AC power-flow recalculation, its ratios improve slightly but remain well below
, confirming that DC-based dispatches are economically sub-optimal when evaluated in the nonlinear AC domain.
In contrast, all neural solvers achieve ratios very close to the AC-OPF optimum. GAT and GIN exhibit slightly super-unit ratios (101–) across all networks, while LG-HGNN consistently produces the lowest deviation from . Specifically, OurF attains , , and on the IEEE-14, 30-, and 118-bus cases, respectively, and further stabilizes near – after PF. These near-unity ratios indicate that LG-HGNN accurately reconstructs generation dispatches that are economically very close to the ground-truth AC-OPF solutions, with minimal numerical bias.
As network size increases, the gap between DC-IPOPT and the neural methods widens, highlighting the scalability and robustness of the proposed architecture. Overall, the full-topology experiments confirm that LG-HGNN effectively balances physical feasibility and economic optimality, correcting the systematic cost bias of DC approximations and surpassing GAT and GIN in both accuracy and consistency across all grid scales.
Scalability to Large-Scale Grids
To evaluate the scalability of the proposed LG-HGNN architecture on realistic large-scale power systems, we further conduct experiments on the GOC-2000 dataset, which contains 2000 buses and exhibits significantly higher topological heterogeneity than the IEEE benchmarks.
Table 5 reports the MSE values before and after PF correction for bus voltages, generator outputs, and line flows.
Across all state variables, LG-HGNN (OurF) consistently achieves the lowest MSE among the learning-based baselines. On this 2000-bus system, the model maintains high accuracy with only moderate error growth relative to the 118-bus case (e.g., bus-angle MSE before PF increases from to ). After PF correction, all models benefit from enforcing the nonlinear AC power flow equations, and the errors of LG-HGNN are further reduced by approximately 35–40% across most variables (e.g., bus-angle MSE decreases from to ). This indicates that the predictions produced by LG-HGNN are already close to the AC-feasible manifold and require only small corrective adjustments.
Overall, these results demonstrate that LG-HGNN generalizes effectively to large-scale transmission networks with thousands of buses: it preserves its accuracy advantage over GAT and GIN on GOC-2000, while remaining fully compatible with standard PF-based post-processing. This validates the proposed model as a scalable and physically consistent surrogate for real-time AC-OPF applications on realistic, highly heterogeneous grids.
5.7. Performance Comparison on N-1 Topology
The N-1 contingency experiments evaluate the ability of different methods to generalize across large families of perturbed topologies where either a branch or a generator is randomly removed from service. In this setting, we distinguish between two configurations of our model: (1) Our
F, trained only on full-topology data and evaluated zero-shot on N-1 contingencies; and (2) Our
N, fine-tuned directly on the N-1 training subset.
Table 6,
Table 7 and
Table 8 report prediction accuracy, feasibility, and economic optimality across IEEE-14, -30, and -118 systems.
Prediction Accuracy Analysis (
Table 6). Compared with the full-topology case, N-1 contingencies cause notable distribution shifts as line and generator outages alter network connectivity and dispatch patterns. Under this setting, DC-IPOPT exhibits large post-PF MSEs—especially for reactive power and branch flows—due to its inherent DC–AC mismatch, while GAT and GIN remain competitive but suffer higher errors on larger grids.
Both OurF (zero-shot) and OurN (fine-tuned) demonstrate strong topology robustness. Even without N-1 training, OurF consistently outperforms GAT and GIN across nearly all variables and systems; for example, on the 30-bus grid, the pre-PF MSE of drops from about (GAT/GIN) to for OurF. This confirms that heterogeneous modeling and effective-resistance encoding enable strong inductive generalization to unseen topologies.
Fine-tuning further refines results: OurN achieves slightly lower MSEs than OurF (typically a few percent improvement) while preserving similar scaling across variables and systems. These modest yet consistent gains show that limited N-1 data is sufficient to adapt the pre-trained model to specific contingency patterns.
As in the full-topology case, post-PF processing affects methods differently. DC-IPOPT’s N-1 predictions lead to large post-PF errors (up to ), whereas OurF and OurN exhibit only minor MSE changes, indicating that both already produce solutions close to the AC-feasible manifold even under topological perturbations.
Constraint Violation Analysis (
Table 7). The updated feasibility results on the N-1 datasets confirm and sharpen the trends observed under full-topology experiments. Before power-flow correction (non-shaded rows), all learning-based models show small yet finite violations of thermal-limit and power-balance constraints. Across all grids, the average magnitudes for
and
remain within
–
pu, indicating that Kirchhoff’s laws are closely, though not perfectly, satisfied through purely learned predictions. Among neural models, Our
F consistently matches or slightly outperforms GAT and GIN, while the fine-tuned variant Our
N yields almost identical values. This pattern demonstrates that the zero-shot model already learns a representation near the N-1-aware feasible region, with minimal need for adaptation.
After AC power-flow recalculation (shaded rows), all neural solvers drive and residuals to numerical zero, leaving only the branch thermal-limit constraints as non-trivial sources of error. Here, the advantage of LG-HGNN becomes particularly clear: on the 30- and 118-bus systems, DC-IPOPT exhibits post-PF thermal violations on the order of , whereas LG-HGNN (both OurF and OurN) maintains violations within –. This represents several orders of magnitude improvement in AC-feasibility fidelity, showing that DC-OPF formulations can severely misestimate branch loading under contingencies, while the transformer-based heterogeneous model preserves physical consistency.
Notably, the nearly identical post-PF violation levels of OurF and OurN indicate that fine-tuning mainly refines numerical precision and economic optimality rather than altering feasibility behavior. In other words, LG-HGNN’s built-in inductive biases—heterogeneous node/edge typing, combined local-global propagation, and physics-aware regularization—are sufficient to produce AC-feasible operating points even in zero-shot generalization across unseen N-1 contingencies.
Optimality Ratio Analysis (
Table 8). The refined results further emphasize the superior economic consistency of LG-HGNN under contingency conditions. Before power-flow correction, DC-IPOPT attains only 94–
optimality on the smaller IEEE-14 and IEEE-30 systems and about
on the 118-bus grid, confirming that the DC approximation substantially underestimates the true AC-OPF cost. Even after PF correction, its ratios remain below
, revealing persistent inefficiencies once DC dispatches are re-evaluated within the nonlinear AC model.
In contrast, the learning-based methods exhibit nearly perfect or slightly super-optimal behavior across all test cases. GAT and GIN reach around 101–, while LG-HGNN consistently achieves ratios closest to the ideal value of . Specifically, OurF records approximately , , and across the 14-, 30-, and 118-bus systems, respectively, and stabilizes near – after PF. Fine-tuning on contingency data (OurN) further aligns predictions with the AC-OPF ground truth, reducing residual deviations by roughly on average and yielding the most consistent ratios across all grid sizes.
These results confirm that LG-HGNN generalizes effectively to unseen N-1 topologies while preserving cost-optimal behavior. Unlike DC-IPOPT, whose N-1 solutions remain both economically sub-optimal and physically less feasible, LG-HGNN maintains near-unity optimality ratios together with minimal constraint violations. In summary, the model functions as a topology-robust, economically consistent surrogate: even in zero-shot settings it provides AC-feasible and nearly optimal dispatches, and with minor fine-tuning, it becomes virtually indistinguishable from the full AC-OPF solver while retaining its large computational advantage.
5.8. Ablation Study
Table 9 summarizes the ablation results on the IEEE-14, IEEE-30, and IEEE-118 systems. Across all benchmarks, the full LG-HGNN consistently achieves the lowest MSE in both voltage angle
and magnitude
, demonstrating the necessity of combining effective-resistance priors, constrained decoding, and bus-centric global attention.
Removing the effective-resistance positional encoding (w/o ER-PE) results in a clear loss of accuracy on all three systems. The degradation is most pronounced on IEEE-118, where the MSE increases from to , confirming that electrical-distance bias improves long-range coupling modeling and enhances robustness on large networks.
Eliminating the constrained decoder (w/o Constrained Decoder) leads to the largest growth in MSE—nearly a 2.6× increase on the IEEE-118 system—highlighting that physics-aware output parameterization is essential for producing voltage predictions that remain within operationally meaningful bounds before power flow correction.
The variant without global attention (w/o Global Attention) exhibits the most severe performance drop, especially on IEEE-118, where the MSE more than triples relative to the full model. This confirms that purely local heterogeneous message passing is inadequate for capturing global voltage-angle dependencies, and that bus-centric global aggregation plays a critical role in scaling to large transmission systems.
Overall, the ablations demonstrate that each architectural component contributes meaningfully to the accuracy and physical consistency of LG-HGNN. Removing any one of them leads to noticeable performance degradation, particularly in larger grids where long-range electrical interactions and tight operational limits make AC-OPF learning significantly more challenging.
5.10. Computational Efficiency Analysis
Figure 3 compares the per-instance solution time of DC-IPOPT and the proposed LG-HGNN across all datasets. The conventional DC-IPOPT solver exhibits a strong dependency on grid size, with average runtimes rising from 28.7 ms on the IEEE-14 system to 374.7 ms pre-power-flow (PF), and up to nearly 1 s after PF on the 118-bus system. On the 2000-bus GOC-2000 case, DC-IPOPT further increases to 4178.3 ms before PF and 5436.5 ms after PF. This rapid growth reflects the iterative nature of nonlinear optimization and the computational cost of repeated sparse matrix factorizations.
In contrast, LG-HGNN maintains low inference latency, requiring only 5–7 ms for the 14- and 30-bus grids and below 30 ms for the 118-bus and 2000 cases before PF, since each prediction involves a fixed number of message-passing and attention layers. Consequently, LG-HGNN achieves substantial speed-ups of , , , and before PF, and , , , and after PF on the 14-, 30-, 118-, and GOC-2000 systems, respectively. Notably, for the 2000-bus case, the post-PF runtime is dominated by the power-flow correction step, which reduces the overall speed-up compared with the pure feed-forward inference stage. These gains are achieved without sacrificing accuracy or physical feasibility, indicating that the model can serve as an efficient surrogate for computationally intensive AC-OPF solvers.
Overall, the results highlight that LG-HGNN combines high physical fidelity with excellent practical scalability, delivering AC-feasible, near-optimal solutions within milliseconds.