Dynamical Graph Neural Networks for Modern Power Grid Analysis

Huang, Shu; Li, Jining; Zeng, Ruijiang; Li, Zhiyong; Xu, Jin

doi:10.3390/electronics15030493

Open AccessArticle

Dynamical Graph Neural Networks for Modern Power Grid Analysis

by

Shu Huang

^1,2,

Jining Li

¹,

Ruijiang Zeng

^1,2,

Zhiyong Li

^1,2,* and

Jin Xu

³

¹

Electric Power Research Institute, Guangdong Power Grid Company Limited, Guangzhou 510081, China

²

China Southern Power Grid Key Laboratory of Power Grid Automation Laboratory, Guangzhou 510081, China

³

Pazhou Laboratory, Guangzhou 510330, China

^*

Author to whom correspondence should be addressed.

Electronics 2026, 15(3), 493; https://doi.org/10.3390/electronics15030493

Submission received: 5 November 2025 / Revised: 12 December 2025 / Accepted: 18 December 2025 / Published: 23 January 2026

(This article belongs to the Special Issue AI Applications for Smart Grid)

Download

Browse Figures

Versions Notes

Abstract

Modern power grids are crucial infrastructures underpinning societal stability, yet their complexity and dynamic nature pose significant challenges for traditional analytical methods. Graph Neural Networks (GNNs) have recently emerged as powerful tools for modeling complex relationships in graph-structured data, making them especially suitable for analyzing power systems. However, existing GNN methods typically focus on static or simplified network models, failing to adequately address dynamic topological changes and suffering from the over-smoothing issue. To overcome these limitations, we propose a novel GNN framework incorporating dynamic message-passing mechanisms, comprising Dynamic Topological Learning (DTL) and Adaptive Message-Passing (AMP) modules. Specifically, DTL captures dynamic changes in the power grid topology conditioned on the current state of the system, while AMP dynamically adjusts the message-passing process to effectively preserve local node information according to the updated topology. This framework is model-agnostic, allowing it to be integrated with various GNN architectures. Extensive experiments on multiple benchmark power grid datasets demonstrate that our proposed framework significantly enhances existing GNN methods in power flow and optimal power flow analysis, consistently achieving lower mean absolute error and higher R-squared scores.

Keywords:

power flow analysis; machine learning; graph neural networks; power grid

1. Introduction

Modern power grids are intricate, critical infrastructures that ensure the stability and efficiency of energy supply across vast regions, even the entire society [1,2,3]. The complexity of these systems, characterized by their dynamic nature and the interdependence of various components, poses significant challenges for traditional analysis methods [4]. As the global demand for energy continues to rise and the shift towards renewable and decentralized energy sources accelerates, such as solar and wind power, traditional power grids face significant operational challenges and the need for advanced analytical tools becomes increasingly urgent. Renewable sources introduce fluctuations, intermittency, and reduced inertia into the grid, complicating the task of maintaining stable power flow and frequency synchronization. These complexities are exacerbated by the integration of diverse, Distributed Energy Resources (DERs) and heightened susceptibility to cascading failures, which can trigger extensive blackouts and severely disrupt daily life and economic activities [5].

Traditional power flow analysis methods, such as the numerical techniques based on Newton–Raphson or Gauss–Seidel algorithms and the solutions to power flow equations and deterministic stability analyses, increasingly struggle to meet the dynamic and uncertain nature of contemporary grids [6,7,8,9,10]. Traditional numerical techniques require detailed system parameters and precise modeling of operating conditions, which can become inaccurate or unavailable due to aging infrastructure, incomplete information on DERs, and rapidly changing grid configurations. Furthermore, numerical approaches, which rely heavily on iterative solutions of complex nonlinear equations, can be computationally prohibitive for real-time monitoring and assessment of large-scale grid stability. Thus, there is an urgent need for innovative, robust methods that leverage data-driven and computationally efficient approaches to ensure the reliability and resilience of modern power grids under diverse operational conditions and potential disturbances. Under these circumstances, Graph Neural Networks (GNNs) have emerged as promising tools for analyzing and modeling power grid phenomena and complexities [11,12,13,14,15]. GNNs are particularly well suited for power grid analysis due to their ability to model complex relationships and interactions between components in a graph structure, where nodes represent power system elements (e.g., generators, transformers, loads) and edges represent the physical connections between them. GNNs follow the message-passing paradigm, where information is exchanged between neighboring nodes to update their representations iteratively [15,16,17].

For example, ref. [18] presents a GNN-based model that learns to iteratively refine power flow solutions, providing a fast and accurate alternative to traditional numerical solvers for solving the nonlinear AC power flow equations. To address the problem that the optimal power flow problem is non-convex and not scalable to large power networks, while the DC optimal power flow approximation fails for heavily loaded grids, ref. [19] uses IPOPT to compute optimal solutions for training set network states as labels, enabling GNNs to learn the mapping from network states to optimal power generation outputs for efficient inference. Ref. [20] introduces a physics-informed neural network framework to approximate optimal power flow solutions while ensuring physical feasibility, thereby reducing computational cost and improving generalization. Ref. [21] uses GNNs to learn mappings from power system states to optimal power flow solutions, aiming to provide fast, scalable approximations that can generalize across different grid topologies and loading conditions. Ref. [22] proposes a method to solve the AC power flow problem using GNNs that are trained to approximate the solution efficiently while incorporating realistic operational constraints such as generator limits and voltage bounds. Ref. [23] proposes techniques to improve GNN performance in predicting power flow by incorporating physical constraints and domain-specific augmentations, thereby increasing both prediction accuracy and generalization across different grid conditions. Ref. [24] proposes PowerFlowNet, a GNN-based architecture, which transforms power flow into a node regression task, integrating a mask encoder (to distinguish known/unknown features) and stacked PowerFlowConv layers (combining message passing and TAGConv to aggregate node and edge features), and training with MSE or mixed loss functions to achieve fast and accurate power flow approximation.

Although these studies demonstrate the potential of GNNs in power grid analysis, they often focus on static or simplified models that do not fully capture the dynamic nature of modern power systems. Specifically, the failures of some critical components, such as transmission lines or transformers, can lead to cascading failures, i.e., cascading outages, and significant disruptions in power supply, which results in the topological changes of the power grid [25,26,27]. Additionally, the changes in the power grid topology can significantly affect the power flow and stability of the system, making it crucial to develop GNN models that can adapt to these changes [28,29,30,31]. Most existing GNN-based methods for power grid analysis, however, still assume a fixed network topology throughout the computation, and thus completely ignore the intrinsically dynamic behavior of real-world grids. Only a few recent studies attempt to incorporate certain forms of dynamics, for example by modeling time-varying operating conditions or pre-defined switching events, but even these approaches rely on externally specified topology changes rather than enabling the GNN itself to autonomously learn and adapt the underlying graph structure from data. For instance, ref. [32] proposes a physics-informed unsupervised power flow solver based on Typed Graph Neural Networks, which minimizes the power balance violations on different static power grid scenarios, including perturbed load/generation injections, adjusted branch parameters (e.g., resistance and reactance), and random single-branch outages designed to simulate topological variations. Yet, the “topological variations” it incorporates are merely independent, pre-defined static perturbations rather than the sequential, cascading topological changes that occur in real grid failures. Furthermore, many existing GNN-based methods suffers from the limitations of over-smoothing, where the node representations become indistinguishable after multiple layers of aggregation, leading to a loss of important local information, especially in the context of power grid analysis where local interactions and dependencies are crucial for accurate predictions [33,34,35,36].

To overcome the above limitations of existing GNN-based approaches, we propose a novel, dynamical GNN framework that explicitly integrates the evolution of grid topology into the learning process. Our main ideas can be summarized as follows:

Dynamic Topological Learning (DTL). We design a DTL module that enables the network to autonomously infer and update a data-driven adjacency structure conditioned on the current system state, instead of relying on a pre-specified, static network model or manually defined switching patterns. In this way, the GNN can directly learn how grid topology should evolve from data, rather than being constrained by externally imposed topology changes.
Adaptive Message-Passing (AMP). Based on the dynamically updated graph, we introduce an AMP module that performs representation learning while adaptively adjusting the strength and range of information propagation according to the learned topology and operating conditions. This allows the model to modulate message passing under different grid configurations and operating regimes.
Model-agnostic dynamical enhancement. The proposed design is model-agnostic: DTL and AMP can be seamlessly plugged into a wide range of existing GNN architectures, turning them into dynamic variants without altering their core design. This provides a generic way to endow conventional GNNs with topology-adaptive capabilities. By continuously reshaping the effective neighborhood structure and regulating the aggregation process, our framework naturally mitigates the over-smoothing effect and preserves critical local distinctions that are essential for accurate power system prediction and control.

Extensive experiments on multiple benchmark power grid datasets demonstrate that, compared with strong GNN baselines, our method enables them to consistently achieve lower errors on both power flow and optimal power flow tasks. The rest of the paper is organized as follows. In Section 2, we briefly introduce the background of power grid analysis and the common GNNs. In Section 3, we present our proposed GNN framework with dynamic message-passing mechanisms. In Section 4, we describe the experimental setup and datasets used in our experiments, and we present the experimental results and analysis. Finally, we conclude the paper in Section 5.

2. Related Work

In this section, we first explain the background of power grid analysis, including the power flow and optimal power flow problems, and then we introduce GNNs that follow the message-passing paradigm generally.

2.1. Power Flow Analysis

Power flow analysis (also known as load flow analysis) is a fundamental computational technique for determining the steady-state voltage magnitudes, angles, and branch power flows in electrical power systems [37,38,39,40,41,42]. It involves solving a set of nonlinear algebraic equations derived from Kirchhoff’s current and voltage laws, given specified generator outputs, load demands, and the network topology and electrical parameters (e.g., line admittances and impedances). Once solved, power flow analysis provides critical insights into the distribution of electrical power within the network.

Mathematically, the steady-state power flow equations for an N-bus system can be expressed as:

S_{i} = P_{i} + j Q_{i} = V_{i} \sum_{k = 1}^{N} V_{k}^{*} Y_{i k}^{*}

(1)

where

V_{i} = | V_{i} | e^{j θ_{i}}

represents the complex voltage at bus i, and

Y_{i k} = G_{i k} + j B_{i k}

denotes the elements of the bus admittance matrix. Solving this nonlinear equation system typically involves iterative numerical algorithms such as the Newton–Raphson or Gauss–Seidel methods. However, these conventional techniques can be computationally intensive, especially for large-scale networks, and their convergence performance heavily depends on accurate initial guesses and system conditions. Such limitations underscore the need for more computationally efficient and robust methods for modern, dynamic power grids.

Mathematically, the steady-state AC power flow of an N-bus system can be written in compact complex form as a nodal power balance. At each bus i, the net complex power injection equals the power delivered to the network through the bus admittance matrix:

S_{i}^{G} - S_{i}^{L} = P_{i} + j Q_{i} = V_{i} \sum_{k = 1}^{N} V_{k}^{*} Y_{i k}^{*},

(2)

where

S_{i}^{G}

and

S_{i}^{L}

denote the complex generated and demanded powers at bus i,

V_{i} = | V_{i} | e^{j θ_{i}}

is the complex bus voltage, and

Y_{i k} = G_{i k} + j B_{i k}

is the

(i, k)

-th entry of the bus admittance matrix. Equation (2) is the standard complex power flow equation obtained from

S_{i} = V_{i} I_{i}^{*}

with

I_{i} = \sum_{k} Y_{i k} V_{k}

. For practical computation, the complex equation in (2) is usually split into its real and imaginary parts, leading to the active and reactive power balance equations:

\begin{matrix} P_{i}^{G} - P_{i}^{L} & = | V_{i} | \sum_{k = 1}^{N} | V_{k} | (G_{i k} cos θ_{i k} + B_{i k} sin θ_{i k}), \end{matrix}

(3)

\begin{matrix} Q_{i}^{G} - Q_{i}^{L} & = | V_{i} | \sum_{k = 1}^{N} | V_{k} | (G_{i k} sin θ_{i k} - B_{i k} cos θ_{i k}), \end{matrix}

(4)

where

θ_{i k} = θ_{i} - θ_{k}

is the voltage angle difference between buses i and k. Together with the bus type specifications (slack, PV, and PQ buses), Equations (3) and () define the full set of nonlinear equality constraints that must be satisfied by any physically feasible power flow solution. Solving these equations typically relies on iterative numerical algorithms such as the Newton–Raphson or Gauss–Seidel methods. However, these conventional techniques can be computationally intensive for large-scale networks, and their convergence performance heavily depends on accurate initial guesses and system conditions. These limitations motivate the search for more data-driven and computationally efficient alternatives for modern, dynamic power grids.

The optimal power flow (OPF) extends traditional power flow analysis by formulating a constrained optimization problem [43,44,45]. Instead of only finding a feasible operating point, OPF seeks operating conditions that optimize an objective function, such as generation cost, losses, or emissions. A typical AC OPF formulation can be written as

min_{{P_{g_{i}}, Q_{g_{i}}, V_{i}, θ_{i}}} \sum_{i \in G} C_{i} (P_{g_{i}})

(5)

subject to the power balance constraints at each bus

\begin{matrix} P_{i}^{G} - P_{i}^{L} & = | V_{i} | \sum_{k = 1}^{N} | V_{k} | (G_{i k} cos θ_{i k} + B_{i k} sin θ_{i k}), \forall i, \end{matrix}

(6)

\begin{matrix} Q_{i}^{G} - Q_{i}^{L} & = | V_{i} | \sum_{k = 1}^{N} | V_{k} | (G_{i k} sin θ_{i k} - B_{i k} cos θ_{i k}), \forall i, \end{matrix}

(7)

and the usual operational constraints:

\begin{matrix} P_{g_{i}}^{min} \leq P_{g_{i}} \leq P_{g_{i}}^{max}, Q_{g_{i}}^{min} \leq Q_{g_{i}} \leq Q_{g_{i}}^{max}, \forall i \in G, \end{matrix}

(8)

\begin{matrix} V_{i}^{min} \leq | V_{i} | \leq V_{i}^{max}, \forall i, | S_{i j} (V, θ) | \leq S_{i j}^{max}, \forall (i, j) \in E . \end{matrix}

(9)

Here,

G

denotes the set of generator buses and

E

denotes the set of transmission lines. These constraints enforce generator capability limits, bus voltage bounds, and thermal limits on line flows, respectively. Due to the nonlinear and non-convex nature, solving large-scale OPF problems in real time remains challenging, which has stimulated growing interest in fast, learning-based approximators such as GNNs.

2.2. Graph Neural Networks

We first introduce the notations used in this paper. We denote a graph as

G = (V, E, X)

, where

V

is the set of nodes,

E

is the set of edges, and

X

is the node feature matrix. Each node

i \in V

has a feature vector

x_{i} \in R^{d}

, where d is the dimension of the feature space. The adjacency matrix of the graph is denoted as A, where

A_{i j} = 1

if there is an edge between nodes i and j, and 0 otherwise. Each edge

(i, j) \in E

may also have associated features, denoted as

f_{i j}

.

GNNs are a class of neural networks specifically designed to operate on graph-structured data, where the relationships between entities are represented as edges connecting nodes [46]. GNNs have gained significant attention in recent years due to their ability to learn complex representations of graph-structured data and their applicability in various domains, including social networks, molecular chemistry, and power systems. Most GNNs follow the message-passing paradigm, where information is exchanged between neighboring nodes to update their representations iteratively. The message-passing process typically consists of two main steps: message aggregation and node update. In the message aggregation step, each node collects messages from its neighbors, which can be formulated as:

m_{i}^{(t)} = \sum_{j \in N (i)} f_{agg} (h_{j}^{(t - 1)}, h_{i}^{(t - 1)}, f_{i j})

(10)

where

m_{i}^{(t)}

is the aggregated message for node i at t-th iteration,

h_{j}^{(t - 1)}

and

h_{i}^{(t - 1)}

are the hidden representations of neighboring node j and node i at the previous iteration, respectively, and

f_{i j}

represents the edge features between nodes i and j. The aggregation function

f_{agg}

can be a simple sum, mean, or more complex neural network-based function. In the node update step, each node updates its hidden representation based on the aggregated messages and its previous representation:

h_{i}^{(t)} = f_{update} (h_{i}^{(t - 1)}, m_{i}^{(t)})

(11)

where

f_{update}

is typically a neural network function that combines the previous representation and the aggregated message to produce the new representation. The process of message aggregation and node update is repeated for multiple iterations, allowing nodes to exchange information with their neighbors and learn rich representations of the graph structure [47,48]. Based on this message-passing framework, various GNN architectures have been proposed, such as Graph Convolutional Networks (GCN) [49], Graph Attention Networks (GAT) [50], and Graph Isomorphism Networks (GIN) [13], each with its own specific design choices for the aggregation and update functions.

3. Methodology

In this section, we present our proposed GNN framework with dynamic message-passing mechanisms for power grid analysis. The framework is designed to capture the dynamic nature of power grids and adapt to topological changes, while also addressing the over-smoothing problem commonly encountered in GNNs. Specifically, this framework consists of two main components: Dynamic Topological Learning (DTL) and Adaptive Message-Passing (AMP). The dynamic topological learning component is responsible for capturing the dynamic nature of power grids and adapting to topological changes, while the adaptive message-passing component adaptively adjusts the message-passing process based on the current state of the power grid. By modeling the dynamic characteristics of topological changes, this framework can alleviate the over-smoothing problem and enhance the performance of GNNs in power grid analysis tasks. We present the overall workflow of the proposed framework in Figure 1 for a brief overview. Next, we will describe each component in detail.

3.1. Dynamic Topological Learning

Dynamic Topological Learning (DTL) is a crucial component of our proposed GNN framework, which aims to capture the dynamic nature of power grids and adapt to topological changes. Specifically, DTL is expected to model the changeable topological structure of power grids, such as the addition or removal of transmission lines, transformers, or other components, which can significantly affect the power flow and stability of the system. To this end, DTL needs to learn a dynamic adjacency matrix

A^{(t)}

that reflects the current state of the power grid at time step t:

A^{(t)} = σ (DTL (H^{(t - 1)}, A^{(t - 1)}, F))

(12)

where

H^{(t - 1)}

is the hidden representation of the nodes at the previous time step,

A^{(t - 1)}

is the adjacency matrix at the previous time step, F represents the features of the edges, and

σ

is the sigmoid function, which ensures the output is in the range [0, 1]. In this way, the new adjacency matrix

A^{(t)}

is updated according to the current state of the power grid, which can be used to guide the message-passing process in the next step. The DTL function

DTL

can be implemented as any neural network that takes the hidden representations and edge features as input and outputs the new adjacency matrix.

In experiments, we employ the common TransformerConv [51] to instantiate the DTL function

DTL

since it takes into account both the node features and edge features, which is suitable for power grid analysis. The TransformerConv layer can be expressed as:

h_{i}^{'} = W_{1} h_{i} + \sum_{j \in N (i)} α_{i, j} (W_{2} h_{j} + W_{3} f_{i j})

(13)

with

α_{i, j} = softmax (\frac{{(W_{4} h_{i})}^{⊤} (W_{5} h_{j} + W_{6} f_{i j})}{\sqrt{d}}),

(14)

where

h_{i}

is the feature vector of node i,

f_{i j}

is the edge feature vector between nodes i and j, and

W_{1}, W_{2}, W_{3}, W_{4}, W_{5}, W_{6}

are learnable weight matrices. The softmax function ensures that the attention coefficients

α_{i, j}

sum to 1 across all neighbors of node i, allowing the model to focus on the most relevant neighbors when updating the node representation. Then, we can use the output of the TransformerConv layer to compute the new adjacency matrix

A^{(t)}

as follows:

{\hat{A}}_{i j}^{(t)} = σ (h_{i}^{' ⊤} h_{j}^{'})

(15)

where

h_{i}^{'}

and

h_{j}^{'}

are the updated node representations from the TransformerConv layer, and

σ

is the sigmoid function. This formulation allows the DTL component to learn a dynamic adjacency matrix that reflects the current state of the power grid based on the node representations and edge features.

Note that the updated adjacency matrix

A^{(t)}

should be binary, meaning that it only indicates the presence or absence of edges between nodes. However, simple discretization of the output of the DTL function may lead to non-differentiability, which is not suitable for training models. To achieve this, we adopt the Straight-Through Gumbel Estimator technique [52,53], which allows us to sample from a Gumbel distribution to obtain a binary adjacency matrix while maintaining differentiability. Specifically, for each element

{\hat{A}}_{i j}^{(t)}

in the adjacency matrix, we sample from a Gumbel distribution with temperature

τ

:

A_{i j}^{(t)} = \frac{e x p ((l o g ({\hat{A}}_{i j}^{(t)}) + g_{i j}^{(1)}) / τ)}{e x p ((l o g ({\hat{A}}_{i j}^{(t)}) + g_{i j}^{(1)}) / τ) + e x p ((l o g (1 - {\hat{A}}_{i j}^{(t)}) + g_{i j}^{(2)}) / τ)}

(16)

where

g_{i j}

is a random variable sampled from the Gumbel distribution, and

τ

is a temperature parameter that controls the smoothness of the sampling process. As

τ

approaches 0, the sampling process becomes more discrete, while larger values of

τ

allow for smoother gradients during training. This approach enables us to learn a dynamic adjacency matrix that reflects the current state of the power grid while maintaining differentiability for backpropagation. Notably, the parameter

τ

can be annealed during training, starting from a higher value to allow for smoother gradients and gradually decreasing it to encourage more discrete adjacency matrices as training progresses. In our implementation, we simply set it learnable and let the model optimize it during training. Furthermore, by modeling the dynamic characteristics of topological changes, the DTL component can alleviate the over-smoothing problem commonly encountered in GNNs since it allows the GNN to adjust the message-passing process based on the current state of the power grid, which can help preserve local information and prevent the loss of important features during multiple layers of aggregation.

3.2. Adaptive Message-Passing

Adaptive Message-Passing (AMP) is another key component of our proposed GNN framework, which aims to learn the representations of nodes in the power grid given the dynamic adjacency matrix

A^{(t)}

learned from the DTL component. The AMP is not limited to a specific GNN architecture, but rather can be combined with various GNN architectures to enhance their performance in power grid analysis tasks. For simplicity, we use the Graph Convolutional Network (GCN) [49] as an example to illustrate the AMP process, which can be expressed as:

H^{(t)} = σ ({\tilde{D}}^{- 1 / 2} {\tilde{A}}^{(t)} {\tilde{D}}^{- 1 / 2} H^{(t - 1)} W)

(17)

where

H^{(t)}

is the hidden representation of the nodes at time step t,

{\tilde{A}}^{(t)} = A^{(t)} + I

is the adjacency matrix with self-loops added,

\tilde{D}

is the degree matrix of the adjacency matrix, and W is a learnable weight matrix. The activation function

σ

can be any nonlinear function, such as ReLU or sigmoid. Note that the adjacency matrix

A^{(t)}

is dynamically updated at each time step based on the current state of the power grid, allowing the GNN to adapt to topological changes and capture the dynamic nature of the system.

Then, the learned hidden representations

H^{(t)}

can be used for various power grid analysis tasks, such as power flow and optimal power flow analysis. In Algorithm 1, we summarize the training procedure of our proposed dynamical GNN framework with DTL and AMP components with the L-layer GCN as the example. The training procedure consists of multiple epochs, where in each epoch, we iterate over mini-batches of the training set. For each sample in the mini-batch, we perform dynamic message passing for L layers, where in each layer, we first encode the node features using the GCN layer, then update the adjacency matrix using the DTL component, and finally perform adaptive message passing using the AMP component. After obtaining the final hidden representations, we compute the loss function based on the task at hand (e.g., mean squared error for regression tasks) and update the model parameters using gradient descent. Note that, we can easily replace the GCN layer with other GNN architectures or specific models for power grid analysis to further enhance the performance of our proposed framework.

Algorithm 1 Training procedure of D-GCN

Require: Training set

D

, loss function

L

, learning rate

η

, number of epochs T, L untrained GCN layers, and Readout function

Readout (\cdot)

Ensure: Trained parameters

θ

of D-GCN

1:: for $t = 1$ to T do ▹ Epoch loop
2:: for mini-batch $B \subset D$ do
3:: $L \leftarrow 0$
4:: for each sample $G = (V, E, X, Y) \in B$ do
5:: $A, H^{(0)}, F \leftarrow$ Adjacency matrix, Node features, and edge features
6:: for $ℓ = 1$ to L do ▹ Dynamic message passing
7:: $H^{(ℓ - 1)} \leftarrow G C N_{θ^{(ℓ - 1)}} (A, H^{(ℓ - 1)}, F)$ ▹ Encode node features via GCN
8:: $A^{(ℓ)} \leftarrow DTL (H^{(ℓ - 1)}, A^{(ℓ - 1)}, F)$ via Equation (12) ▹ Dynamic adjacency matrix
9:: $H^{(ℓ)} \leftarrow AMP (H^{(ℓ - 1)}, A^{(ℓ)})$ via Equation (17) ▹ Adaptive message-passing
10:: end for
11:: $L \leftarrow L + Loss (Readout (H^{(L)}), Y)$ ▹ Compute loss function
12:: end for
13:: $L \leftarrow L / | B |$
14:: $θ \leftarrow θ - η \nabla_{θ} L$ ▹ Gradient update
15:: end for
16:: end for

3.3. Limitations and Future Works

In this subsection, we critically analyze the limitations of the proposed framework and its applicability to real-world large-scale power grids. A primary challenge in applying Graph Neural Networks to power grids is handling the scale of the network. In our proposed framework, the Dynamic Topological Learning (DTL) module is designed to autonomously infer grid topology changes. As described in Equations (14) and (15), the attention mechanism and the subsequent Gumbel-Softmax sampling require computing attention scores for potential edges. In a fully dynamic setting where global topology inference is permitted, this operation inherently carries a computational complexity of

O (N^{2})

, where N is the number of buses. While this global attention enables the model to capture long-range dependencies and latent topological evolutions effectively, it introduces a memory bottleneck for extremely large-scale systems. Consequently, our following evaluation focuses on the RTE6470 dataset, a middle-scale dataset, which represents a realistic, large-scale European transmission grid and serves as a sufficient proxy for verifying the effectiveness of dynamic topology learning under hardware constraints. Future work could alleviate this bottleneck by incorporating sparse attention mechanisms or graph partitioning techniques to scale to ultra-large grids. For example, instead of computing global interactions, we can restrict the DTL module to infer potential links only within a localized ‘candidate set’ (e.g., k-hop neighbors or nodes within a certain electrical distance). This constrains the search space of the Gumbel-Softmax estimator, strictly limiting the computational cost to be linear with respect to the number of edges.

4. Experiments

In this section, we conduct extensive experiments to evaluate the performance of our proposed GNN framework with dynamic message-passing mechanisms and compare it with existing GNN methods. Specifically, we design two groups of experiments. In the first group of experiments, we equip several representative GNN architectures with our DTL and AMP modules, and evaluate whether their performance on power flow and optimal power flow tasks can be consistently improved. This part aims to verify the effectiveness and generality of the proposed dynamic message-passing mechanisms. In the second group of experiments, we apply our framework to larger and more realistic power grid datasets, and retrofit different state-of-the-art baselines with our dynamic modules. This part is designed to assess the scalability and practical applicability of our approach under real-world-like operating conditions. We first describe the experimental setup, including the datasets used and the evaluation metrics. Then, we present the experimental results and analysis. In addition, we conducted experiments on the 19 test cases provided in the general PGLib-OPF repository [54], which offers a curated benchmark set for the AC Optimal Power Flow problem. These cases cover networks ranging from a few dozen nodes to several thousand nodes. Our results, measured by MSE, MAE, and R², show that the proposed module improves the performance of the baseline GNN across a wide range of datasets. A more detailed set of results can be found in Figure A1, Figure A2 and Figure A3 in the Appendix A.

4.1. Experimental Setup

The first group of experiments is conducted on four power grid datasets following [55]: IEEE24, IEEE39, IEEE118, and UK. These selected datasets mirror real-world-based power grids, offering a diverse array of scales, topologies, and operational characteristics. They provide comprehensive data essential for conducting cascading failure analysis. The second group of experiments is performed on a larger power grid dataset, RTE6470, which represents the French transmission system operated by RTE [56]. This dataset encompasses 6470 buses and 9005 branches, reflecting a more complex and realistic power grid scenario. It includes detailed information on network topology, line parameters, generation capacities, and load profiles, making it suitable for evaluating the performance of GNNs in large-scale power flow and optimal power flow tasks. More detailed information about the datasets can be found in [55,56], and we provide a brief summary of the datasets in Table 1.

For comprehensive evaluation, we compare our proposed framework with existing GNNs, including GCN [49], GIN [13], GAT [50], TransformerConv [51], and GATv2 [13] in the first group of experiments. Note that these GNNs are are compatible with our proposed framework, and we call them as D-GCN, D-GIN, D-GAT, D-TransformerConv, and D-GATv2, respectively. All GNNs have two layers and are implemented using the PyTorch Geometric library. We use the Adam optimizer with a learning rate of 0.005 and a weight decay of 0.00005. The batch size is set to 128, and the number of epochs is set to 50 for training. We split the datasets into training, validation, and test sets with a ratio of 80:10:10. The performance of the models is evaluated using the Mean Absolute Error (MAE) and R-squared (R²) regression metrics. We run the experiments five times with different random seeds and report the average results with standard deviations.

Then, we conduct the second group of experiments on the RTE6470 dataset. We compare our proposed framework with the SOTA baseline PowerFlowNet [24], the classic GCN [49], and the traditional numerical method Newton–Raphson (NR), Newton–Raphson with Iwamoto multiplier (Iwamoto) [57], DC Power Flow (DCPF) [58], Tikhonov regularization (Tikhonov) [59]. We retrofit these baselines with our DTL and AMP modules, resulting in D-PowerFlowNet and D-GCN, respectively. The experimental settings are similar to those in the first group of experiments, with the same optimizer, learning rate, weight decay, batch size, number of epochs, and data split ratio. The performance is also evaluated using Mean squared error (MSE), MAE, and R² metrics. We run the experiments ten times with different random seeds and report the average results with standard deviations.

4.2. Effective Analysis of Dynamic Message-Passing Mechanisms

The results of the experiments are shown in Table 2, Table 3, Table 4 and Table 5. Among these tables, Table 2 shows the MAE metric for different GNNs on power grid datasets with respect to power flow tasks, Table 3 shows the R² metric for different GNNs on power grid datasets with respect to power flow tasks, Table 4 shows the MAE metric for different GNNs on power grid datasets with respect to optimal power flow tasks, and Table 5 shows the R² metric for different GNNs on power grid datasets with respect to optimal power flow tasks. All MAE values are scaled by a factor of 100 for easier comparison. The lower the MAE value, the better the performance of the model, while the higher the R² value, the better the performance of the model.

For the power flow tasks, we can see that our proposed framework boosts the performance of all GNNs almost on all cases, achieving lower MAE and higher R² values compared to the baseline GNNs. Specifically, D-TransformerConv achieves the best performance on IEEE24, IEEE118, and UK datasets with lowest MAE of 9.3 × 10⁻³, 2.05 × 10⁻³, and 1.128 × 10⁻³, respectively, while D-GATv2 achieves the best performance on IEEE39 with lowest MAE of 1.268 × 10⁻³. Moreover, D-GATv2 achieves the best R² values on IEEE39 with 97.7062, and D-TransformerConv achieves the best R² values on UK with 98.4741 with 0.2737% and 0.0222% improvements compared to the baseline GNNs, respectively.

In addition, we visualize the performance of different GNNs on power grid datasets with respect to power flow tasks in Figure 2 and Figure 3. From these figures, we can observe that all GNNs perform worse on larger datasets, such as IEEE118, compared to smaller datasets, such as IEEE24. But our proposed framework consistently improves the performance of all GNNs across different datasets, indicating its effectiveness in enhancing the performance of GNNs in power grid analysis tasks.

For the optimal power flow tasks, we can also see the similar trend that our proposed framework improves the performance of all GNNs on almost all cases, achieving lower MAE and higher R² values compared to the baseline GNNs. Specifically, D-TransformerConv achieves the best performance on IEEE24 with lowest MAE of 2.1440 × 10⁻², while D-GATv2 achieves the best performance on IEEE39 with lowest MAE of 1.119 × 10⁻³, D-GATv2 achieves the best performance on IEEE39 and IEEE118 with lowest MAE of 1.643 × 10⁻³ and 3.896 × 10⁻³, respectively. Furthermore, D-GAT achieves the best R² values on IEEE39 with 87.6967, and D-GATv2 achieves the best R² values on IEEE118 with 71.5982. Notably, the D-GIN over four datasets improves the R² values by 1.4074%, 1.3643%, 5.2347%, and 2.5480% respectively, and the D-GAT achieves the improvements of 0.7297%, 0.8699%, 22.8165%, and 1.9489% respectively, which demonstrates the effectiveness of our proposed framework in enhancing the performance of GNNs in optimal power flow tasks.

On the other hand, we can find that the performance of our proposed framework always exhibited lower standard deviations compared to the baseline GNNs, which indicates that our proposed framework is more stable and robust across different runs. This is particularly important in real-world applications where the performance of the model should be consistent and reliable.

4.3. Scalability and Practical Applicability

The experimental results on the large-scale RTE6470 system, i.e., Table 6, further highlight the computational behavior and practical value of different solvers when the network size increases. Numerical methods based on full AC equations, such as Newton–Raphson and the Iwamoto-accelerated variant, remain the reference for accuracy. However, their computation time grows rapidly with network dimension due to repeated factorization of large sparse Jacobian matrices. On the 6470-bus system, both solvers require more than 1100 s per evaluation, which limits their use in real-time decision tasks, large-scale contingency analysis, or fast what-if assessment. Linearized DC power flow demonstrates better scalability because it reduces the problem to a single linear system. Its computation time stays below 35 s even at this scale, but the larger loss leads to systematic errors. This gap becomes more pronounced when the network is heavily loaded or exhibits strong voltage coupling, reducing its practical reliability. Tikhonov-based reconstruction is conceptually appealing because it integrates graph smoothness into the solution process. However, in large networks its main computation involves the inversion of graph Laplacian matrices. As the matrix dimension grows, this operation becomes a major bottleneck, resulting in the highest runtime among all tested baselines. The method’s accuracy also degrades in heterogeneous loading conditions, indicating limited robustness to variations in physical operating points.

The MLP shows fast inference because all computations are fully vectorized and independent of the network’s topology. However, the absence of structural information restricts its ability to generalize across diverse loading patterns. GCN improves structural modeling through graph convolutions but suffers from fixed neighbor aggregation and excessive smoothing, which reduces accuracy when the system exhibits heterogeneous impedance patterns and diverse operational regimes. PowerFlowNet and its dynamic variant show better scalability characteristics. PowerFlowNet integrates edge attributes, multi-hop propagation, and a mask-aware design, allowing the model to capture the electrical interactions of large networks without a substantial increase in computation time. Its runtime remains on the order of a few seconds even for the 6470-bus case, much faster than numerical solvers and significantly more accurate than linearized or purely data-driven baselines. The dynamic version, D-PowerFlowNet, further enhances applicability by adjusting message weights according to the current operating point. This adaptive mechanism improves the model’s ability to reflect load variations, line flow redistribution, and the heterogeneous role of branches in different scenarios. The RTE6470 results confirm that this flexibility yields consistent improvements in both MAE and MSE while keeping the computation overhead modest. These properties indicate that dynamic message passing provides a more responsive representation of large grids compared to static graph convolutions.

To further verify how the dynamic module affects the optimization behavior of different models, we plot the full training trajectories of GCN and PowerFlowNet before and after adding the dynamic module. Specifically, we record the training loss, validation loss, and cumulative training time for 100 epochs, shown in Figure 4. The training curves further demonstrate the effect of introducing the dynamic module into both GCN and PowerFlowNet. For each architecture, the dynamic variant achieves a faster decrease in training loss during the early epochs and reaches a lower validation loss overall. This indicates that the dynamic module provides more informative and adaptive structural signals, enabling the model to optimize more efficiently and generalize more reliably. Although the dynamic module introduces a slight increase in computation per epoch, the cumulative training time grows linearly and remains well controlled. The additional cost is therefore acceptable relative to the improvement in convergence speed and final accuracy.

Overall, the results suggest that the proposed dynamic message-passing mechanisms offer a favorable balance between accuracy, scalability, and computational efficiency, making them suitable for large-scale power flow approximation and real-time operational applications. Their ability to exploit network structure while maintaining low inference cost underscores their practical value in modern power systems with increasing size and variability.

5. Conclusions

This work introduces a dynamical learning framework for GNNs that enhances power flow and optimal power flow analysis by explicitly modeling the evolving structure of modern power grids. Through Dynamic Topological Learning and Adaptive Message Passing, the proposed approach enables GNNs to adapt to state-dependent topology changes, leading to more accurate and robust representations. More importantly, the framework is model-agnostic and can be easily integrated with various GNN architectures. Extensive experiments on multiple benchmark systems, including the large-scale RTE6470 network, show that the dynamic modules consistently improve accuracy, generalization, and training behavior across different architectures, with only modest computational overhead. These results indicate that incorporating dynamic structural information is essential for scalable, data-driven grid analysis and can support real-time or near-real-time applications in increasingly complex and variable power system environments.

Author Contributions

S.H.: Conceptualization, Methodology, Software, Formal analysis, Data curation, Writing—original draft, Visualization. J.L.: Conceptualization, Investigation, Writing—review & editing, Supervision. R.Z.: Conceptualization, Investigation, Writing—review & editing, Supervision. Z.L. and J.X.: Conceptualization, Investigation, Resources, Writing—review & editing, Supervision. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Science and Technology Project of China Southern Power Grid [Project No.030000KC23070014(GDKJXM20230797)].

Data Availability Statement

The data presented in this study will be made available by the authors on request.

Conflicts of Interest

Authors Shu Huang, Jining Li, Ruijiang Zeng, and Zhiyong Li are employed by Electric Power Research Institute of Guangdong Power Grid Co., Ltd. The authors declare that this study received funding from the Science and Technology Project of China Southern Power Grid [Project No.030000KC23070014(GDKJXM20230797)], for which Jin Xu is the principal investigator. The funder was not involved in the study design, the collection, analysis, or interpretation of data, the writing of the manuscript, or the decision to submit the manuscript for publication. Apart from the affiliations and funding described above, the authors declare that they have no other commercial or financial relationships that could be construed as a potential conflict of interest.

Appendix A

Figure A1. Mean Absolute Error (MAE) comparison across 19 test cases from the PGLib-OPF benchmark. The results compare baseline GNN models with their dynamic counterparts, highlighting the performance improvements achieved through the proposed dynamic framework.

To further verify the generality and robustness of our proposed framework across diverse grid topologies, we extended our evaluation to include 19 distinct test cases from the PGlib benchmark [54], ranging from small-scale to large-scale power systems. We compared five standard GNN baselines (GCN, GATv2, GAT, GIN, TransformerConv) against their dynamically enhanced counterparts (D-GNNs). The results on these standard benchmarks consistently support the effectiveness of our proposals.

As shown in Figure A1, Figure A2 and Figure A3, the dynamic framework achieved a universal improvement in model performances. The average performance of dynamic models (D-AVG) surpassed the static baselines in all 19 tested cases. Notably, in challenging scenarios such as Case 197 and Case 4917, where static models failed to capture the underlying physics (resulting in negative

R^{2}

scores), our framework successfully restored predictive capability, yielding valid positive scores. In terms of Mean Squared Error, our method demonstrated significant robustness improvements. A striking example is observed in Case 3120, where the static GIN model suffered from severe divergence (MSE>10.0). In contrast, the D-GIN model stabilized the learning process, reducing the MSE by over 98%. This confirms that the proposed Adaptive Message Passing (AMP) module effectively mitigates the risk of over-smoothing and training instability. While minor performance fluctuations were observed in isolated instances, the overall trend remains positive. As detailed in Figure A2, the dynamic framework reduced the Mean Absolute Error in the majority of cases, with the D-AVG showing a consistent reduction compared to the static average across the benchmark suite.

In summary, these extensive tests on PGlib confirm that equipping standard GNNs with our DTL and AMP modules consistently enhances their generalization capability and numerical stability, validating the proposed framework as a reliable solution for GNNs in power grid analysis tasks.

Figure A2. Mean Squared Error (MSE) comparison across 19 test cases from the PGLib-OPF benchmark. The results compare baseline GNN models with their dynamic counterparts, highlighting the performance improvements achieved through the proposed dynamic framework.

Figure A3. R-squared (R²) comparison across 19 test cases from the PGLib-OPF benchmark. The results compare baseline GNN models with their dynamic counterparts, highlighting the performance improvements achieved through the proposed dynamic framework.

References

Mohanty, A.; Ramasamy, A.; Verayiah, R.; Bastia, S.; Dash, S.S.; Cuce, E.; Khan, T.Y.; Soudagar, M.E.M. Power system resilience and strategies for a sustainable infrastructure: A review. Alex. Eng. J. 2024, 105, 261–279. [Google Scholar] [CrossRef]
Chen, H.; Chen, H.; Feng, L.; Lin, B.; Lin, J.; Liu, Y. Robustness Enhancement Method for Power Dispatching Data Network Against Malicious Attacks and Cascading Failures. Guangdong Electr. Power 2025, 38, 28–37. [Google Scholar]
Lin, B.; Deng, H.; Guo, D.; Tang, W.; Liu, G. Thermal Network Model of Dry-type Transformer Considering Local Nusselt Number of Air Duct. Guangdong Electr. Power 2023, 36, 11–20. [Google Scholar]
Wang, W.X.; Lai, Y.C.; Grebogi, C. Data based identification and prediction of nonlinear and complex dynamical systems. Phys. Rep. 2016, 644, 1–76. [Google Scholar] [CrossRef]
Saxena, V.; Manna, S.; Rajput, S.K.; Kumar, P.; Sharma, B.; Alsharif, M.H.; Kim, M.K. Navigating the complexities of distributed generation: Integration, challenges, and solutions. Energy Rep. 2024, 12, 3302–3322. [Google Scholar] [CrossRef]
Nur, A.; Kaygusuz, A. Load flow analysis with Newton–Raphson and Gauss–Seidel methods in a hybrid AC/DC system. IEEE Can. J. Electr. Comput. Eng. 2021, 44, 529–536. [Google Scholar] [CrossRef]
Montoya, O.D.; Garrido, V.M.; Gil-González, W.; Grisales-Norena, L.F. Power flow analysis in DC grids: Two alternative numerical methods. IEEE Trans. Circuits Syst. II Express Briefs 2019, 66, 1865–1869. [Google Scholar] [CrossRef]
Chatterjee, S.; Mandal, S. A novel comparison of gauss-seidel and newton-raphson methods for load flow analysis. In Proceedings of the 2017 International Conference on Power and Embedded Drive Control (ICPEDC), Chennai, India, 16–18 March 2017; pp. 1–7. [Google Scholar]
Mohsin, M.Y.; Khan, M.A.M.; Yousif, M.; Chaudhary, S.T.; Farid, G.; Tahir, W. Comparison of Newton Raphson and Gauss Seidal Methods for Load Flow Analysis. Int. J. Electr. Eng. Emerg. Technol. 2022, 5, 1–7. [Google Scholar]
Wu, D.; Wang, P.; Liang, J.; Lu, J.; Xu, J.; Wang, R.; Nie, F. Adaptive Local Modularity Learning for Efficient Multilayer Graph Clustering. IEEE Trans. Signal Process. 2024, 72, 2221–2232. [Google Scholar] [CrossRef]
Wu, Z.; Pan, S.; Chen, F.; Long, G.; Zhang, C.; Yu, P.S. A comprehensive survey on graph neural networks. IEEE Trans. Neural Netw. Learn. Syst. 2020, 32, 4–24. [Google Scholar] [CrossRef] [PubMed]
Zhou, J.; Cui, G.; Hu, S.; Zhang, Z.; Yang, C.; Liu, Z.; Wang, L.; Li, C.; Sun, M. Graph neural networks: A review of methods and applications. AI Open 2020, 1, 57–81. [Google Scholar] [CrossRef]
Xu, K.; Hu, W.; Leskovec, J.; Jegelka, S. How powerful are graph neural networks? arXiv 2018, arXiv:1810.00826. [Google Scholar]
Corso, G.; Stark, H.; Jegelka, S.; Jaakkola, T.; Barzilay, R. Graph neural networks. Nat. Rev. Methods Prim. 2024, 4, 17. [Google Scholar] [CrossRef]
Wang, R.; Wang, P.; Wu, D.; Sun, Z.; Nie, F.; Li, X. Multi-view and multi-order structured graph learning. IEEE Trans. Neural Netw. Learn. Syst. 2023, 35, 14437–14448. [Google Scholar] [CrossRef] [PubMed]
He, H.; Yu, X.; Zhang, J.; Song, S.; Letaief, K.B. Message passing meets graph neural networks: A new paradigm for massive MIMO systems. IEEE Trans. Wirel. Commun. 2023, 23, 4709–4723. [Google Scholar] [CrossRef]
Xie, Z.; Wang, M.; Ye, Z.; Zhang, Z.; Fan, R. Graphiler: Optimizing graph neural networks with message passing data flow graph. Proc. Mach. Learn. Syst. 2022, 4, 515–528. [Google Scholar]
Donon, B.; Donnot, B.; Guyon, I.; Marot, A. Graph neural solver for power systems. In Proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary, 14–19 July 2019; pp. 1–8. [Google Scholar]
Owerko, D.; Gama, F.; Ribeiro, A. Optimal power flow using graph neural networks. In Proceedings of the ICASSP 2020—2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 4–8 May 2020; pp. 5930–5934. [Google Scholar]
Lei, X.; Yang, Z.; Yu, J.; Zhao, J.; Gao, Q.; Yu, H. Data-driven optimal power flow: A physics-informed machine learning approach. IEEE Trans. Power Syst. 2020, 36, 346–354. [Google Scholar] [CrossRef]
Liu, Y.; Zhang, N.; Wu, D.; Botterud, A.; Yao, R.; Kang, C. Guiding cascading failure search with interpretable graph convolutional network. arXiv 2020, arXiv:2001.11553. [Google Scholar] [CrossRef]
Böttcher, L.; Wolf, H.; Jung, B.; Lutat, P.; Trageser, M.; Pohl, O.; Tao, X.; Ulbig, A.; Grohe, M. Solving AC power flow with graph neural networks under realistic constraints. In Proceedings of the 2023 IEEE Belgrade PowerTech, Belgrade, Serbia, 25–29 June 2023; pp. 1–7. [Google Scholar]
Varbella, A.; Gjorgiev, B.; Sansavini, G. Geometric deep learning for online prediction of cascading failures in power grids. Reliab. Eng. Syst. Saf. 2023, 237, 109341. [Google Scholar] [CrossRef]
Lin, N.; Orfanoudakis, S.; Cardenas, N.O.; Giraldo, J.S.; Vergara, P.P. PowerFlowNet: Power flow approximation using message passing Graph Neural Networks. Int. J. Electr. Power Energy Syst. 2024, 160, 110112. [Google Scholar] [CrossRef]
Chang, L.; Wu, Z. Performance and reliability of electrical power grids under cascading failures. Int. J. Electr. Power Energy Syst. 2011, 33, 1410–1419. [Google Scholar] [CrossRef]
Rahnamay-Naeini, M.; Wang, Z.; Ghani, N.; Mammoli, A.; Hayat, M.M. Stochastic analysis of cascading-failure dynamics in power grids. IEEE Trans. Power Syst. 2014, 29, 1767–1779. [Google Scholar] [CrossRef]
Song, J.; Cotilla-Sanchez, E.; Ghanavati, G.; Hines, P.D. Dynamic modeling of cascading failure in power systems. IEEE Trans. Power Syst. 2015, 31, 2085–2095. [Google Scholar] [CrossRef]
Schäfer, B.; Witthaut, D.; Timme, M.; Latora, V. Dynamically induced cascading failures in power grids. Nat. Commun. 2018, 9, 1975. [Google Scholar] [CrossRef]
Younesi, A.; Shayeghi, H.; Wang, Z.; Siano, P.; Mehrizi-Sani, A.; Safari, A. Trends in modern power systems resilience: State-of-the-art review. Renew. Sustain. Energy Rev. 2022, 162, 112397. [Google Scholar] [CrossRef]
Alexandridis, A.T. Modern power system dynamics, stability and control. Energies 2020, 13, 3814. [Google Scholar] [CrossRef]
Wang, P.; Wu, D.; Xu, J.; Nie, F. Comprehensive Information Extraction with Separable Representation Learning for Multi-View Clustering. IEEE Trans. Circuits Syst. Video Technol. 2025, 35, 10679–10692. [Google Scholar] [CrossRef]
Lopez-Garcia, T.B.; Domínguez-Navarro, J.A. Power flow analysis via typed graph neural networks. Eng. Appl. Artif. Intell. 2023, 117, 105567. [Google Scholar] [CrossRef]
Chen, D.; Lin, Y.; Li, W.; Li, P.; Zhou, J.; Sun, X. Measuring and relieving the over-smoothing problem for graph neural networks from the topological view. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 3438–3445. [Google Scholar]
Qureshi, S. Limits of depth: Over-smoothing and over-squashing in GNNs. Big Data Min. Anal. 2023, 7, 205–216. [Google Scholar]
Li, J.; Zhang, Q.; Liu, W.; Chan, A.B.; Fu, Y.G. Another perspective of over-smoothing: Alleviating semantic over-smoothing in deep GNNs. IEEE Trans. Neural Netw. Learn. Syst. 2024, 36, 6897–6910. [Google Scholar] [CrossRef]
Keriven, N. Not too little, not too much: A theoretical analysis of graph (over) smoothing. Adv. Neural Inf. Process. Syst. 2022, 35, 2268–2281. [Google Scholar]
Dubey, A. Load flow analysis of power systems. Int. J. Sci. Eng. Res. 2016, 7, 79–84. [Google Scholar]
Afolabi, O.A.; Ali, W.H.; Cofie, P.; Fuller, J.; Obiomon, P.; Kolawole, E.S. Analysis of the load flow problem in power system planning studies. Energy Power Eng. 2015, 7, 509–523. [Google Scholar] [CrossRef]
Ghiasi, M. A detailed study for load flow analysis in distributed power system. Int. J. Ind. Electron. Control Optim. 2018, 1, 153–160. [Google Scholar]
Rehman, N.; Mufti, M.u.d.; Gupta, N. Power flow analysis in a distribution system penetrated with renewable energy sources: A review. Int. J. Ambient. Energy 2024, 45, 2305701. [Google Scholar] [CrossRef]
Chen, T.H.; Chen, M.S.; Hwang, K.J.; Kotas, P.; Chebli, E.A. Distribution system power flow analysis-a rigid approach. IEEE Trans. Power Deliv. 1991, 6, 1146–1152. [Google Scholar] [CrossRef]
Yang, X.; Zhu, T.; Wu, D.; Wang, P.; Liu, Y.; Nie, F. Bidirectional fusion with cross-view graph filter for multi-view clustering. IEEE Trans. Knowl. Data Eng. 2024, 36, 5675–5680. [Google Scholar] [CrossRef]
Phan, D.; Kalagnanam, J. Some efficient optimization methods for solving the security-constrained optimal power flow problem. IEEE Trans. Power Syst. 2013, 29, 863–872. [Google Scholar] [CrossRef]
Burchett, R.; Happ, H.; Wirgau, K. Large scale optimal power flow. IEEE Trans. Power Appar. Syst. 1982, 101, 3722–3732. [Google Scholar] [CrossRef]
Cain, M.B.; O’neill, R.P.; Castillo, A. History of optimal power flow and formulations. Fed. Energy Regul. Comm. 2012, 1, 1–36. [Google Scholar]
Khemani, B.; Patil, S.; Kotecha, K.; Tanwar, S. A review of graph neural networks: Concepts, architectures, techniques, challenges, datasets, applications, and future directions. J. Big Data 2024, 11, 18. [Google Scholar] [CrossRef]
Nandan, M.; Mitra, S.; De, D. GraphXAI: A survey of graph neural networks (GNNs) for explainable AI (XAI). Neural Comput. Appl. 2025, 37, 10949–11000. [Google Scholar] [CrossRef]
Wu, D.; Wang, P.; Lu, J.; Hu, Z.; Zhang, H.; Nie, F. Triangle Topology Enhancement for Multi-view Graph Clustering. IEEE Trans. Knowl. Data Eng. 2025, 37, 4338–4348. [Google Scholar] [CrossRef]
Kipf, T.N.; Welling, M. Semi-supervised classification with graph convolutional networks. arXiv 2016, arXiv:1609.02907. [Google Scholar]
Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; Bengio, Y. Graph attention networks. arXiv 2017, arXiv:1710.10903. [Google Scholar]
Shi, Y.; Huang, Z.; Feng, S.; Zhong, H.; Wang, W.; Sun, Y. Masked label prediction: Unified message passing model for semi-supervised classification. arXiv 2020, arXiv:2009.03509. [Google Scholar]
Paulus, M.B.; Maddison, C.J.; Krause, A. Rao-blackwellizing the straight-through gumbel-softmax gradient estimator. arXiv 2020, arXiv:2010.04838. [Google Scholar]
Shekhovtsov, A.; Yanush, V. Reintroducing straight-through estimators as principled methods for stochastic binary networks. In Proceedings of the DAGM German Conference on Pattern Recognition, Bonn, Germany, 28 September–1 October 2021; Springer: Cham, Switzerland, 2021; pp. 111–126. [Google Scholar]
Babaeinejadsarookolaee, S.; Birchfield, A.; Christie, R.D.; Coffrin, C.; DeMarco, C.; Diao, R.; Ferris, M.; Fliscounakis, S.; Greene, S.; Huang, R.; et al. The power grid library for benchmarking ac optimal power flow algorithms. arXiv 2019, arXiv:1908.02788. [Google Scholar]
Varbella, A.; Amara, K.; Gjorgiev, B.; El-Assady, M.; Sansavini, G. PowerGraph: A power grid benchmark dataset for graph neural networks. Adv. Neural Inf. Process. Syst. 2024, 37, 110784–110804. [Google Scholar]
Josz, C.; Fliscounakis, S.; Maeght, J.; Panciatici, P. AC power flow data in MATPOWER and QCQP format: iTesla, RTE snapshots, and PEGASE. arXiv 2016, arXiv:1603.01533. [Google Scholar] [CrossRef]
Kulworawanichpong, T. Simplified Newton–Raphson power-flow solution method. Int. J. Electr. Power Energy Syst. 2010, 32, 551–558. [Google Scholar] [CrossRef]
Seifi, H.; Sepasian, M.S. Electric Power System Planning: Issues, Algorithms and Solutions; Power System; Springer: Berlin/Heidelberg, Germany, 2011; Volume 49. [Google Scholar]
Hilt, D.E.; Seegrist, D.W. Ridge, a Computer Program for Calculating Ridge Regression Estimates; Department of Agriculture, Forest Service, Northeastern Forest Experiment Station: Upper Darby, PA, USA, 1977; Volume 236. [Google Scholar]

Figure 1. Overall workflow of the proposed dynamic GNN framework. Each layer first updates node embeddings through a standard GNN layer (or specific models) using the current adjacency matrix. The updated node representations are then fed into the Dynamic Topology Learning (DTL) module to infer a refined, state-aware adjacency matrix. This dynamically updated topology is subsequently used by the Adaptive Message Passing (AMP) module to propagate information more effectively across the grid. By iterating this process over multiple layers, the model jointly evolves node features and graph structure, enabling improved expressiveness for nonlinear power flow relationships.

Figure 2. Visualization of the performance of different GNNs on power grid datasets with respect to power flow tasks. The x-axis represents different datasets, and the y-axis represents the MAE values (scaled by a factor of 100). The lower the MAE value, the better the performance of the model.

Figure 3. Visualization of the performance of different GNNs on power grid datasets with respect to power flow tasks. The x-axis represents different datasets, and the y-axis represents the R² values. The higher the R² value, the better the performance of the model.

Figure 4. The performance comparison between baseline GNNs and our proposed framework on RTE6470 dataset. We record the MSE values of the training and validation sets and the time taken for each epoch during training. (a) PowerFlowNet vs. D-PowerFlowNet. (b) GCN vs. D-GCN.

Table 1. The statistics of power grid datasets used in our experiments.

Dataset	No. Bus	No. Branch	No. Graphs
IEEE24	24	38	34,944
IEEE39	39	46	34,944
IEEE118	118	186	34,944
UK	29	99	34,944
RTE6470	6470	9005	9000

Table 2. The Mean Absolute Error (MAE) metric (mean ± std) for different GNNs on power grid datasets (the lower, the better) with respect to power flow tasks. All MAE values are scaled by a factor of 100 for easier comparison.

GNNs	IEEE24	IEEE39	IEEE118	UK
GCN	4.7128 ± 0.0100	2.3950 ± 0.0107	1.5197 ± 0.0023	1.6662 ± 0.0046
D-GCN	4.7127 ± 0.0098	2.3947 ± 0.0109	1.5197 ± 0.0022	1.6661 ± 0.0045
GAT	0.1937 ± 0.0882	0.1497 ± 0.0141	0.3024 ± 0.0473	0.1896 ± 0.0257
D-GAT	0.1583 ± 0.0291	0.1421 ± 0.0091	0.2441 ± 0.0626	0.1865 ± 0.0262
TransformerConv	0.1003 ± 0.0145	0.2044 ± 0.0633	0.2181 ± 0.0150	0.1212 ± 0.0112
D-TransformerConv	0.0930 ± 0.0162	0.1765 ± 0.0611	0.2050 ± 0.0116	0.1128 ± 0.0107
GIN	0.1661 ± 0.0317	0.2640 ± 0.0578	0.2366 ± 0.0162	0.2081 ± 0.0378
D-GIN	0.1575 ± 0.0277	0.2409 ± 0.0594	0.2347 ± 0.0209	0.1862 ± 0.0209
GATv2	0.1770 ± 0.0360	0.1493 ± 0.0728	0.2123 ± 0.0074	0.1740 ± 0.0189
D-GATv2	0.1231 ± 0.0545	0.1268 ± 0.0250	0.2092 ± 0.0483	0.1648 ± 0.0098

Table 3. The R-squared (R²) metric (mean ± std) for different GNNs on power grid datasets (the higher, the better) with respect to power flow tasks.

GNNs	IEEE24	IEEE39	IEEE118	UK
GCN	80.0160 ± 0.0906	77.2453 ± 0.0479	64.6385 ± 0.1204	78.3384 ± 0.0207
D-GCN	80.0140 ± 0.0968	77.2564 ± 0.0486	64.6446 ± 0.1112	78.3376 ± 0.0341
GAT	99.0177 ± 0.3546	97.3350 ± 0.1852	91.5309 ± 1.2245	97.5790 ± 0.3614
D-GAT	99.2041 ± 0.0931	97.4132 ± 0.0897	90.3599 ± 1.6303	97.5471 ± 0.3589
TransformerConv	99.4670 ± 0.1111	96.8806 ± 0.4231	92.2099 ± 0.3613	98.4522 ± 0.1608
D-TransformerConv	99.4458 ± 0.0789	97.0806 ± 0.5457	92.3103 ± 0.4932	98.4741 ± 0.1295
GIN	99.0336 ± 0.1458	96.5441 ± 0.3949	92.8728 ± 0.5126	96.9257 ± 0.6847
D-GIN	99.0361 ± 0.1201	96.4809 ± 0.6947	92.5917 ± 0.7841	97.3475 ± 0.4318
GATv2	99.3605 ± 0.1698	97.4395 ± 0.4657	91.9945 ± 0.1292	97.7471 ± 0.2028
D-GATv2	98.9912 ± 0.4153	97.7062 ± 0.1773	92.4052 ± 0.7097	97.8774 ± 0.1356

Table 4. The Mean Absolute Error (MAE) metric (mean ± std) for different GNNs on power grid datasets (the lower, the better) with respect to optima power flow tasks. All MAE values are scaled by a factor of 100 for easier comparison.

GNNs	IEEE24	IEEE39	IEEE118	UK
GCN	2.1440 ± 0.0225	2.3370 ± 0.0151	1.9482 ± 0.0019	1.2346 ± 0.0034
D-GCN	2.1434 ± 0.0221	2.3323 ± 0.0144	1.9477 ± 0.0017	1.2343 ± 0.0044
GAT	0.1638 ± 0.0458	0.2225 ± 0.0423	0.8631 ± 0.4580	0.3662 ± 0.0698
D-GAT	0.1310 ± 0.0383	0.1867 ± 0.0283	0.4985 ± 0.0386	0.3323 ± 0.0720
TRANSFORMER	0.1138 ± 0.0100	0.2329 ± 0.0458	0.4523 ± 0.0222	0.2234 ± 0.0063
D-TRANSFORMER	0.1119 ± 0.0056	0.2111 ± 0.0689	0.4245 ± 0.0431	0.2312 ± 0.0127
GIN	0.2304 ± 0.0658	0.3553 ± 0.0848	0.4675 ± 0.0736	0.5190 ± 0.1906
D-GIN	0.1829 ± 0.0356	0.3108 ± 0.0799	0.4287 ± 0.0987	0.4767 ± 0.1850
GATV2	0.1374 ± 0.0405	0.1915 ± 0.0362	0.3980 ± 0.0489	0.3124 ± 0.1290
D-GATV2	0.1295 ± 0.0363	0.1643 ± 0.0118	0.3896 ± 0.0496	0.3193 ± 0.1545

Table 5. The R-squared (R²) metric (mean ± std) for different GNNs on power grid datasets (the higher, the better) with respect to optimal power flow tasks.

GNNs	IEEE24	IEEE39	IEEE118	UK
GCN	59.8860 ± 0.3496	54.4978 ± 0.2158	21.4624 ± 0.0841	38.2617 ± 0.3328
D-GCN	59.9893 ± 0.4296	54.5684 ± 0.2257	21.4303 ± 0.0643	38.1860 ± 0.1778
GAT	83.1825 ± 1.0431	86.9404 ± 1.6711	55.6799 ± 16.5466	70.9314 ± 2.5535
D-GAT	83.7895 ± 1.1358	87.6967 ± 0.8240	68.3841 ± 2.0546	72.3138 ± 1.9522
TRANSFORMER	83.0625 ± 0.4244	85.4854 ± 0.7632	64.9955 ± 2.7229	76.9696 ± 0.5374
D-TRANSFORMER	82.9272 ± 0.4150	85.3481 ± 0.8489	65.1747 ± 3.6664	76.7429 ± 0.3820
GIN	81.9655 ± 1.3397	79.7929 ± 2.5261	61.5569 ± 3.3313	65.5584 ± 6.0861
D-GIN	83.1191 ± 0.6605	80.8815 ± 4.7110	64.7792 ± 3.1787	67.2288 ± 5.4025
GATV2	83.9191 ± 1.3640	86.8021 ± 1.5034	70.6253 ± 1.9491	73.2180 ± 4.6738
D-GATV2	83.1983 ± 3.1170	87.1124 ± 1.8960	71.5982 ± 1.0241	73.2846 ± 5.8866

Table 6. The performance comparison between different baselines and our proposed framework on RTE6470 dataset with respect to power flow tasks.

Model	MAE	MSE	R²	Time (s)
Newton–Raphson	$0.0000 \pm 0.0000$	$0.0000 \pm 0.0000$	$1.00000 \pm 0.0000$	$1152.5738 \pm 3.6797$
Iwamoto	$0.0000 \pm 0.0000$	$0.0000 \pm 0.0000$	$1.00000 \pm 0.0000$	$1052.1203 \pm 5.8212$
DC power flow	$1.7050 \pm 0.0325$	$4.9985 \pm 1.7510$	$0.7925 \pm 0.0086$	$30.8115 \pm 0.8122$
Tikhonov	$0.5401 \pm 0.1132$	$1.3435 \pm 0.3456$	$0.3753 \pm 0.4446$	$1892.4063 \pm 117.5652$
MLP	$0.1293 \pm 0.0677$	$0.2069 \pm 0.1186$	$0.7809 \pm 0.2032$	$0.3113 \pm 0.3529$
GCN	$0.3058 \pm 0.0475$	$0.5904 \pm 0.1248$	$0.4195 \pm 0.1958$	$1.3745 \pm 0.2080$
D-GCN	$0.2751 \pm 0.0436$	$0.4783 \pm 0.1054$	$0.5300 \pm 0.1614$	$1.5126 \pm 0.2291$
PowerFlowNet	$0.1210 \pm 0.0643$	$0.0915 \pm 0.1019$	$0.8417 \pm 0.1909$	$4.8548 \pm 2.5745$
D-PowerFlowNet	$0.1092 \pm 0.0581$	$0.0746 \pm 0.0831$	$0.8710 \pm 0.1564$	$5.3519 \pm 2.1616$

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Huang, S.; Li, J.; Zeng, R.; Li, Z.; Xu, J. Dynamical Graph Neural Networks for Modern Power Grid Analysis. Electronics 2026, 15, 493. https://doi.org/10.3390/electronics15030493

AMA Style

Huang S, Li J, Zeng R, Li Z, Xu J. Dynamical Graph Neural Networks for Modern Power Grid Analysis. Electronics. 2026; 15(3):493. https://doi.org/10.3390/electronics15030493

Chicago/Turabian Style

Huang, Shu, Jining Li, Ruijiang Zeng, Zhiyong Li, and Jin Xu. 2026. "Dynamical Graph Neural Networks for Modern Power Grid Analysis" Electronics 15, no. 3: 493. https://doi.org/10.3390/electronics15030493

APA Style

Huang, S., Li, J., Zeng, R., Li, Z., & Xu, J. (2026). Dynamical Graph Neural Networks for Modern Power Grid Analysis. Electronics, 15(3), 493. https://doi.org/10.3390/electronics15030493

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Dynamical Graph Neural Networks for Modern Power Grid Analysis

Abstract

1. Introduction

2. Related Work

2.1. Power Flow Analysis

2.2. Graph Neural Networks

3. Methodology

3.1. Dynamic Topological Learning

3.2. Adaptive Message-Passing

3.3. Limitations and Future Works

4. Experiments

4.1. Experimental Setup

4.2. Effective Analysis of Dynamic Message-Passing Mechanisms

4.3. Scalability and Practical Applicability

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI