5.1. Case Study Setup
To verify the prediction accuracy of the proposed PINN proxy model and its generalization ability under small samples, and to assess the operating efficiency of the fully asynchronous parallel computing architecture on a personal computer, the IEEE 39-bus standard test system shown in
Figure 3 is selected as the core verification platform. The system consists of 10 synchronous generators, 39 buses, 46 AC transmission lines, and 12 transformers, with a system base power set to 100 MVA.
The construction of the experimental sample set adopts the Monte Carlo simulation method, aiming to cover various operating boundaries of the power system. We assume that the active load
and reactive load
at each bus are mutually independent and follow a uniform distribution. Centered on the base load
of the standard case data, the fluctuation range is set to 20%. The specific random perturbation model is expressed as:
Except for the slack bus 31, the active power output of the remaining generators is redistributed proportionally according to the system total load level and strictly restricted within the unit physical constraints
. On this basis, for each generated base operating sample, all 46 transmission lines in the system are traversed to simulate line disconnection faults one by one. Consistent with the definition in
Section 2.2, generator and transformer outages are excluded from this simulation to strictly isolate the impact of topological changes on model performance. Thereby constructing a contingency set
containing both base states and N-1 fault states. The Newton-Raphson method in the MATPOWER 7.1 toolbox is utilized to perform calculations for all the aforementioned scenarios. After eliminating non-convergent samples, 100,000 sets of valid data are finally retained as ground truth labels. The dataset is randomly divided into a training set, a validation set, and a test set in a ratio of 70%:10%:20%, and Z-Score standardization is performed on all input features
, namely the injection power and topology vectors, as well as on the output labels
, specifically the voltage magnitudes and phase angles.
Regarding the proposed lightweight PINN model, this experiment constructs a compact fully connected neural network architecture. The specific network structure and hyperparameter settings are detailed in
Table 3. To meet the requirements of physical equations for the continuity of high-order derivatives, the hyperbolic tangent function (Tanh) is selected as the activation function for all hidden layers. Regarding the training strategy, to balance data fitting and physical constraints, the physics-guided weight
λ adopts a dynamic adjustment strategy, increasing linearly from 0.0 to 1.5 during the training process. This prompts the model to quickly learn the data distribution in the initial stage and strictly approximate the physical manifold in the later stage.
All experiments are conducted on a single personal computer. The hardware platform is equipped with an Intel Core i7-12700K processor and 32 GB DDR4 3200 MHz memory. The software environment is based on the Windows 10 operating system and Python 3.9, while the deep learning model is built using Pytorch 1.13.1. The model offline training phase utilizes an NVIDIA RTX 3070 GPU for acceleration; however, during the N-1 online scanning and performance evaluation phases, only CPU resources are used to assess the actual throughput capability of the proposed asynchronous parallel architecture.
5.2. Verification of Model Prediction Accuracy and Security Assessment Applications
To comprehensively evaluate the performance of the proposed PINN proxy model, comparative experiments are conducted against two baseline models: a standard data-driven DNN and a graph neural network (GNN). The baseline DNN employs the same four-layer fully connected architecture as the proposed PINN but is trained solely on data discrepancies without physical regularization. It is formally noted that the performance metrics for the PINN model presented in
Table 4 are derived directly from the model configuration and training hyperparameters specified in
Table 3. The GNN model is selected for comparison due to its advantageous capability in capturing non-Euclidean topological correlations in power grids [
14]. By contrasting the performance of these three models on the IEEE 39-bus test set, the specific contributions of topological awareness and physical mechanism constraints to model generalization and interpretability are analyzed.
Three primary metrics are adopted to quantify the prediction performance from different perspectives. The root mean square error (RMSE) measures the global deviation between the predicted values and the ground truth labels, serving as an indicator of overall fitting accuracy. The mean absolute percentage error (MAPE) evaluates the relative prediction accuracy, which is particularly significant for variables with small magnitudes such as voltage phase angles. The physics mismatch metric
quantifies the degree of violation of Kirchhoff’s laws by substituting the predicted state variables back into the power flow equations, thereby assessing the physical consistency of the model. The calculation formulas for these metrics are as follows:
Table 4 presents the statistical error results of the PINN, the GNN, and the baseline DNN on the test set. Experimental data indicates that under the condition of sufficient labeled data, the GNN achieves the highest prediction accuracy, with a voltage magnitude RMSE of
p.u. This is attributed to the GNN’s superior ability to extract topological features from the graph-structured power grid data. Although the proposed PINN yields a slightly higher RMSE (
p.u.) compared to the GNN, it still significantly outperforms the baseline DNN, achieving an improvement of approximately 15.1%. More importantly, while the GNN excels in statistical fitting, the PINN demonstrates an overwhelming advantage in physical consistency. The Physics Mismatch of the PINN is nearly negligible compared to the data-driven models, indicating that the PINN successfully finds a solution that balances high accuracy with strict adherence to physical laws. This characteristic makes the PINN more reliable in engineering scenarios where physical validity is paramount.
Beyond the performance comparison under full data conditions, the core innovative value of the PINN model lies in its superior small-sample learning capability. Addressing the high cost of acquiring high-quality labeled samples in actual power system operations, this experiment maintains a constant test set while progressively reducing the training set size. The PINN, GNN, and DNN are retrained using 10%, 20%, 40%, 60%, and 80% of the full data, respectively, and the variation trends of their average RMSE on the test set are recorded.
As illustrated in
Figure 4, distinct behaviors are observed among the three models as the training sample size decreases. In the data-rich regime spanning from 80% to 100%, the performance gap between the PINN and the GNN is relatively narrow, which is expected as data-driven models converge when fed with sufficient information. However, a significant divergence is identified in the sparse-data regime between 10% and 40%. It is observed that the prediction error of the pure data-driven DNN rises exponentially due to severe overfitting when labeled data is scarce. The GNN also exhibits a marked performance degradation, with its RMSE surpassing that of the PINN when the training ratio drops below 40%. In contrast, extreme resilience is demonstrated by the PINN model; even under the condition of 10% data availability, its RMSE remains at a low level of
p.u., which is approximately 74% lower than that of the DNN. This comparison quantitatively proves that the embedded physical equations effectively function as high-quality regularization, compensating for the lack of explicit labels and ensuring robust generalization in small-sample engineering scenarios.
To further quantify the physical interpretability of the model outputs, the Physics Mismatch metric is analyzed in detail. Statistical results in
Table 4 have already shown the average superiority of PINN. Although the GNN reduces the mismatch to 3.85 MW compared to the DNN’s 15.42 MW by implicitly learning topology, it still lacks explicit physical constraints. The PINN model reduces this value to 0.32 MW, a reduction of over 90% compared to the GNN.
Figure 5 displays the power mismatch distribution of 100 randomly selected test samples. The residuals of the baseline DNN are widely distributed with a high mean value, indicating that the unconstrained data-driven model fails to guarantee power balance even if the statistical fitting error is low. This is attributed to the fact that the DNN optimizes solely for label proximity, ignoring the underlying topological constraints. Conversely, the residual distribution of the PINN converges tightly around zero. This demonstrates that the PINN not only fits the input-output mapping but also internalizes Kirchhoff’s laws within the network weights via the physical regularization term
. This high degree of physical consistency ensures the credibility of N-1 assessment results in engineering applications.
Further examining the model performance on the binary classification task involving safe operation and limit violation detection, the allowable range for bus voltage is set to [0.95, 1.05] p.u., and the line thermal stability limit is set to 100%. The 20,000 contingency scenarios in the test set are divided into safe and violation categories based on ground truth labels, focusing on the false negative rate and false positive rate.
As shown in
Table 5, statistical results reveal that the baseline DNN model exhibits high uncertainty when processing samples near operational boundaries, with an FPR of 4.2% and an FNR as high as 2.8%. In an engineering context, a false negative implies that the dispatcher ignores actual existing overload or voltage collapse risks, posing a fatal threat to grid security. Benefiting from its powerful topological perception capabilities, the GNN significantly improves classification accuracy to 99.4% and reduces the FNR to 0.5%, demonstrating excellent fault identification performance. However, the PINN model prioritizes physical safety constraints, further controlling the critical FNR within 0.3% at the cost of a slightly higher false alarm rate. This characteristic, which prioritizes the minimization of false negatives over false positives, demonstrates the high reliability of the proposed PINN method in security early warning applications.
To further explore the adaptability of the model in complex contingency scenarios, a representative critical line disconnection case from the test set is selected for specific analysis.
As shown in
Figure 6, when the critical tie line 16–17 disconnects, the power flow distribution undergoes drastic reconstruction, and the loading rate of the adjacent line 16–19 surges to 112.5%, indicating a severe overload. At this moment, the baseline DNN model infers solely based on the statistical laws of historical data and incorrectly predicts the loading rate of this line as 85.4%, erroneously classifying it as a safe state. In contrast, both the GNN and the PINN successfully capture the impact of the topological change. The GNN, leveraging its message-passing mechanism to aggregate neighbor information, accurately predicts a loading rate of 111.2%. Similarly, because the PINN model explicitly encodes the topology state
and is forced to satisfy Kirchhoff’s laws during training, it successfully captures the physical path of power flow transfer and predicts the loading rate as 110.8%. This comparison confirms that while the GNN achieves high precision through structural induction, the PINN achieves comparable robustness through physical regularization, and both methods significantly outperform the naive DNN in handling complex combinatorial faults.
To further verify the robustness of the proposed dynamic weighting strategy, a sensitivity analysis was conducted regarding the maximum physical weight threshold . The selection of is critical: a value that is too small may fail to effectively impose physical constraints, while a value that is too large could lead to stiff gradients, dominating the data loss and hindering convergence. We retrained the PINN model with varying from 0.5 to 5.0, while keeping the network architecture and other training hyperparameters unchanged.
The statistical results of the model performance under different
settings are presented in
Table 6. It can be observed that when
is set to a low value, the constraint strength is insufficient, resulting in a relatively high physics mismatch of 2.15 MW. Conversely, when
is excessively large, although the physics mismatch is minimized to 0.12 MW, the Voltage RMSE deteriorates slightly to
p.u., indicating that the optimization landscape became too complex for the optimizer to find the global optimum for the data fitting term. However, within the range of
, both the prediction accuracy and physical consistency remain highly stable and optimal. This demonstrates that the proposed method is not sensitive to the precise selection of this hyperparameter, provided it falls within a reasonable magnitude.
5.3. Verification of Efficiency for Fully Asynchronous Parallel Computing
Having verified the prediction accuracy of the PINN proxy model on the IEEE 39-bus system, this section utilizes the same test system to focus on examining the operating efficiency and resource scheduling characteristics of the proposed fully asynchronous parallel computing architecture in a personal computer environment. The experimental platform remains based on the aforementioned Intel Core i7-12700K processor. This chip adopts a heterogeneous hybrid architecture of 8 performance cores (P-Cores) + 4 efficiency cores (E-Cores). The experiment selects 20,000 N-1 contingency scenarios included in the test set as the standard workload, recording the total computation time and CPU core load status under three modes: serial calculation, barrier-based synchronous parallel (Sync-Parallel), and the proposed fully asynchronous parallel (Async-Parallel).
As shown in
Table 7, in single-core Serial mode, completing the scanning of 20,000 scenarios takes a total of 35.42 s, with an average inference time of approximately 1.77 milliseconds per single pass. When 12 Worker processes are enabled for parallel acceleration, the traditional Sync-Parallel mode shortens the time to 4.82 s, corresponding to a speedup ratio of 7.35. Although the speed is improved, this value is far below the theoretical maximum speedup and fails to fully utilize the computing power of the 12 physical cores. In comparison, the proposed Async-Parallel architecture further compresses the total time to 3.18 s, increasing the speedup ratio to 11.14, an efficiency improvement of 34.0% compared to the synchronous mode. The fundamental reason for this significant performance difference lies in the heterogeneous nature of the processor: in Sync-Parallel mode, the system must set a synchronization barrier at the end of each task batch, causing the faster P-Cores to wait for the slower E-Cores to complete their tasks before entering the next round. This bottleneck severely drags down the overall throughput. The fully asynchronous architecture, however, adopts a pull-based scheduling, allowing P-Cores to preemptively fetch new tasks from the shared memory queue immediately after completing the current task without waiting for E-Cores, consequently achieving dynamic load balancing by allocating computational tasks commensurate with the processing capacity of each core.
To intuitively quantify the performance differences in different computing modes in a heterogeneous multi-core environment,
Figure 7,
Figure 8 and
Figure 9 display the experimental results from three dimensions: total time consumption, multi-core speedup characteristics, and microscopic core load.
Figure 7 compares the wall-clock time of Serial, Sync-Parallel, and Async-Parallel modes when processing 20,000 N-1 scenarios. Under the single-core serial baseline, the full scan takes 35.42 s; employing the traditional synchronous parallel strategy shortens the time to 4.82 s. The proposed asynchronous architecture further slashes the computational overhead, depressing the total time to 3.18 s. This significant time advantage, a roughly 34.0% improvement over the synchronous mode, indicates that by eliminating the global synchronization barrier, the system successfully avoids the extra latency caused by inter-process communication, achieving microsecond-level task throughput.
Figure 8 plots the trend of the speedup ratio as the number of Worker processes changes. The black dashed line in the figure represents the ideal linear speedup. It can be seen that as the core count increases from 1 to 12, the Sync-Parallel mode gradually deviates from the ideal line, showing a trend of diminishing marginal returns, finally reaching only a 7.35× speedup at 12 cores. This is because the slower operation speed of the E-Cores of the i7-12700K processor drags down the completion time of the entire batch. In contrast, the proposed Asynchronous mode consistently hugs the ideal linear acceleration line, achieving an 11.1× speedup at full core capacity. This proves that the dynamic scheduling strategy based on the Pull mode can effectively overcome hardware heterogeneity and possesses excellent parallel efficiency on shared-memory architectures.
The deep mechanism of this performance difference is fully explained in the CPU microscopic load sequence in
Figure 9. This figure captures a segment of the real-time occupancy rate of a P-Core during the calculation process. In Sync mode, the CPU load exhibits pronounced sawtooth fluctuations, with significant intervals interspersed between peak loads that should have been fully loaded. These gaps correspond to the invalid waiting time of high-performance cores waiting for low-performance cores to complete the current batch of tasks, known as the Straggler Effect. Conversely, in Async mode, benefiting from the non-blocking task preemption mechanism, the P-Core does not need to wait after completing the current inference and immediately acquires a new task from the shared memory queue. This mechanism fills all the idle gaps, maintaining the core load at a nearly 100% saturation state throughout the entire lifecycle. It is formally clarified that this high utilization signifies that the idle wait times inherent in traditional methods have been successfully eliminated, thereby transforming the physical computing power of the hardware into maximum computational productivity without causing instability. Furthermore, the achievement of such high throughput on a standard CPU validates the cost-effectiveness of the proposed method for widespread utility deployment.
5.4. Computational Efficiency and High-Dimensional Adaptability Analysis
To further verify the applicability of the proposed method in large-scale power grids, this section extends the test object from the IEEE 39-bus system to the topologically more complex IEEE 118-bus system. As shown in
Figure 10, this system contains 118 buses, 54 generators, and 186 transmission lines, with its state space dimension and the scale of the N-1 contingency set increasing by approximately 3 times compared to the 39-bus system. In this high-dimensional scenario, neural networks face the challenge of the Curse of Dimensionality, where the sparsity of the input feature space increases drastically, typically requiring an exponentially growing number of training samples to maintain the same prediction accuracy. For the IEEE 118 system, the hidden layer scale of the PINN model is moderately deepened to [512, 512, 256, 128], and the input layer dimension is correspondingly extended to 304, including 118 active power injections, 118 reactive power injections, and a 186-dimensional topology vector.
Experimental results indicate that despite the significant increase in system scale, the PINN proxy model maintains excellent generalization performance.
Table 8 presents the changes in various metrics when scaling from 39 buses to 118 buses. It can be observed that the input feature dimension increased by 145%, and the complexity of the physical system grew nonlinearly. Under the same training sample density, the voltage prediction RMSE of the baseline DNN model deteriorated from
p.u. in the IEEE 39 system to
p.u., with an error growth rate as high as 157.8%. This exposes the inadequacy of pure data-driven methods in handling high-dimensional nonlinear mappings, making them prone to falling into local minima. In contrast, relying on strong regularization constraints from physical equations, the PINN model recorded an RMSE of only
p.u. on the IEEE 118 system, controlling the error growth rate at 32.4%. This implies that with the system scale expanding nearly 3 times, the accuracy loss of the PINN is only around 30%, and its error growth rate is far lower than the growth rate of feature dimensions. This proves that the introduction of physical mechanisms effectively compresses the search range of the high-dimensional solution space, allowing the model to converge rapidly along the manifold satisfying Kirchhoff’s laws, thereby demonstrating significant data efficiency advantages when dealing with large-scale power grid problems.
Regarding computational efficiency, a larger system implies increased time consumption for a single power flow calculation and memory pressure brought by the growth of the admittance matrix dimensions. Test data shows that, supported by the fully asynchronous parallel architecture, the total time for scanning approximately 20,000 N-1 scenarios for the IEEE 118 system is only 8.4 s. The single inference time increased from 0.16 ms to 0.42 ms, an increase of 162.5%. This magnitude of increase is basically consistent with the growth of input dimensions, indicating that the time complexity of the algorithm presents a linear relationship relative to the system scale, without exhibiting an exponential explosion. At this point, the In-Memory and COW mechanisms described in
Section 4.2 played a key role. Due to the large size of the parameter files for the 118-bus system, employing a traditional multi-process mode would lead to severe throughput bottlenecks caused by frequent disk I/O and memory copying. However, the memory usage monitored by this architecture increased by only 200 MB compared to the 39-bus system, and the CPU cores remained fully loaded throughout. This confirms that the fully asynchronous architecture not only solves the core utilization problem but also effectively mitigates the data transmission overhead caused by the expansion of system scale, ensuring that the assessment efficiency achieves a near-linear speedup with the number of physical cores. It implies that the proposed architecture effectively maximizes the throughput of shared-memory symmetric multiprocessing systems, rather than being limited by the data movement overhead associated with increased system complexity.