1. Introduction
With the continuous increase in energy demand, the rapid development of new energy generation has led to power grid systems exhibiting the “double-high” characteristics of a high proportion of new energy sources and a high proportion of power electronic devices. The integration of a large number of new energy generation systems, such as photovoltaic (PV) and wind power, into the grid impacts its security and stability [
1]. Traditional power grid stability analysis methods, such as electromechanical transient simulation, have a large time scale and cannot reflect the microsecond level dynamic processes of power electronic equipment, resulting in low accuracy and reliability in analyzing the grid performance of new energy generation systems. To ensure the safe and stable operation of the grid, power system practitioners and new energy manufacturers are seeking more accurate and reliable solutions to analyze the impact of the integration of new energy power systems and their controllers on the stability of the power grid. EMT simulation can accurately describe the dynamic processes of power electronic devices at small time scales, and therefore plays an increasingly important role in system planning, design, and operation of “double-high” systems.
Widely used EMT simulators mostly employ digital signal processors (DSPs) or general-purpose central processing units (CPUs) for EMT simulation calculations [
2,
3]. DSPs and CPUs are based on serial computation modes, and parallelization introduces additional communication overhead. Their simulation step sizes typically support 10 μs to 100 μs. However, the integration of new energy and power electronic equipment introduces short-term dynamic processes of fast switching devices, and traditional DSP or CPU simulation platforms based on serial computing face bottlenecks in simulation step size, parallel processing capability, and computational efficiency. There are also EMT simulation solvers built on FPGAs, which can accelerate the computation speed of digital simulation systems without sacrificing simulation accuracy, enabling true parallel processing [
4,
5]. Research focused on improving FPGA simulation efficiency and resource utilization [
6,
7,
8] has significantly expanded the application scenarios of FPGA simulation, achieving small-step simulation capabilities of 1 μs and below, suitable for fast switching dynamic process simulation of high proportion new energy and power electronic equipment.
However, the complex dynamic processes and characteristics of new power systems impose higher requirements on existing EMT simulation tools, which can be observed in three dimensions: rapidly increasing simulation scale, simulation granularity spanning multiple time scales from microseconds to minutes, and the emergence of numerous simulation scenarios [
9,
10]. Faced with the need to simultaneously cover complex control systems and large-scale simulations, as well as accurately simulate the dynamics of fast switching devices on a short time scale, traditional single-architecture simulation tools are difficult to adapt to the evolving parallel computing architecture and lack the ability to perform differentiated calculations for complex models in new power systems based on their computational characteristics, facing challenges such as insufficient scalability and limited performance improvement [
11,
12,
13]. Therefore, there is a need to research EMT simulation tools with hybrid computing architectures that systematically allocate computational resources according to the computational characteristics of different parts of complex models and the performance advantages of different architectures, to achieve optimal computational performance and improve simulation efficiency.
In this paper, we focus on a general-purpose FPGA-based accelerated solver algorithm for offline EMT simulation from three aspects: unifying models, reducing computational time overhead, and improving resource utilization. Building on this, a heterogeneous offline simulation platform based on CPU and FPGA is constructed. This platform utilizes the CPU’s suitability for complex logic and branch computation to complete simulation initialization and accelerate the computation of complex control systems. It leverages the FPGA’s advantages of high parallelism and deep pipelining to compute power electronic converter models that are computationally intensive, require small step sizes, and are time-consuming. Data interaction between the CPU and FPGA is achieved via PCIe. Through the proposed offline simulation accelerated solver and simulation process, heterogeneous parallel offline EMT simulation for general complex models can be realized. Finally, a test case of a grid-connected PV power generation system is built on this heterogeneous offline simulation platform. By comparing the results with those from a single-architecture CPU simulation, the accuracy and efficiency of the heterogeneous offline simulation platform are verified.
2. FPGA-Based General Offline EMT Simulation Solver
Traditional CPU simulation often struggles to achieve smaller simulation steps when dealing with systems containing a large number of power electronic switches and high-frequency dynamic processes due to its inherent serial computing mode and high communication overhead. In contrast, FPGA, with its parallel computing capability and deep pipeline design at the hardware level, is very suitable for undertaking such computationally intensive and highly parallel simulation tasks. Therefore, this article proposes a general EMT offline simulation solver based on FPGA, and forms a heterogeneous simulation architecture that cooperates with CPU. To meet the high-speed and low latency data exchange requirements of the heterogeneous platform in each simulation step, this paper uses PCIe interface to achieve communication between CPU and FPGA.
2.1. Solver Framework
Nodal analysis is a widely used method for EMT simulation [
14]. Its basic workflow, illustrated in
Figure 1, can be divided into the following steps: initialization, component state updating and admittance matrix updating, node injection current calculation, and node voltage equation solving.
In nodal analysis, components are discretized using numerical integration and represented as Norton equivalent circuits (equivalent admittance in parallel with a historical current source), as shown in
Figure 2.
Commonly used EMT simulation components may include passive elements, sources, various types of switches, transformers, and transmission lines. From the perspective of model unification, this paper categorizes different component types into five component update modules based on their historical current calculation formulas. After the components update their historical currents, node current merging is first performed within each group, followed by inter-group node current merging. Simultaneously, the system impedance matrix is updated according to the switch states. Finally, the node voltage equations are solved.
The solver framework is depicted in
Figure 3. This framework covers the complete simulation process from parameter initialization to node voltage solution. The following will explain the functional design and specific implementation methods of each core module in sequence.
2.2. Initialization Module
In this paper, the FPGA-based general EMT offline simulation solver framework serves the CPU-FPGA heterogeneous platform. To leverage computational resources efficiently, complex initialization processes—such as topology analysis, node numbering, component sorting, parameter mapping, and pre-solving the full-state system impedance matrix—are handled by the CPU. Before the simulation computation begins, the initialization results are transmitted to the initialization module of the general solver via PCIe. The FPGA can then read the parameters required for EMT simulation from the initialization module, including the number of nodes, topology information, simulation step size, component parameters, and full-state system impedance matrix parameters (
Figure 4).
2.3. Component Update Module
The component update module provides historical current calculation functions for various electrical components, including basic RLC components, two-branch RLC components, three-branch RLC components, source-type components, and switch-type components. Transformer and transmission line models can be implemented through combinations of two-branch or three-branch RLC components [
15].
To meet the high-frequency switching requirements of switch-type components, a built-in pulse width modulation (PWM) module is used to modulate the reference wave into a high-frequency PWM signal, thereby controlling the switching elements. Furthermore, after the state of a switch-type element is updated, its status is output to the impedance matrix selection module for updating the system impedance matrix. The framework of the component update module is shown in
Figure 5.
According to different component types, the process of historical current calculation in the component update module can be divided into two types: historical current calculation for general components and historical current calculation for switching components.
2.3.1. General Component Historical Current Calculation
The component update module retrieves the parameters for each component type from the initialization module and obtains the node voltages from the previous time step from the node voltage equation solving module. It then outputs the calculated historical currents for each component at the current time step to the node current merging module for solving the node voltage equations at this time step.
Based on the Norton equivalent circuit in
Figure 2, components are discretized using discrete integration methods (such as trapezoidal integration or backward Euler method). The historical current calculation for general components, such as RLC branches and sources, can be summarized by Equations (1)–(3):
where
k is the component index,
t is the simulation time, and
dT is the simulation step size. Equation (1) is for calculating branch voltage,
Uk(
t) is the branch voltage vector,
Un,Fk(
t −
dT) is the voltage vector at the first node, and
Un,Tk(
t −
dT) is the voltage vector at the terminal node. Equation (2) is for calculating branch current, and Equation (3) is for calculating historical branch current.
Gk,
Pk,
Qk are initialization parameters representing the equivalent admittance matrix, branch voltage term coefficient matrix, and branch current term coefficient matrix, respectively.
The general solver proposed in this paper is not inherently tied to any specific numerical integration method. The parameter matrices (Gk, Pk, Qk) in Equations (1)–(3) are calculated and supplied by the CPU during the initialization process. This design renders it compatible with various integration schemes, including but not limited to the backward Euler and trapezoidal methods, depending on the parameters provided.
2.3.2. Switch Component Update Module
Compared to the update modules for other components, the switch component update module includes an additional step for determining the switch state. It receives switch control signals and outputs the switch status.
Two common switch modeling methods are employed (
Figure 6). The first uses a binary resistor to construct the switch model: a small resistance
Ron and a large resistance
Roff are used to represent the switch’s ON and OFF states, respectively [
16]. Since the characteristic equation of a resistor contains no differential term, it has no historical current. However, when the switch state changes, the system’s nodal admittance matrix needs to be updated. The second method models the switch using the associated discrete circuit (ADC) principle [
17]. In this method, a small inductance,
Lsw, models the ON-state switch, while a small capacitance,
Csw, in series with a damping resistor,
Rsw, represents the OFF state. By appropriately selecting the values of
Lsw and
Csw, the equivalent admittance
Ysw can be maintained identical for both states. This eliminates the need to update the system nodal admittance matrix following a switching event. However, the equation for calculating its historical current must be adapted based on the actual state of the switch.
The FPGA-based general offline EMT simulation solver proposed in this paper is compatible with both switch models. Within the switch component update module, Equations (1) and (2) are still used to calculate the branch voltage and branch current. The switch’s ON or OFF state is then determined based on the switching control signal, branch voltage, and branch current results, according to the conduction principle of the specific switch type. For the binary resistor equivalent switch model, the historical current output by the switch component update module will be 0, and the corresponding state value (0 or 1) for each switch is output to the impedance matrix selection module for updating the system impedance matrix. For the ADC equivalent switch model, different historical current calculation formulas are selected based on the switch state.
For example, when the switch is ON, the historical current for the ADC switch is calculated as:
When the switch is OFF, the calculation of the historical current of the ADC switch is as follows:
2.4. Node Current Merging Module
The node current merging module aggregates the historical currents calculated by the component update modules to obtain the node injection currents used for solving the node voltage equations. This process involves two steps: intra-group node current merging and inter-group node current merging, as shown in
Figure 7.
The various component update modules operate independently and in parallel within the FPGA. Due to their differing computational complexities, the time required to complete the historical current calculation varies. Therefore, for component update modules that finish their calculations earlier, intra-group node current merging can be performed first. The merging order follows the component sorting results from the initialization parameters to minimize FPGA computation time overhead. The historical currents are sequentially fed into an atomic addition module. Before an addition operation is completed, operations such as reading, modifying, and writing back the node current data are locked to prevent data races during parallel computation, ensuring the final correctness of the addition result.
Based on the grouping of the component update modules, once the intra-group node current merging is complete for all modules, M groups of equal-length vectors are obtained. Inter-group node current merging is then performed, which essentially involves floating-point vector accumulation, implemented using floating-point accumulators.
2.5. Impedance Matrix Selection and Update Module
During the initialization process, the CPU pre-solves the full-state system impedance matrices for all possible states of the binary resistor switch models in the system. These matrices are sent and stored in the initialization module of the FPGA general solver. The number of impedance matrices is equal to the number of system states N. The impedance matrix selection and update module can be computed in parallel while the solver is performing the node current merging.
The switch states output by the component update modules form a switch state sequence, composed of 0 s and 1 s (0 for OFF, 1 for ON). Each switch state sequence corresponds to a system state. Based on the switch state sequence at the current time step, the corresponding system impedance matrix is identified. A selector is used to update the impedance matrix, which is then output to the node voltage equation solving module for the subsequent calculation. The implementation block diagram is shown in
Figure 8.
For systems containing a large number of switching elements, the number of full-state impedance matrices grows exponentially with the number of switching combinations. To maintain the feasibility of the method, this paper adopts the following strategies: First, switch state pre-judgment techniques [
18] are used to exclude physically impossible switching combinations, thereby reducing the number of matrices that need to be stored. Second, the ADC switch model is employed for non-critical paths; this model maintains constant admittance during state changes, eliminating the need to update the system matrix. Third, the large number of matrices are stored in off-chip DDR memory, and only the matrix required for the current time step is read into on-chip BRAM cache during each simulation step, saving FPGA logic resources.
2.6. Node Voltage Equation Solving Module
Solving the node voltage equations essentially involves matrix-vector multiplication, as expressed by the following formula:
In Equation (6), Un is the node voltage vector, Y−1 is the inverse of the admittance matrix (impedance matrix), and In is the node injection current vector.
In this paper, a floating-point multiply–accumulate structure is employed to implement the matrix-vector multiplication operation, performing multiplication first followed by accumulation. To ensure the correctness of the floating-point accumulation results, the atomic operation module from
Section 2.4 is reused during the accumulation process, thereby improving FPGA resource utilization. The node voltage equation solving module is depicted in
Figure 9.
3. Simulation Verification on CPU-FPGA Heterogeneous Platform
3.1. Simulation Environment Setup
Based on the proposed general FPGA-based offline EMT simulation solver, a CPU-FPGA heterogeneous offline simulation environment was constructed for verification. In this paper, the pure CPU simulation results from the widely recognized industry software CloudPSS (version 4.5) were selected as the reference benchmark for comparison. Its calculation results have been validated through numerous engineering cases [
19,
20], demonstrating reliable accuracy and stability, making it suitable as a reference benchmark for performance comparison and result validation in this study.
The hardware of the heterogeneous offline simulation platform primarily consists of a CPU (AMD Ryzen 9) and an FPGA (XCKU115), as shown in
Figure 10. The CPU is responsible for simulation initialization, control system computation, and simulation waveform display, leveraging its strengths in high-level control and scheduling management. The FPGA is responsible for accelerating the EMT simulation of the electrical system, highlighting its capability for parallel high-speed computation. Data interaction between them is achieved via PCIe.
The overall process of the heterogeneous offline simulation is illustrated in
Figure 11. After the task starts, the CPU performs simulation initialization, analyzing basic simulation parameters and the simulation topology. Through node numbering and component sorting, it obtains a solution that offers favorable time overhead and resource utilization for FPGA computation, pre-solves the full-state system impedance matrices, and generates the initialization results. These initialization results and the simulation start signal are then transmitted to the FPGA via PCIe.
Upon receiving the simulation start signal, the FPGA executes the EMT simulation computation for the electrical system at the predefined time step dT, based on the proposed general offline EMT simulation solver algorithm. Simultaneously, the CPU performs the control system simulation computation at the same time step dT. After completing the simulation computation for one time step, the FPGA sends a synchronization signal instruction to the CPU, transmits the current time step’s electrical system simulation results to the CPU, and reads the CPU’s current time step control system simulation results for use in the next time step’s electrical system simulation computation.
3.2. Simulation Case Study
In this paper, a case study of a grid-connected PV power generation system is constructed. The EMT simulation computation for the electrical part of the system runs on the FPGA. It includes a three-level Boost converter circuit, a three-phase three-level neutral-point-clamped (NPC) inverter bridge circuit, AC filters, a three-phase two-winding transformer, etc. The PV cell unit is modeled mathematically and represented by an equivalent controlled current source. The topology is shown in
Figure 12.
The simulation of the system’s control part runs on the CPU, encompassing dual-loop control for the converters, maximum power point tracking (MPPT) control and voltage ride-through strategies. The main control strategy block diagram is shown in
Figure 13.
Figure 13a illustrates the block diagram of the dual-loop control for the converter. The grid-side three-phase voltages
uabc and currents
iabc are transformed via the Park transformation, yielding the voltage components
ud,
uq and current components
id,
iq in the synchronous rotating reference frame, respectively. The outer control loops generate the reference signals for the inner current loops:
udc_ref is the DC voltage reference,
Qref is the given reactive power reference, while
udc and
Q are the measured actual DC voltage and reactive power. These signals are processed by proportional-integral (PI) regulators, outputting the
d-axis reference current
id_ref and the
q-axis reference current
iq_ref for the inner current loops. Subsequently,
id_ref and
iq_ref are tracked and regulated by the current PI controllers. The resulting signals are then converted back to the three-phase stationary frame via the inverse Park transformation, producing the three-phase modulation signal v
abc_ref, which are finally fed into the space vector PWM unit to drive the converter.
Figure 13b shows the control block diagram for MPPT of the PV generation system. Its input signals are the output power
Ppv and output voltage
Vpv of the PV array. These two signals first pass through a delay block to obtain the power
Ppv(
t −
dT) and voltage
Vpv(
t −
dT) from the previous sampling instant. A subtractor then calculates the power increment Δ
P and the voltage increment Δ
V. A division operation computes Δ
P/Δ
V, which represents the rate of change in power with respect to voltage. This rate-of-change signal is amplified by a proportional gain
k and then compared with the difference between the DC-side reference voltage
Vdc and the maximum power point voltage
Vmppt. The final output is the duty cycle signal, which is used to control the Boost converter, thereby achieving MPPT for the PV array.
The FPGA sends the electrically simulated quantities—such as AC side voltage, AC side current, DC link voltage, and PV cell output voltage—to the CPU. The CPU receives these signals, performs the control system computations, and sends back the controlled source signals and the reference wave signals required for the three-level Boost converter and three-phase three-level inverter to the FPGA. The FPGA then uses its built-in PWM module to modulate the reference waves into high-frequency PWM signals, thereby controlling the power electronic switches in the electrical system and forming a closed loop.
Key parameters for the electrical and control systems are listed in
Table 1.
3.3. Simulation Results Comparison
The constructed simulation model case is simulated on both the CPU-FPGA heterogeneous offline simulation platform and a single-architecture CPU simulation platform, using the same simulation time step (1 µs) and discretization integration method (the backward Euler method). The simulation results under different operating conditions are compared.
The primary reason for adopting the backward Euler method is that its results differ very little from those of the trapezoidal method at small time steps (1 µs), and it presents no numerical stability issues [
21]. Furthermore, to avoid rounding errors, the solver internally employs double-precision floating-point arithmetic for all calculations.
The accuracy of the CPU-FPGA heterogeneous offline simulation is quantified by the maximum relative error and the root mean square (RMS) error, both calculated by comparison with the results from the single-architecture CPU simulation. The maximum relative error is defined as the peak value of the ratio between the instantaneous absolute error and its corresponding CPU-simulated value, computed over a 1 s time window, as given by Equation (7).
Here, the variable X denotes an instantaneous electrical quantity, such as voltage or current.
3.3.1. Steady-State Condition
Under the steady-state condition, the system operates at its rated state with constant control commands and no external disturbances introduced. The steady-state simulation waveforms are compared in
Figure 14.
Figure 14a,b present the phase A voltage waveform and the three-phase RMS voltage at the AC side, respectively, while
Figure 14c,d show the phase A current waveform and the three-phase RMS current, respectively. It can be observed that the simulation waveforms from the CPU-FPGA heterogeneous platform and the pure CPU platform are perfectly aligned, verifying the accuracy of the heterogeneous platform in steady-state computations. The calculated maximum relative error for this condition is 3.54 × 10
−10, and the maximum RMS error is 2.92 × 10
−12.
3.3.2. Transient Condition
The power reference value in the control system is adjusted to change the output of the PV power generation system. At
t = 2 s, the active power reference steps from 1.0 p.u. down to 0.5 p.u., and followed by an increase to 0.8 p.u. at
t = 2.3 s. Regarding reactive power control, its reference is raised from 0 to 0.2 p.u. at
t = 2.5 s, and reversed to −0.2 p.u. at
t = 2.7 s. The simulation results are shown in
Figure 15.
Figure 15a–d display the phase A current, the three-phase RMS current, the system output active power, and the reactive power, respectively. The simulation results show that after the step changes in the active power reference at
t = 2 s and
t = 2.3 s, the active power from both simulation platforms rapidly and accurately tracks the reference command. The AC current waveforms adjust accordingly, and a brief, theoretically expected fluctuation in reactive power occurs at the instant of active power transition. Furthermore, after step changes in the reactive power reference at
t = 2.5 s and
t = 2.7 s, the reactive power from both platforms also tracks the reference value swiftly and precisely. The results from the heterogeneous offline simulation platform are in excellent agreement with the CPU simulation results, with a maximum relative error of 8.65 × 10
−8 and a maximum RMS error of 3.58 × 10
−12, further validating the correctness of the platform during power control transients.
A three-phase short-circuit fault on the AC side is simulated at
t = 5 s, causing a voltage sag at the point of common coupling (PCC). The fault duration is set to 180 ms. The simulation results are shown in
Figure 16.
Figure 16a–d present the dynamic responses of the phase A voltage at the PCC, the phase A current at the AC side, the system reactive power, and the active power during the fault, respectively. The simulation demonstrates that when a three-phase short-circuit fault occurs at
t = 5 s, causing the voltage to sag to approximately 0.4 p.u., the PV system enters a low-voltage ride-through (LVRT) process. During the fault, the active power decreases following the voltage sag, while the system injects additional reactive power according to the LVRT control strategy to support the grid voltage. After fault clearance, the active power output resumes smoothly and recovers to its reference value at a defined ramp rate. The dynamic responses of voltage, current, and power presented by the CPU-FPGA heterogeneous platform are in complete agreement with the pure CPU simulation results. The calculated maximum relative error is 8.93 × 10
−8, and the maximum RMS error is 4.47 × 10
−12, fully verifying the effectiveness and accuracy of the proposed platform under grid fault transient conditions.
3.4. Comparison of Computational Efficiency and Computing Resources
Based on the constructed grid-connected PV system case study, the computational efficiency of the CPU-FPGA heterogeneous offline simulation platform and the pure CPU simulation platform is analyzed and compared. The simulation duration is set to 10 s. The computation time consumed by different parts is summarized in
Table 2.
The heterogeneous offline simulation platform fully utilizes the FPGA’s characteristics of high parallelism and deep pipelining by placing the electrical system, which contains power electronic devices, into the FPGA for accelerated EMT simulation. The results in
Table 2 show that the computation time for each segment of the electrical system on the heterogeneous platform is lower than that on the CPU platform. The computational time for the CPU-based control system accounts for a larger proportion and may fluctuate due to CPU clock jitter. Excluding the initialization process, the computational efficiency of the heterogeneous offline simulation platform is improved by 61.59%, and the total time consumed for the 10 s simulation process is reduced by 54.5%. This verifies the acceleration effect of the proposed FPGA-based general offline EMT simulation solver.
To evaluate the executability and scalability of the proposed FPGA solver for general systems,
Table 3 presents the main resource consumption on the target FPGA (XCKU115) for systems with different numbers of nodes. Resource consumption increases approximately linearly with the number of nodes, and there is still margin available on the selected device. This indicates that the solver has the potential to handle larger-scale systems.
4. Conclusions
In response to the requirements for security and stability analysis in new power systems, characterized by rapidly increasing simulation scale, simulation granularity spanning multiple time scales, and the emergence of numerous simulation scenarios, this paper proposed a general FPGA-based accelerated solver for offline EMT simulation and constructed a CPU-FPGA heterogeneous offline simulation platform. This platform fully utilizes the CPU’s capability for handling complex control logic and the FPGA’s characteristics of high parallelism and deep pipelining, supporting heterogeneous parallelism between electrical systems and control systems, and enabling efficient simulation of systems with a high proportion of power electronic devices.
A case study of a grid-connected PV power generation system was constructed on this platform. By comparing the results with those from a single-architecture pure CPU simulation, the correctness and effectiveness of the heterogeneous offline simulation platform were verified. The results show that the simulation results of the heterogeneous offline simulation platform under both steady-state and transient operating conditions are consistent with those of the pure CPU simulation. Under transient conditions, the maximum relative error is less than 1 × 10−7 and the maximum RMS error is less than 5 × 10−12, ensuring good simulation accuracy. The minor discrepancies primarily originate from numerical errors introduced by data format normalization or differences in floating-point accumulation order, which are within an acceptable range. Meanwhile, the computational efficiency of the heterogeneous offline simulation platform improved by 61.59%, demonstrating a significant acceleration effect.
Further research can be carried out based on this platform to further develop complex components and modules that account for multi-level structures, frequency dependency, and magnetic saturation characteristics, supporting a wider range of application scenarios. Moreover, EMT simulation methods involving multiple heterogeneous devices such as CPU-FPGA-GPU can be investigated. For ultra-large-scale power systems, zoning and differentiated computation can be performed according to their model characteristics to maximize the utilization of simulation resources and improve simulation performance. Additionally, FPGA’s external interaction interfaces can be utilized to construct application scenarios for real-time simulation and the interconnection of multiple heterogeneous simulation devices.
Author Contributions
Conceptualization, T.L. and X.W.; methodology, T.L. and X.W.; software, L.Z. and Q.H.; validation, X.W. and L.Z.; formal analysis, Q.H.; investigation, T.L. and X.W.; resources, L.Z.; data curation, Q.H.; writing—original draft, L.Z.; writing—review and editing, X.W.; project administration, T.L. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by the Science and Technology Project of China Southern Power Grid Co., Ltd., Grant Number ZBKJXM20240003.
Data Availability Statement
The data that support the findings of this study are available within the article. Additional data are available from the corresponding author upon reasonable request.
Acknowledgments
The authors would like to thank the reviewers for their valuable comments and suggestions.
Conflicts of Interest
All authors were employed by China Southern Power Grid Company. All authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Abbreviations
The following abbreviations are used in this manuscript:
| PV | Photovoltaic |
| EMT | Electromagnetic transient |
| DSP | Digital signal processor |
| CPU | Central processing unit |
| FPGA | Field-programmable gate array |
| PWM | Pulse width modulation |
| ADC | Associated discrete circuit |
| DDR | Double data rate synchronous dynamic random access memory |
| BRAM | Block random access memory |
| NPC | Neutral-point-clamped |
| MPPT | Maximum power point tracking |
| PI | Proportional-integral |
| RMS | Root mean square |
| PCC | Point of common coupling |
| LVRT | Low-voltage ride-through |
| LUT | Look-up table |
| FF | Flip-flop |
References
- Zheng, X.; Zhang, D.; Bie, Z.; Wu, X. Distributed robust planning for new power system considering uncertainty and frequency security. Alex. Eng. J. 2025, 127, 690–704. [Google Scholar] [CrossRef]
- Asad, K.; Muhammad, M.; Jiang, C.; Danish, K. A real–time distributed optimization control for power sharing and voltage restoration in inverter–based microgrids. Ain Shams Eng. J. 2025, 16, 103288. [Google Scholar] [CrossRef]
- Song, Y.; Chen, Y.; Yu, Z.; Huang, S.; Shen, C. CloudPSS: A high-performance power system simulator based on cloud computing. Energy Rep. 2020, 6, 1611–1618. [Google Scholar] [CrossRef]
- Abdelhafid, E.A.; Houda, E.A.; Badre, B.; Dokhyl, A.; Saad, M.; Mishari, M.A.; Thamer, A.H.A. Robust control of a wind energy conversion system: FPGA real-time implementation. Heliyon 2024, 10, e35712. [Google Scholar] [CrossRef] [PubMed]
- Wang, S.; Zhang, C.; Zhao, Q. FPGA-Based modelling and embedded real-time simulation of low-voltage DC distribution system with multiple DESs. Electr. Power Syst. Res. 2025, 245, 111621. [Google Scholar] [CrossRef]
- Sandra, D.; Milica, J.; Goran, L. Optimized k-Nearest neighbors search implementation on resource-constrained FPGA platforms. Microprocess. Microsyst. 2024, 109, 105089. [Google Scholar] [CrossRef]
- Li, Y.; Wang, Z.; Fu, X.; Li, P.; Zhao, L.; Wu, X. Fine-grained hardware resource optimization and design for FPGA-based real-time simulation of large-scale renewable energy generations. Int. J. Electr. Power Energy Syst. 2025, 169, 110754. [Google Scholar] [CrossRef]
- Fahimeh, H.; Alireza, M.; Tarek, O.; Jean-Pierre, D. FPGA-based simulation of grid-tied converters using frequency-dependent network equivalent. Electr. Power Syst. Res. 2025, 252, 112400. [Google Scholar] [CrossRef]
- Han, X.; Chen, X.; McElroy, M.B.; Liao, S.; Nielsen, C.P.; Wen, J. Modeling formulation and validation for accelerated simulation and flexibility assessment on large scale power systems under higher renewable penetrations. Appl. Energy 2019, 237, 145–154. [Google Scholar] [CrossRef]
- Zhu, Y.; Zou, J.; He, F.; Yu, S. Multiple-time-scales parameters stability domain construction for grid-connected direct-drive wind power generation system. Energy 2025, 324, 136014. [Google Scholar] [CrossRef]
- Wu, W.; Li, P.; Fu, X.; Yan, J.; Wang, C. Flexible Shifted-Frequency analysis for Multi-Timescale simulations of active distribution networks. Appl. Energy 2022, 321, 119371. [Google Scholar] [CrossRef]
- Janesh, R.; Shaahin, F.; Dharshana, M.; Ramin, P. A multi-solver framework for co-simulation of transients in modern power systems. Electr. Power Syst. Res. 2023, 223, 109659. [Google Scholar] [CrossRef]
- Javier, O.T.; Andrea, T.J.M.; José, R.M. SFA-EMT hybrid simulation of power systems: Application to HVDC systems. Electr. Power Syst. Res. 2026, 252, 112326. [Google Scholar] [CrossRef]
- Mahseredjian, J.; Dinavahi, V.; Martinez, J.A. Simulation Tools for Electromagnetic Transients in Power Systems: Overview and Challenges. IEEE Trans. Power Deliv. 2009, 24, 1657–1669. [Google Scholar] [CrossRef]
- Paweł, P. Modelling of multi-winding transformers for short-circuit calculations in the power system—Modelling accuracy and differences in equivalent circuits. Int. J. Electr. Power Energy Syst. 2023, 148, 108971. [Google Scholar] [CrossRef]
- Yan, X.; Li, Z.; Ding, J.; Zhang, P.; Huang, J.; Wei, Q.; Yu, Z. Computational Efficiency–Accuracy Trade-Offs in EMT Modeling of ANPC Converters: Comparative Study and Real-Time HIL Validation. Energies 2025, 18, 5173. [Google Scholar] [CrossRef]
- Wang, K.; Xu, J.; Li, G.; Tai, N.; Tong, A.; Hou, J. A Generalized Associated Discrete Circuit Model of Power Converters in Real-Time Simulation. IEEE Trans. Power Electron. 2019, 34, 2220–2233. [Google Scholar] [CrossRef]
- Gao, S.; Song, Y.; Chen, Y.; Yu, Z.; Zhang, R. Fast Simulation Model of Voltage Source Converters With Arbitrary Topology Using Switch-State Prediction. IEEE Trans. Power Electron. 2022, 37, 12167–12181. [Google Scholar] [CrossRef]
- Chen, Y.; Gao, S.; Song, Y.; Huang, S.; Shen, C.; Yu, Z. High-performance electromagnetic transient simulation for new-type power system based on cloud computing. Proc. CSEE 2022, 42, 2854–2864. [Google Scholar]
- Niu, L.; Yu, Z.; Song, Y.; Tan, Z.; Chen, Y.; Shen, C. An automatic grid partitioning technique of large grids based on CloudPSS. Energy Rep. 2023, 9, 1307–1317. [Google Scholar] [CrossRef]
- Ferreira, L.F.R.; Bonatto, B.D.; Cogo, J.R.; de Jesus, N.C.; Dommel, H.W.; Marti, J.R. Comparative solutions of numerical oscillations in the trapezoidal method used by EMTP-based programs. In Proceedings of the International Conference on Power Systems Transients (IPST2015), Cavtat, Croatia, 15–18 June 2015; pp. 147–153. [Google Scholar]
Figure 1.
Flowchart of the classic nodal analysis method for EMT simulation. This diagram outlines the fundamental computational sequence employed in EMT simulation. This process forms the algorithmic foundation for the FPGA-based solver proposed in this paper.
Figure 1.
Flowchart of the classic nodal analysis method for EMT simulation. This diagram outlines the fundamental computational sequence employed in EMT simulation. This process forms the algorithmic foundation for the FPGA-based solver proposed in this paper.
Figure 2.
Norton equivalent circuit representation for discretized EMT components. Each electrical component (e.g., RLC branches, sources) is transformed into a Norton equivalent circuit for numerical integration within a simulation time step. The circuit consists of a constant equivalent admittance (Yeq) in parallel with a historical current source (Ih). The historical current source encapsulates the component’s state from the previous time step, enabling the decoupled solution of the system nodal voltages.
Figure 2.
Norton equivalent circuit representation for discretized EMT components. Each electrical component (e.g., RLC branches, sources) is transformed into a Norton equivalent circuit for numerical integration within a simulation time step. The circuit consists of a constant equivalent admittance (Yeq) in parallel with a historical current source (Ih). The historical current source encapsulates the component’s state from the previous time step, enabling the decoupled solution of the system nodal voltages.
Figure 3.
Framework of the FPGA-based general EMT offline simulation solver.
Figure 3.
Framework of the FPGA-based general EMT offline simulation solver.
Figure 4.
Structure and data flow of the initialization module. This module receives pre-processed simulation parameters from the CPU via the PCIe interface. The FPGA-based solver reads these parameters to configure all subsequent computational modules without performing complex topological analysis on-chip, thereby improving overall efficiency.
Figure 4.
Structure and data flow of the initialization module. This module receives pre-processed simulation parameters from the CPU via the PCIe interface. The FPGA-based solver reads these parameters to configure all subsequent computational modules without performing complex topological analysis on-chip, thereby improving overall efficiency.
Figure 5.
Architecture of the component update module. This module calculates the historical current for different component types (e.g., basic RLC, multi-branch RLC, sources, and switches). It retrieves component parameters from the Initialization Module and the previous step’s node voltages.
Figure 5.
Architecture of the component update module. This module calculates the historical current for different component types (e.g., basic RLC, multi-branch RLC, sources, and switches). It retrieves component parameters from the Initialization Module and the previous step’s node voltages.
Figure 6.
Equivalent principle of switch components.
Figure 6.
Equivalent principle of switch components.
Figure 7.
Block diagram of node current merging module. The merging of historical currents into nodal injection currents is performed in two phases to optimize parallelism and latency: intra-group merging and inter-group merging. Atomic addition operations are used to prevent data races during parallel writes.
Figure 7.
Block diagram of node current merging module. The merging of historical currents into nodal injection currents is performed in two phases to optimize parallelism and latency: intra-group merging and inter-group merging. Atomic addition operations are used to prevent data races during parallel writes.
Figure 8.
Working principle of the impedance matrix selection and update module. This module takes the binary switch state sequence from the switch component updates as an input address. This address selects the pre-computed system impedance matrix corresponding to the current system topology from a lookup table (populated during initialization). The selected matrix is then forwarded to the node voltage equation solving module for the current time step’s solution.
Figure 8.
Working principle of the impedance matrix selection and update module. This module takes the binary switch state sequence from the switch component updates as an input address. This address selects the pre-computed system impedance matrix corresponding to the current system topology from a lookup table (populated during initialization). The selected matrix is then forwarded to the node voltage equation solving module for the current time step’s solution.
Figure 9.
Node voltage equation solving module. This module employs parallel floating-point multipliers to calculate the product of the impedance matrix elements and the node current vector, followed by a pipelined floating-point accumulator (reusing the atomic addition logic) to sum the partial results for each node voltage.
Figure 9.
Node voltage equation solving module. This module employs parallel floating-point multipliers to calculate the product of the impedance matrix elements and the node current vector, followed by a pipelined floating-point accumulator (reusing the atomic addition logic) to sum the partial results for each node voltage.
Figure 10.
Hardware composition of heterogeneous offline simulation platform. (a) Hardware architecture of CPU-FPGA heterogeneous offline simulation, (b) Physical image of CPU-FPGA heterogeneous offline simulation platform.
Figure 10.
Hardware composition of heterogeneous offline simulation platform. (a) Hardware architecture of CPU-FPGA heterogeneous offline simulation, (b) Physical image of CPU-FPGA heterogeneous offline simulation platform.
Figure 11.
The overall process of heterogeneous offline simulation.
Figure 11.
The overall process of heterogeneous offline simulation.
Figure 12.
Electrical topology of PV power generation system.
Figure 12.
Electrical topology of PV power generation system.
Figure 13.
Main control strategies for the PV power generation system.
Figure 13.
Main control strategies for the PV power generation system.
Figure 14.
Comparison of steady-state results.
Figure 14.
Comparison of steady-state results.
Figure 15.
Comparison of transient results of power regulation.
Figure 15.
Comparison of transient results of power regulation.
Figure 16.
Comparison of transient fault results.
Figure 16.
Comparison of transient fault results.
Table 1.
Key parameters of electrical and control systems.
Table 1.
Key parameters of electrical and control systems.
| Parameter Name/Unit | Value |
|---|
| Electrical System Parameters | RMS value of grid-side line voltage/kV | 10 |
| AC-side voltage base value/kV | 0.63 |
| DC-side voltage base value/kV | 1.08 |
| AC-side filter inductance/H | 1.1 × 10−4 |
| AC-side filter cutoff frequency/Hz | 900 |
| DC bus capacitance/F | 5.5 × 10−3 |
| Boost converter low-voltage side capacitance/F | 6.6 × 10−5 |
| Boost converter high-voltage side capacitance/F | 3.36 × 10−4 |
| Boost converter filter inductance/H | 2.6 × 10−4 |
| Control System Parameters | Converter phase-locked loop proportional gain | 50 |
| Converter phase-locked loop integral gain | 100 |
| Converter outer-loop PI control proportional gain | 1 |
| Converter outer-loop PI control integral gain | 50 |
| Converter inner-loop PI control proportional gain | 0.5 |
| Converter inner-loop PI control integral gain | 20 |
| PWM carrier frequency/kHz | 16 |
Table 2.
Efficiency comparison between heterogeneous offline simulation and pure CPU simulation.
Table 2.
Efficiency comparison between heterogeneous offline simulation and pure CPU simulation.
| Process Module | Computation Time for 10 s Simulation |
|---|
| Heterogeneous Offline Simulation | Pure CPU Simulation |
|---|
| CPU | FPGA |
|---|
| Initialization | 2.48 | / | 0.93 |
| Control System Computation | 5.18 | / | 5.49 |
| Electrical System Computation | Component Update | / | 1.45 | 7.74 |
| Node Current Merging | / | 2.20 | 3.38 |
| Node Voltage Equation Solving | / | 1.75 | 11.94 |
Others (Measurement/Communication) | 0.55 | 0.43 |
Total Computation Time (Excluding Initialization) | 11.13 | 28.98 |
Total Time (Including Initialization) | 13.61 | 29.91 |
Table 3.
Resource utilization of general FPGA-based accelerated solver.
Table 3.
Resource utilization of general FPGA-based accelerated solver.
| Resource 1 | Available | Utilization (%) |
|---|
| 16 Nodes | 32 Nodes | 64 Nodes | 128 Nodes |
|---|
| LUT | 663,360 | 18 | 29 | 61 | 76 |
| FF | 1,326,720 | 15 | 27 | 42 | 52 |
| BRAM | 2160 | 14 | 21 | 27 | 36 |
| DSP | 5520 | 13 | 17 | 33 | 41 |
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |