2.1. Converter Topology and Fault Characteristic Analysis
Multi-phase interleaved parallel DC-DC converters are adopted in this study as the main research object for fault characteristic analysis and diagnostic verification. This topology is widely used in high-power DC microgrids and energy-router interfaces because multiple parallel branches can share the current stress, while carrier phase-shift modulation reduces the equivalent output ripple and improves modular power conversion capability under variable operating conditions. In the DC microgrid considered in this paper, this structure is used for photovoltaic access, energy storage interaction, and bidirectional power exchange. When an open-circuit fault occurs in one phase or one switching device, the original current-sharing relationship among interleaved branches is disturbed, leading to branch-current imbalance, transient waveform distortion, ripple redistribution, and frequency-domain harmonic variation. These characteristics provide representative multi-channel voltage/current responses for the proposed time–frequency EPFCN-based fault diagnosis framework.
Figure 1 illustrates the integrated application scheme of the DC microgrid system proposed in this paper. The system is structurally equipped with three ports. Port 1 serves as the DC grid port for bidirectional energy transmission. Port 2 is configured as the photovoltaic port, which integrates renewable energy into the DC microgrid. Port 3 acts as a bidirectional power supply port, capable of providing power supply and feeding excess energy back into the DC microgrid system in reverse.
This DC microgrid system enables the reliable integration of distributed photovoltaic renewable energy, energy interaction with the main DC grid, and charge–discharge management of electric vehicles (EVs), effectively improving the energy scheduling flexibility and comprehensive utilization efficiency of the microgrid system.
In this system, the DC grid interface submodule (SM1) adopts a multi-phase interleaved parallel bidirectional DC-DC converter architecture. It consists of filter inductors (L11-L1n), upper-leg IGBTs (Q11-Q1n) and lower-leg IGBTs (Q12-Q1n+1), and is used to realize bidirectional energy transmission between the microgrid and the external main DC grid.
The PV interface submodule (SM2) adopts a multi-phase interleaved parallel unidirectional Boost DC-DC circuit architecture, composed of input inductors (L21-L2n), upper-leg diodes (Q21-Q2n) and lower-leg IGBTs (Q22-Q2n+1), which realizes the unidirectional step-up convergence of photovoltaic energy.
The hardware topology of the electric vehicle interface submodule (SM3) is identical to that of SM1. It also adopts a multi-phase interleaved parallel bidirectional DC-DC converter, including inductors L31-L3n, upper-leg IGBTs Q31-Q3n and lower-leg IGBTs Q32-Q3n+1, to support bidirectional energy management for EV charging and discharging. To simplify the subsequent system-level fault characteristic analysis, this paper selects a typical interleaved parallel structure as a single research object for in-depth deduction and diagnostic verification.
The converters investigated in this study operate under a voltage–current double closed-loop PI control strategy, as shown in
Figure 2. The voltage outer loop regulates the DC bus/output voltage and generates the current reference, while the current inner loop regulates the branch current and produces the modulation signal for the interleaved switching devices. The carrier phase-shift modulation is then used to generate the gate signals for different phases, enabling current sharing among parallel branches under normal operating conditions.
The closed-loop controller affects the observed fault signatures after an open-circuit fault occurs. The voltage loop tends to suppress sustained voltage deviation, while the current loop and the remaining healthy branches compensate for the power imbalance. As a result, the fault information is reflected not only in voltage variation, but also in branch-current imbalance, transient waveform distortion, ripple redistribution, and frequency-domain harmonic changes.
This paper investigates the fault diagnosis of devices in energy routers. Considering that the probability of simultaneous faults in multiple devices is extremely low, only single-circuit device faults are taken into account.
In view of the failure modes and topological locations of different devices in the system, ten operational mode labels covering the steady state (F0) and nine abnormal states are defined in this study. Regarding the measuring point deployment strategy, comprehensively considering full-dimensional state perception and hardware deployment cost, the system finally acquires six channels of electrical signals, including the DC bus capacitor voltage, the voltage and current of the energy storage port (ES) and DC grid port, as well as the terminal voltage of the photovoltaic (PV) access side.
To meet the demand for large-capacity power transmission, the main circuit of the system generally adopts a multi-phase interleaved parallel architecture, with a carrier phase-shift modulation strategy adopted to share the current stress of each bridge arm. This control architecture can effectively cancel the low-frequency ripple of the bus current and multiply its characteristic frequency according to the number of parallel circuit branches, which greatly reduces the volume and design margin of passive filter components.
Under the ideal fault-free steady-state operating condition, for an interleaved DC-DC converter with
n parallel branches, the analytical model of the inductor current for any single phase can be derived as follows.
In the formula, Idc represents the DC bias component of single-phase current; Vin denotes the DC bus input voltage; Ts and ωs correspond to the switching period and its angular frequency respectively; L is the matched filter inductance of each branch; D refers to the steady-state duty cycle of power switches; m stands for the harmonic order after Fourier series expansion. θk is defined as the carrier phase-shift angle between each branch, satisfying θk = 2πk/n.
Based on Kirchhoff’s Current Law, the total system current is the sum of the currents of all normal branches. When an open-circuit fault occurs in a specific branch of the topology, the remaining total current of the system becomes
If the above time-domain features are mapped to the frequency domain for observation, the spectral response of the total current under the open-circuit fault state can be expressed as
where
I denotes the intrinsic amplitude of a single-phase branch at the
m-th harmonic.
Further analysis of fault characteristics shows that the loss of one phase alters the original ripple state of the system, resulting in an increase in low-frequency harmonic energy. The amplitude distortion magnification ratio
hratio of the dominant low-frequency harmonic (
m = 1) in the faulty state relative to the equivalent switching frequency harmonic (
m =
n) in the normal state can be derived as
Herein, δ = 1 when k = q, indicating an open-circuit fault occurs in the corresponding branch, while δ remains 0 under normal operating conditions of other branches. This unified expression not only covers two operating conditions of normal multi-phase interleaved operation and single-phase open-circuit fault, but also establishes a direct physical correlation between the time-domain current waveform and frequency-domain characteristics. Specifically, each cosine term in the formula corresponds to a harmonic component at a specific frequency, and their linear superposition directly forms the actually observed time-domain ripple of total current. From the frequency-domain perspective, the combination of amplitude magnitude and phase offset of each harmonic fundamentally determines the spectral distribution distortion law of the system under normal and faulty operating states.
The time-domain waveform is susceptible to load fluctuations. In practical operation, to achieve accurate fault warning and diagnosis, it is necessary to define clear criteria for distinguishing fault characteristics from normal operating modes. Fault waveforms share similarities with those under normal operating conditions in terms of time-domain features. Reliance solely on time-domain characteristics will lead to low classification accuracy for certain fault waveforms. The introduction of FFT enables the extraction of frequency-domain features to assist time-domain analysis.
Taking Measurement Point 1 as an example,
Figure 3 presents the frequency-domain waveforms of two typical signals. There are distinct differences in spectral characteristics before and after the fault under steady-state operating conditions. The proportion of low-frequency components increases when a fault occurs. In fault classification, time-domain analysis alone yields limited performance. The combination of frequency-domain information and time-domain features can effectively improve the distinguishability of different fault patterns.
Figure 3 illustrates the two-dimensional time–frequency evolution process of the system when switching from normal steady-state operation to different fault modes (Fault 3 and Fault 7). In the figure, the horizontal axis represents time and the vertical axis denotes frequency, while the color mapping characterizes the local amplitude of frequency-domain components, with dark blue indicating low-energy background noise and dark red representing high-energy concentration. Before the fault occurrence, both sets of time–frequency maps exhibit completely consistent steady-state characteristics. The system maintains a dominant frequency band with highly concentrated energy at a center frequency of approximately 135 Hz, and the surrounding frequency bands present a pure background, which indicates stable system operation without obvious harmonic pollution. Although both cases share a similar steady-state initial state, they follow distinct two-dimensional time–frequency evolution trajectories after the fault occurs. Mapping one-dimensional time-series signals into high-dimensional time–frequency images can effectively amplify the latent features under different fault types. It also verifies that the integration of frequency-domain information can significantly expand the feature boundaries among various fault modes, providing high-quality input data for high-precision classification by subsequent deep learning networks.
2.2. HIL Platform, Dataset Construction, and Signal Preprocessing
The hardware-in-the-loop experimental platform was established based on the RT-LAB real-time simulation system. The DC microgrid model was constructed and compiled on the host computer and then downloaded to the real-time simulator for real-time operation. The built-in I/O board of the simulator was connected to the external signal acquisition and conditioning board to collect the voltage and current signals required for fault diagnosis. During the HIL test, the simulated electrical signals were synchronously acquired and transmitted to the host computer for dataset construction, model training, and edge-side inference verification.
Figure 4 presents the functional structure of the RT-LAB-based fault diagnosis platform and edge deployment architecture. The platform consists of two parts: simulation operation and data acquisition, and edge computing deployment. In the simulation and data acquisition part, the DC microgrid and converter fault model are executed on the RT-LAB real-time simulator, and the voltage/current waveforms are observed through the SCOPE interface. The generated signals are transmitted to the signal acquisition card and the control implementation board for data collection and control verification. In the edge computing part, the acquired data are processed on the PC for model training and testing, and the trained EPFCN model is deployed on the Raspberry Pi 4B for edge-side fault diagnosis. The arrows in the figure indicate the data transmission path from RT-LAB to the acquisition module, PC, and edge computing device. To reproduce the dynamic operating conditions of a DC microgrid, the output power of the photovoltaic port and the interactive power of the EV port were randomly varied within 0–100% of the rated capacity in each simulation case. The dataset covered one normal state and nine fault states, totaling ten operating states. For each state, 700 simulation cases were conducted under different operating parameters. Therefore, 7000 complete multi-channel diagnostic samples were generated in total.
In each simulation case, six measurement channels were collected synchronously and combined into one multi-channel diagnostic sample. The input size of each sample was 2000 × 6, where 2000 represents the number of sampling points and 6 represents the number of measurement channels. The sampling frequency was 1 kHz, so each diagnostic sample corresponded to an approximately 2 s signal window. Therefore, the dataset contained 42,000 channel-wise signal records in total, calculated as 10 states × 700 cases × 6 channels, while the model-level input was the complete six-channel signal matrix from the same simulation case.
After all complete multi-channel samples were constructed, the dataset was randomly divided into training, validation, and test sets at a ratio of 80%/10%/10%. The complete six-channel signal matrix from each simulation case was used as the minimum division unit, and all synchronized channels belonging to the same case were assigned to the same subset. For the frequency-domain branch, FFT was applied to each channel of the normalized time-domain signal to obtain the corresponding frequency-domain amplitude sequence. The original time-domain signal and the FFT-domain sequence were then used as the two inputs of the proposed EPFCN.
For the FFT branch, each measured channel was transformed from the time domain to the frequency domain using a 2000-point FFT. The sampling frequency was 1 kHz, and each diagnostic sample contained 2000 sampling points, corresponding to a 2 s signal window under the diagnostic data sampling frequency of 1 kHz. Therefore, the length of one FFT window was 2 s, and the frequency resolution was (Δ
f =
fs/N = 1000/2000 = 0.5 Hz). No overlapping sliding window was used, and the overlap ratio was 0%. No additional tapering window was applied, which is equivalent to using a rectangular window. For each signal x[n], the FFT was calculated as
The magnitude spectrum was normalized by the FFT length:
The normalized magnitude spectrum was used as the frequency-domain input of the FFT branch.
With the diagnostic data sampling frequency of 1 kHz, the corresponding Nyquist frequency is 500 Hz. The FFT-domain input was constructed within the 0–500 Hz diagnostic band. The frequency-domain comparison of representative normal and faulty samples shows that the fault-induced variations are mainly reflected in spectral energy redistribution, ripple-envelope modulation, and low- to mid-frequency harmonic changes within this band. Therefore, the selected diagnostic sampling setting provides the frequency-domain information used by the EPFCN for fault classification.
The warning interval in the implemented diagnosis setting is determined by the diagnostic window length. Since each diagnostic sample contains 2000 sampling points and the sampling frequency is 1 kHz, one diagnostic decision corresponds to an approximately 2 s signal window from fault-related signal acquisition to diagnosis output.
2.3. Proposed EPFCN Architecture and Training Strategy
To clarify the structural difference between the proposed EPFCN and existing CNN/FCN-based diagnostic architectures, a comparison is shown in
Figure 5. Single-domain CNN/FCN models usually use either time-domain waveforms or frequency-domain features as the input, which may lose complementary fault information. General time–frequency CNN/FCN models introduce both time-domain and frequency-domain information, but these features are often stacked at the input side or processed through a common feature extraction path. In contrast, the proposed EPFCN adopts two independent branches. The time-domain branch extracts transient waveform distortion and dynamic response features, while the FFT-domain branch extracts spectral variation and harmonic-related features. The two types of features are fused only after branch-specific feature extraction. In addition, ECA modules are embedded in the convolutional feature extraction process to enhance channel-sensitive fault responses. Therefore, the proposed EPFCN differs from existing CNN/FCN-based structures by combining branch-separated time–frequency feature extraction, ECA-based channel enhancement, and a lightweight Conv1D-based implementation.
The proposed EPFCN is designed as a time–frequency dual-branch diagnostic network, as shown in
Figure 6. The network contains a time-domain branch and an FFT-domain branch. The time-domain branch takes the normalized raw signal as input, with an input size of 2000 × 6, corresponding to 2000 sampling points and six measurement channels. This branch consists of six Conv1D layers, and the numbers of filters are set to 16, 16, 32, 32, 64, and 64, respectively. The FFT-domain branch takes the corresponding frequency-domain sequence as input, also with an input size of 2000 × 6. This branch consists of three Conv1D layers, with the numbers of filters set to 16, 32, and 64, respectively.
For all Conv1D layers in the proposed EPFCN, the convolution kernel size is set to 3, the stride is set to 1, and the padding mode is set to valid. The ReLU function is used as the nonlinear activation function. ECA modules are embedded in the convolutional feature extraction process to enhance fault-sensitive channel responses, and the ECA kernel size is set to 3. After branch-specific feature extraction, global average pooling is applied to compress the feature maps of both branches. The time-domain and FFT-domain features are then fused by concatenation. The fused feature vector is further processed by batch normalization and a dense layer with 128 neurons and ReLU activation. Finally, a dense layer with 10 neurons and Softmax activation is used to output the ten-class diagnosis result.
As shown in
Figure 6, the input of the dual-branch feature fusion network consists of both time-domain and frequency-domain sequences. Let x
t denote the time-domain sequence of the original sampled signal, and x
f represent its corresponding frequency-domain amplitude sequence extracted via the FFT. The constructed dual-domain input feature set can be expressed as
where L denotes the data length of the sampled sequence.
During the feature extraction phase, the 1D convolutional kernel of the
l-th layer performs a sliding cross-correlation operation on the output feature map of the (
l − 1)-th layer to extract local temporal and spectral dependencies. The discrete mathematical expression for the convolutional output of the c-th channel,
is given by
where
K represents the physical size of the sliding convolutional kernel;
and
denote the trainable weight matrix and bias vector for the corresponding channel in the
l-th layer, respectively; and * is the convolution operator. To accelerate network convergence and effectively mitigate the vanishing gradient problem, the rectified linear unit is adopted as the nonlinear activation function
f(.):
To enhance the representation capability of critical state features, an ECA mechanism is embedded after the convolution operation. The channel descriptor generation and feature reweighting process of the ECA module is shown in
Figure 7.
For the output feature map of the
l-th convolutional layer, it can be expressed as
where
Tl denotes the temporal length of the feature map and
Cl denotes the number of channels. Global average pooling is first applied along the temporal dimension to obtain the channel descriptor:
The channel descriptor is written as
The adaptive kernel selection mechanism of ECA determines the local cross-channel interaction range according to the channel number
Cl. The kernel size is calculated as
where
γ and
b are mapping parameters, and ∣⋅∣
odd denotes the nearest odd integer operation. In this study,
γ = 2 and
b = 1 are adopted. The odd kernel size enables the 1D convolution to model a symmetric local neighborhood around each channel. The implemented odd-integer adjustment is expressed as
In the proposed EPFCN, the main channel dimensions of the convolutional feature maps are 16, 32, and 64. According to the above adaptive rule, the corresponding ECA kernel size is kl = 3. Therefore, the ECA kernel size used in this study is determined by the channel-dimension mapping rather than by an arbitrary manual setting.
After the adaptive kernel size is obtained, a one-dimensional convolution is performed on the channel descriptor to capture local cross-channel interaction:
where
denotes the trainable coefficient of the 1D convolution kernel. The channel attention weight is then generated by the Sigmoid activation function:
Finally, the original feature map is recalibrated by channel-wise multiplication:
Through this process, the ECA module adaptively selects the channel-interaction range according to the feature-channel dimension and assigns different weights to different fault-sensitive channels without dimensionality reduction. This mechanism enhances the voltage/current channels that contain stronger fault responses while maintaining the lightweight structure of the proposed EPFCN.
By extracting the peak activation values within the local receptive field, this pooling mechanism further endows the network with translation invariance to minor signal shifts.
Both sub-modules of the EPFCN conclude with a global average pooling layer. This operation calculates the global average of the output feature maps for each convolutional kernel, extracting representative features that serve as spatial feature representations in the time and frequency domains, respectively. These representations are subsequently utilized for the ensuing fault classification task. The EPFCN architecture facilitates the comprehensive capture of distinct feature expressions of fault signals across both time and frequency dimensions. The model concatenates the output features from both channels in a fully connected layer, constructing an integrated deep learning model. This fusion enhances the overall representational capacity and improves classification performance.
Following feature fusion, the model applies batch normalization to the fused features. These normalized features are then fed into a dense layer for feature mapping. Ultimately, the Softmax activation function is employed to achieve the non-linear classification output for multi-class faults, completing the entire fault recognition task.
Based on the foregoing analysis, this paper proposes a fault warning and diagnosis method for DC microgrids based on a time–frequency dual-branch fully convolutional network. The overall procedure of the method is mainly divided into three stages: dataset construction, model training, and optimal model deployment, as illustrated in
Figure 8.
The overall diagnostic workflow consists of three stages: data preparation, model training, and edge-side deployment. In the data preparation stage, the synchronized multi-channel signals collected from the HIL platform are normalized and transformed into time-domain and FFT-domain inputs. In the model training stage, the proposed EPFCN is trained using the constructed training and validation sets, and the optimal model is selected according to the validation performance. In the deployment stage, the trained model is used for online fault warning and diagnosis on the edge computing platform.
In the model training stage, as shown in
Table 1, the Adam optimizer is adopted, with the learning rate set to 0.001 and the batch size set to 128. The loss function is categorical cross-entropy, and the maximum number of training epochs is set to 50. To reduce overfitting, early stopping is used by monitoring the validation accuracy, with the patience set to 5 epochs. In addition, ReduceLROnPlateau is used to adjust the learning rate during training. The reduction factor is set to 0.1, the patience is set to 3 epochs, and the minimum change threshold is set to 1 × 10
−5. The dataset is divided into training, validation, and test sets with a ratio of 80%/10%/10%, and the random seed is fixed at 42 to improve the reproducibility of the experimental results.
The complete EPFCN contains 50,732 parameters, including 50,476 trainable parameters and 256 non-trainable parameters. Under float32 precision, the approximate parameter memory is 0.1935 MB. The computational complexity is 129,530,876 FLOPs, corresponding to 64,765,438 MACs. In the model complexity test, the average single-sample inference time is 2.993 ms/sample, and the corresponding throughput is 334.09 samples/s.
The last stage is optimal model deployment. The trained optimal model is lightweighted and deployed on an edge computing platform. In practical operation, the edge device can directly receive real-time sensing data of the DC microgrid for online inference, thereby realizing rapid early warning and accurate classification of system faults.
In summary, targeting the DC microgrid system shown in
Figure 1, this paper proposes a time–frequency dual-stream feature fusion fault warning and diagnosis method based on an EPFCN. Compared with existing studies, the proposed strategy exhibits prominent advantages in the following three respects:
- (1)
Feature extraction capability and algorithm robustness. Traditional fault identification relies heavily on manual feature engineering, and single-domain features (either time-domain or frequency-domain) tend to fail under complex operating conditions. The proposed EPFCN architecture can adaptively extract and fuse deep time–frequency dual-dimensional features of signals in an end-to-end manner, reducing the dependence on manually designed features. It effectively improves the classification accuracy and robustness of the system under nonlinear disturbances.
- (2)
Hardware cost and engineering implement ability. At the data acquisition terminal, the proposed method fully utilizes the inherent basic voltage and current sensors of the underlying control system of the DC microgrid, without the need to add additional dedicated high-frequency monitoring hardware. Meanwhile, the algorithm only requires a sampling frequency of 1 kHz, and the preprocessing procedure merely involves basic FFT transformation. The characteristics of low sampling rate and low computational overhead greatly reduce the communication and computing pressure of underlying hardware, which well meets the lightweight deployment requirements of edge devices in practical industrial scenarios.
- (3)
Topology generalization and system portability. Although this paper takes a specific interleaved parallel topology as the verification object, the underlying diagnosis logic based on time–frequency feature mapping is not strongly coupled with a particular circuit configuration. When applied to other types of DC hybrid energy systems or new power electronic equipment, the method can be rapidly migrated only by flexibly adjusting the corresponding measurement point mapping and state labels, demonstrating potential portability to related converter systems.