An Open-Circuit Fault Diagnosis Method for Three-Level Neutral Point Clamped Inverters Based on Multi-Scale Shuffled Convolutional Neural Network

This study constructs a power switching device open-circuit fault diagnosis model for a three-level neutral point clamped inverter based on the multi-scale shuffled convolutional neural network (MSSCNN) and extracts and classifies the fault information contained in the output current of inverters. The model employs depthwise separable convolution and channel shuffle techniques to improve diagnostic accuracy and reduce model complexity. The experimental results show that the new model has lower model complexity, better noise resistance and higher average diagnostic accuracy compared with fault diagnosis models based on CNN, ResNet, ShuffleNet V2 and Mobilenet V3 networks.


Introduction
The three-level neutral point clamped (NPC) inverter has been widely recognized and applied in the industry due to its high withstand voltage level, low output harmonic content and flexible control methods [1,2].Power switching devices are one of the most vulnerable parts of power converter systems [3].Once a fault occurs, it will cause the entire power converter system to malfunction, potentially damaging the system and threatening personal safety.Given that the three-level NPC inverter employs a greater number of power switching devices compared to the two-level inverter, the probability of failures is higher.Therefore, timely and accurate diagnosis and detection of power switching device faults in the three-level NPC inverter are important for enhancing the reliability and safety of the system operation.
Power switching device faults are primarily categorized into open-circuit and shortcircuit faults [4].In the event of a short-circuit fault, the instantaneous overcurrent can cause severe damage to the inverter.In practical applications, rapid protection is typically achieved using hardware such as fuses.On the other hand, open-circuit faults generally do not immediately lead to system breakdown.However, the output current distortion introduced by open-circuit faults, if not promptly diagnosed, can also result in damage to the inverter or the load.As a result, the focus of most research is predominantly on the diagnosis of open-circuit faults.The diagnostic methods for open-circuit faults can mainly be divided into three categories: model-based, signal-based and artificial intelligence (AI)-based analysis.
Model-based fault diagnosis methods use the mathematical model of the inverter to estimate physical quantities such as voltage or current.By comparing the residual between the estimated values and the actual signals, it can determine whether a fault has occurred and locate the faulty power devices.In [5], a novel adaptive sliding mode observer is proposed to estimate three-phase currents.To ensure the sensitivity and robustness of the fault detection method, an adaptive threshold strategy is designed.In [6], the DC-side voltage is estimated by utilizing PWM signals and current measurements.The estimated value is then compared with the measured value, and fault diagnosis is achieved through voltage residuals.In [7], a dynamic model of a single-phase NPC converter is established, and fault diagnosis is implemented by calculating the rate of change in current residuals based on this model.In [8], fault detection is achieved by using a method based on model calculations of instantaneous output voltage errors.The aforementioned methods require the establishment of accurate mathematical models, exhibit a strong dependency on physical parameters and demand high sampling frequencies.
The signal analysis-based fault diagnosis method entails the analysis of diverse inverter signals using signal processing techniques, followed by the extraction of key fault characteristics.The diagnosis is then achieved by establishing the correspondence between the extracted fault features and the types of faults.In [9], fault detection and localization are implemented by analyzing the changes in the DC bus capacitor voltage and load current after an open-circuit fault in the inverter.In [10], fault detection and the localization of the faulty bridge arm are carried out by calculating the amplitude and phase angle of the current after Park transformation.This is then combined with the average value of the normalized current to locate the faulty device.In [11], current vector radius variation and switching state injection are used to achieve precise fault device localization.Signal-based methods exhibit a strong dependence on the load connected to the inverter output, and their diagnostic effectiveness is easily affected by environmental noise and system disturbances.
AI-based fault diagnosis methods involve hierarchical processing of fault data, gradually extracting and categorizing relevant data features.Intelligent judgments and decisions are then made by establishing a mapping relationship between input data and fault categories.The work in [12] utilizes wavelet packet energy spectrum entropy to extract the characteristics of the midpoint voltage signal in the bridge arm and employs kernel principal component analysis for a dimensionality reduction in the feature vector.Subsequently, fault classification is performed using a wavelet neural network.In [13], the location of the faulty submodule in a modular multilevel converter (MMC) is determined by extracting time-domain features such as variance and mean from the submodule capacitor voltage data and constructing a random forest classifier.The methods mentioned above require complex processing of raw data, increasing the implementation difficulty.Convolutional neural networks (CNNs) have garnered attention for their powerful feature extraction capabilities [14].In [15], a diagnostic method is proposed for rotating machinery, which involves the direct classification of continuous wavelet transform scaleograms (CWTS) using CNN.In [16], the combination of a CNN with a discrete wavelet transform enabled the identification of open-circuit faults in inverters.However, this method necessitates the generation of current vector trajectory maps, thereby increasing the complexity of data processing.The work in [17] applies a CNN based on the inception structure for diagnosing open-circuit faults in PWM converters.However, the diagnostic speed is influenced significantly due to the network's extensive parallel structure.In [18], a one-dimensional CNN with an improved stochastic gradient optimization method is introduced for extracting and classifying inverter fault features.In [19], a multimodal deep residual filter network (DRFN) is proposed, achieving a 99.18% accuracy rate in identifying open-circuit faults in T-type three-level inverters.However, this approach requires the collection of voltage and current data and involves a complex model with high computational demands.
In response to the challenges posed by existing AI-based fault diagnosis methods, such as high model complexity and cumbersome data processing, this paper proposes a diagnostic approach for open-circuit faults in power switching devices of three-level NPC inverters.The method is based on a multi-scale shuffled convolutional neural network (MSSCNN).This approach efficiently extracts features using depthwise separable convolution and channel shuffle techniques.It combines convolution kernels of different sizes to enhance the algorithm's diagnostic effectiveness.Current data, preprocessed for simplicity, are compiled into a dataset for model training and testing.Subsequently, the proposed fault diagnosis method is tested on this dataset alongside the existing CNN [20], ResNet [21], Sensors 2024, 24, 1745 3 of 18 ShuffleNet V2 [22] and MobileNet V3 [23].The effectiveness and application results of the proposed diagnostic method are validated across three dimensions: model complexity, fault recognition accuracy and noise resistance.

Working Principle
The topology of the three-level NPC inverter is shown in Figure 1.Each phase has four switching devices, S x1 , S x2 , S x3 , S x4 (x = A, B, C), and two clamping diodes, D x1 and D x2 .Table 1 shows the three voltage levels that the three-level NPC inverter can output.
convolution and channel shuffle techniques.It combines convolution kernels of different sizes to enhance the algorithm's diagnostic effectiveness.Current data, preprocessed for simplicity, are compiled into a dataset for model training and testing.Subsequently, the proposed fault diagnosis method is tested on this dataset alongside the existing CNN [20], ResNet [21], ShuffleNet V2 [22] and MobileNet V3 [23].The effectiveness and application results of the proposed diagnostic method are validated across three dimensions: model complexity, fault recognition accuracy and noise resistance.

Working Principle
The topology of the three-level NPC inverter is shown in Figure 1.Each phase has four switching devices, Sx1, Sx2, Sx3, Sx4 (x = A, B, C), and two clamping diodes, Dx1 and Dx2.Table 1 shows the three voltage levels that the three-level NPC inverter can output.

Switching State
The desired output voltages of the three-level NPC inverter can be expressed as where Vm * is the desired amplitude of the three-phase sinusoidal reference voltage;  * is the angular frequency; and 0 is the initial phase of the A-phase voltage.
The modulation waveform expression under the space vector modulation strategy can be expressed as follows The desired output voltages of the three-level NPC inverter can be expressed as where V m * is the desired amplitude of the three-phase sinusoidal reference voltage; ω* is the angular frequency; and φ 0 is the initial phase of the A-phase voltage.
The modulation waveform expression under the space vector modulation strategy can be expressed as follows where m is the modulation index; v z is the zero-sequence voltage, obtained by calculating the maximum value v max and minimum value v min of the three-phase sinusoidal reference voltages V A * , V B * and V C * at any given time.By comparing the modulation wave obtained from (2) with two stacked triangular carriers in the same direction and with equal amplitude, the switching sequence of each bridge arm can be determined, as shown in Figure 2.
where m is the modulation index; vz is the zero-sequence voltage, obtained by calculating the maximum value vmax and minimum value vmin of the three-phase sinusoidal reference voltages VA * , VB * and VC * at any given time.
By comparing the modulation wave obtained from (2) with two stacked triangular carriers in the same direction and with equal amplitude, the switching sequence of each bridge arm can be determined, as shown in Figure 2.

Analysis of Fault Characteristics
When an open-circuit fault occurs in the power switching device, the current flow path will change, thus affecting the output voltage level of the inverter.The direction of current flow from the inverter to the load is defined as the positive direction of current (ix > 0).
Taking the open-circuit fault of SA1 in the P state as an example, when the current is in the positive direction, the current will flow from the clamp diode DA1 and SA2 to the load.The actual state will change from a P to O state, and the output voltage will drop from Vdc/2 to 0, as shown in Figure 3a; when the SA2 has an open-circuit fault in the P state, if the current is in the positive direction, the current will flow from the anti-parallel diodes of SA3 and SA4 to the load, the actual state will change from P to N state and the output voltage will drop from Vdc/2 to −Vdc/2, as shown in Figure 3b.If the current is negative, neither SA1 nor SA2 experiencing an open-circuit fault will impact the output of the inverter, as shown in Figure 3c,d.Table 2 summarizes the changes in the inverter's output voltage before and after a single-device open-circuit fault.Taking phase A as an example, Table 3 presents the simulation waveforms of the inverter's three-phase output current when four switching devices experience open-circuit faults successively.Throughout the simulation process, the modulation index is 0.6, the output fundamental frequency is 50 Hz and the switching frequency is 2 kHz.The load comprises a resistive-inductive configuration, with 22Ω resistance and 4 mH inductance.It can be observed that the current waveform exhibits varying degrees of distortion, and when connected to the motor load, this distortion will cause significant torque/speed fluctuations.

Analysis of Fault Characteristics
When an open-circuit fault occurs in the power switching device, the current flow path will change, thus affecting the output voltage level of the inverter.The direction of current flow from the inverter to the load is defined as the positive direction of current (i x > 0).
Taking the open-circuit fault of S A1 in the P state as an example, when the current is in the positive direction, the current will flow from the clamp diode D A1 and S A2 to the load.The actual state will change from a P to O state, and the output voltage will drop from V dc /2 to 0, as shown in Figure 3a; when the S A2 has an open-circuit fault in the P state, if the current is in the positive direction, the current will flow from the anti-parallel diodes of S A3 and S A4 to the load, the actual state will change from P to N state and the output voltage will drop from V dc /2 to −V dc /2, as shown in Figure 3b.If the current is negative, neither S A1 nor S A2 experiencing an open-circuit fault will impact the output of the inverter, as shown in Figure 3c,d.Table 2 summarizes the changes in the inverter's output voltage before and after a single-device open-circuit fault.Taking phase A as an example, Table 3 presents the simulation waveforms of the inverter's three-phase output current when four switching devices experience open-circuit faults successively.Throughout the simulation process, the modulation index is 0.6, the output fundamental frequency is 50 Hz and the switching frequency is 2 kHz.The load comprises a resistive-inductive configuration, with 22 Ω resistance and 4 mH inductance.It can be observed that the current waveform exhibits varying degrees of distortion, and when connected to the motor load, this distortion will cause significant torque/speed fluctuations.4) occurring in the inverter power switching devices.The fault diagnosis method framework is shown in Figure 4.The development process of the method includes the following:  The fault diagnosis method framework is shown in Figure 4.The development process of the method includes the following:

Data Preprocessing
To improve the accuracy of the diagnostic network, it is necessary to preprocess the three-phase current data output by the inverter.

Data Normalization
The collected current data from any phase are represented as a one-dimensional sequence using the following equation where k∈{1, 2, 3} is the channel number, with channels 1, 2 and 3 corresponding to phases A, B and C, respectively; X k i represents the i-th data in the k-th channel; L data represents the length of the sequence data.
To reduce the impact of current amplitude variations on diagnostic efficiency and accuracy, linear normalization is performed on the collected raw data in this paper, with the expression as follows where X k * i represents the normalized data; X k max and X k min represent the maximum and minimum values in the data of the k-th channel, respectively.

Dataset Production
In the normalized data sequence X k * , resampling is performed at every n interval data point, with the expression as follows where f s represents the original sampling frequency of the data sequence; n interval represents the interval count; f res is the resampling frequency, and its value should satisfy the condition that the newly generated sequence L res contains n normalized data points within one fundamental cycle, with the expression as follows where f represents the fundamental frequency; n represents the length of data within one fundamental cycle.
After completing the resampling of the normalized data, different data segments are obtained through a sliding window approach, as shown in Figure 5. Data are cut using a sliding window with a length of n, and, thus, each generated sample has a length of n.The sliding window continuously moves over the resampled data sequence to generate different samples, with each slide set at a distance of S. In the method proposed in this paper, considering the computational load of the GPU, n is set to 250, S to 10 and the value of n interval is determined based on the fundamental frequency. ( ) where k{1, 2, 3} is the channel number, with channels 1, 2 and 3 corresponding to phases A, B and C, respectively; X k i represents the i-th data in the k-th channel; Ldata represents the length of the sequence data.
To reduce the impact of current amplitude variations on diagnostic efficiency and accuracy, linear normalization is performed on the collected raw data in this paper, with the expression as follows where X k* i represents the normalized data; X k max and X k min represent the maximum and minimum values in the data of the k-th channel, respectively.

Dataset Production
In the normalized data sequence X k* , resampling is performed at every ninterval data point, with the expression as follows where fs represents the original sampling frequency of the data sequence; ninterval represents the interval count; fres is the resampling frequency, and its value should satisfy the condition that the newly generated sequence Lres contains n normalized data points within one fundamental cycle, with the expression as follows where f represents the fundamental frequency; n represents the length of data within one fundamental cycle.
After completing the resampling of the normalized data, different data segments are obtained through a sliding window approach, as shown in Figure 5. Data are cut using a sliding window with a length of n, and, thus, each generated sample has a length of n.The sliding window continuously moves over the resampled data sequence to generate different samples, with each slide set at a distance of S. In the method proposed in this paper, considering the computational load of the GPU, n is set to 250, S to 10 and the value of ninterval is determined based on the fundamental frequency.

MSSCNN Fault Diagnosis Model
As shown in Figure 6, the MSSCNN fault diagnosis model proposed in this paper initially uses 3 × 1 convolution to extract shallow features from the sample data.With a stride of 2, the data length in the convolution layer is halved to a value of 125.The output data size of this layer is 24 (number of channels) × 125 (data length) × 1 (dimension).Furthermore, batch normalization (BN) is applied to normalize these shallow features [24].The BN layer is used to unify parameter magnitudes, which accelerates convergence and prevents network overfitting, with the expression as follows Sensors 2024, 24, 1745 8 of 18 where m represents the number of samples computed at each iteration, which is 5 in this paper; µ is the sample mean; σ 2 is the sample variance; x l * i,k represents the i-th data point of the k-th channel in the l-th layer of the model, k∈{1, 2, . .., N}.The output data size of this layer remains unchanged.γ and β are, respectively, the scale and shift parameters, which can be learned through the network; ε is to prevent the denominator in (9) from being zero.

zero.
The rectified linear unit (ReLU) activation function is used for non-linearity processing [25], which helps alleviate the problem of gradient vanishing and is relatively simple to implement.ReLU can be expressed as follows The output data size of this layer is 24 × 125 × 1.Finally, the features are dimensionally reduced through a maximum pooling (MaxPool) layer [26] with a stride of 2, with the expression as follows where LMaxPool represents the window length of the MaxPool, which is 3 in this paper.The output size of this layer is 24 × 63 × 1.The above steps complete the preliminary extraction of current feature information.To better distinguish open-circuit faults caused by different switching device damages, it is necessary to extract higher-dimensional current fault feature information.CNN The rectified linear unit (ReLU) activation function is used for non-linearity processing [25], which helps alleviate the problem of gradient vanishing and is relatively simple to implement.ReLU can be expressed as follows The output data size of this layer is 24 × 125 × 1.Finally, the features are dimensionally reduced through a maximum pooling (MaxPool) layer [26] with a stride of 2, with the expression as follows where L MaxPool represents the window length of the MaxPool, which is 3 in this paper.The output size of this layer is 24 × 63 × 1.The above steps complete the preliminary extraction of current feature information.
To better distinguish open-circuit faults caused by different switching device damages, it is necessary to extract higher-dimensional current fault feature information.CNN models like ResNet and DenseNet exhibit excellent performance in fields such as image recognition and object detection, but these models often have high complexity.To achieve a lightweight network model while maintaining high accuracy, rapid diagnosis and strong noise resistance in the open-circuit fault diagnosis of three-level NPC inverters, this paper designs the MSSCNN basic module and downsampling module.These modules mainly include 1 × 1 convolution layers, 3 × 1 and 9 × 1 depthwise separable convolution layers, BN layer, ReLU activation function and channel shuffle.The overall structure is shown in Figure 7.
Table 5 presents a comparison of the MSSCNN basic module and downsampling module designed in this paper with four other common CNN models.In traditional CNN architectures, each layer extracts input feature information through ordinary convolutions, but the learning capacity of features is insufficient.In ResNet, low-level feature information is directly mapped to high-level networks through short connections, which greatly improves the convergence speed and accuracy of the network.However, the large number of addition operations in the network leads to high computational complexity.ShuffleNet V2 is a lightweight network that replaces addition operations in ResNet with concat opera-tions, thereby reducing the model's computational load.MobileNet V3 replaces ordinary convolutions with depthwise separable convolution [27] to reduce the computational load while maintaining good classification performance.The basic modules and downsampling modules designed in this paper use depthwise separable convolutions instead of standard convolutions in traditional CNNs.Compared to ShuffleNet V2 and MobileNet V3, to reduce computation, 1 × 1 convolutions are omitted before applying depthwise separable convolutions.Moreover, concat operations replace addition operations in ResNet and MobileNet V3 to further decrease the computational complexity.In the downsampling module, a combination of 3 × 1 convolutions and 9 × 1 large kernel convolutions effectively extracts current fault features at various scales, enhancing the model's robust fault feature extraction capabilities.Table 5 presents a comparison of the MSSCNN basic module and downsampling module designed in this paper with four other common CNN models.In traditional CNN architectures, each layer extracts input feature information through ordinary convolutions, but the learning capacity of features is insufficient.In ResNet, low-level feature information is directly mapped to high-level networks through short connections, which greatly improves the convergence speed and accuracy of the network.However, the large number of addition operations in the network leads to high computational complexity.ShuffleNet V2 is a lightweight network that replaces addition operations in ResNet with concat operations, thereby reducing the model's computational load.MobileNet V3 replaces ordinary convolutions with depthwise separable convolution [27] to reduce the computational load while maintaining good classification performance.The basic modules and downsampling modules designed in this paper use depthwise separable convolutions instead of standard convolutions in traditional CNNs.Compared to ShuffleNet V2 and MobileNet V3, to reduce computation, 1 × 1 convolutions are omitted before applying depthwise separable convolutions.Moreover, concat operations replace addition operations in ResNet and MobileNet V3 to further decrease the computational complexity.In the downsampling module, a combination of 3 × 1 convolutions and 9 × 1 large kernel convolutions effectively extracts current fault features at various scales, enhancing the model's robust fault feature extraction capabilities.The depthwise separable convolution adopted by the proposed model in this paper is divided into two steps: depthwise convolution and pointwise convolution.As shown in Figure 7, each input channel is convolved by only one convolution kernel, so the number of output channels is exactly equal to the number of channels in the previous layer.Since each channel is convolved separately, the features in the channel direction are independent.Therefore, the second step of depthwise separable convolution is to use pointwise convolution, specifically 1 × 1 convolution, to fuse cross-channel information.The computational multiplication of standard convolution can be expressed as follows where K and F are the sizes of the convolution kernel and the output feature map, respectively; M and N are the number of input channels and output channels, respectively.The computational multiplication of depthwise separable convolution can be expressed as follows The computational complexity of standard convolution and depthwise separable convolution can be compared as follows

MSSCNN (Model of this paper)
CNN [20] ResNet [21] ShuffleNet V2 [22] Mobilenet V3 [23] The depthwise separable convolution adopted by the proposed model in this paper is divided into two steps: depthwise convolution and pointwise convolution.As shown in Figure 7, each input channel is convolved by only one convolution kernel, so the

MSSCNN (Model of this paper)
CNN [20] ResNet [21] ShuffleNet V2 [22] Mobilenet V3 [23] The depthwise separable convolution adopted by the proposed model in this paper is divided into two steps: depthwise convolution and pointwise convolution.As shown in Figure 7, each input channel is convolved by only one convolution kernel, so the ShuffleNet V2 [22] Sensors 2024, 24, x FOR PEER REVIEW 10 of 18

MSSCNN (Model of this paper)
CNN [20] ResNet [21] ShuffleNet V2 [22] Mobilenet V3 [23] The depthwise separable convolution adopted by the proposed model in this paper is divided into two steps: depthwise convolution and pointwise convolution.As shown in Figure 7, each input channel is convolved by only one convolution kernel, so the Mobilenet V3 [23] Sensors 2024, 24, x FOR PEER REVIEW 10 of 18

MSSCNN (Model of this paper)
CNN [20] ResNet [21] ShuffleNet V2 [22] Mobilenet V3 [23] The depthwise separable convolution adopted by the proposed model in this paper is divided into two steps: depthwise convolution and pointwise convolution.As shown in Figure 7, each input channel is convolved by only one convolution kernel, so the According to (16), it is evident that the computational complexity of the depthwise separable convolution used in this paper is significantly reduced compared to standard convolution.
As shown in Figure 7, the first step in feature depth extraction is to use a downsampling module to reduce dimensionality and extract information.The current feature information with an input size of 24 × 63 × 1 is convolved by 3 × 1 depthwise separable convolution and 9 × 1 depthwise separable convolution, respectively.This enables the network to capture current information features at different scales.During the pointwise convolution process, the input and output channels remain the same because having similar input and output channel counts minimizes memory usage and speeds up computation.After the convolutions, the two branches are concatenated, doubling the output channels and significantly enhancing the feature learning capability of the network.Another downsampling module is then used to extract feature information with an output size of 96 × 16 × 1.
In the basic module, the feature information is first subjected to channel shuffle.Channel shuffle [22] involves randomly dividing the feature channels into two groups, with each group having half the number of input feature channels.The feature channels in one branch go directly to the next layer without any operation, thereby establishing connection relationships between different layers, allowing each layer to reuse half of the features from the previous layer.This characteristic, similar to DenseNet, contributes to the model's high accuracy.The other branch uses 3 × 1 depthwise separable convolution.The outputs of these two branches are concatenated while maintaining the channel count at 96.The final high-dimensional information feature extraction is accomplished using the downsampling module again.
The final step in feature depth extraction involves a simple concatenation of the two branches.Therefore, in the feature aggregation and output part, 1 × 1 convolution is first used to enhance information exchange between the two branches, as shown in Figure 8.To better distinguish the total of 13 conditions, including various types of single-switch open-circuit faults and normal operation in the three-level NPC inverter, a global average pooling (GAP) layer [26] is utilized to integrate the global information of the features, with the expression as follows where n k represents the total amount of data in the k-th channel.Finally, the 13 operating states are output through a fully connected (FC) layer.
Sensors 2024, 24, x FOR PEER REVIEW 12 of 18 where nk represents the total amount of data in the k-th channel.Finally, the 13 operating states are output through a fully connected (FC) layer.

Data Acquisition
Nine different operating conditions are chosen for validating the effectiveness of the proposed algorithm.These conditions are determined by selecting modulation indices of 0.3, 0.6 and 0.9, as well as fundamental frequencies of 20 Hz, 30 Hz and 50 Hz.Figure 10 shows the three-phase current waveforms under a no-fault condition and when each power switching device in phase A undergoes an open-circuit fault, at a 0.6 modulation index and 50 Hz fundamental frequency.As shown in Figure 10b, it can be   Figure 10 shows the three-phase current waveforms under a no-fault condition and when each power switching device in phase A undergoes an open-circuit fault, at a 0.6 modulation index and 50 Hz fundamental frequency.As shown in Figure 10b, it can be observed that when S A1 experiences an open-circuit fault, the negative half-cycle current of the faulty phase is unaffected because current can flow through the anti-parallel diode on S A1 in the P state.However, the positive half-cycle current is distorted due to the open-circuit fault of S A1 , as in the P state, the open device causes the forward current to only flow from D A1 and S A2 towards the load, changing the actual state from P to O state.As shown in Figure 10c, it can be seen that when S A2 experiences an open-circuit fault, the negative half-cycle current of the faulty phase is unaffected, as current can flow through the anti-parallel diode on S A2 in either the P or O state.However, the forward current is nearly zero because in the P or O state, current can only flow from the anti-parallel diodes on S A3 and S A4 towards the load, resulting in a change in the actual state to an N state.Some experimental data can be found in Table S1 of Supplementary Materials.
on SA1 in the P state.However, the positive half-cycle current is distorted due to the opencircuit fault of SA1, as in the P state, the open device causes the forward current to only flow from DA1 and SA2 towards the load, changing the actual state from P to O state.As shown in Figure 10c, it can be seen that when SA2 experiences an open-circuit fault, the negative half-cycle current of the faulty phase is unaffected, as current can flow through the anti-parallel diode on SA2 in either the P or O state.However, the forward current is nearly zero because in the P or O state, current can only flow from the anti-parallel diodes on SA3 and SA4 towards the load, resulting in a change the actual state to an N state.Some experimental data can be found in Table S1 of Supplementary Materials.From Figure 11, it can be observed that the MSSCNN model has a faster reduction rate of loss and a lower loss value compared to the other network models, indicating that the proposed network model in this paper converges faster and has higher accuracy.
Figure 12 shows the average accuracy of different models based on the results of 10 experiments.The MSSCNN model achieves an accuracy of 99.91%, which is the highest among the five network models.
The diagnostic effectiveness of the models in different noise environments is tested by adding noise with varying signal-to-noise ratios (SNRs) to the collected current data.The average accuracy from 10 repeated experiments is shown in Table 7 and Figure   From Figure 11, it can be observed that the MSSCNN model has a faster reduction rate of loss and a lower loss value compared to the other network models, indicating that the proposed network model in this paper converges faster and has higher accuracy.
Figure 12 shows the average accuracy of different models based on the results of 10 experiments.The MSSCNN model achieves an accuracy of 99.91%, which is the highest among the five network models.Parameters (Params), floating-point operations (FLOPs), memory required during node inference (memory) and the size of memory read and write operations (MemR + W) serve as crucial metrics for evaluating model complexity.Table 8 and Figure 14 display The diagnostic effectiveness of the models in different noise environments is tested by adding noise with varying signal-to-noise ratios (SNRs) to the collected current data.The average accuracy from 10 repeated experiments is shown in Table 7 and Figure 13.It can be observed that the diagnostic accuracy of the model gradually decreases with the increase in noise intensity.The fault diagnosis network model proposed in this paper outperforms the other four models in diagnostic effectiveness under different noise environments.Parameters (Params), floating-point operations (FLOPs), memory required during node inference (memory) and the size of memory read and write operations (MemR + W) serve as crucial metrics for evaluating model complexity.Table 8 and Figure 14 display these metrics for various network models when applied to a 3 × 250 × 1 fault sample.The MSSCNN model demonstrates lower values across all four parameters compared to the other four network models, indicating lower complexity.Parameters (Params), floating-point operations (FLOPs), memory required during node inference (memory) and the size of memory read and write operations (MemR + W) serve as crucial metrics for evaluating model complexity.Table 8 and Figure 14 display these metrics for various network models when applied to a 3 × 250 × 1 fault sample.The MSSCNN model demonstrates lower values across all four parameters compared to the other four network models, indicating lower complexity.To better demonstrate the classification effectiveness of the fault diagnosis model proposed in this paper, the t-SNE method is utilized for dimensionality reduction and visualization of the data in the test samples [28], as shown in  To better demonstrate the classification effectiveness of the fault diagnosis model proposed in this paper, the t-SNE method is utilized for dimensionality reduction and visualization of the data in the test samples [28], as shown in proposed in this paper, the t-SNE method is utilized for dimensionality reduction and visualization of the data in the test samples [28], as shown in Figure 15.It can be observed that the input data features shown in Figure 15a

Conclusions
This study proposes a fault diagnosis method for three-level NPC inverters based on the MSSCNN model and validates its effectiveness on an experimental platform.Compared to prior works like references [12,16]

Conclusions
This study proposes a fault diagnosis method for three-level NPC inverters based on the MSSCNN model and validates its effectiveness on an experimental platform.Compared to prior works like references [12,16]

Figure 1 .
Figure 1.Topology of the three-level NPC inverter.

Figure 1 .
Figure 1.Topology of the three-level NPC inverter.

Sensors 2024 , 18 Figure 3 .
Figure 3.The current path when SA1 or SA2 experiences an open-circuit fault in the P state.

Figure 3 .
Figure 3.The current path when S A1 or S A2 experiences an open-circuit fault in the P state.

Figure 3 .
Figure 3.The current path when SA1 or SA2 experiences an open-circuit fault in the P state.

FaultFigure 3 .
Figure 3.The current path when SA1 or SA2 experiences an open-circuit fault in the P state.

FaultFigure 3 .
Figure 3.The current path when SA1 or SA2 experiences an open-circuit fault in the P state.

FaultFigure 3 .
Figure 3.The current path when SA1 or SA2 experiences an open-circuit fault in the P state.

Fault 3 .
Diagnosis Method for Open-Circuit Faults in Three-Level NPC Inverter Based on MSSCNN The fault diagnosis method proposed in this paper utilizes high-precision current sensors to collect the three-phase output current data of the inverter and combines them with the proposed neural network fault diagnosis model to achieve the identification and fault localization of twelve types of open-circuit faults (as shown in Table

1 .
Collection of three-phase current data under different operating conditions, both with open-circuit faults and fault-free states.2. Normalization of the collected current data, and selection of a portion of the data from different operating conditions as the training set, while the remaining data are used as the test set.3. Building the MSSCNN model on the PyCharm platform, which consists of three main parts: initial feature extraction, deep feature extraction and feature aggregation and output.Taking into account the effectiveness of feature extraction, classification and computational complexity, the number of output channels for these three parts is set to 24, 192 and 1024, respectively.The model initially extracts features from the input current data using ordinary convolution, then further extracts high-dimensional fault information features using the MSSCNN basic module (BM) and downsampling module (DM).Finally, it aggregates the extracted information and outputs the diagnostic result.4.The training set is used to train the model, and the diagnostic effect of the model is verified through the test set.

1 .
Collection of three-phase current data under different operating conditions, both with open-circuit faults and fault-free states.2. Normalization of the collected current data, and selection of a portion of the data from different operating conditions as the training set, while the remaining data are used as the test set.3. Building the MSSCNN model on the PyCharm platform, which consists of three main parts: initial feature extraction, deep feature extraction and feature aggregation and output.Taking into account the effectiveness of feature extraction, classification and computational complexity, the number of output channels for these three parts is set to 24, 192 and 1024, respectively.The model initially extracts features from the input current data using ordinary convolution, then further extracts high-dimensional fault information features using the MSSCNN basic module (BM) and downsampling module (DM).Finally, it aggregates the extracted information and outputs the diagnostic result.4. The training set is used to train the model, and the diagnostic effect of the model is verified through the test set.

Figure 4 .
Figure 4.The development process of the open-circuit fault diagnosis method for three-level NPC inverter devices based on MSSCNN.

Figure 4 .
Figure 4.The development process of the open-circuit fault diagnosis method for three-level NPC inverter devices based on MSSCNN.

Sensors 2024 ,
24, x FOR PEER REVIEW 9 of 18 models like ResNet and DenseNet exhibit excellent performance in fields such as image recognition and object detection, but these models often have high complexity.To achieve a lightweight network model while maintaining high accuracy, rapid diagnosis and strong noise resistance in the open-circuit fault diagnosis of three-level NPC inverters, this paper designs the MSSCNN basic module and downsampling module.These modules mainly include 1 × 1 convolution layers, 3 × 1 and 9 × 1 depthwise separable convolution layers, BN layer, ReLU activation function and channel shuffle.The overall structure is shown in Figure 7.

Figure 8 .
Figure 8. Feature summary and output.4.Experimental Verification 4.1.Experimental Platform The development and training of the fault diagnosis model and the experimental platform for data detection and collection are shown in Figure 9.An OP4510 real-time simulator from OPAL-RT Technologies is used as the controller.The inverter employed in this study is FSD Company's FPS015TI072LA001.The DC power supply, RP7972A, supplies a 150 V DC side voltage.The load comprises a resistive-inductive configuration with 22Ω resistance and 4 mH inductance.The switching frequency is 2 kHz.
The development and training of the fault diagnosis model and the experimental platform for data detection and collection are shown in Figure 9.An OP4510 real-time simulator from OPAL-RT Technologies is used as the controller.The inverter employed in this study is FSD Company's FPS015TI072LA001.The DC power supply, RP7972A, supplies a 150 V DC side voltage.The load comprises a resistive-inductive configuration with 22 Ω resistance and 4 mH inductance.The switching frequency is 2 kHz.The development and training of the fault diagnosis model and the experimental platform for data detection and collection are shown in Figure 9.An OP4510 real-time simulator from OPAL-RT Technologies is used as the controller.The inverter employed in this study is FSD Company's FPS015TI072LA001.The DC power supply, RP7972A, supplies a 150 V DC side voltage.The load comprises a resistive-inductive configuration with 22Ω resistance and 4 mH inductance.The switching frequency is 2 kHz.
There are 13 kinds of single-switch open-circuit faults and normal operation experiments under each working condition.In these experiments, open-circuit fault situations are simulated by deactivating the driving signals to the power switching devices.The three-phase current output of the three-level NPC inverter is sampled at a frequency of 2.5 GHz using a YOKOGAWA oscilloscope and a YOKOGAWA 701932 current probe manufactured by the Yokogawa Electric Corporation in Tokyo, Japan, with a bandwidth of 100 MHz and a maximum input of 30 Arms.The collected current data are compiled into a dataset based on the content of the third section.The allocation of the training and test sets is shown in Table 6, comprising a total of 1170 test samples and 5460 training samples.The size of each sample is 3 × 250 × 1.

4. 2 .
Data Acquisition Nine different operating conditions are chosen for validating the effectiveness of the proposed algorithm.These conditions are determined by selecting modulation indices of 0.3, 0.6 and 0.9, as well as fundamental frequencies of 20 Hz, 30 Hz and 50 Hz.There are 13 kinds of single-switch open-circuit faults and normal operation experiments under each working condition.In these experiments, open-circuit fault situations are simulated by deactivating the driving signals to the power switching devices.The three-phase current output of the three-level NPC inverter is sampled at a frequency of 2.5 GHz using a YOKOGAWA oscilloscope and a YOKOGAWA 701932 current probe manufactured by the Yokogawa Electric Corporation in Tokyo, Japan, with a bandwidth of 100 MHz and a maximum input of 30 Arms.The collected current data are compiled into a dataset based on the content of the third section.The allocation of the training and test sets is shown in Table 6, comprising a total of 1170 test samples and 5460 training samples.The size of each sample is 3 × 250 × 1.

4. 3 .
Analysis and Comparison of Diagnostic Effect To demonstrate the superior performance of the proposed algorithm, fault diagnosis models for CNN, ResNet, ShuffleNet V2 and MobileNet V3 are established based on Table 5.The training set is subsequently input into the MSSCNN, classical CNN, ResNet, ShuffleNet V2 and MobileNet V3 models for training, and the test set is used to validate the training performance of the models.The experiment is repeated 10 times using the same training and test sets, and the loss function comparison curves from one of the experiments are shown in Figure 11.

Sensors 2024 ,
24, 1745 models for CNN, ResNet, ShuffleNet V2 and MobileNet V3 are established based on Table 5.The training set is subsequently input into the MSSCNN, classical CNN, ResNet, Shuf-fleNet V2 and MobileNet V3 models for training, and the test set is used to validate the training performance of the models.The experiment is repeated 10 times using the same training and test sets, and the loss function comparison curves from one of the experiments are shown in Figure 11.

Figure 11 .
Figure 11.Comparison of the loss function.
13.It can be observed that the diagnostic accuracy of the model gradually decreases with the increase in noise intensity.The fault diagnosis network model proposed in this paper outperforms the other four models in diagnostic effectiveness under different noise environments.

Figure 11 .
Figure 11.Comparison of the loss function.

18 Figure 12 .
Figure 12.Comparison of accuracy of different models.

Figure 12 .
Figure 12.Comparison of accuracy of different models.

Figure 13 .
Figure 13.Diagnostic accuracy of each model in different noise environments.

Figure 13 .
Figure 13.Diagnostic accuracy of each model in different noise environments.

Figure 14 .
Figure 14.Complexity comparison of different models.

Figure 15 .
It can be observed that the input data features shown in Figure 15a are chaotic in distribution.After being processed by the MSSCNN model, the 13 types of open-circuit faults in the three-level NPC inverter in Figure 15b are separated, with virtually no overlap of different category samples.This indicates that the model possesses excellent feature extraction capability.

Figure 14 .
Figure 14.Complexity comparison of different models.

Figure 15 .
It can be observed that the input data features shown in Figure 15a are chaotic in distribution.After being processed by the MSSCNN model, the 13 types of open-circuit faults in the three-level NPC inverter in Figure 15b are separated, with virtually no overlap of different category samples.This indicates that the model possesses excellent feature extraction capability.

Figure 15 .
Figure 15.Data feature distribution visualization: (a) Feature distribution of input data.(b) Feature distribution of MSSCNN model's output data.
, this method can identify power switching device open-circuit faults without the need for complex preprocessing such as wavelet packet transforms or Clark transforms on the original three-phase current data.The basic modules and downsampling modules in the MSSCNN model utilize depthwise separable convolution and channel shuffle techniques, ensuring both efficient feature extraction and a lightweight model structure.The combination of convolution kernels of different sizes enhances the model's noise resistance.The fault diagnosis method based on the MSSCNN model proposed in this paper attained an accuracy rate of 99.91% in detecting open-circuit faults in power switching devices.Furthermore, it has higher accuracy than the comparative models built on CNN, ResNet, ShuffleNet V2 and Mobilenet V3 in three noise environments of 6 dB, 8 dB and 10 dB.Meanwhile, it has lower model complexity and surpasses the four comparative models in terms of Params, FLOPs, memory and MemR + W. Through processes such as feature enhancement and effective extraction, the proposed

Figure 15 .
Figure 15.Data feature distribution visualization: (a) Feature distribution of input data.(b) Feature distribution of MSSCNN model's output data.
, this method can identify power switching device opencircuit faults without the need for complex preprocessing such as wavelet packet transforms or Clark transforms on the original three-phase current data.The basic modules and downsampling modules in the MSSCNN model utilize depthwise separable convolution and channel shuffle techniques, ensuring both efficient feature extraction and a lightweight model structure.The combination of convolution kernels of different sizes enhances the model's noise resistance.The fault diagnosis method based on the MSSCNN model proposed in this paper attained an accuracy rate of 99.91% in detecting open-circuit faults in power switching devices.Furthermore, it has higher accuracy than the comparative models built on CNN, ResNet, ShuffleNet V2 and Mobilenet V3 in three noise environments of 6 dB, 8 dB and 10 dB.Meanwhile, it has lower model complexity and surpasses the four comparative models in terms of Params, FLOPs, memory and MemR + W. Through processes such as feature enhancement and effective extraction, the proposed model can effectively improve fault determination and localization capabilities by clustering the originally mixed data feature distributions.

Table 1 .
Output voltage and switching states of three-level NPC inverter.

Table 1 .
Output voltage and switching states of three-level NPC inverter.

Table 2 .
Output voltage variation before and after open-circuit fault.

Table 2 .
Output voltage variation before and after open-circuit fault.

Table 3 .
Three-phase current waveforms of each power switching device in phase A under opencircuit fault.

Table 2 .
Output voltage variation before and after open-circuit fault.

Table 3 .
Three-phase current waveforms of each power switching device in phase A under opencircuit fault.

Table 2 .
Output voltage variation before and after open-circuit fault.

Table 3 .
Three-phase current waveforms of each power switching device in phase A under opencircuit fault.

Table 2 .
Output voltage variation before and after open-circuit fault.

Table 3 .
Three-phase current waveforms of each power switching device in phase A under opencircuit fault.

Table 2 .
Output voltage variation before and after open-circuit fault.

Table 3 .
Three-phase current waveforms of each power switching device in phase A under opencircuit fault.

Table 5 .
Comparison of structure between the basic module and downsampling module.

Table 5 .
Comparison of structure between the basic module and downsampling module.

Table 5 .
Comparison of structure between the basic module and downsampling module.

Table 5 .
Comparison of structure between the basic module and downsampling module.

Table 5 .
Comparison of structure between the basic module and downsampling module.

Table 6 .
Division of training set and test set.

Table 7 .
Diagnostic accuracy of each model in different noise environments.
Figure 13.Diagnostic accuracy of each model in different noise environments.

Table 7 .
Diagnostic accuracy of each model in different noise environments.

Table 8 .
Complexity comparison of different models.

Table 8 .
Complexity comparison of different models.