4.1. Diagnostic Model and Optimization
Traditional time-frequency methods for PMSM fault diagnosis are often limited by their sensitivity to motor geometry and operating conditions, making feature extraction unreliable in complex environments. Deep learning–based approaches, by contrast, can directly learn representative fault characteristics from signal data without relying on the specific structure or operating principles of the machine [
17]. To exploit this advantage, this study adopts a hybrid diagnostic framework that integrates a Deep Belief Network (DBN) with an Extreme Learning Machine (ELM) [
18].
The DBN, composed of multiple Restricted Boltzmann Machines (RBMs), extracts hierarchical features from input signals through unsupervised pretraining and refines them via supervised fine-tuning. The extracted features are then classified by the ELM, which offers fast training speed and strong generalization ability. By combining the DBN’s feature representation capability with the ELM’s classification efficiency, the proposed framework achieves improved diagnostic accuracy and robustness in identifying PMSM demagnetization faults. The structure of the RBM is shown in
Figure 23. During the training stage, the DBN extracts characteristic information from the input signal, while in the fine-tuning stage, the parameters are adjusted according to the learning error.
As a classifier, ELM has a typical three-layer structure consisting of the input layer, the hidden layer and the output layer. The output of ELM oi is shown in Equation (15):
W is the ELM hidden layer weight matrix. b is the ELM hidden layer bias vector. ϑ is the number of ELM hidden layer nodes. i is the number of label layers. β is the weight matrix between the hidden layer and the label layer. X is the output value of the upper-layer. In this paper, ELM, as a label classifier of fault diagnosis model, is placed in the last layer of the model.
Since the diagnostic performance of the DBN–ELM model is highly sensitive to hyperparameter selection, this study introduces an Enhanced Fireworks Algorithm (EnFWA) for optimization. Compared with the traditional Fireworks Algorithm, EnFWA incorporates a dynamic adjustment factor
q and a dynamic radius factor
μr, expressed as in Equations (16) and (17):
n is the current number of iterations. N is the maximum number of iterations. ζ is the adjustment coefficient. f(x) is the fitness function of the algorithm. fmin and fmax is the minimum and maximum value of the fitness function. Ri is the search radius under the current fitness value.
Additionally, EnFWA adopts a roulette-based selection strategy, which dynamically adjusts the search radius as fitness updates, accelerating convergence and reducing the risk of local optima. This optimization ensures that the DBN–ELM model achieves an efficient and robust structure for fault diagnosis.
4.2. Process of Fault Diagnosis and Result Analysis
Before constructing the diagnostic framework, it is necessary to clarify the rationale for feature extraction. Demagnetization faults in IPMSMs manifest primarily as distortions in magnetic flux density, torque fluctuations, and variations in induced voltage. Torque provides a direct reflection of the motor’s electromechanical performance, while the three-phase induced voltage contains abundant fault-related information in both the time and frequency domains. By applying FFT, the frequency components of induced voltage can be decomposed to identify harmonics that are particularly sensitive to demagnetization. Specifically, overall demagnetization is often associated with an increase in the third harmonic due to flux asymmetry, whereas local demagnetization typically produces sideband components near the fundamental frequency. These harmonic variations provide discriminative features that strengthen the separability of different fault modes.
Based on this rationale, the diagnostic dataset is constructed from both time-domain and frequency-domain indicators. The sampled signals include the motor’s output torque and the three-phase induced voltage. FFT analysis is applied to the induced voltage, and each phase is decomposed into 1025 frequency components. The sampling frequency is set to 20 kHz, with a duration of 0.75 s, yielding one torque time series, three voltage time series, and three corresponding voltage spectra under each fault mode. For accuracy and generality, 30 sets of data are recorded per fault, of which 25 are used for model training to determine optimal parameters and 5 for testing. The final data and labels are summarized in
Table 9.
On this basis, a hybrid diagnostic model is constructed using a Deep Belief Network–Extreme Learning Machine (DBN–ELM) architecture. Each Restricted Boltzmann Machine (RBM) in the DBN is trained for 20 iterations with a learning rate of 0.01 and an initial momentum coefficient of 0.5. After data preprocessing and white-noise perturbation, the processed samples are fed into the DBN to extract hierarchical features, and the resulting high-level representations are subsequently classified by the ELM.
To obtain an efficient network structure, an enhanced Fireworks Algorithm (EnFWA) is employed to optimize the key hyperparameters of the DBNELM model, primarily the number of neurons in each hidden layer. In this study, the main EnFWA parameters are set as follows: 10 iterations, 20 initial fireworks, an explosion amplitude range of 20–500, 5 mutation fireworks, and an initial explosion radius of 20. EnFWA evaluates candidate solutions using a fitness-based criterion and uses a roulette-based selection mechanism to dynamically adjust the explosion radius, thereby accelerating convergence while reducing the risk of falling into local optima.
During each iteration, the algorithm updates the hyperparameter candidates and computes their fitness values. The optimization process terminates when the fitness satisfies the convergence requirement or when the maximum number of iterations is reached. If the fitness remains unsatisfactory, the iteration budget is increased to further approach the optimal solution. Through this procedure, the DBN–ELM model obtains an optimized structure with improved diagnostic accuracy and stability in the presence of noise, which forms the basis for the subsequent fault diagnosis and performance evaluation.
Data from the eight demagnetization faults form a 240 × 63,075 matrix, where each sample consists of 15,000 torque time-domain points, 45,000 three-phase voltage time-domain points and 3075 frequency-domain components concatenated into a single feature vector. During the actual sampling process of a motor, disturbances such as mechanical and electromagnetic interference are inevitably present. To simulate real-world conditions, Gaussian white noise is introduced to corrupt the sampled data. Standardization and normalization processing are then carried out.
Figure 24 shows the effect of white noise, where the data represent the frequency components of the U-phase voltage.
The fault diagnosis model adopts the combination of DBN-ELM and the diagnosis flow chart is shown in
Figure 25. The details are as follows:
(1) Demagnetization fault data are obtained using the multi-physical coupling analysis method.
(2) Noise is introduced to simulate real-world operating conditions. The data are then standardized, normalized, and divided into training and testing sets, with labels assigned accordingly.
(3) An optimization algorithm is employed to tune the hyperparameters of the model, thereby obtaining the optimal structure.
(4) The training data are input into the model for learning, and the testing data are subsequently used to perform fault diagnosis.
The results of a single diagnosis are shown in
Figure 26. The accuracy of the training set reached 92%, while that of the test set reached 85%. More error points occur in the overall demagnetization faults because the differences among the six fault types are relatively small. The distribution of the other error points is irregular, which may be attributed to the presence of white noise.
To provide a more reliable estimate of the diagnostic performance beyond this single train-test split, a 10-fold cross-validation procedure is additionally performed on the same dataset. Specifically, the dataset is randomly divided into ten subsets of equal size; nine subsets are used for training and the remaining subset for testing, and the procedure is repeated until each subset has served once as the test set. Across the ten folds, the proposed DBN-ELM model achieves an average test accuracy of 89.1% ± 1.2%, indicating that the model maintains stable generalization performance under different data partitions.
Table 10 shows the performance comparison of different fault diagnosis algorithms. In order to further verify the accuracy of the diagnostic model, three benchmark models are selected for comparison.
(1) SDAE + SVM: stacked denoised autoencoder (SDAE) is used to extract features from the original data and reduce the overall dimension of the data. Support vector machine (SVM) is used to classify the data and corresponding fault labels. The hidden layer network structure is [150,75,30]. The sparse coefficient is 0.01. The penalty term has a weight of 2.
(2) CNN + Softmax: convolutional neural networks (CNN) are used for feature extraction and the last layer of the network is classified by softmax classifier. The model has 5 convolutional layers. The size of the convolution kernel in the first layer is 64’1, and the remaining 4 layers are 3’1. The number of neurons in the fully connected layer is 100 and the activation function is Relu function. The softmax layer has eight outputs corresponding to eight faults.
(3) LSTM: model adopts a two-layer LSTM (Long Short Term Memory) structure. The number of neurons in each layer is 20. The Dropout parameter is set to 0.4 and the bottom layer adopts a fully connected layer to classify output.
The model consisting of SDAE and SVM has poor accuracy and long training time. The accuracy of the training set of the diagnostic model CNN + Softmax is better, but the accuracy of the test set is decreased, which means the diagnosis is unstable. The low data dimension may lead to overfitting of the model. The results of the LSTM diagnostic model are stable, and the accuracy of the training set and the test set hardly have differences but both of them are relatively low. Compared with the processing of time series data, the performance of LSTM in the processing of classification diagnosis is dissatisfactory. Overall, the proposed DBN-ELM model achieves a favorable trade-off between diagnostic accuracy and computational efficiency. Although CNN + Softmax attains the highest test accuracy, it shows clear signs of overfitting, whereas DBN-ELM maintains more consistent performance between training and testing and demonstrates strong robustness to noise, making it more suitable for practical aviation applications.
It should also be noted that the present diagnostic evaluation is conducted on simulation data obtained under a single nominal operating condition. While the DBN–ELM model shows stable performance under k-fold validation and noise perturbation, the lack of multi-condition or experimental verification remains a limitation. These aspects will be further addressed in future work when additional data become available.