Abstract
Aiming to improve the currently low accuracy of fault diagnosis due to the difficulty of extracting the non-stationary and nonlinear features of rolling bearing fault signals, a multi-condition fault diagnosis method for rolling bearings was proposed based on enhanced singular spectrum decomposition (ESSD), optimized multi-scale mean permutation entropy (MMPE), and support vector machine (SVM). Firstly, aiming to address the problem of singular spectrum decomposition (SSD) producing false components and signals with low energy proportions that cannot be accurately decomposed when the residual energy ratio is used as the final iteration termination condition, an enhanced singular spectral decomposition method is proposed. Secondly, the effect of the MMPE extraction of fault features depends on the selection of parameters, and after comprehensively considering the interaction between MMPE parameters, a method to optimize MMPE based on the particle swarm optimization (PSO) algorithm is proposed to maximize the performance of the extracted features. Finally, considering that the classification performance of SVM is affected by the penalty factor c and kernel function g, the fault characteristics proposed by ESSD + PSO - MMPE are identified by an SVM classifier model that is optimized by the particle swarm algorithm, so as to realize the effective diagnosis of multi-condition faults in rolling bearings. Using rolling bearing simulation signals, the Case Western Reserve University bearing dataset, and the online monitoring signal from the front bearings of a wind farm’s 1.5 MW wind turbine, the proposed method is compared with EMD + MMPE + SVM, SSD + MMPE + PSO - SVM, ESSD + MMPE + PSO - SVM, and other methods, and the results show that the proposed method can effectively identify multi-working faults in rolling bearings.
1. Introduction
In recent years, China has proposed “Industry 4.0” and “Made in China 2025” as major strategies for the development of the manufacturing industry [1]. However, for manufacturing enterprises, effectively monitoring and diagnosing the fault conditions of core equipment’s critical components, such as rolling bearings, is crucial for enhancing the level of manufacturing. It is also a key aspect of achieving equipment intelligence and promoting the transformation of the manufacturing industry [2]. The “Mechanical Engineering Discipline Development Strategic Report 2021–2035” also lists improving the reliability and safety of major equipment operations in manufacturing as an important development direction [3].
As the complexity of large equipment continues to increase, rolling bearings, which are core components, need to operate under higher loads and higher speeds. This significantly raises the probability of bearing failures. When a failure occurs, it can lead to downtime for maintenance, affecting the progress of the manufacturing process. Therefore, the ability to quickly and accurately diagnose the operating condition of rolling bearings is of utmost importance for the manufacturing industry [4].
Rolling bearing fault signals exhibit strong nonlinearity and non-stationarity and are characterized by periodic impact features. In recent years, with the continuous development of signal processing technology, research on rolling bearing fault diagnosis has achieved certain results. Huang et al. [5] proposed an Empirical Mode Decomposition (EMD) time–frequency analysis method for adaptive signal decomposition, but this method has the following shortcomings: (1) endpoint effects and mode mixing during decomposition; (2) EMD lacks rigorous theoretical derivation [6]. To address these issues, Bonizzi et al. proposed a novel signal adaptive analysis method called Singular Spectrum Decomposition (SSD) [7]. SSD evolves from the Singular Spectrum Analysis (SSA) algorithm and allows for the adaptive reconstruction of single-component signals from high frequencies to low frequencies [8]. This provides a new approach for analyzing the time series of nonlinear and non-stationary vibration signals. Compared to EMD, Ensemble Empirical Mode Decomposition (EEMD), and Variational Mode Decomposition (VMD), SSD has a solid mathematical foundation, higher decomposition accuracy, and better suppression of spurious components and mode mixing [6]. SSD has been successfully applied to analyze vibration signals in rotating machinery and has achieved good results. However, preliminary research has found that when SSD uses the residual energy ratio of vibration signals as the iteration termination condition, it cannot accurately decompose weak components with low energy ratios. Based on this, this paper proposes an enhanced Singular Spectrum Decomposition method that uses Jensen-Shannon (JS) distance and variance contribution rate as supplementary iteration termination conditions to improve the precision of SSD in decomposing weak signals and suppressing spurious components.
The modal components produced using SSD contain fault information obtained from the original signal. Traditional vibration signal demodulation methods can cause modulation effects at both ends of the demodulated signal [8], leading to errors in extracting accurate fault frequencies. Information entropy can effectively describe the uncertainty of various possible events occurring in an information source [9]. For rolling bearing vibration signals, the uncertainty of vibration signals under different bearing conditions can also be represented by information entropy. Approximate entropy [10], sample entropy [11], and permutation entropy [12] are commonly used methods for uncertainty analysis in fault diagnosis. However, approximate entropy has selectivity towards the target object, and both approximate entropy and sample entropy use hard threshold criteria, which can affect their results [13]. Permutation entropy reflects the uncertainty and complexity of vibration signals on a single scale, whereas fault vibration signal characteristics are often contained across multiple scales. Therefore, multi-scale permutation entropy has been proposed. Zheng Jinde et al. [14] conducted fault diagnosis based on multi-scale permutation entropy for bearing faults, but multi-scale permutation entropy still has the following drawbacks: (1) the effectiveness of quantifying fault characteristics depends on the selection of parameters; (2) in multi-scale permutation entropy, the time series length shortens and original information is lost after coarse-graining, and the dynamic mutation phenomena of the original signal can be neutralized when the time series is averaged after coarse-graining [15]. To address the first issue, Rao Guoqiang et al. [16] determined the parameters for multi-scale permutation entropy using independent determination methods (mutual information method and pseudo-nearest neighbor method) through an experimental comparison of rolling bearing full-life data. Chen Dongning et al. [17] achieved the classification and identification of bearing faults using Fast Variational Mode Decomposition (FVMD), optimized multi-scale permutation entropy, and GK fuzzy clustering methods. For the second issue, Wang Gongxun et al. [15] proposed multi-scale mean permutation entropy and achieved good results in extracting rolling bearing fault features. However, the effectiveness of multi-scale mean permutation entropy in reflecting the uncertainty of vibration signals is also highly dependent on its parameters (delay time, embedding dimension, data length, and scale factor). Therefore, this paper uses particle swarm optimization to adaptively select parameters for MMPE to maximize its fault feature extraction performance.
Once the fault features of the vibration signal have been thoroughly extracted, utilizing this feature information to distinguish fault patterns is a crucial step. Support Vector Machines (SVM) have become a popular classification method in recent years due to their solid theoretical foundation and superior classification performance in fields such as pattern recognition and machine learning [18]. However, since the penalty factor c and the kernel function parameter g can affect the classification results of SVM [19], this paper uses particle swarm optimization to optimize and select these parameters.
In summary, to fully leverage the advantages of SSD, MMPE, and SVM in signal decomposition, feature extraction, and pattern recognition, this paper proposes improvements to combine these methods to diagnose multi-condition faults in rolling bearings. First, an enhanced Singular Spectrum Decomposition (ESSD) method is proposed based on the JS distance and variance contribution rate to improve SSD’s accuracy in decomposing weak signals and suppressing spurious components, and to select the optimal components through correlation analysis. Second, a particle swarm optimization (PSO) method is introduced to optimize multi-scale mean permutation entropy (MMPE) for the effective extraction of fault features from the optimal components. Finally, PSO is used to optimize and select SVM parameters to recognize the features identified by ESSD + PSO - MMPE, achieving an effective diagnosis of the rolling bearing faults.
The remainder of this paper is organized as follows. Section 2 provides a detailed description of the proposed fault diagnosis method. Section 3 presents the simulation analysis of the proposed method. Section 4 presents the experimental design and its comparative results. Section 5 concludes the paper and presents the limitations of the proposed study.
2. Theoretical Foundation
2.1. Singular Spectrum Decomposition
Singular Spectrum Analysis (SSA) is a method for processing nonlinear time series data. It originates from the singular value decomposition of a specific matrix derived from a vibration signal time series. By performing a decomposition and reconstruction of the trajectory matrix of the vibration signal time series, SSA extracts different component sequences from the vibration signal and analyzes the vibration signal [8].
SSD originates from SSA and is a newly proposed adaptive signal processing method. It can decompose nonlinear and non-stationary vibration signals into several Singular Spectrum Components (SSC) and a residual term, ordered from high to low frequencies [6]. The algorithmic process is as follows:
- (1)
- Construct a new trajectory matrix. Given a time series x(n) with a data length of N and an embedding dimension of M, construct an M × N trajectory matrix X. To better understand the matrix construction process, consider the time series x(n) = {1,2,3,4,5} with an embedding dimension M = 3. The corresponding matrix X is
In Equation (1), the left three rows and the left three columns of matrix X constitute the trajectory matrix in SSA. To enhance the oscillatory components in the initial time-domain signal and ensure that the energy of the residual components decreases progressively after iteration, move the three elements from the bottom right of matrix X to the top left of the matrix. Construct a new matrix as expressed in Equation (2) to achieve component extraction.
In the equation, the left half of matrix X is the new trajectory matrix in SSA.
- (2)
- Adaptively select the embedding dimension M. Since SSA has the drawback of empirically choosing the embedding dimension, SSD uses an adaptive rule to select the embedding dimension M. Let rj(n) be the residual component at the j-th iteration, and its expression is as follows:
First, calculate the Power Spectral Density (PSD) of Equation (3) and identify the maximum frequency fmax. In the first iteration, if (fmax/fs) ≤ 10−3 (given threshold), the residual component is considered a major trend term, and M = N/3, where fs is the sampling frequency. If this condition is not met and the iteration number j > 1, then M = 1.2(fs/fmax) to enhance the effectiveness of an analysis using SSA.
- (3)
- Reconstruct the SSC. Reconstruct in order from high frequency to low frequency. If M = N/3, use the first left singular vector to obtain g1(n). If M = 1.2(fs/fmax), select the left singular vectors from the set of all feature groups with prominent main frequencies in the spectrum [fmax − ∆f, fmax + ∆f] and from the feature groups that contribute the most to the main component’s energy, creating a subset Ij(Ij = {i1,i2,…,ip}). Finally, perform reconstruction using the matrix diagonal averaging.
- (4)
- Set the algorithm iteration termination condition. Separate the component sequences estimated from the iterations from the original signal to obtain a residual term v(j+1). Calculate the normalized mean squared error between the obtained residual term and the original time series σNMSE(j).
If the value of Equation (4) is less than the set threshold (default value is 0.01), the decomposition process terminates. Otherwise, use the residual term as the new signal and iterate repeatedly until the termination condition is met, ultimately obtaining the singular spectrum decomposition result.
2.2. Enhanced Singular Spectrum Decomposition Algorithm
From the principles of the SSD algorithm mentioned above, it is understood that the SSCs are generated through iterative decomposition in SSD. When comparing the energy ratio of the residual signal from each decomposition to the original signal with a set threshold, if the signal is less than the threshold, the iteration stops, ending the signal decomposition and outputting the components. Therefore, the performance of the SSD depends on the manually set threshold for terminating the iterations of the singular spectrum decomposition (default value is 0.01). If this threshold is set inappropriately, it will directly affect the accuracy of the singular spectrum decomposition algorithm. Previous research has shown that if the signal contains relatively weak component signals (signals with low amplitude and a low energy ratio), setting the default energy ratio threshold incorrectly will prevent the SSD algorithm from conducting an effective decomposition. Therefore, this paper proposes an ESSD method by setting the JS distance and variance contribution rate as supplementary termination conditions for the iterations, improving the precision of singular spectrum decomposition and suppressing the generation of false components.
2.2.1. Variance Contribution Rate
The variance contribution rate represents the proportion of the variance of the residual signal with physical significance within the total variance of the signal [20]. It quantifies the impact of different periodic components on the original signal. To reduce the generation of false components, this paper proposes using the variance contribution rate of the residual signal produced during the singular spectrum decomposition as a supplementary first iteration termination condition for SSD. When the variance contribution rate of the residual signal is less than the set value, it is considered that there is no valid information within the residual signal, and the singular spectrum decomposition terminates. The specific calculation process is as follows:
A data sequence {x1, x2, …, xN} is collected within the sampling time t. The definition of the variance contribution rate is
where
σ2, E, and N represent the variance, mean, and length of the data sequence, respectively.
2.2.2. JS Distance
JS distance, also known as Jensen–Shannon (JS) divergence, is derived from KL divergence (Kullback–Leibler divergence). Both can be used as measures of the difference between two probability distributions, but KL divergence has the drawback of asymmetry, which limits its effectiveness in measuring the difference between two distributions. In contrast, the JS distance addresses this asymmetry issue by constructing an average probability distribution. The value of the JS distance ranges from [0, 1]. It equals 0 when the two probability distributions are identical and equals 1 when the two distributions are completely opposite [21]. The specific definition is as follows:
Let the random discrete variable x have k possible values {X1, X2, …, Xk}. Let Y1(x) and Y2(x) be two probability distributions of the random discrete variable x. The calculation formula for the JS distance is given by Equation (7):
Therefore, given the excellent ability of the JS distance to measure differences, this paper proposes using the JS distance as a supplementary second iteration termination condition for the singular spectrum decomposition algorithm. This enhances the algorithm’s ability to accurately decompose weak signals.
2.3. Multi-Scale Permutation Entropy
Multi-scale permutation entropy is composed of the permutation entropy of time series at different scales s. The specific calculation process is as follows:
- (1)
- Perform coarse-graining on the time series X = (Xi, i = 1, 2, …, N) to obtain the coarse-grained series yj(s), i.e.,
- (2)
- Perform N-dimensional phase space reconstruction on the coarse-grained time series, i.e.,where l is the l-th reconstructed component l = 1, 2, …, (m − 1)τ, m is the embedding dimension, and τ is the time delay.
- (3)
- Sort the reconstructed components in ascending order to obtain the symbol sequence S(r) = (j1, j2, …, jm), where r = 1, 2, …, R and R ≤ m!. Calculate the probability Pr of each symbol sequence occurring.
- (4)
- Calculate the permutation entropy value for each coarse-grained time series.
- (5)
- Normalize Hp to obtain the multi-scale permutation entropy, which is
The range of Hp is from 0 to 1. The magnitude of Hp indicates the randomness and complexity of the time series: a larger Hp value signifies greater randomness, while a smaller Hp value indicates a more regular time series.
2.4. Multi-Scale Mean Permutation Entropy
Compared to multi-scale permutation entropy, multi-scale mean permutation entropy preserves more of the characteristic information of the time series and reduces sampling errors and sample expansion [1]. The specific principle is as follows:
- (1)
- Replace the coarse-grained time series in the multi-scale permutation entropy with a mean-processed time series to obtain the mean-processed sequence yj(s), i.e.,
When s is equal to 1, the original sequence is the mean-processed sequence. When s is greater than 1, the original sequence becomes a mean-processed sequence of length (N + 1 − s).
- (2)
- The MMPE of the original time series is composed of the permutation entropy of the time series processed by mean for scales from 1 to s.
2.5. Particle Swarm Optimization Algorithm
In 1995, the American scholar Kennedy and engineer Eberhart first proposed the Particle Swarm Optimization (PSO) algorithm [22]. This algorithm simulates the interaction and competition between different particles in a swarm to achieve the optimal position in the search space for the given problem. The position and velocity update formulas are as follows:
where i is the particle, i = 1, 2, …, N, and N is the total number of particles in the swarm. Vi is the velocity of the particle; rand() is a randomly generated number between 0 and 1; xi is the current position of the particle; c1 and c2 are acceleration constants; w is the inertia weight; pbesti is the best individual position found by particle i; and gbesti is the best global position found by particle i.
3. Simulation Analysis
3.1. Analysis of the Impact of the Iteration Termination Threshold on SSD Decomposition Performance
Due to the influence of multiple factors and the complex nature of actual fault signals, the specific components of the signal cannot be predicted. Therefore, this paper illustrates the impact of the iteration termination threshold on the decomposition accuracy of the SSD algorithm by constructing a sinusoidal signal y with frequency components of 5 Hz, 15 Hz, and 30 Hz. As shown in Equation (14), the sampling frequency is set to 1000 Hz. Figure 1 shows the time-domain waveforms of the simulated and synthesized signals.
Figure 1.
Time-domain waveforms of simulated and synthesized signals.
By setting the iteration termination thresholds to 0.01, 0.001, and 0.0001, SSD was performed on the simulated signal, and the resulting SSCs are shown in the figures below. From Figure 2, it can be seen that with the default termination threshold of 0.01, SSD decomposed only two SSCs, failing to decompose the weak signal 0.1sin(30πt) with lower energy, resulting in information loss and under-decomposition. When the iteration termination threshold was set to 0.0001, four components were decomposed, exceeding the actual number of signal components, leading to over-decomposition, with the second component being a false component. Only when the iteration termination threshold was set to 0.001 did the singular spectrum achieve an accurate decomposition of the constituent signals. This indicates that the size of the iteration termination threshold affects the decomposition capability of the singular spectrum decomposition algorithm for weak signals.
Figure 2.
Different iteration termination thresholds for the decomposition results.
Therefore, this paper introduces the JS distance and the variance contribution rate as supplementary termination conditions for the SSD algorithm. These are used in addition to the energy ratio threshold for iteration termination to enhance the accuracy of the singular spectrum decomposition algorithm and suppress the generation of false components. The specific steps of ESSD are as follows:
- (1)
- Parameter Setting. Set the initial parameters for the ESSD algorithm: input signal V(t) and sampling frequency fs. (Due to the supplementary iteration termination conditions, the energy ratio threshold no longer requires personal experience to set, and remains at its default value.)
- (2)
- Perform Iterative Calculation. Iteratively process the input signal several times, each iteration producing an SSC component signal and a residual signal. The residual signal is the difference between the original signal and the SSC component signal at each iteration.
- (3)
- Algorithm Termination Criteria. The decomposition ends when the supplementary first and second iteration termination conditions are met. Specifically, during each iteration, compute the energy ratio between the residual signal and the original signal, as well as the JS distance between the SSC component and the original signal (JSmin) and the JS distance between the residual signal and the original signal (JSthreshold). If the energy ratio is below the threshold and JSmin is less than JSthreshold, the supplementary first iteration termination condition is considered met. Additionally, to avoid generating a large number of invalid false components, compute the variance contribution rate (Rvc) of the residual signal for each iteration. If Rvc is less than the threshold VCthreshold (as referenced in [23] and set to VCthreshold = e−6 based on extensive experimentation), the supplementary second iteration termination condition is considered met. At this point, the signal is deemed to be sufficiently decomposed, the iteration loop is terminated, and the decomposition process is concluded.
- (4)
- Output Components. Arrange the decomposed components in descending order of frequency, number them, and output them.
3.2. Performance Analysis of ESSD
Reference [23] constructs a four-component signal composed of two sinusoidal signals (with frequencies of 400 Hz and 100 Hz), an impulse signal (with a fault frequency set at 50 Hz), and a noise signal, representing a weak fault signal from a rolling bearing. The performance of the ESSD is tested using this signal. The sampling frequency is set to 4096 Hz, with 4096 sampling points, and the inherent frequency is set to 3000 Hz. The specific expression of the simulated signal is given by Equation (15).
Figure 3 shows the time-domain waveform and magnitude spectrum of the simulated signal. The harmonics appearing in the magnitude spectrum correspond to the frequencies of the sinusoidal components, while the fault impulse signal frequency of 50 Hz is hidden. As shown in Table 1, the energy of the simulated signal and its components indicates that the energy of the impulse signal is relatively low, accounting for less than 1% of the total signal energy. Therefore, the constructed simulated signal represents an early weak fault signal in a rolling bearing.
Figure 3.
Time-domain waveform and magnitude spectrum of simulated signal.
Table 1.
Signal energy.
The proposed ESSD was used to decompose the simulated signal. The resulting SSC component signals and their magnitude spectra are shown in Figure 4. After a single adaptive decomposition by ESSD, four SSCs were produced, corresponding to the number of components in the simulated signal. Analysis reveals that the magnitude spectrum of SSC1 is chaotic, matching the characteristics of the component signal y4(t). The magnitude spectrum of SSC2 forms a resonance band between approximately 500 Hz and 1300 Hz, consistent with the characteristics of the impulse signal y1(t). The main spectral lines of SSC3 and SSC4 correspond to the set frequencies of component signals y2(t) and y3(t), respectively. An envelope spectrum analysis of SSC2, as shown in Figure 4i, reveals the fault’s characteristic frequency and its harmonics. This indicates that ESSD effectively decomposed the signal.
Figure 4.
ESSD results.
For comparison, SSD was performed on the simulated signal (with iteration termination thresholds set to 0.1, 0.01, and 0.001). The decomposition results are shown in Figure 5. From the figure, it can be seen that when the threshold is set to 0.1 or 0.01, SSD decomposes the two sinusoidal component signals, but the low-amplitude impulse signal and random noise signal are not effectively decomposed, resulting in under-decomposition and missing valuable information. When the threshold is set to 0.001, SSD decomposes the components of each signal, but compared to ESSD, it produces false components, SSC2, SSC5, and SSC7, resulting in over-decomposition. The comparison shows that the ESSD algorithm proposed in this paper has better signal decomposition accuracy and a better ability to suppress the generation of false components.

Figure 5.
SSD results.
4. Experimental Analysis
4.1. Case 1: Case Western Reserve University Multiple Condition Bearing Dataset
To address the issue of faults occurring during the operation of rolling bearings, this paper proposes a fault feature extraction method based on ESSD + PSO - MMPE and uses PSO - SVM for effective fault recognition. The method’s effectiveness is validated using the Case Western Reserve University Multiple Condition Bearing Dataset.
The CWRU [24] rolling bearing fault simulation test rig consists of a 2-horsepower motor, a torsion encoder, a power meter, and a control electronic unit. A 16-channel sampling instrument is used to collect vibration signals from the rolling bearings, with the analysis focusing on the bearings at the drive end. Vibration data are collected using an accelerometer, with the drive end motor operating at a speed of 1797 RPM and a sampling frequency of 12,000 Hz. Vibration signals are recorded using a multi-channel DAT recorder. The motor bearing condition assessment system developed at Rockwell, Milwaukee, WI, USA. In this study, 100 data samples are extracted with a window size of 1024 and subjected to ESSD.
To verify the effectiveness of the method, two fault datasets with different fault types and different fault diameters of the same type are selected for analysis. Specific details are provided in Table 2.
Table 2.
Two kinds of fault data set information.
4.2. Fault Diagnosis Process
Using Dataset 1 as an example, fault diagnosis is performed using the method described in this paper. The diagnosis steps are as follows:
Step 1: Perform ESSD on the rolling bearing vibration signals from different fault datasets to obtain a series of physically meaningful SSC. Compute the correlation coefficients for the obtained SSC and select the component with the highest correlation coefficient as the optimal component.
Step 2: Extract fault features from the optimal component using the PSO-optimized MMPE. Create a fault feature dataset and divide the dataset into training and testing sets at a 1:1 ratio. This ratio reduces the number of model training samples while increasing the number of testing samples, demonstrating that the proposed method can effectively recognize faults even with a small number of training samples for SVM.
Step 3: Use the PSO-optimized SVM to classify the extracted features from the rolling bearings, thereby identifying their fault types.
Perform ESSD on the vibration signals of Dataset 1 under different fault conditions and select the optimal component based on the correlation coefficient. The results are shown in Table 3. The bolded parts are the optimal components selected through correlation screening. The larger the correlation coefficient, the more similar the SSC component signal is to the original signal, indicating that it contains more fault characteristic information. Additionally, compared to the original signal, it reduces the interference of background noise on the component signal.
Table 3.
Component correlation coefficient value.
4.3. Fault Feature Extraction
4.3.1. Analysis of the Impact of PSO - MMPE Parameters
For the MMPE algorithm, the data length N of the vibration signal and the value of the scale factor s do not significantly affect the MMPE’s ability to reflect the instability of the vibration signal [15]. However, the delay time τ and the embedding dimension m do have an impact on the effectiveness of MMPE in feature extraction [15].
Therefore, in this study, the scale factor s is set to 11. The optimal SSCs of four types of bearing vibration signals are selected to calculate the MMPE. Through experimental analysis, the impact of delay time τ ranging from 1 to 4 and embedding dimension m ranging from 3 to 7 on the MMPE of bearing vibration signals is examined [15].
Impact of delay time τ on MMPE: as shown in Figure 6a, when m = 3 and τ = 1, the entropy value of the normal-state bearing is the smallest. This aligns with the vibration characteristics of the normal-state bearing signal, which have a low impact and high stability [15]. The distribution of MMPE values for the other three types of faults does not provide good differentiation; thus, the impact of m on MMPE needs further investigation.
Figure 6.
The influence of τ on MMPE.
- Impact of embedding dimension m on MMPE
Analysis of Figure 6 and Figure 7 shows that for m=4,5,6,7, with τ=1 and s being less than 2, the algorithm can effectively reflect the fault states of rolling bearings. However, when m is set to a smaller value, the algorithm does not effectively detect changes in fault vibration signals. Conversely, when m is set to a larger value, it results in longer algorithm run times, larger computational loads, and less ideal results [17]. If the scale factor s is too small, it will result in an inadequate extraction of fault information from the bearing vibration signal, and the differences in complexity between fault signals may be diminished [17]. Therefore, the parameters for MMPE must be optimized for accurate selection.
Figure 7.
The influence of m on MMPE.
4.3.2. Parameter Optimization of MMPE Study
Based on the above analysis, it is evident that the reasonable selection of parameters in the MMPE algorithm is crucial for the effectiveness of feature extraction. Therefore, to achieve the best feature extraction performance with MMPE, it is necessary to select the most appropriate parameters for m, τ, and s. Additionally, when optimizing the MMPE parameters, the interactions between parameters cannot be ignored [17]. Therefore, this paper proposes using the PSO algorithm to optimize and select the MMPE method, simultaneously optimizing the four parameters.
4.3.3. Fitness Function
The overall trend of a data set can be partially reflected by the mean of the data, but it cannot be characterized solely by the mean. The skewness (Ske) of the data can effectively represent the performance of the mean. The larger the absolute value of Ske, the less the overall trend of the vibration signal time series is reflected; the smaller the absolute value, the more reliable the skewness [15]. Therefore, this paper uses the square of the MMPE skewness as the objective function to be optimized, determining its minimum value. Let the time series X = {xi, i = 1,2,…,N} form a sequence of permutation entropies Hp(x) = {Hp(1),Hp(2),…,Hp(s)} under the scale, and calculate its skewness Ske using Equation (16).
In the equation, is the mean of the sequence Hp(X); is the standard deviation of the sequence Hp(X); and E(·) denotes the expected value.
The fitness function is defined as follows:
The optimal component signals decomposed from the two datasets in Table 2 using ESSD are analyzed. The parameters for the PSO algorithm are set as follows: the population size N1 is 10, the maximum number of iterations is 30, the acceleration constants c1 and c2 are 1.5, and the inertial weight w is 5. The algorithm is run 10 times to obtain the average results [17].
Table 4.
MMPE optimization parameters for Data Set 1.
Table 5.
MMPE optimization parameters for Data Set 2.
The MMPE values of the two types of rolling bearing vibration signals are calculated using the parameters shown in Table 4 and Table 5. The scale factor s is represented by its minimum value [17]. The distribution of multi-scale mean permutation entropy is shown in Figure 8.
Figure 8.
Two kinds of data set fault characteristics extracted by PSO - MMPE.
As can be seen from Figure 8, the MMPE optimized through parameter tuning distinguishes more clearly between the optimal component signals decomposed by ESSD for the four states of the rolling bearing. The entropy value for the normal-state bearing is the smallest and most regular, which corresponds to its vibration characteristics of low impact and high stability. The MMPE values for various fault signals are larger, and the MMPE values for different faults are also significantly different [15]. This indicates that PSO - MMPE can theoretically represent the state information of bearing vibration signals. Compared to unoptimized MMPE (Figure 6 and Figure 7), it can better distinguish between different fault states of the bearing. Therefore, the fault features extracted by PSO - MMPE can be used as input feature vectors for SVM for further classification and identification of the rolling bearing states.
4.4. Pattern Recognition
4.4.1. Selection of Feature Vectors
Using the PSO - MMPE parameters shown in Table 4 and Table 5, the MMPE values for 50 sets of samples for each bearing state are calculated. The dataset is divided into a training set and a test set in a 1:1 ratio, with 25 sets of samples each for the training and test sets. The MMPE values for each set are taken at a scale s of 11, thus forming a 100 × 11 training set and a 100 × 11 test set, as shown in Table 6 and Table 7.
Table 6.
Sample test data of different fault types.
Table 7.
Test data samples of different fault diameters.
4.4.2. State Recognition
Figure 9 shows the classification and recognition results of the rolling bearing fault features proposed by this method after training the PSO-optimized SVM model with a small sample training set. The trained support vector machine achieved 100% classification accuracy for different fault types and for rolling bearing datasets with the same fault but different diameters.
Figure 9.
Case 1 rolling bearing fault classification results. (a) Classification result of Dataset 1; (b) Classification result of Dataset 2.
A comparative experiment was set up, using the EMD + MMPE, SSD + MMPE, and ESSD + MMPE methods for fault feature extraction. Fault recognition was performed using SVM and PSO - SVM, with the results shown in Table 8. The diagnostic results for Dataset 1 were satisfactory, but the diagnostic results for Dataset 2 still need improvement. In contrast, the proposed method in this paper achieved a fault diagnosis accuracy of 100% for both Dataset 1 and Dataset 2, effectively validating the superiority of the fault feature extraction method proposed in this paper.
Table 8.
Recognition accuracy of different methods.
4.5. Case 2: Online Monitoring Data of Wind Turbine Bearings
After proving the effectiveness of the proposed method through the fault diagnosis of laboratory rolling bearing vibration signals, the method was applied to the fault diagnosis of the front bearing vibration signals of the generator at the Zhuhai Guishan Wind Power Plant. The monitoring data were taken from the No. 30 Yongming 1.5 MW wind turbine (collected using the CS2000 wind turbine drivetrain online vibration detection and analysis system). The sampling frequency was 25.6 kHz, and the sampling time was 2.56 s. The bearing model was NSK’s 6330M/C3 deep groove ball bearing.
The monitoring data from the sensor at the front bearing measurement point of this unit’s generator showed abnormalities compared to other, similar units. After shutting down for inspection, an inner ring wear fault was discovered. Therefore, this paper analyzes 51,200 data points each from the time-domain signals from the generator’s front bearing in both a normal state and an inner ring fault state, as shown in Figure 10, as determined by the inspection personnel, to verify the engineering application effectiveness of the proposed method.
Figure 10.
Front bearing time-domain signal waveform.
With a window size of 1024, 100 samples of the time-domain data of the generator’s front bearing in two different states were segmented for ESSD. The optimal components were selected based on the correlation coefficient to reduce signal redundancy. Taking the first sample as an example, the correlation coefficients between its SSC and the initial signal are shown in the table below (the selected optimal components are highlighted in Table 9).
Table 9.
The phase relation value of each component after ESSD.
Using PSO - MMPE (parameters shown in Table 10) for feature extraction, as shown in Figure 11, it can be observed that PSO - MMPE effectively extracts features from the ESSD-decomposed engineering dataset. Each set of 50 samples per class was randomly divided into training and test sets in a 1:1 ratio (similar to Case 1). Pattern recognition was then performed using PSO - SVM, and the results are shown in Figure 12, achieving a classification accuracy of 100%. Compared to other methods (as shown in Table 11), the proposed method in this paper has shown favorable results in practical engineering applications.
Table 10.
PSO algorithm optimization parameters.
Figure 11.
Bearing PSO - MMPE characteristics.
Figure 12.
Fault diagnosis result.
Table 11.
Comparison of accuracy of different methods.
5. Conclusions
- (1)
- An ESSD method is proposed, which adopts the variance contribution rate and JS divergence as supplementary termination criteria for SSD. This method effectively addresses the problems of SSD being unable to decompose weak signals and tending to generate spurious components when the residual energy ratio is used as the final iterative termination criterion.
- (2)
- A method for optimizing MMPE using PSO is proposed. This method adaptively selects the parameters to effectively extract the features of rolling bearing vibration signals. Based on this, PSO is also used to optimize the SVM for better recognition of the features extracted by ESSD + PSO - MMPE.
- (3)
- Experiments were conducted using the CWRU datasets with different fault types and fault diameters, as well as the dataset for a generator’s front bearing from the Zhuhai Guishan Wind Power Plant. The results show that the proposed method of EEMD and optimized MMPE + SVM for multi-condition fault diagnosis of rolling bearings achieves the highest accuracy in identifying rolling bearing faults. Compared to methods such as EMD + MMPE + SVM, SSD + MMPE + PSO - SVM, and ESSD + MMPE + PSO - SVM, the effectiveness of the proposed method is demonstrated.
- (4)
- Although the proposed ESSD algorithm improves the accuracy of SSD and enhances the ability to suppress spurious components, it fails to address the end-effect and energy leakage issues inherent in SSD during signal decomposition. Further analysis of the algorithm’s underlying principles and subsequent improvements are required to gradually refine and complete the algorithm.
Author Contributions
Conceptualization, W.Z.; software, Y.C.; validation, W.Z. and X.Z.; investigation, X.Z.; resources, W.Z.; data curation, Y.C.; writing—original draft preparation, X.Z.; writing—review and editing, W.Z.; funding acquisition, W.Z. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by Yunnan Provincial Key Laboratory of Intelligent Logistics Equipment and Systems (Grant No.: 202449CE340008, Department of Science and Technology of Yunnan Province) and Shen Weiming Academician Workstation of Yunnan Province (Grant No.: 202505AF350084, Department of Science and Technology of Yunnan Province).
Data Availability Statement
The data supporting this study can be obtained upon request from the corresponding author. However, due to privacy considerations and the presence of undisclosed intellectual property, these data are not accessible to the public.
Conflicts of Interest
The authors declare no conflicts of interest.
Abbreviations
The following abbreviations are used in this manuscript:
| SSC | Singular Spectrum Component |
| JS | Jensen–Shannon |
| KL | Kullback–Leibler divergence |
| s | Scale factor |
| m | Embedding dimension |
| τ | Delay time |
| PSD | Power Spectral Density |
References
- Niu, Q.; Yang, S.; Li, X. An empirical mode decomposition-based frequency-domain approach for the fatigue analysis of nonstationary processes. Fatigue Fract. Eng. Mater. Struct. 2018, 41, 1980–1996. [Google Scholar] [CrossRef]
- Li, W.; He, C.; Chen, Z.; Huang, R.; Jin, G. Unsupervised fault diagnosis method of gearbox based on symmetrical comparative learning. Chin. J. Sci. Instrum. 2022, 43, 121–131. [Google Scholar] [CrossRef]
- Ye, X.; Liu, S.; Zhu, J.; Huang, Z.; Zhu, J.; Lai, Y.; Miao, H.; Wang, Q. Mechanical testing theory and technology: Present situation, trends, and prospects. Laser Optoelectron. Prog. 2023, 60, 0312002. [Google Scholar] [CrossRef]
- Ye, Z.; Yu, J. Gearbox fault diagnosis method based on multi-channel one-dimensional convolutional neural network feature learning. J. Vib. Shock 2020, 39, 55–66. [Google Scholar] [CrossRef]
- Kwon, W.; Lee, J.; Choi, S.; Kim, N. Empirical mode decomposition and Hilbert–Huang transform-based eccentricity fault detection and classification with demagnetization in 120 kW interior permanent magnet synchronous motors. Expert Syst. Appl. 2024, 241, 122515. [Google Scholar] [CrossRef]
- Ren, L.; Zhen, L.; Zhao, Y.; Dong, Q.; Zhang, Y. Fault diagnosis of rolling bearings under strong background noise based on SSA-VMD-MCKD. Vib. Shock 2023, 42, 217–226. [Google Scholar] [CrossRef]
- Postema, J.J.; Muntean, M.; Bonizzi, P.; Karel, J.; Meste, O.; De Raedt, H.; Michielsen, K. Hybrid quantum singular spectrum decomposition for time series analysis. AVS Quantum Sci. 2023, 5, 023803. [Google Scholar] [CrossRef]
- Yan, X.; Jia, M. Morphological demodulation method based on improved singular spectrum decomposition and its application in fault diagnosis of rolling bearings. J. Mech. Eng. 2017, 53, 104–112. [Google Scholar] [CrossRef]
- Huo, Z.; Martínez-García, M.; Zhang, Y.; Yan, R.; Shu, L. Entropy measures in machine fault diagnosis: Insights and applications. IEEE Trans. Instrum. Meas. 2020, 69, 2607–2620. [Google Scholar] [CrossRef]
- Wang, D.; Zhong, J.; Shen, C.; Pan, E.; Peng, Z.; Li, C. Correlation dimension and approximate entropy for machine condition monitoring: Revisited. Mech. Syst. Signal Process. 2021, 152, 107497. [Google Scholar] [CrossRef]
- Wang, Y.; Wang, D. Investigations on sample entropy and fuzzy entropy for machine condition monitoring: Revisited. Meas. Sci. Technol. 2023, 34, 125104. [Google Scholar] [CrossRef]
- Noman, K.; Wang, D.; He, Q. Oscillation based permutation entropy calculation as a dynamic nonlinear feature for health monitoring of rolling element bearing. Measurement 2021, 172, 108891. [Google Scholar] [CrossRef]
- Yang, W.; Zhang, P.; Wang, H.; Chen, Y.; Sun, Y. Gear fault diagnosis based on EEMD with multi-scale fuzzy entropy. J. Vib. Shock 2015, 34, 163–167. [Google Scholar] [CrossRef]
- Zheng, J.; Yao, Y.; Pan, H.; Tong, J.; Liu, Q. Research progress on the application of multi-scale entropy method in the mechanical fault diagnosis. J. Anhui Univ. Technol. Nat. Sci. 2024, 41, 46–57+97. [Google Scholar] [CrossRef]
- Wang, G.; Zhang, M.; Hu, Z. Bearing fault diagnosis of support vector machine based on multi-scale mean arrangement entropy and parameter optimization. J. Vib. Shock 2022, 41, 221–228. [Google Scholar] [CrossRef]
- Wang, H.; Li, Q.; Yang, S.; Liu, Y. Fault recognition of rolling bearings based on parameter optimized multi-scale permutation entropy and Gath-Geva. Entropy 2021, 23, 1040. [Google Scholar] [CrossRef] [PubMed]
- Chen, D.; Zhang, Y.; Yao, C.; Sun, F.; Zhou, N. Fault diagnosis of multi-scale permutation entropy and GK fuzzy clustering based on FVMD. J. Mech. Eng. 2018, 54, 16–27. [Google Scholar] [CrossRef]
- Chen, J. A support vector machine acceleration method for multi-classification problems. Comput. Sci. 2022, 49, 297–300. [Google Scholar] [CrossRef]
- Yu, L.; Chen, S.; Zhang, R.; Li, K.; Su, L. Application of deep sup-port vector machine in gear fault diagnosis. J. Mech. Transm. 2019, 43, 150–156. [Google Scholar] [CrossRef]
- Shang, X. Extraction and classification method of mine microseismic and blasting signal characteristics based on EMDSVD. Chin. J. Geotech. Eng. 2016, 38, 1849–1858. [Google Scholar] [CrossRef]
- Sun, W.; Wang, H.; Gu, Q. Exact Frequency Estimation in the Noise via KL Divergence of Accumulated Power. IEEE Commun. Lett. 2021, 25, 2574–2578. [Google Scholar] [CrossRef]
- Shami, T.M.; El-Saleh, A.A.; Alswaitti, M.; Al-Tashi, Q.; Summakieh, M.A.; Mirjalili, S. Particle swarm optimization: A comprehensive survey. IEEE Access 2022, 10, 10031–10061. [Google Scholar] [CrossRef]
- Zhu, J.; Hu, T.; Jiang, B.; Yang, X. Intelligent bearing fault diagnosis using PCA–DBN framework. Neural Comput. Appl. 2020, 32, 10773–10781. [Google Scholar] [CrossRef]
- Case School of Engineering. Bearing Data Center [DB/OL]. Available online: https://engineering.case.edu/bearingdatacenter/welcome (accessed on 14 December 2025).
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).