1. Introduction
As the core “joint” of the industry equipment, the performance of the bearing directly determines the reliability of the rotating machine. The timely detection of surface defects—such as pitting, cracks, and wear in metallic components—is critical for preventing catastrophic machinery failures, minimizing downtime, and ensuring operational safety. Fault diagnosis techniques are essential for identifying these defects early. According to the data, bearings account for 84% of the faults in wind power transmission systems, with main bearings contributing 30% individually [
1]. In the aero engine area, the maintenance of the main bearing costs over 60% of the total cost [
2]. Complex working conditions frequently lead to metal fatigue and wear [
3]. Traditional predictive models have struggled to provide effective warnings. Once a failure occurs, it can trigger a chain reaction of equipment downtime, financial loss, and even safety accidents. In the field of bearing fault diagnosis, the academic community has proposed various fault diagnosis techniques. Acoustic emission detection captures early micro-defects by making use of the high-frequency stress wave signals [
4]. Although its sensitivity has significantly improved, it still needs a complex noise reduction process. The fault diagnosis techniques based on oil analysis focus on analyzing wear particles, pollutants and other components in oil, which can assess the degree of wear and the type of fault in the equipment [
5]. However, both of these detection methods have certain limitations in bearing fault diagnosis. Mechanical equipment generates vibration signals during work, and fault diagnosis based on vibration signals has been attracting growing attention from the academy and industry.
Vibration signal-based fault feature extraction encompasses time domain, frequency domain, time–frequency analysis, and multiscale entropy methods. While time and frequency domain approaches offer intuitive primary feature extraction [
6], they exhibit limitations for non-stationary signals. Time–frequency techniques (e.g., Wavelet Transform) address this constraint [
7]. Multiscale entropy methods—including Multiscale Sample Entropy (MSE), Multiscale Permutation Entropy (MPE), and Multiscale Dispersion Entropy (MDE) [
8]—demonstrate enhanced operational adaptability and feature robustness by quantifying multi-scale complexity and uncertainty. However, three persistent limitations remain: ① Inadequate representation of nonlinear dynamic features. ② Loss of continuous state information during coarse-graining. ③ Sensitivity to noise interference and short-sequence data.
These constraints have motivated recent innovations in entropy-based feature extraction. Scholars have developed various multiscale entropy methods, with classical approaches exhibiting distinct characteristics: Multiscale Sample Entropy (MSE) [
9] overcomes single-scale constraints of Sample Entropy (SE) [
10] through multiscale frameworks, enhancing signal analysis accuracy and robustness. However, it suffers from: (1) sensitivity to time series length; (2) limited handling of abnormal data; (3) inability to identify adjacent amplitude features. Multiscale Attention Entropy (MAE) evaluates multi-scale complexity but experiences entropy value fluctuations at high scale factors due to shortened coarse-grained sequences, impairing feature stability [
11]. Multiscale Dispersion Entropy (MDE) [
12] integrates discrete entropy with Multiscale analysis to improve fault detection and noise resilience, yet loses critical high-frequency details at large scales. Multiscale Slope Entropy (MSlopEn) [
13] neglects segment data correlations during coarse-graining, causing statistical information loss. Multiscale Permutation Entropy (MPE) [
14] offers algorithmic simplicity and noise robustness but reduces statistical reliability as scale factors increase. Multiscale Fuzzy Entropy (MFE) [
15] captures multiscale features effectively but exhibits parameter sensitivity and results instability.
To overcome the limitations of the previous methods, researchers have proposed many improved multiscale entropy methods in recent years. The Time Shift Entropy method has been introduced into these approaches [
16], leading to the development of several important variants: Time shift Multiscale Dispersion Entropy (TSMDE), Time shift Multiscale Permutation Entropy (TSMPE) [
17], Time shift Multiscale Slope Entropy (TSMSlopEn) [
18], Time Shift Multiscale Fuzzy Entropy (TSMFuE) [
19], Time shift Multiscale Increment Entropy (TSMIncrE) [
20], and Time shift Multiscale Range Entropy (TSMRE) [
21]. These methods use time shift strategies to analyze signal complexity patterns across different scales. When processing information from a large amount of data and non-stationary signals, the TSMDE approach performs very well. Though Kaixuan Shao’s TSMDE captures signal complexity better than similar methods, it still struggles with noise interference [
22].
To address the issue of poor noise resistance, recent research has focused on multimodal fusion. Mostafa Rostaghi et al. [
23] proposed the Fuzzy Dispersion Entropy (FuDE) by introducing the fuzzy membership function, which reduces the information loss that occurs when the signal is mapped to the dispersion mode. Experimental results demonstrate that FuDE has stronger resistance to noise. Therefore, Yuxing Li et al. [
24] proposed the Multiscale Fuzzy Dispersion Entropy (MFuDE), a method that combines the previously mentioned Multiscale entropy approach and FuDE and demonstrates its stronger capabilities in feature extraction. Hamed Azami et al. [
25] proposed Ensemble Entropy (EE), which combines multiple algorithms or parameters and significantly improves the robustness and noise resistance of signal analysis. Furthermore, it may capture new characteristics missed by conventional approaches. However, the EE method has limitations, such as high computational complexity and the calculation results depend on the design of the ensemble strategy.
In summary, the current entropy methods still face four bottleneck issues: weak noise resistance, loss of coarse-grained information, insufficient dynamic representation, and low stability [
26]. Although the latest improvement strategies can partially alleviate the problems, there comes new flaws: MFuDE enhances noise resistance through fuzzy membership functions, but stability decreases at high-frequency scales, and the settings depend on previous experience. EE mixes multiple algorithms to improve robustness, but the computational complexity is high and interpretability is poor. The TSM improves the retention of coarse-grained information but does not significantly enhance noise resistance and is easily affected by dynamic signal interference. A single method is difficult to address all four major bottleneck issues simultaneously, requiring collaborative combination.
Pattern recognition plays a crucial role in fault diagnosis. Commonly used pattern recognition methods include statistical methods [
27], supervised learning, and convolutional neural networks. Of these, the Support Vector Machine (SVM) is the most frequently utilized in supervised learning [
28]. Support Vector Machine (SVM) enhances generalization ability by maximizing the classification margin, making it suitable for high-dimensional data and small sample circumstances. Kernel functions can address nonlinear problems and resist overfitting. However, the standard SVM is essentially a binary classification model, while fault diagnosis in practical engineering often involves multi-condition recognition. To address this, various improvements have been proposed in the academic circle to extend SVM for solving multi-class problems and further enhance its performance. OVO SVM reduces inter-class interference through a one-vs-one strategy, making it suitable for multi-class problems. This method offers higher classification accuracy and computational efficiency when handling complex boundaries, making it especially suitable for fault diagnosis in high-dimensional feature spaces. Yang Liu [
29] applied the OVO SVM method for text sentiment classification and achieve a higher test accuracy. The OvR SVM proposed by Xiaobin Xu et al. [
30] has achieved high prediction accuracy in rotating machinery fault diagnosis. The model is simple to train, requires fewer samples, and has high computational efficiency, making it especially suitable for applications to a small number of categories and significant inter-class differences. Unlike traditional SVM, which finds a hyperplane to maximize the internal between two classes of samples, the Twin Support Vector Machine (TSVM) does classification by constructing two non-parallel hyperplanes. Zhiwen Liu et al. [
31] significantly reduced computational complexity by applying TSVM to mechanical fault diagnosis. LSTSVM was proposed by M.A. Ganaie et al. [
32], which replaces quadratic programming optimization by solving a system of linear equations. Although LSTSVM significantly improves computational efficiency compared to TSVM, both are limited to binary classification tasks and require one-vs-rest (OvR) or hierarchical extension strategies to achieve multi-fault pattern identification.
To address the above issues, a fault diagnosis method based on Time Shift Multiscale Ensemble Fuzzy Dispersion Entropy (TSMEFuDE) and BTLSTSVM is proposed. The following is a summary of the innovations and the main contributions of this work:
A Time Shift Multiscale coarse-grained method (TSM) is introduced into Ensemble Fuzzy Dispersion Entropy, which enhances the coarse-grained sequences. This method can analyze the complexity of signals and the phase distribution of signals at different time scales, solving the problem of continuous sequence point loss encountered by traditional coarse-grained methods.
A new single scale entropy ensemble fuzzy dispersion entropy (EFuDE) is proposed, enhancing the noise resistance and stability of the entropy feature. The specific innovations are as follows: ① Based on the diversity of the signal amplitude, four mapping methods (linear, NCDF, tansig, and logsig) are used to map the original signal into multiple discrete categories. This approach fully takes advantage of each mapping method, effectively reducing sensitivity to noise and outliers, and shows more stability, especially when handling short-duration signals. ② EFuDE excludes the lossy round classification function, instead employing trapezoidal and triangular membership functions to maximize entropy and strengthen inter-category boundaries after signal mapping. ③ Ensemble entropy is introduced to handle more complex synthetic data, which includes different types of noise and mixing processes. It utilizes its superior ability to distinguish between different types of noise, thereby improving the signal feature extraction and noise resistance capabilities.
TSMEFuDE significantly enhances the stability of feature extraction, noise robustness, and adaptability to signal length, while reducing parameter sensitivity, by introducing a time-shifted multiscale strategy to address information loss and integrating various mapping functions and membership functions to address stability and noise issues. This provides more reliable feature inputs for subsequent fault diagnosis.
By innovatively introducing the binary tree structure (BT) into LSTSVM, the accuracy and flexibility of fault diagnosis for complex systems are significantly improved.
The organization of this paper is as follows: In
Section 2, the theory of Time shift Multiscale Ensemble Fuzzy Dispersion Entropy (TSMEFuDE) is elaborated in detail, with a quantitative analysis of the impact of parameters such as embedding dimension and time shift step size on TSMEFuDE. Stability, noise resistance, and multi-class fault feature separability of TSMEFuDE are systematically verified. In
Section 3, the theory of BT-LSTSVM is fully derived.
Section 4 presents the fault diagnosis method based on TSMEFuDE and BT-LSTSVM, along with an algorithm flowchart.
Section 5 conducts fault diagnosis experiments on triaxial bearings and aeroengine bearings, comparing and analyzing the feature extraction performance and classifier recognition accuracy. The experimental conclusions are summarized in
Section 6, which also covers the value of engineering applications.
2. Time Shift Multiscale Ensemble Fuzzy Dispersion Entropy
2.1. Time Shift Multiscale Ensemble Fuzzy Dispersion Entropy
- (1)
An innovative method has been designed to introduce the time shift multi-scale analysis of Higuchi Fractal Dimension (HFD) into EFuDE. With the help of HFD, this stable numerical analysis method may accurately manage different kinds of time series data, such as stationary, non-stationary, deterministic, random, and stationary signals. This method can effectively alleviate the problem of correlated information being lost in continuous sequences during the coarse-graining process. For the original signal
, coarse-graining is performed according to Equation (1), resulting in the sequence
at different scales s [
33].
In the formula,
is the rounded integer value, where s and t represent the initial time point and time interval, respectively, and
the starting point of the time series. The time-shifted coarse-graining process for a scale factor s = 3 is shown in
Figure 1.
- (2)
To retain as much information as possible contained in the signal and enhance the stability of the entropy value, we apply the averaging output strategy shown in Equation (2) to the coarse-grained sequence at each scale, calculating the average complexity across all scales. The computation steps of EFuDE are as follows:
Step 1: Fully utilize the advantages of different mapping methods. Each sequence applies four mapping methods—linear, logsig, tansig, and NCDF—corresponding to Formulas (3)–(6) [
25]. The
coarse-grained sequences are then mapped into c discrete classes.
In the formula, represents the coarse-grained sequence, and in Formula (3) is a very small constant used to avoid values which equal to 0 or 1. is the grouping length. In Formulas (4)–(6), and represent the standard deviation and the mean of the sequence, respectively.
The recognition accuracy of the BT LSTSVM algorithm is the highest among all methods, except for MPE. It is clear from this that, in comparison to other models, the BT LSTSVM model has a definite advantage in terms of fitting ability.
Although TSMEFuDE does not obtain an absolute advantage over all prediction algorithms, it can be seen that, in most circumstances, its prediction accuracy is not very different from the optimal one.
Except for the four basic prediction models, such as BPNN, where TSMEFuDE performs poorly, the accuracy achieved by TSMEFuDE is quite remarkable, fully demonstrating that the features extracted by this method have better distinguishability.
The degradation process of mechanical parts lasts a long time, and the categories and boundaries of fault states are fuzzy. Therefore, it is difficult for traditional features to characterize the degradation trend of parts. DE can measure the complexity and chaotic characteristics of the signal, but its performance is not very good in tracking the state of check valve and bearing.
Because of the use of normal cumulative distribution function (NCDF) mapping, the trend in vibration data is not completely taken into consideration in the DE feature, and because the description of the vibration data distribution characteristics by DE is not sufficiently accurate. In addition, DE is easily disturbed by small fluctuations and noise, and the reliability of tracking is poor. Therefore, a sliding dispersion entropy (SDE) based on sliding window down-sampling and TANSIG mapping is proposed. Assuming that the vibration signal of the mechanical part at the k-th time point has been collected, the SDE feature of the vibration signal at the current time point can be expressed as , and its calculation procedures are as follows.
Among them, the following are the advantages of these four algorithms:
- ①
Linear mapping: Easy to implement and low-cost, specializing in linear relationship modeling, dimensionality reduction and data standardization.
- ②
Logsig function mapping: Specifically suitable for probabilistic prediction and binary classification tasks, the output is limited to a range of 0 to 1, but gradient vanishing may be faced in deep networks.
- ③
Tansig function mapping: The mapping output is to the interval −1 to 1, which is suitable for processing complex signals with positive and negative values, but it is also possible to encounter gradient vanishing in deep network training.
- ④
NCDF mapping: Optimize data uniformity by converting to normal distribution, which significantly improves model stability and prediction accuracy, being especially suitable for data standardization.
The comprehensive use of linear, logsig, Tansig and NCDF mapping methods can make full use of their advantages in data preprocessing, such as maintaining the original scale of data, processing nonlinear relationships and normalizing distribution. This multi-strategy approach not only optimizes the stability of model training, but also significantly improves the prediction accuracy and the adaptability of the model to complex data.
Step 2: Calculate the affiliation. Since is mapped to c discrete classes, the ambiguity of the boundaries between classes and the change in signal amplitude caused by noise may affect the classification results. This work departs from the conventional round classification function to solve this problem. By using two kinds of fuzzy membership functions MK, we can deal with each category more accurately, so as to enhance the class boundary after the signal is mapped to the classification sequence. In the calculation process, class 1 and class c apply trapezoidal membership functions (see Equations (7) and (8)), while other categories (K ≠ 1 and K ≠ c) use triangular membership functions (see Equation (9)).
In this paper, class 1 and class c use trapezoidal membership functions, while other classes (K ≠ 1 and K ≠ c) use triangular membership functions. The specific fuzzy membership functions are as follows:
where Equation (8) k = 2 to c − 1.
Step 3: Build pattern vectors. For a given embedding dimension m, delay d, and fuzzy set number nc, all possible pattern vectors need to be constructed, and there are a total of nc
m possible pattern vectors. Each
,
,
,
and
is mapped to the corresponding scatter mode
. The calculation method is shown in Equation (10), where
Step 4: Probability calculation of the scattering fuzzy pattern. Each pattern vector is mapped to a different scatter pattern
based on its membership degree. Each sample point
of each pattern vector has a corresponding membership degree according to its category
. Only when the membership degrees of all sample points are true (that is, non-zero), the membership degree of the whole pattern is true. We use t-norm for judgment, such as Equation (11). Where i is the index of the number of time delay steps starting from the current sample point j and looking back.
Step 5: Calculate the probability of fuzzy dispersion mode. Determine the probability of each mode according to Equation (12).
reflects the frequency of the pattern in the whole time series.
Step 6: Calculate the set fuzzy dispersion entropy. Finally, based on the information entropy theory and combined with the fuzzy dispersion entropy calculation results under four different fuzzy mapping methods.
- (3)
Iterative scale S, i.e., S = S + 1, repeat the calculation process of steps (1) and (2) until S reaches the preset value. The entire calculation process is shown in
Figure 2, where the red box mainly describes some important updates and compares with the EDE to show the improvements made by TSMEFuDE. We completed the entire TSMEFuDE calculation.
2.2. The Influence of Parameters on TSMEFuDE
Sequence length, embedding dimension
m, time delay t, and number of classes
c are key parameters that influence the entropy-based features of TSMEFuDE. In order to further study the specific impact of these core parameters on the TSMEFuDE characteristics of the time series, a series of numerical simulations was conducted, based on white noise (WGN) and 1/f noise. Each noise-type dataset contains 3000 randomly selected data points, with a sampling frequency set at 5000 Hz.
Figure 3 displays the time domain diagram and spectrum. Notably, the amplitude of WGN shows a random distribution in both time and frequency domains. In contrast to 1/f noise, WGN demonstrates greater variability and unpredictability.
To verify the superior performance of TSMEFuDE under different parameter settings, we conducted a comparative analysis. TSMEFuDE was compared with Composite Multiscale Fuzzy Dispersion Entropy (CMFuDE), Refined Composite Multiscale Fuzzy Dispersion Entropy (TSMDE), Reverse-Cumulative Multiscale Fuzzy Dispersion Entropy (RCMEFuDE), and Time-Shifted Multiscale Ensemble Fuzzy Dispersion Entropy (TSMEFuDE). All experiments were performed on a 12th Gen Intel ® Core ™i7-12700 processor configured at 2.30 GHz and supported by MATLAB 2023b software.
2.2.1. The Impact of Series Length N on TSMEFuDE
Increasing the signal sequence length
N can better reveal the intrinsic complexity of the sequence and reduce the noise and boundary effects. Nevertheless, this results in reduced computational efficiency. To explore the relationship between computational efficiency and accuracy, we conducted experiments using WGN and 1/f noise with sequence lengths of 2000, 4000, 6000, 8000, and 10,000 respectively. The entropy or complexity values extracted by each method are shown in
Figure 4a. Except for
Figure 4c–g, which all have a Y-axis scale of 0.4,
Figure 4c uses a Y-axis scale of 2, and
Figure 4g uses a Y-axis scale of 0.8. The following conclusions can be drawn from the analysis results:
- ①
By comparing
Figure 4a MFuDE with
Figure 4b TSMFuDE, as well as
Figure 4e MEFuDE with
Figure 4h TSMEFuDE, it is observed that the curves become smoother. This indicates that the introduction of the time-shifted analysis method significantly enhances the stability of the entropy values.
- ②
A comparison between
Figure 4c,d and
Figure 4g,h shows that the introduction of Ensemble Entropy helps to smooth the curves. From the figure, it is obvious that the entropy value becomes more stable, leading to a smoother of the curve.
- ③
The TSMEFuDE method we proposed combines the benefits of the previously mentioned approaches, as illustrated in
Figure 4h, the entropy curves for the two types of noise almost overlap, indicating that TSMEFuDE exhibits the best stability and noise resistance.
To evaluate the stability of TSMEFuDE under different sequence lengths, the Coefficient of Variation (CV) was adopted as the evaluation metric. The calculation formula of CV is
, where σ and μ, respectively, represent the standard deviation and mean value of TSMEFuDE at the same scale. Analysis of the experimental results for WGN and 1/f noise under five different sequence lengths is presented in
Figure 5a,b. The figure clearly shows that TSMEFuDE consistently yields the lowest CV values across all tested sequence lengths. This provides strong evidence of TSMEFuDE outstanding stability under different values of
N.
2.2.2. The Impact of Embedding Dimension m, Time Delay t and Class Number c on TSMEFuDE
The embedding dimension
m, time delay
t, and class number
c are the most fundamental and commonly used parameters in dispersion entropy analysis. These characteristics have been incorporated into the proposed TSMEFuDE algorithm. Therefore, we constructed a WGN with a length of 3000 and 1/f noise to analyze the influence of m, t, and c on TSMEFuDE. Setting the embedding dimension
m too low may result in an inability to effectively capture the nonlinear features of the time series [
34]. Conversely, an excessively huge m, will lead to lower computing efficiency. For a detailed analysis of the effect of embedding dimension m on TSMEFuDE, the time delay t and class number c were fixed at 1 and 5, respectively. WGN and 1/f noise sequences of length 3000 were constructed, and the entropy values were taken from a computed form ranging from 2 to 5. To evaluate the impact of m on computation time, error bar charts were used to illustrate the average, maximum, and minimum computation times corresponding to each value of m, as shown in
Figure 6a–c. Based on these results, we can draw the following conclusions: ① With the increase in M, entropy will increase, but the fluctuation of entropy curve will decrease. ② The time required to calculate entropy grows as parameter m increases. ③ Regardless of how m varies, the TSMEFuDE entropy curve always has the smoothest value.
During the process of phase space reconstruction, the separation between neighboring data points that were used to create phase space vectors is determined by the time delay t. In order to record the dynamic activity of time series data, this parameter is necessary. This study examines the impact of time delay t on TSMEFuDE. For this purpose, we fixed the embedding dimension m at 2 and parameter c at 5, then computed TSMEFuDE values for two sets of time series with distinct noise levels across t ranging from 1 to 5. As shown in
Figure 6d–f, the results indicate that, under two noise conditions, the TSMEFuDE entropy curves corresponding to different t values are closely clustered. This demonstrates the high robustness of the method. Meanwhile, we also observed that the computation efficiency is higher when the time delay t is set to 3 or 4.
Class number c determines how many discrete intervals the continuous time series data are divided into, which affects the precision of the sequence classification. Selecting the appropriate number of classes c is very important for effectively capturing the complexity of time series, and the inappropriate number of classes may lead to the loss or oversimplification of information. In this study, we set the embedding dimension m to 2 and the time delay t to 1 to explore the effect of different classes c on entropy. As shown in
Figure 6g–i, we observe that the TSMEFuDE method has a higher entropy than the other two methods, suggesting that the method may be more effective in revealing the complexity of time series. However, we also note that as the number c increases, the computation time also increases significantly, although the entropy increases accordingly.
2.3. Validity Verification Experiment of TSMEFuDE
2.3.1. Validity Verification Experiment of Time Shift
To validate the superiority of TSMEFuDE, we conducted a comparative analysis between the TS (Time Shift) coarse-graining strategy and the M (Multiscale) coarse-graining strategy. By constructing five distinct entropy methods and applying two coarse-graining strategies, respectively, we can compare the relative advantages of the TS (Time Shift) and M (Multiscale) approaches. These entropy methods include permutation entropy (PE), slope entropy (SlopEn), dispersion entropy (DE), fuzzy dispersion entropy (FuDE), and ensemble fuzzy dispersion entropy (EFuDE). The influence of different coarse-grained methods on error bar, coefficient of variation and computing time is analyzed by constructing 30 groups of WGN and 1/f noises with a length of 3000. As shown in the figure, the stability and calculation duration of each entropy method under different scale factors are shown by observing and summarizing the characteristics of the diagram, the following conclusions can be drawn: ① By comparing
Figure 7a and
Figure 7c, it can be seen that when white noise and 1/f noise are processed by different methods, the entropy value changes little under different scale factors, showing good robustness. This shows that these methods have strong adaptability under different types of noise. It should be noted that the curve presented by MPE method in WGN and 1/f noise is a straight line, which cannot effectively express the complexity characteristics of WGN and 1/f noise. However, the curve of MFuDE under 1/f noise decreases with the increase in scale factor, which does not conform to the curve characteristic diagram of 1/f noise. Therefore, MPE and MFuDE methods will not be considered in future discussions. ② By comparing
Figure 7b with
Figure 7d, it can be seen that the proposed TSMEFuDE method has the smallest CV. Under two groups of noise, CV values under different scale factors show that the stability of TSMEFuDE is the highest.
Figure 7c,f shows the maximum calculation time and average calculation time of various entropy methods under 30 groups of signals. It can be shown from the figure that the M-based method is much better than the TS based method in computing time. However, the main disadvantage of the M-based method is that its stability is poorer than the TS method. In addition, the introduction of set entropy also significantly increases the calculation time, making MEFuDE and TSMEFuDE the longest in the same kind of methods. To sum up, although TSMEFuDE takes a long time to calculate, its excellent accuracy and stability make it a compromise solution. Although other methods are fast in calculation, they have no obvious advantages in stability and volatility, so TSMEFuDE is considered to be a better choice.
2.3.2. Verification of Amplitude Frequency Characterization Ability and Noise Resistance Performance
In order to verify the characterization ability of the proposed TSMEFuDE for amplitude and frequency, we constructed five groups of AM FM signals with length of 2000, as shown in Equation (14), and the amplitude and frequency change at equal intervals. The entropy value curve is shown in
Figure 8. The following conclusions can be drawn:
Under all scale factors, the five entropy curves obtained by (f) TSMEFuDE have higher coincidence degree, and the resulting curves are smoother, which shows that the noise robustness of the proposed method is the best.
For the verification experiment of anti-noise performance of TSMEFuDE model, we constructed a set of AM-FM signals with a length of 2000, as shown in Equation (15). Then, we added white noise with signal to noise ratios (SNR) of 5 dB, 10 dB, 15 dB, 20 dB and 25dB, respectively, to obtain five groups of signals, namely ‘Pure signal’, ‘SNR = 15’, ‘SNR = 20’, ‘SNR = 25’, ‘SNR = 30’. The entropy values under these five conditions are shown in the figure. The analysis results show that ① The five entropy curves obtained by TSMEFuDE method almost coincide under all scale factors, indicating that the method has excellent noise robustness. ② It can be seen from
Figure 8a,d that the introduction of time-shifted multi-scale analysis method significantly improves the noise robustness. ③ From the comparison between
Figure 8d and
Figure 8e, it can be concluded that the introduction of set entropy improves noise robustness.
6. Conclusions
This study addresses the issues in traditional information entropy methods (MDEMFuDE) in rotating machinery fault diagnosis, such as coarse-grained information loss, insufficient feature stability, as well as the high computational complexity and parameter sensitivity of the LSTSVM model. A fusion diagnostic method based on TSMEFuDE and BT LSTSVM is proposed. All the results are summarized as follows:
This study proposed a novel fusion diagnostic method, combining Time Shift Multiscale Enhanced Fuzzy Dispersion Entropy (TSMEFuDE) and Binary Tree Least Squares Support Vector Machine (BT-LSTSVM), to address limitations in traditional rotating machinery fault diagnosis methods. The key findings are summarized as follows:
Enhanced Feature Extraction with TSMEFuDE:
① Overcoming Information Loss and Instability: By integrating the Time Shift Multiscale (TSM) strategy and combining linear or nonlinear mappings (NCDF, tansig, logsig) with trapezoidal and triangular membership functions, TSMEFuDE effectively mitigates coarse-grained information loss and significantly improves feature stability and noise resistance compared to traditional MDE and MFuDE methods. ② Superior Noise Robustness and Stability: Experiments under white noise and 1/f noise demonstrated TSMEFuDE’s lower parameter sensitivity and superior stability across different signal lengths. ③ Strong Amplitude Representation and Noise Immunity: Validation using AM-FM signals confirmed TSMEFuDE’s enhanced capability for amplitude representation and better noise resistance.
Parameter Optimization Insights:
① While longer signal lengths (N) enhance feature representation at the cost of efficiency, TSMEFuDE achieves comparable stability even with short sequences (N = 2000) due to the TS and ensemble strategies. ② Key parameters (m, t, c) significantly impact TSMEFuDE’s accuracy and efficiency. An optimized combination (m = 3, t = 3, c = 2) balances high feature representation capability, computational efficiency, and stability. ③ Outperformance of TS Strategy: TSMEFuDE, utilizing the TS coarse-graining strategy, outperformed the standard Multiscale method in noise adaptability, stability (minimal Coefficient of Variation—CV), and complexity representation accuracy. Although computation time is slightly longer, its superior noise resistance and feature resolution represent an optimal compromise.
Effective Fault Classification with BT-LSTSVM:
① Reduced Model Complexity: The improved binary tree structure successfully reduces the computational complexity of the underlying model. ② High and Stable Diagnostic Accuracy: In real-world diagnostic experiments on triaxial bearings and aeroengine bearings, the combination of TSMEFuDE and BT-LSTSVM method achieved a stable recognition accuracy of 97.46%. ③ Consistent Superior Performance: Feature extraction and classification experiments consistently showed that TSMEFuDE combined with BT-LSTSVM maintains optimal performance across various fault recognition scenarios, significantly outperforming comparative methods like BPNN, KNN, and OVO-TSVM.
Overall, the proposed fault diagnosis method demonstrates superior performance in all aspects. However, TSMEFuDE currently lacks feature transferability and has insufficient parameter adaptability, while the BT LSTSVM model faces issues with error accumulation. These problems will become new bottlenecks and focal points for future work. The proposed method has already yielded promising results on the laboratory platform, and it is expected to be applied to fault diagnosis for various mechanical equipment in the future.