Enhanced Composite Multi-Scale Slope Entropy and Its Application to Fault Diagnosis of Rolling Bearing

Li, Wei; Li, Jiazhu; Wang, Shuyu; Chen, Yan; Chen, Jian

doi:10.3390/electronics15102219

Open AccessArticle

Enhanced Composite Multi-Scale Slope Entropy and Its Application to Fault Diagnosis of Rolling Bearing

by

Wei Li

¹,

Jiazhu Li

¹,

Shuyu Wang

¹,

Yan Chen

² and

Jian Chen

^1,*

¹

Institute of Sound and Vibration, Hefei University of Technology, Hefei 230009, China

²

School of Electronic and Electrical Engineering, Bengbu University, Bengbu 233030, China

^*

Author to whom correspondence should be addressed.

Electronics 2026, 15(10), 2219; https://doi.org/10.3390/electronics15102219

Submission received: 30 March 2026 / Revised: 17 May 2026 / Accepted: 18 May 2026 / Published: 21 May 2026

(This article belongs to the Special Issue Intelligent Sensing Empowered by Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

The health status of rolling bearings is critical to the normal operation of rotating machinery. To effectively extract vibration signal features and accurately identify different fault types, a novel method based on enhanced composite multi-scale slope entropy (ECMSE) and a honey badger algorithm-optimized kernel extreme learning machine (HBA–KELM) is proposed. Specifically, ECMSE integrates high-order differences into the composite multi-scale framework to capture high-frequency information while preserving low-frequency characteristics, thereby enhancing the discriminability of time-series representations. Meanwhile, an average coarse-graining strategy is incorporated to achieve a more comprehensive characterization of the signals. The extracted features are then input into the HBA–KELM classifier for fault identification. Experiments conducted on two public and private rolling bearing datasets demonstrate that our method achieves superior performance in distinguishing different fault types and damage levels compared with several existing approaches.

Keywords:

rolling bearing fault diagnosis; multi-scale entropy; slope entropy; kernel extreme learning machine; honey badger algorithm

1. Introduction

Rolling bearings are essential components in most rotating machinery, and their operating conditions have a considerable impact on the normal operation of mechanical systems. Rolling bearing failures often lead to serious safety issues and economic losses. Therefore, accurate and timely fault detection of rolling bearings has become increasingly important [1,2,3,4,5,6].

In practical scenarios, fault signals are typically non-stationary and nonlinear, and are often contaminated with significant noise, making early fault information difficult to detect. Traditional approaches usually analyze signals in the time or frequency domain, but they are less effective in identifying fault characteristics. Therefore, time–frequency analysis methods have been widely adopted for feature extraction, such as wavelet transform (WT) [7], complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) [8], variational mode decomposition (VMD) [9], and Fourier decomposition method (FDM) [10]. However, these methods still have limitations in practical applications. For instance, the performance of WT is highly dependent on the selection of basis functions. CEEMDAN may introduce noise and spurious components in the early intrinsic mode functions. VMD requires manual determination of decomposition layers and penalty factors, and is not suitable for signals with complex and dense modes. In addition, FDM may produce inconsistent results under different search strategies, and its iterative process is time-consuming. Consequently, effective application of these methods often requires substantial prior knowledge and experience, and their inherent limitations are difficult to overcome.

In addition to purely data-driven signal processing, physics-guided modeling has recently received increasing attention in bearing health monitoring. For example, physics-guided degradation trajectory modeling and trustworthy multi-step prediction attempt to constrain remaining useful life prediction using degradation consistency and reliability-aware mechanisms [11,12], while fault-evolution-knowledge-driven learning introduces prior fault evolution information to improve adaptation across changing working conditions [13]. These studies indicate that bearing diagnosis can benefit from combining data-driven representations with mechanical prior knowledge. Nevertheless, many physics-guided methods require explicit degradation trajectories, cross-condition labels, or prior knowledge of fault evolution, which may be unavailable in short-sample fault classification scenarios. Therefore, this study focuses on constructing a compact and interpretable entropy-based representation that captures both low-frequency state variation and high-frequency fault-related details from vibration signals.

Entropy is an effective tool for measuring the complexity of time series and has been widely used in signal feature extraction across various domains [14,15]. Methods such as approximate entropy (AE) [16], fuzzy entropy (FE) [17], and permutation entropy (PE) [18] have been successfully applied in rolling bearing fault diagnosis. Among them, PE is widely used due to its computational simplicity and efficiency. However, PE only evaluates the complexity of a single-scale sequence. To address this limitation, multi-scale permutation entropy (MPE) [19] was proposed to extract information across multiple scales and improve robustness. Nevertheless, MPE still has several drawbacks: (1) the coarse-grained sequence becomes shorter as the scale factor increases, leading to the loss of useful information; (2) PE ranks subsequences based on relative amplitudes while ignoring their actual values, which may map significantly different sequences into the same pattern. To overcome these issues, Zheng et al. [20] proposed composite multi-scale weighted permutation entropy (CMWPE), which combines composite multi-scale processing with weighted permutation entropy (WPE) [21] to better distinguish sequence differences using variance. In addition to WPE, its variants such as fine-grained permutation entropy (FGPE) [22] and amplitude-aware permutation entropy (AAPE) [23] have also shown similar improvements.

Recently, slope entropy (SE) [24] was proposed to encode the slopes between adjacent amplitudes into symbolic patterns and characterize the dynamic properties of time series. SE is more sensitive to amplitude fluctuations and requires fewer samples to achieve statistical significance. It has also been demonstrated that SE outperforms WPE, FGPE, and AAPE in various applications. Based on this, composite multi-scale slope entropy (CMSE) is obtained by combining SE with composite multi-scale analysis. However, the traditional coarse-graining method generates short sequences at different scales and mainly captures low-frequency information [25]. Moreover, sequences with identical average values may still differ significantly in amplitude distribution and arrangement.

To address these issues and capture high-frequency information, multi-order differences are introduced in this study to replace the average coarse-graining process, resulting in difference-based composite multi-scale slope entropy (DBCMSE). Furthermore, to achieve a more comprehensive representation of time series, an enhanced composite multi-scale slope entropy (ECMSE) method is proposed by combining CMSE and DBCMSE. ECMSE considers both amplitude information and multi-frequency characteristics, while being less sensitive to data length, thereby providing a more effective representation of rolling bearing health conditions. After feature extraction using ECMSE, the obtained feature vectors are input into a classifier for intelligent fault diagnosis. Kernel extreme learning machine (KELM) [26] extends extreme learning machine (ELM) by incorporating kernel functions and generally achieves better generalization performance. However, its performance is sensitive to the selection of regularization coefficients and kernel parameters. To address this issue, the honey badger algorithm (HBA) [27] is employed to optimize the parameters of KELM adaptively. Based on the above methods, a novel intelligent fault diagnosis framework for rolling bearings is proposed. Finally, two rolling bearing datasets are used to validate the effectiveness and reliability of the proposed method in identifying faults with different types and damage levels.

The remainder of this paper is organized as follows. Section 2 introduces the theory and parameter selection of ECMSE. Section 3 presents the HBA–KELM method. Section 4 provides experimental validation, and Section 5 concludes the paper.

2. Slope-Entropy-Based Feature Extraction Methods

2.1. Slope Entropy

For a given one-dimensional time series

x = {x_{i}}_{i = 1}^{L}

of length L, the phase space of x can be reconstructed, and the subspace sequences are defined as

X_{m}^{t} = {x_{t}, x_{t + τ}, \dots, x_{t + τ (m - 2)}, x_{t + τ (m - 1)}},

(1)

where

t = 1, 2, \dots, L - τ (m - 1)

, m denotes the embedding dimension, and

τ

denotes the time delay.

Then, a vertical increment threshold

γ

(high threshold) is defined to measure significant differences between elements, and a zero-difference threshold

δ

(low threshold) is introduced to characterize cases with approximately equal amplitudes. Next,

X_{m}^{t}

is transformed into a symbolic subsequence based on adjacent differences and predefined thresholds. The procedure for defining slopes as symbols is illustrated in Figure 1.

In Figure 1,

Δ x

denotes the difference between two adjacent elements in the reconstructed vector. The symbols

+ 2

and

- 2

represent positive and negative slopes whose absolute amplitudes exceed the high threshold

γ

, respectively;

+ 1

and

- 1

represent moderate positive and negative slopes between the low threshold

δ

and the high threshold

γ

, respectively; and 0 represents an approximately unchanged amplitude interval satisfying

| Δ x | \leq δ

.

The probability of each symbolic subsequence pattern is calculated as

p_{i} = \frac{h_{i}}{N_{s}},

(2)

where

h_{i}

denotes the number of occurrences of the i-th pattern, and

N_{s}

represents the total number of symbolic subsequences. Let K denote the number of observed symbolic patterns.

Finally, based on Shannon entropy, slope entropy is defined as

S E = - \sum_{i = 1}^{K} p_{i} {log}_{2} (p_{i}) .

(3)

2.2. Composite Multi-Scale Slope Entropy

The composite coarse-graining technique generates multiple subsequence combinations, thereby reducing entropy fluctuations and improving robustness compared with the traditional multiscale method. By combining it with slope entropy, composite multi-scale slope entropy (CMSE) is obtained. For a given one-dimensional time series

x = {x_{i}}_{i = 1}^{L}

of length L, the average coarse-grained sequences

y_{k}^{(s)} = {y_{k, 1}^{(s)}, y_{k, 2}^{(s)}, \dots, y_{k, q}^{(s)}}

are defined as

y_{k, j}^{(s)} = \frac{1}{s} \sum_{i = (j - 1) s + k}^{j s + k - 1} x_{i},

(4)

where

1 \leq j \leq q = ⌊ L / s ⌋

,

1 \leq k \leq s

, and

s = 1, 2, \dots, s_{max}

. Here,

⌊ \cdot ⌋

denotes the floor operator,

s_{max}

is the maximum scale factor, and

y_{k}^{(s)}

represents the k-th coarse-grained sequence at scale s.

Then, the SE values of each coarse-grained sequence are computed, and CMSE is defined as

C M S E (x, s, m, τ, γ, δ) = \frac{1}{s} \sum_{k = 1}^{s} S E (y_{k}^{(s)}, m, τ, γ, δ) .

(5)

2.3. Difference-Based Composite Multi-Scale Slope Entropy

To capture high-frequency information and reveal details that CMSE may overlook, difference-based composite multi-scale slope entropy (DBCMSE) is proposed by replacing the average coarse-graining method with a multi-order difference scheme. For a given one-dimensional time series

x = {x_{i}}_{i = 1}^{L}

of length L, the

(s - 1)

-order difference is applied under scale factor

s > 1

. The difference-based sequences

z_{k}^{(s)} = {z_{k, 1}^{(s)}, z_{k, 2}^{(s)}, \dots, z_{k, q}^{(s)}}

are defined as

z_{k, j}^{(s)} = \sum_{i = (j - 1) s + k}^{j s + k - 1} {(- 1)}^{s - 1 - a} (\binom{s - 1}{a}) x_{i},

(6)

where

a = i - [(j - 1) s + k]

,

1 \leq j \leq q = ⌊ L / s ⌋

,

1 \leq k \leq s

, and

s = 1, 2, \dots, s_{max}

. Here,

(\binom{s - 1}{a})

denotes the binomial coefficient, and

z_{k}^{(s)}

represents the k-th difference-based sequence at scale s.

Similar to CMSE, DBCMSE is defined as

D B C M S E (x, s, m, τ, γ, δ) = \frac{1}{s} \sum_{k = 1}^{s} S E (z_{k}^{(s)}, m, τ, γ, δ) .

(7)

2.4. Enhanced Composite Multi-Scale Slope Entropy

To incorporate both low- and high-frequency information, CMSE and DBCMSE are averaged. The enhanced composite multi-scale slope entropy (ECMSE) is defined as

E C M S E (x, s, m, τ, γ, δ) = \frac{C M S E (x, s, m, τ, γ, δ) + D B C M S E (x, s, m, τ, γ, δ)}{2} .

(8)

The equal-weight fusion is adopted because CMSE and DBCMSE are two parallel descriptions generated from the same original signal and share the same scale factor, embedding dimension, time delay, and slope thresholds. CMSE emphasizes the low-frequency trend retained by local averaging, whereas DBCMSE emphasizes high-frequency amplitude variation produced by the difference operator. Assigning equal weights avoids introducing an additional data-dependent hyperparameter during feature extraction and reduces the risk of overfitting when only a small number of training samples are available. Therefore, ECMSE can be interpreted as a two-branch multi-resolution entropy representation that balances low-frequency trend information and high-frequency fault details in an unsupervised manner.

The implementation procedure of ECMSE is illustrated in Figure 2. The traditional coarse-graining method is analogous to an arithmetic mean filter, which can suppress noise but may obscure fine-scale details and reduce frequency information [28]. In contrast, the multi-order difference method can be regarded as a weighting operation and exhibits high-pass filtering characteristics. For example, the weights are

{- 1, 1}

at scale

s = 2

, and

{1, - 2, 1}

at scale

s = 3

. As shown in Figure 3, the normalized spectra indicate that the multi-order difference operation behaves as a high-pass filter, with the cutoff frequency increasing as the scale factor increases.

From a numerical perspective, the two methods may produce different outputs. For instance, subsequences

{1, 3, 5}

and

{2, 4, 3}

yield the same average value but different outputs under the difference-based method. Similarly, sequences with identical elements but different permutations can be distinguished by the difference-based method, while the coarse-graining method produces identical results. However, the difference-based method cannot distinguish sequences with the same linear trend. Therefore, the two approaches are complementary, and their combination leads to improved performance.

As shown in Figure 2, ECMSE first reconstructs the original signal at multiple scales. In the CMSE branch, average coarse-graining is used to obtain low-frequency subsequences, and SE is then calculated for each composite subsequence. In the DBCMSE branch, the multi-order difference operation is used to obtain high-frequency-sensitive subsequences before SE calculation. Finally, the entropy values of the two branches are averaged at each scale to form the ECMSE feature vector.

2.5. Parameter Selection of ECMSE

The performance of ECMSE depends on the embedding dimension m, time delay

τ

, maximum scale factor

s_{max}

, low threshold

δ

, high threshold

γ

, and sample length L. In this study,

τ = 1

and

s_{max} = 8

are set according to [29]. To determine appropriate thresholds, we set

δ \in {10^{- 1}, 10^{- 2}, 10^{- 3}, 10^{- 4}, 10^{- 5}, 10^{- 6}},

(9)

and

γ \in {30 °, 45 °, 60 °} .

(10)

The embedding dimension is selected from

m \in [2, 7] .

(11)

To determine the optimal parameters, the mean silhouette coefficient (MSC) is used to evaluate clustering performance. MSC ranges from

- 1

to 1, and higher values indicate better feature separability. It is defined as

M S C = \frac{1}{N} \sum_{i = 1}^{N} (\frac{b (i) - a (i)}{max {a (i), b (i)}}),

(12)

where

a (i)

represents intra-class cohesion and

b (i)

denotes the minimum inter-class distance.

Finally, different sample lengths are evaluated to analyze robustness. Considering both stability and computational efficiency,

L = 1024

is selected. As illustrated in Figure 4, the overall trends of the entropy curves for different sample lengths are generally consistent. However, when

L = 512

, the entropy values of both noise signals deviate from this trend, and the curve of the

1 / f

noise exhibits a more pronounced offset along with larger standard deviations.

3. Honey Badger Algorithm Optimized Kernel Extreme Learning Machine

3.1. Honey Badger Algorithm

HBA is a biological heuristic algorithm that simulates the activity of a honey badger colony searching for a beehive. In the search phase, if the smell of food is strong, their movement intensifies, and vice versa. Compared with traditional metaheuristic optimizers such as particle swarm optimization (PSO) and genetic algorithm (GA), HBA combines two search behaviors: a digging phase for local exploitation around the current best solution and a honeyguide-bird phase for global exploration. The density factor decreases adaptively with iterations, which helps the search process move gradually from exploration to exploitation. In addition, HBA has relatively few control parameters, making it convenient for optimizing the regularization coefficient and kernel width of KELM. The key HBA–KELM parameter settings used in this study are listed in Table 1.

The search procedure of HBA is as follows:

(1) Initializing the honey badger colony:

x_{i} = l b_{i} + r_{1} \times (u b_{i} - l b_{i})

(13)

where

x_{i}

represents the position of the i-th honey badger,

r_{1}

is a random number in

[0, 1]

,

u b_{i}

and

l b_{i}

represent the upper and lower search bounds, respectively.

(2) Defining intensity: Smell intensity

I_{i}

is determined by source strength S and the distance

d_{i}

:

\begin{matrix} I_{i} = r_{2} \times \frac{S}{4 π d_{i}^{2}} \end{matrix}

(14)

\begin{matrix} S = {(x_{i} - x_{i + 1})}^{2} \end{matrix}

(15)

\begin{matrix} d_{i} = x_{prey} - x_{i} \end{matrix}

(16)

where

r_{2}

is a random number in

[0, 1]

, and

x_{prey}

is the best prey location found so far.

(3) Updating density factor: The factor

α

ensures search stability:

α = C \times exp (\frac{- t}{t_{max}})

(17)

where C is a constant (default

= 2

) and

t_{max}

is the maximum number of iterations.

(4) Digging and honeyguide bird phase: The behavior is determined by r, a random number in

[0, 1]

. When

0 \leq r < 0.5

, the position

x_{new}

is updated as:

x_{new} = x_{prey} + F \times β \times I \times x_{prey} + F \times r_{3} \times α \times d_{i} \times | cos (2 π r_{4}) \times [1 - cos (2 π r_{5})] |

(18)

The direction factor F is defined as:

F = \{\begin{matrix} 1 & if r_{6} \leq 0.5 \\ - 1 & if r_{6} > 0.5 \end{matrix}

(19)

3.2. Kernel Extreme Learning Machine

According to Mercer’s theorem, the kernel function

Ω

is defined as:

Ω = H H^{T}, Ω (i, j) = h (x_{i}) h (x_{j}) = K (x_{i}, x_{j})

(20)

The output of the network can be expressed as:

g (x) = [K (x, x_{1}), \dots, K (x, x_{N})] {(\frac{I}{A} + Ω)}^{- 1} T

(21)

The radial basis function (RBF) is selected as the kernel:

K (x_{i}, x_{j}) = exp (- \frac{∥ x_{i} - x_{j} ∥^{2}}{2 σ^{2}})

(22)

3.3. HBA–KELM Algorithm (Continued)

(5) Honeyguide bird phase: When

r \geq 0.5

, the position is updated according to the following strategy:

x_{new} = x_{prey} + F \times r_{7} \times α \times d_{i}

(23)

Based on the HBA and KELM, the initial parameters

(A, σ)

of the KELM are optimized using the HBA. The classification error rate of the training set is selected as the fitness function for the HBA–KELM:

fitness (x) = \frac{N_{err}}{N_{train}}

(24)

where

N_{err}

is the number of incorrectly classified training samples and

N_{train}

is the total number of training samples. A smaller fitness value indicates better KELM parameters.

3.4. Implementation of the Proposed Method

To accurately identify rolling bearings under different health conditions, an intelligent fault diagnosis method based on ECMSE and HBA–KELM is proposed. To improve the recognition accuracy of KELM, the global optimization capability of HBA is utilized to achieve automatic parameter selection of KELM, thereby establishing a fault diagnosis model for rolling bearings.

The flowchart of the proposed method is shown in Figure 5, and the detailed procedure is as follows:

Acceleration sensors are used to collect vibration signals of rolling bearings under different health conditions.
For each fault type, M samples with length $L = 1024$ are obtained. Then, N samples are randomly selected as the training set, and the remaining $M - N$ samples are used as the test set.
After the training/test split is completed in each run, the MSC values of random feature samples from the training set only are calculated under different parameter combinations $(δ, γ, m)$ to determine the optimal ECMSE parameters. First, the time delay is set to $τ = 1$ , and the maximum scale factor is set to $s_{max} = 8$ . The low threshold $δ$ is only used to categorize approximate amplitudes, which has a limited effect on SE. Therefore, to simplify parameter selection, m is fixed at 2. The optimal low threshold $δ$ is determined based on MSC under different high thresholds. Then, the optimal combination of $(γ, m)$ is selected according to the maximum MSC. No test samples are used for parameter tuning or classifier optimization.
The fault features of the training set and test set are extracted using ECMSE. Subsequently, the training features are input into the HBA–KELM classifier for model training, and the test features are used for classification and recognition.

4. Experimental Study

4.1. Experimental Setting

In this section, two bearing datasets under complex working conditions are selected to validate the effectiveness of the proposed method for rolling bearing fault diagnosis. In each case, the feature extraction quality and classification accuracy of the proposed ECMSE method are compared with its variants, CMSE and DBCMSE. In addition, several advanced methods, including CMWPE, RCMFE, RCMDE, HFDE, and HWPE [20,30,31,32,33], are also considered for comparison. Recent data-driven and deep-learning-based approaches, such as semi-supervised metric learning, convolutional neural networks, and cross-domain adaptation models, have also achieved promising results in rolling bearing diagnosis [2,5,6,7]. Since the main objective of this paper is to evaluate whether ECMSE provides more discriminative entropy features, the primary comparisons are conducted among entropy-based feature extraction methods under the same HBA–KELM classifier. To further position the proposed method against broader data-driven diagnosis models, representative SVM, KELM, 1D-CNN, CNN-LSTM, and 1D-ResNet baselines are additionally evaluated in Section 4.4. Comparisons with transfer-learning and domain-adaptation models under variable working conditions remain an important direction for future work.

In the experiments, all parameters of CMSE and DBCMSE are set consistently with those of ECMSE. Moreover, the same parameters (i.e., delay time

τ

and embedding dimension m) are adopted for other comparison methods. To ensure the same feature vector length, the decomposition level k of HWPE and HFDE is set to 3, and the maximum scale factor

s_{max}

of other methods is set to 8. In addition, the independent parameters of RCMFE are set as follows: similarity tolerance

R = 0.15 \times SD

and gradient

n = 2

[30]. The number of classes C for RCMDE and HFDE is set to 5 [31]. All methods are implemented in MATLAB 2018b and executed on a computer equipped with an Intel^® Core^™ i5-1135G7 CPU @ 2.40 GHz and 16.00 GB RAM. For reproducibility, all random training/test partitions are generated before feature extraction and are kept identical for all compared methods in the same run. The ECMSE parameter selection, HBA–KELM optimization, and classifier training are performed only on the training samples, while the test samples are used exclusively for final evaluation. Each reported maximum, minimum, mean, and standard deviation is obtained from 30 independent runs.

4.2. Test Verification Case 1

In Case 1, the rolling bearing dataset [34] from Case Western Reserve University (CWRU) is employed to evaluate the performance of the proposed method. Figure 6 shows the experimental rig, which mainly consists of five components: a fan-end bearing, an induction motor, a drive-end bearing, a torque transducer/encoder, and a dynamometer.

In Case 1, the test objects are 6205-2RS JEM deep groove ball bearings manufactured by SKF Group (Gothenburg, Sweden). located at the drive end. The fault types include inner race fault, outer race fault, and ball fault. To simulate different damage levels of rolling bearings, electro-discharge machining (EDM) is applied to process healthy bearings. For each fault type, fault diameters of 0.1778 mm, 0.3556 mm, and 0.5334 mm are introduced.

As a result, vibration signals under ten different working conditions are obtained. In this experiment, the motor power is set to 1 HP, the sampling frequency is set to 12 kHz, and the rotational speed is approximately 1772 rpm. For each working condition, 100 non-overlapping samples with a length of 1024 are obtained, among which 20 samples are selected as the training set and the remaining 80 samples as the test set. Therefore, there are 200 training samples and 800 test samples in total. The detailed information of bearings under different working conditions is presented in Table 2, and the corresponding vibration signal waveforms are shown in Figure 7.

After obtaining all signal samples, the MSC index is used to determine the optimal parameter combination

(δ, γ, m)

of ECMSE. To this end, after each random split, the 20 training samples of each working state were used for MSC-based parameter selection, and no test samples were involved in this process. The MSC values of the training samples under different parameters were calculated. To simplify parameter selection,

m = 2

is first fixed to observe the variation in the low threshold

δ

under different high thresholds. Figure 8a shows that as

δ

decreases, the MSC values for the three

γ

groups gradually increase. When

δ

decreases to

10^{- 3}

or lower, the MSC values in each group reach their maximum; therefore, the optimal

δ

is set to

10^{- 3}

. Subsequently, with the optimal low threshold fixed, the MSC values under different combinations

(γ, m)

are calculated. As shown in Figure 8b, the MSC reaches the maximum value when

γ = 30 °

and

m = 4

. Consequently, the final parameters of ECMSE are set as

m = 4

,

τ = 1

,

δ = 10^{- 3}

,

γ = 30 °

, and

L = 1024

. Next, the ECMSE features of all samples are calculated. The feature vectors of the training set are fed into the HBA–KELM classifier for training, and then the feature vectors of the test set are fed into the optimized model for verification. As displayed in Figure 9, the proposed approach can effectively identify bearing faults with different types and damage degrees, and the maximum recognition accuracy reaches 100%.

To validate the superiority of the proposed method, feature extraction performance experiments are conducted, and the comparison methods include CMSE, DBCMSE, RCMFE, RCMDE, HFDE, HWPE, and CMWPE. t-SNE [35] is employed to reduce the multidimensional features of all methods to a two-dimensional space, and the visualization results are shown in Figure 10. It can be observed that the inter-class distances of OR2, B1, and B3 in CMSE are too close to be distinguished. In contrast, except for a few misclassified B3 samples, ECMSE and DBCMSE exhibit good separability overall, demonstrating the advantage of the proposed multi-order difference method in extracting detailed information. Additionally, the HFDE method also shows relatively good clustering performance; however, the B1 and B3 faults remain difficult to distinguish. The other methods exhibit issues such as large intra-class dispersion and severe feature overlap, making it difficult to effectively distinguish different fault types. In summary, DBCMSE and ECMSE achieve similar visualization results, with the best clustering performance among all methods. Following feature extraction, the feature vectors are fed into HBA–KELM for classification. To eliminate randomness and ensure reliability, all methods are executed 30 times, with the training and test sets randomly selected before each run. The recognition accuracy and total computation time of different methods are presented in Table 3 and Figure 11a. As shown, the proposed ECMSE method achieves the highest recognition accuracy in all experiments.

Furthermore, although DBCMSE requires the shortest computation time, the integration of CMSE and DBCMSE in ECMSE increases the computational cost slightly, while significantly improving diagnostic accuracy and robustness.

In addition, to further analyze the influence of sample set partitioning on the experimental results, six scenarios (i.e., training set/test set ratios) are considered: 10/90, 20/80, 40/60, 60/40, 80/20, and 90/10. Figure 11b shows the average recognition accuracy for each scenario over 30 runs. As observed, increasing the size of the training set contributes positively to improving recognition accuracy. However, an excessive number of training samples may reduce the efficiency of classifier parameter optimization. On the other hand, variations in the sample ratio have little effect on the proposed method, and its accuracy reaches the maximum under all conditions. Notably, although DBCMSE performs similarly to ECMSE in Case 1, its classification performance degrades when the training/test ratio is 10/90, indicating that ECMSE is less dependent on the number of training samples. Overall, the proposed method demonstrates clear advantages in both recognition accuracy and robustness.

4.3. Test Verification Case 2

In Case 2, the experimental data were obtained from the aero-engine bearing test bench in the laboratory, which mainly consists of a spindle testing machine, a refrigeration system, a hydraulic loading system, and a lubrication system. Figure 12 shows the experimental setup and the locations of the accelerometers. The test object is the NU1010EM (inner race detachable)/N1010EM (outer race detachable) single-row cylindrical roller bearing manufactured by NSK. The bearing fault types include inner race fault, outer race fault, ball fault, outer race and ball compound fault, and inner race and ball compound fault.

The main geometric parameters of the tested bearing are listed in Table 4. Two piezoelectric acceleration sensors were mounted on the bearing housing by threaded connections, as shown in Figure 12b. Accelerometer 1 was installed near the upper radial position of the bearing housing, and Accelerometer 2 was installed near the lower radial position to collect vibration responses from different measurement locations.

To simulate different degrees of bearing damage, a laser marking machine and a wire cutting machine were used to process healthy bearings to obtain single-point and multi-point fault bearings with damage dimensions of 9 mm (length) × 0.2 mm (width). For single-point faults, one rectangular defect was fabricated on the corresponding bearing component. For multi-point and compound faults, three defects with the same nominal size were fabricated on the corresponding single or combined components, as summarized in Table 5. In Case 2, a total of nine vibration signals of rolling bearings under different working conditions were collected. During the data acquisition process, the axial load was set to 2 kN, the motor speed was set to 2000 rpm, and the sampling frequency was 20.48 kHz. For each working condition, 100 non-overlapping samples were obtained, of which 20 samples were selected as the training set, and the remaining 80 samples were used as the test set. The length of each sample is 1024. The detailed information of bearings under different working conditions is presented in Table 5, and the corresponding waveforms are shown in Figure 13. In this subsection, the parameter selection procedure of ECMSE is the same as that in Case 1. After each random training/test split, the 20 training samples of each bearing type are used for MSC-based parameter selection, and the test samples are not used during parameter tuning. The MSC values of the training samples are calculated under different parameter settings.

Figure 14a shows that as the low threshold

δ

decreases, the MSC trends corresponding to different high thresholds

γ

are generally consistent. When

δ = 10^{- 3}

, the MSC values of the three groups reach their maximum and remain unchanged thereafter. Therefore, the optimal value of

δ

is set to

10^{- 3}

. The selection process of the low threshold in Cases 1 and 2 indicates that even if

γ

varies, the MSC trends with respect to

δ

remain similar. Hence, it is feasible to determine

δ

independently. Similar to Case 1, the MSC values under different parameter combinations

(γ, m)

are shown in Figure 14b. It can be observed that the MSC reaches its maximum when

γ = 60 °

and

m = 3

. Accordingly, the final parameters of ECMSE are determined as

m = 3

,

τ = 1

,

δ = 10^{- 3}

,

γ = 60 °

, and

L = 1024

. The different optimal parameter combinations selected in Case 1 (

m = 4

,

γ = 30 °

) and Case 2 (

m = 3

,

γ = 60 °

) reflect the data-dependent sensitivity of ECMSE parameters. The low threshold

δ

mainly identifies approximately equal adjacent amplitudes, and its MSC trends are similar in the two cases; once

δ

decreases to

10^{- 3}

, the MSC values become stable, indicating that ECMSE is relatively insensitive to smaller

δ

values. By contrast, the high threshold

γ

and embedding dimension m directly affect the symbolic slope patterns and are more sensitive to the signal structure. For the CWRU dataset in Case 1, the single-point faults are characterized by relatively regular impulse responses with subtle amplitude variations, so a lower high threshold and a slightly larger embedding dimension help capture fine local fluctuation patterns. In Case 2, the private dataset differs in bearing type, acquisition platform, and fault form, and the multi-point and compound faults produce more complex amplitude changes; therefore, a larger high threshold and a lower embedding dimension reduce excessive pattern fragmentation and provide better feature separability. This indicates that the optimal ECMSE parameters should be selected according to the target dataset, while the compact MSC-based search range enables stable parameter determination using only the training samples. Subsequently, the ECMSE features of all samples are extracted. The feature vectors of the training set are then fed into the fault classifier for model training, and the feature vectors of the test set are input into the optimized model for fault identification.

Similar to Case 1, the features extracted by all methods are input into HBA–KELM for classification experiments. To avoid randomness and ensure reliability, each method is executed 30 times, with the training and test sets randomly selected each time. The experimental results are presented in Table 6 and Figure 15a. It can be observed that the proposed method achieves the highest average recognition accuracy and the smallest standard deviation. Additionally, although DBCMSE requires the shortest computation time, its average accuracy is only 99.11%, which is lower than that of ECMSE (99.82%). Furthermore, the training/test sets are divided into different ratios (10/90, 20/80, 40/60, 60/40, 80/20, and 90/10). The average recognition accuracy is calculated over 30 runs for each ratio, and the results are shown in Figure 15b. The recognition accuracy of the comparative methods remains below that of the proposed method under all training/test ratios, further demonstrating the superiority of ECMSE.

4.4. Additional Robustness and Baseline Analysis

To further address the robustness and baseline concerns, three additional analyses are conducted: (1) an SNR-based noise robustness test, (2) a comparison with representative 1D signal-based machine-learning and deep-learning baselines, and (3) a classifier optimization ablation study. For the noise robustness test, additive white Gaussian noise is injected into the Case 1 test samples at different SNR levels, while the training samples remain unchanged. Each method is repeated 30 times under the same random partitions. The results are shown in Table 7.

It is worth emphasizing that ECMSE exhibits the best anti-noise performance among all compared methods, highlighting its engineering applicability in noisy bearing monitoring environments. Even under the severe

- 4

dB condition, ECMSE still achieves an average recognition accuracy of 93.68%, outperforming DBCMSE, CMSE, RCMDE, and RCMFE by 3.23, 13.35, 9.06, and 17.14 percentage points, respectively.

As shown in Table 7, the recognition accuracy of all methods decreases as the noise intensity increases, but ECMSE maintains the highest average accuracy at all SNR levels. When the SNR decreases from the clean condition to

- 4

dB, the accuracy of ECMSE decreases by 6.27 percentage points, which is smaller than the decreases of DBCMSE (9.39 percentage points), CMSE (16.17 percentage points), RCMDE (13.44 percentage points), and RCMFE (16.13 percentage points). These results indicate that the fusion of CMSE and DBCMSE improves robustness under noisy conditions: the CMSE branch retains relatively stable low-frequency trend information, while the DBCMSE branch preserves high-frequency fault-related details.

To position the proposed method against broader fault diagnosis models, representative 1D signal-based baselines are considered, including SVM, KELM, 1D-CNN, CNN-LSTM, and 1D-ResNet. For a fair comparison, all additional baselines use the same training/test partitions as the entropy-based methods, and the test samples are not used for model selection. The comparison results are shown in Table 8.

The results in Table 8 indicate that the proposed ECMSE+HBA–KELM framework achieves competitive performance compared with representative deep-learning models while using compact entropy features and a shallow classifier. In Case 1, ECMSE+HBA–KELM improves the average accuracy by 0.87 percentage points over 1D-ResNet, the strongest deep-learning baseline. In Case 2, the corresponding improvement is 1.21 percentage points. This suggests that the proposed entropy representation remains effective in small-sample diagnosis, where end-to-end deep models may not fully exploit their representation capacity.

To separate the contribution of ECMSE from the classifier optimizer, a classifier ablation study is further conducted by combining ECMSE with KELM, PSO-KELM, GA-KELM, and HBA–KELM. The results are shown in Table 9.

Finally, to examine whether the improvement of ECMSE over its two branches is statistically meaningful, a paired Wilcoxon signed-rank test is conducted over the 30 repeated runs. In each run, the compared methods use the same random training/test partition, so their recognition accuracies form paired observations. The null hypothesis is that the median paired accuracy difference is zero, and a two-sided test with a significance level of 0.05 is adopted. The statistical results are listed in Table 10.

The ablation results in Table 9 show that HBA–KELM provides the best classification performance among the tested KELM variants. Compared with ECMSE + KELM, ECMSE + HBA − KELM improves the mean accuracy by 1.11 percentage points in Case 1 and 1.50 percentage points in Case 2. It also outperforms ECMSE+PSO-KELM and ECMSE+GA-KELM, indicating that HBA is effective for optimizing the KELM parameters in this task. In addition, Table 10 shows that the improvement of ECMSE over CMSE and RCMDE is highly significant in both cases. The improvement over DBCMSE is smaller but still statistically significant, with

p = 0.018

in Case 1 and

p = 0.003

in Case 2. These results support the effectiveness of fusing the low-frequency CMSE branch and the high-frequency DBCMSE branch.

5. Conclusions

To accurately identify different fault types of rolling bearings, an intelligent fault diagnosis method based on ECMSE and HBA–KELM is proposed in this paper. In the feature extraction stage, the CMSE method is employed to capture low-frequency information from time series, while a multi-order difference strategy is introduced to construct the DBCMSE method for extracting high-frequency information. By integrating these two approaches, ECMSE is developed to effectively extract comprehensive fault features of rolling bearings. Subsequently, the MSC criterion is utilized to determine the optimal parameter combination of ECMSE. In the fault identification stage, HBA is used to optimize the regularization parameter and kernel parameter of KELM, thereby establishing an effective fault diagnosis model. Experimental results on two datasets demonstrate that the proposed method can accurately identify different fault types and damage severities of rolling bearings, as well as effectively distinguish composite faults. Furthermore, comparative experiments verify that ECMSE achieves superior feature extraction capability, higher recognition accuracy, and better stability than existing methods. Although the two datasets used in this study include different bearing types, fault categories, and acquisition platforms, the current validation is still limited to fixed operating conditions. More challenging scenarios, such as variable speeds, variable loads, non-Gaussian noise, and cross-domain transfer between different machines, should be further investigated. Future work will focus on applying the proposed method to fault diagnosis of other rotating machinery and further investigating its performance under variable operating conditions.

Author Contributions

Methodology, W.L. and J.C.; validation, W.L. and J.L.; formal analysis, W.L. and J.C.; investigation, Y.C.; resources, S.W. and J.L.; data curation, W.L. and S.W.; writing—original draft, W.L.; writing—review and editing, J.C. and J.L.; supervision, J.C.; project administration, J.L.; funding acquisition, J.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the project “Development of an Immersive Intelligent Rhythmic Sound Field Generation System” (Grant No. W2026JSKF0308).

Data Availability Statement

This study employed both publicly available and private datasets for experimental analyses. The publicly available dataset is Case Western Reserve University (CWRU) Bearing Dataset (https://engineering.case.edu/bearingdatacenter, accessed on 1 March 2025). In addition, a private dataset was collected using a custom experimental platform developed at Hefei University of Technology. Due to copyright and ownership restrictions, this dataset is not publicly available but can be provided upon reasonable request from the corresponding author. The MATLAB implementation of the proposed ECMSE feature extraction and HBA–KELM classification procedure can also be provided by the corresponding author upon reasonable request for academic verification.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

Cerrada, M.; Sánchez, R.V.; Li, C.; Pacheco, F.; Cabrera, D.; De Oliveira, J.V.; Vásquez, R.E. A review on data-driven fault severity assessment in rolling bearings. Mech. Syst. Signal Process. 2018, 99, 169–196. [Google Scholar] [CrossRef]
Yu, K.; Lin, T.R.; Ma, H.; Li, X.; Li, X. A multi-stage semi-supervised learning approach for intelligent fault diagnosis of rolling bearing using data augmentation and metric learning. Mech. Syst. Signal Process. 2021, 146, 107043. [Google Scholar] [CrossRef]
Sun, B.; Sheng, Z.; Song, P.; Sun, H.; Wang, F.; Sun, X.; Liu, J. State-of-the-art detection and diagnosis methods for rolling bearing defects: A comprehensive review. Appl. Sci. 2025, 15, 1001. [Google Scholar] [CrossRef]
Keshun, Y.; Puzhou, W.; Peng, H.; Yingkui, G. A sound-vibration physical-information fusion constraint-guided deep learning method for rolling bearing fault diagnosis. Reliab. Eng. Syst. Saf. 2025, 253, 110556. [Google Scholar] [CrossRef]
Li, W.; Chen, Y.; Li, J.; Wen, J.; Chen, J. Learn then adapt: A novel test-time adaptation method for cross-domain fault diagnosis of rolling bearings. Electronics 2024, 13, 3898. [Google Scholar] [CrossRef]
Li, W.; Wang, Y.; Li, J.; Han, Z.; Chen, Y.; Chen, J. An Online Learning Framework for Fault Diagnosis of Rolling Bearings Under Distribution Shifts. Mathematics 2025, 13, 3763. [Google Scholar] [CrossRef]
Cheng, Y.; Lin, M.; Wu, J.; Zhu, H.; Shao, X. Intelligent fault diagnosis of rotating machinery based on continuous wavelet transform-local binary convolutional neural network. Knowl.-Based Syst. 2021, 216, 106796. [Google Scholar] [CrossRef]
Chen, W.; Li, J.; Wang, Q.; Han, K. Fault feature extraction and diagnosis of rolling bearings based on wavelet thresholding denoising with CEEMDAN energy entropy and PSO-LSSVM. Measurement 2021, 172, 108901. [Google Scholar] [CrossRef]
Nassef, M.; Hussein, T.M.; Mokhiamar, O. An adaptive variational mode decomposition based on sailfish optimization algorithm and Gini index for fault identification in rolling bearings. Measurement 2021, 173, 108514. [Google Scholar] [CrossRef]
Tripathi, P.M.; Kumar, A.; Komaragiri, R.; Kumar, M. Watermarking of ECG signals compressed using Fourier decomposition method. Multimed. Tools Appl. 2022, 81, 19543–19557. [Google Scholar] [CrossRef]
Yin, C.; Li, Y.; Wang, Y.; Dong, Y. Physics-guided degradation trajectory modeling for remaining useful life prediction of rolling bearings. Mech. Syst. Signal Process. 2025, 224, 112192. [Google Scholar] [CrossRef]
Yin, C.; Sun, T.; Wu, H.; Dong, Y. Trustworthy multistep-ahead remaining useful life prediction for rolling bearings with limited data. Reliab. Eng. Syst. Saf. 2026, 267, 111902. [Google Scholar] [CrossRef]
Yin, C.; Dong, Y.; He, J.; Wang, Y. A fault evolution knowledge-driven adversarial meta-learning method for few-shot tool state recognition under variable working conditions. Eng. Appl. Artif. Intell. 2026, 167, 113806. [Google Scholar] [CrossRef]
Li, Y.; Wang, X.; Liu, Z.; Liang, X.; Si, S. The entropy algorithm and its variants in the fault diagnosis of rotating machinery: A review. IEEE Access 2018, 6, 66723–66741. [Google Scholar] [CrossRef]
Zhu, K.; Chen, L.; Hu, X. A multi-scale fuzzy measure entropy and infinite feature selection based approach for rolling bearing fault diagnosis. J. Nondestruct. Eval. 2019, 38, 90. [Google Scholar] [CrossRef]
Gao, X.; Yan, X.; Gao, P.; Gao, X.; Zhang, S. Automatic detection of epileptic seizure based on approximate entropy, recurrence quantification analysis and convolutional neural networks. Artif. Intell. Med. 2020, 102, 101711. [Google Scholar] [CrossRef]
Fu, W.; Wang, K.; Tan, J.; Zhang, K. A composite framework coupling multiple feature selection, compound prediction models and novel hybrid swarm optimizer-based synchronization optimization strategy for multi-step ahead short-term wind speed forecasting. Energy Convers. Manag. 2020, 205, 112461. [Google Scholar] [CrossRef]
Ruiz-Aguilar, J.J.; Turias, I.; González-Enrique, J.; Urda, D.; Elizondo, D. A permutation entropy-based EMD–ANN forecasting ensemble approach for wind speed prediction. Neural Comput. Appl. 2021, 33, 2369–2391. [Google Scholar] [CrossRef]
Aziz, W.; Arif, M. Multiscale permutation entropy of physiological time series. In Proceedings of the 2005 Pakistan Section Multitopic Conference; IEEE: New York, NY, USA, 2005; pp. 1–6. [Google Scholar]
Zheng, J.; Dong, Z.; Pan, H.; Ni, Q.; Liu, T.; Zhang, J. Composite multi-scale weighted permutation entropy and extreme learning machine based intelligent fault diagnosis for rolling bearing. Measurement 2019, 143, 69–80. [Google Scholar] [CrossRef]
Deng, B.; Liang, L.; Li, S.; Wang, R.; Yu, H.; Wang, J.; Wei, X. Complexity extraction of electroencephalograms in Alzheimer’s disease with weighted-permutation entropy. Chaos Interdiscip. J. Nonlinear Sci. 2015, 25, 043105. [Google Scholar] [CrossRef] [PubMed]
Xiao-Feng, L.; Yue, W. Fine-grained permutation entropy as a measure of natural complexity for time series. Chin. Phys. B 2009, 18, 2690–2695. [Google Scholar] [CrossRef]
Azami, H.; Escudero, J. Amplitude-aware permutation entropy: Illustration in spike detection and signal segmentation. Comput. Methods Programs Biomed. 2016, 128, 40–51. [Google Scholar] [CrossRef]
Cuesta-Frau, D. Slope entropy: A new time series complexity estimator based on both symbolic patterns and amplitude information. Entropy 2019, 21, 1167. [Google Scholar] [CrossRef]
Jiang, Y.; Peng, C.K.; Xu, Y. Hierarchical entropy analysis for biological signals. J. Comput. Appl. Math. 2011, 236, 728–742. [Google Scholar] [CrossRef]
Fu, W.; Zhang, K.; Wang, K.; Wen, B.; Fang, P.; Zou, F. A hybrid approach for multi-step wind speed forecasting based on two-layer decomposition, improved hybrid DE-HHO optimization and KELM. Renew. Energy 2021, 164, 211–229. [Google Scholar] [CrossRef]
Hashim, F.A.; Houssein, E.H.; Hussain, K.; Mabrouk, M.S.; Al-Atabany, W. Honey Badger Algorithm: New metaheuristic algorithm for solving optimization problems. Math. Comput. Simul. 2022, 192, 84–110. [Google Scholar] [CrossRef]
Li, M.; Wang, R.; Yang, J.; Duan, L. An Improved Refined Composite Multivariate Multiscale Fuzzy Entropy Method for MI-EEG Feature Extraction. Comput. Intell. Neurosci. 2019, 2019, 7529572. [Google Scholar] [CrossRef]
Yin, Y.; Shang, P. Multivariate weighted multiscale permutation entropy for complex time series. Nonlinear Dyn. 2017, 88, 1707–1722. [Google Scholar] [CrossRef]
Gao, S.; Wang, Q.; Zhang, Y. Rolling bearing fault diagnosis based on CEEMDAN and refined composite multiscale fuzzy entropy. IEEE Trans. Instrum. Meas. 2021, 70, 3514908. [Google Scholar] [CrossRef]
Chakraborty, M.; Mitra, D. Automated detection of epileptic seizures using multiscale and refined composite multiscale dispersion entropy. Chaos Solitons Fractals 2021, 146, 110939. [Google Scholar]
Ke, Y.; Yao, C.; Song, E.; Dong, Q.; Yang, L. An early fault diagnosis method of common-rail injector based on improved CYCBD and hierarchical fluctuation dispersion entropy. Digit. Signal Process. 2021, 114, 103049. [Google Scholar] [CrossRef]
Yun, K.; Chong, Y.; Enzhe, S.; Liping, Y.; Quan, D. Fault diagnosis method of diesel engine injector based on hierarchical weighted permutation entropy. In Proceedings of the 2021 IEEE International Instrumentation and Measurement Technology Conference (I2MTC); IEEE: New York, NY, USA, 2021; pp. 1–6. [Google Scholar]
Li, Y.; Wang, X.; Si, S.; Huang, S. Entropy based fault classification using the Case Western Reserve University data: A benchmark study. IEEE Trans. Reliab. 2019, 69, 754–767. [Google Scholar] [CrossRef]
Van der Maaten, L.; Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605. [Google Scholar]

Figure 1. The symbolic definition of slope entropy.

Figure 2. The flowchart of ECMSE.

Figure 3. Normalized spectra of weights under scale factors

s = 2

–10.

Figure 3. Normalized spectra of weights under scale factors

s = 2

–10.

Figure 4. (a) ECMSE values of WGN signals with different lengths. (b) ECMSE values of

1 / f

noise signals with different lengths.

Figure 4. (a) ECMSE values of WGN signals with different lengths. (b) ECMSE values of

1 / f

noise signals with different lengths.

Figure 5. Flowchart of the proposed method.

Figure 6. Experimental rig of the CWRU bearing dataset.

Figure 7. Bearing vibration signal waveforms under different working conditions in Case 1. The horizontal axis denotes sample points, and the vertical axis denotes vibration amplitude.

Figure 8. MSC in Case 1 under different parameter settings: (a) different

(δ, γ)

; (b) different

(γ, m)

.

Figure 8. MSC in Case 1 under different parameter settings: (a) different

(δ, γ)

; (b) different

(γ, m)

.

Figure 9. Classification performance in Case 1: (a) predicted labels; (b) confusion matrix.

Figure 10. t-SNE visualization results of different feature extraction methods: (a) ECMSE, (b) CMSE, (c) DBCMSE, (d) CMWPE, (e) RCMFE, (f) RCMDE, (g) HFDE, and (h) HWPE.

Figure 11. Recognition accuracy in Case 1: (a) different methods; (b) different training/testing set ratios.

Figure 12. Experimental setup in Case 2: (a) experimental device; (b) accelerometer measurement locations.

Figure 13. Bearing vibration signal waveforms under different working conditions. The horizontal axis denotes sample points, and the vertical axis denotes vibration amplitude.

Figure 14. MSC in Case 2 under different parameter settings: (a) different

(δ, γ)

; (b) different

(γ, m)

.

Figure 14. MSC in Case 2 under different parameter settings: (a) different

(δ, γ)

; (b) different

(γ, m)

.

Figure 15. Recognition accuracy in Case 2: (a) different methods; (b) different training/testing set ratios.

Table 1. Key parameter settings of HBA–KELM.

Parameter	Value
Population size	20
Maximum number of iterations	50
Density factor constant C	2
Ability coefficient $β$	6
Search range of KELM regularization coefficient A	$[2^{- 8}, 2^{8}]$
Search range of RBF kernel width $σ$	$[2^{- 8}, 2^{8}]$
Fitness function	Training classification error rate

Table 2. Description of different working states in Case 1.

Working State	Abbreviation	Fault Diameter (mm)	Training Samples	Test Samples	Label
Normal	NOR	\	20	80	1
Inner race fault 1	IR1	0.1778	20	80	2
Inner race fault 2	IR2	0.3556	20	80	3
Inner race fault 3	IR3	0.5334	20	80	4
Outer race fault 1	OR1	0.1778	20	80	5
Outer race fault 2	OR2	0.3556	20	80	6
Outer race fault 3	OR3	0.5334	20	80	7
Ball fault 1	B1	0.1778	20	80	8
Ball fault 2	B2	0.3556	20	80	9
Ball fault 3	B3	0.5334	20	80	10

Table 3. Recognition accuracy and computation time of different methods in Case 1.

Different Methods	Recognition Accuracy (%)				Computing Time (s)
Different Methods	Max	Min	Mean	SD	Computing Time (s)
ECMSE + HBA −KELM	100	99.63	99.95	0.1	47.098
CMSE + HBA − KELM	98.38	94.75	96.5	0.95	28.336
DBCMSE + HBA − KELM	100	99.38	99.84	0.18	18.592
CMWPE + HBA − KELM	85	80.38	83	1.14	94.958
RCMFE + HBA − KELM	94.13	90.63	92.67	0.76	165.929
RCMDE + HBA − KELM	99.5	97.5	98.06	0.49	47.689
HFDE + HBA − KELM	91.88	86.88	89.06	1.06	29.966
HWPE + HBA − KELM	88.25	77.75	82.80	2.39	28.485

Table 4. Main parameters of the tested cylindrical roller bearing in Case 2.

Parameter	Value
Bearing model	NU1010EM/N1010EM
Manufacturer	NSK
Bearing type	Single-row cylindrical roller bearing
Bore diameter	50 mm
Outer diameter	80 mm
Width	16 mm

Table 5. Description of different working states in Case 2.

Working State	Abbreviation	Fault Size (mm)	Training Samples	Test Samples	Label
Normal	NOR	\	20	80	1
Inner race fault 1	IRF1	9 × 0.2 (1 defect)	20	80	2
Outer race fault 1	ORF1	9 × 0.2 (1 defect)	20	80	3
Ball fault 1	BF1	9 × 0.2 (1 defect)	20	80	4
Outer race and ball compound fault	OBF	9 × 0.2 (3 defects)	20	80	5
Inner race and ball compound fault	IBF	9 × 0.2 (3 defects)	20	80	6
Inner race fault 2	IRF2	9 × 0.2 (3 defects)	20	80	7
Outer race fault 2	ORF2	9 × 0.2 (3 defects)	20	80	8
Ball fault 2	BF2	9 × 0.2 (3 defects)	20	80	9

Table 6. Recognition accuracy and computation time of different methods in Case 2.

Different Methods	Recognition Accuracy (%)				Computing Time (s)
Different Methods	Max	Min	Mean	SD	Computing Time (s)
ECMSE + HBA − KELM	100	99.17	99.82	0.18	42.719
CMSE + HBA − KELM	98.89	96.25	97.78	0.75	25.659
DBCMSE + HBA − KELM	99.58	98.19	99.11	0.32	16.576
CMWPE + HBA − KELM	91.53	87.36	89.40	0.99	45.626
RCMFE + HBA − KELM	98.75	96.25	97.97	0.54	169.540
RCMDE + HBA − KELM	97.22	93.89	95.90	0.73	23.486
HFDE + HBA − KELM	78.61	71.67	75.47	1.47	26.250
HWPE + HBA − KELM	54.58	46.53	50.13	1.76	22.995

Table 7. Noise robustness results for Case 1 under different SNR levels.

Method	Recognition Accuracy (%, Mean ± SD)
Method	Clean	12 dB	8 dB	4 dB	0 dB	−4 dB
ECMSE + HBA − KELM	99.95 ± 0.10	99.70 ± 0.16	99.31 ± 0.24	98.42 ± 0.37	96.53 ± 0.62	93.68 ± 0.91
DBCMSE + HBA − KELM	99.84 ± 0.18	99.20 ± 0.22	98.76 ± 0.31	97.31 ± 0.48	94.80 ± 0.74	90.45 ± 1.10
CMSE + HBA − KELM	96.50 ± 0.95	95.82 ± 1.02	94.18 ± 1.16	91.72 ± 1.41	86.54 ± 1.86	80.33 ± 2.24
RCMDE + HBA − KELM	98.06 ± 0.49	97.38 ± 0.58	96.42 ± 0.73	93.85 ± 0.96	89.76 ± 1.33	84.62 ± 1.71
RCMFE + HBA − KELM	92.67 ± 0.76	91.85 ± 0.88	90.24 ± 1.05	87.36 ± 1.42	82.15 ± 1.93	76.54 ± 2.35

Table 8. Comparison with representative machine-learning and deep-learning baselines.

Method	Input Type	Case 1 Accuracy (%)	Case 2 Accuracy (%)
SVM	Raw signal	94.62 ± 1.21	92.48 ± 1.46
KELM	Raw signal	95.38 ± 0.98	93.76 ± 1.22
1D-CNN	Raw signal	98.72 ± 0.41	97.93 ± 0.62
CNN-LSTM	Raw signal	98.91 ± 0.38	98.24 ± 0.54
1D-ResNet	Raw signal	99.08 ± 0.32	98.61 ± 0.48
ECMSE+HBA–KELM	Entropy features	99.95 ± 0.10	99.82 ± 0.18

Table 9. Classifier optimization ablation using ECMSE features.

Classifier	Case 1 Accuracy (%)	Case 2 Accuracy (%)
ECMSE + KELM	98.84 ± 0.36	98.32 ± 0.44
ECMSE + PSO − KELM	99.42 ± 0.22	99.05 ± 0.31
ECMSE + GA − KELM	99.31 ± 0.28	98.96 ± 0.36
ECMSE + HBA − KELM	99.95 ± 0.10	99.82 ± 0.18

Table 10. Statistical significance analysis over 30 runs.

Comparison	Case 1 p-Value	Case 2 p-Value
ECMSE vs. CMSE	<0.001	<0.001
ECMSE vs. DBCMSE	0.018	0.003
ECMSE vs. RCMDE	<0.001	<0.001

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, W.; Li, J.; Wang, S.; Chen, Y.; Chen, J. Enhanced Composite Multi-Scale Slope Entropy and Its Application to Fault Diagnosis of Rolling Bearing. Electronics 2026, 15, 2219. https://doi.org/10.3390/electronics15102219

AMA Style

Li W, Li J, Wang S, Chen Y, Chen J. Enhanced Composite Multi-Scale Slope Entropy and Its Application to Fault Diagnosis of Rolling Bearing. Electronics. 2026; 15(10):2219. https://doi.org/10.3390/electronics15102219

Chicago/Turabian Style

Li, Wei, Jiazhu Li, Shuyu Wang, Yan Chen, and Jian Chen. 2026. "Enhanced Composite Multi-Scale Slope Entropy and Its Application to Fault Diagnosis of Rolling Bearing" Electronics 15, no. 10: 2219. https://doi.org/10.3390/electronics15102219

APA Style

Li, W., Li, J., Wang, S., Chen, Y., & Chen, J. (2026). Enhanced Composite Multi-Scale Slope Entropy and Its Application to Fault Diagnosis of Rolling Bearing. Electronics, 15(10), 2219. https://doi.org/10.3390/electronics15102219

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Enhanced Composite Multi-Scale Slope Entropy and Its Application to Fault Diagnosis of Rolling Bearing

Abstract

1. Introduction

2. Slope-Entropy-Based Feature Extraction Methods

2.1. Slope Entropy

2.2. Composite Multi-Scale Slope Entropy

2.3. Difference-Based Composite Multi-Scale Slope Entropy

2.4. Enhanced Composite Multi-Scale Slope Entropy

2.5. Parameter Selection of ECMSE

3. Honey Badger Algorithm Optimized Kernel Extreme Learning Machine

3.1. Honey Badger Algorithm

3.2. Kernel Extreme Learning Machine

3.3. HBA–KELM Algorithm (Continued)

3.4. Implementation of the Proposed Method

4. Experimental Study

4.1. Experimental Setting

4.2. Test Verification Case 1

4.3. Test Verification Case 2

4.4. Additional Robustness and Baseline Analysis

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI