Research on Rolling Bearing Fault Diagnosis Method Based on MPE and Multi-Strategy Improved Sparrow Search Algorithm Under Local Mean Decomposition

Chi, Haodong; Chen, Huiyuan

doi:10.3390/machines13040336

Open AccessArticle

Research on Rolling Bearing Fault Diagnosis Method Based on MPE and Multi-Strategy Improved Sparrow Search Algorithm Under Local Mean Decomposition

by

Haodong Chi

and

Huiyuan Chen

^*

College of Chemical Engineering, Qinghai University, Xining 810016, China

^*

Author to whom correspondence should be addressed.

Machines 2025, 13(4), 336; https://doi.org/10.3390/machines13040336

Submission received: 17 March 2025 / Revised: 10 April 2025 / Accepted: 14 April 2025 / Published: 18 April 2025

(This article belongs to the Section Machines Testing and Maintenance)

Download

Browse Figures

Versions Notes

Abstract

:

To address the issues of non-stationarity, noise interference, and insufficient discriminative power of traditional fault feature extraction methods in rolling bearing vibration signals, this paper proposes a fault diagnosis method based on multi-scale permutation entropy (MPE) and a multi-strategy improved sparrow search algorithm (MSSA) under local mean decomposition (LMD). First, LMD is employed to adaptively decompose the original signal. Effective product functions (PFs) are then selected using the Pearson correlation coefficient, enabling signal reconstruction that suppresses noise interference while preserving fault impact components. Second, to overcome the limited capability of traditional time-frequency features in representing complex fault patterns, MPE is introduced to construct a multi-scale complexity feature vector, effectively capturing the scale-dependent differences in the dynamic behavior of signals. Furthermore, considering the instability of classification caused by the empirical setting of hidden layer nodes in the extreme learning machine (ELM), a multi-strategy improved sparrow search algorithm is proposed to optimize ELM parameters. This algorithm integrates an adaptive Levy flight mechanism and dynamic reverse learning. The long-tail jump characteristics of Levy flight enhance the global search capability, while dynamic reverse learning increases population diversity, preventing premature convergence. The experimental results demonstrate that the proposed method achieves an average diagnostic accuracy of over 96% across multiple datasets, verifying its robustness in handling non-stationary signals and fault classification.

Keywords:

fault diagnosis; feature extraction; local mean decomposition; multi-scale permutation entropy; adaptive levy flight; dynamic reverse learning; sparrow search algorithm; extreme learning machine

1. Introduction

Rolling bearings serve as indispensable elements in rotating machinery, and their condition plays a crucial role in determining the equipment’s reliability and lifespan [1]. However, bearing vibration signals collected in industrial environments typically exhibit significant non-stationarity and are influenced by background noise, mechanical resonance, and multi-source interference. These factors cause early fault characteristics to be weak and easily submerged in noise. Traditional fault diagnosis methods are often based on time-domain statistical indicators [2] or frequency-domain spectral analysis [3]. However, these methods have limited capability in handling non-stationary signals and struggle to capture transient impact features. Although time-frequency analysis methods such as empirical mode decomposition (EMD) can reveal the local time–frequency characteristics of signals to some extent [4], its performance is constrained by the subjectivity of basis function selection or inherent defects in modal decomposition, leading to insufficient robustness in practical applications.

During the signal preprocessing phase, both EMD and its variant, ensemble empirical mode decomposition (EEMD) [5], are utilized to adaptively break down signals into a series of intrinsic mode functions (IMFs), providing important tools for non-stationary signal analysis. However, EMD-based methods suffer from mode mixing, where frequency components with different physical meanings are incorrectly assigned to the same IMF. Additionally, these methods are highly sensitive to noise and may generate spurious modal components. Variational mode decomposition (VMD) [6] mitigates mode mixing by constructing a constrained variational model, predefining the number of decomposition modes, and employing a bandwidth constraint in the frequency domain [7]. While VMD theoretically alleviates mode mixing, its performance heavily depends on empirical parameter selection. If the number of decomposition modes K is too small, under-decomposition may occur, preventing sufficient separation of fault features. Conversely, if K is too large, redundant modes may be generated, increasing computational complexity. Moreover, improper selection of penalty factors under noisy conditions can exacerbate energy leakage between modes, further reducing the reliability of the decomposition process.

In contrast, LMD iteratively extracts the local mean function and envelope estimation function of a signal, decomposing the original signal into multiple PFs, each composed of a pure frequency-modulated signal multiplied by an envelope signal [8]. LMD offers complete adaptability, requires no predefined decomposition levels, and demonstrates strong robustness against boundary effects [9]. However, existing studies often directly utilize all PF components for signal reconstruction, overlooking the fact that noisy environments may introduce spurious PF components. This study put forward a PF component selection strategy based on the Pearson correlation coefficient. By computing the correlation coefficients between each PF and the original signal, a dynamic threshold is set to eliminate noise-dominated components, thereby improving decomposition accuracy.

For feature extraction, traditional time–frequency domain parameters such as spectral kurtosis exhibit limited effectiveness in representing complex fault patterns. These parameters may overlap in frequency bands, making it difficult to distinguish faults using a single-scale energy distribution. MSE computes sample entropy at different coarse-graining scales to reflect dynamic system behavior [10]. However, MSE suffers from low computational efficiency and may lose high-frequency details during the coarse-graining process. In comparison, multi-scale permutation entropy (MPE) [11,12], which quantifies entropy based on the probability distribution of data permutation patterns in time series, provides a more effective means of capturing signal complexity while balancing computational efficiency and noise resistance.

For fault classification, the ELM is widely used due to its single-hidden-layer structure and randomly initialized weights, enabling fast training and real-time diagnosis [13]. However, the number of neurons within the hidden layer in an ELM must be manually set, and the random initialization of weights can lead to fluctuations in classification performance, severely affecting the stability of the diagnostic system. To optimize ELM parameters, swarm intelligence techniques, including particle swarm optimization (PSO) [14] and genetic algorithms (GA) [15], have found broad applications. These techniques can help fine-tune the model, improving its diagnostic accuracy. However, traditional algorithms such as PSO and GA suffer from slow convergence and susceptibility to local optima. PSO, in particular, experiences a rapid loss of population diversity when optimizing high-dimensional parameters, leading to premature convergence, while GA incurs high computational costs due to crossover and mutation operations, making it less suitable for real-time applications.

In addition to PSO and GA, other optimization methods such as Ant Colony Optimization (ACO) and Artificial Bee Colony (ABC) algorithms have also been explored for ELM parameter optimization. ACO has shown effectiveness in finding global optima by mimicking the foraging behavior of ants, but its performance can be sensitive to parameter settings. Similarly, ABC offers a balance between exploration and exploitation, but it may require numerous iterations to converge, impacting real-time implementation. Moreover, Differential Evolution (DE), which is known for its simplicity and strong global search ability, has also been applied, though it can struggle with convergence speed in high-dimensional optimization problems.

The SSA achieves a balance between convergence speed and global search capability by simulating the discoverer–follower foraging mechanism and anti-predation behavior of sparrows [16]. However, the original SSA suffers from decreased population diversity in the later iterations, causing individuals to converge around local optima and struggle to escape [17,18].

To overcome these limitations, this study proposes a rolling ball-bearing fault diagnosis framework that integrates improved LMD, optimized MPE, and a multi-strategy SSA to address the shortcomings of traditional methods in signal decomposition, feature extraction, and classification optimization. In the signal decomposition stage, the improved LMD method, combined with correlation coefficient-based PF selection, effectively removes noise interference. In the feature extraction stage, a multi-scale permutation entropy approach with adaptive scale selection is employed, overcoming the limitations of traditional MSE in capturing high-frequency details and improving fault feature identification. In the classification optimization stage, to prevent SSA from stagnating when optimizing ELM’s hidden layer nodes, an adaptive Levy flight mechanism and dynamic reverse learning strategy are introduced. These enhancements improve the algorithm’s global search and local exploitation capabilities, thereby increasing the model’s stability and accuracy. The proposed framework provides a high-precision and robust solution for bearing fault diagnosis under complex noise conditions and varying operating environments in industrial applications.

2. Signal Decomposition, Reconstruction, and Feature Extraction

2.1. Bearing Signal Decomposition and Reconstruction

LMD is an adaptive technique designed to decompose non-stationary signals. It effectively separates transient impact components and noise interference in complex vibration signals. The core concept of LMD is to iteratively extract the local mean function and envelope estimation function of a signal, decomposing the original signal into several PFs, each representing an oscillatory mode with different physical significance [19]:

(1): Extraction of Extremum Points and Interpolation: Given a bearing vibration signal $x (t)$ , the local extremum points are first identified, including the sequence of local maxima ${\max_{k}}$ and the sequence of local minima ${\min_{k}}$ . Using cubic spline interpolation, the local mean function $m_{11} (t)$ and the envelope estimation function $a_{11} (t)$ are constructed.

$m_{11} (t) = \frac{\max_{k} (t) + \min_{k} (t)}{2}$

(1)

$a_{11} (t) = \frac{\max_{k} (t) - \min_{k} (t)}{2}$

(2)

where the $m_{11} (t)$ represents the Local mean function derived during the first iteration of the first decomposition stage, and $a_{11} (t)$ is the envelope estimation function computed in the first iteration of the first decomposition stage.
(2): Iterative Demodulation to Generate Frequency-Modulated Components: The local mean function is removed to obtain the mean-corrected signal $h_{11} (t)$ .

$h_{11} (t) = x (t) - m_{11} (t)$

(3)

where the $h_{11} (t)$ is the mean-corrected residual signal after removing $m_{11} (t)$ in the first iteration of PF1.

Further demodulation of

h_{11} (t)

results in the extraction of the pure frequency-modulated component

s_{11} (t)

.

s_{11} (t) = \frac{h_{11} (t)}{a_{11} (t)}

(4)

If

a_{12} (t)

of

s_{11} (t)

satisfies

\max (a_{12} (t)) \leq 1

, it is considered to meet the pure frequency modulation condition. Otherwise,

s_{11} (t)

is used as a new signal, and steps (1)–(2) are repeated until the iteration termination condition is met.

(3): PF Component Generation and Residual Signal Update: After $n$ iterations, the first PF component is obtained by multiplying the final frequency-modulated component with the cumulative envelope function.

${PF}_{1} (t) = a_{1} (t) \cdot s_{1 n} (t)$

(5)

$a_{1} (t) = \prod_{k = 1}^{n} a_{1 k} (t)$

(6)

The residual signal

u_{1} (t)

is then used as a new input, and the above decomposition process is repeated until the residual signal exhibits a monotonic trend or consists mainly of noise.

u_{1} (t) = x (t) - {PF}_{1} (t)

(7)

The convergence of LMD can be demonstrated by the monotonically decreasing energy of the residual signal. Suppose the residual signal after the k-th decomposition is

u_{k} (t)

, with its energy defined as:

E (u_{k}) = \int_{- \infty}^{\infty} {|u_{k} (t)|}^{2} d t

(8)

Since the energy of each extracted PF component satisfies

E (P F_{k}) > 0

, the residual signal energy satisfies:

E (u_{k + 1}) = E (u_{k}) - E ({PF}_{k})

(9)

This indicates that the energy of the residual signal strictly decreases during the decomposition process, ensuring algorithm convergence.

After signal decomposition, each PF corresponds to signal characteristics in different frequency bands. However, real-world signals often contain noise and other interference, resulting in the presence of irrelevant or noise-dominated components. Directly using all components for feature extraction and diagnosis not only increases computational complexity but may also lead to extracted features deviating from the actual characteristics, thereby affecting the diagnostic outcome.

To optimize the decomposition results, modal components need to be screened and reconstructed. This study employs the Pearson correlation coefficient to filter components, which quantifies the linear correlation between each component and the original signal to determine whether a component contains essential feature information. Selected components are retained, while those related to noise are discarded. This approach suppresses noise while preserving critical feature information in the signal.

For each component

P F_{i} (t)

, the Pearson correlation coefficient

R_{i}

with the

x (t)

is calculated as follows:

R_{i} = \frac{\sum_{i = 1}^{N} (P F_{i} (t) - \bar{P F_{i}}) (x (t) - \bar{x})}{\sqrt{\sum_{i = 1}^{N} {(P F_{i} (t) - \bar{P F_{i}})}^{2} \sum_{i = 1}^{N} {(x (t) - \bar{x})}^{2}}}

(10)

where

\bar{x}

and

{\bar{P F}}_{i}

represent the mean values of the corresponding signals. The correlation coefficient

R_{i}

ranges from [−1,1]. When

|R_{i}|

is close to 1, it indicates a strong linear correlation between the two variables, while a value approaching 0 suggests a weak correlation.

A correlation threshold

τ

is set, whereby if

|R_{i}| \geq τ

, the component is considered to contain essential feature information [20]. Otherwise, it is regarded as noise or irrelevant information and is removed. The selected components are then used for signal reconstruction, with the reconstructed signal

x_{r} (t)

expressed as:

x_{r} (t) = \sum_{i \in S} P F_{i} (t)

(11)

where

S

is the set of components satisfying

|R_{i}| \geq τ

.

2.2. Multi-Scale Permutation Entropy Feature Extraction

Permutation entropy, first introduced by Bandt [21], is an entropy measure used to evaluate the complexity of one-dimensional time series and is widely employed in extracting features related to mechanical faults. However, fault characteristic information in bearing operation is distributed across multiple scales. Analyzing the signal using single-scale permutation entropy may lead to the omission of fault features present at other scales. To address this limitation, Costa [22] proposed the concept of MPE. MPE is an extension of permutation entropy that applies multi-scale coarse-graining to a time series before computing its permutation entropy at different scales. The calculation process is as follows:

Given a one-dimensional time series

X = {x_{i}, i = 1, 2, \dots, N}

, a multi-scale coarse-graining process is performed by averaging non-overlapping segments of data points at increasing scale factors. The coarse-grained time series is expressed as:

y_{j}^{τ} = \frac{1}{τ} \sum_{i = (j - 1) τ + 1}^{j τ} x_{i}

(12)

where

τ

represents the scale factor. When

τ = 1

, the coarse-grained series is identical to the original series. The length of each coarse-grained series decreases as the scale factor increases.

For each coarse-grained series

{y_{j}, j = 1, 2, \dots, m}

, phase space reconstruction is performed in an m-dimensional embedding space:

Y_{j} = {y_{i}, y (j + τ), \dots, y (j + (m - 1) τ)}

(13)

where

m

is the embedding dimension and

τ

is the time delay. For each

j

, the m-dimensional real-valued sequence is arranged in ascending order:

y (j + (j_{1} - 1)) τ \leq y (j + (j_{2} - 1)) τ \leq \dots \leq y (j + (j_{m} - 1)) τ

(14)

For any sequence

Y_{j}

, a corresponding symbolic sequence

S (r)

is obtained, where

r = 1, 2, \dots k

and

k \leq m!

. The maximum number of distinct symbol sequences is

m!

in an embedding space of dimension

m

. The probability of occurrence of a given symbolic sequence

S (r)

is denoted as

P_{r}

, and the permutation entropy is defined in the form of Shannon entropy:

H_{p} (m) = - \sum_{r = 1}^{k} P_{r} \ln P_{r}

(15)

where the maximum value of

H_{p} (m)

is

\ln m!

. When all symbol sequences have equal probability,

P_{r} = \frac{1}{m!}

, and the normalized permutation entropy is given by:

0 \leq H_{p} = H_{p} (m) / \ln (m!) \leq 1

(16)

The value of

H_{p}

reflects the degree of randomness in the time series. A larger

H_{p}

value indicates a higher level of irregularity in the time series, whereas a smaller

H_{p}

value suggests greater regularity.

3. Rolling Bearing Fault Diagnosis Method Based on Multi-Strategy Improved Sparrow Search Algorithm Optimize ELM

3.1. Extreme Learning Machine

The ELM is a learning algorithm designed for Single Hidden Layer Feedforward Neural Networks (SLFNs). Its key feature is the random assignment of input layer weights and hidden layer biases, while only the output layer weights need to be solved analytically. This structure allows the ELM to have a fast training speed and strong generalization ability and avoids local minima issues common in traditional neural networks [23].

In an ELM model, we assume the training samples are given by

{(x_{i}, y_{i})}_{i = 1}^{N}

, where

x_{i} \in ℝ^{d}

represents the input features and

y_{i} \in ℝ^{m}

is the corresponding output. For an SLFN with

L

hidden nodes, the output function is expressed as:

f (x) = \sum_{j = 1}^{L} β_{j} g (w_{j} \cdot x + b_{j})

(17)

where

w_{j} \in ℝ^{d}

and

b_{j} \in ℝ

are the input weight and bias of the

j - th

hidden node,

g (\cdot)

is the activation function, and

β_{j} \in ℝ^{m}

represents the output layer weights. In the ELM, both

w_{j}

and

b_{j}

are randomly initialized, significantly reducing the complexity of parameter tuning during training.

To ensure that the network perfectly matches all training samples, for any

i = 1, 2, \dots, N

, the following equation must hold:

\sum_{j = 1}^{L} β_{j} g (w_{j} \cdot x_{i} + b_{j}) = y_{i}

(18)

We define the hidden layer output matrix

H

, where each element is given by:

H_{i j} = g (w_{j} \cdot x_{i} + b_{j}), i = 1, 2, \dots, N, j = 1, 2, \dots, L

(19)

Thus, the network output can be written in matrix form as:

H β = Y

(20)

where

β

is the weight matrix to be determined and Y represents the target output matrix. Since

w_{j}

and

b_{j}

have already been randomly assigned, solving for

β

reduces to solving a linear least squares problem. Typically, the Moore–Penrose pseudo-inverse

H^{†}

is used to obtain

β

:

β = H^{†} Y

(21)

This analytical approach eliminates the need for iterative parameter updates, significantly reducing training time. Theoretical results have demonstrated that when the hidden node count is sufficiently high, the ELM is capable of approximating any continuous function with high probability, satisfying the universal approximation theorem. This provides a solid theoretical foundation for ELM’s application in nonlinear classification and regression tasks.

However, the performance of ELM is highly dependent on the selection of the number of hidden layer nodes. If too few nodes are chosen, the model may underfit and fail to capture complex features in the data. Conversely, an excessive number of nodes may lead to overfitting and increased computational costs. Therefore, selecting an appropriate number of hidden nodes and optimizing the model parameters are crucial challenges in practical applications. In the following sections, this study will employ a swarm intelligence optimization algorithm to fine-tune ELM parameters, thereby enhancing the overall accuracy and robustness of the fault diagnosis model.

3.2. Multi-Strategy Improved Sparrow Search Algorithm

The Sparrow Search Algorithm represents a novel approach in swarm intelligence optimization that simulates the foraging and anti-predation behaviors of a sparrow population, demonstrating a good balance between exploration and exploitation [24]. However, when addressing high-dimensional and complex optimization problems such as ELM parameter tuning, the original SSA suffers from inherent drawbacks, including insufficient population diversity, a tendency to become trapped in local optima, and reduced efficiency in the later stages of convergence. To resolve these issues, this paper proposes a multi-strategy improved sparrow algorithm that incorporates an adaptive Levy flight mechanism and a dynamic opposition-based learning strategy to enhance the algorithm’s global search capability and robustness, thereby providing an efficient and stable solution for ELM parameter optimization.

In the original SSA, individuals in the population are typically divided into “discoverers”, who are responsible for searching the global optimum region, “followers”, who perform local exploitation by closely tracking the discoverers, and a few individuals who serve as vigilantes to prevent the population from entering dangerous regions. The update formula for the positions of the discoverers is given by:

X_{i}^{t + 1} = X_{i}^{t} \cdot \exp (- \frac{i}{α \cdot T}) + Q

(22)

where

X_{i}^{t}

denotes the position of the

i - th

discoverer at iteration

t

,

α

is a control factor,

T

is the maximum number of iterations, and

Q

is a random disturbance term.

The update formula for the positions of the followers is:

X_{i}^{t + 1} = X_{i}^{t} + |X_{i}^{t} - X_{b e s t}^{t}| \cdot L

(23)

where

X_{b e s t}^{t}

represents the position of the best individual in the current population and

L

is a random step-size coefficient.

Although SSA exhibits fast convergence and strong optimization performances in global search, it still faces shortcomings such as a tendency to fall into local optima and premature convergence when handling high-dimensional or complex optimization problems. To address these issues, this paper proposes two key strategies to improve SSA and applies them to optimize ELM parameters.

The adaptive Levy flight mechanism exploits the long-tail property of the Levy distribution to generate random step lengths, enabling individuals to perform long-distance jumps with a higher probability during the search process, which helps to escape local optima [25]. Specifically, when updating the positions of the discoverers, an adaptive probability

P (t)

is introduced to control the application of the Levy perturbation, that is:

X_{i}^{t + 1} = X_{i}^{t} + α \cdot L (λ)

(24)

where

α

is the step-size adjustment factor,

L (λ)

denotes the random step length sampled from the Levy distribution, and the parameter

λ

controls the magnitude of the jump. The expression for

L (λ)

is given by:

L (λ) = \frac{Γ (1 + λ) \cdot \sin (π λ / 2)}{Γ (\frac{1 + λ}{2}) \cdot λ \cdot 2^{(λ - 1) / 2}}

(25)

The dynamic opposition-based learning strategy aims to expand the search space by generating the oppositional solution of the current solution, thereby enhancing population diversity and preventing premature convergence [26]. For each individual

X_{i}

, its oppositional solution is generated according to the following formula:

X_{i}^{'} = γ (t) \cdot (X_{\max} + X_{\min}) - X_{i}

(26)

where

X_{\max}

and

X_{\min}

denote the upper and lower bounds of the search space, respectively, and the shrinking factor

γ (t)

is dynamically adjusted with the iteration count

t

so that the oppositional solution maintains high diversity in the early stages and gradually converges to a narrower search region in the later stages. It is generally set as:

γ (t) = γ_{\max} - (γ_{\max} - γ_{\min}) \cdot \frac{t}{T}

(27)

By combining the adaptive Levy flight mechanism with the dynamic opposition-based learning strategy, the improved SSA update formula for the discoverers can be expressed as:

X_{i}^{t + 1} = γ (t) \cdot (X_{\max} + X_{\min}) - X_{i}^{t} + P (t) \cdot α \cdot L (λ)

(28)

3.3. MSSA-ELM-Based Bearing Fault Diagnosis Model

The MSSA-ELM-based bearing fault diagnosis model first decomposes the acquired vibration signals using LMD to obtain several PFs. Then, the Pearson correlation coefficient is employed to analyze the correlation of the PF components, and only the components that are highly correlated with the original signal are selected for reconstruction to suppress noise interference. Next, the MPE of the reconstructed signal is calculated, and a coarse-graining process is used to extract the complexity features of the permutation patterns at different scales, thereby constructing a multi-dimensional fault feature vector. Based on this feature vector, the population size, maximum number of iterations, Levy flight coefficient, and contraction factor for opposition-based learning are initialized for the sparrow population, and iterative optimization is performed through a multi-strategy cooperative mechanism. In each iteration, the classification accuracy of the ELM is used as the fitness function to evaluate individual performance and update the global best position until the convergence condition is met. Once the optimization is complete, the optimal combination of the number of hidden layer nodes and weight parameters is obtained to construct a high-accuracy classification model. Finally, the optimized ELM model is applied to the classification of bearing fault features, and the fault type identification results are produced. The algorithm implementation process is illustrated in Figure 1.

4. Experimental Analysis

This experiment employs a bearing dataset from Xi’an Jiaotong University to analyze the characteristics and fault evolution of bearing vibration signals under varying radial loads and rotational speeds. The experimental setup includes an AC motor, a motor speed controller, a shaft, supporting bearings, a hydraulic loading system, and the test bearing. The hydraulic loading system provides a precise radial force so that the load is evenly applied to the bearing housing. The motor speed controller adjusts the speed to ensure stable operation of the bearing under various conditions. The experimental setup is shown in Figure 2.

The test bearing selected for this experiment is the LDK UER204 rolling bearing, which was sourced from Zhejiang Changxing Shengyang Technology Co., Ltd., Changxing, China. Its parameters are listed in Table 1. Its parameters are listed in Table 1. The experiment was designed with three types of operating conditions, as detailed in Table 2, with five bearings participating in the test for each condition. This platform simulates the effects of actual operating conditions on bearing vibration signals under controlled circumstances. It provides a reliable data foundation for the subsequent study of fault diagnosis models.

Under low-speed operating conditions, bearing fault signals often exhibit weak characteristics mixed with strong noise interference, making accurate feature extraction challenging and ultimately compromising diagnostic precision. To address this, we first analyze the LMD method in detail. The original vibration signal is decomposed into several PF components via LMD. Combined with signal reconstruction techniques, diagnostically relevant features are extracted to construct feature vectors for subsequent fault diagnosis. Figure 3 illustrates the raw signal acquired from the test bench, while Figure 4 shows the PF components obtained after LMD decomposition.

As shown in Figure 4, the PF components exhibit distinct frequency characteristics. Certain components contain high-frequency information that reflects subtle changes in localized faults or abnormal wear. However, not all components are diagnostically useful—some are dominated by noise or irrelevant information, which may obscure critical fault features and reduce diagnostic accuracy. Therefore, it is necessary to filter the decomposed PF components. This study quantifies the linear correlation between each PF component and the raw signal using Pearson correlation coefficients. A threshold is set to retain only highly correlated components for signal reconstruction, effectively suppressing noise and providing reliable inputs for feature extraction. The correlation coefficients of PF components under different bearing states are listed in Table 3.

Based on Table 3, the top three PF components with Pearson correlation coefficients exceeding a threshold of τ = 0.3 were selected for signal reconstruction. The rationale for this threshold was derived from a sensitivity analysis conducted across multiple candidate values (τ = 0.2, 0.3, and 0.4) to balance noise suppression and fault feature retention. To validate the threshold selection, classification accuracy and signal-to-noise ratio (SNR) improvements were evaluated across these τ values using a validation dataset of 120 bearing vibration signals. At τ = 0.3, the reconstructed signals achieved optimal performance with 92.1% fault classification accuracy and an average SNR gain of 8.7 dB, outperforming τ = 0.2 (89.5% accuracy, SNR +6.2 dB) and τ = 0.4 (85.3% accuracy, SNR +7.1 dB). This aligns with Cohen’s convention for correlation strength, where |r| ≥ 0.3 represents a moderate association [22]. Lower thresholds (τ < 0.3) retained noise-dominated components, while higher thresholds (τ > 0.4) discarded critical weak fault signatures in later PF components. The sensitivity analysis confirms that τ = 0.3 optimally preserves diagnostically relevant information while suppressing stochastic noise, ensuring robust performance across all four fault categories. This empirically grounded threshold selection strategy enhances the generalizability of the reconstruction method compared to ad hoc parameter choices. Figure 5 compares the raw and reconstructed signals under different bearing states.

The reconstructed signals exhibit enhanced signal-to-noise ratios (SNRs) and clearer fault signatures. High-frequency components associated with localized faults and wear are retained, while redundant noise is eliminated. This optimization improves the visibility of fault-induced impulses in both time and frequency domains, providing a robust foundation for subsequent feature extraction and pattern recognition.

To validate the proposed algorithm, a multidimensional test function with random perturbations is employed for simulation:

F (x) = \sum_{i = 1}^{\dim} x_{i}^{4} + r a n d

(29)

where

x

is the optimization vector,

\dim

denotes dimensionality, and

r a n d (0, 1)

introduces random noise.

This “bowl-shaped” function evaluates the algorithm’s ability to handle high-dimensional nonlinear optimization under uncertainty. Figure 6 and Figure 7 visualize the function and optimization convergence.

The traditional SSA typically shows rapid improvement in the fitness value during the early iterations, but it may experience stagnation afterward as the algorithm struggles to make further progress. However, the MSSA, which incorporates adaptive Levy flight and dynamic opposition-based learning, demonstrates a more robust convergence behavior. It achieves fitness values significantly lower in early iterations, maintaining a steady decrease and ultimately converging to an optimal solution, showing superior performance in terms of both global exploration and local exploitation.

Two optimized ELM models—SSA-ELM and MSSA-ELM—are compared with the baseline ELM for bearing fault diagnosis. Fault states are labeled as follows: normal (Label 1), inner race fault (Label 2), outer race fault (Label 3), and roller fault (Label 4). Figure 8 and Table 4 summarize the diagnostic accuracies.

The experimental results demonstrate that the traditional ELM model achieves an overall accuracy of 87.5% in bearing fault diagnosis, with moderate recognition rates for different fault states. In contrast, the SSA-ELM model, optimized using the SSA, improves the overall accuracy to 91.875%, indicating that the global search mechanism effectively enhances parameter configuration and classification performance. Furthermore, the MSSA-ELM model, which integrates an adaptive Levy flight mechanism and a dynamic opposition-based learning strategy, achieves a remarkable 96.875% accuracy across all fault categories. This finding highlights the advantages of multi-strategy optimization in balancing global exploration and local exploitation, effectively avoiding local optima.

To further evaluate the practical applicability of the proposed algorithm, a bearing signal acquisition system was set up in the laboratory, as illustrated in Figure 9. This setup is designed to collect vibration data from bearings operating under different conditions, including multiple rolling element types (ball, tapered roller, and spherical roller bearings), to comprehensively assess the method’s versatility. It consists primarily of a driving motor, bearing housing, couplings, and multiple sensors. By adjusting rotational speed and loading conditions, the system simulates diverse working states of bearings. Additionally, a data acquisition system is employed to record real-time vibration signals, rotational speed, and other key parameters, providing comprehensive and accurate experimental data for subsequent fault diagnosis analysis. By thoroughly assessing the robustness and accuracy of the algorithm in distinguishing normal and faulty states, its feasibility and effectiveness in real-world engineering applications can be further validated.

Based on the collected bearing vibration data, the feature extraction and fault labeling processes were conducted as described in the previous sections. The datasets were used to train and evaluate three models: ELM, SSA-ELM, and MSSA-ELM. The comparative results of the diagnostic accuracy across different datasets are presented in Figure 10.

The results indicate that optimization significantly enhances the predictive accuracy of bearing fault diagnosis. The baseline ELM model exhibits relatively low accuracy on both the Xi’an Jiaotong University bearing dataset (Dataset 1) and the laboratory-collected dataset (Dataset 2), suggesting limited generalization capability. The SSA-ELM model, optimized with the SSA algorithm, demonstrates a notable improvement in accuracy, benefiting from enhanced parameter tuning and nonlinear mapping capabilities. However, local optima remain a challenge. In contrast, the MSSA-ELM model achieves over 95% accuracy on both datasets, demonstrating the effectiveness of the proposed optimization strategy in enhancing global search capability and model robustness. Notably, even in the complex signal environment of the laboratory dataset, MSSA-ELM maintains high accuracy, highlighting its superior generalization ability and practical application value.

Based on this foundation, a comparative experiment was designed to validate the applicability and superiority of the proposed method from three perspectives: diagnostic models, optimization algorithms, and signal preprocessing.

To further demonstrate the effectiveness and superiority of the proposed method, comparative experiments were designed across three levels—diagnostic models, optimization algorithms, and signal preprocessing techniques: (1) diagnostic model comparison, in which traditional machine learning models, including SVM, KNN, Random Forest (RF), and ELM, were compared to evaluate their feature extraction and classification performances on high-dimensional nonlinear data; (2) optimization algorithm comparison, in which parameter tuning of the ELM was conducted using PSO, GA, the Grey Wolf Optimizer (GWO), and the proposed MSSA, to assess their global search capability and convergence efficiency; and (3) signal preprocessing comparison, in which the impact of signal preprocessing was examined by comparing diagnostic performance before and after LMD and Pearson correlation coefficient-based reconstruction. The model without preprocessing is denoted as MSSA-ELM-N.

The comprehensive experimental results are summarized in Figure 11.

The experimental results reveal significant differences in the performance of various diagnostic models and optimization algorithms in bearing fault diagnosis. First, models based on ELM demonstrated superior classification performance. Specifically, in Dataset 1, the prediction accuracy of MSSA-ELM reached 96.875%, which was significantly higher than that of MSSA-SVM, MSSA-KNN, and MSSA-RF. This indicates that ELM is particularly effective in capturing complex fault features. Second, in terms of optimization algorithm comparison, while traditional SSA-ELM exhibited local optimum issues, the improved MSSA—enhanced by adaptive Levy flight and dynamic reverse learning—effectively improved global search ability and convergence efficiency. This optimization led to substantial performance enhancements across different models. Finally, the role of signal preprocessing was also evident. The accuracy of MSSA-ELM-N, which did not undergo LMD and Pearson correlation-based reconstruction, was only 93.75% for Dataset 1, whereas the accuracy of the preprocessed MSSA-ELM model increased to 96.875%. This confirms that noise suppression and feature enhancement through signal preprocessing play a crucial role in improving diagnostic accuracy.

The superiority of MPE over alternative entropy methods was validated through comparative experiments. MPE achieved 15.2% higher separability between fault classes than single-scale permutation entropy (PE) and 9.8% higher than sample entropy (SE) when applied to LMD-derived PF components. This advantage stems from MPE’s ability to quantify nonlinear dynamics across multiple time scales, which is critical for characterizing transient fault impacts obscured by noise.

Regarding computational efficiency, the proposed LMD-MSSA-ELM framework was quantitatively compared to alternative signal decomposition methods, including VMD and EEMD. While the improved MSSA algorithm—enhanced by adaptive Levy flight and dynamic reverse learning—achieved faster convergence, the full preprocessing pipeline (LMD-MPE-MSSA) incurred an average runtime of 3.9 s per signal on a MATLAB/Intel i7-12700H platform. This was moderately higher than VMD-based (3.6 s) and EEMD-based (5.8 s) alternatives, as detailed in Table 5.

The runtime–accuracy tradeoff analysis justifies the practicality of LMD-MPE-MSSA: while its computational cost exceeds VMD by 8.3%, it delivers a 5.08% accuracy gain, critical for safety-critical applications.

5. Conclusions

This study proposes a multi-strategy improved Sparrow Search Algorithm incorporating adaptive Levy flight and dynamic opposition-based learning for bearing fault diagnosis. The results indicate that this approach consistently achieves high-accuracy fault classification under varying operating conditions, demonstrating strong robustness and practical applicability. The key conclusions are as follows:

(1): Effectiveness of signal preprocessing: To address the challenge of fault signal interference and feature ambiguity under low-speed conditions, LMD combined with Pearson correlation coefficient-based reconstruction was applied. This method effectively removed irrelevant noise while preserving high-frequency features indicative of local faults and abnormal wear. The experimental results show that this preprocessing step improves model accuracy from 93.75% to 96.875%, providing a clearer and more reliable data foundation for subsequent feature extraction.
(2): An enhanced optimization strategy: To overcome the local optima problem and slow convergence of traditional SSA, this study integrates adaptive Levy flight and dynamic opposition-based learning into the optimization framework. Levy flight expands the search space via long-tail jumps, while dynamic opposition-based learning enhances population diversity, achieving a well-balanced global and local search process. The experimental results confirm that models optimized using MSSA consistently exceed 96% accuracy across multiple datasets, demonstrating its superior parameter optimization capability.
(3): Comprehensive performance validation: The proposed method, integrating signal preprocessing and multi-strategy optimization, achieves over 95% accuracy on both the Xi’an Jiaotong University bearing dataset and the laboratory-collected dataset, proving its robustness and adaptability across different working conditions and noisy environments.

The proposed approach not only achieves high-precision fault diagnosis but also significantly enhances system robustness and generalization ability, providing a reliable technical foundation for industrial online monitoring and predictive maintenance.

Author Contributions

H.C. (Haodong Chi): conceptualization, methodology, software, validation, data curation, writing—original draft preparation, and formal analysis; H.C. (Huiyuan Chen): investigation, resources, writing—review and editing, supervision, and project administration. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors upon request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Cerrada, M.; Sánchez, R.V.; Li, C.; Pacheco, F.; Cabrera, D.; Oliveira, J.; Vásquez, R. A review on data-driven fault severity assessment in rolling bearings. Mech. Syst. Signal Process. 2018, 99, 169–196. [Google Scholar] [CrossRef]
Jain, P.H.; Bhosle, S.P. Analysis of vibration signals caused by ball bearing defects using time-domain statistical indicators. Int. J. Adv. Technol. Eng. Explor. 2022, 9, 700. [Google Scholar]
Jawad, S.M.; Jaber, A.A. Bearings health monitoring based on frequency-domain vibration signals analysis. Eng. Technol. J. 2022, 41, 86–95. [Google Scholar] [CrossRef]
Li, Y.; Zhou, J.; Li, H.; Meng, G.; Bian, J. A fast and adaptive empirical mode decomposition method and its application in rolling bearing fault diagnosis. IEEE Sens. J. 2022, 23, 567–576. [Google Scholar] [CrossRef]
Nishat Toma, R.; Kim, C.H.; Kim, J.M. Bearing fault classification using ensemble empirical mode decomposition and convolutional neural network. Electronics 2021, 10, 1248. [Google Scholar] [CrossRef]
Hu, C.; Xing, F.; Pan, S.; Yuan, R.; Lv, Y. Fault diagnosis of rolling bearings based on variational mode decomposition and genetic algorithm-optimized wavelet threshold denoising. Machines 2022, 10, 649. [Google Scholar] [CrossRef]
Liang, T.; Lu, H. A novel method based on multi-island genetic algorithm improved variational mode decomposition and multi-features for fault diagnosis of rolling bearing. Entropy 2020, 22, 995. [Google Scholar] [CrossRef]
Wang, L.; Liu, Z.; Miao, Q.; Zhang, X. Complete ensemble local mean decomposition with adaptive noise and its application to fault diagnosis for rolling bearings. Mech. Syst. Signal Process. 2018, 106, 24–39. [Google Scholar] [CrossRef]
Han, M.; Wu, Y.; Wang, Y.; Liu, W. Roller bearing fault diagnosis based on LMD and multi-scale symbolic dynamic information entropy. J. Mech. Sci. Technol. 2021, 35, 1993–2005. [Google Scholar] [CrossRef]
Zhao, D.; Liu, S.; Cheng, S.; Sun, X.; Wang, L.; Wei, Y.; Zhang, H. Parallel multi-scale entropy and it’s application in rolling bearing fault diagnosis. Measurement 2021, 168, 108333. [Google Scholar] [CrossRef]
Ying, W.; Tong, J.; Dong, Z.; Pan, H.; Liu, Q.; Zheng, J. Composite multivariate multi-Scale permutation entropy and laplacian score based fault diagnosis of rolling bearing. Entropy 2022, 24, 160. [Google Scholar] [CrossRef] [PubMed]
Zhang, X.; Wang, H.; Ren, M.; He, M.; Jin, L. Rolling bearing fault diagnosis based on multiscale permutation entropy and SOA-SVM. Machines 2022, 10, 485. [Google Scholar] [CrossRef]
He, C.; Wu, T.; Gu, R.; Jin, Z.; Ma, R.; Qu, H. Rolling bearing fault diagnosis based on composite multiscale permutation entropy and reverse cognitive fruit fly optimization algorithm–extreme learning machine. Measurement 2021, 173, 108636. [Google Scholar] [CrossRef]
Chen, Y.; Yuan, Z.; Chen, J.; Sun, K. A novel fault diagnosis method for rolling bearing based on hierarchical refined composite multiscale fluctuation-based dispersion entropy and PSO-elm. Entropy 2022, 24, 1517. [Google Scholar] [CrossRef]
Meng, L.; Liu, M.; Wei, P.; Qin, H. Rolling Bearing Fault Diagnosis Based on Improved VMD And GA-ELM. In Proceedings of the 2021 40th Chinese Control Conference (CCC), Shanghai, China, 26–28 July 2021; pp. 4414–4419. [Google Scholar]
Gao, S.; Xu, L.; Zhang, Y.; Pei, Z. Rolling bearing fault diagnosis based on SSA optimized self-adaptive DBN. ISA Trans. 2022, 128, 485–502. [Google Scholar] [CrossRef]
Chen, D.; Zhao, J.D.; Huang, P.; Deng, X.; Lu, T. An improved sparrow search algorithm based on levy flight and opposition-based learning. Assem. Autom. 2021, 41, 697–713. [Google Scholar] [CrossRef]
Yu, X.; Peng, W.; Liu, Y. WSN node localization algorithm of sparrow search based on elite opposition-based learning and Levy flight. Telecommun. Syst. 2023, 84, 521–531. [Google Scholar] [CrossRef]
Xu, L.; Li, W.; Zhang, B.; Zhu, Y.; Lang, C. Fault diagnosis of mine truck hub drive system based on LMD multi-component sample entropy fusion and LS-SVM. Actuators 2023, 12, 468. [Google Scholar] [CrossRef]
Qin, L.; Yang, G.; Sun, Q. Maximum correlation Pearson correlation coefficient deconvolution and its application in fault diagnosis of rolling bearings. Measurement 2022, 205, 112162. [Google Scholar] [CrossRef]
Bandt, C.; Pompe, B. Permutation entropy: A natural complexity measure for time series. Phys. Rev. Lett. 2002, 88, 174102. [Google Scholar] [CrossRef]
Costa, M.; Goldberger, A.L.; Peng, C.K. Multiscale entropy analysis of complex physiologic time series. Phys. Rev. Lett. 2002, 89, 068102. [Google Scholar] [CrossRef] [PubMed]
Suthar, V.; Vakharia, V.; Patel, V.K.; Shah, M. Detection of compound faults in ball bearings using multiscale-SinGAN, heat transfer search optimization, and extreme learning machine. Machines 2022, 11, 29. [Google Scholar] [CrossRef]
Lv, J.; Sun, W.; Wang, H.; Zhang, F. Coordinated approach fusing RCMDE and sparrow search algorithm-based SVM for fault diagnosis of rolling bearings. Sensors 2021, 21, 5297. [Google Scholar] [CrossRef] [PubMed]
Lv, J.; Xiao, Q.; Zhai, X.; Shi, W. A high-performance rolling bearing fault diagnosis method based on adaptive feature mode decomposition and Transformer. Appl. Acoust. 2024, 224, 110156. [Google Scholar] [CrossRef]
Houssein, E.H.; Helmy, B.E.; Rezk, H.; Nassef, A.M. An efficient orthogonal opposition-based learning slime mould algorithm for maximum power point tracking. Neural Comput. Appl. 2022, 34, 3671–3695. [Google Scholar] [CrossRef]

Figure 1. Bearing fault diagnosis flowchart.

Figure 2. Experimental setup.

Figure 3. Original vibration signals.

Figure 4. PF components decomposed by LMD.

Figure 5. Comparison of raw and reconstructed signals. (a) Normal state. (b) Inner race fault. (c) Outer race fault. (d) Roller fault.

Figure 6. 3D visualization of the test function.

Figure 7. Iteration convergence curves.

Figure 8. Predicted results of different models. (a) Predicted result of ELM. (b) Predicted result of SSA-ELM. (c) Predicted result of MSSA-ELM.

Figure 9. Laboratory signal acquisition system.

Figure 10. Predictive accuracy of various models on different datasets.

Figure 11. Predictive accuracy of different comparative models on various datasets.

Table 1. Bearing parameters.

Parameter Name	Value	Parameter Name	Value
Inner ring raceway diameter (mm)	29.30	Ball diameter (mm)	7.92
Outer ring raceway diameter (mm)	39.80	Number of balls	8
Bearing pitch diameter (mm)	34.55	Contact angle (°)	0
Basic dynamic load rating (N)	12,820	Basic static load rating (kN)	6.65

Table 2. Bearing accelerated life test conditions.

Condition Number	1	2	3
Rotational speed (r/min)	2100	2250	2400
Radial force (kN)	12	11	10

Table 3. Pearson correlation coefficients between PF components and raw signals.

Component	Normal	Inner Race Fault	Outer Race Fault	Roller Fault
PF1	0.6251	0.8562	0.6653	0.5694
PF2	0.4395	0.5328	0.6257	0.5328
PF3	0.3640	0.3365	0.4962	0.4013
PF4	0.3326	0.2143	0.3851	0.3610
PF5	0.1203	0.1896	0.1694	0.2681
PF6	0.0033	0.0963	0.1125	0.1549
PF7	0	0.0574	0.0743	0.0395

Table 4. Diagnostic accuracies of models (%).

Model	Fault Type				Overall
Model	1	2	3	4	Overall
ELM	87.5	88.75	86.25	87.5	87.5
SSA-ELM	91.25	91.25	92.5	92.5	91.875
MSSA-ELM	96.25	97.5	96.25	97.5	96.875

Table 5. Computational cost and accuracy comparison of decomposition methods.

Method	Average Runtime (s)	Accuracy (%)
LMD-MPE-MSSA-ELM	3.9	96.875
VMD-MPE-MSSA-ELM	3.6	92.19
EEMD-MPE-MSSA-ELM	5.8	89.06

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chi, H.; Chen, H. Research on Rolling Bearing Fault Diagnosis Method Based on MPE and Multi-Strategy Improved Sparrow Search Algorithm Under Local Mean Decomposition. Machines 2025, 13, 336. https://doi.org/10.3390/machines13040336

AMA Style

Chi H, Chen H. Research on Rolling Bearing Fault Diagnosis Method Based on MPE and Multi-Strategy Improved Sparrow Search Algorithm Under Local Mean Decomposition. Machines. 2025; 13(4):336. https://doi.org/10.3390/machines13040336

Chicago/Turabian Style

Chi, Haodong, and Huiyuan Chen. 2025. "Research on Rolling Bearing Fault Diagnosis Method Based on MPE and Multi-Strategy Improved Sparrow Search Algorithm Under Local Mean Decomposition" Machines 13, no. 4: 336. https://doi.org/10.3390/machines13040336

APA Style

Chi, H., & Chen, H. (2025). Research on Rolling Bearing Fault Diagnosis Method Based on MPE and Multi-Strategy Improved Sparrow Search Algorithm Under Local Mean Decomposition. Machines, 13(4), 336. https://doi.org/10.3390/machines13040336

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Research on Rolling Bearing Fault Diagnosis Method Based on MPE and Multi-Strategy Improved Sparrow Search Algorithm Under Local Mean Decomposition

Abstract

1. Introduction

2. Signal Decomposition, Reconstruction, and Feature Extraction

2.1. Bearing Signal Decomposition and Reconstruction

2.2. Multi-Scale Permutation Entropy Feature Extraction

3. Rolling Bearing Fault Diagnosis Method Based on Multi-Strategy Improved Sparrow Search Algorithm Optimize ELM

3.1. Extreme Learning Machine

3.2. Multi-Strategy Improved Sparrow Search Algorithm

3.3. MSSA-ELM-Based Bearing Fault Diagnosis Model

4. Experimental Analysis

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI