Next Article in Journal
Verifiable Differential Privacy Partial Disclosure for IoT with Stateless k-Use Tokens
Next Article in Special Issue
Defect-Intent Ambiguity Addressing for Training-Free Deterministic PCB Defect Localization via Template Selection and Dissimilarity Mapping
Previous Article in Journal
A POA-QPSO Hybrid Algorithm for Multi-Objective Optimization of Dual-Layer Walker Constellations
Previous Article in Special Issue
Correlation Between Meso-Defect and Fatigue Life Through Representing Feature Analysis for 6061-T6 Aluminum Alloys
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Composite Fault Feature Index-Guided Variational Mode Decomposition with Dynamic Weighted Central Clustering for Bearing Fault Detection

School of Mechatronic Engineering, Changchun University of Technology, Changchun 130103, China
*
Authors to whom correspondence should be addressed.
Sensors 2026, 26(4), 1394; https://doi.org/10.3390/s26041394
Submission received: 29 January 2026 / Revised: 18 February 2026 / Accepted: 20 February 2026 / Published: 23 February 2026
(This article belongs to the Special Issue Sensing Technologies in Industrial Defect Detection)

Abstract

To address the periodic impacts and amplitude-modulated high-frequency resonance phenomena caused by bearing faults in rotating machinery, this paper proposes a detection method. The core innovation lies in: firstly, constructing a composite fault feature index (CFFI) that integrates normalized kurtosis and fuzzy entropy, which synchronously quantifies the fault impact intensity and periodic structure, and serves as an optimization objective; secondly, definining a spectral energy retention rate (SERR) that includes both the full spectrum and characteristic frequency bands to evaluate the denoising effect and fault feature retention, respectively. Based on this, the method adaptively determines the Variational Mode Decomposition (VMD) parameters through the Triangular Topology Aggregation Optimizer (TTAO), and uses Dynamic Weighted Center Clustering (DWCC) to screen key IMFs containing fault-envelope information. On the IMS bearing dataset, the SERR of the reconstructed signal is 0.21356, which is higher than the actual collected signal value of 0.22465, with a relative error of 4.9%, indicating a higher reconstruction accuracy. These quantitative results indicate that CFFI-guided optimization enhances impulsive and periodic fault components while maintaining stable feature-band retention. This approach is suitable for real-world equipment monitoring and exhibits strong engineering applicability.

1. Introduction

As a critical component of rotating machinery, rolling bearing failure often manifests as periodic impacts between rolling elements and defect sites. This process excites high-frequency resonance, generating impulses and amplitude-modulation structures in the envelope domain [1,2,3]. Under real-world conditions characterized by variable operating states, intense noise, and multi-source interference, fault features are typically weak, non-stationary, and easily masked, posing significant challenges to the sensitivity and robustness of traditional diagnostic methods that rely on fixed parameters and single features [4]. Consequently, how to adaptively select frequency bands, achieve stable signal decomposition, and quantitatively evaluate feature preservation in real-world scenarios has become a key research direction in bearing fault signal processing.
Classical diagnostic procedures typically follow the “band selection–envelope demodulation” principle. Methods such as spectral kurtosis and kurtograms are first used to select resonance bands that highlight transients, followed by zero-phase bandpass filtering and a Hilbert transform to obtain the envelope, thereby enhancing impulsive and modulated fault information [5]. While effective in many engineering applications, this fixed-parameter approach with a single band-selection criterion is prone to band-selection bias and feature degradation under low signal-to-noise ratio (SNR) and non-stationary conditions [6]. To enhance feature separation, adaptive time–frequency decomposition methods have been widely adopted. Techniques such as EMD, EEMD, and CEEMDAN can decompose signals into intrinsic modes at different scales and capture transients, but they suffer from issues such as mode mixing and end effects [7]. In contrast, Variational Mode Decomposition (VMD) is an adaptive signal decomposition method that recursively extracts intrinsic mode functions (IMFs) with specific center frequencies by constructing and solving a constrained variational problem. VMD iteratively solves for modes and their center frequencies within a variational framework that minimizes bandwidth, offering better noise robustness and decomposition stability, which aligns well with the “impact-resonance” signal mechanism [8]. However, the decomposition quality of VMD is susceptible to its penalty factor (α) and mode number (K). The penalty factor (α) controls the bandwidth of each extracted mode; a larger α yields smoother modes with better noise suppression but may obscure impulsive features, while a smaller α preserves detail at the cost of increased noise sensitivity. The mode number (K) determines how many intrinsic mode functions (IMFs) are generated; an insufficient K leads to under-decomposition (merging of distinct fault components), whereas an excessive K causes over-decomposition (splitting of fault features and introduction of spurious components). In bearing fault diagnosis, an inappropriate penalty factor or mode number (α, K) can directly obscure the impulse-induced resonance signatures targeted for extraction. Without data-driven criteria, over-decomposition or under-decomposition can easily occur.
To address VMD’s parameter sensitivity, recent research has employed metaheuristic optimization algorithms, using a health indicator as the fitness function to achieve adaptive parameter tuning. This approach demonstrates a good balance between global and local search in selecting signal-processing parameters [9,10]. Meanwhile, entropy-based features (e.g., sample entropy, permutation entropy, fuzzy entropy) and kurtosis are complementary in characterizing the complexity and impulse strength of time series. They are often used as fault-sensitive indicators or combined into composite criteria to enhance sensitivity and robustness to weak features [11,12]. In the post-decomposition stage, clustering methods can help distinguish noise-dominant from fault-dominant intrinsic mode functions (IMFs), reducing the impact of mode mixing on reconstruction quality and thus preserving fault-related components more effectively. For evaluation, publicly available datasets, such as the IMS bearing data, provide standardized benchmarks and multi-condition coverage for generalized validation [13].
In addition, in terms of constructing health indicators, although there have been studies combining multiple features [14], they are mostly used for later diagnosis and rarely used as a unified optimization criterion for guiding the entire process of signal decomposition [15]. In the stage of modal selection and reconstruction, traditional methods often rely on empirical thresholds or simple rules to screen IMFs, lacking robustness analysis of IMF cluster characteristics, and are susceptible to mode aliasing and noise IMF interference [16].
To tackle the aforementioned challenges, this paper constructs a closed-loop diagnostic framework following the sequence: “indicator, decomposition, clustering, reconstruction, evaluation.” Specifically: (1) Guided by spectral kurtosis, perform frequency band selection, followed by zero-phase filtering and envelope demodulation. (2) Use the proposed Composite Fault Feature Indicator (CFFI), which integrates impulse sensitivity with complexity robustness, as the fitness function to optimize VMD’s α and K parameters adaptively. This focuses the decomposition on components related to fault envelopes. (3) Apply Dynamic Weighted Central Clustering (DWCC) to the obtained IMFs and reconstruct the fault-dominant components. (4) Quantify noise reduction and feature preservation using a defined Spectral Energy Retention Rate (SERR). The framework’s stability and engineering applicability are validated using the IMS bearing dataset. A comparative analysis with baseline methods using various entropy-based fitness functions demonstrates the superior robustness of the proposed approach.
The main contributions of this paper are as follows:
  • CFFI integrating features sensitive to impulses and robust to signal complexity, serving as a unified criterion for both VMD parameter optimization and IMF selection.
  • Employs health-indicator-driven metaheuristic optimization to adaptively select VMD’s α and K, mitigating over-/under-decomposition and enhancing focus on fault-envelope information.
  • Adopts DWCC for robust clustering and reconstruction of IMFs, reducing the effects of mode mixing and highlighting fault-dominant components.
  • Introduces the SERR as an evaluation metric, establishing a closed-loop quantitative assessment from band selection to reconstruction, facilitating online monitoring and trend tracking.

2. Research Methodology

The objective of this study is to achieve real-time detection of bearing faults (e.g., spalling on inner/outer rings or rolling elements) by determining optimal VMD parameters. This guides the adaptive selection of the impact and modulated resonance information inherent to such faults [17,18]. This objective is achieved through four key steps: (1) A CFFI specific to bearing faults is proposed to quantitatively measure the strength of fault-related impulses and periodic structures, simultaneously serving as the fitness function for the optimization algorithm. (2) The selected fitness function is used to optimize the two key VMD parameters: the penalty factor α and the mode number K. This optimization ensures that the decomposed IMFs are concentrated within the fault-related resonance bands and envelope components, thereby adaptively amplifying bearing fault signatures. (3) An improved clustering method, DWCC, is applied to the IMFs. Based on the CFFI values, the cluster with the highest average CFFI is selected for signal reconstruction, yielding a signal enriched with bearing fault features. (4) Finally, the feature-band SERR is calculated within the known or estimated fault characteristic frequency band Ω, enabling the identification and trend tracking of bearing fault characteristics.

2.1. Health Indicators

In fault diagnosis systems, health indicators are the key to converting raw vibration signals into interpretable, state-sensitive quantitative measures. Their core function is to detect, quantify, and track early fault characteristics, especially in noisy and non-stationary operating conditions, where simple visual or statistical analysis often fails. To effectively capture the key changing characteristics of bearing fault signals, this paper proposes a novel CFFI.
Traditional health indicators, such as root mean square (RMS), KI, and various entropy indicators, are often designed for the single physical characteristics of signals. Although effective in specific scenarios, their sensitivity and robustness are difficult to balance when facing the complex characteristics of impact and periodic modulation that coexist in bearing faults. For example, kurtosis is sensitive to impact but susceptible to noise and outlier interference; The entropy index has a certain ability to characterize periodic structures, but its peak response to transient shocks is weak.
This makes CFFI not only a better-performing alternative indicator, but also a closed-loop guiding core that embeds diagnostic intelligence into the entire process of “decomposition, optimization, clustering, evaluation”. The application of traditional indicators is often open-loop and isolated, while CFFI optimizes and filters signals through the same criteria, improving the cohesion, adaptability, and target consistency of the entire feature extraction process. In addition, in Section 4, this article compares CFFI with the traditional metrics root mean square (RMS) and KI in three evaluation dimensions: trend (Tred), monotonicity (Mono), and scale-adaptive comprehensive feature variability (SAIF).

2.1.1. Construction of the Composite Fault Feature Index

The introduction of the CFFI is based on an understanding of the limitations of existing signal-feature extraction methods, and its construction is crucial to the adaptiveness of the optimization algorithm proposed in this paper. Traditional features such as RMS and KI perform well in specific scenarios. However, when dealing with complex, non-stationary signals exhibiting impact-resonance modulation characteristics typical of bearing faults, a single feature is often insufficient. Bearing faults usually manifest as high-frequency resonance excited by periodic impacts, resulting in pronounced impulses and periodic structures in the envelope domain. To simultaneously capture both impulse sensitivity and periodic structure, the CFFI integrates FE and KI through normalized weighted fusion, defined as follows:
C F F I I M F i = w k × K I n o r m ( I M F i ) + w e × [ 1 F E n o r m ( I M F i ) ]
where I M F i represents the i-th intrinsic mode function; KInorm and FEnorm denote the normalized values of KI and FE for that IMF, respectively; w k and w e are weighting coefficients (default: w k = w e = 0.5, adjustable based on operational conditions). This definition ensures that a higher CFFI value indicates stronger impulses and periodic structures in the signal related to bearing faults. The CFFI serves both for selecting and clustering individual IMFs and as the fitness function for the Triangular Topology Aggregation Optimizer (TTAO) in optimizing VMD parameters, enabling the decomposition process to focus on fault-related features adaptively.
Fuzzy entropy [19] is used to quantify the complexity and uncertainty of a time series. Its value tends to be lower when the signal exhibits stronger periodicity. The calculation formula is as follows:
FE norm I M F i = ln ϕ m + 1 I M F i ϕ m I M F i
where ϕ m and ϕ m + 1 represent the fuzzy similarity measures at embedding dimensions m and m+1, respectively.
The kurtosis index [20] quantifies impulse strength and the peakedness of a distribution. Impacts from bearing faults and the resulting pulses in the envelope domain typically cause a significant increase in kurtosis. It is calculated using the following formula:
K I n o r m I M F i = 1 N k = 1 N I M F i k μ σ 4
where N denotes the sample size, μ is the mean, and σ represents the standard deviation.
The construction described above ensures that the CFFI is simultaneously sensitive to both impulsive shocks and periodic structures. Consequently, a higher CFFI value indicates a more substantial presence of components associated with bearing faults. Therefore, the CFFI serves a dual purpose: it acts as a criterion for selecting relevant IMFs and functions as the fitness function for the TTAO optimization algorithm, guiding the adaptive search for optimal VMD parameters.

2.1.2. Assessment Metric

To quantitatively evaluate the performance of the proposed novel health indicator (CFFI), this paper employs two standard metrics (Tred and Mono) along with one comprehensive metric, SAIF. Let the total number of sample segments be K, and the health indicator value for each segment be denoted as hk. The segment index sk can represent the time series. The definitions and significance of the three types of metrics are provided below.
(1)
Tred: This metric measures the linear correlation between the health indicator and time, reflecting the overall change trend of the indicator as the fault evolves over time.
T r e d = k = 1 K h k H ¯ s k S ¯ k = 1 K ( h k H ¯ ) 2 k = 1 K ( s k S ¯ ) 2
where H ¯ is the mean value of the health indicator, and S ¯ is the average value of the time index. To enhance robustness, sk can be assigned as the rank of the segment index.
(2)
Mono: This metric measures the direction of change in the health indicator over time, which is crucial for evaluating the degradation process of equipment health. Monotonicity is defined as follows:
M o n o = N p N n K 1
where N p and N n represent the number of health indicators that are greater than 0 and less than 0 after differentiation, respectively, and K denotes the sample size.
(3)
SAIF: This is used to comprehensively measure health indicators across multiple time scales, integrating methods such as multi-scale analysis, statistical features, and weighted summation. The definition of SAIF is as follows:
S A I F = 1 N i = 1 N w i 1 L j = 1 L σ i j μ i j + ϵ 2 + α s k e w x i j + β k u r t x i j
where N is the number of time scales in the signal, L is the number of sub-windows within each time scale, w i is the weight of the ith time scale, σ i j and μ i j are the standard deviation and mean of the j-th sub window under the i-th time scale, respectively, ϵ is a small constant to prevent division by zero, α and β are weight coefficients for skewness and kurtosis, and s k e w ( x i j ) and k u r t ( x i j ) are the skewness and kurtosis of the j-th sub window under the i-th time scale, respectively.

2.2. Variational Mode Decomposition

VMD is an adaptive signal-processing method [21,22] that decomposes a complex signal into several IMFs with distinct center frequencies. It addresses issues such as mode mixing and end effects that are present in traditional EMD methods [23] and is widely applied in fields such as fault diagnosis and signal denoising. The core of VMD lies in minimizing the bandwidth of each mode through a variational approach. This process can be summarized in the following steps:
(1)
Objective of Mode Decomposition:
  • The input signal f(t) is assumed to be decomposed into K mode functions u(k), each associated with a center frequency ωk.
  • The objective is to find these mode functions uk(t) and their corresponding center frequencies ωk such that the bandwidth of each mode is minimized.
(2)
Formulating the Variational Problem:
The problem of minimizing modal bandwidth is formalized as a constrained variational problem:
m i n u k , ω k k = 1 K t δ t + j π t     u k t e j ω k t 2 2 s . t . k u k = f
where t denotes the time-derivative operator, δ t is the Dirac delta function, which ensures orthogonality of the modes in the time domain, represents the convolution operator, j is the imaginary unit, u k t is the kth mode function, and ω k is its corresponding center frequency.
(3)
Lagrangian Multiplier Method:
To handle the reconstruction constraint k = 1 K u k t = f t , the Lagrangian multiplier λ(t) is introduced:
L ( { u k } , { ω k } , λ ) = α k = 1 K t δ t + j π t     u k t e j ω k t 2 2 + f t k = 1 K u k t 2 2 + λ t , f t k = 1 K u k t
(4)
Solving the Variational Problem: The Alternating Direction Method of Multipliers (ADMM) is employed to solve the aforementioned Lagrangian function iteratively, updating the mode functions u k ( t ) and the center frequencies ω k .
u k n + 1 = arg m i n u k t δ t + j π t     u k t e j ω k t 2 2 + α 2 f t i k u i t u k t + λ 2 α 2 2
ω k n + 1 = 0 ω | u k ( ω ) | 2 d ω 0 | u k ( ω ) | 2 d ω λ n + 1 = λ n + τ f t k = 1 K u k n + 1 t
(5)
Convergence Criterion: It is determined whether the change in the mode functions u k during each iteration satisfies the convergence condition:
k = 1 K   u k n + 1 u k n 2 2 u k n 2 2 < ϵ
If satisfied, the iteration stops. The motivation for selecting VMD as the modal decomposition method lies in its high degree of alignment with the mechanism of bearing faults: (1) It effectively captures and separates complex vibration modes. (2) It suppresses noise, thereby enhancing signal clarity. (3) It adapts to non-stationary signals, enabling precise analysis of transient features. (4) It identifies and distinguishes multiple fault modes.

2.3. Triangulation Topology Aggregation Optimizer

The Triangulation TTAO is a novel metaheuristic algorithm based on mathematical principles, specifically designed to address complex continuous optimization problems in engineering applications [24]. TTAO leverages the geometric properties of similar triangles to balance exploration and exploitation, thereby improving the efficiency of finding the optimal solution.
The core concept of TTAO is to partition the search space into multiple triangular topology units. Each unit is optimized by updating its vertices and generating interior points within the triangle. These units iteratively evolve through processes of global and local aggregation, gradually converging towards the optimal solution.
The TTAO algorithm is implemented through the following steps:
(1)
Initialization: An initial population P is distributed within the search space, forming several triangular units. Each unit consists of three vertices X i 1 , X i 2 , X i 3 , and an additional point X i 4
P = X 1 , X 2 , , X N
where N represents the population size, and each X i is a d-dimensional vector.
(2)
Formation of Triangular Units: The vertices X i 1 , X i 2 , and X i 3 of each triangular unit can be randomly distributed within the search space during the initialization process. The additional point X i 4 is typically generated inside the triangle to assist in the optimization process.
(3)
Global Aggregation: During the Global Aggregation phase, new candidate solutions are generated by performing weighted combinations of the vertices from different triangular units. This step emphasizes exploring potential regions within the search space.
Assuming the vertices of a triangular unit Δ i j k are X i , X j , and X k , a new point X n e w can be generated using the following formula:
X n e w = α X i + β X j + γ X k
where α, β, γ are weighting coefficients that satisfy α + β + γ = 1, and are typically determined using random methods.
(4)
Local Aggregation: The Local Aggregation phase focuses on refining the search within each triangular unit. A local search around the best vertex is conducted by perturbing its position to find a better solution. For each vertex X i , its update can be expressed as:
X i t + 1 = X i t + λ X b e s t X i t
where λ is a step-size coefficient, X best is the best-performing vertex in the current triangular unit, and t denotes the iteration number.
(5)
Iteration and Convergence: The algorithm alternates between global and local aggregation phases, updating the individuals in the population until a convergence condition is met (e.g., a maximum number of iterations or an acceptable solution quality). The convergence criterion can typically be defined as the change in the objective function value Δ f < ϵ , or reaching a predefined maximum number of iterations t max : Δ f = f ( X ( t ) ) f ( X ( t 1 ) ) < ϵ or t t m a x .

2.4. Dynamic Weighted Centroid Clustering

In signal analysis, decomposing a complex signal into several IMFs with distinct characteristics is a crucial step. Building on this, clustering the IMFs can further reveal underlying patterns and structures within the signal. To better analyze and utilize the decomposed IMFs, this section proposes a DWCC method, which is described in detail alongside actual vibration signals. The specific steps of the DWCC algorithm are as follows:
(1)
Data Standardization: Prior to cluster analysis of the IMFs, it is necessary to standardize the CFFI values of the IMFs. This process aims to eliminate the influence of IMFs with different magnitudes and scales on the clustering results, allowing for comparison and analysis of all IMFs on a unified scale. The standardized CFFI value is defined as
C F F I n o r m a l i z e d I M F i = C F F I I M F i μ C F F I σ C F F I
where CFFI n o r m a l i z e d ( IMF i ) is the CFFI value of the i-th IMF. μ CFFI and σ CFFI are the mean and standard deviation of the CFFI values for all IMFs, respectively.
Through standardization, the impact of differences in magnitude across IMFs is eliminated, thereby improving the accuracy and effectiveness of subsequent cluster analysis.
(2)
Cluster Center Initialization: After standardization, the K-means++ algorithm is employed to select K initial cluster centers μ 1 , μ 2 , , μ K , which are determined using the following formula:
μ j 0 = arg m i n i CFFI normalized I M F i μ j 2
where μ j 0 denotes the initial center of the j-th cluster. The K-means++ algorithm ensures a reasonable distribution of the initial centers, thereby improving the convergence speed and the effectiveness of the clustering results.
(3)
Improved Distance Metric: In DWCC, an improved weighted Euclidean distance metric is used to calculate the similarity between each IMF and the cluster centers, enabling precise measurement of the similarities among IMFs:
d w I M F i , μ j = k = 1 m w k C F F I normalized , k I M F i μ j , k 2
where w k is the weight for each feature. m is the dimensionality of the features.
(4)
Adaptive IMF Assignment: An adaptive distribution strategy is employed, which dynamically assigns IMFs to the most suitable cluster center based on the local density and global similarity of each IMF. The adaptive IMF assignment strategy is defined as
c i = arg m i n j d w I M F i , μ j + λ 1 ρ I M F i
where c i is the assigned class of IMF i . λ is a balancing factor used to adjust the influence of distance and density. ρ ( IMF i ) represents the local density of IMF i .
(5)
Dynamic Cluster Center Update: Cluster centers are updated dynamically using a weighted averaging method to achieve smooth transitions and accelerate convergence. The dynamic update strategy is as follows:
μ j t + 1 = α μ j t + 1 α 1 C j i C j C F F I normalized I M F i
where α is a smoothing coefficient that controls the weight distribution between the new cluster center and the old one. C j is the set of IMFs currently belonging to the j th cluster. C j is the number of IMFs in the j th cluster.
(6)
Adaptive Convergence Criterion: The convergence criterion is dynamically adjusted based on the stability and trend of the cluster centers to ensure the algorithm achieves optimal clustering results within a reasonable timeframe:
μ j t + 1 μ j t < ϵ t j
where ϵ t is the convergence threshold at the t -th iteration, which can be dynamically adjusted according to the historical changes in the cluster centers.
(7)
Optimization Enhancement Strategy: After each iteration, a local search and optimization are performed on the assignment results to reduce intra-class variance further and enhance the robustness and accuracy of the clustering. Specific measures include: reassigning and optimizing IMFs within each cluster, removing IMFs that may belong to multiple categories, and adjusting them to more suitable classes.

2.5. Algorithm Flow

In the method proposed in this paper, the novel health indicator CFFI is used as the fitness function given in Equation (21), and TTAO is employed to optimize VMD. (* indicates specific numerical values used in practical operation.)
f i t n e s s = m i n K , α C F F I i s . t . K , α ,
C F F I i is the fitness function value obtained for each candidate. The proposed method is illustrated in Figure 1.
  • Initialize the parameters of TTAO and VMD.
  • Decompose the original signal into different modes.
  • Calculate the CFFI for each modality.
  • Calculate the fitness function, CFFI, for each modality until the maximum iteration is reached.
  • Obtain the global optimal fitness function and save the optimal values of the mode number K and the penalty factor α.
  • Perform DWCC on the decomposed IMFs.
  • Select the cluster with the highest average CFFI value as the IMFs for the reconstructed signal.
In practical decomposition, low-energy narrowband residual or broadband noise IMF usually exhibits lower CFFI values. DWCC will classify them into low CFFI clusters and remove them in the reconstruction stage, thereby improving the energy concentration and identifiability of fault-related components.

3. Simulation Analysis

To validate the applicability of the proposed method during bearing operation, a non-stationary simulated signal was constructed as a benchmark. This signal is designed to evaluate the algorithm’s capabilities for decomposition, optimization, and reconstruction under time-varying frequency and amplitude conditions, as well as in the presence of background noise. It is important to note that this simulated signal is not a physical model of a bearing fault but rather a non-stationary test sequence used to assess the algorithm’s robustness. The base signal is defined as a sinusoidal wave whose period and amplitude decrease over time via time attenuation factors. The mathematical expressions for a periodic signal with period T are given as:
f t = f 0 e k f t
A t = A 0 e k A t
x s i n t = A t sin 2 π f t t
where A is the initial amplitude of the signal, f is the signal frequency, t is time, f 0 = 2 Hz , k f = 0.1 s 1 , A 0 = 1 , and k A = 0.2 s 1 , as shown in Figure 2a.
On this basis, random white noise is generated and added to the above signal to form the final simulated signal, as shown in Figure 2b, expressed as:
x f i n a l t = x s i n t + n t
where n ( t ) = ( a m i n + ( a m a x a m i n ) r ) N ( 0,1 ) , with a m i n = 0.1 , a m a x = 3.0 , r being a random number uniformly distributed in [0, 1], and N ( 0,1 ) representing Gaussian noise with zero mean and unit variance.
First, the parameters for VMD and TTAO were initialized, with CFFI set as the fitness function for TTAO, aiming to maximize fitness. For the VMD, the number of iterations was set to 100, and the maximum number of decomposition layers K max = 25 . For the TTAO optimization, the number of iterations was set to 50. The resulting optimal fitness value found by TTAO was 0.11976 (as shown in Figure 3). Subsequently, VMD was used to decompose the original signal into several IMFs with different center frequencies. During each decomposition and parameter search, the CFFI was calculated for each IMF and aggregated (e.g., by taking the maximum value or the mean of the top p values) to form the fitness value. This process iterated until the parameter combination yielding the highest fitness was found, and the optimal parameters were saved (the optimal number of modes K = 10 , and the penalty factor α = 200 ). Under these parameters, the corresponding IMFs (see Figure 4) and their center frequencies (see Figure 5) were obtained.
As shown in Figure 4 and Figure 5, the first three IMFs clearly capture the pulse-resonance-dominated shock-attenuation characteristics; the subsequent IMF (IMF4-IMF10) mainly includes low-energy narrowband residual and broadband noise preserved after optimized parameter decomposition (K = 10). Based on the screening of CFFI and DWCC, these low-CFFI modes are classified as noise clusters and discarded in the reconstruction, further ensuring the feature set and denoising effect of the reconstructed signal. DWCC was then performed on all IMFs, followed by CFFI-based screening. The cluster with the highest average CFFI was selected for reconstruction to focus on fault-related components (four components were obtained, as shown in Figure 6).
In this example, the number of clusters was set to six. Since a higher CFFI value indicates stronger fault-related impulses and periodic structures, the cluster with the highest average CFFI (in this case, comprising four IMFs) was selected to form the reconstruction components. To evaluate denoising and feature-preservation performance, the SERR is introduced. Without requiring a clean reference signal, SERR assesses the overall quality of noise reduction and feature extraction by comparing the energy changes in the frequency domain before and after denoising. It is defined as
E o r i g i n a l = k | Y o r i g i n a l k | 2
E d e n o i s e d = k | Y d e n o i s e d k | 2
S E R R = E d e n o i s e d E o r i g i n a l
where Y original [ k ] is the spectral representation of the signal before denoising, Y denoised [ k ] is the spectral representation of the signal after denoising, E original represents the total spectral energy of the signal before denoising, and E denoised represents the total spectral energy of the signal after denoising.
Finally, the four obtained components were reconstructed (as shown in Figure 7, which compares the reconstructed signal with the pure signal, demonstrating effective denoising). The spectral energy of the reconstructed signal was calculated, resulting in an SERR value of 0.31254. A lower SERR value indicates a greater reduction in the denoised signal’s overall spectral energy, indicating good denoising performance. Figure 8 better illustrates the specific implementation of the signal processing. Figure 8a shows the time-domain plot of the generated noisy signal, and Figure 8b displays the amplitude spectrum and cluster spectrum of the noisy signal. Under the influence of noise, the signal’s spectrum appears complex and difficult to analyze directly. Figure 8c presents the amplitude spectrum and cluster spectrum of the denoised signal after processing by the proposed algorithm. Using this algorithm, noise is effectively removed, significantly enhancing signal clarity and feature extraction accuracy.
As shown in Figure 8, the processed signal exhibits more concentrated spectral energy and reduced noise, validating the algorithm’s effectiveness for signal denoising and feature extraction.
To verify the algorithm’s stability, 10 additional simulation trials were conducted. The parameters for the simulated signals are listed in Table 1. Results from multiple trials demonstrate that the proposed method maintains good consistency and robustness under non-stationary and noisy conditions.
In summary, the signal-processing framework proposed in this paper applies to the analysis of non-stationary signals with decaying periods. It demonstrates good adaptability for feature extraction in bearing vibration signals.

4. Experimental Validation

To verify the practical application advantages of CFFI in bearing fault detection, comparative experiments were conducted based on the IMS public dataset. The overall process includes data preprocessing, indicator calculation and evaluation, decomposition and optimization, IMF screening and reconstruction, as well as ablation and repeated experiment analysis.
Data and Preprocessing:
(1)
Frequency band selection and envelope extraction: The failure of the bearing body is usually manifested as periodic impact excitation of high-frequency resonance, and the formation of low-frequency amplitude modulation characteristics in the envelope domain. Combine known or estimated characteristic frequencies, or select bands based on spectral kurtosis to determine the bandpass range, using zero-phase bandpass filtering to preserve fault-related frequency bands; subsequently, Hilbert transform is applied to the bandpass signal to obtain the envelope, and slight detrending is performed as needed to eliminate DC bias. This process can suppress irrelevant noise while highlighting amplitude modulation and pulse information.
(2)
Standardization: Standardize the envelope or bandpass signal with z-scores to achieve zero mean and unit variance, improving comparability across different operating conditions/time periods.
(3)
Segmented processing: Segment the standardized signal into fixed time windows (approximately 20,480 points per window), and calculate CFFI, RMS, KI, and evaluation metrics (Trend, Monotonity, SAIF) window-by-window to maintain temporal resolution and characterize dynamic changes.
CFFI, root mean square (RMS), and kurtosis index (KI) were calculated window-by-window on IMS data, and evaluated and compared using three metrics: trend, monotonicity, and scale-adaptive integrated feature variability (SAIF). The summary results are shown in Figure 9.
The results showed that CFFI had advantages in all three metrics, especially in the SAIF dimension approaching 1, indicating its stronger significance and stability in fault-related features at multiple scales. Combining the composite characterization of pulse-like (KInorm) and periodic structure complexity (1-FEnorm) in CFFI, it can be concluded that its advantage comes from simultaneously characterizing peak and periodic structures, thereby more robustly highlighting fault-related features under different operating conditions and time scales. This further validates the sensitivity and robustness of CFFI as a health assessment indicator for bearings.
The sampling rate for IMS data recording is 20 kHz, with approximately 20,480 points per file. The preprocessing stage determines the fault-related resonance band based on spectral kurtosis and its harmonics, and performs envelope demodulation and trend removal. Subsequently, the envelope domain signal is decomposed into IMFs with different center frequencies using VMD, with a fixed number of iterations of 100 and a maximum decomposition level of Kmax = 25. To obtain the optimal decomposition configuration, TTAO is introduced to adaptively search for the penalty factor α and mode number K, and the optimization budget is uniformly set to 50 evaluations; in each candidate parameter evaluation, calculate the CFFI of each IMF and update the search parameters with the goal of maximizing CFFI. The representative results show that the optimal parameters are α = 400 and K = 10. Under this configuration, the SERR of the reconstructed signal is 0.21356, which is lower than the value of 0.22465 for the actual collected signal. This reduction in SERR indicates a decrease in extraneous spectral energy, thereby confirming the effectiveness of the denoising process and a higher quality of decomposition and reconstruction. The relative error between the two is 4.9%. Subsequently, the CFFI mean of each cluster is calculated, and the cluster with the highest mean is selected as the reconstruction set to avoid irrelevant or redundant modes from being mixed in. This process usually selects representative components such as IMF1, IMF5, IMF10 for reconstruction in experiments, which not only preserves the peak and amplitude modulation structure of fault pulses, but also effectively suppresses background noise and non-fault components.
On the premise of ensuring consistency in the data, preprocessing, VMD iteration, and optimization budget, ablation experiments were conducted to quantify the independent contributions of each module. Figure 10 compares the overall performance of five methods on SERR: (1) the complete method (TTAO+CFFI+DWCC); (2) the fixed parameters (K = 10, α = 400) are slightly higher than the complete method, indicating that empirical parameters can approach the optimum under specific operating conditions but lack adaptability; (3) under the same budget, the SERR of the grid search further increases, reflecting the influence of step size and discretization constraints; (4) replacing CFFI with minimum envelope entropy as the fitness resulted in a significant increase in SERR, indicating that entropy indices are less sensitive to spike pulses and complex changes than CFFI; (5) excluding DWCC and directly selecting the top K IMFs for reconstruction results in mixed information and energy dispersion significantly compromised the quality of reconstruction and diagnostic distinguishability.
In addition, Figure 11 compares the SERR performance of different fitness functions in 10 repetitions. When using CFFI as the fitness, the 10 results remained stable in the range of approximately 0.20–0.24 with minimal fluctuations; when using minimum envelope entropy, minimum information entropy, minimum sample entropy, and minimum permutation entropy, the overall SERR significantly increases (typical range of about 0.50–0.60), and the dispersion across experiments significantly increases. This result indicates that CFFI can more directly measure the peak characteristics and complex dynamic changes of fault pulses, providing a clear convergence direction for optimizers; entropy-based indicators are susceptible to distribution estimation and window settings in noisy backgrounds and non-stationary pulse-dominated scenes, leading to unstable parameter selection and decreased reconstruction quality. Based on the data from different comparisons and repeated experiments, it can be concluded that TTAO is more efficient in approaching optimal solutions than grid search under limited evaluation budget; CFFI, as a fitness function, is more suitable for characterizing the pulse peaks of mechanical faults than entropy-based indicators; DWCC is a key step in ensuring the quality of IMF screening, which can improve the energy concentration of reconstruction and the significance of envelope spectral fault features.

5. Conclusions

A health indicator-driven IMF clustering method for bearing fault detection was proposed. To address common issues in non-stationary and noisy environments, such as the susceptibility to mode mixing in traditional methods, difficulty in parameter self-adaptation, and lack of a closed-loop assessment, a framework for adaptive decomposition and selection guided by a CFFI was constructed. Using CFFI as both a sensitive and robust health indicator and an optimization objective, the method employs a TTAO to adaptively search for the optimal penalty factor α and mode number K for VMD. This enabled the decomposition to focus on fault-related impulse and amplitude-modulated resonance modes. Subsequently, DWCC was applied to screen the obtained IMFs, and the cluster with the highest average CFFI value—based on the semantic principle that “higher CFFI indicates stronger fault-related components”—was selected for reconstruction. Finally, a closed-loop assessment is based on the SERR, where the global full-spectrum SERR was used to measure overall noise reduction, and the feature-band SERR measures retention of fault characteristics. The experimental and simulation results based on the IMS public bearing dataset show that the proposed method achieved stable parameter estimation and clear fault modes under various operating conditions. Compared to baseline schemes using RMS, KI, or various entropy-based fitness functions, the proposed method exhibited higher Tred/Mono and SAIF. It achieved a significant reduction in the global SERR while maintaining a stable high retention in the feature-band SERR. This reflected its stronger sensitivity, robustness, and online applicability. It was noted that the current validation primarily relied on the IMS dataset; thus, the method’s generalizability across different bearing types and more-variable operational conditions is suggested to warrant further investigation. Future work will focus on automating frequency band selection and CFFI weight adaptation, validating generalization under complex variable-speed conditions, and enabling real-time deployment with computational optimization for embedded platforms.

Author Contributions

Conceptualization, B.Z., B.S., Z.G., Y.S., Z.P. and X.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Jilin Provincial Scientific and Technological Development Program (No. 20250306008ZD).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
SAIFScale-Adaptive Integrated Feature Variability
TTAOTriangular Topology Aggregation Optimizer
DWCCDynamic Weighted Central Clustering
CFFIComposite Fault Feature Indicator
VMDVariational Mode Decomposition
SERRSpectral Energy Retention Rate
IMFsIntrinsic Mode Functions
SNRSignal-to-Noise Ratio
RMSRoot Mean Square
MonoMonotonicity
FEFuzzy Entropy
KIKurtosis
TredTrend

References

  1. Antoni, J. The Infogram: Entropic Evidence of the Signature of Repetitive Transients. Mech. Syst. Signal Process. 2016, 74, 73–94. [Google Scholar] [CrossRef]
  2. Maláková, S.; Sivák, S.; Fedorko, G.; Molnár, V.; Tomek, M. Analysis of Damage Origin of Bevel Gear Wheels. Eng. Fail. Anal. 2023, 146, 107105. [Google Scholar] [CrossRef]
  3. Hou, Y.; Li, T.; Wang, J.; Ma, J.; Chen, Z. A Lightweight Transformer Based on Feature Fusion and Global–Local Parallel Stacked Self-Activation Unit for Bearing Fault Diagnosis. Measurement 2024, 236, 115068. [Google Scholar] [CrossRef]
  4. Sun, Z.; Wang, Y.; Zhang, L. Rotating Machinery Health State Assessment under Multi-Working Conditions Based on a Deep Fuzzy Clustering Network. Measurement 2023, 218, 113172. [Google Scholar] [CrossRef]
  5. He, G.; Ding, K.; Lin, H. Fault Feature Extraction of Rolling Element Bearings Using Sparse Representation. J. Sound Vib. 2016, 366, 514–527. [Google Scholar] [CrossRef]
  6. Sawalhi, N.; Randall, R.B.; Endo, H. The Enhancement of Fault Detection and Diagnosis in Rolling Element Bearings Using Minimum Entropy Deconvolution Combined with Spectral Kurtosis. Mech. Syst. Signal Process. 2007, 21, 2616–2633. [Google Scholar] [CrossRef]
  7. Huang, N.; Shen, Z.; Long, S.; Wu, M.L.C.; Shih, H.; Zheng, Q.; Yen, N.-C.; Tung, C.-C.; Liu, H. The Empirical Mode Decomposition and the Hilbert Spectrum for Nonlinear and Non-Stationary Time Series Analysis. Proc. R. Soc. London. Ser. A Math. Phys. Eng. Sci. 1998, 454, 903–995. [Google Scholar] [CrossRef]
  8. Dragomiretskiy, K.; Zosso, D. Variational Mode Decomposition. IEEE Trans. Signal Process. 2014, 62, 531–544. [Google Scholar] [CrossRef]
  9. Zhu, J.; Wang, C.; Hu, Z.; Kong, F.; Liu, X. Adaptive Variational Mode Decomposition Based on Artificial Fish Swarm Algorithm for Fault Diagnosis of Rolling Bearings. Proc. Inst. Mech. Eng. Part C J. Mech. Eng. Sci. 2016, 231, 635–654. [Google Scholar] [CrossRef]
  10. Li, C.; Liu, Y.; Liao, Y. An Improved Parameter-Adaptive Variational Mode Decomposition Method and Its Application in Fault Diagnosis of Rolling Bearings. Shock Vib. 2021, 2021, 2968488. [Google Scholar] [CrossRef]
  11. He, C.; Wu, T.; Liu, C.; Chen, T. A Novel Method of Composite Multiscale Weighted Permutation Entropy and Machine Learning for Fault Complex System Fault Diagnosis. Measurement 2020, 158, 107748. [Google Scholar] [CrossRef]
  12. Qian, P.; Ma, E.; Wang, Y.; Li, Q.; Xu, H.; Zhang, D.; Xiang, Z.; Tian, X. A Novel Multi-Band Demodulation Method and Its Application in Bearing Fault Detection. Measurement 2025, 256, 118405. [Google Scholar] [CrossRef]
  13. Mao, W.; He, L.; Yan, Y.; Wang, J. Online Sequential Prediction of Bearings Imbalanced Fault Diagnosis by Extreme Learning Machine. Mech. Syst. Signal Process. 2017, 83, 450–473. [Google Scholar] [CrossRef]
  14. Meng, T.; Jiang, X.; Yang, S.; Liu, J.; Gao, H.; Zhu, Z. Spectral Feature-Informed Difference Multi-Modes Decomposition for Compound Bearing Fault Diagnosis. Expert Syst. Appl. 2025, 294, 128735. [Google Scholar] [CrossRef]
  15. Lei, Y.; Yang, B.; Jiang, X.; Jia, F.; Li, N.; Nandi, A.K. Applications of Machine Learning to Machine Fault Diagnosis: A Review and Roadmap. Mech. Syst. Signal Process. 2020, 138, 106587. [Google Scholar] [CrossRef]
  16. Li, X.; Ma, Z.; Kang, D.; Li, X. Fault Diagnosis for Rolling Bearing Based on VMD-FRFT. Measurement 2020, 155, 107554. [Google Scholar] [CrossRef]
  17. He, Y.; Hu, M.; Feng, K.; Jiang, Z. Bearing Condition Evaluation Based on the Shock Pulse Method and Principal Resonance Analysis. IEEE Trans. Instrum. Meas. 2021, 70, 1–12. [Google Scholar] [CrossRef]
  18. Li, H.; Wu, X.; Liu, T.; Li, S.; Zhang, B.; Zhou, G.; Huang, T. Composite Fault Diagnosis for Rolling Bearing Based on Parameter-Optimized VMD. Measurement 2022, 201, 111637. [Google Scholar] [CrossRef]
  19. Lee, H.-M.; Chen, C.-M.; Chen, J.-M.; Jou, Y.-L. An Efficient Fuzzy Classifier with Feature Selection Based on Fuzzy Entropy. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 2001, 31, 426–432. [Google Scholar] [CrossRef]
  20. Dibaj, A.; Hassannejad, R.; Ettefagh, M.M.; Ehghaghi, M.B. Incipient Fault Diagnosis of Bearings Based on Parameter-Optimized VMD and Envelope Spectrum Weighted Kurtosis Index with a New Sensitivity Assessment Threshold. ISA Trans. 2021, 114, 413–433. [Google Scholar] [CrossRef]
  21. Li, H.; Li, S.; Sun, J.; Huang, B.; Zhang, J.; Gao, M. Ultrasound Signal Processing Based on Joint GWO-VMD Wavelet Threshold Functions. Measurement 2024, 226, 114143. [Google Scholar] [CrossRef]
  22. Lei, W.; Wang, G.; Wan, B.; Min, Y.; Wu, J.; Li, B. High Voltage Shunt Reactor Acoustic Signal Denoising Based on the Combination of VMD Parameters Optimized by Coati Optimization Algorithm and Wavelet Threshold. Measurement 2024, 224, 113854. [Google Scholar] [CrossRef]
  23. Wang, J.; Du, G.; Zhu, Z.; Shen, C.; He, Q. Fault Diagnosis of Rotating Machines Based on the EMD Manifold. Mech. Syst. Signal Process. 2020, 135, 106443. [Google Scholar] [CrossRef]
  24. Zhao, S.; Zhang, T.; Cai, L.; Yang, R. Triangulation Topology Aggregation Optimizer: A Novel Mathematics-Based Meta-Heuristic Algorithm for Continuous Optimization and Engineering Applications. Expert Syst. Appl. 2024, 238, 121744. [Google Scholar] [CrossRef]
Figure 1. Algorithm flowchart.
Figure 1. Algorithm flowchart.
Sensors 26 01394 g001
Figure 2. (a) Pure periodic decreasing signal; (b) periodic decreasing signal perturbed by white noise.
Figure 2. (a) Pure periodic decreasing signal; (b) periodic decreasing signal perturbed by white noise.
Sensors 26 01394 g002
Figure 3. Iterative curve.
Figure 3. Iterative curve.
Sensors 26 01394 g003
Figure 4. The IMFs after decomposition.
Figure 4. The IMFs after decomposition.
Sensors 26 01394 g004
Figure 5. Frequency corresponding to IMFs.
Figure 5. Frequency corresponding to IMFs.
Sensors 26 01394 g005
Figure 6. CFFI and DWCC select IMFs.
Figure 6. CFFI and DWCC select IMFs.
Sensors 26 01394 g006
Figure 7. Contrast diagram of pure signal and denoised signal.
Figure 7. Contrast diagram of pure signal and denoised signal.
Sensors 26 01394 g007
Figure 8. Analog signal spectrum energy change. (a) time-domain waveform containing noisy signals; (b) amplitude spectrum and clustering spectrum of the noise signal; (c) amplitude spectrum and clustering spectrum of the denoised signal.
Figure 8. Analog signal spectrum energy change. (a) time-domain waveform containing noisy signals; (b) amplitude spectrum and clustering spectrum of the noise signal; (c) amplitude spectrum and clustering spectrum of the denoised signal.
Sensors 26 01394 g008
Figure 9. Radar chart comparison of the health indicators across Tred, Mono, and SAIF.
Figure 9. Radar chart comparison of the health indicators across Tred, Mono, and SAIF.
Sensors 26 01394 g009
Figure 10. Ablation experiment analysis.
Figure 10. Ablation experiment analysis.
Sensors 26 01394 g010
Figure 11. Comparative analysis of SERR under different fitness functions.
Figure 11. Comparative analysis of SERR under different fitness functions.
Sensors 26 01394 g011
Table 1. Calculation of analog signal SERR.
Table 1. Calculation of analog signal SERR.
NumberSERR
10.31526
20.41265
30.34512
40.31569
50.36965
60.37445
70.40168
80.35487
90.30875
100.32588
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhang, B.; Shen, B.; Gao, Z.; Shao, Y.; Pang, Z.; Yin, X. Composite Fault Feature Index-Guided Variational Mode Decomposition with Dynamic Weighted Central Clustering for Bearing Fault Detection. Sensors 2026, 26, 1394. https://doi.org/10.3390/s26041394

AMA Style

Zhang B, Shen B, Gao Z, Shao Y, Pang Z, Yin X. Composite Fault Feature Index-Guided Variational Mode Decomposition with Dynamic Weighted Central Clustering for Bearing Fault Detection. Sensors. 2026; 26(4):1394. https://doi.org/10.3390/s26041394

Chicago/Turabian Style

Zhang, Bangcheng, Boyu Shen, Zhi Gao, Yubo Shao, Zaixiang Pang, and Xiaojing Yin. 2026. "Composite Fault Feature Index-Guided Variational Mode Decomposition with Dynamic Weighted Central Clustering for Bearing Fault Detection" Sensors 26, no. 4: 1394. https://doi.org/10.3390/s26041394

APA Style

Zhang, B., Shen, B., Gao, Z., Shao, Y., Pang, Z., & Yin, X. (2026). Composite Fault Feature Index-Guided Variational Mode Decomposition with Dynamic Weighted Central Clustering for Bearing Fault Detection. Sensors, 26(4), 1394. https://doi.org/10.3390/s26041394

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop