An Effective Approach to Rotatory Fault Diagnosis Combining CEEMDAN and Feature-Level Integration

Chauhan, Sumika; Vashishtha, Govind; Kaur, Prabhkiran

doi:10.3390/a18100644

Open AccessArticle

An Effective Approach to Rotatory Fault Diagnosis Combining CEEMDAN and Feature-Level Integration

by

Sumika Chauhan

¹

,

Govind Vashishtha

^1,2,*

and

Prabhkiran Kaur

³

¹

Faculty of Geoengineering, Mining and Geology, Wroclaw University of Science and Technology, Na Grobli 15, 50-421 Wroclaw, Poland

²

Department of Mechanical Engineering, Graphic Era Deemed to be University, Dehradun 248002, India

³

Department of Mechanical Engineering, IKGPTU Amritsar Campus, Amritsar 143105, India

^*

Author to whom correspondence should be addressed.

Algorithms 2025, 18(10), 644; https://doi.org/10.3390/a18100644

Submission received: 30 August 2025 / Revised: 2 October 2025 / Accepted: 9 October 2025 / Published: 12 October 2025

(This article belongs to the Special Issue AI-Powered Predictive Maintenance: Transforming Industrial Operations Through Intelligent Fault Diagnosis)

Download

Browse Figures

Versions Notes

Abstract

This paper introduces an effective approach for rotatory fault diagnosis, specifically focusing on centrifugal pumps, by combining complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) and feature-level integration. Centrifugal pumps are critical in various industries, and their condition monitoring is essential for reliability. The proposed methodology addresses the limitations of traditional single-sensor fault diagnosis by fusing information from acoustic and vibration signals. CEEMDAN was employed to decompose raw signals into intrinsic mode functions (IMFs), mitigating noise and non-stationary characteristics. Weighted kurtosis was used to select significant IMFs, and a comprehensive set of time, frequency, and time–frequency domain features was extracted. Feature-level fusion integrated these features, and a support vector machine (SVM) classifier, optimized using the crayfish optimization algorithm (COA), identified different health conditions. The methodology was validated on a centrifugal pump with various impeller defects, achieving a classification accuracy of 95.0%. The results demonstrate the efficacy of the proposed approach in accurately diagnosing the state of centrifugal pumps.

Keywords:

CEEMDAN; SVM; centrifugal pump; feature-fusion

1. Introduction

Centrifugal pumps are ubiquitous fluid-handling machines that play a critical role in a vast array of industrial, domestic, and agricultural applications. Their operational versatility allows them to be deployed in diverse settings, often requiring continuous operation under demanding and extreme conditions. These conditions can include elevated temperatures, the handling of high-density fluids, and exposure to significant pressures, all of which place considerable stress on the pump’s internal components. Consequently, centrifugal pumps are susceptible to a range of potential failures affecting their mechanical integrity and performance [1].

These failures can be broadly categorized into three primary types: mechanically induced faults, which stem from wear and tear, fatigue, or material defects within the pump’s components; system faults, arising from issues within the larger system in which the pump operates such as blockages, cavitation, or improper installation; and operational faults, which result from incorrect operating procedures, exceeding design parameters, or inadequate maintenance. The occurrence of faults within the pump can initiate a cascade of problems, potentially leading to the breakdown of the entire system it serves, resulting in significant economic losses due to production downtime, repair costs, and potential damage to other equipment [2].

Therefore, proactively diagnosing the condition of the pump at an early stage is of paramount importance for ensuring operational reliability and minimizing potential disruptions. To achieve this, effective condition monitoring techniques are crucial for detecting the subtle, early signs of impending failure [3]. These techniques encompass a spectrum of approaches, ranging from relatively simple methods like periodic vibration analysis, which can identify imbalances or bearing issues, and temperature monitoring, which can indicate overheating or friction problems, to more sophisticated and advanced methodologies [4]. These advanced techniques include oil analysis, which assesses the condition of lubricating oil for contaminants and wear debris, and acoustic emission monitoring, which detects high-frequency sounds generated by developing cracks or leaks. By employing these condition monitoring techniques, proactive maintenance strategies can be implemented, allowing for timely repairs or component replacements, ultimately minimizing downtime, reducing the overall maintenance costs, and extending the operational lifespan of the centrifugal pump [5].

A common approach to fault diagnosis in rotating machinery relies on data acquisition from a single sensor type. However, centrifugal pumps present a more complex diagnostic challenge due to their intricate structure and the inherent variability of their operating conditions, which often include random or unpredictable factors. Consequently, relying solely on data from a single sensor type can be insufficient to capture the full spectrum of information needed for accurate fault diagnosis [6].

To address this limitation, fusing data from multiple sensor types offers a more robust and comprehensive approach. This strategy ensures the efficacy and completeness of the acquired information, leading to more accurate and reliable fault detection. Acquiring data from multiple sensors is known as multi-source information fusion (MSIF) [7]. In recent years, the field of MSIF has experienced significant advancements, driven by the recognition that integrating information from diverse sources can provide a more holistic understanding of the system’s condition.

MSIF works by integrating all available information received from multiple sensors, enabling cross-validation and mutual data compensation. This synergistic approach enhances the overall performance of the diagnostic system, allowing for the extraction of more useful and relevant information. Furthermore, MSIF strengthens the system’s resilience and stability by mitigating the impact of noise or inaccuracies from individual sensors [8].

In the proposed study, sound and vibration signals are acquired using dedicated sensors. However, raw signals are inherently susceptible to noise contamination, which can obscure the underlying patterns indicative of specific faults. Therefore, it is crucial to pre-process these signals using a suitable algorithm to eliminate unwanted noise and enhance the signal-to-noise ratio. This pre-processing step is essential for accurately detecting different types of defects and assessing their severity, ultimately enabling timely and effective maintenance interventions.

Traditional signal processing methods often rely on statistical analysis in the time domain, frequency domain, and time–frequency domain. However, these traditional approaches can struggle to effectively identify defects in complex systems like centrifugal pumps due to the inherent nonlinear and non-stationary characteristics of the raw signals acquired from these machines. The complex interplay of various factors contributing to pump operation results in signals that are difficult to interpret directly using conventional statistical methods.

To overcome these challenges, researchers have proposed numerous techniques in the literature that leverage decomposition methods. These methods aim to decompose the complex raw signals into a set of simpler, more manageable components, making it easier to extract meaningful features related to specific fault conditions. For instance, Azizi et al. [9] employed the empirical mode decomposition (EMD) technique for fault identification in centrifugal pumps. EMD is a data-driven, adaptive technique that decomposes a signal into a collection of intrinsic mode functions (IMFs), representing different scales of variability within the signal.

However, EMD suffers from a significant limitation known as mode mixing, where a single IMF may contain components of different frequencies, hindering accurate feature extraction. To address this issue, the ensemble empirical mode decomposition (EEMD) technique was developed. EEMD mitigates the mode mixing problem by adding white noise to the signal before decomposition and averaging the resulting IMFs across multiple trials. While EEMD improves upon EMD, it introduces increased computational complexity due to the ensemble averaging process [10,11].

The complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) approach offers a further refinement, addressing the drawbacks of both EMD and EEMD. CEEMDAN provides a perfectly reconstructed signal, improved spectral separation of the modes, and reduced computational cost compared with EEMD [12,13,14]. This makes CEEMDAN a particularly attractive pre-processing technique for analyzing complex signals from rotating machinery.

Therefore, this paper utilized the CEEMDAN approach as a pre-processing step for the raw acoustic and vibration signals acquired from the centrifugal pump. Following signal decomposition, the next step involves selecting the dominant modes based on a chosen health indicator. A variety of health indicators are available in the literature, including statistical measures like kurtosis, which is sensitive to impulsive events, correlation coefficients, and various types of entropy that quantify the signal’s complexity and irregularity. To effectively identify defects in centrifugal pumps, fault features are extracted from the decomposed signals in the time domain, frequency domain, and time–frequency domain for both vibration and acoustic signals. This multi-faceted feature extraction approach aims to capture a comprehensive representation of the pump’s condition, enabling accurate fault diagnosis.

Techniques that utilize multiple data sources to improve diagnostic accuracy are collectively known as data fusion techniques. These techniques can be broadly categorized based on the stage at which the fusion operation takes place: signal-level fusion, feature-level fusion, and decision-level fusion. In the context of fault diagnosis for mechanical machinery, feature-level fusion and decision-level fusion techniques are the most commonly employed.

In feature-level fusion, the process involves extracting relevant features from each data source independently. These features, which represent characteristic attributes of the signals, are then combined into a single, consolidated feature vector. This integrated feature vector serves as the input for a classification algorithm. The advantage of feature-level fusion lies in its ability to capture complementary information from different sensors and create a more comprehensive representation of the system’s state.

Decision-level fusion, on the other hand, involves analyzing and processing each signal from a different sensor separately. Each signal is individually assessed, and a preliminary decision or classification is made based on the information it provides. These individual decisions are then combined using techniques such as fuzzy logic or Dempster–Shafer (DS) theory to arrive at a final, consolidated decision [15]. Decision-level fusion is particularly useful when dealing with heterogeneous data sources or when the relationships between different signals are complex and difficult to model directly.

This paper adopted a feature-level fusion technique, enabling the integration of relevant features extracted from different signals into a unified representation. This approach aims to leverage the complementary information provided by each sensor to enhance the accuracy and robustness of the fault diagnosis process.

Following the feature fusion stage, the next crucial step is to classify the prepared dataset into predefined categories corresponding to different fault conditions. This dataset is typically divided into a training dataset and a testing dataset. The training dataset is used to build a classification model, while the testing dataset is used to evaluate the performance and generalization ability of the trained model. A variety of classification algorithms are available in the literature including k-nearest neighbor (k-NN), convolutional neural networks (CNNs), deep learning architectures, extreme learning machines (ELMs), and support vector machine (SVM) classifiers. Hou et al. [16] proposed Diagnosisformer, a transformer-based model that enhances rolling bearing fault diagnosis by fusing frequency domain features. It uses a multi-feature parallel fusion encoder and cross-flipped decoder for improved accuracy, generalization, and robustness, achieving 99.84% and 99.85% accuracy on two datasets. Hou et al. [17] proposed a global local transformer, a lightweight method for bearing fault diagnosis that uses multi-channel vibration feature fusion and a global-local parallel self-activation unit. It balances diagnostic performance with resource constraints, reducing the storage and computational needs while maintaining generalization and robustness, verified on public and self-built data.

This article employed an SVM classifier due to its proven ability to achieve high accuracy in various classification tasks, particularly in scenarios with high-dimensional data. The performance of the SVM classifier is highly dependent on two key parameters: the kernel function, which determines the mapping of data points into a higher-dimensional space, and the regularization parameter, which controls the trade-off between maximizing the margin and minimizing the classification error.

To optimize the accuracy of the SVM classifier, an optimization technique is applied to determine the optimal values for both the kernel function parameters and the regularization parameter. Numerous optimization algorithms are available in the literature including genetic algorithms, DDMPEA, and the SCA algorithm. This paper utilized the crayfish optimization algorithm (COA) to optimize the parameters of the SVM classifier. COA is a metaheuristic optimization algorithm inspired by the social behavior of crayfish, offering a balance between exploration and exploitation in the search for optimal solutions.

The remainder of this paper is organized to guide the reader through the research process. Section 2 establishes the theoretical foundation, providing essential background. Section 3 details the proposed methodology, outlining its key components. Section 4 demonstrates the application of this methodology, showcasing its practical use. Finally, Section 5 concludes the paper, summarizing findings and future directions. This structure ensures a clear and comprehensive understanding of the research.

2. Theoretical Background

2.1. Complete Ensemble Empirical Mode Decomposition with Additive Noise

The EMD method operates on the principle that complex, non-stationary time series can be represented as a combination of simpler, intrinsic oscillatory modes. EMD aims to identify and isolate these modes directly from the data, based on their inherent time scales, and then decompose the data accordingly. This decomposition process, known as sifting, effectively removes riding waves or oscillations that lack zero crossings between extrema. Consequently, EMD analyzes signal oscillations locally, separating the data into a collection of non-overlapping time scale components [18]. This process breaks down a signal

x [m]

into its constituent IMFs based on two key properties. The first is that each IMF contains only one extremum (a local minimum or maximum) between successive zero crossings. Therefore, the quantity of local minima and maxima can differ by at most one. The second is that the mean of each IMF is zero.

A limitation of the original EMD is the mode mixing problem. Ensemble EMD (EEMD) addresses this by introducing white noise to the input signal. However, EEMD can still leave residual noise in the IMFs, resulting in a mixture of noise and signal. To address this, the complete ensemble EMD with adaptive noise (CEEMDAN) algorithm was developed by Torres et al. [19]. CEEMDAN aims to eliminate noise entirely from the signal and achieve perfect reconstruction, closely matching the original signal. The first residue can be calculated using the following equation:

r_{1} [m] = x [m] - \bar{{I M F}_{1} [m]}

(1)

where

\bar{{I M F}_{1}}

is the first IMF and

x [m]

is the original signal. The calculation of the second residue is carried out as follows:

r_{2} [m] = r_{1} [m] - \bar{{I M F}_{2} [m]}

(2)

These processes continue until the stopping criteria are met. The CEEMDAN algorithm can be summarized as follows:

Step 1. The first mode is obtained by a mixture of signal and noise

(x [m] + ε_{0} ω^{i} [m])

, using EMD decomposition as

\bar{{I M F}_{1} [m]} = \frac{1}{L} \sum_{l = 1}^{L} {I M F}_{1}^{l} [m]

(3)

where

L

represents the ensemble size.

Step 2. The first residue is calculated using Equation (1) at

k = 1

.

Step 3. The second IMF is obtained by Equation (4)

\bar{{I M F}_{2} [m]} = \frac{1}{L} \sum_{l = 1}^{L} E_{1} [r_{1} [m] + ε_{1} E_{1} (ω^{i} [m])]

(4)

where

E_{k} (.)

is an operator used to extract the

k^{t h}

IMF from the signal using EMD decomposition and

ε_{k}

is used to control the SNR at each stage.

Step 4. The

k^{t h}

residue is achieved by using equation

r_{k} [m] = r_{k - 1} [m] - {I M F}_{k} [m]; \forall k = 1,2, \dots, K

(5)

where

K

represents the total number of modes.

Step 5. The decomposition of the signal is performed until the

({k + 1)}^{t h}

mode is achieved as

{I M F}_{k + 1} [m] = \frac{1}{L} \sum_{l = 1}^{L} E_{1} [r_{k} [m] + ε_{k} E_{k} (ω^{i} [m])]

(6)

Step 6. Go to Step 4 and repeat.

2.2. Crayfish Optimization Algorithm (COA)

Jia et al. [20] proposed the COA, inspired by crayfish behavior. This behavior encompasses summer resort, competition, and foraging. The competition and foraging aspects mimic the exploitation and exploration stages of optimization, regulated by temperature. At elevated temperatures, crayfish either seek refuge in caves for summer or compete for the same cave. Conversely, at suitable temperatures, crayfish engage in exploration through foraging. This temperature-dependent behavior introduces greater randomness in the search for the global optimum. More information regarding the mathematics involved in this optimization technique can be found in Refs. [20,21]. The following subsections describe the procedure of COA.

2.2.1. Initialization

The COA begins by randomly initializing a population of candidate solutions, denoted as

X

. This population consists of

N

individuals, each with a dimensionality of ‘

d i m

’. The position

X_{i, j}

of the

i^{t h}

individual in the

j^{t h}

dimension is then determined according to the model defined in Equation (7).

X_{i, j} = {l b}_{j} + ({u b}_{j} - {l b}_{j}) \times r a n d

(7)

where

{u b}_{j}

and

{l b}_{j}

are the upper and lower bounds of the

j^{t h}

dimension, respectively.

2.2.2. Defining the Temperature and Number of Crayfish

As previously mentioned, temperature plays a crucial role in the various life stages of crayfish, and its influence is mathematically represented in Equation (8). Specifically, when the ambient temperature exceeds 30 °C, the crayfish instinctively seeks cooler environments, effectively entering a “summer vacation” phase. Conversely, within an optimal temperature range, the crayfish initiates its foraging behavior. This foraging range is defined as being between 15 °C and 30 °C. Due to the temperature dependence of this foraging behavior, it can be effectively simulated using a normal distribution, with the mathematical model for this simulation outlined in Equation (9).

t e m p = r a n d \times 15 + 20

(8)

p = C_{1} \times (\frac{1}{\sqrt{2 \times π} \times σ} \times e x p (- \frac{{(t e m p - μ)}^{2}}{{2 σ}^{2}}))

(9)

where

t e m p

indicates the temperature of the crayfish’s location,

μ

is the temperature of the most significant crayfish, and the parameters

σ

and

C_{1}

control the intakes of the crayfish at different temperatures.

2.2.3. Summer Resort Stage

As previously indicated, when the ambient temperature exceeds 30 °C, the crayfish retreats into a cave to seek refuge from the heat, effectively initiating a “summer vacation” phase. The location of this cave, denoted as

X_{s h a d e}

, is determined according to the model presented in Equation (10).

X_{s h a d e} = (X_{G} + X_{L}) / 2

(10)

Here,

X_{G}

signifies the best position discovered to date, representing the global optimum. Conversely,

X_{L}

denotes the position of the individual crayfish within the current population. The competition for the cave is simulated as a random event; specifically, if a randomly generated number is less than 0.5, it implies an absence of competition, and the crayfish can directly occupy the cave, as mathematically expressed in Equation (11).

X_{i, j}^{t + 1} = X_{i, j}^{t} + C_{2} \times r a n d \times (X_{s h a d e} - X_{i, j}^{t})

(11)

Here, t is the current position, and

t + 1

is the next position. The parameter

C_{2}

is the decreasing curve computed by Equation (12).

C_{2} = 2 - (\frac{t}{T})

(12)

where

T

represents the maximum iteration. This behavior of crayfish approaching the cave simulates the process of obtaining an optimal solution.

2.2.4. Competition Stage

The other crayfish are interested in the same cave if the

t e m p > 30

and

r a n d \geq 0.5 .

Therefore, they fight with each other to acquire the cave, which is shown by Equation (13).

X_{i, j}^{t + 1} = X_{i, j}^{t} - X_{z, j}^{t} + X_{s h a d e}

(13)

where

z

is the random crayfish and is computed by Equation (14).

z = r o u n d (r a n d \times (N - 1)) + 1

(14)

2.2.5. Foraging Stage

The crayfish starts searching for food when the temperature is less than 30 °C After finding food, the crayfish judges the size of the food. The location and size of the food is defined in Equations (15) and (16), respectively.

X_{f o o d} = X_{G}

(15)

Q = C_{3} \times ({f i t n e s s}_{i} / {f i t n e s s}_{f o o d})

(16)

The crayfish tears the food into smaller sizes and eats with its claws. This simulates the behavior of obtaining the optimal solution.

2.3. Support Vector Machine (SVM)

The SVM was proposed by Vapnik to classify data using the two dimensional space, as depicted in Figure 1.

If there is a dataset of

N

points

(x_{i}, y_{i})

, where

i

ranges from

1 t o N

,

x_{i}

represents the input data, and

y_{i} \in \{- 1,1\}

denotes the class label. SVM aims to find the optimal hyperplane that maximizes the margin between the classes. This optimal hyperplane maximizes the distance between itself and the closest data points of each class.

M i n i m i z e \frac{1}{2} {‖w‖}^{2} + C \sum_{i = 1}^{n} ξ_{i}

(17)

This is subjected to

y_{i} (w^{T} x_{i} + b) \geq 1 - ξ_{i} a n d ξ_{i} \geq 0 f o r i = 1 t o N

, where

w

represents the weight vector,

b

is the bias,

T

denotes the transpose, and

ξ_{i}

is the slack variable. The values of

w

and

b

are determined during the training process. The regularization parameter

C

controls the model’s complexity by balancing the margin size and the number of misclassified points. A Lagrange multiplier,

α_{i}

, is used to treat the hyperplane as a quadratic form. It incorporates constraints into the quadratic equation, transforming it into an unconstrained problem. The parameter

w

is derived from Ref. [22]. The objective function of SVM is defined in Equation (18).

w (α) = \sum_{i = 1}^{n} α_{i} + \frac{1}{2} \sum_{i, j = 1}^{n} α_{i} α_{j} x_{i} y_{i} k (x_{i}, x_{j})

(18)

where

α_{i}, α_{j}

are the Lagrange multiplier and

{k (x}_{i}, x_{j})

is the kernel function.

f = s i g n (b + \sum_{S V} [{(α}_{i} y_{i}) k (x_{i}, x_{j})])

(19)

A positive Lagrange multiplier indicates that the corresponding data point is a SV. The kernel function enables the computation of dot products in feature space without explicitly mapping the input data into that space. Many kernel functions are available, and Table 1 provides a list of some of the most widely used.

The regularization parameter (

C

) and kernel parameter (γ) significantly influence the SVM performance. Selecting an appropriate value for

C

minimizes misclassification by maximizing the margin and reducing the training error. A large

C

value leads to low bias but can cause overfitting, while a small

C

value may result in high bias and underfitting. Therefore, C should be chosen carefully. In this study, the radial basis function (RBF) kernel (parameterized by γ) was used to transform the input data into a higher-dimensional space, which helps control the sharpness of peaks in the decision boundary. A smaller γ value creates sharper peaks, and a larger value produces smoother peaks. The COA was employed to optimize both

C

and γ in this work. Traditional MATLAB optimization methods like grid search and Bayesian optimization can also be used, but they are often time-consuming. The search ranges for

C

and γ, based on Ref. [21], are presented in Table 2.

3. Proposed Work

The detailed steps for the proposed work are given below and are also represented in Figure 2.

Acquire the multi-sensor data from the centrifugal pump under different health conditions.
Process the raw signals through CEEMDAN to obtain the IMFs.
Compute the weighted kurtosis for each IMF to identify the significant IMFs based on their maximum value.
Extract the features from multi-domains (i.e., time domain, frequency domain, and time–frequency domain).
Prepare the dataset after applying the feature-level fusion technique.
Normalize the dataset and further divide it into the training and testing datasets in the ratio 70:30.
The SVM parameters are optimized through COA to improve the recognition rate.
Validation of the built model should be carried out.

4. Application of the Proposed Methodology

For this experiment, a monoblock centrifugal pump was utilized to identify impeller defects. The specifications of the centrifugal pump are tabulated in Table 3. Figure 3 illustrates the schematic and typical image of the test rig, which incorporated a sound sensor and data acquisition equipment. The pump was operated at a constant speed of 2800 RPM, resulting in an operating frequency of 46.67 Hz. The test rig’s rotor shaft was supported by two bearings: Bearing 1 (bearing number 6203-ZZ), which was positioned closer to the impeller, and Bearing 2 (bearing number 6202-ZZ), located farther from the impeller. The impeller itself was situated on the rotor shaft and enclosed within the pump casing. It consists of rotating vanes that draw water axially through the impeller’s eye. The vanes then impart kinetic energy to the water, forcing it to flow radially outward through the casing. The interaction between the water and the rotating vanes, specifically the energy transfer through impacts, is what generates the chaotic phenomena within the water flow.

A uniaxial PCB accelerometer with a sensitivity of 100 mV/g was used to capture the vibration signals. The accelerometer was attached to the impeller casing using wax, as shown in Figure 3. A National Instruments 24-bit DAQ system (Austin, TX, USA) with 4 channels, set to a sampling frequency of 70,000 Hz, acquired the signal. Simultaneously, a microphone with a sensitivity of −60 dB recorded the sound signal, also at a sampling frequency of 70,000 Hz.

Four distinct health conditions were analyzed: healthy, clogging, blade cut, and wheel cut, as shown in Figure 4. The raw signals, displayed in Figure 5, are clearly masked with noise, necessitating preprocessing via complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN). In the case of the normal condition, the vibration signal was more consistent and possibly had a lower amplitude, suggesting stable operation, whereas the sound signal during normal operation had background noise. In the case of clogging, the vibration signal showed slightly increased randomness or higher frequency components, potentially indicating turbulent flow or increased mechanical stress due to blockage. The sound signals also exhibited higher-frequency noise, which is associated with turbulent flow due to the blockage. The blade cut or wheel cut conditions showed some impulsive peaks or irregular patterns, reflecting impacts or imbalances caused by the damage in the vibration signals, whereas the acoustic data showed a distortion that represents the altered sound profile because of the blade cut and wheel cut. CEEMDAN decomposed the raw signals into various intrinsic mode functions (IMFs). The IMFs obtained by CEEMDAN under different health conditions for both the vibration signals and sound signals are shown in Figure 6, Figure 7, Figure 8, Figure 9, Figure 10, Figure 11, Figure 12 and Figure 13.

The IMFs containing useful information were considered significant and identified based on the maximum value of weighted kurtosis. The weighted kurtosis assesses the effectiveness of the decomposition results by combining standard kurtosis and the correlation coefficient. Standard kurtosis is highly dependent on the density distribution of impacts generated by a fault. Therefore, relying solely on maximum kurtosis can neglect impacts with large amplitudes but dispersed density distributions. While the correlation coefficient quickly determines the similarity between two signals, it is susceptible to noise from faulty components. To leverage the advantages of both indices, weighted kurtosis was developed, as shown in Equation (20).

K C I = K I . |C|

(20)

where

K I

represents the kurtosis index for input signal

x (n)

and is expressed as

K I = \frac{\frac{1}{N} \sum_{n = 0}^{N - 1} x^{4} (n)}{{(\frac{1}{N} \sum_{n = 0}^{N - 1} x^{2} (n))}^{2}}

(21)

Here, the length of the signal is given by

N

. Considering

E [.]

as a mathematical expectation, the correlation

C

between

x

and

y

is expressed as:

C = \frac{E [(x - \bar{x}) (y - \bar{y})]}{E [{(x - \bar{x})}^{2}] E [{(y - \bar{y})}^{2}]}

(22)

The weighted kurtosis obtained from the IMFs of the vibration signals is tabulated in Table 4. Similarly, the weighted kurtosis obtained from the IMFs of the sound signals is presented in Table 5.

The analysis of the weighted kurtosis values, as presented in the accompanying tables, revealed that a substantial portion of the IMFs exhibited very small kurtosis values. This observation suggests that these IMFs contribute negligibly to the overall signal, indicating a lack of meaningful information content. Consequently, these IMFs were deemed unsuitable for subsequent analysis and excluded from further consideration. To ensure the selection of the most informative components, the top five significant IMFs were chosen based on the maximum weighted kurtosis values derived from both the vibration and sound signals for each defined health condition. This selection process prioritized IMFs that captured the most prominent characteristics related to the health status of the system under investigation.

Following the identification of these significant IMFs, a comprehensive set of 30 features, detailed in Table 6, were extracted from these selected components. This feature set encompassed a diverse range of descriptors drawn from the time domain, frequency domain, and time–frequency domain, providing a multi-faceted representation of the underlying signal characteristics. The combination of these features aims to capture both the temporal dynamics and spectral properties of the IMFs, enabling a more thorough and robust analysis of the system’s health condition.

The 30 features extracted from the significant IMFs of the vibration and sound signals, corresponding to each health condition, underwent a normalization process to ensure consistent scaling. Specifically, each feature was scaled to fall within the range of [0, 1]. This normalization step is crucial for preventing features with larger magnitudes from dominating the analysis and ensuring that all features contribute equally to subsequent modeling stages. Following normalization, the features derived from both the vibration and sound signals were fused, creating a unified dataset that integrated information from both sensor modalities. This fusion process resulted in a final dataset with dimensions of

40 \times 30

, where the 40 rows represent the number of signal instances collected (i.e., 40 signals), and the 30 columns represent the total number of normalized features extracted from the significant IMFs. This combined dataset served as the input for further analysis and modeling.

To proceed with the classification of different health conditions, a support vector machine (SVM) was employed. It is well-known that the performance of an SVM, specifically its classification and prediction accuracy, is significantly influenced by the appropriate selection of its regularization parameter

(C)

and kernel function parameter (γ). An arbitrary or random selection of these parameters can lead to suboptimal results. Indeed, the results of the initial classification attempts using arbitrarily chosen values for

C

and γ, as presented in Table 5, demonstrated a poor recognition rate and a time-consuming optimization process. Therefore, it is crucial to optimize the selection of these parameters to design an efficient and accurate SVM model. In this study, the parameters

C

and γ were optimized using the CAO as described in Section 2.2. The CAO algorithm was configured with a population size of 30, and the optimization process was terminated when a maximum of 100 iterations was reached. All processing and simulations were conducted using MATLAB R2024b running on a Windows 11 operating system with an Intel Core-i7 processor @2.30 GHz and 64 GB of RAM.

To ensure the robustness and generalizability of the SVM model and to avoid overfitting or underfitting, a

k - f o l d

cross-validation technique was implemented during the classification process.

K - f o l d

cross-validation involves partitioning the training dataset into k equal subsets or “folds.” During each iteration,

k - 1

folds are used for training the SVM model, while the remaining kth fold is used for validation. In this specific implementation, a

10 - f o l d

cross-validation technique was employed. The fitness of the SVM model was evaluated by calculating the cross-validation error, as defined by Equation (23). This error metric guides the CAO in its search for the optimal

C

and γ parameters, aiming to minimize the cross-validation error and maximize the model’s predictive performance.

E r r o r = \frac{N u m b e r o f s a m p l e s i n c o r r e c t l y p r e d i c t e d w h i l e c r o s s - v a l i d a t i o n}{T o t a l n u m b e r o f t r a i n i n g s a m p l e s u s e d f o r v a l i d a t i o n o f a c c u r a c y} \times 100 %

(23)

A c c u r a c y = 1 - E r r o r

(24)

The optimized SVM model achieved a classification accuracy of 95.0%. The computational cost associated with optimizing the SVM parameters using the proposed algorithm and training the SVM model was 22.7249 s. The optimized values for the regularization parameter C and the kernel function parameter γ, which yielded the maximum accuracy, were determined to be 3.1258 and 512.9957, respectively. Table 7 presents the performance of the SVM model using different combinations of C and γ, highlighting the impact of parameter selection on classification accuracy.

The efficacy of the proposed methodology was further assessed through the construction of a confusion matrix, as depicted in Figure 14. This matrix provides a detailed breakdown of the classification results, illustrating the number of instances correctly and incorrectly classified for each health condition. Analysis of the confusion matrix revealed that the proposed methodology effectively distinguished between different health conditions, achieving an overall classification accuracy of 95.0%. This demonstrates the ability of the system to accurately diagnose the state of the centrifugal pump.

In order to validate the effectiveness of the proposed classifier, a comparative analysis was conducted against several established classification algorithms including a basic support vector machine (SVM), extreme learning machine (ELM), random forest (RF), k-nearest neighbor (k-NN), and decision tree (DT). The classification accuracies achieved by each of these algorithms were meticulously recorded and are presented visually in the bar plot in Figure 15. A detailed examination of Figure 15 revealed that the proposed method, specifically SVM optimized with the COA algorithm, demonstrated a superior level of accuracy compared with all of other classifiers evaluated. This significant performance advantage underscores the efficacy and potential of the proposed SVM with COA approach for the classification task under consideration.

To further assess the effectiveness of the proposed methodology, a comprehensive comparison was undertaken against a range of well-established optimization algorithms. These included genetic algorithm (GA), particle swarm optimization (PSO), ant lion optimization (ALO), sine-cosine algorithm (SCA), and slime mold algorithm (SMA). The classification accuracies resulting from the application of each of these optimization algorithms are presented in detail in Figure 16. Upon the careful examination of Figure 16, it became evident that the proposed methodology achieved the highest level of accuracy in comparison to all of the other optimization algorithms tested. This clearly demonstrates the superior performance and potential of the proposed methodology for the given task.

To ensure the robustness and generalizability of the proposed methodology, it was applied to a physical test rig of a worm gearbox. This gearbox was driven by a 50 Hz DC motor, with a flexible coupling connecting the two. The DC motor’s operation was governed by a control panel, as illustrated in Figure 17. Vibration and acoustic signals were acquired from the worm gearbox under three distinct health conditions, all tested at a rated RPM of 1500. The three health conditions considered were: healthy (representing a gearbox with no defects), pitting (representing a gearbox with surface fatigue), and missing (representing a gearbox with a missing tooth or component), as depicted in Figure 18. This approach allowed for a realistic evaluation of the methodology’s performance under varying operational scenarios.

Initially, the worm within the gearbox was free of any intentionally introduced defects, representing the “healthy” operating condition. While this condition was considered the baseline, the potential for inherent, pre-existing flaws was acknowledged. For each of the three defined health conditions (healthy, pitting, and missing), signals were acquired while the gearbox operated at its rated RPM of 1500. Vibration data were captured using a PCB^® Piezotronics uniaxial accelerometer, characterized by a sensitivity of 100 mV/g. This accelerometer was directly mounted onto the gearbox housing to capture representative vibration patterns. Simultaneously, acoustic signals were recorded using an ECM 8000 microphone, exhibiting a sensitivity of −60 dB. A National Instruments DAQ system, featuring 24-bit resolution and 4-channel capability, was employed to acquire the vibration signals within the LabView 2020 environment. Subsequent analysis of the acquired data was performed using MATLAB R2019a software running on a machine configured with an AMD Ryzen 5 4600H processor with Radeon graphics (3 GHz), 8 GB of RAM, and a 64-bit Windows 10 operating system. The raw vibration and sound signals acquired during testing are presented in Figure 19.

In the case of the wormgear box, the optimized SVM model demonstrated a recognition accuracy of 97.2579%, achieved with a computation time of 20.5842 s. The optimized values for the regularization parameter C and the kernel function parameter γ, which yielded this high accuracy, were determined to be 3.5713 and 479.5685, respectively. Based on the results of both case studies, it can be inferred that the proposed method exhibits superior performance in accurately identifying different classes and possesses the potential for broad application across a variety of other domains.

5. Conclusions

This paper presents a robust and effective methodology for diagnosing faults in centrifugal pumps, leveraging the strengths of CEEMDAN and feature-level fusion. By integrating acoustic and vibration signals, the approach overcomes the limitations of traditional single-sensor methods, offering a more comprehensive and accurate assessment of pump health. The use of CEEMDAN for signal decomposition effectively addresses the challenges posed by non-stationary and noisy data, enabling the extraction of meaningful features. The selection of significant IMFs based on weighted kurtosis further enhances the diagnostic performance, while feature-level fusion ensures that complementary information from different sensors is effectively combined. The optimized SVM classifier, trained using the COA algorithm, achieved a high classification accuracy of 95.0%, demonstrating the practical applicability of the proposed methodology for real-world fault diagnosis. The SVM classifier, optimized using the COA, was evaluated against a range of alternative classifiers and optimization algorithms. Furthermore, it was also applied to a dataset acquired from a worm gearbox. The results demonstrate that the proposed method not only achieves superior accuracy, but also exhibits strong generalization capabilities, indicating its robustness and potential for broader application. This work provides a valuable contribution to the field of condition monitoring, offering a reliable tool for proactively identifying faults and preventing costly downtimes in centrifugal pump systems.

Finally, future studies could investigate the use of transfer learning techniques to adapt the trained model to different pump configurations and operating conditions, reducing the need for extensive retraining. These efforts would contribute to the development of more intelligent and efficient condition monitoring systems for rotating machinery, ultimately improving operational reliability and reducing the maintenance costs.

Author Contributions

S.C.: Conceptualization, methodology, software, validation, formal analysis, investigation, data curation, writing—original draft preparation. G.V.: Validation, formal analysis, investigation, data curation, writing—original draft preparation. P.K.: Writing—review and editing. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are available on request from the corresponding author due to University rules and regulation.

Conflicts of Interest

The authors declare no conflict of interest.

References

Rapur, J.S.; Tiwari, R. On-Line Time Domain Vibration and Current Signals Based Multi-Fault Diagnosis of Centrifugal Pumps Using Support Vector Machines. J. Nondestruct. Eval. 2018, 38, 6. [Google Scholar] [CrossRef]
Vashishtha, G.; Kumar, R. Unsupervised Learning Model of Sparse Filtering Enhanced Using Wasserstein Distance for Intelligent Fault Diagnosis. J. Vib. Eng. Technol. 2023, 11, 2985–3002. [Google Scholar] [CrossRef]
Jiang, L.; Du, H.; Bu, Y.; Zhao, C.; Lu, H.; Yan, J. Deep Learning-Based Multilabel Compound-Fault Diagnosis in Centrifugal Pumps. Ocean Eng. 2024, 314, 119697. [Google Scholar] [CrossRef]
Ahmad, S.; Ahmad, Z.; Kim, J.-M. A Centrifugal Pump Fault Diagnosis Framework Based on Supervised Contrastive Learning. Sensors 2022, 22, 6448. [Google Scholar] [CrossRef]
ALTobi, M.A.S.; Bevan, G.; Wallace, P.; Harrison, D.; Ramachandran, K.P. Fault Diagnosis of a Centrifugal Pump Using MLP-GABP and SVM with CWT. Eng. Sci. Technol. Int. J. 2019, 22, 854–861. [Google Scholar] [CrossRef]
Ali, M.Z.; Shabbir, M.N.S.K.; Liang, X.; Zhang, Y.; Hu, T. Machine Learning-Based Fault Diagnosis for Single- and Multi-Faults in Induction Motors Using Measured Stator Currents and Vibration Signals. IEEE Trans. Ind. Appl. 2019, 55, 2378–2391. [Google Scholar] [CrossRef]
Cai, B.; Liu, Y.; Fan, Q.; Zhang, Y.; Liu, Z.; Yu, S.; Ji, R. Multi-Source Information Fusion Based Fault Diagnosis of Ground-Source Heat Pump Using Bayesian Network. Appl. Energy 2014, 114, 1–9. [Google Scholar] [CrossRef]
Fu, Y.; Chen, X.; Liu, Y.; Son, C.; Yang, Y. Multi-Source Information Fusion Fault Diagnosis for Gearboxes Based on SDP and VGG. Appl. Sci. 2022, 12, 6323. [Google Scholar] [CrossRef]
Azizi, R.; Attaran, B.; Hajnayeb, A.; Ghanbarzadeh, A.; Changizian, M. Improving Accuracy of Cavitation Severity Detection in Centrifugal Pumps Using a Hybrid Feature Selection Technique. Measurement 2017, 108, 9–17. [Google Scholar] [CrossRef]
Xu, Y.; Wang, H.; Xu, F.; Bi, S.; Ye, J. A Sensor Data-Driven Fault Diagnosis Method for Automotive Transmission Gearboxes Based on Improved EEMD and CNN-BiLSTM. Processes 2025, 13, 1200. [Google Scholar] [CrossRef]
Shi, J.; Hou, Y.; Wang, Z.; Yang, Z.; Lv, Z. A Diagnostic Method for Noise and Intermittent Faults in Analog Circuits Based on the EEMD-SR Filtering Algorithm and LSTM-CM. Measurement 2025, 254, 117871. [Google Scholar] [CrossRef]
Duan, N.; Zeng, Y.; Dao, F.; Xu, S.; Luo, X. Fault Diagnosis of Hydro-Turbine Based on Ceemdan-Mpe Preprocessing Combined with Cpo-Bilstm Modelling. Energies 2024, 18, 1342. [Google Scholar] [CrossRef]
Li, J.; He, D.; Wei, Z.; Zhao, M.; Xiang, Z. CEEMDAN and Adaptive Distance Embedding for Fault Diagnosis of Train Bogie Bearing. Meas. Sci. Technol. 2025, 36, 036127. [Google Scholar] [CrossRef]
Ouelaa, Z.; Djebala, A.; Younes, R.; Chaabi, L.; Ouelaa, N.; Bouacha, K.; Mrabti, A. Advanced Gear Fault Diagnosis in Non-Stationary Conditions with an Improved CEEMDAN-Wavelet Denoising Technique. Adv. Mech. Eng. 2025, 17, 16878132251356546. [Google Scholar] [CrossRef]
Zhang, T.; Sun, H. A Method for Predicting the Remaining Life of Lithium-Ion Batteries Based on an Improved Dempster–Shafer Evidence Theory Framework. Energies 2025, 18, 3370. [Google Scholar] [CrossRef]
Hou, Y.; Wang, J.; Chen, Z.; Ma, J.; Li, T. Diagnosisformer: An Efficient Rolling Bearing Fault Diagnosis Method Based on Improved Transformer. Eng. Appl. Artif. Intell. 2023, 124, 106507. [Google Scholar] [CrossRef]
Hou, Y.; Li, T.; Wang, J.; Ma, J.; Chen, Z. A Lightweight Transformer Based on Feature Fusion and Global–Local Parallel Stacked Self-Activation Unit for Bearing Fault Diagnosis. Measurement 2024, 236, 115068. [Google Scholar] [CrossRef]
Lv, Y.; Yuan, R.; Wang, T.; Li, H.; Song, G. Health Degradation Monitoring and Early Fault Diagnosis of a Rolling Bearing Based on CEEMDAN and Improved MMSE. Materials 2018, 11, 1009. [Google Scholar] [CrossRef] [PubMed]
Torres, M.E.; Colominas, M.A.; Schlotthauer, G.; Flandrin, P. A Complete Ensemble Empirical Mode Decomposition with Adaptive Noise. In Proceedings of the 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Prague, Czech Republic, 22–27 May 2011; pp. 4144–4147. [Google Scholar]
Jia, H.; Rao, H.; Wen, C.; Mirjalili, S. Crayfish Optimization Algorithm. Artif. Intell. Rev. 2023, 56, 1919–1979. [Google Scholar] [CrossRef]
Chauhan, S.; Vashishtha, G.; Gupta, M.K.; Korkmaz, M.E.; Demirsöz, R.; Noman, K.; Kolesnyk, V. Parallel Structure of Crayfish Optimization with Arithmetic Optimization for Classifying the Friction Behaviour of Ti-6Al-4V Alloy for Complex Machinery Applications. Knowl. Based Syst. 2024, 286, 111389. [Google Scholar] [CrossRef]
Chauhan, S.; Singh, M.; Kumar Aggarwal, A. An Effective Health Indicator for Bearing Using Corrected Conditional Entropy through Diversity-Driven Multi-Parent Evolutionary Algorithm. Struct. Health Monit. 2020, 20, 2525–2539. [Google Scholar] [CrossRef]
Vashishtha, G.; Chauhan, S.; Yadav, N.; Kumar, A.; Kumar, R. A Two-Level Adaptive Chirp Mode Decomposition and Tangent Entropy in Estimation of Single-Valued Neutrosophic Cross-Entropy for Detecting Impeller Defects in Centrifugal Pump. Appl. Acoust. 2022, 197, 108905. [Google Scholar] [CrossRef]

Figure 1. Basic architecture of SVM [21].

Figure 2. Flowchart of the proposed methodology.

Figure 3. (a) Schematic of the test rig and (b) pictorial view of the test rig [23].

Figure 4. Different operating conditions. (a) Clogging, (b) blade cut, and (c) wheel cut [23].

Figure 5. Raw signals of centrifugal pump under different health conditions. (a) Left panel showing vibration signals. (b) Right panel showing sound signals.

Figure 6. IMFs obtained by CEEMDAN of the vibration signal under the normal condition.

Figure 7. IMFs obtained by CEEMDAN of the vibration signal under the clogging condition.

Figure 8. IMFs obtained by CEEMDAN of the vibration signal under the blade cut condition.

Figure 9. IMFs obtained by CEEMDAN of the vibration signal under the wheel cut condition.

Figure 10. IMFs obtained by CEEMDAN of the sound signal under the normal condition.

Figure 11. IMFs obtained by CEEMDAN of the sound signal under the clogging condition.

Figure 12. IMFs obtained by CEEMDAN of the sound signal under the blade cut condition.

Figure 13. IMFs obtained by CEEMDAN of the sound signal under the wheel cut condition.

Figure 14. Confusion matrix.

Figure 15. Comparison of accuracy with different classifiers.

Figure 16. Comparison of the accuracy of the proposed method with different optimization algorithms.

Figure 17. Worm gearbox test rig.

Figure 18. Health states of worm gearbox. (a) Healthy, (b) pitting, and (c) missing.

Figure 19. Raw signals for worm gearbox under different health conditions. (a) Left panel showing vibration signals. (b) Right panel showing sound signals.

Table 1. Different kernel function [21].

Kernel Function	Formula
Linear	$k (x_{i}, x_{j}) = x_{i} . x_{j}$
Polynomial	$k (x_{i}, x_{j}) = {(γ x_{i} . x_{j} + r)}^{d}, γ > 0$
Radian basis function (RBF)	$k (x_{i}, x_{j}) = e x p (- γ {‖x_{i} - x_{j}‖}^{2}), γ > 0$
Sigmoid	$k (x_{i}, x_{j}) = t a n h (γ x_{i} . x_{j} + r)$

Table 2. The search range of SVM parameters [21].

Parameter	Setting
C	0.1–5
γ	100–1000

Table 3. Specifications of the centrifugal pump [23].

Power supply	230/240 V
Motor power	0.5 HP
Discharge	1.61 L/s
Impeller type	Closed
Impeller diameter, impeller vanes	118.88 mm, 3

Table 4. Weighted kurtosis for different modes for vibration signals.

Condition	M1	M2	M3	M4	M5	M6	M7	M8	M9	M10	M11	M12	M13	M14	M15
Normal	11.3446	8.2567	3.1916	1.1590	1.0678	0.2911	0.4902	0.3253	0.3688	0.1097	0.0040	6.7294 × 10⁻⁴	3.4007 × 10⁻⁴	0.0015	-
Blade cut	5.7086	2.3391	1.4886	2.2290	1.4190	0.3116	0.5053	1.1249	0.7711	0.0010	0.0025	0.0112	0.0054	0.0065	0.0078
Clogging	2.9248	2.1873	2.5112	1.6482	1.2310	0.3507	0.4895	0.9998	0.2879	0.0356	0.0035	0.0071	0.0057	-	-
Wheel cut	6.6399	3.4269	1.6305	1.8405	1.5298	0.5381	0.6563	0.9274	0.6066	0.0193	0.0101	0.0278	0.0023	0.0086	0.0020

Table 5. Weighted kurtosis for different modes for acoustic signals.

Condition	M1	M2	M3	M4	M5	M6	M7	M8	M9	M10	M11	M12	M13
Normal	0.2736	0.3754	0.3100	0.3266	1.3614	2.3960	1.8352	1.6788	1.0321	0.7588	0.1297	0.1232	0.0049
Blade cut	0.2752	0.4271	0.7323	0.9714	0.5812	0.5629	1.8599	1.7116	0.0375	0.0745	0.0411	0.0141	0.0032
Clogging	0.4923	0.5470	1.1804	1.2899	0.8410	0.5112	2.3472	1.3757	0.3560	0.2534	0.1468	0.0264	0.0353
Wheel cut	0.1736	0.2454	0.2743	0.3237	0.3691	0.2017	1.8553	1.4669	0.1737	0.0203	0.0903	0.0281	0.0543

Table 6. Features from the time domain, frequency domain, and time–frequency domain.

Feature Notation and Formula	Feature Notation and Formula	Feature Notation and Formula
Mean A1 = $\frac{1}{N} \sum_{i = 1}^{N} x_{i}$	Kurtosis A11 $= \frac{1}{σ^{4}} (\frac{1}{N - 1} \sum_{i = 1}^{N} {(x_{i} - μ)}^{4})$	Median absolute deviation A21 = $m e d i a n (\| x_{i} - m e d i a n (x_{i}) \|)$
Root mean square A2 = $\sqrt{\frac{1}{N} \sum_{i = 1}^{N} x_{i}^{2}}$	Crest factor A12 $= \frac{m a x (\|x\|)}{\sqrt{\frac{1}{N} \sum_{i = 1}^{N} x_{i}^{2}}}$	Rate zero crossing A22 = $\frac{n u m b e r o f z e r o c r o s s i n g s}{t o t a l n u m b e r o f p o i n t s}$
Root A3 = ${(\frac{1}{N} \sum_{i = 1}^{N} \sqrt{\|x_{i}\|})}^{2}$	Shape factor A13 $= \frac{\sqrt{\frac{1}{N} \sum_{i = 1}^{N} x_{i}^{2}}}{\frac{1}{N} \sum_{i = 1}^{N} {\| x}_{i} \|}$	Entropy A23 $= - \sum_{1 = 1}^{N} h (x_{i}) \log_{2} h (x_{i})$
Maximum value A4 = $m a x (\|x\|)$	Impulse factor A14 $= \frac{m a x (\|x\|)}{\frac{1}{N} \sum_{i = 1}^{N} {\| x}_{i} \|}$	Histogram upper bound A24 $= m a x (x_{i}) + \frac{m a x (x_{i}) - m i n (\frac{x_{i}}{N - 1})}{2}$
Peak-to-peak A5 = $m a x (\|x\|) - m i n (\|x\|)$	Clearance factor A15 $= \frac{m a x (\|x\|)}{{(\frac{1}{N} \sum_{i = 1}^{N} \sqrt{\|x_{i}\|})}^{2}}$	Histogram lower bound A25 $= m a x (x_{i}) - \frac{m a x (x_{i}) - m i n (\frac{x_{i}}{N - 1})}{2}$
Standard deviation A6 = $\sqrt{\frac{1}{N - 1} \sum_{i = 1}^{N} {(x_{i} - μ)}^{2}}$	Skewness factor A16 $= \frac{1}{σ^{3}} \frac{(\frac{1}{N - 1} \sum_{i = 1}^{N} {(x_{i} - μ)}^{3})}{{(\sqrt{\frac{1}{N} \sum_{i = 1}^{N} x_{i}^{2}})}^{2}}$	Activity A26 $= \frac{1}{N - 1} \sum_{i = 1}^{N} {(x_{i} - μ)}^{2}$
Median A7 = $\frac{50 (N + 1)}{100} t h o b s e r v a t i o n$	Kurtosis factor A17 $= \frac{1}{σ^{4}} \frac{(\frac{1}{N - 1} \sum_{i = 1}^{N} {(x_{i} - μ)}^{4})}{{(\sqrt{\frac{1}{N} \sum_{i = 1}^{N} x_{i}^{2}})}^{4}}$	Variance A27 $= \frac{\sum_{i = 1}^{N} {(x (i) - x_{a v g})}^{2}}{N - 1}$
25th Percentiles A8 = $\frac{25 (N + 1)}{100} t h o b s e r v a t i o n$	Geometric mean A18 $= {(\prod_{i = N}^{N} x_{i})}^{\frac{1}{N}}$	Wavelet energy decomposition A28 $= \frac{\sum_{n = 1}^{N} {\|x_{i} (n)\|}^{2}}{\sum_{k = 0}^{2^{j} - 1} \sum_{n = 1}^{N} {\|x_{k} (n)\|}^{2}}$
75th Percentiles A9 = $\frac{75 (N + 1)}{100} t h o b s e r v a t i o n$	Root sum of squares A19 $= \sqrt{\sum_{i = 1}^{N} {\|x_{i}\|}^{2}}$	Normal negative log likelihood for single Gaussian A29 $= - \sum_{i = 1}^{N} l o g [\frac{1}{σ \sqrt{2 π}} e x p \frac{- {(x_{i} - μ)}^{2}}{2 σ^{2}}]$
Skewness A10 $= \frac{1}{σ^{3}} (\frac{1}{N - 1} \sum_{i = 1}^{N} {(x_{i} - μ)}^{3})$	Mean absolute deviation A20 $= m e a n (\|x_{i} - μ\|)$	Total harmonic distortion A30 = $\frac{\sqrt{\sum_{i = 2}^{N} Y_{i}^{2}}}{Y_{i}}$

Table 7. Effectiveness of SVM at different values of C and γ.

S. No.	C Value	γ Value	Efficiency
1	4.127	189.9128	62.67%
2	3.2598	499.8147	79.87%
3	2.4179	522.1742	80.60%
4	3.1258	512.9957	95.0%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chauhan, S.; Vashishtha, G.; Kaur, P. An Effective Approach to Rotatory Fault Diagnosis Combining CEEMDAN and Feature-Level Integration. Algorithms 2025, 18, 644. https://doi.org/10.3390/a18100644

AMA Style

Chauhan S, Vashishtha G, Kaur P. An Effective Approach to Rotatory Fault Diagnosis Combining CEEMDAN and Feature-Level Integration. Algorithms. 2025; 18(10):644. https://doi.org/10.3390/a18100644

Chicago/Turabian Style

Chauhan, Sumika, Govind Vashishtha, and Prabhkiran Kaur. 2025. "An Effective Approach to Rotatory Fault Diagnosis Combining CEEMDAN and Feature-Level Integration" Algorithms 18, no. 10: 644. https://doi.org/10.3390/a18100644

APA Style

Chauhan, S., Vashishtha, G., & Kaur, P. (2025). An Effective Approach to Rotatory Fault Diagnosis Combining CEEMDAN and Feature-Level Integration. Algorithms, 18(10), 644. https://doi.org/10.3390/a18100644

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An Effective Approach to Rotatory Fault Diagnosis Combining CEEMDAN and Feature-Level Integration

Abstract

1. Introduction

2. Theoretical Background

2.1. Complete Ensemble Empirical Mode Decomposition with Additive Noise

2.2. Crayfish Optimization Algorithm (COA)

2.2.1. Initialization

2.2.2. Defining the Temperature and Number of Crayfish

2.2.3. Summer Resort Stage

2.2.4. Competition Stage

2.2.5. Foraging Stage

2.3. Support Vector Machine (SVM)

3. Proposed Work

4. Application of the Proposed Methodology

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI