1. Introduction
Centrifugal pumps are ubiquitous fluid-handling machines that play a critical role in a vast array of industrial, domestic, and agricultural applications. Their operational versatility allows them to be deployed in diverse settings, often requiring continuous operation under demanding and extreme conditions. These conditions can include elevated temperatures, the handling of high-density fluids, and exposure to significant pressures, all of which place considerable stress on the pump’s internal components. Consequently, centrifugal pumps are susceptible to a range of potential failures affecting their mechanical integrity and performance [
1].
These failures can be broadly categorized into three primary types: mechanically induced faults, which stem from wear and tear, fatigue, or material defects within the pump’s components; system faults, arising from issues within the larger system in which the pump operates such as blockages, cavitation, or improper installation; and operational faults, which result from incorrect operating procedures, exceeding design parameters, or inadequate maintenance. The occurrence of faults within the pump can initiate a cascade of problems, potentially leading to the breakdown of the entire system it serves, resulting in significant economic losses due to production downtime, repair costs, and potential damage to other equipment [
2].
Therefore, proactively diagnosing the condition of the pump at an early stage is of paramount importance for ensuring operational reliability and minimizing potential disruptions. To achieve this, effective condition monitoring techniques are crucial for detecting the subtle, early signs of impending failure [
3]. These techniques encompass a spectrum of approaches, ranging from relatively simple methods like periodic vibration analysis, which can identify imbalances or bearing issues, and temperature monitoring, which can indicate overheating or friction problems, to more sophisticated and advanced methodologies [
4]. These advanced techniques include oil analysis, which assesses the condition of lubricating oil for contaminants and wear debris, and acoustic emission monitoring, which detects high-frequency sounds generated by developing cracks or leaks. By employing these condition monitoring techniques, proactive maintenance strategies can be implemented, allowing for timely repairs or component replacements, ultimately minimizing downtime, reducing the overall maintenance costs, and extending the operational lifespan of the centrifugal pump [
5].
A common approach to fault diagnosis in rotating machinery relies on data acquisition from a single sensor type. However, centrifugal pumps present a more complex diagnostic challenge due to their intricate structure and the inherent variability of their operating conditions, which often include random or unpredictable factors. Consequently, relying solely on data from a single sensor type can be insufficient to capture the full spectrum of information needed for accurate fault diagnosis [
6].
To address this limitation, fusing data from multiple sensor types offers a more robust and comprehensive approach. This strategy ensures the efficacy and completeness of the acquired information, leading to more accurate and reliable fault detection. Acquiring data from multiple sensors is known as multi-source information fusion (MSIF) [
7]. In recent years, the field of MSIF has experienced significant advancements, driven by the recognition that integrating information from diverse sources can provide a more holistic understanding of the system’s condition.
MSIF works by integrating all available information received from multiple sensors, enabling cross-validation and mutual data compensation. This synergistic approach enhances the overall performance of the diagnostic system, allowing for the extraction of more useful and relevant information. Furthermore, MSIF strengthens the system’s resilience and stability by mitigating the impact of noise or inaccuracies from individual sensors [
8].
In the proposed study, sound and vibration signals are acquired using dedicated sensors. However, raw signals are inherently susceptible to noise contamination, which can obscure the underlying patterns indicative of specific faults. Therefore, it is crucial to pre-process these signals using a suitable algorithm to eliminate unwanted noise and enhance the signal-to-noise ratio. This pre-processing step is essential for accurately detecting different types of defects and assessing their severity, ultimately enabling timely and effective maintenance interventions.
Traditional signal processing methods often rely on statistical analysis in the time domain, frequency domain, and time–frequency domain. However, these traditional approaches can struggle to effectively identify defects in complex systems like centrifugal pumps due to the inherent nonlinear and non-stationary characteristics of the raw signals acquired from these machines. The complex interplay of various factors contributing to pump operation results in signals that are difficult to interpret directly using conventional statistical methods.
To overcome these challenges, researchers have proposed numerous techniques in the literature that leverage decomposition methods. These methods aim to decompose the complex raw signals into a set of simpler, more manageable components, making it easier to extract meaningful features related to specific fault conditions. For instance, Azizi et al. [
9] employed the empirical mode decomposition (EMD) technique for fault identification in centrifugal pumps. EMD is a data-driven, adaptive technique that decomposes a signal into a collection of intrinsic mode functions (IMFs), representing different scales of variability within the signal.
However, EMD suffers from a significant limitation known as mode mixing, where a single IMF may contain components of different frequencies, hindering accurate feature extraction. To address this issue, the ensemble empirical mode decomposition (EEMD) technique was developed. EEMD mitigates the mode mixing problem by adding white noise to the signal before decomposition and averaging the resulting IMFs across multiple trials. While EEMD improves upon EMD, it introduces increased computational complexity due to the ensemble averaging process [
10,
11].
The complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) approach offers a further refinement, addressing the drawbacks of both EMD and EEMD. CEEMDAN provides a perfectly reconstructed signal, improved spectral separation of the modes, and reduced computational cost compared with EEMD [
12,
13,
14]. This makes CEEMDAN a particularly attractive pre-processing technique for analyzing complex signals from rotating machinery.
Therefore, this paper utilized the CEEMDAN approach as a pre-processing step for the raw acoustic and vibration signals acquired from the centrifugal pump. Following signal decomposition, the next step involves selecting the dominant modes based on a chosen health indicator. A variety of health indicators are available in the literature, including statistical measures like kurtosis, which is sensitive to impulsive events, correlation coefficients, and various types of entropy that quantify the signal’s complexity and irregularity. To effectively identify defects in centrifugal pumps, fault features are extracted from the decomposed signals in the time domain, frequency domain, and time–frequency domain for both vibration and acoustic signals. This multi-faceted feature extraction approach aims to capture a comprehensive representation of the pump’s condition, enabling accurate fault diagnosis.
Techniques that utilize multiple data sources to improve diagnostic accuracy are collectively known as data fusion techniques. These techniques can be broadly categorized based on the stage at which the fusion operation takes place: signal-level fusion, feature-level fusion, and decision-level fusion. In the context of fault diagnosis for mechanical machinery, feature-level fusion and decision-level fusion techniques are the most commonly employed.
In feature-level fusion, the process involves extracting relevant features from each data source independently. These features, which represent characteristic attributes of the signals, are then combined into a single, consolidated feature vector. This integrated feature vector serves as the input for a classification algorithm. The advantage of feature-level fusion lies in its ability to capture complementary information from different sensors and create a more comprehensive representation of the system’s state.
Decision-level fusion, on the other hand, involves analyzing and processing each signal from a different sensor separately. Each signal is individually assessed, and a preliminary decision or classification is made based on the information it provides. These individual decisions are then combined using techniques such as fuzzy logic or Dempster–Shafer (DS) theory to arrive at a final, consolidated decision [
15]. Decision-level fusion is particularly useful when dealing with heterogeneous data sources or when the relationships between different signals are complex and difficult to model directly.
This paper adopted a feature-level fusion technique, enabling the integration of relevant features extracted from different signals into a unified representation. This approach aims to leverage the complementary information provided by each sensor to enhance the accuracy and robustness of the fault diagnosis process.
Following the feature fusion stage, the next crucial step is to classify the prepared dataset into predefined categories corresponding to different fault conditions. This dataset is typically divided into a training dataset and a testing dataset. The training dataset is used to build a classification model, while the testing dataset is used to evaluate the performance and generalization ability of the trained model. A variety of classification algorithms are available in the literature including k-nearest neighbor (k-NN), convolutional neural networks (CNNs), deep learning architectures, extreme learning machines (ELMs), and support vector machine (SVM) classifiers. Hou et al. [
16] proposed Diagnosisformer, a transformer-based model that enhances rolling bearing fault diagnosis by fusing frequency domain features. It uses a multi-feature parallel fusion encoder and cross-flipped decoder for improved accuracy, generalization, and robustness, achieving 99.84% and 99.85% accuracy on two datasets. Hou et al. [
17] proposed a global local transformer, a lightweight method for bearing fault diagnosis that uses multi-channel vibration feature fusion and a global-local parallel self-activation unit. It balances diagnostic performance with resource constraints, reducing the storage and computational needs while maintaining generalization and robustness, verified on public and self-built data.
This article employed an SVM classifier due to its proven ability to achieve high accuracy in various classification tasks, particularly in scenarios with high-dimensional data. The performance of the SVM classifier is highly dependent on two key parameters: the kernel function, which determines the mapping of data points into a higher-dimensional space, and the regularization parameter, which controls the trade-off between maximizing the margin and minimizing the classification error.
To optimize the accuracy of the SVM classifier, an optimization technique is applied to determine the optimal values for both the kernel function parameters and the regularization parameter. Numerous optimization algorithms are available in the literature including genetic algorithms, DDMPEA, and the SCA algorithm. This paper utilized the crayfish optimization algorithm (COA) to optimize the parameters of the SVM classifier. COA is a metaheuristic optimization algorithm inspired by the social behavior of crayfish, offering a balance between exploration and exploitation in the search for optimal solutions.
The remainder of this paper is organized to guide the reader through the research process.
Section 2 establishes the theoretical foundation, providing essential background.
Section 3 details the proposed methodology, outlining its key components.
Section 4 demonstrates the application of this methodology, showcasing its practical use. Finally,
Section 5 concludes the paper, summarizing findings and future directions. This structure ensures a clear and comprehensive understanding of the research.
4. Application of the Proposed Methodology
For this experiment, a monoblock centrifugal pump was utilized to identify impeller defects. The specifications of the centrifugal pump are tabulated in
Table 3.
Figure 3 illustrates the schematic and typical image of the test rig, which incorporated a sound sensor and data acquisition equipment. The pump was operated at a constant speed of 2800 RPM, resulting in an operating frequency of 46.67 Hz. The test rig’s rotor shaft was supported by two bearings: Bearing 1 (bearing number 6203-ZZ), which was positioned closer to the impeller, and Bearing 2 (bearing number 6202-ZZ), located farther from the impeller. The impeller itself was situated on the rotor shaft and enclosed within the pump casing. It consists of rotating vanes that draw water axially through the impeller’s eye. The vanes then impart kinetic energy to the water, forcing it to flow radially outward through the casing. The interaction between the water and the rotating vanes, specifically the energy transfer through impacts, is what generates the chaotic phenomena within the water flow.
A uniaxial PCB accelerometer with a sensitivity of 100 mV/g was used to capture the vibration signals. The accelerometer was attached to the impeller casing using wax, as shown in
Figure 3. A National Instruments 24-bit DAQ system (Austin, TX, USA) with 4 channels, set to a sampling frequency of 70,000 Hz, acquired the signal. Simultaneously, a microphone with a sensitivity of −60 dB recorded the sound signal, also at a sampling frequency of 70,000 Hz.
Four distinct health conditions were analyzed: healthy, clogging, blade cut, and wheel cut, as shown in
Figure 4. The raw signals, displayed in
Figure 5, are clearly masked with noise, necessitating preprocessing via complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN). In the case of the normal condition, the vibration signal was more consistent and possibly had a lower amplitude, suggesting stable operation, whereas the sound signal during normal operation had background noise. In the case of clogging, the vibration signal showed slightly increased randomness or higher frequency components, potentially indicating turbulent flow or increased mechanical stress due to blockage. The sound signals also exhibited higher-frequency noise, which is associated with turbulent flow due to the blockage. The blade cut or wheel cut conditions showed some impulsive peaks or irregular patterns, reflecting impacts or imbalances caused by the damage in the vibration signals, whereas the acoustic data showed a distortion that represents the altered sound profile because of the blade cut and wheel cut. CEEMDAN decomposed the raw signals into various intrinsic mode functions (IMFs). The IMFs obtained by CEEMDAN under different health conditions for both the vibration signals and sound signals are shown in
Figure 6,
Figure 7,
Figure 8,
Figure 9,
Figure 10,
Figure 11,
Figure 12 and
Figure 13.
The IMFs containing useful information were considered significant and identified based on the maximum value of weighted kurtosis. The weighted kurtosis assesses the effectiveness of the decomposition results by combining standard kurtosis and the correlation coefficient. Standard kurtosis is highly dependent on the density distribution of impacts generated by a fault. Therefore, relying solely on maximum kurtosis can neglect impacts with large amplitudes but dispersed density distributions. While the correlation coefficient quickly determines the similarity between two signals, it is susceptible to noise from faulty components. To leverage the advantages of both indices, weighted kurtosis was developed, as shown in Equation (20).
where
represents the kurtosis index for input signal
and is expressed as
Here, the length of the signal is given by
. Considering
as a mathematical expectation, the correlation
between
and
is expressed as:
The weighted kurtosis obtained from the IMFs of the vibration signals is tabulated in
Table 4. Similarly, the weighted kurtosis obtained from the IMFs of the sound signals is presented in
Table 5.
The analysis of the weighted kurtosis values, as presented in the accompanying tables, revealed that a substantial portion of the IMFs exhibited very small kurtosis values. This observation suggests that these IMFs contribute negligibly to the overall signal, indicating a lack of meaningful information content. Consequently, these IMFs were deemed unsuitable for subsequent analysis and excluded from further consideration. To ensure the selection of the most informative components, the top five significant IMFs were chosen based on the maximum weighted kurtosis values derived from both the vibration and sound signals for each defined health condition. This selection process prioritized IMFs that captured the most prominent characteristics related to the health status of the system under investigation.
Following the identification of these significant IMFs, a comprehensive set of 30 features, detailed in
Table 6, were extracted from these selected components. This feature set encompassed a diverse range of descriptors drawn from the time domain, frequency domain, and time–frequency domain, providing a multi-faceted representation of the underlying signal characteristics. The combination of these features aims to capture both the temporal dynamics and spectral properties of the IMFs, enabling a more thorough and robust analysis of the system’s health condition.
The 30 features extracted from the significant IMFs of the vibration and sound signals, corresponding to each health condition, underwent a normalization process to ensure consistent scaling. Specifically, each feature was scaled to fall within the range of [0, 1]. This normalization step is crucial for preventing features with larger magnitudes from dominating the analysis and ensuring that all features contribute equally to subsequent modeling stages. Following normalization, the features derived from both the vibration and sound signals were fused, creating a unified dataset that integrated information from both sensor modalities. This fusion process resulted in a final dataset with dimensions of , where the 40 rows represent the number of signal instances collected (i.e., 40 signals), and the 30 columns represent the total number of normalized features extracted from the significant IMFs. This combined dataset served as the input for further analysis and modeling.
To proceed with the classification of different health conditions, a support vector machine (SVM) was employed. It is well-known that the performance of an SVM, specifically its classification and prediction accuracy, is significantly influenced by the appropriate selection of its regularization parameter
and kernel function parameter (
γ). An arbitrary or random selection of these parameters can lead to suboptimal results. Indeed, the results of the initial classification attempts using arbitrarily chosen values for
and
γ, as presented in
Table 5, demonstrated a poor recognition rate and a time-consuming optimization process. Therefore, it is crucial to optimize the selection of these parameters to design an efficient and accurate SVM model. In this study, the parameters
and
γ were optimized using the CAO as described in
Section 2.2. The CAO algorithm was configured with a population size of 30, and the optimization process was terminated when a maximum of 100 iterations was reached. All processing and simulations were conducted using MATLAB R2024b running on a Windows 11 operating system with an Intel Core-i7 processor @2.30 GHz and 64 GB of RAM.
To ensure the robustness and generalizability of the SVM model and to avoid overfitting or underfitting, a
cross-validation technique was implemented during the classification process.
cross-validation involves partitioning the training dataset into k equal subsets or “folds.” During each iteration,
folds are used for training the SVM model, while the remaining kth fold is used for validation. In this specific implementation, a
cross-validation technique was employed. The fitness of the SVM model was evaluated by calculating the cross-validation error, as defined by Equation (23). This error metric guides the CAO in its search for the optimal
and
γ parameters, aiming to minimize the cross-validation error and maximize the model’s predictive performance.
The optimized SVM model achieved a classification accuracy of 95.0%. The computational cost associated with optimizing the SVM parameters using the proposed algorithm and training the SVM model was 22.7249 s. The optimized values for the regularization parameter
C and the kernel function parameter
γ, which yielded the maximum accuracy, were determined to be 3.1258 and 512.9957, respectively.
Table 7 presents the performance of the SVM model using different combinations of
C and
γ, highlighting the impact of parameter selection on classification accuracy.
The efficacy of the proposed methodology was further assessed through the construction of a confusion matrix, as depicted in
Figure 14. This matrix provides a detailed breakdown of the classification results, illustrating the number of instances correctly and incorrectly classified for each health condition. Analysis of the confusion matrix revealed that the proposed methodology effectively distinguished between different health conditions, achieving an overall classification accuracy of 95.0%. This demonstrates the ability of the system to accurately diagnose the state of the centrifugal pump.
In order to validate the effectiveness of the proposed classifier, a comparative analysis was conducted against several established classification algorithms including a basic support vector machine (SVM), extreme learning machine (ELM), random forest (RF), k-nearest neighbor (k-NN), and decision tree (DT). The classification accuracies achieved by each of these algorithms were meticulously recorded and are presented visually in the bar plot in
Figure 15. A detailed examination of
Figure 15 revealed that the proposed method, specifically SVM optimized with the COA algorithm, demonstrated a superior level of accuracy compared with all of other classifiers evaluated. This significant performance advantage underscores the efficacy and potential of the proposed SVM with COA approach for the classification task under consideration.
To further assess the effectiveness of the proposed methodology, a comprehensive comparison was undertaken against a range of well-established optimization algorithms. These included genetic algorithm (GA), particle swarm optimization (PSO), ant lion optimization (ALO), sine-cosine algorithm (SCA), and slime mold algorithm (SMA). The classification accuracies resulting from the application of each of these optimization algorithms are presented in detail in
Figure 16. Upon the careful examination of
Figure 16, it became evident that the proposed methodology achieved the highest level of accuracy in comparison to all of the other optimization algorithms tested. This clearly demonstrates the superior performance and potential of the proposed methodology for the given task.
To ensure the robustness and generalizability of the proposed methodology, it was applied to a physical test rig of a worm gearbox. This gearbox was driven by a 50 Hz DC motor, with a flexible coupling connecting the two. The DC motor’s operation was governed by a control panel, as illustrated in
Figure 17. Vibration and acoustic signals were acquired from the worm gearbox under three distinct health conditions, all tested at a rated RPM of 1500. The three health conditions considered were: healthy (representing a gearbox with no defects), pitting (representing a gearbox with surface fatigue), and missing (representing a gearbox with a missing tooth or component), as depicted in
Figure 18. This approach allowed for a realistic evaluation of the methodology’s performance under varying operational scenarios.
Initially, the worm within the gearbox was free of any intentionally introduced defects, representing the “healthy” operating condition. While this condition was considered the baseline, the potential for inherent, pre-existing flaws was acknowledged. For each of the three defined health conditions (healthy, pitting, and missing), signals were acquired while the gearbox operated at its rated RPM of 1500. Vibration data were captured using a PCB
® Piezotronics uniaxial accelerometer, characterized by a sensitivity of 100 mV/g. This accelerometer was directly mounted onto the gearbox housing to capture representative vibration patterns. Simultaneously, acoustic signals were recorded using an ECM 8000 microphone, exhibiting a sensitivity of −60 dB. A National Instruments DAQ system, featuring 24-bit resolution and 4-channel capability, was employed to acquire the vibration signals within the LabView 2020 environment. Subsequent analysis of the acquired data was performed using MATLAB R2019a software running on a machine configured with an AMD Ryzen 5 4600H processor with Radeon graphics (3 GHz), 8 GB of RAM, and a 64-bit Windows 10 operating system. The raw vibration and sound signals acquired during testing are presented in
Figure 19.
In the case of the wormgear box, the optimized SVM model demonstrated a recognition accuracy of 97.2579%, achieved with a computation time of 20.5842 s. The optimized values for the regularization parameter C and the kernel function parameter γ, which yielded this high accuracy, were determined to be 3.5713 and 479.5685, respectively. Based on the results of both case studies, it can be inferred that the proposed method exhibits superior performance in accurately identifying different classes and possesses the potential for broad application across a variety of other domains.