Cluster Migration Distance for Performance Degradation Assessment of Water Pump Bearings

Because the signal of water pump bearing is seriously disturbed by noise and the fault evolution is complex, it is difficult to describe the performance degradation trend of water pump bearing in a timely and accurate manner using the traditional performance degradation index (PDI). In this paper, a new Cluster Migration Distance (CMD) algorithm is proposed. The extraction of the indicator includes the following four steps: First, the relevant blind separation is used to extract the useful signal of the monitored bearing from the mixed signal; secondly, the impact component is further enhanced by wavelet packet analysis. Then, the redundancy of the original feature vectors is eliminated using our previously proposed KJADE (Kernel Joint Approximate Diagonalization of Eigen-matrices) method. Finally, the newly proposed CMD index is computed as PDI. By calculating the offset trajectory of the feature cluster centroid in the continuous running process of the bearing, CMD can aptly deal with the complex and variable features in the fault evolution process of the water pump bearing. The whole-life monitoring data of a 220 KW water pump system are processed. The results show that the proposed CMD index has better early-warning ability and monotonicity than the traditional kurtosis index.


Introduction
Water pumps play an important role in urban drinking water treatment, agricultural irrigation, and other scenarios. As one of the key components of water pumps, bearings are of great significance to ensure their healthy and reliable operation. However, due to their high load and high-speed operation over a long duration, various kinds of faults inevitably occur [1][2][3].
Dutta N. et al. [4]. proposed an artificial neural network-based model for the diagnosis of water pump bearings with higher accuracy than other machine-learning algorithms. Tang Jing et al. [5] converted the collected vibration signals into frequency domain signals and used frequency domain features and support vector machines to diagnose the faults of water pump bearings. Li Tao et al. [6,7] proposed an adaptive algorithm based on a particle swarm optimization algorithm using the convolutional neural network fault diagnosis method. Other research [8] proposed the method of the wavelet packet to denoise the vibration signal of the centrifugal pump bearing and extracted the frequency band energy representing the corresponding bearing fault and used a BP neural network for training and fault identification. Han Hui et al. [9] proposed a water pump bearing diagnosis method based on stack noise reduction and self-encoding, which obtained a more robust feature representation and could precisely diagnose the fault state of water pump bearings. Bie Zhongzheng et al. [10] introduced the principle of peak energy analysis and used the peak energy value to judge the failure degree of the water pump bearing.
Denoising and accurate performance degradation index extraction are two major challenges in achieving effective performance degradation assessment of water pump bearings. In practical engineering applications, the water pump is usually driven to rotate by a motor through a coupling, and four bearings are usually installed coaxially on the main shaft. Therefore, the vibration signal of each bearing will receive vibration interference from other bearings, which introduces difficulties in the accurate extraction of the weak fault information in the early stages of the bearing. Therefore, it is necessary to study effective means to effectively eliminate noise. Bearing failures, on the other hand, have complex evolutionary processes, often starting with one failure type, such as localized material tribes, and then more complex composite failures may emerge as bearing components interact with localized defects. Therefore, accurately characterizing the performance degradation trend of water pump bearings is one of the main challenges.
Regarding research on noise reduction, Pinlu et al., proposed a deep learning method for efficient noise reduction and feature extraction, which is based on a combination of residual construction units, soft thresholding, and global context. However, this method has the disadvantage of a shared threshold [11]. Sandaram Buchaiah and Piyush Shakya et al., applied different signal processing techniques to extract 72 raw features from vibration data collected experimentally on bearings. They used a random forest method to select a subset of relevant features from the extracted features. The selected features were fused through a 14-dimensional dimensionality-reduction technique, from which two-dimensional relevant performance indicators are extracted and compared between techniques to determine the most effective technique. However, these algorithms are numerous and they need to be compared when they are used, and it is impossible to determine which one is the optimal algorithm [12]. Fei-Ping Du et al., proposed an adaptive regularization parameter selection strategy to denoise vibration signals using a sparse redundant representation model. However, this posterior method requires a great deal of computation of the FP metric, and although this method is very effective, it has not been applied to all scenarios [13].
Regarding research on performance degradation indicators, Yang Chuang Yan et al., established a new RUL prediction model for the problems of poor residual life prediction performance and the single performance degradation index of rolling bearings, and the method showed a good performance in prediction accuracy and reliability [14]. Yan Xiao li et al., proposed a bearing performance degradation evaluation algorithm based on CMMP and feature fusion. Mathematical morphological operations are driven by partial differential equations (PDEs) for the accurate assessment of bearing life cycle failure datasets [15]. Tao Zan et al., introduced joint approximate diagonalization of a characteristic matrix (JADE) and a particle swarm optimization support vector machine (PSO-SVM) in the prediction of the performance degradation trend of rolling bearings, and realized the performance degradation trend and residual performance of rolling bearings under small samples, with an accurate prediction of service life [16].
In the above studies, it is generally considered that only one type of failure occurs in the bearing during the degradation process, but in fact, in actual engineering cases, we often find that there are composite failures. This is because, after the early failure of a bearing, the interaction of the local failure with the bearing components can lead to more complex compound failures. At this time, some traditional performance degradation indicators, such as kurtosis, will fluctuate significantly, which affects the accurate judgment of the performance degradation trend. Therefore, in this study, we innovatively propose a novel performance degradation metric extraction method called Cluster Migration Distance (CMD). CMD does not focus on the growth of features, but rather on the change in features, becoming lower or higher. Degradation trends are assessed by tracking changes in features. At the same time, in order to solve the problem of strong noise interference, we comprehensively use the non-target vibration source interference elimination ability of blind separation and the impact component identification ability of wavelet analysis to achieve noise elimination and use our specially proposed KJADE method for feature fusion processing. In order to verify the effectiveness of the proposed method, we verify the life-cycle monitoring data of a 220 KW water pump system in practical engineering applications. The results show that the early warning ability and the monotonicity of the proposed CMD index are better than the traditional kurtosis indicators.
The details of this work are described in the following sections. Section 2 provides the detailed flow of the proposed approach. Section 3 presents the analysis results and related discussions of an engineering application case. Section 4 concludes this work and points out future research work.

Method and Process
In practical engineering applications, the early fault signal of a water pump bearing is seriously interfered with, and the fault evolution process is complex and changeable. For the purpose of solving this problem, this paper proposes a new performance degradation index extraction method called the Cluster Migration Distance (CMD). At the same time, blind separation, wavelet packet analysis, and the KJADE feature fusion method are comprehensively used to assist in the accurate extraction of bearing performance degradation indicators.
The proposed method mainly includes the following four steps: (1) In order to eliminate the interference of the coaxial vibration source in the pump system, decorrelated blind separation is used to separate the signal of the monitored bearing from the acquired signals of multiple measurement points. (2) Wavelet packet analysis is used to further enhance the shock components closely related to the diagnostic information from the signal; the comprehensive application of the blind separation and wavelet packet analysis methods can effectively eliminate noise interference. (3) Our proposed KJADE method is used to eliminate redundancy and feature fusion on the original time-frequency domain feature vectors; the KJADE method combines the nonlinear processing capability of the kernel method and the advantages of high-order cumulant calculation in the JADE method, which can effectively extract features that characterize fault information. (4) Lastly, the CMD index is calculated to be used as a final indicator to describe the degradation trend of the monitored bearing performance.
The schematic flow chart of the method and the detailed flow chart of the algorithm are shown in Figures 1 and 2, respectively, and the details of the method will be described below.

Vibration Source Interference Cancellation Based on Blind Separation
Principal Component Analysis (PCA) [13,16] is a multidimensional data analysis method commonly used in statistical analysis, and it is able to find implicit statistical features from raw data. It is very effective in data dimensionality reduction [17], information compression, and de-correlation between data. In this section, PCA is used to reduce the dimensionality of the two-dimensional signals collected by the horizontal and vertical sensors of the same measuring point before blind separation and extract the main feature information. We assume that the vibration source signal of the monitored bearing is s(t), and other coaxial vibration source signals are s i (t). After the pump mixing system, the random vector formed by the observation signals of the same measuring point is x = (x i−1 , x i−2 ) T , (i = 1, . . . , N), N is the number of measuring points, and its mean m x = E(x) = 0. We then find an orthogonal transformation matrix W and perform an orthogonal transformation on the random vector, so that the random variables in the output random vector y = Wx are not correlated with each other.
The orthogonal matrix W is obtained by decomposing the eigenvalues of the covariance matrix C x of x. Usually, C x is a real symmetric matrix, decomposed as: where U = (u 1 , u 2 ), Λ = diag(λ 1 , λ 2 ) and u 1 and u 2 are the eigenvectors of the covariance matrix C x . The eigenvectors are orthogonal to each other, that is E u i T u j = 0(i = j ∈ 1, 2), and λ i is the corresponding eigenvalue and is presented as: It can be seen that PCA establishes a set of orthogonal bases u 1 , u 2 for the 2-dimensional data space, and u 1 is orthogonal to u 2 .y i = u i T x, i = 1, 2. We then take the first y i (t) after dimensionality reduction as the principal component.

Vibration Source Interference Cancellation Based on Blind Separation
Principal Component Analysis (PCA) [13,16] is a multidimensional data analysis method commonly used in statistical analysis, and it is able to find implicit statistical features from raw data. It is very effective in data dimensionality reduction [17], information

Blind Separation of Decorrelation
In the previous link, the dimensionality reduction in the two-dimensional signals is measured by the horizontal and vertical displacement sensors of the same vibration measuring point. The signal matrix y(t) formed by each measuring point after dimension reduction is used as the observed signal matrix for de-correlated blind separation, and feature extraction is achieved using de-correlated blind separation [18][19][20]. Assuming that the signals have statistical irrelevance, non-whiteness, or non-stationarity, and there is no more than one Gaussian signal in the signal, the eigenvalue decomposition of multiple non-zero time-delay correlation matrices is used to achieve blind signal separation. We then proceed as follows: (1) The observed signal is pre-whitened. We calculate the zero-time delay correlation matrix R xx of the observed signal matrix y(t) and perform singular value decomposition, where Σ is a diagonal matrix consisting of singular values and then z(t) = By(t).
(2) We find unitary matrices V and make the set of non-zero time-delay correlation matrices {R zz (τ 1 ), R zz (τ 2 ), · · · , R zz (τ k )} joint diagonalization. The objective function is as follows: Among them, k is the number of time delay correlation matrices, o f f (M) = ∑ 1<i =j<n M ij 2 .
(3) The separated signal matrix is: The frequency domain correlation analysis is carried out between the separated signal and the monitored bearing signal, and the separated signal with a high-frequency domain correlation is taken as the monitoring bearing signal after blind separation.

Wavelet Packet Analysis to Achieve Shock Component Enhancement
After blind separation, the signal s(t) is subjected to wavelet packet decomposition, and its decomposition tree is shown in Figure 3.
The frequency domain correlation analysis is carried out between the separated signal and the monitored bearing signal, and the separated signal with a high-frequency domain correlation is taken as the monitoring bearing signal after blind separation.

Wavelet Packet Analysis to Achieve Shock Component Enhancement
After blind separation, the signal s(t) is subjected to wavelet packet decomposition, and its decomposition tree is shown in Figure 3. If the sampling frequency of the bearing vibration signal is fs, after the n-layer wavelet packet analysis, there are a total of 2 n nodes, and the frequency interval of each node segment is 2 +1 . By analyzing the spectrum of the signal s(t), the concentration area of the fault's frequency components is determined, and then the corresponding node is selected for reconstruction to obtain the signal x(t).

Multivariate Feature Extraction in Time-Frequency Domain
After the vibration sensor collects the bearing signal, extracting the features reflecting the bearing state from these data is a key step in realizing fault diagnosis. The quality of the extracted features directly affects the recognition accuracy. Due to the complexity of the bearing structure and the superposition and coupling of the vibration signals of various components, a single characteristic index, such as the root mean square value or peak value, cannot accurately reflect the current state of the bearing [21]. It can, however, comprehensively evaluate the running state of the bearing [22]. Therefore, the time-domain If the sampling frequency of the bearing vibration signal is f s , after the n-layer wavelet packet analysis, there are a total of 2 n nodes, and the frequency interval of each node segment is f s 2 n+1 . By analyzing the spectrum of the signal s(t), the concentration area of the fault's frequency components is determined, and then the corresponding node is selected for reconstruction to obtain the signal x(t).

Multivariate Feature Extraction in Time-Frequency Domain
After the vibration sensor collects the bearing signal, extracting the features reflecting the bearing state from these data is a key step in realizing fault diagnosis. The quality of the extracted features directly affects the recognition accuracy. Due to the complexity of the bearing structure and the superposition and coupling of the vibration signals of various components, a single characteristic index, such as the root mean square value or peak value, cannot accurately reflect the current state of the bearing [21]. It can, however, comprehensively evaluate the running state of the bearing [22]. Therefore, the time-domain dimensionless index, the frequency-domain index, and the energy ratio index based on the wavelet packet sub-band are extracted as the original high-dimensional feature vectors.

Extraction Method of Time-Domain Features
The signal obtained by wavelet packet decomposition and reconstruction is x(t), and n is the number of sampling points. The extracted signal feature set has dimensional indicators and non-dimensional indicators [23], as shown in Table 1.

Extraction Method of Frequency Domain Features
When the bearing is in a healthy state, its vibration and acoustic signals are small during operation. When the bearing is damaged locally, a periodic impact signal will be generated, which will lead to the high-frequency vibration of the bearing itself. Therefore, the vibration signal of the bearing is very complex, and different types of bearing faults have different fault frequencies and impact laws. Therefore, the complex time domain signal can be transformed into a single harmonic component through Fourier transform for research, so as to obtain each harmonic component, such as the amplitude and phase information of the wave. Since the signals of different fault types do not have exactly the same frequency spectra, different frequency domain characteristic parameters are required for monitoring [24]. F13-F16, as listed in Table 1, are the frequency domain features required in this chapter, where f i and s i are the frequency and amplitude corresponding to the i-th spectral line of the reconstructed signal x(t).

Extraction Method of Time-Frequency Domain Features
As a typical time-frequency domain analysis method, wavelet decomposition can perform multi-scale transformation of vibration signals. Since the fault information widely exists in different frequency components in the signal, a change in the frequency component often indicates that the state of the bearing has changed, so wavelet packet frequency band energy detection technology can be used to realize the original feature extraction of the bearing's operating state.
Third-order wavelet packet decomposition is performed on the reconstructed signal x(t), and the energy corresponding to the node x 3i is E i , while the corresponding discrete point amplitude is h ik , i = 0, 1, 2, . . . , 7; k = 1, 2, . . . , n, and n is the number of sampling points. Then, the energy of the i-th subband signal is: In order to more intuitively judge the change in energy and perform dimensionless processing on the extracted sub-band energy of wavelet packets, the energy ratio of each frequency band energy to the total energy E, as shown in Table 1, can be obtained as the timefrequency domain energy ratio original features, of which the total energy E = ∑ 7 i=0 E i .

Advanced Feature Fusion Extraction Based on KJADE
In the previous section, time-domain indicators, frequency-domain indicators, and energy ratios based on wavelet packet sub-bands were extracted from the vibration signal, and formed a multi-domain feature set, avoiding the shortcoming of the insufficient evaluation capability of a single feature. However, there are redundant and conflicting problems among some features in multi-domain features, so it is necessary to extract feature quantities sensitive to bearing fault states. The bearing vibration signal often has nonlinear characteristics, and JADE belongs to the linear processing method, so it cannot effectively extract the characteristic quantities sensitive to the bearing fault state. In order to further apply JADE to bearing vibration signals with nonlinear behavior, the idea of the kernel function method is introduced to JADE, and the joint approximate diagonalization based on the kernel function eigenmatrix is obtained. Kernel Feature Matrix Joint Approximate Diagonalization (KJADE) is a novel feature fusion method. It not only has the characteristics of JADE, but also has better nonlinear processing capabilities than JADE. The core idea is to map the data X ∈ R n×m to the high-dimensional feature space F through the nonlinear function Φ, and then use the JADE algorithm to transform the nonlinear separable problem into a linearly separable problem in F. The mapping process is shown in Figure 4. Assuming that the sample space is X = {x 1 , x 2 , . . . , x n }, where x i is the input vector of the i-th dimension of the sample space, which contains n data points, after mapping, the feature Sensors 2022, 22, x FOR PEER REVIEW 9 of 23 In order to more intuitively judge the change in energy and perform dimensionless processing on the extracted sub-band energy of wavelet packets, the energy ratio of each frequency band energy to the total energy E, as shown in Table 1, can be obtained as the time-frequency domain energy ratio original features, of which the total energy = ∑ 7 =0 .

Advanced Feature Fusion Extraction Based on KJADE
In the previous section, time-domain indicators, frequency-domain indicators, and energy ratios based on wavelet packet sub-bands were extracted from the vibration signal, and formed a multi-domain feature set, avoiding the shortcoming of the insufficient evaluation capability of a single feature. However, there are redundant and conflicting problems among some features in multi-domain features, so it is necessary to extract feature quantities sensitive to bearing fault states. The bearing vibration signal often has nonlinear characteristics, and JADE belongs to the linear processing method, so it cannot effectively extract the characteristic quantities sensitive to the bearing fault state. In order to further apply JADE to bearing vibration signals with nonlinear behavior, the idea of the kernel function method is introduced to JADE, and the joint approximate diagonalization based on the kernel function eigenmatrix is obtained. Kernel Feature Matrix Joint Approximate Diagonalization (KJADE) is a novel feature fusion method. It not only has the characteristics of JADE, but also has better nonlinear processing capabilities than JADE. The core idea is to map the data ∈ × to the high-dimensional feature space F through the nonlinear function Φ, and then use the JADE algorithm to transform the nonlinear separable problem into a linearly separable problem in F. The mapping process is shown in Figure 4. Assuming that the sample space is = { 1 , 2 , . . . , }, where xi is the input vector of the i-th dimension of the sample space, which contains n data points, after mapping, the feature space is = { ( 1 ), ( 2 ), . . . , ( )}. For the same purpose as JADE, KJADE also needs to find unmixed matrix B to obtain the optimal matrix, namely: Furthermore, the covariance matrix of the mapped F can be obtained as: RF and its eigenvalues and eigenvectors cannot be obtained due to the "curse of dimensionality" caused by the excessive dimension of the feature space. The idea of kernel function is introduced here, and the complex and time-consuming inner product calculation is converted into a kernel function, and an N × N kernel matrix K is obtained: For the same purpose as JADE, KJADE also needs to find unmixed matrix B to obtain the optimal matrix, namely: Furthermore, the covariance matrix of the mapped F can be obtained as: R F and its eigenvalues and eigenvectors cannot be obtained due to the "curse of dimensionality" caused by the excessive dimension of the feature space. The idea of kernel function is introduced here, and the complex and time-consuming inner product calculation is converted into a kernel function, and an N × N kernel matrix K is obtained: Among them, K ij must meet the Mercer condition, that is, K = FF T . The following are commonly used kernel functions: Gaussian kernel function: Polynomial Kernel Function: Sigmoid Kernel function: Since the Gaussian kernel function achieves better results in solving practical problems [25], the Gaussian kernel function is used in this study to replace the inner product operation, where σ represents the width parameter of the function. Then, the eigendecomposition of the fourth-order cumulant matrix of the extracted kernel matrix is produced, and the nonlinear feature fea_kjade(x i ,y i ,z i ) hidden in the observation signal is obtained.

Calculation of CMD Performance Degradation Index
When the bearing fails, it has good class separability from the characteristic distribution of the healthy bearing. With the continuous operation of the bearing, the failure degree is expanded, and the characteristic difference between the monitored bearing and the healthy bearing becomes larger. Therefore, two types of models can be constructed to evaluate the difference between monitoring signals and health signals. In the classification of samples, the intra-class distance and the inter-class distance have been applied in the class separability measurement between different bearing fault types.
The two types of models constructed are shown in Figure 5. In the extracted nonlinear feature fea_kjade(x i ,y i ,z i ), we extract the feature segment of the bearing in normal operation and record the feature set extracted in the healthy state of the bearing as X o , the feature set extracted by the bearing at time t is Y t , and then the two types of models formed are . Then, the inter-class scatter matrix is: The intra-class scatter matrix is: In: C is the number of classes, where C = 2, and m i and m are the feature mean of class i and the entire sample, respectively. The inter-class scatter matrices S b and S w represent the degree of aggregation between different classes and between the same class, respectively.
To describe the feature part more comprehensively, the intra-class and inter-class distance SS is used as a measure of class separability, and its expression is shown in Equation (17).
. Figure 5. Schematic diagram of the two types of models.

Selection of Distance Index Calculation
Because clustering does not know any sample labels, the samples are divided into different classes through the internal relationship and different calculation methods are used, and the clustering will also obtain different results. The following are commonly used similarity calculation methods. Therefore, when calculating small sample clustering, it is necessary to calculate the relevant distance. When calculating the distance between the feature set extracted at time t and the centroid of the feature set X0 extracted under the healthy sample, we use the following distance formulas as a reference for comparison:

Euclidean Distance
According to the calculation, this distance can be considered to be the L2 norm. Two points a(x1,x2...xn) and b(y1,y2...yn) are in the n-dimensional space:

Manhattan Distance
According to the calculation, this distance can be considered to be the L1 norm. Two points a(x1,x2...xn) and b(y1,y2...yn) are in the n-dimensional space:

Chebyshev Distance
It is simply considered the maximum value of the coordinate difference of each coordinate. Two points a(x1,x2...xn) and b(y1,y2...yn) are in the n-dimensional space: Because clustering does not know any sample labels, the samples are divided into different classes through the internal relationship and different calculation methods are used, and the clustering will also obtain different results. The following are commonly used similarity calculation methods. Therefore, when calculating small sample clustering, it is necessary to calculate the relevant distance. When calculating the distance between the feature set Y t extracted at time t and the centroid of the feature set X0 extracted under the healthy sample, we use the following distance formulas as a reference for comparison:

1.
Euclidean Distance According to the calculation, this distance can be considered to be the L2 norm. Two points a(x 1 ,x 2 . . . x n ) and b(y 1 ,y 2 . . . y n ) are in the n-dimensional space:

Manhattan Distance
According to the calculation, this distance can be considered to be the L1 norm. Two points a(x 1 ,x 2 . . . x n ) and b(y 1 ,y 2 . . . y n ) are in the n-dimensional space:

Chebyshev Distance
It is simply considered the maximum value of the coordinate difference of each coordinate. Two points a(x 1 ,x 2 . . . x n ) and b(y 1 ,y 2 . . . y n ) are in the n-dimensional space: 4.

Minkowski Distance
The Minkowski distance between two n-dimensional variables a(x 11 ,x 12 , . . . ,x 1n ) and b(x 21 ,x 22 , . . . ,x 2n ) is defined as: where p is a variable parameter. Min's distance defines a set of distance formulas, including the Euclidean distance, Manhattan distance, and Chebyshev distance.

Cosine Distance
The cosine similarity derivation formula is as follows: Two points a(x 1 ,x 2 . . . x n ) and b(y 1 ,y 2 . . . y n ) are in the n-dimensional space

Correlation Distance
The correlation coefficient is a way to assess the degree of correlation between random variables X and Y.
Correlation distance:

Dynamic Centroid
When the bearing is running, its characteristic signal data migrate and change. In order to more clearly observe the migration and change trend of the fusion features, we propose the concept of the dynamic centroid. When finding the centroid of a single data cluster, it is often best to find a fixed number of K centroid points for the entire data cluster and iteratively find a partition scheme for K clusters, minimizing the loss corresponding to the clustering result function, where the loss function can be defined as the sum of squared errors between the sample points in each cluster and their center points: where x i represents the i th sample, c i is the cluster to which x i belongs, µ ci represents the center point corresponding to the cluster, and M is the total number of samples.
As shown in Figure 6, first, we randomly select a sample point as the initial centroid, and then calculate the similarity between each sample point and the centroid. We classify each sample point into its most similar category, then recalculate the centroid point (i.e., the class center) of each class again, repeat this process until the centroid point no longer changes, and finally, the class and the class centroid can be obtained. We then divide the entire data cluster into a given number of K classes, but this class, divided according to data characteristics, may not have the trend we want, the size of each class may not be the same, and its data points may not be the same or continuous. We refer to this idea when calculating the migration trend of the feature distance, divide the data into n segments, find the centroid point for each segment, divide the entire data segment into n sample clusters, and in each sample cluster, the fluctuation distance between the data point and the centroid is calculated. The sum of the movement distance of the centroid and the fluctuation distance is the migration distance we want. clusters, and in each sample cluster, the fluctuation distance between the data point and the centroid is calculated. The sum of the movement distance of the centroid and the fluc tuation distance is the migration distance we want. Therefore, here we divide the nonlinear fusion feature fea_kjade(xi,yi,zi), extracted from the feature fusion in the previous step, into n sample clusters, find the centroid C0 o the fault-free sample cluster, find the centroid point for each sample cluster separately C find the migration distance Xi between the two centroid points, and find the fluctuation distance di between the sample cluster and the previous centroid point Ci−1. At this time the required performance degradation evaluation index ddd1 can be obtained by adding these two distances.
The basic idea is to select the early fault-free samples initially judged in the KJADE fusion feature index as the fault-free sample clusters, and then divide the remaining sam ples into sample clusters according to the predetermined number of samples, calculate th centroids of the sample clusters, and specify each sample. The distance to its correspond ing sample cluster centroid plus the migration distance from the sample cluster centroid point to the non-faulty sample cluster centroid point is the dynamic centroid distance o the sample. The pseudocode of the dynamic centroid algorithm is shown in Algorithm 1 Therefore, here we divide the nonlinear fusion feature fea_kjade(x i ,y i ,z i ), extracted from the feature fusion in the previous step, into n sample clusters, find the centroid C 0 of the fault-free sample cluster, find the centroid point for each sample cluster separately C i , find the migration distance X i between the two centroid points, and find the fluctuation distance d i between the sample cluster and the previous centroid point C i−1 . At this time, the required performance degradation evaluation index ddd 1 can be obtained by adding these two distances.
The basic idea is to select the early fault-free samples initially judged in the KJADE fusion feature index as the fault-free sample clusters, and then divide the remaining samples into sample clusters according to the predetermined number of samples, calculate the centroids of the sample clusters, and specify each sample. The distance to its corresponding sample cluster centroid plus the migration distance from the sample cluster centroid point to the non-faulty sample cluster centroid point is the dynamic centroid distance of the sample. The pseudocode of the dynamic centroid algorithm is shown in Algorithm 1.

Results and Discussion
This paper takes the vibration signals collected by each measuring point of a pump from 11 September to 24 December 2021 as the analysis object. The physical diagram of the equipment structure of the pump set and the installation of the vibration sensor is shown in Figure 7. Vibration sensors in horizontal and vertical directions are placed on the free end and the driving end of the water pump and drive motor. The measuring point data sets are 1H, 2H, 2V, 3H, 3V, 4H, and 4A. Each half of each measuring point collects group data, eliminates the useless data due to problems such as shutdown and acquisition terminal failure during this period, and obtains 3684 groups of vibration signals during this period to complete the data preprocessing. The technical parameters related to the water pump are shown in Table 2, and the overall structure and physical diagram are shown in Figure 7.

Results and Discussion
This paper takes the vibration signals collected by each measuring point of a pump from 11 September to 24 December 2021 as the analysis object. The physical diagram of the equipment structure of the pump set and the installation of the vibration sensor is shown in Figure 7. Vibration sensors in horizontal and vertical directions are placed on the free end and the driving end of the water pump and drive motor. The measuring point data sets are 1H, 2H, 2V, 3H, 3V, 4H, and 4A. Each half of each measuring point collects group data, eliminates the useless data due to problems such as shutdown and acquisition terminal failure during this period, and obtains 3684 groups of vibration signals during this period to complete the data preprocessing. The technical parameters related to the water pump are shown in Table 2, and the overall structure and physical diagram are shown in Figure 7.    Figure 8 is the original signal time-domain diagram and partially enlarged diagram. From the partial magnification of the time domain signal, we can see the minor fault segment, that is, the data segment whose amplitude has not yet increased but the periodic shock component begins to appear. The coordinates of the corresponding sample points are 5,521,700 to 6,856,630. The envelope spectrum of the minor fault signal segment is shown in Figure 9. From the figure, one can observe the obvious cage fault frequency and rotation frequency and its frequency multiplication component. Regarding the sample of the mid-term fault section, the coordinates of the corresponding sample point are 24,000,000 to 25,000,000, and the envelope spectrum is drawn as shown in Figure 10, in which the outer raceway failure frequency, inner raceway failure frequency, rolling element failure frequency, and cage failure can be observed. This shows that in the severe fault section, the bearing fault evolved from the cage fault to a composite fault composed of the outer raceway fault, the inner raceway fault, the rolling element fault, and the cage fault. Regarding the samples of the later fault section, the coordinates of the corresponding sample points are 50,000,000 to 51,000,000, the main frequency component in the envelope spectrum is the rotation frequency, and there are sidebands around the rotation frequency, indicating that modulation occurs in the severe fault section.  Figure 8 is the original signal time-domain diagram and partially enlarged diagram. From the partial magnification of the time domain signal, we can see the minor fault segment, that is, the data segment whose amplitude has not yet increased but the periodic shock component begins to appear. The coordinates of the corresponding sample points are 55,217,00 to 68,5663,0. The envelope spectrum of the minor fault signal segment is shown in Figure 9. From the figure, one can observe the obvious cage fault frequency and rotation frequency and its frequency multiplication component. Regarding the sample of the mid-term fault section, the coordinates of the corresponding sample point are 24,000,000 to 25,000,000, and the envelope spectrum is drawn as shown in Figure 10, in which the outer raceway failure frequency, inner raceway failure frequency, rolling element failure frequency, and cage failure can be observed. This shows that in the severe fault section, the bearing fault evolved from the cage fault to a composite fault composed of the outer raceway fault, the inner raceway fault, the rolling element fault, and the cage fault. Regarding the samples of the later fault section, the coordinates of the corresponding sample points are 50,000,000 to 51,000,000, the main frequency component in the envelope spectrum is the rotation frequency, and there are sidebands around the rotation frequency, indicating that modulation occurs in the severe fault section.     For the minor fault segment, according to the number of sample points n = 16,384, the kurtosis value of the segment is calculated to draw the kurtosis curve of the original signal, as shown in Figure 11. The earliest increase in kurtosis occurs at sample segment 437, and the corresponding sample point coordinate is 71,598,08, indicating that a single kurtosis index has limitations in the early fault diagnosis of bearings and cannot accurately identify minor faults of bearings.

Analysis of Blind Separation and Wavelet Packet Reconstruction Processing Results
In order to eliminate the interference of other vibration sources in the pump device to the signal acquisition under actual working conditions, the source signal is initially processed by the blind separation processing method, and wavelet packet decomposition is performed on the collected bearing vibration signal after blind separation. The decomposition structure is shown in Figure 12, where W(j, m) represents the node m of the layer j, each node represents the decomposition coefficient of the original signal s(t) on the scale j for the wavelet packet function, m represents the frequency band, and each node is associated with the corresponding frequency band match. In the figure, W(0, 0) represents the original signal, and other nodes such as W(2, 0) represent the 0th node coefficient of the second layer, and then each subband signal is extracted by reconstruction. For the minor fault segment, according to the number of sample points n = 16,384, the kurtosis value of the segment is calculated to draw the kurtosis curve of the original signal, as shown in Figure 11. The earliest increase in kurtosis occurs at sample segment 437, and the corresponding sample point coordinate is 7,159,808, indicating that a single kurtosis index has limitations in the early fault diagnosis of bearings and cannot accurately identify minor faults of bearings.  For the minor fault segment, according to the number of sample points n = 16,384, the kurtosis value of the segment is calculated to draw the kurtosis curve of the original signal, as shown in Figure 11. The earliest increase in kurtosis occurs at sample segment 437, and the corresponding sample point coordinate is 71,598,08, indicating that a single kurtosis index has limitations in the early fault diagnosis of bearings and cannot accurately identify minor faults of bearings.

Analysis of Blind Separation and Wavelet Packet Reconstruction Processing Results
In order to eliminate the interference of other vibration sources in the pump device to the signal acquisition under actual working conditions, the source signal is initially processed by the blind separation processing method, and wavelet packet decomposition is performed on the collected bearing vibration signal after blind separation. The decomposition structure is shown in Figure 12, where W(j, m) represents the node m of the layer j, each node represents the decomposition coefficient of the original signal s(t) on the scale j for the wavelet packet function, m represents the frequency band, and each node is associated with the corresponding frequency band match. In the figure, W(0, 0) represents the original signal, and other nodes such as W(2, 0) represent the 0th node coefficient of the second layer, and then each subband signal is extracted by reconstruction.

Analysis of Blind Separation and Wavelet Packet Reconstruction Processing Results
In order to eliminate the interference of other vibration sources in the pump device to the signal acquisition under actual working conditions, the source signal is initially processed by the blind separation processing method, and wavelet packet decomposition is performed on the collected bearing vibration signal after blind separation. The decomposition structure is shown in Figure 12, where W(j, m) represents the node m of the layer j, each node represents the decomposition coefficient of the original signal s(t) on the scale j for the wavelet packet function, m represents the frequency band, and each node is associated with the corresponding frequency band match. In the figure, W(0, 0) represents the original signal, and other nodes such as W(2, 0) represent the 0th node coefficient of the second layer, and then each subband signal is extracted by reconstruction. When the number of nodes m is an even number, it represents the low-frequency component signal decomposed by the low-pass filter coefficient h(k). On the contrary, when m is an odd number, it represents the high-frequency component signal decomposed by the high-pass filter coefficient g(k).
The sampling frequency of the bearing vibration signal is 12,800 Hz. The wavelet packet is decomposed into two layers, while the third layer has 2 ^ 2 = 4 frequency segments, and the frequency range of each frequency segment is 6400/4 = 1600 Hz. The spe- When the number of nodes m is an even number, it represents the low-frequency component signal decomposed by the low-pass filter coefficient h(k). On the contrary, when m is an odd number, it represents the high-frequency component signal decomposed by the high-pass filter coefficient g(k).
The sampling frequency of the bearing vibration signal is 12,800 Hz. The wavelet packet is decomposed into two layers, while the third layer has 2ˆ2 = 4 frequency segments, and the frequency range of each frequency segment is 6400/4 = 1600 Hz. The specific range is shown in Table 3. It can be seen from the table that the frequency bands corresponding to different nodes are different. Therefore, we analyze the spectrum of the signal after blind separation, create its spectrogram, observe the concentrated segment of the fault frequency, and select the wavelet node for the next step.
The bearing signal is a typical irregular signal. In order to grasp the time-frequency characteristics of each frequency band, the dbN wavelet can be used for multi-layer decomposition, and the time-frequency analysis of the signal is carried out in different frequency ranges. Because the db6 wavelet is a tightly supported orthogonal real wavelet, with good regularity and a large vanishing moment, it is used as the wavelet basis for wavelet packet decomposition. The second-order wavelet packet decomposition is performed on the signal after blind separation, and the db6 wavelet is selected as the wavelet base. Figure 13 is the spectrogram of the signal after blind separation processing. Through spectrogram analysis, the main frequency component of the signal is approximately 840 HZ. Therefore, node 4 (frequency band 1-1600 HZ) in the tree node is selected for reconstruction. When the number of nodes m is an even number, it represents the low-frequency component signal decomposed by the low-pass filter coefficient h(k). On the contrary, when m is an odd number, it represents the high-frequency component signal decomposed by the high-pass filter coefficient g(k).
The sampling frequency of the bearing vibration signal is 12,800 Hz. The wavelet packet is decomposed into two layers, while the third layer has 2 ^ 2 = 4 frequency segments, and the frequency range of each frequency segment is 6400/4 = 1600 Hz. The specific range is shown in Table 3. It can be seen from the table that the frequency bands corresponding to different nodes are different. Therefore, we analyze the spectrum of the signal after blind separation, create its spectrogram, observe the concentrated segment of the fault frequency, and select the wavelet node for the next step.
The bearing signal is a typical irregular signal. In order to grasp the time-frequency characteristics of each frequency band, the dbN wavelet can be used for multi-layer decomposition, and the time-frequency analysis of the signal is carried out in different frequency ranges. Because the db6 wavelet is a tightly supported orthogonal real wavelet, with good regularity and a large vanishing moment, it is used as the wavelet basis for wavelet packet decomposition. The second-order wavelet packet decomposition is performed on the signal after blind separation, and the db6 wavelet is selected as the wavelet base. Figure 13 is the spectrogram of the signal after blind separation processing. Through spectrogram analysis, the main frequency component of the signal is approximately 840 HZ. Therefore, node 4 (frequency band 1-1600 HZ) in the tree node is selected for reconstruction. To better verify the advantages of reconstructed signals, PCA, KPCA, JADE, and KJADE methods are used to fuse the original data and the reconstructed data, and the processed feature points are normalized and the first four are preliminarily selected. The To better verify the advantages of reconstructed signals, PCA, KPCA, JADE, and KJADE methods are used to fuse the original data and the reconstructed data, and the processed feature points are normalized and the first four are preliminarily selected. The normal working data of the bearing is used as a no-fault signal. Figure 14 shows the feature point clustering effect after feature fusion of source data and reconstructed data. Table 4 shows the intra-class distance of KJADE and JADE. By calculating the intra-class and interclass distances between different feature points, it is found that the intra-class inter-class distances of the non-faulty samples and early fault samples in the reconstructed signal are significantly lower than the original signal, indicating that after blind separation and wavelet packet reconstruction, the clustering effect of feature points obtained by feature fusion of the four methods, PCA, KPCA, JADE, and KJADE, has been significantly improved.
inter-class distances between different feature points, it is found that the intra-class interclass distances of the non-faulty samples and early fault samples in the reconstructed signal are significantly lower than the original signal, indicating that after blind separation and wavelet packet reconstruction, the clustering effect of feature points obtained by feature fusion of the four methods, PCA, KPCA, JADE, and KJADE, has been significantly improved.

Fusion Feature Distance Metrics
Considering that the three-dimensional coordinates of the signal feature points can only reflect the distribution and clustering effect of the feature points in the three-dimensional space, it cannot intuitively reflect the time node when the bearing starts to fail, and the trend of the subsequent bearing failure severity changes with time. The normal working data of the bearing in the first four days is used as the no-fault sample cluster, the centroid coordinates of the no-fault sample cluster are obtained, the distance between each subsequent feature point and the centroid of the no-fault sample cluster is calculated, and the distance index is used as the judgment bearing. A new indicator of the failure level is obtained.
To demonstrate the superiority of KJADE in the distance trend after the fusion of bearing fault features, feature distance calculation processing was performed on the data after feature fusion using PCA, KPCA, JADE, and KJADE methods. In terms of calculation, six different distance calculation methods, such as the Euclidean distance, Manhattan distance, and correlation distance, are used for verification. Figure 15 is a trend diagram of the fusion feature distance indicators of the four feature fusion indicators PCA, KPCA, JADE, and KJADE under different distance calculation methods. The PCA fusion feature distance index has a large amplitude at the non-fault sample cluster, which cannot accurately judge the fault occurrence time. The angle cosine

Fusion Feature Distance Metrics
Considering that the three-dimensional coordinates of the signal feature points can only reflect the distribution and clustering effect of the feature points in the three-dimensional space, it cannot intuitively reflect the time node when the bearing starts to fail, and the trend of the subsequent bearing failure severity changes with time. The normal working data of the bearing in the first four days is used as the no-fault sample cluster, the centroid coordinates of the no-fault sample cluster are obtained, the distance between each subsequent feature point and the centroid of the no-fault sample cluster is calculated, and the distance index is used as the judgment bearing. A new indicator of the failure level is obtained.
To demonstrate the superiority of KJADE in the distance trend after the fusion of bearing fault features, feature distance calculation processing was performed on the data after feature fusion using PCA, KPCA, JADE, and KJADE methods. In terms of calculation, six different distance calculation methods, such as the Euclidean distance, Manhattan distance, and correlation distance, are used for verification. Figure 15 is a trend diagram of the fusion feature distance indicators of the four feature fusion indicators PCA, KPCA, JADE, and KJADE under different distance calculation methods. The PCA fusion feature distance index has a large amplitude at the non-fault sample cluster, which cannot accurately judge the fault occurrence time. The angle cosine distance index and correlation distance index of the KPCA fusion feature have low amplitudes at the early non-fault sample clusters, which are in line with the actual working conditions, but the fault occurrence point is consistent with the kurtosis index, and early faults cannot be diagnosed in advance. There are also cases where the amplitudes of the remaining distance indicators are too large at the fault-free samples. The overall trend of JADE fusion features fluctuates greatly, and the amplitude is large at the fault-free sample cluster, which will interfere with the judgment of bearing faults. The Chebyshev distance index, the included angle cosine distance index, the Euclidean distance index, the Manhattan distance index, and the Minkowski distance index of the KJADE fusion feature also have the same problems as the above three other feature fusion methods. The amplitude at the cluster is abnormal, and the early warning of the early bearing failure cannot be realized.
JADE fusion features fluctuates greatly, and the amplitude is large at the fault-free sample cluster, which will interfere with the judgment of bearing faults. The Chebyshev distance index, the included angle cosine distance index, the Euclidean distance index, the Manhattan distance index, and the Minkowski distance index of the KJADE fusion feature also have the same problems as the above three other feature fusion methods. The amplitude at the cluster is abnormal, and the early warning of the early bearing failure cannot be realized. The relevant distance index of KJADE effectively realizes the early warning of the early failure of the faulty bearing. Figure 16 shows a partially enlarged view of the correlation distance index of the KJADE fusion feature. The correlation distance index of the KJADE fusion feature rises at sample segment 336, and the corresponding sample coordinate is 55,050,24, which is the same as the starting sample coordinate of the minor fault segment in Section 3.1. 55,217,00 is basically the same, indicating that the correlation distance index of the KJADE fusion feature realizes the early diagnosis of minor bearing faults. The relevant distance index of KJADE effectively realizes the early warning of the early failure of the faulty bearing. Figure 16 shows a partially enlarged view of the correlation distance index of the KJADE fusion feature. The correlation distance index of the KJADE fusion feature rises at sample segment 336, and the corresponding sample coordinate is 5,505,024, which is the same as the starting sample coordinate of the minor fault segment in Section 3.1. 5,521,700 is basically the same, indicating that the correlation distance index of the KJADE fusion feature realizes the early diagnosis of minor bearing faults.

Cluster Migration Distance
In Section 3.2, after the early failure occurred, the kurtosis index and the KJADE fusion index showed a trend of first rising and then decreasing, and this anomalous phenomenon did not match the actual situation. In order to more accurately describe the trend of bearing failure degree with time, based on KJADE feature fusion, this paper proposes a novel performance degradation metric extraction method called Cluster Migration Distance (CMD). On the premise that the sample points corresponding to minor faults are determined in Section 3.2, all non-fault sample points are selected to obtain their centroids, divide the remaining sample points into n sample clusters according to the specified sample cluster capacity N, calculate the centroid of each sample cluster, and calculate the distance ( = 1,2,3, . . . , ). In the subsequent calculation of the distance index of the sample

Cluster Migration Distance
In Section 3.2, after the early failure occurred, the kurtosis index and the KJADE fusion index showed a trend of first rising and then decreasing, and this anomalous phenomenon did not match the actual situation. In order to more accurately describe the trend of bearing failure degree with time, based on KJADE feature fusion, this paper proposes a novel performance degradation metric extraction method called Cluster Migration Distance (CMD). On the premise that the sample points corresponding to minor faults are determined in Section 3.2, all non-fault sample points are selected to obtain their centroids, divide the remaining sample points into n sample clusters according to the specified sample cluster capacity N, calculate the centroid of each sample cluster, and calculate the distance X i (i = 1, 2, 3, . . . , N). In the subsequent calculation of the distance index of the sample points, the dynamic centroid method is used, that is, the distance x j (j = 1, 2, 3 . . .) from each sample point to the centroid of the sample cluster where the sample point is located. The migration distance of the centroid point of the sample cluster is used as the dynamic centroid distance v j of the sample point. The specific calculation formula is as follows: Considering the calculation accuracy and the total number of samples, this paper takes the sample cluster capacity n = 50 and the number of sample clusters as 37, calculates the dynamic centroid distance of each sample point, and draws its change trend as shown in Figure 17. Compared with the KJADE integrated index and the kurtosis index, the dynamic centroid distance index has a larger minor fault amplitude, which allows it to make a more accurate judgment on the minor fault of the bearing, and the subsequent amplitude shows an upward trend, which is more in line with actual engineering conditions.

Cluster Migration Distance
In Section 3.2, after the early failure occurred, the kurtosis index and the KJADE fusion index showed a trend of first rising and then decreasing, and this anomalous phenomenon did not match the actual situation. In order to more accurately describe the trend of bearing failure degree with time, based on KJADE feature fusion, this paper proposes a novel performance degradation metric extraction method called Cluster Migration Distance (CMD). On the premise that the sample points corresponding to minor faults are determined in Section 3.2, all non-fault sample points are selected to obtain their centroids, divide the remaining sample points into n sample clusters according to the specified sample cluster capacity N, calculate the centroid of each sample cluster, and calculate the distance ( = 1,2,3, . . . , ). In the subsequent calculation of the distance index of the sample points, the dynamic centroid method is used, that is, the distance ( = 1,2,3. . . ) from each sample point to the centroid of the sample cluster where the sample point is located. The migration distance of the centroid point of the sample cluster is used as the dynamic centroid distance of the sample point. The specific calculation formula is as follows: Considering the calculation accuracy and the total number of samples, this paper takes the sample cluster capacity n = 50 and the number of sample clusters as 37, calculates the dynamic centroid distance of each sample point, and draws its change trend as shown in Figure 17. Compared with the KJADE integrated index and the kurtosis index, the dynamic centroid distance index has a larger minor fault amplitude, which allows it to make a more accurate judgment on the minor fault of the bearing, and the subsequent amplitude shows an upward trend, which is more in line with actual engineering conditions. In the early stages of equipment operation, the equipment operates normally, and various vibration indicators fluctuate normally. The fluctuations originate from changes in the flow and load of the pump. The bearing at the drive end of the pump has early normal wear characteristics, but there is no deterioration trend. With the operation of the In the early stages of equipment operation, the equipment operates normally, and various vibration indicators fluctuate normally. The fluctuations originate from changes in the flow and load of the pump. The bearing at the drive end of the pump has early normal wear characteristics, but there is no deterioration trend. With the operation of the pump, the corresponding fault information of the pump vibration data, such as kurtosis and other indicators, change significantly compared to the normal operation stage. At this time, the bearing will result in early failure. As the faulty bearing continues to work, various indicators fluctuate at high and low points. Intensified through the analysis of the data, the time-domain waveform can be seen in the time-domain waveform with insignificant swarm impact characteristics and the appearance of impact instability, indicating that the bearing is worn at this time, but the damaged surface of the rolling element is small, and the load area occasionally makes contact during the rotation process. When the inner and outer raceways are in contact, the vibration acceleration index rises, and when it is not in contact, the indicators decline.
As the bearing continues to work, the high-frequency vibration kurtosis density and impact energy ratio of the pump drive end change again compared to the previous ones. All kinds of indicators show a significant upward trend, and the vibration model shows new changes. The impact of the quasi-rotation frequency interval can be seen in the highfrequency vibration acceleration waveform, which appears to be stable, and the acceleration envelope spectrum is dominated by the rotation frequency and the fault frequency of the bearing inner ring, indicating that the main damage of the bearing at this stage is on the bearing inner ring. At the same time, the rising indicators show that the bearing damage surface is gradually expanding. The rise in vibration energy mainly comes from the bearing fault frequencies and sidebands in the frequency spectrum. The vibration velocity trend of the driving end and the free end of the pump shows that the effective value of the vibration velocity at the free end also shows a slow upward trend with time. This feature indicates that the vibration generated by the damage of the bearing at the driving end affects the entire shaft system of the pump.
It can be seen from Figure 17 that when the water pump is in a fault-free state, the distance between the fusion feature and the non-fault data center of mass is small. When the bearing is in a fault state, there will be a sudden change in the distance between the fusion coordinates and the non-fault center of mass, which can be determined at the moment of the sudden change, when the bearing failed. Among different fusion methods, we can see that compared with other feature fusion methods, the KJADE method is better than other feature fusion methods. When observing the trend of the feature distance, its change trend is more obvious. After the mutation, it can be clearly reflected by the relevant characteristic trend, so the state of the bearing during operation can be sensed in advance through the corresponding characteristic distance change. When the characteristic distance begins to grow larger or even begins to mutate, we can judge accordingly whether the bearing has failed.
When the kurtosis index is simply used, and when the number of sample points is 444, the kurtosis index begins to show a clear upward trend. At this time, the corresponding bearing failure stage is the mid-term failure of the bearing, and the early failure of the bearing cannot be accurately identified. At this time, the water pump bearing is running in a faulty state. The CMD index we proposed began to show an upward trend when the number of sampling points was 336, which was 20 h earlier than the simple kurtosis index. Early warning can be achieved, and economic losses caused by potential safety hazards can be prevented in time.

Conclusions
In practical engineering applications, the pump bearing signal is seriously disturbed by noise, and the fault evolution is complex and changeable. Traditional performance degradation indicators cannot describe the degradation trend in a timely and accurate manner. To solve this problem, this work proposes a novel CMD index extraction method. This method can effectively eliminate the interference of the coaxial vibration source and can accurately describe the degradation trend of the bearing. The analysis results of an actual engineering case show that its early warning ability and index monotonicity are better than the traditional kurtosis index.
However, there are still some future challenges: (1) Better denoising of the pump bearing signal and extracting effective degradation indicators to achieve earlier warning and more accurate fault evolution process characterization, which is still a future challenge; (2) improving the computational efficiency of the method to achieve real-time monitoring, which requires further research; (3) in the process of CMD calculation, some parameters need to be manually selected, and the method of realizing parameters' self-optimization deserves further study; and lastly, (4) the automatic selection and fusion of features is also worthy of further research to improve the clustering of features.