Transfer Prediction Method of Bearing Remaining Useful Life Based on Deep Feature Evaluation under Different Working Conditions

In the existing bearing remaining useful life (RUL)-prediction model based on deep learning, the advantages and disadvantages of the extracted features are evaluated by the prediction accuracy; thus, the analytical ability of the features is poor. At the same time, the change of working conditions has a great influence on prediction accuracy. To overcome these limitations, a prediction method of bearing RUL based on feature evaluation and deep transfer learning is proposed. The proposed model can solve the above problems: (1) a method of feature evaluation and selection for bearing life prediction based on trend consistency index was designed. (2) In this study, a domain adversarial transfer model based on feature condition mapping is proposed to overcome the second limitation. Experimental results show that this method is superior to the existing bearing evaluation and prediction methods.


Introduction
Rolling bearing is one of the key parts of mechanical equipment, and its reliability directly affects the operational safety of the equipment [1].Thus, it is of great significance to predict the remaining useful life (RUL) of rolling bearing for the health evaluation of the entire equipment [2][3][4].Rolling bearing RUL-prediction methods can be divided into model-based and data-driven methods [5,6].The model-based method needs to make assumptions about the bearing degradation process, but it is quite different from the actual degradation process and requires artificial knowledge from experience, so the application is still limited.Currently, with the rapid development of intelligent technology and deep learning, data-driven methods have become a focus of research in academia and industry.
The data-driven bearing RUL prediction mainly includes two parts: feature extraction and prediction model construction.The features extracted by traditional data-driven bearing RUL methods are generally some statistical indicators, such as root mean square and kurtosis [7][8][9][10].Machine learning models are widely used in prediction models, such as support vector machines [11], hidden Markov [12], Bayesian network [13], etc.However, these methods need to extract degradation features based on expert knowledge and experience and then select the appropriate machine learning model to predict based on the changing trend of features.Recently, with the development of deep learning technology and its strong nonlinear mapping learning ability, it has been introduced into rolling bearing RUL prediction.This method extracts features adaptively from the original signal and completes the prediction, which reduces the dependence of the intelligent prediction model Through the summary and analysis of the above literature, it can be seen that a lot of study on the RUL of bearings based on deep learning has been performed at this stage.The related research of bearing RUL transfer prediction and characteristic evaluation has also been performed, but there are still the following limitations: (1) In the existing feature evaluation research of bearing life prediction, the time and frequency domain features of vibration signal are evaluated and screened, but the feature extracted from the deep learning model is not evaluated.The automatic feature extraction function of the deep learning model can reduce the complexity of manual feature extraction.Using feature evaluation methods to evaluate this kind of feature can reduce the influence of human factors and improve the interpretability of deep features.Thus, it is valuable to evaluate the features extracted from the deep learning model; (2) Currently, the premise for deep learning to obtain high prediction accuracy in bearing RUL is that there are enough data, and the training and test data come from the same or similar distribution.If these conditions are not met, the performance of the deep-learning-based prediction method will decline or even fail.However, in practical application, the distribution of training data and test set (prediction data) is often different due to the change of working conditions.The current transfer prediction method based on maximum mean discrepancy and other domain adaptation is to reduce the difference of the overall distribution of source and target domain data, which may lead to the extracted features' lack of prediction resolution.
The contributions of this study are as follows: Aiming at the first limitation, a feature evaluation and screening method for bearing life prediction based on a trend consistency index is proposed.The signal features extracted from the deep transfer learning model are evaluated and screened, and the screened features are used to predict the remaining service life of bearing.The effectiveness of the proposed method is verified by comparing it with the prediction results obtained by classical evaluation indexes (time correlation, monotonicity, and robustness).
Aiming at the second limitation, we propose a bearing RUL-prediction method based on feature evaluation and deep transfer learning.The framework of the method includes feature extraction and evaluation, prediction, and domain adaptation module.First, the unsupervised deep learning model convolutional auto encode coding network model was used to construct the feature extraction model to extract the source and target domain features.Second, the domain adaptation module based on domain antagonism was used to reduce the difference of feature extraction between the source and target domain, and the feature condition mapping learning mechanism was added to improve the prediction resolution.Then, a trend consistency index was added to evaluate the extracted features, and the features with high scores were extracted according to the index scores.Finally, the full convolution network model was constructed as the prediction model, and the filtering features were input for prediction.The superiority of the proposed method was verified by collecting data on bearing failures under different working conditions.
The remainder of the study is structured as follows.The proposed method is presented in Section 2. Experimental details, results, and analysis are stated in Section 3. Finally, conclusions are drawn in Section 4.

Proposed Method
A transfer prediction method of bearing RUL based on deep feature evaluation is herein proposed.The framework of the method includes feature extraction, domain adaptation, feature evaluation, and prediction module, as shown in Figure 1.The feature extraction module was constructed by an unsupervised convolutional autoencoder network model.The domain adaptation module consists of two layers of a fully connected neural network using Wasserstein distance to measure the difference between different distributions.In the feature evaluation module, the trend consistency index was used to select the features with high trend consistency to predict the bearing life.The prediction module is a three-layer convolutional network.
A transfer prediction method of bearing RUL based on deep feature evaluation is herein proposed.The framework of the method includes feature extraction, domain adaptation, feature evaluation, and prediction module, as shown in Figure 1.The feature extraction module was constructed by an unsupervised convolutional autoencoder network model.The domain adaptation module consists of two layers of a fully connected neural network using Wasserstein distance to measure the difference between different distributions.In the feature evaluation module, the trend consistency index was used to select the features with high trend consistency to predict the bearing life.The prediction module is a three-layer convolutional network.The feature extraction module adopts the unsupervised learning deep convolutional autoencoder network, and the network structure of the autoencoder is shown in Figure 2. The convolutional autoencoder network model is composed of a three-layer one-dimensional convolution layer and a three-layer one-dimensional deconvolution layer.The hidden layer features of the autoencoder network were pooled to provide features for the prediction and domain adaptation module.The parameters of the feature extraction module are shown in Table 1.The loss function of the network is the mean square error, and the specific formula is as given (1): where   is the input value, and    is the output value.The feature extraction module adopts the unsupervised learning deep convolutional autoencoder network, and the network structure of the autoencoder is shown in Figure 2. The convolutional autoencoder network model is composed of a three-layer one-dimensional convolution layer and a three-layer one-dimensional deconvolution layer.The hidden layer features of the autoencoder network were pooled to provide features for the prediction and domain adaptation module.The parameters of the feature extraction module are shown in Table 1.The loss function of the network is the mean square error, and the specific formula is as given (??): where y i is the input value, and y i p is the output value.

Feature extraction module
Encoder

Domain Adaptation Module
The domain adaptation module uses the idea of a generative adversarial network (GAN) to solve the problem of measuring the difference between the source and target domain in domain adaptation.The source and target domain data with different working conditions were input into the generator at the same time.The generator was used for feature extraction, and then, the discriminator was trained through continuous adversarial.When the discriminator cannot determine the source of the distribution of the feature extracted by the generator, it can be considered that the feature extracted by the generator is no longer different from the feature extracted by the target domain.The discriminator consists of two layers of a fully connected neural network, and the number of neurons in the last layer is 1.To avoid the instability of the original GAN training, the discriminator module uses Wasserstein distance instead of Jensen-Shannon (JS) divergence to measure the difference between different distributions. 1. Wasserstein distance where   and   is the probability distribution;   () is the fitting function of Wasserstein distance.When   satisfies the 1-Lipschitz constraint, the Wasserstein distance between distributions can be approximately evaluated by adjusting the parameters of   .

Domain Adaptation Module
The domain adaptation module uses the idea of a generative adversarial network (GAN) to solve the problem of measuring the difference between the source and target domain in domain adaptation.The source and target domain data with different working conditions were input into the generator at the same time.The generator was used for feature extraction, and then, the discriminator was trained through continuous adversarial.When the discriminator cannot determine the source of the distribution of the feature extracted by the generator, it can be considered that the feature extracted by the generator is no longer different from the feature extracted by the target domain.The discriminator consists of two layers of a fully connected neural network, and the number of neurons in the last layer is 1.To avoid the instability of the original GAN training, the discriminator module uses Wasserstein distance instead of Jensen-Shannon (JS) divergence to measure the difference between different distributions.

1.
Wasserstein distance where P r and P g is the probability distribution; f w (x) is the fitting function of Wasser- stein distance.When f w satisfies the 1-Lipschitz constraint, the Wasserstein distance between distributions can be approximately evaluated by adjusting the parameters of f w .
The Wasserstein distance between source domain feature distribution P h s and target domain feature distribution P h t is as follows: Conditional mapping constraints Sensors 2023, 23, 8254 where f ' g (x s ) is the mapping function in source domain of the feature extraction module.y is the label.The symbol represents the dot product operation symbol, where x s and x t are the data samples from the source domain and the target domain, respectively; D(x) is the mapping function of the domain discrimination module; and f g (x) is the mapping function of the feature extraction module.
To satisfy the 1-Lipschitz constraint, the gradient penalty L grad is imposed on θ d .
Wasserstein distance can be calculated approximately by the following formula.
where γ is the penalty coefficient.Thus, the objective function of the feature extraction model of transfer learning based on domain confrontation can be written as follows:

Prediction Module
The prediction module is constructed by a full convolution neural network, which reduces the number of training parameters using the weight sharing and local connection characteristics of the convolution layer.Increasing the number of layers of the network can improve the nonlinear mapping ability.Thus, the multilayer CNN is used as the prediction model for experimental verification.The structure of the model is shown in Figure 3.The model has three network layers, and the last layer is the full connection layer, which is used to output the life prediction value r h .Finally, the weighted smoothing method is used to smooth the prediction results.
where   ' (  ) is the mapping function in source domain of the feature extrac ule.y is the label.The ⊙ symbol represents the dot product operation symbol, and   are the data samples from the source domain and the target domain, re () is the mapping function of the domain discrimination module; and   mapping function of the feature extraction module.
To satisfy the 1-Lipschitz constraint, the gradient penalty   is impose where ℎ � = ℎ  + (1 − )ℎ  .∇ is a gradient differential operator.Wasserstein distance can be calculated approximately by the following for where γ is the penalty coefficient.Thus, the objective function of the feature extraction model of transfer learn on domain confrontation can be written as follows:

Prediction Module
The prediction module is constructed by a full convolution neural netwo reduces the number of training parameters using the weight sharing and local c characteristics of the convolution layer.Increasing the number of layers of th can improve the nonlinear mapping ability.Thus, the multilayer CNN is used diction model for experimental verification.The structure of the model is shown 3. The model has three network layers, and the last layer is the full connection la is used to output the life prediction value  ℎ .Finally, the weighted smoothing used to smooth the prediction results.The RUL percentage label y is used in model training y h .It represents the percentage of RUL at the current time in the total life.The calculation formula is as follows: where L represents the total number of times of data acquisition for the corresponding bearing; h represents the h-th data acquisition for the corresponding bearing.

Feature Evaluation Module
(1) Design idea of trend consistent life prediction evaluation index Time correlation, monotonicity, and robustness play an active role in the quantitative evaluation and screening of bearing signal features.However, when the selected features are input into the bearing life prediction model, the consistency of the same feature in the trend of different bearings will have an impact on the bearing life prediction, and these three classic feature evaluation indicators are not the same.The consistency of this trend is not directly considered.
In the time direction, the same signal feature should be consistent in the trend between different bearings.The higher the degree of consistency, the higher the prediction result.Otherwise, the lower the prediction result.To evaluate the trend consistency and select the features with higher trend consistency to predict the bearing life and improve the prediction accuracy, a new calculation method was designed; that is, with the help of correlation calculation formula, the trend consistency of the same feature among different bearings was calculated, and the trend consistency life prediction evaluation index is thus proposed.
(2) Construction method of trend consistent life-prediction evaluation index First, the extracted features were smoothed.Second, the features were compressed using normalization and down-sampling methods.Then, the correlation between features was calculated using the correlation calculation formula.Then, the mean value of a group of calculated correlation values was taken for processing.Finally, the score of the trend consistency index was obtained.
Suppose the r feature sequence of the i bearing is . The trend term after exponential weighted moving smoothing is The main calculation process of the trend consistency index is as follows: Step 1: The trend terms of the same feature series of different bearings are normalized (0-1) and dimensionally reduced.Without changing the trend of feature change, the same feature of different bearings has the same data length.The trend term X T i r of the feature sequence is Z i r = Z i r (1), Z i r (2), . . . ,Z i r (s) after normalization and down-sampling.The main calculation formula of down-sampling is as follows: where s means that the time length of X T i r is divided into s intervals averagely, which is also the total number of features after down-sampling; M is the length of each interval; R (•) is an upward rounding function; G (f) is the length of the f interval; X is the value of the feature after the f-th interval down-sampling.
Step 2: The correlation Formula ( 9) is used to calculate the correlation between the same feature series of different bearings.The calculation results of the correlation value q ij r of the same feature sequence between two bearings are arranged as follows: Sensors 2023, 23, 8254 where n is the total number of bearings.q 12 r represents the correlation value between the first bearing's r feature sequence Z 1 r and the second bearing's R feature sequence Z 2 r .
Step 3: The results calculated in step 2 are processed by means, and the score q is obtained.The calculation process is shown in the following formula: where E is the total number of q ij r in q r .The calculated mean value meets the requirements of the range from 0 to 1, so the final score of the trend consistency index of the r-th feature sequence is Q, and the closer q is to 1, the more consistent the trend performance of the same feature on different bearings.The prediction of residual life is helpful to obtain better prediction accuracy using the feature of high consistency score of this trend.

Model Training
The verification process of the bearing RUL-prediction feature selection method based on the trend consistency index mainly includes the following four steps: Step 1: Data preprocessing.First, the original vibration signal at each time point is transformed into a frequency domain signal by fast Fourier transform, and the training and test set are divided according to certain rules.
Step 2: Feature extraction.The unsupervised convolutional autoencoder network model is used to construct the feature extraction model, and the GAN-based domain adaptive module is used to reduce the distribution difference between the source and target domain data to train the feature extraction network.After the training, the training and test set are input into the network simultaneously to obtain the corresponding feature set.
Step 3: Evaluation and screening features.According to the index evaluation steps, the trend consistency index is used to evaluate the features of the training set; according to the score of the index, the high-quality features of the trend consistency index are extracted.
Step 4: Predict bearing RUL.The multilayer full convolution network model is used as the prediction model, and the high-quality feature set of the training set is used to train the network; after the training, the high-quality feature set of the test set is input to obtain the prediction value of the model.

Experimental Verification
The experimental results and analysis are divided into five parts.First, the experimental data are explained.Second, the model feature extraction is tested and analyzed.Third, the influence of domain adaptation on feature distribution is analyzed.Fourth, the trend consistency index of feature evaluation is calculated and analyzed.Fifth, the influence of index scores on the prediction results is analyzed.Finally, the prediction of bearing RUL based on different evaluation methods is compared and analyzed.

Experimental Data
The experimental data are vibration acceleration data collected from the accelerated life bench test of rolling bearing, which comes from the PHM data challenge [31] held by the Institute of Electrical and Electronics Engineers (IEEE) in 2012.The accelerated life test of rolling bearing is shown in Figure 4.The data set contains the full life-cycle vibration data of 17 rolling bearings under three working conditions, including 7 bearings under the first and second working conditions and 3 bearings under the third working condition, as shown in Table 2.The data acquisition method was used to collect 2560 vibration accelerations every 10 s until the vibration acceleration in the data description reaches the set threshold, and the bearing failure condition is stopped.In this study, bearing dataset 1 is the source domain, and bearing dataset 2 and 3 are target domain, respectively. of rolling bearing is shown in Figure 4.The data set contains the full life-cycle vibration data of 17 rolling bearings under three working conditions, including 7 bearings under the first and second working conditions and 3 bearings under the third working condition, as shown in Table 2.The data acquisition method was used to collect 2560 vibration accelerations every 10 s until the vibration acceleration in the data description reaches the set threshold, and the bearing failure condition is stopped.In this study, bearing dataset 1 is the source domain, and bearing dataset 2 and 3 are target domain, respectively.

Feature Extraction
In the process of experimental verification, any bearing is selected from the bearing dataset 2 target domain data set as the test set, and the other six bearings are selected as the training set.Taking bearing2_3 2-3 as an example, the effectiveness of the method is illustrated.The original vibration and frequency domain signal of the first 0.1 s sample of bearing 2_3 are shown in Figure 5 and Figure 6, respectively.

Feature Extraction
In the process of experimental verification, any bearing is selected from the bearing dataset 2 target domain data set as the test set, and the other six bearings are selected as the training set.Taking bearing2_3 2-3 as an example, the effectiveness of the method is illustrated.The original vibration and frequency domain signal of the first 0.1 s sample of bearing 2_3 are shown in Figures 5 and 6      It can be seen from Figure 8 that the feature extraction model can extract different features from the test set, indicating that the extracted features are diverse.It can be seen from Figure 8 that the feature extraction model can extract different features from the test set, indicating that the extracted features are diverse.It can be seen from Figure 8 that the feature extraction model can extract different features from the test set, indicating that the extracted features are diverse.It can be seen from Figure 8 that the feature extraction model can extract different features from the test set, indicating that the extracted features are diverse.

Domain Adaptation
It can be seen from Figures 9 and 10 that the probability density distribution of data extraction features in the source and target domain is quite different before transfer learning.After transfer learning, the consistency of probability density distribution of features was significantly improved, and the difference was reduced.This shows that the feature extracted by transfer learning is insensitive to the change of working conditions, which is conducive to the RUL-prediction model to achieve higher prediction accuracy.
extraction features in the source and target domain is quite different before transfer ing.After transfer learning, the consistency of probability density distribution of fea was significantly improved, and the difference was reduced.This shows that the fe extracted by transfer learning is insensitive to the change of working conditions, wh conducive to the RUL-prediction model to achieve higher prediction accuracy.

Without transfer learning.
With transfer learning.

Calculation and Analysis of Trend Consistency Index in Feature Evaluation
In the feature evaluation, the extracted features of the training set were evalu and due to the length of limitation, the calculation process of bearing 1-1 and bearin in the training set is shown in the form of pictures.The trend terms of the first fe sequence of bearings 1_1 and 2_2 are shown in Figure 11a and Figure 11c, respect After normalization and down-sampling in step 1, the feature sequences of the three ings are shown in Figure 11b and Figure 11d, respectively.From the comparison two results before and after normalization and down-sampling, it can be seen tha step does not change the changing trend of the feature sequence and achieves the pu of the first step.After the correlation calculation in the second step, the correlation v of all bearing features are arranged as shown in Table 3.It can be seen from Figures 9 and 10 that the probability density distribution extraction features in the source and target domain is quite different before transf ing.After transfer learning, the consistency of probability density distribution of was significantly improved, and the difference was reduced.This shows that the extracted by transfer learning is insensitive to the change of working conditions, conducive to the RUL-prediction model to achieve higher prediction accuracy.

Without transfer learning.
With transfer learning.

Calculation and Analysis of Trend Consistency Index in Feature Evaluation
In the feature evaluation, the extracted features of the training set were ev and due to the length of limitation, the calculation process of bearing 1-1 and bea in the training set is shown in the form of pictures.The trend terms of the first sequence of bearings 1_1 and 2_2 are shown in Figure 11a and Figure 11c, resp After normalization and down-sampling in step 1, the feature sequences of the thr ings are shown in Figure 11b and Figure 11d, respectively.From the compariso two results before and after normalization and down-sampling, it can be seen step does not change the changing trend of the feature sequence and achieves the of the first step.After the correlation calculation in the second step, the correlatio of all bearing features are arranged as shown in Table 3.

Calculation and Analysis of Trend Consistency Index in Feature Evaluation
In the feature evaluation, the extracted features of the training set were evaluated, and due to the length of limitation, the calculation process of bearing 1-1 and bearing 2_1 in the training set is shown in the form of pictures.The trend terms of the first feature sequence of bearings 1_1 and 2_2 are shown in Figures 11a and 11c, respectively.After normalization and down-sampling in step 1, the feature sequences of the three bearings are shown in Figures 11b and 11d, respectively.From the comparison of the two results before and after normalization and down-sampling, it can be seen that this step does not change the changing trend of the feature sequence and achieves the purpose of the first step.After the correlation calculation in the second step, the correlation values of all bearing features are arranged as shown in Table 3.
According to the results obtained in step 2, we calculated the data in Table 3 using the calculation method in step 3. Finally, the trend consistency index score of the first feature sequence was obtained, and the score is 0.683.
Each bearing feature set has 256 feature sequences.According to the above process, 256 score values were obtained, as shown in Figure 12.According to the results obtained in step 2, we calculated the data in Table 3 us the calculation method in step 3. Finally, the trend consistency index score of the first f   It can be seen from Figure 12 that the score of the 23rd feature sequence is the highest, while the score of the 22nd feature sequence is the lowest.To verify whether the score is reasonable, the trend items of the 22nd, and 23rd characteristic series of bearing 1_1 and 2_1 after down-sampling are listed, respectively, as shown in Figures 13 and 14.By comparing Figures 13 and 14, we can see that the more similar the trend is, the higher the index score is, while the more different the trend is, the lower the index score is.The above phenomenon shows that the calculation method of the trend consistency index is reasonable.It can be seen from Figure 12 that the score of the 23rd feature sequence is the highest, while the score of the 22nd feature sequence is the lowest.To verify whether the score is reasonable, the trend items of the 22nd, and 23rd characteristic series of bearing 1_1 and 2_1 after down-sampling are listed, respectively, as shown in Figures 13 and 14.By comparing Figures 13 and 14, we can see that the more similar the trend is, the higher the index score is, while the more different the trend is, the lower the index score is.The above phenomenon shows that the calculation method of the trend consistency index is reasonable.It can be seen from Figure 12 that the score of the 23rd feature sequence is the highes while the score of the 22nd feature sequence is the lowest.To verify whether the score reasonable, the trend items of the 22nd, and 23rd characteristic series of bearing 1_1 an 2_1 after down-sampling are listed, respectively, as shown in Figures 13 and 14.By com paring Figures 13 and 14, we can see that the more similar the trend is, the higher th index score is, while the more different the trend is, the lower the index score is.The abov phenomenon shows that the calculation method of the trend consistency index is reason able.The 256 scores calculated from 256 feature sequences were normalized, and the ca culation results are shown in Figure 15.The threshold value was selected as 0.5, and th feature with a normalized score above 0.5 was retained.According to the sequence number obtained above, the features corresponding to th sequence number were extracted in the feature set of the training and test set, which ar the high-quality features considered by the trend consistency index.
From the above calculation process, it can be seen that the trend consistency inde The 256 scores calculated from 256 feature sequences were normalized, and the calculation results are shown in Figure 15.The threshold value was selected as 0.5, and the feature with a normalized score above 0.5 was retained.The 256 scores calculated from 256 feature sequences were normalized, and the ca culation results are shown in Figure 15.The threshold value was selected as 0.5, and th feature with a normalized score above 0.5 was retained.According to the sequence number obtained above, the features corresponding to th sequence number were extracted in the feature set of the training and test set, which ar the high-quality features considered by the trend consistency index.
From the above calculation process, it can be seen that the trend consistency inde score of the feature series is relative, and the feature series with a high score is more con sistent in different bearings than the feature series with a low score.According to the sequence number obtained above, the features corresponding to the sequence number were extracted in the feature set of the training and test set, which are the high-quality features considered by the trend consistency index.
From the above calculation process, it can be seen that the trend consistency index score of the feature series is relative, and the feature series with a high score is more consistent in different bearings than the feature series with a low score.

Influence of Index Score on Prediction Results
To explain the influence of feature scores on the prediction results, high-quality feature sets with scores above 0.5 and common feature sets with scores below 0.5 were used for training and prediction, respectively.Taking bearing 2_2 and 3_3 as test sets randomly, the prediction results corresponding to their high-quality and common feature sets are shown in Figures 16 and 17

Influence of Index Score on Prediction Results
To explain the influence of feature scores on the prediction results, high-quality feature sets with scores above 0.5 and common feature sets with scores below 0.5 were used for training and prediction, respectively.Taking bearing 2_2 and 3_3 as test sets randomly the prediction results corresponding to their high-quality and common feature sets are shown in Figure 16 and Figure 17, respectively.From the results of Figures 16 and 17, it can be seen that the prediction effect of the feature with a high score of trend consistency index is better than that of the feature with a low score, which shows that the more consistent the trend performance of the same feature on different bearings is, the better the prediction effect that will be obtained.

Comparative Analysis of Bearing RUL Prediction Results
Three typical and trend consistency indicators were used to evaluate and screen features, respectively.The high-quality feature sets obtained from each evaluation were used

Influence of Index Score on Prediction Results
To explain the influence of feature scores on the prediction results, high-quality feature sets with scores above 0.5 and common feature sets with scores below 0.5 were used for training and prediction, respectively.Taking bearing 2_2 and 3_3 as test sets randomly, the prediction results corresponding to their high-quality and common feature sets are shown in Figure 16 and Figure 17, respectively.From the results of Figures 16 and 17, it can be seen that the prediction effect of the feature with a high score of trend consistency index is better than that of the feature with a low score, which shows that the more consistent the trend performance of the same feature on different bearings is, the better the prediction effect that will be obtained.

Comparative Analysis of Bearing RUL Prediction Results
Three typical and trend consistency indicators were used to evaluate and screen features, respectively.The high-quality feature sets obtained from each evaluation were used for training and prediction, and four groups of prediction results were obtained.To better From the results of Figures 16 and 17, it can be seen that the prediction effect of the feature with a high score of trend consistency index is better than that of the feature with a low score, which shows that the more consistent the trend performance of the same feature on different bearings is, the better the prediction effect that will be obtained.

Comparative Analysis of Bearing RUL Prediction Results
Three typical and trend consistency indicators were used to evaluate and screen features, respectively.The high-quality feature sets obtained from each evaluation were used for training and prediction, and four groups of prediction results were obtained.To better illustrate the prediction effect of this method, 256 feature sequences were used for training and prediction, and a group of prediction results using full feature sequences was obtained.
The mean absolute error was used as the evaluation standard of prediction error, and the calculation formula is shown in Formula (14).
where H is the total number of predicted values of corresponding bearings.
Taking bearings 1_1~1_7 as the source domain and the six bearings in bearings 2_1~2_7 as the target domain, one bearing was selected as the test set in turn.According to this division method, the mean error of prediction is shown in Table 4. From the statistical results in Table 4, it can be seen that the prediction effect of the trend consistency index is better than the other three indexes in the test sample, which shows that the trend consistency index can effectively select the high-quality feature set, which is conducive to reducing the prediction error from the feature set extracted by the deep learning model.Compared with the other prediction results, the comprehensive average of the errors is reduced by 20.5%, 21%, 20.3%, and 22.8%, which shows that the feature evaluation method is suitable for the signal features extracted by the deep learning model and can improve the interpretability of such features to a certain extent.
To show the prediction effect more intuitively and comprehensively, bearing 2-2 and bearing 3-3 were taken as examples to show the prediction results when they are used as test sets, respectively.The prediction results are shown in Table 5.It can be seen that the prediction accuracy of the trend consistency index is higher than that of the full feature series.

Conclusions
To address the problem of feature extraction and evaluation in bearing life prediction, a method of bearing RUL transfer prediction based on deep feature evaluation is proposed, which adopts a feature evaluation and screening method based on the trend

Conclusions
To address the problem of feature extraction and evaluation in bearing life prediction, a method of bearing RUL transfer prediction based on deep feature evaluation is proposed, which adopts a feature evaluation and screening method based on the trend

Conclusions
To address the problem of feature extraction and evaluation in bearing life prediction, a method of bearing RUL transfer prediction based on deep feature evaluation is proposed, which adopts a feature evaluation and screening method based on the trend

Conclusions
To address the problem of feature extraction and evaluation in bearing life prediction, a method of bearing RUL transfer prediction based on deep feature evaluation is proposed, which adopts a feature evaluation and screening method based on the trend

Conclusions
To address the problem of feature extraction and evaluation in bearing life prediction, a method of bearing RUL transfer prediction based on deep feature evaluation is proposed, which adopts a feature evaluation and screening method based on the trend

Conclusions
To address the problem of feature extraction and evaluation in bearing life prediction, a method of bearing RUL transfer prediction based on deep feature evaluation is proposed, which adopts a feature evaluation and screening method based on the trend consistency index.Finally, the bearing RUL can be predicted.The conclusions are as follows:

Figure 3 .
Figure 3. Network model for life prediction.

Figure 3 .
Figure 3. Network model for life prediction.

Figure 4 .
Figure 4. Accelerated life testbed of rolling bearing.

Figure 4 .
Figure 4. Accelerated life testbed of rolling bearing.

19 Figure 5 .
Figure 5. Original signal of the first 0.1 s sample of bearing 2_3.

Figure 5 .
Figure 5. Original signal of the first 0.1 s sample of bearing 2_3.

Figure 5 .
Figure 5. Original signal of the first 0.1 s sample of bearing 2_3.

Figure 6 .
Figure 6.Frequency domain signal of the first 0.1 s sample of bearing 2_3.

Figure 6 .
Figure 6.Frequency domain signal of the first 0.1 s sample of bearing 2_3.After the training of the feature extraction model, 320 feature sequences of bearing 2_3 were obtained.Two feature sequences were randomly selected to show the effect of feature extraction.The first and 320 feature sequences selected are shown in Figures 7 and 8.

Figure 5 .
Figure 5. Original signal of the first 0.1 s sample of bearing 2_3.

Figure 6 .
Figure 6.Frequency domain signal of the first 0.1 s sample of bearing 2_3.

Figure 5 .
Figure 5. Original signal of the first 0.1 s sample of bearing 2_3.

Figure 6 .
Figure 6.Frequency domain signal of the first 0.1 s sample of bearing 2_3.

Figure 9 .
Figure 9. Feature probability density distribution of bearing 2_1 and bearing 1_1.

Figure 10 .
Figure 10.Feature of bearing without transfer learning and with transfer learning.

Figure 9 .
Figure 9. Feature probability density distribution of bearing 2_1 and bearing 1_1.

Figure 9 .
Figure 9. Feature probability density distribution of bearing 2_1 and bearing 1_1.

Figure 10 .
Figure 10.Feature of bearing without transfer learning and with transfer learning.

Figure 10 .
Figure 10.Feature of bearing without transfer learning and with transfer learning.

Figure 11 .
Figure 11.Trend items bearing 3_2 characteristics.(a) Trend term of bearing 1_1 feature.(b) Tre term of bearing 1_1 feature after down-sampling.(c) Trend term of bearing 2_1 feature.(d) Tre term of bearing 2_1 feature after down-sampling.

Figure 11 .
Figure 11.Trend items bearing 3_2 characteristics.(a) Trend term of bearing 1_1 feature.(b) Trend term of bearing 1_1 feature after down-sampling.(c) Trend term of bearing 2_1 feature.(d) Trend term of bearing 2_1 feature after down-sampling.

Figure 12 .
Figure 12.Trend consistency score of full feature sequence.

Figure 13 .Figure 12 .
Figure 13.Trend term of the 22nd characteristic sequence of bearings 1_1 and 2_1 after down-sampling.

Figure 12 .
Figure 12.Trend consistency score of full feature sequence.

Figure 13 .Figure 13 .
Figure 13.Trend term of the 22nd characteristic sequence of bearings 1_1 and 2_1 after down-sam pling.

Figure 14 .
Figure 14.Trend term of the 23rd characteristic sequence of bearings 1_1 and 2_1 after down-sam pling.

Figure 15 .
Figure 15.Normalized Score of Trend Consistency of Full Feature Sequence.

Figure 14 .
Figure 14.Trend term of the 23rd characteristic sequence of bearings 1_1 and 2_1 after downsampling.

Sensors 2023 , 1 Figure 14 .
Figure 14.Trend term of the 23rd characteristic sequence of bearings 1_1 and 2_1 after down-sam pling.

Figure 15 .
Figure 15.Normalized Score of Trend Consistency of Full Feature Sequence.

Figure 15 .
Figure 15.Normalized Score of Trend Consistency of Full Feature Sequence.

Figure 16 .
Figure 16.Prediction results of two kinds of feature sets of bearing 2_2.

Figure 17 .
Figure 17.Prediction results of two types of feature sets of bearing 3_3.

Figure 16 .
Figure 16.Prediction results of two kinds of feature sets of bearing 2_2.

Figure 16 .
Figure 16.Prediction results of two kinds of feature sets of bearing 2_2.

Figure 17 .
Figure 17.Prediction results of two types of feature sets of bearing 3_3.

Figure 17 .
Figure 17.Prediction results of two types of feature sets of bearing 3_3.

Table 2 .
Bearing data set.

Table 2 .
Bearing data set.

Table 3 .
Correlation value of the same feature.

Table 3 .
Correlation value of the same feature.

Table 4 .
The predicted results of the experiment.

Table 5 .
Prediction experimental results of bearing 2-2 and bearing 3-3.To address the problem of feature extraction and evaluation in bearing life prediction, a method of bearing RUL transfer prediction based on deep feature evaluation is proposed, which adopts a feature evaluation and screening method based on the trend

Table 5 .
Prediction experimental results of bearing 2-2 and bearing 3-3.To address the problem of feature extraction and evaluation in bearing life prediction, a method of bearing RUL transfer prediction based on deep feature evaluation is proposed, which adopts a feature evaluation and screening method based on the trend

Table 5 .
Prediction experimental results of bearing 2-2 and bearing 3-3.To address the problem of feature extraction and evaluation in bearing life prediction, a method of bearing RUL transfer prediction based on deep feature evaluation is proposed, which adopts a feature evaluation and screening method based on the trend

Table 5 .
Prediction experimental results of bearing 2-2 and bearing 3-3.To address the problem of feature extraction and evaluation in bearing life prediction, a method of bearing RUL transfer prediction based on deep feature evaluation is proposed, which adopts a feature evaluation and screening method based on the trend

Table 5 .
Prediction experimental results of bearing 2-2 and bearing 3-3.To address the problem of feature extraction and evaluation in bearing life prediction, a method of bearing RUL transfer prediction based on deep feature evaluation is proposed, which adopts a feature evaluation and screening method based on the trend