Overview of Equipment Health State Estimation and Remaining Life Prediction Methods

: Health state estimation can quantitatively evaluate the current degradation state of equipment, and remaining life prediction can quantitatively predict the remaining service time of equipment. These two technologies can provide a basis for condition-based maintenance and predictive maintenance of equipment, respectively. In recent years, a large amount of research has been implemented in these two technologies. However, there is not any systematic review that covers these two technologies, and their engineering applications, comprehensively. To ﬁll the gap, this paper makes a comparative analysis of existing health state estimation and remaining life prediction methods, and details the characteristics and limitations of various methods. The engineering applications of these two methods are summarized, and their applicable objects are brieﬂy given. Finally, these two methods are summarized, and their feasibility for engineering application is discussed. This work provides guidance for the selection of industrial equipment health assessment and remaining life prediction methods.


Introduction
In the past decades, with increasing equipment complexity and integration, the failure rate has gradually increased.In order to ensure equipment's smooth completion of various tasks and reduce the maintenance cost in the life cycle, prognostics and health management (PHM) technology was born in the 1970s [1].PHM technology represents a change in concept, which enables the maintenance and management of equipment to engage in post-treatment and passive maintenance, regular inspection, active protection, and then to advance prediction and comprehensive management [2].This technology has been intensively studied and widely used in the UK, USA and other countries.It is an important part of equipment maintenance and management.Health state estimation and remaining life prediction are key technologies in PHM [3].Health state estimation and remaining life prediction mainly collect the output data of the equipment through various sensors, process and analyze the data with the help of various algorithms, comprehensively evaluate the health of the equipment and predict the remaining service time of the equipment [4].With the help of these two technologies, the degradation trend of the equipment can be identified and the future service time can be evaluated.Furthermore, maintenance management opinions can be provided in time, so as to improve the reliability and supportability of the equipment.
There are three main ways to evaluate the health state equipment, as shown in Table 1.The first two involve evaluating the health state level of the equipment, and the third one is to evaluate the health value of the equipment [5].Initially, engineers only used fault and normal binary functions to judge the health state of equipment [6].This method is relatively mature and insufficient to define the state of equipment only by binary functions.Later, The remaining life of equipment refers to the period from the time when the equipment is put into production to the time when it can no longer be repaired and reused.It is determined by the material, manufacturing quality, service conditions and maintenance conditions of the equipment.Because some factors are random factors (such as environment, climate, technical proficiency of operators, etc.), the actual remaining life of equipment produced in the same batch will not be exactly the same.Equipment life analysis is mainly divided into two stages; the early stage is life estimation, and then the remaining life prediction is derived.Life estimation is mainly used to evaluate the remaining time of newly developed components and equipment under specific working conditions.The purpose of remaining useful life (RUL) prediction is to predict how much time is left before the equipment fails under the condition that the current state and historical state data are known.The formula is as follows: [9] T − t|T > t, Z(t) where T is the random variable of failure time; t is the current running time; Z(t) is the history up to the current time.
There are great differences in life estimation methods for equipment under different conditions, i.e., newly developed equipment and equipment under working state, as shown in Table 2.The life estimation methods of newly developed components and equipment include mechanism analysis methods and environmental factor conversion methods [10].Equipment RUL prediction in working state refers to the prediction of equipment RUL using relevant information after the equipment has been working for a period of time [11].Relevant information is mainly degradation data, including performance degradation data during equipment operation, and degradation data obtained through accelerated life test or simulation.The commonly used methods are based on a physical model and a data-driven method.An important premise of using performance degradation data for life prediction is to accurately define the failure of equipment.It is generally believed that the performance degradation data is considered to have failed when it reaches a predetermined failure threshold.For example, failure of a power supply device can be defined as when its output voltage drops to a given threshold.
The component reliability synthesis method is used for theoretical calculation [11] Remaining life prediction Equipment (components, systems) in working state

Degradation data Accelerated life test and simulation
Based on physical model; Data driven (including machine learning and statistical data driven) [12] In recent years, a large amount of research has been implemented in health state estimation and remaining life prediction.However, there is not any systematic review that covers these two technologies, and their engineering applications comprehensively.This greatly limits the application of health assessment and remaining life prediction methods in industry.Therefore, it is necessary to summarize the methods of health evaluation, remaining life prediction and their engineering applications.This paper compares and analyzes various health assessment and remaining life prediction methods.The engineering applications of the two methods are summarized, and their applicable objects are also given.Finally, the feasibility of engineering applications is discussed.

Health State Estimation Method
Due to the different characteristics of different equipment, the health state estimation methods are usually different.According to different driving methods, they are divided into three types, namely, model driven methods, knowledge driven methods and data driven methods [13].
Model driven methods include the Mahalanobis distance method [14], fusion weight calculation method [15], European distance method [16], fuzzy theory method [17], etc.These methods are simple, efficient and easy to implement [18].At present, they are widely used, but these methods need expert experience to determine the weight and model parameters, coupled with the idealized assumption of modeling [19].As a result, the effect in practice is difficult to adapt to the influence of various complex factors in the process of equipment operation.
The knowledge-driven type of health state estimation carries out health state estimation through knowledge acquisition and knowledge expression, but this type of method is difficult in practice, and there is little research on it.This is mainly because knowledge and experience are limited, and knowledge expression also faces the problem of knowledge standardization.
The data-driven method is the most promising method at present, which makes full use of the advantages of machine learning and deep learning.It is also a widely studied method in health state estimation in China and abroad [20].These methods include linear regression [21], support vector machine [22], support vector description [23][24][25], neural network and deep learning [26][27][28].In particular, the rise and wide application of deep learning have greatly promoted the development of health state estimation research.
This paper classifies the equipment health state estimation methods according to the algorithm principle, as shown in Figure 1.The characteristics of each method are given in Table 3.
network and deep learning [26][27][28].In particular, the rise and wide application of deep learning have greatly promoted the development of health state estimation research.
This paper classifies the equipment health state estimation methods according to the algorithm principle, as shown in Figure 1.The characteristics of each method are given in Table 3.

Model Calculation Method (1) Fusion calculation method
The fusion calculation method directly calculates the overall health of the equipment according to the impact of the index collected data on the overall health of the equipment.Its general expression is as follows: ( ) where H represents the health result of the assessment.Suppose there are k sampling data in total.For the ith sampling data, wi represents the weight of the data, f(di) represents the result of certain processing on the index collected data di.
Because it is intuitive and easy to understand, fusion computing is widely used in health state estimation.Li et al. [29] analyzed the factors affecting transformer health state estimation, and deduced and established the health state estimation formula that can reflect the transformer state.
(2) Information entropy method Information entropy method is a quantitative expression of the overall average characteristics of the collected indicators, and can be used as a complexity measure analysis method to reflect the fault characteristics.The calculation formula is as follows: where Pi represents the probability of occurrence of various sampling data.According to the definition of entropy, the greater the entropy, the better the health of the equipment [30].It is worth noting that there are few applications of health state estimation using information entropy alone, and this method is often combined with other methods.Lu et al. [31] proposed a feature extraction method based on information entropy fusion and applied it to gas path analysis of a turboshaft engine.The results showed that the feature

Model Calculation Method (1) Fusion calculation method
The fusion calculation method directly calculates the overall health of the equipment according to the impact of the index collected data on the overall health of the equipment.Its general expression is as follows: where H represents the health result of the assessment.Suppose there are k sampling data in total.For the ith sampling data, w i represents the weight of the data, f (d i ) represents the result of certain processing on the index collected data d i .
Because it is intuitive and easy to understand, fusion computing is widely used in health state estimation.Li et al. [29] analyzed the factors affecting transformer health state estimation, and deduced and established the health state estimation formula that can reflect the transformer state.
(2) Information entropy method Information entropy method is a quantitative expression of the overall average characteristics of the collected indicators, and can be used as a complexity measure analysis method to reflect the fault characteristics.The calculation formula is as follows: where P i represents the probability of occurrence of various sampling data.According to the definition of entropy, the greater the entropy, the better the health of the equipment [30].It is worth noting that there are few applications of health state estimation using information entropy alone, and this method is often combined with other methods.Lu et al. [31] proposed a feature extraction method based on information entropy fusion and applied it to gas path analysis of a turboshaft engine.The results showed that the feature extraction method based on information entropy fusion can effectively reduce the dimension of input parameters and simplify the feature samples, so as to improve the ability of engine health state estimation.
Machines 2022, 10, 422 5 of 27 (3) Distance method The distance method uses various vector distances or similarity calculation methods to evaluate the health degree by comparing the distances or similarities between the vectors to be evaluated and the standard health vector.
Yin et al. [32] proposed a rolling bearing health state evaluation method based on the similarity of manifold space principal curve, which realized the quantitative evaluation of rolling bearing health state.Deng et al. [33] used Mahalanobis distance to divide the health state space to evaluate the health state of electromechanical equipment.In recent years, multiple distance fusion calculations or combinations with other methods have also become a research direction.In Zhang et al. [34], Jensen-Shannon (JS) divergence based on information entropy theory was introduced to measure the similarity between the statistical distribution of real-time state and reference health state data, and the similarity was transformed into an index that can evaluate the health state of the system.The results showed that this method can accurately extract the trend change in door health state and detect the abnormal maintenance state of door systems in time.
Among the health state estimation methods, there are relatively many methods based on distance or similarity, but the selection of health samples, the determination of parameter weight and the selection of distance algorithms are still worthy of further research.
(4) Grey correlation degree method Grey correlation analysis method is a kind of grey system analysis method.It states that if the change trend of the two factors is consistent, the degree of correlation between the two is high.Therefore, it measures the degree of correlation between factors according to the similarity or difference between development trends between factors.Compared with the distance method, it is more suitable to situations in which the monitoring data change greatly due to the influence of the environment.Bai et al. [35] evaluated the blade health of wind turbines according to the definition and calculation of health degree by the grey relationship model.

Evaluation Analysis Methods
(1) Fuzzy theory Fuzzy theory was proposed by American cybernetics expert Professor Zadeh in 1965.In some cases, the traditional accurate evaluation methods are not applicable, the fuzzy evaluation method can be used.Cao et al. [36] applied the fuzzy data fusion method to the calculation of sensor health degree, calculated the membership degree of the sensor at each time point by using the membership function, calculated the fusion membership degree of the sensor at multiple time points by using the secondary index evaluation fusion strategy, and finally obtained the health degree of the sensor from the mapping relationship between membership degree and health degree.Pure calculation with fuzzy theory cannot well reflect the impact of different collection indicators on health.Most health state estimation methods using fuzzy theory consider the impact of weight.Qian et al. [37] proposed a variable weight fuzzy health evaluation method, introduced the variable weight formula based on equilibrium function, and carried out fuzzy comprehensive evaluation combined with variable weight matrix and fuzzy relationship matrix, so as to obtain its comprehensive health.
(2) Evidence theory Evidence theory, also known as Dempster/Shafer (D-S) evidence theory, was first proposed by Dempster in 1967.Yin et al. [38] used a multi-index fusion method to evaluate radar health state, which avoids the subjectivity and limitations of the allocation of basic reliability functions of the evaluation index in traditional methods.

Machine Learning Method
(1) Traditional machine learning method Traditionally, the estimation of equipment health mostly needs to check the operation status of equipment manually, which increases the labor intensity and reduces the accuracy of evaluation.Modern industrial applications prefer to automatically identify the health state of machines, which can be achieved by using traditional machine learning models.
In the health state estimation algorithm, the support vector data description (SVDD) algorithm is mainly used.SVDD is a single classifier, which is trained with health data to obtain the SVDD hypersphere.Then, the distance from the sampling data vector to the center of the sphere is calculated.Compared with the radius of the sphere, the health represented by the sampling data can be obtained.Zhong et al. [39] designed the turnout fault detection algorithm and health evaluation algorithm based on SVDD, so as to carry out the health management of turnout equipment.
(2) Deep learning method In recent years, with the rapid development of deep learning related technology, its powerful data processing and modeling ability has attracted the attention of many scholars in China and abroad.Some scholars have introduced deep learning technology into equipment health state estimation, and achieved some promising results.
Wagshum et al. [40] developed a bearing health monitoring system with a similarity inspection index based on deep neural networks (DNN).As a classical deep learning method, convolutional neural networks (CNN) have made many outstanding achievements in speech recognition, image recognition and target tracking [41].Liu et al. [42] proposed the construction path of turnout health state estimation system based on CNN architecture, which selects the power curve of a turnout switch machine as the carrier of turnout health state estimation.The test results showed that the CNN algorithm has good adaptability in the application of turnout health state estimation.Residual network (RESNET) was developed on the basis of CNN architecture and has higher generalization performance [43].Peng et al. [44] used RESNET network architecture to evaluate the health state of bearings and achieved competitive results.
Recurrent neural network (RNN) is a framework for processing time series data.Due to its unique properties, it is also often used to estimate the health of equipment.Wu et al. [45] used normal state sample data to train a long short-term memory (LSTM) codec network and construct a feature space.The Euclidean distance between measured data feature vector and feature space was used to measure the degradation degree of health state, so as to effectively achieve the quantitative evaluation of health state of systems or equipment.

Remaining Life Prediction Methods
Scholars have classified and summarized the remaining life prediction methods.Heng et al. [46] divided the remaining life prediction methods of mechanical equipment into three categories-traditional reliability methods, methods based on monitoring data and methods integrating the two-and focused on the methods based on monitoring data.In practical engineering, there is large amount of non-mechanical equipment, such as electronic equipment [47].Pecht et al. [48] based on electronic equipment, divided the remaining life prediction methods into failure mechanism analysis methods, data-driven methods and fusion methods, as shown in Figure 2. Data-driven methods include machine learning methods and statistical data-driven methods.
In practical engineering, there is large amount of non-mechanical equipment, such as electronic equipment [47].Pecht et al. [48] based on electronic equipment, divided the remaining life prediction methods into failure mechanism analysis methods, data-driven methods and fusion methods, as shown in Figure 2. Data-driven methods include machine learning methods and statistical data-driven methods.Methods based on a physical model analyze the physical and chemical causes of equipment failure, establish the relationship between equipment failure and physical and chemical causes, such as component wear through failure physical analysis and physical and chemical analysis, and obtain the life evolution law, so as to predict the life of equipment [49].Data-driven methods generally use the obtained data to predict the remaining life by fitting the evolution law of equipment performance variables and extrapolating to the failure threshold [50].Fusion methods refer to the combination of failure mechanism analysis and data-driven models.Although they can make full use of the advantages of the two methods, the process is relatively complex, so such methods are rarely reported [51].The characteristics of these three methods are shown in Table 4.

Model calculation method
Fusion calculation method Easy to understand, simple calculation, no need for a large number of samples [18] The choice of parameter weight determination or distance algorithm is subjective and has poor generalization, so it is often rarely used alone [19] Information entropy Distance method

Grey correlation degree
Evaluation analysis method

Fuzzy theory
It can fuse the data information between subjective and objective to solve the problem of decision conflict [36] The calculation is cumbersome, and the selection of membership function and weight is subjective [37] Evidence theory It has the ability to deal with uncertain information without a priori probability [38] When used for high conflict evidence, the results may be inaccurate, and the amount of calculation is large [38] Machine learning method

Machine learning
The explanation is strong, and the computing resources are relatively small [39] Relying on artificial feature extraction, it has insufficient generalization and is prone to under fitting [39] Deep learning No manual feature extraction is required, which is suitable for big data scenarios [40] It needs a lot of labeled data, training takes a lot of resources, and the interpretability is not as good as the traditional methods [41] Methods based on a physical model analyze the physical and chemical causes of equipment failure, establish the relationship between equipment failure and physical and chemical causes, such as component wear through failure physical analysis and physical and chemical analysis, and obtain the life evolution law, so as to predict the life of equipment [49].Data-driven methods generally use the obtained data to predict the remaining life by fitting the evolution law of equipment performance variables and extrapolating to the failure threshold [50].Fusion methods refer to the combination of failure mechanism analysis and data-driven models.Although they can make full use of the advantages of the two methods, the process is relatively complex, so such methods are rarely reported [51].The characteristics of these three methods are shown in Table 4.

Method Characteristics
Methods based on a physical model Applicable to equipment with a clear degradation mechanism and weak generalization ability [49] Data-driven methods Strong data processing ability [50] Fusion methods Make full use of the advantages of the two methods, but the processes are complex [51]

Methods Based on a Physical Model
The remaining life prediction method based on a physical model is usually suitable for systems or devices with a clear degradation mechanism and easy description of the mechanism model, which can accurately predict the life of equipment.Tanaka et al. [52] proposed a mechanism model that can describe fatigue cracking along slip bands.Mou et al. [53] further established a three-dimensional simulation model to describe fatigue cracking and predicted life on this basis.It is worth noting that such methods need to be analyzed according to specific equipment, making it difficult for them to be popularized.In addition, due to the increasing complexity of equipment in the engineering field, it is difficult to obtain the mechanism model of equipment, which also limits the application of this kind of method.

1.
Machine learning method Machine learning makes a computer simulate human learning behavior and continuously trains the model by obtaining new information to improve the generalization ability of the model [54].Due to the powerful data processing ability of machine learning, this method is widely used in data mining, speech recognition, computer vision, fault diagnosis and life prediction.According to the depth of learning, machine learning methods can be divided into traditional machine learning and deep learning methods, as shown in Figure 3. Traditional machine learning algorithms largely rely on expert prior knowledge and signal processing technology, which is difficult to automatically process and for analysis of massive monitoring data.Deep learning is developed from the traditional machine learning algorithm.With its powerful feature extraction ability, it provides a solution for training massive data and opens up a new direction for the field of machine learning [55].Strong data processing ability [50] Fusion methods Make full use of the advantages of the two methods, but the processes are complex [51]

Methods Based on a Physical Model
The remaining life prediction method based on a physical model is usually suitable for systems or devices with a clear degradation mechanism and easy description of the mechanism model, which can accurately predict the life of equipment.Tanaka et al. [52] proposed a mechanism model that can describe fatigue cracking along slip bands.Mou et al. [53] further established a three-dimensional simulation model to describe fatigue cracking and predicted life on this basis.It is worth noting that such methods need to be analyzed according to specific equipment, making it difficult for them to be popularized.In addition, due to the increasing complexity of equipment in the engineering field, it is difficult to obtain the mechanism model of equipment, which also limits the application of this kind of method.

Data-Driven Method 1. Machine learning method
Machine learning makes a computer simulate human learning behavior and continuously trains the model by obtaining new information to improve the generalization ability of the model [54].Due to the powerful data processing ability of machine learning, this method is widely used in data mining, speech recognition, computer vision, fault diagnosis and life prediction.According to the depth of learning, machine learning methods can be divided into traditional machine learning and deep learning methods, as shown in Figure 3. Traditional machine learning algorithms largely rely on expert prior knowledge and signal processing technology, which is difficult to automatically process and for analysis of massive monitoring data.Deep learning is developed from the traditional machine learning algorithm.With its powerful feature extraction ability, it provides a solution for training massive data and opens up a new direction for the field of machine learning [55].The methods based on traditional machine learning mainly include the methods based on neural network and support vector machine (SVM).The characteristics of each method are given in Table 5.

a Neural network
As a mathematical processing method to simulate the structure and function of biological nervous systems, a neural network has the ability of automatic learning and summary.It mainly includes an input layer, hidden layer and output layer, which are often used to solve problems such as classification and regression [56].After years of research and exploration, it has shown strong advantages in the field of remaining life prediction.The remaining life prediction method based on a neural network aims to take the original measurement data or the features extracted based on the original measurement data as the input of the neural network, continuously adjust the structure and parameters of the network through a certain training algorithm, and use the optimized network to predict the residual life of the equipment online.The prediction process does not need any prior information and is completely based on the prediction results obtained from the monitoring data [57].At present, the methods based on a neural network mainly include the methods based on a multi-layer perceptrons (MLP) neural network, the methods based on a radial basis function (RBF) neural network and the methods based on extreme learning machines (ELMS).

•
Multilayer perceptron neural network Multilayer perceptron neural network (MLP) is a kind of feedforward neural network with a hidden layer, and the neuron model of the hidden layer and output layer is consistent.MLP is mostly trained by back propagation (BP) algorithm.In addition to using BP algorithm to train MLP, other methods are also used for training, such as in [58].Bezazi et al. [59] used MLP artificial neural network to model the composite structure monitoring data, and trained the network through maximum likelihood estimation and Bayesian reasoning.The results showed that the network has good generalization ability.On this basis, Pierce et al. [60] further analyzed the robustness of the network based on interval uncertainty technology.This kind of research provides another idea for MLP-based neural network training.
Because MLP has the ability to approximate any form of nonlinear function by adding hidden layers or hidden elements, it has attracted extensive attention in the field of remaining life prediction.

• Radial basis function neural network
Radial basis function neural network (RBF) neural network is a neural network structure proposed in the 1980s.It has a three-layer feedforward network with a single hidden layer, and can approach any continuous nonlinear function with any accuracy [61,62].The biggest difference between RBF neural network and MLP neural network in structure is that the independent variable of excitation function is the product of distance and deviation between input vector and weight vector, rather than the weighted sum between input vector and weight vector.Liu et al. [63] pointed out that the key to an RBF neural network model is to correctly select the appropriate RBF center.The number and location of RBF centers in the hidden layer directly affect the approximation ability of the network.Li et al. [64] constructed a life prediction model of accelerated life test by using the method of grey RBF neural network.The test showed that the prediction result is obviously better than BP neural network.When Li et al. [65] used RBF neural network for relay life prediction, the input information of RBF neural network was not the original relay overtravel time with non-stationary characteristics, but the random term obtained through wavelet transform.The output information of RBF neural network realized the relay life prediction again through wavelet packet reconstruction.Chen et al. [66] proposed a multivariate grey RBF hybrid model for remaining life prediction of industrial equipment, which integrates the advantages of grey model and RBF neural network, effectively ensures the prediction accuracy and has practical engineering application value.
The remaining life prediction method based on RBF neural network only contains one hidden layer, and the fitting accuracy is high.It can overcome the problems of falling into local optimization and slow convergence of the learning process, and can realize the dynamic determination of the network structure and the data center of the hidden layer unit.

•
Neural network based on limit learning machine Limit learning machine (elms), as a new learning algorithm for a single hidden layer feedforward neural network, was proposed by Huang [67].The basic idea of the elms training process is to randomly select the input weight and hidden layer deviation value, manually select the number of hidden layer neurons according to the engineering practice experience, and determine the output weight by the least square method, so as to realize the rapid determination of network structure and parameters.
Li et al. [68] studied the problem of remaining life prediction of fan mechanical transmission components based on elms, and introduced in detail the principle, parameter selection and optimization process of the elms algorithm, so as to predict the trend and value of relevant performance parameters and realize the evaluation of performance parameters and the prediction of residual life.Liu et al. [69] extracted the features that can better reflect the bearing degradation process through the joint approximate diagonalization method of the two-layer feature matrix, and input the extracted features into the elms model to accurately predict the remaining life of the bearing.On this basis, Liu et al. [70] improved the feature extraction method, also used the elms method to train the extracted features, and applied this method to the study of the remaining life of bearings.Du et al. [71] proposed a multi classification probability elms model based on sigmoid a posteriori probability mapping and Lagrange pairwise coupling method, which solved the problem of remaining life prediction of UAV transmitters.Yang et al. [72] proposed a remaining life prediction method based on elms, compared the relationship and difference between elms and BP artificial neural network, and found that the internal parameters of elms do not need iterative calculation.The test showed that the model based on elms is slightly inferior to the model based on BP artificial neural network in prediction accuracy and stability, but can significantly reduce the training time.
The remaining life prediction method based on elms has the following advantages: It can make a rapid remaining life prediction and effectively reduce the model training time.The activation function can use discontinuous functions.The problem of sensitive selection of learning parameters and easily falling into local extremum in gradient descent learning algorithm is avoided.
Although the method based on elms has many advantages, it also has some shortcomings.Since the deviation between the input weight and the hidden layer is generated randomly, the network training effect of elms cannot be guaranteed, which may be good and bad from time to time.At the same time, the number of hidden layer nodes needs to be selected according to experience and experimental methods, which makes it difficult to ensure the optimal model.In addition, because the output weight is calculated by the least square method, the method based on elms will face the problem of expanding the influence of outliers and noise.b Method based on support vector machine Support vector machine (SVM) was developed based on VC dimension theory and the structural risk minimization principle.It was first proposed by Cortes and Vapnik in 1995.It is mainly used to solve the classification and regression problems of ML and is suitable for analyzing small samples and multidimensional data [73,74].
The main idea of the research on the remaining life prediction method based on SVM is to train the support vector machine model with the condition monitoring data obtained in the actual project, determine the model parameters (insensitivity coefficient, penalty factor, kernel function parameters, etc.), predict the future state of the system based on the trained SVM model, and obtain the residual life of the equipment by comparing with the preset failure threshold.
Due to the multidimensional, nonlinear and uncertain characteristics of condition monitoring data in practical engineering, it is usually difficult to ensure the accuracy of SVM model parameters by simply using a SVM method to train condition monitoring data.SVM model parameters directly affect the remaining life results of equipment.Therefore, scholars began to pay attention to how to combine SVM with other methods to predict the remaining life of equipment.
In order to eliminate the interference information in the data, Miao et al. [75] combined wavelet analysis with SVM to predict the remaining life of a gyroscope.Nieto et al. [76] proposed an algorithm based on Hybrid Particle Swarm Optimization and SVM for spacecraft engine remaining life prediction, solved the optimization problem of super parameters in the training process of support vector machine, and further improved the accuracy of prediction.Aiming at the problem that SVM cannot effectively deal with non-static sequences with monotonic trends, Maior et al. [77] proposed a method combining empirical mode decomposition and SVM for degradation data analysis and remaining life prediction, and applied the proposed method to the analysis of a motor.The results showed that this method can improve the prediction performance compared with simple SVM.
The remaining life prediction method based on SVM is more suitable for analyzing small samples and multidimensional data.However, there are also many defects.For example, with the increase in the sample set, the linearity will increase, resulting in the increase of overfitting and calculation time.It is difficult to obtain the prediction of probability formula, that is, it is impossible to evaluate the uncertainty of remaining life prediction; the Kernel function must satisfy the Mercer condition.

Traditional Machine Learning Remaining Life Prediction Method Advantages Disadvantages
Neural network

MLP
Has the ability to approximate any form of nonlinear function by adding hidden layers or hidden elements [60].
The effect is good without obvious disadvantages [60].

RBF
The network contains only one hidden layer, and the fitting accuracy is high; It can overcome falling into local optimization and realize the dynamic determination of the network structure and the data center of the hidden layer unit [61,62].
The effect is good without obvious disadvantages [66].

ELM Short training time; The activation function can use discontinuous functions; It avoids the problems of sensitive selection of learning parameters and easily falling into
local extremum [67].
Since the deviation between the input weight and the hidden layer is generated randomly, the network training effect of elms cannot be guaranteed, which may be good and bad from time to time.The number of hidden layer nodes needs to be selected according to experience and experimental methods, which makes it difficult to ensure the optimal model [72].

SVM
The remaining life prediction method based on SVM is more suitable for analyzing small samples and multidimensional data [75].
With the increase of the sample set, the linearity will increase, resulting in the increase of overfitting and calculation time.It is difficult to obtain the prediction of probability formula, that is, it is impossible to evaluate the uncertainty of remaining life prediction; the Kernel function must satisfy the Mercer condition [77]. (

2) Deep learning
The research on equipment remaining life prediction methods based on deep learning mainly include: methods based on deep neural network (DNN), methods based on deep belief network (DBN), methods based on convolutional neural network (CNN) and methods based on recurrent neural network (RNN).The characteristics of each method are given in Table 6.

•
Deep neural network Deep neural network (DNN) is usually a multilayer neural network formed by stacking multilayer feature representation models.The common feature representation models include AE and denoising automatic encoder (DAE).The main idea of the method based on DNN is to extract the high-level features of the original data through multiple AE or DAE stacking networks, and then realize the prediction of the remaining life based on regression fitting method or feedforward neural network.Zhou et al. [78] proposed an early diagnosis method of micro, slowly varying faults based on DNN.The high-dimensional fault features extracted by deep learning are transformed into one-dimensional fault features through the PCA method, and then the life prediction model is constructed by using the nonlinear fitting method.Yan et al. [79] studied a remaining life prediction method combining deep DAE and regression analysis for the big data analysis of industrial systems.Two deep DAEs were used to process the far end signal and near end signal, respectively to obtain the overall trend and current change process.The outputs of the two deep DAEs were fused to predict the residual life of the equipment through linear regression.
The DNN prediction method has the following characteristics: Useful features can be extracted through multiple dimensionality reduction of input data, which can facilitate model training.Because the DAE has the function of noise reduction and filtering, the network formed by the stacking of multiple DAEs can process the monitoring data containing noise, which fully reflects the strong robustness and universality of this method.

• Deep belief network
As a typical deep learning method, DBN is mainly a deep network composed of multiple restricted Boltzman machines (RBM) stacked and a classification layer or regression layer.It can not only realize the feature representation and extraction of observation data from low-level to high-level, but also discover the distributed features of input data [80].Deutsch et al. [81] successfully applied DBN to predict the remaining life of bearings, but the prediction accuracy of the proposed method was much lower than that of the particle filter method.Deutsch et al. [82] also proposed a remaining life prediction method for rotating equipment integrating DBN and a feedforward neural network (FNN), which is based on the improvement and expansion of the DBN method and can effectively combine the feature extraction ability of DBN with the prediction performance of FNN.
In order to obtain the probability distribution of remaining life, DBN and particle filter have been effectively combined to further improve the prediction accuracy [83].On this basis, Zhao et al. [84] effectively combined the advantages of DBN and RVM to study a new method for predicting the remaining life of Li batteries.Because DBN has strong feature extraction ability, it effectively solves the uncertainty problem caused by artificial feature extraction and selection, and realizes the goal of intelligent feature extraction.At the same time, the time-domain signal under this method does not need to meet the requirements of periodicity, so it has a broad application space in the field of remaining life prediction.
However, DBN still has several limitations: The short-term prediction performance is good, while the long-term prediction performance is poor.It cannot reflect the uncertainty of the prediction results.Generally, it needs to be combined with other methods to reflect the uncertainty of the prediction results.

• Convolutional neural network
As a kind of classical feedforward neural network, CNN was first proposed by Lecun and was used to solve the problem of image processing.It is mainly composed of several convolution layers and pooling layers.The purpose is to extract the topology features hidden in the monitoring data step by step by constructing multiple filters, and the extracted features will become more and more abstract with the deepening of the network level [85].For CNN, the convolution layer uses the original input data to convolute multiple local filters, and the subsequent pooling layer can extract the most important features with a fixed length.The commonly used pooling function is the maximum pooling function [86].
The research on remaining life prediction based on CNN began in 2016.Babu et al. [87] applied deep CNN to the field of remaining life prediction, used two convolution layers and two pooling layers to extract the characteristics of the original signal, and combined with MLP to predict the remaining life.Li et al. [88] proposed a multivariable equipment residual life estimation method based on deep CNN.In order to better extract features, the time window method is used to obtain samples.At the same time, because some effective information will be filtered out by the pooling operation, the pooling layer is ignored in the process of building the network.Ren et al. [89] studied the problem of bearing remaining life prediction based on CNN, combined a series of extracted features, namely the spectrum main energy vector, into a feature map, and extracted a one-dimensional vector that is helpful for predicting the remaining life through the structure of CNN.The one-dimensional vector is input into the deep neural network to predict the residual life.The bearing test showed that the proposed method is better than the traditional ML method.
The research on remaining life prediction based on CNN has the following characteristics: It is suitable for engineering equipment that can monitor massive data.It can realize automatic feature extraction and recognition without manual participation and intervention.The weight-sharing feature makes the number of parameters of CNN model less and the optimization process more convenient.However, the remaining life prediction based on CNN is still in the preliminary exploration stage, the research results have not been systematic, and the uncertainty of remaining life cannot be given quantitatively.Therefore, the methods based on CNN still require in-depth research.

•
Recurrent neural network RNN is a kind of feedforward neural network including a feedforward connection and an internal feedback connection, and is mainly used to process the monitoring vector sequence with interdependent characteristics.Due to its special network structure, it can retain the state information at the last moment on the hidden layer, so it has strong advantages in the field of complex dynamic system modeling [90].
The basic idea of remaining life prediction methods based on RNN is to take the monitoring data input in the project as the input of the RNN network, and train the model parameters through back propagation through time (BPTT), so as to realize the remaining life prediction of equipment.It should be noted that the internal feedback connection of an RNN depicts the pre-and post-dependence of monitoring data.Liu et al. [91] used an adaptive RNN to predict the remaining life of Li batteries, and online optimized the weight of the network structure through a cyclic Levenberg-Marquardt method.The remaining life prediction method based on RNN can integrate the original learning samples with the new learning mode to realize the retraining of samples.It can not only improve the accuracy of remaining life prediction, but also has the characteristics of fast convergence and high stability.However, the traditional RNN usually has the problem of "memory decay", because there is no structure to control memory flow in the traditional circulation layer.When dealing with long-term dependent degradation data, the traditional RNN methods will face the problem of gradient disappearance or explosion, and the prediction accuracy will be seriously affected.On the other hand, RNN cannot effectively analyze and process multidimensional data, and usually needs to be combined with other methods for these purposes.
Scholars have also proposed other improved deep learning methods to predict the RUL, which show better performance than the current popular models.For instance, Zhang et al. [91] proposed a dual-task network structure based on bidirectional gated recurrent unit (BiGRU) and multigate mixture-of-experts (MMoE), which simultaneously evaluates the health state and predict the RUL of industrial equipment.

DNN
Model training is convenient; It can process the monitoring data containing noise, which shows that the method has strong robustness and universality [78].
The effect is good without obvious disadvantages [79].
The short-term prediction performance is good, while the long-term prediction performance is poor; It cannot reflect the uncertainty of the prediction results [84].

CNN
It is applicable to engineering equipment that can monitor massive data; It can realize automatic feature extraction and recognition without manual participation and intervention; The weight-sharing feature makes the number of parameters of a CNN model less and the optimization process more convenient [85].
It is still in the preliminary exploration stage, and the research results have not been systematized; The uncertainty of remaining life cannot be given quantitatively [89].

RNN
It can integrate the original learning samples with the new learning mode to realize the retraining of samples, improve the prediction accuracy, and has the characteristics of fast convergence and high stability [90].
Traditional RNN usually has the problem of "memory decline".When dealing with long-term dependent degraded data, it will face the problem of gradient disappearance or explosion, and the prediction accuracy will be affected; RNN cannot effectively analyze and process multidimensional data [92].

Statistical data-driven approach
The statistical data-driven method is based on the theory of probability statistics, using the historical data degradation trajectory of similar systems or products, establishing the relationship between the data system and the degradation model, and estimating the parameters of the degradation model, so as to obtain the analytical probability distribution of the remaining life of the object or system and realize the prediction of the remaining life [93].
Statistical data-driven methods assume that the degradation model is known in advance, and directly use the condition monitoring data or environmental data to estimate the model parameters offline or online.However, the degradation model in practical engineering is unknown, and the degradation models of different types of equipment are different.The improper selection of a degradation model will seriously affect the prediction accuracy of remaining life.Typical methods include Wiener process, gamma process, inverse Gaussian process, Markov model and so on (see Table 7).

• Wiener process
The method based on the Wiener process is mainly applicable to the non-monotonic case of equipment performance degradation process.This method mainly uses the following mathematical model to describe the degradation process: where x 0 is the initial performance degradation value; λ(s) is the drift parameter; σ is the diffusion coefficient; B(t) is the standard Brownian motion.After obtaining the equipment performance degradation process model, the remaining life distribution of the equipment can be calculated by using the relevant theory of the Wiener process on the basis of giving its failure value.In order to realize the accurate real-time prediction of the remaining life of the equipment, usually, the real-time monitoring information of the equipment can be used to dynamically update the remaining life prediction results.Gebraeel et al. [94] first established the degradation model of equipment based on the Wiener process with linear drift (or linearization), and assumed that the drift coefficient obeyed the normal distribution.According to the degradation data observed in real time, the online update of the random drift coefficient was realized by using the method of Bayesian reasoning.The Gebraeel method has had a great impact in the field of equipment life prediction and health management.However, the remaining life prediction results obtained by the Gebraeel method are only applicable to linear degradation equipment or equipment whose performance degradation data can be directly linearized.Moreover, the Brownian motion term in the degradation model used in this method is only treated as the observation error, so that the remaining life distribution obtained is not the exact solution in the sense of first arrival time.

Gamma process
The gamma process is often used to model the degradation trajectory of monotonic data, such as metal wear and crack growth.Abdel et al. [95] first proposed it in 1975 and used the gamma process to model continuous monotonic degradation data.Bagdonavicius considered the influence of dynamic environments in the degradation model and proposed a remaining life prediction method based on the gamma process considering dynamic environments [96].Lawless et al. [97] considered the problem that the parameters in the gamma process are random variables.In practical application, the duty cycle of the system may not be periodic, which would lead to aperiodic degradation measurements.Both of these factors affect the accuracy of health assessment and RUL prediction.In order to meet these challenges, Zhao et al. [98] propose a Gamma state-space model of power equipment, which considers the temporal uncertainty, measurement uncertainty, and device-to-device heterogeneity.This method introduces a new idea for health assessment and RUL prediction.

•
Inverse Gaussian process The basic idea of the inverse Gaussian process is to assume that the degradation is strictly monotonic, and the increment of degradation obeys an inverse Gaussian distribution.The degradation process is described by the change in increment.The inverse Gaussian process was first proposed by Wasan et al. [99] in 1968, but it was not applied to the degradation modeling of equipment by Wang et al. [100] until 2010.The inverse Gaussian process is used to describe the monotonic degradation process due to the connection between the inverse Gaussian distribution and the linear drift Wiener process.Compared with the gamma process, the inverse Gaussian process is easier to deduce and implement mathematically, and more flexible and applicable.

Markov model
The Markov chain method is often used in the degradation modeling of processes with continuous time discrete state characteristics.This method is based on two assumptions: one is that the future degradation state is only determined by the current degradation state, that is, it is memoryless.Second, the system monitoring data can reflect its working state.The remaining life prediction method based on Markov chain defines the first arrival time by the time when the degradation process first reaches the failure state, and calculates the remaining life according to the first arrival time.Kharoufeh et al. [101] carried out a series of studies on this method and proposed a degradation model based on Markov chain considering environmental impact.Lee et al. [102] incorporated a Markov property in the degradation process into remaining life prediction based on a regression model.

Wiener process
It is applicable to the non-monotonic situation of equipment performance degradation process [94].

Gamma process
Degradation trajectory modeling commonly used for monotone data [98].

Inverse Gaussian
Assuming that the degradation is strictly monotonic, and the increment of degradation obeys an inverse Gaussian distribution, the degradation process is described by the change in increment [100].

Markov
Degradation modeling for processes with continuous time discrete state characteristics [102].

Regression Index
The formula of mean absolute error (MAE) is [103]: where N is the number of samples, y i is the real value of the ith sample, ŷi is the predicted value of the ith sample, y i is the average value of the sample data.The mean square error (MSE) formula is [104]: The formula of mean absolute percentage error (MAPE) is [105]: MAPE expresses the prediction effect by calculating the absolute error percentage.The smaller the value, the better.If MAPE equals 10, this indicates that the predicted average deviates from the true value by 10%.
Since MAPE calculation is independent of dimension, different problems are comparable in specific scenarios.However, the disadvantages of MAPE are also obvious, and there is no definition at y i = 0.In addition, it should be noted that the penalty of MAPE for negative error is greater than that for positive error.
The formula of root mean square error (RMSE) is [106]: RMSE represents the sample standard deviation of the difference between the predicted value and the real value.Compared with MAE, RMSE has a greater penalty for large error samples.However, one disadvantage of RMSE is that it is sensitive to outliers, which will lead to very large RMSE results.
Based on RMSE, there is also a commonly used variant evaluation index called root mean square logarithmic error (RMSLE), the formula of which is: RMSLE penalizes the samples with small predicted value more than those with large predicted value.
The determination coefficient is mainly evaluated by (R-square) [107], and the formula is: R-square is used to measure the proportion of the variation of dependent variables that can be explained by independent variables.The general value range is 0~1.The closer R-square is to 1, the greater the proportion of the sum of regression squares in the total sum of squares, the closer the regression line is to each observation point, the more the variation in y value is explained by the change in x, and the better the fitting degree of regression.

Classification Index
TP, TN, FP and FN are mainly used to count the problems of two categories [108].Of course, multiple categories can also be counted separately, and the samples can be divided into positive samples and negative samples (see Table 8).The first letter in TP, TN, FP and FN indicates whether the recognition result of the classifier is correct.The first letter of true is t, and the first letter of false is f.The second letter indicates the decision result of the classifier; P indicates that the classifier decides to be a positive sample, and N indicates that the classifier decides to be a negative sample.TP: the classifier recognizes correctly, and the classifier considers the sample as a positive sample; TN: the classifier recognizes correctly, and the classifier considers the sample as a negative sample; FP: the recognition result of the classifier is wrong, and the classifier thinks the sample is a positive sample, therefore, in fact, the sample is a negative sample; FN: the recognition result of the classifier is wrong, and the classifier considers the sample to be a negative sample, therefore, in fact, the sample is a positive sample.
Accuracy refers to the proportion of the number of samples with correct model prediction (including true prediction and false prediction) in the total number of samples [109], i.e., Accuracy = m correct m total (11) where m correct represents the number of samples correctly classified by the model, and m total represents the number of all samples.Accuracy is one of the simplest and most intuitive evaluation indicators in classification problems, but there are some limitations in accuracy.For example, in the second classification, when negative samples account for 99%, if the model predicts all samples as negative samples, it can also obtain 99% accuracy [110].Although the accuracy seems high, this model is actually useless because it cannot find a positive sample.
Precision refers to the proportion of the number of samples predicted to be true by the model and actually true to the number of samples predicted to be true by the model [111], i.e., Precision = TP TP + FP (12) Recall refers to the proportion of the number of samples predicted by the model and actually true to the number of samples actually true [112], i.e., Generally speaking, the accuracy rate and recall rate are mutually exclusive; that is to say, if the accuracy rate is high, the recall rate will become low.If the recall rate is high, the accuracy rate will be low.Therefore, an index value considering both accuracy and recall is designed.The value is the harmonic average of accuracy rate and recall rate [113], i.e., In some scenarios, we pay different attention to the accuracy rate and recall rate.At this time, the F a value in the more general form of F 1 value can be satisfied.The F a value is defined as follows: We average all the arithmetic indexes in [114], and then calculate the average value of each category. ) where i represents the ith category.We establish a statistical matrix for each instance, and then calculate the corresponding data in the global index set [115].

Index Selection Principle
If the health state grade is used to evaluate the health degree of the equipment, it can be regarded as a classification problem, and the indicators of the above classification problems can be used [116,117].If a specific value is used to evaluate the health of the equipment, it can be regarded as a regression problem, and the indicators of the above regression problem can be used.However, the regression problem can also be transformed into a classification problem, so as to use the indicators of the classification problem.
If the regression scheme is adopted, the recommended evaluation index is mean square error or R-square.If the classification scheme is adopted, the recommended evaluation index is accuracy or F 1 value.

Application of Health State Estimation in Various Industries
Each industry has estimated the health degree of the following products, and the specific contents are shown in the Table 9: The power industry mainly estimates the health of electronic products such as battery energy storage systems and electric energy meters, electrical products such as wind turbine drive chains and generator sets, and mechanical products such as power mechanical equipment.The transportation industry mainly evaluates the health of vehicle batteries and other electronic products.The Internet industry mainly estimates the health of electronic products such as server hosts.The petrochemical industry mainly estimates the health of electronic products such as coal mine underground systems and electrical products such as electric submersible pumps.The water industry mainly estimates the health of electrical products such as water supply equipment.The manufacturing industry mainly estimates the health of mechanical products such as robots and electrical products such as air conditioners.The medical industry mainly estimates the health of human muscles.It can be seen that health state estimation has been applied to a wide extent in industry.In addition, there are a variety of health state estimation methods.

Remaining Life Prediction in Various Industries
Each industry has made RUL predictions on certain products (see Table 10), as follows: In the aviation field, RUL prediction is carried out for electronic products such as hierarchical control systems and mechanical products such as blades.In the power industry, RUL prediction is carried out for electronic products such as power batteries, converters and power modules.In the vehicle industry, RUL prediction is made for electrical products such as mechanical relays and mechanical products such as brake shoes.In the household appliance industry, RUL prediction is made for mechanical equipment such as rolling bearings.It can be seen that remaining life prediction has been applied to a certain extent in industry, but it is not widely used compared with health state estimation.In addition, there are a variety of remaining life prediction methods used in industry.

Applicable Objects of the Methods
The objects of health state estimation and remaining life prediction are different.Among them, health state estimation is applicable to equipment with a high failure rate and low importance.Remaining life prediction is applicable to equipment with a low failure rate and high importance.
In the rail transit industry, the average annual failure frequency can be used as an indicator of failure rate, and the delay time can be used as an indicator of importance.Using these two indicators, the equipment of rail transit is divided into four categories.Among them, the equipment with high annual failure frequency and low delay time is suitable for health state estimation, and industrial computers constitute such equipment.The equipment with low annual failure frequency and high delay time is suitable for remaining life prediction, and the power supply is one such piece of equipment.The equipment with low annual failure frequency and low delay time is suitable for fault alarm.Subsystems and components with high annual failure frequency and delay time are suitable for design improvement.

Methods Summary
In terms of health, the following summary is given: (1) Various health state estimation methods have been applied in industry.The health state estimation method has been applied in the industry to a certain extent, and the application objects include electronic, electrical, mechanical equipment and so on.Therefore, it is feasible to estimate the health degree of equipment in various industries.
(2) The health state estimation methods based on non-machine learning and machine learning are suitable for different objects.Among them, non-machine learning methods have the advantages of strong interpretability, being easy to understand and they do not need a lot of equipment, but they also have the disadvantages of poor generalization and subjective weight determination.Machine learning methods have the advantages of strong generalization ability, but they have the disadvantage of weak interpretability.Based on the above two points, when estimating the health of equipment, if the number of output parameters is small, the relationship between fault and output parameters can be listed, and the method needs some explanation, the non-machine learning methods are recommended.If the amount of equipment data is sufficient and the method needs strong generalization ability, and in addition, if the equipment output parameters are complex and the relationship between fault and output parameters cannot be obtained, the machine learning methods are recommended.
(3) The accuracy and reliability of health state estimation are affected by many factors.In terms of accuracy, before the implementation of the two health state estimation methods, experts need to establish the corresponding relationship between parameters and scores, that is, expert scoring, also known as expert labeling.The quality of expert scoring basically determines the upper limit of the accuracy and reliability of machine learning methods.Moreover, if the equipment data cannot cover most degradation situations in equipment engineering applications, the accuracy of health state estimation will also be affected.In addition, the assessment indicators of health state estimation effect (false report and missing report) are different, and the estimation effect is also different.
The remaining life prediction is summarized as follows: (1) Various remaining life prediction methods are applied in industry.The application objects mainly involve electrical and mechanical equipment with obvious degradation laws.Therefore, it is feasible to predict the remaining life of equipment in various industries.
(2) The remaining life prediction methods based on a physical model and data are applicable to different objects.Among them, the physical model method is suitable for equipment with a single degradation characterization parameter and a clear degradation mechanism, but its generalization ability is poor.The machine learning methods in the datadriven methods have the advantages of strong data-driven ability and good generalization.They are suitable for equipment with multiple degradation characterization parameters, and are extremely suitable for equipment with an unclear relationship between the degradation characterization parameters and their failure.The statistical data-driven methods in the data-driven methods have strong interpretability and are suitable for equipment with only a single degradation characterization parameter and significant degradation characteristics.Based on the above three points, the physical model method is not suitable for the remaining life prediction of rail transit equipment.When predicting the residual life of equipment, if the equipment has only one degradation characterization parameter and needs some explanation, the statistical data-driven methods are recommended.If there are many equipment output parameters and the relationship between the equipment output parameters and their failure is not clear, the machine learning methods are recommended.
(3) The accuracy of remaining life prediction is greatly affected by equipment degradation data and actual working conditions of equipment.The premise of remaining life prediction is to have sufficient equipment degradation data or equipment life data as support.If the test conditions covered by the data are single and the types of working conditions are less, the representativeness of the data is poor.The remaining life prediction methods trained by the data in this case have poor accuracy and applicability.In addition, if the actual working conditions of the equipment change greatly, the general life prediction methods cannot meet the accuracy requirements, and the life prediction methods need to be improved.
The selection of accuracy evaluation methods is summarized as follows: (1) The health state estimation methods can be classified as both s classification problem and a regression problem.The appropriate index evaluation method needs to be selected according to the specific situation.Remaining life prediction is a regression problem, which can be evaluated by regression index.
(2) The recommended classification evaluation index is accuracy or F1 value, and the recommended regression evaluation index is mean square error or R-square.

Feasibility Analysis
Feasibility analysis of health state estimation methods: (1) The economic cost, time cost and labor cost of fault injection tests are within an affordable range.Because the training data and verification data of health state estimation methods can be derived from fault injection test, the sample of this test can be recycled, so the economic cost of the test sample is within an acceptable range.Since this test can be conducted intermittently and the time spent in each group of tests can be reduced to weeks, the test time is also within the acceptable range.In addition, the test needs the cooperation of engineers who are proficient in products, data acquisition and accelerated life tests.If there are engineers with these abilities in the unit, it is feasible only from the perspective of manpower.
(2) The time cost and labor cost of exploring the health state estimation methods are large.If the company has no research foundation in this field, the exploration and comparison of various non-machine learning methods will take a long time.The labor cost depends on the number of non-machine learning health state estimation methods.If more non-machine learning health state estimation methods are selected, the labor cost will be large.
(3) The health degree estimation methods can bring greater economic benefits and better social benefits.If the equipment in the industry is more suitable for health state estimation, these health state estimation methods can be applied to a wide range of engineering in the future, which will bring great economic benefits.Engineering application can greatly improve the operation safety and reliability of unit products, so as to obtain better social benefits.
Feasibility analysis of remaining life prediction methods: (1) The economic cost, time cost and labor cost of accelerated life test are significant.Because the training data and verification data of remaining life prediction methods need to be derived from accelerated life test, this test is a destructive test, and the sample cannot be recycled.Therefore, the economic cost of the test sample is high.Since this test needs to be carried out continuously until the sample fails, the time spent on each group of tests is basically half a year, so the test time is long.In addition, during the accelerated life test, personnel are required to be on duty 24 h a day, and engineers proficient in products, data acquisition and accelerated life test are required to work together, so the labor cost is high.
(2) The time cost and labor cost of exploring the remaining life prediction methods are large.If the unit has no previous experience in remaining life prediction methods, the exploration of various remaining life prediction methods will take a long time.The labor cost depends on the number of remaining life prediction methods utilizing machine learning.If more machine learning remaining life prediction methods are selected, the labor cost will be greater.
(3) The remaining life prediction methods can bring less economic and social benefits.If only some equipment in the unit is suitable for remaining life prediction, the engineering application volume of remaining life prediction method is poor and the economic benefits are small.

Figure 1 .
Figure 1.Classification of health state estimation methods.

Figure 1 .
Figure 1.Classification of health state estimation methods.

Figure 2 .
Figure 2. Classification of remaining life prediction methods.

Figure 2 .
Figure 2. Classification of remaining life prediction methods.

Author
Contributions: J.Z. formulated the overall research objectives, carried out the research design and manuscript writing.C.G. and T.T. formulated the overall research objectives.X.X., M.L. and B.Y. made contributions to the research idea.All authors have read and agreed to the published version of the manuscript.Funding: This work was research was funded by the National Key R&D Program of China, grant number 2020YFB1600705, and the Beijing Science and Technology Project, grant number Z191100002519003.

Table 3 .
Advantages and disadvantages of health state estimation methods.

Table 4 .
Characteristics of remaining life prediction methods.

Table 4 .
Characteristics of remaining life prediction methods.

Table 5 .
Classification and characteristics of traditional machine learning remaining life prediction methods.

Table 6 .
Classification and characteristics of deep learning remaining life prediction methods.

Table 7 .
Classification and characteristics of statistical data-driven remaining life prediction methods.

Table 9 .
Health assessment in various industries.

Table 10 .
Remaining life prediction methods in various industries.