Remaining Useful Life Prediction of the Concrete Piston Based on Probability Statistics and Data Driven

This paper proposes a method on predicting the remaining useful life (RUL) of a concrete piston of a concrete pump truck based on probability statistics and data-driven approaches. Firstly, the average useful life of the concrete piston is determined by probability distribution fitting using actual life data. Secondly, according to condition monitoring data of the concrete pump truck, a concept of life coefficient of the concrete piston is proposed to represent the influence of the loading condition on the actual useful life of individual concrete pistons, and different regression models are established to predict the RUL of the concrete pistons. Finally, according to the prediction result of the concrete piston at different life stages, a replacement warning point is established to provide support for the inventory management and replacement plan of the concrete piston.


Preface
Along with the continuous progress of modern manufacturing technology, the structure of mechanical and electrical systems is more and more complex, which brings new challenges to fault prediction and health management of the system. Parts are important components of mechanical and electrical product systems, once the parts fail, it may affect the healthy operation of the whole system, or even cause serious loss of life and property. Therefore, the remaining useful life (RUL) prediction of parts has become a key research issue of fault prediction and health management [1][2][3]. Lei Y et al. [4] provided a review on machinery prognostics following its whole program, i.e., from data acquisition to RUL prediction. Jay Lee et al. [5] provided a review on the system design of prognostics and health management, and gave a tutorial for the selection of RUL prediction approaches by comparing their advantages and disadvantages.
At present, a number of research on the RUL prediction of parts have reported [6][7][8], and approaches of RUL prediction can be roughly grouped into three categories. The first category is the prediction method based on physical models, which estimates the RUL of parts according to the degradation mechanism. Leser et al. [9] validated the crack growth modeling method using damage diagnosis data based on structural health monitoring, and a probabilistic prediction of RUL is formed for a metallic, single-edge notch tension specimen with a fatigue crack growing under mixed-mode conditions. Habib et al. [10] evaluated the stress of A310 aircraft wings during each loading cycle through a finite element analysis, and they predicted the RUL of A310 wings using the Paris Law technique based on linear elastic fracture mechanics. Chen et al. [11] developed a novel computational modelling technique for the prediction of crack growth in load bearing orthopaedic alloys subjected to fatigue loading, which can predict the RUL of parts through the crack path. The second category is the prediction method based on probability statistics, which fit the failure data of parts to obtain the characteristic distribution of life through a statistical distribution model. Wang et al. [12] proposed a novel method based on the three-parameter Weibull distribution proportional hazards model to predict the RUL of rolling bearings, the model is able to produce accurate RUL predictions for the tested bearings and outperforms the popular two-parameter model. Pan et al. [13] proposed a remanufacturability evaluation scheme based on the average RUL of the structural arm, and made a comprehensive evaluation by establishing the reliability parameter model of the structural arm. Xu et al. [14] discussed the influence of different distribution function values on the prediction results by analyzing different parameter estimation methods, and established the RUL prediction model based on the failure data of parts. Rong et al. [15] determined the average useful life of the pump truck boom based on the Weibull distribution function by using the failure data, and predicted the RUL of the boom by using the used time. The third category is the data-driven prediction method. Ren et al. [16] analyzed the time-domain and frequency-domain characteristics of rolling bearing vibration signals, and established the RUL prediction model of rolling bearing based on deep neural networks. Liu et al. [17] proposed an RUL prediction framework based on multiple health state assessments that divide the entire bearing life into several health states, where a local regression model can be built individually. Zio et al. [18] proposed a methodology for the estimation of the RUL of parts based on particle filtering. Sun et al. [19] used support vector machines to build degradation models for bearing RUL prediction. Maio et al. [20] proposed a combination of a relevance vector machine and model fitting as a prognostic procedure for estimating the RUL of degraded thrust ball bearings. Deutsch et al. [21] proposed a deep learning-based approach for the RUL prediction of rotating parts with big data.
With more and more information available to mechanical devices, many new methods have been applied to prediction models. Mad et al. [22] used a physical model to generate health indices whose evolution can be estimated and predicted online. Xu J et al. [23] combined the monitoring sensor data and integrated the strengths of the data-driven prognostics approach and the experience-based approach, while reducing their respective limitations.
The RUL prediction, based on physical model needs to establish accurate models to describe failure degradation mechanism of parts, while the RUL prediction, based on probability statistics, does not consider the actual working state of different parts, so the application of both methods is limited. With the support of modern information technology and the industrial Internet of Things technology, mechanical and electrical product systems are becoming more and more intelligent, so more and more data on the working status can be obtained, which brings great potential for data driven RUL prediction research [24].
A concrete pump truck is a kind of construction vehicle which uses hydraulic pressure to deliver concrete continuously through the pipeline. A concrete piston, which is located in the conveying cylinder of the pump truck, as shown in Figure 1, is an important part of the concrete pump truck. When the concrete piston is working, it reciprocates in the concrete medium of the conveying cylinder, provides pressure for the concrete, pumps the concrete to a remote place, and plays a sealing role at the same time. The working environment of the concrete piston is very harsh, and it is difficult to establish an accurate failure degradation model and obtain the operating state data directly. At present, there is limited research on the RUL prediction of the concrete piston. By using the condition monitoring data of the concrete pump truck and the replacement information data of the concrete piston, this paper puts forward an RUL prediction method of the concrete piston based on probability statistics and condition monitoring data, and the validity of the method is verified through the result analysis and model application.   Figure 2 shows the flowchart of the proposed methodology for RUL prediction. The methodology is divided into two phases: offline and online. In the offline phase, the replacement information data form different concrete pistons are used to fit features based on the Weibull distribution, the condition monitoring data from different concrete pump trucks are used to fit features based on regression algorithm, and the RUL prediction model is built. In the online phase, the RUL of the concrete piston is estimated based on the condition monitoring data from a new concrete pump truck and the real-time working life. The rest of the paper is organized as follows: Section 2 introduces the basic situation of the data. In Section 3, we establish the RUL prediction model of the concrete piston based on probability statistics and data-driven approaches. Section 4 discusses the prediction effect of different regression models, and we use the best prediction model to propose setting the replacement warning point of the concrete piston in Section 5, and conclusions are finally provided.  Figure 2 shows the flowchart of the proposed methodology for RUL prediction. The methodology is divided into two phases: offline and online. In the offline phase, the replacement information data form different concrete pistons are used to fit features based on the Weibull distribution, the condition monitoring data from different concrete pump trucks are used to fit features based on regression algorithm, and the RUL prediction model is built. In the online phase, the RUL of the concrete piston is estimated based on the condition monitoring data from a new concrete pump truck and the real-time working life.   Figure 2 shows the flowchart of the proposed methodology for RUL prediction. The methodology is divided into two phases: offline and online. In the offline phase, the replacement information data form different concrete pistons are used to fit features based on the Weibull distribution, the condition monitoring data from different concrete pump trucks are used to fit features based on regression algorithm, and the RUL prediction model is built. In the online phase, the RUL of the concrete piston is estimated based on the condition monitoring data from a new concrete pump truck and the real-time working life. The rest of the paper is organized as follows: Section 2 introduces the basic situation of the data. In Section 3, we establish the RUL prediction model of the concrete piston based on probability statistics and data-driven approaches. Section 4 discusses the prediction effect of different regression models, and we use the best prediction model to propose setting the replacement warning point of the concrete piston in Section 5, and conclusions are finally provided. The rest of the paper is organized as follows: Section 2 introduces the basic situation of the data. In Section 3, we establish the RUL prediction model of the concrete piston based on probability statistics and data-driven approaches. Section 4 discusses the prediction effect of different regression models, and we use the best prediction model to propose setting the replacement warning point of the concrete piston in Section 5, and conclusions are finally provided.

Data Source
The data studied in this paper were collected from 129 concrete pump trucks of a construction machinery enterprise from January to December 2019, including two types of data: condition monitoring data of the concrete pump truck and replacement information data of the concrete piston. The condition monitoring data of the concrete pump truck includes time, GPS latitude, GPS longitude, engine speed, hydraulic oil temperature, system pressure, pumping capacity, cumulative fuel consumption, reversing frequency, cumulative working time, and pump truck status, etc., which are uploaded to the enterprise's networked operation and maintenance platform through the Internet of Things. The replacement information data, which refers to the actual working life of the concrete piston when it is replaced because of failure, is directly inputted into the networked operation and maintenance platform by the service engineer of the enterprise.

Data Description
According to the functional characteristics of the concrete piston, this paper studies five condition monitoring data related to the working state of the concrete pump truck, including engine speed, system pressure, pumping capacity, reversing frequency, and cumulative working time. The specific meaning of the condition monitoring data is shown in Table 1. The condition monitoring data of the concrete pump truck includes "equipment number", "parameter name", "parameter value" and "server receiving time", totaling more than 2.8 million pieces. The replacement information data of the concrete piston includes "equipment number", "replacement timing" and "replacement date", totaling 325 pieces.

Data Preprocessing
The condition monitoring data of the concrete pump truck studied in this paper are time series data collected by sensors. Due to factors such as the timing error of sensors or poor communication conditions, certain data are missed in the data set. For the four types of data, such as engine speed, system pressure, pumping capacity, and reversing frequency, the missing data may be very close to the data uploaded the previous time due to the high data collecting frequency, so the nearest complement method is adopted to fill in missed data. The cumulative working time is accumulated data; it can be assumed that the changing of the cumulative working time is slow and uniform, so the linear interpolation method is adopted to fill the missed data [25]. The original data of the engine speed in a certain period of time is shown in Figure 3, and the processed data is shown in Figure 4.

Model Construction
If actual working life data of the concrete piston is known, the appropriate probability statistical distribution model can be selected to fit the data, and the characteristic distribution of the life can be obtained, which can be used to estimate the average useful life. During the operation of the concrete piston, the working state of the concrete pump truck will have an impact on its actual working life, so a concept of life coefficient is proposed based on the condition monitoring data of the concrete pump truck, and the RUL prediction model of the concrete piston is established, as shown in equation (1).
where is the RUL of the concrete piston, α is the life coefficient of the concrete piston related to condition monitoring data of the concrete pump truck, is the average useful life of the concrete piston, and is the real-time working life of the concrete piston.

The Average Useful Life of the Concrete Piston
In the failure probability distribution function of parts, there are several kinds of common distribution functions: exponential distribution, normal distribution, lognormal distribution, Weibull distribution, etc. Among them, the Weibull distribution is the most widely used due to its high degree of fitting and good effect for parts which undergo notable degradation before final failure [25]. The main failure mode of the concrete piston is dissipation failure, so this paper uses the Weibull distribution to study the average useful life of the concrete piston.
The probability density function of the two-parameter Weibull distribution is:

Model Construction
If actual working life data of the concrete piston is known, the appropriate probability statistical distribution model can be selected to fit the data, and the characteristic distribution of the life can be obtained, which can be used to estimate the average useful life. During the operation of the concrete piston, the working state of the concrete pump truck will have an impact on its actual working life, so a concept of life coefficient is proposed based on the condition monitoring data of the concrete pump truck, and the RUL prediction model of the concrete piston is established, as shown in equation (1).
where is the RUL of the concrete piston, α is the life coefficient of the concrete piston related to condition monitoring data of the concrete pump truck, is the average useful life of the concrete piston, and is the real-time working life of the concrete piston.

The Average Useful Life of the Concrete Piston
In the failure probability distribution function of parts, there are several kinds of common distribution functions: exponential distribution, normal distribution, lognormal distribution, Weibull distribution, etc. Among them, the Weibull distribution is the most widely used due to its high degree of fitting and good effect for parts which undergo notable degradation before final failure [25]. The main failure mode of the concrete piston is dissipation failure, so this paper uses the Weibull distribution to study the average useful life of the concrete piston.
The probability density function of the two-parameter Weibull distribution is:

Model Construction
If actual working life data of the concrete piston is known, the appropriate probability statistical distribution model can be selected to fit the data, and the characteristic distribution of the life can be obtained, which can be used to estimate the average useful life. During the operation of the concrete piston, the working state of the concrete pump truck will have an impact on its actual working life, so a concept of life coefficient is proposed based on the condition monitoring data of the concrete pump truck, and the RUL prediction model of the concrete piston is established, as shown in Equation (1).
where M r is the RUL of the concrete piston, α is the life coefficient of the concrete piston related to condition monitoring data of the concrete pump truck, M t is the average useful life of the concrete piston, and M 0 is the real-time working life of the concrete piston.

The Average Useful Life of the Concrete Piston
In the failure probability distribution function of parts, there are several kinds of common distribution functions: exponential distribution, normal distribution, lognormal distribution, Weibull distribution, etc. Among them, the Weibull distribution is the most widely used due to its high degree of fitting and good effect for parts which undergo notable degradation before final failure [25]. The main failure mode of the concrete piston is dissipation failure, so this paper uses the Weibull distribution to study the average useful life of the concrete piston.
The probability density function of the two-parameter Weibull distribution is: where λ is the scale parameter, called the characteristic life, which is an average value of the life of the parts; k is the shape parameter, which is the failure form of the parts. The failure distribution function of the Weibull distribution is: The average useful life M t of the concrete piston is represented by the expected value of the failure distribution function: where Γ is the gamma function.
According to the replacement information data of the concrete piston, we can obtain the actual working life data, arrange it in increasing order, calculate it by the common median rank, and estimate the parameters of the Weibull distribution based on the least square method. The fitting results are shown in Figure 5, and the fitting error is not higher than 0.056.

The Life Coefficient of the Concrete Piston
As the concrete piston is a mechanical part dominated by wear failure, it is expected to wear faster under a higher-strength working environment, so the working time of the concrete pump truck under a high-load working state has a greater impact on its life. Referring to the working environment and material properties of the concrete piston, the high-load working state is determined by parameters, such as engine speed, system pressure, pumping capacity, and reversing frequency. According to the actual performance parameters of the concrete pump truck, the definition of the high-load working state is shown in Table 2. According to the definition of the high-load working state of the concrete pump truck, condition monitoring data of the concrete pump truck corresponding to the actual working life data of the concrete piston is statistically analyzed. The ratio of the high-load working state of the concrete piston in the life cycle of engine speed, system pressure, pumping capacity, and reversing frequency for 325 pieces is calculated respectively, which is recorded as A, B, C, D, as shown in Table 3. The life coefficient α is calculated by the average useful life M t and the real-time working life M 0 according to Formula (1), and the results are shown in Table 3.       The correlation coefficients between A, B, C, D and α are calculated respectively, and the results are −0.6548, −0.5583, −0.4863 and −0.5379. Obviously, the negative correlation between them are a little high.
Taking the four types of high-load working state proportions as inputs and the life coefficient α as outputs, a prediction model on α is established by different algorithms. Considering the number of datasets is only 325, Multiple linear regression (MLR), Support vector regression (SVR), and Random forest regression(RFR) are selected because of their good performance with a small amount of samples for RUL prediction.

MLR
MLR is used to predict the dependent variable as a linear combination of independent ones; it can map the relationship between a dependent variable and explanatory variables. The model is defined as: where y i is response vector, x ki is regression matrix, β k is regression coefficient, ε i is random error.

SVR
SVR is one of the applications of the Support Vector Machine (SVM). The SVM constructs a hyperplane in a high-dimensional space, which can be used for classification and regression. For a given dataset {(x i , y i ), i = 1, 2, · · · , n}, where x i ∈ R d , y i ∈ R, and n is the capacity of samples, are the input vectors, y i is the associated output value. The regression mode can be expressed as follows: where ω is a d-dimensional vector and b is the bias term.

RFR
RFR is an extension of the decision tree algorithm, in which decision trees are combined and each decision tree is independently trained. The training procedure was employed as follows: (1) from the training dataset, a bootstrap sample was drawn as a randomized subset; (2) each individual tree was grown using the randomized subset of predictor variables. Each tree model f (x i ) was defined as y i = f (x i ) + ε i . The trees were grown to the largest extent possible without pruning; (3) repeat the step (2) until the number of trees was grown. Then the predicted results were aggregated by averaging them [26].

Result Analysis
The dataset of the concrete piston life prediction shown in Table 3 is randomly divided into a training set and a test set according to a ratio of 8:2. The three algorithms of MLR, SVR, and RFR are used to calculate the life coefficient α using the data of the training set. The derived α is then used to predict the life of the parts in the test set using the Formula (1) program in Python and invoking toolkits to calculate, analyze, and draw. The predicted life of the concrete piston calculated by each model is compared with the actual working life, as shown in Figures 6-8. (2) each individual tree was grown using the randomized subset of predictor variables. Each tree model ( ) was defined as = ( ) + . The trees were grown to the largest extent possible without pruning; (3) repeat the step (2) until the number of trees was grown. Then the predicted results were aggregated by averaging them [26].

Result Analysis
The dataset of the concrete piston life prediction shown in Table 3 is randomly divided into a training set and a test set according to a ratio of 8:2. The three algorithms of MLR, SVR, and RFR are used to calculate the life coefficient α using the data of the training set. The derived α is then used to predict the life of the parts in the test set using the formula (1) program in Python and invoking toolkits to calculate, analyze, and draw. The predicted life of the concrete piston calculated by each model is compared with the actual working life, as shown in Figures 6-8.   As can be seen from Figures 6-8, among the three prediction models, the SVR model has the best prediction effect.
The root mean square error (RMSE), as shown in Formula (7), is used to evaluate the prediction results.
whereŷ is the predicted capacity value, and y is the real capacity value.  As can be seen from Figures 6, 7, and 8, among the three prediction models, the SVR model has the best prediction effect.
The root mean square error (RMSE), as shown in formula (7), is used to evaluate the prediction results.
where is the predicted capacity value, and is the real capacity value. The RMSE refers to the square root of the mean of the square of all the errors in the estimated number . A smaller RMSE value indicates a more accurate prediction.
In order to make a detailed comparison and analysis of the prediction accuracy of each model, a five-fold cross-validation is carried out. The dataset is divided into five subsets on average. Four subsets are selected as the training set and the remaining subset as the test set each time. A total of five validation calculations are carried out, and the RMSE values of each model are obtained, as shown in Figure 9. As can be seen from Figure 9, the prediction errors of each model are generally stable, among which the RMSE value of the SVR model is the lowest and the prediction effect is the best, so we chose the SVR model to predict the RUL of the concrete piston online.  The RMSE refers to the square root of the mean of the square of all the errors in the estimated number n. A smaller RMSE value indicates a more accurate prediction.
In order to make a detailed comparison and analysis of the prediction accuracy of each model, a five-fold cross-validation is carried out. The dataset is divided into five subsets on average. Four subsets are selected as the training set and the remaining subset as the test set each time. A total of five validation calculations are carried out, and the RMSE values of each model are obtained, as shown in Figure 9. As can be seen from Figure 9, the prediction errors of each model are generally stable, among which the RMSE value of the SVR model is the lowest and the prediction effect is the best, so we chose the SVR model to predict the RUL of the concrete piston online. As can be seen from Figures 6, 7, and 8, among the three prediction models, the SVR model has the best prediction effect.
The root mean square error (RMSE), as shown in formula (7), is used to evaluate the prediction results.
where is the predicted capacity value, and is the real capacity value. The RMSE refers to the square root of the mean of the square of all the errors in the estimated number . A smaller RMSE value indicates a more accurate prediction.
In order to make a detailed comparison and analysis of the prediction accuracy of each model, a five-fold cross-validation is carried out. The dataset is divided into five subsets on average. Four subsets are selected as the training set and the remaining subset as the test set each time. A total of five validation calculations are carried out, and the RMSE values of each model are obtained, as shown in Figure 9. As can be seen from Figure 9, the prediction errors of each model are generally stable, among which the RMSE value of the SVR model is the lowest and the prediction effect is the best, so we chose the SVR model to predict the RUL of the concrete piston online.

Dependence of RUL Prediction on Working Time
In order to further analyze the prediction effect of the life prediction model on different working times of the concrete piston, life prediction was performed at a step size of 5% of the actual working life, with a typical result of on α and RUL prediction shown in Table 4. In Table 4, M a is the actual RUL of the concrete piston. Three concrete pistons with an actual working life of 210, 240 and 270 h, respectively, were selected to analyze the prediction effect of the model, and all the data are calculated to draw the RMSE curve of the prediction results, as shown in Figure 10. From Figure 10a-c, it can be seen that the prediction effect is best when the actual working life reaches approximately 80%. The RUL of 325 concrete pistons is predicted using the proposed method, where the estimation error is less than 4.73%. Figure 10d shows the averaged RMSE value on the predicted RUL at different working times. It can be seen that, in the early-life stage of the concrete piston, the prediction has a large error due to less condition monitoring data. However, the prediction accuracy improves as the working time increases until the working time is at 80% of the actual working life. Then, the prediction accuracy becomes worse as the working time increases. At present, the concrete pistons of the concrete pump trucks are not replaced preventively due to the lack of supportive approaches. They are usually replaced after wearing until failure, which often leads to the unplanned downtime of the concrete pump truck, causing unnecessary economic losses and even affecting the project's progress. To achieve preventive replacement, it is very important to choose an appropriate replacement time.
Replacing too early will lead to increased costs, and replacing too late may lead to unplanned downtime. Therefore, it is necessary to develop a replacement plan when the working time is close to the actual working life and the prediction error is small. Through the research of this work, it is found that the RUL prediction model of the concrete piston based on probability statistics and data-driven methods has the best prediction effect when the concrete piston working life reaches 80% of the predicted RUL; this result can be used for the formulation of preventive replacement plans. It can be set as a replacement warning point, which can be used as the main basis for maintenance according to the situation, and a reasonable maintenance replacement and inventory management plan can be developed to reduce costs and economic losses.

Conclusions
This paper proposes a new method for predicting the RUL of the concrete piston based on probability statistics and data-driven methods. A life coefficient is proposed to link the actual life of individual concrete pistons and the average useful life derived from the actual replacement data of a set of concrete pistons. The life coefficient is considered to be mainly affected by the load working state, and it is found that support vector regression could provide a good estimation on the life coefficient. The RUL of 325 concrete pistons is predicted using the proposed method, where the estimation error is less than 4.73%. It is also found that that the prediction accuracy is best when the working life reaches 80% of the predicted useful life, which puts forward the replacement warning point to provide support for inventory management and a replacement plan of the concrete piston.

Conflicts of Interest:
The authors declare no conflict of interest.