Remaining Useful Life Prediction of Lithium-Ion Batteries Based on Deep Learning and Soft Sensing

The Remaining useful life (RUL) prediction is of great concern for the reliability and safety of lithium-ion batteries in electric vehicles (EVs), but the prediction precision is still unsatisfactory due to the unreliable measurement and fluctuation of data. Aiming to solve these issues, an adaptive sliding window-based gated recurrent unit neural network (GRU NN) is constructed in this paper to achieve the precise RUL prediction of LIBs with the soft sensing method. To evaluate the battery degradation performance, an indirect health indicator (HI), i.e., the constant current duration (CCD), is firstly extracted from charge voltage data, providing a reliable soft measurement of battery capacity. Then, a GRU NN with an adaptive sliding window is designed to learn the long-term dependencies and simultaneously fit the local regenerations and fluctuations. Employing the inherent memory units and gate mechanism of a GRU, the designed model can learn the long-term dependencies of HIs to the utmost with low computation cost. Furthermore, since the length of the sliding window updates timely according to the variation of HIs, the model can also capture the local tendency of HIs and address the influence of local regeneration. The effectiveness and advantages of the integrated prediction methodology are validated via experiments and comparison, and a more precise RUL prediction result is provided as well.


Introduction
As the main energy component, lithium-ion batteries (LIBs) play an important role in the development of hybrid and electric vehicles (EVs) and other electronic industry, owing to the advantages of high energy density, low-emission, lightweight, etc. [1]. However, the maximum available capacity gradually fades with the repeated charge and discharge, leading to the end of the battery life. It will cause waste if the battery is replaced too early, while safety accidents may occur when it is replaced too late [2,3]. The health monitoring and prognostics for LIBs can make great contributions to improve the safety and reliability of EVs and provide early warning for battery replacement [4]. One of the most important issues in the condition monitoring and prognostics of LIBs is the prediction of remaining useful life (RUL) via degradation modeling and online inference.
Typically, the methods for an RUL prediction mainly include model-based methods and data-driven methods. The model-based methods, particularly the Kalman filter (KF), the particle filter (PF) and some stochastic models, have been recognized to contribute the state of charge (SOC) and state of health (SOH) estimation of batteries in recent years [5][6][7][8]. However, for LIBs, the sensitivity of a stochastic model when facing the complicated degradation mechanisms causes a decrease in the model robustness. By contrast, the data-driven methods can learn the battery degradation trends from battery monitoring data directly, whereby it circumvents the analysis of electrochemical reaction and failure mechanism. Hence, these kinds of technologies have attracted great interest recently among researchers [9].
(1) Combining soft sensing with deep learning, a reliable RUL prediction model is proposed, which can accomplish a satisfactory HI estimation and provide an accurate RUL for LIBs in the routine environment. (2) A unique indirect HI, i.e., the CCD extracted from the charge monitoring data, is considered as the indirect HI without complicated measurements and time-consuming calculations, providing a soft measurement of battery performance degradation. (3) A GRU prediction network with an adaptive sliding window is utilized to estimate the HI tendencies and determine the battery residual life. The designed GRU NN can not only learn the long-term dependencies but also fit the local regenerations and fluctuations of the battery degeneration with low computation cost.

Test Data
All the data for the analysis and tests in this manuscript are selected from the experimental data of 18,650 sized LIBs provided by the NASA PCoE research center [27]. A set of four LIBs (B5, B6, B7 and B18) cycles through three operation profiles (charge discharge and impedance) under room temperature (24 • C). The parameter setups about ambient temperature (AT), charge current (CC), discharge current (DC), end-of-discharge (EOC) and end of life criteria (EOLC) for these batteries are presented in Table 1. Since the charge process is more stable than the discharge process, the measured data during battery charge are employed to analyze the battery degradation performance. The database used in this paper contains the charge information covering cycle 0 to cycle 168 for B5-B7, and cycle 0 to cycle 132 for B18. Taking B5 as an example, Figure 1a,b illustrates the varying charge current curves and voltage curves at charge cycle 1, 40, 70, 120 and 160, separately. Specifically, the battery charge profile is carried out in a constant current (CC) and constant voltage (CV) mode, and the detail of the CC-CV charge curves at cycle 40 is shown in Figure 2. As we see in Figure 2, the CC charge mode is firstly implemented at the beginning of the charge process with a current of 1.5 A until the battery voltage reaches 4.2 V. Then, the battery enters the CV charge mode until the current drops to 20 mA. The work conditions and test modes of B6, B7 and B18 are the same as those of B5.
ing calculations, providing a soft measurement of battery performance degradation (3) A GRU prediction network with an adaptive sliding window is utilized to estimat the HI tendencies and determine the battery residual life. The designed GRU NN ca not only learn the long-term dependencies but also fit the local regenerations an fluctuations of the battery degeneration with low computation cost.

Test Data
All the data for the analysis and tests in this manuscript are selected from the exper imental data of 18,650 sized LIBs provided by the NASA PCoE research center [27]. A se of four LIBs (B5, B6, B7 and B18) cycles through three operation profiles (charge discharg and impedance) under room temperature (24 ℃). The parameter setups about ambien temperature (AT), charge current (CC), discharge current (DC), end-of-discharge (EOC and end of life criteria (EOLC) for these batteries are presented in Table 1. Since the charg process is more stable than the discharge process, the measured data during batter charge are employed to analyze the battery degradation performance. The database used in this paper contains the charge information covering cycle 0 t cycle 168 for B5-B7, and cycle 0 to cycle 132 for B18. Taking B5 as an example, Figure 1a, illustrates the varying charge current curves and voltage curves at charge cycle 1, 40, 70 120 and 160, separately. Specifically, the battery charge profile is carried out in a constan current (CC) and constant voltage (CV) mode, and the detail of the CC-CV charge curve at cycle 40 is shown in Figure 2. As we see in Figure 2, the CC charge mode is firstl implemented at the beginning of the charge process with a current of 1.5 A until the ba tery voltage reaches 4.2 V. Then, the battery enters the CV charge mode until the curren drops to 20 mA. The work conditions and test modes of B6, B7 and B18 are the same a those of B5.

HI Extraction
An effective HI that can reflex the real battery degeneration characteristics is im portant for the RUL prediction of LIBs. The traditional His, e.g., capacity or impedance are difficult to measure due to the fact that they generally need expensive instruments an complex operations and cannot be used in online real-time prognosis. Hence, the soft sens ing method is becoming the mainstream for the data-driven prediction method. As state in [28], the soft sensing model takes easily measurable variables and difficult-to-measur variables as the input and output, to estimate the variables that cannot be detected due t the limitations of the sensors. In this section, we aim to construct an easily measurabl indirect HI to reflect battery capacity degradation, which is unavailable in the online ap plication.
As known, the performance of LIBs gradually degrades with the increase in th charge number, and this degradation can be observed in geometrical shapes of curren and voltage curves at different cycles, as seen in Figure 1. By analyzing these curves, th time length of the CC charge mode gradually shortens, and the growth of the voltag becomes faster and faster with the growth of the cycle number. Namely, there exists certain correlation between the charge current duration and battery performance. To de scribe the battery degradation processes, the CCD is selected in this paper as the HI t reflect the battery degradation performance. The CCD is the duration of the CC charg mode, as defined in Figure 2. It is a direct indicator of the battery capacity variation an reflexes the battery polarization to a certain extent.
The vector of the CCD along charge cycle can be expressed as follows: where CCD is the vector of the observed values of the CCD, N indicates the number o battery cycles and i t represents the end time of the CC charge process for cycle i, whic is also the start time of the CV charge process. Generally, as in Figures 1 and 2, the en time of the CC charge process can be considered as the time when the battery voltag reaches 4.2 V.

Algorithm Description
Using the extracted CCD as the HI, an adaptive sliding window-based GRU predic tion network is constructed in this section to estimate the HI degradation and predict th RUL of the LIB. The structure of the prediction model is illustrated in Figure 3. As pre sented, an adaptive sliding window is designed to dynamically select the input data fo training and forecasting. Then, a GRU NN is constructed with the purpose of estimatin

HI Extraction
An effective HI that can reflex the real battery degeneration characteristics is important for the RUL prediction of LIBs. The traditional His, e.g., capacity or impedance, are difficult to measure due to the fact that they generally need expensive instruments and complex operations and cannot be used in online real-time prognosis. Hence, the soft sensing method is becoming the mainstream for the data-driven prediction method. As stated in [28], the soft sensing model takes easily measurable variables and difficult-to-measure variables as the input and output, to estimate the variables that cannot be detected due to the limitations of the sensors. In this section, we aim to construct an easily measurable indirect HI to reflect battery capacity degradation, which is unavailable in the online application.
As known, the performance of LIBs gradually degrades with the increase in the charge number, and this degradation can be observed in geometrical shapes of current and voltage curves at different cycles, as seen in Figure 1. By analyzing these curves, the time length of the CC charge mode gradually shortens, and the growth of the voltage becomes faster and faster with the growth of the cycle number. Namely, there exists a certain correlation between the charge current duration and battery performance. To describe the battery degradation processes, the CCD is selected in this paper as the HI to reflect the battery degradation performance. The CCD is the duration of the CC charge mode, as defined in Figure 2. It is a direct indicator of the battery capacity variation and reflexes the battery polarization to a certain extent.
The vector of the CCD along charge cycle can be expressed as follows: where CCD is the vector of the observed values of the CCD, N indicates the number of battery cycles and t i represents the end time of the CC charge process for cycle i, which is also the start time of the CV charge process. Generally, as in Figures 1 and 2, the end time of the CC charge process can be considered as the time when the battery voltage reaches 4.2 V.

Algorithm Description
Using the extracted CCD as the HI, an adaptive sliding window-based GRU prediction network is constructed in this section to estimate the HI degradation and predict the RUL of the LIB. The structure of the prediction model is illustrated in Figure 3. As presented, an adaptive sliding window is designed to dynamically select the input data for training and forecasting. Then, a GRU NN is constructed with the purpose of estimating the decline of CCD online using the trained model parameters and forecasting inputs. At last, the RUL of the LIB can be determined from the predicted CCD values.

GRU Prediction with Adaptive Sliding Window
As an improved recurrent neural network, a GRU is designed to solve the gradients' exploding and vanishing problem by virtue of the peculiar memory unit and gate mechanism. Additionally, meanwhile, compared with the traditional LSTM, less training data and time are required to promote the convergence of the model with the streamlined gates. By combining with the GRU cells, an adapted window updating mechanism is designed to contribute the GRU NN construction to conduct the CCD estimation and RUL prediction.
The graphical description of the proposed adaptive sliding window-based GRU prediction structure is revealed in Figure 4. The amount of CCD data fed into the GRU model in each iteration is updated as the window length changes. Significantly, the number of GRU cells, i.e., the hidden size of the GRU NN, is dynamically consistent with the length of the sliding window. The generation process of learning data for the GRU model using the adaptive sliding window is given as follows: (1) The sliding mode of the window is set as one-step ahead, i.e., the number of the new data in the window adds only one for each step. Let the current point be P, and the next point be P + 1; the value of the CCD at P+1 needs to be predicted.
We use the priori data captured in the current sliding window to predict the CCD value at P + 1, and the length of this sliding window is +1 P L , which can be updated by using the following formula [29]: indicates the Euclidean vector of the norm of the difference between P CCD and 1 P +

CCD
, with indicates the absolute value of the difference between the (2) In the online training stage, through selecting the initial window length and performing the one-step-ahead prediction, the CCD data for training are expanded into twodimensional space to explore the structure and parameters of the GRU NN. For each sequence, its length varies with the adaptive mechanism (Equation (2)). With the trained model, the designed GRU NN can predict the CCD of the next cycle one by

GRU Prediction with Adaptive Sliding Window
As an improved recurrent neural network, a GRU is designed to solve the gradients' exploding and vanishing problem by virtue of the peculiar memory unit and gate mechanism. Additionally, meanwhile, compared with the traditional LSTM, less training data and time are required to promote the convergence of the model with the streamlined gates. By combining with the GRU cells, an adapted window updating mechanism is designed to contribute the GRU NN construction to conduct the CCD estimation and RUL prediction.
The graphical description of the proposed adaptive sliding window-based GRU prediction structure is revealed in Figure 4. The amount of CCD data fed into the GRU model in each iteration is updated as the window length changes. Significantly, the number of GRU cells, i.e., the hidden size of the GRU NN, is dynamically consistent with the length of the sliding window. The generation process of learning data for the GRU model using the adaptive sliding window is given as follows: (1) The sliding mode of the window is set as one-step ahead, i.e., the number of the new data in the window adds only one for each step. Let the current point be P, and the next point be P + 1; the value of the CCD at P + 1 needs to be predicted.
10, x FOR PEER REVIEW 6 of 13 one. As seen in Figure 4, the GRU NN is composed of the basic GRU cell with a reset gate ( r ) and an update gate ( z ). The information propagating in GRU cells can be controlled by the gate mechanism.
Given that the input at the current time is P t and +1 P t is the CCD value at the next time, it is P+1 that needs to be predicted. P h indicates the hidden state of GRU cells at P, which is also the output of the cell. The reset gate ( r ) aims to control the data information from the new input information and output information yielded by previous cells. The update gate is employed to maintain the helpful historical information. The reset gate and update gate at time P + 1 are, respectively, calculated using the following formulas: where σ is the logistic sigmoid function, W and U represent the layer weights and b indicates the biases.
where  means the element-wise product, W and U represent the layer weights and b indicates the biases. The RUL prediction model is constructed by connecting the above GRU cells. When the predicted CCD is lower than the failure threshold, a failure occurs, and the RUL can be calculated.   We use the priori data captured in the current sliding window to predict the CCD value at P + 1, and the length of this sliding window is L P+1 , which can be updated by using the following formula [29]: where ∆CCD P+1 = CCD P − CCD P+1 indicates the Euclidean vector of the norm of the difference between CCD P and CCD P+1 , with CCD P = [t P−L P , · · · , t P−1 ] and CCD P+1 = [t P−L P +1 , · · · t P ]. ∆CCD 0 is the mean value of ∆CCD . |∆R P+1 | indicates the absolute value of the difference between the R P+1 and R P , which are the variances of CCD i and CCD i−1 respectively, and ∆R 0 denotes the mean value of |∆R i |. L max and L min are hyper-parameters for the proposed sliding window, which are determined by trial and error.
(2) In the online training stage, through selecting the initial window length and performing the one-step-ahead prediction, the CCD data for training are expanded into two-dimensional space to explore the structure and parameters of the GRU NN. For each sequence, its length varies with the adaptive mechanism (Equation (2)). With the trained model, the designed GRU NN can predict the CCD of the next cycle one by one. As seen in Figure 4, the GRU NN is composed of the basic GRU cell with a reset gate (r) and an update gate (z). The information propagating in GRU cells can be controlled by the gate mechanism.
Given that the input at the current time is t P and t P+1 is the CCD value at the next time, it is P + 1 that needs to be predicted. h P indicates the hidden state of GRU cells at P, which is also the output of the cell. The reset gate (r) aims to control the data information from the new input information and output information yielded by previous cells. The update gate is employed to maintain the helpful historical information. The reset gate and update gate at time P + 1 are, respectively, calculated using the following formulas: where σ is the logistic sigmoid function, W and U represent the layer weights and b indicates the biases. The output of the reset gate is employed to generate the candidate state h P+1 using a tanh function for updating the cell state. Then, the output of this cell h P+1 can be calculated using h P+1 and the output of the update gate, z P+1 . The transformation process of the cell states is presented in the following form: where means the element-wise product, W and U represent the layer weights and b indicates the biases. The RUL prediction model is constructed by connecting the above GRU cells. When the predicted CCD is lower than the failure threshold, a failure occurs, and the RUL can be calculated.

RUL Prediction
An LIB is deemed to fail when the HI reaches its pre-specified failure threshold. Additionally, the length of available service cycles from the current cycle to the end-of-life cycle are referred to as the RUL. In this paper, the end-of-life cycle is the cycle number when the CCD decreases below its failure threshold, and the current cycle is the prediction start cycle.
The high correlation between the capacity and the extracted HI is demonstrated in the subsequent Section 4.1, then the failure threshold of the normalized CCD (t nor_th ) can be expressed as follows [13]: where Cap th indicates the failure threshold on battery capacity, which is usually set to 70-80% of its nominal value [10], and Cap max and Cap min are the maximum and minimum of capacity. For convenience, the normalized CCD and its failure threshold are employed in the following experiment and analysis. The RUL can be calculated using the following [16]: where N RUL is the number of residual cycles, i.e., the RUL. N EOL indicates the cycle number when the value of CCDs degrades below t nor_th , and N ECL represents the prediction starting cycle.

Results and Discussion
To verify the validity of the RUL prognostic model proposed in this manuscript, several experiments and comparisons are performed here. Additionally, the degradation data of cells B5, B6, B7 and B18 introduced in Section 2.1 are selected for prediction and analysis experiments in this chapter.

Correlation Analysis and Life Threshold Calculation
According to the analyses in Section 2.2, the proposed HIs, i.e., the CCDs, are extracted from the charge current and voltage monitoring data of each battery.
To assess the consistency between the extracted CCD and the capacity, the spearman correlation analysis and significant test are performed for batteries B5-B7 and B18. The spearman correlation ranges from 0 to 1, in which the value of 1 indicates a strong correlation, while 0 denotes a low correlation. Next, to obtain the statistical significance, the significant test on the spearman rank correlation coefficient is performed. The level of statistical significance expressed as H with H ∈ [0, 1]. A small H leads to a strong possibility of rejecting the null hypothesis, which indicates a significant correlation between the extracted HI and capacity. The detail can refer to [30]. The obtained results are presented in Table 2. As can be seen, the correlation coefficients between the capacity and the extracted CCD for all the considered batteries are close to 1. These indicate that a significant linear correlation between the CCD and the capacity of LIBs exists. Hence, the extracted HI can prominently reflect the degradation performance of LIBs as an alternative to capacity. Next, the failure threshold of normalized CCDs can be calculated according to the threshold transforming of the normalized capacity in Equation (7). The corresponding failure thresholds for four batteries are also listed in Table 2.

Performance Assessment
In this manuscript, two evaluating standards are adopted to evaluate the prediction performance of the proposed model, which are listed as the coefficient of determination (R-square), and the Absolute Error (AE).
Here, the expression of R-square is as follows: where y represents the mean value of y.
As to the evaluation standard AE, it is calculated by the following: where R andR denote the real RUL and the predicted one of LIBs, respectively.

Prediction Results Analysis
This paper develops an adaptive sliding window-based GRU (ASWGRU) NN for the RUL prediction of LIBs to improve the prediction precision and robustness. This algorithm is implemented by implanting an adaptive sliding window into the GRU arithmetic. In which, the inputs of the GRU are replaced with rebuilt data by the designed adaptive sliding window. With the long-term learning performance of the GRU and the transient state capturing capacity of an adaptive sliding window, this method can not only predict the global degradation trends of CCD, but also estimate the local regenerations. The window parameters L max and L min are set to 5 and 25, respectively. In the experiment, we found that the GRU with more than two hidden layers tends to overfit the training set; therefore, the number of GRU layers is set to 1. The hidden size of the GRU is updated by the ASW. Additionally, the resilient mean square backpropagation method is employed for adaptively optimizing the weights and biases in the GRU model. The other hyper-parameters of the neural network are selected with the manual search method.
First, an RUL prediction with start points 61, 71, 81 and 91 is conducted for B5, B6, B7 and B18. The prediction results can be seen in Figure 5. As seen, despite the fact the transient state estimation performance for the CCD reduces as the training data decrease, the designed prediction algorithm can still provide a satisfactory RUL prediction for all the batteries. In the sequel, the R-square between the estimated CCD and the actual value and the AE of the predicted RUL for four batteries are collected in Tables 3-6. In which, all the R-square values of the B5-B7 batteries are close to 1, which demonstrates that the CCDs estimated using the present method are consistent with their real values, implicating a high prediction precision. For B18, the local fluctuations in the battery data are more frequent and the RUL prediction precision descends by comparison with the prediction result of B5 and B7.  Further, we compare the results at the 81st prediction start point with some commonly used time series neuro network models including the GRU without an ASW [24], the standard LSTM [31] and the Nonlinear Auto-Regressive Network (NARX) [32]. The LSTM is one of the advanced RNN algorithms with a more complicated gate mechanism in its memory cells to learn the long-term dependent sequences. For comparison purposes, the parameters of the LSTM are chosen as the same as the ASWGRU and GRU. The NARX is the nonlinear extension of the linear auto-regressive with exogenous input (ARX) model, where the current output is described with a nonlinear functional expansion of lagged input and output signals, plus additive noise. The NARX model is often employed for time-series modeling. For the parameters of NARX, please refer to reference [33]. Figure 6 illustrates the RUL prediction results. It is evident that more accurate results can be yielded by employing the proposed ASWGRU models, which can better fit local recovery dynamics and precisely estimate the final failure point for the four batteries considered. In contrast, the other three non-windows approach only obtain a steady decline prediction for the overall trends and failed to estimate the local regeneration due to the lack of a capturing mechanism for the local dynamics. Moreover, the NARX performs the worst performance among all these methods, which cannot achieve an effective RUL prediction at the 81st prediction start point because of lacking the long-term memory capacity. Furthermore, compared with the traditional GRU, some better prediction results can be obtained by using LSTM. However, at the same time, the LSTM needs more training iterations, which is not conducive to practical applications. The similar conclusions can be drawn from the R-square between the estimated CCD and actual value and the AE of the predicted RUL for the four batteries collected in Tables 3-6, where "-" denotes that the   Further, we compare the results at the 81st prediction start point with some commonly used time series neuro network models including the GRU without an ASW [24], the standard LSTM [31] and the Nonlinear Auto-Regressive Network (NARX) [32]. The LSTM is one of the advanced RNN algorithms with a more complicated gate mechanism in its memory cells to learn the long-term dependent sequences. For comparison purposes, the parameters of the LSTM are chosen as the same as the ASWGRU and GRU. The NARX is the nonlinear extension of the linear auto-regressive with exogenous input (ARX) model, where the current output is described with a nonlinear functional expansion of lagged input and output signals, plus additive noise. The NARX model is often employed for time-series modeling. For the parameters of NARX, please refer to reference [33]. Figure 6 illustrates the RUL prediction results. It is evident that more accurate results can be yielded by employing the proposed ASWGRU models, which can better fit local recovery dynamics and precisely estimate the final failure point for the four batteries considered. In contrast, the other three non-windows approach only obtain a steady decline prediction for the overall trends and failed to estimate the local regeneration due to the lack of a capturing mechanism for the local dynamics. Moreover, the NARX performs the worst performance among all these methods, which cannot achieve an effective RUL prediction at the 81st prediction start point because of lacking the long-term memory capacity. Furthermore, compared with the traditional GRU, some better prediction results can be obtained by using LSTM. However, at the same time, the LSTM needs more training iterations, which is not conducive to practical applications. The similar conclusions can be drawn from the R-square between the estimated CCD and actual value and the AE of the predicted RUL for the four batteries collected in Tables 3-6, where "-" denotes that the estimated value does not reach its failure threshold. As shown in Tables 3-6, using the proposed arithmetic, the RUL of all the batteries can be accurately forecasted, with all the AE holding at the reasonable ranges, which verifies the effectiveness of the proposed method.
x FOR PEER REVIEW 10 of 13  Predicted Figure 6. The prediction result of CCD using the considered three prediction modes for batteries: (a) B5, (b) B6, (c) B7, (d) B18.

Conclusions
In this paper, we present a novel RUL prediction framework by incorporating the deep learning and soft sensing method, where the CCD is extracted from the monitoring data of the charge process to reflex the battery degradation performance and an adaptive sliding window based GRU NN is constructed to simultaneously learn the long-term dependencies and fit the local fluctuations of the battery degeneration with a low computation cost. The precise prediction result is verified via some meaningful comparison experiments using the test data from the NASA PCoE research center. In reality, the high frequency measurement noise and the characteristic variation of the loads are still the main challenges for battery RUL predictions. In the future, we will further validate the effectiveness and superiority of the designed prediction method using more practical measurement data and improve the performance of prediction arithmetic for dealing with the high frequency noises and load fluctuations.