Prognostics Comparison of Lithium-Ion Battery Based on the Shallow and Deep Neural Networks Model

: Prognostics of the remaining useful life (RUL) of lithium-ion batteries is a crucial role in the battery management systems (BMS). An artiﬁcial neural network (ANN) does not require much knowledge from the lithium-ion battery systems, thus it is a prospective data-driven prognostic method of lithium-ion batteries. Though the ANN has been applied in prognostics of lithium-ion batteries in some references, no one has compared the prognostics of the lithium-ion batteries based on di ﬀ erent ANN. The ANN generally can be classiﬁed to two categories: the shallow ANN, such as the back propagation (BP) ANN and the nonlinear autoregressive (NAR) ANN, and the deep ANN, such as the long short-term memory (LSTM) NN. An improved LSTM NN is proposed in order to achieve higher prediction accuracy and make the construction of the model simpler. According to the lithium-ion data from the NASA Ames, the prognostics comparison of lithium-ion battery based on the BP ANN, the NAR ANN, and the LSTM ANN was studied in detail. The experimental results show: (1) The improved LSTM ANN has the best prognostic accuracy and is more suitable for the prediction of the RUL of lithium-ion batteries compared to the BP ANN and the NAR ANN; (2) the NAR ANN has better prognostic accuracy compared to the BP ANN.


Introduction
Lithium-ion batteries have been widely used from portable electronics to battery-driven hybrid vehicles owing to their higher energy density, higher output voltage, lower self-discharge, longer lifetime, higher reliability and other advantages compared to other types of batteries available in the market [1][2][3]. The lithium-ion battery failure consequence have different levels of severity: from performance degradation to catastrophic failure [4]. The lithium-ion battery failure caused several Boeing 787 to be caught fire and caused all airliners to be grounded indefinitely in 2013 [5]. Therefore, It is imperative and highly desired to detect the performance degradation and to predict the remaining useful life (RUL) of lithium-ion batteries. Prognostics and health management (PHM) are the technologies to evaluate the reliability of a system in its actual life cycle conditions to determine the RUL [6,7]. One of the major work of prognostics is to predict RUL [1]. In this paper, we consider the battery will fail when the battery's capacity drops to 70% of its initial capacity, because both the battery's capacity and power will drop much faster after this point, and the battery is unreliable and should be replaced [8,9].
Existing methods for lithium-ion batteries' RUL prediction roughly can be classified into two main categories: the physics-of-failure (PoF)-based method and the data-driven method [1,2]. The PoF-based prognostic methods depend on the battery physical characteristics, i.e., a battery's material property, its failure mechanism and life cycle loading conditions, and tend to be computationally complex [10]. through three different operational profiles (charge, discharge and impedance) at room temperature (24 • C). The capacity data of a lithium-ion battery is measured in a discharge cycle.
The voltage and current at a charge and discharge cycle is shown in Figure 1. And multiple charging and discharging cycles is to repeat the process until the batteries reached end-of-life (EOL) criteria, which was a 30% fade in rated capacity.
(a) Charging process: Charging was carried out in a constant current model at 1.5 A until the battery voltage reached 4.2 V and then continued in a constant voltage model until the charge current dropped to 0.02 A. (b) Discharging process: Discharge was carried out at a constant current level of 2 A until the battery fell to 2.7 V for batteries #B5.
Energies 2019, 12, x 3 of 14 run through three different operational profiles (charge, discharge and impedance) at room temperature (24 °C). The capacity data of a lithium-ion battery is measured in a discharge cycle. The voltage and current at a charge and discharge cycle is shown in Figure 1. And multiple charging and discharging cycles is to repeat the process until the batteries reached end-of-life (EOL) criteria, which was a 30% fade in rated capacity.
(a) Charging process: Charging was carried out in a constant current model at 1.5 A until the battery voltage reached 4.2 V and then continued in a constant voltage model until the charge current dropped to 0.02 A. (b) Discharging process: Discharge was carried out at a constant current level of 2 A until the battery fell to 2.7 V for batteries #B5. When the actual capacity of the lithium-ion battery drops to 70% of the rated capacity, that is, from 2 Ahr to 1.4 Ahr, the experiment stops, which means that the lithium-ion battery reaches the cut-off life point. The capacity decline curve of the #B5 battery is shown in the Figure 2.
The capacity degradation curve of lithium-ion batteries is obvious nonlinear, non-Gaussian characteristics, as shown in Figure 2. The battery is needed to be charged and discharged continuously, thus the capacity of the battery will degrade over time. Due to the good continuity of the capacity degradation trend, the actual capacity of the battery can be used to characterize the health status of the lithium-ion battery. Therefore, the end of life (EOL) of the lithium-ion battery can be defined as the number of discharge cycles when the actual capacity drops to the specified capacity threshold for the first time. For the sake of simplicity, the capacity threshold is defined as 1.38 Ahr, which is an integer multiple cycles and is near to 70% of the rated capacity.  When the actual capacity of the lithium-ion battery drops to 70% of the rated capacity, that is, from 2 Ahr to 1.4 Ahr, the experiment stops, which means that the lithium-ion battery reaches the cut-off life point. The capacity decline curve of the #B5 battery is shown in the Figure 2.
The capacity degradation curve of lithium-ion batteries is obvious nonlinear, non-Gaussian characteristics, as shown in Figure 2. The battery is needed to be charged and discharged continuously, thus the capacity of the battery will degrade over time. Due to the good continuity of the capacity degradation trend, the actual capacity of the battery can be used to characterize the health status of the lithium-ion battery. Therefore, the end of life (EOL) of the lithium-ion battery can be defined as the number of discharge cycles when the actual capacity drops to the specified capacity threshold for the first time. For the sake of simplicity, the capacity threshold is defined as 1.38 Ahr, which is an integer multiple cycles and is near to 70% of the rated capacity. run through three different operational profiles (charge, discharge and impedance) at room temperature (24 °C). The capacity data of a lithium-ion battery is measured in a discharge cycle. The voltage and current at a charge and discharge cycle is shown in Figure 1. And multiple charging and discharging cycles is to repeat the process until the batteries reached end-of-life (EOL) criteria, which was a 30% fade in rated capacity.  When the actual capacity of the lithium-ion battery drops to 70% of the rated capacity, that is, from 2 Ahr to 1.4 Ahr, the experiment stops, which means that the lithium-ion battery reaches the cut-off life point. The capacity decline curve of the #B5 battery is shown in the Figure 2.
The capacity degradation curve of lithium-ion batteries is obvious nonlinear, non-Gaussian characteristics, as shown in Figure 2. The battery is needed to be charged and discharged continuously, thus the capacity of the battery will degrade over time. Due to the good continuity of the capacity degradation trend, the actual capacity of the battery can be used to characterize the health status of the lithium-ion battery. Therefore, the end of life (EOL) of the lithium-ion battery can be defined as the number of discharge cycles when the actual capacity drops to the specified capacity threshold for the first time. For the sake of simplicity, the capacity threshold is defined as 1.38 Ahr, which is an integer multiple cycles and is near to 70% of the rated capacity.   In addition, it will be interesting to observe that the temperature inside the lithium-ion battery changes with the charge and discharge cycle at room temperature (24 • C). As can be seen from the Figure 3, during the discharge process, the temperature of the battery gradually increased at first, then dropped quickly in the end. In addition, it will be interesting to observe that the temperature inside the lithium-ion battery changes with the charge and discharge cycle at room temperature (24 °C). As can be seen from the Figure 3, during the discharge process, the temperature of the battery gradually increased at first, then dropped quickly in the end.

Data Normalized
The input should be normalized by the follow formula: where x is the actual measured values; min x and max x are the minimum and maximum in the original sequence correspondingly; x′ is the normalized value, which belong to [ ] 0,1 .

Evaluation Index
In order to evaluate the prediction accuracy of the network model quantitatively, the absolute error, relative error and root mean square error of the lithium-ion battery RUL prediction can be used for analysis.

Data Normalized
The input should be normalized by the follow formula: where x is the actual measured values; x min and x max are the minimum and maximum in the original sequence correspondingly; x is the normalized value, which belong to [0, 1].

Evaluation Index
In order to evaluate the prediction accuracy of the network model quantitatively, the absolute error, relative error and root mean square error of the lithium-ion battery RUL prediction can be used for analysis.
where EOP (end of prediction) represents the predicted battery end of life; RUL_ae represents the predicted absolute error; RUL_re (%) represents the predicted relative error; RMSE represents the root mean square error.
When the corresponding input and output of the network model are determined, one of the problems that comes with it is how to determine the number of hidden layers and the number of hidden layer nodes. In general, only the network structure with one hidden layer is considered, because firstly a network included one hidden layer with enough neurons can solve any finite input-output mapping problem; secondly, it is more difficult to increase the prediction accuracy by adding hidden layer than increasing the number of hidden layer nodes in actual training. At present, there is no accurate mathematical formula for the selection of the number of hidden nodes in the NN, and most of them are based on the following empirical model.
where m represents the number of hidden layer nodes; n represents the number of input layer nodes; l represents the number of output layer nodes; α is a constant between 1 and 10.

The BP NN Model
The BP NN belongs to a kind of multilayer feedforward NN. It advances layer by layer according to the weight coefficient during training, and it adjusts the weight and bias of the network dynamically based on the back propagation algorithm. Structurally, it consists of an input layer, an output layer, and a hidden layer; in essence, the BP algorithm uses the squared error sum of the network as the objective function and uses the gradient descent method to calculate the minimum value of the objective function. Its basic structure is shown in the Figure 4: hidden layer nodes. In general, only the network structure with one hidden layer is considered, because firstly a network included one hidden layer with enough neurons can solve any finite inputoutput mapping problem; secondly, it is more difficult to increase the prediction accuracy by adding hidden layer than increasing the number of hidden layer nodes in actual training. At present, there is no accurate mathematical formula for the selection of the number of hidden nodes in the NN, and most of them are based on the following empirical model. where m represents the number of hidden layer nodes; n represents the number of input layer nodes; l represents the number of output layer nodes; α is a constant between 1 and 10.

The BP NN Model
The BP NN belongs to a kind of multilayer feedforward NN. It advances layer by layer according to the weight coefficient during training, and it adjusts the weight and bias of the network dynamically based on the back propagation algorithm. Structurally, it consists of an input layer, an output layer, and a hidden layer; in essence, the BP algorithm uses the squared error sum of the network as the objective function and uses the gradient descent method to calculate the minimum value of the objective function. Its basic structure is shown in the Figure 4: The neural model in Figure 4 can be expressed mathematically as, where the input values of a node,  The neural model in Figure 4 can be expressed mathematically as, where the input values of a node, a m−1 i , replicate the dendrites of the biological neuron and are multiplied by their respective connection weights, w m ji , and then summed. The summation is applied to the function of node σ, along with the bias b m j , giving as a result the activation value of the node a m j .

The NAR NN Model
The dynamic NN can add the data saved at the previous time to the calculation of the data at a later time through feedback and delay, so it is more suitable for the prediction of time series. Among them, the NAR NN is a kind of dynamic NN, which has been widely used in practical applications. Its basic structure is shown in Figure 5.

The NAR NN Model
The dynamic NN can add the data saved at the previous time to the calculation of the data at a later time through feedback and delay, so it is more suitable for the prediction of time series. Among them, the NAR NN is a kind of dynamic NN, which has been widely used in practical applications. Its basic structure is shown in Figure 5.
As can be seen in equation (5), the output ( ) y t is defined by ( ) ( ) ( ) which indicates that the current value of the system is determined by past values.

The LSTM NN Model
In general, the time series prediction modeling is a difficult problem, because unlike regression prediction modeling, the interdependence between time series makes the problem difficult. As a kind of deep learning network structure, LSTM has the advantage that the context of data is considered in the training process. Compared with the traditional data-driven prediction method, the LSTM network has stronger nonlinear approximation ability. The general structure of the LSTM network is shown graphically in Figure 6.
There is a structure called "gate" in LSTM to remove or increase the ability of information to be passed to the memory unit. A gate is a structure that allows information to pass selectively. It consists of a sigmoid NN layer and a point multiplication operation. The sigmoid layer outputs a value between 0 and 1, describing how much of each part can pass. In a memory unit, there are three types of gates, namely a forgotten gate, an input gate, and an output gate, which are described in detail below.
As can be seen in equation (5), the output y(t) is defined by y(t − 1), y(t − 2), . . . , y(t − d), which indicates that the current value of the system is determined by past values.

The LSTM NN Model
In general, the time series prediction modeling is a difficult problem, because unlike regression prediction modeling, the interdependence between time series makes the problem difficult. As a kind of deep learning network structure, LSTM has the advantage that the context of data is considered in the training process. Compared with the traditional data-driven prediction method, the LSTM network has stronger nonlinear approximation ability. The general structure of the LSTM network is shown graphically in Figure 6.
There is a structure called "gate" in LSTM to remove or increase the ability of information to be passed to the memory unit. A gate is a structure that allows information to pass selectively. It consists of a sigmoid NN layer and a point multiplication operation. The sigmoid layer outputs a value between 0 and 1, describing how much of each part can pass. In a memory unit, there are three types of gates, namely a forgotten gate, an input gate, and an output gate, which are described in detail below.  Forget gate: f t . takes inputs of x t and h t−1 , and outputs a number between 0 and 1 for each state C t−1 . A = 1stands for retaining the state value completely, whereas A = 0 represents discarding the value completely. The forget gate is calculated as follows: Input gate and input node: "input gate", i t , decides which values to update. "input node", C t , creates a new candidate state C t . The equations to calculate the two outputs are as follows: Combing (6) and (7) to update the previous internal state C t−1 into the current state C t : Output gate: finally, there is a sigmoid layer called the "output gate", o t , that determines what information to output. After putting the internal state C t through a tanh layer This can be implemented as: where W and b are the layer weights and biases, respectively.

The Improved LSTM Model
The LSTM model selects the historical capacity data of the lithium-ion battery as an input variable to predict the remaining capacity of the lithium-ion battery in the future. Taking the #B5 battery as an example, the length of the capacity data is 168, that is: The original capacity data is then divided into training sets X train and testing sets X test based on the predicted starting point T. The X train will be normalized by the methods presented in the previous parts and X train will be got, namely: In the traditional LSTM prediction methods, XTrain = X 1 , X 2 , . . . , X T−1 and YTrain = X 2 , X 3 , . . . , X T will be defined as the input and output to construct a LSTM model. Since there are few data samples, a data construction method of the following form is proposed in order to achieve higher prediction accuracy and make the construction of the model simpler.
Assuming that the width of the split window is L and m = T − 1, the input and output of the LSTM model can be expressed as: where i belongs to [1, L]; x i represents the normalized data sequence value; X represents the input of the network; Y represents the output label of the corresponding input. According to our lots of simulation experiments and relative references, the width of the split window is set 12, i.e., l = n = 12, thus m belongs to (5,15) based on the data processing method in Equations (3) and (12). Under the same conditions, we have studied the variation of the RMSE of the LSTM network with m, as shown in the Figure 7. As can be seen from the Figure 7, when m is equal to 13, the root mean square error of the network is the smallest, so m = 13 in this paper.  (12) where i belongs to [ ] 1, L ; i x′ represents the normalized data sequence value; X represents the input of the network; Y represents the output label of the corresponding input.

{ }
According to our lots of simulation experiments and relative references, the width of the split window is set 12, i.e., l = n = 12, thus m belongs to (5,15) based on the data processing method in Equations (3) and (12). Under the same conditions, we have studied the variation of the RMSE of the LSTM network with m, as shown in the Figure 7. As can be seen from the Figure 7, when m is equal to 13, the root mean square error of the network is the smallest, so m = 13 in this paper.

Results and Discussion
The software package and the version used to implement all three NNs structures is MATLAB2018a. For the LSTM network model in this paper, according to our analysis and pre-experiments mentioned above, l = n = 12, m = 13, α = 8, the learning rate is set 0.005. The predicted starting point is set T 1 = 69, T 2 = 89, T 3 = 109 for #B5. The battery will be predicted 200 times at each prediction starting point and the point not reached failure threshold will be eliminated. Figures 8-10 show the prediction results at three different prediction starting points.

Results and Discussion
The software package and the version used to implement all three NNs structures is MATLAB2018a. For the LSTM network model in this paper, according to our analysis and preexperiments mentioned above, l = n = 12, m = 13, α = 8, the learning rate is set 0.005. The predicted starting point is set  The prediction result of LSTM at T1 = 69 is shown in Figure 8. As can be seen in the Figure 8, the prediction result is 124 and the relative error is−4. The prediction result of LSTM at T1 = 69 is shown in Figure 8. As can be seen in the Figure 8, the prediction result is 124 and the relative error is−4.  The prediction result of LSTM at T1 = 69 is shown in Figure 8. As can be seen in the Figure 8, the prediction result is 124 and the relative error is−4. The prediction result of LSTM at T2 = 89 is shown in Figure 9. As can be seen in Figure 9, now the prediction result is 130 and the relative error is 2, which means that the prediction error is gradually decreasing. The prediction result of LSTM at T2 = 89 is shown in Figure 9. As can be seen in Figure 9, now the prediction result is 130 and the relative error is 2, which means that the prediction error is gradually decreasing. The prediction result of LSTM at T3 = 109 is shown in Figure 10. As can be seen in the Figure 10, the prediction result is 131 and the relative error is 3.
The LSTM model can achieve a good prediction accuracy of the lithium-ion battery capacity, as shown in Figures 8-10. In order to observe the prediction effect of the LSTM model under different prediction starting points intuitively, the probability distribution histograms (PDH) of the #B5 battery prediction results at different prediction starting points is shown in Figures 11-13. The prediction result of LSTM at T3 = 109 is shown in Figure 10. As can be seen in the Figure 10, the prediction result is 131 and the relative error is 3.
The LSTM model can achieve a good prediction accuracy of the lithium-ion battery capacity, as shown in Figures 8-10. In order to observe the prediction effect of the LSTM model under different prediction starting points intuitively, the probability distribution histograms (PDH) of the #B5 battery prediction results at different prediction starting points is shown in Figures 11-13.
The prediction result of LSTM at T3 = 109 is shown in Figure 10. As can be seen in the Figure 10, the prediction result is 131 and the relative error is 3.
The LSTM model can achieve a good prediction accuracy of the lithium-ion battery capacity, as shown in Figures 8-10. In order to observe the prediction effect of the LSTM model under different prediction starting points intuitively, the probability distribution histograms (PDH) of the #B5 battery prediction results at different prediction starting points is shown in Figures 11-13. The probability distribution histograms and the probability distribution curve of the LSTM model after 200 tests is shown in Figure 11. As can be seen in Figures 8 and 11, due to the randomness of the network in initializing weights and bias, the prediction results are not same at each test, but the improved LSTM model proposed in this paper can approach the real life cut-off point of lithiumion batteries with a high probability from the long-term prediction results, which verify the validity of the model. Figure 12 is similar to Figure 11, except that it is a statistical distribution of #B5 with the improved LSTM model after 200 tests at T2 = 89. As seen in Figure 12, the prediction result is 130, which is very close to the real life cut-off point of the lithium-ion battery. The probability distribution histograms and the probability distribution curve of the LSTM model after 200 tests is shown in Figure 11. As can be seen in Figures 8 and 11, due to the randomness of the network in initializing weights and bias, the prediction results are not same at each test, but the improved LSTM model proposed in this paper can approach the real life cut-off point of lithium-ion batteries with a high probability from the long-term prediction results, which verify the validity of the model. Figure 12 is similar to Figure 11, except that it is a statistical distribution of #B5 with the improved LSTM model after 200 tests at T2 = 89. As seen in Figure 12, the prediction result is 130, which is very close to the real life cut-off point of the lithium-ion battery. From the probability distribution histogram of the prediction results, when we use 69, 89, 109 as the prediction starting point of the network to predict the remaining service life of the lithium-ion battery, the life expectancy cutoff point of the battery B5 is 124, 130, 131, respectively. The corresponding probabilities are 29%, 35%, and 42.5%, respectively. In addition, we can also see that the normal distribution probability density curve fitted by the prediction results becomes thinner and thinner with the increase of the prediction starting point, which indicates that the accuracy of the prediction is also gradually increasing. From the probability distribution histogram of the prediction results, when we use 69, 89, 109 as the prediction starting point of the network to predict the remaining service life of the lithium-ion battery, the life expectancy cutoff point of the battery B5 is 124, 130, 131, respectively. The corresponding probabilities are 29%, 35%, and 42.5%, respectively. In addition, we can also see that the normal distribution probability density curve fitted by the prediction results becomes thinner and thinner with the increase of the prediction starting point, which indicates that the accuracy of the prediction is also gradually increasing. the prediction starting point of the network to predict the remaining service life of the lithium-ion battery, the life expectancy cutoff point of the battery B5 is 124, 130, 131, respectively. The corresponding probabilities are 29%, 35%, and 42.5%, respectively. In addition, we can also see that the normal distribution probability density curve fitted by the prediction results becomes thinner and thinner with the increase of the prediction starting point, which indicates that the accuracy of the prediction is also gradually increasing. In order to compare the prognostics of the lithium-ion batteries based on different ANN, such as BP, NAR, and LSTM, the comparison results of the three algorithms are shown in Figure 14 and Table 1. The parameters for LSTM and NAR are: n = l = 12, m = 13, α = 8, d = 20. In order to compare the prognostics of the lithium-ion batteries based on different ANN, such as BP, NAR, and LSTM, the comparison results of the three algorithms are shown in Figure 14 and Table 1. The parameters for LSTM and NAR are: n = l = 12, m = 13, α = 8, d = 20.  As can be seen from Table 1, among the three prediction algorithms, LSTM has the highest prediction accuracy. When 80, 90, and 100 are used as prediction starting points respectively, the corresponding absolute errors are 13, 8, and 2, and the relative errors are 10.16%, 6.25%, and 1.56%. From the absolute error, relative error and RMSE corresponding to the prediction results, the LSTM model is indeed more accurate than the static BP network and the dynamic NAR network. By comparing these three different prediction algorithms, we can know that the LSTM network is better for the time series problem of lithium-ion battery life prediction, and the learning ability of the degradation process is stronger. In addition, the NAR ANN has the better prognostic accuracy  As can be seen from Table 1, among the three prediction algorithms, LSTM has the highest prediction accuracy. When 80, 90, and 100 are used as prediction starting points respectively, the corresponding absolute errors are 13, 8, and 2, and the relative errors are 10.16%, 6.25%, and 1.56%. From the absolute Energies 2019, 12, 3271 12 of 13 error, relative error and RMSE corresponding to the prediction results, the LSTM model is indeed more accurate than the static BP network and the dynamic NAR network. By comparing these three different prediction algorithms, we can know that the LSTM network is better for the time series problem of lithium-ion battery life prediction, and the learning ability of the degradation process is stronger. In addition, the NAR ANN has the better prognostic accuracy compared to the BP ANN.

Conclusions
An improved LSTM prediction method for lithium-ion battery RUL estimation is proposed in this paper. And the prognostics comparison of lithium-ion battery based on the BP ANN, the NAR ANN and the LSTM ANN has been studied in detail. The experimental results show: (1) compared with the static NN prediction model and the dynamic NN prediction model, the LSTM model based on the deep learning theory can achieve better dependence, lower prediction error and higher prediction accuracy; (2) the NAR ANN has the better prognostic accuracy compared to the BP ANN.
For lithium-ion batteries, the error and accuracy of the prediction results obtained by applying the LSTM model are different at different prediction starting points. By analyzing the prediction results of the B5 using the LSTM model, the more the predicted starting point is, that is, the more samples are trained, the higher the accuracy of the obtained network model, but the longer the training model takes, a balance needs to be struck between prediction accuracy and training time.