You are currently viewing a new version of our website. To view the old version click .
Processes
  • Article
  • Open Access

15 January 2025

A Joint Prediction of the State of Health and Remaining Useful Life of Lithium-Ion Batteries Based on Gaussian Process Regression and Long Short-Term Memory

,
,
,
and
1
School of Mechanical and Electrical Engineering, Shandong Jianzhu University, Jinan 250101, China
2
Shandong Zhengchen Technology Co., Ltd., Jinan 250101, China
*
Author to whom correspondence should be addressed.
This article belongs to the Section Energy Systems

Abstract

To comprehensively evaluate the current and future aging states of lithium-ion batteries, namely their State of Health (SOH) and Remaining Useful Life (RUL), this paper proposes a joint prediction method based on Gaussian Process Regression (GPR) and Long Short-Term Memory (LSTM) networks. First, health features (HFs) are extracted from partial charging data. Subsequently, these features are fed into the GPR model for SOH estimation, generating SOH predictions. Finally, the estimated SOH values from the initial cycle to the prediction start point (SP) are input into the LSTM network in order to predict the future SOH trajectory, identify the End of Life (EOL), and infer the RUL. Validation on the Oxford Battery Degradation Dataset demonstrates that this method achieves high accuracy in both SOH estimation and RUL prediction. Furthermore, the proposed approach can directly utilize one or more health features without requiring dimensionality reduction or feature fusion. It also enables RUL prediction at the early stages of a battery’s lifecycle, providing an efficient and reliable solution for battery health management. However, this study is based on data from small-capacity batteries and does not yet encompass applications in large-capacity or high-temperature scenarios. Future work will focus on expanding the data scope and validating the model’s performance in real-world systems, driving its application in practical engineering scenarios.

1. Introduction

With the rapid development of electric vehicles and renewable energy storage systems, lithium-ion batteries, as key energy storage devices, have garnered significant attention for their performance and safety [1]. State of Health (SOH) and Remaining Useful Life (RUL) are two crucial parameters for evaluating battery performance [2,3]. Accurately estimating these parameters is essential for ensuring the safe and efficient operation of battery systems, as well as for reducing operational costs and preventing unexpected failures [4].
SOH is defined as the ratio of the current maximum available capacity of the battery to its rated capacity [5], expressed by the following equation:
S O H = Q c u Q e × 100 %
where Q c u represents the current maximum available capacity of the battery and Q e represents the rated capacity of the battery.
When the current maximum available capacity of the battery drops to 80% of its rated capacity, i.e., a SOH of ≤ 80%, the battery reaches its End of Life (EOL), at which point it needs to be replaced. RUL is defined as the expected number of cycles remaining before the battery reaches its EOL, starting from its current state [6], expressed as follows:
R U L = c y c l e E O F c y c l e c u
where c y c l e E O F represents the number of cycles at which the battery reaches its EOL; c y c l e c u represents the current cycle count of the battery.
Currently, SOH estimation methods can be categorized into two types: model-based methods and data-driven methods. Model-based methods estimate the SOH by constructing equivalent circuit models for the battery. For example, Schwunk et al. [7] proposed an SOH estimation method based on the PF; Bustos et al. [8] developed an approach using the DF; Ranga et al. [9] introduced a method based on the UKF; Rahimifard et al. [10] proposed an ASVSF-VBL; Fahmy et al. [11] presented a DAUKF-CCA; and Yang et al. [12] also implemented an SOH estimation approach using the UKF. However, due to the complexity of the internal battery environment and the uncertainty of external operating conditions, developing an accurate model remains a significant challenge [13].
In contrast, data-driven methods avoid the complexity of the modeling process by analyzing the historical operating data of batteries and utilizing deep learning or machine learning techniques to estimate the SOH. As a result, these methods have gained significant attention and recognition in recent years [14]. For example, Alberto et al. [15] proposed an SOH estimation method based on an FC-FNN; Rahimian et al. [16] introduced a method using an NN; Safavi et al. [17] developed an approach combining a CNN-LSTM; Lee et al. [18] presented a method utilizing an MNN-LSTM; and Teixeira et al. [19] proposed a GRU-based SOH estimation method.
In data-driven SOH estimation, the selection of health features (HFs) plays a critical role. Jia et al. [20] extracted HFs from the discharge process and used GPR for SOH estimation. However, in practical applications, the discharge conditions of batteries are often difficult to measure accurately, making data collection challenging. In comparison, data collection during the charging process is more convenient. Feng et al. [21] used features such as constant-current and constant-voltage charging times as HFs and employed IGPR for SOH estimation. Similarly, Dai et al. [22] extracted HFs such as constant-current and constant-voltage charging times from the complete charging process and utilized DA-BiLSTM networks for SOH estimation. Liu [23] further considered that batteries are not always charged from zero during the charging process and extracted the constant-current charging time from a state of charge (SOC) of 20% to the end of charging as an HF, using GPR for SOH estimation. However, this method overlooks scenarios where the battery might not be fully charged.
RUL prediction is primarily based on data-driven methods. For instance, Chang et al. [24] used the first 50% or 70% of a battery’s data to train an LSTM model and tested it with the remaining data to predict the RUL. Liu et al. [25] employed the first 50% or 60% of a battery’s data to train a CEEMDAN-PSO-BiGRU model and tested it with the remaining data for RUL prediction. Similarly, Zou et al. [26] utilized the first 40% or 50% of a battery’s data to train a CEEMDAN-PSO-BiGRU model, testing it with the remaining data for RUL prediction. Tang et al. [27] adopted the first 50% of a battery’s data to train a CEEMDAN-IGWO-BiGRU model and tested it with the remaining data for RUL prediction. These RUL prediction methods typically focus on a single battery, using its early-stage data for model training and later-stage data for testing. Since only a portion of the historical data is used, the model is unable to fully learn the complete degradation cycle of the battery, resulting in certain limitations [28]. Furthermore, such methods cannot provide RUL predictions during the early stages of battery usage, which poses constraints in practical applications.
SOH and RUL are both parameters that reflect battery aging, with the SOH representing the current aging state and the RUL indicating future aging trends. To comprehensively estimate the battery’s aging status, it is essential to jointly predict both the SOH and RUL. For example, Li et al. [29] proposed a joint SOH and RUL prediction method based on GPR-LSSVM; Dong et al. [30] developed a method using HKFRVM; and Wang et al. [31] proposed another approach based on GPR-LSSVM for joint SOH and RUL prediction. In these methods, researchers typically extract multiple health features from battery data for SOH estimation and then use dimensionality reduction techniques to fuse these features into a single indirect health feature (IHF), which is used as the input for the SOH estimation model. During the RUL prediction process, these methods rely on predicting IHF values based on cycle numbers and subsequently inferring the SOH and RUL from the predicted IHF. This approach requires minimizing the number of health features, often using only a single composite feature, to simplify the relationship between cycle numbers and features.
Based on this, this paper proposes a joint SOH and RUL prediction method utilizing partial charging data. First, Gaussian Process Regression (GPR) is employed for SOH estimation, followed by Long Short-Term Memory (LSTM) networks for RUL prediction. The proposed method offers the following advantages:
  • It leverages partial charging data, reducing dependence on complete charging data and making the model more suitable for real-world scenarios with incomplete data.
  • The method imposes no restriction on the number of input features, allowing for the flexible selection of health features.
  • It enables RUL prediction during the early stages of battery usage, providing earlier warning capabilities to help extend battery lifespan and prevent unexpected failures.
The remainder of this paper is organized as follows: Section 2 introduces the theoretical foundations, dataset, correlation analysis, model architecture, and evaluation metrics. Section 3 presents the model validation, result analysis, and a discussion. Section 4 summarizes the work of this study.

2. Theoretical Foundation

2.1. GPR

GPR [32] is a non-parametric Bayesian regression method that models data by assuming a certain relationship between data points. GPR excels in handling small sample datasets and providing uncertainty estimates for predictions, making it highly adaptable. The mathematical foundations of GPR are detailed below.
  • Definition of Gaussian Process
A Gaussian process is a stochastic process defined over a function space. Its core idea is to assume that the function values f ( x ) follow a Gaussian distribution. For any set of inputs x = x 1 , x 2 , , x n , their corresponding function values f = { f ( x 1 ) , f ( x 2 ) , , f ( x n ) } follow a multivariate Gaussian distribution.
A Gaussian process can be denoted as follows:
f G P m x , K x , x
where m x is the mean function, representing the expected value of the function at input point x; K x , x is the covariance function (or kernel function), which describes the correlation between two input points. The covariance function is typically chosen to be a symmetric positive definite kernel.
In practical applications, the mean function is often assumed to be m x = 0 , leaving only the definition of the covariance function K x , x to be specified.
2.
Covariance Function and Expectation
The mean function m x and the covariance function K x , x are defined as follows:
m x = E f x
K x , x = E f x m x f x m x
where E f x represents the expected value of f x , while K x , x represents the covariance between f x and f x , reflecting the correlation between these two input points.
3.
Observation Data Model
Consider a set of observation data { ( x i , y i ) } i = 1 n , where x i represents the input and y i represents the observed value. It is assumed that the observed value is composed of the true function value f x i and independent and identically distributed Gaussian noise ϵ i , as follows:
y i = f x i + ϵ i , ϵ i N 0 , σ n 2
These observations can be expressed as follows:
y = f X + ϵ
where y = [ y 1 , y 2 , , y n ] , f X = [ f ( x 1 ) , f ( x 2 ) , , f ( x n ) ] , and the noise term ϵ follows a Gaussian distribution with a mean of zero and variance σ n 2 , denoted as ϵ N ( 0 , σ n 2 I ) .
4.
Joint Gaussian Distribution
Since f X follows a Gaussian process G P ( 0 , K ) , the observed values y follow a joint Gaussian distribution:
y N 0 , K X , X + σ n 2 I
5.
Prediction for a New Input Point
For a new input x * , the distribution of its corresponding output f x * can be predicted. According to the properties of Gaussian processes, the joint distribution can be expressed as follows:
y f x * N 0 , K X , X + σ n 2 I K X , x * K x * , X K x * , x *
where K X , x * is the covariance vector between the training points and the test point x * , and K x * , x * is the covariance of the test point with itself.
Using the properties of conditional Gaussian distributions, the predictive distribution of f x * given X , y , and x * follows a Gaussian distribution:
f x * X , y , x * N μ x * , σ 2 x *
where the mean and variance are given by the following equations:
μ x * = K x * , X [ K X , X + σ n 2 I ] 1 y
σ 2 x * = K x * , x * K x * , X [ K X , X + σ n 2 I ] 1 K X , x *
6.
Hyperparameter Optimization and Kernel Function Selection
In practical applications, the kernel function K x , x and its hyperparameters are critical to the performance of GPR. Commonly used kernel functions include the Radial Basis Function (RBF) kernel, the Matern kernel, and the linear kernel, among others. The selection of hyperparameters is typically achieved by maximizing the marginal likelihood, which is expressed as follows:
log p y X = 1 2 y [ K X , X + σ n 2 I ] 1 y 1 2 l o g K X , X + σ n 2 I n 2 log 2 π
By optimizing this marginal likelihood function, the optimal hyperparameter settings can be obtained, thereby improving the regression performance of GPR.

2.2. LSTM

LSTM [33] is a specialized type of Recurrent Neural Network (RNN) that excels at capturing long-term dependencies in sequential data. Traditional RNNs often face challenges such as gradient vanishing or exploding when processing long time series. LSTM effectively mitigates these issues through its unique gating mechanisms.
The core of LSTM consists of three gates: the forget gate, the input gate, and the output gate, as shown in Figure 1. These gating mechanisms are responsible for selectively forgetting, updating, and outputting state information, enabling precise control over sequence information. Through these gates, LSTM can selectively retain or discard past information, making it highly effective for modeling time series data.
Figure 1. LSTM architecture diagram.
  • Forget Gate
The forget gate controls whether information from the previous time step is passed to the next time step. Its calculation formula is as follows:
f t = σ ( w f × h t 1 + w f × x t + b f )
where f t is the output of the forget gate, representing the proportion of information to be forgotten at the current time step. w f is the weight matrix of the forget gate, which determines the influence of the previous hidden state and the current input on the forget gate. h t 1 is the hidden state from the previous time step, which contains all information from the sequence up to the current time step. x t is the input data at the current time step. b f is the bias term of the forget gate. σ is the sigmoid activation function, with an output range between 0 and 1, determining the degree of forgetting.
2.
Input Gate
The input gate controls the updating of new information at the current time step. It consists of two parts:
Input Gate Activation:
i t = σ ( w i × h t 1 + w i × x t + b i )
where i t is the output of the input gate, indicating whether new information is accepted. w i is the weight matrix of the input gate. b i is the bias term of the input gate.
Candidate values:
C t = t a n h w c × h t 1 + w c × x t + b c
where C t is the candidate value at the current time step, representing new information that can be added to the cell state. w c is the weight matrix for the candidate values. b c is the bias term for the candidate values. tanh is the hyperbolic tangent activation function, ensuring the candidate values range between −1 and 1.
3.
Cell State
The core of LSTM is the cell state c t , which is responsible for carrying long-term dependency information. The update formula for the cell state is as follows:
c t = f t × c t 1 + i t × C t
where c t is the cell state at the current time step, containing information for long-term memory. c t 1 is the cell state from the previous time step. i t × C t represents the update to the cell state by the input gate, determining the new information to be added to the state at the current time step.
4.
Output Gate
The output gate controls the final hidden state output. Its calculation formula is as follows:
o t = σ ( w o × h t 1 + w o × x t + b o )
where o t is the output of the output gate, determining the hidden state at the current time step. w o is the weight matrix of the output gate. b o is the bias term of the output gate.
5.
Hidden State
The final hidden state h t of the LSTM is calculated through the combination of the output gate and the cell state:
h t = o t × tanh C t
where h t is the hidden state at the current time step, containing key information from the current and previous time steps. tanh C t is the activation of the current cell state, with an output range between −1 and 1.

2.3. Dataset Description and Health Feature Extraction

This study uses the Oxford Battery Degradation Dataset as the source of experimental data, selecting four battery cells (Cell1, Cell3, Cell7, and Cell8) for testing. These batteries have a rated capacity of 0.74 Ah and were subjected to aging tests at a constant temperature of 40 °C. After every 100 cycles, a 1C constant current charge–discharge calibration was performed to simulate the aging behavior of the batteries during actual use.
Figure 2 illustrates the voltage curves of the battery at different cycles, using Cell1 as an example. As shown in the figure, as the number of battery cycles increases, both the charging and discharging times gradually shorten, which is visually represented by the voltage curves shifting to the left. This phenomenon indicates that the battery’s SOH degradation trend is consistent with the reduction in charging and discharging times.
Figure 2. Voltage curves under different cycles.
In practice, it is challenging to measure battery discharge information. In contrast, collecting information during the charging process is more convenient. Therefore, this study extracts health features (HFs) from the charging process. However, the charging process may not always start from a state of charge (SOC) of 0%, nor necessarily end at full charge. Ultimately, this study uses partial charging data to extract the following HFs:
(1)
HF1: The constant current charging time in a SOC range of 20% to 80%.
(2)
HF2: The integral of the voltage curve with respect to time within a SOC range of 20% to 80%.
By extracting HFs from partial charging data, the dependence on data completeness is reduced, thereby enhancing the applicability of the proposed method.
Figure 3 illustrates the normalized degradation trends of the SOH, HF1, and HF2 with the number of cycles, using Cell1 as an example. The results show that both HF1 and HF2 exhibit clear degradation trends as the cycle count increases. These trends are highly consistent with the degradation of the SOH, indicating that the extracted HFs effectively reflect the battery’s aging state.
Figure 3. Degradation trends of health features and SOH over cycles.

2.4. Correlation Analysis

To evaluate the correlation between HF1, HF2, and the SOH, this paper employs the Pearson correlation coefficient and the Spearman correlation coefficient for correlation analysis.
The Pearson correlation coefficient is defined as follows:
P e a r s o n = E X Y E ( X ) E ( Y ) E X 2 E 2 X E Y 2 E 2 ( Y )
where X denotes the HF and Y denotes the SOH.
The Spearman correlation coefficient is defined as follows:
S p e a r m a n = i = 1 n X i X ¯ Y i Y ¯ i = 1 n X i X ¯ 2 i = 1 n Y i Y ¯ 2
where X ¯ represents the average value of the HF; Y ¯ indicates the average SOH value; and n refers to the total number of samples.
Table 1 summarizes the Pearson and Spearman correlation coefficients between HF1 and HF2 with the SOH for the four batteries (Cell1, Cell3, Cell7, and Cell8). The results show that all correlation coefficients exceed 99.99%, indicating an extremely high correlation between the extracted health features and the SOH. This further validates that HF1 and HF2 effectively represent the battery’s health status.
Table 1. Correlation coefficient analysis.

2.5. Model Structure and Parameter Settings

The structure of the proposed model is shown in Figure 4. As illustrated, the model consists of two parts: SOH estimation and RUL prediction. This structure not only enables the accurate estimation of the battery’s SOH but also facilitates the prediction of its RUL, thereby providing comprehensive state monitoring and decision support for Battery Management Systems (BMSs). All training and testing in this study were conducted in the MATLAB 2023a environment.
Figure 4. Schematic diagram of the SOH estimation and RUL prediction model architecture.

2.5.1. SOH Estimation

The SOH estimation part is divided into two phases: offline training and online testing.
  • Offline Training Phase:
In this phase, HF1 and HF2 were used as input variables, and the SOH was set as the target variable to construct and train the GPR model. The constructed GPR model adopted a squared exponential kernel function, which is mathematically expressed as follows:
K x i , x j = σ f 2 exp x i x j 2 2 l 2
where x i and x j represent the input health features, σ f is the signal amplitude, and l is the length-scale hyperparameter.
To minimize the impact of differences in feature scales on model training, the input variables HF1 and HF2, and the output variable SOH were normalized to the range [0,1], using the following formula:
x = x min x max x min x
where x is the original feature value, x is the normalized feature value, and min x and max x represent the minimum and maximum values of the feature, respectively.
2.
Online Testing Phase:
In this phase, HF1 and HF2 were input into the trained GPR model to obtain the estimated values of the SOH.

2.5.2. RUL Prediction

In the RUL prediction part, the actual SOH data were first used to train the LSTM model. The model consists of four LSTM layers and one fully connected (FC) layer. Each LSTM layer contains 20 hidden nodes, and the FC layer has one node. The initial learning rate was set to 0.001, and the Adam optimizer was used for training, the maximum number of iterations was 3000, and the sequence length was set to 2.
To enhance the training performance of the model, the SOH data were standardized. The standardized SOH data were used to generate the input and output sequences for the model. Specifically, for each input, two consecutive standardized SOH values [ S O H t , S O H t + 1 ] were used as the input sequence, and the SOH value at the next time step, S O H t + 2 , was used as the output sequence.
After obtaining the SOH estimates from the GPR model, the estimates were divided into a sequence of SOH values from cycle 1 to the current SP: X i n p u t = { S O H 1 , S O H 2 , , S O H S P } . The LSTM model was then used to predict the future SOH values. Within the predicted values, the cycle count at which SOH ≤ 80% was identified. Based on this, the RUL at the current time was calculated as follows: R U L = E O L c y c l e S P c y c l e .
Once the model training was completed, the SOH estimates obtained from the GPR model were divided into a sequence fragment ranging from the first cycle to just before the SP. This fragment was used as the input to the LSTM model to predict future SOH values. The cycle corresponding to the EOL was then identified from the predicted values, and the RUL at the current time was calculated accordingly.

2.6. Evaluation Metrics

To evaluate the accuracy of the model’s predictions, the root mean square error (RMSE) and mean absolute error (MAE) were selected as evaluation metrics. The specific calculation methods are as follows:
R M S E = 1 n i = 1 n y ^ i y i 2
M A E = 1 n i = 1 n y ^ i y i
where y ^ i represents the predicted value for the i-th sample; y i represents the actual observed value for the i-th sample; and n denotes the number of samples.

3. Experiment and Analysis

3.1. Experiment 1

In Experiment 1, the leave-one-out method was used to validate the model, where one battery was selected as the test set while the remaining three batteries were used as the training set. This process was repeated for all batteries in the dataset. This method maximizes the utilization of the dataset and evaluates the model’s performance under different dataset combinations.

3.1.1. Analysis of SOH Estimation Results for Experiment 1

Table 2 presents the results of SOH estimation. Taking Cell8 as an example, the RMSE and MAE estimated by the GPR model are 0.0632% and 0.0531%, respectively; for the SVM model, the RMSE is 0.3458% and the MAE is 0.2686%; for the Decision Tree (DT) model, the RMSE is 0.2276% and the MAE is 0.1867%; and for the Ridge Regression (RR) model, the RMSE is 0.4197% and the MAE is 0.3629%. According to the data in the table, it can be observed that the proposed GPR model achieves the highest estimation accuracy across all test sets compared to the other three models.
Table 2. SOH Estimation Errors under Different Models in Experiment 1.
Figure 5 and Figure 6 further illustrate the SOH estimation results and the corresponding absolute errors for the four batteries. Combining Figure 5 and Figure 6, and Table 2, it can be observed that the GPR model exhibits lower variability in absolute error compared to the other three models, and its estimated values can better track the true values.
Figure 5. SOH estimation results for different models.
Figure 6. Absolute errors of SOH estimation for different models.

3.1.2. Analysis of RUL Prediction Results for Experiment 1

Table 3 summarizes the error results of RUL prediction for the four batteries and compares the performance of the LSTM and BiLSTM models. The results show that the LSTM model achieves an average RMSE of 2.1808 and an average MAE of 1.9382, while the BiLSTM model’s average RMSE and MAE are 2.3490 and 2.0742, respectively. The proposed LSTM model maintains low RMSE and MAE levels across all test sets and outperforms the BiLSTM model overall.
Table 3. RUL Prediction Errors under Different Models in Experiment 1.
Figure 7 and Figure 8 illustrate the RUL prediction results and the corresponding absolute errors, respectively. As shown in the figures, compared to the BiLSTM model, the LSTM model exhibits smaller fluctuations in prediction error and produces prediction curves that are closer to the true values. This indicates that the LSTM model has a greater advantage in capturing the degradation trend of battery RUL.
Figure 7. RUL prediction results.
Figure 8. RUL prediction errors.

3.2. Experiment 2

In Experiment 1, we validated the model’s performance using the leave-one-out method. To further verify the model’s robustness and wide applicability, a different experimental design was adopted in Experiment 2. In this experiment, two batteries were selected as the training set, and the other two batteries were used as the test set, allowing us to evaluate the model’s performance under different training and testing dataset combinations.

3.2.1. Analysis of SOH Estimation Results for Experiment 2

Table 4 presents the SOH estimation errors of different models on the test set under six different training set combinations. The results show that, despite variations in the choice of training sets, the GPR model consistently maintains high estimation accuracy, with an RMSE generally below 0.12% and an MAE below 0.1%. This indicates that the proposed method exhibits strong generalization ability under varying data conditions.
Table 4. SOH Estimation Errors under Different Models in Experiment 2.
Figure 9 and Figure 10 further illustrate the SOH estimation results and their corresponding absolute errors when using different training set combinations. Combined with the data in Table 4, it can be observed that whether Cell1 and Cell3 or Cell7 and Cell8 are used as the training set, the GPR model consistently achieves accurate SOH estimation for the test set batteries, demonstrating high estimation accuracy.
Figure 9. SOH estimation results and errors for different models when Cell1 and Cell3 are used as the training set.
Figure 10. SOH estimation results and errors for different models when Cell7 and Cell8 are used as the training set.

3.2.2. Analysis of RUL Prediction Results for Experiment 2

Table 5 summarizes the RUL prediction errors of the LSTM and BiLSTM models under six different training/testing set combinations. Overall, the LSTM model achieves an average RMSE of 1.4286 and an average MAE of 1.1502, while the BiLSTM model has an average RMSE of 1.4583 and an average MAE of 1.2080. The LSTM model demonstrates significantly better performance in terms of average error, reflecting higher prediction accuracy and stability.
Table 5. RUL Prediction Errors under Different Models in Experiment 2.
Figure 11 and Figure 12 show the RUL prediction results and their corresponding absolute errors under different training set combinations. Combined with the data in Table 5, it can be observed that, although different training set combinations have a slight impact on the prediction results, the overall errors remain small, further demonstrating the robustness and reliability of the model.
Figure 11. RUL prediction results and errors when Cell1 and Cell3 are used as the training set.
Figure 12. RUL prediction results and errors when Cell7 and Cell8 are used as the training set.

3.3. Validation on the University of Maryland Battery Dataset

To further evaluate the performance and generalization ability of the proposed model, validation experiments were conducted using the University of Maryland battery dataset. The experiments selected data from two batteries, CS36 and CS37, where one battery’s data were used for model training and the other for testing. The battery type in this dataset is LiCoO2, with a rated capacity of 1.1 Ah. A notable characteristic of this dataset is that the SOH degradation curve is relatively complex, exhibiting significant capacity regeneration phenomena and high noise levels.
During SOH estimation, it is crucial to reasonably account for capacity regeneration to ensure the accuracy of SOH estimation. However, in RUL prediction, as noted in references [20,21,28,30,34,35,36], the true RUL values follow a monotonically decreasing trend, meaning that capacity regeneration should not affect RUL predictions. In other words, capacity regeneration does not actually extend the battery’s RUL. Therefore, to eliminate the influence of noise and capacity regeneration on RUL prediction, the SOH data need to be smoothed.
In this section, LOWESS filtering [37] was applied to smooth the SOH data, with a smoothing factor set to 0.1. During the LSTM model training phase, the smoothed true SOH data were used as input to ensure the model captured the true degradation trend of the SOH. In the testing phase, the SOH data estimated by the GPR model were also smoothed using LOWESS before being input into the LSTM model for prediction. The input sequence length for the LSTM model was set to 10. By modeling the time-series characteristics of the SOH, the model achieved accurate future SOH predictions, which were then used to infer the battery’s RUL.

3.3.1. Analysis of SOH Estimation Results

Table 6 presents the error results of SOH estimation for three models, and Figure 13 further illustrates the estimated SOH and the corresponding absolute errors. The results show that compared to the Oxford Battery Dataset, the SOH estimation accuracy on the University of Maryland Battery Dataset has decreased. This is primarily due to the presence of significant capacity regeneration phenomena and higher noise levels in this dataset.
Table 6. SOH estimation error results.
Figure 13. SOH estimation results and errors for different models.
However, the proposed model is still able to effectively capture the overall degradation trend of the SOH and demonstrates good tracking performance during the capacity regeneration phase. Based on the average RMSE and MAE, the GPR model achieves the highest estimation accuracy, with an average RMSE of 1.0247% and an average MAE of 0.8158%. This indicates that GPR maintains strong adaptability and robustness in handling complex and noisy data, making it capable of accurately estimating battery SOH.

3.3.2. Analysis of RUL Prediction Results

Table 7 summarizes the true values, predicted values, and corresponding errors of RUL prediction for the University of Maryland Battery Dataset (CS36 and CS37) under different SPs. Figure 14 further illustrates the SOH prediction results.
Table 7. RUL prediction results at different starting points.
Figure 14. SOH prediction results at different starting points.
For the CS36 battery, the prediction error is smallest at SP = 250, where the true RUL is 240, the predicted RUL is 256, the absolute error is only 16, and the relative error is 6.6667%, demonstrating high prediction accuracy. However, at SP = 350, the prediction error increases, with an absolute error of 39 and a relative error of 27.8571%.
For the CS37 battery, the model exhibits higher prediction accuracy at SP = 350, where the true RUL is 207, the predicted RUL is 199, the absolute error is 8, and the relative error is only 3.8647%. In contrast, at an earlier starting point (SP = 250), the error is relatively larger, with an absolute error of 38 and a relative error of 12.3779%.
The results indicate that, due to the complexity of the SOH degradation curve and the higher noise levels in this dataset, the model’s RUL prediction accuracy is slightly lower compared to the Oxford Battery Dataset. However, overall, it still effectively captures the RUL variation trends.
Overall, despite the high noise levels and capacity regeneration phenomena in the University of Maryland battery dataset, which pose challenges to prediction accuracy, the proposed GPR-LSTM joint model is still able to achieve relatively high prediction accuracy at specific starting points. This indicates that the model demonstrates a certain level of adaptability under complex conditions. Future research can further optimize the model’s noise-resistant performance to better address the uncertainties in complex datasets.

4. Conclusions and Future Work

This paper proposes a joint prediction method based on partial charging data for the SOH estimation and RUL prediction of lithium-ion batteries. By integrating GPR and LSTM, the method achieves a comprehensive evaluation of the current state and future aging trends of batteries. The proposed approach relies only on partial charging data, reducing the need for complete charging data while ensuring high prediction accuracy, making it particularly suitable for practical scenarios with incomplete data. Furthermore, the model flexibly handles multiple health features and learns from complete degradation cycle data, effectively improving the accuracy of SOH estimation. By using the LSTM model for RUL prediction, the method provides RUL predictions at the early stages of a battery’s lifecycle, offering early warning capabilities for BMSs, which help extend battery lifespan and reduce the risk of sudden failures.
Despite achieving favorable prediction performance, this study has certain limitations that require further optimization and resolution in future research:
  • The study is validated based on open-source datasets, specifically the Oxford Battery Dataset, which mainly includes small-capacity (0.74 Ah) battery data under an operating temperature of 40 °C. It does not cover large-capacity batteries (e.g., above 100 Ah) or high-temperature environments (e.g., above 60 °C). Future work should expand to larger capacity and more complex operating conditions to validate the model’s effectiveness and generalization ability in real-world engineering applications.
  • In practical engineering applications, the measurement precision of BMSs and Energy Management Systems (EMSs) may involve uncertainties, potentially introducing additional noise that could affect SOH estimation and RUL prediction performance. Therefore, future research should focus on developing noise-resistant models to enhance robustness under complex measurement conditions.

Author Contributions

Resources, Y.S.; writing—original draft, X.L.; writing—review and editing, M.Z.; visualization, W.B. and H.L.; supervision, M.Z.; funding acquisition, Y.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Shandong Province Key Research and Development Plan [Grant No.: 2022KJHZ002], the Shandong Province Undergraduate Teaching Reform Research Project [Grant No.: M2020202], and the Shandong Province Science and Technology Small and Medium-sized Enterprises Innovation Capacity Enhancement Project [Grant No.: 2023TSGC0173].

Data Availability Statement

The data presented in this study are available at https://doi.org/10.3390/pr12091871 (accessed on 6 November 2024) [A Review on Lithium-Ion Battery Modeling from Mechanism-Based and Data-Driven Perspectives] [Processes 2024, 12, 1871].

Conflicts of Interest

Yuanyuan Song was employed by Shandong Zhengchen Technology Co., Ltd. The remaining author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Abbreviations

SOHState of HealthFCFully Connected
RULRemaining Useful LifeRMSERoot Mean Square Error
HFHealth FeatureMAEMean Absolute Error
GPRGaussian Process RegressionSOCState of Charge
LSTMLong Short-Term MemoryIHFIndirect Health Feature
SPStart PointDTDecision Tree
EOLEnd of LifeRRRidge Regression
RBFRadial Basis FunctionBMSBattery Management System
RNNRecurrent Neural NetworksEMSEnergy Management System

References

  1. Kumar, R.R.; Bharatiraja, C.; Udhayakumar, K.; Devakirubakaran, S.; Sekar, K.S.; Mihet-Popa, L. Advances in Batteries, Battery Modeling, Battery Management System, Battery Thermal Management, SOC, SOH, and Charge/Discharge Characteristics in EV Applications. IEEE Access 2023, 11, 105761–105809. [Google Scholar] [CrossRef]
  2. Ghalkhani, M.; Habibi, S. Review of the Li-Ion Battery, Thermal Management, and AI-Based Battery Management System for EV Application. Energies 2023, 16, 185. [Google Scholar] [CrossRef]
  3. Nyamathulla, S.; Dhanamjayulu, C. A review of battery energy storage systems and advanced battery management system for different applications: Challenges and recommendations. J. Energy Storage 2024, 86, 111179. [Google Scholar] [CrossRef]
  4. Nazaralizadeh, S.; Banerjee, P.; Srivastava, A.K.; Famouri, P. Battery Energy Storage Systems: A Review of Energy Management Systems and Health Metrics. Energies 2024, 17, 1250. [Google Scholar] [CrossRef]
  5. Pradhan, S.K.; Chakraborty, B. Battery management strategies: An essential review for battery state of health monitoring techniques. J. Energy Storage 2022, 51, 104427. [Google Scholar] [CrossRef]
  6. Ansari, S.; Ayob, A.; Lipu, M.H.; Hussain, A.; Saad, M.H.M. Remaining useful life prediction for lithium-ion battery storage system: A comprehensive review of methods, key factors, issues and future outlook. Energy Rep. 2022, 8, 12153–12185. [Google Scholar] [CrossRef]
  7. Schwunk, S.; Armbruster, N.; Straub, S.; Kehl, J.; Vetter, M. Particle filter for state of charge and state of health estimation for lithiumeiron phosphate batteries. J. Power Sources 2013, 239, 705–710. [Google Scholar] [CrossRef]
  8. Bustos, R.; Gadsden, S.A.; Malysz, P.; Al-Shabi, M.; Mahmud, S. Health Monitoring of Lithium-Ion Batteries Using Dual Filters. Energies 2022, 15, 2230. [Google Scholar] [CrossRef]
  9. Ranga, M.R.; Aduru, V.R.; Krishna, N.V.; Rao, K.D.; Dawn, S.; Alsaif, F.; Alsulamy, S.; Ustun, T.S. An Unscented Kalman Filter-Based Robust State of Health Prediction Technique for Lithium-Ion Batteries. Batteries 2023, 9, 376. [Google Scholar] [CrossRef]
  10. Rahimifard, S.; Habibi, S.; Goward, G.; Tjong, J. Adaptive Smooth Variable Structure Filter Strategy for State Estimation of Electric Vehicle Batteries. Energies 2021, 14, 8560. [Google Scholar] [CrossRef]
  11. Fahmy, H.M.; Hasanien, H.M.; Alsaleh, I.; Ji, H.; Alassaf, A. State of health estimation of lithium-ion battery using dual adaptive unscented Kalman filter and Coulomb counting approach. J. Energy Storage 2024, 88, 111557. [Google Scholar] [CrossRef]
  12. Yang, Q.; Ma, K.; Xu, L.; Song, L.; Li, X.; Li, Y. A Joint Estimation Method Based on Kalman Filter of Battery State of Charge and State of Health. Coatings 2022, 12, 1047. [Google Scholar] [CrossRef]
  13. Bai, J.Q.; Huang, J.Y.; Luo, K.; Yang, F.; Xian, Y. A feature reuse based multi-model fusion method for state of health estimation of lithium-ion batteries. J. Energy Storage 2023, 70, 107965. [Google Scholar] [CrossRef]
  14. Zheng, D.; Man, S.; Guo, X.F.; Ning, Y. Joint prediction of state of health and remaining useful life for lithium-ion batteries based on health features optimization and multi-model fusion. Ionics 2024, 30, 6239–6252. [Google Scholar] [CrossRef]
  15. Barragán-Moreno, A.; Schaltz, E.; Gismero, A.; Stroe, D.-I. Capacity State-of-Health Estimation of Electric Vehicle Batteries Using Machine Learning and Impedance Measurements. Electronics 2022, 11, 1414. [Google Scholar] [CrossRef]
  16. Rahimian, S.K.; Tang, Y.F. A Practical Data-Driven Battery State-of-Health Estimation for Electric Vehicles. IEEE Trans. Ind. Electron. 2023, 70, 1973–1982. [Google Scholar] [CrossRef]
  17. Safavi, V.; Bazmohammadi, N.; Vasquez, J.C.; Guerrero, J.M. Battery State-of-Health Estimation: A Step towards Battery Digital Twins. Electronics 2024, 13, 587. [Google Scholar] [CrossRef]
  18. Lee, J.-H.; Lee, I.-S. Estimation of Online State of Charge and State of Health Based on Neural Network Model Banks Using Lithium Batteries. Sensors 2022, 22, 5536. [Google Scholar] [CrossRef]
  19. Teixeira, R.S.D.; Calili, R.F.; Almeida, M.F.; Louzada, D.R. Recurrent Neural Networks for Estimating the State of Health of Lithium-Ion Batteries. Batteries 2024, 10, 111. [Google Scholar] [CrossRef]
  20. Jia, J.F.; Liang, J.Y.; Shi, Y.; Wen, J.; Pang, X.; Zeng, J. SOH and RUL Prediction of Lithium-Ion Batteries Based on Gaussian Process Regression with Indirect Health Indicators. Energies 2020, 13, 375. [Google Scholar] [CrossRef]
  21. Feng, H.L.; Shi, G.L. SOH and RUL prediction of Li-ion batteries based on improved Gaussian process regression. J. Power Electron. 2021, 21, 1845–1854. [Google Scholar] [CrossRef]
  22. Dai, J.Y.; Xia, M.C.; Chen, Q.F. Encoding and Decoding Model of State of Health Estimation and Remaining Useful Life Prediction for Batteries Based on Dual-stage Attention Mechanism. Autom. Electr. Power Syst. 2023, 47, 168–177. [Google Scholar] [CrossRef]
  23. Liu, P.; Li, Z.W.; Cai, Y.S.; Wang, W.; Xia, X.Y. Joint Estimation Method of SOC and SOH Based on the Fusion of Equivalent Circuit Model and Data-driven Model. Trans. China Electrotech. Soc. 2023, 39, 3232–3243. [Google Scholar] [CrossRef]
  24. Chang, Y.H.; Hsieh, Y.C.; Chai, Y.H.; Lin, H.-W. Remaining Useful-Life Prediction for Li-Ion Batteries. Energies 2023, 16, 3096. [Google Scholar] [CrossRef]
  25. Liu, L.; Sun, W.; Yue, C.; Zhu, Y.; Xia, W. Remaining Useful Life Estimation of Lithium-Ion Batteries Based on Small Sample Models. Energies 2024, 17, 4932. [Google Scholar] [CrossRef]
  26. Zou, L.; Wen, B.; Wei, Y.; Zhang, Y.; Yang, J.; Zhang, H. Online Prediction of Remaining Useful Life for Li-Ion Batteries Based on Discharge Voltage Data. Energies 2022, 15, 2237. [Google Scholar] [CrossRef]
  27. Tang, X.; Wan, H.; Wang, W.; Gu, M.; Wang, L.; Gan, L. Lithium-Ion Battery Remaining Useful Life Prediction Based on Hybrid Model. Sustainability 2023, 15, 6261. [Google Scholar] [CrossRef]
  28. Yin, J.; Liu, B.; Sun, G.B.; Qian, X.W. Transfer Learning Denoising Autoencoder-Long Short Term Memory for Remaining Useful Life Prediction of Li-Ion Batteries. Trans. China Electrotech. Soc. 2024, 39, 289–302. [Google Scholar] [CrossRef]
  29. Li, D.H.; Liu, X.; Cheng, Z. The co-estimation of states for lithium-ion batteries based on segment data. J. Energy Storage 2023, 62, 106787. [Google Scholar] [CrossRef]
  30. Dong, H.; Mao, L.; Qu, K.; Zhao, J.; Li, F.; Jiang, L. State of Health Estimation and Remaining Useful Life Estimation for Li-ion Batteries Based on a Hybrid Kernel Function Relevance Vector Machine. Int. J. Electrochem. Sci. 2021, 17, 221135. [Google Scholar] [CrossRef]
  31. Wang, P.; Fan, L.F.; Cheng, Z. A Joint State of Health and Remaining Useful Life Estimation Approach for Lithium-ion Batteries Based on Health Factor Parameter. Proc. CSEE 2022, 42, 1523–1534. [Google Scholar] [CrossRef]
  32. Sahinoglu, G.O.; Pajovic, M.; Sahinoglu, Z.; Wang, Y.; Orlik, P.V.; Wada, T. Battery State of Charge Estimation Based on Regular/Recurrent Gaussian Process Regression. IEEE Trans. Ind. Electron. 2018, 65, 4311–4321. [Google Scholar] [CrossRef]
  33. Tiane, A.; Okar, C.; Alzayed, M.; Chaoui, H. Comparing Hybrid Approaches of Deep Learning for Remaining Useful Life Prognostic of Lithium-Ion Batteries. IEEE Access 2024, 12, 70334–70344. [Google Scholar] [CrossRef]
  34. Ezzouhri, A.; Charouh, Z.; Ghogho, M.; Guennoun, Z. A Data-Driven-Based Framework for Battery Remaining Useful Life Prediction. IEEE Access 2023, 11, 76142–76155. [Google Scholar] [CrossRef]
  35. Zhang, Q.Y.; Cheng, Z.; Liu, X. RUL Estimation Method for Lithium-ion Batteries Based on Multi-dimensional and Multi-scale Features. Automot. Eng. 2024, 46, 1897–1903. [Google Scholar] [CrossRef]
  36. Najera-Flores, D.A.; Hu, Z.; Chadha, M.; Todd, M.D. A Physics-Constrained Bayesian neural network for battery remaining useful life prediction. Appl. Math. Model. 2023, 122, 42–59. [Google Scholar] [CrossRef]
  37. Ma, T.; Xu, J.; Li, R.; Yao, N.; Yang, Y. Online Short-Term Remaining Useful Life Prediction of Fuel Cell Vehicles Based on Cloud System. Energies 2021, 14, 2806. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.