Comparative Study of Recurrent Neural Networks for Electric Vehicle Battery Health Assessment

Kumar, Nagendra; Kundu, Krishanu; Kumar, Rajeev

doi:10.3390/wevj17040178

Open AccessArticle

Comparative Study of Recurrent Neural Networks for Electric Vehicle Battery Health Assessment

by

Nagendra Kumar

¹

,

Krishanu Kundu

²

and

Rajeev Kumar

^3,*

¹

Electrical and Electronics Engineering Department, GL Bajaj Institute of Technology & Management, Greater Noida 201306, India

²

Electronics and Communication Engineering Department, GL Bajaj Institute of Technology & Management, Greater Noida 201306, India

³

Electrical and Electronics Engineering Department, Krishna Institute of Engineering & Technology (KIET), Ghaziabad 201206, India

^*

Author to whom correspondence should be addressed.

World Electr. Veh. J. 2026, 17(4), 178; https://doi.org/10.3390/wevj17040178

Submission received: 14 February 2026 / Revised: 20 March 2026 / Accepted: 24 March 2026 / Published: 26 March 2026

(This article belongs to the Section Storage Systems)

Download

Browse Figures

Versions Notes

Abstract

Precise assessment of battery state of health (SoH) is vital for certifying consistent performance in order to enable maintenance of energy storage system. This work compares different deep learning methods to learn and predict the complex and nonlinear dynamics of battery. The models are developed and tested for predicting SoH using sequential degradation data from batteries. The effectiveness of these models is assessed using matrices such as RMSE, MAE and R², along with qualitative analysis. The experiment results show that the BiLSTM model performs better than the others. It has the lowest RMSE (0.90), the lowest MAE (0.72), and the highest R² (0.99), which highlights its enhanced ability to capture long-term temporal dependencies. The proposed models are validated using NASA lithium-ion battery aging dataset (B0005), which is widely used as a benchmark for battery health predictions studies. Overall, the findings indicate that bidirectional network architecture significantly improves the accuracy and consistency of SoH predictions when compared to unidirectional models.

Keywords:

electric vehicles (EVs); battery health predictions; state of health (SoH); state of charge (SoC); long short-term memory (LSTM); bidirectional LSTM (BiLSTM); gated recurrent unit (GRU)

Graphical Abstract

1. Introduction

Lithium-ion batteries are widely used in electric vehicles, electronics and energy storage systems because of their merits like high energy density, and long life. However, battery operation is degraded and can lead to safety and capacity loss. Therefore, the accurate estimation of battery state of heath is necessary for safe operation, maintenance and energy management [1,2]. Earlier battery health can be estimated using equivalent circuit models but require accurate calibration of internal parameters. Data-driven methods use data like voltage and current to learn degradation behavior [3]. Recent research explores recurrent network and hybrid models to improve prediction accuracy [2].

1.1. Literature Review

1.1.1. Review-Based Studies

The field of battery health management has evolved from basic to advanced techniques. Research, as reported in [4,5], highlighted the global need for ML-based approaches for precise SoC monitoring of lithium-ion technology. The challenges of generalization and transferability were discussed in [6,7]. Ref. [8] reported the key issues and major research progress related to end-to-end decision making for autonomous vehicles using ML-based, probabilistic-based and mixed-method-based prediction. Recent research indicates that conventional machine learning techniques are being replaced by deep learning systems that can perform real-time online SoH estimation [9]. An adaptive and intelligent BMS was introduced in [10], and an intelligent SOX network was suggested where system continuously updates itself as the battery ages. Ref. [11] reported a future BMS as a fully self-adjusting system which updates its parameters automatically to ensure long-term safety and performance.

1.1.2. Model-Based Studies

At first, efforts to measure battery health mainly focused on internal physical parameters such as [12], which showed that internal resistance could be directly obtained from electric vehicle operational data, which offers a key model-based degradation indicator. Further, as computing capabilities improve the focus moved towards data-driven approaches. Earlier industrial methods depended on basic voltage-based lookup tables for estimating state of health (SoH), but these methods are not effective at capturing the nonlinear nature of battery aging [13]. For systems with limited computing power, ref. [14] shows that reduced-order electrochemical models can offer an alternative option without the high computational demands of deep neural networks.

1.1.3. Hybrid Studies

To enhance these models further, studies by [2,15] emphasized the advantages of gated recurrent units (GRUs), which provide comparable accuracy to LSTMs but with lower computational cost. Recent studies support hybrid and structural innovations. Authors in ref. [16] integrated a CNN for extracting spatial features with LSTM for tracking temporal patterns and developed a strong multimodal network. Similarly, ref. [17] proposed a temporal convolutional network (TCNs) as a more stable and adaptable alternative compared to conventional RNNs. To address the issues resulting from noisy real-world data, ref. [18] used empirical mode decomposition to clean battery signals, allowing LSTM networks to concentrate on actual degradation patterns rather than temporary disturbances. Ref. [19] showed and stressed the value of combining multiple models to keep estimation accuracy consistent across different operating conditions. Ref. [20] reported that attention mechanism significantly exceeds typical recurrent neural network by focusing on the most relevant health-related feature within long sequences. A multi-head attention BiLSTM network has been designed to predict SoC of lithium-ion batteries in [21]. Ref. [22] indicated that metaheuristic approaches such as particle swarm optimization (PSO) can adjust neural network hyper parameters more efficiently than manual methods. Ref. [23] introduced a hybrid CNN-BILSTM attention network that considers battery data as multidimensional signal and effectively reduces noise while identifying health degradation patterns. Recent studies show the importance of understanding battery design using physics-based analysis. In Ref. [24], degradation behavior and lifespan using short charging segments have been evaluated and influenced using design parameters. However, these approaches require detailed physical information and controlled conditions. Recent studies have also explored the transformer-based architectures are well suited for real-time predictive maintenance in battery management systems [25].

1.1.4. Data-Driven Studies

Data-driven approaches have also played an important role in estimation of degradation. Ref. [3] demonstrated that LSTM networks are particularly very well suited for predicting remaining useful life (RUL) due to their capability to track long-term degradation patterns. Ref. [26] reported a specific aging pattern in voltage and current curves to enhance data-driven estimation techniques. In ref. [27], the probabilistic neural network for quick health classification has been applied initially. A deep learning system for estimating the state of health of EV batteries by utilizing long-term operational data has been reported in [28]. Research shows those deep neural networks are capable of effectively learning complex degradation patterns and delivering precise SoH predictions. Ref. [29] shows that deep learning models can estimate SoH using only partial charging data, removing the need for full life-cycle test data. An online remaining useful life (RUL) prediction utilizing BiLSTM network has been reported in [30]. From literature review, it is evident that model-based, hybrid and data-driven estimation methods have shown significant progress; however, there are still some challenges. Model-based and physics informed approaches require electrochemical parameters and knowledge of internal dynamics which generally are unavailable in real world. Hybrid models can address this limitation but further increase the modeling complexity and other requirements. For battery management systems large data as voltage, current and temperature are collected which makes the data-driven approaches more suitable for implementation. It is also seen that various metaheuristic optimization techniques exist and can be applied for model parameter tuning in different engineering problems [31]. Despite these advances, recurrent neural networks such as LSTM and GRU remain widely used for battery health prediction due to their balance between prediction accuracy, model complexity and data requirements. The contribution of main works reported here has also been given in Table 1.

1.2. Research Gap

It has been seen from the literature that many deep learning studies focus on a single neural network model or evaluate models under different datasets. Therefore, a fair comparison is difficult to make in order to identify the most suitable model for implementing practical battery management system. Further, limited attention has been given to analyzing the trade-off between accuracy and complexity. This is also an important factor for real-time embedded applications. Therefore, a systematic comparison of these recurrent deep learning models is necessary.

1.3. Novelty and Motivation

This study performs a systematic comparative analysis of four recurrent deep learning architectures; LSTM, BiLSTM, GRU and BiGRU. All models have been checked for same preprocessing pipeline, input features and performance matrices. This approach enables comparison of prediction accuracy and model behavior. Apart from this, this study also highlights the trade-off between prediction performance and model complexity and shows the suitability of each architecture.

1.4. Research Positioning and Contributions

This work lies at the intersection of deep learning-based battery health diagnostics and practical electric vehicle (EV) battery management system (BMS) design. Unlike many existing studies that focus on a single model or dataset-specific performance, this paper emphasizes fair comparison, reproducibility, and real-world applicability.

The key highlights of current study:

Comparative Network: Authors have chosen an incorporated data preprocessing and assessment framework so that LSTM, BiLSTM, GRU, and BiGRU models could be compared impartially.
Comprehensive performance evaluation: Model performance is evaluated using well known performance metrics such as RMSE, MAE, and R² in order to assess prediction accuracy and error.
Computational Feasibility and Practical Relevance: An analysis of results provides the guidance to select suitability of the models for real-world implementation.

2. Materials and Methods

Data-based approach for prediction of state of health (SoH) of an electric vehicle battery has been depicted via Figure 1. Collection of basic battery measurements like voltage, temperature, etc. has been done at initial stage. Collected raw data is then analyzed and processed properly, which is helpful for extracting meaningful information. Meaningful data is then segregated into time-based segments which could be utilized effectively by models. At later stage advanced deep learning models like LSTM, BiLSTM, GRU, and BiGRU are trained as well as compared to find the best. All models have been trained using the same dataset and input features, and their performance was assessed by employing standard error metrics such as MAE and RMSE for checking prediction accuracy. On the basis of numerical results as well as visual analysis, the model providing significant balance between computational efficacy and accuracy is deemed suitable for real-time battery health estimation.

3. Dataset and Preprocessing

The experimental study was conducted using the publicly available NASA lithium-ion battery aging dataset (B0005). The dataset contains detailed charge-discharge cycling data collected under laboratory conditions. This dataset has measurements of terminal voltage, current, discharge capacity, temperature for battery full life cycle.

Data Preprocessing Steps

For ensuring better data quality and model robustness below mentioned steps are to be followed:

Voltage, current and temperature are selected as input features because these variables directly reflect battery operating conditions.
Feature normalization: Min-Max normalization is applied to scale all features to the range (0,1) which improves stability and convergence during training.
A sliding window (40 cycles) approach was used to capture temporal degradation patterns across chargers-discharge cycles.
All preprocessing steps, including normalization and sliding window segmentation are applied uniformly to NASA B0005 dataset to ensure a fair and unbiased comparison among deep leering models.

4. Deep Learning Models

In this study, authors have compared the performance of four different deep learning models for battery characteristic estimation.

Let the input sequence be

X = \{x_{1}, x_{2}, \dots, x_{T}\}

(1)

where

$x_{t} = [V_{t}, I_{t}, T_{t}, \dots]$
$X : T h e c o m p l e t e s e t o r s e q u e n c e o f d a t a p o i n t s f r o m s t a r t t o t i m e T .$
$x_{t} : T h e f e a t u r e v e c t o r a t a s p e c i f i c m o m e n t t .$
$V_{t} : V o l t a g e a t t i m e t .$
$I_{t} : C u r r e n t a t t i m e t .$
$T_{t} : T e m p e r a t u r e a t t i m e t .$

4.1. Long Short-Term Memory (LSTM) Network

To estimate SoH of lithium-ion batteries, four deep learning models have been utilized. One of them is long short-term memory (LSTM) network. LSTM networks are effective in handling dependencies and reducing vanishing gradient problem which make them a perfect choice for battery estimation. Below are the equations forming the architecture of the LSTM network. The symbols and their meanings are given in the Abbreviations and nomenclature.

LSTM Model Equations

Forget Gate : f_{t} = σ (W_{f} x_{t} + U_{f} h_{t - 1} + b_{f})

(2)

Input Gate : i_{t} = σ (W_{i} x_{t} + U_{i} h_{t - 1} + b_{i})

(3)

Candidate cell state : \overset{ˇ}{c} t = t a n h (W_{c} x_{t} + U_{c} h_{t - 1} + b_{c})

(4)

Cell state update : c t = f_{t} ⊙ c_{t - 1} + i_{t} ⊙ \overset{ˇ}{c} t

(5)

Output gate : o_{t} = σ (W_{O} x_{t} + U_{O} h_{t - 1} + b_{o})

(6)

Hidden state : h_{t} = o_{t} ⊙ t a n h (c_{t})

(7)

Output layer : {\hat{y}}_{t - L S T M} = W_{y} h_{t} + b_{y}

(8)

where x_t is the input vector, h_t is the hidden state, c_t is the cell state. The vector f_t, i_t, and o_t are forget, input and output gate. The W_f, W_i, W_c are the input weight matrices and U_f, U_i, U_c are the recurrent weight matrices. The vectors are b_i, b_f, b_c are the bias terms. The

⊙

represents element wise multiplication. The

σ

represents activation function and

t a n h

is the hyperbolic tangent function. The output

{\hat{y}}_{t - L S T M}

is computed using output weight matrix and bias.

4.2. Bidirectional Long Short-Term Memory (BiLSTM) Model

BiLSTM network is basically an enhanced form of standard LSTM model. It processes battery time-series data in forward and backward directions, and it is able to capture charging-discharging dynamics as well as long term degradation behavior.

BiLSTM Model Equations

Forward LSTM : \vec{h t} = {L S T M}_{f} (x_{t})

(9)

Backward LSTM : \overset{⃐}{(h t)} = {L S T M}_{b} (x_{t})

(10)

Hidden State Concatenation : h_{t} = |\vec{h t}| |\overset{⃐}{(h t)}|

(11)

Output layer : {\hat{y}}_{t_B i L S T M} = W_{y} h_{t} + b_{y}

(12)

where

\vec{h t}

and

\overset{⃐}{(h t)}

shows the hidden state of forward and backward LSTM. The concatenation vector h_t combines information from both directions. The output of BiLSTM is

{\hat{y}}_{t_B i L S T M}

.

4.3. Gated Recurrent Unit (GRU) Model

GRU is a variant of recurrent neural network which employs gating mechanisms for controlling the flow of information using a reset gate and an update gate.

GRU Model Equations

Update gate : z_{t} = σ (W_{z} x_{t} + U_{z} h_{t - 1} + b_{z})

(13)

Reset gate : r_{t} = σ (W_{r} x_{t} + U_{r} h_{t - 1} + b_{r})

(14)

Candidate hidden state : {\tilde{h}}_{t} = t a n h (W_{h} x_{t} + U_{h} (r_{t} ⊙ h_{t - 1}) + b_{h})

(15)

Hidden state update : h_{t} = (1 - z_{t}) ⊙ h_{t - 1} + z_{t} ⊙ {\tilde{h}}_{t}

(16)

Output Layer : {\hat{y}}_{t_G R U} = W_{y} h_{t} + b_{y}

(17)

where

z_{t}

and

r_{t}

are the update and reset gate. The

{\tilde{h}}_{t}

is the candidate hidden state. The W_z, W_r, W_h are the input weight matrices and U_z, U_r, U_h are the recurrent weight matrices. The vectors are b_z, b_r, b_h are the bias terms. The output of GRU is

{\hat{y}}_{t_G R U}

.

4.4. Bidirectional GRU (BiGRU Model)

BiGRU is an extension of GRU and it processes sequence in two directions. The output is a combination of them. It can learn past, present and future dependencies. It gives more accurate prediction of battery SoC/SoH.

BiGRU Model Equations

Forward GRU : {\vec{h}}_{t_{_B i G R U} =} {G R U f}_{t} (x_{t})

(18)

Backward GRU : {\overset{\leftarrow}{h}}_{t_B i G R U} = {G R U b}_{t} (x_{t})

(19)

Hidden State Fusion : h_{t} = |{\vec{h}}_{t_{B i G R U}}| |{\overset{\leftarrow}{h}}_{t_B i G R U}|

(20)

Output Layer : {\hat{y}}_{t_B i G R U} = W_{y} h_{t} + b_{y}

(21)

The terms

{\vec{h}}_{t_{_B i G R U}}

and

{\overset{\leftarrow}{h}}_{t_B i G R U}

shows the forward and backward GRU. The output of BiGRU

{\hat{y}}_{t_B i G R U}

.

In this study, each model was implemented using the same network depth to ensure a fair comparison. The models consist of two recurrent layers with 128 and 64 hidden neurons, followed by an attention layer to capture temporal features, a fully connected dense layer with 64 neurons and a final output layer for SoH prediction. The complete architecture of four models is shown in Figure 2.

4.5. Hyperparameter Selection and Training Configuration

The recurrent neural network models were trained using consistent set of hyperparameters to ensure a fair comparison. All models were optimized using Adam optimizer with a learning rate of 0.0005. The training is conducted for 150 epochs (number of times the dataset process during training) with a batch size (number of processed training sample) of 64. Each model has two layers with 128 and 64 hidden neurons followed by dense layer of 64 neurons. ReLu has been used as an activation function. A dropout of 0.3 has applied to reduce overfitting and improve generalization. For bidirectional networks, the same hyperparameter was used, and the recurrent layers were implemented using bidirectional wrappers to capture the temporal dependencies in both directions.

5. Performance Evaluation

The performance of the above-mentioned deep learning models has been checked on three popular metrics.

Mean Absolute Error (MAE): It is used to measure average absolute prediction error.

M A E = \frac{1}{N} \sum_{i = 1}^{N} |y_{i} - {\overset{⏞}{y}}_{i}|

(22)

Root Mean Square Error (RMSE): It measures the square root of the average squared prediction error.

R M S E = \sqrt{\frac{1}{N}} \sum_{i = 1}^{N} |y_{i} - {\overset{⏞}{y}}_{i}| 2

(23)

Coefficient of Determination (R²): It indicates how good the model captures the variance of the target variable.

R^{2} = 1 - \frac{\sum_{i = 1}^{N} |y_{i} - {\overset{⏞}{y}}_{i}| 2}{\sum_{i = 1}^{N} |y_{i} - {\overset{⏞}{y}}_{i}| 2}

(24)

where Y = true value,

\overset{⏞}{y} =

predicted value, N = number of samples.

6. Results and Discussion

The performance of the four models has been checked and compared for Figure 3, Figure 4, Figure 5, Figure 6, Figure 7, Figure 8 and Figure 9 and Table 2 and Table 3 given in the next subsections.

6.1. Validation Loss Comparison Across Models

Figure 3 presents the validation loss (MSE) comparison of LSTM, BiLSTM, GRU, and BiGRU models over training epochs. It shows how quickly and stably the models learn to predict the data. All models exhibit rapid drop in validation loss; however, the BiGRU exhibits massive spikes in the initial epochs and struggles. It is also seen that all four models converge to nearly zero after 40 epochs. Based on the mean (µ) and standard deviation (σ), the models can be ranked as, GRU, BiLSTM, LSTM and BiGRU. Figure 3 also indicates that although some models have a spiky start, they all settle and are capable of learning the pattern of the dataset.

6.2. True vs. Predicted SoH over Battery Cycles

Figure 4 represents the comparison of the true vs. predicted SoH values obtained using LSTM, BiLSTM, GRU and BiGRU models. It is seen that all models are successfully tracking the true behavior of the battery over cycles.

Following are the findings of Figure 4a–d.

LSTM: N = 392, MAE = 0.78, µ = 0.15, σ = 0.97.

BiLSTM: N = 392, MAE = 0.71, µ = 0.11, σ = 0.88.

GRU: N = 392, MAE = 0.96, µ = 0.15, σ = 1.2.

BiGRU: N = 392, MAE = 1.7, µ = 0.28, σ = 1.4.

By comparing the mean absolute error, mean bias and standard deviation of the prediction error, it is seen that the bidirectional model (BiLSTM) shows a close resemblance to the true SoH and is the most effective among all. The BiLSTM model has the lowest MAE (0.71) and standard deviation (0.88). The models can be ranked in the order as BiLSTM, LSTM, GRU and BiGRU as per their true and predicted SoH values given in Figure 4.

6.3. Residual Error Distribution

Figure 5 demonstrates residual error distribution for LSTM, BiLSTM, GRU and BiGRU models. It is seen that residuals for all deep learning models are almost Gaussian and center around zero. The findings of Figure 5 are as follows:

LSTM:, µ(Mean) = 0.15, σ(Std) = 0.97.

BiLSTM: µ(Mean) = −0.11, σ(Std) = 0.88.

GRU: N = µ(Mean) = 0.15, σ(Std) = 1.2.

BiGRU: µ(Mean) = 0.28, σ(Std) = 1.45.

From the findings, it is evident that BiLSTM is the superior model with the lowest standard deviation and a mean close to zero. It has the tightest range between the minimum and maximum errors.

6.4. Scatter Plot of True vs. Predicted SoH

Figure 5 shows the scatter plot of true and predicated SoH using all four deep learning models. The findings of Figure 6 are as follows:

LSTM: MAE = 0.78, BiLSTM: MAE = 0.71, GRU: MAE = 0.96, BiGRU: MAE = 1.7.

It is evident that BiLSTM is the most suitable model among all as it has the lowest MAE of 0.71. It shows that all models are working well, as most points lie close to the ideal line.

6.5. Absolute Error Boxplot

Figure 7 shows the boxplot of absolute predication error for each of the same four models. The findings of Figure 7 are as follows:

LSTM: MAE = 0.78, σ = 0.59, max error = 2.8.

BiLSTM: MAE = 0.71, σ = 0.52, max error = 2.3.

GRU: MAE = 0.96, σ = 0.74, max error = 3.9.

BiGRU: MAE = 1.17, σ = 0.89, max error = 5.2.

For BiGRU, it is seen that the model makes larger mistakes as it has higher outlier and larger box spread. The BiLSTM shows a tighter box spread and indicates more stable and consistent predictions. GRU shows large box spread and produces large errors. Overall, it can be seen that BiLSTM shows superior performance. Therefore, it is the most robust model among all.

6.6. Absolute Error over Samples

Figure 8 represents the absolute error changes across samples utilizing the LSTM, BiLSTM, GRU and BiGRU models. This analysis reveals how stable the models are across the entire dataset and whether their errors are random or follow a specific temporal pattern.

The findings of Figure 8 are as follows:

LSTM: MAE = 0.78, σ = 0.59, Error Peak = Moderate.

BiLSTM: MAE = 0.71, σ = 0.52, Error Peak = Low.

GRU: MAE = 0.96, σ = 0.74, Error Peak = High.

BiGRU: MAE = 1.17, σ = 0.89, Error Peak = Very High.

It is seen that some models have spikes reaching the threshold values of 4 and 5; however, the BiLSTM model has the lowest error over time. The BiLSTM model shows consistent behavior over time with lowest error value and gives more stable predictions. The BiLSTM is the most accurate, the most precise and the most stable model over time.

6.7. Summarized Discussion of Figure 3, Figure 4, Figure 5, Figure 6, Figure 7 and Figure 8

Since the NASA B0005 dataset is a widely recognized dataset in battery aging research, the results obtained in this study provide a meaningful comparison with existing SoH estimation approaches reported in the literature.

It is evident from the results shown in Figure 3, Figure 4, Figure 5, Figure 6, Figure 7 and Figure 8 that the BiLSTM model has superior prediction accuracy and stability compared with other models. The discussion can be written as:

The validation loss comparison shows that, among all models, the BiLSTM model achieves faster convergence and more stable learning.
The true vs. predicated SoH curves show that the BiLSTM model closely follows the degradation pattern of the battery.
The residual error distribution shows that prediction errors of BiLSTM model are centered near zero with lower scattering.
The scatter plot shows a strong correlation between predicted and actual SoH values in the case of the BiLSTM model.
The absolute error box plot and sample wise error plots further confirmed that the BiLSTM produces consistently lower production error.

6.8. Comparison of Model’s Performance

To improve the reliability of the reported results, repeated experiments and statistical analysis have been included. Each model was trained and evaluated over five independent runs using the same experimental settings. The average values of RMSE, MAE and R² have been calculated for these runs to reduce the random initialization and training variability. The results given in Table 2, report the mean performance values. The mean RMSE and MAE values along with their standard deviations and 95% confidence intervals are reported in Table 3.

From Table 2 and Table 3, it is seen that the BiLSTM attained greater outcome with the lowest RMSE (0.90) and MAE (0.72), along with the highest coefficient of determination (R² = 0.99). This indicates that, due to bidirectional learning capabilities, the BiLSTM effectively captures temporal dependencies in the data, which in turn results in better prediction performance.

The same has been given in Figure 9, too, which represents the quantitative comparison of the models. It is seen that all models achieve high R² values which shows that models are capable of effectively capturing the overall degradation patterns. It is also seen that the BiLSTM model achieves the lowest MAE and RMSE values and has superior prediction accuracy.

Models Complexity Comparison

Deep learning models used for battery health prediction differs in computational complexity due to their internal architecture. Unidirectional models (LSTM and GRU) process input sequences in a single temporal direction, while bidirectional models (BiLSTM and BiGRU) process sequences in both forward and backward directions. This increases the computational cost of bidirectional networks.

7. Conclusions

The performance of deep learning models in predicting batteries’ state of health (SoH) has been investigated in this study. A comparative analysis of LSTM, BiLSTM, GRU and BiGRU models was performed under identical conditions. It is found that, among all different models, the BiLSTM proves to be superior. The BiLSTM model provides the best prediction with the lowest prediction error (MAE = 0.72, RMSE = 0.90) and high R² value (0.99). Therefore, it is evident that BiLSTM model is able to learn smoothly and maintain stability. This model shows more stable convergence, as seen in validation loss curves. The BiLSTM model minutely follows the aging as well as degradation of batteries over time, as confirmed by the results shown for actual and predicted SoH values. Residual error distributions and scatter plots show lower scattering and strong correlation between predicted and actual SoH values in case of the BiLSTM model. The comparison shows the superiority of BiLSTM model over others; therefore, this model can be used as a dependable model for estimating battery health.

Real-world application of the proposed model: The proposed model can be integrated into a battery management system to estimate state of health using real-time data like voltage, current and temperature. This model can be used to predict degradation during charging and discharging cycles in order to detect abnormal aging for scheduled maintenance or even the replacement of the battery. Energy storage systems can use this information to prevent any unexpected failures. It can be noted that this model can predict and improve safety, reliability and life management of lithium-ion batteries in real-world applications.

Limitation and Future work: However, bidirectional models have promising results, yet there are several limitations that can arise as these models use both past and future information during training. The current study relies on a battery-specific dataset and controlled experimental conditions. The generalization of the models may be limited for different battery types and operating environments. Therefore, future work can be focused on the more recent deep learning models, complex modeling, the validation of the proposed model on large and diverse datasets (B0006, B0007 and real-world EV battery data) under noisy measurement environments, and the investigation of transfer learning for generalization for improved real-time battery health prediction.

Author Contributions

Conceptualization, N.K.; Methodology, N.K.; Software, N.K.; Formal analysis, K.K. and R.K.; Investigation, K.K. and R.K.; Resources, K.K.; Data curation, N.K. and K.K.; Writing—original draft, N.K.; Writing—review & editing, R.K.; Visualization, K.K. and R.K.; Supervision, R.K.; Project administration, R.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations and symbols are used in this manuscript:

BMS	Battery management system
SoH	State of Health
SoC	State of Charge
RNN	Recurrent neural network
RUL	Remaining useful Life
LSTM	Long Short-term memory
BiLSTM	Bidirectional long short-term memory
GRU	Gated recurrent unit
BiGRU	Bidirectional gated recurrent unit
RMSE	Root mean square error
MAE	Mean absolute error
MSE	Mean square Error
EV	Electric Vehicle
PSO	Particle swarm optimization
Symbol	Description
LSTM Model
y_t, ${\hat{y}}_{t - L S T M}$	True output, Predicated Output
x_t	Input Vector
f_t, i_t, o_t	Forget gate, Input gate, Output gate
C_t	Cell state
$σ$	Sigmoid activation function
h_t, h_(t−1)	Hidden state, Previous hidden state
$⊙$	Element wise product
W_f, W_i, W_c, W_o, W_y	Input weight matrices for respective gate and states.
U_f, U_i, U_c, U_o	Recurrent weight matrices
b_f, b_i, bc, bo, b_y	Bias terms for corresponding gate and states
tanh	Hyperbolic tangent
N	Total Samples
R²	Coefficient of determination
GRU Model
z_t	Update gate output
rt	Reset gate output
Wz, Wr, Wh, Wy	Input weight matrices for respective gate and layer
Uz, Ur, Uh,	weight matrices connecting hidden to gates
bz, br, bh, by	Bias for respective gate and layer
${\hat{y}}_{t_G R U}$	Output of GRU
BiLSTM and BiGRU Model
$\vec{h t},$ ${\vec{h}}_{{t_}_{B i G R U}}$	Hidden state during forward LSTM/GRU
$\overset{⃐}{(h t)},$ ${\overset{\leftarrow}{h}}_{t_B i G R U}$	Hidden state during backward LSTM/GRU
$h_{t}$	Combined state
${\hat{y}}_{t_B i L S T M}, {\hat{y}}_{t_B i G R U}$	Output of BiLSTM/BiGRU

References

Sylvestrin, G.R.; Maciel, J.N.; Amorim, M.L.M.; Carmo, J.P.; Afonso, J.A.; Lopes, S.F.; Junior, O.H.A. State of the Art in Electric Batteries’ State-of-Health (SoH) Estimation with Machine Learning: A Review. Energies 2025, 18, 746. [Google Scholar] [CrossRef]
Jorkesh, S.; Ahmed, R.; Habibi, S.; Hosseininejad, R.; Xu, S. Battery State of Charge and State of Health Estimation Using a New Hybrid Deep Neural Network Approach (GRU-LSTM). IEEE Access 2025, 13, 12566–12580. [Google Scholar] [CrossRef]
Zhang, Y.; Xiong, R.; He, H.; Pecht, M. Long short-term memory recurrent neural network for remaining useful life prediction of lithium-ion batteries. IEEE Trans. Veh. Technol. 2018, 67, 5695–5705. [Google Scholar] [CrossRef]
Zhao, F.; Guo, Y.; Chen, B. A Review of Lithium-Ion Battery State of Charge Estimation Methods Based on Machine Learning. World Electr. Veh. J. 2024, 15, 131. [Google Scholar] [CrossRef]
Mbagaya, L.; Reddy, K.; Botes, A. Machine Learning Techniques for Battery State of Health Prediction: A Comparative Review. World Electr. Veh. J. 2025, 16, 594. [Google Scholar] [CrossRef]
Liu, C.; Li, H.; Li, K.; Wu, Y.; Lv, B. Deep Learning for State of Health Estimation of Lithium-Ion Batteries in Electric Vehicles: A Systematic Review. Energies 2025, 18, 1463. [Google Scholar] [CrossRef]
Reza, M.; Mannan, M.; Mansor, M.; Ker, P.J.; Mahlia, T.M.I.; Hannan, M. Recent advancement of remaining useful life prediction of lithium-ion battery in electric vehicle applications: A review of modelling mechanisms, network configurations, factors, and outstanding issues. Energy Rep. 2024, 11, 4824–4848. [Google Scholar] [CrossRef]
Chen, S.; Hu, X.; Zhao, J.; Wang, R.; Qiao, M. A Review of Decision-Making and Planning for Autonomous Vehicles in Intersection Environments. World Electr. Veh. J. 2024, 15, 99. [Google Scholar] [CrossRef]
Wang, Z.; Feng, G.; Zhen, D.; Gu, F.; Ball, A. A review on online state of charge and state of health estimation for lithium-ion batteries in electric vehicles. Energy Rep. 2021, 7, 5141–5161. [Google Scholar] [CrossRef]
Hossain Lipu, M.S.; Karim, T.F.; Ansari, S.; Miah, M.S.; Rahman, M.S.; Meraj, S.T.; Elavarasan, R.M.; Vijayaraghavan, R.R. Intelligent SOX Estimation for Automotive Battery Management Systems: State-of-the-Art Deep Learning Approaches, Open Issues, and Future Research Opportunities. Energies 2023, 16, 23. [Google Scholar] [CrossRef]
Madani, S.S.; Ziebert, C.; Vahdatkhah, P.; Sadrnezhaad, S.K. Recent progress of deep learning methods for health monitoring of lithium-ion batteries. Batteries 2024, 10, 204. [Google Scholar] [CrossRef]
Giordano, G.; Klass, V.; Behm, M.; Lindbergh, G.; Sjoberg, J. Model-Based lithium-ion battery resistance estimation from electric vehicle operating data. IEEE Trans. Veh. Technol. 2018, 67, 3720–3728. [Google Scholar] [CrossRef]
Murnane, M.; Ghazel, A. A Closer Look at State of Charge (SOC) and State of Health (SOH) Estimation Techniques for Batteries. Analog Devices 2017, 2, 426–436. [Google Scholar]
Hosseininasab, S.; Lin, C.; Pischinger, S.; Stapelbroek, M.; Vagnoni, G. State-of-health estimation of lithium-ion batteries for electrified vehicles using a reduced-order electrochemical model. J. Energy Storage 2022, 52, 104684. [Google Scholar] [CrossRef]
Wang, C.; Du, W.; Zhu, Z.; Yue, Z. The real-time big data processing method based on LSTM or GRU for the smart job shop production process. J. Algorithm Comput. Technol. 2020, 14, 1–9. [Google Scholar] [CrossRef]
Zraibi, B.; Okar, C.; Chaoui, H.; Mansouri, M. Remaining useful life assessment for lithium-ion batteries using CNN-LSTM-DNN hybrid method. IEEE Trans. Veh. Technol. 2021, 70, 4252–4261. [Google Scholar] [CrossRef]
Zhou, D.; Li, Z.; Zhu, J.; Zhang, H.; Hou, L. State of health monitoring and remaining useful life prediction of lithium-ion batteries based on temporal convolutional network. IEEE Access 2020, 8, 53307–53320. [Google Scholar] [CrossRef]
Cheng, G.; Wang, X.Z.; He, Y.R. Remaining useful life and state of health prediction for lithium batteries based on empirical mode decomposition and a long and short memory neural network. Energy 2021, 232, 121022. [Google Scholar] [CrossRef]
Zhang, Y.; Wik, T.; Bergström, J.; Zou, C. State of Health Estimation for Lithium-Ion Batteries under Arbitrary Usage Using Data-Driven Multimodel Fusion. IEEE Trans. Transp. Electrif. 2024, 10, 1494–1507. [Google Scholar] [CrossRef]
Liu, G.; Deng, Z.; Xu, Y.; Lai, L.; Gong, G.; Tong, L.; Zhang, H.; Li, Y.; Gong, M.; Yan, M.; et al. Lithium-Ion Battery State of Health Estimation Based on CNN-LSTM-Attention-FVIM Algorithm and Fusion of Multiple Health Features. Appl. Sci. 2025, 15, 7555. [Google Scholar] [CrossRef]
Xi, H.; Lv, T.; Qin, J.; Ma, M.; Xie, J.; Lu, S.; Liu, Z. Prediction of Lithium Battery Voltage and State of Charge Using Multi-Head Attention BiLSTM Neural Network. Appl. Sci. 2025, 15, 3011. [Google Scholar] [CrossRef]
Ren, X.; Liu, S.; Yu, X.; Dong, X. A method for state-of-charge estimation of lithium-ion batteries based on PSO-LSTM. Energy 2021, 234, 121236. [Google Scholar] [CrossRef]
Zhu, Z.; Yang, Q.; Liu, X.; Gao, D. Attention-based CNN-BiLSTM for SOH and RUL estimation of lithium-ion batteries. J. Algorithms Comput. Technol. 2022, 16, 1. [Google Scholar] [CrossRef]
Guo, W.; Vilsen, S.B.; Li, Y.; Verma, A.; Stroe, D.I.; Brandell, D. Uncovering the impact of battery design parameters on health and lifetime using short charging segments. Energy Environ. Sci. 2025, 18, 8462–8474. [Google Scholar] [CrossRef]
Zhao, J.; Wang, Z.; Wu, Y.; Burke, A.F. Predictive pretrained transformer (PPT) for real-time battery health diagnostics. Appl. Energy 2025, 377, 124746. [Google Scholar] [CrossRef]
Xia, Z.Y.; Abu Qahouq, J.A. Lithium-ion battery ageing behavior pattern characterization and state-of-health estimation using data-driven method. IEEE Access 2021, 9, 98287–98304. [Google Scholar] [CrossRef]
Lin, H.T.; Liang, T.J.; Chen, S.M. Estimation of battery state of health using probabilistic neural network. IEEE Trans. Ind. Inf. 2013, 9, 679–685. [Google Scholar] [CrossRef]
Alhazmi, R.M. State of Health Prediction in Electric Vehicle Batteries Using a Deep Learning Model. World Electr. Veh. J. 2024, 15, 385. [Google Scholar] [CrossRef]
Lu, J.; Xiong, R.; Tian, J.; Wang, C.; Sun, F. Deep learning to estimate lithium-ion battery state of health without additional degradation experiments. Nat. Commun. 2023, 14, 2760. [Google Scholar] [CrossRef]
Wang, F.K.; Amogne, Z.E.; Chou, J.H.; Tseng, C. Online remaining useful life prediction of lithium-ion batteries using bidirectional long short-term memory with attention mechanism. Energy 2022, 254, 124344. [Google Scholar] [CrossRef]
Rathinam, A.; Phukan, R. Solution to Economic Load Dispatch Problem Based on FIREFLY Algorithm and Its Comparison with BFO, CBFO-S and CBFO-Hybrid. In Swarm, Evolutionary, and Memetic Computing; Panigrahi, B.K., Das, S., Suganthan, P.N., Nanda, P.K., Eds.; SEMCCO 2012. Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2012; Volume 7677. [Google Scholar] [CrossRef]

Figure 1. Proposed flowchart for battery state of health prediction using four recurrent networks.

Figure 2. The architecture of four different models used in the proposed comparative deep learning models [2,8,28].

Figure 3. Validation loss (MSE) comparison across models for battery SoH prediction.

Figure 4. True vs. predicted SoH across battery cycles using all models. (a) True vs. Predicted SoH across battery cycles using LSTM; (b) True vs. Predicted SoH across battery cycles using BiLSTM; (c) True vs. Predicted SoH across battery cycles using GRU; (d) True vs. Predicted SoH across battery cycles using BiGRU.

Figure 5. Residual error distribution for prediction errors. (a) Residual error distribution using LSTM; (b) Residual error distribution using BiLSTM; (c) Residual error distribution using GRU; (d) Residual error distribution using BiGRU.

Figure 6. Scatter plot of predicted vs. actual SoH across all models. (a) Scatter plot of predicted vs. actual SoH using LSTM; (b) Scatter plot of predicted vs. actual SoH using BiLSTM; (c) Scatter plot of predicted vs. actual SoH using GRU; (d) Scatter plot of predicted vs. actual SoH using BiGRU.

Figure 7. Absolute error boxplot comparing prediction error distribution across all models. (a) Absolute error boxplot using LSTM; (b) Absolute error boxplot using BiLSTM; (c) Absolute error boxplot using GRU. (d) Absolute error boxplot using BiGRU.

Figure 8. Absolute error across samples across all models. (a) Absolute error across samples using LSTM; (b) Absolute error across samples using BiLSTM; (c) Absolute error across samples using GRU; (d) Absolute error across samples using BiGRU.

Figure 9. Quantitative comparison of all models using R², MAE and RMSE metrics.

Table 1. Summary of the recent machine learning approaches used for battery SoH prediction.

Ref. No	Author (Year)	Primary Method	Accuracy (Error Metrics)	Key Contribution
Review Studies
[1]	Sylvestrin (2025)	Review/MLP & LSTM	Review	Highlights shift toward Physics-Informed ML for better interpretability.
[4]	Zhao (2024)	Review	Review	Provide a review of three main steps involved in various ML-based SoC estimation methods.
[5]	Mbagaya (2025)	Review	Review	Review four ML approaches for SoC estimation
[6]	Liu (2025)	Review	Review	Identifies Transformers and the domain gap as the next big challenges.
[7]	Reza (2024)	Review	Review	Maps external factors (Temp/DoD) to specific model configurations.
[8]	Chen (2024)	Review	Review	Review the latest issues and research progress for decision making and planning for autonomous vehicles.
[10]	Hossain Lipu (2023)	Review	Review	Proposes the Intelligent SOX framework for all battery states.
[11]	Madani (2024)	Review (Self-Adaptive)	Review	Highlights models that self-adjust as the battery chemically ages.
Model-Based Studies
[12]	Giordano (2018)	Model-Based (ARX)	High (validated on EV data)	Estimates resistance from real-world driving data without lab tests.
[13]	Murnane (2017)	Hardware/Analog Review	Industry	Compares OCV vs. Coulomb Counting from a hardware perspective.
[14]	Hosseininasab (2022)	Reduced-Order Model	High (Physics-aware)	Developed Reduced-Order Models (ROM) for low-compute BMS.
Hybrid Studies
[2]	Jorkesh (2025)	Hybrid GRU-LSTM	SoH RMSE: 0.65%	Combines GRU efficiency with LSTM for dynamic EV states.
[16]	Zraibi (2021)	CNN-LSTM-DNN	Significant error reduction	Used CNNs to extract spatial features for RUL.
[17]	Zhou (2020)	TCN (Temporal Conv)	<10 cycles error (RUL)	Introduced parallel processing for health monitoring using TCNs.
[18]	Cheng (2021)	EMD-LSTM	SoH RMSE: 0.02	Used Signal Decomposition to remove capacity regeneration noise.
[19]	Zhang (2024)	Multi-model Fusion	Stable under usage	Ensures accuracy under arbitrary/unpredictable driving behavior.
[20]	Liu (2025b)	CNN-LSTM-Attention	MAE: 0.99%	Uses Attention to focus on critical parts of the charging cycle.
[21]	Xi (2025)	Multi-Head BiLSTM	High stability	Uses Multi-Head Attention to track multiple battery variables at once.
[22]	Ren (2021)	PSO-LSTM	RMSE: 0.90% (SSL)	Uses Particle Swarm Optimization to auto-tune model parameters.
[23]	Zhu (2022)	CNN-BiLSTM-Attn	MAE < 1%	Proposes a triple-layer framework for maximum feature extraction.
[24]	Guo et al. (2025)	Physics informed battery analysis	Correlation in parameters and degradation	Demonstrated how battery parameters influence on degradation
[25]	Jingyuan Zhao (2025)	Multimodal fusion PPT	Combing CNN and transformer for time series analysis	Achieved computational cost savings and enhanced generalization
Data-Driven Studies
[3]	Zhang (2018)	LSTM	High R-squared	First major validation of LSTM’s ability to handle time-series degradation.
[26]	Xia (2021)	Data-Driven Features	High Precision	Characterized specific aging patterns in voltage/current curves.
[27]	Lin (2013)	PNN (Probabilistic)	High speed	Demonstrated fast learning speed for nonlinear SoH classification.
[15]	Wang (2020)	LSTM/GRU	Real-time efficiency	Confirmed GRU is often faster than LSTM for industrial big data.
[28]	Alhazmi (2024)	Deep Learning (Fleet)	Fleet-wide accuracy	Focuses on data normalization for cross-vehicle SOH accuracy.
[29]	Lu (2023)	Deep Learning	No experiments needed	Achieved SoH estimation using only partial charging data.
[30]	Wang (2022)	Bidirectional LSTM	High RUL precision	Processes data forward and backward to catch complex trends.

Table 2. Performance comparison of all models in terms of RMSE, MAE and R².

Model	RMSE	MAE	R²
LSTM	1.04	0.83	0.99
BiLSTM	0.90	0.72	0.99
GRU	1.30	1.02	0.98
GRUBi	1.24	1.00	0.98

Table 3. Final results with statistical stability.

Model	RMSE_Mean	RMSE_Std	RMSE_CI	MAE_Mean	MAE_Std	MAE_CI
LSTM	1.0383	0.0851	0.0746	0.8233	0.0634	0.0556
BiLSTM	1.0310	0.1137	0.0997	0.8087	0.0945	0.0829
GRU	1.2617	0.0614	0.0538	0.9963	0.0400	0.0350
GRUBi	1.3972	0.1521	0.1333	1.1069	0.1142	0.1001

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2026 by the authors. Published by MDPI on behalf of the World Electric Vehicle Association. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.

Share and Cite

MDPI and ACS Style

Kumar, N.; Kundu, K.; Kumar, R. Comparative Study of Recurrent Neural Networks for Electric Vehicle Battery Health Assessment. World Electr. Veh. J. 2026, 17, 178. https://doi.org/10.3390/wevj17040178

AMA Style

Kumar N, Kundu K, Kumar R. Comparative Study of Recurrent Neural Networks for Electric Vehicle Battery Health Assessment. World Electric Vehicle Journal. 2026; 17(4):178. https://doi.org/10.3390/wevj17040178

Chicago/Turabian Style

Kumar, Nagendra, Krishanu Kundu, and Rajeev Kumar. 2026. "Comparative Study of Recurrent Neural Networks for Electric Vehicle Battery Health Assessment" World Electric Vehicle Journal 17, no. 4: 178. https://doi.org/10.3390/wevj17040178

APA Style

Kumar, N., Kundu, K., & Kumar, R. (2026). Comparative Study of Recurrent Neural Networks for Electric Vehicle Battery Health Assessment. World Electric Vehicle Journal, 17(4), 178. https://doi.org/10.3390/wevj17040178

Article Menu

Comparative Study of Recurrent Neural Networks for Electric Vehicle Battery Health Assessment

Abstract

1. Introduction

1.1. Literature Review

1.1.1. Review-Based Studies

1.1.2. Model-Based Studies

1.1.3. Hybrid Studies

1.1.4. Data-Driven Studies

1.2. Research Gap

1.3. Novelty and Motivation

1.4. Research Positioning and Contributions

2. Materials and Methods

3. Dataset and Preprocessing

Data Preprocessing Steps

4. Deep Learning Models

4.1. Long Short-Term Memory (LSTM) Network

4.2. Bidirectional Long Short-Term Memory (BiLSTM) Model

4.3. Gated Recurrent Unit (GRU) Model

4.4. Bidirectional GRU (BiGRU Model)

4.5. Hyperparameter Selection and Training Configuration

5. Performance Evaluation

6. Results and Discussion

6.1. Validation Loss Comparison Across Models

6.2. True vs. Predicted SoH over Battery Cycles

6.3. Residual Error Distribution

6.4. Scatter Plot of True vs. Predicted SoH

6.5. Absolute Error Boxplot

6.6. Absolute Error over Samples

6.7. Summarized Discussion of Figure 3, Figure 4, Figure 5, Figure 6, Figure 7 and Figure 8

6.8. Comparison of Model’s Performance

Models Complexity Comparison

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI