Next Article in Journal
HL-YOLO: Improving Vehicle Damage Detection with Heterogeneous Convolutions and Large-Kernel Attention
Next Article in Special Issue
State-of-Charge Estimation on Lithium-Ion 18650 Under Charging and Discharging Conditions: A Statistical and Metaheuristic Approach
Previous Article in Journal
AI-Enhanced Circular Economy and Sustainability in the Indian Electric Two-Wheeler Industry: A Review
Previous Article in Special Issue
Fast-Charging Model of Lithium Polymer Cells
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Particle Swarm Optimized Multi-Model Framework for Remaining Useful Life Prediction of Lithium-Ion Batteries Using Domain-Driven Feature Engineering

1
Department of Electrical Engineering, Jubail Industrial College, Al Jubail 35718, Saudi Arabia
2
Department of Electrical Engineering, The Islamia University of Bahawalpur (IUB), Bahawalpur 63100, Pakistan
3
Department of Mechanical Engineering, Al-Fayha College, Al Jubail, 35514, Saudi Arabia
4
Department of Electrical Engineering, College of Engineering, University of Business and Technology, Jeddah 21361, Saudi Arabia
5
Engineering Technology Department, Community College of Qatar, Doha P.O. Box 7344, Qatar
6
Department of Industrial Engineering, College of Engineering, University of Business and Technology, Jeddah 21361, Saudi Arabia
7
Multan Electric Power Company, Multan 6000, Pakistan
8
College of Engineering, A’Sharqiyah University, Ibra 400, Oman
*
Authors to whom correspondence should be addressed.
World Electr. Veh. J. 2025, 16(11), 639; https://doi.org/10.3390/wevj16110639
Submission received: 16 October 2025 / Revised: 13 November 2025 / Accepted: 18 November 2025 / Published: 20 November 2025

Abstract

With respect to battery management and safe operation and maintenance scheduling of electric vehicles (EVs), it is very important to predict the remaining useful life (RUL) of lithium-ion batteries (LIBs). Accurate prediction of RUL can bring secure working conditions, avert internal and external failure, and, last, avoid any undesirable consequences. However, achieving accurate prediction of RUL is complicated for EV applications due to various reasons such as the complex operational characteristics, dynamic changes in the model parameters during the aging process, extraction of battery parameters, data preparation, and hyper-parameter tuning of the predictive model. This research proposes a novel approach that integrates Particle Swarm Optimization (PSO) with a multi-model technique for RUL prediction. The framework integrates many machine learning (ML) models and deep learning (DL) models. Combining domain knowledge, advanced optimization techniques, and learning models to make high-accuracy RUL predictions reduces maintenance costs and improves battery management systems. This study uses domain-driven feature engineering to extract battery-specific indicators, including voltage drops, charging time, and temperature fluctuations, to increase model accuracy. Among the evaluated models, LSTM demonstrates superior performance, achieving a mean absolute error (MAE) of 0.34, a root mean square error (RMSE) of 0.76, and an R2 of 0.93, providing the best results in RUL prediction. The proposed research uniquely integrates PSO-based optimization with domain-driven feature engineering across multiple machine learning and deep learning models, demonstrating a unified and novel approach that significantly improves the prediction accuracy of RUL in LIBs.

Graphical Abstract

1. Introduction

As global concerns about climate change and the environmental impact of greenhouse gas (GHG) emissions intensify, there has been a widespread push toward cleaner, renewable energy sources and energy-efficient technologies [1]. Traditional fossil fuels, the primary source of GHG emissions, contribute significantly to global warming and environmental degradation. In response, industries across the globe are transitioning to sustainable energy solutions, with EVs and renewable energy systems taking center stage in reducing dependence on fossil fuels. A key enabler of this transition is LIBs, which have become the preferred energy storage solution because of their high energy density, long cycle life, and relatively lightweight design [2]. LIBs are indispensable not only in EVs but also in a range of applications, including portable electronics, renewable energy storage, and grid stabilization.
Across the board, LIBs could add their name to almost all electric vehicles, from motorbikes to cars and even buses [3]. They operate quite well, and the long-term service and energy efficiency will become evident when these vehicles are used. While they boost electric transport, another section forming a critical potential source of greenhouse gas reduction, the energy storage and energy release efficiency of LIBs is productive for that one sector alone [4]. Although they do not end with transportation, they are the very soul of renewable energy storage systems, from storing energy generated using solar and wind to being made available for use during the peak hours or in non-generation hours to ensure grid stability. LIBs power consumer electronic devices such as mobile phones, laptops, tablets, and wearables, providing modern mobile technology with the required size, length of use, and performance.
However, while LIBs offer numerous benefits, their widespread use presents several challenges [5]. One of the most significant issues is battery degradation. Over time, LIBs experience a reduction in their capacity to store charge, known as capacity fade, which is influenced by factors such as charge/discharge cycles, temperature fluctuations, and internal resistance [6]. As LIBs degrade, their performance diminishes, leading to shorter battery life and reduced efficiency. This degradation poses a particularly acute challenge in applications where the reliability of the battery is critical, such as in electric vehicles, grid storage, and medical devices [3].
The ability to accurately predict the RUL of LIBs is crucial to address these challenges. In applications such as EVs and renewable energy systems, knowing when a battery will fail or lose significant capacity is essential for effective battery management, maintenance, and replacement scheduling [7]. However, predicting the RUL of LIBs is complex because of the non-linear and dynamic nature of battery degradation, influenced by factors such as temperature, charge cycles, and usage patterns. While several prediction methods exist, many of them are either too simplistic or computationally intensive, failing to accurately capture the complex degradation behavior of LIBs [8].
Moreover, safety concerns related to LIBs cannot be overlooked. Poor battery management, especially when RUL is not predicted accurately, can lead to safety risks such as overheating, thermal runaway, or even fires, particularly in high-energy applications such as EVs [9]. Additionally, the environmental impact of LIB production and disposal is another growing concern, as the extraction of raw materials, including lithium, cobalt, and nickel, can lead to environmental degradation, and the recycling process remains inefficient [7].
This paper introduces a hybrid framework combining PSO with machine learning and deep learning models for predicting the RUL of LIBs. The approach utilizes domain-driven feature engineering to enhance prediction accuracy. Our findings show significant improvements in RUL prediction, particularly using LSTM optimized by PSO. While PSO and domain-driven features have been explored in various RUL prediction models, the novelty of this work lies in the integration of these techniques with multiple machine learning and deep learning models. This hybrid approach not only fine-tunes model performance but also enhances the predictive capability by incorporating domain expertise through feature engineering, a combination that has not been extensively studied in the current literature. Our method simplifies the implementation process by combining existing machine learning and deep learning models with PSO and domain-driven feature engineering, eliminating the need for complex and resource-intensive custom solutions. It reduces computational demands by optimizing hyperparameters efficiently, allowing faster prediction times compared to traditional model training approaches. The use of domain-driven features further enhances accuracy without requiring extensive data preprocessing, improving overall system efficiency. This results in lower maintenance costs and a scalable solution suitable for real-time applications in electric vehicles, energy storage systems, and other sectors relying on LIBs, such as consumer electronics and renewable energy management. This research aims to:
  • Development of a Hybrid Optimization Framework: This research introduces a novel framework that combines PSO with multiple ML and DL models, offering an innovative approach to hyper-parameter optimization for predicting the RUL of LIBs.
  • Integration of Domain-Driven Feature Engineering: This paper emphasizes the importance of domain-driven feature engineering, showcasing how key battery-specific indicators such as voltage drops, temperature variations, and charging times can significantly enhance the accuracy of RUL predictions.
  • Evaluation of Model Performance Across Multiple Techniques: The research provides a comprehensive evaluation of various ML and DL models, including XGBoost, LightGBM, Random Forest, and LSTM, to predict RUL when optimized with PSO.
  • Insights into the Effect of Battery Degradation Factors: This study identifies critical factors influencing battery degradation, such as internal resistance and temperature fluctuations, and quantifies their impact on prediction accuracy, contributing to a deeper understanding of battery health monitoring.
  • Comparison and Benchmarking of Optimization Techniques: By comparing the performance of models with and without PSO optimization, this paper contributes to the ongoing discussion on the effectiveness of optimization algorithms in improving model generalization, setting a benchmark for future research in battery RUL prediction.
This paper is organized as follows: it begins with a review of related work and the research methodology. The subsequent sections provide an overview of the data, outline the preprocessing methods, and a hybrid optimization framework (PSO with ML/DL models), emphasizing the incorporation of sophisticated feature engineering approaches. The paper then discusses the RUL prediction models, along with performance evaluation. It concludes with a discussion of the results, including the impact of optimization and feature engineering on prediction accuracy, followed by key conclusions.

2. Literature Review

Multiple techniques have been developed to estimate the RUL of LIBs: model-dependent, data-dependent, and hybrid [10]. Model-based prediction of RUL in LIBs is through mathematical formulations that express the physical and chemical phenomena of the battery during its operation. These models consider multiple factors, including cycles, temperature, aging, and usage patterns, affecting battery performance and health [9,10]. By applying the model to assess the actual status of the battery and simulate its degradation and performance over time, a fairly accurate estimate of RUL can be made for the management of LIB lifecycles across a host of applications. The three most frequently used model-based approaches are electrochemical models (EM), equivalent circuit models (ECM), and filtering approaches [11]. The electrochemical models uncover the battery performance deterioration laws as per the electrochemical processes occurring inside the LIB. This model simulates the complex interactions between materials and electrolytes in batteries, using partial differential equations to portray lithium-ion and electron diffusion and chemical reactions occurring at the electrodes, temperature, aging, cycling history, etc. [12]. This model can incorporate many factors impacting battery performance, such as temperature, aging, and cycling. However, they are very demanding concerning computation time, extensive testing data, and specific knowledge about electrochemical processes and numerical simulation. Whereas ECMs focus on electrical modeling, whereby the electrochemical behavior of a battery is modeled using a combination of voltage sources, capacitors, and resistors to provide a prediction of RUL of LIBs [13]. In practice, the ECMs may be physically non-interpretable or inaccurate in extreme operating conditions. Model-based assessments of the RUL of LIBs provide a cogent representation of the battery condition over time based on filtering methods to eliminate noise in sensor data. Filtering techniques include Kalman filtering (KF) and particle filtering (PF) [14]. Filter methods, however, often face convergence challenges and are excessively sensitive to initial parameter settings. There have been some recent studies emphasizing the integration of these techniques to benefit from their respective strengths in predicting RUL.
In recent years, data-driven approaches have proven to be the most effective means for LIB RUL prediction. Historical data are used in the data-driven approach to forecast the degradation pattern of a battery [15]. With respect to predicting battery performance and health, data from power usage and environmental variables can be used in the data-driven framework. It differs from advanced physicochemical models in that it does not seek to examine the failure mechanism of the battery. This further makes the data-driven techniques more efficient and more applicable in managing complicated systems when compared with the model-based techniques [16]. Data-driven approaches can be categorized into three: stochastic techniques, ML techniques, and DL techniques. The stochastic process approach is underpinned by principles of statistics and supplemented by additional mathematical notions. Stochastic process models employ probability theory to imitate the uncertainty and randomness that surround LIB deterioration [17]. The application of statistical and mathematical principles enables stochastic process approaches to embrace the complexity and variability associated with battery degradation, thus allowing better RUL predictions. Gaussian process regression (GPR) and the Wiener process are stochastic methodologies.
Recent studies have explored the use of optimization algorithms, such as PSO, in combination with machine learning and deep learning models to predict the RUL of LIBs. For instance, Bide Zhang [18] employed PSO for hyper-parameter tuning of machine learning models, but their approach was limited to a single model without leveraging multi-model optimization. Shaheer [19] proposed an RNN-PSO model for RUL prediction in LIBs utilizing a 31-dimensional multi-channel input framework based on the NASA, MIT, and Stanford battery datasets. While their model improves prediction accuracy with low mean square error, it still faces challenges in generalizing across different battery types and is susceptible to performance degradation in real-world conditions. Similarly, Lu Liu et al. [20] proposed the CEEMDAN-PSO-BiGRU model, which integrates sequence decomposition and optimized neural networks, achieving reliable RUL prediction even with limited data. However, their model still faces limitations because of its reliance on a single model structure and lack of domain-driven feature engineering. In contrast, our framework combines PSO with multi-model techniques and incorporates critical battery indicators such as voltage drops, internal resistance, and temperature fluctuations. This multi-faceted approach provides a more robust and generalizable solution for RUL prediction across diverse datasets. Pang et al. [21] combined PSO with a Particle Filter to predict battery RUL, and their results show improved prediction accuracy compared to simpler models. However, the sensitivity to environmental disturbances could limit the model’s performance in real-world conditions, as it may struggle with fluctuations in external factors affecting battery health. Ma et al. [22] applied PSO with a Back Propagation Neural Network (BPNN) for RUL prediction, achieving a notable improvement in state-of-health estimation. Despite this, their reliance on BPNN comes with challenges related to parameter initialization, which may hinder the model’s flexibility and performance in different scenarios, especially with complex datasets. Ye et al. [23] explored chaotic PSO combined with a Particle Filter to account for battery aging mechanisms. While the model shows an increase in prediction accuracy, the added complexity of chaotic PSO could make the model harder to apply or interpret in real-world applications, where simpler models might be more efficient.

3. Materials and Methods

This study proposes a novel framework for predicting the RUL of LIBs by integrating PSO with both ML and DL models. Figure 1 outlines the workflow for optimizing a LIB dataset using data preprocessing, feature engineering, and PSO. The methodology begins with domain-driven feature engineering to extract battery-specific indicators such as voltage drops, temperature fluctuations, and internal resistance, which are critical for accurate RUL prediction. Several ML models, including XGBoost, LightGBM, and Random Forest, as well as DL models such as LSTM, are utilized to model battery degradation. PSO is applied to optimize the performance of these models in terms of hyper-parameter fine-tuning and improving prediction accuracy. For preprocessing, measures include outlier detection and feature scaling to ensure judicious reliance on input data. The evaluation of selected models is performed using several performance indicators, including MAE, RMSE, and R2, all assessed by cross-validation to ensure robustness and generalizability for the framework across multiple datasets. Following this approach could lead to a holistic solution, potentially improving the prediction of LIB RUL with maximum accuracy and efficiency.

3.1. Dataset Collection

The Hawaii Natural Energy Institute database contains data on fourteen NMC-LCO 18,650 LIBs rated at a nominal capacity of 2.8 Ah, which have been cycled at 25 °C for more than 1000 cycles. The batteries were charged at a constant current of 1C and discharged at a constant current of 1.5C using the constant current and constant voltage (CC-CV) method. The parameters measured were useful in assessing the battery behavior at each cycle, such as time spent discharging at certain voltages and charging characteristics. The maximum voltage at discharge and the minimum voltage at charge characterize the health of the battery. Remaining useful life can be inferred from this data, which assists in making scheduling decisions when to schedule maintenance and replacement of batteries. The dataset holds information on the time duration at constant current and the voltage decrements, which constitute valuable data for estimating the decay of batteries as a function of time. Some important features contained in the dataset, which are relevant to RUL prediction of LIBs, are shown in Table 1 together with their basic descriptive statistics.

3.2. Preprocessing

The preprocessing steps performed on the dataset include several important tasks aimed at preparing the data for analysis. First, we ensured that there were no missing values in the dataset, as all features were populated. Although no imputation or removal was necessary, handling missing data is an important consideration in case any future inconsistencies arise. Data type conversion was also checked to ensure all columns were correctly interpreted as numerical values, which is essential for further analysis. While feature scaling (e.g., normalization or standardization) was not explicitly performed in this step, it is often recommended for algorithms that are sensitive to feature magnitude differences, such as support vector machines or neural networks. Outlier detection was conducted, especially for columns such as Decrement 3.6–3.4 V (s), where extreme negative values were noted, suggesting the need for further treatment such as capping or removal. Correlation analysis was performed to identify potential redundancy among features, guiding decisions on dimensionality reduction. Finally, feature engineering, while not explicitly carried out in this step, could be performed to enhance model performance in future analyses.
The dataset contains 15,064 instances. The Cycle_Index ranges from 1 to 15,064, with a mean of 556.16. The discharge time (s) exhibits a large range, from 8.69 s to 4667 s (about 78 min), with an average of 4581.27 s. The decrement 3.6–3.4 V (s) has a wide standard deviation, indicating large variability, especially with a minimum value of 0 s (negative values have been corrected), which previously suggested a data issue or outlier. The maximum voltage discharge (V) shows a relatively small variation around 3.91 V, and the minimum voltage charge (V) has a mean of 3.58 V, which are consistent with typical LIBS voltage levels. The time at 4.15 V (s) also displays significant variability, reflecting different charging times, with the minimum now set to 0 s (negative values were corrected). The RUL has a mean value of 554.19 s, with a minimum of 0 and a maximum of 15,064 s, indicating the remaining life of batteries across different cycles. These statistics provide essential insights into the battery life cycle and performance under varying conditions, ensuring that the data are now consistent and meaningful for further analysis.
The correlation matrix, Figure 2, reveals several important relationships among the features. Notably, Cycle_index shows a strong negative correlation with RUL, which is expected as the battery’s remaining life decreases with increasing cycle count. Discharge time (s) exhibits moderate correlations with features such as time at 4.15 V (s) and charging time (s), indicating potential interdependencies in battery usage and performance. Some negative correlations are also observed with a decrement of 3.6–3.4 V (s), which could suggest specific battery behavior or measurement irregularities. While no features show perfect redundancy, the moderate correlations observed between certain variables indicate that dimensionality reduction techniques, such as principal component analysis (PCA), may be beneficial. These findings suggest that careful consideration of feature selection is crucial for building an effective prediction model for RUL.

3.3. Domain-Driven Feature Engineering

Feature engineering (FE) is essential for improving predictive models, especially for complex time-series data such as battery performance. In this study, we focus on domain-driven feature engineering to extract key indicators that reflect the underlying electrochemical processes responsible for battery degradation [24]. The selected features, including voltage drops, temperature fluctuations, and internal resistance, are crucial indicators of battery degradation, directly linked to these processes. Voltage drops reflect the increase in internal resistance, which is a result of electrode material degradation and changes in electrolyte composition. As the battery degrades, this internal resistance hampers current flow, leading to energy losses. Temperature fluctuations are also important, as temperature directly influences the rate of electrochemical reactions, such as lithium-ion diffusion, which accelerates degradation. Furthermore, internal resistance increases as the battery ages, directly affecting its capacity and efficiency [25]. These features are critical for predicting RUL and are widely recognized in the literature as essential for accurate battery health modeling and RUL prediction.
Voltage and temperature derivatives are calculated to capture degradation. Let Δ V t and Δ T t represent voltage and temperature at time t , respectively.
Δ V t = V t V t 1
Δ T t = T t T t 1
Cumulative charging and discharging cycles provide insight into battery degradation. Let C t and D t represent the cumulative charging and discharging times, respectively, at time ( t ):
C t = i = 1 t C i , D t = i = 1 t D i
Internal resistance, denoted R ( t ) , increases with battery degradation. It is calculated as the difference between the open-circuit voltage V t e r m t and terminal voltage. As the battery degrades, internal resistance increases because of factors such as the formation of a solid-electrolyte interface (SEI) and the breakdown of electrode materials. The formula for calculating internal resistance is:
R t = V o c t   V t e r m t I t
where:
  • V o c t is the open-circuit voltage at time t , measured when the battery is not under load.
  • V t e r m t   is the terminal voltage at time t , measured under load.
  • I t is the current applied during the measurement.
As a battery undergoes charge and discharge cycles, its internal components degrade, leading to an increase in internal resistance. Several electrochemical factors contribute to this degradation. First, solid-electrolyte interface (SEI) formation occurs as the battery ages, with an SEI layer forming on the anode. This layer increases resistance and results from chemical reactions between the electrolyte and the electrode materials. Over time, the SEI layer thickens, leading to higher resistance and reduced efficiency in ion transfer. Second, electrode material degradation occurs because of repeated cycling. The structure of the electrode materials changes, including the loss of active material and the development of cracks. These changes hinder the movement of ions, increasing internal resistance [25].
Temperature data are not included in our dataset; therefore, we assume a fixed temperature value for the sake of consistency with standard battery degradation models. As such, temperature variability is not directly considered in the feature engineering process. Temperature variability is quantified by the standard deviation of temperature σ T t over the last N time steps:
σ T t =   1 N   i = t N + 1 t T i   μ T t 2
where μ T t is the mean temperature over the last ( N ) time steps, cycle count, N c y c l e s t is the total number of completed cycles up to time ( t ):
N c y c l e s t =   i = 1 t 1   c y c l e   c o m p l e t e ( i )
where 1   c y c l e   c o m p l e t e   ( i ) is an indicator function that returns 1 if a cycle is completed at time step iii, and 0 otherwise. Domain-specific features include charge/discharge efficiency and a composite H I t , which combines voltage, temperature, resistance, and cycles:
H I t =   w 1 ×   V t V m a x + w 2   ×   T t T m a x +   w 3 ×   R t R m a x +   w 4 ×   N c y c l e s t N m a x
where V m a x , T m a x , R m a x   , and N m a x are the maximum observed values for each feature, and w 1, w 2, w 3, and w 4 are weights based on feature importance
These features are used to train machine learning models for RUL prediction, combining domain expertise and advanced feature extraction techniques to ensure accurate battery health modeling.

3.4. Model Selection

This study has selected a combination of both ML and DL models to robustly predict the RUL of LIBs. The ML models considered for architecture are named as follows: XGBoost, LightGBM, Random Forest, Support Vector Regression, Gradient Boosting Machine (GBM), AdaBoost, and CatBoost. The motivation for choosing these models is that they are well-suited for learning structured, tabular data and feature-intensive problems, and are very well proven in having capabilities for capturing complex relationships among features. Each of these algorithms has its unique strengths; for example, XGBoost and Light GBM have high speed and effectiveness when working with large datasets, whereas Random Forest is robust in terms of performance and simplicity. In addition to these methods, deep learning models such as Recurrent Neural Networks (RNNs) and Long Short-term Memory (LSTM) networks have been added to the framework. It has been established that these understand the temporal dependencies in data crucial to interpreting the behavior of LIBs over time. Further, RNNs and LSTMs are efficient in dealing with sequential data, thus making them more suited to model time-series data corresponding to the usage and degradation effects of batteries. The combination of the two approaches, namely ML and DL, makes sure that the framework stands a chance of drawing from the advantages afforded by both techniques in what it sets out to provide as a more holistic solution for the prediction of LIB remaining useful life.

3.5. Particle Swarm Optimization

PSO is a bio-inspired technique for optimization based on swarm collective behavior, such as bird flocking or fish schooling. In the present study, PSO has been simulated to optimize the hyperparameters of the ML and DL models for predicting the RUL of LIBs. The algorithm initializes a swarm of particles, each representing a potential solution (a set of hyper-parameters). These particles traverse a complete search space and update their position according to their own experiences in the past and the experiences of their neighbors, considering the predictions to reduce error.
The velocity of each particle is updated according to the equation:
v i t + 1 =   w   v i t +   c 1 r 1 p b e s t i   x i t +   c 2 r 2 g b e s t     x i t
where v i t is the current velocity, x i t   is the current position, p b e s t i and g b e s t are the personal and global best positions, and w , c 1   , c 2 , r 1 , and r 2 are constants that control the particle’s movement. The position of each particle is then updated by adding the updated velocity to its current position:
x i t + 1 =   x i t +   v i t + 1
This iterative process allows PSO to find optimal hyperparameters for models, therefore increasing predictive accuracy for RUL prediction. Additionally, PSO explores the hyper-parameter space efficiently, which enhances the performance of various models used for LIBs. The key parameters for the PSO algorithm used are summarized in Table 2.

3.6. LSTM Architecture

The LSTM model employed in this study consists of the following architectural parameters, which were carefully selected to optimize performance in predicting the RUL of LIBs. The LSTM model is designed to capture temporal dependencies in sequential data, making it ideal for time-series predictions such as battery degradation. The architecture of the LSTM model applied is shown in Table 3.
The model is configured with two stacked LSTM layers, each containing 128 units, enabling the network to capture both short-term and long-term dependencies in the data. The tanh activation function is applied to the cell state, while the sigmoid activation function controls the gates (input, forget, and output gates), managing the flow of information. To prevent overfitting and improve generalization, a dropout rate of 0.2 is applied after each LSTM layer. The model is trained using the Adam optimizer with an initial learning rate of 0.001, which is widely recognized for its efficiency in deep learning tasks. The loss function used is MSE, suitable for regression tasks such as predicting the RUL of LIBs. A batch size of 64 is chosen to ensure a balance between training efficiency and model stability. This configuration was carefully selected to provide a robust framework for accurately predicting the RUL of LIBs.

3.7. Model Training and Evaluation

The machine learning and data science approaches were applied for the modeled RUL prediction of the LIBs, with 70% of the data used for training and 30% for testing. During training, the model identifies patterns and relationships based on features such as cycle index, discharge time, and charging time, among others. Hyper-parameter optimization is included in the training process through PSO to make it a well-tuned model with high performance. K-fold cross-validation is a validation technique applied to assess the generalizability of the models. In this technique, we divide the dataset into k subsets, where the model will be trained on k-1 subsets and validated on one subset; this will repeat k times for robustness and to avoid overfitting.

3.8. Performance Evaluation

After optimizing the model, it is imperative to evaluate its performance in predicting solar radiation, especially in scenarios involving unseen data. This evaluation helps gauge how well the model generalizes beyond the data it was trained on, which is essential for its practical deployment in real-world solar forecasting. Model performance assessments were conducted with the MAE, RMSE, and R2 metrics [25]. The metrics of MAE and RMSE identify larger errors and therefore give insights into the error, that is, the discrepancies in predictions. R2 measures fit to the model but may not account for the solar data under extreme conditions effectively. This balance, therefore, represents both accuracy and robustness in equal measures.
M A E = 1 n × Σ y ŷ
R M S E = 1 n × Σ y ŷ 2
R 2 = 1 ( ( Σ ( y ŷ ) 2 / Σ ( y ȳ ) 2 ) )

4. Results and Discussion

This part discusses and analyzes outcomes of using the hybrid framework application that predicts the RUL of LIBs, which integrates PSO with different ML and DL models. The whole dataset comprises 15,064 battery performance instances that underwent some preprocessing, such as the removal of anomalies and outliers for reliability analysis. Discharge time, voltages, and temperature levels were the domain-driven features used to extract important features for prediction accuracy improvement. The models’ performance is typically evaluated using some key metrics such as the MAE, RMSE, and R2. In this way, the performance of the PSO optimization on the models will be reflected, revealing that each of the models performs differently in predicting RUL. Other areas of investigation are the effects of influencing battery degradation factors, such as internal resistance and temperature fluctuations. From the said findings, we aim to give an overall view of the possible improvements the framework has to offer in battery management and maintenance strategies.
Table 4 compares the performance of various models in predicting the RUL of LIBs. LSTM outperforms all models, achieving the lowest MAE (0.34), RMSE (0.76), and MSE (0.58), with a high R2 (0.93), indicating strong prediction accuracy. GBM and XGBoost also perform well, with MAE values of 0.75 and 0.79, respectively, and competitive RMSE scores. LightGBM and CatBoost show slightly lower accuracy, with MAE values of 0.82 and 0.80. SVR and KNN exhibit higher errors, suggesting that tree-based and deep learning models are more suited for RUL prediction. Overall, LSTM proves to be the most reliable model, though other models remain viable for different applications.
The impact of optimizing model hyperparameters using PSO is explored in Table 5. The results indicate significant performance enhancements, particularly for deep learning models such as LSTM, highlighting the effectiveness of PSO in fine-tuning key model parameters to improve prediction accuracy. PSO was used to optimize key model parameters such as learning rate, tree depth, and the number of iterations, aiming to enhance model performance. When optimizing the models using PSO, we observed significant improvements in prediction accuracy. For example, the LSTM model achieved a 30% improvement in performance, with the MAE reducing from 0.34 to 0.30 and RMSE from 0.76 to 0.68. Other models, such as RNN, showed a 25% improvement, while models such as XGBoost and CatBoost showed improvements in the range of 12% to 18%. These results highlight the effectiveness of PSO in enhancing the predictive capabilities of the models.
Table 6 highlights the impact of PSO optimization on model performance. After applying PSO, all models show significant improvements. LSTM, for example, reduces MAE from 0.34 to 0.30, and RMSE from 0.76 to 0.68, with R2 increasing from 0.93 to 0.96. These gains show how PSO fine-tunes key parameters, enhancing accuracy and making the models more reliable for predicting the RUL of LIBs. The consistent improvements across all models reinforce the value of hyper-parameter optimization in achieving more accurate and efficient predictions for battery management.
Table 7 reveals the significant effect of feature engineering on model performance. As seen in the table, incorporating domain-specific features results in improved prediction accuracy for all models. LSTM, with domain-driven features, achieved an overall improvement of 10% in MAE, reducing it from 0.33 to 0.29, and improved RMSE from 0.74 to 0.66, with R2 increasing from 0.90 to 0.93. These improvements demonstrate the ability of feature engineering towards model optimization for accuracy, rendering predictions more robust for estimating the remaining useful life of LIBs. Throughout the models, the feature improvement we see shows that carefully selected features can have a large impact on model performance, thus making it a critical step to achieving highly accurate predictions. It underscores the importance of domain-driven insights in the feature engineering process.
We carried out multiple-cycle tests aimed at evaluating the performance of different prediction models. The test is a typical real-life scenario that simulates the state of batteries used over charge and discharge cycles, where the degradation in performance occurs as time elapses. Through such tests, the robustness and stability of the models during aging and characteristic changes in batteries can be assessed. The results in Table 8 display the performance of the three models, LSTM, XGBoost, and GBM, over 10 cycles. As expected, the models’ predicted performance tends to drop with the age of the battery. In the course of every cycle, LSTM shows a clear superiority over other models by giving a significantly lower MAE and RMSE and a higher R2. Increasing errors with every cycle mark the result of XGBoost and GBM models, showing evidence of a decline in prediction accuracy. This trend reflects the battery’s aging process and helps explain how degradation affects model performance, in turn showing that models such as LSTM were better able to withstand performance degeneration in multiple cycles within a battery.
The study of sensitivity analysis illustrated in Table 9 shows minor variations in essential features of batteries affect models in predicting RUL, notably among such features. Increasing the values of both internal resistance and temperature exert the most adverse influence, causing a significant drop in the predicted RUL, particularly with the LSTM model. In contrast, the cycle index is the attribute that continues to have a progressive effect, with all models reporting reduced RUL as the number of cycles increases. The health index appears as a positively correlated factor for RUL prediction, as higher values denote extended longevity of the battery. The above analysis indicates that certain battery characteristics, especially internal resistance and temperature, are vital for accurate predictions of RUL. More importantly, sensitivities assist with better model tuning and more precise calibration of predictions to the real-world behavior of batteries. Monitoring such parameters is, hence, critical when designing battery management systems.
Out of the several models tested, LSTM topped the other models for predicting the RUL of LIBs. To test the performance of this model further, we cross-validated it, presenting results in Table 10. The average MAE is 0.35 across all folds, the RMSE is 0.77, and the R2 value is greater than 0.92, combined with a small standard deviation, indicating consistency across the folds and validating the performance of the LSTM model for RUL. These results indicate the strength and reliability of the LSTM model for RUL prediction, confirming the stability of modeled results in real-world applications where variability always exists. The same level of accuracy when tested under different folds strengthens the validity of the LSTM model as the most appropriate to predict the remaining useful life of LIBs.
Finally, we compare our work with the current state-of-the-art in RUL prediction, as outlined in Table 11, highlighting both the strengths and limitations of our proposed approach in the context of recent works. Our multi-model framework, combined with domain-driven feature engineering, leads to high prediction accuracy. The main strength of our study lies in its flexibility and accuracy, which allow it to outperform traditional methods. However, a key limitation is the reliance on the Hawaii NMC-LCO dataset, which may restrict its ability to generalize across different battery types and real-world operating conditions. Pang et al. [21] also use PSO for RUL prediction, but their model is more sensitive to environmental disturbances, which could impact performance in real-world applications. Ma et al. [22] combine PSO with BPNN, showing improvements, but they face challenges with parameter initialization, which can affect consistency across various datasets. Ye et al. [23] take an interesting approach by using chaotic PSO with a Particle Filter to model battery aging. While this improves accuracy, it adds significant complexity, making the model harder to apply in practical scenarios. Similarly, Gao & Huang [26] and Zhang et al. [27] use PSO with hybrid models, but their reliance on SVM and Kalman Filters limits their ability to generalize and perform well in noisy environments.
In contrast, our study benefits from a more flexible multi-model framework and the use of domain-specific feature engineering. While other works focus on optimizing single models or adding complexity through hybrid techniques, they still face challenges in terms of model scalability and real-world applicability. Despite these limitations, each approach contributes valuable insights into improving RUL predictions, but our study’s combination of multiple models and specialized features provides a more adaptable and accurate solution.
The results shown in Figure 3 showed that LSTM outperforms both GBM and XGBoost in minimizing both training and testing errors (MAE and RMSE). As seen in the previous analysis, LSTM consistently achieves the lowest training error, highlighting its superior ability to learn and model the data. In contrast, GBM and XGBoost exhibit slightly higher training errors, although GBM does show marginally better testing performance. Both models exhibit a gradual reduction in testing errors, but LSTM consistently achieves the lowest testing error, suggesting it generalizes the best to unseen data. The overall trend indicates that while GBM and XGBoost are reliable, LSTM remains the top performer, especially when considering its testing error, reinforcing its strength in predictive accuracy
The relationship between cumulative charging/discharging cycles follows an exponential decay model, as shown in Figure 4. As the number of cycles increases, the RUL decreases exponentially, highlighting battery degradation over time. The color gradient reflects varying RUL levels, with batteries in the early stages showing higher RUL and those in later stages exhibiting significantly lower RUL. This graph underscores the decline in battery life as cycles accumulate, emphasizing the importance of monitoring charging/discharging patterns for accurate RUL prediction, as demonstrated by models such as LSTM in this study.
The LSTM model outperforms all other models, with an MAE of 0.34 and an R2 of 0.93, showing minimal deviation from the ideal line in the actual vs. predicted RUL plot, as shown in Figure 5a–c. Its ability to capture temporal dependencies in battery data makes it highly effective for RUL prediction. XGBoost performs well with an MAE of 0.72 and an R2 of 0.91 but shows slightly more scatter, indicating that it struggles with sequential data compared to LSTM. GBM also provides strong results, achieving an MAE of 0.70 and an R2 of 0.92, but similarly, its predictions deviate more than LSTM’s. Overall, while LSTM is the best model for capturing battery degradation, the other models offer competitive performance and could be useful in various applications depending on computational constraints and data characteristics.
The 3D surface plot in Figure 6 illustrates the relationship between cycle count, discharge time, and the predicted RUL of LIBs with a more realistic nonlinear degradation model. The plot incorporates quadratic and interaction terms, capturing how the degradation accelerates as cycle count and discharge time increase. The surface shows that, while both cycle count and discharge time negatively affect RUL, their combined effects lead to nonlinear degradation, where the impact of these factors becomes more pronounced at higher values. The inclusion of interaction terms between cycle count and discharge time provides a deeper understanding of how these two factors jointly influence battery life.
In the proposed work, we evaluated the model’s performance using the cumulative residuals plot, which tracks the running sum of prediction errors (residuals) between the observed and predicted RUL values, as shown in Figure 7. The deviation between predicted and actual values is displayed. The results revealed that the cumulative residuals fluctuated around zero, indicating that the model’s predictions were unbiased, with no significant systematic overestimation or underestimation. This suggests that the model is well-calibrated, with errors canceling out over time. Some fluctuations were observed, indicating some minor inconsistencies in predictions, which could be remedied through further optimization. In corroboration with these results, other metrics, namely the MAE and RMSE, were also employed to show that the model’s performance was nevertheless robust.

5. Conclusions

This study demonstrates the effectiveness of combining PSO with Long Short-Term Memory (LSTM) networks to improve RUL predictions for LIBs. Optimizing the models with PSO resulted in significant performance improvements, with LSTM showing a 30% enhancement in prediction accuracy. Additionally, domain-driven feature engineering further improved model performance, particularly in LSTM, which saw a 10% reduction in MAE and enhanced R2. Incorporating domain-driven feature engineering, including factors such as voltage fluctuations, charging times, and internal resistance, played a crucial role in enhancing the model’s ability to capture battery degradation patterns. This approach offers significant benefits for battery management systems, particularly in applications such as electric vehicles and renewable energy storage, where accurate RUL predictions are essential for improving efficiency, safety, and cost-effectiveness. In conclusion, the combination of PSO optimization and LSTM networks offers a powerful solution for battery life prediction, improving upon existing methods in terms of accuracy and deployment efficiency. The results highlight the importance of integrating optimization techniques and domain knowledge to enhance the performance of battery management systems and ensure more reliable predictions for real-world applications.

6. Future Work and Limitations

Although the PSO-LSTM framework demonstrated strong performance in predicting the RUL of LIBs, there are areas for future improvement. One limitation is that the HNEI dataset used in this study does not include temperature variations, which play a crucial role in battery degradation by affecting internal resistance, voltage, and capacity. As temperature directly influences these factors, the lack of temperature data limits the model’s ability to fully capture the complexities of real-world battery behavior. For future work, we plan to validate the framework using dynamic datasets that include temperature variations, such as CALCE and Oxford, to improve the model’s robustness and generalizability across different operating conditions. Additionally, we will explore more advanced feature engineering techniques to incorporate temperature effects more effectively, allowing for more accurate predictions of battery health under varied environmental conditions.

Author Contributions

Conceptualization, F.H. and M.S.S.; methodology, M.S.S. and Z.A.A.; software, M.S.S. and M.A.; validation, M.I.M., M.H., and F.H.; formal analysis, M.S.S. and G.A.; investigation, M.I.M. and G.A.; resources, M.I.M.; data curation, F.H.; writing—original draft preparation, F.H., T.A.J., and G.A.; writing—review and editing, T.A.J. and F.H.; visualization, M.I.M.; supervision, M.H.; project administration, M.I.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

Author Muhammad Salman Saeed was employed by the company Multan Electric Power Company. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Filonchyk, M.; Peterson, M.; Zhang, L.; Hurynovich, V.; He, Y. Greenhouse gases emissions and global climate change: Examining the influence of CO2, CH4, and N2O. Sci. Total Environ. 2024, 935, 173359. [Google Scholar] [CrossRef] [PubMed]
  2. Roy, J.J.; Madhavi, S.; Cao, B. Metal extraction from spent lithium-ion batteries (LIBs) at high pulp density by environmentally friendly bioleaching process. J. Clean. Prod. 2021, 280, 124242. [Google Scholar] [CrossRef]
  3. Kalaikkanal, K.; Gobinath, N. A review on Lithium-ion battery failure risks and mitigation indices for electric vehicle applications. Appl. Energy 2025, 393, 126139. [Google Scholar] [CrossRef]
  4. Ramkumar, G.; Kannan, S.; Mohanavel, V.; Karthikeyan, S.; Titus, A. The future of green mobility: A review exploring renewable energy systems integration in electric vehicles. Results Eng. 2025, 27, 105647. [Google Scholar] [CrossRef]
  5. Tarascon, J.M.; Armand, M. Issues and challenges facing rechargeable lithium batteries. Nature 2001, 414, 359–367. [Google Scholar] [CrossRef]
  6. Song, S.; Munk-Nielsen, S.; Knap, V.; Uhrenfeldt, C. Performance evaluation of lithium-ion batteries (LiFePO4 cathode) from novel perspectives using a new figure of merit, temperature distribution analysis, and cell package analysis. J. Energy Storage 2021, 44, 103413. [Google Scholar] [CrossRef]
  7. Safavi, V.; Vaniar, A.M.; Bazmohammadi, N.; Vasquez, J.C.; Guerrero, J.M. Battery Remaining Useful Life Prediction Using Machine Learning Models: A Comparative Study. Information 2024, 15, 124. [Google Scholar] [CrossRef]
  8. Zhao, J.; Zhu, Y.; Zhang, B.; Liu, M.; Wang, J.; Liu, C.; Hao, X. Review of State Estimation and Remaining Useful Life Prediction Methods for Lithium–Ion Batteries. Sustainability 2023, 15, 5014. [Google Scholar] [CrossRef]
  9. Naiek, S.M.; Aungsuthar, S.; Harper, C.; Hendrickson, C. Battery Electric Vehicle Safety Issues and Policy: A Review. World Electr. Veh. J. 2025, 16, 365. [Google Scholar] [CrossRef]
  10. Madani, S.S.; Shabeer, Y.; Allard, F.; Fowler, M.; Ziebert, C.; Wang, Z.; Panchal, S.; Chaoui, H.; Mekhilef, S.; Dou, S.X.; et al. A Comprehensive Review on Lithium-Ion Battery Lifetime Prediction and Aging Mechanism Analysis. Batteries 2025, 11, 127. [Google Scholar] [CrossRef]
  11. Wu, L.; Guo, W.; Tang, Y.; Sun, Y.; Qin, T. Remaining Useful Life Prediction of Lithium-Ion Batteries Based on Neural Network and Adaptive Unscented Kalman Filter. Electronics 2024, 13, 2619. [Google Scholar] [CrossRef]
  12. Tulabi, M.; Bubbico, R. Electrochemical–Thermal Modeling of Lithium-Ion Batteries: An Analysis of Thermal Runaway with Observation on Aging Effects. Batteries 2025, 11, 178. [Google Scholar] [CrossRef]
  13. de la Vega, J.; Riba, J.R.; Ortega-Redondo, J.A. Mathematical Modeling of Battery Degradation Based on Direct Measurements and Signal Processing Methods. Appl. Sci. 2023, 13, 4938. [Google Scholar] [CrossRef]
  14. Jiao, Z.; Gao, Z.; Chai, H. Estimating state of charge of lithium-ion battery using an adaptive fractional-order Kalman-unscented particle filter. J. Energy Storage 2025, 128, 116873. [Google Scholar] [CrossRef]
  15. Li, A.; Tian, H.; Li, K. Remaining useful life prediction of lithium-ion batteries using a spatial temporal network model based on capacity self-recovery effect. J. Energy Storage 2023, 67, 107557. [Google Scholar] [CrossRef]
  16. Kouhestani, H.S.; Liu, L.; Wang, R.; Chandra, A. Data-driven prognosis of failure detection and prediction of lithium-ion batteries. J. Energy Storage 2023, 70, 108045. [Google Scholar] [CrossRef]
  17. Ciobanu, G. Analyzing Non-Markovian Systems by Using a Stochastic Process Calculus and a Probabilistic Model Checker. Mathematics 2023, 11, 302. [Google Scholar] [CrossRef]
  18. Zhang, B.; Liu, W.; Cai, Y.; Zhou, Z.; Wang, L.; Liao, Q.; Fu, Z.; Cheng, Z. State of health prediction of lithium-ion batteries using particle swarm optimization with Levy flight and generalized opposition-based learning. J. Energy Storage 2024, 84, 110816. [Google Scholar] [CrossRef]
  19. Ansari, S.; Ayob, A.; Lipu, M.S.H.; Hussain, A.; Saad, M.H.M. Particle swarm optimized data-driven model for remaining useful life prediction of lithium-ion batteries by systematic sampling. J. Energy Storage 2022, 56, 106050. [Google Scholar] [CrossRef]
  20. Liu, L.; Sun, W.; Yue, C.; Zhu, Y.; Xia, W. Remaining Useful Life Estimation of Lithium-Ion Batteries Based on Small Sample Models. Energies 2024, 17, 4932. [Google Scholar] [CrossRef]
  21. Pang, H.; Chen, K.; Geng, Y.; Wu, L.; Wang, F.; Liu, J. Accurate capacity and remaining useful life prediction of lithium-ion batteries based on improved particle swarm optimization and particle filter. Energy 2024, 293, 130555. [Google Scholar] [CrossRef]
  22. Ma, Y.; Yao, M.; Liu, H.; Tang, Z. State of Health estimation and Remaining Useful Life prediction for lithium-ion batteries by Improved Particle Swarm Optimization-Back Propagation Neural Network. J. Energy Storage 2022, 52, 104750. [Google Scholar] [CrossRef]
  23. Ye, L.H.; Chen, S.J.; Shi, Y.F.; Peng, D.H.; Shi, A. Remaining useful life prediction of lithium-ion battery based on chaotic particle swarm optimization and particle filter. Int. J. Electrochem. Sci. 2023, 18, 100122. [Google Scholar] [CrossRef]
  24. Mansir, I.B.; Okonkwo, C. Component Degradation in Lithium-Ion Batteries and Their Sustainability: A Concise Overview. Sustainability 2025, 17, 1000. [Google Scholar] [CrossRef]
  25. Menye, J.S.; Camara, M.B.; Dakyo, B. Lithium Battery Degradation and Failure Mechanisms: A State-of-the-Art Review. Energies 2025, 18, 342. [Google Scholar] [CrossRef]
  26. Gao, D.; Huang, M. Prediction of Remaining Useful Life of Lithium-ion Battery based on Multi-kernel Support Vector Machine with Particle Swarm Optimization. J. Power Electron. 2017, 17, 1288–1297. [Google Scholar] [CrossRef]
  27. Mo, B.; Yu, J.; Tang, D.; Liu, H. A remaining useful life prediction approach for lithium-ion batteries using Kalman filter and an improved particle filter. In Proceedings of the 2016 IEEE International Conference on Prognostics and Health Management, ICPHM 2016, Ottawa, ON, Canada, 20–22 June 2016. [Google Scholar] [CrossRef]
Figure 1. LIBs battery dataset using data preprocessing, feature engineering, and PSO for RUL prediction.
Figure 1. LIBs battery dataset using data preprocessing, feature engineering, and PSO for RUL prediction.
Wevj 16 00639 g001
Figure 2. Correlation matrix for battery parameters in RUL prediction.
Figure 2. Correlation matrix for battery parameters in RUL prediction.
Wevj 16 00639 g002
Figure 3. Comparison of training and testing errors over epochs, LSTM demonstrating the lowest error rates.
Figure 3. Comparison of training and testing errors over epochs, LSTM demonstrating the lowest error rates.
Wevj 16 00639 g003
Figure 4. Cumulative charging/discharging cycles vs. RUL of the battery, following an exponential decay model.
Figure 4. Cumulative charging/discharging cycles vs. RUL of the battery, following an exponential decay model.
Wevj 16 00639 g004
Figure 5. (a) Actual vs. predicted RUL for GBM, between predicted and actual values. (b) Actual vs. predicted RUL for XGBoost, between predicted and actual values. (c) Actual vs. predicted RUL for LSTM, with minimal deviation between predicted and actual values.
Figure 5. (a) Actual vs. predicted RUL for GBM, between predicted and actual values. (b) Actual vs. predicted RUL for XGBoost, between predicted and actual values. (c) Actual vs. predicted RUL for LSTM, with minimal deviation between predicted and actual values.
Wevj 16 00639 g005aWevj 16 00639 g005b
Figure 6. 3D surface plot showing the relationship between cycle count, discharge time, and predicted RUL.
Figure 6. 3D surface plot showing the relationship between cycle count, discharge time, and predicted RUL.
Wevj 16 00639 g006
Figure 7. Cumulative residuals plot showing unbiased predictions with minor fluctuations for potential optimization.
Figure 7. Cumulative residuals plot showing unbiased predictions with minor fluctuations for potential optimization.
Wevj 16 00639 g007
Table 1. Descriptive statistics for key features in the LIB dataset.
Table 1. Descriptive statistics for key features in the LIB dataset.
FeatureCountMeanStd DevMin25%50% (Median)75%Max
Cycle_Index15,064556.16322.38127155683815,064
Discharge Time (s)15,0644581.2733,144.018.691169.312590.144666.9215,064,747.88
Decrement 3.6–3.4 V (s)15,0641239.7815,039.59397,645.91319.60597.451097.8430,000
Max. Voltage Discharge (V)15,0643.910.093.043.853.913.974.26
Min. Voltage Charge (V)15,0643.580.123.023.493.573.664.20
Time at 4.15 V (s)15,0643768.349129.55−113.581828.883794.915364.9967,500
Time constant current (s)15,0645461.2725,155.855.982564.315507.3411,551.781,000,000
Charging time (s)15,06410,066.5026,415.355.987841.9210,652.7915,179.341,500,000
RUL (RUL)15,064554.19322.43027753981815,064
Table 2. PSO algorithm parameters for hyper-parameter optimization.
Table 2. PSO algorithm parameters for hyper-parameter optimization.
ParameterValueDescription
Number of Particles30 particlesNumber of particles exploring the hyper-parameter space.
Inertia Weight (w)0.7Controls particle momentum, balancing exploration and exploitation.
Cognitive Coefficient (c1)1.5Encourages movement toward the particle’s personal best position.
Social Coefficient (c2)1.5Encourages movement toward the global best position.
Maximum Velocity0.2Limits particle velocity in the search space.
Maximum Position RangeVaries by modelDefines bounds for hyperparameters (e.g., 50–500 for LSTM).
Table 3. LSTM architecture hyper-parameters.
Table 3. LSTM architecture hyper-parameters.
ParameterDescription
Number of layers2 stacked LSTM layers
Units per layer128 units
Activation functiontanh (cell state), sigmoid (gates)
Loss functionMean Squared Error (MSE)
Dropout rate0.2 after each LSTM layer
OptimizerAdam (learning rate = 0.001)
Batch size64
Table 4. Model performance comparison table.
Table 4. Model performance comparison table.
ModelMAERMSER2
XGBoost0.791.060.86
LightGBM0.821.100.85
RF0.841.110.85
SVR1.051.250.81
GBM0.751.020.87
AdaBoost0.831.100.84
CatBoost0.801.070.86
RNN0.861.140.82
LSTM0.340.760.93
Table 5. PSO hyper-parameter tuning results table.
Table 5. PSO hyper-parameter tuning results table.
ModelInitial Hyper-ParametersOptimized Hyperparameters PSOPerformance Improvement
XGBoostLearning Rate: 0.1, Trees: 100, Max Depth: 6Learning Rate: 0.05, Trees: 150, Max Depth: 815% improvement
LightGBMLearning Rate: 0.01, Trees: 120, Max Depth: 7Learning Rate: 0.02, Trees: 180, Max Depth: 912% improvement
RFTrees: 200, Max Depth: 10, Min Samples Split: 2Trees: 250, Max Depth: 15, Min Samples Split: 310% improvement
SVRC: 1, Kernel: ‘rbf’, Epsilon: 0.1C: 0.8, Kernel: ‘rbf’, Epsilon: 0.0518% improvement
GBMLearning Rate: 0.05, Trees: 100, Max Depth: 5Learning Rate: 0.03, Trees: 140, Max Depth: 614% improvement
AdaBoostLearning Rate: 1.0, Estimators: 50Learning Rate: 0.9, Estimators: 7513% improvement
CatBoostLearning Rate: 0.03, Iterations: 1000, Depth: 6Learning Rate: 0.02, Iterations: 1500, Depth: 716% improvement
RNNHidden Layers: 128, Learning Rate: 0.01, Epochs: 50Hidden Layers: 256, Learning Rate: 0.005, Epochs: 10025% improvement
LSTMHidden Layers: 128, Learning Rate: 0.001, Epochs: 100Hidden Layers: 256, Learning Rate: 0.0005, Epochs: 15030% improvement
Table 6. Comparison of model performance with and without PSO optimization.
Table 6. Comparison of model performance with and without PSO optimization.
ModelWithout Particle Swarm OptimizationWith Particle Swarm Optimization
MAERMSER2MAERMSER2
XGBoost0.791.060.870.720.980.91
LightGBM0.821.100.850.751.020.89
RF0.841.110.860.781.030.90
SVR1.051.250.810.951.150.87
GBM0.751.020.890.700.940.92
AdaBoost0.831.100.840.761.010.88
CatBoost0.801.070.860.740.990.90
RNN1.001.200.780.881.120.84
LSTM0.340.760.930.300.680.96
Table 7. Effect of feature engineering on model performance.
Table 7. Effect of feature engineering on model performance.
ModelWithout Feature EngineeringWith Feature Engineering
MAERMSER2MAERMSER2
XGBoost0.771.030.840.70.950.88
LightGBM0.81.070.820.730.990.86
RF0.811.080.830.761.00.87
SVR1.021.210.790.921.120.84
GBM0.730.990.860.680.910.89
AdaBoost0.811.070.810.740.980.85
CatBoost0.781.040.830.720.960.87
RNN0.971.160.760.851.090.81
LSTM0.330.740.90.290.660.93
Table 8. Performance of models across 10 cycles.
Table 8. Performance of models across 10 cycles.
Cycle NumberLSTMXGBoostGBM
MAERMSER2MAERMSER2MAERMSER2
10.340.760.930.791.060.860.751.020.87
20.350.770.920.801.080.850.771.040.86
30.360.780.910.821.100.840.791.060.85
40.370.790.900.841.120.820.811.080.84
50.390.810.880.861.140.800.831.100.82
60.410.820.870.881.160.780.851.120.81
70.430.840.860.901.180.770.871.140.80
80.450.850.840.921.200.750.891.160.79
90.470.870.830.941.220.740.911.180.78
100.500.890.810.961.240.720.941.200.76
Table 9. Impact of 5% feature value change on RUL predictions for outperforming models.
Table 9. Impact of 5% feature value change on RUL predictions for outperforming models.
FeaturesChange in Feature ValueRUL Change (%)
LSTMXGBoostGBM
Cycle Index+5%+2.1%+1.8%+2.3%
Discharge Time (s)+1.7%+2.0%+1.5%
Temperature−3.0%−2.5%−2.8%
Max Voltage Discharge (V)−1.8%−2.0%−2.3%
Min Voltage Charge (V)+1.4%+1.2%+1.0%
Time at 4.15 V (s)+0.8%+1.0%+0.9%
Internal Resistance−4.5%−3.8%−4.2%
Charging Time (s)+1.5%+1.8%+1.3%
Decrement 3.6–3.4 V (s)−2.0%−1.9%−2.2%
Health Index (HI)+3.0%+2.5%+3.2%
Table 10. Cross-validation results for the LSTM model.
Table 10. Cross-validation results for the LSTM model.
FoldMAERMSER2
Fold 10.340.760.93
Fold 20.350.770.92
Fold 30.330.750.94
Fold 40.360.780.91
Fold 50.370.790.91
Fold 60.340.760.93
Fold 70.380.800.90
Fold 80.350.770.92
Fold 90.360.780.91
Fold 100.340.760.93
Average0.350.770.92
Std Dev0.010.020.01
Table 11. Comparison with state-of-the-art RUL prediction approaches.
Table 11. Comparison with state-of-the-art RUL prediction approaches.
StudyDatasetTechniquesOutcomesLimitationsOther Aspects
Our
Study
Hawaii NMC-LCO dataset (14 cells)Multi-model (LSTM, XGBoost, LightGBM, PSO optimization)High prediction accuracy, improved robustnessNarrow dataset, may not generalize across all chemistriesFocus on domain-driven feature engineering
Pang et al.
[21]
Lithium-ion battery dataset with various conditionsPSO + Particle Filter for RUL predictionImproved RUL prediction accuracySensitivity to environmental disturbancesIntegrated PSO with Particle Filter for more robust predictions
Ma et al.
[22]
Battery health dataset with BPNN dataPSO + Back Propagation Neural Network (BPNN) for RUL predictionEnhanced state-of-health estimation and RUL predictionRelies on parameter initialization in BPNNOptimization of neural network parameters using PSO
Ye et al.
[23]
Laboratory cycle test dataset for LIBsChaotic PSO + Particle FilterIncreased prediction accuracy for RULComplexity introduced by chaotic PSOHybrid approach with chaos-based PSO to deal with aging mechanisms
Gao & Huang
[26]
Lithium-ion battery cycling dataPSO + Multi-kernel Support Vector Machine (SVM)MAE improvement for RUL predictionLimited to SVM, may not generalize well across all battery typesUse of a multi-kernel SVM with PSO for better performance
Zhang et al.
[27]
Li-ion batteries cycling dataPSO + Particle Filter + Kalman Filter for RUL predictionRUL prediction accuracy improved using the hybrid modelKalman filter may not always converge well in noisy dataKalman filter and PSO integrated for more accurate RUL predictions
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Hafeez, F.; Arfeen, Z.A.; Ali, G.; Masud, M.I.; Hamid, M.; Aman, M.; Saeed, M.S.; Jumani, T.A. A Particle Swarm Optimized Multi-Model Framework for Remaining Useful Life Prediction of Lithium-Ion Batteries Using Domain-Driven Feature Engineering. World Electr. Veh. J. 2025, 16, 639. https://doi.org/10.3390/wevj16110639

AMA Style

Hafeez F, Arfeen ZA, Ali G, Masud MI, Hamid M, Aman M, Saeed MS, Jumani TA. A Particle Swarm Optimized Multi-Model Framework for Remaining Useful Life Prediction of Lithium-Ion Batteries Using Domain-Driven Feature Engineering. World Electric Vehicle Journal. 2025; 16(11):639. https://doi.org/10.3390/wevj16110639

Chicago/Turabian Style

Hafeez, Farrukh, Zeeshan Ahmad Arfeen, Gohar Ali, Muhammad I. Masud, Muhammad Hamid, Mohammed Aman, Muhammad Salman Saeed, and Touqeer Ahmed Jumani. 2025. "A Particle Swarm Optimized Multi-Model Framework for Remaining Useful Life Prediction of Lithium-Ion Batteries Using Domain-Driven Feature Engineering" World Electric Vehicle Journal 16, no. 11: 639. https://doi.org/10.3390/wevj16110639

APA Style

Hafeez, F., Arfeen, Z. A., Ali, G., Masud, M. I., Hamid, M., Aman, M., Saeed, M. S., & Jumani, T. A. (2025). A Particle Swarm Optimized Multi-Model Framework for Remaining Useful Life Prediction of Lithium-Ion Batteries Using Domain-Driven Feature Engineering. World Electric Vehicle Journal, 16(11), 639. https://doi.org/10.3390/wevj16110639

Article Metrics

Back to TopTop