Next Article in Journal
Solar Spectral Beam Splitting Simulation of Aluminum-Based Nanofluid Compatible with Photovoltaic Cells
Previous Article in Journal
Enhanced Simulation Accuracy and Design Optimization in Power Semiconductors Through Individual Aluminum Metallization Layer Modeling
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Robust Data-Driven State of Health Estimation of Lithium-Ion Batteries Based on Reconstructed Signals

by
Byron Alejandro Acuña Acurio
1,*,
Diana Estefanía Chérrez Barragán
1,
Juan Carlos Rodríguez
2,
Felipe Grijalva
3 and
Luiz Carlos Pereira da Silva
1
1
Faculdade de Engenharia Elétrica e de Computação (FEEC), Universidade Estadual de Campinas (UNICAMP), Campinas 13083-852, SP, Brazil
2
Analog Devices Inc., Wilmington, MA 01887, USA
3
Colegio de Ciencias e Ingenierías ”El Politécnico”, Universidad San Francisco de Quito USFQ, Quito 170157, Ecuador
*
Author to whom correspondence should be addressed.
Energies 2025, 18(10), 2459; https://doi.org/10.3390/en18102459
Submission received: 1 April 2025 / Revised: 28 April 2025 / Accepted: 29 April 2025 / Published: 11 May 2025
(This article belongs to the Section D: Energy Storage and Application)

Abstract

:
The state of health (SoH) of lithium-ion batteries is critical for diagnosing the actual capacity of the battery. Data-driven methods have achieved impressive accuracy, but their sensitivity to sensor noise, missing samples, and outliers remains a limitation for their deployment. This paper proposes a robust, purely data-driven SoH estimation methodology that addresses these challenges. Our method uses a proposed non-iterative closed-form signal reconstruction derived from a modified Tikhonov regularization. Five new features were extracted from reconstructed voltage and temperature discharge profiles. Finally, a Huber regression model is trained using these features for SoH estimation. Six ageing scenarios built from the public NASA and Sandia National Laboratories datasets, under severe Gaussian noise conditions (10 dB SNR), were employed to validate our proposed approach. In noisy environments and with limited training data, our proposed approach maintains a competitive accuracy across all scenarios, achieving low error metrics, with an RMSE on the order of 10 4 , an MAE on the order of 10 2 , and a MAPE below 1%. It outperforms state-of-the-art deep neural networks, direct-feature Huber models, and hybrid physics/data-driven models. In this work, we demonstrate that robustness in SoH estimation for lithium-ion batteries is influenced by the choice of machine learning architecture, loss function, feature selection, and signal reconstruction technique. In addition, we found that tracking the time to minimum discharge voltage and the time to maximum discharge temperature can be used as effective features to estimate SoH in data-driven models, as they are directly correlated with capacity loss and a decrease in power output.

1. Introduction

Lithium-ion batteries (LIBs) are essential in the transition to sustainable energy systems [1]. They enable the storage of renewable energy for later use, which is critical for balancing supply and demand from intermittent energy sources [2]. In the electricity sector, LIBs are used for peak shaving, arbitrage, capacity firming, energy price management, frequency regulation, and grid stabilization [3]. Over time, batteries experience degradation due to multiple factors such as charge, discharge cycles, and the temperature effects of working conditions. State of health (SoH) estimation is important for assessing battery degradation of all the aforementioned applications and electric vehicles [4]. SoH is defined as the ratio of the actual capacity to the nominal capacity [5]. It is expressed as a percentage to diagnose the actual battery capacity [6]. For instance, a SoH below range of 70–80% indicates the end-of-life of the battery for primary applications since the battery can no longer deliver the required power density by high-performance storage systems [7]. However, such batteries can still be reused in second life applications (lower power applications) [8]. SoH estimation methods can be classified in the following four main categories [9]: (i) experimental methods, (ii) physics-based models, (iii) data-driven methods, and (iv) hybrid approaches [10].
Experimental methods, also known as direct measurement methods, include processes such as ampere-hour counting [11], which involves performing a full charge and discharge cycle between defined voltage thresholds. Specific current rates and temperature conditions are maintained during ampere-hour counting [9]. To determine the actual battery capacity Q, and SoH, the following formulation is employed:
Q = 0 t f η I ( t ) d t S O H = Q Q initial
where Q is the actual capacity, Q initial is the initial capacity of a battery, I ( t ) is the current, and t f is the total time during charging or discharging. η represents the coulombic efficiency, which is very close to 1 for LIBs [12]. However, these methods are time-consuming, require interrupting normal operation, and are limited to controlled laboratory environments [13].
Physics-based models [14] simulate the chemical and physical processes occurring within batteries. SoH is estimated by monitoring changes in some model parameters that are correlated with battery aging [15]. For instance, in equivalent circuit models (ECM) [16], LIBs are represented as electrical circuits with resistors, capacitors, and voltage sources [17]. As the battery ages, evidenced by a decline in SOH, the internal resistance parameter in ECM models increases, while the capacitance related to capacity tends to decrease [18]. Models calibrated under specific conditions may not apply directly to others, necessitating re-calibration across different operating scenarios [19].
Data-driven methods [20], such as machine learning (ML) algorithms, are trained on historical aging data (voltage, current, temperature, etc.) to identify complex patterns and relationships between the extracted features, e.g., health indicators and SoH [21]. Recent studies highlight that by using large datasets and carefully designed training settings, data-driven techniques can achieve high accuracy and adaptability in predicting battery SoH [22], often outperforming traditional physics-based models in terms of accuracy [22]. Therefore, data-driven methods are usually implemented in BMS to track degradation and estimate SoH [23]. However, SoH data-driven approaches often require large datasets for training [24] and are sensitive to noise and missing data [25]. Thus, selecting and extracting the right features for data-driven methods is non-trivial, especially with varying operating conditions and degradation mechanisms [26].
Hybrid approaches combine different sources of information and modeling techniques to enhance the accuracy, robustness, and interpretability of SoH estimation. For instance, physics-based models can be used to provide simulated features to data-driven approaches to perform SoH estimation with real and simulated data [27]. Recent studies have proposed knowledge transfer techniques; for instance, ref. [28] developed a transfer learning method that diagnoses degradation modes (DMs) of lithium–iron–phosphate (LFP) batteries by minimizing both classification loss and domain adaptation loss between synthetic and real datasets. This approach enables accurate DM identification without requiring extensive real-world labeled data. Similarly, ref. [29] proposed a two-stage SOH estimation framework where DM knowledge is first transferred from synthetic to real datasets and then used as input for SOH prediction.
In this work, we propose a novel pure data-driven SoH estimation approach of lithium-ion batteries, designed to demonstrate robustness against noisy measurements and outliers. Unlike previous methods, our proposed approach can be trained with small datasets. The proposed method does not need an explicit physics-based model or assumptions about initial aging conditions for maintaining high accuracy, making it suitable for real-time BMS implementation with low-computational cost and memory requirements. To highlight the advantages of the proposed methodology, Table 1 presents a comparative analysis against previous data-driven and hybrid approaches that report the lowest error metrics for estimating lithium-ion battery state of health [30].
The summarized contributions of this paper are as follows:
  • Closed-form signal reconstruction: We present a non-iterative closed-form solution for the signal reconstruction of noisy measurements.
  • Novel data-driven health indicators: We introduce five noise-resilient features derived from the reconstructed voltage and temperature discharge profiles.
  • Robust data-driven state of health estimation: In this work, we use the Huber cost function to improve the accuracy of the regression model by reducing the impact of outliers, providing an alternative to removing outliers from the dataset.
This paper is organized as follows: Section 1 introduces the SoH estimation problem for LIBs and reviews current modeling techniques in the literature. Section 2 outlines the proposed approach and the case studies. Section 3 presents a comprehensive performance assessment that demonstrates the effectiveness of our method compared to existing data-driven and hybrid methods in noisy environments and with limited training data. Finally, Section 4 concludes the paper, summarizing key findings and discussing potential directions for future work.

2. Materials and Methods

This section outlines the materials and methodology adopted in the development of the proposed SoH estimation approach for LIBs under noisy measurement conditions. As illustrated in Figure 1, the proposed method is described in detail in Section 2.1. This work assumed that there is an availability of data storage systems capable of continuously monitoring and recording voltage and temperature measurements from LIBs.

2.1. Proposed Approach

Our proposed approach consists of three key stages:
(1)
Signal Reconstruction: This stage estimates a vector z ^ m = z ^ 1 m , z ^ 2 m , , z ^ t m T from noisy voltage and temperature profiles z c m = z c , 1 m , z c , 2 m , z c , t m T , where m = 1 corresponds to the battery discharge voltage profile and m = 2 represents the battery discharge temperature profile. The signal reconstruction employs the proposed closed-form, non-iterative mathematical expression formulated in Equation (8).
(2)
Feature Extraction: From the reconstructed voltage and temperature discharge profiles, five new data-driven health indicators are extracted, which are conditionally correlated with the battery aging process.
(3)
SoH Estimation: A Huber regression model was employed for SoH estimation, demonstrating robustness against outliers.
  • Further details of each stage will be explained in the following sections.

2.1.1. Proposed Signal Reconstruction Stage

This subsection details the proposed signal reconstruction stage, designed to recover voltage and temperature discharge profiles from noisy measurement data. The measurement model employed in this work was formulated as follows:
z c m z m + e
where z m represents the unknown noise-free signal for each discharge profile. The term e denotes the stochastic noise component affecting the measurements. The objective of the proposed signal reconstruction method is to estimate the clean signal, denoted as z ^ m , from the noisy measurement z c m . To achieve this, the signal reconstruction problem is formulated as a regularized optimization problem:
min z ^ m z ^ m z c m 2 2 + δ ϕ Δ z ^ m ;
the regularization hyperparameter δ controls the trade-off between preserving the original noisy signal and noise filtering. The function ϕ Δ z ^ m introduces a regularization term to improve robustness against measurement noise. When δ = 0 , the reconstructed signal exactly matches the noisy measurement. Conversely, excessively high values of δ result in over-smoothing, potentially distorting the original signal characteristics. Two regularization strategies were investigated in this study: (i) Tikhonov regularization =  ϕ Δ z ^ m = Δ z ^ m 2 2 and (ii) LASSO regularization =  ϕ Δ z ^ m = Δ z ^ m 1 .
To solve the ill-posed problem in Equation (3), the regularization hyperparameter δ is determined using the L-Curve method, as detailed in [36]. Additionally, we employ the regularization operator Δ , previously reported in [36], and defined in Equation (4), due to its demonstrated stability in reconstructing voltage and temperature discharge profiles of lithium-ion batteries observed in this work.
Δ = 1 1 0 0 0 0 0 0 0 1 2 1 0 0 0 0 0 0 0 1 2 1 0 0 0 0 0 0 0 1 2 1 0 0 0 0 0 0 0 0 0 1 2 1 0 0 0 0 0 0 0 1 2 1 0 0 0 0 0 0 0 1 1
where Δ R t × t is a square matrix, with its dimension equal to the number of samples t in each discharge profile.

Signal Reconstruction Based on Tikhonov Regularization

The reconstruction of the signal is formulated based on a modified Tikhonov regularization strategy expressed in Equation (5). This formulation, known as the unrestricted form, allows us to obtain a non-iterative approach to recover clean discharge profiles from noisy measurements, as follows
min z ^ m z ^ m z c m 2 2 + δ Δ z ^ m 2 2
This formulation can be rewritten in the following compact form:
min z ^ m z ^ m z c m 2 2 + δ Δ z ^ m 2 2 = min z ^ m z ^ m z c m δ Δ z ^ m 2 2
Equation (6) can be rewritten as
min z ^ m z ^ m z c m δ Δ z ^ m 2 2 = min z ^ m I δ Δ z ^ m z c m 0 2 2
Solving for z ^ m , the closed-form solution of signal reconstruction using modified Tikhonov regularization is given by
z ^ m = I δ Δ T I δ Δ 1 I δ Δ T z c m 0

Signal Reconstruction Based on LASSO Regularization

In the case of LASSO regularization strategy, it is not possible to have a closed-form expression. However, the ill-posed problem can be effectively solved using the following constrained optimization problem.
min δ Δ z ^ m 1 s . t . z ^ m z c m 2 2 δ
For this study, the regularization parameter δ was empirically set to 5 for signals with a signal-to-noise ratio (SNR) of 10 dB. The optimal choice of δ is inherently dependent on the noise characteristics of the data:
  • For highly corrupted signals (SNR < 10 dB), larger δ values ( δ > 5 ) are recommended to enhance noise suppression.
  • For signals with minimal noise contamination (SNR > 10 dB), lower δ values ( δ < 5 ) preserve original signal details.

2.1.2. Proposed Feature Extraction Stage

We propose a systematic approach to extract features highly correlated with battery health degradation by analyzing the evolving patterns in voltage and temperature profiles during discharge cycles. The method consists of three steps:
  • Identification of discharge cycle.
  • Extraction of voltage features.
  • Extraction of temperature features.
1. Identification of discharge cycle: We isolate the continuous discharge cycles from battery operation data by identifying periods where the current drops below a predefined threshold (in this case −0.05 A) and where a predefined minimum voltage (e.g., cutoff voltage) is reached. Note that manufacturers generally set a safe cutoff voltage to preserve battery health and lifespan [37]. Lower cutoff voltages (e.g., 2.0 V) correspond to a depth of discharge (DOD) closer to 100%. However, in real-time applications, battery manufacturers generally do not recommend a 100% DOD as an operational norm [38]. Many battery studies suggest limiting the DOD to 70% to extend battery lifespan [39]. This step ensures we analyze only discharge events, filtering out partial or interrupted discharge cycles that could skew the analysis.
2. Extraction of voltage features: Battery voltage during discharge follows a decreasing pattern, where each successive voltage measurement is lower than the previous one, z ^ 1 1 > z ^ 2 1 > > z ^ t 1 , and whose characteristics change as the battery degrades. We capture these dynamics through two critical features:
  • Minimum discharge voltage ( x 1 ): This is the lowest voltage reached during the discharge cycle (cutoff voltage). As a battery ages, its internal impedance increases due to factors such as lithium inventory loss and conductive degradation [40]. An increment in the impedance leads to a more pronounced voltage drop ( R 0 i ) during discharge [41]. According to the Shepherd model, the terminal voltage V at discharge time t can be modeled as follows [42]:   
    V ( t ) = E 0 K Q Q i ( t ) × t i ( t ) R 0 i ( t )
    where E 0 is the theoretical initial open-circuit voltage under specific working conditions, R 0 represents the internal ohmic resistance of the battery ( Ω ) , i is the discharge current in amperes (A), assumed positive, and Q is the actual full capacity of the battery in ampere-hours ( A h r ), in which eventually, with aging, its experiment capacity will fade. Q i ( t ) × t is the remaining capacity in the battery at discharge time t and K is the polarization resistance coefficient ( Ω ) . Thus, the minimum discharge voltage is sensitive to internal resistance growth, and to the battery’s full capacity Q. In this work, we found that tracking the time to minimum discharge voltage can be used as a feature to estimate SoH in data-driven models.
  • Time to minimum voltage ( x 2 ): This is the duration between the beginning of discharge and the minimum voltage. A healthy battery with full capacity can sustain the discharge current longer before reaching the voltage limit, whereas an aged battery with capacity fade will hit the minimum voltage sooner [43].
3. Extraction of temperature features: Temperature profiles during discharge typically exhibit an increasing pattern, where each successive temperature reading surpasses the previous one, z ^ 1 2 < z ^ 2 2 < < z ^ t 2 , and which changes with battery aging. To capture these thermal characteristics, we introduce three temperature-related features:
  • Minimum temperature at the beginning of discharge ( x 3 ): The baseline temperature at the beginning of discharge, establishing a reference point.
  • Maximum discharge temperature ( x 4 ): The peak temperature reached during discharge, reflecting internal resistance and exothermic reactions. As the battery degrades and its internal resistance increases, it produces more heat for the same discharge current, resulting in a higher peak temperature [44].
  • Time elapsed between minimum and maximum temperature ( x 5 ): The aged battery’s temperature climbs to its maximum in a shorter time than in a new battery, which heats more slowly due to its lower internal resistance [45].
These proposed features effectively encapsulate the electrochemical and thermal signatures of battery degradation without requiring complete charge–discharge curves. All extracted features x i were normalized to lie between 0 and 1 to ensure numerical stability and consistency.

2.1.3. State of Health (SoH) Estimation

SoH is estimated using a Huber regression model (Equation (11)), a robust method less sensitive to outliers than least squares regression [46]. The Huber loss function [46] (Equation (12)) is used to fit the model, which smoothly transitions from a quadratic form for small residuals to a linear form for large residuals. This adaptive property mitigates the influence of outliers while maintaining sensitivity to minor deviations [46], making it particularly well-suited for SoH estimation of noisy data.
y ^ = α 0 + α 1 x 1 + α 2 x 2 + α 3 x 3 + α 4 x 4 + α 5 x 5
where the α coefficients are optimized by solving a convex quadratic programming problem [47] and x i represents our proposed features. We set the Huber transition parameter to γ = 1.35 , which yields approximately 95% efficiency under Gaussian noise. This choice promotes numerical stability by limiting the influence of extreme residuals, thereby helping to prevent overfitting even in high-dimensional feature spaces. The Huber cost function [48], used in the training stage to fit the model, is defined as follows:   
L γ ( y i , y ^ i ) = 1 2 ( y i y ^ i ) 2 for   | y i y ^ i |   γ , γ | y i y ^ i | 1 2 γ 2 otherwise .
Once trained, the Huber regression model provides accurate SoH estimates, y ^ , from the five extracted features. As shown in Figure 1, incorporating Huber regression into our pipeline ensures computational efficiency and robustness against outliers.
Training and testing were performed using labeled datasets, which are described comprehensively in Section 2.2.

2.2. Datasets

The proposed SoH estimation method was evaluated using two distinct datasets: the NASA battery dataset [49] and Sandia National Laboratories (SNL) battery degradation dataset [50].

2.2.1. NASA Data Overview

To evaluate our proposed SoH estimation method, experiments were performed using three distinct testing scenarios derived from NASA’s lithium-ion battery aging dataset [49]. The selected batteries and scenarios ensure various realistic operational scenarios, including temperature variations, different discharge cutoff voltages, diverse capacity fade thresholds, and the end-of-life (EOL) criteria.
Scenario 1: Variable Temperature Conditions. In this scenario, we evaluate our proposed SoH estimation method over a broad temperature range from 4 °C to 40 °C. Batteries B0005 and B0007 were used for model training, while Battery B0018 was reserved exclusively for testing purposes. The charging protocol consisted of a constant current (CC) phase at 1.5 A until the battery voltage reached 4.2 V, then a constant voltage (CV) phase was maintained at 4.2 V until the charging current decreased to 20 mA. The discharge cutoff voltages varied across the batteries, set at 2.7 V for B0005, 2.2 V for B0007, and 2.5 V for B0018. The end-of-life criterion was determined based on a capacity fade of 30%, corresponding to a decrease from the nominal capacity of 2.0 Ah to 1.4 Ah.
Scenario 2: Constant Temperature Conditions. In Scenario 2, the batteries operated under controlled ambient conditions, with a constant temperature of 24 °C. Battery B0033 was used for training purposes, while B0034 was reserved for testing. The charging protocol was identical to Scenario 1, with the difference that the discharge cutoff voltages were set at 2.0 V for B0033 and 2.2 V for B0034. A more demanding end-of-life criterion was adopted, defined by a capacity fade threshold of 20%, also corresponding to a reduction in battery capacity from the initial 2.0 Ah to 1.6 Ah.
Scenario 3: Low-Temperature Operation: In this scenario, the batteries operated under low-temperature conditions (4 °C). The charging protocol followed the previously established procedure, with the difference that the discharge cutoff voltages among batteries were set at the following: 2.2 V for B0046, 2.5 V for B0047, and 2.7 V for B0048. The end-of-life criterion was the same as Scenario 1, defined as a 30% capacity fade.
Using the aforementioned scenarios, we evaluate the proposed SoH estimation method under different working conditions, as described in [51] and shown in Table 2.

2.2.2. SNL Data Overview

We also evaluated our proposed SoH estimation method using three additional testing scenarios derived from the Sandia National Laboratories (SNL) lithium-ion battery aging dataset [50]. SNL’s dataset contains detailed cycle-level and time-series battery performance data collected during charge–discharge cycling tests on commercial 18650-format lithium–iron–phosphate (LFP) cells. Each battery was subjected to varied experimental conditions, including temperature, charge–discharge current rates, and state-of-charge (SoC) ranges. The data were acquired through the open-access web platform [52].
Scenario 4: High-stress regime. In this scenario, both training and test datasets from lithium–iron–phosphate (LFP) 18650 cells were cycled at 25 °C ambient temperature. Cells were charged at a rate of 0.5C and discharged at a rate of 1C (where C represents the battery’s capacity in ampere-hours (Ah)), covering a full SoC range from 0% to 100% (complete depth of discharge). Specifically, the training data originated from the file SNL_18650_LFP_25C_0-100_0.5-1C_c_timeseries.csv, while testing data were taken from SNL_18650_LFP_25C_0-100_0.5-1C_d_timeseries.csv. Discharge rate conditions varied between 0.5C and 2C.
Scenario 5: Higher discharge rate. The training dataset was obtained from the file SNL_18650_LFP_15C _0-100_0.5-2C_a_timeseries.csv, while the testing dataset was sourced from SNL_18650_LFP_15C_0-100_0.5-2C_b_timeseries.csv. Compared to other scenarios, Scenario 5 is characterized by its lower operating temperature conditions (15 °C) and a higher discharge rate (2C), resulting in shorter discharge durations at high voltage per cycle.
Scenario 6: High-stress regime. The working conditions of this scenario are similar to those of Scenario 4, but with different batteries. For the training data, we used the file SNL_18650_LFP_25C_0-100_0.5-1C_a_timeseries.csv, while the data used for testing were taken from the file SNL_18650_LFP_25C_0-100_0.5-1C_b_timeseries.csv.
The proposed SoH estimation methodology was evaluated across the aforementioned defined scenarios, detailed in Table 3. Note, NCA means Lithium Nickel Cobalt Aluminum Oxide, and NMC means Lithium Nickel Manganese Cobalt Oxide. Hence, NCA and NMC refer to two different types of lithium-ion battery cathode chemistries.

2.3. Accuracy Evaluation Metrics

A robust SoH estimation model maintains its accuracy across a wide range of scenarios. To evaluate the accuracy of our proposed SoH estimation methodology, three widely used evaluation metrics [53] were employed: mean absolute percentage error (MAPE), mean absolute error (MAE), and root mean square error (RMSE). These metrics provide complementary measures of estimation accuracy, as detailed below:
  • Mean Absolute Percentage Error (MAPE) quantifies the relative error as a percentage, making it easy to interpret and compare across different datasets or models, and is defined as follows:
    MAPE = 1 n i = 1 n y ^ i y i y i
    Note that MAPE can be misleading when actual y i values are close to zero, as the percentage error becomes disproportionately large.
  • Mean Absolute Error (MAE) measures the average magnitude of absolute deviations between the estimated and actual SoH values. Lower MAE values indicate higher prediction accuracy. Mathematically, it is expressed as follows:
    MAE = 1 n i = 1 n y ^ i y i
    Note that MAE is less sensitive to outliers compared to metrics like Root Mean Squared Error (RMSE), as it does not square the error.
  • Root Mean Square Error (RMSE) evaluates the standard deviation of prediction errors, heavily penalizing large deviations. RMSE is computed as follows:
    RMSE = 1 n i = 1 n y ^ i y i 2
    Note that RMSE is sensitive to outliers.
    where y ^ i is the estimated SoH, y i is the ground truth SoH, and n is the total number of SoH observations.

3. Numerical Results and Discussion

We first describe the experimental setup in Section 3.1, then provide a comprehensive experimental validation and performance analysis of our SoH estimation methodology under various noise levels, charging protocols, and operational conditions.

3.1. Experimental Setup

Our proposed SoH estimation method was evaluated using the scenarios detailed in Section 2.2. All experiments were conducted on a workstation assembled in Dell’s manufacturing facility located in Hortolândia, São Paulo, Brazil, with an Intel(R) Core(TM) i7 870 processor and 8 GB RAM. The proposed approach presented in Section 2.1 was implemented in Python 3.8 [54].

3.2. Sensitivity Analysis of Signal Reconstruction

The voltage and temperature discharge profile data for each scenario, detailed in Section 2.2, were distorted according to the following signal-to-noise ratio (SNR) ranges: 10, 20, 30, 40, 50 dB. All signals were normalized to zero mean and unit variance before processing to ensure a fair comparison across methods and parameter values. For each SNR level, we identified the optimal regularization parameter δ for both Lasso and Tikhonov methods using grid search. Our proposed approach assumes that sensor noise follows a zero-mean Gaussian distribution. This is a common characteristic of real-world sensor data according to [55]. Both our proposed approach based on modified Tikhonov formulation (Equation (8)) and LASSO formulation (Equation (9)) demonstrated stable performance under noisy conditions. However, our proposed approach achieves the lowest error metrics across multiple SNR levels, outperforming filtering methods such as moving average (MA) [56] and Kalman filter [57] for signal reconstruction, as shown in Figure 2.
In contrast to our proposed closed-form solution, which provides stable reconstructions for ill-posed inverse problems, LASSO formulation requires an iterative solution process and can be sensitive to convergence criteria and regularization parameter tuning. Therefore, we found that the LASSO formulation has a higher error variability than our proposed approach based on modified Tikhonov. On the other hand, the moving average (MA) filter operates by smoothing data points over a specified window. However, MA can lead to oversmoothing and the loss of important features in the signal, particularly in the presence of sharp transitions or edges. We found that this characteristic often results in insufficient performance when the signal reconstruction involves non-stationary or rapidly changing signals. In contrast, our proposed approach promotes piecewise constant solutions that are beneficial for reconstructing signals with abrupt changes. Thus, our proposed signal reconstruction method not only reduces noise effectively but also preserves essential signal characteristics that simple averaging methods can overlook. In this study, the centered MA filter [56] was implemented using Equation (16)
y [ n ] = 1 M i = 0 M 1 x n + i M 1 2
where M represents the predefined samples (window size) that are used to average the original input signal x [ n ] and n is the time index. In this work, we use M = 5 obtained from a grid search. This allows the filter to process both past and future samples in a symmetric manner.
Kalman filter addresses the signal reconstruction problem by forming a recursive estimation procedure that integrates predictions based on the following dynamic model [57]:
x k = x k 1 + w k , w k N ( 0 , Q ) ,
where x k R is the latent (noise-free) signal sample at discrete time step k and Q > 0 is the process-noise variance that captures unmodeled perturbations and model mismatch. Each noisy observation z k was modeled with direct noisy measurement v k , as follows:
z k = x k + v k , v k N ( 0 , R ) ,
where R > 0 is the measurement-noise variance. Our proposed approach based on Tikhonov formulation uses all data (e.g., all discharge voltage and temperature profiles) in a single batch. In contrast, the Kalman filter performs a sequential filtering by updating the reconstruction solution at each sample k. The Kalman filter achieves optimality only when ( Q , R ) match the true process and sensor variances. Hence, we found that our proposed method is more robust to non-optimal hyperparameter specification compared with the Kalman filter. The experimental results demonstrate that our proposed approach achieves lower bias and variance than the Kalman filter for the signal reconstruction task under varying noise levels. This highlights the importance of sufficient data information for accurate reconstruction, providing stability against noise. Figure 2 is organized as follows: the boxplots show the RMSE, MAE, and MAPE for all the estimated SoH across different SNRs. SNR quantifies the measurement error for each measurement using (19).
SNR dB = 10 log 10 E z 2 σ z 2
The error e iid N 0 , σ z 2 is used to pollute each measurement z c m in Equation (2). Signal reconstruction is an ill-conditioned problem, which can lead to convergence issues. To mitigate this, we introduced a closed-form solution in Equation (8) that ensures reliable performance even when the LASSO formulation encounters convergence difficulties. To do this, it is required to use the proposed regularization operator presented in Equation (4), which was introduced in a previous paper by our research group [36]. Since the proposed closed-form expression is a non-iterative mathematical model, the number of iterations required to obtain a signal reconstruction solution is one. Consequently, a significant reduction in computational time is observed.
Table 4 shows the reduction in reconstruction signal quality as noise levels increase. However, our proposed approach maintains a consistent good performance in terms of accuracy. These findings support the selection of our proposed modified Tikhonov reconstruction stage within the proposed SoH estimation method.

3.3. Sensitivity and Correlation Analysis of Proposed Features

It is necessary to understand the importance of each feature for SoH estimation and provide insights into model goodness-of-fit in relative terms; that is, not merely whether the model performs well, but which features contribute most to its performance. To do this, we conducted a sensitivity analysis of the proposed features shown in Table 5. In this analysis, we examined model performance under different feature combinations: using all features, leaving one feature out, and using only a single feature. We evaluated each combination using error metrics (RMSE, MAE, MAPE), the coefficient of determination ( R 2 ), and execution time. Note that R 2 quantifies the proportion of variance in the SoH variable that is estimated from one or more features, defined as
R 2 = 1 i = 1 n y i y ^ i 2 i = 1 n y i y ¯ 2
where y ¯ is the mean of the true values. In this formulation, R 2 = 1 indicates a perfect fit of the model to the data, corresponding to zero residual error ( i = 1 N y i y ^ i 2 = 0 ), whereas R 2 = 0 indicates that the model is no better than predicting the mean of the data ( i = 1 N y i y ^ i 2 = i = 1 N y i y ¯ 2 ). Thus, R 2 = 0.90 means that 90% of the variance in SoH is explained by the model. It is worth noting that R 2 can be negative if the model’s predictions are worse than simply using the mean y ¯ , although in a well-trained regression for SoH we expect R 2 to be between 0 and 1.
We use the Pearson correlation coefficient to measure the degree of linear correlation between our proposed features and the SoH, based on the following formulation
r = i = 1 n x i x ¯ y i y ¯ i = 1 n x i x ¯ 2 i = 1 n y i y ¯ 2
where x i indicates the proposed features and y i the SoH, and x ¯ i and y ¯ i are their average values, respectively. The range of the Pearson correlation coefficient is [ 1 , 1 ] . The closer to the extreme values at both ends, the stronger the linear correlation between the proposed features and SoH. For detailed information of the correlation analysis, see Table 6.
Based on Table 6, we found that tracking the time to minimum discharge voltage and the time to maximum discharge temperature can be used as effective features to estimate SoH in data-driven models, as they are directly correlated with capacity loss and a decrease in power output.

3.4. Comparison of Feature Selection: Direct Features vs Proposed

We compared our proposed five features, detailed in Section 2.1.2, with the direct features approach reported in [58], which uses the following ten features: minimum, maximum, and average values of voltage, current, temperature, and the total discharge time. Both methods use Huber regression, which is robust to outliers, along with the proposed signal reconstruction stage, differing only in their features. As depicted in Figure 3, our proposed method outperformed the direct features method in terms of accuracy across different battery-aging scenarios. Our proposed features, despite using fewer features than direct features [58], perform better, showing that our proposed feature engineering is more effective than just collecting a wide range of statistics.

3.5. Data-Driven Robustness Comparison

To evaluate the robustness [55] of our proposed aproach, we adjusted the experimental setup presented in [59] with more intense noise contamination. The original study [59] used Gaussian white noise with an SNR of 30 dB. We evaluated our model with an SNR = 10 dB. This simulated more adverse data conditions. We compared our proposed approach with the deep neural network (DNN) model for SoH estimation. The DNN model is described in [31] and summarized in Section 3.5.1. To eliminate any bias that might arise from differences in noise handling, we employed the same proposed signal reconstruction stage based on modified Tikhonov regularization for the DNN model, which was used in our proposed SoH estimation approach. Note that SoH estimations were performed across different battery aging stages, operational conditions, and charging protocols.

3.5.1. DNN

The DNN model extracts three key temporal features from the constant current–constant voltage (CC-CV) charging process: (1) the initial voltage inflection point, which characterizes the early charging behavior; (2) the CC-CV transition time, occurring when the cell reaches 4.2 V and the current decreases to 1.5 A; and (3) the time to reach peak cell temperature, which captures thermal characteristics during charging. The DNN architecture consists of two hidden layers with three neurons each, employing Rectified Linear Unit (ReLU) activation functions between layers.
As illustrated in Figure 4, our proposed approach consistently estimates the SoH throughout the battery life cycles. The temporary increment in the measured value of SoH (non-linearities) in Figure 4 corresponds to the capacity regeneration phenomenon that occurs in lithium-ion batteries [60]. Our approach demonstrates superior robustness against noise, particularly in Scenario 2, where the presence of high noise levels significantly affects the DNN model [31]. While DNN models require large datasets for effective learning, our method achieves comparable or superior performance with significantly fewer training samples in noisy conditions. In addition, in Scenario 3, which includes multiple missing values in the training data, our method maintained its resilience, further demonstrating its robustness under adverse conditions.
Compared to the DNN [31] model, our proposed approach achieves superior performance (see Figure 4) with a RMSE of 10 4 , MAE of 10 2 , and MAPE below of 1%. These results confirm the ability of our proposed approach to perform consistent and accurate SoH estimations across different discharge conditions and with limited training data. We further benchmarked our results against prior state-of-the-art methods, as shown in Table 1.

3.6. Comparison with Hybrid Model

Our proposed approach is a purely data-driven machine learning (ML) model. We compare our proposed approach with the hybrid approach in [61], which combines ML-based predictions with a physics-based model. In the test of the hybrid approach, we employed the proposed pipeline in Section 2.1 as the machine learning component. For the physics-based model, an exponential decay assumption was employed, as follows:
SoH phys ( n ) = C f + C 0 C f exp β n α
where C 0 represents the initial capacity (normalized to 1.0), C f is the final capacity (e.g., 0.8), and β , α are decay parameters that control how quickly the SoH degrades with cycle count. As shown in Figure 5 and Figure 6, the hybrid approach model might not perfectly capture real-world battery dynamics. This limitation underscores that our proposed data-driven method can help correct or improve these SoH estimations. To compute the following results, hybrid model fusion was weighted 30% physics and 70% ML as follows:
SoH hybrid = w · SoH phys + ( 1 w ) · SoH ML
The hybrid approach was tested using the six scenarios presented in Section 2.2.
Table 7 compares the accuracy of the proposed SoH estimation pipeline with three alternatives over six representative aging scenarios. These results confirm that our proposed pipeline estimates SoH, even under severe noise, missing data, and variable discharge protocols, where the alternative methods lose accuracy.
As is shown in Table 8, SoH estimation of lithium-ion batteries is influenced by operational parameters such as discharge C-rates, temperature, and charging protocols. In this case, higher discharge C-rates tend to larger estimation errors. However, our proposed approach maintains competitive accuracy. Low temperatures like 4 °C lead to higher errors compared to moderate temperatures such as 24 °C. The CC-CV charging protocol performs better than CC for SoH estimation.

4. Conclusions

This study demonstrates that the robustness of state-of-health estimation for lithium-ion batteries is significantly influenced by the choice of machine learning architecture, loss function, feature selection, and signal reconstruction technique. The performance of our proposed approach remained stable under low-temperature (4 °C) operation, high discharge rates (2C), severe Gaussian noise (10 dB SNR), and missing data, where DNN and hybrid models lost accuracy. This work also demonstrated that well-engineered features, obtained using domain knowledge, capture relevant information more effectively than a larger quantity of statistics. In this case, our proposed approach, which relies on only five engineered features, outperformed a comparable model that used ten statistical features.
Our proposed closed-form signal reconstruction approach based on modified Tikhonov regularization achieves superior reconstruction quality across various noise levels, compared to the iterative LASSO, moving average filter, and Kalman filter. Our proposed data-driven SoH estimation approach demonstrated high accuracy under noisy conditions, with a low computational cost.
Future research will aim to extend our proposed SoH estimation approach by incorporating additional relevant physical parameters, developing adaptive methods for selecting the regularization parameter ( δ ), and performing experiments using non-Gaussian noise and extreme operating conditions. Our current results strongly highlight the potential of data-driven methods for achieving accurate and robust SoH estimation within battery management systems (BMS), which are critical for the safe and efficient operation of electric vehicles (EV). Hence, future work will focus on validating our proposed approach under more diverse and variable battery aging conditions, using real-world EVs datasets.

Author Contributions

Conceptualization, B.A.A.A., D.E.C.B. and L.C.P.d.S.; methodology, B.A.A.A. and L.C.P.d.S.; software Python (version 3.8), D.E.C.B.; validation, L.C.P.d.S., J.C.R. and F.G.; formal analysis, D.E.C.B. and L.C.P.d.S.; investigation, B.A.A.A.; resources, L.C.P.d.S.; data curation, F.G. and J.C.R.; writing—original draft preparation, B.A.A.A.; writing—review and editing, D.E.C.B. and L.C.P.d.S.; visualization, D.E.C.B.; supervision, J.C.R.; project administration, L.C.P.d.S.; funding acquisition, L.C.P.d.S. All authors have read and agreed to the published version of the manuscript.

Funding

This work has been supported by the following Brazilian Research Agencies: FAPESP, CAPES, CNPq, INEP, and FINEP. The authors are funded by grants #2025/01540-6, #2022/16881-5, #2020/03069-5, and #2021/11380-5, Centro Paulista de Estudos da Transição Energética (CPTEn), São Paulo Research Foundation (FAPESP). This work was also developed under the Electricity Sector Research and Development Program PD-00063-3058/2019-PA3058: “MERGE—Microgrids for Efficient, Reliable and Greener Energy”, regulated by the National Electricity Agency (ANEEL in Portuguese), in partnership with CPFL Energia (Local Electricity Distributor). This work was supported by the Universidad San Francisco de Quito through the Poli-Grants Program under Grant 24263.

Data Availability Statement

In this study, two publicly available lithium-ion battery aging datasets were employed, the NASA battery dataset [49] and Sandia National Laboratories (SNL) battery degradation dataset [50].

Conflicts of Interest

Author Juan Carlos Rodríguez was employed by the company Analog Devices Inc. All authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Greim, P.; Solomon, A.; Breyer, C. Assessment of lithium criticality in the global energy transition and addressing policy gaps in transportation. Nat. Commun. 2020, 11, 4570. [Google Scholar] [CrossRef]
  2. Lagrange, A.; de Simón-Martín, M.; González-Martínez, A.; Bracco, S.; Rosales-Asensio, E. Sustainable microgrids with energy storage as a means to increase power resilience in critical facilities: An application to a hospital. Int. J. Electr. Power Energy Syst. 2020, 119, 105865. [Google Scholar] [CrossRef]
  3. Department of Energy. DOE Global Energy Storage Database, 2023. Available online: https://gesdb.sandia.gov/ (accessed on 8 September 2023).
  4. Stroe, D.I.; Schaltz, E. Lithium-ion battery state-of-health estimation using the incremental capacity analysis technique. IEEE Trans. Ind. Appl. 2019, 56, 678–685. [Google Scholar] [CrossRef]
  5. Kallel, A.Y.; Petrychenko, V.; Kanoun, O. State-of-health of Li-ion battery estimation based on the efficiency of the charge transfer extracted from impedance spectra. Appl. Sci. 2022, 12, 885. [Google Scholar] [CrossRef]
  6. Oji, T.; Zhou, Y.; Ci, S.; Kang, F.; Chen, X.; Liu, X. Data-driven methods for battery soh estimation: Survey and a critical analysis. IEEE Access 2021, 9, 126903–126916. [Google Scholar] [CrossRef]
  7. John, J.; Kudva, G.; Jayalakshmi, N. Secondary life of electric vehicle batteries: Degradation, state of health estimation using incremental capacity analysis, applications and challenges. IEEE Access 2024, 12, 63735–63753. [Google Scholar] [CrossRef]
  8. Gharebaghi, M.; Rezaei, O.; Li, C.; Wang, Z.; Tang, Y. A Survey on Using Second-Life Batteries in Stationary Energy Storage Applications. Energies 2024, 18, 42. [Google Scholar] [CrossRef]
  9. Zhang, J.; Li, K. State-of-Health Estimation for Lithium-Ion Batteries in Hybrid Electric Vehicles—A Review. Energies 2024, 17, 5753. [Google Scholar] [CrossRef]
  10. Liu, K.; Wang, Y.; Lai, X. Data Science-Based Full-Lifespan Management of Lithium-Ion Battery: Manufacturing, Operation and Reutilization; Springer Nature: Berlin/Heidelberg, Germany, 2022. [Google Scholar]
  11. Tian, J.; Xiong, R.; Shen, W. A review on state of health estimation for lithium ion batteries in photovoltaic systems. ETransportation 2019, 2, 100028. [Google Scholar] [CrossRef]
  12. Wilhelm, J.; Seidlmayer, S.; Keil, P.; Schuster, J.; Kriele, A.; Gilles, R.; Jossen, A. Cycling capacity recovery effect: A coulombic efficiency and post-mortem study. J. Power Sources 2017, 365, 327–338. [Google Scholar] [CrossRef]
  13. Bayoumi, E.H.; De Santis, M.; Awad, H. A Brief Overview of Modeling Estimation of State of Health for an Electric Vehicle’s Li-Ion Batteries. World Electr. Veh. J. 2025, 16, 73. [Google Scholar] [CrossRef]
  14. Andersson, M.; Streb, M.; Ko, J.Y.; Klass, V.L.; Klett, M.; Ekström, H.; Johansson, M.; Lindbergh, G. Parametrization of physics-based battery models from input–output data: A review of methodology and current research. J. Power Sources 2022, 521, 230859. [Google Scholar] [CrossRef]
  15. Iurilli, P.; Brivio, C.; Carrillo, R.E.; Wood, V. Physics-Based SoH Estimation for Li-Ion Cells. Batteries 2022, 8, 204. [Google Scholar] [CrossRef]
  16. Sun, X.; Zhang, Y.; Zhang, Y.; Wang, L.; Wang, K. Summary of health-state estimation of lithium-ion batteries based on electrochemical impedance spectroscopy. Energies 2023, 16, 5682. [Google Scholar] [CrossRef]
  17. Tran, M.K.; Mathew, M.; Janhunen, S.; Panchal, S.; Raahemifar, K.; Fraser, R.; Fowler, M. A comprehensive equivalent circuit model for lithium-ion batteries, incorporating the effects of state of health, state of charge, and temperature on model parameters. J. Energy Storage 2021, 43, 103252. [Google Scholar] [CrossRef]
  18. Velasco-Arellano, H.; Castillo-Magallanes, N.; Visairo-Cruz, N.; Núñez-Gutiérrez, C.A.; Lázaro, I. Parametric Correlation Analysis between Equivalent Electric Circuit Model and Mechanistic Model Interpretation for Battery Internal Aging. World Electr. Veh. J. 2024, 15, 291. [Google Scholar] [CrossRef]
  19. Lu, J.; Wu, T.; Amine, K. State-of-the-art characterization techniques for advanced lithium-ion batteries. Nat. Energy 2017, 2, 1–13. [Google Scholar] [CrossRef]
  20. Ren, Z.; Du, C. A review of machine learning state-of-charge and state-of-health estimation algorithms for lithium-ion batteries. Energy Rep. 2023, 9, 2993–3021. [Google Scholar] [CrossRef]
  21. Zhang, M.; Yang, D.; Du, J.; Sun, H.; Li, L.; Wang, L.; Wang, K. A review of SOH prediction of Li-ion batteries based on data-driven algorithms. Energies 2023, 16, 3167. [Google Scholar] [CrossRef]
  22. Gong, J.; Xu, B.; Chen, F.; Zhou, G. Predictive Modeling for Electric Vehicle Battery State of Health: A Comprehensive Literature Review. Energies 2025, 18, 337. [Google Scholar] [CrossRef]
  23. Nuroldayeva, G.; Serik, Y.; Adair, D.; Uzakbaiuly, B.; Bakenov, Z. State of health estimation methods for lithium-ion batteries. Int. J. Energy Res. 2023, 2023, 4297545. [Google Scholar] [CrossRef]
  24. Alamin, K.S.S.; Daghero, F.; Pollo, G.; Pagliari, D.J.; Chen, Y.; Macii, E.; Poncino, M.; Vinco, S. Model-Driven Dataset Generation for Data-Driven Battery SOH Models. In Proceedings of the 2023 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED), Vienna, Austria, 7–8 August 2023; pp. 1–6. [Google Scholar]
  25. Bamati, S.; Chaoui, H. Developing an online data-driven state of health estimation of lithium-ion batteries under random sensor measurement unavailability. IEEE Trans. Transp. Electrif. 2022, 9, 1128–1141. [Google Scholar] [CrossRef]
  26. Wang, J.; Zhang, C.; Meng, X.; Zhang, L.; Li, X.; Zhang, W. A Novel Feature Engineering-Based SOH Estimation Method for Lithium-Ion Battery with Downgraded Laboratory Data. Batteries 2024, 10, 139. [Google Scholar] [CrossRef]
  27. Zhang, Z.; Zhang, R.; Liu, X.; Zhang, C.; Sun, G.; Zhou, Y.; Yang, Z.; Liu, X.; Chen, S.; Dong, X.; et al. Advanced State-of-Health Estimation for Lithium-Ion Batteries Using Multi-Feature Fusion and KAN-LSTM Hybrid Model. Batteries 2024, 10, 433. [Google Scholar] [CrossRef]
  28. Lu, X.; Qiu, J.; Lei, G.; Zhu, J. Degradation mode knowledge transfer method for LFP batteries. IEEE Trans. Transp. Electrif. 2022, 9, 1142–1152. [Google Scholar] [CrossRef]
  29. Lu, X.; Qiu, J.; Lei, G.; Zhu, J. State of health estimation of lithium iron phosphate batteries based on degradation knowledge transfer learning. IEEE Trans. Transp. Electrif. 2023, 9, 4692–4703. [Google Scholar] [CrossRef]
  30. Alhazmi, R.M. State of health prediction in electric vehicle batteries using a deep learning model. World Electr. Veh. J. 2024, 15, 385. [Google Scholar] [CrossRef]
  31. Driscoll, L.; de la Torre, S.; Gomez-Ruiz, J.A. Feature-based lithium-ion battery state of health estimation with artificial neural networks. J. Energy Storage 2022, 50, 104584. [Google Scholar] [CrossRef]
  32. Khumprom, P.; Yodo, N. A data-driven predictive prognostic model for lithium-ion batteries based on a deep learning algorithm. Energies 2019, 12, 660. [Google Scholar] [CrossRef]
  33. Cui, Z.; Wang, C.; Gao, X.; Tian, S. State of health estimation for lithium-ion battery based on the coupling-loop nonlinear autoregressive with exogenous inputs neural network. Electrochim. Acta 2021, 393, 139047. [Google Scholar] [CrossRef]
  34. Fan, Y.; Xiao, F.; Li, C.; Yang, G.; Tang, X. A novel deep learning framework for state of health estimation of lithium-ion battery. J. Energy Storage 2020, 32, 101741. [Google Scholar] [CrossRef]
  35. Zhou, D.; Li, Z.; Zhu, J.; Zhang, H.; Hou, L. State of health monitoring and remaining useful life prediction of lithium-ion batteries based on temporal convolutional network. IEEE Access 2020, 8, 53307–53320. [Google Scholar] [CrossRef]
  36. Acurio, B.A.A.; Barragán, D.E.C.; Amezquita, J.C.L.; Rider, M.J.; Da Silva, L.C.P. Design and Implementation of a Machine Learning State Estimation Model for Unobservable Microgrids. IEEE Access 2022, 10, 123387–123398. [Google Scholar] [CrossRef]
  37. Rufino Júnior, C.A.; Sanseverino, E.R.; Gallo, P.; Amaral, M.M.; Koch, D.; Kotak, Y.; Diel, S.; Walter, G.; Schweiger, H.G.; Zanin, H. Unraveling the degradation mechanisms of lithium-ion batteries. Energies 2024, 17, 3372. [Google Scholar] [CrossRef]
  38. Wei, Y.; Yao, Y.; Pang, K.; Xu, C.; Han, X.; Lu, L.; Li, Y.; Qin, Y.; Zheng, Y.; Wang, H.; et al. A comprehensive study of degradation characteristics and mechanisms of commercial Li (NiMnCo) O2 EV batteries under vehicle-to-grid (V2G) services. Batteries 2022, 8, 188. [Google Scholar] [CrossRef]
  39. Yagci, M.C.; Feldmann, T.; Bollin, E.; Schmidt, M.; Bessler, W.G. Aging characteristics of stationary lithium-ion battery systems with serial and parallel cell configurations. Energies 2022, 15, 3922. [Google Scholar] [CrossRef]
  40. Edge, J.S.; O’Kane, S.; Prosser, R.; Kirkaldy, N.D.; Patel, A.N.; Hales, A.; Ghosh, A.; Ai, W.; Chen, J.; Yang, J.; et al. Lithium ion battery degradation: What you need to know. Phys. Chem. Chem. Phys. 2021, 23, 8200–8221. [Google Scholar] [CrossRef]
  41. Yang, X.; Wang, Z.; Xie, S. Influence of Overdischarge Depth on the Aging and Thermal Safety of LiNi0.5Co0.2Mn0.3O2/Graphite Cells. Battery Energy 2025, e70008. [Google Scholar] [CrossRef]
  42. Li, S.; Ke, B. Study of battery modeling using mathematical and circuit oriented approaches. In Proceedings of the 2011 IEEE Power and Energy Society General Meeting, Detroit, MI, USA, 24–29 July 2011; pp. 1–8. [Google Scholar]
  43. Birkl, C.R.; Roberts, M.R.; McTurk, E.; Bruce, P.G.; Howey, D.A. Degradation diagnostics for lithium ion cells. J. Power Sources 2017, 341, 373–386. [Google Scholar] [CrossRef]
  44. Shen, W.; Wang, N.; Zhang, J.; Wang, F.; Zhang, G. Heat generation and degradation mechanism of lithium-ion batteries during high-temperature aging. ACS Omega 2022, 7, 44733–44742. [Google Scholar] [CrossRef]
  45. Huang, R.; Xu, Y.; Wu, Q.; Chen, J.; Chen, F.; Yu, X. Simulation study on heat generation characteristics of lithium-ion battery aging process. Electronics 2023, 12, 1444. [Google Scholar] [CrossRef]
  46. Wang, Q.; Ma, Y.; Zhao, K.; Tian, Y. A comprehensive survey of loss functions in machine learning. Ann. Data Sci. 2020, 9, 187–212. [Google Scholar] [CrossRef]
  47. Mangasarian, O.L.; Musicant, D.R. Robust linear and support vector regression. IEEE Trans. Pattern Anal. Mach. Intell. 2000, 22, 950–955. [Google Scholar] [CrossRef]
  48. Huber, P.J. Robust estimation of a location parameter. In Breakthroughs in Statistics: Methodology and Distribution; Springer: Berlin/Heidelberg, Germany, 1992; pp. 492–518. [Google Scholar]
  49. Saha, B.; Goebel, K. Battery data set. In NASA AMES Prognostics Data Repository; NASA Ames Research Center: Moffett Field, CA, USA, 2007. Available online: https://www.nasa.gov/intelligent-systems-division/discovery-and-systems-health/pcoe/pcoe-data-set-repository/ (accessed on 1 January 2023).
  50. Preger, Y.; Barkholtz, H.M.; Fresquez, A.; Campbell, D.L.; Juba, B.W.; Romàn-Kustas, J.; Ferreira, S.R.; Chalamala, B. Degradation of commercial lithium-ion cells as a function of chemistry and cycling conditions. J. Electrochem. Soc. 2020, 167, 120532. [Google Scholar] [CrossRef]
  51. Ma, S.; Jiang, M.; Tao, P.; Song, C.; Wu, J.; Wang, J.; Deng, T.; Shang, W. Temperature effect and thermal impact in lithium-ion batteries: A review. Prog. Nat. Sci. Mater. Int. 2018, 28, 653–666. [Google Scholar] [CrossRef]
  52. Battery Archive. Battery Archive, 2025. Available online: https://www.batteryarchive.org/ (accessed on 6 March 2025).
  53. Lipu, M.S.H.; Mamun, A.A.; Ansari, S.; Miah, M.S.; Hasan, K.; Meraj, S.T.; Abdolrasol, M.G.; Rahman, T.; Maruf, M.H.; Sarker, M.R.; et al. Battery management, key technologies, methods, issues, and future trends of electric vehicles: A pathway toward achieving sustainable development goals. Batteries 2022, 8, 119. [Google Scholar] [CrossRef]
  54. Python Software Foundation. Python Programming Language; Python Software Foundation: Wilmington, DE, USA, 1991–2024; Available online: https://www.python.org/ (accessed on 1 January 2023).
  55. Lucu, M.; Martinez-Laserna, E.; Gandiaga, I.; Camblong, H. A critical review on self-adaptive Li-ion battery ageing models. J. Power Sources 2018, 401, 85–101. [Google Scholar] [CrossRef]
  56. Engelberg, S. Implementing Moving Average Filters Using Recursion [Tips & Tricks]. IEEE Signal Process. Mag. 2023, 40, 78–80. [Google Scholar]
  57. Kim, Y.; Bang, H. Introduction to Kalman filter and its applications. In Introduction and Implementations of the Kalman Filter; IntechOpen: Rijeka, Croatia, 2018. [Google Scholar]
  58. Jiang, N.; Zhang, J.; Jiang, W.; Ren, Y.; Lin, J.; Khoo, E.; Song, Z. Driving behavior-guided battery health monitoring for electric vehicles using extreme learning machine. Appl. Energy 2024, 364, 123122. [Google Scholar] [CrossRef]
  59. Lin, M.; Zeng, X.; Wu, J. State of health estimation of lithium-ion battery based on an adaptive tunable hybrid radial basis function network. J. Power Sources 2021, 504, 230063. [Google Scholar] [CrossRef]
  60. Deng, L.M.; Hsu, Y.C.; Li, H.X. An improved model for remaining useful life prediction on capacity degradation and regeneration of lithium-ion battery. In Proceedings of the Annual Conference of the PHM Society, Jeju, Republic of Korea, 2–5 October 2017; Volume 9. [Google Scholar]
  61. Liang, J.; Liu, H.; Xiao, N.C. A hybrid approach based on deep neural network and double exponential model for remaining useful life prediction. Expert Syst. Appl. 2024, 249, 123563. [Google Scholar] [CrossRef]
Figure 1. Proposed state of health estimation for lithium-ion batteries based on signal reconstruction.
Figure 1. Proposed state of health estimation for lithium-ion batteries based on signal reconstruction.
Energies 18 02459 g001
Figure 2. Sensitivity analysis of signal reconstruction quality across various SNRs (10–50 dB). Our proposed approach based on Tikhonov consistently outperforms widely adopted signal processing methods, including the LASSO formulation, moving average (MA) filter, and the Kalman filter, across multiple accuracy metrics. Boxplots show (a) RMSE, (b) MAE, and (c) MAPE, with the proposed approach yielding the lowest error dispersion across all metrics, highlighting its effectiveness in preserving signal quality under noise.
Figure 2. Sensitivity analysis of signal reconstruction quality across various SNRs (10–50 dB). Our proposed approach based on Tikhonov consistently outperforms widely adopted signal processing methods, including the LASSO formulation, moving average (MA) filter, and the Kalman filter, across multiple accuracy metrics. Boxplots show (a) RMSE, (b) MAE, and (c) MAPE, with the proposed approach yielding the lowest error dispersion across all metrics, highlighting its effectiveness in preserving signal quality under noise.
Energies 18 02459 g002
Figure 3. Performance comparison of state-of-health (SoH) estimation across three scenarios using direct features reported in [58] vs. our proposed feature selection approach. In each scenario, the left plot shows predicted SoH at different numbers of aging cycles. The boxplots show RMSE, MAE, and MAPE. The proposed method consistently exhibits lower prediction errors, highlighting its effectiveness over direct features.
Figure 3. Performance comparison of state-of-health (SoH) estimation across three scenarios using direct features reported in [58] vs. our proposed feature selection approach. In each scenario, the left plot shows predicted SoH at different numbers of aging cycles. The boxplots show RMSE, MAE, and MAPE. The proposed method consistently exhibits lower prediction errors, highlighting its effectiveness over direct features.
Energies 18 02459 g003
Figure 4. (a) In Scenario 1, our proposed approach consistently estimates the SoH throughout the battery life cycle. (b) In Scenario 2, the comparison highlights the robustness of our proposed approach against noisy data, whereas the DNN approach [31] is sensitive to noise. (c) In Scenario 3, the results demonstrate the resilience of our proposed approach to some missing values.
Figure 4. (a) In Scenario 1, our proposed approach consistently estimates the SoH throughout the battery life cycle. (b) In Scenario 2, the comparison highlights the robustness of our proposed approach against noisy data, whereas the DNN approach [31] is sensitive to noise. (c) In Scenario 3, the results demonstrate the resilience of our proposed approach to some missing values.
Energies 18 02459 g004
Figure 5. Comparison of proposed SoH approach (red) versus hybrid model (green) [61] across multiple scenarios based on NASA dataset [49] and detailed in Section 2.2. Left plots show the SoH predictions for each scenario along different numbers of aging cycles (the ground truth is in black). Boxplots of the corresponding error metrics RMSE, MAE, and MAPE. Overall, our proposed purely data-driven approach captures real-world battery aging dynamics more accurately and consistently than the hybrid model.
Figure 5. Comparison of proposed SoH approach (red) versus hybrid model (green) [61] across multiple scenarios based on NASA dataset [49] and detailed in Section 2.2. Left plots show the SoH predictions for each scenario along different numbers of aging cycles (the ground truth is in black). Boxplots of the corresponding error metrics RMSE, MAE, and MAPE. Overall, our proposed purely data-driven approach captures real-world battery aging dynamics more accurately and consistently than the hybrid model.
Energies 18 02459 g005
Figure 6. Comparison of proposed SoH approach (red) versus hybrid model (orange) [61], across multiple scenarios based on SNL dataset [50] and detailed in Section 2.2. Left plots show the SoH predictions for each scenario along different numbers of aging cycles (the ground truth is in black). Boxplots of the corresponding error metrics RMSE, MAE, and MAPE. We can see our proposed pure data-driven approach can adapt to the nuances of battery aging in ways the hybrid approach could not.
Figure 6. Comparison of proposed SoH approach (red) versus hybrid model (orange) [61], across multiple scenarios based on SNL dataset [50] and detailed in Section 2.2. Left plots show the SoH predictions for each scenario along different numbers of aging cycles (the ground truth is in black). Boxplots of the corresponding error metrics RMSE, MAE, and MAPE. We can see our proposed pure data-driven approach can adapt to the nuances of battery aging in ways the hybrid approach could not.
Energies 18 02459 g006
Table 1. Comparison of proposed SoH estimator with previous data-driven approaches.
Table 1. Comparison of proposed SoH estimator with previous data-driven approaches.
Data-Driven MethodNumber of Input FeaturesRobustnessTrainable ParametersPerformance Metric
Proposed Approach5 proposed features based on signal reconstructionOur proposed signal reconstruction approach can handle noise and outliers in the measurement data6 polynomial parameters trained with Huber cost functionRMSE =  10 4 % ,
MAE =  10 2 % ,
MAPE = 1%
Deep Neural Network [31]3 (Direct features)The reported approach requires preparing the data by removing significant outliers manually2 hidden layers with 30 and 15 neurons, respectively, as well as Sigmoid and Tanh activation functionsRMSE = 1.9 × 10 4 % ,
MAPE = 1.39%
Deep Neural Network [32]6 (Direct features)The paper does not discuss a dedicated noise handling mechanism217 trainable parametersRMSE = 0.004758%,
MAE = 0.534%
Nonlinear Autoregressive Exogenous Neural Network [33]8 (Model-based features)The paper does not discuss a dedicated noise handling mechanismHidden neurons = 50, Feedback delays = 8MAE = 0.72%,
MaxE = 4.69%
Gated Recurrent Unit Network [34]3 (Direct features)Gaussian noise injection with a mean of 0 and a standard deviation of 1–2% into the voltage, current, and temperature measurements (works on less noise-corrupted signals)Hidden neurons = 256 (GRU), convolution number = 64, size of each convolution layer 32 × 1MAE = 1.03%,
MaxE = 4.11%
Convolutional Neural Network [35]1 (Preprocessed features)The reported approach is sensitive to noise and outliersNumber of convolution kernels = 256, size of the kernel = 3 × 1RMSE = 1.1%,
MAE = 0.9%
Table 2. Operating parameters of batteries from NASA dataset.
Table 2. Operating parameters of batteries from NASA dataset.
Battery IDEnd Voltage (V)Ambient Temperature (°C)Nominal Capacity (Ah)Discharge Current (A)End of Life Criteria (Ah)No. of Cycles
B00052.74 to 40221.4168
B00072.24 to 40221.4168
B00182.54 to 40221.4132
B00332.024241.6198
B00342.224241.6198
B00462.24211.472
B00472.54211.472
B00482.74211.472
Table 3. Operating parameters of batteries from SNL dataset.
Table 3. Operating parameters of batteries from SNL dataset.
Cathode Chemistry
NCANMC
ManufacturerPanasonicLG Chem
Manufacturer City—CountryOsaka—JapanSeoul—Republic of Korea
Manufacturer PNNCR18650B18650HG2
Battery type1865018650
Nominal capacity [Ah]3.23
Nominal voltage [V]3.63.6
Voltage range [V]2.5–4.22.0–4.2
Max discharge current [A]620
Temperature range [°C]0–45−5–50
Charge C-rate0.5C0.5C
Discharge C-rate0.5C/1C/2C0.5C/1C/2C
Test temperature [°C]15/25/3515/25/35
Depth of discharge0–100%0–100%
Table 4. Comparison of signal reconstruction accuracy for the proposed closed-form Tikhonov, the iterative LASSO formulation, a centred moving-average (MA) filter, and a recursive Kalman filter under five noise conditions (SNR = 10–50 dB). Lower values of RMSE, MAE, and MAPE indicate superior reconstruction quality. The best results in each column are in bold.
Table 4. Comparison of signal reconstruction accuracy for the proposed closed-form Tikhonov, the iterative LASSO formulation, a centred moving-average (MA) filter, and a recursive Kalman filter under five noise conditions (SNR = 10–50 dB). Lower values of RMSE, MAE, and MAPE indicate superior reconstruction quality. The best results in each column are in bold.
MetricsMethodSNR (dB)Average
1020304050
RMSETikhonov0.04040.02100.01720.01650.01640.0223
MA0.04080.02780.02520.02480.02480.0287
Lasso0.05470.02480.02400.02470.02480.0306
Kalman0.05940.05370.05110.05150.05150.0534
MAETikhonov0.02950.01120.00640.00410.00350.0109
MA0.02730.01200.00770.00600.00560.0117
Lasso0.04270.01440.01040.00910.00880.0171
Kalman0.03810.03380.03110.03180.03190.0334
MAPE (%)Tikhonov0.84970.32950.19400.12990.11330.3233
MA0.78940.35660.23690.18660.17640.3492
Lasso1.21970.42410.31640.28090.27190.5026
Kalman1.13771.01390.93600.95530.96011.0006
Table 5. Sensitivity analysis of proposed features. Checkmark (✓) indicates inclusion and a dash (-) indicates exclusion of the corresponding proposed features: minimum discharge voltage ( x 1 ), time to minimum voltage ( x 2 ), minimum temperature at the start of discharge ( x 3 ), maximum discharge temperature ( x 4 ), and time elapsed between minimum and maximum temperature ( x 5 ). The best results are in bold.
Table 5. Sensitivity analysis of proposed features. Checkmark (✓) indicates inclusion and a dash (-) indicates exclusion of the corresponding proposed features: minimum discharge voltage ( x 1 ), time to minimum voltage ( x 2 ), minimum temperature at the start of discharge ( x 3 ), maximum discharge temperature ( x 4 ), and time elapsed between minimum and maximum temperature ( x 5 ). The best results are in bold.
Proposed FeaturesRMSEMAEMAPE (%)R2Time (s)
x1x2x3x4x5
0.08620.07320.59870.9235230.0259
-0.2334010.18557051.21690.9137160.01905
-1.06513950.80384052.605850.6281430.0176
-0.22904050.1810890.700050.9229050.01785
-0.2309860.182820.646050.92419950.01835
-0.22972050.1811460.732650.9234760.01975
----1.3933721.1715158.9394−1.4540.0076
----0.2317930.18602351.226550.92758050.0086
----1.16817750.8623424.666650.083540.0079
----1.15660250.87579056.4991−0.638320.0067
----1.3642491.1576783.333450.40792150.00895
Table 6. Pearson correlation coefficients (r) between each of the five proposed features and the battery state of health (SoH), reported separately for the training set, the independent test set, and the combined dataset. The strong positive correlations of the temporal features x 2 and x 5 with SoH corroborate their dominant explanatory power, whereas x 1 , x 3 , and x 4 show only weak or moderate association. A negative sign indicates an inverse monotonic relationship with SoH.
Table 6. Pearson correlation coefficients (r) between each of the five proposed features and the battery state of health (SoH), reported separately for the training set, the independent test set, and the combined dataset. The strong positive correlations of the temporal features x 2 and x 5 with SoH corroborate their dominant explanatory power, whereas x 1 , x 3 , and x 4 show only weak or moderate association. A negative sign indicates an inverse monotonic relationship with SoH.
Proposed FeaturesTrainTestOverall
x 1 : minimum discharge voltage−0.1465−0.2116−0.1791
x 2 : time to minimum voltage0.96890.96270.9658
x 3 : minimum temperature at the start of discharge−0.03400.27520.1206
x 4 : maximum discharge temperature0.0357−0.0601−0.0122
x 5 : time elapsed between minimum and maximum temperature0.96250.93050.9465
Table 7. Comparison of proposed state-of-health (SoH) estimator with a deep neural network (DNN) using three handcrafted charging features, a Huber-regression model trained on ten direct statistics, and a hybrid estimator (physics-aided). Performance is evaluated under six aging scenarios, with results reported as root mean square error (RMS), mean absolute error (MAE), and mean absolute percentage error (MAPE). Lower values indicate higher accuracy. Aggregated results are labeled “Average”. The best results are in bold.
Table 7. Comparison of proposed state-of-health (SoH) estimator with a deep neural network (DNN) using three handcrafted charging features, a Huber-regression model trained on ten direct statistics, and a hybrid estimator (physics-aided). Performance is evaluated under six aging scenarios, with results reported as root mean square error (RMS), mean absolute error (MAE), and mean absolute percentage error (MAPE). Lower values indicate higher accuracy. Aggregated results are labeled “Average”. The best results are in bold.
ScenarioMethodRMSEMAEMAPE (%)
1Proposed0.00290.00220.2546
DNN0.0182390.0154161.9341
Direct Features0.00420.00340.4231
Hybrid2.2778511.9228062.5100
2Proposed0.01400.01101.4985
DNN0.1036900.09754012.8129
Direct Features0.02740.02383.2878
Hybrid2.8458871.8905122.6018
3Proposed0.08350.02611.4191
DNN0.1784650.17413623.8429
Direct Features0.08830.079310.6426
Hybrid10.2056813.2077091.7226
4Proposed0.06400.05670.06
DNN9.39969.39729.78
Direct Features4.56770.55580.58
Hybrid4.53984.46704.65
5Proposed0.08300.07570.08
DNN0.42710.39120.41
Direct Features4.63950.50650.53
Hybrid4.24804.14904.37
6Proposed0.26960.26760.28
DNN22.504022.492623.58
Direct Features5.51942.25202.36
Hybrid4.21114.11704.30
AverageProposed0.08620.07320.5987
DNN5.43855.428012.0616
Direct Features2.47440.57012.96725
Hybrid4.721403.29233.3591
Table 8. Comparative analysis of State of Health (SoH) estimation performance across six scenarios, demonstrating the effect of discharge C-rates, ambient temperatures, and charging protocols on RMSE, MAE, and MAPE metrics.
Table 8. Comparative analysis of State of Health (SoH) estimation performance across six scenarios, demonstrating the effect of discharge C-rates, ambient temperatures, and charging protocols on RMSE, MAE, and MAPE metrics.
ScenarioCharge C-RateDischarge C-RateAmbient Temperature (°C)RMSMAEMAPE (%)
1∼0.75C (CC-CV)1C4 to 400.0020210.0017110.2182
2∼0.75C (CC-CV)1C240.0135710.0113081.5269
3∼0.75C (CC-CV)2C40.0831210.0281241.8028
4∼0.75C (CC-CV)1C250.11480.11470.12
50.5C (CC)0.5C (CC)150.23090.23080.24
60.5C (CC)2C (CC)250.35680.35660.37
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Acuña Acurio, B.A.; Chérrez Barragán, D.E.; Rodríguez, J.C.; Grijalva, F.; Pereira da Silva, L.C. Robust Data-Driven State of Health Estimation of Lithium-Ion Batteries Based on Reconstructed Signals. Energies 2025, 18, 2459. https://doi.org/10.3390/en18102459

AMA Style

Acuña Acurio BA, Chérrez Barragán DE, Rodríguez JC, Grijalva F, Pereira da Silva LC. Robust Data-Driven State of Health Estimation of Lithium-Ion Batteries Based on Reconstructed Signals. Energies. 2025; 18(10):2459. https://doi.org/10.3390/en18102459

Chicago/Turabian Style

Acuña Acurio, Byron Alejandro, Diana Estefanía Chérrez Barragán, Juan Carlos Rodríguez, Felipe Grijalva, and Luiz Carlos Pereira da Silva. 2025. "Robust Data-Driven State of Health Estimation of Lithium-Ion Batteries Based on Reconstructed Signals" Energies 18, no. 10: 2459. https://doi.org/10.3390/en18102459

APA Style

Acuña Acurio, B. A., Chérrez Barragán, D. E., Rodríguez, J. C., Grijalva, F., & Pereira da Silva, L. C. (2025). Robust Data-Driven State of Health Estimation of Lithium-Ion Batteries Based on Reconstructed Signals. Energies, 18(10), 2459. https://doi.org/10.3390/en18102459

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop