Research on State of Health Assessment of Lithium-Ion Batteries Using Actual Measurement Data Based on Hybrid LSTM–Transformer Model

Zhang, Hanyu; Wang, Jifei

doi:10.3390/sym18010169

Open AccessArticle

Research on State of Health Assessment of Lithium-Ion Batteries Using Actual Measurement Data Based on Hybrid LSTM–Transformer Model

by

Hanyu Zhang

and

Jifei Wang

^*

College of Energy and Power Engineering, Inner Mongolia University of Technology, Hohhot 010080, China

^*

Author to whom correspondence should be addressed.

Symmetry 2026, 18(1), 169; https://doi.org/10.3390/sym18010169

Submission received: 5 December 2025 / Revised: 29 December 2025 / Accepted: 14 January 2026 / Published: 16 January 2026

(This article belongs to the Section Engineering and Materials)

Download

Browse Figures

Versions Notes

Abstract

An accurate assessment of the state of health (SOH) of lithium-ion batteries (LIBs) is crucial for ensuring the safety and reliability of energy storage systems and electric vehicles. However, existing methods face challenges: physics-based models are computationally complex, traditional data-driven methods rely heavily on manual feature engineering, and single models lack the ability to capture both local and global degradation patterns. To address these issues, this paper proposes a novel hybrid LSTM–Transformer model for LIB SOH estimation using actual measurement data. The model integrates Long Short-Term Memory (LSTM) networks to capture local temporal dependencies with the Trans-former architecture to model global degradation trends through self-attention mechanisms. Experimental validation was conducted using eight 18650 Nickel Cobalt Manganese (NCM) LIBs subjected to 750 charge–discharge cycles under room temperature conditions. Sixteen statistical features were extracted from voltage and current data during constant current–constant voltage (CC-CV) phases, with feature selection based on the Pearson correlation coefficient and maximum information coefficient analysis. The proposed LSTM–Transformer model demonstrated superior performance compared to the standalone LSTM and Transformer models, achieving a mean absolute error (MAE) as low as 0.001775, root mean square error (RMSE) of 0.002147, and mean absolute percentage error (MAPE) of 0.196% for individual batteries. Core features including cumulative charge (CC Q), charging time, and voltage slope during the constant current phase showed a strong correlation with the SOH (absolute PCC > 0.8). The hybrid model exhibited excellent generalization across different battery cells with consistent error distributions and nearly overlapping prediction curves with actual SOH trajectories. The symmetrical LSTM–Transformer hybrid architecture provides an accurate, robust, and generalizable solution for LIB SOH assessment, effectively overcoming the limitations of traditional methods while offering potential for real-time battery management system applications. This approach enables health feature learning without manual feature engineering, representing an advancement in data-driven battery health monitoring.

Keywords:

lithium-ion batteries; state of health assessment; hybrid LSTM-Transformer model; symmetrical model

1. Introduction

The acceleration of the transformation of the global energy structure and the in-depth advancement of the strategic goals of ‘carbon peaking and carbon neutrality’ have led to explosive growth in renewable energy sources such as wind and solar power [1,2]. However, their intermittent and fluctuating nature poses a serious challenge to the stable operation of the electricity grid [3]. In July 2025, the National Energy Administration of China released the ‘China New Energy Storage Development Report (2025)’. By 2024, the total installed capacity of operational new energy storage projects around the world had reached approximately 180 million kilowatts, marking a surge of approximately 98% compared to the end of 2023, with an additional installed capacity of around 90 million kilowatts. China’s cumulative installed capacity of completed and operational new energy storage projects reached 73.76 million kilowatts/168 million kilowatt-hours, marking a surge of over 130% since the end of 2023. The annual increase in new energy storage installed capacity was 42.37 million kilowatts/101 million kilowatt-hours. Of the various new energy storage technologies, LIBs dominate, accounting for around 96.4% of the operational installed capacity [4]. LIBs have swiftly become dominant in the electrochemical energy storage market due to their numerous advantages, including high energy density, a long cycle life, a fast response speed, and high technical maturity. They are widely used in two core areas: energy storage power stations and electric vehicles [5]. In the field of large-scale energy storage power plants, LIB systems are essential for the construction of smart grids, microgrids, and emergency backup power supplies. They significantly improve energy utilization efficiency and grid reliability by participating in grid frequency regulation and peak shaving through ‘peak shaving and valley filling’. In the electric vehicle sector, LIBs are the ‘heart’ of the vehicle; their performance directly affects the vehicle’s range, safety, and service life [6]. With the rapid increase in the number of electric vehicles and retired batteries, batteries are increasingly being seen not only as a source of power but also as a potential distributed energy storage resource. The importance of managing their SOH is therefore increasingly apparent [7]. Although LIBs have a wide range of potential applications in energy storage, there are still many serious challenges to overcome in their actual operation, particularly in the following areas. (1) Performance degradation issues: During long-term cyclic use, complex electrochemical side reactions occur inside the battery, leading to problems such as the loss of active lithium and damage to the structure of the electrode material. These problems manifest themselves as external characteristics, such as reduced capacity and increased internal resistance, i.e., a decrease in the SOH. (2) Safety issues: Risks such as uneven battery aging, overcharging, overdischarge, and thermal runaway are always present. An accurate assessment of the SOH is essential for battery management systems (BMSs) to perform effective thermal management, balancing, and safety warning functions. Failure to assess the SOH can result in serious accidents. (3) Economic issues: The SOH of batteries directly impacts the total lifecycle cost and economic efficiency of energy storage systems. An accurate SOH assessment is crucial for realizing battery reuse, value assessment, and optimizing replacement strategies and for reducing system costs. Therefore, the development of a highly robust, high-precision LIB SOH assessment method is of great theoretical significance and of great value for engineering application. This will ensure the safe and stable operation of energy storage systems, maximize the lifecycle value of batteries, and promote the healthy development of the new energy industry [8,9]. The SOH of LIBs cannot be obtained through direct measurement. Instead, it is estimated using algorithmic models based on measurable physical quantities, such as voltage, current, and temperature [10]. Currently, methods for assessing the SOH of LIBs can be broadly categorized into three main types: physics-based models, data-driven approaches, and hybrid models incorporating multi-source data fusion. Physics-based methods primarily involve the establishment of a dynamic model that can describe the external characteristics and internal states of LIBs. This model reflects the dynamic response behavior of batteries, enabling the estimation of the SOH. Commonly used physical models include equivalent circuit models (ECMs) and electrochemical models (EMs). ECMs simulate the external electrical behavior of batteries using simple circuit components such as voltage sources, resistors, and capacitors. Examples include the Rint model [11], the Thevenin model [12], the Partnership for a New Generation of Vehicles (PNGV) model [13], and the dual polarization (DP) model [14]. ECMs are the most widely used models in current BMSs due to their simple structure, efficient computation, and reliable simulation of battery dynamics [15]. However, ECMs can only describe the external voltage response of batteries and cannot reveal internal electrochemical reactions, changes in LIB concentration, side reactions, or aging mechanisms. Furthermore, the accuracy of these models depends heavily on the range of operating conditions based on their calibration. Once the actual operating conditions of the battery exceed this range, the predictive capability of the model will decrease significantly [16]. In contrast to the ECM, the EM is a mechanism-based model that is rooted in physicochemical principles, such as porous electrode theory and concentrated solution theory. It utilizes a series of partial differential equations (PDEs) to describe the migration, diffusion, and reactions of LIBs within the electrodes and electrolytes. The EM demonstrates strong predictive capability, good extrapolation performance, and high accuracy. The pseudo-two-dimensional (P2D) electrochemical model can be used to describe the dynamic characteristics of batteries with high prediction accuracy through PDEs [17]. However, the EM is highly complex, and solving coupled non-linear PDEs requires substantial computational resources, making it unsuitable for real-time BMSs. Nevertheless, it is applicable to battery design, mechanism research, and offline analysis [18]. With the rapid advancement of artificial intelligence (AI) technology, data-driven estimation methods for the SOH of LIBs have been extensively researched and applied [19]. The core of these approaches lies in feature extraction and model construction. SOH estimation requires the collection or calculation of battery data, such as voltage, current, temperature, and internal resistance. By analyzing these data to identify trend patterns, features correlated with the SOH can be extracted. Various data-driven algorithmic models are then employed to establish the mapping relationship between these features and the SOH, thus estimating the unknown SOH [20]. The present paper proposes a data-driven SOH estimation method based on the area of the constant current charging and discharging voltage curves of LIBs as health features (HFs). The validity of the method was validated using four typical battery datasets, and then the HFs of the battery and its SOH were selected. The results show a strong correlation. Subsequently, the two HFs were utilized as inputs to the Gaussian Process Regression (GPR), LSTM, and BP algorithms for SOH estimation [21]. Tian et al. [22] proposed a method for estimating the SOH, which involved the extraction of HFs from sampled surface temperatures. A segment of the differential temperature curves within a specified voltage range was utilized to establish a correlation with the SOH by employing the support vector regression method. Lin et al. [23] proposed a data-driven methodology for estimating the SOH of LIBs, incorporating the effects of internal resistance. The model was employed as a bridge to facilitate the effective integration of the ECM and the data-driven method. In data-driven algorithms, researchers widely apply artificial neural networks (ANNs) [24], support vector machines (SVMs) [25], LSTM networks [26], and convolutional neural networks (CNNs) [27]. A temperature-compensated Bi-LSTM with an integrated attention mechanism (AM) was proposed by Xu et al. [28] for the co-estimation of the SOC and SOH. Compared to bidirectional LSTMs, the proposed method improved precision by 21.45%. Bockrath et al. [29] presented an algorithm for estimating the SOH of LIBs using different segments of partial discharge profiles. Raw sensor data was fed directly into a temporal CNN, eliminating the need for feature engineering. This neural network can process raw sensor data and estimate the SOH of battery cells in various aging and degradation scenarios. Another approach involves battery health assessment based on hybrid models and multi-source data fusion. This method enhances battery health evaluation performance by integrating deep learning models with traditional models [30]. Traditional models extract battery features, while deep learning models handle non-linear feature fusion and dimensionality reduction [31,32]. Hybrid models typically comprise three stages: (1) Feature extraction and fusion. Traditional models extract battery features, whereas deep learning models perform non-linear feature fusion and dimensionality reduction. (2) Model fusion. The weighted integration of outputs from deep learning and traditional models enhances prediction accuracy. (3) Data augmentation and correction. Deep learning models correct the outputs of traditional models to compensate for their limitations in complex scenarios. Yin et al. [33] proposed a novel approach using Deep Reinforcement Learning (DRL) to optimize the parameters of an Adaptive Unscented Kalman Filter (AUKF). The DRL agent learns to adjust the AUKF parameters by interacting with the battery environment to maximize estimation accuracy. The experimental results demonstrate that the DRL-optimized AUKF outperforms traditional UKF methods in terms of state of charge (SOC) and SOH estimation accuracy, highlighting its potential for enhancing BMSs. Mazzi et al. [34] proposed a real-time SOH estimation model based on a deep learning framework. This model combines two distinct architectures: a one-dimensional convolutional neural network (1D-CNN) and a bidirectional gated recurrent unit (BiGRU). The hybrid CNN-BiGRU utilizes the 1D-CNN layer to extract relevant features from the input data and then relies on the Bi-GRU layer to learn sequences in both directions. The data fed into the 1D-CNN layer originates from current, voltage, and temperature readings acquired by the BMS. Due to the significant impact of hyperparameters on the performance of neural networks, Bayesian optimization techniques based on Gaussian processes were employed to tune the hyperparameters of the CNN-BiGRU model. Gao et al. [35] addressed the challenge of accurately estimating battery SOH across different types and operating conditions using a single network. Their paper proposed a novel hybrid network that combines a Hierarchical Feature Coupled Module (HFCM) and an LSTM module. This enables the full extraction of raw data information and allows for a more accurate estimation of battery SOH across various types and operating conditions. The HFCM first extracts feature information from raw samples, which are then modeled as time series data by an LSTM module. Based on the HFCM-LSTM architecture, the model incorporates data directly from the battery itself, allowing for SOH estimation. The experimental results demonstrate that the proposed SOH estimation algorithm outperforms others in terms of both accuracy and versatility. However, existing research suffers from the following limitations: (1) Whether based on physical models or data-driven approaches, the performance of existing models is highly dependent on the range of operating conditions covered by the training data. Once the actual operating conditions of LIBs exceed the scope of the training data, predictive performance deteriorates significantly. (2) Current data-driven methods heavily rely on hand-crafted feature engineering to extract HFs related to the SOH from raw data (voltage, current, temperature). These features often require domain-specific prior knowledge and may fail to comprehensively capture multidimensional degradation information during battery aging. (3) A single model struggles to strike the balance between accuracy, robustness, and generalization capability. Although hybrid model fusion strategies have been explored, the lack of an effective cross-modal feature fusion mechanism results in low information utilization efficiency. Gui et al. [36] proposed a novel cross-domain SOH estimation framework called MM-LG-CNNT, which integrates multi-modal data and a parallel CNN–Transformer architecture with a multi-information alignment strategy to achieve accurate and robust battery health state prediction under limited data conditions. However, the CNN–Transformer excels in local feature extraction and global time series modeling. In the context of lithium-ion battery SOH detection, the LSTM–Transformer demonstrates unique advantages in capturing long-term dependencies and handling complex non-linear relationships, potentially offering better generalization capabilities, particularly with small sample data. This paper proposes a hybrid deep learning model integrating LSTM networks and the Transformer architecture to estimate the SOH of LIBs. The LSTM unit is particularly effective at capturing long-term dependencies in time series data, enabling it to recognize the temporal patterns of battery degradation. Meanwhile, the self-attention mechanism in the Transformer allows for the parallel computation of global dependencies, providing the model with an enhanced ability to capture critical features and improve model interpretability. The combination of these two architectures aims to achieve more accurate and robust modeling of the battery degradation process. The main contributions of this paper are as follows: (1) A novel symmetrical hybrid architecture LSTM–Transformer is proposed that overcomes the limitations of traditional data-driven models in terms of temporal modeling. This architecture achieves a balance between local dynamic detail preservation and global pattern capture, analogous to a functional symmetry in handling multi-scale temporal features. This significantly enhances the model’s ability to represent complex temporal dynamics. (2) An HF learning mechanism is constructed to enable the model to automatically learn HFs that are highly correlated with the SOH directly from raw battery data, thus eliminating the need for manual feature engineering. (3) The recurrent structure of the LSTM and the Transformer’s hierarchical attention mechanism ensures powerful feature extraction capabilities while reducing computational redundancy. This optimization decreases the computational load on non-critical data, thereby enhancing real-time performance.

The remainder of this paper has the following structure. Section 2 provides an overview of the LSTM–Transformer method. Section 3 introduces data acquisition and feature extraction. Section 4 presents the experimental validation of the LSTM–Transformer hybrid model. Section 5 concludes this article and demonstrates the superiority of the proposed model.

2. Method for Assessing SOH Based on LSTM–Transformer

2.1. Definition of SOH

The SOH for LIBs is a key metric used to quantify the degree of degradation of battery performance, intuitively reflecting the gap between the current state of the battery and its brand-new condition. The SOH essentially describes the battery’s aging level and serves as the core basis for determining whether maintenance or replacement is required. The SOH primarily measures battery health across two dimensions: The first is capacity retention capability—the battery’s ability to store and release charge. As cycle counts increase, the active materials degrade, leading to a reduction in the battery’s maximum usable capacity. The second aspect is power output capability, which refers to the battery’s resistance to internal current flow. Battery aging causes an increase in internal resistance, resulting in reduced charge/discharge efficiency, decreased operating voltage, greater voltage drops, and increased heat generation during high-current operation. Depending on the specific evaluation focus, the SOH is primarily calculated using two different formulas.

S O H_{C} = \frac{C_{c u r r e n t}}{C_{r a t e d}} \times 100 %,

(1)

Formula (1) represents the capacity-based SOH.

C_{c u r r e n t}

denotes the battery’s current actual capacity, i.e., the total electrical energy that can be delivered after one complete charge–discharge cycle at the current stage of the battery’s life. This value is an estimated figure.

C_{r a t e d}

denotes the rated capacity, which is the nominal capacity value specified under standard conditions at the time of manufacture. It is a fixed constant. Typically, the battery is considered to have reached end-of-life (EOL) and requires replacement when

S O H_{C}

drops to 80%.

S O H_{R} = \frac{R_{E O L} - R_{c u r r e n t}}{R_{E O L} - R_{n e w}} .

(2)

Formula (2) focuses on evaluating the battery’s power performance and efficiency.

R_{c u r r e n t}

represents the battery’s internal resistance in its current state,

R_{n e w}

denotes the battery’s internal resistance when brand-new, and

R_{E O L}

is the internal resistance threshold defined by the manufacturer to indicate battery failure.

2.2. LSTM Network Model

As a specialized type of recurrent neural network (RNN), LSTM effectively addresses the vanishing gradient problem inherent in traditional RNNs by incorporating a gating mechanism. This provides unique advantages when processing sequential data [37]. In the context of LIB SOH assessment tasks, LSTMs can capture the local temporal dynamics of parameters such as voltage, current, and temperature during charging and discharging. Firstly, they establish time-dependent models by continuously transmitting state information through memory units during battery aging to accurately depict the cumulative effects of capacity decay. Secondly, they improve the robustness of SOH evaluation systems against noise; the forget gate mechanism filters out high-frequency noise from sensor signals, thereby improving the reliability of feature extraction. The LSTM network [38] incorporates an effective gating mechanism to manage information flow, comprising a forget gate, an input gate, and an output gate (see Figure 1). This gating mechanism enables the LSTM network to selectively remember or forget information at different time steps, thus facilitating the handling of temporal relationships. Furthermore, the memory cell within the LSTM block enables the network to selectively forget or remember information, effectively addressing long-term dependency issues. The computational processes for the three gates are as follows:

f_{t} = σ (W_{t} \cdot [H_{t - 1}, X_{t}] + b_{f}),

(3)

i_{t} = σ (W_{i} \cdot [H_{t - 1}, X_{t}] + b_{i}),

(4)

o_{t} = σ (W_{o} \cdot [H_{t - 1}, X_{t}] + b_{o}),

(5)

Updating a storage unit can be expressed as follows:

C_{t} = f_{t} ⊙ C_{t - 1} + i_{t} ⊙ \tanh (W_{c} \cdot [H_{t - 1}, X_{t} + b_{c}]),

(6)

H_{t} = o_{t} ⊙ \tanh (c_{t}) .

(7)

Here,

f_{t}

,

i_{t}

, and

o_{t}

represent the forget gate, input gate, and output gate, respectively;

C_{t}

is the memory cell;

X_{t}

and

H_{t}

denote the input and final output, respectively;

W

is the weight;

b

is the bias.

σ (\cdot)

and

\tanh (\cdot)

denote the sigmoid and hyperbolic tangent functions, respectively.

2.3. Transformer Neural Network Model

Transformer models leverage self-attention mechanisms instead of the recursive structure of traditional RNNs, modeling global dependencies in sequence data through parallel computation [39]. Their innovative value in assessing the SOH of LIBs manifests itself primarily in the following ways: (1) Self-attention weight matrices quantify implicit correlations across different charge–discharge cycles. (2) The multi-head attention mechanism enables the parallel extraction of interaction features across multiple parameters, such as voltage, current, and temperature. (3) The introduction of sinusoidal positional encoding compensates for the lack of positional sensitivity in temporal information and provides a precise description of the phased characteristics of battery aging trajectories. The Transformer is a sequence-to-sequence model comprising an encoder and a decoder, as shown in Figure 2. The encoder consists of a multi-head self-attention module and a position-wise feed-forward neural network (FFN). The self-attention module is a key com-ponent in calculating the importance of each position in the sequence. It maps each position in the input sequence to a query vector, a key vector, and a value vector. It then computes the attention score for that position using dot products and weighted summation. To improve its ability to extract features, the Transformer uses multi-head parallel self-attention. This multi-head self-attention module enables the Transformer to learn distinct representations of the query, key, and value vectors within each self-attention pass. The results of these passes are then combined to yield a more comprehensive feature representation. The output of the multi-head self-attention in the encoder is as follows:

MultiHead (Q, K, V) = Concat ({head}_{1}, \dots, {head}_{h}) W^{O},

(8)

{head}_{i} = softmax (Q W_{i}^{Q} {(K W_{i}^{K})}^{T} / \sqrt{d_{k}}) (V W_{i}^{V}) .

(9)

where

{head}_{i}

represents self-attention;

h

denotes the number of self-attention heads;

Q

,

K

, and

V

refer to the query, key, and value vectors, respectively. Meanwhile,

W^{Q}

,

W^{K}

,

W^{V}

, and

W^{O}

stand for the weights, and

d_{k}

defines the key dimension.

Softmax (\cdot)

is a softmax activation function, and

Concat (\cdot)

denotes the concatenation operation.

2.4. Construction of LSTM–Transformer Hybrid Models

The limitations of existing single models in the SOH assessment of LIBs are manifested in traditional LSTMs struggling to capture long-term global dependencies, while Transformers inadequately extract local temporal features. The proposed hybrid LSTM–Transformer model establishes a symmetrical processing framework, where the LSTM network captures local temporal dependencies while the Transformer architecture models global degradation trends through self-attention mechanisms. This symmetry between local and global feature extraction enables the model to comprehensively characterize the battery aging process. An LSTM layer is used to extract fine-grained local features, followed by a Transformer encoder to model cross-cycle global decay trends. A gated attention module is designed to adaptively adjust the feature contribution weights between the LSTM and Transformer layers, thereby enhancing the model’s ability to generalize across different aging stages and ensuring that the predicted results align with the dynamics of LIB aging. The LSTM–Transformer hybrid model uses an LSTM unit as an encoder to extract voltage and current data from LIBs, capturing the correlation between adjacent time steps in the process. These computations are then performed via a fully connected layer using residual connections. The resulting output then feeds into a self-attention layer where multi-head self-attention calculations process feature parameters across different time steps in order to derive correlation information. Features from the self-attention layer are fully leveraged to capture long-term dependencies. Finally, the model produces the predicted SOH value for the LIBs through a feed-forward neural network and fully connected decoding layer. The LSTM–Transformer hybrid model architecture is shown in Figure 3.

The process of evaluating the LIB SOH using the LSTM–Transformer hybrid serial network model is shown in Figure 4, which consists of the following six steps.

Step 1: Collect voltage and current time series data from LIBs as raw input data.

Step 2: Normalize time series features by removing noisy and outlying data to facilitate model training.

Step 3: Extract multidimensional health factors and battery characteristics from LIB charge–discharge curves to characterize battery degradation systematically. Quantify the linear correlation between each feature and the SOH using the Pearson correlation coefficient (PCC). For non-linear features, supplement them with the maximum information coefficient (MIC) to assess their non-linear association with the SOH.

Step 4: LSTM layers extract local time series features to capture dynamic characteristics during the charging and discharging of the battery.

Step 5: Transformer encoder layers learn global dependencies through self-attention mechanisms to enhance long-term pattern recognition capabilities.

Step 6: Fully connected layers map high-dimensional features to SOH prediction values via linear layers.

2.5. LSTM–Transformer Model Hyperparameters and Training Hyperparameters

The model parameters and training parameters of the LSTM–Transformer hybrid model are shown in Table 1 and Table 2, respectively.

2.6. Model Evaluation Metrics

In order to fully assess the predictive performance of the model, four indicators were used for a comprehensive analysis.

2.6.1. Mean Absolute Error

The mean absolute error (MAE) calculates the average of the absolute values of the differences between the predicted and actual values. Compared to the MSE, it is less sensitive to outliers.

M A E = \frac{1}{n} \sum_{i = 1}^{n} |y_{i} - \overset{\land}{y_{i}}|,

(10)

2.6.2. Root Mean Square Error

The root mean square error (RMSE) is the square root of the MSE. Since the MSE is measured in the square of the original data units, taking the square root ensures that the result’s units match the actual values, making it easier to interpret.

R M S E = \sqrt{M S E} = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - \overset{\land}{y_{i}})}^{2}},

(11)

2.6.3. Mean Absolute Percentage Error

The mean absolute percentage error (MAPE) expresses error as a percentage of the true value. It provides an indication of relative error, making it possible to compare the performance of models across datasets with different scales or units.

M A P E = \frac{100 %}{n} \sum_{i = 1}^{n} |\frac{y_{i} - \overset{\land}{y_{i}}}{y_{i}}| .

(12)

2.6.4. Coefficient of Determination

The coefficient of determination

R^{2}

(also known as the goodness of fit) is used to quantify the prediction accuracy of a regression model for the target variable. It is calculated by comparing the differences between the model’s predicted values and the actual observed values.

R^{2} = 1 - \frac{\sum_{i} {({\hat{y}}_{i} - y_{i})}^{2}}{\sum_{i} {({\bar{y}}_{i} - y_{i})}^{2}}

(13)

3. Data Acquisition and Feature Extraction

3.1. Data Acquisition

The test data utilized 18650 NCM LIBs manufactured by ‘LISHEN’ in Tianjin, China. These batteries have a nominal capacity of 2000 mAh and a nominal voltage of 3.6 V, with charging and discharge cut-off voltages of 4.2 V and 2.5 V, respectively. All batteries were charged at 4.2 V in constant current–constant voltage (CC-CV) mode at 2 C and then discharged to 2.5 V at 1 C; the entire experiment was conducted at room temperature. The test was conducted at an ambient temperature of 25 ± 5 °C, with each cycle comprising 50 charge–discharge cycles. The charge–discharge test platform employed was an ACTS-5V10A-GGS-D system, with all data sampled at a frequency of 1 Hz, as illustrated in Figure 5. The design of various charge and discharge strategies enabled the collection of multidimensional time series data, encompassing voltage, current, temperature, and capacity, during the cycling and aging of the battery. In order to ensure data accuracy and reliability, the raw data were subjected to preprocessing using moving-window average filtering and outlier removal methods to eliminate noise and interference signals. This process resulted in a high-quality, standardized dataset, which serves as the input for subsequent SOH estimation models.

3.2. Data Feature Extraction

The acquired multidimensional time series data from the LIBs were subjected to a preprocessing procedure consisting of data cleansing, filtration, outlier elimination, and missing value imputation. Subsequently, a data smoothing process was implemented for the purpose of generating refined training datasets, which were characterized by enhanced quality and diminished noise levels. The selection of a voltage curve with values within the range [4.0, 4.2] V is required. Assuming the time interval corresponding to the selected data is [

t_{s t a r t}

,

t_{e n d}

], the mean, standard deviation, kurtosis, skewness, charging time, cumulative charge, curve slope, and curve entropy are calculated as follows:

The voltage mean, which characterizes the average voltage level within the specified interval, reflects the overall charging voltage level of the LIBs.

μ_{v} = \frac{1}{n} \sum_{i = 1}^{n} v_{i},

(14)

The voltage standard deviation indicates the degree of voltage fluctuation, which can increase due to noise or the instability of internal resistance.

σ_{v} = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(v_{i} - μ_{v})}^{2}},

(15)

Voltage kurtosis is a measure of the voltage distribution, with high kurtosis indicating the occurrence of abnormal spikes.

K u r t o s i s (v) = \frac{\frac{1}{n} \sum_{i = 1}^{n} {(v_{i} - μ_{v})}^{4}}{σ_{v}^{4}} - 3,

(16)

Voltage skewness is a metric that quantifies the symmetry of the voltage distribution, thereby revealing the unevenness of voltage variations.

S k e w n e s s (v) = \frac{\frac{1}{n} \sum_{i = 1}^{n} {(v_{i} - μ_{v})}^{3}}{σ_{v}^{3}},

(17)

Charge time refers to the duration of the constant current (CC) phase, which is influenced by the internal resistance of the battery and the decay of capacity.

T_{C C} = t_{e n d, C C} - t_{s t a r t, C C},

(18)

Cumulative charge refers to the capacity charged during the CC phase.

Δ Q = \int_{t_{s t a r t}}^{t_{e n d}} I \cdot d t,

(19)

Voltage slope is the rate of change of the voltage over time, revealing the charging dynamics.

s l o p e = \frac{v_{e n d} - v_{s t a r t}}{t_{e n d} - t_{s t a r t}},

(20)

Voltage entropy reveals the complexity and uncertainty of the voltage distribution, reflecting the diversity of battery states.

H (v) = - \sum_{i = 1}^{n} p_{j} \log_{2} p_{j} .

(21)

where

p_{j}

stands for the normalized value of the curve. For curves with current values between 0.5 and 0.1 A, the feature calculation method remains the same as that described above. After feature extraction, redundancy is reduced, and model generalization is enhanced by feature selection. Correlation screening involves calculating the PCC between each feature and the SOH and selecting the feature with the highest correlation.

4. Experimental Verification

This experiment employs a hybrid LSTM–Transformer model that is implemented within a deep learning framework. Testing was conducted using Python 3.8.5 software and a GeForce RTX 4090 GPU.

4.1. Data Preprocessing

As shown in Figure 6, during the data preprocessing stage, statistical features are extracted from short-term data collected before the battery is fully charged or discharged, in order to serve as model inputs. This approach effectively addresses the issue of insufficient feature generalization in the data. The charging voltage selection range is [4.0, 4.2] V, and the discharge current selection range is [0.1, 0.5] A. The red box in Figure 6a represents the voltage extraction region, while the blue box in Figure 6b indicates the current extraction region. As can be seen in Figure 7, the voltage and current acquisition settings specify charging and discharging voltages of 4.2 V and 2.5 V, respectively, and charging and discharging currents of 4 A and 2 A, correspondingly. Following battery aging tests, the battery capacity reached approximately 1800 mAh.

4.2. Battery Degradation Testing

This paper developed a battery degradation experiment. Eight batteries manufactured by LISHEN (with a nominal capacity of 2000 mAh for LiNi₀.₅Co₀.₂Mn₀.₃O₂, a nominal voltage of 3.6 V, and charge and discharge cut-off voltages of 4.2 V and 2.5 V, respectively) were subjected to 750 cycles at room temperature. The batteries were charged in CC-CV mode to 4.2 V and discharged at a 1C rate to 2.5 V. Figure 8 shows the degradation trajectories of the eight batteries. As shown in Figure 8, battery No. 2 exhibited the fastest aging rate, reaching 80% of its total capacity after 700 cycles and indicating that it had reached the EOL. In contrast, battery No. 6 showed the slowest aging rate, maintaining 87% capacity after 720 cycles.

4.3. Feature Extraction

Figure 9 shows the characteristics during the CC phase, which include CC Q, CC charge time, voltage slope, and voltage entropy. CC Q shows a clear upward trend as the SOH increases from 0.8 to 1.0, with consistent variation patterns observed across different batteries. This suggests that the cumulative charge during the CC phase is a feature that is strongly correlated with the SOH. CC charge time follows the same trend, increasing with a rising SOH. This suggests that healthier batteries require longer charging times during the CC phase, making it a reliable SOH indicator. As the SOH increases from 0.8 to 1.0, the voltage slope shows a significant downward trend with highly consistent patterns across different batteries. This suggests that the voltage slope can serve as a key SOH metric, with a better SOH correlated with a lower voltage slope. However, voltage entropy, which reflects voltage ‘disorder’, exhibits highly dispersed data points with minimal discriminatory power for the SOH. Characteristics during the constant voltage (CV) phase include CV Q, CV charge time, current slope, and current entropy. CV Q, CV charge time, and current slope show clear negative correlations with the SOH and concentrated trends, whereas current entropy data are dispersed with no obvious unified pattern.

As shown in Figure 10, the core features that are strongly correlated include CC Q, CC charge time, and voltage slope (in the CC phase) and CV Q, CV charge time, and current slope (in the CV phase). The absolute values of their correlation coefficients approach or exceed 0.8, making them key indicators of the SOH. Moderately correlated features include voltage kurtosis/skewness and current mean/standard deviation/kurtosis/skewness. Their absolute correlation coefficients range between 0.6 and 0.8, providing some indicative value for the SOH. Weakly correlated features are the voltage mean/standard deviation and voltage entropy. Their absolute correlation coefficients are below 0.5 or close to 0, showing extremely low discriminatory power for the SOH.

4.4. SOH Evaluation

The 16 extracted features are periodically input into the hybrid LSTM–Transformer model to estimate the SOH of the LIBs. Feature normalization is performed to mitigate the impact of amplitude variations on the model and to enhance the stability of training. Specifically, all features are scaled to the range [−1, 1].

As shown in Figure 11, batteries 2 and 6 experienced rapid degradation in the early stages, whereas batteries 1 and 3 showed a more gradual decline. Battery 8 demonstrated fluctuations in its true SOH during the mid-stage, while the remaining batteries exhibited relatively smooth curves. Batteries 1 and 3 had the longest lifespan (approximately 700 cycles), batteries 2 and 6 had the shortest (approximately 600 cycles), and the remaining batteries fell between these two extremes. Battery 3 exhibited the highest fitting accuracy, with the curves nearly overlapping, while batteries 5 and 8 showed slightly larger deviations in the later stages. The remaining batteries demonstrated good fitting quality.

Table 3 shows the error metric data for the SOH predictions of the LSTM–Transformer model for batteries 1–8. Batteries 4, 5, and 6 are in the high-accuracy group, with a MAPE of less than 0.4%. Batteries 1, 3, and 7 belong to the medium-accuracy group, with 0.4% < MAPE < 0.6%. Battery 2 is classified as being in the low-accuracy group, with a MAPE value of 0.7150%. Overall, the prediction results demonstrate that the LSTM–Transformer model can make highly accurate and reliable SOH predictions for all eight battery cells. Even the worst-performing cell (No. 2) has a MAPE value below 0.72%, with results that fall within an acceptable range. The consistent performance across different error metrics confirms the model’s stability.

Subsequently, a comparative analysis of SOH prediction performance was performed for five deep learning strategies (LSTM–Transformer, CNN–LSTM, CNN–Transformer, LSTM, and Transformer), utilizing data from LIBs 7 and 8 of the self-test IMUT battery dataset. The results are presented in Figure 12. Among these approaches, the LSTM–Transformer (the purple curve) delivers the best performance: its curve aligns most closely with the true SOH (black curve) across both datasets, matching the true trend and fluctuations tightly, which implies that it has the smallest MAE/RMSE and an R² nearest to 1. Following closely is CNN–LSTM (the red curve): it fits well with the true SOH in LIB 7 and only deviates slightly in the later cycles of LIB 8, resulting in marginally larger errors than those of LSTM–Transformer but still having strong overall performance. The CNN–Transformer (the blue curve) ranks at a moderate level: its curve shows minor fluctuations but remains close to the true SOH, with error levels falling between those of CNN–LSTM and LSTM. LSTM (the green curve) has weaker performance: its curve deviates noticeably from the true SOH in the later cycles of both datasets, leading to a larger MAE/RMSE and an R² farther from 1. Finally, the Transformer (the orange curve) is the worst-performing strategy: its curve shows the most significant deviation from the true SOH (particularly with predicted values notably lower than true values in later cycles), corresponding to the largest MAE/RMSE and an R² farthest from 1. Table 4 displays the error metric analysis for the five different deep learning strategies on the IMUT dataset.

To further validate the performance of the LSTM–Transformer hybrid model, a comparative experiment of five deep learning strategies was performed using LIB 7 and LIB 8 from the XJTU dataset (comprising 55 LISHEN LiNi₀.₅Co₀.₂Mn₀.₃O₂ batteries with a nominal capacity of 2000 mAh, a nominal voltage of 3.6 V, and cut-off voltages of 4.2 V for charging and 2.5 V for discharging), with the results presented in Figure 13. The LSTM–Transformer exhibits the strongest prediction performance: its curve closely aligns with the true SOH across both datasets, tightly matching the actual SOH trend and fluctuations to minimize deviations. Following closely is CNN–LSTM, which, despite minor fluctuations in its curve, remains generally proximate to the true SOH and yields relatively small prediction errors. The CNN–Transformer demonstrates moderate performance, with its curve showing noticeable but not extreme divergences from the true SOH in both datasets. LSTM has weaker performance, as its curve deviates more distinctly from the true SOH, particularly in the later cycles of both batteries. Finally, the Transformer delivers the poorest prediction effect, with its curve exhibiting the most significant divergence from the true SOH across both datasets, straying the farthest from the actual SOH trend. Table 5 presents the error metric analysis for the five different deep learning strategies on the XJTU dataset.

Both the IMUT and XJTU datasets use fixed 2C charging and 1C discharging. These constant current cycling tests fail to capture all the mechanisms of battery degradation and therefore cannot be used to evaluate the practical application value of the LSTM–Transformer hybrid model. To better validate the LSTM–Transformer model, this paper introduces the HUST dataset, which contains data from 77 LFP/graphite batteries undergoing 77 distinct multi-stage discharge protocols. Manufactured by A123 (APR18650M1A), these batteries have a rated capacity of 1100 mAh and a rated voltage of 3.3 V and were cycled at 30 °C using identical charging protocols and varying discharge methods. Similarly, five deep learning models were employed to evaluate single-cell batteries from the HUST dataset. The evaluation results are shown in Figure 14. When predicting the LIB SOH using five deep learning models, all model prediction curves exhibit a declining trend with increasing cycle count. However, the fusion models (LSTM–Transformer, CNN–LSTM) demonstrate superior fitting accuracy and stability compared to single models (LSTM, Transformer). The LSTM–Transformer showed high alignment with actual values in the mid-to-late stages, while CNN–LSTM exhibited low prediction errors during the mid-cycle phase. Single models demonstrated significant prediction fluctuations and noticeable bias in the late cycle phase. In terms of performance characteristics, the hybrid models combined the strengths of different architectures, excelling in capturing long-sequence feature correlations and local–temporal features. Single models exhibited certain limitations when handling long or ultra-long sequences. Table 6 presents the error metric analysis, quantitatively demonstrating the LSTM–Transformer’s superior performance (lowest MAE of 0.003236, RMSE of 0.004118, and MAPE of 0.3502% and highest R²: 0.9965) on the HUST dataset with dynamic operating conditions.

5. Conclusions

This study developed and validated a symmetry-inspired LSTM–Transformer model for the accurate SOH assessment of LIBs using real-world measurement data. The proposed model effectively addresses the limitations of traditional single-model approaches by leveraging the complementary strengths of both architectures. The key findings and contributions of this research can be summarized as follows: (1) The LSTM–Transformer hybrid model exhibits lower error across all evaluation metrics, achieving MAE values as low as 0.001775 and RMSE values down to 0.002147 for individual batteries. Comparative analysis with the standalone LSTM and Transformer models confirmed the superiority of the hybrid approach in both accuracy and robustness. (2) Through comprehensive feature extraction and correlation analysis, we found that CC charge capacity, CC charge time, and voltage slope during the CC charging phase, along with CV charge capacity, CV charge time, and current slope during the CV phase, serve as the most significant health indicators with correlation coefficients approaching or exceeding 0.8. (3) The integration of LSTM’s local temporal dependency capture with the Transformer’s global attention mechanism enabled the model to effectively learn both short-term dynamic patterns and long-term degradation trends in battery behavior. This multi-scale temporal understanding proved crucial for accurate SOH estimation throughout the battery lifecycle. (4) The model demonstrated consistent performance across multiple batteries with varying degradation patterns and cycle lifespans, indicating strong generalizability. The error distribution analysis revealed that the hybrid model maintained low prediction errors across different battery specimens. (5) By utilizing readily available charging voltage and current data within specific operational ranges ([4.0, 4.2] V for voltage and [0.1, 0.5] A for current), the proposed method offers practical implementation potential for real-world BMSs without requiring complex instrumentation. The hybrid architecture LSTM–Transformer represents an advancement in data-driven battery SOH assessment, providing a robust framework that balances local feature extraction with global pattern recognition. Future work will focus on extending this approach to various battery chemistries, optimizing computational efficiency for real-time applications, and exploring transfer learning techniques to enhance model adaptability across different operating conditions and battery types.

Constraints such as cost, wiring, and system complexity often mean that battery packs monitor only a limited number of cell voltages and currents. Accurately assessing the SOH of the battery pack when information is incomplete has become a core challenge. Future research will analyze charge/discharge voltage and current trends in a small number of monitorable cells within the pack. Leveraging the strong correlation between cell characteristics, the HFs of the monitorable cells will be extracted in order to estimate the SOH of unmonitored cells. Deep learning models will address challenges such as cell imbalance, thermal gradients, and impedance dispersion, enabling precise SOH estimation.

Author Contributions

Conceptualization, J.W.; methodology, J.W. and H.Z.; software, H.Z.; writing—original draft preparation, H.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research is funded by National College Student Innovation and Entrepreneurship Training Program (202510128006). This research is supported by Inner Mongolia Natural Science Foundation Project under grant No. 2025LHMS05018 and Inner Mongolia First-Class Discipline scientific research special project under grant No: YLXKZX-NGD-011.

Data Availability Statement

The original contributions presented in this study are included in this article. Further inquiries can be directed to the corresponding author.

Acknowledgments

The authors greatly appreciate the comments from the reviewers, which helped improve the quality of this paper.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

SOH	State of Health
LIBs	Lithium-Ion Batteries
LSTM	Long Short-Term Memory
NCM	Nickel Cobalt Manganese
CC-CV	Constant Current–Constant Voltage
BMS	Battery Management System
PCC	Pearson Correlation Coefficient
MIC	Maximum Information Coefficient
HFs	Health Features

References

Huang, C.; Lin, B. Digital economy solutions towards carbon neutrality: The critical role of energy efficiency and energy structure transformation. Energy 2024, 306, 132524. [Google Scholar] [CrossRef]
Wang, Y.; Yan, Q.; Luo, Y.; Zhang, Q. Carbon abatement of electricity sector with renewable energy deployment: Evidence from China. Renew. Energy 2023, 210, 1–11. [Google Scholar] [CrossRef]
Koohi-Fayegh, S.; Rosen, M. A review of energy storage types, applications and recent developments. J. Energy Storage 2020, 27, 101047. [Google Scholar] [CrossRef]
National Energy Administration. China New Energy Storage Development Report. 2025. Available online: https://www.nea.gov.cn/20250731/1d40d09f75714280a9218d5bea178fbd/c.html (accessed on 31 July 2025).
Lopez, F.A.; Lauinger, D.; Vuille, F.; Mueller, D.B. On the potential of vehicle-to-grid and second-life batteries to provide energy and material security. Nat. Commun. 2024, 15, 4179. [Google Scholar] [CrossRef]
Chen, Z.; Jia, L.; Yin, L.; Dang, C.; Ren, H.; Zhang, Z. A review on thermal management of Li-ion battery: From small-scale battery module to large-scale electrochemical energy storage power station. J. Therm. Sci. 2025, 34, 1–23. [Google Scholar] [CrossRef]
Li, Q.; Song, R.; Wei, Y. A review of state-of-health estimation for lithium-ion battery packs. J. Energy Storage 2025, 118, 116078. [Google Scholar] [CrossRef]
Drame, O.; Yahya, Z.; Sarr, A.; Mbow, C. Soumbedjioune tunnel in dakar: A case study for CFD modeling of fire-smoke extraction in a sinusoidal floor tunnel. Emerg. Manag. Sci. Technol. 2024, 4, e011. [Google Scholar] [CrossRef]
Zhang, J.; Long, T.; Sun, X.; He, L.; Yang, J.; Wang, J.; Wang, Z.; Huang, Y.; Zhang, L.; Zhang, Y. Mechanism Investigation on Microstructure Degradation and Thermal Runaway Propagation of Batteries Undergoing High-Rate Cycling Process. J. Energy Chem. 2026, 113, 1013–1029. [Google Scholar] [CrossRef]
Zhu, T.; Wang, S.; Fan, Y.; Hai, N.; Huang, Q.; Fernandez, C. An improved dung beetle optimizer–hybrid kernel least square support vector regression algorithm for state of health estimation of lithium-ion batteries based on variational model decomposition. Energy 2024, 306, 132464. [Google Scholar] [CrossRef]
Hou, J.; Xu, J.; Lin, C.; Jiang, D.; Mei, X. State of charge estimation for lithium-ion batteries based on battery model and data-driven fusion method. Energy 2024, 290, 130056. [Google Scholar] [CrossRef]
Hentunen, A.; Lehmuspelto, T.; Suomela, J. Time-domain parameter extraction method for Thévenin-equivalent circuit battery models. IEEE Trans. Energy Convers. 2014, 29, 558–566. [Google Scholar] [CrossRef]
Geng, Y.; Pang, H.; Liu, X. State-of-charge estimation for lithium-ion battery based on PNGV model and particle filter algorithm. J. Power Electron. 2022, 22, 1154–1164. [Google Scholar] [CrossRef]
Tekin, M.; Karamangil, M.I. Development of dual polarization battery model with high accuracy for a lithium-ion battery cell under dynamic driving cycle conditions. Heliyon 2024, 10, e28454. [Google Scholar] [CrossRef]
Tran, M.-K.; Mathew, M.; Janhunen, S.; Panchal, S.; Raahemifar, K.; Fraser, R.; Fowler, M. A comprehensive equivalent circuit model for lithium-ion batteries, incorporating the effects of state of health, state of charge, and temperature on model parameters. J. Energy Storage 2021, 43, 103252. [Google Scholar] [CrossRef]
Sun, J.; Liu, Y.; Kainz, J. A comparative study of parameter identification methods for equivalent circuit models for lithium-ion batteries and their application to state of health estimation. J. Energy Storage 2025, 114, 115707. [Google Scholar] [CrossRef]
Chen, Q.; Chen, X.S.; Li, Z. A fast numerical method with non-iterative source term for pseudo-two-dimension lithium-ion battery model. J. Power Sources 2023, 577, 233258. [Google Scholar] [CrossRef]
Liu, Y.; Wang, L.; Li, D.; Wang, K. State-of-health estimation of lithium-ion batteries based on electrochemical impedance spectroscopy: A review. Prot. Control Mod. Power Syst. 2023, 8, 41. [Google Scholar] [CrossRef]
Li, Q.; Xue, W. A review of feature extraction toward health state estimation of lithium-ion batteries. J. Energy Storage 2025, 112, 115453. [Google Scholar] [CrossRef]
Mu, G.; Wei, Q.; Xu, Y.; Li, J.; Zhang, H.; Yang, F.; Zhang, J.; Li, Q. State of health estimation of lithium-ion batteries based on feature optimization and data-driven models. Energy 2025, 316, 134578. [Google Scholar] [CrossRef]
Wu, J.; Liu, Z.; Zhang, Y.; Lei, D.; Zhang, B.; Cao, W. Data-driven state of health estimation for lithium-ion battery based on voltage variation curves. J. Energy Storage 2023, 73, 109191. [Google Scholar] [CrossRef]
Tian, J.; Xiong, R.; Shen, W. State-of-health estimation based on differential temperature for lithium-ion batteries. IEEE Trans. Power Electron. 2020, 35, 10363–10373. [Google Scholar] [CrossRef]
Lin, M.; Yan, C.; Wang, W.; Dong, G.; Meng, J.; Wu, J. A data-driven approach for estimating state-of-health of lithium-ion batteries considering internal resistance. Energy 2023, 277, 127675. [Google Scholar] [CrossRef]
Li, P.; Wu, X.; Grosu, R.; Hou, J.; Ilolov, M.; Xiang, S. Applying neural network to health estimation and lifetime prediction of lithium-ion batteries. IEEE Trans. Transp. Electrif. 2025, 11, 4224–4248. [Google Scholar] [CrossRef]
Zheng, C.; Mengmeng, S.; Xing, S.; Renxin, X.; Jiangwei, S. Online state of health estimation for lithium-ion batteries based on support vector machine. Appl. Sci. 2018, 8, 925. [Google Scholar] [CrossRef]
Zhang, L.; Zhang, J.; Gao, T.; Lyu, L.; Wang, L.; Shi, W.; Jiang, L.; Cai, G. Improved LSTM based state of health estimation using random segments of the charging curves for lithium-ion batteries. J. Energy Storage 2023, 74, 109370. [Google Scholar] [CrossRef]
Hou, J.; Su, T.; Gao, T.; Yang, Y.; Xue, W. Early prediction of battery lifetime for lithium-ion batteries based on a hybrid clustered CNN model. Energy 2025, 319, 134992. [Google Scholar] [CrossRef]
Xu, P.; Wang, C.; Ye, J.; Ouyang, T. State-of-charge estimation and health prognosis for lithium-ion batteries based on temperature-compensated Bi-LSTM network and integrated attention mechanism. IEEE Trans. Ind. Electron. 2024, 71, 5586–5596. [Google Scholar] [CrossRef]
Bockrath, S.; Lorentz, V.; Pruckner, M. State of health estimation of lithium-ion batteries with a temporal convolutional neural network using partial load profiles. Appl. Energy 2023, 329, 120307. [Google Scholar] [CrossRef]
Zhang, W.; He, H.; Li, T.; Yuan, J.; Xie, Y.; Long, Z. Lithium-ion battery state of health prognostication employing multi-model fusion approach based on image coding of charging voltage and temperature data. Energy 2024, 296, 131095. [Google Scholar] [CrossRef]
Xiong, X.; Wang, Y.; Jiang, C.; Sun, Z.; Chen, Z. Multi-physics data and model feature fusion for lithium-ion battery capacity estimation by transformer-based deep learning. Energy 2025, 335, 137783. [Google Scholar] [CrossRef]
Liu, Y.; Lai, S.; Mai, Y.; Zhang, W.; He, H. Lithium batteries health state prediction method based on TCN–GRU–attention fusion model with multi-source charging information. AIP Adv. 2025, 15, 025040. [Google Scholar] [CrossRef]
Yin, Y.; Zhu, X.; Zhao, X. A deep reinforcement learning approach for state of charge and state of health estimation in lithium-ion batteries. AIP Adv. 2023, 13, 105207. [Google Scholar] [CrossRef]
Mazzi, Y.; Ben Sassi, H.; Errahimi, F. Lithium-ion battery state of health estimation using a hybrid model based on a convolutional neural network and bidirectional gated recurrent unit. Eng. Appl. Artif. Intell. 2024, 127, 107199. [Google Scholar]
Gao, M.; Bao, Z.; Zhu, C.; Jiang, J.; He, Z.; Dong, Z.; Song, Y. HFCM–LSTM: A novel hybrid framework for state-of-health estimation of lithium-ion battery. Energy Rep. 2023, 9, 2577–2590. [Google Scholar] [CrossRef]
Gui, X.; Du, J.; Wang, Q.; Zhao, H.; Cheng, Y.; Zhao, J. Multi-modal data information alignment based SOH estimation for lithium-ion batteries using a local–global parallel CNN–Transformer Network. J. Energy Storage 2025, 129, 117178. [Google Scholar]
Sherstinsky, A. Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Phys. D Nonlinear Phenom. 2020, 404, 132306. [Google Scholar] [CrossRef]
Greff, K.; Srivastava, R.K.; Koutnik, J.; Steunebrink, B.R.; Schmidhuber, J. LSTM: A search space odyssey. IEEE Trans. Neural Netw. Learn. Syst. 2017, 28, 2222–2232. [Google Scholar] [CrossRef]
Mo, Y.; Wu, Q.; Li, X.; Huang, B. Remaining useful life estimation via transformer encoder enhanced by a gated convolutional unit. J. Intell. Manuf. 2021, 32, 1997–2006. [Google Scholar] [CrossRef]

Figure 1. An LSTM structure diagram. The key components of LSTM primarily include the cell state and the gating mechanism, which is further subdivided into the forget gate, input gate, and output gate.

Figure 2. A Transformer architecture diagram. The key components of the Transformer include the input embedding layer, positional encoding, multi-head self-attention mechanism, feed-forward network, residual connections, and layer normalization, ultimately organized into two major modules: the encoder and decoder.

Figure 3. An LSTM–Transformer hybrid model architecture diagram. The LSTM–Transformer hybrid model combines LSTM for local temporal feature extraction and self-attention for long-range dependency modeling to predict the SOH.

Figure 4. Flow of LSTM–Transformer network to evaluate SOH.

Figure 5. Battery charging and discharging platform.

Figure 6. Voltage and current data preprocessing. (a) Voltage curve and voltage extraction region within single voltage acquisition cycle. (b) Current curve and current extraction region within single current acquisition cycle.

Figure 7. Charging and discharging voltage, current, and capacity curves.

Figure 8. Aging trajectory of 8 batteries.

Figure 9. A correlation diagram between 16 extracted features and the SOH. This figure illustrates 16 features of batteries 1 to 8 from the IMUT dataset. In each subplot, the x-axis represents the battery SOH, while the y-axis shows the normalized values of the corresponding feature.

Figure 10. A correlation heatmap between extracted features and the SOH. The order of the 16 features shown in this figure is consistent with the feature order in Figure 9.

Figure 11. Estimation results of SOH for 8 LIBs using LSTM–Transformer model. (a) SOH prediction results for LIB No. 1; (b) SOH prediction results for LIB No. 2; (c) SOH prediction results for LIB No. 3; (d) SOH prediction results for LIB No. 4; (e) SOH prediction results for LIB No. 5; (f) SOH prediction results for LIB No. 6; (g) SOH prediction results for LIB No. 7; (h) SOH prediction results for LIB No. 8.

Figure 12. A comparison of SOH prediction for LIB No. 7–8 under different deep learning strategies on the IMUT dataset. (a) The SOH prediction performance of five distinct deep learning strategies—namely the LSTM–Transformer, CNN–LSTM, CNN–Transformer, LSTM, and Transformer—was evaluated using the self-test dataset from LIB 7 (IMUT battery 7 dataset). (b) The SOH prediction performance of five distinct deep learning strategies—namely the LSTM–Transformer, CNN–LSTM, CNN–Transformer, LSTM, and Transformer—was evaluated using the self-test dataset from LIB 8 (IMUT battery 8 dataset).

Figure 13. A comparison of SOH prediction for LIB No. 7–8 under different deep learning strategies on the XJTU dataset. (a) The SOH prediction performance of five distinct deep learning strategies—namely the LSTM–Transformer, CNN–LSTM, CNN–Transformer, LSTM, and Transformer—was evaluated using the XJTU dataset from LIB 7. (b) The SOH prediction performance of five distinct deep learning strategies—namely the LSTM–Transformer, CNN–LSTM, CNN–Transformer, LSTM, and Transformer—was evaluated using the XJTU dataset from LIB 8.

Figure 14. A comparison of SOH prediction for single-cell batteries under different deep learning strategies on the HUST dataset.

Table 1. LSTM–Transformer model hyperparameters.

Parameter Name	Value	Adjustment Range
LSTM Layers	1	[1, 2]
LSTM Hidden Layers	64	[32, 128]
Transformer Layers	1	[1, 3]
Transformer Hidden Layers	128	[64, 256]
Transformer Attention Heads	8	[4, 8]
Dropout	0.25	[0.1, 0.3]
Input Feature Number	16	Fixed
Decoder Input Dimensions	1	Fixed

Table 2. LSTM–Transformer training hyperparameters.

Parameter Name	Value	Adjustment Range
Warm-up Learning Rate	1.00 × 10⁻⁴	[1 × 10⁻⁵, 5 × 10⁻⁴]
Base Learning Rate	1.00 × 10⁻⁴	[5 × 10⁻⁵, 5 × 10⁻⁴]
Final Learning Rate	5.00 × 10⁻⁵	[1 × 10⁻⁵, 1 × 10⁻⁴]
Training Rounds	200	[100, 500]
Early Stopping Patience	10	[5, 20]
Batch Size	32	[16, 64]
Input Sequence Length	30	[20, 50]

Table 3. Error metric analysis by the LSTM–Transformer model.

Battery Number	MAE	RMSE	MAPE (%)	R²
No. 1	0.003669	0.004264	0.3990	0.9841
No. 2	0.005954	0.006915	0.7150	0.9751
No. 3	0.004787	0.005702	0.5560	0.9871
No. 4	0.003346	0.004205	0.3560	0.9805
No. 5	0.002734	0.004561	0.3020	0.9820
No. 6	0.001775	0.002147	0.1960	0.9970
No. 7	0.003642	0.004508	0.4121	0.9882
No. 8	0.004022	0.005055	0.4544	0.9888

Table 4. Error metric analysis under the five different deep learning strategies on the IMUT dataset.

LIB No.	Strategy	MAE	RMSE	MAPE (%)	R²
	LSTM–Transformer	0.003642	0.004508	0.4121	0.9882
	CNN-LSTM	0.004827	0.005639	0.5417	0.9816
No. 7	CNN–Transformer	0.006307	0.007222	0.7109	0.9698
	LSTM	0.009216	0.010246	1.0362	0.9392
	Transformer	0.010361	0.011935	1.1919	0.9176
	LSTM–Transformer	0.004022	0.005055	0.4544	0.9888
	CNN–LSTM	0.005757	0.006357	0.6651	0.9823
No. 8	CNN–Transformer	0.005175	0.006769	0.5875	0.9800
	LSTM	0.006375	0.007431	0.7304	0.9758
	Transformer	0.008720	0.009606	1.0155	0.9596

Table 5. Error metric analysis under the five different deep learning strategies on the XJTU dataset.

LIB No.	Strategy	MAE	RMSE	MAPE (%)	R²
	LSTM–Transformer	0.003468	0.004963	0.3870	0.9908
	CNN–LSTM	0.004300	0.006126	0.4783	0.9860
No. 7	CNN–Transformer	0.004734	0.006685	0.5326	0.9833
	LSTM	0.007595	0.009745	0.8416	0.9646
	Transformer	0.006908	0.010454	0.7775	0.9593
	LSTM–Transformer	0.003392	0.004002	0.3559	0.9927
	CNN–LSTM	0.004244	0.006674	0.4694	0.9797
No. 8	CNN–Transformer	0.008117	0.008994	0.8678	0.9631
	LSTM	0.006880	0.009925	0.7614	0.9550
	Transformer	0.009348	0.014423	1.0410	0.9051

Table 6. Error metric analysis under the five different deep learning strategies on the HUST dataset.

Strategy	MAE	RMSE	MAPE (%)	R²
LSTM–Transformer	0.003236	0.004118	0.3502	0.9965
CNN–LSTM	0.005553	0.006009	0.5991	0.9926
CNN–Transformer	0.004847	0.007005	0.5664	0.9899
LSTM	0.005485	0.006962	0.6143	0.9900
Transformer	0.005503	0.006922	0.6176	0.9901

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhang, H.; Wang, J. Research on State of Health Assessment of Lithium-Ion Batteries Using Actual Measurement Data Based on Hybrid LSTM–Transformer Model. Symmetry 2026, 18, 169. https://doi.org/10.3390/sym18010169

AMA Style

Zhang H, Wang J. Research on State of Health Assessment of Lithium-Ion Batteries Using Actual Measurement Data Based on Hybrid LSTM–Transformer Model. Symmetry. 2026; 18(1):169. https://doi.org/10.3390/sym18010169

Chicago/Turabian Style

Zhang, Hanyu, and Jifei Wang. 2026. "Research on State of Health Assessment of Lithium-Ion Batteries Using Actual Measurement Data Based on Hybrid LSTM–Transformer Model" Symmetry 18, no. 1: 169. https://doi.org/10.3390/sym18010169

APA Style

Zhang, H., & Wang, J. (2026). Research on State of Health Assessment of Lithium-Ion Batteries Using Actual Measurement Data Based on Hybrid LSTM–Transformer Model. Symmetry, 18(1), 169. https://doi.org/10.3390/sym18010169

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Research on State of Health Assessment of Lithium-Ion Batteries Using Actual Measurement Data Based on Hybrid LSTM–Transformer Model

Abstract

1. Introduction

2. Method for Assessing SOH Based on LSTM–Transformer

2.1. Definition of SOH

2.2. LSTM Network Model

2.3. Transformer Neural Network Model

2.4. Construction of LSTM–Transformer Hybrid Models

2.5. LSTM–Transformer Model Hyperparameters and Training Hyperparameters

2.6. Model Evaluation Metrics

2.6.1. Mean Absolute Error

2.6.2. Root Mean Square Error

2.6.3. Mean Absolute Percentage Error

2.6.4. Coefficient of Determination

3. Data Acquisition and Feature Extraction

3.1. Data Acquisition

3.2. Data Feature Extraction

4. Experimental Verification

4.1. Data Preprocessing

4.2. Battery Degradation Testing

4.3. Feature Extraction

4.4. SOH Evaluation

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI