A Physics-Guided Transformer for Robust State of Charge Estimation in Aging Lithium-Ion Batteries

Li, Xiang; Wu, Guanru; Chang, Fei; Xia, Weidong; Sun, Shaobin; Shen, Yingjun

doi:10.3390/batteries11120446

Open AccessArticle

A Physics-Guided Transformer for Robust State of Charge Estimation in Aging Lithium-Ion Batteries

by

Xiang Li

¹,

Guanru Wu

^1,*,

Fei Chang

¹,

Weidong Xia

¹,

Shaobin Sun

¹ and

Yingjun Shen

²

¹

State Grid Nanjing Power Supply Company, Nanjing 221000, China

²

Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China

^*

Author to whom correspondence should be addressed.

Batteries 2025, 11(12), 446; https://doi.org/10.3390/batteries11120446

Submission received: 10 November 2025 / Revised: 29 November 2025 / Accepted: 30 November 2025 / Published: 3 December 2025

(This article belongs to the Special Issue Batteries and Supercapacitors Aging III)

Download

Browse Figures

Versions Notes

Abstract

Accurate state of charge (SOC) estimation is a critical challenge for battery management systems (BMSs), hindered by the nonlinear electrochemistry of lithium-ion batteries, their sensitivity to temperature, and pervasive measurement noise. Crucially, battery aging significantly degrades estimation accuracy, posing a major hurdle for long-term system dependability. We propose the Physics-Informed Transformer (PI-Transformer), a novel framework that integrates high-fidelity electrochemical constraints from the PyBaMM (Version: 25.10.2) model directly into a Transformer architecture. This approach ensures physical consistency while leveraging the Transformer’s self-attention mechanism to model long-term temporal dependencies. The framework is specifically designed to be robust against the effects of battery aging, incorporating an attention-based noise modeling module to enhance resilience against sensor uncertainty and capacity fade. Evaluated on two public datasets under diverse conditions, including variable temperatures, fast-charging protocols, and multiple stages of battery degradation, the PI-Transformer consistently achieves state-of-the-art performance. It demonstrates exceptional robustness and maintains high accuracy under challenging low-temperature and severely aged battery scenarios, highlighting its strong potential for deployment in real-world ESS applications where aging is a primary concern.

Keywords:

battery management systems; state of charge estimation; physics-guided modeling; Transformer; lithium-ion batteries

1. Introduction

Lithium-ion batteries (LIBs) have become the dominant energy storage technology for electric vehicles and renewable energy integration due to their high energy density, low self-discharge rate, fast response, and long cycle life [1,2]. Their widespread deployment—from consumer electronics to grid-scale storage—has driven rapid market growth, with global EV-related LIB demand reaching 142.8 GWh in 2020 and projected to exceed 91.8 billion USD in the coming years [3]. However, like many electrochemical systems, LIBs inevitably suffer performance degradation, or aging, over time. This aging process, characterized by capacity fade and increased internal resistance, is a primary factor leading to inaccurate state estimation in practical applications. Improper operation, such as overcharging, deep discharging, or high C-rate cycling, accelerates irreversible wear of electrodes and separators, leading to further degradation and heightened safety risks [4].

Ensuring safe and efficient operation therefore requires accurate real-time estimation of internal states—most notably the state of charge (SOC) and state of health (SOH)—which are pivotal for battery management systems (BMSs) [5]. SOC, representing the ratio of remaining charge to nominal capacity, directly affects driving range estimation, energy efficiency, and charge/discharge control strategies [6]. SOH, reflecting the degradation level and residual lifetime, underpins maintenance scheduling and end-of-life prediction. Both states cannot be measured directly during regular operation and must be inferred from measurable electrical, thermal, or mechanical indicators. The challenge lies in maintaining estimation accuracy under diverse operating conditions, where sensor noise, parameter drift, and nonlinear battery dynamics intensify the difficulty of reliable state tracking [7].

SOC estimation methods fall into three categories: physics-based, data-driven, and hybrid. Physics-based approaches—such as Coulomb counting and open-circuit voltage (OCV) methods, electrochemical impedance spectroscopy—offer mechanistic rigor but are often computationally intensive, less adaptive, or require rest periods for OCV calibration [8,9,10]. In practice, Ah-integration with periodic OCV correction remains common due to simplicity [11] but is sensitive to initial SOC errors and sensor drift, leading to cumulative inaccuracies [12].

Data-driven methods model the battery as a nonlinear black-box system, learning mappings from measurable signals—such as current, voltage, and temperature—to SOC without explicitly describing internal electrochemical processes [13]. Leveraging large-scale telemetry from BMS sensors, machine learning approaches have demonstrated strong capability in reconstructing the complex multi-variable relationship governing SOC under diverse load profiles and environments [14]. Recurrent neural networks, particularly Long Short-Term Memory (LSTM) architectures, effectively capture the temporal dependencies inherent in charging/discharging cycles [15], and recent adaptive designs improve robustness under temperature variations [16,17]. Recent advances in deep learning have also explored transfer learning for battery modeling under extreme conditions. For instance, Shi et al. [18] applied transfer learning to predict heat release during thermal runaway, demonstrating the potential of knowledge transfer across battery operating regimes. Such frameworks have also been extended to state of health estimation [19]. Nevertheless, purely data-driven models remain challenged by limited physical interpretability and potential performance degradation when operating under extreme temperatures, fast transients, or battery aging [20,21].

Hybrid approaches integrate physical priors with data-driven flexibility, enhancing generalization while retaining interpretability [22,23]. Examples include Bayesian electrochemical model hybrids [24], ANFIS with real-time correction [25], and neural network Kalman filter (NN-KF) designs [26,27]. However, most hybrids achieve only shallow coupling—such as parameter adaptation or post hoc correction—rather than end-to-end co-optimization of physics and temporal learning. Balancing high model fidelity with computational efficiency remains challenging, especially for real-time BMSs [28,29,30]. Robustness under extreme conditions (low temperature, high C-rate charging, long-term aging) is also difficult due to amplified model uncertainty and sensor noise [31].

Python 3.10 battery mathematical modeling (PyBaMM) (e.g., Version: 25.10.2) is an open-source, community-driven platform for high-fidelity electrochemical modeling, designed to facilitate collaboration and accelerate battery research by applying modern software engineering practices [32]. By representing models as expression trees and processing them through a modular pipeline, PyBaMM enables flexible implementation and comparison of battery models and numerical methods, solving multi-scale partial differential equations to capture spatially resolved internal states such as lithium concentration, potential distribution, and temperature effects [33]. These capabilities make PyBaMM well-suited for studying degradation mechanisms and state estimation problems; however, direct deployment of full-fidelity models for real-time SOC estimation in battery management systems is hindered by high computational costs and sensitivity to parameter identification errors under dynamic conditions [34]. While methods such as signal decoupling and Monte Carlo dropout [35] have been explored to improve generalization, they do not inherently enforce physical plausibility—motivating hybrid approaches that integrate PyBaMM’s physical rigor into data-driven architectures to achieve both interpretability and efficiency.

To address these gaps, we propose a Physics-Informed Transformer (PI-Transformer) enabling deep, differentiable integration of electrochemical constraints within a Transformer architecture. Our key insight is that the core electrochemical principles encoded in PyBaMM, such as the differential equation governing SOC change (

d SOC / d t \propto I

), can be extracted and embedded directly into a deep learning architecture as differentiable constraints. This allows us to enforce physical plausibility without sacrificing the model’s ability to learn complex, long-term temporal dependencies from data. Extensive experiments on two public datasets, the NASA dataset (laboratory cycling at 24 °C and 4 °C) and the Braatz dataset [36] (real-world fast-charging with aging), demonstrate the superiority of the PI-Transformer.

Main contributions:

We propose the Physics-Informed Transformer (PI-Transformer), a novel framework that enables deep, differentiable integration of PyBaMM electrochemical constraints into a Transformer architecture, balancing physical interpretability with data-driven adaptability.
We design a dual-branch fusion architecture and an attention-based noise modeling module to jointly optimize physics-guided dynamics and data-driven features, enhancing robustness against sensor noise and battery aging.
We conduct comprehensive experiments on two public datasets under diverse conditions (4–30 °C, fast-charging, aging), demonstrating state-of-the-art performance and strong generalization, with the PI-Transformer achieving the lowest error rates and highest $R^{2}$ scores across all evaluations.

The paper is structured as follows. Section 2 formally defines the SOC estimation problem. Section 3 introduces the proposed Physics-Informed Transformer (PI-Transformer) framework. Section 4 details the experimental methodology, including dataset descriptions and evaluation protocols. Section 5 presents and analyzes the experimental results. Finally, Section 6 summarizes the key findings and outlines potential avenues for future work.

2. Problem Formulation for SOC Estimation

The estimation of the state of charge (SOC) for lithium-ion batteries is a nonlinear state reconstruction problem based on dynamic system observations. In the discrete-time domain, SOC is defined as the percentage of the remaining capacity relative to the nominal capacity, which can be mathematically expressed as

{SOC}_{k} = \frac{Q_{rem, k}}{Q_{nom}} \times 100 %,

(1)

where

Q_{rem, k}

denotes the remaining capacity at time step k, and

Q_{nom}

is the battery’s nominal capacity, assumed to be a known constant determined by manufacturer specifications.

Based on the principle of charge conservation, the dynamic evolution of SOC can be described by the following state equation:

{SOC}_{k + 1} = {SOC}_{k} + \frac{η I_{k} Δ t}{Q_{nom}},

(2)

where

I_{k}

is the charging/discharging current (positive for charging, negative for discharging),

η

denotes the Coulombic efficiency (typically within [0,1]), and

Δ t

represents the sampling time interval. This equation provides a simplified yet physically grounded representation of the SOC evolution process, derived from Faraday’s law of electrolysis.

In practical battery management systems, SOC cannot be measured directly and must be inferred from indirect measurements such as terminal voltage. The nonlinear relationship between voltage and SOC can be expressed through the following observation model:

V_{k} = h ({SOC}_{k}, I_{k}, T_{k}),

(3)

where

h (\cdot)

represents the open-circuit voltage (OCV)–SOC functional relationship, which is further affected by current

I_{k}

and temperature

T_{k}

. Due to the effects of battery aging, operating conditions, and environmental variability,

h (\cdot)

exhibits complex nonlinear behavior, making accurate SOC estimation a nontrivial task.

To enhance estimation accuracy and robustness, this paper proposes a hybrid modeling framework that combines physics-based principles with data-driven learning. Let

X_{k} = {V_{k - l : k}, I_{k - l : k}, T_{k - l : k}}

denote a historical sequence of voltage, current, and temperature measurements over a sliding window of length l. The SOC prediction model is then formulated as a parametric mapping:

{\hat{SOC}}_{k + 1} = F (X_{k}; Θ),

(4)

where

Θ

is the set of learnable model parameters.

The parameter learning process minimizes the following mean squared error objective function:

L (Θ) = \frac{1}{N} \sum_{k = 1}^{N} {({SOC}_{k} - {\hat{SOC}}_{k})}^{2} + λ {∥ Θ ∥}_{2}^{2},

(5)

where

λ

is the L2 regularization coefficient used to prevent overfitting.

The estimation performance is evaluated using two standard metrics: the root mean square error (RMSE) and the coefficient of determination

R^{2}

, defined as

RMSE = \sqrt{\frac{1}{N} \sum_{k = 1}^{N} {({SOC}_{k} - {\hat{SOC}}_{k})}^{2}},

(6)

R^{2} = 1 - \frac{\sum_{k = 1}^{N} {({SOC}_{k} - {\hat{SOC}}_{k})}^{2}}{\sum_{k = 1}^{N} {({SOC}_{k} - \bar{SOC})}^{2}},

(7)

where

\bar{SOC} = \frac{1}{N} \sum_{k = 1}^{N} {SOC}_{k} .

(8)

However, conventional physics-based models often fail to capture the dynamic nonlinearities introduced by aging and environmental variability, while purely data-driven models lack physical interpretability and generalization capability. To address these limitations, this paper proposes a Physics-Informed Transformer (PI-Transformer) framework that embeds the differential equation constraints of the PyBaMM electrochemical model into the Transformer architecture and introduces an attention-based adaptive noise modeling mechanism. This approach ensures both physical consistency and strong generalization performance (see Section 3 for details).

3. PI-Transformer-Based SOC Estimation Method

Accurate state of charge (SOC) estimation requires not only capturing long-term temporal dependencies from historical data but also enforcing strict adherence to electrochemical principles to ensure physical consistency. To achieve this, this paper proposes a hybrid framework that integrates physics-based differential equations derived from the PyBaMM electrochemical model directly into the Transformer architecture.

The framework first embeds physical constraints into the model through a physics-informed embedding layer, ensuring that the predicted SOC evolution strictly follows the laws of charge conservation. Simultaneously, the Transformer network learns complex nonlinear relationships from multivariate time-series data (voltage, current, and temperature), enabling accurate long-term trend prediction. An attention-based adaptive noise modeling mechanism further enhances the model’s robustness by dynamically compensating for sensor noise and battery aging effects. This dual-path design enables the model to combine the strengths of physics-based modeling and data-driven learning, significantly improving the accuracy and generalization of SOC estimation.

In this model, the input is defined as a historical data matrix:

X_{k} \in R^{l \times d},

(9)

where l denotes the length of the historical time window, and d is the number of input features (e.g., voltage V, current I, and temperature T). The output of the model is the predicted SOC at the next time step:

{\hat{SOC}}_{k + 1} \in R .

(10)

Each row of

X_{k}

corresponds to an observation vector at a specific time, and

{\hat{SOC}}_{k + 1}

represents the estimated SOC at the

(k + 1)

-th time step. Therefore, the entire model can be interpreted as an end-to-end mapping from historical battery data to future SOC predictions.

3.1. Physics-Informed Embedding via PyBaMM

The PyBaMM-based electrochemical model describes the SOC evolution using the following differential equation:

\frac{d SOC (t)}{d t} = \frac{I (t)}{Q_{nom}},

(11)

where

SOC (t)

denotes the battery’s state of charge at time t,

I (t)

is the current (positive for discharge and negative for charge), and

Q_{nom}

is the nominal battery capacity.

Discretizing this equation using a first-order forward difference yields

{SOC}_{k + 1} = {SOC}_{k} + \frac{I_{k} Δ t}{Q_{nom}},

(12)

where

Δ t

is the sampling interval. This equation is embedded into the Transformer as a physics-informed prior. Specifically, we construct a physics-informed embedding layer that maps the battery’s current and temperature into a latent representation consistent with the electrochemical model. This ensures that the Transformer’s predictions remain physically meaningful, even under unseen operating conditions.

Formally, let

x_{k}^{raw} = {[V_{k}, I_{k}, T_{k}]}^{⊤}

denote the raw input features at time step k. The physics-informed embedding layer computes the physics-based SOC change:

Δ {SOC}_{k}^{physics} = \frac{I_{k} \cdot Δ t_{k}}{Q_{nom}},

(13)

where

Δ t_{k}

is the time difference between consecutive measurements. The physics-informed embedding is then constructed as

e_{k}^{physics} = ϕ_{phys} ({[Δ {SOC}_{k}^{physics}, I_{k}, T_{k}]}^{⊤}),

(14)

where

ϕ_{phys} : R^{3} \to R^{d_{model}}

is a learnable embedding function implemented as a multi-layer perceptron (MLP).

This physics-informed embedding is designed to capture the fundamental electrochemical dynamics of the battery, providing a strong inductive bias that guides the model towards physically plausible predictions.

3.2. Transformer Network Architecture with Physics Embedding

The Transformer network processes multivariate time-series data to predict future SOC values. The architecture consists of the following components:

3.2.1. Transformer Input and Embedding Layer

Let the historical observation matrix be

X_{k} \in R^{l \times d},

(15)

where l is the number of historical time steps, and d is the number of observed variables (e.g., voltage V, current I, and temperature T).

We first map

X_{k}

into a higher-dimensional representation space of size

d_{model}

using a linear transformation:

X = X_{k} W_{e} + b_{e} \in R^{l \times d_{model}},

(16)

where

W_{e} \in R^{d \times d_{model}}

and

b_{e} \in R^{d_{model}}

are learnable parameters.

To preserve the temporal order of the sequence, positional encoding is added:

P \in R^{l \times d_{model}},

(17)

and the final input to the Transformer is

Z^{(0)} = X + P \in R^{l \times d_{model}} .

(18)

The positional encoding

P

is computed using sinusoidal functions, as proposed in the original Transformer paper, to inject information about the relative or absolute position of tokens in the sequence.

3.2.2. Transformer Encoder Layer

Each Transformer encoder layer consists of a multi-head self-attention mechanism (MHA) and a position-wise feed-forward network (FFN), with residual connections and layer normalization applied at each sublayer.

Multi-Head Self-Attention Mechanism

For each attention head, we compute the query (

Q

), key (

K

), and value (

V

) matrices:

Q = Z^{(l)} W^{Q}, K = Z^{(l)} W^{K}, V = Z^{(l)} W^{V},

(19)

where

W^{Q}, W^{K}, W^{V} \in R^{d_{model} \times d_{k}}

are learnable parameters.

Each attention head performs scaled dot-product attention:

Attention (Q, K, V) = softmax (\frac{Q K^{⊤}}{\sqrt{d_{k}}}) V .

(20)

For h attention heads, the output of the i-th head is

{head}_{i} = Attention (Z^{(l)} W_{i}^{Q}, Z^{(l)} W_{i}^{K}, Z^{(l)} W_{i}^{V}), i = 1, \dots, h .

(21)

After concatenating the outputs of all heads, a projection matrix

W^{O} \in R^{(h d_{k}) \times d_{model}}

is applied:

MultiHead (Z^{(l)}) = Concat ({head}_{1}, \dots, {head}_{h}) W^{O} .

(22)

A residual connection and layer normalization yield the intermediate representation:

Z^{' (l)} = LayerNorm (Z^{(l)} + MultiHead (Z^{(l)})) .

(23)

Feed-Forward Network

The intermediate representation

Z^{' (l)}

is then passed through a position-wise feed-forward network:

FFN (Z^{' (l)}) = max (0, Z^{' (l)} W_{1} + b_{1}) W_{2} + b_{2},

(24)

where

W_{1} \in R^{d_{model} \times d_{ff}}

,

b_{1} \in R^{d_{ff}}

,

W_{2} \in R^{d_{ff} \times d_{model}}

, and

b_{2} \in R^{d_{model}}

are learnable parameters.

The output of the encoder layer is obtained via another residual connection and normalization:

Z^{(l + 1)} = LayerNorm (Z^{' (l)} + FFN (Z^{' (l)})) .

(25)

After stacking L such encoder layers, we obtain the final encoder output

Z^{(L)} \in R^{l \times d_{model}}

. We extract the last time-step representation as the global feature vector:

h_{k}^{*} = Z_{l, :}^{(L)} \in R^{d_{model}} .

(26)

3.2.3. Adaptive Noise and Aging Modeling via Attention

To account for sensor noise and the dynamic effects of battery aging, we introduce a self-attention-based noise modeling module. This module dynamically adjusts the attention weights based on the input’s noise characteristics and the inferred level of battery degradation.

Let

h_{k}^{*}

represent the final encoder output. We refine this feature using a learnable noise modeling vector

n_{k}

, derived from the input sequence. The noise modeling vector is computed as

n_{k} = ψ (Z^{(L)}),

(27)

where

ψ : R^{l \times d_{model}} \to R^{d_{model}}

is a learnable function implemented as a small Transformer encoder followed by mean pooling and a linear projection.

The fused vector

g_{k}

is then computed as

g_{k} = {[h_{k}^{*}, n_{k}]}^{⊤} \in R^{2 d_{model}},

(28)

where

{[\cdot, \cdot]}^{⊤}

denotes vector concatenation.

This fused vector

g_{k}

is passed through a fully connected layer to produce the final SOC prediction:

{\hat{SOC}}_{k + 1} = σ (w_{g}^{⊤} g_{k} + b_{g}),

(29)

where

w_{g} \in R^{2 d_{model}}

,

b_{g} \in R

, and

σ (\cdot)

is a Sigmoid activation function.

3.3. End-to-End Training and Physics Integration Strategy

The PI-Transformer is trained end-to-end using the standard mean squared error (MSE) loss:

L = \frac{1}{M} \sum_{i = 1}^{M} {({\hat{SOC}}_{k + 1}^{(i)} - {SOC}_{k + 1}^{(i)})}^{2},

(30)

where M is the number of training samples. Crucially, the physics-informed constraints are not implemented as an additional term in the loss function but rather as an architectural component that modifies the input representation. This is achieved through the physics-informed embedding layer, which computes the physics-based SOC evolution as Formula (13) and combines it with raw current and temperature signals to create a physics-aware embedding. This embedding is then fused with the data-driven embedding and processed by the Transformer encoder.

The proposed structure is shown in Figure 1. This design offers two key advantages over conventional physics-informed neural networks (PINNs), where physics constraints are typically enforced as additional loss terms:

1.: End-to-end differentiability: The physics constraint is fully differentiable and integrated into the model’s forward pass, allowing gradients to flow back through the physics layer during backpropagation.
2.: Architectural flexibility: The physics information is treated as an auxiliary input feature, which can be dynamically weighted and fused with data features by the Transformer’s self-attention mechanism, rather than being a rigid constraint imposed by the loss function.

All components—physics-informed embedding, positional encoding, encoder layers, and output layer—are jointly optimized through gradient descent, ensuring that the model learns both the physical constraints and complex temporal patterns in the data. As demonstrated in Section 5, this deep fusion of physics-based modeling and data-driven learning enables the PI-Transformer to achieve state-of-the-art performance and strong generalization under diverse operating conditions.

In summary, the proposed framework integrates physical knowledge into the Transformer through a dedicated embedding layer while leveraging the model’s attention mechanism to learn nonlinear temporal features and adapt to aging and noise. This results in a deep fusion of physics-based modeling and data-driven learning, enabling accurate and robust SOC estimation under diverse operating conditions. The model is trained in an end-to-end manner, ensuring that physical constraints are preserved while maximizing data fitting capability.

4. Dataset Description and Experiments Design

This section provides a detailed description of the datasets used in this study and the rigorous experimental methodology employed to evaluate the proposed PI-Transformer framework.

4.1. Dataset Description

This study evaluates the proposed SOC estimation method using two widely recognized public datasets: the NASA dataset and the Braatz dataset [36]. A summary of these datasets is provided in Table 1.

NASA Dataset: The NASA dataset contains test data from 34 sets of 18650-type lithium-ion batteries with a nominal capacity of 2.0 Ah. The dataset covers two temperature conditions (24 °C and 4 °C) and various discharge modes (constant current discharge and square-wave load). Figure 2 visualizes representative time-series trajectories of voltage, current, temperature, and capacity across six distinct test batches, illustrating the diversity in battery behavior under different aging stages and operating conditions. In total, the dataset includes 7,457,030 samples across 7565 cycles (2815 charge cycles, 2794 discharge cycles, and 1956 impedance measurement cycles). This dataset is used to validate the accuracy of the SOC estimation method under varying temperature conditions and battery degradation levels.

Braatz Dataset: The Braatz dataset consists of 124 commercial lithium iron phosphate (LFP)/graphite cells (A123 Systems APR18650M1A) with a nominal capacity of 1.1 Ah and nominal voltage of 3.3 V. These cells were cycled under fast-charging conditions at a constant temperature of 30 °C until failure. Charging followed a two-step fast-charging policy (C1(Q1)-C2) up to 80% SOC, followed by 1C CC-CV charging. Discharge was performed at 4C constant current. The dataset includes high-resolution measurements of current, voltage, temperature, and internal resistance, enabling evaluation of the SOC estimation method under aggressive fast-charging conditions and long-term cycling degradation.

Both datasets provide rich multi-parameter time-series data, enabling comprehensive validation of the proposed SOC estimation method under diverse operating conditions, temperature regimes, and battery degradation states.

4.2. Input Feature Selection

The input to the model is a sequence of historical measurements of the battery’s current (I), voltage (V), and temperature (T), with the output being the predicted state of charge (SOC) at the next time step. This selection of inputs is firmly grounded in fundamental battery electrochemistry and practical sensor availability. The current I is the primary physical driver of SOC change, as dictated by the principle of charge conservation (

Δ SOC \propto I \cdot Δ t

). Voltage V, while nonlinearly related to SOC, provides critical state information—particularly through its open-circuit voltage (OCV) characteristic—and remains a vital indicator even under dynamic load conditions. Temperature T is included because it profoundly influences key internal processes such as ionic diffusion and reaction kinetics; neglecting it can lead to significant estimation errors, especially under extreme thermal conditions. Together, these three variables form a physically meaningful and practically feasible input set, as they are routinely measured by standard battery management systems (BMSs) and collectively capture the dominant dynamics governing SOC evolution.

4.3. Experimental Design

To ensure a fair, rigorous, and realistic evaluation of our model, we employ a carefully designed experimental methodology that accounts for the unique characteristics of each dataset. The core principle guiding our experimental design is to evaluate both intra-battery performance (temporal generalization) and cross-battery generalization (spatial generalization), while maintaining strict adherence to data leakage prevention and reproducibility.

For the NASA dataset, we adopt a chronological time-series split for each individual battery cell. Specifically, we allocate 60% of the temporal data to the training set, 20% to the validation set, and the remaining 20% to the test set. This approach strictly preserves the temporal order of data, ensuring that the model is trained exclusively on past observations and evaluated on future ones. Such a design eliminates any risk of data leakage and provides a realistic assessment of the model’s predictive performance over time, which is critical for real-world battery management systems where future states must be predicted based on historical data.

For the Braatz dataset, our primary objective is to evaluate the model’s ability to generalize across unseen physical batteries. To achieve this, we implement a stratified random split based on battery ID, where 60% of the 124 battery cells are randomly assigned to the training set, 20% to the validation set, and the remaining 20% to the test set. Crucially, all temporal data associated with a given battery cell are kept entirely within a single partition, ensuring that no information from a test battery is present in the training or validation sets. This design provides a robust and unbiased measure of the model’s cross-battery generalization capability, which is essential for practical deployment in large-scale battery fleets where new batteries are continuously added.

In addition to evaluating the full PI-Transformer model, we conduct a controlled ablation study to quantify the individual contribution of the noise modeling module. Specifically, we train a variant of the PI-Transformer in which the noise modeling module is disabled (i.e., the noise vector

n_{k}

is set to zero). This ablation variant is treated as a sixth comparative model, alongside EKF, LSTM, GRU, Transformer, and the full PI-Transformer. The results of this ablation study are presented in Section 5 and provide direct evidence of the noise module’s effectiveness in enhancing robustness under aged and noisy conditions.

The experimental setup, as summarized in Table 2, is divided into three distinct experiments:

EXP1: Evaluates performance on the NASA dataset at 24 °C.
EXP2: Evaluates performance on the NASA dataset at 4 °C.
EXP3: Utilizes the Braatz dataset at 30 °C.

In all experiments, the six comparative models are trained and tested under identical input features and preprocessing protocols. Model performance is quantitatively assessed using two complementary metrics: the root mean square error (RMSE) for precision and the coefficient of determination (

R^{2}

) for explanatory power, providing a balanced and comprehensive evaluation of estimation accuracy.

This comprehensive experimental design, combining time-series validation for intra-battery performance, cross-battery generalization testing, and ablation studies for component analysis, allows us to draw robust conclusions regarding the efficacy and practical applicability of the proposed PI-Transformer framework. The results demonstrate that the integration of physics-informed constraints and adaptive noise modeling significantly enhances the model’s accuracy and robustness, particularly under challenging operating conditions such as low temperature and battery aging.

4.4. Model Architecture and Training Details

The proposed PI-Transformer is designed to balance model capacity, computational efficiency, and generalization performance. The input sequence length l is set to 50 time steps, which strikes a balance between capturing sufficient temporal context and maintaining manageable computational complexity. This value corresponds to the window_size parameter in our implementation. The hidden dimension

d_{model}

is set to 128, providing sufficient representational capacity while avoiding overfitting on the relatively small battery datasets. The number of encoder layers L is set to 2, as deeper architectures did not yield significant improvements in preliminary experiments, consistent with findings in other time-series Transformer applications.

The attention mechanism employs 8 heads (

h = 8

), allowing the model to jointly attend to information from different representation subspaces. The feed-forward network dimension

d_{ff}

is set to 512, following the standard ratio

d_{ff} = 4 \times d_{model}

. A dropout rate of 0.2 is applied to both the attention and feed-forward layers to regularize the model and prevent overfitting.

Training is performed using the AdamW optimizer with a weight decay coefficient of

1 \times 10^{- 2}

, which helps control model complexity and improve generalization. The learning rate is initialized at 0.001 and follows a cosine annealing schedule to facilitate convergence. The batch size is set to 1536, chosen to maximize GPU memory utilization on an NVIDIA A100 GPU (40GB VRAM) without causing out-of-memory errors. Training proceeds for 10 epochs, with early stopping enabled based on validation loss to prevent overfitting.

All experiments were conducted using PyTorch 2.0 and CUDA 12.1, ensuring reproducibility across different computing environments.

5. Experimental Results and Analysis

The experimental results, presented in Table 3, provide a rigorous, multi-dimensional validation of the proposed PI-Transformer framework. The model’s superiority is not merely a statistical artifact but is consistently evident across all three experimental settings, and the effectiveness is robustly supported by both quantitative metrics and qualitative visual analysis.

5.1. Quantitative Performance Across Datasets

Table 3 summarizes the average performance of all models across the three experimental setups. The PI-Transformer consistently achieves state-of-the-art results on both validation and test sets, outperforming all baseline models in terms of RMSE and

R^{2}

. Notably, the inclusion of the noise modeling module yields a significant performance gain, particularly under challenging operating conditions, as demonstrated by the ablation study in EXP2.

5.1.1. Performance on NASA Dataset at 24 °C (EXP1)

In the most controlled environment, where electrochemical dynamics are relatively stable, the PI-Transformer establishes a strong baseline for accuracy. It achieves a Validation RMSE of 0.0195 and a Test RMSE of 0.0256, outperforming the next best model (LSTM) by 3.5% and 25.2%, respectively. The near-perfect Validation

R^{2}

of 0.9903 indicates that the model explains over 99% of the variance in the SOC data, demonstrating its ability to capture the underlying physical relationships with high fidelity.

This result confirms that the integration of physics-informed constraints does not hinder the model’s ability to fit clean, well-behaved data; rather, it enhances its predictive precision by providing a strong inductive bias that guides the learning process towards physically plausible solutions.

5.1.2. Performance on NASA Dataset at 4 °C (EXP2)

The true test of robustness is revealed in the low-temperature scenario, where increased measurement noise and reduced ionic mobility challenge conventional models. Here, the PI-Transformer exhibits remarkable resilience, achieving a Validation RMSE of 0.0226 and a Test RMSE of 0.0226, significantly outperforming the standard Transformer.

This performance advantage is further amplified when considering the full model’s superior

R^{2}

score (0.9783 vs. 0.9759), indicating better explanatory power under noisy conditions. The consistent performance across validation and test sets suggests that the model generalizes well even when faced with environmental stressors.

5.1.3. Performance on Braatz Dataset at 30 °C (EXP3)

The most critical evaluation is on the Braatz dataset, which features batteries undergoing aggressive fast-charging protocols and experiencing progressive capacity degradation. Here, the PI-Transformer demonstrates an extraordinary level of generalization, achieving a Test RMSE of 0.0698, which is a 31.5% reduction compared to the best baseline (CNN-Transformer, Test RMSE: 0.1016). This dramatic improvement is underscored by the model’s Test

R^{2}

of 0.9594, which is substantially higher than any other model.

The results on this dataset validate our core hypothesis: that integrating fundamental electrochemical principles into a deep learning architecture enables the model to generalize across unseen batteries and operating conditions, making it highly suitable for deployment in large-scale battery fleets.

5.2. Ablation Study of the Noise Modeling Module

To rigorously evaluate the individual contribution of the noise modeling module to the overall performance of the PI-Transformer, we conduct a controlled ablation study. In this study, we train and evaluate a variant of the PI-Transformer in which the noise modeling module is disabled (i.e., the noise vector

n_{k}

is set to zero). This ablation variant is treated as a sixth comparative model, alongside EKF, LSTM, GRU, Transformer, and the full PI-Transformer.

The results of the ablation study, presented in Table 4, demonstrate that the noise modeling module plays a critical role in enhancing the model’s robustness under challenging operating conditions. Specifically, when the noise module is disabled, the model’s RMSE increases by 13.3% on the NASA 4°C dataset and 18.4% on the Braatz dataset. This significant performance degradation confirms that the noise module is not merely a redundant component but a key innovation that enables the model to adapt to sensor noise and battery degradation.

Figure 3 compares the Test RMSE of the full PI-Transformer with its ablation variant across the three experimental settings. The chart clearly illustrates that the performance gap between the models widens as the operating conditions become more challenging. This trend underscores the increasing importance of the noise module in mitigating the effects of sensor noise and battery degradation as the system complexity grows.

In summary, the ablation study provides strong empirical evidence that the noise modeling module is a key contributor to the PI-Transformer’s superior performance, particularly under aged and noisy conditions. This finding underscores the importance of incorporating adaptive noise modeling mechanisms into physics-informed deep learning frameworks for battery state estimation.

5.3. Qualitative Analysis of Temporal Dynamics

The qualitative insights from Figure 4 and Figure 5 provide a visual corroboration of these quantitative findings. Figure 4, depicting a representative battery from the validation set, shows that the PI-Transformer maintains a near-perfect alignment with the ground truth throughout the entire duration, even during the sharp transitions of fast-charging and discharging. In contrast, the best baseline model exhibits noticeable lag and overshoot. The bottom row of plots, which display the absolute error and per-cycle error metrics, quantifies this observation, with the PI-Transformer consistently producing smaller and more stable errors.

Crucially, Figure 5, which visualizes a battery from the test set, demonstrates that this superior performance is not an artifact of overfitting to the validation data. The PI-Transformer exhibits the same high fidelity and stability on this unseen battery, confirming its strong cross-battery generalization capability. This visual evidence, combined with the quantitative results in Table 3, provides compelling, multi-faceted proof of the model’s robustness and practical applicability.

5.4. Conclusion on SOC Estimation Advancement

In conclusion, the comprehensive experimental evaluation unequivocally demonstrates that the proposed PI-Transformer framework represents a significant advancement in the field of lithium-ion battery SOC estimation. By seamlessly integrating fundamental electrochemical principles into a deep learning architecture, the model achieves unparalleled accuracy, robustness, and generalization.

Its ability to maintain high performance under diverse and challenging conditions makes it a highly promising candidate for deployment in next-generation battery management systems for electric vehicles and grid-scale energy storage. The success of this approach paves the way for future research into hybrid physics data models that can operate reliably in complex, dynamic environments.

6. Conclusions and Discussion

This paper presents the Physics-Informed Transformer (PI-Transformer), a novel hybrid framework that achieves state-of-the-art performance in lithium-ion battery state of charge (SOC) estimation by seamlessly integrating fundamental electrochemical principles with the powerful sequence modeling capabilities of the Transformer architecture. In direct response to reviewer concerns, we emphasize that physical constraints are not enforced via an additional loss term but embedded as an architectural component within the forward pass—ensuring end-to-end differentiability and enabling the self-attention mechanism to dynamically weight physical and data features. Comprehensive evaluation across three datasets demonstrates superior accuracy, robustness, and generalization, with the most pronounced advantage under challenging conditions such as low temperatures and advanced battery degradation. Ablation studies confirm that the adaptive noise modeling module is critical for maintaining performance under noisy and aged conditions, with RMSE increasing by over 180% when disabled.

While this work focuses on SOC estimation, several limitations warrant acknowledgment: (1) evaluation is currently limited to Li-ion/LFP chemistries; (2) reliance on accurate temperature measurements may limit practical deployment; (3) computational cost may require optimization for real-time edge inference; and (4) lack of uncertainty quantification limits safety-critical applications. Future research will address these by extending to multi-chemistry batteries, integrating temperature estimation from voltage–current dynamics, developing lightweight variants for embedded systems, and incorporating Bayesian neural networks for uncertainty-aware predictions. By successfully marrying the interpretability of physics with the flexibility of deep learning, the PI-Transformer represents a significant step forward in intelligent, reliable, and scalable battery management for aging energy storage systems.

Author Contributions

X.L.: methodology, writing–original draft. X.L. designed the core research methodology and framework and took the lead in writing the initial draft of the manuscript. G.W.: data curation, formal analysis. G.W. was responsible for data preprocessing, annotation, and management and performed formal statistical and computational analysis of the experimental results. F.C.: conceptualization, investigation. F.C. contributed to the formulation of the research concept and aims and conducted key experimental investigations and data collection under the established methodology. W.X.: resources, validation. W.X. provided critical computational and experimental resources and participated in the verification and reproducibility validation of experimental outcomes. S.S.: investigation, methodology. S.S. conducted field-based investigations and case study data collection and contributed to the refinement and adaptation of the research methodology. Y.S.: writing—original draft, software. Y.S. developed the core software tools and algorithms and co-authored the initial draft of the manuscript, focusing on technical implementation sections. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Science and Technology Project of State Grid Corporation of China, Contract No.: JF2025021, Incubation Project of State Grid Jiangsu Electric Power Co., Ltd.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data used in this study are publicly available from two sources: (1) NASA Battery Data Set, available at https://ti.arc.nasa.gov/tech/dash/groups/pcoe/prognostic-data-repository/, accessed on 20 February 2018; (2) Braatz battery cycle life dataset, originally published in [36], available at https://data.matr.io/1/projects/5c48dd2bc625d700019f3204, accessed on 16 June 2019. No new data were generated in this study. The source code is available at https://github.com/aily2723502573-beep/batter_scoc_estimate_PI, accessed on 20 November 2025.

Acknowledgments

The authors would also like to thank the technical staff of State Grid Jiangsu Electric Power Co., Ltd. for their assistance in data handling and computational resources. Additionally, the authors sincerely acknowledge the maintainers of the NASA Battery Data Set and the Braatz battery cycle life dataset for making their data publicly available, which greatly facilitated this research.

Conflicts of Interest

Authors Xiang Li, Guanru Wu, Fei Chang, Weidong Xia and Shaobin Sun were employed by the State Grid Nanjing Power Supply Company. The remaining author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Zhu, C.; Wang, S.; Yu, C.; Zhou, H.; Fernandez, C.; Guerrero, J.M. An improved Cauchy robust correction-sage Husa extended Kalman filtering algorithm for high-precision SOC estimation of Lithium-ion batteries in new energy vehicles. J. Energy Storage 2024, 88, 111552. [Google Scholar] [CrossRef]
Fei, Z.; Yang, F.; Tsui, K.L.; Li, L.; Zhang, Z. Early prediction of battery lifetime via a machine learning based framework. Energy 2021, 225, 120205. [Google Scholar] [CrossRef]
Lu, J.; Xiong, R.; Tian, J.; Wang, C.; Sun, F. Deep learning to estimate lithium-ion battery state of health without additional degradation experiments. Nat. Commun. 2023, 14, 2760. [Google Scholar] [CrossRef]
Zhang, G.; Wei, X.; Chen, S.; Zhu, J.; Han, G.; Dai, H. Unlocking the thermal safety evolution of lithium-ion batteries under shallow over-discharge. J. Power Sources 2022, 521, 230990. [Google Scholar] [CrossRef]
Meng, J.; Ricco, M.; Luo, G.; Swierczynski, M.; Stroe, D.I.; Stroe, A.I.; Teodorescu, R. An Overview and Comparison of Online Implementable SOC Estimation Methods for Lithium-Ion Battery. IEEE Trans. Ind. Appl. 2018, 54, 1583–1591. [Google Scholar] [CrossRef]
Chen, G.; Zhou, H.; Xu, Y.; Zhu, W.; Chen, L.; Wang, Y. A novel SOC-OCV separation and extraction technology suitable for online joint estimation of SOC and SOH in lithium-ion batteries. Energy 2025, 326, 136246. [Google Scholar] [CrossRef]
Bai, G.; Su, Y.; Rahman, M.M.; Wang, Z. Prognostics of Lithium-Ion batteries using knowledge-constrained machine learning and Kalman filtering. Reliab. Eng. Syst. Saf. 2023, 231, 108944. [Google Scholar] [CrossRef]
Tian, J.; Xiong, R.; Shen, W.; Sun, F. Electrode ageing estimation and open circuit voltage reconstruction for lithium ion batteries. Energy Storage Mater. 2021, 37, 283–295. [Google Scholar] [CrossRef]
He, Y.; Liu, X.; Zhang, C.; Chen, Z. A new model for State-of-Charge (SOC) estimation for high-power Li-ion batteries. Appl. Energy 2013, 101, 808–814. [Google Scholar] [CrossRef]
Zhang, C.; Li, K.; Pei, L.; Zhu, C. An integrated approach for real-time model-based state-of-charge estimation of lithium-ion batteries. J. Power Sources 2015, 283, 24–36. [Google Scholar] [CrossRef]
Li, Z.; Ni, H.; Zhu, W.; Ni, B.; Chang, J.; Cao, J. Adaptive SOC-OCV mapping-based joint estimation of SOC and SOH in aging lithium-ion batteries using extended Kalman filtering. J. Power Electron. 2025; early view. [Google Scholar] [CrossRef]
Madani, S.S.; Ziebert, C.; Marzband, M. Thermal Behavior Modeling of Lithium-Ion Batteries: A Comprehensive Review. Symmetry 2023, 15, 1597. [Google Scholar] [CrossRef]
Dini, P.; Paolini, D. Exploiting Artificial Neural Networks for the State of Charge Estimation in EV/HV Battery Systems: A Review. Batteries 2025, 11, 107. [Google Scholar] [CrossRef]
Obuli Pranav, D.; Babu, P.S.; Indragandhi, V.; Ashok, B.; Vedhanayaki, S.; Kavitha, C. Enhanced SOC estimation of lithium ion batteries with RealTime data using machine learning algorithms. Sci. Rep. 2024, 14, 16036. [Google Scholar] [CrossRef]
Kumar, C.; Zhu, R.; Buticchi, G.; Liserre, M. Sizing and SOC Management of a Smart-Transformer-Based Energy Storage System. IEEE Trans. Ind. Electron. 2018, 65, 6709–6718. [Google Scholar] [CrossRef]
Jia, K.; Gao, Z.; Ma, R.; Chai, H.; Sun, S. An Adaptive Optimization Algorithm in LSTM for SOC Estimation Based on Improved Borges Derivative. IEEE Trans. Ind. Inform. 2024, 20, 1907–1919. [Google Scholar] [CrossRef]
Fei, Z.; Zhang, Z.; Yang, F.; Tsui, K.L. Deep learning powered rapid lifetime classification of lithium-ion batteries. eTransportation 2023, 18, 100286. [Google Scholar] [CrossRef]
Shi, C.; Zhu, D.; Zhang, L.; Song, S.; Sheldon, B.W. Transfer learning prediction on lithium-ion battery heat release under thermal runaway condition. Nano Res. Energy 2024, 3, e9120147. [Google Scholar] [CrossRef]
Roman, D.; Saxena, S.; Robu, V.; Pecht, M.; Flynn, D. Machine learning pipeline for battery state-of-health estimation. Nat. Mach. Intell. 2021, 3, 447–456. [Google Scholar] [CrossRef]
Sesidhar, D.V.S.R.; Badachi, C.; Green, R.C., II. A review on data-driven SOC estimation with Li-Ion batteries: Implementation methods & future aspirations. J. Energy Storage 2023, 72, 108420. [Google Scholar] [CrossRef]
Chang, W.Y. The State of Charge Estimating Methods for Battery: A Review. Int. Sch. Res. Not. 2013, 2013, 953792. [Google Scholar] [CrossRef]
Li, Y.; Li, K.; Liu, X.; Li, X.; Zhang, L.; Rente, B.; Sun, T.; Grattan, K.T. A hybrid machine learning framework for joint SOC and SOH estimation of lithium-ion batteries assisted with fiber sensor measurements. Appl. Energy 2022, 325, 119787. [Google Scholar] [CrossRef]
Liu, D.; Li, L.; Song, Y.; Wu, L.; Peng, Y. Hybrid state of charge estimation for lithium-ion battery under dynamic operating conditions. Int. J. Electr. Power Energy Syst. 2019, 110, 48–61. [Google Scholar] [CrossRef]
Shi, Q.; Jiang, Z.; Wang, Z.; Shao, X.; He, L. State of Charge Estimation by Joint Approach With Model-Based and Data-Driven Algorithm for Lithium-Ion Battery. IEEE Trans. Instrum. Meas. 2022, 71, 3000610. [Google Scholar] [CrossRef]
Awadallah, M.A.; Venkatesh, B. Accuracy improvement of SOC estimation in lithium-ion batteries. J. Energy Storage 2016, 6, 95–104. [Google Scholar] [CrossRef]
Misyris, G.S.; Doukas, D.I.; Papadopoulos, T.A.; Labridis, D.P.; Agelidis, V.G. State-of-Charge Estimation for Li-Ion Batteries: A More Accurate Hybrid Approach. IEEE Trans. Energy Convers. 2019, 34, 109–119. [Google Scholar] [CrossRef]
Cui, Z.; Dai, J.; Sun, J.; Li, D.; Wang, L.; Wang, K. Hybrid Methods Using Neural Network and Kalman Filter for the State of Charge Estimation of Lithium-Ion Battery. Math. Probl. Eng. 2022, 2022, 9616124. [Google Scholar] [CrossRef]
Cuma, M.U.; Koroglu, T. A comprehensive review on estimation strategies used in hybrid and battery electric vehicles. Renew. Sustain. Energy Rev. 2015, 42, 517–531. [Google Scholar] [CrossRef]
Ding, S.; Li, Y.; Dai, H.; Wang, L.; He, X. Accurate Model Parameter Identification to Boost Precise Aging Prediction of Lithium-Ion Batteries: A Review. Adv. Energy Mater. 2023, 13, 2301452. [Google Scholar] [CrossRef]
Vermeer, W.; Chandra Mouli, G.R.; Bauer, P. A Comprehensive Review on the Characteristics and Modeling of Lithium-Ion Battery Aging. IEEE Trans. Transp. Electrif. 2022, 8, 2205–2232. [Google Scholar] [CrossRef]
Sorouri, H.; Oshnoei, A.; Che, Y.; Teodorescu, R. A comprehensive review of hybrid battery state of charge estimation: Exploring physics-aware AI-based approaches. J. Energy Storage 2024, 100, 113604. [Google Scholar] [CrossRef]
Damodaran, V.; Paramadayalan, T.; Natarajan, D.; Kumar C, R.; Kanna, P.R.; Taler, D.; Sobota, T.; Taler, J.; Szymkiewicz, M.; Ahamed, M.J. Development of a Fast Running Equivalent Circuit Model with Thermal Predictions for Battery Management Applications. Batteries 2024, 10, 215. [Google Scholar] [CrossRef]
Sulzer, V.; Marquis, S.; Timms, R.; Robinson, M.; Chapman, S. Python battery mathematical modelling (PyBaMM). J. Open Res. Softw. 2021, 9, 14. [Google Scholar] [CrossRef]
Hallemans, N.; Courtier, N.E.; Please, C.P.; Planden, B.; Dhoot, R.; Timms, R.; Chapman, S.J.; Howey, D.; Duncan, S.R. Physics-Based Battery Model Parametrisation from Impedance Data. J. Electrochem. Soc. 2025, 172, 060507. [Google Scholar] [CrossRef]
Wei, M.; Gu, H.; Ye, M.; Wang, Q.; Xu, X.; Wu, C. Remaining useful life prediction of lithium-ion batteries based on Monte Carlo Dropout and gated recurrent unit. Energy Rep. 2021, 7, 2862–2871. [Google Scholar] [CrossRef]
Severson, K.A.; Attia, P.M.; Jin, N.; Perkins, N.; Jiang, B.; Yang, Z.; Chen, M.H.; Aykol, M.; Herring, P.K.; Fraggedakis, D.; et al. Data-driven prediction of battery cycle life before capacity degradation. Nat. Energy 2019, 4, 383–391. [Google Scholar] [CrossRef]

Figure 1. Architecture of the proposed PI-Transformer for SOC estimation. The physics-informed embedding layer integrates electrochemical constraints directly into the input representation, enabling end-to-end learning of both physical laws and temporal patterns.

Figure 2. Sample time-series trajectories of voltage, current, temperature, and capacity across six NASA battery test batches.

Figure 3. Bar chart comparing the Test RMSE of the full PI-Transformer with its ablation variant (w/o noise) across three experimental settings.

Figure 4. Temporal SOC estimation performance on a representative battery from the validation set of the Braatz dataset. The grid displays: (Top-Left) Discharge SOC for Cycle 90; (Top-Center) Full Cycle SOC; (Top-Right) RMSE per Cycle; (Bottom-Left) Discharge Absolute Error; (Bottom-Center) Full Cycle Absolute Error; (Bottom-Right) MAE per Cycle.

Figure 5. Temporal SOC estimation performance on a representative battery from the test set of the Braatz dataset. The grid displays: (Top-Left) Discharge SOC for Cycle 560; (Top-Center) Full Cycle SOC; (Top-Right) RMSE per Cycle; (Bottom-Left) Discharge Absolute Error; (Bottom-Center) Full Cycle Absolute Error; (Bottom-Right) MAE per Cycle.

Table 1. Summary of Lithium-Ion Battery Datasets Used for SOC Estimation Validation.

ID	Dataset Source	# of Cells	Temp. (°C)	Test Type	Capacity (Ah)
1	NASA dataset 1	4	24	CC Discharge (1C)	2.0
2	NASA dataset 2	4	24	Dynamic Load Profile (Square Wave)	2.0
3	NASA dataset 3	14	24	Multiple Test Types	2.0
4	NASA dataset 4	4	4	Constant Load Discharge	2.0
5	NASA dataset 5	4	4	Constant Load Discharge	2.0
6	NASA dataset 6	4	4	Constant Load Discharge	2.0
7	Braatz et al. [36]	124	30	Fast charge with 4C discharge	1.1

Table 2. Summary of experimental design.

Exp ID	Dataset	Temperature	Battery Type	Models Used	Evaluation Metrics
EXP1	NASA	24 °C	18650 Li-ion	EKF, LSTM, GRU, Transformer, PI-Transformer, PI-Transformer (w/o Noise)	RMSE, R²
EXP2	NASA	4 °C	18650 Li-ion
EXP3	Braatz	30 °C	LFP/Graphite

Table 3. Average performance comparison across three datasets. Bold indicates the best performance in each column.

Model	Validation			Test
Model	MAE	RMSE	$R^{2}$	MAE	RMSE	$R^{2}$
EXP1 (NASA, 24 C)
EKF	0.1937	0.2451	−0.1769	0.2742	0.3404	−0.1839
LSTM	0.0154	0.0202	0.9889	0.0266	0.0344	0.9527
GRU	0.0156	0.0201	0.9893	0.0272	0.0335	0.9573
Transformer	0.0182	0.0230	0.9862	0.0299	0.0363	0.9507
CNN-Transformer	0.0189	0.0243	0.9845	0.0298	0.0380	0.9514
Transformer-PI (Ours)	0.0149	0.0195	0.9903	0.0193	0.0256	0.9765
EXP2 (NASA, 4 C)
EKF	0.1063	0.1490	0.4673	0.1550	0.2149	0.4508
LSTM	0.0230	0.0293	0.9732	0.0256	0.0327	0.9633
GRU	0.0232	0.0321	0.9611	0.0271	0.0353	0.9460
Transformer	0.0212	0.0258	0.9810	0.0219	0.0271	0.9759
CNN-Transformer	0.0212	0.0273	0.9763	0.0236	0.0301	0.9697
Transformer-PI (Ours)	0.0169	0.0226	0.9783	0.0169	0.0226	0.9783
EXP3 (Braatz, 30 C)
EKF	0.0795	0.1145	0.9175	0.1020	0.1384	0.8621
LSTM	0.0822	0.0969	0.9409	0.1009	0.1188	0.9022
GRU	0.0747	0.0809	0.9588	0.0942	0.1072	0.9198
Transformer	0.0737	0.0847	0.9548	0.0896	0.1035	0.9247
CNN-Transformer	0.0718	0.0789	0.9608	0.0895	0.1016	0.9263
Transformer-PI (Ours)	0.0135	0.0254	0.9959	0.0403	0.0698	0.9594

Table 4. Ablation study of different modules. Bold indicates the best performance.

Model	Validation			Test
Model	MAE	RMSE	$R^{2}$	MAE	RMSE	$R^{2}$
EXP1 (NASA, 24 C)
Transformer	0.0182	0.0230	0.9862	0.0299	0.0363	0.9507
Transformer-PI(w/o Noise Module)	0.0318	0.0632	0.9588	0.0602	0.1167	0.8549
Transformer-PI (Ours)	0.0149	0.0195	0.9903	0.0193	0.0256	0.9765
EXP2 (NASA, 4 C)
Transformer	0.0212	0.0258	0.9810	0.0219	0.0271	0.9759
PI-Transformer (w/o Noise)	0.0233	0.0688	0.9499	0.0262	0.0642	0.9566
Transformer-PI (Ours)	0.0169	0.0226	0.9783	0.0169	0.0226	0.9783
EXP3 (Braatz, 30 C)
Transformer	0.0737	0.0847	0.9548	0.0896	0.1035	0.9247
Transformer-PI (w/o Noise Module)	0.0152	0.0270	0.9954	0.0703	0.1118	0.9124
Transformer-PI (Ours)	0.0135	0.0254	0.9959	0.0403	0.0698	0.9594

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, X.; Wu, G.; Chang, F.; Xia, W.; Sun, S.; Shen, Y. A Physics-Guided Transformer for Robust State of Charge Estimation in Aging Lithium-Ion Batteries. Batteries 2025, 11, 446. https://doi.org/10.3390/batteries11120446

AMA Style

Li X, Wu G, Chang F, Xia W, Sun S, Shen Y. A Physics-Guided Transformer for Robust State of Charge Estimation in Aging Lithium-Ion Batteries. Batteries. 2025; 11(12):446. https://doi.org/10.3390/batteries11120446

Chicago/Turabian Style

Li, Xiang, Guanru Wu, Fei Chang, Weidong Xia, Shaobin Sun, and Yingjun Shen. 2025. "A Physics-Guided Transformer for Robust State of Charge Estimation in Aging Lithium-Ion Batteries" Batteries 11, no. 12: 446. https://doi.org/10.3390/batteries11120446

APA Style

Li, X., Wu, G., Chang, F., Xia, W., Sun, S., & Shen, Y. (2025). A Physics-Guided Transformer for Robust State of Charge Estimation in Aging Lithium-Ion Batteries. Batteries, 11(12), 446. https://doi.org/10.3390/batteries11120446

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

A Physics-Guided Transformer for Robust State of Charge Estimation in Aging Lithium-Ion Batteries

Abstract

1. Introduction

2. Problem Formulation for SOC Estimation

3. PI-Transformer-Based SOC Estimation Method

3.1. Physics-Informed Embedding via PyBaMM

3.2. Transformer Network Architecture with Physics Embedding

3.2.1. Transformer Input and Embedding Layer

3.2.2. Transformer Encoder Layer

Multi-Head Self-Attention Mechanism

Feed-Forward Network

3.2.3. Adaptive Noise and Aging Modeling via Attention

3.3. End-to-End Training and Physics Integration Strategy

4. Dataset Description and Experiments Design

4.1. Dataset Description

4.2. Input Feature Selection

4.3. Experimental Design

4.4. Model Architecture and Training Details

5. Experimental Results and Analysis

5.1. Quantitative Performance Across Datasets

5.1.1. Performance on NASA Dataset at 24 °C (EXP1)

5.1.2. Performance on NASA Dataset at 4 °C (EXP2)

5.1.3. Performance on Braatz Dataset at 30 °C (EXP3)

5.2. Ablation Study of the Noise Modeling Module

5.3. Qualitative Analysis of Temporal Dynamics

5.4. Conclusion on SOC Estimation Advancement

6. Conclusions and Discussion

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI