State of Charge Estimation of Lithium-Ion Batteries Based on Hidden Markov Factor Graphs

Fang, Wei; Su, Zhi-Jian; Shao, Yu-Tong; Wu, Guang-Ping; Liu, Peng

doi:10.3390/math13182922

Open AccessArticle

State of Charge Estimation of Lithium-Ion Batteries Based on Hidden Markov Factor Graphs

by

Wei Fang

¹,

Zhi-Jian Su

²,

Yu-Tong Shao

¹,

Guang-Ping Wu

² and

Peng Liu

^1,*

¹

School of Electrical and Control Engineering, North University of China, Taiyuan 030051, China

²

School of Chemistry and Chemical Engineering, North University of China, Taiyuan 030051, China

^*

Author to whom correspondence should be addressed.

Mathematics 2025, 13(18), 2922; https://doi.org/10.3390/math13182922

Submission received: 4 August 2025 / Revised: 6 September 2025 / Accepted: 8 September 2025 / Published: 10 September 2025

(This article belongs to the Special Issue Recent Developments in Disturbance Rejection Control Theory and Applications)

Download

Browse Figures

Versions Notes

Abstract

Lithium-ion batteries serve as critical energy storage devices and are extensively utilized across diverse applications. The accurate estimation of State of Charge (SOC) is critically important for Battery Management Systems. Traditional SOC estimation methods have achieved progress, such as the Extended Kalman Filter (EKF) and particle filter. However, when there exist uncertainties in battery model parameters and the parameters change dynamically with operating conditions, the EKF tends to produce accumulated errors, which leads to a decline in estimation accuracy. This paper proposes a hybrid approach integrating the EKF with a Hidden Markov Factor Graph (HMM-FG). First, this method uses the EKF to achieve a real-time estimation of the SOC. Then, it treats the EKF-estimated value as an observation through the HMM-FG and combines current and voltage measurement data. It also introduces a factor function to describe the temporal correlation of the SOC and the uncertainty of EKF modeling errors, thereby performing Maximum A Posteriori (MAP) estimation correction on the SOC. Different from the traditional EKF, this method can use future observation information to suppress the error accumulation of the EKF under dynamic parameter changes. Experiments were conducted under different temperatures (0 °C, 25 °C, 45 °C), and a variety of different dynamic operating conditions (FUDS, DST), and comparisons were made with the EKF, Extended Kalman Smoother (EKS), and data-driven method based on LSTM.

Keywords:

lithium-ion battery; hidden Markov model; factor graph; extended Kalman filter; maximum a posteriori estimation

MSC:

68U35

1. Introduction

As global energy crises intensify and environmental awareness rises, the utilization of clean energy and electric tools has expanded progressively. Lithium-ion batteries are an efficient energy storage device that are extensively employed in electric vehicles, power grids, and renewable energy systems. However, the safety, performance, and lifespan of the battery depend on the Battery Management Systems (BMS) in practical applications. The State of Charge (SOC) estimation of batteries is a core technology in the BMS [1,2].

The SOC of a battery quantifies the ratio between remaining capacity and nominal capacity. It is a critical indicator for measuring the currently available capacity of the battery. Accurate SOC estimation not only ensures the safe operation of the battery but also prevents occurrences such as overcharging and over-discharging. It also prolongs the service life of the battery [3]. However, due to the complex electrochemical characteristics of lithium-ion batteries under different operating conditions, the SOC is closely related to multiple parameters, including operating current, voltage, and temperature. These parameters possess strong nonlinearity and time-variability. Therefore, SOC estimation has become a highly challenging task.

The current SOC estimation approaches for lithium-ion batteries fall into three primary categories: direct measurement, model-based, and data-driven methods [4]. The direct measurement methods estimate the SOC by measuring the open-circuit voltage of the battery or using the Coulomb counting method. This method is simple and intuitive, but it is vulnerable to initial state uncertainty and measurement noise. They often fail to accurately estimate SOC, especially under complex working conditions. The model-based methods conduct online SOC estimation using equivalent circuit or electrochemical battery models, such as the Extended Kalman Filter (EKF), Cubature Kalman Filter (CKF), and Unscented Kalman Filter (UKF) [5,6,7]. These methods rely on the accurate battery models [8]. In practical applications, the battery parameters often change with temperature, state of charge, and aging degree [9], which leads to a decline in the accuracy of SOC estimation. The data-driven methods utilize machine learning [10,11] algorithms to estimate SOC through a large amount of historical data. In recent years, deep learning models (such as Long Short-Term Memory (LSTM) networks and attention mechanism-based Transformers) have been widely applied to SOC prediction and estimation. These methods can automatically extract complex temporal features and have certain advantages in handling non-linear operating conditions. However, their performance highly depends on the completeness and distribution consistency of training data. They exhibit insufficient generalization capabilities under temperature variations, aging, and unseen operating conditions. Meanwhile, these models have weak interpretability, making it difficult to meet the requirements for transparency and robustness in safety-critical applications.

The Hidden Markov Model (HMM) is a statistical framework for time-series analysis. It performs well in modeling dynamic systems with hidden states. The HMM infers the hidden states of the system through the observed data. It is suitable for describing the variation process of SOC. Meanwhile, the Factor Graph (FG) is an effective probabilistic graphical model that can represent the dependencies between variables and enables efficient inference and state estimation using message-passing algorithms [12].

This paper proposes an SOC estimation method combining the Hidden Markov Factor Graph (HMM-FG) and EKF. The HMM-FG models battery SOC dynamics via probabilistic dependencies among factor nodes. It facilitates joint inference using historical and real-time measurements. The HMM-FG can optimize the marginal probability of SOC via maximum a posteriori (MAP) estimation. Thereby, it can effectively reduce errors in EKF and improve estimation accuracy. In addition, the introduction of the FG allows the estimation process to exhibit better robustness and adaptability under dynamic load and environmental changes. The experimental verification under different temperatures and different operating conditions demonstrates that the proposed method not only outperforms the traditional EKF method in estimation accuracy but also shows strong real-time performance and adaptability.

2. Modeling of Lithium-Ion Batteries

2.1. Second-Order Thevenin Model

The main lithium-ion battery models include three types: the resistance and capacitance (RC) model, the electrochemical model, and the Thevenin equivalent circuit model. The RC model is a simplified battery model that describes the dynamic response of the battery through resistive and capacitive components. It is suitable for SOC estimation on short time scales but may exhibit poor performance in capturing the nonlinear characteristics of the battery over long time scales. The electrochemical model is a model that elaborately describes the internal chemical reactions and charge transfer processes of the battery. The Thevenin equivalent circuit model models the lithium-ion battery as an equivalent circuit, which includes a voltage source, internal resistance, and battery capacitance. The response of the battery under different conditions is analyzed by studying the circuit.

In Figure 1, the open circuit voltage (OCV) refers to the open-circuit voltage of the battery;

U (t)

denotes the terminal voltage of the battery;

R_{0}

represents the ohmic internal resistance of the battery;

I (t)

stands for the operating current;

R_{1}, C_{1}

are the polarization resistance and capacitance of the battery, respectively;

R_{2}, C_{2}

are the concentration polarization resistance and capacitance of the battery, respectively;

U_{1}, U_{2}

indicate the voltages of the two RC loops, respectively.

According to Kirchhoff’s laws, the circuit model in Figure 1 can be expressed as follows:

\{\begin{cases} I (t) = \frac{U_{1}}{R_{1}} + C_{1} \frac{d U_{1}}{d t} \\ I (t) = \frac{U_{2}}{R_{2}} + C_{2} \frac{d U_{2}}{d t} \\ O C V = U (t) + R_{0} I (t) + U_{1} + U_{2} \end{cases}

(1)

where the current

I

defined as the input, the terminal voltage

U_{t}

is the observed quantity, and

x = {[S O C U_{1} U_{2}]}^{T}

is selected as the state variable of the system. In this paper, the Euler method is used to discretize Equation (1). Specifically, the forward Euler method is adopted, which can also be understood as the forward difference method. Its basic idea is approximate iteration, using the Formula

\hat{x} = \frac{x (k + 1) - x (k)}{T}

to approximate the integral, where

T

is the sampling period and

\hat{x}

is the integral from the previous moment. The state-space equation in a known time-invariant continuous system is as follows:

\{\begin{cases} \hat{x} = A x + B u \\ y = C x + D u \end{cases}

Substituting

\hat{x} = \frac{x (k + 1) - x (k)}{T}

into the equation yields;

\hat{x} = \frac{x (k + 1) - x (k)}{T} = A x (k) + B u (k)

After rearrangement, the discretized state-space equation is obtained as follows:

x (k + 1) = T A x (k) + T B u (k)

The observation equation is the following:

y (k) = H x (k) + J u (k)

where

H = C; J = D

.

In summary, the expression form of its discretized state-space equation is as follows:

\{\begin{cases} x (k + 1) = F x (k) + G u (k) \\ y (k) = H x (k) + J u (k) \end{cases}

where

F = T A; G = T B

.

The discretized equation of Equation (1) is obtained according to the above method.

[\begin{matrix} S O C (k + 1) \\ U_{1} (k + 1) \\ U_{2} (k + 1) \end{matrix}] = A [\begin{matrix} S O C (k) \\ U_{1} (k) \\ U_{2} (k) \end{matrix}] + B I (k) + w (k)

(2)

where

{[S O C (K + 1) U_{1} (K + 1) U_{2} (K + 1)]}^{T}

denotes the system state at time

K + 1

.

A = [\begin{matrix} 1 & 0 & 0 \\ 0 & e^{- T / R_{1} C_{1}} & 0 \\ 0 & 0 & e^{- T / R_{2} C_{2}} \end{matrix}]

Here,

A

is the state matrix of the system;

B = [- \frac{η T}{Q} R_{1} (1 - e^{- T / R_{1} C_{1}}) R_{2} (1 - e^{- T / R_{2} C_{2}})]

where

B

is the control matrix of the system;

T

represents the sampling time;

K

denotes the discrete time;

Q

stands for the battery capacity;

η

is the charge-discharge efficiency of the battery; and

w (k)

is the process noise of the system.

Similarly, the discretized observation equation of the system is the following:

\begin{matrix} U (k) & = O C V (S O C (k)) - I (k) R_{0} \\ - U_{1} (k) - U_{2} (k) + v (k) \end{matrix}

(3)

where

v (k)

represents the observation noise of the system.

2.2. OCV-SOC Fitting

The relationship between a battery’s OCV and SOC constitutes a significant research subject in the study of battery electrical performance. The SOC refers to the percentage of the battery’s current stored electric quantity. The OCV is the voltage value across the battery terminals under no-load conditions. During the battery’s charging and discharging processes, the OCV varies with the changes in SOC. Thus, there exists a certain functional relationship between the SOC and OCV. This relationship is typically non-linear and takes the form of a complex curve. The battery’s voltage is influenced by multiple factors, such as chemical reactions within the battery, temperature, load, and historical charging and discharging conditions. It can usually be obtained through experimental measurements and represented by a fitting curve. In this study, experimental data from the incremental current open-circuit voltage (OCV) test were used for polynomial fitting of different orders. The optimal order of the polynomial fitting was selected by analyzing the experimental results.

Analysis of the data in Table 1 reveals that for polynomial orders ranging from 1 to 5, both the RMSE and MAE gradually decrease as the polynomial order increases, with fitting accuracy continuously improving. When the order reaches 6 to 7, the RMSE and MAE are close to zero, basically reaching the accuracy limit of numerical calculation. Although polynomials of excessively high orders can further reduce errors numerically, they may introduce unnecessary fluctuations or oscillations, which are inconsistent with the physical laws governing the battery’s SOC–OCV relationship. Therefore, on the premise of ensuring accuracy, the 6th-order polynomial is the optimal choice. The experimental steps for the 6th-order polynomial fitting are as follows:

(1): Fully charge the battery to 100% SOC.
(2): Now discharge using a negative pulse current relaxation duration at every 10% SOC.
(3): Recharge the battery following the same procedure but using positive-pulse current.
(4): Apply averaging and linear interpolation steps to obtain the OCV-SOC curves at 0 °C, 25 °C, and 45 °C.

The fitting results are shown in Figure 2.

2.3. Model Parameter Identification

In this paper, the forgetting factor recursive least squares (FFRLS) method is used to identify

[R_{0}, R_{1}, R_{2}, C_{1}, C_{2}]

the second-order Thevenin model. The formulas of FFRLS are as follows:

\{\begin{cases} {\hat{θ}}_{k} = {\hat{θ}}_{k - 1} + K_{k} (y_{k} + φ_{k}^{T} {\hat{θ}}_{k - 1}) \\ K_{k} = p_{k} φ_{k} = p_{k - 1} φ_{k} {(λ + φ_{k} p_{k - 1} φ_{k}^{T})}^{- 1} \\ p_{k} = \frac{1}{λ} [I - K_{k} φ_{k}^{T}] p_{k - 1} \end{cases}

(4)

where

K_{k}

represents the gain term;

y_{k}

denotes the observed value;

λ

is the forgetting factor, typically

0 < λ < 1

, and generally takes the value of

0.95 < λ < 1

;

p_{k}

represents the covariance at time instant

k

.

The identification results

[R_{0}, R_{1}, R_{2}, C_{1}, C_{2}]

are as follows (Table 2):

After completing the parameter identification, this paper uses the Dynamic Stress Test (DST) condition experimental data to conduct a simulation. A comparison between the actual measured voltage values and the experimentally simulated voltage values is obtained. The results are shown in Figure 3, and the error values in Figure 4.

2.4. Extended Kalman Filter Algorithm (EKF Algorithm)

Based on the second-order equivalent circuit model, the following state-space equations are listed as follows:

\{\begin{cases} x_{k} = f (x_{k - 1}, u_{k - 1}) + w_{k} \\ z_{k} = h (x_{k}, u_{k}) + v_{k} \end{cases}

(5)

where

k

represents the time instant;

x_{k}

and

z_{k}

denote the state vector and observation vector of the system, respectively;

u_{k}

represents the input to the system;

f

and

h

denote the state transition matrix and observation matrix, respectively;

w_{k} ~ (0, Q_{k})

and

v_{k} ~ (0, R_{k})

denote the system noise and measurement noise, respectively. The EKF algorithm is used to estimate the state variables of the system. The specific steps are as follows:

(1): Predict the state $x_{k}^{-}$ and covariance $p_{k}^{-}$ at the current time step;

{\hat{x}}_{k}^{-} = f ({\hat{x}}_{k - 1}, u_{k - 1})

P_{k}^{-} = F_{k} P_{k - 1} F_{k}^{T} + Q

where

F_{k} = \frac{\partial f}{\partial x}

is the Jacobian matrix of the state transition matrix, and

Q

is the process noise covariance matrix.

(2): Calculate the Kalman gain $K_{k}$ ;

K_{k} = P_{k}^{-} H_{k}^{T} {(H_{k} P_{k}^{-} H_{k}^{T} + R)}^{- 1}

where

H_{k} = \frac{\partial h}{\partial x}

is the Jacobian matrix of the model, and

R

is the covariance matrix of the observation equation.

(3): Update the state estimate ${\hat{x}}_{k}$ ;

{\hat{x}}_{k} = {\hat{x}}_{k}^{-} + K_{k} (z_{k} - h ({\hat{x}}_{k}^{-}))

(4): Update the covariance matrix $P_{k}$ ;

P_{k} = (I - K_{k} H_{k}) P_{k}^{-}

3. Hidden Markov Factor Graph Modeling

3.1. Factor Graph (FG)

A factor Graph is a type of probabilistic graphical model [13,14,15], which is used to represent the conditional dependencies between variables and is suitable for reasoning and computation. In machine learning, signal processing, and information theory, it is widely applied to solve complex probabilistic problems such as Bayesian networks and Markov random fields.

A factor graph is a kind of bipartite graph, which is mainly made up of variable nodes and factor nodes. The variable nodes stand for random variables, and they are usually represented by circles, with each variable node corresponding to one random variable. The factor nodes indicate the relationships or functions between variables. They are typically a factor of the probability distribution, generally represented by squares. The variable nodes connected by a factor node signify a certain dependent relationship between these variables. A factor graph breaks down complex global computations into multiple local ones, thereby simplifying the reasoning process.

Suppose there is a global function

g

that contains random variables

\{X_{1}, X_{2}, \dots, X_{n}\}

. It can be decomposed into multiple factors

f_{a}, f_{b}, \dots

in the following form:

g (X_{1}, X_{2}, \dots, X_{n}) = \prod_{i} f_{i} (S_{i})

(6)

where

S_{i}

represents the subset of variables on which

f_{i}

depends.

An important application of factor graphs is to perform inference on the graph, such as using the sum-product algorithm or belief propagation algorithm [16] to calculate marginal probability distributions by passing information in the graph. In addition, in high-dimensional probability distributions, the complexity of direct computation is relatively high.

Let

g (X_{1}, X_{2}, X_{3}, X_{4}, X_{5})

be a function containing five variables, and

g

can be expressed as a product.

g (X_{1}, X_{2}, X_{3}, X_{4}, X_{5}) = f_{a} (X_{1}) f_{b} (X_{2}) f_{c} (X_{1}, X_{2}, X_{3}) f_{d} (X_{3}, X_{4}) f_{e} (X_{4}, X_{5})

(7)

Its factor graph representation is shown in Figure 5 where

X_{1}, X_{2}, X_{3}, X_{4}, X_{5}

represents variable nodes;

f_{a}, f_{b}, f_{c}, f_{d}, f_{e}

represents factor nodes.

3.2. Hidden Markov Model Representation in Factor Graph

A Hidden Markov Model is a statistical model which is widely used in time series data [17,18]. It assumes that the system is a Markov process, and the states are not directly observable. The states of the system are inferred indirectly through a set of related observation values. In the problem of lithium battery estimation, the observation values such as voltage, current, and temperature are usually required. They are combined with a battery model to estimate the SOC of the battery. HMM is a tool capable of handling processes with uncertainty and dynamics. It is suitable for the estimation of a battery’s SOC.

HMM mainly consists of the following five parts:

(1): Stateset: S = {S₁, S₂, …, S_t} represents all hidden states of the system. At each time point, the system is in a state that cannot be directly observed.
(2): Observation set: $O = {O_{1}, O_{2}, \dots, O_{t}}$ represents the observed output sequence. At each time point, the system outputs an observable value, which depends on the current hidden state.
(3): Initial state probability distribution: $τ = \{τ_{i}\}$ represents the probability distribution of the system being in each hidden state at time point $t = 1$ . $τ_{i} = P (X_{1} = S_{i})$ represents the probability that the system is in state $S_{i}$ at the initial moment.
(4): State transition probability matrix: $A = {a_{i j}}$ represents the probability of the system transitioning from one hidden state to another. $a_{i j} = P (X_{t + 1} = S_{j} | X_{t} = S_{i})$ represents the probability of transitioning from state $S_{i}$ to state $S_{j}$ at time $t$ .
(5): Observation probability matrix: $B = {b_{j} (O_{t})}$ represents the probability that the system outputs a certain observation value given a hidden state. $b_{j} (O_{t}) = P (O_{t} = o_{t} | X_{t} = S_{j})$ represents the probability of observing $O_{t}$ under $S_{j}$ .

The hidden Markov factor graph model of lithium batteries can reflect the connections between various variables. It provides a powerful probabilistic framework for SOC estimation and realizes real-time estimation of SOC through message passing algorithms.

The SOC is taken as the hidden state sequence. The terminal voltage U is modeled as an observation. The model is shown in Figure 6.

In the figure,

S = {s o c_{1}, s o c_{2}, s o c_{3} \dots}

is the true state of charge (SOC) of the battery is shown, which is a hidden state that cannot be directly observed and serves as a variable node in the factor graph.

Y = {y_{1}, y_{2}, y_{3} \dots}

is the observation sequence. Among them,

y_{t} = U_{t}

represents the terminal voltage, which serves as the observation input in the HMM-FG.

f = {f_{1}, f_{2}, f_{3} \dots}

is the state transition factor, and

f (s o c_{t + 1} | s o c_{t})

denotes the probability of SOC changing between adjacent time steps, which can be defined based on the charge-discharge curve of the battery or the equivalent circuit model of the battery.

g = {g_{1}, g_{2}, g_{3} \dots}

is the observation factor. The voltage observation factor

g (s o c_{t}, U_{t})

is a Gaussian-like likelihood function calculated based on the difference between the voltage output based on SOC and the actual voltage measurement. This function compares the observed voltage with the voltage estimated from SOC (including current, internal resistance, and RC voltage).

3.3. Definition of Factor Function

This section will elaborate on the specific factor functions.

In this study, the SOC estimation result of the EKF is used as reference information to model and constrain the state transition factor, which helps predict the SOC at the next moment, instead of relying on the circuit model.

Let the SOC estimation values of the EKF at time steps

K

and

K + 1

be

S O C_{k}^{e k f}

and

S O C_{k + 1}^{e k f}

, respectively. Then their difference

Δ S O C^{e k f} (k) = S O C^{e k f} (k + 1) - S O C^{e k f} (k)

can be used as the reference mean value for state changes. Thus, the state transition of SOC can be modeled as follows:

\begin{array}{l} f (s o c_{t + 1} | s o c_{t}) = P (s o c_{t + 1} | s o c_{t}) \\ = N (s o c_{t + 1} - s o c_{t}; Δ s o c_{k}^{e k f}, σ_{s o c}^{2}) \\ = \frac{1}{\sqrt{2 π σ_{s o c}^{2}}} \exp (- \frac{{((s o c_{t + 1} - s o c_{t}) - Δ s o c_{k}^{e k f})}^{2}}{2 σ_{s o c}^{2}}) \end{array}

(8)

The observation factor adopts the Gaussian distribution form established based on the open-circuit voltage ocv-soc curve, and is defined as

\begin{array}{l} g (U_{t} | s o c_{t}) = N (U_{t}; O C V (S O C), σ_{U}^{2}) \\ = \frac{1}{\sqrt{2 π σ_{U}^{2}}} \exp (- \frac{(U_{t} - {(O C V (S O C))}^{2}}{2 σ_{U}^{2}}) \end{array}

(9)

4. Sum-Product Algorithm Rules

The sum-product algorithm is a powerful message-passing technique used for probabilistic inference in graphical models (such as factor graphs). This algorithm can effectively calculate marginal distributions by passing messages in the graph. First, define the messages in the sum-product algorithm: The message

m_{x_{i} \to f_{j}} (X_{i})

from variable node

X_{i}

to factor node

f_{j}

is calculated by summing the messages from adjacent factor nodes;

m_{X_{i} \to f_{j}} (X_{i}) = \prod_{f_{k} \in τ (X_{i})} m_{f_{k} \to X_{i}} (X_{i})

(10)

The message

m_{f_{j} \to X_{i}} (X_{i})

from the factor node

f_{j}

to the variable node

X_{i}

is defined as follows:

m_{f_{j} \to X_{i}} (X_{i}) = \sum_{X_{τ}} (f_{j} (X_{τ}) \prod_{X_{k} \in τ (f_{j}) \ X_{i}} m_{X_{k} \to f_{i}} (X_{k}))

(11)

In the formula,

X_{τ}

represents all variables adjacent to the current node.

τ (X_{i})

denotes the set of neighboring factor nodes of the variable node

X_{i}

, and

τ (f_{j})

denotes the set of neighboring variable nodes of the factor node

f_{i}

.

The sum-product algorithm can be divided into the following steps:

(1): Initialize information: Send initial messages for each node, which are usually uniform distributions or prior distributions.
(2): Iterative message passing: Calculate the messages from each variable node to the factor node and calculate the messages from each factor node to its adjacent variable nodes.
(3): Terminate iteration: Stop the message passing when it converges.
(4): Calculate marginal probability: For each variable node, calculate the marginal probability distribution by combining all incoming edge messages, as follows:

P (X_{i}) \propto \prod_{f_{k} \in τ (X_{i})} m_{f_{j} \to X_{i}} (X_{i})

(12)

Through the sum-product algorithm, we can efficiently calculate the marginal distribution of variables. Then, we perform maximum a posteriori probability estimation. Thereby, the accurate estimation of SOC is achieved.

5. Implementation of the Sum-Product Algorithm

To effectively estimate the marginal distribution of the SOC of lithium batteries and improve the estimation efficiency, this paper adopts the sum-product algorithm for message passing. In the factor graph, the messages typically flow from the variable nodes to factor nodes and from the factor nodes to variable nodes. The state transition factor nodes are calculated based on the current SOC and the SOC at the previous moment. The main direction of message passing is from the factor nodes to variable nodes. When calculating the message from a factor node to the SOC, it is usually necessary to sum over all possible previous states. At the same time, the messages from the SOC to the factor are not required. Based on the hidden Markov factor graph of lithium batteries, we define its message passing as follows:

Message from the state factor node to the SOC;

m_{f_{s o c} \to s o c_{t}} (s o c_{t}) = \sum_{s o c_{t - 1}} f_{s o c} (s o c_{t - 1}, s o c_{t}) * m_{s o c_{t - 1} \to f_{s o c}} (s o c_{t - 1})

(13)

Message from the observation factor node to the SOC;

m_{g \to s o c_{t}} (s o c_{t}) = g_{1} (U_{t} | s o c_{t}) * m_{s o c_{t - 1} \to g} (s o c_{t - 1})

(14)

By calculating the messages from each factor node to the SOC using the above formulas, we combine them to obtain the marginal distribution of the SOC. The formula for calculating the marginal distribution is the following:

P (s o c_{t}) \propto m_{f_{s o c} \to s o c_{t}} (s o c_{t}) * m_{g \to s o c_{t}} (s o c_{t})

(15)

Finally, the calculated marginal distribution is normalized to ensure that the sum of probabilities equals 1.

P (s o c_{t}) = \frac{P (s o c_{t})}{\sum_{s o c_{t}} P (s o c_{t})}

(16)

After obtaining the marginal distribution, the estimation is achieved through the MAP probability. In the discretized factor graph framework, the posterior probability is determined by the joint message passing of the state transition factors and the observation factors.

6. State Estimation Process

Before starting the state estimation, we need to initialize its prior distribution from historical data. The accuracy of state estimation depends on precise initial data, so the initialization of the prior distribution has a significant impact on the estimation results. The state estimation process is as follows:

(1): Introducing historical data of the SOC variable, such as using the results obtained by EKF for the prior initialization distribution.
(2): Define the factor function $f, g$ . Perform message passing using the Sum-Product algorithm.
(3): Calculate the marginal probability of the SOC based on the passed messages and normalize it.
(4): Apply the MAP estimation to the marginal distribution of the node, and output the estimated SOC. The flow chart is shown in Figure 7.

When bad data exists in historical data, a small number of them will not have a significant impact on the results. During the measurement process, outliers (such as extremely high or low voltages) may occur. Such bad data will cause model deviations, ultimately leading to inaccurate SOC estimation. Bad data may also slow down the convergence speed of the algorithm or even make it fall into a local optimum. Thereby, it affects the real-time performance of the estimation.

The impact of bad data is multifaceted, which may result in inaccurate and unreliable state estimation of the SOC of the lithium battery. Therefore, in practical applications, it is necessary to take measures to improve data quality, such as increasing data redundancy and monitoring sensor status to reduce the negative impact of bad data.

7. Simulation Experiment

In the EKF–HMM–FG framework proposed in this paper, the HMM–FG component optimizes the entire observation sequence through joint Maximum A Posteriori (MAP) inference. This means that in offline or quasi-offline settings, future observation information can be utilized to improve the SOC estimation at the current moment; thus, the EKF–HMM–FG acts as a smoothing-based improved method under this mode. To ensure fairness in comparison, the Extended Kalman Smoother (EKS) is selected as the model-driven benchmark in the experiments, and a deep learning method based on the Long Short-Term Memory (LSTM) network is further introduced as the data-driven benchmark. The LSTM can learn the dynamic variation relationship of SOC from historical voltage, current, and temperature data, and possesses strong temporal modeling capabilities. However, its performance is highly dependent on the distribution of training data, and it is prone to estimation deviations under out-of-distribution operating conditions. Through comprehensive comparison with the EKS and LSTM methods, the advantages of the proposed EKF–HMM–FG method in terms of estimation accuracy and robustness can be verified more comprehensively.

7.1. Dataset and Experimental Conditions

This study selects the INR 18650-20R lithium-ion battery as the research object and verifies the proposed SOC estimation algorithm under multi-temperature and multi-operating conditions. The experimental temperatures are set to 0 °C, 25 °C, and 45 °C to evaluate the algorithm’s adaptability and robustness in low-temperature, normal-temperature, and high-temperature environments. The basic parameters of the battery are shown in Table 3, and the initial value of SOC is uniformly set to 0.8. The experimental operating conditions include two typical dynamic load profiles:

(1): FUDS (Federal Urban Driving Schedule): Simulates urban traffic conditions and is characterized by frequent acceleration and deceleration.
(2): DST (Dynamic Stress Test): Featuring large load fluctuations, it is designed to examine the battery’s dynamic response under stress testing.

In each set of experiments, the battery’s terminal voltage, current, and temperature data were collected. The Extended Kalman Filter (EKF) algorithm was first used for preliminary SOC estimation. Subsequently, the current data, EKF-estimated SOC, and temperature information were fed as observation inputs into the proposed Hidden Markov Model-Factor Graph (HMM–FG) for message passing and Maximum A Posteriori (MAP) inference, yielding the optimized SOC curve.

To verify the effectiveness and robustness of the proposed method, this study conducted a comparative analysis between the EKF–HMM–FG method and three benchmark methods: EKF, Extended Kalman Smoother (EKS), and the LSTM-based data-driven method. The focus was on evaluating their estimation accuracy and volatility performance under different temperature conditions and operating profiles.

7.2. Comparative Methods

7.2.1. Extended Kalman Smoother (EKS)

The Extended Kalman Smoother (EKS) is a posterior state estimation method developed on the basis of the Extended Kalman Filter (EKF). Its core idea is that on the basis of the forward recursion of filtering, it uses observation data from future moments to perform backward correction on historical states, thereby obtaining the optimal posterior estimation of the entire time series. The EKF is a causal filter that only relies on current and past observations. The EKS is different. It is a smoothing method that can use observation information from the past, present, and future at the same time. Therefore, it has higher estimation accuracy when the system noise is large or the model parameter uncertainty is strong.

In this study, the EKS adopts the same battery equivalent circuit model as the EKF as the state-space model. The state variables include SOC, model internal resistance, polarization voltage, and other parameters. The implementation process of the EKS includes the following two steps:

(1): Forward Filtering Phase: The SOC is estimated recursively via the EKF, yielding the a priori state estimates and the covariance sequence.
(2): Backward Smoothing Phase: Using the Rauch–Tung–Striebel (RTS) smoothing algorithm, the correction gain is propagated backward from the final moment to perform smoothing correction on the state estimation at each moment, resulting in the posterior estimation of the entire time series.

By incorporating future observation information, the EKS can effectively reduce the estimation fluctuations of the EKF under operating conditions with rapid dynamic changes and improve the overall accuracy of SOC estimation.

7.2.2. Long Short-Term Memory (LSTM)

Long Short-Term Memory (LSTM) is a typical variant of a Recurrent Neural Network (RNN). It can effectively alleviate the gradient vanishing and gradient exploding problems that traditional RNNs tend to have in long-sequence modeling by introducing a gating mechanism (input gate, forget gate, and output gate). This allows it to better capture long-term dependencies in time-series data. With this advantage, LSTM has been widely used in many time-series modeling tasks such as speech recognition, natural language processing, and battery health management.

In the task of battery SOC estimation, LSTM does not rely on an explicit battery equivalent circuit model. Instead, it relies on a large amount of historical operating data for end-to-end training. By learning the complex nonlinear mapping relationships between voltage, current, temperature, and SOC, it achieves fully data-driven SOC prediction. Compared with EKF and EKS methods based on physical models, LSTM is more flexible and can adapt to different working conditions and nonlinear characteristics to a certain extent. However, its performance depends on the scale and coverage of training data. When the training data are insufficient or there is a large difference between the working condition distribution and the test conditions, the prediction accuracy and generalization ability may be affected.

7.2.3. Analysis of Experimental Results

To verify the effectiveness of the EKF-HMM-FG method proposed in this study, SOC estimation experiments were conducted under three temperature conditions (0 °C, 25 °C, and 45 °C) and two operating conditions (FUDS and DST). Figure 8, Figure 9, Figure 10 and Figure 11 show the SOC estimation curves and corresponding error curves of each method under different conditions, while Table 4, Table 5, Table 6, Table 7, Table 8 and Table 9 summarize the quantitative error indicators of each method.

Analysis of the SOC estimation curves shows that the estimation results of the traditional EKF have large fluctuations. Especially under low-temperature and high-dynamic load conditions, the curve exhibits obvious jitter and even deviation. Compared with EKF, EKS produces smoother estimation curves, but there is still a deviation from the actual SOC. The LSTM can follow SOC changes well under some conditions, but it has a certain overall drift problem, which is particularly obvious at high temperatures. In contrast, the SOC curve obtained by the proposed EKF-HMM-FG method in this study is highly consistent with the actual value. Moreover, in the locally enlarged area, it can closely follow the actual SOC trajectory, showing stronger robustness and stability.

The SOC error curves further illustrate the differences between the methods. The errors of EKF and EKS fluctuate greatly over time, with the maximum value exceeding 3% to 4%. For LSTM, the error deviates seriously under some operating conditions, showing a systematic bias. In contrast, the error of the proposed EKF-HMM-FG method is always concentrated around zero, and its fluctuation range is significantly reduced. It can be maintained within ±1.5% under all three temperature conditions, demonstrating strong temperature adaptability.

The MAE, RMSE, maximum error, and Bias indicators shown in Table 4, Table 5, Table 6, Table 7, Table 8 and Table 9 are consistent with the curve results. Taking the FUDS operating condition as an example, the RMSE of the proposed EKF-HMM-FG method at 25 °C is only 0.425%, with a maximum error of 1.411%—far lower than the EKF’s 0.883% (RMSE) and 4.103% (maximum error). At 45 °C, the RMSE of the EKF-HMM-FG further decreases to 0.293%, and the maximum error is only 0.841%. It shows similar advantages under the DST operating condition as well, which indicates that this method can maintain high accuracy and stability across different operating conditions and temperatures.

A combination of graph and table comparisons shows that the EKF-HMM-FG method is significantly superior to EKF and EKS in suppressing noise fluctuations and reducing systematic deviations. At the same time, it avoids the generalization problem of the LSTM method in cross-temperature scenarios. Its Bias is close to zero at all temperatures, which verifies the practicality and robustness of the proposed method in complex operating conditions.

8. Conclusions

This paper proposes a battery state estimation method based on the hidden Markov model factor graph. It modeled the dynamic characteristics of the battery using a second-order equivalent circuit model. In this method, the terminal voltage serves as the observation factor, while the SOC estimated by EKF is used to construct the state transition factor. This design improves the accuracy and stability of the estimation. In the simulation experiments, the sum-product algorithm is used to estimate the marginal distribution of SOC through message passing in the factor graph. This effectively improves the accuracy and robustness of the estimation.

In the future, the online real-time estimation of the model can be further improved. The multi-source information [19,20] (such as temperature, capacity, attenuation, etc.) can be combined to enhance the accuracy of SOC estimation, so as to meet the needs of complex application scenarios such as electric vehicles.

Author Contributions

Methodology, W.F.; Software, G.-P.W.; Investigation, P.L.; Writing—original draft, Z.-J.S.; Writing—review & editing, Y.-T.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Foundation of China Grant (grant no. 62373247), and the National Defence Fund (grant no. 2023-JCJQ-JJ-0353).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

Li, J.; Wang, Q.; Kong, D.; Wang, X.; Yu, Z.; Le, Y.; Huang, X.; Hu, Z.; Wu, H.; Fang, H.; et al. Research progress on the safety assessment of lithium-ion battery energy storage. Energy Storage Sci. Technol. 2023, 12, 2282–2301. [Google Scholar]
Chen, Y.; Kang, Y.; Zhao, Y.; Wang, L.; Liu, J.; Li, Y.; Liang, Z.; He, X.; Li, X.; Tavajohi, N.; et al. A review of lithium-ion battery safety concerns: The issues, strategies, and testing standards. J. Energy Chem. 2021, 59, 83–99. [Google Scholar] [CrossRef]
Meng, Z.; Agyeman, K.A.; Wang, X. Lithium-ion battery state of charge estimation with adaptability to changing conditions. IEEE Trans. Energy Convers. 2023, 38, 2860–2870. [Google Scholar] [CrossRef]
Tan, B.; Du, J.; Ye, X.; Cao, X.; Qu, C. A review of model-based SOC estimation methods for lithium-ion batteries. Energy Storage Sci. Technol. 2023, 12, 1995–2010. [Google Scholar]
Luo, L.; Chen, G. Research on SOC estimation of lithium battery for electric vehicle based on EKF algorithm. In Proceedings of the 2021 International Conference on Power System Technology (POWERCON), Haikou, China, 8–9 December 2021; pp. 820–823. [Google Scholar]
Hu, Z.; Liu, J. Lithium battery SOC correction technology based on equivalent circuit UKF filtering algorithm. In Proceedings of the 2022 International Conference on Artificial Intelligence and Computer Information Technology (AICIT), Yichang, China, 16–18 September 2022; pp. 1–4. [Google Scholar]
Wang, K.; Lyu, S.; Liu, Y.; Wang, B. Graph optimization via decoupled edge tuning for efficient industrial anomaly detection. IEEE Trans. Netw. Sci. Eng. 2025, 12, 4028–4043. [Google Scholar] [CrossRef]
Yang, J.; Wang, T.; Du, C.; Min, F.; Lyu, T.; Zhang, Y.; Yan, L.; Xie, J.; Yin, G. Overview of the modeling of lithium-ion batteries. Energy Storage Sci. Technol. 2019, 8, 58–64. [Google Scholar]
Liu, W.; Zheng, Y.; Khalatbarisoltani, A.; Xu, L.; Pan, Y.; Hu, X. Enhanced electrothermal state estimation and experimental validations for electric flying car batteries. IEEE/ASME Trans. Mechatron. 2024, 29, 4456–4467. [Google Scholar] [CrossRef]
Li, X.; Yu, D.; Byg, V.S.; Ioan, S.D. The development of machine learning-based remaining useful life prediction for lithium-ion batteries. J. Energy Chem. 2023, 82, 103–121. [Google Scholar] [CrossRef]
Lipu, M.S.H.; Ansari, S.; Miah, M.S.; Meraj, S.T.; Hasan, K.; Shihavuddin, A.S.M.; Hannan, M.A.; Muttaqi, K.M.; Hussain, A. Deep learning enabled state of charge, state of health and remaining useful life estimation for smart battery management system: Methods, implementations, issues and prospects. J. Energy Storage 2022, 55, 105752. [Google Scholar] [CrossRef]
Cheng, Y.; Lai, J.; Lyu, P.; Wang, T.; Zhu, J. Visual heading-aided pedestrian navigation method based in factor graph in indoor environment. IEEE Trans. Ind. Electron. 2024, 71, 1006–1016. [Google Scholar]
Kschischang, F.R.; Frey, B.J.; Loeliger, H.A. Factor graphs and the sum-product algorithm. IEEE Trans. Inf. Theory 2001, 47, 498–519. [Google Scholar] [CrossRef]
Li, J.; Wang, H.; Cai, D. The principle of factor diagram and its application prospects. Telecommun. Technol. 2000, 40, 20–24. [Google Scholar]
Bai, S.; Lai, J.; Lyu, P.; Cen, Y.; Sun, X.; Wang, B. Performance enhancement of tightly coupled GNSS/IMU integration based on factor graph with robust TDCP loop closure. IEEE Trans. Intell. Transp. Syst. 2024, 25, 2437–2449. [Google Scholar] [CrossRef]
Sun, M.; Davies, M.E.; Proudler, I.K.; Hopgood, J.R. Adaptive kernel Kalman filter based belief propagation algorithm for maneuvering multi-target tracking. IEEE Signal Process. Lett. 2022, 29, 1452–1456. [Google Scholar] [CrossRef]
Feng, R.; Chang, Y.; Mao, L.; Li, J. Liquid pressure sensing system with coupling RFID and HMM based on distance measurement. IEEE Sens. J. 2020, 21, 1051–1058. [Google Scholar] [CrossRef]
Deng, Q.; Söffker, D. A review of HMM-based approaches of driving behaviors recognition and prediction. IEEE Trans. Intell. Veh. 2021, 7, 21–31. [Google Scholar] [CrossRef]
Wang, Y.; Tian, J.; Sun, Z.; Wang, L.; Xu, R.; Li, M.; Chen, Z. A comprehensive review of battery modeling and state estimation approaches for advanced battery management systems. Renew. Sustain. Energy Rev. 2020, 131, 110015. [Google Scholar] [CrossRef]
Zhuang, Y.; Sun, X.; Li, Y.; Huai, J.; Hua, L.; Yang, X.; Cao, X.; Zhang, P.; Cao, Y.; Qi, L.; et al. Multi-sensor integrated navigation/positioning system using data fusion: From analytics-based to learning-based approaches. Inf. Fusion 2023, 95, 62–90. [Google Scholar] [CrossRef]

Figure 1. The second-order Thevenin equivalent circuit model.

Figure 2. OCV-SOC fitting curve.

Figure 3. Comparison of model output and true values.

Figure 4. Voltage estimation error value.

Figure 5. The example of a factor graph.

Figure 6. Hidden Markov factor diagram.

Figure 7. Flow chart of the state estimation.

Figure 8. SOC Estimation under FUDS operating conditions at different temperatures.

Figure 9. SOC estimation errors under FUDS operating conditions at different temperatures.

Figure 10. SOC estimation under DST operating conditions at different temperatures.

Figure 11. SOC estimation errors under DST operating conditions at different temperatures.

Table 1. Fitting indicators of different orders.

Order	RMSE	MAE
1	0.0441	0.0385
2	0.0387	0.0293
3	0.0259	0.0215
4	0.0096	0.0079
5	0.0007	0.0006
6	3.7 × 10⁻¹⁵	3.05 × 10⁻¹⁵
7	3.86 × 10⁻¹⁵	3.14 × 10⁻¹⁵

Table 2. Battery parameter identification results.

$S O C$	$R_{0} / Ω$	$R_{1} / Ω$	$R_{2} / Ω$	$C_{1} / F$	$C_{2} / F$
0.1	0.0711	0.0220	0.00214	973.705	361.029
0.2	0.0713	0.0218	0.00187	1014.322	419.262
0.3	0.0710	0.0217	0.00187	1066.142	403.944
0.4	0.0709	0.0207	0.00182	1129.260	438.525
0.5	0.0708	0.0193	0.00180	1195.518	459.439
0.6	0.0706	0.0186	0.00176	1229.872	481.362
0.7	0.0707	0.0183	0.00169	1231.680	499.461
0.8	0.0704	0.0185	0.00164	1235.596	506.406
0.9	0.0701	0.0186	0.00162	1233.072	504.058

Table 3. Battery basic data.

Battery Parameter	Value
Rated capacity	2000 mAh
Nominal voltage	3.7 V
Maximum charging voltage	4.2 V
Discharge cut-off voltage	2.5 V
Maximum continuous discharge current	22 A
Standard charging current	1.25 A
Cycle life	>300

Table 4. SOC estimation error indicators at 0 °C under FUDS operating condition (%).

Indicators (%)	MAE	RMSE	Max-Err	Bias
EKF	0.641	0.811	3.1556	0.172
EKS	0.543	0.682	2.1973	0.168
EKF-HMM-FG	0.281	0.382	1.2103	0.025
LSTM	0.368	0.729	2.6871	0.078

Table 5. SOC estimation error indicators at 25 °C under FUDS operating condition (%).

Indicators (%)	MAE	RMSE	Max-Err	Bias
EKF	0.661	0.883	4.1032	0.0576
EKS	0.595	0.810	3.6813	0.0539
EKF-HMM-FG	0.321	0.425	1.4106	0.0949
LSTM	0.776	0.987	2.5109	0.3878

Table 6. SOC estimation error indicators at 45 °C under FUDS operating condition (%).

Indicators (%)	MAE	RMSE	Max-Err	Bias
EKF	0.615	0.766	2.5153	0.1671
EKS	0.545	0.678	1.8480	0.1635
EKF-HMM-FG	0.256	0.293	0.8406	0.0198
LSTM	0.611	0.778	2.6341	0.1883

Table 7. SOC estimation error indicators at 0 °C under DST operating condition (%).

Indicators (%)	MAE	RMSE	Max-Err	Bias
EKF	0.591	0.758	2.943	0.186
EKS	0.519	0.782	2.417	0.173
EKF-HMM-FG	0.281	0.554	1.548	−0.027
LSTM	0.553	0.716	2.760	−0.141

Table 8. SOC estimation error indicators at 25 °C under DST operating condition (%).

Indicators (%)	MAE	RMSE	Max-Err	Bias
EKF	0.536	0.815	3.627	−0.128
EKS	0.466	0.751	3.265	−0.131
EKF-HMM-FG	0.281	0.549	1.713	−0.029
LSTM	0.553	0.856	3.729	−0.302

Table 9. SOC estimation error indicators at 45 °C under DST operating condition (%).

Indicators (%)	MAE	RMSE	Max-Err	Bias
EKF	0.566	0.759	3.2031	−0.0081
EKS	0.501	0.679	2.7359	−0.0116
EKF-HMM-FG	0.388	0.481	1.4303	0.0340
LSTM	0.658	0.819	3.4592	−0.2842

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Fang, W.; Su, Z.-J.; Shao, Y.-T.; Wu, G.-P.; Liu, P. State of Charge Estimation of Lithium-Ion Batteries Based on Hidden Markov Factor Graphs. Mathematics 2025, 13, 2922. https://doi.org/10.3390/math13182922

AMA Style

Fang W, Su Z-J, Shao Y-T, Wu G-P, Liu P. State of Charge Estimation of Lithium-Ion Batteries Based on Hidden Markov Factor Graphs. Mathematics. 2025; 13(18):2922. https://doi.org/10.3390/math13182922

Chicago/Turabian Style

Fang, Wei, Zhi-Jian Su, Yu-Tong Shao, Guang-Ping Wu, and Peng Liu. 2025. "State of Charge Estimation of Lithium-Ion Batteries Based on Hidden Markov Factor Graphs" Mathematics 13, no. 18: 2922. https://doi.org/10.3390/math13182922

APA Style

Fang, W., Su, Z.-J., Shao, Y.-T., Wu, G.-P., & Liu, P. (2025). State of Charge Estimation of Lithium-Ion Batteries Based on Hidden Markov Factor Graphs. Mathematics, 13(18), 2922. https://doi.org/10.3390/math13182922

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

State of Charge Estimation of Lithium-Ion Batteries Based on Hidden Markov Factor Graphs

Abstract

1. Introduction

2. Modeling of Lithium-Ion Batteries

2.1. Second-Order Thevenin Model

2.2. OCV-SOC Fitting

2.3. Model Parameter Identification

2.4. Extended Kalman Filter Algorithm (EKF Algorithm)

3. Hidden Markov Factor Graph Modeling

3.1. Factor Graph (FG)

3.2. Hidden Markov Model Representation in Factor Graph

3.3. Definition of Factor Function

4. Sum-Product Algorithm Rules

5. Implementation of the Sum-Product Algorithm

6. State Estimation Process

7. Simulation Experiment

7.1. Dataset and Experimental Conditions

7.2. Comparative Methods

7.2.1. Extended Kalman Smoother (EKS)

7.2.2. Long Short-Term Memory (LSTM)

7.2.3. Analysis of Experimental Results

8. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI