1. Introduction
As global energy crises intensify and environmental awareness rises, the utilization of clean energy and electric tools has expanded progressively. Lithium-ion batteries are an efficient energy storage device that are extensively employed in electric vehicles, power grids, and renewable energy systems. However, the safety, performance, and lifespan of the battery depend on the Battery Management Systems (BMS) in practical applications. The State of Charge (SOC) estimation of batteries is a core technology in the BMS [
1,
2].
The SOC of a battery quantifies the ratio between remaining capacity and nominal capacity. It is a critical indicator for measuring the currently available capacity of the battery. Accurate SOC estimation not only ensures the safe operation of the battery but also prevents occurrences such as overcharging and over-discharging. It also prolongs the service life of the battery [
3]. However, due to the complex electrochemical characteristics of lithium-ion batteries under different operating conditions, the SOC is closely related to multiple parameters, including operating current, voltage, and temperature. These parameters possess strong nonlinearity and time-variability. Therefore, SOC estimation has become a highly challenging task.
The current SOC estimation approaches for lithium-ion batteries fall into three primary categories: direct measurement, model-based, and data-driven methods [
4]. The direct measurement methods estimate the SOC by measuring the open-circuit voltage of the battery or using the Coulomb counting method. This method is simple and intuitive, but it is vulnerable to initial state uncertainty and measurement noise. They often fail to accurately estimate SOC, especially under complex working conditions. The model-based methods conduct online SOC estimation using equivalent circuit or electrochemical battery models, such as the Extended Kalman Filter (EKF), Cubature Kalman Filter (CKF), and Unscented Kalman Filter (UKF) [
5,
6,
7]. These methods rely on the accurate battery models [
8]. In practical applications, the battery parameters often change with temperature, state of charge, and aging degree [
9], which leads to a decline in the accuracy of SOC estimation. The data-driven methods utilize machine learning [
10,
11] algorithms to estimate SOC through a large amount of historical data. In recent years, deep learning models (such as Long Short-Term Memory (LSTM) networks and attention mechanism-based Transformers) have been widely applied to SOC prediction and estimation. These methods can automatically extract complex temporal features and have certain advantages in handling non-linear operating conditions. However, their performance highly depends on the completeness and distribution consistency of training data. They exhibit insufficient generalization capabilities under temperature variations, aging, and unseen operating conditions. Meanwhile, these models have weak interpretability, making it difficult to meet the requirements for transparency and robustness in safety-critical applications.
The Hidden Markov Model (HMM) is a statistical framework for time-series analysis. It performs well in modeling dynamic systems with hidden states. The HMM infers the hidden states of the system through the observed data. It is suitable for describing the variation process of SOC. Meanwhile, the Factor Graph (FG) is an effective probabilistic graphical model that can represent the dependencies between variables and enables efficient inference and state estimation using message-passing algorithms [
12].
This paper proposes an SOC estimation method combining the Hidden Markov Factor Graph (HMM-FG) and EKF. The HMM-FG models battery SOC dynamics via probabilistic dependencies among factor nodes. It facilitates joint inference using historical and real-time measurements. The HMM-FG can optimize the marginal probability of SOC via maximum a posteriori (MAP) estimation. Thereby, it can effectively reduce errors in EKF and improve estimation accuracy. In addition, the introduction of the FG allows the estimation process to exhibit better robustness and adaptability under dynamic load and environmental changes. The experimental verification under different temperatures and different operating conditions demonstrates that the proposed method not only outperforms the traditional EKF method in estimation accuracy but also shows strong real-time performance and adaptability.
2. Modeling of Lithium-Ion Batteries
2.1. Second-Order Thevenin Model
The main lithium-ion battery models include three types: the resistance and capacitance (RC) model, the electrochemical model, and the Thevenin equivalent circuit model. The RC model is a simplified battery model that describes the dynamic response of the battery through resistive and capacitive components. It is suitable for SOC estimation on short time scales but may exhibit poor performance in capturing the nonlinear characteristics of the battery over long time scales. The electrochemical model is a model that elaborately describes the internal chemical reactions and charge transfer processes of the battery. The Thevenin equivalent circuit model models the lithium-ion battery as an equivalent circuit, which includes a voltage source, internal resistance, and battery capacitance. The response of the battery under different conditions is analyzed by studying the circuit.
In
Figure 1, the open circuit voltage (OCV) refers to the open-circuit voltage of the battery;
denotes the terminal voltage of the battery;
represents the ohmic internal resistance of the battery;
stands for the operating current;
are the polarization resistance and capacitance of the battery, respectively;
are the concentration polarization resistance and capacitance of the battery, respectively;
indicate the voltages of the two RC loops, respectively.
According to Kirchhoff’s laws, the circuit model in
Figure 1 can be expressed as follows:
where the current
defined as the input, the terminal voltage
is the observed quantity, and
is selected as the state variable of the system. In this paper, the Euler method is used to discretize Equation (1). Specifically, the forward Euler method is adopted, which can also be understood as the forward difference method. Its basic idea is approximate iteration, using the Formula
to approximate the integral, where
is the sampling period and
is the integral from the previous moment. The state-space equation in a known time-invariant continuous system is as follows:
Substituting
into the equation yields;
After rearrangement, the discretized state-space equation is obtained as follows:
The observation equation is the following:
where
.
In summary, the expression form of its discretized state-space equation is as follows:
where
.
The discretized equation of Equation (1) is obtained according to the above method.
where
denotes the system state at time
.
Here,
is the state matrix of the system;
where
is the control matrix of the system;
represents the sampling time;
denotes the discrete time;
stands for the battery capacity;
is the charge-discharge efficiency of the battery; and
is the process noise of the system.
Similarly, the discretized observation equation of the system is the following:
where
represents the observation noise of the system.
2.2. OCV-SOC Fitting
The relationship between a battery’s OCV and SOC constitutes a significant research subject in the study of battery electrical performance. The SOC refers to the percentage of the battery’s current stored electric quantity. The OCV is the voltage value across the battery terminals under no-load conditions. During the battery’s charging and discharging processes, the OCV varies with the changes in SOC. Thus, there exists a certain functional relationship between the SOC and OCV. This relationship is typically non-linear and takes the form of a complex curve. The battery’s voltage is influenced by multiple factors, such as chemical reactions within the battery, temperature, load, and historical charging and discharging conditions. It can usually be obtained through experimental measurements and represented by a fitting curve. In this study, experimental data from the incremental current open-circuit voltage (OCV) test were used for polynomial fitting of different orders. The optimal order of the polynomial fitting was selected by analyzing the experimental results.
Analysis of the data in
Table 1 reveals that for polynomial orders ranging from 1 to 5, both the RMSE and MAE gradually decrease as the polynomial order increases, with fitting accuracy continuously improving. When the order reaches 6 to 7, the RMSE and MAE are close to zero, basically reaching the accuracy limit of numerical calculation. Although polynomials of excessively high orders can further reduce errors numerically, they may introduce unnecessary fluctuations or oscillations, which are inconsistent with the physical laws governing the battery’s SOC–OCV relationship. Therefore, on the premise of ensuring accuracy, the 6th-order polynomial is the optimal choice. The experimental steps for the 6th-order polynomial fitting are as follows:
- (1)
Fully charge the battery to 100% SOC.
- (2)
Now discharge using a negative pulse current relaxation duration at every 10% SOC.
- (3)
Recharge the battery following the same procedure but using positive-pulse current.
- (4)
Apply averaging and linear interpolation steps to obtain the OCV-SOC curves at 0 °C, 25 °C, and 45 °C.
The fitting results are shown in
Figure 2.
2.3. Model Parameter Identification
In this paper, the forgetting factor recursive least squares (FFRLS) method is used to identify
the second-order Thevenin model. The formulas of FFRLS are as follows:
where
represents the gain term;
denotes the observed value;
is the forgetting factor, typically
, and generally takes the value of
;
represents the covariance at time instant
.
The identification results
are as follows (
Table 2):
After completing the parameter identification, this paper uses the Dynamic Stress Test (DST) condition experimental data to conduct a simulation. A comparison between the actual measured voltage values and the experimentally simulated voltage values is obtained. The results are shown in
Figure 3, and the error values in
Figure 4.
2.4. Extended Kalman Filter Algorithm (EKF Algorithm)
Based on the second-order equivalent circuit model, the following state-space equations are listed as follows:
where
represents the time instant;
and
denote the state vector and observation vector of the system, respectively;
represents the input to the system;
and
denote the state transition matrix and observation matrix, respectively;
and
denote the system noise and measurement noise, respectively. The EKF algorithm is used to estimate the state variables of the system. The specific steps are as follows:
- (1)
Predict the state and covariance at the current time step;
where
is the Jacobian matrix of the state transition matrix, and
is the process noise covariance matrix.
- (2)
Calculate the Kalman gain ;
where
is the Jacobian matrix of the model, and
is the covariance matrix of the observation equation.
- (3)
Update the state estimate ;
- (4)
Update the covariance matrix ;
3. Hidden Markov Factor Graph Modeling
3.1. Factor Graph (FG)
A factor Graph is a type of probabilistic graphical model [
13,
14,
15], which is used to represent the conditional dependencies between variables and is suitable for reasoning and computation. In machine learning, signal processing, and information theory, it is widely applied to solve complex probabilistic problems such as Bayesian networks and Markov random fields.
A factor graph is a kind of bipartite graph, which is mainly made up of variable nodes and factor nodes. The variable nodes stand for random variables, and they are usually represented by circles, with each variable node corresponding to one random variable. The factor nodes indicate the relationships or functions between variables. They are typically a factor of the probability distribution, generally represented by squares. The variable nodes connected by a factor node signify a certain dependent relationship between these variables. A factor graph breaks down complex global computations into multiple local ones, thereby simplifying the reasoning process.
Suppose there is a global function
that contains random variables
. It can be decomposed into multiple factors
in the following form:
where
represents the subset of variables on which
depends.
An important application of factor graphs is to perform inference on the graph, such as using the sum-product algorithm or belief propagation algorithm [
16] to calculate marginal probability distributions by passing information in the graph. In addition, in high-dimensional probability distributions, the complexity of direct computation is relatively high.
Let
be a function containing five variables, and
can be expressed as a product.
Its factor graph representation is shown in
Figure 5 where
represents variable nodes;
represents factor nodes.
3.2. Hidden Markov Model Representation in Factor Graph
A Hidden Markov Model is a statistical model which is widely used in time series data [
17,
18]. It assumes that the system is a Markov process, and the states are not directly observable. The states of the system are inferred indirectly through a set of related observation values. In the problem of lithium battery estimation, the observation values such as voltage, current, and temperature are usually required. They are combined with a battery model to estimate the SOC of the battery. HMM is a tool capable of handling processes with uncertainty and dynamics. It is suitable for the estimation of a battery’s SOC.
HMM mainly consists of the following five parts:
- (1)
Stateset: S = {S1, S2, …, St} represents all hidden states of the system. At each time point, the system is in a state that cannot be directly observed.
- (2)
Observation set: represents the observed output sequence. At each time point, the system outputs an observable value, which depends on the current hidden state.
- (3)
Initial state probability distribution: represents the probability distribution of the system being in each hidden state at time point . represents the probability that the system is in state at the initial moment.
- (4)
State transition probability matrix: represents the probability of the system transitioning from one hidden state to another. represents the probability of transitioning from state to state at time .
- (5)
Observation probability matrix: represents the probability that the system outputs a certain observation value given a hidden state. represents the probability of observing under .
The hidden Markov factor graph model of lithium batteries can reflect the connections between various variables. It provides a powerful probabilistic framework for SOC estimation and realizes real-time estimation of SOC through message passing algorithms.
The SOC is taken as the hidden state sequence. The terminal voltage U is modeled as an observation. The model is shown in
Figure 6.
In the figure, is the true state of charge (SOC) of the battery is shown, which is a hidden state that cannot be directly observed and serves as a variable node in the factor graph. is the observation sequence. Among them, represents the terminal voltage, which serves as the observation input in the HMM-FG. is the state transition factor, and denotes the probability of SOC changing between adjacent time steps, which can be defined based on the charge-discharge curve of the battery or the equivalent circuit model of the battery. is the observation factor. The voltage observation factor is a Gaussian-like likelihood function calculated based on the difference between the voltage output based on SOC and the actual voltage measurement. This function compares the observed voltage with the voltage estimated from SOC (including current, internal resistance, and RC voltage).
3.3. Definition of Factor Function
This section will elaborate on the specific factor functions.
In this study, the SOC estimation result of the EKF is used as reference information to model and constrain the state transition factor, which helps predict the SOC at the next moment, instead of relying on the circuit model.
Let the SOC estimation values of the EKF at time steps
and
be
and
, respectively. Then their difference
can be used as the reference mean value for state changes. Thus, the state transition of SOC can be modeled as follows:
The observation factor adopts the Gaussian distribution form established based on the open-circuit voltage ocv-soc curve, and is defined as
4. Sum-Product Algorithm Rules
The sum-product algorithm is a powerful message-passing technique used for probabilistic inference in graphical models (such as factor graphs). This algorithm can effectively calculate marginal distributions by passing messages in the graph. First, define the messages in the sum-product algorithm: The message
from variable node
to factor node
is calculated by summing the messages from adjacent factor nodes;
The message
from the factor node
to the variable node
is defined as follows:
In the formula, represents all variables adjacent to the current node. denotes the set of neighboring factor nodes of the variable node , and denotes the set of neighboring variable nodes of the factor node .
The sum-product algorithm can be divided into the following steps:
- (1)
Initialize information: Send initial messages for each node, which are usually uniform distributions or prior distributions.
- (2)
Iterative message passing: Calculate the messages from each variable node to the factor node and calculate the messages from each factor node to its adjacent variable nodes.
- (3)
Terminate iteration: Stop the message passing when it converges.
- (4)
Calculate marginal probability: For each variable node, calculate the marginal probability distribution by combining all incoming edge messages, as follows:
Through the sum-product algorithm, we can efficiently calculate the marginal distribution of variables. Then, we perform maximum a posteriori probability estimation. Thereby, the accurate estimation of SOC is achieved.
5. Implementation of the Sum-Product Algorithm
To effectively estimate the marginal distribution of the SOC of lithium batteries and improve the estimation efficiency, this paper adopts the sum-product algorithm for message passing. In the factor graph, the messages typically flow from the variable nodes to factor nodes and from the factor nodes to variable nodes. The state transition factor nodes are calculated based on the current SOC and the SOC at the previous moment. The main direction of message passing is from the factor nodes to variable nodes. When calculating the message from a factor node to the SOC, it is usually necessary to sum over all possible previous states. At the same time, the messages from the SOC to the factor are not required. Based on the hidden Markov factor graph of lithium batteries, we define its message passing as follows:
Message from the state factor node to the SOC;
Message from the observation factor node to the SOC;
By calculating the messages from each factor node to the SOC using the above formulas, we combine them to obtain the marginal distribution of the SOC. The formula for calculating the marginal distribution is the following:
Finally, the calculated marginal distribution is normalized to ensure that the sum of probabilities equals 1.
After obtaining the marginal distribution, the estimation is achieved through the MAP probability. In the discretized factor graph framework, the posterior probability is determined by the joint message passing of the state transition factors and the observation factors.
6. State Estimation Process
Before starting the state estimation, we need to initialize its prior distribution from historical data. The accuracy of state estimation depends on precise initial data, so the initialization of the prior distribution has a significant impact on the estimation results. The state estimation process is as follows:
- (1)
Introducing historical data of the SOC variable, such as using the results obtained by EKF for the prior initialization distribution.
- (2)
Define the factor function . Perform message passing using the Sum-Product algorithm.
- (3)
Calculate the marginal probability of the SOC based on the passed messages and normalize it.
- (4)
Apply the MAP estimation to the marginal distribution of the node, and output the estimated SOC. The flow chart is shown in
Figure 7.
When bad data exists in historical data, a small number of them will not have a significant impact on the results. During the measurement process, outliers (such as extremely high or low voltages) may occur. Such bad data will cause model deviations, ultimately leading to inaccurate SOC estimation. Bad data may also slow down the convergence speed of the algorithm or even make it fall into a local optimum. Thereby, it affects the real-time performance of the estimation.
The impact of bad data is multifaceted, which may result in inaccurate and unreliable state estimation of the SOC of the lithium battery. Therefore, in practical applications, it is necessary to take measures to improve data quality, such as increasing data redundancy and monitoring sensor status to reduce the negative impact of bad data.
7. Simulation Experiment
In the EKF–HMM–FG framework proposed in this paper, the HMM–FG component optimizes the entire observation sequence through joint Maximum A Posteriori (MAP) inference. This means that in offline or quasi-offline settings, future observation information can be utilized to improve the SOC estimation at the current moment; thus, the EKF–HMM–FG acts as a smoothing-based improved method under this mode. To ensure fairness in comparison, the Extended Kalman Smoother (EKS) is selected as the model-driven benchmark in the experiments, and a deep learning method based on the Long Short-Term Memory (LSTM) network is further introduced as the data-driven benchmark. The LSTM can learn the dynamic variation relationship of SOC from historical voltage, current, and temperature data, and possesses strong temporal modeling capabilities. However, its performance is highly dependent on the distribution of training data, and it is prone to estimation deviations under out-of-distribution operating conditions. Through comprehensive comparison with the EKS and LSTM methods, the advantages of the proposed EKF–HMM–FG method in terms of estimation accuracy and robustness can be verified more comprehensively.
7.1. Dataset and Experimental Conditions
This study selects the INR 18650-20R lithium-ion battery as the research object and verifies the proposed SOC estimation algorithm under multi-temperature and multi-operating conditions. The experimental temperatures are set to 0 °C, 25 °C, and 45 °C to evaluate the algorithm’s adaptability and robustness in low-temperature, normal-temperature, and high-temperature environments. The basic parameters of the battery are shown in
Table 3, and the initial value of SOC is uniformly set to 0.8. The experimental operating conditions include two typical dynamic load profiles:
- (1)
FUDS (Federal Urban Driving Schedule): Simulates urban traffic conditions and is characterized by frequent acceleration and deceleration.
- (2)
DST (Dynamic Stress Test): Featuring large load fluctuations, it is designed to examine the battery’s dynamic response under stress testing.
In each set of experiments, the battery’s terminal voltage, current, and temperature data were collected. The Extended Kalman Filter (EKF) algorithm was first used for preliminary SOC estimation. Subsequently, the current data, EKF-estimated SOC, and temperature information were fed as observation inputs into the proposed Hidden Markov Model-Factor Graph (HMM–FG) for message passing and Maximum A Posteriori (MAP) inference, yielding the optimized SOC curve.
To verify the effectiveness and robustness of the proposed method, this study conducted a comparative analysis between the EKF–HMM–FG method and three benchmark methods: EKF, Extended Kalman Smoother (EKS), and the LSTM-based data-driven method. The focus was on evaluating their estimation accuracy and volatility performance under different temperature conditions and operating profiles.
7.2. Comparative Methods
7.2.1. Extended Kalman Smoother (EKS)
The Extended Kalman Smoother (EKS) is a posterior state estimation method developed on the basis of the Extended Kalman Filter (EKF). Its core idea is that on the basis of the forward recursion of filtering, it uses observation data from future moments to perform backward correction on historical states, thereby obtaining the optimal posterior estimation of the entire time series. The EKF is a causal filter that only relies on current and past observations. The EKS is different. It is a smoothing method that can use observation information from the past, present, and future at the same time. Therefore, it has higher estimation accuracy when the system noise is large or the model parameter uncertainty is strong.
In this study, the EKS adopts the same battery equivalent circuit model as the EKF as the state-space model. The state variables include SOC, model internal resistance, polarization voltage, and other parameters. The implementation process of the EKS includes the following two steps:
- (1)
Forward Filtering Phase: The SOC is estimated recursively via the EKF, yielding the a priori state estimates and the covariance sequence.
- (2)
Backward Smoothing Phase: Using the Rauch–Tung–Striebel (RTS) smoothing algorithm, the correction gain is propagated backward from the final moment to perform smoothing correction on the state estimation at each moment, resulting in the posterior estimation of the entire time series.
By incorporating future observation information, the EKS can effectively reduce the estimation fluctuations of the EKF under operating conditions with rapid dynamic changes and improve the overall accuracy of SOC estimation.
7.2.2. Long Short-Term Memory (LSTM)
Long Short-Term Memory (LSTM) is a typical variant of a Recurrent Neural Network (RNN). It can effectively alleviate the gradient vanishing and gradient exploding problems that traditional RNNs tend to have in long-sequence modeling by introducing a gating mechanism (input gate, forget gate, and output gate). This allows it to better capture long-term dependencies in time-series data. With this advantage, LSTM has been widely used in many time-series modeling tasks such as speech recognition, natural language processing, and battery health management.
In the task of battery SOC estimation, LSTM does not rely on an explicit battery equivalent circuit model. Instead, it relies on a large amount of historical operating data for end-to-end training. By learning the complex nonlinear mapping relationships between voltage, current, temperature, and SOC, it achieves fully data-driven SOC prediction. Compared with EKF and EKS methods based on physical models, LSTM is more flexible and can adapt to different working conditions and nonlinear characteristics to a certain extent. However, its performance depends on the scale and coverage of training data. When the training data are insufficient or there is a large difference between the working condition distribution and the test conditions, the prediction accuracy and generalization ability may be affected.
7.2.3. Analysis of Experimental Results
To verify the effectiveness of the EKF-HMM-FG method proposed in this study, SOC estimation experiments were conducted under three temperature conditions (0 °C, 25 °C, and 45 °C) and two operating conditions (FUDS and DST).
Figure 8,
Figure 9,
Figure 10 and
Figure 11 show the SOC estimation curves and corresponding error curves of each method under different conditions, while
Table 4,
Table 5,
Table 6,
Table 7,
Table 8 and
Table 9 summarize the quantitative error indicators of each method.
Analysis of the SOC estimation curves shows that the estimation results of the traditional EKF have large fluctuations. Especially under low-temperature and high-dynamic load conditions, the curve exhibits obvious jitter and even deviation. Compared with EKF, EKS produces smoother estimation curves, but there is still a deviation from the actual SOC. The LSTM can follow SOC changes well under some conditions, but it has a certain overall drift problem, which is particularly obvious at high temperatures. In contrast, the SOC curve obtained by the proposed EKF-HMM-FG method in this study is highly consistent with the actual value. Moreover, in the locally enlarged area, it can closely follow the actual SOC trajectory, showing stronger robustness and stability.
The SOC error curves further illustrate the differences between the methods. The errors of EKF and EKS fluctuate greatly over time, with the maximum value exceeding 3% to 4%. For LSTM, the error deviates seriously under some operating conditions, showing a systematic bias. In contrast, the error of the proposed EKF-HMM-FG method is always concentrated around zero, and its fluctuation range is significantly reduced. It can be maintained within ±1.5% under all three temperature conditions, demonstrating strong temperature adaptability.
The MAE, RMSE, maximum error, and Bias indicators shown in
Table 4,
Table 5,
Table 6,
Table 7,
Table 8 and
Table 9 are consistent with the curve results. Taking the FUDS operating condition as an example, the RMSE of the proposed EKF-HMM-FG method at 25 °C is only 0.425%, with a maximum error of 1.411%—far lower than the EKF’s 0.883% (RMSE) and 4.103% (maximum error). At 45 °C, the RMSE of the EKF-HMM-FG further decreases to 0.293%, and the maximum error is only 0.841%. It shows similar advantages under the DST operating condition as well, which indicates that this method can maintain high accuracy and stability across different operating conditions and temperatures.
A combination of graph and table comparisons shows that the EKF-HMM-FG method is significantly superior to EKF and EKS in suppressing noise fluctuations and reducing systematic deviations. At the same time, it avoids the generalization problem of the LSTM method in cross-temperature scenarios. Its Bias is close to zero at all temperatures, which verifies the practicality and robustness of the proposed method in complex operating conditions.
8. Conclusions
This paper proposes a battery state estimation method based on the hidden Markov model factor graph. It modeled the dynamic characteristics of the battery using a second-order equivalent circuit model. In this method, the terminal voltage serves as the observation factor, while the SOC estimated by EKF is used to construct the state transition factor. This design improves the accuracy and stability of the estimation. In the simulation experiments, the sum-product algorithm is used to estimate the marginal distribution of SOC through message passing in the factor graph. This effectively improves the accuracy and robustness of the estimation.
In the future, the online real-time estimation of the model can be further improved. The multi-source information [
19,
20] (such as temperature, capacity, attenuation, etc.) can be combined to enhance the accuracy of SOC estimation, so as to meet the needs of complex application scenarios such as electric vehicles.