1. Introduction
With the rapid advancement of new green energy technologies, the precise detection of battery state of charge within battery management systems (BMSs) has become a research hotspot in the energy sector [
1]. As the core energy storage component in new energy technologies, lithium-ion batteries have achieved large-scale application in electric vehicles and smart grids due to their significant advantages, including high energy density [
2], long cycle life, and absence of memory effect [
3]. Notably, the safe operation of electric vehicle powertrains heavily relies on the BMSs’ real-time monitoring of SOC [
4]. However, SOC measurement is constrained by the complexity of the battery’s electrochemical system and cannot be directly measured.
The current mainstream methods for predicting the SOC of batteries can be categorized into three major technical approaches: Kalman filter algorithms based on physical models, traditional empirical model-driven ampere-hour integration methods, and emerging data-driven methods [
5,
6,
7]. While the ampere-hour integration method offers the advantage of a simple principle, its error exhibits exponential growth over time. Neural network methods can achieve high-precision predictions but are constrained by the scale of training data. Open-circuit voltage and internal resistance methods, dependent on battery static conditions and measurement device accuracy, struggle to meet practical engineering requirements. With the rapid advancement of deep learning technology [
8], data-driven approaches have emerged as the core research methodology in this field [
9] due to their structural simplicity and ease of implementation. This method effectively approximates complex nonlinear mapping relationships, enabling the construction of high-precision systems [
10]. Notably, the Hammerstein model demonstrates unique advantages in SOC prediction due to its distinctive nonlinear mapping capability [
11]. By cascading a static nonlinear module with a dynamic linear module, this model effectively decomposes complex nonlinear problems into separately manageable nonlinear compensation and linear control issues.
The nonlinear input-linear dynamic response characteristics of the Hammerstein model [
12,
13] align well with the autonomous learning capabilities of neural networks, enabling the construction of highly accurate SOC prediction models. Yang [
14] et al. employed a three-layer BP neural network to estimate SOC; however, traditional BP neural networks suffer from issues such as sensitivity to initial weights and susceptibility to local optima, resulting in imprecise SOC predictions. Liu et al. [
15] used a genetic algorithm (GA) to optimize the BP neural network. However, this optimization requires adjusting numerous parameters, such as crossover rate and mutation rate, resulting in reduced convergence speed. Li et al. [
16] introduced particle swarm optimization (PSO) for weight and threshold correction, constructing a BP neural network model with PSO optimization. Although this algorithm improved convergence speed, its local search capability remained insufficient. Gao et al. [
17] used the Grey Wolf Algorithm (GWO) to optimize the weight values of BP neural networks, yet a reducible steady-state error margin persisted after optimization. Zhao Xinhao et al. [
18] employed an improved firefly algorithm to optimize backpropagation neural networks, but this approach exhibited high computational complexity and time consumption. Wei et al. [
19] used the cuckoo algorithm to optimize the BP neural network for predicting the health status of lithium batteries, which had fewer iterations and larger errors. Afaq Ahmad et al. [
20] used the HO optimization algorithm to integrate electric vehicle charging network research, and Aniseh Saber et al. [
21] used the HO algorithm to conduct risk assessment on LSTM network optimization. The study showed that the HO optimization algorithm has good optimization performance and prediction accuracy in mechanical design and photovoltaic power generation prediction fields. On this basis, this article combines the HO algorithm with the Hammerstein model for the first time to construct an HO BP Hammerstein composite structure, which is applied to predict the state of charge (SOC) of lithium-ion batteries. In the existing SOC prediction methods for lithium-ion batteries, using optimization algorithms to improve neural networks has become the mainstream research direction. However, various mainstream algorithms have inherent limitations in being quantifiable and comparable, which restricts further improvement of prediction performance. Genetic algorithm (GA) relies on complex manual parameter adjustment and has a slow convergence speed; Although the particle swarm optimization (PSO) algorithm has improved convergence speed, it is prone to premature convergence in complex nonlinear spaces and lacks local development capabilities, resulting in large steady-state errors; The gray wolf algorithm (GWO) and other algorithms have poor local optimization stability under dynamic conditions, which limits the model’s generalization ability. Compared with these methods, the Hippopotamus Optimization Algorithm (HO) [
22] introduced in this article has significant differences and innovative value in mechanism: it simulates the social structure and behavioral competition of hippopotamus populations and constructs an optimization framework with fewer parameters and stronger search dynamics [
23]. The HO algorithm can more effectively maintain population diversity and avoid falling into local optima during the global exploration phase; during the local development phase, more refined searches can be conducted. The advantage of this mechanism enables HO to more accurately optimize neural network weights when dealing with the highly nonlinear and dynamic coupling characteristics of battery systems, thereby achieving a better balance between exploration and mining, providing a more efficient algorithm framework for constructing high-precision SOC prediction models. Subsequent experiments will demonstrate that incorporating this innovative mechanism into the Hammerstein model can significantly improve predictive performance.
Among existing lithium-ion battery state-of-charge prediction methods, the Hammerstein model can characterize the dynamic-static coupling characteristics of batteries. However, its nonlinear modules often employ traditional function approximation methods, leading to issues such as dynamic response lag and insufficient generalization capability under complex operating conditions. Current mainstream Hammerstein model prediction methods include key term separation [
24], overparameterized identification [
25,
26], hierarchical recursive identification [
27,
28,
29], and filtering techniques [
30,
31]. Among these, the Hammerstein model developed by Liu and Wang et al. [
11,
24], which combines key term separation with hierarchical identification principles, predicts the battery’s state of charge. However, these methods generally suffer from insufficient dynamic tracking capability and high sensitivity to noise. To address this bottleneck, this paper innovatively introduces a hippo algorithm-optimized BP neural network into the nonlinear module of the Hammerstein model. This approach offers the following advantages: enhanced algorithmic performance, optimized model structure, and improved engineering applicability. The experiment shows that the HO BP Hammerstein model proposed in this paper achieves significant advantages in standard function optimization (such as an F4 function accuracy of 6.1 × 10
−2.61). In battery SOC prediction, the MAE was as low as 0.469% under 0 °C FUDS conditions and remained below 0.74% in wide temperature range testing. The prediction accuracy of R
2 > 97% was achieved on real vehicle data, significantly improving the robustness and practicality of the prediction.
The main contributions of this paper are as follows:
- (1)
For lithium-ion battery state-of-charge prediction under complex operating conditions, a composite architecture integrating neural networks with the Hammerstein model is proposed. This approach suppresses the accumulation of multi-factor errors through a nonlinear dynamic coupling mechanism.
- (2)
Employing the key-term separation concept, the coupling between parameters of a non-linear portion and a linear portion in the Hammerstein SOC model is separated with minimal parameters and computational effort.
- (3)
The proposed method demonstrates superior performance across various operating conditions through comparisons with HO-BP-Hammerstein, GWO-BP, and PSO-BP approaches, achieving an average error below 0.74%.
- (4)
Practical in-vehicle validation: Using hybrid electric vehicle data, the proposed method proves applicable with high stability and accuracy demonstrated through experiments.
Chapter distribution of this article:
Section 2 introduces the Hammerstein model based on the HO algorithm-optimized backpropagation neural network and the BP deep neural network architecture.
Section 3 presents the mathematical modeling of the hippo algorithm optimization process.
Section 4 verifies the superior performance of the hippo optimization algorithm and the prediction process of the HO-BP-Hammerstein model.
Section 5 elaborates on the data sources and processing methods used in the experiments, as well as the experimental parameter settings and result analysis.
Section 6 summarizes this paper and proposes extended ideas for further research.
3. Mathematical Modeling of the HO Algorithm
3.1. Population Initialization
Hippopotamus optimization is a population-based optimization algorithm whose search agents consist of hippopotamus individuals. Hippopotamuses represent candidate solutions to the optimization problem, meaning each hippopotamus’s updated position within the search space signifies an adjustment to the decision variable’s value. The initialization phase requires generating random initial solutions. During this process, decision variables are generated according to the following formula:
In the formula, gives the i-th solution’s position, with r being a uniform random number in [0, 1], is the lower bound; is the upper bound for the j-th variable. N is the population size; m is the problem’s decision variable count.
The HO algorithm simulates the social structure and juvenile behavior characteristics of hippopotamus herds, integrating leader guidance, exploratory wandering, and risk avoidance mechanisms into the optimization process. Its core concept employs biological heuristic strategies to balance global search with local exploitation, thereby enhancing the efficiency of solving complex optimization problems [
21].
In this study, the Hippo Optimization (HO) algorithm was employed as a global optimizer, directly optimizing all weights and bias parameters of the BP neural network rather than solely for parameter initialization. The entire training process was dominated by HO, with the standard BP backpropagation algorithm serving to provide fitness calculations for HO based on network outputs, rather than performing an independent gradient descent training phase.
3.2. Phase One: Hippopotamus Position Updates in Rivers or Ponds (Exploration Phase)
The first stage of the HO algorithm achieves global exploration of the solution space by simulating the competitive and dispersal behaviors of male members within a hippopotamus herd. The dominant male hippopotamus (representing the global optimum) guides the population toward convergence, while expelled males enhance diversity through random perturbations or competitive mechanisms, thereby improving the algorithm’s robustness in tackling complex optimization problems.
Hippopotamus populations consist of several adult female hippos, hippo calves, multiple adult male hippos, and one dominant male hippo (the group leader). The selection of the dominant male hippo is based on the iterative evaluation of objective function values (taking the minimum value in minimization problems and the maximum value in maximization problems). The mathematical expression for the position of male hippopotamus members in a river or pond is shown in Equation (13).
denotes the position of the i-th male hippopotamus,
is a random number in the interval [0, 1], and
represents the position of the dominant hippopotamus (i.e., the solution with the highest fitness in the current iteration).
In Formula (14)
are random vectors in [0, 1],
is a random scalar in [0, 1],
and
are random integers taking values 1 or 2, and
and
are random integers taking values 0 or 1. In Formula (16)
denotes the mean of randomly selected hippo positions (which may include the current hippo
). Formulas (16) and (17) describe the rules for updating the position of female or immature hippos within a population. If
is greater than 0.6, it indicates the cub is far from its mother.
is a random number between [0, 1]; if greater than 0.5, local exploration is performed; otherwise, global exploration is conducted.
and
are coefficients or vectors randomly selected from preset scenarios to adjust movement direction and step size.
is a random number between [0, 1]. In Formulas (18) and (19)
is the objective function value used to evaluate the quality of candidate solutions.
By simulating hippopotamus characteristics, a dynamic equilibrium between exploration and exploitation is achieved. Position updates driven by random perturbations and fitness optimize neural network weights and biases, thereby enhancing model performance.
3.3. Phase Two: Hippopotamus Defending Against Predators (Exploration Phase)
The second stage of the HO algorithm transforms low-fitness regions (“predator locations”) into exploration drivers by simulating hippopotamus defense against predators. Through directed perturbations and multi-strategy exploration mechanisms, it enhances the algorithm’s global search capability in complex solution spaces. This phase, along with the first phase, provides an adaptive balance between exploration and operation. The predator’s position within the search space is defined by Equation (20).
Here,
denotes a random vector ranging from 0 to 1.
The proximity of the i-th candidate solution to the low-fitness region is expressed as in Equation (21). Formulas (21) and (22) achieve an adaptive switching between local exploration and global search in the optimization algorithm by simulating a hippopotamus’s differential response to predator distance. This mechanism enables the HO algorithm to perform exceptionally well in complex optimization problems, particularly in optimizing neural network weights and biases, effectively balancing training speed and model performance.
is a candidate solution in the defensive state, and
is a random vector based on the Levy distribution, simulating sudden position changes during predator attacks. In Equation (21), the range of values for
[2, 4], The range of values for c [1, 1.5], The range of values for D [2, 3], The range of values for g [−1, 1], and
is a multidimensional random direction vector. The range of values for w and v [0, 1], ϑ is a constant equal to 1.5, Γ denotes the gamma function, and
is the scale parameter.
The survival competition rules for hippos after defending against predators, as shown in the above formula, are centered on determining individual retention or replacement through fitness comparison. In the second stage, a significant improvement in the global search process is observed. Steps 1 and 2 complement each other to effectively prevent the system from lingering in local optimization.
3.4. Phase Three: Hippopotamus Escapes Predator (Development Phase)
When hippos encounter groups of predators or fail to disperse threats through defensive behaviors, their survival strategy involves rapidly retreating to nearby safe zones such as lakes or ponds. The third phase of the HO algorithm simulates this behavior by generating random positions near the current location, enhancing local exploration capabilities, and optimizing the solution’s fine-grained search.
Equations (26)–(29) establish a multi-scenario-based local search optimization, where t denotes the current iteration count, and T represents the maximum iteration count, dynamically adjusting the search intensity.
denotes a candidate solution for local development.
is randomly selected from three scenarios, with the chosen scenario exhibiting stronger local search capabilities.
denotes a uniformly distributed vector over [0, 1]
m, while
and
represent uniformly distributed scalars over [0, 1].
denotes a normal distribution scalar with mean 0 and standard deviation 1.
represents a search strategy emphasizing different development behaviors (detailed search, gradient following, random perturbations). This design endows the HO algorithm with high flexibility and efficiency during the local development phase, making it particularly suitable for scenarios requiring fine-tuned optimization.
The HO algorithm implicitly simulates the characteristics of different hippo roles through behavioral fusion and random perturbations, rather than explicitly categorizing the population. This approach prevents excessive computational complexity and low efficiency and avoids over-parameterization in network optimization. This design embodies a balance between “biologically inspired” principles and “engineering practicality,” ensuring the algorithm maximizes optimization performance while preserving the essence of natural behavior.
5. Experimental Simulation and Results Processing
5.1. Experimental Data
The lithium battery dataset used in this paper originates from the University of Maryland (CALCE), with the battery model being INR18650-20R. Key parameters are listed in
Table 3. This includes the Federal Urban Driving Schedule (FUDS) dataset and the Dynamic Stress Test (DST) dataset, as shown in
Figure 8 and
Figure 9. When the test temperature is set to 0 °C and the initial SOC of the battery is 80%, detailed battery parameters for both the FUDS and DST datasets—such as voltage, current, voltage change rate, and SOC—are provided.
5.2. Performance Evaluation Metrics
This paper employs Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and Coefficient of Determination (R
2) to evaluate the model’s performance in predicting SOC. The calculation formulas:
5.3. Verification of SOC Prediction Accuracy Superiority
To validate the SOC prediction performance of the HO-BP-Hammerstein model [
37], this study predicts the state of charge during discharge under FUDS conditions (0 °C, 25 °C, 45 °C) with an initial SOC of 80%. The results are compared with other SOC prediction methods, such as GWO-BP and PSO-BP. As shown in
Figure 10b, during the initial simulation phase, the PSO-BP and GWO-BP methods predicted values closer to the actual SOC than the HO-BP-Hammerstein model. However, as the battery continued discharging, the predictions from both PSO-BP and GWO-BP exhibited significant fluctuations with gradually increasing errors. Despite the superior stability of PSO-BP and GWO-BP, the HO-BP-Hammerstein model consistently demonstrates a narrower error range between predicted and actual SOC, clearly exceeding the comparison methods in prediction accuracy. Compared to comparative methods, the modeling results of the HO-BP-Hammerstein model not only achieve higher prediction accuracy but also exhibit superior stability. The comparative evaluation metrics for each method are summarized in
Table 4. The H0-BP-Hammerstein model achieved optimal performance across all metrics: R
2 = 0.976, MSE = 3.7 × 10
−5, RMSE = 0.615%, and MAE = 0.469%. This fully validates the model’s comprehensive advantages in both accuracy and stability.
To check the efficiency of the HO-BP-Hammerstein model in predicting SOC under different operating conditions, SOC prediction experiments were also conducted under DST conditions (0 °C, 25 °C, 45 °C) with an initial SOC of 80%. The prediction results of the HO-BP-Hammerstein model at different temperatures indicate that its SOC prediction curves exhibit the closest consistency with actual measured values in
Figure 11a–c. Simulation results demonstrate that, compared to the reference methods, the HO-BP-Hammerstein model adapts well to different operating conditions and maintains higher stability in battery SOC prediction. As shown in
Table 5, the evaluation metrics reveal that at 0 °C, the HO-BP-Hammerstein model achieves an R
2 of 0.986, with MAE and RMSE of 0.368% and 0.485%, respectively. The GWO-BP model achieves an R
2 of 0.941, with MAE and RMSE of 0.83% and 1.001%, significantly reducing error occurrence. These results further demonstrate the applicability and high accuracy of the HO-BP-Hammerstein model under varying operating conditions, highlighting its valuable utility for engineering practice.
According to
Table 6, the training time of HO BP Hammerstein is higher than that of PSO-BP and GWO-BP, which is due to the more complex population behavior simulation and fitness evaluation of the HO algorithm. However, during the deployment phase, the single inference time of all three methods is extremely small, as the trained model only requires one simple forward calculation for prediction. Although the training process of the model (which is an offline step) takes a long time, it does not affect the online near-real-time estimation performance of the vehicle BMS. Therefore, the HO BP Hammerstein model not only significantly improves prediction accuracy but also has engineering feasibility for achieving near real-time and high-precision SOC estimation in actual BMS hardware.
5.4. Real Vehicle Battery Prediction Simulation
In practical applications, accurate prediction of SOC is crucial, but accuracy is affected by various factors.
This section conducts experimental validation using integrated battery data from two different electric vehicle brands. Both models utilize 18,650-type ternary lithium-ion battery systems, charged under fast-charging and slow-charging modes, respectively. Through characteristic analysis of battery data combined with experimental simulation, the generalization capability of the HO-BP-Hammerstein model is verified.
To ensure the input quality and training stability of the model, we have systematically preprocessed the raw data. The specific steps and methods are as follows:
Due to the high frequency of data collection, there is a significant redundancy issue in the original charging data, manifested as multiple repeated or differentiated single battery voltage records corresponding to the same SOC value. Firstly, we performed deduplication on the data to eliminate identical duplicate records. Subsequently, a method combining physical constraints and statistical analysis was used to handle outliers: based on the normal operating voltage range of lithium-ion batteries (2.5 V–4.2 V), physical outliers that clearly exceeded this limit were removed; At the same time, for abnormal voltage/current values in historical operating data, cubic spline interpolation is used for smooth filling to restore the continuity and rationality of the data, avoiding sequence interruption or information loss caused by direct exclusion. After cleaning, a total of 140,866 valid charging data were obtained, laying a reliable data foundation for subsequent modeling. To further improve the training efficiency and numerical stability of the model, a normalization method is used to linearly transform all feature data into the [−1, 1] interval. This range matches the output domain of the commonly used hyperbolic tangent activation function in neural networks, which not only helps to unify the scales of different scale features but also accelerates the model convergence process and provides a smoother search space for subsequent HO optimization algorithms in the parameter space. The formula follows:
where
denotes the normalized data value;
represents the raw data;
and
are the maximum and minimum values in the raw data. All processed charge data are stored within the range [−1, 1].
Algorithm parameters: hippo population size is 50, maximum iteration count is 50, network iteration count is 100, and convergence threshold is 0.000001. 80% of the experimental data is used for model training, while the remaining 20% is reserved for model validation to ensure the model maintains good predictive accuracy under real-world conditions.
The model prediction results are shown in
Figure 12 and
Figure 13, demonstrating excellent predictive accuracy. Under these conditions, the HO-BP-Hammerstein model achieved evaluation metrics of: R
2 = 97.1918%, MAE = 0.081149, MSE = 0.010592, and RMSE = 0.10292. The experimental results demonstrate the model’s strong generalization capability on new data, highlighting its applicability and stability.
In the HO-BP-Hammerstein model, the core mechanism of SOC prediction lies in organically integrating the nonlinear dynamic modeling capability of the Hammerstein system with the parameter adaptation feature of the BP neural network through a hybrid optimization strategy. Simultaneously, leveraging the global search advantage of the HO algorithm effectively avoids the pitfall of traditional gradient descent methods, which are prone to getting stuck in local minima. The model is implemented through a three-tiered structure: the Hammerstein module consists of a static nonlinear block cascaded with a dynamic linear block; the BP neural network serves as the parameter identifier, with its weight matrix and bias vector forming the optimization variables for the HO algorithm; and the HO algorithm efficiently searches the global parameter space to locate the optimal network parameter combination that minimizes prediction error. Theoretically, the effective separation of static nonlinear and dynamic linear parameters within the Hammerstein model enables gradient information calculation for the BP network through backpropagation of errors. Combined with the adaptive search mechanism of the HO algorithm, this achieves optimal parameter selection. The error norm exhibits exponential decay with iteration, ultimately confining SOC prediction errors strictly within engineering accuracy requirements.
6. Conclusions
This paper proposes the Hippo Algorithm-Optimized BP Neural Network Hammerstein Model (HO-BP-Hammerstein), achieving high-precision prediction of lithium-ion battery state of charge. By globally optimizing the neural network’s weight matrix and bias parameters via the HO algorithm, the model’s nonlinear fitting capability is significantly enhanced. In multi-condition testing under FUDS and DST protocols across the 0 °C to 45 °C temperature range, the model achieved an average absolute error of 0.368%, representing a 46.2% reduction compared to the GWO-BP method. This demonstrates excellent temperature adaptability and prediction accuracy. Practical engineering validation confirms the model’s effective application in hybrid electric vehicle battery management systems, enabling reliable SOC prediction for real electric vehicles and providing critical data support for battery thermal runaway early warning. This study has made significant progress in the field of SOC prediction for lithium-ion batteries, but there are still certain limitations. On the one hand, the model training process has higher computational costs; on the other hand, the dataset size and diversity of operating conditions currently used for model validation are still limited. Future research work should focus on more concrete and feasible paths, with a focus on breaking through the challenges of multi-source heterogeneous data fusion generated in real vehicle operation. Specifically, the system needs to integrate multidimensional temporal data such as voltage curves, multidimensional temperature distributions, cumulative cycle times, and current change rates collected synchronously under different charging strategies (such as fast charging and slow charging modes). These variables have a significant impact on the dynamic characteristics and aging status of the battery. To effectively extract key information from the complex heterogeneous data mentioned above, it is recommended to use a technology path that combines attention mechanism to extract key features to dynamically quantify different data sources, achieve adaptive collaborative optimization of model structure and parameters, and thus construct a higher precision next-generation battery state prediction framework, promoting the transfer and application of this research results to more complex actual vehicle environments.