Open AccessArticle
Stage-Wise SOH Prediction Using an Improved Random Forest Regression Algorithm
by
Wei Xiao, Jun Jia, Wensheng Gao, Haibo Li, Hong Xu, Weidong Zhong and Ke He
Electronics 2026, 15(2), 287; https://doi.org/10.3390/electronics15020287 (registering DOI) - 8 Jan 2026
Abstract
In complex energy storage operating scenarios, batteries seldom undergo complete charge–discharge cycles required for periodic capacity calibration. Methods based on accelerated aging experiments can indicate possible aging paths; however, due to uncertainties like changing operating conditions, environmental variations, and manufacturing inconsistencies, the degradation
[...] Read more.
In complex energy storage operating scenarios, batteries seldom undergo complete charge–discharge cycles required for periodic capacity calibration. Methods based on accelerated aging experiments can indicate possible aging paths; however, due to uncertainties like changing operating conditions, environmental variations, and manufacturing inconsistencies, the degradation information obtained from such experiments may not be applicable to the entire lifecycle. To address this, we developed a stage-wise state-of-health (SOH) prediction approach that combined offline training with online updating. During the offline training phase, multiple single-cell experiments were conducted under various combinations of depth of discharge (DOD) and C-rate. Multi-dimensional health features (HFs) were extracted, and an accelerated aging probability
was defined. Based on the correlation statistics between HFs,
, the SOH, and
, all cells in the dataset were divided into general early, middle, and late aging stages. For each stage, cells were further classified by their longevity (long, medium, and short), and multiple models were trained offline for each category. The results show that models trained on cells following similar aging paths achieve significantly better performance than a model trained on all data combined. Meanwhile, HF optimization was performed via a three-step process: an initial screening based on expert knowledge, a second screening using Spearman correlation coefficients, and an automatic feature importance ranking using a random forest regression (RFR) model. The proposed method is innovative in the following ways: (1) The stage-wise multi-model strategy significantly improves the SOH prediction accuracy across the entire lifecycle, maintaining the mean absolute percentage error (MAPE) within 1%. (2) The improved model provides uncertainty quantification, issuing a warning signal at least 50 cycles before the onset of accelerated aging. (3) The analysis of feature importance from the model outputs allows the indirect identification of the primary aging mechanisms at different stages. (4) The model is robust against missing or low-quality HFs. If certain features cannot be obtained or are of poor quality, the prediction process does not fail.
Full article
►▼
Show Figures