1. Introduction
The depletion of fossil fuel reserves and the escalating impacts of climate change have posed significant challenges to the sustainable development of renewable energy, smart grids, and clean transportation systems. Owing to their advantages of high energy density, low self-discharge, environmental friendliness, and long service life, lithium-ion batteries have been widely applied in these fields and play an important role [
1,
2,
3]. However, with repeated charge–discharge cycles during operation, numerous electrochemical side reactions continuously occur in the anode, electrolyte, and cathode, leading to progressive performance degradation of the battery. Consequently, the remaining useful life (RUL) of the battery degrades [
4], significantly impairing charge–discharge performance and potentially leading to malfunction or even safety accidents under extreme conditions. Accurate prediction of battery RUL can provide critical guidance for preventive maintenance and safe, stable operation, reduce maintenance costs, and mitigate the risk of catastrophic failures [
5].
As a core function of the battery management system (BMS), remaining useful life (RUL) prediction has attracted increasing attention in recent years [
6], as it aims to estimate the remaining number of charge–discharge cycles before the battery reaches the failure threshold (FT) under its current health state and nominal operating conditions [
7]. Currently, existing battery RUL prediction techniques can be broadly categorized into model-based methods and data-driven methods [
8,
9].
Model-based prediction methods utilize mathematical models that characterize battery aging behavior to examine the relationship between performance degradation and aging-related performance indicators [
10]. Equivalent circuit model-based approaches are unable to fully capture the dynamic variations in batteries [
11]. Electrochemical models are established on the basis of reaction mechanisms during electrochemical processes; however, due to the excessive complexity of internal battery reactions, it is difficult to construct accurate degradation models [
12]. Consequently, model-based estimation methods remain challenging and generally exhibit limited prediction accuracy [
13]. In contrast, data-driven methods can effectively avoid the aforementioned problems.
Data-driven methods do not require consideration of the internal physicochemical reaction processes of batteries and directly use historical monitoring data to predict battery degradation trends, such methods mostly adopt machine learning algorithms [
14,
15]. Sun et al. [
16] proposed a modeling strategy that decomposes the degradation process into analyzable stages and employed support vector machines (SVMs) to predict lithium-ion battery capacity degradation. In addition to machine learning approaches, statistical analysis methods under a probabilistic framework are also commonly used for RUL prediction in data-driven methods [
17], including Gaussian process regression [
18] and sample entropy-based statistical methods [
19]. With the development of artificial intelligence, deep learning-based RUL prediction methods have attracted increasing attention [
20]. In ref. [
21], high SOH prediction accuracy was achieved by integrating an improved gray wolf optimization algorithm with a deep extreme learning machine; however, the method relies on manual feature extraction. Ref. [
22] significantly improved SOH prediction accuracy by constructing an activation function library and adaptively enabling long short-term memory (LSTM) to select more suitable activation functions at different stages. However, the method relies on empirical rules for activation function selection. Ref. [
23] achieved accurate and interpretable RUL prediction by combining bidirectional long short-term memory (BiLSTM) and Transformer with a knee-point initiation strategy; however, the model has a complex structure and was mainly validated on a single battery, leaving its cross-battery generalization capability to be further evaluated.
The raw capacity sequence of batteries often exhibits pronounced non-stationarity and is coupled with capacity regeneration, measurement noise, and multi-scale degradation characteristics, which limits the model’s ability to effectively learn the true degradation patterns [
24]. Ref. [
25] integrated variational mode decomposition (VMD), an attention mechanism, and temporal convolutional network (TCN) to effectively separate high-frequency noise and enhance long-term sequence modeling capability. Ref. [
26] decomposed the capacity sequence using complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) and combined it with parallel BiLSTM to effectively characterize multi-scale degradation features and capacity regeneration, significantly improving the stability of RUL prediction. Ref. [
27] targeted small-sample scenarios by integrating CEEMDAN and principle component analysis feature enhancement and achieved high accuracy RUL prediction using only 10% of the data. However, the use of VMD and CEEMDAN increases computational complexity and parameter tuning cost and may easily lead to data leakage. To address the problems of insufficient generalization and lack of physical consistency under small-sample and varying operating conditions, recent studies have introduced physical information into models to enhance interpretability, robustness, and cross-condition prediction capability while retaining the high representational power of data-driven methods [
28]. Ref. [
29] introduced empirical degradation models and physical constraints into a data-driven framework using physics-informed neural networks, achieving high stability and a certain level of interpretability under small-sample, cross-battery, and cross-condition scenarios. Ref. [
30] achieved label-free, small-sample RUL prediction with strong interpretability by relying solely on online capacity data of a single battery using Kalman filter and Wiener degradation model; however, the simplified model assumptions and reliance on a single capacity feature require further validation for real-time applications.
Compared with pure data-driven methods, model-based methods based on identification have also received extensive attention in recent years. Such methods typically identify and update model parameters on the basis of a pre-defined model structure through techniques such as least squares or swarm intelligence optimization [
31]. Reference [
32] proposes a SOH estimation method based on impulse response and temperature-extended ARX models, achieving joint estimation of battery capacity degradation and internal resistance increase. This method significantly improves the accuracy and trend tracking ability of SOH estimation under different temperature conditions and has a relatively low computational complexity suitable for real-time implementation in BMS. Reference [
33] focuses on the swarm intelligence and meta-heuristic optimization identification of ECM parameters, treating the identification as a global optimization problem, which is suitable for parameter identification scenarios with non-convex cost functions, initial value sensitivity, or numerous local minima. However, model-based methods based on identification rely on pre-defined model structures, and their modeling capabilities are limited by model order and structural assumptions. Additionally, the parameter identification process is sensitive to the quality of the excitation signal and noise level, which can lead to parameter drift or unstable identification. Therefore, enhancing the model’s ability to express complex degradation behaviors remains an important issue that needs to be further addressed for identification-based models.
Based on the above analysis, this study extracts characteristic parameters of lithium-ion batteries as health features (HFs) and proposes a stochastic configuration network (SCN) model optimized by the sparrow search algorithm (SSA), thereby constructing HFs and SSA-SCN-based prediction method. The proposed method is validated using the NASA and the CALCE battery datasets, and the experimental results demonstrate that the proposed method achieves superior accuracy and robustness in RUL prediction.
The organization of this paper is as follows:
Section 2 describes the lithium-ion battery datasets used in the experiments and presents the extraction of HFs, and
Section 3 introduces the related algorithms and constructs the SSA-SCN method for RUL prediction.
Section 4 provides an experimental validation of the proposed method and compares its prediction performance with that of classical approaches. Finally,
Section 5 concludes the paper.
3. Methodology
3.1. Stochastic Configuration Networks
SCNs constitute a class of randomized machine learning algorithms [
36]. Similar to other randomized learning methods, such as random weight feedforward neural networks and random vector functional link networks, SCNs randomly assign the input weights
ω and biases
b of hidden layer nodes and subsequently compute the output weights using regularized techniques such as least squares. Compared with other deep structured networks, SCN architecture does not involve coupling between layers. Since there is no need for interconnections among multiple layers, SCN has advantages over gradient-based layer by layer training methods that rely on backpropagation, in terms of network complexity, training speed, and parameter scale, thereby improving learning capability and computational efficiency. When network accuracy does not meet the specified requirements, the accuracy can be improved by increasing the number of hidden layer nodes. The additional computational cost incurred by increasing the number of hidden layer nodes is significantly lower than that associated with increasing network depth in conventional models. Therefore, SCN is particularly suitable for systems with limited feature dimensions that require real-time prediction.
When configuring and computing hidden layer nodes, SCN introduces a supervised mechanism and adopts an incremental learning strategy in which random parameters are assigned under inequality constraints. Starting from a simple network structure, the number of hidden layer nodes is increased according to the complexity of the training samples, the parameter ranges are adaptively adjusted, and the output weights of the nodes are computed using the least squares method to ensure good universal approximation capability. The structure of the SCN is shown in
Figure 5. The algorithmic principle and universal approximation property of SCN can be described as follows:
For a given training dataset {X, Y}, X = {x1, x2, …, xn} denotes the input variables, where xi = {xi,1, xi,2, …, xi,d} ∈ Rd, Y = {y1, y2, …, yn} denotes the corresponding output variables, where yi = {yi,1, yi,2, …, yi,m}, i = 1, 2, …, n.
Given a target function
f: R
d→R
m, assuming that the hidden layer of SCN has generated
L − 1 nodes, the current network output can be expressed as:
where
βj denotes the output weight of the
j-th node in the hidden layer.
g(·) denotes the activation function.
ωj and
bj are the input weight and bias of the
j-th hidden node, respectively, with
j = 1, 2, …,
Lmax.
The current residual vector of the network is calculated as:
If the magnitude of ‖
eL−1‖
2 does not reach the predefined error
ε or the number of network nodes has not reached the maximum value
Lmax, the
L-th hidden layer node is added, and its input weights and bias are determined according to the supervised mechanism given in Equation (7):
where
q = 1, 2, …,
m,
hL denotes the output of the
L-th hidden layer node.
ωL and
bL represent the candidate input weight and bias of the
L-th hidden node, respectively.
R ∈ (0, 1), {
μL} denotes a sequence of non-negative real numbers, where
μL ≤ 1 −
r and
lim L→+∞ μL = 0. The candidate parameters that maximize
criterion are chosen as the parameters of the
L-th node in the network.
The output weights of the hidden layer nodes are solved using Equation (8):
where
H+ denotes the Moore–Penrose generalized inverse of
H,
H = [
h1,
h2, …,
hL].
The network output
f is expressed as:
Assuming that
Γ = {
g1,
g2, …} denotes a set of real valued functions, span(
Γ) represents the function space spanned by
Γ; span(
Γ) is dense in the
L2 space, and for ∀
g∈
Γ, 0 < ‖
g‖ <
bg,
bg ∈
R+. If the random basis function
gL satisfies the inequality constraint:
The output weights of the hidden layer nodes are expressed as:
Then, lim L→+∞|| f − fL|| = 0.
Based on the above theoretical foundations, SCN exhibits good universal approximation capability, along with fast learning speed, high real-time performance, strong generalization ability, and minimal reliance on manual intervention. The algorithmic flow of SCN is shown in
Figure 6.
3.2. Sparrow Search Algorithm
The performance of SCN is influenced by the settings of hyperparameters. During the construction process, the initialization of weights and biases for candidate layer nodes depends on the scaling factor λ, while the computation of inequality constraints in the supervised mechanism depends on the regularization parameter γ. Therefore, the input weights and biases of hidden layer nodes are affected by λ and γ. To enhance the effectiveness of SCN, it is crucial to identify optimal parameters to maximize network performance and optimize the network structure. SSA is a swarm intelligence-based optimization algorithm inspired by the foraging behavior of sparrows, which simulates the cooperation and competition within sparrow populations during foraging to solve various optimization problems, featuring strong search capability, fast convergence, and high robustness. The specific optimization process of SSA can be described as follows:
A population
X composed of
n sparrows.
where
n denotes the number of sparrows, and
d denotes the dimensionality of the variables.
Discoverers with higher fitness obtain food preferentially during the search process, and they are responsible for locating food sources and guiding the movement of the entire population. The position update of discoverers is given as:
where
t denotes the current iteration number.
denotes the position of the
i-th sparrow in the
j-th dimension at the
t-th iteration, where
i,
j = 1, 2, …,
N.
T denotes the maximum number of iterations.
α is a random number in the range (0, 1].
R2 denotes the alarm value and takes values in [0, 1].
ST denotes the safety threshold and takes values in [0.5, 1].
Q is a random number following a normal distribution.
L denotes 1 ×
d matrix with all elements equal to 1.
When R2 < ST, it indicates that there are no predators nearby, and the discoverers enter a wide range search mode. When R2 ≥ ST, it indicates the presence of danger in the vicinity, the discoverers issue an alarm, and all sparrows rapidly move to other safe regions.
During the foraging process, joiners move toward better food sources, and their position update is described as:
where
A+ =
AT(
AAT)
−1,
XP denotes the best position occupied by the discoverers.
Xworst denotes the current global worst position.
A is 1 ×
d matrix whose elements are randomly assigned as either 1 or −1.
It is assumed that 10% to 20% of the sparrow population becomes aware of danger. Sparrows that perceive danger exhibit antipredation behavior and move toward safe regions. The position update of vigilant sparrows is described as:
where
Xbest denotes the current global best position, and
β denotes the step size control parameter and follows a normal distribution with zero mean and unit variance.
K represents the movement direction of a sparrow and is a random number within the interval [−1, 1].
fi denotes the fitness value of the current sparrow individual,
fg denotes the global best fitness value,
fw denotes the global worst fitness value.
ε is a small constant introduced to avoid division by zero.
When fi > fg, the sparrow is located at the edge of the population, and Xbest represents the position of the population center, which is considered safe. When fi = fg, the sparrow located in the middle of the population becomes aware of danger and needs to move closer to other sparrows.
By iteratively updating according to the above steps, the fitness of the population continuously improves, and the optimal parameters are obtained after a certain number of iterations.
3.3. SSA-SCN
To improve the prediction accuracy and efficiency of SCN for battery capacity, the sparrow search algorithm is employed to optimize the hyperparameters of the stochastic configuration network model, namely the scaling factor λ and the regularization coefficient γ. The specific procedure of SSA-SCN can be described as follows:
Step 1: Set the number of sparrows n and the maximum number of iterations T, determine the maximum number of hidden layer nodes Lmax, the maximum number of candidate nodes Tmax, the tolerance error ε, and the hyperparameter ranges λ ∈ [0.5, 200] and γ ∈ [0.9, 0.9999].
Step 2: Construct and train the model. SCN is trained using the initial hyperparameters to obtain the initial fitness value as a baseline for optimization and mean square error (MSE) is adopted as the fitness function to evaluate the optimization performance of SSA-SCN. MSE defined in Equation (16).
where
n is the number of samples,
is the predicted value, and
y is the actual value.
Step 3: Execute the SSA. By continuously updating the positions of sparrows to improve fitness, the scaling factor λ and the regularization coefficient γ of SCN are optimized.
Step 4: Check whether the iteration number reaches the maximum value T or the fitness error satisfies the predefined threshold ε. If either condition is met, the iteration is terminated.
Step 5: Select the sparrow individual with the highest fitness obtained during SSA iterations, take its position parameters as the scaling factor λ and regularization coefficient γ of SCN, and construct SSA-SCN prediction model using the optimized hyperparameters.
The pseudocode is as follows (Algorithm 1):
| Algorithm 1. SSA-SCN Algorithm |
Input: battery features and historical capacity data Output: battery capacity |
| 1: Initialize parameters, including the sparrow population size n, the maximum iteration number T, the maximum number of hidden layer nodes Lmax, the maximum number of candidate nodes Tmax, the tolerance error ε, the scaling factor λ ∈ [0.5, 200], and the regularization coefficient γ ∈ [0.9, 0.9999]. |
| 2: Initialize the sparrow positions X = {x1, x2, …, xn}. |
| 3: Construct and train the initial SCN model. |
| 4: for i = 1 to n do |
| 5: Randomly initialize λ and γ, train the SCN model using λ and γ, and compute the fitness value f (xi). |
| 6: end for |
| 7: Record the current best position Xbest and its fitness value f (Xbest). |
| 8: Optimize SCN hyperparameters using SSA. |
| 9: for t = 1 to T do |
| 10: Update the positions of the discoverers. |
| 11: Update the positions of the joiners. |
| 12: Update the positions of vigilant sparrows. |
| 13: if f (Xbest) ≤ ε |
| 14: end for |
| 15: Construct the final SCN model using the optimized parameters. |
| 16: Use the trained SCN model to predict battery capacity. |
SSA conducts dual-parameter collaborative optimization on λ and γ of SCN. It achieves a combined mechanism of structural adaptive growth and global parameter optimization. Compared with other optimization algorithms, Particle Swarm Optimization has a fast convergence speed but is prone to getting stuck in local optima. Genetic Algorithm has the advantage of strong global optimization ability but has the drawback of slow convergence. In contrast, SSA method can achieve a balance between global and local optimization, and it has stable convergence and fast speed. SSA-SCN adopts a single hidden layer structure and uses generalized inverse to solve without backpropagation. The low-dimensional feature input scenario is more compatible with the real-time RUL scenario of lithium-ion batteries.
5. Conclusions
Lithium-ion batteries are widely used in the energy storage field due to their superior performance, and accurate prediction of RUL can improve the safety and reliability of battery systems. In this study, the NASA and CALCE datasets are investigated, and features characterizing battery performance degradation together with cycle numbers are extracted from the charge–discharge stages to form HFs, while Pearson and Spearman correlation analyses are employed to examine their correlation with capacity and verify the feasibility of the extracted HFs. SSA that provides the optimal parameters for SCN is proposed, thereby enabling accurate RUL prediction of lithium-ion batteries, and the performance of the SSA-SCN method is validated using two battery degradation datasets through comparative analysis with other prediction methods. Therefore, the proposed prediction method can provide higher accuracy and timeliness for lithium-ion battery RUL prediction. In future work, the full life cycle of battery charge–discharge processes and variations in other features such as temperature during operation will be considered to achieve more accurate RUL prediction. The proposed method has not yet been validated on other batteries or under more complex operating conditions, and therefore further optimization is required for practical applications.
Future research will further incorporate dynamic conditions such as temperature variations and multi-rate charging and discharging to analyze the impact of temperature coupling effects and load fluctuations on the stability of health characteristics and the accuracy of model predictions. It will also explore an extended framework that integrates environmental variables and operating conditions information in order to enhance the generalization ability and engineering applicability of the model in real-world scenarios. With the widespread application of large-scale energy storage systems, online adaptive and continuous learning can consider the sliding window incremental update mechanism, error-triggered local update strategies, and periodic background hyperparameter re-optimization. These methods need to be further enhanced in future engineering deployments to improve the model’s ability to adapt to long-term drift problems.