A Multi-Stage Adaptive Method for Remaining Useful Life Prediction of Lithium-Ion Batteries Based on Swarm Intelligence Optimization

: The accuracy of predicting the remaining useful life of lithium batteries directly affects the safe and reliable use of the supplied equipment. Since the degradation of lithium batteries can easily be inﬂuenced by different operating conditions and the regeneration and ﬂuctuation of battery capacity during the use of lithium batteries, it is difﬁcult to construct an accurate prediction model of lithium batteries. Therefore, research into high-precision methods of predicting the remaining useful life has been a popular topic for the whole-life management system of lithium batteries. In this paper, a new hybrid optimization method for predicting the remaining useful life of lithium batteries is proposed. The proposed method incorporates two different swarm intelligence optimization algorithms. Firstly, the whale optimization algorithm is used to optimize the variational mode decomposition (WOAVMD), which can decompose the historical life data into several trend components and non-trend components. Then, the sparrow search algorithm is applied to optimize the long short-term memory neural network (SSALSTM) to predict the non-trend component and the autoregressive integrated moving average model (ARIMA) is used to predict trend components. Finally, the prediction results of each component are integrated to evaluate the remaining useful life of lithium batteries. Results show that better prediction accuracy is obtained in the prediction experiments for several types of batteries in both the NASA and CALCE battery datasets. The generalization ability of the algorithm has also been effectively improved owing to the optimization of parameters of the variational mode decomposition (VMD) and the long short-term memory neural network (LSTM).


Introduction
Compared with other energy storage devices, lithium batteries have the advantages of higher energy density, longer life and stronger charge/discharge performance [1].They have been widely used in new energy vehicles, ships, aviation and other fields.However, the useable capacity of the lithium batteries will decrease during charging and discharging cycles and this will reduce the reliability of the device during operation and pose a safety hazard [2].The accurate prediction of the remaining useful life of lithium batteries can help users to maintain devices and avoid risks before they are about to fail.Therefore, research into the prediction of the remaining useful life of lithium batteries is of great significance.At present, common methods of the remaining useful life prediction for lithium-ion batteries can be mainly divided into model-based methods and data-driven methods [3].
The model-based methods require detailed prior knowledge of the respective battery and will also incur significant levels of complexity and calculating expenses to achieve a high level of estimating precision [4].They usually use the failure mechanism to estimate the remaining useful life of lithium batteries, but the calculation processes are relatively complicated and the model parameter identification is highly required.Li et al. [5] studied the extended Kalman filter, particle filter (PF) and recursive least squares, and then compared and analyzed their performance from two aspects of accuracy and convergence speed.Xiong et al. [6] extracted five battery aging indicators from the electrochemical model to estimate the capacity.Deng et al. [7] proposed an empirical model for predicting the remaining useful life.The PF method was applied to estimate the parameters of the model.In order to further improve the prediction accuracy, Li et al. [8] introduced a combined strategy of support vector machine (SVM) regression and the PF method in the remaining useful life prediction.Hong et al. [9] established an iterative model of a generalized Cauchy process with long-range dependence properties.Although the prediction effect of these methods to predict the remaining useful life of the batteries was gradually improved under certain operating conditions, the adaptability of the model was still insufficient for different types of batteries or in more complex actual working conditions.
The data-driven method is also a commonly used method to predict the remaining useful life.Its advantage is that it can avoid accurately establishing a complex electrochemical physical model of the lithium batteries.These methods use the health indicators of the lithium battery to input the prediction model for remaining useful life prediction.In order to solve the problem of poor accuracy and generalization ability caused by factors such as capacity regeneration and random fluctuations, Liu et al. [10] proposed a prediction method of remaining useful life based on ensemble empirical mode decomposition, deep Boltzmann machines and long short-term memory.Ge et al. [11] proposes a comprehensive prediction method based on VMD, integrated particle filter and the long short-term memory neural network (LSTM) with a self-attention mechanism (SA).Li et al. [12] uses the complementary integrated empirical mode decomposition (CEEMD) method to decompose the original capacity data curve, and then reconstructs the modal components based on the Hurst exponent.In the end, it uses Gaussian process regression (GPR) and LSTM to predict different components.Yun et al. [13] proposed a novel hybrid scheme based on complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN), the autoregressive integrated moving average model (ARIMA) and the least squares support vector machine (LSSVM).Historical discharge time series and capacity series were used for the analysis.Then, charge/discharge time data series were constructed as health indicators to achieve more reliable prediction.Wang et al. [14] improved CEEMD using an adaptive noise algorithm (ICEEMDAN).Then, they used a specially designed interpolation reconstruction mechanism to solve the problem of missing information due to over-decomposition.Finally, they further improved the prediction accuracy and robustness.Pan et al. [15] used empirical mode decomposition (EMD) to decompose the degradation curve, and then used LSTM to predict the long-term degradation trend.At the same time, SOH, SOC and rest time were inputted into GPR to predict the capacity regeneration part, which also improved the accuracy.Yun et al. [16] constructed and analyzed three time-health indicators in detail and estimated the relationship between time variables and capacity degradation to evaluate the SOH of lithium-ion batteries.In addition, in order to solve the problem of the difficult tuning of predictive model parameters, swarm intelligence optimization algorithms are often applied to optimize parameter settings.Wang et al. [17] developed a hybrid method based on an artificial bee colony algorithm and support vector regression (ABC-SVR) and solved the problem that the common PSO cannot provide accurate parameter optimization; Wang et al. [18] used the ant lion optimization algorithm (ALO) to optimize the parameters of SVR and further improved the accuracy; Yang et al. [19] proposed a novel hybrid method based on the support vector regression method, using the gray wolf algorithm to optimize its kernel parameters and Fan et al. [20] developed a deep learning method that combined the forgetting online sequential extreme learning machine (FOS-ELM) with the hybrid grey wolf optimizer (HGWO) algorithm and attention mechanism for the prognostic and health management (PHM) of the lithium battery, and a global optimal solution with high stability and high convergence speed was obtained.
From the research of many scholars mentioned above, it can be seen that the accuracy of the current methods for predicting the remaining useful life of lithium batteries has been improved well.However, the methods mentioned above have their limitations.For example, the accuracy of prediction results depends on the amount of provided historical data to some extent.When the length of the training data is not long enough for the model to accurately capture key information from the capacity degradation data, it is difficult for the prediction results to achieve the expected accuracy.Meanwhile, it is difficult for a prediction model using fixed parameters to maintain high prediction accuracy for different batteries.Therefore, this paper proposes a hybrid optimization method for predicting the remaining useful life of lithium batteries to overcome the problems of the insufficient adaptability of different batteries and a high dependence on historical data.The proposed method introduces two different swarm intelligence optimization algorithms to realize the adaptive adjustment of model parameters and structure according to the historical capacity data.In the decomposition step, the whale optimization algorithm (WOA) is used to find the most optimal decomposition parameters of VMD; thus, the capacity sequence is decomposed into components with low complexity.In the prediction step, the sparrow search algorithm (SSA) is applied to set parameters of LSTM to improve the prediction effect.
The remainder of the paper is organized as follows: In Section 2, the related theories and methods are described including variational mode decomposition, the whale optimization algorithm, long short-term memory and the sparrow search algorithm.Section 3 introduces the datasets; then, prediction experiments under different conditions are carried out based on the proposed hybrid optimization prediction method and the prediction results are analyzed and discussed.Section 4 is the conclusion.

Variational Mode Decomposition
Variational mode decomposition [21] uses the non-recursive decomposition mode to decompose the signal into a specified number of intrinsic mode functions (IMFs) with different center frequencies.The essence of VMD is to construct and solve variational problems.It is assumed that the original signal is decomposed into K components and each component is a modal component with a central frequency and a limited bandwidth.Meanwhile, it is ensured that the sum of the estimated bandwidth of each mode is minimal and the sum of all modes is equal to the original signal.Compared with empirical mode decomposition (EMD), VMD can severely overcome the problem of modal aliasing and also has a weaker end effect, so it has a better feature extraction ability and anti-noise ability.Assuming that the original signal is decomposed into K components, the constraints are as follows, as seen in Equation (1).
In Equation (1), {u k } is the set of k IMFs obtained after decomposition, {w k } is the set of center frequencies of each IMF, the center frequency of u i is w i (i = 1, 2 . . .K), * means convolution and δ(t) is the unit impulse function.In order to transform the equation into an unconstrained variational problem, the extended Lagrangian function is introduced to ensure the accuracy of signal decomposition (Equation ( 2)).
Next, Equation ( 2) is updated and solved using the alternating direction method of multipliers.In the end, k modal components {µ k } with the center frequency {w k } are obtained via convergence, where the expression of each IMF in the frequency domain μk is as follows: 2.1.2.Whale Optimization Algorithm However, the decomposition effect of VMD mentioned above has a strong dependence on the parameter settings.Meanwhile, there is no precise adjustment method for the parameter settings (K and α).Therefore, this paper uses the whale optimization algorithm WOA [22] to optimize the VMD to reduce the influence of manual setting parameters on the decomposition effect.The algorithm design is based on the behavior patterns of the whales, such as constricting, circling, randomly wandering and feeding in a spiral bubble net.According to whales' predatory behavior, the following mathematical models are established: (1) Shrinking and encircling predation.Firstly, it is assumed that the optimal whale position is the target prey position or the approximate target prey position and then other whales will try to approach the optimal whale position.The position update process is as in Equation ( 4): In Equation ( 4), t represents the number of iterative searches, X represents the position vector and X best represents the position vector of the current optimal whale.It will be continuously updated during the iteration process.A and C represent the coefficient matrix, as shown in Equation ( 5): In Equation ( 5), a is the convergence factor and will approach from 2 to 0 during the iterative process, and r is a random vector between [0, 1].
(2) Randomly wander for food.Due to the randomness of whales' positions when searching for prey, the optimal position is usually unknown; so, it is stipulated that the current whale position will randomly select the position of another whale and update it, and this will help to improve the global search ability of whales, as is shown in Equation (6).
In Equation (6), X rand represents the position vector of a random whale.
(3) Spiral upward predation.When the whale attacks the target, it will slowly approach the target in a spiral upward manner.The expression of this process is as follows: (7) In Equation (7), D is the difference vector between the optimal position and the current position, b is a constant and l is a random number uniformly distributed in [1, −1].According to the hunting behavior model above, the algorithm steps are designed as follows: Step 1. Initialize parameters.Input the whale scale N, the number of iterations t, the fitness value and the position vector of the specified optimal whale.
Step 2. Update the whales' positions and then calculate the individual fitness value.
Each whale will update its position based on different prey models according to different A values.Then, a random number p ∈ [0, 1] is introduced to make the whale choose between the shrinking encirclement model and the spiral upward model.When it is in the shrinking encirclement, the whale position update strategy is as follows: When A ≥ 1, it means that the whale is outside the shrinking circle, and the random wander foraging model is adopted; when A < 1, it means that the whale is in the shrinking circle.When p ≥ 0.5, the shrinking encirclement model will be selected, and when p < 0.5, the spiral upward model will be used.
Step 3. Iterate the loop.If the loop termination condition is not met, Step 2 will be repeated.
Step 4. Output the parameters K and α obtained from the optimal algorithm solution.
The WOAVMD is designed to enable adaptive decomposition according to the historical data of different batteries, effectively reducing the nonlinearity and complexity of the original sequence.The algorithm flow is shown in Figure 1.
In Equation (7), ′ ⃑⃑⃑ is the difference vector between the optimal position and the current position,  is a constant and  is a random number uniformly distributed in [1, −1].According to the hunting behavior model above, the algorithm steps are designed as follows: Step 1. Initialize parameters.Input the whale scale , the number of iterations , the fitness value and the position vector of the specified optimal whale.
Step 2. Update the whales' positions and then calculate the individual fitness value.Each whale will update its position based on different prey models according to different  values.Then, a random number  ∈ [0, 1] is introduced to make the whale choose between the shrinking encirclement model and the spiral upward model.When it is in the shrinking encirclement, the whale position update strategy is as follows: When  ≥ 1, it means that the whale is outside the shrinking circle, and the random wander foraging model is adopted; when  < 1, it means that the whale is in the shrinking circle.When  ≥ 0.5, the shrinking encirclement model will be selected, and when  < 0.5, the spiral upward model will be used.
Step 3. Iterate the loop.If the loop termination condition is not met, Step 2 will be repeated.
Step 4. Output the parameters K and α obtained from the optimal algorithm solution.
The WOAVMD is designed to enable adaptive decomposition according to the historical data of different batteries, effectively reducing the nonlinearity and complexity of the original sequence.The algorithm flow is shown in Figure 1.

Long Short-Term Memory Neural Network
Long short-term memory (LSTM) is a variant of the recurrent neural network (RNN) proposed by Hochreiter and Schmidhuber [23] and now has been commonly used as a time series prediction model.It solves the problem of gradient vanishing or gradient explosion in RNN by introducing the concept of cell states.The memory cell is the core component of the LSTM architecture and it allows the network to selectively store and retrieve information at different time steps.LSTM networks use a memory cell and several gates to selectively regulate the flow of information in and out of the cell.The unit structure of LSTM is shown in Figure 2.

Long Short-Term Memory Neural Network Based on Swarm Intelligence Optimization 2.2.1. Long Short-Term Memory Neural Network
Long short-term memory (LSTM) is a variant of the recurrent neural network (RNN) proposed by Hochreiter and Schmidhuber [23] and now has been commonly used as a time series prediction model.It solves the problem of gradient vanishing or gradient explosion in RNN by introducing the concept of cell states.The memory cell is the core component of the LSTM architecture and it allows the network to selectively store and retrieve information at different time steps.LSTM networks use a memory cell and several gates to selectively regulate the flow of information in and out of the cell.The unit structure of LSTM is shown in Figure 2. LSTM obtains long-term memory capability through the cooperative control of its gating units (such as forget gate, input gate, output gate, etc.) to achieve the ability to protect useful information and discard redundant information.The calculation of the gating units is as follows: where  is the input quantity, ℎ is the output quantity, ,  and , respectively, are the input gate, output gate and forgetting gate and  is the unit state. and ℎ are activation functions,  represents the weight and  is the deviation matrix.

Sparrow Search Algorithm
The sparrow search algorithm [24] is a swarm intelligent optimization algorithm derived from the feeding and anti-predation behavior of the sparrow population.In view of the phenomenon that the prediction accuracy of the LSTM model is easily affected by its key parameter settings, this paper uses the sparrow search algorithm to optimize its parameter settings before LSTM prediction.In order to cope with the extreme failure trend of some batteries in the later stage, we designed a flag to control the BN layer; the BN layer cannot only speed up the training and convergence of the network, but can also control gradient explosions and gradient vanishing.The algorithm mainly simulates the sparrow foraging process to find the optimal combination solution of the BN layer flag and the three key parameters of LSTM (learning rate, number of hidden layer nodes and regularization coefficient).
The algorithm divides different individuals into different roles in the process of imitating sparrow foraging.The explorers are in charge of looking for targets while the followers will follow the explorers.Predators and vigilantes are added during the process, and then the algorithm calculates the fitness value of the sparrow individuals through the fitness function; meanwhile, the roles and positions between individuals are constantly changed according to the fitness value.When the maximum number of iterations is reached, the sparrow with the highest global fitness value is the global optimal solution.The randomly initialized sparrow population matrix is as follows: LSTM obtains long-term memory capability through the cooperative control of its gating units (such as forget gate, input gate, output gate, etc.) to achieve the ability to protect useful information and discard redundant information.The calculation of the gating units is as follows: where x is the input quantity, h is the output quantity, i, o and f , respectively, are the input gate, output gate and forgetting gate and C is the unit state.σ and tanh are activation functions, W represents the weight and b is the deviation matrix.

Sparrow Search Algorithm
The sparrow search algorithm [24] is a swarm intelligent optimization algorithm derived from the feeding and anti-predation behavior of the sparrow population.In view of the phenomenon that the prediction accuracy of the LSTM model is easily affected by its key parameter settings, this paper uses the sparrow search algorithm to optimize its parameter settings before LSTM prediction.In order to cope with the extreme failure trend of some batteries in the later stage, we designed a flag to control the BN layer; the BN layer cannot only speed up the training and convergence of the network, but can also control gradient explosions and gradient vanishing.The algorithm mainly simulates the sparrow foraging process to find the optimal combination solution of the BN layer flag and the three key parameters of LSTM (learning rate, number of hidden layer nodes and regularization coefficient).
The algorithm divides different individuals into different roles in the process of imitating sparrow foraging.The explorers are in charge of looking for targets while the followers will follow the explorers.Predators and vigilantes are added during the process, and then the algorithm calculates the fitness value of the sparrow individuals through the fitness function; meanwhile, the roles and positions between individuals are constantly changed according to the fitness value.When the maximum number of iterations is reached, the sparrow with the highest global fitness value is the global optimal solution.The randomly initialized sparrow population matrix is as follows: In Equation ( 9), x is the sparrow individual, d is the dimension of the population, that is, the number of parameters to be optimized in the neural network, and n is the number of sparrows.The population fitness matrix is as follows: In Equation (10), f represents the fitness function of the sparrow individual.During the sparrow foraging process, the explorers can preferentially search for the target in a larger range as the explorers have a higher fitness value.Then, they can move closer to the target and further guide the entire population to it.During each iteration, the explorers' position updated equation is as follows: In Equation ( 11), t is the current iteration count, iter max is the maximum count of iterations, a is a random number between 0 and 1, Q is a random number that obeys a normal distribution, L is a matrix of 1 × d and each element is 1, R ∈ [0, 1] is the alarm value and ST ∈ [0.5, 1] is the safety threshold.When R < ST, the explorer conducts a strategic extensive search.When R ≥ ST, it means that the vigilante finds the predator, and all sparrows should fly to the safe area.The initial position of the vigilante is randomly generated according to Equation ( 12): In Equation ( 12), X best is the global optimal position, β is the step size control parameter and obeys N(0, 1) normal distribution, f g and f w are, respectively, the global optimal and worst fitness and ε is a non-zero constant.During the process, the follower will monitor the explorers and determine whether to compete for the target according to the state of the explorers.If the scramble is successful, the position will be updated; otherwise, they will continue to forage.The equation for updating the position of the follower is as follows: In Equation ( 13), X P is the best position of the explorers, X worst is the worst position globally and A is a matrix in which each element is randomly 1 or −1.After the parameter initialization is completed, the search starts until the maximum number of iterations is reached, and finally the sparrow with the highest global fitness is the optimal LSTM parameter combination obtained.The process is shown in Figure 3.

Algorithmic Flow of the Proposed Method
In order to improve the accuracy of the remaining useful life prediction of batteries and the adaptability of the algorithm, this paper proposes a prediction method by optimizing variational mode decomposition and optimizing deep learning.The proposed method mainly introduces swarm intelligence optimization algorithms to optimize parameter settings to obtain the best algorithm parameters.The overall prediction process is shown in Figure 4.According to the characteristics of the capacity degradation, the high-precision life prediction needs to be developed in phases.As is shown in Figure 4, the proposed scheme in this paper is divided into two stages: the decomposition stage and the prediction stage.In the decomposition phase, VMD is mainly used which is a common method of decomposing nonlinear systems, but the current settings of modal components K and penalty factor α are more dependent on empirical methods and there may be problems of modal confounding and endpoint effects, while WOA has better global search capability in terms of parameter search and avoids using only one formula to update the location of search individuals, thus reducing the possibility of local optimal stagnation.Its search process

Algorithmic Flow of the Proposed Method
In order to improve the accuracy of the remaining useful life prediction of batteries and the adaptability of the algorithm, this paper proposes a prediction method by optimizing variational mode decomposition and optimizing deep learning.The proposed method mainly introduces swarm intelligence optimization algorithms to optimize parameter settings to obtain the best algorithm parameters.The overall prediction process is shown in Figure 4.

Algorithmic Flow of the Proposed Method
In order to improve the accuracy of the remaining useful life prediction of batteries and the adaptability of the algorithm, this paper proposes a prediction method by optimizing variational mode decomposition and optimizing deep learning.The proposed method mainly introduces swarm intelligence optimization algorithms to optimize parameter settings to obtain the best algorithm parameters.The overall prediction process is shown in Figure 4.According to the characteristics of the capacity degradation, the high-precision life prediction needs to be developed in phases.As is shown in Figure 4, the proposed scheme in this paper is divided into two stages: the decomposition stage and the prediction stage.In the decomposition phase, VMD is mainly used which is a common method of decomposing nonlinear systems, but the current settings of modal components K and penalty factor α are more dependent on empirical methods and there may be problems of modal confounding and endpoint effects, while WOA has better global search capability in terms of parameter search and avoids using only one formula to update the location of search individuals, thus reducing the possibility of local optimal stagnation.Its search process According to the characteristics of the capacity degradation, the high-precision life prediction needs to be developed in phases.As is shown in Figure 4, the proposed scheme in this paper is divided into two stages: the decomposition stage and the prediction stage.In the decomposition phase, VMD is mainly used which is a common method of decomposing nonlinear systems, but the current settings of modal components K and penalty factor α are more dependent on empirical methods and there may be problems of modal confounding and endpoint effects, while WOA has better global search capability in terms of parameter search and avoids using only one formula to update the location of search individuals, thus reducing the possibility of local optimal stagnation.Its search process tends to quickly locate a number of potential optimal solution regions in the early search phase and then eventually converges to the optimal solution; so, it can better solve the problem that the commonly used center frequency observation method for VMD parameter optimization can only optimize the K value, but the penalty factor α takes the default value.WOA is used to optimize the selection of the two key parameters of VMD and the minimum value of fuzzy entropy is used as the adaptation function to ensure that the complexity of each modal component is the lowest, so as to obtain the optimal parameter combination (K, α).In the prediction stage, the LSTM method is mainly used for time series prediction, but the traditional prediction model is a fixed model and the parameters cannot be changed once they are determined, but the capacity degradation curve of lithium batteries is susceptible to multiple factors with complex change trends and the generalization of the fixed model is poor for different battery capacity degradation curves; so, different model settings should be used for different historical input data.The prediction effect of LSTM is easily affected by hyperparameter settings, while SSA has better performance in terms of search accuracy, convergence speed and stability; so, SSA is used not only to achieve accurate settings of hyperparameters, but also to achieve adaptive parameter settings for different input data, so as to improve the prediction accuracy and generalization ability.The ARIMA model is used to predict the trend component in the prediction, and the non-trend component is inputted into the SSALSTM.The SSA algorithm is used to perform a parameter search for the three key parameters of the LSTM: learning rate, number of hidden layer nodes and regularization coefficient, and the BN layer flag is used which is set to decide whether to add the batch normalization (BN) layer before the Relu activation layer so that the LSTM can adapt to sequences of different lengths and features.Finally, the prediction results of each component are fused to obtain the final prediction results.

Dataset Descriptions
The lithium battery is a nonlinear system.Many interactive factors contribute to its nonlinearity, such as electrochemical reaction stages, operating conditions and battery types [25].In the process of use, the battery capacity will gradually decrease with the cycle charge and discharge of the battery.This phenomenon is called battery degradation.The battery capacity degradation data used in this research are from the dataset Battery Aging ARC-FY08Q4 provided by the Prognostics Center of Excellence (PCoE) at the National Aeronautics and Space Administration (NASA) Ames Research Center [26] and the dataset of CS2-type batteries from the Center for Advanced Life Cycle Engineering (CALCE) at the University of Maryland [27].
Figure 5a shows the test results of four NASA batteries (18,650, 2000 mAh).The battery is first charged at a constant current of 1.5 A until the voltage reaches the maximum cut-off voltage of 4.2 V, and then the battery is charged at a constant voltage until the current drops to 20 mA.Finally, the battery is discharged at a constant current of 2 A until the voltage drops to 2.7 V, 2.5 V, 2.2 V and 2.5 V, respectively.Figure 5b shows the test results of four batteries from the University of Maryland (CALCE, 1100 mAh, CS2).The battery was charged at a constant current of 0.5 C until the 4.2 V cut-off, and then it began to charge at a constant voltage until the current dropped to 50 mA.In the experiment, the discharge current of the CS2_33 and CS_34 batteries was 0.5 C, the discharge current of CS2_36 and CS2_37 was 1 C and the experimental termination voltage was 2.7 V. Figure 5 illustrates the capacity degradation curve of the eight batteries.The conditions are enumerated in Table 1.It can be clearly seen from Figure 5 that that the attenuation of the battery presents a nonlinear characteristic.Many factors contribute to this phenomenon, such as the increase in the passivation film, the shedding of the active material particles, the active material lattice and so on [28].Furthermore, capacity recovery will occur to varying degrees, that is, capacity regeneration during the aging process and after its recovery, but there is often a rapid decline phenomenon.When the battery reaches the end of its life, the deterioration rate will be further intensified.In addition, the entire battery life is often accompanied by random fluctuations [29].As a result, the degradation trend is often uncertain.These reasons create great difficulties in predicting the remaining useful life of lithium batteries.In this section, the B5 and CS2_33 batteries in Figure 5 are used to represent the prediction results of the remaining useful life to verify the prediction effect of the proposed method.

WOAVMD Decomposition of Battery Capacity Curve
First, the battery capacity degradation data are inputted into WOAVMD.In this step, the minimum value of the fuzzy entropy of the reconstructed signal is used as the fitness function of WOA, so that the complexity of the decomposition result components is as low as possible.In the experiment, the population size is set to 20, the maximum number of iterations is set to 40, the dimension is 2, the trial calculation interval of  is [4,20] and the trial calculation interval of  is [500, 2000].In the B5 battery experiment,  is 9 and α is 1118 after WOA parameter optimization, while in the experiment of the CS2_33 battery, the optimal parameter settings are  = 8 and  = 853.The VMD decomposition results of the B5 and CS2_33 batteries are shown in Figure 6.
From Figure 6a,c, it can be seen that the timing diagrams of the components are quite distinct.Among the IMF components obtained after optimal decomposition, IMF1 of B5 can reflect the general situation of capacity regeneration, IMF2 can reflect the degradation trend and IMF3-9 can effectively reflect the local capacity fluctuation characteristics during capacity regeneration and degradation.IMF1 in the decomposition result of CS2_33 reflects the overall degradation trend while IMF2 reflects capacity regeneration and IMF3-8 reflects random fluctuations.Then, fast Fourier transform is performed on each component so that the frequency spectrum can be obtained, as is shown in Figure 6b,d.It appears from the figures that the frequency distribution of each component of the two batteries is  It can be clearly seen from Figure 5 that that the attenuation of the battery presents a nonlinear characteristic.Many factors contribute to this phenomenon, such as the increase in the passivation film, the shedding of the active material particles, the active material lattice and so on [28].Furthermore, capacity recovery will occur to varying degrees, that is, capacity regeneration during the aging process and after its recovery, but there is often a rapid decline phenomenon.When the battery reaches the end of its life, the deterioration rate will be further intensified.In addition, the entire battery life is often accompanied by random fluctuations [29].As a result, the degradation trend is often uncertain.These reasons create great difficulties in predicting the remaining useful life of lithium batteries.In this section, the B5 and CS2_33 batteries in Figure 5 are used to represent the prediction results of the remaining useful life to verify the prediction effect of the proposed method.

WOAVMD Decomposition of Battery Capacity Curve
First, the battery capacity degradation data are inputted into WOAVMD.In this step, the minimum value of the fuzzy entropy of the reconstructed signal is used as the fitness function of WOA, so that the complexity of the decomposition result components is as low as possible.In the experiment, the population size is set to 20, the maximum number of iterations is set to 40, the dimension is 2, the trial calculation interval of K is [4,20] and the trial calculation interval of α is [500, 2000].In the B5 battery experiment, K is 9 and α is 1118 after WOA parameter optimization, while in the experiment of the CS2_33 battery, the optimal parameter settings are K = 8 and α = 853.The VMD decomposition results of the B5 and CS2_33 batteries are shown in Figure 6.
From Figure 6a,c, it can be seen that the timing diagrams of the components are quite distinct.Among the IMF components obtained after optimal decomposition, IMF1 of B5 can reflect the general situation of capacity regeneration, IMF2 can reflect the degradation trend and IMF3-9 can effectively reflect the local capacity fluctuation characteristics during capacity regeneration and degradation.IMF1 in the decomposition result of CS2_33 reflects the overall degradation trend while IMF2 reflects capacity regeneration and IMF3-8 reflects random fluctuations.Then, fast Fourier transform is performed on each component so that the frequency spectrum can be obtained, as is shown in Figure 6b,d.It appears from the figures that the frequency distribution of each component of the two batteries is relatively concentrated while the frequency band distribution of all of the components is relatively discrete, avoiding the phenomenon of mode aliasing.Therefore, the proposed adaptive optimized VMD decomposition can separate the data into multiple modal components more clearly, and the algorithm has better robustness.
relatively concentrated while the frequency band distribution of all of the components is relatively discrete, avoiding the phenomenon of mode aliasing.Therefore, the proposed adaptive optimized VMD decomposition can separate the data into multiple modal components more clearly, and the algorithm has better robustness.

Long Short-Term Memory Neural Network Combined with Sparrow Search Algorithm for Battery Life Prediction
In this stage, the trend component and non-trend components obtained via optimized VMD decomposition were, respectively, predicted using the autoregressive integrated moving average model (ARIMA) and SSALSTM.Firstly, the capacity curves of the first 58 cycles of the NASA battery dataset and the first 320 cycles of the CALCE battery dataset were used as the training set, and the subsequent cycles were used as the test set.Experiments were performed and repeated several times to verify the stability of the method.The prediction results are shown in Figure 7a,b.The non-trending modal components present obvious nonlinear fluctuations and can reduce the complexity of the following LSTM prediction; so, SSALSTM was applied for their prediction.In the SSA parameter optimization stage, the number of sparrow populations was set to 20 and the maximum number of iterations was 30, the dimension was 4, the L2 regularization coefficient trial interval was [1 × 10 −10 , 0.1], the initial learning rate trial interval was [1 × 10 −4 , 0.01] and the hidden layer node number trial interval was [10,500].The BN layer flag was initialized to generate a random number [0, 1] through a random function.If the flag value was greater than 0.5 in the final optimization result, it meant joining the BN layer in the prediction model.The prediction starting points of the two batteries were, respectively, set to 90 and 440.The blue curve in Figure 7 shows the historical cycle and the orange curve shows the prediction results.

Long Short-Term Memory Neural Network Combined with Sparrow Search Algorithm for Battery Life Prediction
In this stage, the trend component and non-trend components obtained via optimized VMD decomposition were, respectively, predicted using the autoregressive integrated moving average model (ARIMA) and SSALSTM.Firstly, the capacity curves of the first 58 cycles of the NASA battery dataset and the first 320 cycles of the CALCE battery dataset were used as the training set, and the subsequent cycles were used as the test set.Experiments were performed and repeated several times to verify the stability of the method.The prediction results are shown in Figure 7a,b.The non-trending modal components present obvious nonlinear fluctuations and can reduce the complexity of the following LSTM prediction; so, SSALSTM was applied for their prediction.In the SSA parameter optimization stage, the number of sparrow populations was set to 20 and the maximum number of iterations was 30, the dimension was 4, the L2 regularization coefficient trial interval was [1 × 10 -10 , 0.1], the initial learning rate trial interval was [1 × 10 −4 , 0.01] and the hidden layer node number trial interval was [10,500].The BN layer flag was initialized to generate a random number [0, 1] through a random function.If the flag value was greater than 0.5 in the final optimization result, it meant joining the BN layer in the prediction model.The prediction starting points of the two batteries were, respectively, set to 90 and 440.The blue curve in Figure 7 shows the historical cycle and the orange curve shows the prediction results.Finally, the prediction results of the trend component and all non-trend components were fused to obtain the final prediction result of the remaining useful life of the lithium battery.The prediction results will be discussed in the next section.

Fusion Results of Prediction
This section takes the NASA B5 battery and the CALCE CS2_33 battery as examples to analyze the experimental results of the proposed method.The prediction starting point Finally, the prediction results of the trend component and all non-trend components were fused to obtain the final prediction result of the remaining useful life of the lithium battery.The prediction results will be discussed in the next section.

Fusion Results of Prediction
This section takes the NASA B5 battery and the CALCE CS2_33 battery as examples to analyze the experimental results of the proposed method.The prediction starting point of B5 was the 58th cycle and the prediction starting point of CALCE was the 320th cycle.
The number of experiments was five.The experimental results are shown in Figure 8a,b.There is little difference in the result curves obtained via multiple predictions.Although the swarm intelligence optimization algorithm was added, the performance of the algorithm was still relatively stable.The error curve was further drawn, as is shown in Figure 8c,d.
of B5 was the 58th cycle and the prediction starting point of CALCE was the 320th cycle.The number of experiments was five.The experimental results are shown in Figure 8a,b.There is little difference in the result curves obtained via multiple predictions.Although the swarm intelligence optimization algorithm was added, the performance of the algorithm was still relatively stable.The error curve was further drawn, as is shown in Figure 8c,d Then, in order to fully analyze the effectiveness of the method, three indicators were chosen.The root mean squared error (RMSE) was calculated using Equation ( 14) to measure the response error distribution.The mean absolute error (MAE) was calculated using Equation (15).Standard deviation was calculated using Equation ( 16) to measure the dispersion of prediction errors.Then, in order to fully analyze the effectiveness of the method, three indicators were chosen.The root mean squared error (RMSE) was calculated using Equation ( 14) to measure the response error distribution.The mean absolute error (MAE) was calculated using Equation (15).Standard deviation was calculated using Equation ( 16) to measure the dispersion of prediction errors.

RMSE(y
where y i is the real value of the battery capacity, ŷi is the predicted value of the battery capacity, error is the prediction error and m is the number of predicted cycles.The evaluation indexes and their average values of the five experiments are shown in Table 2.As can be seen from Figure 8 and Table 2, the proposed method has a comparatively accurate predictive effect on the datasets.The average MAE of the B5 battery is up to 0.0108, RMSE amounts to 0.0262 and the average MAE of the CS2_33 battery is up to 0.0757; RMSE amounts to 0.1037.The standard deviation of the prediction errors from multiple experiments does not differ significantly.The difference between the five prediction results is small and the error curve is similar.Therefore, it can be considered that the parameter optimization has high consistency in multiple experiments, verifying that the model has better stability.

Analysis and Discussion of Prediction Results
In the last section, B5 and CS2_33 were taken as examples to illustrate the process and effect of the whole prediction experiment and to verify the stability of the proposed method.In this section, we will conduct more experiments in different ways, showing and discussing the experimental process and its results.

Analysis of Results from Different Starting Points
From the experiments above, it can be seen that the proposed method can stably predict the remaining useful life of batteries from the same prediction starting point or in multiple predictions.This section randomly selects different starting points and conducts experiments three times to compare the prediction effect.The starting cycles of the B5 battery were 30, 70 and 90, respectively, while those of the CS2_33 battery were 200, 370 and 440, respectively.The prediction results and errors are shown in Figures 9 and 10.
Batteries 2023, 9, x FOR PEER REVIEW 14 of 22 where   is the real value of the battery capacity,  ̂ is the predicted value of the battery capacity,  is the prediction error and  is the number of predicted cycles.The evaluation indexes and their average values of the five experiments are shown in Table 2.As can be seen from Figure 8 and Table 2, the proposed method has a comparatively accurate predictive effect on the datasets.The average MAE of the B5 battery is up to 0.0108, RMSE amounts to 0.0262 and the average MAE of the CS2_33 battery is up to 0.0757; RMSE amounts to 0.1037.The standard deviation of the prediction errors from multiple experiments does not differ significantly.The difference between the five prediction results is small and the error curve is similar.Therefore, it can be considered that the parameter optimization has high consistency in multiple experiments, verifying that the model has better stability.

Analysis and Discussion of Prediction Results
In the last section, B5 and CS2_33 were taken as examples to illustrate the process and effect of the whole prediction experiment and to verify the stability of the proposed method.In this section, we will conduct more experiments in different ways, showing and discussing the experimental process and its results.

Analysis of Results from Different Starting Points
From the experiments above, it can be seen that the proposed method can stably predict the remaining useful life of batteries from the same prediction starting point or in multiple predictions.This section randomly selects different starting points and conducts experiments three times to compare the prediction effect.The starting cycles of the B5 battery were 30, 70 and 90, respectively, while those of the CS2_33 battery were 200, 370 and 440, respectively.The prediction results and errors are shown in Figures 9 and 10    As described above, the figures show that the long-term prediction error is large when the amount of historical data is small, although it can reflect the general trend.As the amount of historical data increases, the performance of the prediction experiment gradually improves.After several experiments from different starting points, it was found that the accuracy of prediction began to improve significantly from the 58th cycle of the B5 battery and the 320th cycle of the CS2_33 battery.

Analysis of Results of Different Batteries
In order to further verify the adaptability of the prediction method, NASA B6, B7 and B18 batteries and CALCE CS2_34, CS2_36 and CS2_37 batteries were selected for multiple prediction experiments from different starting points.Table 3 shows the MAE, RMSE and corresponding average of multiple experiments.The standard deviation and its corresponding average of the prediction error of multiple experiments are listed in Table 4.After carrying out the prediction experiments of eight batteries several times, Figure 11 shows the histograms of the average value of the RMSE and MAE of the prediction results and Figure 12 shows the histograms of the average value of the STD of the prediction error.
Tables 3 and 4 and Figures 11 and 12 show the improvement process of the prediction effect in the experiments of different batteries.When the starting point of prediction is As described above, the figures show that the long-term prediction error is large when the amount of historical data is small, although it can reflect the general trend.As the amount of historical data increases, the performance of the prediction experiment gradually improves.After several experiments from different starting points, it was found that the accuracy of prediction began to improve significantly from the 58th cycle of the B5 battery and the 320th cycle of the CS2_33 battery.

Analysis of Results of Different Batteries
In order to further verify the adaptability of the prediction method, NASA B6, B7 and B18 batteries and CALCE CS2_34, CS2_36 and CS2_37 batteries were selected for multiple prediction experiments from different starting points.Table 3 shows the MAE, RMSE and corresponding average of multiple experiments.The standard deviation and its corresponding average of the prediction error of multiple experiments are listed in Table 4.After carrying out the prediction experiments of eight batteries several times, Figure 11 shows the histograms of the average value of the RMSE and MAE of the prediction results and Figure 12 shows the histograms of the average value of the STD of the prediction error.
Tables 3 and 4 and Figures 11 and 12 show the improvement process of the prediction effect in the experiments of different batteries.When the starting point of prediction is about 25% of the total cycle number, the prediction effect is not ideal and the standard deviation is large, but the proposed method can still reflect the life-changing trend in the eight batteries well.As the historical cycle increases, when the amount of provided data reaches about 40% of the full life cycles, the proposed method predicts the battery life more accurately in the experiments of all eight batteries, and the prediction accuracy continues to improve as the battery cycle continues to increase.Meanwhile, the standard deviation also decreases significantly, indicating that the proposed method can capture small changes in the sequence change process.In all of the experiments of different batteries, the proposed method maintains high prediction accuracy; so, it has good adaptability to different batteries.

Comparison and Analysis of Different Methods
In this part, several related algorithm models are tested.SSALSTM and VMDLSTM were selected for comparison and explanation.In the VMDLSTM method, the number of VMD modal components K was set to 6 and the penalty factor α was 20 according to the empirical method.As for the LSTM parameters, the initial learning rate was set to 0.005, the L2 regularization coefficient was set to 0.001 and the number of hidden layer nodes was 20.New indicators were calculated to measure the prediction effect of the method.The average values of RMSE (Equation ( 17)) and MAE (Equation ( 18)) of each starting point and multiple prediction results called ARMSE and AMAE were used.

Comparison and Analysis of Different Methods
In this part, several related algorithm models are tested.SSALSTM and VMDLSTM were selected for comparison and explanation.In the VMDLSTM method, the number of VMD modal components K was set to 6 and the penalty factor α was 20 according to the empirical method.As for the LSTM parameters, the initial learning rate was set to 0.005, the L2 regularization coefficient was set to 0.001 and the number of hidden layer nodes was 20.New indicators were calculated to measure the prediction effect of the method.The average values of RMSE (Equation ( 17)) and MAE (Equation ( 18)) of each starting point and multiple prediction results called ARMSE and AMAE were used.
In Equations ( 17) and (18), n refers to the number of different prediction starting points, i = {1, 2, 3, 4}.The calculation results are shown in Table 5 and Figure 12.Table 5 lists the calculated results of the AMAE and ARMSE of each method, while Figure 12 shows the histogram of the evaluation indicators for each method, respectively.The capacity value when the capacity decays to 70% of the initial capacity was taken as the life threshold.Then, the number of available cycles for the lithium-ion batteries to reach the failure threshold (EOL) was predicted.When the battery capacity reached EOL, the error index between the predicted remaining service life (PRUL) and the real remaining service life (RUL) was calculated as follows: In Equation ( 19), ERR is used to express the prediction error of RUL in terms of the number of cycles and P ERROR is used to evaluate the accuracy of RUL prediction.The calculated results for the two datasets are shown in Table 6.prediction accuracy increases gradually.Compared with SSALSTM and VMDLSTM, the proposed method uses less historical data but can obtain higher prediction accuracy and the prediction accuracy is greatly improved during the process.For example, in the CS2-34 battery, the prediction accuracy of this method from the starting point of the 320th cycle is higher than that of SSALSTM from the 440th cycle, and equivalent to that of VMDLSTM from the 440th cycle.
Batteries 2023, 9, x FOR PEER REVIEW 19 of 22 number decreases and the prediction accuracy increases gradually.Compared with SSALSTM and VMDLSTM, the proposed method uses less historical data but can obtain higher prediction accuracy and the prediction accuracy is greatly improved during the process.For example, in the CS2-34 battery, the prediction accuracy of this method from the starting point of the 320th cycle is higher than that of SSALSTM from the 440th cycle, and equivalent to that of VMDLSTM from the 440th cycle.

Conclusions
In order to solve the problems of insufficient accuracy and adaptability of current life prediction methods, this paper proposes an improved method for predicting the remaining useful life of lithium batteries based on hybrid optimization.First, the whale optimization algorithm was used to optimize the variational mode decomposition to decompose the historical capacity curve of lithium batteries into multiple modal components and the self-adaptive parameter setting was realized to make sure that VMD could capture local features in the sequence more effectively, making the analysis of non-stationary sequences more accurate.Then, the sparrow search algorithm was used to optimize the LSTM neutral network to predict each component, achieving the self-adaptive setting of different parameters and the network structure.The proposed method combines parameter optimization in the two steps of decomposition and prediction, and finally the prediction effect is optimized.The results show that the optimization decomposition method used in this paper has better adaptability to the capacity regeneration phenomenon.In addition, the proposed method has shown better performance in the experiments of all eight of the batteries.The accuracy of the proposed method is generally high for the prediction of the degradation curves of different batteries.When the amount of data reaches about 40% of the dataset, the prediction accuracy significantly improves.The error curve also shows that this method can predict the capacity fluctuation in different degrees.So, it is verified that the proposed method has better adaptability while maintaining high prediction accuracy.
the optimal and the worst value.
the optimal and the worst value.

Author Contributions:
Conceptualization, Q.B., W.Q. and Z.Y.; methodology, Q.B.; software, Q.B.; validation, Q.B., W.Q. and Z.Y.; formal analysis, Q.B.; investigation, Q.B.; resources, Q.B.; writing original draft preparation, Q.B.; writing-review and editing, Q.B. and Z.Y.; visualization, Q.B.; supervision, W.Q.; project administration, W.Q. All authors have read and agreed to the published version of the manuscript.Funding: This work was supported by National key research and development program under Grant 2020YFB160070301, the Key R&D Program of Jiangsu Province under Grant BE2019311 and Jiangsu modern agricultural industry key technology innovation project under Grant CX(20)2013.Data Availability Statement: The data used in this paper are from the NASA Battery Aging Dataset and the CALCE battery dataset of CS2 cells.

Table 2 .
Evaluation metrics for multiple experiments.

Table 2 .
Evaluation metrics for multiple experiments.

Table 3 .
MAE, RMSE and their averages of multiple experiments (different starting points).

Table 4 .
STD and their averages of multiple prediction errors (different starting points).

Table 6 .
Comparison of RUL prediction results according to different methods.