Remaining Useful Life Prediction for Lithium-Ion Batteries Based on Iterative Transfer Learning and Mogriﬁer LSTM

: Lithium-ion battery health and remaining useful life (RUL) are essential indicators for reliable operation. Currently, most of the RUL prediction methods proposed for lithium-ion batteries use data-driven methods, but the length of training data limits data-driven strategies. To solve this problem and improve the safety and reliability of lithium-ion batteries, a Li-ion battery RUL prediction method based on iterative transfer learning (ITL) and Mogriﬁer long and short-term memory network (Mogriﬁer LSTM) is proposed. Firstly, the capacity degradation data in the source and target domain lithium battery historical lifetime experimental data are extracted, the sparrow search algorithm (SSA) optimizes the variational modal decomposition (VMD) parameters, and several intrinsic mode function (IMF) components are obtained by decomposing the historical capacity degradation data using the optimization-seeking parameters. The highly correlated IMF components are selected using the maximum information factor. Capacity sequence reconstruction is performed as the capacity degradation information of the characterized lithium battery, and the reconstructed capacity degradation information of the source domain battery is iteratively input into the Mogriﬁer LSTM to obtain the pre-training model; ﬁnally, the pre-training model is transferred to the target domain to construct the lithium battery RUL prediction model. The method’s effectiveness is veriﬁed using CALCE and NASA Li-ion battery datasets, and the results show that the ITL-Mogriﬁer LSTM model has higher accuracy and better robustness and stability than other prediction methods.


Introduction
Lithium-ion batteries are commonly utilized in various industries, including consumer electronics, electric vehicles, communications, and airplanes, due to their advantageous characteristics such as extended lifespan, high energy density, low self-discharge rate, and environmental friendliness [1].Their health management, performance degradation, safety maintenance, and remaining useful life (RUL) prediction have become important research issues [2].In practice, with the increase in operation time, lithium-ion batteries will inevitably deteriorate until failure [3].Failure of lithium-ion batteries may lead to serious economic losses and catastrophic consequences [4].Therefore, accurate lithium-ion battery RUL prediction and timely management and maintenance are important to ensure the safe operation of equipment as well as to avoid equipment failure [5].
Lithium-ion battery RUL prediction is mainly divided into model-based approaches and data-driven approaches [6].Data-driven approaches are more suitable for large-scale engineering applications, as they extract health factors from raw data to depict degradation trends without considering the internal physical and chemical changes within the battery [7].Ansari Shaheer et al. [8] proposed an enhanced lithium-ion battery life prediction model based on a recurrent neural network (RNN) and a particle swarm optimization algorithm (PSO), demonstrating high robustness and prediction accuracy.Nevertheless, the PSO algorithm tends to fall into local optima, making it challenging to obtain the exact optimal solution.To overcome this issue, the LSTM, a variant of RNN, effectively addresses the problems of gradient explosion and gradient disappearance [9].Qu et al. [10] developed a PA-LSTM model for Li-ion battery RUL prediction, combining LSTM network, particle swarm optimization, and an attention mechanism.Their method employs Complete Ensemble Empirical Modal Decomposition with Adaptive Noise (CEEMDAN) to denoise the original data, thus improving prediction accuracy.Similarly, Liu et al. [9] optimized the LSTM network using the improved sparrow search algorithm (ISSA) to predict Li-ion battery RUL, achieving better accuracy and robustness.While a single LSTM network may be inefficient due to random data fluctuations, the Mogrifier LSTM [11] was introduced to enhance data interaction.Bo et al. [12] proposed a combined prediction method, employing the deep belief network (DBN) and Mogrifier LSTM network, with Ensemble Empirical Modal Decomposition (EEMD) used to preprocess lithium battery capacity data.This method effectively addresses the battery capacity regeneration problem and exhibits high effectiveness.He et al. [13] proposed CAM-LSTM-DA, a method based on causal analysis, an attention mechanism, and Mogrifier-LSTM, for unsupervised constrained adversarial domain adaptation to enhance the generalization and prediction performance of Li-ion battery RUL models.These approaches contribute to the informed utilization of Li-ion batteries.
Data-driven approaches focus on improving the accuracy of the underlying model.However, variations in data distribution across different types, components, and batches of products can result in substantial model errors [14].Therefore, data-driven models constructed under variations in data distribution cannot guarantee high accuracy and better generalization ability [15].The accuracy of their prediction results often relies on the length of the available data that the model can learn from [16].When the training data is insufficient to learn the degraded features, achieving the expected accuracy in prediction becomes challenging [17].Consequently, the model lacks generalization capability and the ability to rapidly predict the RUL of batch datasets under diverse working conditions [17].Transfer learning (TL) is an emerging solution to address the challenges posed by variations in data distribution.Ma et al. [18] introduced a hybrid network that combines TL, DBN, and LSTM.Lu et al. [19] presented a Li-ion battery State of Health (SOH) evaluation model that utilizes transfer learning and LSTM, effectively reducing the training iterations of the target domain model for SOH prediction.To address the challenge of predicting the RUL for small target sample sets in Li-ion batteries, Wang et al. [20] introduced a transfer learning-based Gate Recurrent Unit (GRU) RUL prediction model.This study further demonstrated the potential of State of Charge (SOC) estimation using big data techniques and limited target samples.Zou et al. [21] proposed a transfer learning-based fusion model that combines Convolutional Neural Networks (CNN) and LSTM.This model effectively addresses the challenge of small sample size encountered in Li-ion battery performance evaluation tasks.Chou et al. [22] proposed a fine-tuning model for predicting the future capacity of lithium-ion batteries with TL.The authors demonstrated the potential of this model for performing predictive maintenance of lithium batteries using actual battery charge/discharge data.
In summary, transfer learning enhances the learning of the model by increasing the available data, thereby improving its generalization ability.This approach is particularly beneficial when dealing with limited data for lithium battery prediction.Therefore, this paper introduces an ITL-Mogrifier LSTM model based on ITL to address the RUL prediction problem for lithium batteries with small samples.The experimental results demonstrate the effectiveness of the proposed ITL-Mogrifier LSTM model in accurately predicting the RUL of Li-ion batteries, as validated using datasets from the Center for Advanced Life Cycle Engineering (CALCE) and NASA-Ames Prediction Center of Excellence (PCOE).Additionally, a comparison with other prediction methods confirms the superiority of the proposed approach.The primary contributions of this paper can be summarized as follows: (1) The VMD algorithm decomposes the non-stationary signal in the time series into the primary degenerate trend and the remaining noise.To address the stability and accuracy of the VMD decomposition, the SSA is used to optimize the VMD structure parameters and achieve adaptive parameter selection.The maximum information factor is used to filter and reconstruct the highly correlated IMF components to characterize the battery capacity degradation information to avoid the battery capacity regeneration problem and improve the RUL prediction accuracy of the model.(2) The capacity degradation information of the source domain battery dataset is input to the Mogrifier LSTM through iteration to form a pre-training model, and finally, the target domain battery capacity degradation information is input to the pre-training model through transfer learning to effectively suppress the impact of significant data distribution differences on the model and achieve multi-source domain transfer learning.
The rest of this paper is organized as follows: Section 2 gives the SSA-VMD, MIC, and Mogrifier LSTM theories.Section 3 describes the source domain, target domain battery data, and the ITL-Mogrifier LSTM model prediction process, and the experimental results and error analysis are presented in Section 4. Section 5 summarizes the main conclusions.

Sparrow Search Algorithm-Variational Modal Decomposition
VMD is an adaptive signal processing method based on Wiener filtering, which has significant advantages in dealing with nonlinear and nonsmooth signals [23].Therefore, the VMD model is used to decompose the battery capacity sequence [24].
The VMD algorithm differs from EMD in that it is defined as follows: where A k (t) is the instantaneous amplitude of u k (t) and ω k (t) is the instantaneous phase of u k (t).The original signal is decomposed into k modal functions, each with a central frequency u k (t).To achieve the decomposition of the original signal the constrained variational problem is defined as follows: where t is the time script and δ(t) is the Dirac distribution.u k and ω k are the set of all modes and their corresponding central frequencies, respectively.By introducing the Lagrangian multiplier λ, the transformed expression is obtained as follows: where α is the second penalty factor to reduce the interference of Gaussian noise.λ(t) is the Lagrange multiplication operator, is the inner product operation, and the other parameters have the same meaning as above.
When employing the VMD decomposition steps, setting the appropriate values for two parameters: the number of modes (K) and the penalty parameter (α), is essential.A K value that is too large will result in over-decomposition, while a K value that is too small will lead to under-decomposition.Similarly, a sizeable α value will cause the loss of valuable band information, whereas too small of an α value will introduce redundancy in the information.Thus, it is crucial to determine the optimal combination of these parameters (K, α) [25].
A new optimization algorithm called the sparrow search algorithm was proposed in 2020 [25].In the SSA, individuals are divided into three categories: discoverers, followers, and vigilantes, and the position of each individual corresponds to a solution.The algorithm obtains the position of the optimal solution by continuously updating the positions of these three categories of individuals and calculating the fitness value of all individuals at each cycle, with the main update iteration steps shown below [26].
Step 1: Initialize the population, the proportion of predators and joiners, and the number of iterations.
Step 2: Calculate the fitness values and sort them from largest to smallest.
Step 6: Calculate the fitness value and update the sparrow position.
Step 7: If the requirements are met, output the result; otherwise, repeat steps 2-6.
The finder checks for predators in the foraging area and if not, searches extensively for food; if there are predators, it flies to a safe area.The expression is shown in Equation (4) below.
where X t i,j denotes the location information of the ith sparrow in the jth dimension.t denotes the current number of iterations.R 2 ∈ (0, 1] and ST(0.5, 1.0] denote the warning value and the warning threshold, respectively.L is a d-dimensional column vector whose element is 1.Q is a normally distributed random number. (2) Follower position update.
When a follower joins, it is determined whether it is eligible to compete with the finder for food, i.e., whether its location is better.If its location corresponds to a lower fitness level, then it is not eligible to compete and it needs to fly to another area to forage; otherwise, the joiner will forage in the vicinity of the best individual X p .The expressions are shown below.
where X t worst denotes the position of the worst adapted individual in generation t, X t+1 p denotes the position of the best adapted individual in generation t + 1, and A is a matrix of the same dimension as L with elements that are subsequently pre-defined as 1 or −1 and satisfy A + = A T AA T −1 .
When individuals are at the periphery of the population, they need to adopt antipredatory behavior to achieve a higher degree of adaptation; when they are at the center of the population, they need to move closer to their peers to stay away from the danger zone.The expressions are as follows.
where X t best is the current best position.β is a random number with a mean of 1 and variance of 0, obeying a normal distribution.B (B [0, 1]) is a random number.f i denotes the fitness of the sparrow in the current situation, f g is the best fitness in the current global situation, f w is the worst fitness, and ε is the smallest constant.
The objective function chosen by SSA-VMD is the mean envelope entropy (MEE).The expression of the objective function MEE is: where K denotes the number of IMF components obtained through VMD.E Pi is the envelope entropy value of each IMF component.When the value of MME is smaller, the complexity of IMF is lower, and the signal is more stable.Figure 1 shows the flowchart of SSA-VMD.
When individuals are at the periphery of the population, they need to adopt predatory behavior to achieve a higher degree of adaptation; when they are at the ce of the population, they need to move closer to their peers to stay away from the da zone.The expressions are as follows.
( ) where t best X is the current best position.β is a random number with a mean of 1 variance of 0, obeying a normal distribution.B (B [0, 1]) is a random number.i f den the fitness of the sparrow in the current situation, g f is the best fitness in the cur global situation, w f is the worst fitness, and ε is the smallest constant.
The objective function chosen by SSA-VMD is the mean envelope entropy (MEE) expression of the objective function MEE is: where K denotes the number of IMF components obtained through VMD.Pi E is th velope entropy value of each IMF component.When the value of MME is smaller complexity of IMF is lower, and the signal is more stable.Figure 1 shows the flowcha SSA-VMD.

Maximum Information Coefficient
The maximal information coefficient is developed based on mutual information and is a kind of maximal information-based nonparametric exploration (MINE) in statistical analysis [27].The MIC is defined as follows: the current two-dimensional space is divided into X intervals and Y intervals in the x and y directions, respectively, to form an x × y grid; the grid with the same partition interval can be divided in a variety of ways.Suppose the set of grids formed by different partition ways is Ω; then, it is defined as: where D|G represents the distribution of the dataset D on the grid G; I(D|G) is mutual information of D|G ; n is the number of dataset samples; and B(n) is the upper limit of the number of mesh divisions.The general value range of B(n The increase in the upper limit of the number of grid divisions will make the measurement of the correlation degree of MIC more accurate, but the computational complexity will also increase.The optimal effect can be obtained when B(n) = n 0.6 [28].

Mogrifier LSTM
At the 2020 International Conference on Learning Representations (ICLR), Gabor Melis, Tomas Kocisky, and others at the University of Oxford, UK, presented the Mogrifier LSTM deep learning algorithm [11], which improves the algorithm.The improved algorithm extends the original LSTM algorithm by adding two gating units on top of the original algorithm, increasing the interaction space of network input and output, making full use of the intrinsic connection between input and output, and improving the dynamic approximation capability of the network.Figure 2 shows the structural unit of the Mogrifier LSTM network.
Batteries 2023, 9, x FOR PEER REVIEW 7 of 23  The Mogrifier LSTM is an extension of the LSTM where the input x(t) and the previous moment output h(t − 1) are screened against each other in an alternating manner before entering the conventional LSTM cell.The dashed box in Figure 3 shows the interactive control process of input and output.x(t) is transformed by the sigmoid threshold unit to obtain the control state u(t).The sigmoid function makes the values of the elements in u(t) between [0 and 1], and the dotted multiplication of u(t) and h(t − 1) transforms each element in h(t − 1) to different degrees, if u(t) corresponds to an element with a value of 1, the corresponding element of h(t − 1) flows into the network according to the original value; if u(t) corresponds to an element value of 0.5 in h(t − 1), it causes the corresponding element in h(t − 1) to flow into the neural network after halving the value of each element in u(t).The value of each element in x(t) is derived from the weight of x(t) flowing into that layer, which is continuously updated during the training process of the network to reduce the network loss value.The update process for ( ) u t and ( ) in Figure 3 is shown below: Where u W is the weight of the input ( ) x t on the control of ( ) and u b is its bias.
( ) The final output of the network in the recurrent neural network is obtained from ( ) c t .During the training process of the network, the network uses the time series to update ( ) c t .If an element in the input time series changes more drastically, the network should learn how to deal with it.To solve this problem, the Mogrifier LSTM adds ( ) to the threshold unit of ( ) is processed by the sigmoid threshold structure to achieve the control state ( ) v t .Using ( ) v t to achieve the transformation of each element in ( ) x t , the network uses the loss value obtained after the transformation to fine-tune the weights of the layer to achieve the gradient update.
( ) v t and ( ) update process is as follows: The update process for u(t) and h(t − 1) in Figure 3 is shown below: where W u is the weight of the input x(t) on the control of h(t − 1), and b u is its bias.h(t − 1) is transformed into u(t) by h(t − 1) .The final output of the network in the recurrent neural network is obtained from c(t).During the training process of the network, the network uses the time series to update c(t).If an element in the input time series changes more drastically, the network should learn how to deal with it.To solve this problem, the Mogrifier LSTM adds h(t − 1) to the threshold unit of x(t) control.As shown in Figure 4 h(t − 1) is processed by the sigmoid threshold structure to achieve the control state v(t).Using v(t) to achieve the transformation of each element in x(t), the network uses the loss value obtained after the transformation to fine-tune the weights of the layer to achieve the gradient update.v(t) and x(t − 1) update process is as follows: where W v is the weight of the input h(t − 1) on the x(t) passability control and b v is its bias matrix.x(t) is transformed into v(t) through x(t) .
( ) ( ) ( ) where v W is the weight of the input ( ) x t passability control and v b is its bias matrix.( ) x t is transformed into ( ) x(t)'

Source Domain Cell Dataset: CALCE Dataset
The source domain battery dataset used in this study was obtained from the lithiumion battery dataset provided by the Center for Advanced Life Cycle Engineering Research (CALCE) at the University of Maryland, USA [29].Four battery charge and discharge experimental datasets, CS2_35, CS2_36, CS2_37, and CS2_38, were selected as the source domain data for the model.The CS2 series batteries are lithium cobalt-acid batteries with LiCoO2 mixed with carbon as a conductive additive as the anode and layered graphite bonded with polypropylene fluoride as the cathode [30,31].At the same room temperature (24 °C) environment, all batteries underwent the same charge/discharge pattern using a constant current of 0.5 C to charge when the voltage reached 4.2 V. Then the voltage was kept at 4.2 V for constant voltage charging, and when the charging current dropped to 0.05 A, a constant current discharge was performed using a current of 1 C.The discharge was stopped when the voltage was 2.7 V. The same charging and discharging experiments were performed for the battery several times and stopped when the capacity of the battery dropped from 1.1 Ah to 0.88 Ah. Figure 5 shows the capacity degradation trend of the CALCE lithium battery.The source domain battery dataset used in this study was obtained from the lithiumion battery dataset provided by the Center for Advanced Life Cycle Engineering Research (CALCE) at the University of Maryland, USA [29].Four battery charge and discharge experimental datasets, CS2_35, CS2_36, CS2_37, and CS2_38, were selected as the source domain data for the model.The CS2 series batteries are lithium cobalt-acid batteries with LiCoO2 mixed with carbon as a conductive additive as the anode and layered graphite bonded with polypropylene fluoride as the cathode [30,31].At the same room temperature (24 • C) environment, all batteries underwent the same charge/discharge pattern using a constant current of 0.5 C to charge when the voltage reached 4.2 V. Then the voltage was kept at 4.2 V for constant voltage charging, and when the charging current dropped to 0.05 A, a constant current discharge was performed using a current of 1 C.The discharge was stopped when the voltage was 2.7 V. The same charging and discharging experiments were performed for the battery several times and stopped when the capacity of the battery dropped from 1.1 Ah to 0.88 Ah. Figure 5 shows the capacity degradation trend of the CALCE lithium battery.

Target Domain Battery: NASA Dataset
This study's target domain battery dataset is derived from the lithium battery data in the NASA-Ames Prediction Center of Excellence (PCOE) data [32,33].Batteries numbered B5 and B6 were selected to validate the proposed method.The corresponding batteries have a rated capacity of 2 Ah.The battery aging experiments are performed at room temperature.The procedure is as follows: First, a constant current of 1.5 A is applied to charge the battery.When the battery voltage reaches 4.2 V, the charging is finished.A constant voltage mode is applied to charge the battery in another cycle.When the charging current drops to 20 mA, the charging is completed.NASA battery packs are ternary lithium batteries, with lithium nickel-cobalt-aluminate as the anode material and graphite as the cathode material [34].Then, the battery packs corresponding to B5 and B6 are discharged in constant current mode at 2 A. The discharge is completed when the battery voltage drops to 2.7 V and 2.5 V, respectively.The battery failure threshold line is reached when the battery capacity drops to 70% of the rated capacity. Figure 6 shows the NASA lithium battery capacity degradation trend.

Target Domain Battery: NASA Dataset
This study's target domain battery dataset is derived from the lithium battery data in the NASA-Ames Prediction Center of Excellence (PCOE) data [32,33].Batteries numbered B5 and B6 were selected to validate the proposed method.The corresponding batteries have a rated capacity of 2 Ah.The battery aging experiments are performed at room temperature.The procedure is as follows: First, a constant current of 1.5 A is applied to charge the battery.When the battery voltage reaches 4.2 V, the charging is finished.A constant voltage mode is applied to charge the battery in another cycle.When the charging current drops to 20 mA, the charging is completed.NASA battery packs are ternary lithium batteries, with lithium nickel-cobalt-aluminate as the anode material and graphite as the cathode material [34].Then, the battery packs corresponding to B5 and B6 are discharged in constant current mode at 2 A. The discharge is completed when the battery voltage drops to 2.7 V and 2.5 V, respectively.The battery failure threshold line is reached when the battery capacity drops to 70% of the rated capacity. Figure 6 shows the NASA lithium battery capacity degradation trend.

Specific Experimental Procedures
(1) In the data pre-processing stage: The actual battery capacity decay leads to a decreasing trend of the overall SOH and battery aging.Therefore, the battery capacity can be used as a direct HI to assess the degree of battery aging [35].The capacity degradation data from NASA and CALCE lithium battery historical life experiment data are extracted separately.The CALCE lithium battery capacity degradation dataset is used

Specific Experimental Procedures
(1) In the data pre-processing stage: The actual battery capacity decay leads to a decreasing trend of the overall SOH and battery aging.Therefore, the battery capacity can be used as a direct HI to assess the degree of battery aging [35]. is calculated.To verify the prediction performance of the model, different prediction starting points are chosen to evaluate the prediction effect of the model.(4) In the error analysis stage: using CRA, MAPE, RUL Error , and PRUL Error the model performance is evaluated, and the results show that the proposed method has better prediction accuracy than Mogrifier LSTM and LSTM neural network models for RUL of Li-ion batteries.The flow chart is shown in Figure 7: LSTM model, and a pre-trained model with high prediction accuracy is obtained several iterations.(3) In the RUL prediction stage: The optimal Mogrifier LSTM pre-trained mode tained from the training is transferred to the target domain, and the capacity de dation curve is predicted for the target domain battery after fine-tuning the ta domain training set, where the prediction step is 5; then the RUL of the target dom battery is calculated.To verify the prediction performance of the model, diffe prediction starting points are chosen to evaluate the prediction effect of the mod (4) In the error analysis stage: using CRA, MAPE,

Error RUL , and
Error PRUL the m performance is evaluated, and the results show that the proposed method has b prediction accuracy than Mogrifier LSTM and LSTM neural network models for of Li-ion batteries.The flow chart is shown in Figure 7:

Experimental Verification 4.1. SSA-VMD
To enhance the accuracy and reliability of lithium battery life prediction, the dataset undergoes SSA-VMD decomposition, aimed at reducing noise and interference.Specifically, the battery capacity degradation data from both the source and target domains are subjected to this analysis.The decomposition process involves determining the optimal modal number, K, for VMD.Inadequate decomposition occurs with excessively small K values, while excessive decomposition occurs with excessively large K values.To address this issue, the VMD parameters, K and α, are optimized through SSA.In this paper, the SSA parameters [36] adopted are presented in Table 1, encompassing the number of populations (Num), maximum number of iterations (Iter), upper and lower boundaries (Lb and Ub), and search dimension (Dim).The optimal parameter combination resulting from SSA-VMD decomposition is [5,171], [5,200].values, while excessive decomposition occurs with excessively large K values.To address this issue, the VMD parameters, K and α, are optimized through SSA.In this paper, the SSA parameters [36] adopted are presented in Table 1, encompassing the number of populations (Num), maximum number of iterations (Iter), upper and lower boundaries (Lb and Ub), and search dimension (Dim).The optimal parameter combination resulting from SSA-VMD decomposition is [5,171], [5,200].In order to assess the decomposition performance, the commonly used Index of Orthogonality (IO) [38] is employed.This index serves to evaluate the effectiveness of sequence decomposition techniques.A smaller IO value indicates better orthogonality between the components, thus implying higher sequence decomposition accuracy.The definition of the orthogonality index is presented below: In order to assess the decomposition performance, the commonly used Index of Orthogonality (IO) [38] is employed.This index serves to evaluate the effectiveness of sequence decomposition techniques.A smaller IO value indicates better orthogonality between the components, thus implying higher sequence decomposition accuracy.The definition of the orthogonality index is presented below: where u i and u j represent the indexes of the ith and jth IMFs, respectively, by utilizing Equation ( 14), the IO values for the decomposition results of the B5 battery's capacity data are obtained and presented in Table 2.It is observed that the SSA-VMD decomposition exhibits smaller IO values compared to the EMD decomposition for the B6 battery's capacity data, suggesting a superior decomposition performance.

Maximum Information Coefficient (MIC)
Compared with the original signal, it can be seen that the IMF component can effectively capture and can reflect the performance degradation trend of Li-ion batteries more To reduce the effects of capacity regeneration and noise, the maximum information factor is used to measure the correlation of each modal component after decomposition with the original capacity degradation data, and some modal features with low correlation can be discarded as noise.Table 3 shows the results after normalization of the maximum information coefficients of each IMF component in the source and target domain battery datasets and the average value of the calculated maximum information coefficients of each IMF component.The IMF components with maximum correlation coefficient values more significant than the average value are screened out for capacity series reconstruction.The reconstructed capacity degradation information is highly correlated with the original capacity information.The degradation trend is more obvious, so the reconstructed capacity degradation information is used to characterize lithium battery capacity degradation information.Figure 9 shows the capacity degradation curves of the reconstructed source domain and target domain cell datasets.To verify the effectiveness of the proposed method in the RUL prediction of Li-ion batteries, the final obtained experimental results were compared with the Mogrifier LSTM  To verify the effectiveness of the proposed method in the RUL prediction of Li-ion batteries, the final obtained experimental results were compared with the Mogrifier LSTM and LSTM, and the error between the actual and predicted values when the capacity of the Li-ion battery drops to the failure threshold is defined as follows: 100% where True RUL denotes the actual RUL, i.e., the number of cycles for which the capacity is below the failure threshold (1.4 Ah), and the actual RUL for B5 and B6 batteries are 124 and 108 cycles, respectively.The prediction results from Figures 10-12 reveal significant improvements in the ITL-Mogrifier LSTM method compared to the LSTM method.The ITL-Mogrifier LSTM approach produces prediction results that closely align with the natural capacity decay curve To verify the effectiveness of the proposed method in the RUL prediction of Li-ion batteries, the final obtained experimental results were compared with the Mogrifier LSTM and LSTM, and the error between the actual and predicted values when the capacity of the Li-ion battery drops to the failure threshold is defined as follows: where RUL True denotes the actual RUL, i.e., the number of cycles for which the capacity is below the failure threshold (1.4 Ah), and the actual RUL for B5 and B6 batteries are 124 and 108 cycles, respectively.RUL pre is the capacity prediction value, and RUL Error denotes the absolute error between RUL True and RUL pre .PRUL Error is the relative error percentage.
The prediction results from Figures 10-12 reveal significant improvements in the ITL-Mogrifier LSTM method compared to the LSTM method.The ITL-Mogrifier LSTM approach produces prediction results that closely align with the natural capacity decay curve of Li-ion batteries.This improvement can be attributed to the Mogrifier LSTM's ability to address the low convergence efficiency of unidirectional neural networks and effectively utilize temporal context information, resulting in more accurate predictions.Furthermore, when comparing the prediction results of the ITL-Mogrifier LSTM and Mogrifier LSTM, it is observed that iterative migration learning allows the model to learn features more comprehensively, leading to better prediction results than the Mogrifier LSTM.These improvements hold true even under different cycling conditions.Specifically, the predicted remaining useful life (RUL) of the lithium battery under 100 cycling conditions, utilizing the reconfigured capacity sequence, is significantly better than predictions under 80 and 60 cycling conditions.This increase in accuracy can be attributed to the larger amount of training data, resulting in predictions that align closely with the actual battery capacity degradation curve.Additionally, the proposed method demonstrates higher stability, as the failure threshold is reached faster compared to the original capacity degradation curve, further confirming its effectiveness.
Tables 4-7 show the quantitative representation of the prediction errors of the model in which the RUL prediction errors for different conditions of battery B5 have a MAPE of not more than 6.5% and an RUL Error of not more than 1, and the RUL prediction errors for different conditions of battery B6 have a MAPE of more than 9.0% and an RUL Error of not more than 1.From the algorithmic analysis, firstly, this paper adopts the SSA-VMD algorithm, reduces the complexity of the data, enables the model to learn the intrinsic information of the data thoroughly, discards the noise component, and the capacity sequence by screening the maximum information coefficient so that the model can capture the capacity regeneration fluctuation.Second, the iterative transfer learning increases the amount of data for model training, the amount for model learning also increases and its prediction results are more accurate.In summary, the quantitative analysis of the error and the analysis from the algorithm's perspective can prove that the proposed algorithm in this paper has better accuracy.To quantitatively evaluate the prediction performance, the cumulative relative accuracy (CRA) and the mean absolute percentage error (MAPE) of different methods are compared with their formulae as follows [39]: where K is the length of the samples, l k and l k represent the predicted and true values at moment t k .The higher the CRA value, the better the performance, and the smaller the MAPE value, the better the performance.This study utilizes the continuous multi-step sliding window prediction method with a prediction step length of five steps.This implies that the earliest prediction moment occurs five cycles prior to reaching the failure threshold cycle point.Figures 10-12 illustrate that the accuracy of the prediction result improves as the failure threshold cycle point approaches.
In order to verify the robustness and stability of the proposed method, we utilize 5-fold cross-validation [40] and select different training and test sets for RUL prediction of lithium batteries, respectively.Specifically, within the test set of the target domain dataset, the test data from CALCE's lithium batteries constituted 60% of the total cycling data.The data obtained from NASA was selected to have a starting point of the 100th cycle for B5 batteries and the 90th cycle for B6 batteries.
Figures [13][14][15][16] showcase that the proposed method's robustness can be adequately determined using 5-fold cross-validation.Additionally, the battery degradation trend of CALCE displays significant volatility, which poses a challenge for modeling using a simple linear relationship.Consequently, a robust and stable model is required.Tables 8-23 demonstrate that the prediction method proposed in this paper outperforms other prediction methods in terms of battery RUL prediction.This superiority is evident in both RUL Error , PRUL Error , and CRA evaluation, as well as in MAPE evaluation.Furthermore, regardless of the choice of source dataset or target domain dataset, the proposed method consistently performs better than the comparison methods in cross-validation.The proposed method improves by approximately 46.94% compared to Mogrifier LSTM and 65.79% compared to LSTM in terms of RUL Error .Moreover, it exhibits about a 50.28% improvement over Mogrifier LSTM and 67.03% over LSTM in terms of PRUL Error .In terms of CRA, the proposed method shows an improvement of around 1.98% compared to Mogrifier LSTM and 5.44% compared to LSTM, while in terms of MAPE, it improves by approximately 25% compared to Mogrifier LSTM and 41.37% compared to LSTM.The results consistently demonstrate significant robustness and stability.  .In terms of CRA, the proposed method shows an improvement of around 1.98% compared to Mogrifier LSTM and 5.44% compared to LSTM, while in terms of MAPE, it improves by approximately 25% compared to Mogrifier LSTM and 41.37% compared to LSTM.The results consistently demonstrate significant robustness and stability.

Conclusions
In this paper, an ITL-Mogrifier LSTM method is proposed to implement the prediction of RUL for Li-ion batteries.The main findings of this paper are as follows: (1) The VMD algorithm is capable of accurately capturing the overall decreasing trend of battery capacity and fluctuations.However, the accuracy and stability of the decomposition results can be influenced by the optimization parameters K and α.
To obtain more precise decomposition results, SSA was utilized to explore optimal parameter combinations.SSA-VMD analysis was conducted on both the source and target domain battery capacity degradation data.The correlation between each IMF component and the original capacity degradation data was evaluated using the maximum information coefficient.The highly correlated IMF components were then reconstructed, effectively characterizing the battery health information and mitigating the phenomenon of capacity regeneration.These reconstructed signals exhibited a strong correlation with the original capacity degradation data.Hence, it is viable to employ these data as the input for Li-ion battery RUL prediction models.
(2) Although the LSTM neural network can achieve the RUL prediction of the Li-ion battery, its prediction error is large, and it is easy to cause gradient disappearance and gradient explosion phenomenon.The Mogrifier LSTM algorithm can increase the interaction between the data to enhance the effective features and weaken the secondary features, and the optimization search effect is better.Using ITL-Mogrifier LSTM to extract useful knowledge from multiple source domain battery data and suppress the effects of large data distribution differences on the model, transfer learning with the multi-domain transfer is achieved, and reliable models can be constructed with the help of a small amount of target data.
By using the CALCE and NASA Li-ion battery datasets, the prediction results of the proposed method and other methods are compared, and the experimental results show that the ITL-Mogrifier LSTM has a small prediction error for Li-ion battery RUL and that the model effect is better than other models.In addition, the proposed method not only has better prediction accuracy but also can effectively reduce the cost of lithium-ion battery aging data collection.

23 Figure 5 .
Figure 5. CALCE lithium battery capacity degradation trend.3.1.2.Target Domain Battery: NASA Dataset This study's target domain battery dataset is derived from the lithium battery data in the NASA-Ames Prediction Center of Excellence (PCOE) data [32,33].Batteries numbered B5 and B6 were selected to validate the proposed method.The corresponding batteries have a rated capacity of 2 Ah.The battery aging experiments are performed at room tem-
The capacity degradation data from NASA and CALCE lithium battery historical life experiment data are extracted separately.The CALCE lithium battery capacity degradation dataset is used as the source domain battery dataset, and the NASA lithium battery capacity degradation dataset is used as the target domain dataset.The parameters K and α of VMD are optimized using the SSA, and the source domain and target domain battery capacity degradation data are decomposed to obtain several IMF components, and the highly correlated feature components are selected by the MIC for capacity sequence reconstruction and input to the model as characterizing lithium battery capacity degradation information.(2) In the degradation modeling phase: the reconstructed full-life capacity degradation data of the source domain battery dataset is iteratively pre-trained on the Mogrifier LSTM model, and a pre-trained model with high prediction accuracy is obtained after several iterations.(3) In the RUL prediction stage: The optimal Mogrifier LSTM pre-trained model obtained from the training is transferred to the target domain, and the capacity degradation curve is predicted for the target domain battery after fine-tuning the target domain training set, where the prediction step is 5; then the RUL of the target domain battery

Figure 8
Figure 8 shows the IMF components after the decomposition of the source domain battery (with CS2_35 as an example), and the target domain battery dataset (with B5 as an example).In the figure, IMF1 is the main trend degradation component.IMF2 can reflect capacity regeneration.IMF3~5 are random components, which can reflect the random disturbance caused by the battery degradation [37].

Figure 8 Figure 8 .
Figure 8 shows the IMF components after the decomposition of the source domain battery (with CS2_35 as an example), and the target domain battery dataset (with B5 as an example).In the figure, IMF1 is the main trend degradation component.IMF2 can reflect capacity regeneration.IMF3~5 are random components, which can reflect the random disturbance caused by the battery degradation [37].

Figure 9 .
Figure 9. Reconstructed capacity degradation curve.(a) Source domain Li-ion battery data.(b) Target domain Li-ion battery data.

4. 3 .
ITL-Mogrifier LSTM Li-ion batteries No. CS2_35, CS2_36, CS2_37, and CS2_38 are used as source domain data, and batteries No. B5 and B6 are used as target domain data.The reconstructed source domain battery capacity degradation data are all input as training sets into the Mogrifier LSTM for iterative training to obtain the pre-training model, and the pre-training model is transferred to the target domain.To verify the model prediction effect, the capacity sequence data of the reconstructed target domain battery capacity degradation data under 100-cycle operating conditions, 80-cycle operating conditions, and 60-cycle operating conditions are selected as training sets and compared with the Mogrifier LSTM after SSA-VMD and with the LSTM after SSA-VMD.Figures 10-12 show the RULs of the target domain battery dataset under different starting points (ST).

Figure 9 .
Figure 9. Reconstructed capacity degradation curve.(a) Source domain Li-ion battery data.(b) Target domain Li-ion battery data.

Table 2 .
SSA-VMD and EMD decomposition of IO values.

Table 3 .
IMF maximum information coefficient.

Table 4 .
B5 battery RUL Error and PRUL Error analysis.

Table 5 .
B6 battery RUL Error and PRUL Error analysis.

Table 16 .
B6, CS2_35, CS2_36 and CS2_37 are source domains for CS2_38 battery and PRUL Error and PRUL Error analysis.