An Adaptive Noise Reduction Approach for Remaining Useful Life Prediction of Lithium-Ion Batteries

: Lithium-ion batteries are widely used in the electric vehicle industry due to their recyclabil-ity and long life. However, a failure of lithium-ion batteries can cause some catastrophic accidents, such as electric car battery explosion ﬁres and so on. To prevent such harm from occurring, it is essential to monitor the remaining useful life of lithium-ion batteries and give early warning. In this paper, an adaptive noise reduction approach is proposed to predict the RUL (Remaining Useful Life) of lithium-ion batteries, which uses CEEMDAN (Complete Ensemble Empirical Mode Decomposition with Adaptive Noise) combined with wavelet decomposition to achieve adaptive noise reduction decomposition, and then inputs the obtained IMF (Intrinsic Mode Function) components into LS–RVM (Least Square Relevance Vector Machine) for training, prediction, and reconstruction, so as to achieve high-precision prediction of RUL. Moreover, in order to verify the validity of the model, the model in this paper is compared with other common models. The results demonstrate that the RMSE, MAPE, and MAE of the proposed model are 0.008678, 0.005002, and 0.006894, and that it has higher accuracy than the other common prediction models.


Introduction
Lithium-ion batteries have the advantages of high energy density and low discharge rate, and they are widely used in energy storage devices and play an important role in areas such as transportation electrification and smart grids [1]. However, the failure of lithium-ion batteries can cause catastrophic accidents, such as electric car batteries exploding and catching fire, aircraft parts failing, and so on [2,3]. With the increasing requirements for battery safety in related fields, it is very important to accurately predict the RUL of lithium-ion batteries and give early warning to reduce unnecessary losses. Therefore, it is of great significance to develop a method that can accurately predict the RUL of lithium-ion batteries.
The current RUL prediction of lithium-ion batteries can be mainly divided into modelbased methods and data-driven methods [4]. However, the degradation of lithium-ion batteries is a complex electrochemical process, the modeling process of model-based methods in practical situations is complicated, and the establishment of theoretical models is supported by expert knowledge, which makes the models not robust [5][6][7]. In order to simplify the complexity of the system with unmeasured variables and effectively simulate the health degradation trend of lithium-ion batteries, data-driven methods can be used to ensure the reliable and efficient operation of the battery management system and avoid potential dangers caused by battery failure [8].
Wang et al. [9] proposed the method of RVM to predict and model the battery decay trajectory through online learning and, subsequently, many scholars combined RVM and other methods to predict the capacity decay trend [10][11][12][13]. MLP (Multilayer Perceptron) can capture the relationship between battery remaining life and other attributes [14], LSTM (Long Short-Term Memory) can effectively describe the degradation characteristics of lithium-ion batteries [15], but it takes a lot of time to train [16]. SVMs (Support Vector Machines) have advantages on datasets with significant capacity decay [17,18]. Chen et al. propose a transformer network for Remaining Useful Life Prediction of lithium-ion batteries [19]. However, the above methods ignore the influence of capacity proliferation on lithium-ion batteries.
Some scholars try to use EMD (Empirical Mode Decomposition) combined with neural networks to predict the RUL of lithium-ion batteries [20]. Zhou et al. [21] proposed a method combining EMD and ARIMA (Autoregressive Integrated Moving Average model) to capture the local fluctuations in the degradation process of lithium-ion batteries. Chen et al. [22] screened the components obtained by the EMD algorithm to improve prediction accuracy. However, the EMD method has some disadvantages [23], such as serious mode aliasing and end effect, and it cannot obtain reliable RUL estimation and prediction. Since the decline curve of capacity regeneration is different from the normal decline curve, Xu et al. [24] added the decline model of capacity regeneration to reduce the prediction error caused by the difference between the decline curve of capacity regeneration and the normal decline curve but ignored the influence of random fluctuations. Wang et al. proposed a model combining VMD (Variational Mode Decomposition) [25], MLP, and LSTM to model the overall degradation trend and capacity regeneration component [26][27][28]. In addition, some scholars use stochastic processes to fit the battery degradation process [29][30][31]. Furthermore, there are also some scholars who try to introduce health factors that can be used to predict the RUL of the lithium-ion battery. For instance, Wang et al. [32] proposed an indirect health indicator extracted from the constant current charge process to aid prediction, and Lin et al. [33] extracted new indirect health indicators from the voltage and current curves during charging.
Although EMD can decompose capacity regeneration trends and other random noise trends, the phenomenon of modal aliasing also appears by using EMD [34]. To address this problem, Wu and Huang proposed EEMD (Ensemble Empirical Mode Decomposition) [35], which reduced the reconstruction error by increasing the integration times. However, poor completeness and large amounts of calculation easily lead to the inefficient decomposition of EEMD. Thus, Torres. et al. proposed CEEMDAN [36][37][38], which added adaptive white noise at each stage of the decomposition, calculated a unique margin signal to obtain each IMF, and has a large enhancement of the decomposition efficiency compared with the EEMD algorithm. Nevertheless, it is still difficult to avoid the interference of redundant components.
For the existing problems in the above research, this paper proposes an adaptive noise reduction approach, which adopts CEEMDAN to decompose the original sequence to obtain multiple IMFs (including normal degradation trend, capacity regeneration trend, and random trend), and then inputs the IMF components into the wavelet transform, respectively, to reduce noise. Then, the denoised sequences are input into LS and RVR models for training and prediction according to the normal degradation trend and other trends. Finally, the obtained prediction components are reconstructed to obtain the final prediction results, and RMSE, MAPE, and MAE are selected to evaluate the prediction effect of the model.
The main contributions of this paper are detailed as follows: 1. The computation speed and the completeness of CEEMDAN has increased compared with EMD. 2. EMD may produce several low-frequency IMF components with small amplitudes, which have little significance for the prediction results. Therefore, adopting the CEEMDAN method can reduce the number of these components. 3. To achieve high-precision prediction, wavelet transform is used to reduce noise and improve signal resolution for the IMF component of CEEMDAN, which contains some noise. 4. Compared with other popular models, the proposed model is also very good at early prediction. Simple combinations of models are more accurate than complex ones.
The rest of this paper is organized as follows. Section 2 presents the theoretical background of the proposed method. Section 3 describes the experimental analysis. A case study and comparative analysis are provided and discussed in Section 4. The conclusion is given in Section 5.

Theoretical Background
This section describes the basic principles and specific steps of CEEMDAN, Wavelet Domain Denoising, LS, and RVM, as well as where these methods are applicable.

CEEMDAN
Aiming at the problem of mode aliasing in the signal decomposed by the EMD algorithm, the EEMD algorithm adds random white noise to the signal to be decomposed to reduces the mode aliasing of EMD. In essence, white noise is a series of independent sequences which are added to the original data. The characteristics of white noise are used to amplify the degree of irrelevance of the modes that are difficult to separate in the original data, so as to extract the original two inseparable modes. However, there is always a certain amount of white noise remaining in the modal components obtained by the signal decomposition algorithm, and different number of modes may be generated by the signal plus noise decomposition through EMD, which will affect the subsequent signal analysis and processing. CEEMDAN solves the above problems from two aspects: (1) The IMF component with auxiliary noise after EMD is added in each iteration; (2) EEMD is to carry out the overall average of the modal components obtained after empirical mode decomposition, while CEEMDAN is to carry out the overall average calculation after the first-order IMF component to obtain the final first-order IMF component, and then repeat the above operations for the residual parts. In this way, the transfer of white noise from high frequency to low frequency is effectively solved.
In general, the CEEMDAN algorithm is to add a finite number of adaptive white noise at each stage of EMD. Assume that the k-th modal component obtained by CEEMDAN is IMF k , E j (·) is the j-th modal component obtained by EMD, w i is a Gaussian white noise signal satisfying the standard normal distribution, k is the SNR (Signal-To-Noise Ratio), i = 1, 2, . . . , I is the number of white noise added, and x[n] is the signal to be partitioned. The steps of CEEMDAN are as follows: Step 1. Add white Gaussian noise signal to x[n] to make a new signal x[n] + ε 0 w i [n]. Then, the first-order model component I MF i 1 [n] is obtained by EMD: Step 2. Overall average the I-th modal components, and the first IMF of CEEMDAN is obtained as follows: Step 3. Calculate the residual after removing the first IMF: Step 4. Add white noise to r 1 [n] to obtain a new signal ). Then, the first-order I MF i 2 [n] is obtained by EMD, and the second IMF of CEEMDAN is obtained as follows: Step 5. Calculate the residual after removing the k-th IMF, fork = 2, . . . , K Step 6. Add white noise to r k [n] to obtain a new signal r k [n] + k E k (w i [n]). Then, the first-order I MF i k+1 [n] is obtained by EMD, and the (k + 1)-th IMF of CEEMDAN is obtained as follows: Step 7. Repeat Step 5 and Step 6 until termination conditions meet the EMD iteration. The final residual is as follows: The final number of IMFs is determined only by the data and stopping criteria. The flow chart of CEEMDAN is shown in Figure 1.

Wavelet Domain Denoising
The essence of wavelet denoising is the process of suppressing the useless part and enhancing the useful part. The denoising process is divided into three parts: wavelet decomposition, threshold filtering, and wavelet reconstruction [39].
(1) The wavelet decomposition is to select a kind of wavelet to decompose the signal with N-layer wavelets.
(2) The threshold filtering is to obtain the estimated wavelet coefficients by thresholding the decomposed coefficients of each layer.
(3) The wavelet reconstruction is to obtain the denoised signal by wavelet reconstruction according to the denoised wavelet coefficients.
The process of wavelet decomposition and noise reduction is shown in Figure 2.

LS
CEEMDAN can separate the main trend degradation categories from the original signal, and the least square regression method can be used to fit the trend, which can effectively predict the normal decline trend of capacity.
Suppose the multivariate linear equation has the following form: To construct the loss function with respect toŵ: The derivative of the loss function is as follows: The weights can be solved as:ŵ

RVM
RVM, which has the same functional form as the support vector machine [14], is a Bayesian sparse kernel algorithm for regression and classification. The expression of the RVM regression prediction model can be written as follows: where {x n } N s n=1 is the input set, {t n } N s n=1 is the output set, and K(x, x i ) is the kernel function. Linear kernel functions are used in this paper: is the noise that follows the N 0, σ 2 distribution, w i is the weight, and N s is the number of samples.
Assuming that {t n } N s n=1 is an independent random variable, the likelihood function of the sample can be expressed as: To avoid overfitting w and σ 2 , impose conditions on some parameters, assume that w i follows a Gaussian conditional probability distribution with mean 0 and variance β −1 i : β is the N + 1-dimensional hyperparameter that determines the prior distribution of the weight w.
Given the likelihood function and the prior distribution of w, the posterior distribution of w can be obtained according to the Bayesian formula: where Maximizing the marginal likelihood function to solve the hyperparameter β, the iterative formula of hyperparameter β and variance σ 2 can be obtained as follows: where ∑ ii is the diagonal element of the variance matrix of the posterior weight value, and µ i is the weight value of the i-th posterior distribution. The weight is calculated iteratively according to the hyperparameter β and variance σ 2 , and then the prediction is made.

Experimental Analysis
This section first introduces the basic definitions of RUL and SOH (State of Health), and the relationship between them, and then describes the process of the whole combination algorithm. Finally, three evaluation indexes RMSE, MAPE, and MAE are selected to evaluate the effect of the proposed method.

RUL
SOH is an important indicator to measure the degree of battery degradation, which is usually defined by capacity ratio [40]: whereĉ (1) is the initial capacity of the battery, andĉ(j) is the estimated capacity of the cell in the j-th cycle. In view of the fact that the initial battery capacity of the lithium battery selected for the data set in this paper is consistent, the capacity data of the battery is directly used to define the SOH of the battery.
In general, end of battery life is defined as the point when remaining capacity reaches 70-80% of initial capacity, in which SOH is equal to 0.7-0.8. RUL is the number of charge and discharge cycles left before the battery life reaches the failure threshold. According to the relationship between SOH and RUL, this paper adopts the prediction of capacity series to indirectly express the prediction results of RUL.

Experimental Design
In order to further illustrate the effectiveness of CEEMDAN-wavelet adoption in this paper, a comparison of add chirps and directly used wavelet decomposition is given. The training set ratio is 0.6. Figure 3 shows that the prediction results of adding noise and wavelet filtering are poor, and the predicted trajectory is not smooth under the influence of noise. Therefore, an adaptive noise reduction method is proposed in this paper, which adopts CEEMDAN and then uses the wavelet denoising method to extract more effective information of the original sequence.   The flow of method proposed in this paper is as follows: Step 1. Decompose the capacity signal into a series of IMFs ranging from high frequency to low frequency, and then use wavelet analysis to the IMF component further noise reduction.
Step 2. The decomposition step in Step 1 leads to three parts: main trend degradation part, capacity regeneration part, random interference, and noise part. Then, the training sets and test sets are constructed from these three parts, respectively. The training sets of the normal decay part are trained by LS, and the training sets of the local regeneration part and random interference and noise parts are trained by RVM.
Step 3. Input the test set data of each part into the trained model according to Step 2 to obtain the predicted value, and then reconstruct the predicted value of capacity.
The overall framework is shown in Figure 4.

Evaluation Indicators
RMSE, MAPE, and MAE are used as evaluation indicators for the performance estimation of the prediction model in this paper. The three evaluation indicators are defined as follows: whereŷ i is the predicted value, y i is the real value, and n is the length of the sample.

Experiment and Discussion
This section introduces the data set used in the experiment, and then the data of the data set are processed to obtain the training set and prediction set. Then, the training set and prediction set are combined with CEEMDAN combined with the wavelet denoising algorithm to obtain the decompression and denoising data, and the LS-RVM is used to fit the data. Finally, the obtained results are analyzed.

Experimental Dataset
The test data in this paper are from the 18,650 model lithium-ion battery dataset proposed by NASA PCoE Research Center [41] which includes four types of batteries (B0005, B0006, B0007, and B0018). The battery is charged with a constant current of 1.5 A at about 24°C. The operating voltage rises to the maximum cut-off voltage of 4.2 V, then it switches to constant voltage charging. When the current drops to 20 mA, the charging stops. The battery capacity attenuation trend contained in the dataset is shown in Figure 5.

CEEMDAN Combined with Wavelet Denoising Algorithm
The number of noise added is 100, and the SNR is 0.0005. The Daubechies function is chosen as the wavelet basis function, and the sliding window step is 4.
It can be seen that IMF1 can represent the main trend degradation, IMF2 represents the capacity regeneration trend, and IMF3 and IMF4 represent the random interference and noise trend. This can provide a reference basis for reducing the complexity of the prediction model by CEEMDAN. The IMF components decomposed by CEEMDAN were divided into data sets, and the sliding window model was established to construct the initial training samples. Then, take the t time step as the input and predict the value at the t + 1 time.
The B0005 Decomposition Results is shown in Figure 6.

The LS-RVM Fitting
The modal component of CEEMDAN is decomposed into three parts: the main trend degradation part, capacity regeneration part, random interference, and noise part. The training set of the main trend degradation part is trained by the LS model, and the training set of the capacity regeneration part and random interference and noise are trained by the RVM model, and then the test set is input in order to predict. Taking B0007 as an example, the prediction results of each part are as follows: It can be seen from Figure 7a that the normal capacity degradation trend predicted by the LS model almost coincides with the real value decay trend, and from Figure 7b-d, it can be seen that the capacity regeneration trend, random interference, and noise trend can be well predicted by the RVM model.

Analysis of Forecast Results
To further illustrate the importance of decomposition and noise reduction, the comparison between the method used in this paper and the result of fitting the original sequence using only RVM is as follows: As can be seen from Figure 8, due to the noise and random interference in the original sequence, the prediction results of RVM lag behind the prediction results obtained by the proposed method.
The control groups are set to further compare the effectiveness of the CEEMDAN-WAVELET-LS-RVM combined method proposed in this paper.
It can be seen from Figure 9 that the EMD-ARIMA-RVM model has a serious deviation phenomenon when comparing the prediction trend charts of various models, and the prediction fluctuations of other models are also large. The CEEMDAN-WAVELET-LSTM-RVM prediction model only has a relatively good fitting effect on the data of the B0007 and B0018 batteries. In addition, the proposed method can fit all curves well.
In order to better illustrate the high accuracy of the prediction method in this paper, B0005 is taken as an example, where 60% is selected as the training set and 40% as the test set. The evaluation results are shown as follows: From Table 1, the prediction accuracy of the CEEMDAN-WAVELET denoising decomposition is higher than that of the EMD method. The RMSE, MAPE, and MAE obtained by the method adopted in this paper are all the smallest, and all are below 1%. In order to verify the effectiveness of the method, the ratio of training and testing sets are adjusted, and the index error comparison chart of each model under different training ratios is obtained: It can be seen from Figure 10 that the LS-RVM combined model fit is the best among all other combined models, and the combined model proposed in this paper is superior to other prediction models regardless of the training ratio.

Conclusions
Accurately predicting the RUL of lithium-ion batteries can improve the safety and reliability of energy storage systems. Aiming at the RUL problem, this paper proposes an adaptive noise reduction approach. The effectiveness of the proposed RUL prediction method is verified by using the battery charge-discharge cycle dataset published by NASA.
To achieve more accurate predictions, in this paper, the original sequence is broken down into three parts by CEEMDAN. Compared with EMD, CEEMDAN can adaptively calculate the number of decomposed layers and obtain better decomposition results and then adopt the wavelet denoising to obtain more accurate signal information. Considering the phenomenon of battery capacity regeneration and the interference of random noise, the RVM is used to fit the capacity regeneration part and the random noise part to reduce the complexity of the prediction model, and the trend of battery capacity degradation can be tracked stably. The proposed model can better fit the capacity decay trend of lithiumion batteries with obvious capacity proliferation phenomena and effectively improve the prediction accuracy of the remaining service life of lithium ion batteries. In addition, compared with other models, the proposed method performs better on various datasets and can effectively improve prediction accuracy.
Nevertheless, due to the complexity of the actual operating environment of lithium-ion batteries, RUL prediction under multiple environmental conditions will be considered in the future.
Author Contributions: Formal analysis, W.Q. and G.C.; algorithm, W.Q. and T.Z.; experiment and simulation, W.Q. and T.Z.; validation, W.Q. and T.Z.; writing-original draft preparation, W.Q. and T.Z. and G.C.; writing-review and editing, W.Q., T.Z. and G.C. All authors have read and agreed to the published version of the manuscript. Data Availability Statement: Publicly available datasets were analyzed in this study. These data can be found here: [https://ti.arc.nasa.gov/tech/dash/groups/pcoe/prognostic-data-repository/ #battery, accessed on 1 June 2022].

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: