Lithium-Ion Battery State of Health Estimation with Multi-Feature Collaborative Analysis and Deep Learning Method

: The accurate estimation of the battery state of health (SOH) is crucial for the dependability and safety of battery management systems (BMS). The generality of existing SOH estimation methods is limited as they tend to primarily consider information from single-source features. Therefore, a novel method for integrating multi-feature collaborative analysis with deep learning-based approaches is proposed in this research. First, several battery degradation features are obtained through differential thermal voltammetry (DTV) analysis, singular value decomposition (SVD), incremental capacity analysis (ICA), and terminal voltage characteristic (TVC) analysis. The features highly related to SOH are selected as inputs for the deep learning model based on the results of a Pearson correlation analysis. The SOH estimation is achieved by developing a deep learning framework cored by long short-term memory (LSTM) neural network (NN), which integrates multi-source features as an input. A suggested method is validated using NASA and Oxford Battery Degradation datasets. The results demonstrate that the presented model provides great SOH estimation accuracy and generality, where the maximum root mean square error (RMSE) is less than 1%. Based on a cloud computing platform, the proposed method can be applied to provide a real-time prediction of battery health, with the potential to enhance battery full lifespan management.


Introduction
Lithium-ion batteries are deployed in a multitude of applications, including portable electronics, electric vehicles (EVs), and energy storage devices [1][2][3]. As the performance of LIBs declines nonlinearly, it becomes a system's safety risk and limits the usage of electrical devices [4][5][6][7]. It also restricts the continuous use of LIBs as a secondary energy storage medium [8][9][10]. Therefore, the precise prediction of the state of health (SOH) is essential for EVs and sustainable energy application developments.
SOH is one of the metrics that reflects battery health in terms of available capacity. Studies have shown that battery degradation is a sophisticated physicochemical process [11][12][13]. The process is simultaneously influenced by internal and external elements of the battery [14]. Due to the complex operating conditions, the accumulation of internal side reaction products leads to shifts in electrochemical equilibrium, which makes the battery degradation process highly non-linear [15]. Due to the requirements of non-destructive detection, it is challenging to obtain SOH by direct measurement and experimental approaches. These methods need intricately constructed experiments and voluminous data [16,17]. Moreover, direct measurement and experimental approaches are frequently difficult to converge quickly and prone to signal distortions [18,19]. These issues render the above methods unsuitable for complicated, dynamic, real-time operations and practical applications.
Diverse methods have been investigated to provide accurate SOH estimation, including model-based methods and data-driven methods. The model-based method can be generally categorized into the equivalent circuit model (ECM) [20][21][22] and the electrochemical model [23][24][25]. Electrochemical models are more appropriate than ECM models for estimating SOH. Electrochemical models are developed to simulate the degradation of a battery by incorporating mathematical descriptions of the underlying physical and chemical processes that occur within the battery during operation. Capacity loss is estimated by simulating the recyclable lithium-ion loss and the functional material loss of electrodes throughout cycles. Jin et al. [26] derived a reduced-order degradation model for electrode materials based on an effective physical loss mechanism and obtained the battery capacity. Li et al. [27] estimated the battery SOH by calculating the loss of active lithiumions across aging cycles and incorporating SEI growth corrections into a single particle model. The model-based methods are equipped with the advantage of high interpretability. However, the significant computational cost limits the adoption of these approaches on mobile devices with limited real-time computing power. In addition, the lack of detail caused by model simplification leads to a distinct disparity between the estimated and true values. Meanwhile, increasing model complexity to improve the estimation performance leads to a discernible declining marginal utility. Therefore, deploying the model-based model on the cloud with sufficient computing resources, acquiring the SOH with actual measurement data, and filtering methods on the mobile side have garnered interest [28,29]. Although the application of these approaches achieves a balance between computing expense and estimating accuracy, their efficacy is still constrained by the electrochemical model. As a result, it is incompetent to complex and dynamic working conditions by using electrochemical models.
Recent developments in the machine learning field have provided a promising solution to SOH estimation. Data-driven methods are adaptable to dynamic operating conditions since they are not restricted by intricate physical specifics. Many studies have utilized machine learning approaches to achieve SOH estimation based on degradation features. Richardson et al. [30] designed a Gaussian process regression method to calculate SOH. Hu et al. [31] utilized k-nearest neighbor regression to capture capacity dependence based on five features. As a crucial component of data-driven methodologies, deep learning techniques are being used to estimate SOH. Khumprom et al. [32] used a Deep Neural Networks (DNN) approach to predict the SOH and the RUL of LIBs, revealing the excellence of deep learning methods. Li et al. [33] proposed a variant long short-term memory (LSTM) neural network (NN) to enhance the model performance of battery SOH prediction. Hong et al. [34] applied a battery degradation model using LSTM NN for SOH estimation on actual EVs. The above data-driven methods can achieve accuracy in estimating SOH. However, the use of input data that lacks physical relevance can impede the interpretability of these models and restrict their applicability in certain scenarios. As a result, it becomes challenging to comprehend the underlying physical processes that lead to the degradation of the battery.
The effectiveness of data-driven methods is highly dependent on the supplied data. Obtaining features highly related to the degradation process is decisive for the effectiveness of a data-driven model. The incremental capacity analysis (ICA) is one of the critical degradation feature analysis methods. Lin et al. [35] proposed a method for predicting the SOH of a battery by dividing the IC curve into voltage segments and using a back-propagation neural network for each segment to achieve high-precision online SOH estimation. In addition, Zhang et al. [36] and Li et al. [37] also employed IC analysis for SOH estimation. All Batteries 2023, 9, 120 3 of 21 these works achieved excellent estimation results. However, the IC analysis only contains information on the current and voltage, not temperature. It can only partially capture the relationship between the external signal and internal activities since the internal phase change of the battery is expressed as heat [38]. Therefore, mining temperature information to connect the microscopic degradation process and the signal change has become one of the focuses. Battery degradation can be characterized microscopically through the type and intensity of internal phase change. Macroscopically, external signal characteristics can be used to measure the phase change. The differential thermal voltammetry (DTV) method is proposed to connect the internal phase change and the external signal characteristics [39][40][41]. By incorporating the temperature and voltage characteristics through the DTV method, non-destructive tracking of battery degradation can be realized.
However, data-driven algorithms with single-source information are susceptible to local optima and noise sensitivity, resulting in an accuracy reduction. To eliminate the limitations of single-source information, Dai et al. [42] extracted multiple degradation features based on IC and voltage characteristic analysis. They adopted a neural network with a prior knowledge-based optimization strategy to realize SOH prediction. Patil et al. [43] applied a multi-level support vector machine to forecast the battery SOH after obtaining numerous derived features from voltage and temperature profiles. Wang et al. [44] employed DTV analysis and Gaussian Process Regression (GPR), which effectively extracted features that had a strong correlation with the SOH of LiCoO2 batteries by combining the temperature and voltage information of the batteries. The above works achieved excellent accuracy in SOH estimation. The integration of all relevant data sources, including temperature, current, and voltage, is crucial in the analysis of battery degradation. A comprehensive examination of these factors can facilitate a better understanding of the degradation mechanism. In addition, deep learning methods have high robustness to noise and practical applications due to their ability to learn complex and non-linear relationships between inputs and outputs through multiple layers of abstraction. The validation for different types of battery material is significant. Therefore, the generality can be validated.
In this research, a SOH estimate method based on multi-feature collaborative analysis and deep learning is proposed. First, several degradation features are gathered using DTV, SVD, ICA, and TVC analysis. After that, the input features of the deep learning model are selected using Pearson correlation analysis. To enhance the performance of the LSTMbased deep learning model, the dropout and RMSprop techniques are employed. Based on the NASA and Oxford battery degradation datasets, experiments are conducted. The experimental results demonstrate the generality of the presented model, showing that the proposed method achieves estimation accuracy in each experimental case. This work offers a method that employs a combination of multi-feature collaborative analysis and deep learning methods for the estimation of SOH. The utilization of multi-feature collaborative analysis facilitates the full consideration of various data sources pertaining to LIBs, thereby yielding a model with generality. Furthermore, as the cloud computing platform develops, the suggested method may contribute to real-time online SOH estimation on various systems. Under the architecture of Cyber Hierarchy and Interactional Network (CHAIN), this work contributes to implementing functions (e.g., real-time SOX estimate, safety management, and early defect notification) for the cloud battery management system (BMS) [45][46][47].
The main contributions of this paper are as follows: (1) Multi-feature collaborative analysis method is used for in-depth mining of voltage, current, and temperature data.
(2) The characteristic signal analysis approach is incorporated to enhance the quality of model input data while simultaneously lowering data size. (3) Based on the DTV and ICA method, the connection between microscopic phase change and the macroscopic signal is developed, enabling the model input to be physically interpretable. (4) The use of the SVD method is proposed to reduce the dimensionality of data while reinforcing features that are crucial for degradation analysis, allowing for more efficient analysis while preserving the important information.

Multi-Feature Analysis
In this section, DTV, singular value decomposition (SVD), IC, and terminal voltage characteristic (TVC) are analyzed with the NASA and Oxford battery degradation datasets. Based on the analyses, multiple degradation features have been selected as alternative features for correlation analysis.

Battery Degradation Dataset
The experiment conducted four second-generation 18650-size lithium batteries (B5, B6, B7, and B18) with identical LiCoO2 composite materials sourced from the NASA Ames Prognostics Center of Excellence [48]. The degradation of the batteries was evaluated under three conditions, including constant-current and constant-voltage charge (CC-CV) mode and constant-current discharge at 24 • C. The CC-CV charging process was the same for all four batteries, with the current set at 1.5 A until the voltage reached 4.2 V and ceasing once the current dropped below 20 mA during the constant-voltage period. The temperature was divided into three parts based on the current load. Subsequently, the batteries were discharged with a constant current of 2A until the voltages reached the predetermined values of 2.7 V for B5, 2.5 V for B6, 2.2 V for B7, and 2.5 V for B18. The experimental particulars are provided in Table 1. Figure 1a shows the degradation of the capacity of these four batteries with cycles. It can be noted that there is a discernible capacity rebound rather than a monotonic capacity drop as the number of cycles increases, which makes SOH estimation challenging. developed, enabling the model input to be physically interpretable. (4) The use of the SVD method is proposed to reduce the dimensionality of data while reinforcing features that are crucial for degradation analysis, allowing for more efficient analysis while preserving the important information.

Multi-Feature Analysis
In this section, DTV, singular value decomposition (SVD), IC, and terminal voltage characteristic (TVC) are analyzed with the NASA and Oxford battery degradation datasets. Based on the analyses, multiple degradation features have been selected as alternative features for correlation analysis.

Battery Degradation Dataset
The experiment conducted four second-generation 18650-size lithium batteries (B5, B6, B7, and B18) with identical LiCoO2 composite materials sourced from the NASA Ames Prognostics Center of Excellence [48]. The degradation of the batteries was evaluated under three conditions, including constant-current and constant-voltage charge (CC-CV) mode and constant-current discharge at 24 °C. The CC-CV charging process was the same for all four batteries, with the current set at 1.5 A until the voltage reached 4.2 V and ceasing once the current dropped below 20 mA during the constant-voltage period. The temperature was divided into three parts based on the current load. Subsequently, the batteries were discharged with a constant current of 2A until the voltages reached the predetermined values of 2.7 V for B5, 2.5 V for B6, 2.2 V for B7, and 2.5 V for B18. The experimental particulars are provided in Table 1. Figure 1a shows the degradation of the capacity of these four batteries with cycles. It can be noted that there is a discernible capacity rebound rather than a monotonic capacity drop as the number of cycles increases, which makes SOH estimation challenging.
The Oxford capacity degradation dataset was compiled by the University of Oxford [49,50]. This paper selects eight batteries with a rated capacity of 740 mAh, designated cells 1 through 8. The temperature of the experiments was maintained at 40 °C . Each cell was repeatedly charged under 2C CC settings and discharged under dynamic circumstances to simulate driving situations. Every 100 cycles, a 1C CC charge/discharge cycle was used to measure the battery capacity. Table 2 displays detailed experimental information. The capacity trajectory of these eight batteries is shown in Figure 1b. Overall, the capacity of the Oxford dataset is a smooth decline. However, the capacity curves of cell 2 and cell 5 show an unexplained plunge, which may be attributable to alterations in operating conditions. Hence, cells 1, 3, 4, 6, 7, and 8 are chosen for training and testing the model in this research.  The Oxford capacity degradation dataset was compiled by the University of Oxford [49,50]. This paper selects eight batteries with a rated capacity of 740 mAh, designated cells 1 through 8. The temperature of the experiments was maintained at 40 • C. Each cell was repeatedly charged under 2C CC settings and discharged under dynamic circumstances to simulate driving situations. Every 100 cycles, a 1C CC charge/discharge cycle was used to measure the battery capacity. Table 2 displays detailed experimental information. The capacity trajectory of these eight batteries is shown in Figure 1b. Overall, the capacity of the Oxford dataset is a smooth decline. However, the capacity curves of cell 2 and cell 5 show an unexplained plunge, which may be attributable to alterations in operating conditions. Hence, cells 1, 3, 4, 6, 7, and 8 are chosen for training and testing the model in this research. Studies have demonstrated a strong correlation between battery deterioration and microscopic entropy change [40,51], and the variation in temperature distribution can have a significant impact on the degradation of the battery over time [38]. The primary factors contributing to the decline of the maximum charge storage capacity of LIBs as a function of temperature within a specific operating range have been identified as the formation and alteration of surface films on the electrodes, as well as structural and phase transformations of the electrode material [52]. As battery degradation progresses, the positive and negative electrodes may exist in various phase combinations. These phase transitions can lead to changes in the system's entropy, which can be identified by observing inflection points on the DTV curve. The DTV curve's peak position can indicate the peak potential during charge/discharge stages, shifts in peak position can indicate changes in impedance and stoichiometry, peak height can indicate the maximum heating rate, peak width can indicate the potential window of the phase combinations in the electrodes, and peak area can provide information on the heat generated during charge/discharge stages. The DTV can be calculated as: where V represents the terminal voltage, T is the temperature, and t is the sampling time.
The utilization of the DTV method allows for the identification of the degree and type of phase change within the electrode material through the analysis of changes in temperature and voltage. This information is crucial for understanding the entropy change in the system, which dictates the degradation stage of the battery. However, in practical applications, the temperature and voltage data are vulnerable to external factors such as operating conditions, making the differential signals noise-sensitive. Therefore, filtering methods are necessary to obtain clear DTV signals for degradation feature analysis. In this paper, the Savitzky Golay (SG) filtering method is operated to filter the DTV curves to maintain the shape of the curve while removing the noise. The SG filtering method is expressed as follows: where x(i) represents the original input signals, y(i) is the output signals after the filter, C j is the coefficient in SG filtering, and N is the number of convolution integers. The value of N is equal to the size of the smoothing window, which is equal to 2M + 1. Figure 2a illustrates the comparison results of DTV curves before and after filtering. The filtering results reveal the effectiveness of the SG filtering method. Specifically, it effectively smooths the curves while retaining the shapes of features. The results of DTV differential curves before and after filtering are shown in Figure 2b. Clearly, the DTV differential curve is still smooth after filtering, further demonstrating the applicability of SG filtering to DTV analysis.

SVD Analysis
As a fundamental matrix decomposition technique, the SVD method condenses the information in complex matrices by using simpler matrices, achieving data compression while preserving the information's characteristics. The SVD approach is applicable in several situations, including data downscaling and feature extraction, due to its practical data compression impact. External measurement data can characterize the degree of battery degradation. Large data quantities and minor changes in measurement data might hinder the performance of the deep learning model. Consequently, the SVD approach is applied to the numerical processing of the external measurement data of the battery in order to compress data while reinforcing the degradation features. Suppose that matrix A is an m × m matrix of any size, and AA T and A T A are symmetric matrices of size m × m and n × n, respectively. If AA T = PΛ1P T and A T A = QΛ2Q T , the SVD of matrix A can be described as follows: where P and Q T are square matrices of size m × m and n × n, respectively, Σ is the diagonal matrix of size m × n, which means that all matrix elements are 0 except those on the main diagonal. The elements on the main diagonal of the Σ matrix are called singular values. The DTV curves change continuously during degradation, and features can be obtained. Figure 2c shows the evolution of DTV curves over the battery B5 cycles in the NASA dataset. Throughout the degradation stages of the battery, the DTV curves exhibit two peaks and one trough. In addition, the change in peaks and valleys shows a distinct directionality. The peak values fall gradually while the peak positions shift progressively to the left. Simultaneously, the value of the valley grows, and its position shifts to the left. The curve shape changes correspond to the actual battery degradation process. As illustrated in Figure 2d, the peaks and the valley are selected as alternative DTV analysis features. The left peak value and position are denoted as FV1 and FV2, the valley value and position as FV3 and FV4, and the right peak value and position as FV5 and FV6, respectively. Figure 2e shows the evolution of the DTV curves of cell 1 of the Oxford dataset. The DTV curves of the Oxford dataset distinctly exhibit one peak and two valleys throughout the degradation stages of the battery, which is different from battery B5. With the deepening of battery degradation, the valley values increase, and the valley positions gradually move to both ends. Meanwhile, the peak values decrease while the positions gradually shift to the left. Therefore, as shown in Figure 2f, the peak and valleys of the DTV curve of the Oxford dataset are taken as the alternative features. It is noteworthy that the DTV curve shape of the Oxford dataset differs from that of the NASA dataset due to the substantial difference in experimental conditions. For the Oxford dataset, FV1 and FV2 represent the left valley value and location, FV3 and FV4 represent the peak value and position, and FV5 and FV6 represent the right valley value and position.

SVD Analysis
As a fundamental matrix decomposition technique, the SVD method condenses the information in complex matrices by using simpler matrices, achieving data compression while preserving the information's characteristics. The SVD approach is applicable in several situations, including data downscaling and feature extraction, due to its practical data compression impact. External measurement data can characterize the degree of battery degradation. Large data quantities and minor changes in measurement data might hinder the performance of the deep learning model. Consequently, the SVD approach is applied to the numerical processing of the external measurement data of the battery in order to compress data while reinforcing the degradation features. Suppose that matrix A is an m × m matrix of any size, and AA T and A T A are symmetric matrices of size m × m and n × n, respectively. If AA T = PΛ 1 P T and A T A = QΛ 2 Q T , the SVD of matrix A can be described as follows: where P and Q T are square matrices of size m × m and n × n, respectively, Σ is the diagonal matrix of size m × n, which means that all matrix elements are 0 except those on the main diagonal. The elements on the main diagonal of the Σ matrix are called singular values. Thus, the SVD compresses matrix A of size m × n into singular values, which can be utilized as degradation features. In this study, SVD was applied to both voltage and temperature data for each discharge cycle, resulting in the extraction of two singular values for each cycle as feature vectors, thus compressing the voltage and temperature data. This method of applying SVD to the voltage and temperature data can compress the amount of data being measured and reduce the input burden on the model. By extracting the singular values as feature vectors, labeled FV7 and FV8, it reduces the dimensionality of data and eliminates less important or redundant information, making it more efficient for the model to process.

IC and TVC Analyses
As a commonly used feature analysis method to describe the battery degradation process, the IC curve can characterize the battery deterioration from the electrode level and has a high resolution for the battery charge and discharge platform area. The IC curve is defined as the incremental battery capacity of successive voltage increments, and its mathematical representation is as follows: where Q, I, V, t are the discharge capacity, current, voltage, and time, respectively. In actual applications, the measured values of current and voltage are accompanied by noises, which can generate undesirable variations in the IC curve and therefore impact the extraction of features. This paper applies the SG filtering method to filter the IC curve. The IC curves change continuously during the battery degradation process, and the degradation features can be obtained. Figure 3a illustrates the evolution of IC curves during the cycles of battery B5 in the NASA dataset. In addition, it has been discovered that a complete IC curve has a significant peak throughout the degradation process and that as the battery degrades, the peak value of the IC curve is constantly falling and shifting to the left. These phenomena indicate that the peak of the IC curve can be used as a feature to characterize battery degradation. Figure 3b shows the features within the IC curve of the battery in the NASA dataset. The peak value and position are extracted as alternative features for SOH estimation. The evolution of IC curves during the aging cycles of battery cell 1 in the Oxford dataset is illustrated in Figure 3c. The IC curves of the Oxford and NASA datasets contain a single peak, and their shapes, including the peak's trend, are comparable. Figure 3d shows the IC feature variables of the battery. Similarly, the feature variables are identical to those in the NASA dataset. In this research, the peak value and position of the IC curves based on NASA and Oxford datasets are denoted as FV9 and FV10, respectively. the degradation process of the cell 1 battery in the Oxford dataset. Different material types and loading conditions can have an impact on the behavior of the batteries over time. Due to the significant differences in the battery material types and operating conditions between the NASA dataset and the Oxford dataset, the voltage range selected for TVC features differs between the two datasets. The voltage range used for the NASA dataset is (3.5, 3.9), while for the Oxford dataset, it is (3.7, 4.1). Selecting a voltage range within which the battery voltage changes are relatively stable and smooth rather than experiencing rapid drops can facilitate better feature analysis and characterization of battery aging. This can lead to a more accurate and reliable assessment of the battery's performance and degradation over time.  In addition to the three feature analysis approaches described above, this work presents a straightforward way for obtaining battery degradation features through TVC analysis. The discharge time in the same operating terminal voltage range falls continually during the CC situation, which can be used to characterize the battery deterioration as the battery degrades. The expression of TVC is shown as follows:

Methodology
where V low is the lower cut-off voltage value and V high the higher cut-off voltage value during the CC condition.t low and t high are the end and start times of the discharge process between V low and V high , respectively. As shown in Figure 3e,f, the voltage curve shifts to the left during the degradation process, and the degradation features can be obtained. Figure 3e shows the evolution of the voltage-time curve throughout the degradation stages of the battery B5 in the NASA dataset. Changes in voltage during the early discharge process are relatively gradual. As the depth of discharge increases, the discharge curve reaches an inflection point, after which the battery enters a highly nonlinear zone. In this paper, the feature is utilized as an alternative feature, marked as FV11, based on both NASA and Oxford datasets. Figure 3f shows the evolution of the voltage-time curves of the whole discharge process during the degradation process of the cell 1 battery in the Oxford dataset. Different material types and loading conditions can have an impact on the behavior of the batteries over time. Due to the significant differences in the battery material types and operating conditions between the NASA dataset and the Oxford dataset, the voltage range selected for TVC features differs between the two datasets. The voltage range used for the NASA dataset is (3.5, 3.9), while for the Oxford dataset, it is (3.7, 4.1). Selecting a voltage range within which the battery voltage changes are relatively stable and smooth rather than experiencing rapid drops can facilitate better feature analysis and characterization of battery aging. This can lead to a more accurate and reliable assessment of the battery's performance and degradation over time.

Methodology
In this section, the framework and process of the proposed deep learning model for SOH estimation are described in detail, including the feature selection and model construction. Figure 4 illustrates the proposed framework for battery SOH estimation based on multi-feature collaborative analysis and deep learning methods. The framework consists of five components: raw data acquisition, multi-feature analysis, feature selection, deep learning model construction, and assessment of the proposed model. In the first and second parts of the framework, data acquisition and degradation feature analysis are carried out, and four sets of alternative features are extracted using DTV, SVD, IC, and TVC, respectively. The procedure is discussed detailed in Section 2. In the third component of the framework, Pearson correlation analysis is used to pick alternative features with high correlation as the final features, which are the inputs of the LSTM NN model. The LSTM NN is constructed and trained in the fourth part, with the dropout technique applied to restrain overfitting and the RMSprop algorithm from achieving rapid convergence. In the fifth part, the predicted SOH is compared to the actual SOH. Root mean square error (RMSE) and mean absolute error (MAE) analyses are performed to evaluate the model performances.

Feature Selection
Based on the multi-feature collaborative analysis in the previous section, eleven alternative features are obtained based on both the NASA and Oxford datasets. In order to further screen out high-quality features as the inputs of the LSTM model, the correlation between the alternative features and SOH is calculated using Pearson correlation analysis. The calculation can be described as follows: where x i and x represent the feature value and the mean value of the feature variable, respectively. y i and y represent the SOH value and the mean SOH value, respectively. P represents the number of samples. ρ xy is the correlation coefficient between the specific feature and SOH. The value of ρ xy is between −1 and 1. In this study, a feature selection technique was implemented on the training set in order to identify features that exhibited a strong correlation with the SOH of the battery. Subsequently, the selected features were employed for predicting the SOH of the test set. The correlation coefficient is utilized as a metric to evaluate the strength of the association between the features and the SOH. A high correlation coefficient, approaching 1, implies a strong correlation. The correlation values for the NASA dataset are shown in Table 3, while the correlation values for the Oxford dataset are shown in Table 4. As a result, the features FV2, FV4, and FV6 to 11 are selected as the input features for the NASA dataset, while FV6 to 11 are chosen for the Oxford dataset.

Feature Selection
Based on the multi-feature collaborative analysis in the previous section, eleven a ternative features are obtained based on both the NASA and Oxford datasets. In order t further screen out high-quality features as the inputs of the LSTM model, the correlatio

Model Structure
In this paper, the LSTM NN model is trained as the core of the deep learning model for battery degradation prediction. As a variant of recurrent neural network (RNN), the LSTM NN could effectively mitigate the problems of gradient disappearance and gradient explosion of simple RNN and capture the long-term dependence between inputs and outputs. The LSTM unit accepts the inputs and unrolls them in time, learning from each time step of the input sequence. Each LSTM unit controls the input, forgetting, and output of information via its input gate, forgetting gate, and output gate, respectively. The forward propagation process of each LSTM unit at time step t is as follows: where i t , f t , o t are respectively the output of the input gate, forgetting gate, and output gate. σ and tanh are the sigmoid and hyperbolic tangent activation functions; W and b are the weight matrix and bias vector, respectively, which are continuously learned and updated during training. x t is the input to the LSTM unit in the current time step. In addition, h t and h t−1 are the hidden output of the neuron at the current time step and the previous time step, respectively. C t−1 is the memory of the neuron at the previous moment, C t is the newly generated memory of the neuron, and C t is the current memory state of the neuron. Compared to simple RNNs, which decay exponentially to the past inputs, the LSTM controls the neuron memory through the forgetting gate and input gate. The continuous updating of the internal parameters of the forgetting gate and input gate enables dynamic selection to recall correct information and forget redundant information, hence facilitating a more accurate correlation between input features and battery SOH. Since battery degradation is a long-term process, it is inevitable that duplicate information will emerge. Through the above structure, the LSTM NN improves the prediction performance of the deep learning model by forgetting redundant information and screening out the appropriate information. Consequently, the LSTM NN can successfully prevent gradient disappearance and gradient explosion, reflecting the long-term relationship between the input features and battery degradation.
Overfitting has been one of the primary issues with the deep learning approach. In this paper, the dropout technique is used to reduce overfitting and enhance model performance by randomly resting neurons. Specifically, during the forward propagation of the NN, each neuron is allowed to rest with a specific probability p. Through this process, a randomly smaller NN is generated from the original NN structure, reducing the model's complexity while severing the connection between some neurons. While addressing the issue of overfitting, the dropout strategy might lower the number of learned parameters and increase convergence speed. Moreover, the optimization method of the NN is critical, as it dictates how the NN executes the learning process, hence influencing convergence speed, characteristics, and model performance. It has been demonstrated that the RMSprop approach is effective for RNNs [53]. The RMSprop method offers a quicker convergence rate than other optimization techniques and is appropriate for RNNs. In this research, the RMSprop approach is employed as an optimizer for the LSTM model.

Results and Discussion
In this section, the SOH prediction results are described, and the accuracy and generality of the proposed model are validated based on both NASA and Oxford datasets.

Evaluation Index
In order to quantitatively evaluate the prediction results, RMSE and MAE are utilized to describe the error between the real SOH and the predicted SOH. The expression of RMSE is given by Equation (13), and the expression of MAE is given by Equation (14): whereQ i represents the predicted value, Q i represents the true value, and N represents the number of cycles. In this paper, the battery SOH can be calculated from the remaining capacity of the battery, and the expression is as follows: (15) where Q current represents the current capacity of the battery and Q initial represents the initial capacity. The SOH of a new battery is 1, and the SOH value gradually decreases as the battery ages.

Results with Different Features
In this subsection, the comparison between single-feature and multi-feature analysis is conducted based on the battery B5 of the NASA dataset. The first 40% of data are used as the training set to derive the parameters of the LSTM NN model, while the remaining 60% is used as the testing set to evaluate the performance of the model. From the original battery data, each feature analysis method obtains a feature set, and the collaboration features are the fusion of all feature sets. Figure 5 shows the results of the SOH prediction based on different feature sets and feature collaboration. Figure 5a-e shows the detailed results with two colored lines for each subplot. The orange line indicates the actual SOH, whereas the black line indicates the expected SOH. Figure 5f is a bar chart displaying the RMSE and MAE for each case.

Results Based on Different Batteries
In this subsection, the generality of the model based on the multi-feature collaborative analysis method is validated based on the batteries B6, B7, and B18 of the NASA dataset. Figure 6 displays the results of the predicted SOH on different batteries. Figure 6ac shows the detailed prediction results. Figure 6d describes the RMSE and MAE.
As shown in Figure 6d, the results based on different batteries reach a maximum RMSE of 0.93% and a maximum of 0.80%, which provides good overall prediction results and demonstrates the model's dependability and generality. As shown in Figure 6a-c, the model based on the multi-feature collaborative analysis method achieves accurate SOH prediction in all cases, reflecting the superiority of the proposed method. The collaborative features contain more degradation information compared with a single feature. While the single feature is easily disturbed by the data quality and the measurement conditions of different experiments, the multi-feature collaborative method can nevertheless accurately predict the SOH of different batteries, reflecting the model's generality. From the detailed prediction results, the model also accurately captures the battery capacity rebound phenomenon, reflecting the proposed method's advantage. Specifically, the collaborative features can accurately analyze the various changing behaviors of the battery SOH and precisely construct the mapping relation between the features and SOH. The above results can demonstrate the generality of the multi-feature collaborative analysis method.   As shown in Figure 5f, each case achieves good prediction results. The highest error results are the RMSE of 1.22% and MAE of 1.10%. As illustrated in Figure 5a-e, the model accurately predicts the degradation trend of the battery SOH. The capacity rebound phenomenon is accurately captured based on the NASA dataset, reflecting the superiority of the selected features. In addition, models based on the DTV and IC analysis obtain the estimated results with RMSEs of 1.22% and 1.10%, respectively. Compared with models based on other feature analysis methods, their error values are the highest. The differential operation of DTV and IC analysis notably amplifies the noise of the measurement data, resulting in a loss of accuracy. Although the SG filtering method is applied to remove the noise and achieve curve smoothing, the influence of noise cannot be ignored entirely. In contrast, the SVD method achieves the best prediction results among the single-feature analysis with an RMSE of 0.67%, indicating that the compressed data using the SVD method still characterizes the battery degradation well. The feature set of TVC analysis achieves an RMSE of 0.92%, revealing that with the deepening of battery degradation, the change in battery discharge voltage-time curves can characterize battery degradation well through the simple feature obtained by TVC analysis. Noting that the model with the multi-feature collaborative analysis achieves the best prediction results, reaching an RMSE of 0.62% and an MAE of 0.51%. The results demonstrate the predictive accuracy superiority of the suggested method. The collaborative features contain more information on the degradation of various variables, enabling the deep learning model to characterize the battery degradation better. In addition, the single-feature analysis could be sensitive to the data quality. Due to the comprehensive consideration of several different methods, the multi-feature collaborative analysis method could avoid the extreme situation that singlefeature analysis could be disturbed by serious data quality, indicating that the proposed method has stronger robustness compared to the single-feature analysis method.

Results Based on Different Batteries
In this subsection, the generality of the model based on the multi-feature collaborative analysis method is validated based on the batteries B6, B7, and B18 of the NASA dataset. Figure 6 displays the results of the predicted SOH on different batteries. Figure 6a-c shows the detailed prediction results. Figure 6d describes the RMSE and MAE.

Results with Different Features
In this subsection, the comparison with the single-feature analysis and multi-feature analysis is performed based on the Oxford dataset cell 1. Figure 7a-e illustrates the detailed results of the model SOH prediction. Figure 7f shows the RMSE and MAE. As shown in Figure 7f, each case achieves accurate SOH prediction, getting a maximum RMSE of 0.45% and a maximum MAE of 0.39%. In addition, the prediction results on the Oxford dataset are significantly better than those on the NASA dataset. This phenomenon is because the Oxford dataset battery SOH is a smooth decline and does not exhibit a significant capacity rebound, making it easier to achieve SOH prediction on this dataset. The models based on each feature analysis achieve balanced prediction results, indicating that the selected features have an excellent ability to characterize battery degradation. The model based on IC analysis achieves the lowest RMSE of 0.41%. The IC analysis responds to internal battery degradation through external current and voltage measurement parameters. The results imply that the feature provides a good link between the measurement parameters and the battery's microscopic changes. In contrast, the model based on DTV analysis achieves the highest RMSE of 0.45%. The DTV method links the temperature and voltage measurement data with the internal entropy change in the battery. However, due to the challenges of measuring the temperature data and its vulnerability to noise interference, DTV analysis is more susceptible to being disturbed by measurement noise. Despite this, the DTV analysis model still gets an RMSE of 0.45%, indicating that the DTV  As shown in Figure 6d, the results based on different batteries reach a maximum RMSE of 0.93% and a maximum of 0.80%, which provides good overall prediction results and demonstrates the model's dependability and generality. As shown in Figure 6a-c, the model based on the multi-feature collaborative analysis method achieves accurate SOH prediction in all cases, reflecting the superiority of the proposed method. The collaborative features contain more degradation information compared with a single feature. While the single feature is easily disturbed by the data quality and the measurement conditions of different experiments, the multi-feature collaborative method can nevertheless accurately predict the SOH of different batteries, reflecting the model's generality. From the detailed prediction results, the model also accurately captures the battery capacity rebound phenomenon, reflecting the proposed method's advantage. Specifically, the collaborative features can accurately analyze the various changing behaviors of the battery SOH and precisely construct the mapping relation between the features and SOH. The above results can demonstrate the generality of the multi-feature collaborative analysis method.

Results with Different Features
In this subsection, the comparison with the single-feature analysis and multi-feature analysis is performed based on the Oxford dataset cell 1. Figure 7a-e illustrates the detailed results of the model SOH prediction. Figure 7f shows the RMSE and MAE. As shown in Figure 7f, each case achieves accurate SOH prediction, getting a maximum RMSE of 0.45% and a maximum MAE of 0.39%. In addition, the prediction results on the Oxford dataset are significantly better than those on the NASA dataset. This phenomenon is because the Oxford dataset battery SOH is a smooth decline and does not exhibit a significant capacity rebound, making it easier to achieve SOH prediction on this dataset. The models based on each feature analysis achieve balanced prediction results, indicating that the selected features have an excellent ability to characterize battery degradation. The model based on IC analysis achieves the lowest RMSE of 0.41%. The IC analysis responds to internal battery degradation through external current and voltage measurement parameters. The results imply that the feature provides a good link between the measurement parameters and the battery's microscopic changes. In contrast, the model based on DTV analysis achieves the highest RMSE of 0.45%. The DTV method links the temperature and voltage measurement data with the internal entropy change in the battery. However, due to the challenges of measuring the temperature data and its vulnerability to noise interference, DTV analysis is more susceptible to being disturbed by measurement noise. Despite this, the DTV analysis model still gets an RMSE of 0.45%, indicating that the DTV analysis can link the internal entropy change with the external measurement parameters and characterize the battery degradation well. Noting that the model based on the multi-feature collaborative analysis method still achieves the best prediction results, with RMSE of 0.32% and MAE of 0.28%, which may be related to the model based on the multi-feature collaborative method has more information to describe degradation. The above results reflect the reliability and generality of the proposed method.

Results on Different Battery
In this subsection, the generality of the multi-feature collaborative analysis method is further validated based on the Oxford dataset cell 3, cell 4, cell 6, cell 7, and cell 8. Figure   0 2000

Results on Different Battery
In this subsection, the generality of the multi-feature collaborative analysis method is further validated based on the Oxford dataset cell 3, cell 4, cell 6, cell 7, and cell 8. Figure 8a-e shows the detailed prediction results of the proposed method. Figure 8f shows the RMSE and MAE. As shown in Figure 8f, the model based on the multi-feature collaborative analysis method achieves accurate SOH prediction on each cell, getting a maximum RMSE of 0.56% and a maximum MAE of 0.45%. The results demonstrate the generality of the proposed method, which may be attributable to the fact that the collaborative features contain more degradation information to characterize battery degradation than a singlefeature set. As shown in Figure 8a-e, it can be seen that the real curve of SOH in this dataset is smooth, and there is no capacity rebound phenomenon. Different from the above results based on the NASA dataset in which the predicted SOH shows a sightly rebound, the predicted SOH on this dataset is similar to the real SOH with a smooth decline. The results suggest that the proposed method can adequately capture the change in battery SOH in diverse material system batteries and varied battery SOH change behaviors, demonstrating the method's dependability and generality.  Table 5 and Table 6 show the prediction results of the proposed method in this paper and compare them with other works that were studied based on the NASA battery dataset and the Oxford battery dataset, respectively. These studies employed data-driven algorithms and some hybrid algorithms to establish a battery degradation model. Ref. [54] used voltage and current as features; a Support Vector Regression (SVR) and an Adaptive Dropout LSTM were established to predict the battery's SOH. Ref. [43] used DTV as the input feature, and a GPR model was constructed to predict the battery's SOH, achieving excellent results on the NASA dataset. Ref. [55] used the time interval of the equal charging voltage rising (DV_DT) as the input feature, and an LSSVM-ECM model and an EDM model were constructed and tested on both the NASA and the Oxford datasets. Ref. [ Tables 5 and 6 show the prediction results of the proposed method in this paper and compare them with other works that were studied based on the NASA battery dataset and the Oxford battery dataset, respectively. These studies employed data-driven algorithms and some hybrid algorithms to establish a battery degradation model. Ref. [54] used voltage and current as features; a Support Vector Regression (SVR) and an Adaptive Dropout LSTM were established to predict the battery's SOH. Ref. [43] used DTV as the input feature, and a GPR model was constructed to predict the battery's SOH, achieving excellent results on the NASA dataset. Ref. [55] used the time interval of the equal charging voltage rising (DV_DT) as the input feature, and an LSSVM-ECM model and an EDM model were constructed and tested on both the NASA and the Oxford datasets. Ref. [56] constructed SVM, MLR, GPR, and Fusion of SVM MLR GPR models and tested them on the Oxford dataset. The designed method has a high SOH prediction accuracy on different datasets. It benefits from the input features provided by the multi-feature analysis method and the excellent ability of the LSTM network to solve the long-term dependency time series problem. The comparison results indicate that the designed SOH prediction method has good prediction performance.

Discussions
The study proposes a method that employs a combination of multi-feature collaborative analysis and deep learning methods for the estimation of the SOH of LIBs with a high level of precision. The synthesis of diverse information sources, such as temperature, current, and voltage, plays a crucial role in the assessment of battery degradation. A thorough analysis of these variables can contribute to a deeper insight into the degradation process. Furthermore, it is crucial to verify the results for various types of battery materials to establish their general applicability. The use of multi-feature collaborative analysis allows for the examination of multiple sources of information related to the LIBs, resulting in a model with robust generality. Note that the proposed method is validated under specific experimental conditions (CC-CV). However, in practice, the operating conditions of the battery may be uncertain due to varying environments. The inherent uncertainty and imprecision of dynamic loading conditions present significant challenges in the extraction of high-related features for the input of the deep learning model, which can negatively impact the correlation between the extracted features and the structural health, thereby compromising the accuracy of the model. However, the proposed method combines multiple data resources. In the event of a failure of one feature analysis, the utilization of other feature extractors can potentially mitigate the impact and maintain the functionality of the model. The utilization of deep learning models in the field of battery SOH estimation poses several challenges, particularly in terms of deployment on resource-constrained devices such as electric vehicles. An approach to mitigate the challenges associated with the deployment of deep learning models for SOH prediction on resource-constrained devices is to leverage the computational resources of a cloud-based platform for training the model and to enable its usage through data transmission to the target device. By leveraging the computational resources available on cloud-based platforms to analyze features and train the proposed deep learning-based model, it is feasible to achieve real-time prediction of battery state. In the process of selecting an appropriate model for a particular task, some simple machine learning models may prove to be more effective in terms of prediction accuracy. However, in certain cases, more complex machine learning models, such as LSTM NN in this study, may outperform their simpler counterparts. This is due to the increased complexity of LSTM, which allows it to more effectively capture the relationship between the input data and the SOH, resulting in improved generalization capabilities. The results of this study demonstrate that the proposed method is capable of achieving an acceptable level of accuracy for the estimation of various types of LIBs through the effective extraction of relevant battery information via a multi-feature collaborative analysis method. However, further research is needed to investigate the generality of this method to other types of LIBs.

Conclusions
In this study, a LIBs SOH estimation framework based on a deep learning method with collaborative multi-feature analysis is provided. In such a framework, the physicalderived features are combined, and a deep learning model is utilized to estimate battery SOH efficiently. DTV, SVD, IC, and TVC analysis were used to attain the degradation features to describe battery degradation and increase the models' generality thoroughly. Based on the results of the feature analysis, the Pearson correlation analysis is then used to select features as the input of the deep learning model cored by LSTM NN. In addition, the dropout and RMSprop techniques are implemented into such a framework, which improves the model's performance. In order to verify the generality of the multi-feature collaborative analysis method, the NASA and Oxford battery degradation datasets are selected for validation and error analysis. The experiment results demonstrate that the method has high estimation accuracy and generality, where the maximum root mean square error (RMSE) is less than 1%. The proposed method can reduce the dependence of the estimation accuracy on the data scale while considering the interpretability of the estimation results. With the continuous improvement of cloud technology, this study provides valuable improvements to the battery SOH estimation method, which boosts the interpretability and generality of the deep learning model. On the basis of the Cyber Hierarchy and Interactional Network (CHAIN) architecture, it is anticipated that the deep learning model paired with the multi-feature collaborative analysis approach could be implemented on a cloud platform.

Data Availability Statement:
The data presented in this study are openly available at the NASA Ames Center of Excellence Diagnostic Center and Oxford university research archive.