Deep Learning in the State of Charge Estimation for Li-Ion Batteries of Electric Vehicles: A Review

: As one of the critical state parameters of the battery management system, the state of charge (SOC) of lithium batteries can provide an essential reference for battery safety management, charge/discharge control, and the energy management of electric vehicles (EVs). To analyze the application of deep learning in electric vehicles’ power battery SOC estimation, this study reviewed the technical process, common public datasets, and the neural networks used, as well as the structural characteristics and advantages and disadvantages of lithium battery SOC estimation in deep learning methods. First, the speciﬁc technical processes of the deep learning method for SOC estimation were analyzed, including data collection, data preprocessing, feature engineering, model training, and model evaluation. Second, the current commonly and publicly used lithium battery dataset was summarized. Then, the input variables, data sets, errors, and advantages and disadvantages of three types of deep learning methods were obtained using the structure of the neural network used for training as the classiﬁcation criterion; further, the selection of the deep learning structure for SOC estimation was discussed. Finally, the challenges and future development directions of lithium battery SOC estimation using the deep learning method were explained. Over all, this review provides insights into deep learning for EVs’ Li-ion battery estimation in the future.


Introduction
With the intensification of global warming and climate anomalies caused by carbon dioxide emissions [1,2], it has become a worldwide consensus to reduce and control the production of fossil fuel-based fuel vehicles. Therefore, the accelerated transformation of vehicle electrification is an important trend in the current development, but a major obstacle to its wide application is the range limitation [3]. Lithium batteries, as the current power source of most electric vehicles (EVs) [4], have the advantages of high stability, high energy density, and a long cycle life [5]. Nowadays, due to the increased demand for electric vehicles, the requirements for battery performance and energy management have increased. The battery status is an important parameter of the battery management system (BMS) [6], and the accuracy of SOC is related to the rationality of energy distribution, the length of the range, safety, and the optimal charging and discharging of Li-ion batteries in EVs [7]. Furthermore, the accuracy of SOC estimation can determine the smart degree of the automated demand response for BMS in EVs because the BMS determines when to stop the charge or discharge for the safety and health of Li-ion batteries according to the estimated SOC values.
The State of Charge (SOC) is a more important parameter in the EVs' Li-ion battery state parameters, which is equivalent to the amount of fuel in a fuel-burning car, indicating the remaining power of the battery that allows the computation of other quantities of EVs [8]. However, the electrochemical reflection in lithium batteries is complex and very curr full SOC 100% where Ccurr is the real-time battery capacity, and Cfull is the fully charged state. When the battery is fully charged, the SOC is 100%, and the SOC is 0% when the battery discharge is completed [3]. In the practical application of BMS in EVs, the mathematical presentation of SOC depends on the method of SOC estimation. Currently, there are two main methods for lithium battery SOC estimation: modeldriven and data-driven. The summary diagram of the SOC estimation methods is depicted in Figure 1. The main model-driven idea is to build models for estimating SOC values through scientific knowledge of lithium batteries, which are divided into the following categories: electrochemical model [10][11][12] and equivalent circuit model [13][14][15][16][17]. The electrochemical model involves the research of the internal dynamic condition of Liion batteries for higher SOC estimation performance, but it is less applied for the BMS of EVs because of its requirement of a high computational cost and complicated mathematical equations, which are often presented by partial differential equations that requires intensive computations. The equivalent circuit model is represented by electrical components and is used to monitor the behavior of Li-ion batteries at different times; in addition, it is derived from empirical knowledge and experimental data, and it is widely applied in the BMS for on-line SOC estimation because of its capacity to estimate SOC in real time and its low computational cost. However, its accuracy of SOC estimation is usually limited by the range of the parameterized model. The data-driven method is used to measure the data related to lithium batteries, and then the data are used to generate a model. There are two common types of filtering algorithms [16,[18][19][20][21][22][23] and machine learning, which is mainly divided into traditional machine learning and deep learning. Fuzzy logic [24,25], support vector machines [26], and neural networks [27][28][29] are the current traditional machine learning methods commonly used for the SOC estimation of the lithium batteries of EVs in the off-line condition. These methods are not widely used in the BMS of EVs mainly because of their high computational cost; hence, the data-driven method used in the EVs' Li-ion SOC estimation problem is computed in the workstation. The filter algorithm is usually used to estimate SOC combined with an equivalent circuit model and a datadriven method for higher accuracy of SOC estimation, and it is not usually deployed in The main research themes of the lithium batteries in EVs' SOC estimation review are model-driven [30][31][32][33], data-driven [34][35][36], model and data-driven [37,38], and machine learning [3,39], but few reviews are focused on deep learning. In recent years, deep learning has been applied in computer vision, natural language processing, and life sciences [40], and some important research results have been achieved [41]. In 2017, some scholars tried to apply deep learning to the lithium battery SOC estimation problem, which has some advantages over the previous methods in terms of time and accuracy.
To develop the EVs' Li-ion battery SOC estimation in deep learning and fill the research gap, this review reveals a contribution with a comprehensive description of EVs' Li-ion battery SOC estimation problem in the deep learning method, and the contribution is as follow: (1) The process of EVs' Li-ion battery SOC estimation in the deep learning method is discussed comprehensively; (2) The algorithm, characters, and selection of the deep learning structure used to estimate the SOC of the Li-ion battery in EVs are explained thoroughly; (3) The challenge and future development of EVs' Li-ion battery in the deep learning method are presented.
This study reviewed the application of deep learning methods in EVs' Li-ion battery SOC estimation from four aspects. To obtain the overview of the deep learning method in the SOC of a Li-ion battery, the first part is the technical process of the deep learning method to estimate the SOC of a lithium battery; the data of the Li-ion battery in EVs is the most important part of SOC estimation in the deep learning method, the second part is about high-quality public lithium battery data sets, and the deep learning neural network structure can determine the performance of SOC estimation in deep learning method, the third part is about different neural networks structure of deep learning in EVs' lithium battery SOC estimation problem application; to research the characteristics of deep learning structure which is applied to solve the problem of SOC estimation for Li-ion battery in EVs, the fourth part is to analyze and evaluate the characteristics of different neural networks as well as the future development of SOC estimation in the deep learning method.

Process of SOC Estimation Using the Deep Learning Method
The flow chart of the SOC estimation technology of a lithium battery based on deep learning is shown in Figure 2. The main process includes five processes: data collection, data preprocessing, feature engineering, model training, and model prediction.
Data collection is a time-consuming part of the whole process. To simulate the state changes of lithium batteries under real driving conditions, the parameter changes caused by the load of the lithium battery under driving conditions are generally recorded to form driving cycles, which are loaded on the tested lithium battery. Common drive cycles include DST (Dynamic Stress Test), US06, FUDS (Federal Urban Driving Schedule) [42], and BJDST (Bei Jing Dynamic Stress Test) [43]. Since the ambient temperature has a significant impact on the lithium battery, to simulate the state of the lithium battery at different temperatures, a thermal chamber is generally used as the temperature variable in the simulated lithium battery test. The original data measured by the instrument generally need to be pre-processed, that is, data cleaning, which is because the test process has a Data collection is a time-consuming part of the whole process. To simulate the state changes of lithium batteries under real driving conditions, the parameter changes caused by the load of the lithium battery under driving conditions are generally recorded to form driving cycles, which are loaded on the tested lithium battery. Common drive cycles include DST (Dynamic Stress Test), US06, FUDS (Federal Urban Driving Schedule) [42], and BJDST (Bei Jing Dynamic Stress Test) [43]. Since the ambient temperature has a significant impact on the lithium battery, to simulate the state of the lithium battery at different temperatures, a thermal chamber is generally used as the temperature variable in the simulated lithium battery test. The original data measured by the instrument generally need to be pre-processed, that is, data cleaning, which is because the test process has a certain probability of random conditions leading to missing data or the introduction of noise signals and other situations.
Feature engineering refers to analyzing or designing features that are strongly correlated with the SOC of lithium batteries based on the measured data to reduce the difficulty in the next step of model training. Due to the different units of different units, the size of the value may be different or even vary greatly; in model training, the neural network cannot recognize the change of the unit and can only perform numerical operations. However, variables with too large values will reduce the weight of variables with small values, which is not conducive to finding the relationship between the measured variable and the SOC of the lithium battery during model training, so the data are generally standardized, that is, a unified standard is selected to perform the numerical transformation of the data. The commonly used data normalization is the maximum-minimum normalization process, as shown in Equation (2): Max Min x -X x = X -X * (2) where i x * is the value of a variable after normalization, XMin and XMax are the minimum and maximum values of the variable. After the maximum-minimum normalization, the values of different unit variables are transformed between 0 and 1. Then, the standardized feature data are randomly divided into the training set, validation set, and test set. The training set is used to train a model related to the lithium battery's SOC with the feature data, the validation set is used to verify whether the parameters of the training set are reasonable to adjust the model parameters, and the test set is used to test the generalization ability of the trained model and can only be used once. The trained model is trained Feature engineering refers to analyzing or designing features that are strongly correlated with the SOC of lithium batteries based on the measured data to reduce the difficulty in the next step of model training. Due to the different units of different units, the size of the value may be different or even vary greatly; in model training, the neural network cannot recognize the change of the unit and can only perform numerical operations. However, variables with too large values will reduce the weight of variables with small values, which is not conducive to finding the relationship between the measured variable and the SOC of the lithium battery during model training, so the data are generally standardized, that is, a unified standard is selected to perform the numerical transformation of the data. The commonly used data normalization is the maximum-minimum normalization process, as shown in Equation (2): where x * i is the value of a variable after normalization, X Min and X Max are the minimum and maximum values of the variable. After the maximum-minimum normalization, the values of different unit variables are transformed between 0 and 1. Then, the standardized feature data are randomly divided into the training set, validation set, and test set. The training set is used to train a model related to the lithium battery's SOC with the feature data, the validation set is used to verify whether the parameters of the training set are reasonable to adjust the model parameters, and the test set is used to test the generalization ability of the trained model and can only be used once. The trained model is trained in the selected neural network with the training set, and the model trained by the training set is verified in the validation set to see if the accuracy reaches the highest accuracy, If the desired accuracy is not achieved, you can choose to adjust the parameters of the neural network, and then model training. If necessary, the neural network can also be re-selected for training. If a satisfactory accuracy is achieved, the trained model is tested in the test set to derive the predicted SOC values. The final step is model evaluation; the predicted SOC values are compared with the actual SOC values in the test set using the root mean square error (RMSE), the mean absolute error (MAE), or the mean square error (MSE) to evaluate the model accuracy, and the root mean square error, mean error, and mean square error are shown in Equation (3), where N is the number of variables, SOC pre is the predicted SOC value through the model based on the deep learning method, and SOC act is the actual SOC value in the test set. The smaller the error obtained from the above formula, the higher the model accuracy.

Li-Ion Battery Dataset
The Li-ion battery data form the most important part of the process of SOC estimation because high-quality lithium battery data can better understand the relationship between lithium battery electrochemistry, using conditions, design, etc. Since different types of lithium batteries have different state attributes, and the life cycle of lithium batteries is getting longer and longer, for the failure of lithium batteries, the complexity of failure, and life cycle testing for lithium batteries is also increasing, Therefore, some research institutions have disclosed the test data of lithium batteries obtained by testing.
NASA was the first organization to make the lithium battery dataset publicly available [44]. The dataset [45] contains lithium battery state parameters by performing charging and discharging tests at three different temperatures, i.e., 4 • C, 24 • C, and 43 • C, and recording the impedance as a damage criterion.
The CALCE battery research team at the University of Maryland [46] tested several common types of lithium batteries with different materials and capacities, which were measured separately at 10 different temperatures ranging from −40 • C to 50 • C. The A123 lithium iron phosphate battery dataset, which is often used in the lithium battery SOC estimation problem, was tested at eight different temperatures ranging from −10 • C to 50 • C by DST and FUDS drive cycles.
Aiming to optimize the fast charging of lithium batteries, the Toyota Research Center (TOYOTA) cooperated with the Massachusetts Institute of Technology (MIT) and Stanford University and tested 124 and 224 phosphoric acids of 1.1 Ah and 3.3 V in a temperaturecontrolled convection box at 30 • C. The tested batteries were used to rapidly charge lithium batteries at a rate of 4 C, then discharge at the same rate, and cycle until failure [47,48].
The Panasonic 18650PF Li-ion battery dataset [49] was tested on a brand new 2.9 Ah Panasonic 18650PF cell by Phillip Kollmeyer at Wisconsin-Madison University using a 25-amp, 18-volt digatron firing circuit universal battery tester channel in an 8-cubic-foot thermal chamber. The battery, charged after each test at a 1 C rate to 4.2 V, 50 mA cut off, with battery temperature 12 • C or higher, was subjected to five different temperatures and a series of tests.
Then, a brand new turnigy graphene 5000 mAh 65 C cell [50] and 3 Ah LG HG2 Li-ion battery [51] were tested at McMaster University by Phillip Kollmeye, both of which were tested in an 8 cu.ft. thermal chamber with a 75 amp, 5 volt digatron firing circuit universal battery tester channel with a voltage and current accuracy of 0.1% of full scale.
Zhang et al. [52], from the China University of Science and Technology, conducted charge and discharge tests on three lithium iron phosphate batteries under constant current and DST conditions at room temperature. This dataset can be used for lithium battery SOC estimation, lithium battery performance measurement, and dynamic characteristic analysis of the pack operation. Wang et al. [53] used the BTS-8000 to perform discharge tests on four LiFePO4 battery packs and supercapacitors under DST and UDDS conditions at room temperature, and the data can be used not only for the SOC estimation of a Li-ion Machines 2022, 10, 912 6 of 21 battery but also for Li-ion battery and supercapacitor performance measurements, model parameter calibration, and dynamic characterization.
Each dataset comes with a full test process that is not reproduced here. The dataset is often in ".mat", ".xlsx", or ".csv" format, and each dataset includes a "Readme" file that describes the test parameters, naming principles, notes, and other details about the dataset. Table 1 provides a review of the publicly accessible higher-quality lithium battery datasets.

Deep Learning Neural Network Structure in SOC Estimation
The SOC estimation of a Li-ion battery in the deep learning method uses deep learning theory of computer science to build a model that builds the approximate relationship between input data (voltage, current, temperature, power, capacity, etc.) and output data (SOC) by available data. According to different neural network structures, it can be classified as a single, hybrid, or trans structure. Figure 3 depicts a summary of the major neural network structures utilized in deep learning for lithium battery's SOC estimation.

Deep Learning Neural Network Structure in SOC Estimation
The SOC estimation of a Li-ion battery in the deep learning method uses deep learning theory of computer science to build a model that builds the approximate relationship between input data (voltage, current, temperature, power, capacity, etc.) and output data (SOC) by available data. According to different neural network structures, it can be classified as a single, hybrid, or trans structure. Figure 3 depicts a summary of the major neural network structures utilized in deep learning for lithium battery's SOC estimation.

Single Structure
The single structure uses only a deep learning structure to estimate SOC; in this chapter, it includes a multi-layer perceptron (MLP) type, convolutional type, and recurrent type.

MLP Type-DNN
Multi-layer perceptron, also known as an artificial neural network, is derived from a Deep Neural Network (DNN) after the arithmetic power is improved and the training parameters are increased; its advantage is that it does not limit the dimensionality of the input, it is highly adaptable to the data, and theoretically, a 3-layer perceptron can fit any function nonlinearly, but the disadvantage is easy over-fitting [54] when the network has

Single Structure
The single structure uses only a deep learning structure to estimate SOC; in this chapter, it includes a multi-layer perceptron (MLP) type, convolutional type, and recurrent type.

MLP Type-DNN
Multi-layer perceptron, also known as an artificial neural network, is derived from a Deep Neural Network (DNN) after the arithmetic power is improved and the training Machines 2022, 10, 912 7 of 21 parameters are increased; its advantage is that it does not limit the dimensionality of the input, it is highly adaptable to the data, and theoretically, a 3-layer perceptron can fit any function nonlinearly, but the disadvantage is easy over-fitting [54] when the network has massive parameters. Figure 4 shows the structure of a deep neural network with four hidden layers, each containing eight neurons.
The single structure uses only a deep learning structure to estimate SOC; in this chapter, it includes a multi-layer perceptron (MLP) type, convolutional type, and recurrent type.

MLP Type-DNN
Multi-layer perceptron, also known as an artificial neural network, is derived from a Deep Neural Network (DNN) after the arithmetic power is improved and the training parameters are increased; its advantage is that it does not limit the dimensionality of the input, it is highly adaptable to the data, and theoretically, a 3-layer perceptron can fit any function nonlinearly, but the disadvantage is easy over-fitting [54] when the network has massive parameters. Figure 4 shows the structure of a deep neural network with four hidden layers, each containing eight neurons. Ephrem et al. [55] used the DNN to train a model for SOC estimation and tested the Panasonic 18650 lithium battery under different temperatures and driving cycles [49], among which seven fully discharged datasets were selected as training datasets, "US06" and "HWFET" were the validation datasets, the test set was the data set under the changing temperature of 10 °C-25 °C, the inputs were current, voltage, average voltage, and Ephrem et al. [55] used the DNN to train a model for SOC estimation and tested the Panasonic 18650 lithium battery under different temperatures and driving cycles [49], among which seven fully discharged datasets were selected as training datasets, "US06" and "HWFET" were the validation datasets, the test set was the data set under the changing temperature of 10-25 • C, the inputs were current, voltage, average voltage, and average current, and it was verified separately at each temperature. After the test set test and compared with four other methods, the lowest RMS error obtained was 0.78%. SHRIVAS-TAVA et al. [56] tested the Panasonic 18650 lithium battery, using "DST, FUDS, US06" as the training dataset and validation dataset and "WLTP" as the test set; the inputs were voltage, current, and temperature. The model was compared with the SVR (Supper Vector Regression) method, and the RMS when using the DNN method was significantly smaller than the SVR. HOW et al. [57] used the INR lithium battery dataset from the CALCE dataset [46] to train the lithium battery SOC model, with "DST" as the training dataset, and "FUDS, BJDST, and US06" as the test dataset, with current, temperature, and voltage as inputs. After training, the model was tested in the "DST" test dataset and compared with five methods, and the RMS was 3.68%. Kashkooli et al. [58] tested eight commercial 15 Ah lithium battery cells cycled at various constant rates of charge/discharge and conducted tests at the one-mouth interval for a period of 10 mouths; the measurement data were divided randomly into three groups in which 70% was used for training, 15% for cross-validation, and 15% for testing; the test performance based on MSE using DNN was 0.0247%.

Convolutional Type-TCN
Convolutional type neural networks in the SOC estimation applications of Li-ion batteries are mainly variants of convolution neural networks (CNNs [59]) in time series data, which are one-dimensional convolutional neural networks [60] (1D-CNNs) and temporal convolution networks [61] (TCNs). The primary benefit of a one-dimensional convolutional neural network is that it can extract and categorize one-dimensional signal data while using less computer capacity. It has been frequently employed in real-time monitoring tasks such as defect prediction and categorization in recent years. The SOC estimation of a Li-ion battery is a regression problem, but models in 1D-CNNs are not as accurate in terms of regression prediction problems as in classification problems, so they are typically employed as a feature extraction layer in conjunction with other networks. The main benefits of time-domain convolutional networks are the expansion of the feature extraction range by increasing the perceptual field by expanding the causal convolution and the mitigation of the gradient explosion problem by residual connection [62], which allows for the training of models with more parameters and higher accuracy. The schematic diagram of the convolutional neural network is shown in Figure 5.
while using less computer capacity. It has been frequently employed in real-time moni-toring tasks such as defect prediction and categorization in recent years. The SOC estimation of a Li-ion battery is a regression problem, but models in 1D-CNNs are not as accurate in terms of regression prediction problems as in classification problems, so they are typically employed as a feature extraction layer in conjunction with other networks. The main benefits of time-domain convolutional networks are the expansion of the feature extraction range by increasing the perceptual field by expanding the causal convolution and the mitigation of the gradient explosion problem by residual connection [62], which allows for the training of models with more parameters and higher accuracy. The schematic diagram of the convolutional neural network is shown in Figure 5. HANNAN et al. [63] constructed a multi-layer time-domain convolutional layer with feedforward direction and optimized the learning rate using an optimization algorithm, using "Cycle 1-Cycle 4, Cycle NN, UDDS, LA92" from the dataset [49] as the training set HANNAN et al. [63] constructed a multi-layer time-domain convolutional layer with feedforward direction and optimized the learning rate using an optimization algorithm, using "Cycle 1-Cycle 4, Cycle NN, UDDS, LA92" from the dataset [49] as the training set and "US06, HWFT" as the test set; the MSE of the test was 0.85% when compared with that of the four models.

Recurrent Type-LSTM
As shown in Figure 6, recurrent types mainly include the Recurrent Neural Network (RNN), Long Short-Term Memory [64] (LSTM), and Gated Recurrent Unit [65] (GRU). Gradient explosion or disappearance occurs in recurrent neural networks as parameters are increased; then, the creation of LSTM alleviates the problem of gradient explosion in the recurrent neural network, followed by GRU with fewer parameters than LSTM. At present, the LSTM is the most used network of recurrent neural networks in the lithium battery SOC estimation problem, followed by the GRU, and the recurrent neural network is not used directly [66]. The benefit of a recurrent neural network is that it can utilize the previous output as the next input, thus exploiting the relationship of the input variables; but, owing to its one-way operation and historical data calculation, it takes longer to train than neural networks that can run in parallel.
Ephrem et al. [67] adopted LSTM to train the lithium battery SOC model under fixed and varying ambient temperatures in the dataset [46]. In the fixed ambient temperature SOC model, the training dataset included the data under eight mixed drive cycles, and the two discharge test cases were used as the validation dataset; the test dataset was the charging test case; in the varying ambient temperature SOC model, the training dataset with 27 drive cycles included three sets of nine drive cycles recorded at 0 • C, 10 • C, and 25 • C. The test dataset included the data of another mixed-drive cycle. Both models' input variables are voltage, current, and temperature. After evaluation, the model achieved the lowest MAE of 0.573% at 10 • C and an MAE of 1.606% with ambient temperature from 10 to 25 • C. Cui et al. [68] used LSTM with an encoder-decoder [69] structure in the dataset [43]; the input was "I t , V t , I avg , V avg ", and the test result was an RMSE of 0.56% and MAE of 0.46% in US06, which was higher than that using only LSTM and GRU in that paper. Wong et al. [70] used the undisclosed 'UNIBO Power-tools Dataset' as a training dataset and dataset [51] as a test dataset in the LSTM structure; the input variables were current, voltage, and temperature, and the MAE was 1.17% at 25 • C. Du et al. [71] tested two LR1865SK Li-ion battery cells at room temperature and used the dataset in [45] as the comparative case to test the model trained by LSTM; the input variables were current, voltage, temperature, cycles, energy, power, and time; the MAE was 0.872% at an average level. YANG et al. [72] used the LSTM to build a model for lithium battery SOC estimation; the data were obtained from the A123 18560 lithium battery under three drive cycles, i.e., DST, US06, and FUDS; the input vectors were current, voltage, and temperature. In addition, the model robustness was tested in the unknown initial state of the lithium battery, with the Unscented Kalman Filter [73] (UKF) method for comparison; the test results showed that the RMS of LSTM was significantly smaller than that of UKF.

Recurrent Type-GRU
YANG et al. [74] trained the model by using GRU, and the dataset was tested using three LiNiMnCoO2 batteries with DST and FUDS drive cycles; the input vectors were the current, voltage, and temperature. Then, the trained model was tested in a dataset of another material; it obtained 3.5% of max. RMS. The authors of studies [75][76][77] all used GRU as the neural network for model training; the dataset was the INR 18650-20R and A123 18650 lithium battery from the CALCE dataset [46] with inputs of voltage, current, and temperature, and the RMS error obtained from the test dataset was not significantly different. Kuo et al. [78] tested a 18650 Li-ion battery cell and used GRU with an encoderdecoder structure, in which the input vectors were current, voltage, and temperature; further, they compared this with LSTM, GRU, and a sequence-to-sequence structure, and the result showed that the MAE of their proposed neural network was lower than that of other methods at three different drive cycles and temperatures.

Hybrid Structure
The main idea of the hybrid neural network in the estimation of the SOC of a lithium battery is to improve the prediction accuracy of the model by combining the advantages of various types of neural networks. The current common architecture in the lithium battery SOC estimation problem is a 1D-CNN as a feature extraction layer to extract deeper features of the input data, and a recurrent neural network (LSTM or GRU is used more often) as a model building layer to construct a model between the SOC and the input variables. Some scholars also added the fully connected layer (FC) before the final output layer to improve the accuracy of the model. The architecture of 1D-CNN + X + Y in lithium battery SOC estimation is depicted in Figure 7.

Recurrent Type-GRU
YANG et al. [74] trained the model by using GRU, and the dataset was tested using three LiNiMnCoO2 batteries with DST and FUDS drive cycles; the input vectors were the current, voltage, and temperature. Then, the trained model was tested in a dataset of another material; it obtained 3.5% of max. RMS. The authors of studies [75][76][77] all used GRU as the neural network for model training; the dataset was the INR 18650-20R and A123 18650 lithium battery from the CALCE dataset [46] with inputs of voltage, current, and temperature, and the RMS error obtained from the test dataset was not significantly different. Kuo et al. [78] tested a 18650 Li-ion battery cell and used GRU with an encoderdecoder structure, in which the input vectors were current, voltage, and temperature; further, they compared this with LSTM, GRU, and a sequence-to-sequence structure, and the result showed that the MAE of their proposed neural network was lower than that of other methods at three different drive cycles and temperatures.

Hybrid Structure
The main idea of the hybrid neural network in the estimation of the SOC of a lithium battery is to improve the prediction accuracy of the model by combining the advantages of various types of neural networks. The current common architecture in the lithium battery SOC estimation problem is a 1D-CNN as a feature extraction layer to extract deeper features of the input data, and a recurrent neural network (LSTM or GRU is used more often) as a model building layer to construct a model between the SOC and the input variables. Some scholars also added the fully connected layer (FC) before the final output layer to improve the accuracy of the model. The architecture of 1D-CNN + X + Y in lithium battery SOC estimation is depicted in Figure 7.

1D-CNN + LSTM
SONG et al. [79] used a neural network combination of "1D-CNN + LSTM" to build a model with inputs of voltage, current, temperature, average voltage, and average current, for the dataset, and the 1.1 Ah A123 18650 lithium battery was tested at seven different temperatures with drive cycles of US06, FUDS. The results showed that the error of the "1D-CNN + LSTM" method was significantly smaller than that of the method that only used one neural network when tested in the test dataset and compared with the 1D-CNN and LSTM methods.

1D-CNN + GRU + FC
HUANG et al. [80] used a "1D-CNN + GRU + FC" neural network architecture with inputs of voltage, current, and temperature; the dataset was obtained from the BAK 18650 lithium battery at seven different temperatures with drive cycles of DST and FUDS. Compared with the method of one neural network such as RNN, GRU, and a support vector machine, it achieved the lowest RMS.

NN + Filter Algorithm
The NN + filter algorithm type uses a neural network and filter algorithm for improving Li-ion SOC estimation performance, Figure 8 is a case of that structure, which is the combination of LSTM and the adaptive H-infinity filter that can be found in [81] in more detail.
YANG et al. [82] tried to combine the advantages of both LSTM and UKF. They used LSTM and an offline training neural network to obtain a pre-trained model with the data obtained; then, the real-time data obtained were inputted into UKF and the pre-trained model, whose data input occurred after normalization. The UKF filters out the noise and improves the model performance. After this, combinations of LSTM and filtering class algorithms appear as "LSTM + CKF (Cubature Kalman Filter)" [83], "LSTM + EKF (Extended Kalman Filter)" [84], and "LSTM + AHIF (Adaptive H-infinity Filter) [81], through the test dataset, and their model performance was better than the models only trained by LSTM.
HUANG et al. [80] used a "1D-CNN + GRU + FC" neural network architecture with inputs of voltage, current, and temperature; the dataset was obtained from the BAK 18650 lithium battery at seven different temperatures with drive cycles of DST and FUDS. Compared with the method of one neural network such as RNN, GRU, and a support vector machine, it achieved the lowest RMS.

NN + Filter Algorithm
The NN + filter algorithm type uses a neural network and filter algorithm for improving Li-ion SOC estimation performance, Figure 8 is a case of that structure, which is the combination of LSTM and the adaptive H-infinity filter that can be found in [81] in more detail. YANG et al. [82] tried to combine the advantages of both LSTM and UKF. They used LSTM and an offline training neural network to obtain a pre-trained model with the data obtained; then, the real-time data obtained were inputted into UKF and the pre-trained model, whose data input occurred after normalization. The UKF filters out the noise and improves the model performance. After this, combinations of LSTM and filtering class algorithms appear as "LSTM + CKF (Cubature Kalman Filter)" [83], "LSTM + EKF (Extended Kalman Filter)" [84], and "LSTM + AHIF (Adaptive H-infinity Filter) [81], through the test dataset, and their model performance was better than the models only trained by LSTM.

Trans Structure
Trans structure is mainly used to transfer the knowledge of source data to target data and in this chapter includes the section on transfer learning and transformers.

Transfer Learning
As depicted in Figure 9, the knowledge is utilized from the learning task trained by source data and that of the target data, which can improve the robustness of the model to achieve higher performance. Some researchers applied transfer learning to enhance the performance of SOC estimation. Trans structure is mainly used to transfer the knowledge of source data to target data and in this chapter includes the section on transfer learning and transformers.

Transfer Learning
As depicted in Figure 9, the knowledge is utilized from the learning task trained by source data and that of the target data, which can improve the robustness of the model to achieve higher performance. Some researchers applied transfer learning to enhance the performance of SOC estimation. Bian et al. [85] added a fully connected layer after bidirectional LSTM on this basis with inputs of voltage, current, and temperature; the datasets were three different lithium battery datasets, A123 18650, INR 18650-20R from the CALCE dataset [46] as the target dataset, and the Panasonic lithium battery dataset [49] as the pre-trained dataset. Then, they used transfer learning to transfer features from the model trained with the pretrained dataset to the model trained with the target dataset. Compared with the method Bian et al. [85] added a fully connected layer after bidirectional LSTM on this basis with inputs of voltage, current, and temperature; the datasets were three different lithium battery datasets, A123 18650, INR 18650-20R from the CALCE dataset [46] as the target dataset, and the Panasonic lithium battery dataset [49] as the pre-trained dataset. Then, they used transfer learning to transfer features from the model trained with the pre-trained dataset to the model trained with the target dataset. Compared with the method of one neural network such as RNN, LSTM, and GRU, the model of the transfer learning method achieved the lowest RMS.
Liu et al. [86] applied TCN to two different types of lithium battery data and migrated the trained model for lithium battery SOC estimation as a pre-trained model to another battery dataset by transfer learning [87]. The training dataset of the pre-trained model was "US06, HWFET, UDDS, LA92, Cycle NN", corresponding to 25 • C, 10 • C, and 0 • C in the dataset [49], and the test set was "Cycle 1-Cycle 4"; the input vectors were current, voltage, and temperature. The model trained under 25 • C was migrated to the new lithium battery SOC model as a pre-trained model by transfer learning, the training dataset of the new lithium battery SOC model included the data measured under two mixed driving cycles in the dataset [50], the test dataset was "US06, HWFET, UDDS, LA92" in the dataset [50], and its RMS range was 0.36-1.02%.

Transformer
Transformer [88] is based on the encoder-decoder structure and attention mechanism, which is multi-head attention. It can enhance the connection and relation of data, and hence the transformer is applied in the natural language process, image detection, and segmentation, etc. In recent years, some scholars tried to use the structure based on the transformer for SOC estimation. The diagram of the transformer is shown in Figure 10. Hannan et al. [89] used the structure based on the encoder of the transformer [88] to estimate SOC, and the dataset was used in [51]; the input variables were current, voltage, and temperature, and compared with different methods including DNN, LSTM, GRU, and other deep learning methods, the test performance was 1.19% for RMSE and 0.65% for MAE.
Shen et al. [90] used two encoders and one decoder of the transformer, in which the input variables were the current-temperature and voltage-temperature sequences; the dataset was obtained from [46], in which the 'DST' and 'FUDS' were used as the training dataset, and the 'US06′ was used as test dataset. Further, they added a closed loop to improve the performance of SOC estimation; then, compared with LSTM and LSTM + UKF, the test results showed that the RMSE of their proposed method was lower than that of other methods.

Evaluation and Future Development
For further analysis and evaluation, Table 2 summarizes the literature, lithium bat- Hannan et al. [89] used the structure based on the encoder of the transformer [88] to estimate SOC, and the dataset was used in [51]; the input variables were current, voltage, and temperature, and compared with different methods including DNN, LSTM, GRU, and other deep learning methods, the test performance was 1.19% for RMSE and 0.65% for MAE.
Shen et al. [90] used two encoders and one decoder of the transformer, in which the input variables were the current-temperature and voltage-temperature sequences; the dataset was obtained from [46], in which the 'DST' and 'FUDS' were used as the training dataset, and the 'US06 was used as test dataset. Further, they added a closed loop to improve the performance of SOC estimation; then, compared with LSTM and LSTM + UKF, the test results showed that the RMSE of their proposed method was lower than that of other methods.

Evaluation and Future Development
For further analysis and evaluation, Table 2 summarizes the literature, lithium battery datasets, input variables, and errors using various neural network models of deep learning to solve the problem of lithium battery SOC estimation. (I: current, V: voltage, T: temperature, t: time, I avg : average current, V avg : average voltage, MAX: maximum error) As a data-driven method to solve the SOC estimation problem of lithium batteries, deep learning methods have the advantages of high accuracy and a short modeling time and do not require a lot of complex interdisciplinary knowledge. Specific to each network, due to different characteristics, various advantages and disadvantages in practical applications are also different. Therefore, Table 3 summarizes the advantages and disadvantages of various neural networks for deep learning methods to solve the problem of lithium battery SOC estimation. DNN can handle the Li-ion battery data without thinking about the dimensions of input variables, but it is easy to increase the problem of overfitting and local optimum in SOC estimation when it uses several MLP layers. 1D-CNN can effectively extract the data features of Li-ion battery data, but it has lower precision of SOC estimation than other neural network structures when it is only used in the 1D-CNN structure. TCN is designed for time series data by using the convolutional neural network structure, but its robustness of SOC estimation is lower than that of others. LSTM can process long-term Li-ion battery data for SOC estimation and it alleviates the problem of gradient disappearance and explosion, but it has several calculation parameters for SOC estimation and it needs large storage capacity to process Li-ion battery data; therefore, it has a long training time. GRU has fewer calculation parameters of SOC estimation than LSTM and it can also alleviate the gradient disappearance and explosion problem, but it still needs a long training time. 1D-CNN + X + Y combines the advantages of different neural networks to estimate SOC and it can further improve the precision of SOC estimation with appropriate parameters of the neural network, but it has a relatively complex model structure compared with the single structure of the neural network. The NN + filter algorithm can merge the benefit of a neural network and filter algorithm to improve SOC estimation performance, but it needs a large capacity to store Li-ion battery data and a long time to further process parameters, which requires more time than only using a neural network structure to estimate SOC. Transfer learning for SOC estimation can transfer knowledge about different types of Li-ion battery data to target data, but it is difficult to determine which part of knowledge to transfer to the target data. The transformer can provide the connection between Li-ion battery features, but it needs a large amount of data and computing power due to its high calculation complexity.
It is a multi-factor-determined problem that chooses an appropriate deep learning structure for SOC estimation, which depends on the data, results of precision, consumption costs (time), etc. The amount and quality of available data are the first factors to be considered; in other words, SOC estimation using a deep learning method that is datadriven will have a good performance in a data system with a large quantity and high quality. The training time and precision of SOC estimation need to be jointly considered for the selection of the deep learning structure because in most cases indicated, the training time is positively correlated with SOC estimation accuracy, but its precision will not increase significantly with the training time when it is beyond a certain threshold. From the perspective of data, without thinking about the factor of training time, if the amount of data is not rich, the recurrent structure and transfer learning can be preferred; the reason is that Li-ion battery data include the time series sequence, and the recurrent structure can effectively process the history input data; when the amount of data is rich, the hybrid structure and transformer can be well applied in the SOC estimation problem. From the perspective of the training time and SOC estimation performance, under the condition of the same amount of data, the hybrid structure can be adopted for SOC estimation because its precision is higher than that of the single structure but its training time is longer. Therefore, the selection of a deep learning structure is based on the quality and quantity of available data as well as the desired result in the reality of SOC estimation. Although deep learning can handle a large amount of data and the effect is good, objectively speaking, three main problems need to be solved before using deep learning methods to solve the problem of lithium battery SOC estimation before it can be widely used in practice:

1.
Data: Due to the different battery types, battery parameters, and battery manufacturers for different electric vehicles, the SOC of the lithium battery that provides power cannot be generalized by a model. The failure and life cycle testing of lithium batteries take a long time and have a significant time cost. Generally, scientific research institutions or colleges and universities conduct battery parameter tests, so the quantity and quality of data obtained are limited. At present, models trained by deep learning can only achieve high accuracy under certain operating conditions or certain temperatures. For a general model, the amount of data is far from enough, and to maximize the utilization ratio of Li-ion cell data, there are some methods that can be used: (1) Time series data augmentation: the Li-ion data can be further augmented because they are the time series data, and several methods can be found in the paper [91], and in the state of charge for the Li-ion battery estimation problem, adding noise is the simple and effective method, which can be found in the paper [89].
(2) Creation of new variables based on original data, which can be created by some variables such as the derivation of voltage, current, and temperature based on voltage, current, and temperature; in addition, variables should be created according to the science of Li-ions. (3) Transfer of the model from the different Li-ion datasets: to improve the precision of SOC estimation, the model can be frozen or fine-tuned in a neural network layer to accomplish the target learning tasks; furthermore, when the amount of data is sufficient, the pre-trained models such as GPT-3 and BERT can be applied to the Li-ion SOC estimation problem.

2.
Computing power: Most electric vehicles generally have an in-vehicle computing platform with high-cost performance and low computing power and power consumption as the "brain" of the electronic and electrical equipment due to cost or power consumption reasons. To speed up the training, most of the deep learning is currently based on special processing units, such as graphic processing units and tensor processing units. For accelerated operations, however, these special computing units are designed without considering power consumption and cannot be directly used for onboard computing power platforms for electric vehicles. In addition, at present, all lithium battery SOC estimation based on deep learning is to test the battery separately under simulated driving conditions and to conduct offline training according to the obtained data. On-board training is carried out on the data measured by the sensors in the environment.

3.
Interpretability: Previously, there was no recognized scientific explanation for machine learning in computer science; nowadays, it is only used as a black box. This feature results in a lack of stability and interpretability compared with traditional methods. There is no fixed solution to the situation that does not meet expectations, so it sometimes takes a long time.

Conclusions
We reviewed the lithium battery SOC estimation methods based on the deep learning method and the commonly used lithium battery SOC datasets in recent years, studied four types of neural networks including the single, hybrid, and trans structure, analyzed the advantages and disadvantages of various neural networks, and then listed some methods to improve the data utilization rate and future development.
EVs' lithium battery SOC estimation in the deep learning method is part of the intersection of computer science, data science, and battery chemistry. At present, both deep learning and battery fields still have many complex problems that are difficult to solve or understand. From the perspective of efficiency, deep learning methods are superior to methods such as mathematical models, but they do not have a deeper understanding of the changes in the battery state parameters. From the question itself, there are two aspects worth paying attention to:

1.
High-quality data: Some public lithium battery data sets may not meet the actual needs due to reasons such as models or unexpected situations. From the actual needs, it may be necessary to re-test the lithium battery. In the next step, the SOC test of the lithium battery should be considered. Establishing a set of accepted testing methods or standards, which may be an efficient way to generate high-quality data at scale, can avoid duplication of testing, reduce testing time, and improve data quality.

2.
Computer science: Most of the existing deep learning-based lithium battery SOC estimation research uses neural networks that have made breakthroughs in the field of computer science as a method to migrate to this problem. In the future, we can focus on breakthrough research results in the field of computer science, which can be studied by referring to relevant theories and algorithms; the relevant science of battery chemistry can be used as a priori knowledge to construct the characteristics related to the state parameters of lithium batteries.
With the expansion of computer science, together with the advanced devices for data storage (such as cloud storage) and high-quality data, we envision deep learning to be a promising technique to model real-time battery management in the future.