1. Introduction
In recent years, environmental pollution has become a growing problem, and the cost of oil has become increasingly expensive. A new type of electric vehicle (EV) is gradually being explored as an alternative to conventional fuel vehicles  [
1]. Electric vehicles versus conventional fuels are more resourceful and less costly to maintain. They support governments’ comprehensive agendas to address greenhouse gas emissions as a cleaner form of personal transport. However, for many years, mileage anxiety, charging anxiety, and battery anxiety have been the three major obstacles to the future development of electric vehicles, and electric vehicles have always struggled with these three obstacles. Internal combustion engines and electric motors will continue to coexist for many years because it is difficult to allay the three main concerns. Lithium-ion batteries have evolved into the standard power source for electric vehicle equipment in this process  [
2]. They have a high energy density, a low self-discharge rate, almost zero memory effect, a high open circuit voltage, and a long life. Graphite is widely used as the carbon component in the electrodes of lithium-ion batteries. However, it is a difficult material to obtain. Researchers have found that carbon from plants has potential for use in supercapacitors. Banana peels, grass, and water spinach (Ipomoea Aquatica) can all be used as graphite substitutes. They are more accessible, environmentally friendly and cheaper [
3]. Lithium-ion batteries will be more widely used in the future as people continue to explore the potential applications of biomass in batteries. However, lithium-ion batteries cannot monitor their own health status or remaining power. They do not have system control or real-time human–computer interaction. In order to monitor battery status, control battery charging and discharging, ensure that the battery is not short-circuited, check battery health, and interact with the entire vehicle’s information [
4], the BMS is created. Battery SOC describes the remaining battery capacity. It is one of the key indicators for assessing the stability and safety of lithium batteries under current operating conditions. Accurate SOC estimation can help to avoid overcharging or discharging [
5], avoid explosions caused by thermal reactions, and provide safety assurance to the user, and it is the foundation for the majority of BMS decisions. The SOE [
6] measures the ratio between the battery’s maximum and current available energy. It is important reference information for the energy management, distribution, and control of the whole vehicle. It reflects more effectively the influence of the current moment and previous charging and discharging conditions of the battery. It is more suitable for predicting range parameters, such as the range of the EV [
7], to relieve the driver’s mileage anxiety. In addition, accurate knowledge of the SOE value can facilitate the BMS’s ability to develop more reasonable energy control strategies and optimize the performance of EV energy control. Thus, the EV range will be extended and battery energy usage will increase, both of which are crucial for EV economy improvement.
SOC and SOE are not real entities and cannot be measured directly by devices such as sensors. People mostly use integral methods [
8], open-circuit voltage methods [
9], model-based methods, etc. These methods, however, are heavily inspired by ambient temperature, aging, and sensor errors. The electrochemical model method [
10,
11] and equivalent circuit model (ECM) [
12,
13,
14] have high requirements on relevant parameters of model composition. They require painstaking experiments and deep battery research by experts in the field. These techniques are inadequate for the complicated, dynamically shifting real-world situations of electric vehicles. Recently, with the arrival of the big data era, DL has made significant accomplishments in areas such as images and speech. It has powerful automatic feature extraction capabilities and advantages in handling high-dimensional and non-linear data. With the rapid development of the intelligent auto industry, many researchers are also concentrating on applying deep learning algorithms to battery status estimation. The data-driven method based on deep learning does not require a deep study of the internal chemical reactions’ characteristics. Training can achieve fast and efficient estimation with a large amount of data. Accurate estimates can guarantee user experience and security. Long Short-Term Memory (LSTM) networks have been used to estimate the battery SOC. Yang et al. [
15] proposed using LSTM networks to simulate the complex dynamics of lithium batteries to estimate SOC. They compared it with a model-based approach in terms of computation time and an unknown initial SOC. The model obtains more accurate estimates than the Unscented Kalman Filter (UKF) [
16], with RMSE and MAE within 2% and 1% of incorrect initial SOC values, respectively. Bian et al. [
17] combined a bi-directional LSTM network with a codec structure to estimate the battery SOC. The method has an MAE of only 1.07% at varying temperature conditions and improved the MAE by 12% and 16% at 25 °C compared to GRU-ED and LSTM-ED. In addition to this, estimation using convolutional modules and attentional mechanisms has also been explored. Wang et al. [
18] used a convolution module to achieve a more accurate estimation of SOC. They superimposed multiple measurable variables over a period of time to serve as model inputs, combining process information and interrelationships generated by the voltage or current. The MAE and RMSE values are 1.260% and 0.998%, respectively. Yang et al. [
19] proposed a deep learning method based on a two-stage attention mechanism. The method effectively reduces the effect of noise on SOC estimation. They used lithium battery domain knowledge such as current, voltage, and temperature as features to input into a coder-based network of gated recurrent units. The attention mechanism was used for preprocessing in the encoder. In the decoder stage, another attention mechanism was used to consider the correlation of time series with reference to the time-scale state of the previous encoder. The MAE value is less than 0.5% in the experimental results. Meanwhile, for battery SOE estimation, researchers have explored different approaches when using neural network methods. Liu et al. [
20] proposed a direct SOE estimation method based on an improved BPNN under dynamic current and temperature conditions. However, this method is an open-loop estimation, and the estimation accuracy is poor due to incorrect measured values of battery parameters. Wang et al. [
21] developed a sliding window neural network model to describe the voltage response of lithium-ion batteries (LIBs) under current and temperature excitation. They used a Monte Carlo sampling method based on a Bayesian probabilistic learning framework to estimate the SOE of the LIBs. There are also researchers who start with an analysis of the battery energy state problem and first predict future operating conditions to achieve an accurate SOE estimate. Liu et al. [
22] proposed a driving condition identification algorithm based on information entropy theory. They applied Markov Chain theory to construct a driving condition prediction algorithm and established an electric vehicle system model. Then, they simulated obtaining the predicted battery operating conditions that correspond to the predicted driving conditions. The final SOE estimation based on EV working condition identification and prediction was achieved. Ren et al. [
23] proposed an SOE estimation method based on future average power predictions. The future SOC sequence, voltage sequence, and temperature sequence were coupled to obtain the future prediction of the SOC, voltage sequence, and temperature sequence depending on the moving average method to collect the historical load. The battery’s SOE was then determined by adding up the voltage and capacity sequences. The prediction-based approach can precisely estimate the battery SOE and takes into account the impact of potential LIB loads. The key to this approach, however, is how to accurately predict the complex operating conditions of future batteries.
The existing research on the separate state has achieved milestones. They combined an attention mechanism, filter [
24], and other technologies to achieve relatively high precision estimation under multi-operating conditions. However, we also find that most studies used model fusion to obtain higher estimates using more complex estimation models. The estimation of multiple states requires multiple complex single-state models. This undoubtedly increases the difficulty of building models into EV controllers for practical applications. The multi-task learning model that performs well in other fields cannot be directly applied to state estimation in the battery field. Fewer studies have discussed their model loading costs in detail. There is a lack of research on how to balance on-board computational conditions with high accuracy estimates and how to achieve multi-state joint battery estimation.
In order to better meet the needs of practical applications and achieve accurate battery state estimation with limited on-board computing resources, in this paper, a multi-task learning network combining multi-layer extraction structure and a separated expert layer is proposed for the first time for the multi-state joint online estimation of SOE and SOC of lithium batteries. MTL in this paper adopts a multi-layer extraction structure. It separates task sharing from specific task parameters. The underlying LSTM initially extracts time-series features. The separated expert layer extracts specific features and shared features from multiple tasks. The final result is the joint multi-state online estimation of SOE and SOC for lithium batteries. A Panasonic dataset is used to simulate the processes of off-line training and online prediction. MTL improves estimation accuracy and reduces the computing resources required for multi-state estimation. We also compared our model with single-task estimation models and other multi-task estimation models and conducted generalization performance tests on other datasets. The effectiveness and superiority of this method are proven by experiments.
The paper is organized as follows: 
Section 2 describes the relevant problem and the current state of the art in multi-state estimation and multi-task learning for batteries; 
Section 3 describes the model structure and optimization objectives in detail; 
Section 4 gives a description of the specific experimental design and a comparison and discussion of the experimental results; 
Section 5 draws conclusions. 
Section 6 provides a discussion of future trends and remaining issues.
  4. Experiments
In this section, the data set used are first described in terms of data recording methods, data partitioning, and data types. Next, experiments are carried out in four aspects: comparison of different multi-task learning models, comparison with single-task learning models, different loss combinations, and generalization performance to demonstrate the effectiveness and superiority of the proposed method.
  4.1. Dataset and Experimental Design
The Panasonic 18650PF [
45] lithium-ion battery dataset was collected by Dr. Phillip Kollmeyer at the University of Wisconsin-Madison. It includes the test data of the automobile industry standard drives US06, LA92, UDDS, NN, and HWFET at three temperatures of 0 °C, 10 °C, and 25 °C. Each temperature contains five standard cycles and four mixed cycles. Each data point contains a measured current, voltage, temperature, power, counters, etc. They were saved with a 0.1-second time step.
The LG 18650HG2 [
46] lithium-ion battery dataset was collected by Dr. Phillip Kollmeyer at the University of Wisconsin-Madison. In the dataset, a brand new LG HG2 battery was tested in an 8-cubic-foot hot chamber with a 75-amp, 5-volt Digatron Firing Circuits universal battery tester channel with voltage and current accuracy of 0.1% of full scale. The dataset includes eight hybrid cycles randomly composed of US06, LA92, UDDS, and HWFET driven by automobile industry standards at three temperatures: 0 °C, 10 °C, and 25 °C. Each dataset contains the measured battery voltage, battery current, and battery temperature. They were saved with a 0.1-second time step.
The Nan value in the Panasonic dataset is cleared first. The SOC label value and SOE label value corresponding to each timestamp are calculated according to the rated capacity, rated voltage, and discharge process. The drive cycle data of UDDS, etc. under three temperatures are taken as the training set of the model; the cycle (1, 2) under three temperatures have six random mixed drive cycles as the test set; the cycle (3, 4) under three temperatures have six random mixed drive cycles as the validation set. The current, voltage, and temperature in the original data are normalized and divided according to the time window size of 128 as the input of the model. The predicted SOC and SOE values at the time are the model’s output. The hyperparameters of the MTL model adopted are shown in 
Table 4.
In this paper, the learning rate is set to 0.001 and the batch_size is set to 64. At the same time, the paper comprehensively evaluates the error degree of the two estimation tasks after several experiments and sets dropout = 0.2.
After training, the testing data are used to simulate the online test. The data are input into the trained network. RMSE and MAE are used to evaluate the models’ prediction performance in an online test and visualize the error results. The two evaluation indicators are defined as follows:
RMSE is one of the most commonly used evaluation indices for regression models. It may calculate the difference in magnitude between the model’s predicted value and the actual value. MAE is used to measure the mean absolute error. The smaller the value, the better the model. At the same time, the paper also records the time required for the model to predict. It is the average value obtained after 10 repeated experiments on the test set. Our model is trained in a Linux environment using a 12-vCPU Intel (R) Xeon (R) Platinum 8255C CPU @ 2.50 GHz and a RTX 3080, and the code is written based on PyTorch under the pycharm tool. The experimental process is shown in 
Figure 3.
  4.2. Experiment 1: Different Multi-Task Learning Models
Experiment 1 is conducted according to the current research background and status quo. We compare the MTL with a variety of multi-task learning models. In this paper, the proposed MTL model is compared with CNN with attention mechanisms (CNN_atten), the Hard Share model (Hard_Share), Customized Gate Control (CGC), and the multi-task learning model MMOE proposed by Google.
The CNN_atten model compared in this paper is a simplified model based on SegNet [
47]. Encoders are divided into three groups of convolution blocks, and the maximum pooling layer is used for downsampling between each group of convolution blocks. The decoder with a symmetric structure is also divided into three groups of convolution blocks. Bilinear interpolation and the maximum pooling position index are used for upsampling between each group of convolution blocks. Task-specific networks connect to different depths of the network. They select features of varying importance for each task from different depths. Hard_share uses LSTM with the same structure as MTL as the sharing layer. It has a similar tower network for different tasks as the task-specific network to learn the sharing features between tasks from the extracted timing features. MMOE [
36] uses an expert layer composed of several experts to extract features and uses a weighted combination of gate structures to learn shared features among tasks. Based on MMOE, CGC classifies experts into task-specific experts. This preserves the ability to learn from task-specific experts.
The MTL model in this paper comprehensively considers the benefits and drawbacks of the above models. Specific features and shared features are extracted through a multi-layer extraction structure. Multiple tasks are trained at the same time and share the learned knowledge with each other to improve performance. 
Table 5 further shows the average error of the tests conducted. The smaller value of all these indicators indicates that the method has better performance in the current estimation task.
From the comparison results in 
Table 5, it can be seen that the model retains the ability to learn specific task features. This can improve the estimation accuracy of the model in the battery multi-state estimation task. The CNN_atten cannot meet the accuracy requirements for the prediction effect of time sequence state. For SOE estimation tasks, MAE and RMSE values are 1.8312% and 2.2276%, respectively; for SOC estimation tasks, MAE and RMSE values are 0.9925% and 1.3486%, respectively. They are much higher than the MTL model proposed in this paper. As the model is still complex, it takes a lot of time to simulate the online test process. The hard parameter-sharing model using only LSTM obtains the best estimation effect except for MTL, but there is still a certain gap compared with MTL. For MMOE and its improved model CGC, the optimal MAE and RMSE values for SOE estimation are 1.1113% and 1.4281%, respectively. For SOC estimation, the optimal MAE and RMSE values are 0.7038% and 0.8992%, respectively. The estimated accuracy is much lower than that of MTL. Only the test time is better than MTL model. The MTL model is more suitable for the online prediction of a battery multi-state in the case of limited on-board computing resources.
The MTL model in this paper has MAE and RMSE values of 0.5943% and 0.7709%, respectively, for the SOC estimation task. For the SOE estimation task, the MAE and RMSE values are 1.0128% and 1.2898%, respectively. We compare the results with those of other articles using the Panasonic dataset. SOC and SOE were estimated in the literature [
28] at 25 ºC under UDDS operating conditions. The results are shown in 
Table 6.
In 
Table 5, we present the average results of our tests at multiple temperatures and operating conditions. The results of their tests at single temperature and single operating conditions are shown in 
Table 6. For SOC estimation tasks, there is a large improvement in accuracy. Both evaluation indicators improved by more than 0.4%. For the SOE estimation task, our model differs by around 0.2% compared to theirs. MTL has shown better results in comprehensive tests at multiple temperatures and operating conditions.
  4.3. Experiment 2: Compare with the Single Task Model
To analyze the performance variance for the single-task learning model and confirm the efficiency of the MTL model, we compare the MTL model with the single-state estimation model in experiment 2. It is that the model only estimated the battery SOC or SOE. We ensure consistency with other model structures and experimental conditions. The same dataset is used to train the two single-state estimation models. 
Table 7 shows the average error of results. The smaller value of all these indicators indicates that the method has better performance in the current estimation task.
The comparison results in 
Table 7 show that the MTL model exhibits a better estimation effect. The errors of the two tasks are smaller than those of the corresponding single-task model. For the SOE estimation task, MAE and RMSE values decreased by 0.0961% and 0.0868%, respectively, compared with the single-task learning model; for the SOC estimation task, MAE and RMSE values decreased by 0.0513% and 0.0775%, respectively, compared with the single-task learning model. The difference between the simulated online prediction and the single task model is less than 0.5 ms. Obviously, its cost of computing resources is far less than the direct superposition of the two single-task models. It can be seen that the features extracted from the two tasks have some commonality. The MTL allows for the simultaneous extraction of specific features and sharing of information between different tasks. The MTL model’s usefulness is demonstrated by the fact that it can enhance performance by simultaneously sharing information about several tasks. The experimental results show that knowledge sharing under the multi-task learning model can lead to more accurate estimation results.
  4.4. Experiment 3: Different Combinations of Losses
It is particularly important to balance the convergence rates of the loss functions of different tasks. Most of the existing methods dynamically calculate the loss weights. Dynamic Weight Average (DWA) [
48] is popularly used. The definition of DWA is as follows:
Here, 
 represents the loss reduction rate of task m in each training, 
 represents the average loss during a training iteration, and t refers to the number of iterations. This paper presents the changes in loss weights of two tasks in the process of model training by applying DWA dynamic weight calculation, as shown in 
Figure 4.
Figure 4: The horizontal coordinate is the number of iterations, and the vertical coordinate is the weight value; alpha1 and alpha2 represent, respectively, the weight of the two losses. The broken line in the figure shows the weight changes for the two tasks during the 30 iterations of training. Only at one point is there a difference of about 0.06 compared with the mean value of 1. Most of the other weight values are within the interval 
, and the weight values of the two tasks always rise and fall alternately around 1. It can be seen that the tasks of battery charge state estimation and energy state estimation are not only correlated to a certain extent but also have almost the same importance and impact on the model. There is no distinguishable difference between the two tasks in terms of learning capacity or rate. Therefore, this paper directly sets the weight loss of the two tasks as 1 through the analysis of the training process and weight change trend. 
Table 8 shows the comparison results for different loss combinations. Smaller values of all these indicators represent that the method has better performance in the current estimation of tasks.
 In 
Table 8, MTL is used to train the model by setting the loss of two tasks to 1, and MTL_dwa is used to train the model by dynamic weight calculation according to DWA. It can be seen that the weight obtained by dynamic weight calculation has a certain lag. It may not meet the current model’s learning speed requirements for weights. After the two weights are set to 1, better estimation results are obtained during the simulated online test. From the comparison of the results, we can see that the improvement is not significant. The DWA and fixed weights do not have a significant impact on what is studied in this paper. This also demonstrates that both tasks are of equal importance and influence on MTL. 
Figure 5 and 
Figure 6 display a comparison error between prediction and label values.
The red curves in the two figures are the label values of SOC and SOE at the corresponding times. The green and blue in the two figures are the predicted values of SOC and SOE at the corresponding times. The horizontal axis is the number of samples. The vertical axis in 
Figure 5 is the percentage of SOC at the current time. The vertical axis in 
Figure 6 is the percentage of SOE at the current time. As can be seen from the figure, the state prediction curve and label value curve almost completely coincide. There is only a slight error in the final moment of the partial discharge cycle.
  4.5. Experiment 4: Model Generalization Performance Test
To prove the validity of the MTL model, a generalization performance test is performed on the LG battery dataset. This paper makes a comparison with the multi-tasking learning model, which has been widely used in current research. We also make a comparison with the corresponding single-tasking learning model. 
Table 9 shows the specific comparison results. Smaller values of all these indicators represent better performance of the method in the current estimation.
The relevant models in the table have been described in the previous section. As can be seen from the table, MTL can also achieve a good estimation effect on the LG battery dataset. For the SOE estimation task, MAE and RMSE values are 0.5855% and 0.8671%, respectively, and for the SOC estimation task, MAE and RMSE values are 0.5267% and 0.7806%, respectively. The estimated results are better than the single-task learning model and other multi-task learning models. This indicates that the MTL model has universal applicability. The error comparison between the predicted value and the label value of the MTL model in the process of simulating online prediction is shown as follows.
The red curves in the two figures are the label values of SOC and SOE at the corresponding times. The green and blue in the two figures are the predicted values of SOC and SOE at the corresponding times. The horizontal axis is the number of samples. The vertical axis in 
Figure 7 is the percentage of SOC at the current time. The vertical axis in 
Figure 8 is the percentage of SOE at the current time. As can be seen from the figure, the state prediction curves overlap almost exactly with the label value curves when predictions are made on the LG dataset. There is only a small error at 40% of the value of the partial discharge cycle. It also shows that MTL can be used well on the LG dataset.
We compare the results with those of other articles using the LG dataset. Data from the literature [
27] were tested using UDDS operating conditions at three temperatures: 0 °C, 10 °C, and 25 °C. The relevant results are shown in 
Table 10.
In the literature [
27], SOC and SOE were estimated for UDDS conditions at three temperatures, with the best results at 25 °C. For the SOC estimation task, the MAE and RMSE values are 0.63% and 0.82%, respectively. For the SOE estimation task, the MAE and RMSE values are 0.64% and 0.85%, respectively. The MTL model in this paper has MAE and RMSE values of 0.5267% and 0.7806%, respectively, for the SOC estimation task. For the SOE estimation task, the MAE and RMSE values are 0.5855% and 0.8671%, respectively. The average test results of our model for multi-temperature and multi-drive conditions outperformed its optimum results for multiple degrees Celsius.