Data Preparation and Training Methodology for Modeling Lithium-Ion Batteries Using a Long Short-Term Memory Neural Network for Mild-Hybrid Vehicle Applications

: Voltage models of lithium-ion batteries (LIB) are used to estimate their future voltages, based on the assumption of a speciﬁc current proﬁle, in order to ensure that the LIB remains in a safe operation mode. Data of measurable physical features—current, voltage and temperature—are processed using both over- and undersampling methods, in order to obtain evenly distributed and, therefore, appropriate data to train the model. The trained recurrent neural network (RNN) consists of two long short-term memory (LSTM) layers and one dense layer. Validation measurements over a wide power and temperature range are carried out on a test bench, resulting in a mean absolute error (MAE) of 0.43 V and a mean squared error (MSE) of 0.40 V 2 . The raw data and modeling process can be carried out without any prior knowledge of LIBs or the tested battery. Due to the challenges involved in modeling the state-of-charge (SOC), measurements are used directly to model the behavior without taking the SOC estimation as an input feature or calculating it in an intermediate step.


Introduction
The market share of mild hybrid vehicles has been increasing, as they provide an easy and effective compromise between the challenging BEV technology and the drawbacks of ICE vehicles when it comes to satisfying set ecological standards. The efficiency of a mild hybrid vehicle depends largely on the battery-related hardware and software. Higher battery capacities and power capabilities are directly connected to a lower fuel consumption, as well as higher weight and costs. Thus, more accurate BMS software is the key to a viable cost-benefit ratio. An accurate prediction of battery voltage levels after a certain current load is of great significance for vehicle energy and power management systems (EPMSs). Using these values, the EPMS is able to adapt a strategy-either to reduce the load of the consumer or increase the energy recovered from the electrical machine. This ensures that the battery is capable of providing or storing energy for the drive train in more situations without exceeding legal voltage limits. Therefore, the overall energy efficiency can be increased and the fuel consumption decreased [1,2].
Conventional battery models use equivalent circuit models (ECM) to estimate the voltage by modeling the electrochemical processes that take place in a battery during discharging or charging. Such models are very accurate, but require significant modeling effort. Knowledge about internal battery processes and their impact on the voltage behavior is necessary to conduct proper measurements and parameterize RC circuits appropriately. Measurement methods such as electrochemical impedance spectroscopy have to be carried out with accurately calibrated and precisely adjusted equipment. The experimental data then must be processed mathematically to obtain the modeling parameters. An online prediction of the terminal voltage was presented by Ranjbar et al. [3]. Chiang et al. used ECM as an input for their SOC estimation [4]. Madani et al. showed the applicability of ECM to LTO batteries [5].
Interest in modeling battery behavior using machine learning (ML) algorithms has recently been increasing. This trend has been enabled by an increase in both CPU and GPU power, whereby research activity is expected to increase dramatically in the field of ML. Most ML models in this area cover the field of SOC prediction, through the use of fuzzy logic [6], neural networks (NNs) [7], deep NN [8], LSTM cells [9,10] and gated recurrent units (GRUs) [11,12]. Huang et al. [13] presented an approach on convolutional GRU. Vidal et al. [14] presented a comparison of SOC models based on FNN, RNN and Kalman filter models. Their comparison showed that RNNs deliver better results than FNN and similar results as models with Kalman filters. Extended Kalman filters [15] and cascaded Kalman filters [16] indicate improvements for models with Kalman filters. The battery state of health (SOH) is highly nonlinear and therefore ML is an appropriate approach. You et al. [17] modeled the SOH with an FNN, Chaoui et al. implemented an RNN [18] and Zhang et al. [19] used an LSTM.
Some ML methods also use approaches to model the voltage behavior with acceptable accuracy. These approaches face the same problems as conventional ECMs, as auxiliaries such as the SOC or coulomb counters are a mandatory input [3,20]. The proposed method uses only measurable physical parameters as input, in order to estimate the battery voltage. This has the advantage of easier adaption to other cell chemistries, causing fewer errors due to upstream inaccuracies.
The theory of dense, dropout and LSTM layers are described in Section 2. Section 3 outlines the methods for pre-processing the raw data to input data. Section 4 presents the model architecture and the associated training process. Thereafter, Section 5 validates the resulting model with separate data. Our conclusion is drawn in Section 6.

Theory of RNN Utilization in Battery Models
The voltage of a battery largely depends on previous loads and, thus, it is advisable to use a recurrent neural network (RNN) to model the behavior of a battery.

Dense Layer
Dense layers are fully connected feed-forward neural network (FFNN) layers, which are often used to fit linear problems. They are also able to fit non-linear behavior through the use of non-linear activation functions (e.g., sigmoid). These characteristics enable them to be attached to RNNs such as GRUs or LSTMs, in order to further process the abstract outputs resulting from the previous RNNs. Experiments have demonstrated the increase of performance when using one to three dense layers after an RNN [21,22].

Dropout Layer
Dropout layers can be added to an NN to prevent overfitting and for better generalization. Srivastava et al. [23] initially introduced dropout layers as a regularization technique. The aim is to drop neurons randomly during every weight update process when training. These neurons are ignored during weight tuning and backpropagation. As a consequence, the net becomes less sensitive to the specific weights of neurons. The dropout rate defines the fraction of dropped neurons.

RNN
Unlike FFNNs, RNNs are capable of processing time-series data due to their structure (as shown in Figure 1a). In an FFNN, every neuron of a layer is connected only to the next layer, whereas the output of a neuron in an RNN can also be connected to neurons of the same or even past layers. This structure provides a time-dependent memory, but can encounters difficulties involving long-term dependencies. This is the reasoning behind the vanishing and exploding gradient problems, which arise when each of the neural network's weights receives an update proportional to the partial derivative of the loss function with respect to the current weight. In some cases, the gradient can become infinitesimally small, such that the weight is prevented from changing its value [22]. Figure 1 shows an exemplary network architecture for an RNN. The output y t of the neurons can be calculated using the activation function σ, the hidden layer vector h, the corresponding matrices W and U and the bias vector b: .

GRU
GRUs were first introduced by Cho et al. [24]. This structure, as seen in Figure 1b, is based on the LSTM structure, but has one fewer output gates. The matrix calculations during fitting and backpropagation can be computed faster, as there are fewer parameters in the system. GRUs are potentially quicker to train, but less accurate than LSTMs. The hidden state vector, h t , is highly dependent on the update gate z t , the previous hidden state vector h t−1 and the candidate hidden state vector h t . The candidate for the hidden state vector h t is activated using a tanh function and additionally calculated with the reset gate vector r t . The reset gate vector r t and the update gate vector z t are calculated with the previous hidden state h t−1 and the input vector x t . W and U describe the corresponding matrices, b is the bias vector and σ is the activation functions:

LSTM
Hochreiter et al. [25] proposed the long short-term memory (LSTM) cells in their thesis. These neurons can prevent the vanishing gradient problem from occurring in an RNN. This approach adds an additional status to each neuron. Figure 1c shows the structure of an LSTM network, compared to an RNN and a GRU. The hidden state vector, h t , is highly dependent on the cell state C t . The cell state enables the network to store information over a longer period of time without encountering the vanishing gradient problem. The forget gate f t , the input gate i t and the previous cell state C t−1 have direct effects on the cell state C t . The forget gate is calculated with the previous hidden state, the input vector and an activation function. The candidate for the updating gate C t has the same inputs, but is activated with a tanh function. The input gate i t and the output gate o t are calculated similarly to the forget gate. W and U describe the corresponding matrices, b is the bias vector and σ is the activation function:

Classification of Battery Models
Battery voltage behavior models can be divided into the following categories: • Analytical models • Electrochemical models • Equivalent circuit models • Data-based models Mathematical models describe the electric behavior of a battery cell in an analytical way. Three main equations-the Sheperd, Nernst and Peukert equations-are applied in this approach. These models are parameterized with test data, including input values for the SOC, voltage and current. Consequently, a previous SOC estimation is obligatory and a temperature dependency is not included. Physical-based models can achieve a high accuracy by modeling the dynamic behavior with equations derived from physical and electrochemical laws. Therefore, it is necessary to solve a large number of partial differential equations in real-time and, thus, they are typically excluded from industrial applications. Common approaches use the Butler-Volmer equation [26] and can achieve high accuracy.
In the literature, ECM are widely discussed using RC circuits [27], in comparison to math models [28] and various ECM [29]. Farmann et al. [30] showed the applicability of ECM for LTO application. ECMs model the macroscopic effects of the electrochemical processes that occur in a battery cell during charging and discharging. Voltage polarization arising from non-linear effects caused by diffusion, charge transfer, and the electrochemical double layer is modeled using one or more RC circuits. A valid SOC model is elementary for building an ECM. Furthermore, precise measurements have to be carried out and the parameters must be fitted.
Data-based models have emerged over the past few years as a promising approach for modeling batteries, due to advancements in computational power and machine learning algorithms. Fitting models to training data is computationally intensive, whereas predicting profiles with a trained model is less so and, thus, capable of operating in real time. Training with large datasets allows us to model effects that current battery knowledge cannot explain and to fit highly non-linear correlations such as aging [31]. Furthermore, no expert battery knowledge is necessary to model the behavior.
This paper investigates the use of ML algorithms to detect the battery behavior. Therefore, only physically measured parameters are used for fitting and, hence, no state variables, such as SOC or state-of-power (SOP), are needed. This ensures the ease of implementation of the training algorithm, as well as high accuracy. Additional data often improve the accuracy but overfitting can be effectively applied in some parameter areas, if the amount of data is lacking. Hence, every dataset can be processed into a valid training dataset without a need for specifying the measured current or voltage profiles.

Complexity and Amount of Data
The data used to train the proposed model were obtained from measurements of testing vehicles. These vehicles drive under customer-oriented conditions with regard to vehicle speed, ambient temperature, driving characteristics and usage behavior. The logged data contain, inter alia, the internally measured battery parameters of current, terminal voltage and temperature.
The battery is part of a 48 V mild-hybrid power supply system, which is subject to power profiles resulting from the electrical machine, the DC/DC converter and the consumer. These batteries consist of 20 lithium titanate (LTO) cells connected in series. Each cell has a nominal capacity of 10 Ah and a nominal voltage of 2.2 V. The battery current is limited to 350 A in the charge and discharge directions, due to the application specifications. With a capacity of 10 Ah, the battery is deployed with C-rates up to 35°C. The operating temperature range is between −18 and 60°C. The battery management system (BMS) measures cell voltages, currents and temperatures, in order to provide data to the CAN bus of the vehicle. The CAN bus signals recorded during real operations in test vehicles were used to train and validate the model.
A raw data volume of over 200 million data points was the result of these measurements. To train a model with machine learning algorithms, the raw data had to be pre-processed by the following methods, in order to obtain a smaller dataframe with a better reproduction of the battery behavior. Under-and oversampling were used to reduce the raw dataframe for training to 1,028,918 sequences. The validation test set had a further 175,394 sequences, where the data were partly manually selected and partly randomly chosen, in order to obtain a diverse validation set. The test procedure described in Section 3.2 was carried out using a test set of 71,901 sequences, unlike the validation and training sets, which were created from test bench measurements. In contrast to the data from the vehicle measurements, the test measurements on the test bench were not randomly initialized and aimed to cover a wide range of SOC, temperature and power, in order to demonstrate the performance of the model. Table 1 provides an overview of the mean, RMS and peak power, as well as the C-rates of the three profiles for training, validation and testing. To obtain a model that is able to predict both the critical areas and smoother sections, the profiles included high power and current phases up to 17 kW, as well as battery regeneration phases. The battery temperature and power range was the same range the battery would experience in a vehicle. This facilitates the usability of the model for vehicle applications. There are two main issues with large datasets. First, the training time for one epoch becomes very time-consuming. Second, the model may not learn adequately, due to an unequal data distribution over the features. Considering this, the reduction of the amount of data and, thus, the training time prompted us to use the method of undersampling input data for ML models. Balancing features were selected to reduce over-represented data. The best predictions were made after balancing with these balancing features: • U mean : mean voltage in sequence • I mean : mean current in sequence • T mean : mean temperature in sequence • ∆U: difference between U(t) and U(t − 1) • ∆I: difference between I(t) and I(t − 1) The maximum value range of each feature n was detected and divided into m equal-sized value ranges. The m × n bins were filled with data (i.e., to the bin limit) and excess data were cut off. The bias elimination of over-represented feature subranges is an effect of undersampling as well as oversampling.

Oversampling
After the undersampling process, the data were still not perfectly distributed over all feature ranges due to a lack of data in some edge areas. Undersampling deeper than necessary would lead to a lack of information to train the model. The oversampling algorithm employed for the proposed model used underrepresented data points, added artificial noise to the features and appended the result to the original dataset. The noise was chosen such that it had no noticeable influence on the target value. This flexibility was due to the errors in the measured variables, which are present in the data anyway, as well as due to the inertia of the features. A temperature change by a few Celsius has little influence on the result but, in contrast to a simple duplication, it prevents over-representation of the experimental data.
The noise range was set from −2 to 2°C in 1°C steps, such that every under-represented data point was quintupled. The final data distribution can be seen in Figure 2. The current was almost equally distributed, except in two areas. Very low and very high currents were still under-represented, due to the fact that the system rarely operates in these areas. The range around zero current was over-represented, which was due to some rest periods. Undersampling this area would lead to an inability to perceive the open-circuit voltage (OCV). Due to warming in the battery during operation, there were many more data for higher temperatures than for lower.

Normalizing
The different ranges of the input features need to be normalized, in order to improve numerical stability and to accelerate the training process. To avoid an oscillating or exploding loss with non-normalized data, a very small learning rate needs to be applied. This is caused by a non-symmetric cost function.
The feature with the largest range determines the learning rate. If the features are normalized, each feature is in the same range, leading to a higher learning rate. This speeds up the training process. A min-max scaler, as shown in Equation (13), was implemented for the proposed model, as it was considered the most effective scaling method.

Sequentializing
Recurrent neural networks, such as GRUs or LSTMs, are built using sequence data as input. The sequence length is the same length as the section on which the model can fit the behavior. The lengths of sequences have a direct effect on the memory and speed necessary for computation. Considering that battery effects are highly time-dependent, a long sequence length is desirable. The internal effects of diffusion, charge transfer, electrochemical double layer and conductance have different time dependencies. Time constants for modeling these effects range from milliseconds to hours and are dependent on the SOC and temperature.
As a trade-off between computational cost and model accuracy, the sequence length was chosen as 128 data points. Data were logged with a sampling rate of 10 Hz, which means that each sequence represents the last 12.8 s of recording. Sequentializing was performed with a shift of one-step, such that no information was lost.

Battery Modeling and Rnn Hyperparameter Tuning
Selecting a smart input feature enables simple utilization in vehicle applications. The model accuracy is highly dependent on the chosen hyperparameters and model architectures.

Feature Selection
Input feature selection has a significant influence on the convergence of ML models. Battery voltage prediction models are often trained with the terminal current, battery temperature, actual voltage and an indicator of the remaining capacity, such as the SOC or a similar charge counter. Due to the computational cost and inaccuracy of SOC modeling, the proposed model was trained without SOC as an input feature. To bypass the issue of modeling the SOC, some current integration methods have been applied in the literature. The side effect of this is that either the initial SOC or some extra information about the battery state must be known, such as whether the battery is fully charged or fully discharged. This extra information and current integration over time provides a kind of SOC to the net.
The operation strategy of a 48 V system in automotive applications pre-defines a volatile battery operating area. With rarely stationary states, an adaptable model has to be designed. Therefore, the input features of the model were selected as: Terminal voltage U t , and • Voltage trend U trend . Table 2 shows an exemplary sequence (with a sequence length of 10) for the target value U pred and the corresponding input features. The calculation of U trend contains no new information, but, as an input, it ensures faster and more precise convergence when training the model. The average voltage over the last few steps provides information about which voltage level the battery is actually operating at. Furthermore, when combined with the previous current and voltage, it can provide the trend of overvoltage polarization. The current and temperature determine the overvoltage polarization in the next step. When predicting more than one step at once, the U trend is recalculated every minute by using the last sequence of predicted voltages. Use of the period of 60 s ensures that small errors in voltage prediction are not fed directly to the next step as input. Updating U trend in every iteration could result in a drift of voltage and, therefore, an increasing error. Table 2. Prediction scheme with sequence length of ten. Step T 6 I 6 U mean (U 0 :U 8 ) U 6 7 T 7 I 7 U mean (U 0 :U 8 ) U 7 8 T 8 I 8 U mean (U 0 :U 8 ) U 8 9 T 9 I 9 U mean (U 0 :U 8 ) U pred The input feature set used provides all of the information needed for learning, whereby the future voltages of a battery can be predicted. This approach has two advantages: First, explicit battery modeling knowledge from experts is not necessary to model the behavior. Feature selection and hyperparameter tuning are implemented and can be adapted to any lithium-ion battery by simple data preparation and model adjustments. Second, an online algorithm is provided which is applicable in every condition during operation, with a short initializing time for the sequence length.

Training Progress
The input data obtained from the pre-processing steps were divided into batches, in order to reduce the memory size required for each training iteration. An epoch is considered complete when every batch has been processed once in an iteration. Figure 3a shows the loss and Figure 3b shows the validation loss over the trained epochs. As the loss decreases steadily, the model keeps on learning the input data inter-relations. Overfitting is indicated by an increasing validation loss beyond Epoch 60. The loss metrics of the test set had their lowest point in Epoch 35.

Proposed Model Architecture
The input and output layers of the neural net were determined by the feature selection and sequence length, as shown in Figure 4. This approach used four features with a sequence length of 128 and, thus, the input of the first layer was a 128 × 4 matrix for each time step. The output was simply the predicted voltage for the next step. Figure 5 demonstrates that the LSTM cells had better validation results, compared to those of the RNN and GRU. From Epoch 30 onwards, the LSTM had lower MSE and performed slightly better than the RNN, with regard to the mean max error. In addition, the training time per epoch was three times faster when using the LSTM, compared to the RNN, and similar to that of the GRU. The hidden layers were determined empirically as two LSTM layers with 128 neurons each and one attached dense layer with 128 neurons. An additional dropout layer with a dropout rate of 0.2 was also used, in order to deal with overfitting issues. The model hyperparameters shown in Table 3 were determined empirically for the proposed models using a grid search algorithm.

Validation Using Test Bench Measurements
A comparison between the model prediction and test bench measurements was carried out, in order to validate the proposed model. The error was quantified by calculating the MAE and MSE, as described in Equations (15) and (16), respectively. The error at each step, err(t), was calculated using the ground truth gt(t) and the predicted values from the model pred model (t). n describes the number of steps.
Unlike the training and validation datasets, the test set was measured on a test bench, where the temperature and current were adjustable. This ensured that a wide range of power, SOC and temperature could be tested. To achieve this, a current profile was recorded during a vehicle test drive and adapted to the adjusted temperature and SOC to obtain the voltage boundaries of the battery. Batteries display non-linear behavior in the SOC and voltage limit values, due to the steepening OCV and rising internal resistances. Adapting the current profiles allows access to these hard-to-predict areas, thus ensuring more exact validation. The test bench offers a measurement setup with a climate chamber to condition the battery temperature, a power supply and electric load to apply the current profiles and a computer to control the BMS and plot the data. Before starting measurement, the batteries were first conditioned electrically to the requested state of charge by completely discharging a fully charged battery. Thermal conditioning was then carried out for at least 12 h to ensure a fully tempered battery. Concatenated current profiles for the four temperature regions are shown in Figure 6. The current is selected to meet the requirements of the test-bench and voltage limits in consideration of temperature dependent overvoltages.  Figure 7 presents a validation of the current profile applied at an average temperature of −23°C in low SOC regions. The validation had difficulties predicting these operation points. As Figure 7a shows, the model was nevertheless capable of predicting the qualitative progression of the voltage with an MSE of 1.18 V 2 . The prediction differed from the ground truth at high rates due to the gradient of inner resistances in the battery cell increasing at lower temperatures. Figure 7b shows the maximum error (of 3.9 V) occurring during a high current peak.  The better voltage prediction closer to room temperature is shown in Figure 8a. The maximum error was less than 1.1 V with an MSE of 0.19 V 2 within this profile, as shown in Figure 8b. Better predictions resulted from a less volatile voltage course and more resilient data availability. This temperature-dependent error is similar to that occurring in common ECMs.
Three different current profiles were performed at different charge states for every temperature region. The maximum and minimum current were restricted at −25°C, such that the lower and upper voltage barriers of 38 and 53 V were not exceeded. There was no current restriction at 25°C, on account of the lower inner resistance. The measured profiles were used as the input for the prediction, in order to obtain an equivalent input profile for validation. The model predicted current and temperature, where the first 128 voltage values were used to calculate the initial U trend . U trend was then updated every 60 s and was calculated in consideration of the last 128 predicted voltage values.
Each adapted profile was evaluated individually, in order to determine the deficiencies of the model. As regards the voltage level of 48 V, the overall maximum relative error is below 1% at all conditions. The error metrics are summarized in Table 4.

Conclusions
A novel battery modeling approach using an LSTM is proposed for a lithium-ion battery in vehicle applications. The calculated U trend and the balanced input dataset give the possibility to train a model without using the SOC and the included difficulties. Two steps were carried out to achieve a valid model: First, the raw experimental data were pre-processed to obtain a useful input vector. With under-and oversampling, redundant data were reduced and under-represented areas were reproduced, respectively. Sequentializing and normalizing permitted adequate training. Second, hyperparameter tuning was carried out during training, in order to find the optimal model architecture.
Validation showed that the model accuracy was within an appropriate range. The maximum error in the validation set was below 1% (or 3.9 V) with an MSE of 0.40 V 2 over a Temperature range from −23 to 56°C and power up to 11 kW. The proposed battery model can be used in real-time applications, as the model inputs are physically measurable parameters. This ensures the possibility of simple and accurate implementation in a BMS. Its accuracy and transferability to other battery types mean that the developed modeling method can be used in a variety of mild-hybrid vehicles.
Future work may involve battery SOP estimation based on the proposed voltage prediction. A more robust voltage prediction in the peripheral SOC areas must, thus, be trained.