State of Charge Estimation of Lithium-Ion Batteries Using Stacked Encoder–Decoder Bi-Directional LSTM for EV and HEV Applications

Energy storage technologies are being used excessively in industrial applications and in automobiles. Battery state of charge (SOC) is an important metric to be monitored in these applications to ensure proper and safe functionality. Since SOC cannot be measured directly, this paper puts forth a novel machine learning architecture to improve on the existing methods of SOC estimation. This method consists of using combined stacked bi-directional LSTM and encoder–decoder bi-directional long short-term memory architecture. This architecture henceforth represented as SED is implemented to overcome the nonparallel functionality observed in traditional RNN algorithms. Estimations were made utilizing different open-source datasets such as urban dynamometer driving schedule (UDDS), highway fuel efficiency test (HWFET), LA92 and US06. The least Mean Absolute Error observed was 0.62% at 25 °C for the HWFET condition, which confirms the good functionality of the proposed architecture.


Introduction
In the 21st century, the global automobile industry is moving on to electric vehicles. Government-backed policies such as the United States government pledging a 50% reduction in transport sector CO 2 emissions by 2030 has given a more legitimate reason for a switch to electric vehicles and has led to an ever-growing demand for hybrid and electric vehicles [1,2]. The automobile racing industry has also made the switch from highly inefficient V8 power systems to more efficient hybrid V6 engines in formula 1 championship and fully electric drive systems in formula E championship. With the increased demand for electric and hybrid electric vehicles, manufacturers are looking into ways to improve and accurately monitor and estimate various parameters such as the state of charge and state of health of the battery packs being used in these vehicles. A Hybrid Electric Vehicle (HEV) has an electric motor working in conjunction to a conventional engine; because of this, HEVs achieve better fuel economy. Fuel economy in plug-in electric vehicles (PHEV) and all electric vehicles (EV) is calculated differently because of the apparent change in the propulsion systems. The two common metrics for fuel economy in HPEVs and EVs is miles per gallon of gasoline equivalent (mpge) and kilowatt-hours (kWh) per 100 miles. According to the United States Department of Energy, the 2018 Accord hybrid has an EPA combined city and highway fuel economy of 47 miles per gallon, while on the other hand, the estimate for a Honda Accord (petrol) with a four-cylinder internal combustion engine is 33 miles per gallon.
Unlike a conventional car where fuel gauging is completed using a float connected to a resistor in the fuel tank and fuel is gauged depending on the position of the float, it is apparent that this system of fuel gauging is not useful in an electric car. Hence, electric (1) A novel hybrid architecture using a stacked Bi-LSTM and encoder-decoder Bi-LSTM is used to estimate SOC at varying temperatures. The network can take advantage of the bi-directional functionality of Bi-LSTMs and capture sequential tendencies more accurately and provide a more accurate SOC sequence. By providing a SOC sequence estimate as opposed to single value SOC estimates, the trend in battery capacity and battery state in real-world scenarios can be more effectively monitored. (2) The stacked Bi-LSTM was built with deep structures to take advantage of deep neural network architectures, and the use of Bi-LSTM units aid in capturing the temporal dependencies from the forward and backwards directions. Since the encoder and decoder blocks are trained simultaneously, the training time of the network is also reduced. (3) The model is tested on a standard open-source lithium-ion battery dataset. The proposed network performs better than similar pre-existing architectures. Experimental testing proves that the SED network can accurately estimate SOC sequence at varying temperatures provided current, voltage and temperature measurement sequences. A mean absolute error (MAE) of 0.62% was observed for HWFET conditions at varying ambient temperatures, which shows the proper functionality of the proposed network.
The paper is structured as follows, Section 2 describes the LIB battery data and various performance metrics used in the study. Section 3 introduces the proposed network architecture and basic principles of stacked Bi-LSTM and encoder-decoder architectures. Section 4 provides a detailed analysis of the performance of the proposed network, and concluding remarks are given in Section 5.

Battery Management Systems
A BMS is defined as a system that monitors and controls various parameters indicative of battery state of health such as state of charge, instantaneous available power, battery capacity, etc. A BMS is essential to obtain the maximum efficiency from the battery pack in terms of maximum charge available and battery life. A BMS must perform a set list of tasks every measurement interval for proper functionality of the battery pack; these tasks include SOC estimate, SOH estimate, maximum charge available calculation and cell equalization. Figure 1 illustrates a basic BMS task flow with various methods of SOC estimation techniques shown that could be used.
(1) A novel hybrid architecture using a stacked Bi-LSTM and encoder-decoder Bi-LS is used to estimate SOC at varying temperatures. The network can take advantag the bi-directional functionality of Bi-LSTMs and capture sequential tendencies m accurately and provide a more accurate SOC sequence. By providing a SOC seque estimate as opposed to single value SOC estimates, the trend in battery capacity a battery state in real-world scenarios can be more effectively monitored. (2) The stacked Bi-LSTM was built with deep structures to take advantage of deep neu network architectures, and the use of Bi-LSTM units aid in capturing the tempo dependencies from the forward and backwards directions. Since the encoder and coder blocks are trained simultaneously, the training time of the network is also duced.
(3) The model is tested on a standard open-source lithium-ion battery dataset. The p posed network performs better than similar pre-existing architectures. Experimen testing proves that the SED network can accurately estimate SOC sequence at vary temperatures provided current, voltage and temperature measurement sequences mean absolute error (MAE) of 0.62% was observed for HWFET conditions at vary ambient temperatures, which shows the proper functionality of the proposed n work.
The paper is structured as follows, Section 2 describes the LIB battery data and v ous performance metrics used in the study. Section 3 introduces the proposed netw architecture and basic principles of stacked Bi-LSTM and encoder-decoder architectu Section 4 provides a detailed analysis of the performance of the proposed network, a concluding remarks are given in Section 5.

Battery Management Systems
A BMS is defined as a system that monitors and controls various parameters ind tive of battery state of health such as state of charge, instantaneous available power, b tery capacity, etc. A BMS is essential to obtain the maximum efficiency from the batt pack in terms of maximum charge available and battery life. A BMS must perform a list of tasks every measurement interval for proper functionality of the battery pack; th tasks include SOC estimate, SOH estimate, maximum charge available calculation and equalization. Figure 1 illustrates a basic BMS task flow with various methods of SOC timation techniques shown that could be used. Most BMS modules use real time voltage (V), current (I) and temperature (T) me urements to perform the task mentioned above. General topologies used to implem BMS architecture are centralized, distributed and modular [21]. Most BMS modules use real time voltage (V), current (I) and temperature (T) measurements to perform the task mentioned above. General topologies used to implement BMS architecture are centralized, distributed and modular [21].

State of Charge Estimation Techniques
The need for accurate state of charge (SOC) estimation is paramount in HEV and EV applications. Battery state of charge can be defined as the amount of charge left in a battery; unlike a gas-powered car, the remaining charge in the battery cannot be measured directly but has to be estimated using algorithms that predict the available power based on data coming from voltage, current and temperature sensors. Battery SOC can be estimated using different methods, the basic and simple algorithm that can be used is the Coulomb counting method [22,23], where the total amount of energy entering and leaving the battery is monitored. This is implemented using the following equation where Z(0) is initial SOC, η i is Coulombic efficiency, i(t) is current and C n is battery capacity. The Coulomb counting method provides an accurate SOC estimate, but to achieve this, a very precise initial SOC estimate is required. Since we are integrating current over time, any current sensor error is accumulated over time, leading to a massive drop in accuracy.
Other SOC estimation techniques such as fuzzy logic-based SOC estimation [24][25][26], open circuit voltage method [27] and impedance spectroscopy-based methods [28] can be used, but they suffer from issues such as requiring a long time required to obtain OCV measurements; different battery chemistries, temperatures and age affect the OCV [20]. Impedance spectroscopy models differ for different battery chemistries and are highly dependent on experimental conditions [29]. Support vector machine can also be used for SOC estimation [30][31][32]. The lack of robustness in these methods makes them not so good candidates for good SOC estimation algorithm.
Another estimation technique that can be employed other than Artificial Neural Network (ANN) models is battery model based SOC estimation techniques such as Kalman filtering [33], extended Kalman filtering (EKF) [34], etc. The EKF method provides a very precise and robust SOC estimate [35]. Although methods such as EKF provide accurate SOC estimates, they are highly dependent on a very accurate battery model; for instance, in [36], a Kalman filtering example is provided using a simple linear circuit that is considered as a model of the battery, but for extremely precise estimates of SOC, a far more complicated and accurate battery model is required [37]. Apart from the dependency on accurate battery models, the EKF method requires a high computational cost after integration compared to other methods [20].
To overcome these challenges and provide an equally precise SOC estimate, ANNbased algorithms have been put forth. In this paper, a BLSTM-based SOC estimation technique is discussed, and results have been provided which prove the effectiveness of ANN-based algorithms in SOC estimation for HEV applications.

SOC Estimation Requirements in HEV Applications
The requirements for BMS change for different applications. The very nature of the functionality of HEVs makes SOC estimation very tricky compared to fully electric vehicles (EV). It is important to understand the difference in drive systems for EVs and HEVs.
EVs have relatively simple drive systems compared to HEVs. In an EV, the power from the battery pack is supplied to the motors through an inverter. Figure 2 shows a distributed multi motor electric vehicle drive system [38]. The system consists of two DC motors driving the front and rear wheels through the front and rear differential. Regenerative charging can be achieved while braking. The power flow is quite simple while in operation: during acceleration, from the start, the energy flows from the battery to the motors and remains the same until it is reversed when regenerative braking occurs.
HEVs often have more complex drive systems compared to EVs. Figure 3 shows a simplified version of the drive system found in a Toyota Prius. Note that the system is now only a front wheel drive (FWD) or rear wheel drive (RWD) instead of an all-wheel drive (AWD), as shown in Figure 2. The need for a differential is negated using a planetary gear system, which helps transfer the power to the drive wheels from multiple sources. Because of the hybrid architecture, the power draw from the battery changes during various operating conditions. HEVs often have more complex drive systems compared to EVs. F simplified version of the drive system found in a Toyota Prius. Note th now only a front wheel drive (FWD) or rear wheel drive (RWD) instead drive (AWD), as shown in Figure 2. The need for a differential is negated u gear system, which helps transfer the power to the drive wheels from m Because of the hybrid architecture, the power draw from the battery cha ious operating conditions. • During acceleration at the start, most of the workload is carried by th because of the incredible torque provided by the electric motor. T from the battery is the maximum during this phase. In most HEVs, t shut down while starting from a complete stop.

•
During normal conditions, the power draw from the battery is red and power coming from the ICE is split to drive the generator and generator is in turn used to power the electric motors.

•
During sudden changes to the vehicle's momentum, i.e., sudden ac • During acceleration at the start, most of the workload is carried by the electric system because of the incredible torque provided by the electric motor. The power draw from the battery is the maximum during this phase. In most HEVs, the ICE is entirely shut down while starting from a complete stop. • During normal conditions, the power draw from the battery is reduced massively, and power coming from the ICE is split to drive the generator and the wheels. The generator is in turn used to power the electric motors.

•
During sudden changes to the vehicle's momentum, i.e., sudden acceleration or deceleration, power from the battery is either drawn to support the ICE output or the electric motors are used as generators and the battery pack is charged while regenerative braking is performed. • During charge condition, the battery pack can be charged using the ICE output to drive the generator. The battery charge levels are monitored to maintain a minimum level of charge. HEVs often have more complex drive systems compared to EVs. Figure 3 shows a simplified version of the drive system found in a Toyota Prius. Note that the system is now only a front wheel drive (FWD) or rear wheel drive (RWD) instead of an all-wheel drive (AWD), as shown in Figure 2. The need for a differential is negated using a planetary gear system, which helps transfer the power to the drive wheels from multiple sources. Because of the hybrid architecture, the power draw from the battery changes during various operating conditions. • During acceleration at the start, most of the workload is carried by the electric system because of the incredible torque provided by the electric motor. The power draw from the battery is the maximum during this phase. In most HEVs, the ICE is entirely shut down while starting from a complete stop.

•
During normal conditions, the power draw from the battery is reduced massively, and power coming from the ICE is split to drive the generator and the wheels. The generator is in turn used to power the electric motors.

•
During sudden changes to the vehicle's momentum, i.e., sudden acceleration or deceleration, power from the battery is either drawn to support the ICE output or the electric motors are used as generators and the battery pack is charged while regenerative braking is performed.

•
During charge condition, the battery pack can be charged using the ICE output to drive the generator. The battery charge levels are monitored to maintain a minimum level of charge. Because of these fluctuating power draws from the battery, a far more accurate and sophisticated SOC estimator is required in HEV applications.
Hybrid electric vehicles require a more complex battery and battery management systems because of the extremely dynamic rate profiles seen in HEVs as compared to BEVs and personal electronics [37]. HEVs and BEVs also require high current in the magnitudes of 20C, which leads to the cell chemistry to never be in equilibrium, and this leads to the use of very robust SOC estimators. A simple SOC estimator such as Coulomb counting can be employed in Pes because of the almost constant current draw that a battery experiences in a PE environment, but this kind of SOC estimator is not preferred in HEVs because of the dynamic current draw from the battery.
A good SOC estimator has an advantage of prolonging the lifetime of the battery pack, since we can aggressively exploit the precise SOC estimate to control overcharging or undercharging. It also improves battery pack designs and leads to the use of less batteries, which ultimately saves on weight, size and price of the battery pack. So, a precise SOC estimator is not only required because of the challenges posed by HEV architectures but also has added benefits, which are essential in making this technology viable for practical use.

Materials and Methods
A Panasonic 18650PF cell is used to collect data for this study. The data is available online provided by Dr. Phillip Kollmeyer at The University of Wisconsin-Madison. The cell has a rated capacity of 2700 mAh at 20 • C, and other vital battery specifications are listed in Table 1. The acquired battery datasets involve four dynamic tests-namely, US06, urban dynamometer driving schedule (UDDS), highway fuel efficiency test (HWFET) and LA92-that are used to simulate the power profile of battery packs in EVs and HEVs. A series of nine drive cycles are performed in the order Cycle 1, 2, 3 and 4, US06, HWFET, UDDS, LA92, Neural Network. Cycles 1 through 4 consist of a random mix of data from US06, HWFET, UDDS and LA92. The power profile of the drive cycles was calculated for a Ford F150 truck with a 35 kWh battery pack. This battery pack is scaled for a single 18650PF cell. Training and testing of the proposed network were completed using these four dynamic tests. The acquired datasets are trained and tested on Python v3.1 using Pycharm IDE. The computer used to run the software is a Windows workbench with an Intel Xeon processor and a 12 GB NVIDIA TitanXp GPU. The main measurements taken from the datasheets are the voltage (V), current (I) and temperature ( • C). As clearly observed from Figure 4d, battery discharge capacity decreases as temperature decreases. So, it is important to include temperature data. Temperature data of −10 • C, 0 • C, 10 • C and 25 • C were used while training and testing of the proposed model. The temperature is not constant but fluctuates within a range with an initial setpoint of the above-mentioned temperatures. This poses more challenge while estimating SOC but provides an accurate representation of a real-world scenario. The 25 • C temperature data in US06 conditions fluctuate within a range of 25 to 32.7 • C with an average temperature of 29 • C, the 10 • C data range from 33 to 10 • C with an average temperature of 15.92 • C, 0 • C data range from 12 to 0 • C with an average temperature of 6.9 • C and −10 • C data range from 6.6 to −10 • C with an average temperature of −0.1 • C. There are two temperatures provided in the original datasets, the battery case temperature at the middle of the battery measured using a thermocouple and the chamber temperature the battery is placed in. The chamber temperature is controlled using a 8 cu.ft thermal chamber. For this study, the battery case temperature is used, which varies over time. Figure 4c illustrates the fluctuations in the temperature data for the US06 condition. Within the original dataset, tests such as drive cycles were considered important, and the data were recorded every 0.1 s; i.e., we have a time step of 0.1 s during the discharge working condition. Other processes such as charging and pauses were considered secondary and have a lower data rate of 60 s. One can reduce the data rate to a constant 1 s in both conditions to lower the computational power required during testing and training. We can also up-sample the data to achieve a 0.1 s data rate for the charge working condition. In this paper, a constant data rate of 0.1 s is considered for greater accuracy while compromising on calculation costs. data were recorded every 0.1 s; i.e., we have a time step of 0.1 s during the discharge working condition. Other processes such as charging and pauses were considered secondary and have a lower data rate of 60 s. One can reduce the data rate to a constant 1 s in both conditions to lower the computational power required during testing and training. We can also up-sample the data to achieve a 0.1 s data rate for the charge working condition. In this paper, a constant data rate of 0.1 s is considered for greater accuracy while compromising on calculation costs. As seen in Figure 4a,b, current and voltage fluctuate a lot within a given dataset. These massive fluctuations create issues while training the algorithm. Voltage, current and temperature are all measured in different scales, and these do not contribute equally, while model fitting and learning and create an unwanted bias. To overcome this issue, the input measurements are normalized between 0 and 1. Normalizing the datasets will not only eliminate the issue discussed above but also speed up the learning process of the algorithm. In this paper, the normalization is completed using a min-max scaler function.
where n is the nth measurement in the datasheet. is the normalized value in the range (0, 1). As seen in Figure 4a,b, current and voltage fluctuate a lot within a given dataset. These massive fluctuations create issues while training the algorithm. Voltage, current and temperature are all measured in different scales, and these do not contribute equally, while model fitting and learning and create an unwanted bias. To overcome this issue, the input measurements are normalized between 0 and 1. Normalizing the datasets will not only eliminate the issue discussed above but also speed up the learning process of the algorithm. In this paper, the normalization is completed using a min-max scaler function.
where n is the nth measurement in the datasheet. x scaled is the normalized value in the range (0, 1).

Performance Metrics
The performance of the proposed network is evaluated using these performance metrics.

Mean Absolute Error
MAE is an arithmetic average of the absolute errors while comparing two separate outcomes defining the same process. In this case, we compare the actual (y k ) vs. the predicted values (ŷ k ).
where N is the total number of timesteps available. MAE serves as a perfect metric for interpretation of the network, since it is proven to be a more natural measure of average error.

RMSE is the average of the squared errors between actual and predicted values. It is ideal for showing small variances and large outliers in errors as compared to Mean Square
Error; since the errors are squared, RMSE exaggerates any large errors. The expression used to calculate RMSE is whereŷ k is the predicted value, y k is the actual value and N is the total number of timesteps. RMSE tends to increase depending on the sample size. Since the sample sizes of the databases used in this study are approximately equal, values of RMSE can be compared across different databases.

Long Short-Term Memory
LSTM was proposed by Hochreiter et al. as a new gradient-based method in 1997 to overcome the drawbacks of recurrent backpropagation algorithms [39]; it achieves this by using memory cells instead of hidden nodes. LSTMs use gate units that learn to open and close access to the constant error which eliminates vanishing or explosion of the gradient, as seen in back-propagation networks or general RNNs. The basic structure of an LSTM unit can be seen in Figure 5. LSTM units calculate values for all the gates involved: namely, forget gate (f t ), input gate (i t ), output gate (o t ) and cell memory (c t ) for every forward pass at time t. These calculations can be summarized as follows where W, U and b are the weight matrices and bias parameter, which is learned during training the network. The sigmoid function (σ) is bound between 0 and 1 and hence is perfect to be used for forget, input and output gate calculations, since this can be interpreted within the LSTM unit as a "forgetting factor". While training the network, if the value of the input gate or forget gate is close to 0, it will be interpreted as a non-essential input or non-essential previous memory and will be eliminated.
achines 2022, 13, x FOR PEER REVIEW value of the input gate or forget gate is close to 0, it will be in input or non-essential previous memory and will be eliminat In an LSTM unit, the forget gate, output gate and input g (xt) and previous output (previous activation) (at−1); cell mem vious memory (ct−1), forget gate and input gate. The overall In an LSTM unit, the forget gate, output gate and input gate depend on present input (x t ) and previous output (previous activation) (a t−1 ); cell memory is influenced by the previous memory (c t−1 ), forget gate and input gate. The overall output (a t ) considers all the gates and cell memory.

Bi-Directional LSTM
It is an extension of unidirectional LSTM which consists of a forward pass and a backward pass. Figure 6a shows the general structure of a stacked bi-directional LSTM, the structure of Bi-LSTM facilitates the network to have both backwards and forward information [18]. Bi-LSTMs achieve this by using two hidden layers that process input data in the forward and backward directions. These hidden sequences are then fed to the same output layer. Bi-LSTMs use these forward and backward sequences and update the output sequence using the following equations where W, U and b are the weight matrices and bias parameter. The backwards and forward layers are iterated by feeding the network form t = N to 1 for the backwards layer and t = 1 to N for the forward layer. As stated before, since the network has information regarding the previous and future sequences, it is ideal for use in state-of-charge estimations. In an LSTM unit, the forget gate, output gate and input gate depend on present input (xt) and previous output (previous activation) (at−1); cell memory is influenced by the previous memory (ct−1), forget gate and input gate. The overall output (at) considers all the gates and cell memory.

Bi-Directional LSTM
It is an extension of unidirectional LSTM which consists of a forward pass and a backward pass. Figure 6a shows the general structure of a stacked bi-directional LSTM, the structure of Bi-LSTM facilitates the network to have both backwards and forward information [18]. Bi-LSTMs achieve this by using two hidden layers that process input data in the forward and backward directions. These hidden sequences are then fed to the same output layer. Bi-LSTMs use these forward and backward sequences and update the output sequence using the following equations where W, U and b are the weight matrices and bias parameter. The backwards and forward layers are iterated by feeding the network form t = N to 1 for the backwards layer and t = 1 to N for the forward layer. As stated before, since the network has information regarding the previous and future sequences, it is ideal for use in state-of-charge estimations.

Proposed Stacked Encoder-Decoder Bi-LSTM
The proposed network uses a stacked bi-directional LSTM block in combination with the encoder-decoder network. Encoder-decoder architecture has been used in various applications such as language processing [40,41], trajectory estimation [42] and SOC estimation [19]. The encoder-decoder network works by feeding the input sequence to an encoder block and estimating the probability distribution of the tth sample of the output sequence (s t ) using a decoder block. The structure of an encoder-decoder architecture is shown in Figure 7. The input sequence u 1 , . . . , u N is passed through the encoder block, which generates a cell state vector (c N ) after N recursive steps. The cell state vector consists of the hidden states summarized by the encoder block, which can be given as c N = m(h 1 , . . . , h N ). Since the recurrent blocks within the encoder-decoder network can be changed to other types of recurrent blocks such as SRNN, GRU or LSTMs [19], a Bi-LSTM block is used in this study. Since Bi-LSTMs are bi-directional the hidden weights generated by the encoder block is a concatenation of forward ( f h i ) and backward (bh i ) hidden states. The state vector can now be represented as c N = m (( f h 1 , bh 1 ) . . . , ( f h N , bh N )), where m is a non-linear function. The encoder block aims to model the conditional probability of the output sequence given the input sequence. P(s 1 , . . . , s N | u 1 , . . . , u N ) = N ∏ n=1 P(s n |c N , s 1 , . . . , s n−1 ) where s i is the output sequence, u i is the input sequence and N is the number of time steps in the input data. The cell state vector of the encoder block is made to be the initial state of the decoder block, c 0 = c N , and the decoder generates a probability distribution for the nth sample of the output sequence given the decoder state of the previous sample (c n −1 ) and the previous sample of the output sequence (s n−1 ).
where N represents the number of time steps in the output sequence. Both the encoder and decoder blocks are trained together, and the decoder outputs the target sequence given the input sequence. The flow of the proposed algorithm is shown in Figure 6b. The hidden states of the stacked layer are fed into the encoder input. Then, the encoder processes the input sequence and forms the cell state network to be used by the decoder to estimate the output sequence. The decoder uses the estimated previous sample (s n−1 ) to estimate the present sample because of the unavailability of true sample values to the decoder. This is the main limitation of using the proposed architecture. Since the decoder block does not have access to the true sample values and uses an estimated previous sample value to estimate the current sample, there is a chance of error propagation to occur. The error propagation can be limited by using a beam search algorithm [42].
x FOR PEER REVIEW 11 of 18 The input and target output sequences are divided into sequences with each sequence of length L and fed into the network. The sequence length (L) is set to optimize the estimation accuracy of the network. To overcome the issue of vanishing gradients, a rectified linear unit (ReLU) activation function is used, and the output of the dense layer at the end can be given as where w is the weight matrix and b is the bias associated with the fully connected dense The input and target output sequences are divided into N L sequences with each sequence of length L and fed into the network. The sequence length (L) is set to optimize the estimation accuracy of the network. To overcome the issue of vanishing gradients, a rectified linear unit (ReLU) activation function is used, and the output of the dense layer at the end can be given as where w is the weight matrix and b is the bias associated with the fully connected dense layer. Training the model consists of a forward pass and a backwards pass. The network creates an estimated sequence and calculates the loss function. Mean Square Error was used as the loss function in this study, which is given as the average of the squared difference between the actual and estimated values. Total loss is sent backwards through back propagation to update the weights and biases accordingly. Back propagation is performed using an adaptive moment optimizer also known as Adam optimizer. During testing, no back propagation is performed, as the network has finished learning and the weights and biases are not updated. Furthermore, to avoid overfitting, a dropout layer can be implemented in between the Bi-LSTM layers [19,43]. The proposed network utilizes the features from stacked Bi-LSTM and ED architecture.

Experimental Results and Discussion
Stacked Bi-LSTM and encoder-decoder Bi-LSTM (ED Bi-LSTM) are used to compare the functionality with the stacked encoder-decoder network (SED). The networks are tested across different temperature ranges (25 • C, 10 • C, 0 • C, −10 • C) of the datasets available. The hyperparameters of all the networks were made constant, and no hyperparameter optimization was performed prior to testing. Hyperparameter optimization can be completed as part of future development of the proposed network. Computational time was given priority while testing the networks; all the hyperparameters are selected in a way that reduces the time taken to train and test the network. The SED Bi-LSTM network takes the longest to train compared to the other two networks, which is due to the deep architecture of the network. On average, the proposed network takes 1.5 to 2 h longer compared to stacked Bi-LSTM and 1 to 1.5 h longer compared to ED Bi-LSTM. Measures were taken to accelerate the training process in the SED network by normalizing the input data and training all the internal blocks parallelly. The number of iterations has been limited to under 1000, and the number of trainable parameters that affect the network width has been limited to less than 50,000. To further emphasize the advantages of the proposed network, a smaller number of trainable parameters are used in the SED network compared to ED Bi-LSTM and stacked Bi-LSTM. Figures 8 and 9 show the SOC prediction comparison of different networks. All the evaluation metrics are computed by an error calculation function after testing the network. To check the proper functionality of the error calculation function, the software is run through a known set of estimated values whose error metrics are previously calculated. Table 2 shows the effects of varying the depth of the algorithm. The depth of the algorithm can be changed by changing the number of Bi-LSTM layers within the stacked and encoder-decoder blocks. It is very evident for the results that the algorithm performs well when two Bi-LSTM layers are used throughout the architecture. Further increasing the model depth to three Bi-LSTM layers affects the algorithm negatively and the accuracy drops; this can be attributed to the vanishing gradients. The depth of the algorithm was set to two layers for further evaluations within this study. Table 3 shows the performance metrics for all the networks compared in the study at different temperatures. It is evident that the proposed network performs better in all the conditions compared to stacked Bi-LSTM and ED Bi-LSTM. The MAE and RMSE are lower in all the cases for the proposed network. The lowest MSE of less than 1% was observed under UDDS condition while using the SED network. The error metrics increase in US06 conditions due to the high discharge currents involved, but the overall error for the SED network is still lower than that in the other two networks. The highest MAE of 1.97% and RMSE of 2.7% is observed in US06 condition at 0 • C.    Table 2 shows the effects of varying the depth of the algorithm. The depth of the algorithm can be changed by changing the number of Bi-LSTM layers within the stacked and encoder-decoder blocks. It is very evident for the results that the algorithm performs well when two Bi-LSTM layers are used throughout the architecture. Further increasing the model depth to three Bi-LSTM layers affects the algorithm negatively and the accuracy  From Figures 8 and 9, stacked Bi-LSTM performs the worst overall and takes the least amount of time to train among the networks tested. The SOC prediction error shown in the SOC prediction error plots for all the networks is the least for the UDDS condition at 25 • C with an overall error within ±0.05 of the actual value at every timestep and the most for US06 conditions at 0 • C with an overall error within ±0.15 of the actual value at every timestep. The SED network performs the best, since the network depth is increased with the stacked layers and the encoder-decoder setup provides more accurate predictions in many-to-one sequence scenarios. The SOC prediction error plots show that the estimation error was the highest for stacked Bi-LSTM in almost all the cases. The effectiveness of the encoder-decoder architecture is enhanced using stacked layers in the SED network and is reflected in the HWFET condition at 25 • C; stacked Bi-LSTM performs slightly better than ED Bi-LSTM in this condition, but the SED network outperforms both, even though the standalone ED architecture fails to do so. The lowest MAE of 0.62% and RMSE of 0.86% is observed in HWFET conditions at 25 • C. Overall, under UDDS conditions, the SED network performs the best with MAE and RMSE of less than 1.5%. Under HWFET and LA92 conditions, the MAE and RMSE are less than 1.8% and 2%, respectively. Under the US06 condition, the MAE and RMSE is less than 2.8%, but it is relatively higher compared to other conditions because of the unstable cell equilibrium due to the high discharge current. This affects the model prediction, since current, voltage and temperature characteristics are used to predict the SOC at a given timestep. LSTM layers and the ReLu activation function are used to overcome the issue of vanishing gradients, and limiting the number of parameters helps in reducing the likelihood of overfitting. From the results discussed above, it can be concluded that the SED network performs the best, and the use of stacked layers combined with encoder-decoder architecture leads to an improvement in performance throughout all the conditions tested in this study.

Conclusions
In this paper, a new sequence to sequence deep learning algorithm is proposed to improve on pre-existing SOC estimation techniques. A stacked encoder-decoder algorithm is introduced for hybrid electric vehicle applications. The major contributions are as follows. Firstly, the proposed algorithm improves on previously established encoder-decoder architecture when SOC is estimated at varying temperatures. Secondly, the SED algorithm can learn from the measured data of lithium-ion batteries and directly estimate the SOC at varying temperatures. The algorithm analyzes the data sequentially and generates an SOC sequence based of the context of the input sequences. Thirdly, data processing techniques are implemented to reduce the time taken to train the algorithm and lower the computational load. The network depth study shows that the network performs the best when two Bi-LSTM layers are used and worse when three Bi-LSTM layers are used in all the blocks within the network. This also shows the limitation of deep neural networks, which are prone to vanishing gradients when the network becomes too deep. The use of encoderdecoder architecture helps in reducing the training time because the encoder and decoder blocks are trained simultaneously. Experimental results have shown that the proposed algorithm performs better than standalone encoder-decoder architecture and stacked bidirectional LSTM architecture. An MAE as low as 0.62% was observed while estimating the SOC at varying temperature, which proves the practicality of the proposed algorithm. Since the proposed network is not a model-based network, it can be implemented in various other applications for real-time SOC estimation, provided that training data are made available for the network. The SOC can be estimated while charging with a supercharger or household outlet. The proposed network can be used in an EV with a solar panel setup, since the battery pack undergoes similar conditions as the HEV environment where the battery sustains short charging and discharging conditions. The algorithm can be further improved by implementing EKF based hyper-parameter estimation techniques for hyper-parameter optimization and using techniques such as the beam search algorithm to overcome the inherit drawback of using an encoder-decoder structure. Finally, because of the very accurate SOC estimation obtained, one can aggressively exploit it to control over-charging or under-charging the battery pack. This helps in reducing the number of batteries required within the battery pack and ultimately contributes to reducing the cost of the battery pack. In conclusion, the proposed algorithm improves on the existing SOC estimation techniques and is a good choice for EV and HEV BMS applications.