On the Use of Machine Learning Methods for EV Battery Pack Data Forecast Applied to Reconstructed Dynamic Profiles

de la Vega, Joaquín; Riba, Jordi-Roger; Ortega-Redondo, Juan Antonio

doi:10.3390/app152011291

Open AccessArticle

On the Use of Machine Learning Methods for EV Battery Pack Data Forecast Applied to Reconstructed Dynamic Profiles

by

Joaquín de la Vega

¹

,

Jordi-Roger Riba

^2,*

and

Juan Antonio Ortega-Redondo

¹

Electronics Engineering Department, Universitat Politècnica de Catalunya, 08034 Barcelona, Spain

²

Electrical Engineering Department, Universitat Politècnica de Catalunya, 08034 Barcelona, Spain

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(20), 11291; https://doi.org/10.3390/app152011291

Submission received: 30 September 2025 / Revised: 15 October 2025 / Accepted: 17 October 2025 / Published: 21 October 2025

(This article belongs to the Special Issue AI-Based Machinery Health Monitoring)

Download

Browse Figures

Versions Notes

Featured Application

The proposed methodology can be used to design and validate advanced battery management systems (BMS) for electric vehicles. By combining reconstruction strategies with recurrent neural network forecasting, the approach can reliably estimate the remaining time to depletion (RTD), even when sensor data is missing or corrupted. This contributes to safer vehicle operation, reduces range anxiety for end users, and supports predictive maintenance strategies in real-world mobility applications.

Abstract

Lithium-ion batteries are essential to electric vehicles, so it is crucial to continuously monitor and control their health. However, since today’s battery packs consist of hundreds or thousands of cells, monitoring all of them is challenging. Additionally, the performance of the entire battery pack is often limited by the weakest cell. Therefore, developing effective monitoring techniques that can reliably forecast the remaining time to depletion (RTD) of lithium-ion battery cells is essential for safe and efficient battery management. However, even in robust systems, this data can be lost due to electromagnetic interference, microcontroller malfunction, failed contacts, and other issues. Gaps in voltage measurements compromise the accuracy of data-driven forecasts. This work systematically evaluates how different voltage reconstruction methods affect the performance of recurrent neural network (RNN) forecast models trained to predict RTD through quantile regression. The paper uses experimental battery pack data based on the behavior of an electric vehicle under dynamic driving conditions. Artificial gaps of 500 s were introduced at the beginning, middle, and end of each discharge phase, resulting in over 4300 reconstruction cases. Four reconstruction methods were considered: a zero-order hold (ZOH), an autoregressive integrated moving average (ARIMA) model, a gated recurrent unit (GRU) model, and a hybrid unscented Kalman filter (UKF) model. The results presented here reveal that the UKF model, followed by the GRU model, outperform alternative reconstruction methods. These models minimize signal degradation and provide forecasts similar to the original past data signal, thus achieving the highest coefficient of determination and the lowest error indicators. The reconstructed signals were fed into LSTM and GRU RNNs to estimate RTD, which produced confidence intervals and median values for decision-making purposes.

Keywords:

lithium-ion batteries; battery pack; electric mobility; dynamic load; machine learning; time-series forecast; data reconstruction

1. Introduction

The accelerated deployment of lithium-ion (Li-ion) batteries in electric vehicles (EVs) and energy storage systems (ESS) is supported by their superior electrochemical performance, including high gravimetric and volumetric energy density, increased cycle life, and declining production costs [1,2]. These characteristics establish Li-ion batteries as the backbone of modern electrification and grid integration initiatives. However, in the case of EVs, the dynamic operating conditions due to driving profiles, which involve rapid acceleration, regenerative braking, and load fluctuations, introduce significant complexity to monitoring and prediction tasks. Accurate prediction of the remaining discharge time is essential for two main reasons. First, it allows for precise range forecasting, which is crucial for meeting the end-user’s needs. Second, it facilitates effective battery management strategies within vehicles, ensuring optimal performance and safety. Advanced machine learning methods, such as recurrent neural networks (RNNs), can address the nonlinear and temporal dependencies of battery discharge dynamics. However, even with correct training, these methods depend on real-time data to provide a forecast.

Incomplete sensor data availability is an important challenge in real-world battery management [3]. Failures in cell voltage sensors, intermittent communication losses, or bandwidth constraints can lead to temporary or persistent data gaps. These missing measurements compromise the accuracy of state estimation algorithms, thereby undermining the safety and performance of critical battery management system (BMS) functionalities. For example, voltage gaps may delay the detection of anomalies, reduce balancing efficiency, or introduce uncertainty into discharge time predictions during high-stress operating phases.

In the field of electric mobility, manufacturers use various communication and data-flow architectures for their battery systems. Each architecture has its own design trade-offs regarding scalability, modularity, latency, and redundancy. Nevertheless, every EV and plug-in hybrid electric vehicle (PHEV) requires the integration of a battery management system (BMS) [4,5]. A BMS ensures safe operation, optimizes battery performance, and meets regulatory and safety standards [6]. There are two primary BMS implementation architectures: a centralized BMS that integrates sensing, decision-making, and control functions, and a distributed master–slave configuration with multiple local BMS modules (slaves) that communicate with a supervisory master node [7]. Either way, the system processes a large volume of sensor data in real time (see Figure 1), including individual cell voltages, currents, temperatures, and diagnostic signals [8]. However, the reliability of this data infrastructure is vulnerable to faults [9]. Sensor drift, connector degradation, electromagnetic interference, cable breaks, and microcontroller failures can generate communication or sensing gaps. When one or more data streams are missing or invalid, the BMS may experience data gaps or corrupt inputs. These gaps are highly undesirable due to the importance of the battery in vehicle operation [10]. Voltage is particularly critical because it defines safe operational limits and is directly tied to overvoltage/undervoltage cutoff mechanisms. Any cell voltage deviation beyond prescribed thresholds can result in degradation, imbalance, or catastrophic failure [11].

However, the challenge goes beyond merely filling in the gaps. In practice, missing data may be correlated with fault conditions rather than random events. If reconstruction models assume randomness, they may mask incipient failures and delay corrective measures. For this reason, anomaly-aware reconstruction strategies that explicitly model uncertainty and interface with diagnostic systems are critical for ensuring safety in fault-prone environments [12,13]. Combining advanced reconstruction strategies with distributed learning architectures can improve the reliability and safety of EV and ESS operation beyond current baselines. This approach addresses technical limitations of sensor infrastructure and supports broader objectives, such as reducing range anxiety, optimizing lifecycle costs, and enabling secure integration with renewable-heavy power grids.

A wide range of reconstruction techniques exist, from statistical interpolation to data-driven methods that exploit correlations among neighboring cells or modules. The choice of reconstruction strategy directly affects the quality of the model’s inputs and, consequently, the reliability of the downstream prediction [14]. However, few studies have examined how different reconstruction approaches influence RNN forecasting under dynamic load profiles.

RNNs are well-suited for modeling complex, nonlinear time-series signals because they are designed to process sequential data while retaining memory of past inputs [15]. Unlike traditional feedforward networks, RNNs maintain a hidden state that evolves over time. This allows RNNs to learn temporal dependencies and internal dynamics that are not directly observable [16]. This makes RNNs especially powerful for applications such as battery modeling, where voltage and current signals may exhibit nonlinear behavior due to electrochemical hysteresis, load variability, or multi-timescale dynamics. By capturing the sequential structure of these signals, RNNs can accurately predict future values or reconstruct missing data, even in the presence of noise and variability.

This paper addresses that gap by benchmarking multiple reconstruction methods in the context of RTD prediction. RTD is defined as the time interval until a cell branch reaches its cutoff voltage under a dynamic drive cycle. It is a relevant metric for range estimation and discharge scheduling. Our analysis evaluates the ability of different reconstruction techniques to recover realistic input signals while maintaining RNN predictive accuracy using representative driving cycles characterized by rapid current transients and partial data unavailability. Our findings shed light on the interplay between data recovery and sequence modeling, offering a more robust and precise approach to battery management in next-generation EVs and energy storage systems.

The primary novelty of this research paper is the development and systematic validation of a two-stage hybrid framework designed to assess the impact of various signal reconstruction strategies on the performance and reliability of deep learning RTD forecasts. In the first stage, the signal reconstruction strategies are evaluated systematically. In the second stage, the time series reconstructed in the first stage, along with the original, complete signals, serve as inputs for deep learning forecasts. The results show that the UKF method outperforms the other analyzed reconstruction methods, including the GRU, an RNN architecture for sequential data reconstruction. These results challenge the common belief that neural networks always outperform other “classical” alternatives. Finally, the results presented here confirm that the LSTM RNN is quantitatively superior to the GRU RNN for this RTD forecasting task.

2. Materials and Methods

The methodology proposed in this work follows a two-stage approach. First, artificial gaps are systematically introduced into the experimental battery cycling data at various points within the discharge profile. Then, the missing segments are reconstructed using various reconstruction techniques. Both the reconstructed signals and the original signal are retained for analysis. Next, the reconstructed voltage trajectories serve as inputs for recurrent neural network models that were previously trained to forecast the remaining time to depletion. This combined process enables the direct evaluation of each reconstruction strategy’s accuracy and its effect on forecasting performance, providing an integrated view of how data recovery choices influence predictive reliability.

Figure 2 illustrates the methodology applied in this research.

2.1. Analyzed Reconstruction Methods

This paper compares four increasingly complex reconstruction strategies: the zero-order hold (ZOH), the autoregressive integrated moving average (ARIMA), the unscented Kalman filter (UKF), and the gated recurrent unit (GRU), a type of RNN. Although these methods have different requirements, they are all evaluated using the same data gap regions of driving cycle data to ensure comparability.

The first and the simplest method is the ZOH, which propagates the last valid value across the gap [17]. It is computationally inexpensive and easy to implement since it does not depend on any other parameters. For voltage reconstruction, the estimated voltage

\hat{v}

at the gap from the last known voltage v at time k with m missing samples is calculated as follows:

\begin{matrix} \hat{v} (k + j) = v (k), & j = 1, 2 \dots, m \end{matrix}

(1)

ARIMA is a classical statistical forecasting model used for time-dependent variables [18]. It assumes that the present voltage is shaped by its history, corrections from past prediction errors, and long-term differences. This makes ARIMA suitable for analyzing signals in which past electrical states influence the present [19]. ARIMA can be expanded to consider seasonal information. It consists of three main components: the integrated part d, the autoregressive component p, and the moving average order q.

y_{t} = {(1 - B)}^{d} v_{t}

(2)

y_{t} = φ_{1} y_{t - 1} + \dots + φ_{p} y_{t - p} + ε_{t} + θ_{1} ε_{t - 1} + \dots + θ_{q} ε_{t - q}

(3)

In (2),

y_{t}

is the

d

-th differenced version of the voltage time series

v_{t}

, which is differenced until it becomes stationary.

B

is the lag (backshift) operator that shifts the values backward in time. In (3), the coefficient

φ_{p}

scales the contribution of a past value of the differenced voltage through the autoregressive component. The random error

ε_{t}

has a zero mean and constant variance. The coefficient

θ_{q}

scales the influence of a past error

ε_{t - q}

through the moving average component.

No predefined models were used. Instead, a custom Python script was used to preprocess the data and leverage the auto_arima procedure from the pmdarima Python package [20], which automatically calculates the optimal model orders (p,d,q) by minimizing the fitting error using the Akaike information criterion (AIC). The resulting time series was then evaluated for stationarity using the augmented Dickey–Fuller (ADF) test. Only configurations that produced stationary residuals were included in the final model. The resulting ARIMA model was then used to recursively forecast and fill the voltage gap.

The third approach investigated for data reconstruction is the Unscented Kalman Filter (UKF). The UKF is carried out by applying the unscented transformation to the Kalman filter framework [21]. It models a nonlinear, discrete-time system defined by the state transition function

x_{t}

and the measurement function

y_{t}

as [22],

\{\begin{matrix} x_{t + ∆ t} = f (x_{t}, u_{t}, ∆ t, t) + w_{t} \\ y_{t} = h (x_{t}) + v_{t} \end{matrix}

(4)

In (4),

f

captures the system dynamics and nonlinearities,

u_{t}

represents auxiliary inputs,

t

is the absolute timestamp,

w_{t}

denotes process noise,

h

maps the system state to the measurement space, and

v_{t}

represents measurement noise. We introduce a hybrid state-transition model that incorporates information from neighboring cells in the battery pack, as well as a predictive model derived from past data of the target cell as,

f (x_{t}, u_{t}, ∆ t, t) = (1 - ρ) [x_{t} + γ ({\bar{u}}_{t + ∆ t} - x_{t + ∆ t})] + ρ \cdot P_{m o d e l} (t)

(5)

where

\bar{u_{t}}

is the average of the available auxiliary cell values,

γ > 0

defines the coupling strength, and ρ, which is within the range [0, 1], balances the contribution of the predictive model against that of the auxiliary cell term. Note that the UKF formulation uses signals from neighboring cells to reconstruct missing data. This is possible when these auxiliary signals are available and reliable. However, when multiple adjacent cells experience simultaneous data loss, the reconstruction accuracy naturally decreases. In such cases, other mechanisms are required, such as higher-level pack observers, to select healthy cells.

The predictive model is expressed as follows,

P_{m o d e l} (t) = {a \cdot t}^{2} + b \cdot t + c + d \cdot l n (1 - t) + e \cdot l n (t)

(6)

The parameters a, b, c, d and e are estimated offline using historical data from previous cycles, excluding the data gap. This estimation is based on a nonlinear least squares optimization based on the Levenberg–Marquardt algorithm. Thus, the model combines the logarithmic terms included in the Nernst equation [23,24] to capture boundary behavior, and it uses a quadratic polynomial to describe nonlinear trends in flatter regions.

For the observation model, a direct observation is used as follows:

h (x_{t}) = x_{t}

(7)

which implies that the measurements

y_{t}

directly observe the internal state, subject to Gaussian noise with a zero mean and a covariance R. During missing intervals, however, no actual measurements are available, so the UKF relies solely on the prediction step, without applying measurement updates.

The UKF advances the state vector

x_{t}

, represented by the mean

\hat{x_{t}}

and the covariance

P_{t}

, by constructing sigma points

X_{i, t}

, which are representative samples of the state distribution,

\{\begin{matrix} X_{i, t} = \hat{x_{t}} & i = 0 \\ X_{i, t} = \hat{x_{t}} + {\sqrt{(n + λ) P}}_{i, t} & i = 1, \dots, n \\ X_{i, t} = \hat{x_{t}} - {\sqrt{(n + λ) P}}_{i - n, t} & i = n + 1, \dots, 2 n \end{matrix}

(8)

The scaling parameter

λ

is defined as

λ = α^{2} (n + κ) - n

(9)

where

α

controls the spread of the sigma points, and

κ

is a secondary scaling constant. Each sigma point is then propagated through the nonlinear state function, which is expressed as follows:

X_{i, t + 1} = f (X_{i, t}, u_{t}, ∆ t, t)

(10)

The predicted mean

\hat{x}

and the covariance p values are obtained as follows:

{\hat{x}}_{t + 1 | t} = \sum_{i = 0}^{2 n} {ω_{m}}_{i} X_{i, t + 1}

(11)

P_{t + 1 | t} = \sum_{i = 0}^{2 n} {ω_{c}}_{i} [X_{i, t + 1} - {\hat{x}}_{t + 1 | t}] {[X_{i, t + 1} - {\hat{x}}_{t + 1 | t}]}^{T} + Q_{t + 1}

(12)

where

Q

is the process noise covariance, and

ω_{m}

and

ω_{c}

are the sigma points for the mean and covariance weights, respectively. These are calculated as follows:

\{\begin{matrix} \begin{matrix} {ω_{m}}_{i} = \frac{λ}{n + λ} \\ {ω_{c}}_{i} = \frac{λ}{n + λ} + (1 - α^{2} + β) \end{matrix} & i = 0 \\ {ω_{m}}_{i} = {ω_{c}}_{i} = \frac{1}{2 (n + λ)} & i = 1, \dots, 2 n \end{matrix}

(13)

Here,

β

is an additional parameter that increases the weight of the central sigma point. It is commonly set to 2 for Gaussian distributions. Finally, the process noise

Q

is approximated using prediction residuals as

Q \approx V a r [z_{t} - f (z_{t - 1}, u_{t}, ∆ t, t)]

(14)

The fourth method considered is a recurrent neural network, specifically the gated recurrent unit (GRU). GRUs are a type of RNN designed to model sequential data. They are well-suited for learning temporal dynamics in systems where measurements are strongly time-dependent. GRUs control the flow of information over time steps through their gating mechanisms. This allows the network to retain relevant historic information while reducing vanishing gradient issues that typically limit the performance of conventional RNNs. This makes GRUs particularly effective at capturing the nonlinear, time-varying behavior of electrochemical processes during battery discharge.

In this implementation, the model uses a fixed lookback window of approximately two minutes to predict the subsequent voltage point. Each input sequence incorporates terminal voltage, current, state of charge (SoC), and elapsed time. This provides the network with complementary information about the evolving system state. The architecture consists of an input layer that handles the feature vector, two stacked GRU layers with 252 neurons each, and a fully connected output layer that generates a single scalar prediction of the reconstructed voltage. The network weights were randomly initialized before training, which enabled the optimizer to discover suitable representations of the system dynamics from scratch. Since voltage appears in both the input and output, the model can operate in an autoregressive manner by recursively using its own predictions. However, stability across reconstruction gaps is maintained by using exogenous inputs, such as current and elapsed time.

Figure 3 describes the architecture used for implementing the GRU.

Training was conducted using the AdamW optimizer with weight decay regularization and early stopping to prevent overfitting and memorization. Convergence was achieved with a root mean squared error (RMSE) of 10 mV on the training dataset. This target was not arbitrary; it was selected to align with the requirements of practical BMSs for balancing, while ensuring that the reconstruction accuracy is directly relevant to operational decision-making.

Three key performance indicators were used to evaluate the performance of the proposed reconstruction methods and establish a benchmark for comparison: the root mean square error (RMSE), the mean absolute error (MAE), and the coefficient of determination (R²). These metrics are formally defined as follows:

\{\begin{matrix} R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(v_{i} - {\hat{v}}_{i})}^{2}} \\ M A E = \frac{1}{n} \sum_{i = 1}^{n} | v_{i} - {\hat{v}}_{i} | \\ R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(v_{i} - {\hat{v}}_{i})}^{2}}{\sum_{i = 1}^{n} {(v_{i} - \bar{v})}^{2}} \end{matrix}

(15)

where

v_{i}

is the actual voltage value,

{\hat{v}}_{i}

is the predicted voltage,

\bar{v}

is the mean of the actual values and n is the number of observations.

2.2. Analyzed RTD Forecasting Methods

The forecasting objective is to estimate the remaining time to depletion (RTD), which is defined as the time interval until the cell or branch voltage drops below the cutoff threshold, V_off. So if t_cutoff is defined as the time instance at which V(t) = V_off, then the RTD can be defined as

R T D (t) = t_{c u t o f f} - t

(16)

Figure 4 shows the baseline calculation of the RTD at a specific time during a WLTP driving cycle. Unlike static state variables, such as SoC, RTD is a dynamic variable whose value evolves as a function of the current operating profile, instantaneous load, and cell voltage response history. In practice, RTD provides a direct, interpretable measure of available operating time under real-world driving conditions, such as the WLTP.

Forecasting the RTD is challenging because input signals, such as cell voltages, branch currents, and elapsed time, exhibit strong nonlinear behavior. Under WLTP conditions, rapid current transients, rest periods, and voltage relaxation effects generate intricate temporal dependencies that conventional regression methods and curve-fitting techniques fail to capture. This work uses an RNN architecture to address these challenges, and two of the most common implementations are compared: the GRU and the LSTM networks.

An LSTM is a type of RNN that incorporates gating mechanisms and an internal cell state that acts as a memory line. These features allow the network to retain information over longer time periods and make LSTMs well-suited for modeling the long-term temporal dependencies present in real driving cycles.

Both GRU and LSTM networks are advanced types of RNNs designed to capture long-range dependencies in sequential data and address the vanishing gradient problem. The key difference between the two lies in their internal structure. LSTMs use three gates (input, forget, and output) and maintain two states (cell and hidden). This allows LSTMs to control the flow of information more selectively. GRUs simplify this mechanism by using only two gates (reset and update) and a single hidden state. This makes GRUs computationally more efficient, yet they still achieve comparable performance on many sequence modeling tasks. GRUs tend to train faster and require fewer parameters. However, LSTMs may offer more flexibility in learning complex temporal patterns in longer sequences.

The proposed framework implements GRU and LSTM models with identical architectures consisting of two stacked layers, each with 128 neurons. These models are trained using historical sequences of current, voltage, elapsed time, power and SoC within a 120-s lookback window. A fixed 120-s lookback window was chosen as a trade-off between computational cost and model accuracy. Training was conducted using the AdamW optimizer with weight decay regularization and early stopping to prevent overfitting and memorization. Rather than attempting to forecast full future current and voltage behavior explicitly, the network is optimized to estimate the RTD at each time step. This method transforms the problem into a supervised learning task, in which the target variable is the true remaining time until voltage cutoff, calculated from experimental data. Through these temporal correlations in the input features, the model learns to associate characteristic load-voltage- SoC patterns with typical times remaining before depletion.

Additionally, both model networks express their output in terms of quantiles of the RTD distribution. This provides a point estimate (the median RTD) and an uncertainty band. This probabilistic formulation achieved through a pinball loss function [25,26] is particularly useful since it incorporates reliable confidence intervals which are essential for ensuring safe and efficient operation under uncertain or variable driving conditions. For this work, three quantile levels, defined as τ, were selected to achieve a confidence level of 80% in the predictions. The selected quantile levels are τ = 0.5 for the 50% quantile (median), τ = 0.1 for the 10% quantile and τ = 0.9 for the 90% quantile.

The pinball loss function L_τ used to train the RNN uses the ground of truth y and the model prediction

\hat{q}

. It is defined as

L_{τ} (y, \hat{q}) = \{\begin{matrix} τ \cdot (y - \hat{q}) & y \geq \hat{q} \\ (τ - 1) \cdot (y - \hat{q}) & y < \hat{q} \end{matrix}

(17)

Equation (17) shows that the loss function penalizes errors differently depending on whether the prediction is too low (underestimation) or too high (overestimation). For example, if the true RTD (y) is 1000 s and the model predicts

{\hat{q}}_{0.1}

= 800 s, the underestimation of 200 s results in a minor penalty of 0.1 × 200 = 20. However, the same underestimation at

{\hat{q}}_{0.9}

= 800 s is penalized nine times more strongly, 0.9 × 200 = 180. Conversely, if the model overestimates with

{\hat{q}}_{0.1}

= 1200 s, the penalty becomes 0.9 × 200 = 180, while the same overestimation with

{\hat{q}}_{0.9}

= 1200 s results in a penalty of only 0.1 × 200 = 20. This asymmetric weighting causes the lower quantile to remain below most outcomes, the upper quantile to stay above them, and the median quantile to balance both. Consequently, the three outputs collectively approximate the empirical distribution of RTD rather than generating a single point estimate. When this methodology is used to adjust the neuron weights inside the RNN, the model learns to satisfy all three loss functions simultaneously. This provides a confidence range and a median value,

{\hat{q}}_{0.5}

for reference.

Figure 5 shows the RNN structure for the GRU and LSTM models, which have three quantile outputs.

Two aspects were considered when evaluating the validity of RNN predictions: the decreasing monotonicity of predicted quantile values and the prediction interval coverage probability (PICP).

Since the RNN is free to predict any values and the initial weights are randomly assigned, the decreasing monotonicity must be evaluated. Ensuring monotonicity confirms that the quantile curves do not cross each other. The rule for decreasing monotonicity is as follows:

{\hat{q}}_{0.1} (t) \leq {\hat{q}}_{0.5} (t) \leq {\hat{q}}_{0.9} (t)

(18)

Testing the PICP against the true RTD and the predicted values,

\hat{q}

, can ensure that the selected confidence interval is reliable, or reveal whether it is overconfident (too narrow) or underconfident (too wide). For an 80% prediction interval, the PICP is defined as the fraction of test instances in which the true RTD falls between

{\hat{q}}_{0.1}

and

{\hat{q}}_{0.9}

. This condition can be expressed as

{P I C P}_{80 %} = \frac{1}{n} \sum_{k = 1}^{n} [{\hat{q}}_{0.1} (t_{k}) \leq y_{k} \leq {\hat{q}}_{0.9} (t_{k})]

(19)

where n is the total number of time samples, y_k is the ground of truth RTD at time instance k, and

{\hat{q}}_{0.1} (t_{k}) a n d

{\hat{q}}_{0.9} (t_{k})

are the predicted 10th and 90th quantile values forecasted at time instance k.

We also define the PICP width as follows:

{P I C P}_{w i d t h} = {\hat{q}}_{0.9} (t_{k}) - {\hat{q}}_{0.1} (t_{k})

(20)

This is the width of the RTD prediction confidence interval, expressed in seconds.

3. Dataset

The methodologies evaluated in this work rely on a dataset obtained using an advanced battery testbench specifically designed to reproduce realistic electric vehicle operating conditions [27]. While executing dynamic load profiles derived from the WLTP test cycle, the testbench supports the independent monitoring of individual cell voltages, branch currents, and pack-level signals. The fully modular system architecture integrates with a BMS via CAN communication, enabling real-time exchange of voltage, current, state-of-charge, and diagnostic messages. This design allows laboratory cycling to closely replicate real automotive usage in terms of both electrical dynamics and data acquisition fidelity. A detailed description of the test bench hardware, data acquisition structure, and communication layers can be found in [28].

Figure 6 shows the electrical diagram of the testing setup and the acquired signals.

The test object is a lithium-ion battery pack consisting of Panasonic NCR18650B cells. These cells are a commercially available 18650-format NCA chemistry. The nominal specifications of each cell are as follows:

Chemistry: NCA;
Nominal capacity: 3.2 Ah;
Nominal voltage: 3.6 V;
Charge conditions: CC-CV at 0.5C (cut off at 65 mA or after 4 h).

The cells were arranged in a 12-series by 3-parallel (S12P3) configuration. This resulted in a pack with a nominal voltage of 43.2 V and a total nominal capacity of approximately 9.6 Ah. This configuration provides sufficient energy to reproduce full WLTP driving cycles while maintaining observability at the cell and pack levels. This is essential for evaluating reconstruction methods under realistic load conditions. For this study, however, only the driving cycles were used, as constant charge is more trivial to predict. The evaluation set corresponds to the driving cycles numbered 249 through 350 in the dataset. This design ensures that the reported results reflect the methods’ ability to generalize to unseen operating conditions rather than memorize specific trajectories. To prevent data bias, the dataset was split into two subsets:

Training set (WLTP driving cycles 249 to 300): This set is used for training data-driven reconstruction methods such as GRUs, as well as for training the RNN models for RTD forecasting.
Evaluation set (WLTP driving cycles 301 to 350): This set is reserved exclusively for testing all reconstruction methods and assessing their impact on RTD forecasting accuracy.

Figure 7 shows an example of the battery pack’s voltage behavior under the tested driving conditions (WLTP).

4. Results

This study includes several control algorithms that represent the main families of reconstruction approaches commonly found in the literature. These families include deterministic interpolation methods (ZOH), statistical models (ARIMA), model-based filters (UKF), and data-driven neural architectures (GRU and LSTM). Together, these algorithms provide a comprehensive benchmark of the latest techniques for missing data reconstruction and short-term forecasting in battery systems [29,30,31].

This section presents the study results in three parts. First, signals with missing data are reconstructed, and the R², RMSE, and MAE metrics are evaluated for each reconstruction method (see Figure 8a). Next, the designed RTD forecasting model is evaluated against the original signal using complete data (see Figure 8b). This evaluation relies on the mean MAE, as well as the coverage and width of the 80% prediction interval (PICP80%). Finally, the forecasting results obtained with each reconstructed dataset are presented and assessed using the same set of metrics (see Figure 8c). Together, these analyses allow us to evaluate the performance of individual reconstruction methods and their combined performance when applying forecasting models.

Figure 8 summarizes the three parts of the results presented in this section.

4.1. Reconstruction Results

This section evaluates how well the ZOH, ARIMA, UKF, and GRU algorithms reconstruct missing data from the original signals (see Figure 8a).

Independent tests were conducted across 36 cells of the battery pack over 40 cycles to evaluate the performance of the reconstruction methods. Three 500-s artificial gaps were simulated at different positions within the discharge phase in each tested cycle, resulting in 4320 test cases. The fixed gap length of 500 s was chosen to create a realistic and challenging reconstruction scenario. This duration exceeds the 120-s lookback window used by the GRU to reconstruct the missing data. This ensures that the reconstruction partly relies on previously imputed values rather than direct observations. Then, each reconstruction method was applied to reconstruct the missing signal interval.

Figure 9 shows representative reconstruction results for cycle 315, focusing on cell S1P1 of the battery pack.

The average and maximum values of the performance metrics of the different evaluated reconstruction methods are summarized in Table 1.

It is noted that each cycle averages about 12,000 s, while the data gaps are of 500 s. The first data gap (beginning) begins at the 1000-s mark, the second (middle) begins at the 6000-s mark, and the third (end) begins at the 10,000-s mark. The results clearly show differences in the effectiveness of the evaluated reconstruction methods. Figure 9 clearly shows that both the ZOH and ARIMA methods fail to reproduce the original signal’s complex dynamic behavior. Instead, they exhibit a quasi-linear trend and cannot reproduce the rich frequency content of the WLTP driving cycles. Consequently, the magnitude of R² is negative, indicating very poor predictive capability. This occurs when the predicted values differ greatly from the actual values to the extent that the mean of the actual values would provide a better fit. Meanwhile, the RMSE and MAE values remain relatively high and comparable to one another. In contrast, the GRU (RNN) achieves an average R² of 0.79, with substantially lower RMSE and MAE values. This demonstrates its ability to effectively capture temporal dependencies, given its capacity to handle more inputs. Finally, the UKF algorithm outperforms the other methods by achieving the highest R² value (0.91) and the lowest RMSE and MAE values. This underscores its robustness for signal reconstruction in this context. This can be attributed to its knowledge of neighboring cell states and its ability to fit historical data.

4.2. RTD Forecast Based on the Original Signal of Past Data

This section evaluates the performance of the GRU and LSTM RNN models for RTD forecasting of the original past data signals (see Figure 8b).

Since the designed RTD forecasting model provides probabilistic outputs, both the PICP_80% and PICP_width values must be evaluated. Additionally, the median estimate (

{\hat{q}}_{0.5}

) is reported, because it is a meaningful indicator of model performance. The evaluation was conducted on 40 driving cycles distinct from those used for training, across 36 cells. At each time step, the pre-trained models were applied. Since the RNN models were configured with a 120-s lookback window, estimations are only available from that point onward. Figure 10 illustrates an example, and Figure 11 shows the RTD forecasting performance indicators based on the original signal of past data compared with the RTD of the complete original signal.

Figure 11 shows a boxplot of the MAE and PICP80% of the RTD forecasting results produced by the GRU and LSTM RNNs. Boxplots display outliers (black empty circles), lower and upper quartiles (Q1 and Q3, respectively), the median, and the maximum and minimum scores.

Table 2 summarizes the values of the performance indicators of the RTD forecasting based on the original signal of past data when compared with the RTD of the complete original signal. The evaluation was conducted on 40 driving cycles across 36 cells, which are distinct from those used for training.

As shown in Figure 11 and Table 2, the LSTM achieved lower median RTD error (MAE of 26.7 versus 28.6 s), higher PICP_80% coverage (93.1% versus 88.2%), and a narrower PICP_width (126.6 s versus 159.1 s) than the GRU across the 1440 WLTP evaluated discharge cycles. These results indicate improved accuracy and better-calibrated uncertainty when using the LSTM forecasting method.

4.3. RTD Forecast Based on the Reconstructed Signal of Past Data

This section evaluates the performance of the GRU and LSTM RNN models for RTD forecasting of the reconstructed past data signals (see Figure 8c). The RNNs are trained using complete, original data, and their performance is evaluated using reconstructed data. This strategy provides a more accurate assessment of the performance of the reconstruction methods. To ensure comparability, the evaluation was conducted using the same cycles and injected gaps as in previous sections. This yielded 4320 test cases for each reconstruction method and RTD forecasting model. Figure 12 and Figure 13 show box plots of the GRU and LSTM forecasting results, respectively, based on all analyzed reconstruction methods. A complete summary of the performance indicators is provided in Table 3.

As shown in Figure 12 and Figure 13 and Table 3, the performance of the GRU and LSTM models for RTD forecasting decreased when evaluated using reconstructed time series. These results highlight two points. First, introducing reconstruction noise inevitably degrades performance compared to the complete original data. Second, the LSTM forecaster is superior because it combines high accuracy with reliable uncertainty estimates.

When artificial gaps were introduced and filled using different reconstruction methods, the UKF method minimized data degradation the most, providing forecasts similar to the original past data signal. GRU-based reconstruction was the second-best method, but it doubled the errors. Simpler reconstruction methods, such as ZOH and ARIMA, substantially degraded accuracy. ARIMA produced the poorest MAE values despite showing PCIP_80% and PCIP_width values similar to those of the better methods.

The relationship between MAE and PCIP_width shown in Figure 12 and Figure 13 and Table 3 requires further analysis. For most combinations, the PCIP_width values were larger than the MAE values. For example, the LSTM + UKF combination had PCIP_width = 126 s bands and MAE = 38 s, indicating that the true RTD typically falls within the confidence interval. This trade-off is acceptable because it ensures reliability, even if it means broader ranges. However, the opposite occurred in ARIMA-reconstructed cases, where the MAE values exceeded the PCIP_width values (e.g., LSTM + ARIMA with MAE = 153.5 s and PCIP_width = 121.4 s). This reflects an overconfident model response, meaning that the [

{\hat{q}}_{0.1}

,

{\hat{q}}_{0.9}

] interval is too narrow relative to the actual error. This explains the reduced PCIP_80% values observed.

4.4. Assessment of the Computational Cost

This section evaluates the computational cost of the methods analyzed in this work.

Table 4 summarizes the average computational cost of the data reconstruction methods, distinguishing between the data preprocessing and reconstruction stages. The preprocessing stage includes the steps necessary before reconstruction. These steps include UKF parameter fitting based on historical data, GRU model loading, and searching for the optimal ARIMA parameters (p,d,q). The reconstruction time corresponds to estimation or inference over a 500-s data gap.

The results presented in Table 4 show that the UKF and ZOH methods have lower computational costs than the other analyzed methods. It is notorious that the UKF method excels both in accuracy and low computational cost. Although the ARIMA model is fast in the reconstruction stage, it requires a high preprocessing cost due to optimal selection of parameters. In contrast, the GRU method has a low preprocessing cost, but it presents the highest reconstruction time because it makes sequential, step-by-step predictions across the data gap. These results emphasize the trade-off between accuracy and computational efficiency.

Table 5 shows the average computational time for the RTD forecasting stage. The first column corresponds to model loading time, and the second corresponds to forecasting time. Forecasting time is defined as forward inference over a complete discharge sequence.

As shown in Table 5, the results indicate that the LSTM and GRU models have a loading time of approximately 0.08 s, confirming that model initialization is very fast. However, the forecasting process itself is considerably slower, taking approximately 5.5 s for the LSTM and 7.9 s for the GRU RNNs.

The results presented in Table 4 and Table 5 were obtained using Python 3.10.11 codes running on an Intel Core i9-13900K processor. Implementing the GRU and LSTM RNN models in compiled languages such as C/C++ or through optimized embedded frameworks can further reduce inference time in both the reconstruction and RTD forecasting stages [32].

5. Conclusions

A battery management system (BMS) processes a large volume of data, and the electric powertrain operates under demanding conditions. These factors make system components, such as sensors and cables, susceptible to temporary malfunctions or permanent failures. Missing or corrupted signals resulting from these issues directly impact the reliability of sensing data and consequently affect the decision-making of critical subsystems in electric vehicles. Therefore, robust methods are required to address data gaps and maintain continuity in system monitoring to ensure safe and reliable operation.

This paper has evaluated several techniques for reconstructing missing data, including zero-order hold (ZOH), the unscented Kalman filter (UKF), the autoregressive integrated moving average (ARIMA) model, and a recurrent neural network based on gated recurrent units (GRUs). In addition to reconstruction, we evaluated the reconstructed signals by using them as inputs for two time-to-depletion (RTD) forecasting models. This two-step evaluation allows us to assess the performance of the reconstruction methods based on both their accuracy and their usefulness in forecasting tasks.

The results demonstrate that the UKF provides the most reliable signal reconstruction, outperforming both traditional statistical approaches and learning-based methods. Regarding forecasting, the LSTM model performs slightly better than the GRU model. The hybrid approach, which combines UKF-based reconstruction and LSTM-based forecasting, yields the best overall results, surpassing all other combinations tested. These findings are validated using data collected from a controlled test bench that replicates realistic driving conditions. This demonstrates the practical relevance of the proposed methodology for real-world electric vehicle applications.

Author Contributions

Conceptualization, J.d.l.V., J.-R.R. and J.A.O.-R.; methodology, J.A.O.-R.; software, J.d.l.V. and J.-R.R.; validation, J.d.l.V., J.-R.R. and J.A.O.-R.; formal analysis, J.-R.R. and J.A.O.-R.; investigation, J.d.l.V., J.-R.R. and J.A.O.-R.; resources, J.-R.R. and J.A.O.-R.; data curation, J.d.l.V.; writing—original draft preparation, J.d.l.V. and J.-R.R.; writing—review and editing, J.d.l.V., J.-R.R. and J.A.O.-R.; supervision, J.-R.R. and J.A.O.-R. All authors have read and agreed to the published version of the manuscript.

Funding

This project received funding from grant TED2021-130007B-I00, MICIU/AEI/10.13039/501100011033/, and from ERDF “A way of making Europe,” provided by the European Union, and from the Agència de Gestió d’Ajuts Universitaris i de Recerca-AGAUR (2021 SGR 00392).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data used in this manuscript is available in the public repository “Lithium-ion battery pack cycling dataset with CC-CV charging and WLTP/constant discharge profiles” located at https://dataverse.csuc.cat/dataset.xhtml?persistentId=doi:10.34810/data2395 (accessed on 30 September 2025).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Ostadi, A.; Kazerani, M.; Chen, S.K. Optimal sizing of the Energy Storage System (ESS) in a Battery-Electric Vehicle. In Proceedings of the 2013 IEEE Transportation Electrification Conference and Expo (ITEC), Detroit, MI, USA, 16–19 June 2013. [Google Scholar]
Ranjith Kumar, R.; Bharatiraja, C.; Udhayakumar, K.; Devakirubakaran, S.; Sekar, K.S.; Mihet-Popa, L. Advances in Batteries, Battery Modeling, Battery Management System, Battery Thermal Management, SOC, SOH, and Charge/Discharge Characteristics in EV Applications. IEEE Access 2023, 11, 105761–105809. [Google Scholar] [CrossRef]
Xiong, R.; Yu, Q.; Shen, W.; Lin, C.; Sun, F. A Sensor Fault Diagnosis Method for a Lithium-Ion Battery Pack in Electric Vehicles. IEEE Trans. Power Electron. 2019, 34, 9709–9718. [Google Scholar] [CrossRef]
Spoorthi, B.; Pradeepa, P. Review on Battery Management System in EV. In Proceedings of the 2022 International Conference on Intelligent Controller and Computing for Smart Power (ICICCSP), Hyderabad, India, 21–23 July 2022. [Google Scholar]
Prada, E.; Di Domenico, D.; Creff, Y.; Sauvant-Moynot, V. Towards Advanced BMS Algorithms Development for (P)HEV and EV by Use of a Physics-Based Model of Li-ion Battery Systems. In Proceedings of the 2013 World Electric Vehicle Symposium and Exhibition (EVS27), Barcelona, Spain, 17–20 November 2013. [Google Scholar]
Li, B.; Fu, Y.; Shang, S.; Li, Z.; Zhao, J.; Wang, B. Research on Functional Safety of Battery Management System (BMS) for Electric Vehicles. In Proceedings of the 2021 International Conference on Intelligent Computing, Automation and Applications (ICAA), Nanjing, China, 25–27 June 2021; pp. 267–270. [Google Scholar]
Popp, A.; Fechtner, H.; Schmuelling, B.; Kremzow-Tennie, S.; Scholz, T.; Pautzke, F. Battery Management Systems Topologies: Applications: Implications of Different Voltage Levels. In Proceedings of the 2021 IEEE 4th International Conference on Power and Energy Applications (ICPEA), Busan, Republic of Korea, 9–11 October 2021; pp. 43–50. [Google Scholar]
Khan, F.I.; Hossain, M.M.; Lu, G. Sensing-based monitoring systems for electric vehicle battery—A review. Meas. Energy 2025, 6, 100050. [Google Scholar]
Kosuru Rahul, V.S.; Kavasseri Venkitaraman, A. A Smart Battery Management System for Electric Vehicles Using Deep Learning-Based Sensor Fault Detection. World Electr. Veh. J. 2023, 14, 101. [Google Scholar] [CrossRef]
Li, J.; Che, Y.; Zhang, K.; Liu, H.; Zhuang, Y.; Liu, C.; Hu, X. Efficient battery fault monitoring in electric vehicles: Advancing from detection to quantification. Energy 2024, 313, 134150. [Google Scholar] [CrossRef]
Jeevarajan, J.A.A.; Joshi, T.; Parhizi, M.; Rauhala, T.; Juarez-Robles, D. Battery Hazards for Large Energy Storage Systems. ACS Energy Lett. 2022, 7, 2725–2733. [Google Scholar] [CrossRef]
Haider, S.N.N.; Zhao, Q.; Li, X. Data driven battery anomaly detection based on shape based clustering for the data centers class. J. Energy Storage 2020, 29, 101479. [Google Scholar] [CrossRef]
Bhaskar, K.; Kumar, A.; Bunce, J.; Pressman, J.; Burkell, N.; Rahn, C. Data-Driven Thermal Anomaly Detection in Large Battery Packs. Batteries 2023, 9, 70. [Google Scholar] [CrossRef]
Liu, J.; He, L.; Zhang, Q.; Xie, Y.; Li, G. Real-world cross-battery state of charge prediction in electric vehicles with machine learning: Data quality analysis, data repair and training data reconstruction. Energy 2025, 335, 138322. [Google Scholar]
Sherstinsky, A. Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) network. Phys. D Nonlinear Phenom. 2020, 404, 132306. [Google Scholar]
Saha, P.; Dash, S.; Mukhopadhyay, S. Physics-incorporated convolutional recurrent neural networks for source identification and forecasting of dynamical systems. Neural Netw. 2021, 144, 359–371. [Google Scholar]
Karafyllis, I.; Krstic, M. Nonlinear stabilization under sampled and delayed measurements, and with inputs subject to delay and zero-order hold. IEEE Trans. Automat. Contr. 2012, 57, 1141–1154. [Google Scholar] [CrossRef]
Zhou, Y.; Huang, M. Lithium-ion batteries remaining useful life prediction based on a mixture of empirical mode decomposition and ARIMA model. Microelectron. Reliab. 2016, 65, 265–273. [Google Scholar] [CrossRef]
Riba, J.R.; Gómez-Pau, Á.; Martínez, J.; Moreno-Eguilaz, M. On-Line Remaining Useful Life Estimation of Power Connectors Focused on Predictive Maintenance. Sensors 2021, 21, 3739. [Google Scholar] [CrossRef] [PubMed]
pmdarima: ARIMA Estimators for Python—Pmdarima 2.0.4 Documentation. Available online: https://alkaline-ml.com/pmdarima/ (accessed on 29 September 2025).
He, Z.; Dong, C.; Pan, C.; Long, C.; Wang, S. State of charge estimation of power Li-ion batteries using a hybrid estimation algorithm based on UKF. Electrochim. Acta 2016, 211, 101–109. [Google Scholar]
Xiong, K.; Zhang, H.Y.; Chan, C.W. Performance evaluation of UKF-based nonlinear filtering. Automatica 2006, 42, 261–270. [Google Scholar] [CrossRef]
de la Vega, J.; Riba, J.R.; Ortega-Redondo, J.A. Real-Time Lithium Battery Aging Prediction Based on Capacity Estimation and Deep Learning Methods. Batteries 2023, 10, 10. [Google Scholar] [CrossRef]
Hussein, A.A.H.; Batarseh, I. An overview of Generic Battery Models. In Proceedings of the 2011 IEEE Power and Energy Society General Meeting, Detroit, MI, USA, 24–28 July 2011. [Google Scholar]
Liu, S.; Deng, J.; Yuan, J.; Li, W.; Li, X.; Xu, J.; Zhang, S.; Wu, J.; Wang, Y.G. Probabilistic quantile multiple fourier feature network for lake temperature forecasting: Incorporating pinball loss for uncertainty estimation. Earth Sci. Inform. 2024, 17, 5135–5148. [Google Scholar] [CrossRef]
Bauer, I.; Haupt, H.; Linner, S. Pinball boosting of regression quantiles. Comput. Stat. Data Anal. 2024, 200, 108027. [Google Scholar] [CrossRef]
de la Vega Hernández, J.; Ortega Redondo, J.A.; Riba Ruiz, J.-R. Lithium-Ion Battery Pack Cycling Dataset with CC-CV Charging and WLTP/Constant Discharge Profiles. 2025. Available online: https://dataverse.csuc.cat/dataset.xhtml?persistentId=doi:10.34810/data2395 (accessed on 30 July 2025).
de La Vega, J.; Riba, J.-R.; Ortega, J.A. Advanced Battery Test Bench for Realistic Vehicle Driving Conditions Assessment; Institute of Electrical and Electronics Engineers Inc.: Piscataway, NJ, USA, 2025; pp. 1–6. [Google Scholar]
Ma, C.; Wang, A.; Chen, G.; Xu, C. Hand joints-based gesture recognition for noisy dataset using nested interval unscented Kalman filter with LSTM network. Vis. Comput. 2018, 34, 1053–1063. [Google Scholar] [CrossRef]
Liu, X.; Gao, Z.; Tian, J.; Wei, Z.; Fang, C.; Wang, P. State of Health Estimation for Lithium-Ion Batteries Using Voltage Curves Reconstruction by Conditional Generative Adversarial Network. IEEE Trans. Transp. Electrif. 2024, 10, 10557–10567. [Google Scholar] [CrossRef]
Mu, J.; Han, Y.; Zhang, C.; Yao, J.; Zhao, J. An Unscented Kalman Filter-Based Method for Reconstructing Vehicle Trajectories at Signalized Intersections. J. Adv. Transp. 2021, 2021, 6181242. [Google Scholar] [CrossRef]
Li, J.; Zhang, C.; Cao, Q.; Qi, C.; Huang, J.; Xie, C. An Experimental Study on Deep Learning Based on Different Hardware Configurations. In Proceedings of the 2017 International Conference on Networking, Architecture, and Storage (NAS), Shenzhen, China, 7–9 August 2017. [Google Scholar]

Figure 1. Sensor failure in a central BMS architecture.

Figure 2. Reconstruction and RTD forecast evaluation methodology.

Figure 3. GRU-RNN architecture reconstruction.

Figure 4. Remaining time for depletion measurement.

Figure 5. GRU and LSTM model architectures for RTD forecasting.

Figure 6. Electrical diagram of the battery testing setup.

Figure 7. Example of a driving cycle and a charge cycle obtained from the dataset.

Figure 8. Summary of the studies performed in this section. (a) Reconstruction and comparison against the original signal. (b) RTD forecasting with original past data and comparison against the real RTD from complete original data. (c) RTD forecasting with reconstructed past data and comparison against the real RTD from complete original data.

Figure 9. ZOH, ARIMA, UKF and GRU reconstruction results for cycle 315 and cell S1P1 with a simulated artificial gap of 500 s. The green shadow indicates the reconstruction interval. The orange, red, violet, and blue lines correspond to the reconstructed signals produced by the ZOH, ARIMA, UKF, and GRU methods, respectively.

Figure 10. Example of RTD forecasting evaluation. At the top is the driving cycle discharge phase, with the voltage cutoff marked. Below is the RTD forecast error in seconds of

{\hat{q}}_{0.5}

and the confidence band PICP_width evaluated at each time step of the driving cycle.

Figure 10. Example of RTD forecasting evaluation. At the top is the driving cycle discharge phase, with the voltage cutoff marked. Below is the RTD forecast error in seconds of

{\hat{q}}_{0.5}

and the confidence band PICP_width evaluated at each time step of the driving cycle.

Figure 11. Performance indicators of the RTD forecasting based on the original signal of past data when compared with the RTD of the complete original signal. Boxplot results for the LSTM and GRU models. On the left, MAE in seconds for the forecasted

{\hat{q}}_{0.5}

corresponding to each evaluated RNN model. On the right, PICP_80% coverage for each evaluated RNN model.

Figure 11. Performance indicators of the RTD forecasting based on the original signal of past data when compared with the RTD of the complete original signal. Boxplot results for the LSTM and GRU models. On the left, MAE in seconds for the forecasted

{\hat{q}}_{0.5}

corresponding to each evaluated RNN model. On the right, PICP_80% coverage for each evaluated RNN model.

Figure 12. Performance indicators of the RTD forecast based on the reconstructed signal of past data when compared with the RTD of the complete original signal. Boxplot results for the GRU models used for RTD forecasting based on the different reconstruction methods (UKF, GRU, ZOH and ARIMA). On the (left), MAE in seconds for the forecasted

{\hat{q}}_{0.5}

corresponding to each evaluated RNN model. On the (right), PICP_80% coverage for each evaluated GRU model.

Figure 12. Performance indicators of the RTD forecast based on the reconstructed signal of past data when compared with the RTD of the complete original signal. Boxplot results for the GRU models used for RTD forecasting based on the different reconstruction methods (UKF, GRU, ZOH and ARIMA). On the (left), MAE in seconds for the forecasted

{\hat{q}}_{0.5}

corresponding to each evaluated RNN model. On the (right), PICP_80% coverage for each evaluated GRU model.

Figure 13. Performance indicators of the RTD forecast based on the reconstructed signal of past data when compared with the RTD of the complete original signal. Boxplot results for the LSTM models used for RTD forecasting based on the different reconstruction methods (UKF, GRU, ZOH and ARIMA). On the (left), MAE in seconds for the forecasted

{\hat{q}}_{0.5}

corresponding to each evaluated RNN model. On the (right), PICP_80% coverage for each evaluated LSTM model.

Figure 13. Performance indicators of the RTD forecast based on the reconstructed signal of past data when compared with the RTD of the complete original signal. Boxplot results for the LSTM models used for RTD forecasting based on the different reconstruction methods (UKF, GRU, ZOH and ARIMA). On the (left), MAE in seconds for the forecasted

{\hat{q}}_{0.5}

corresponding to each evaluated RNN model. On the (right), PICP_80% coverage for each evaluated LSTM model.

Table 1. Performance metrics of the different evaluated reconstruction methods over 500-s data gaps that were simulated at the beginning, middle and end intervals of 40 discharge cycles for 36 cells, accounting for 4320 cases per method.

Model	Average R²	Average/Maximum RMSE	Average/Maximum MAE
ARIMA	−1.7104	0.0797/0.5238	0.0813/0.4847
GRU (RNN)	0.7936	0.0385/0.1145	0.0287/0.0735
UKF	0.9134	0.0266/0.1077	0.0127/0.0803
ZOH	−1.1822	0.0764/0.4757	0.0783/0.4408

Table 2. Values of the performance indicators of the RTD forecast based on the original signal of past data when compared with the RTD of the complete original signal.

Model	$MAE {\hat{q}}_{0.5}$ Mean Values (s)	$MAE {\hat{q}}_{0.5}$ Median Values (s)	PICP_80% Mean Values (%)	PICP_width Mean Values (s)
GRU	36.2 s	28.6 s	88.2%	159.06 s
LSTM	34.5 s	26.7 s	93.1%	126.56 s

Table 3. Values of the performance indicators of the RTD forecast based on the reconstructed signal of past data when compared with the RTD of the complete original signal.

Model	Reconstruction Method	$MAE {\hat{q}}_{0.5}$ Mean Values (s)	$MAE {\hat{q}}_{0.5}$ Median Values (s)	PICP_80% Mean Values (%)	PICP_width Mean Values (s)
GRU	UKF	39.2 s	30.9 s	85.2%	157.1 s
	GRU	88.5 s	79.9 s	84.8%	159.6 s
	ZOH	113.7 s	82.5 s	80.3%	161.6 s
	ARIMA	167.1 s	148.8 s	80.1%	164.1 s
LSTM	UKF	37.8 s	30.4 s	90.4%	125.9 s
	GRU	88.3 s	90.1 s	90.1%	129.0 s
	ZOH	105.6 s	78.0 s	84.3%	122.0 s
	ARIMA	153.5 s	144.7 s	84.4%	121.4 s

Table 4. Data gap reconstruction stage. Average computational cost of the analyzed reconstruction methods.

Reconstruction Method	Data Preprocessing Stage Computational Cost in Seconds	Reconstruction Stage Computational Cost in Seconds
UKF	0.003	0.402
GRU	0.065	10.146
ZOH	0.000	0.001
ARIMA	22.360	0.012

Table 5. RTD forecasting stage. Average computational cost of the analyzed methods.

Forecasting Method	Model Loading Time in Seconds	Forecasting Time (RTD Evaluation) in Seconds
LSTM	0.077	5.524
GRU	0.080	7.891

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

de la Vega, J.; Riba, J.-R.; Ortega-Redondo, J.A. On the Use of Machine Learning Methods for EV Battery Pack Data Forecast Applied to Reconstructed Dynamic Profiles. Appl. Sci. 2025, 15, 11291. https://doi.org/10.3390/app152011291

AMA Style

de la Vega J, Riba J-R, Ortega-Redondo JA. On the Use of Machine Learning Methods for EV Battery Pack Data Forecast Applied to Reconstructed Dynamic Profiles. Applied Sciences. 2025; 15(20):11291. https://doi.org/10.3390/app152011291

Chicago/Turabian Style

de la Vega, Joaquín, Jordi-Roger Riba, and Juan Antonio Ortega-Redondo. 2025. "On the Use of Machine Learning Methods for EV Battery Pack Data Forecast Applied to Reconstructed Dynamic Profiles" Applied Sciences 15, no. 20: 11291. https://doi.org/10.3390/app152011291

APA Style

de la Vega, J., Riba, J.-R., & Ortega-Redondo, J. A. (2025). On the Use of Machine Learning Methods for EV Battery Pack Data Forecast Applied to Reconstructed Dynamic Profiles. Applied Sciences, 15(20), 11291. https://doi.org/10.3390/app152011291

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

On the Use of Machine Learning Methods for EV Battery Pack Data Forecast Applied to Reconstructed Dynamic Profiles

Featured Application

Abstract

1. Introduction

2. Materials and Methods

2.1. Analyzed Reconstruction Methods

2.2. Analyzed RTD Forecasting Methods

3. Dataset

4. Results

4.1. Reconstruction Results

4.2. RTD Forecast Based on the Original Signal of Past Data

4.3. RTD Forecast Based on the Reconstructed Signal of Past Data

4.4. Assessment of the Computational Cost

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI