A Data-Driven Framework to Predict Lithium-Ion Battery Cell Imbalance for Real-Time Battery Management Systems

Models that can predict battery cells’ thermal and electrical behaviors are necessary for real-time battery management systems to regulate the imbalance within battery cells. This work introduces a Gaussian Process Regression (GPR)-based data-driven framework that succeeds the Multi-Scale Multi-Dimensional (MSMD) modeling structure. The framework can make highly accurate predictions at the same level as full-order full-distribution simulations based on MSMD. A pseudo-2D model is used to generate training data and is combined with a process that shifts computation burdens from real-time battery management systems to lab data preparation. The testing results highlight the reliability of the GPR-based data-driven framework in terms of accuracy and stability under various operational conditions.


Introduction
In applications such as electric vehicles (EVs), the imbalance of electrical potential, temperature, and the State of Charge (SOC) within battery cells may cause problems such as thermal runaway or short usage life [1,2]. Some methods such as Barsukov's [3] have been applied to regulate the imbalance, which monitors battery cells' electrical and thermal behavior with sensors and balance the cells via circuits and cooling devices. Although this method is straightforward and effective, many necessary sensors and balancing circuits render battery systems complicated and bulky [4]. Without a breakthrough in developing new materials, recent relevant industrial improvements through better designs in sensors and circuits have been minimal [5]. Therefore, mathematical models that predict the behavior of battery cells' becomes necessary for optimizating the battery design and eliminate the inherent imbalance within battery cells. Additionally, sensors and circuits may be reduced by providing real-time battery management that can act early through predictive computational algorithms to prevent battery imbalance.
This research aims to introduce a mathematical model for battery cells applied to real-time battery management systems. Due to the critical attributes of Li-ion battery cells (they provide one of the best tradeoffs in terms of power density, low weight, cell voltage, and low self-discharge according to [6]), they constitute the focus of this paper without any loss of generality.
Mathematical models for Li-ion battery dynamics generally fall within two main categories: the equivalent circuit models and the electrochemical models [7,8]. Popular equivalent circuit models include the Rin model and its later extension, the RC model, introduced by [9]. The equivalent circuit algorithms use electrical components to model battery cells' dynamic behavior based on empirical knowledge. These algorithms are advantageous due to their simplicity and fast computational speed, but their straightforward construction also creates inherent shortcomings, resulting in an accuracy that hinges upon restricted working conditions or temperature ranges [1,10]. To the authors' best knowledge, Table 1. Summary of solution variables and governing equations of the sub-models.

Domain Governing Equation Solution Variable
Particle 1D spherical particle model i ξ = k i (c e ) α a (c s,max − c s,e ) α a (c e ) α c exp α a F RT η − exp α a F RT η i ξ (r = R s , x, X, Y, Z)  In the particle domain, is the transfer current density at the particle surface, and is the lithium concentration within a solid electrode particle. In the electrode domain, is the volumetric reaction current at the electrode composite volume, which is evaluated with ̅ , the averaged transfer current density over the active particle surfaces. In the cell domain, is the volumetric current source in the current collectors, which is determined by , the current density in the electrode.
is the heat source. Here the triple prime denotes volumetric terms (e.g., current per volume), while the double prime denotes density terms (e.g., current per area).
The cell is treated in 2D to be compared with the works by Kim [14] and Guo [16], and the electrodes and the current collectors attached to their outside surfaces are treated together to discuss the electrical and thermal behaviors, named positive electrode plate and the negative electrode plate, respectively.
Each of the electrode plates is partitioned into 16 subareas, cf. Figure 1a. Note that the current collector tabs, i.e., the relatively large rectangular segments on the left side of the electrode plates connecting with the external circuits, are not of interest in this work In the particle domain, i ξ is the transfer current density at the particle surface, and c s is the lithium concentration within a solid electrode particle. In the electrode domain, j x is the volumetric reaction current at the electrode composite volume, which is evaluated with i ξ , the averaged transfer current density over the active particle surfaces. In the cell domain, j Θ is the volumetric current source in the current collectors, which is determined by i x , the current density in the electrode. q Θ is the heat source. Here the triple prime denotes volumetric terms (e.g., current per volume), while the double prime denotes density terms (e.g., current per area).
The cell is treated in 2D to be compared with the works by Kim [14] and Guo [16], and the electrodes and the current collectors attached to their outside surfaces are treated together to discuss the electrical and thermal behaviors, named positive electrode plate and the negative electrode plate, respectively.
Each of the electrode plates is partitioned into 16 subareas, cf. Figure 1a. Note that the current collector tabs, i.e., the relatively large rectangular segments on the left side of the electrode plates connecting with the external circuits, are not of interest in this work since no electrochemical process is active there. All the inputs are homogenized within each subarea before starting each timestep, while the temperature distribution is not. The temperature distribution is used as the initial condition to solve for the subsequent cell domain temperature distribution.
The Pseudo-2D (P2D) model, which describes the particle and electrode domain dynamics [15], is used to prepare the necessary data for training GPR models, cf. Figure 2. The process is separated into multiple timesteps to simulate an entire process of battery charge or discharge. Each new timestep starts with training GPR models. GPR models and the cell models solve for the temperature and electrical potential distributions on the plates in 2D in each timestep, and they become inputs for the next timestep.
Gaussian Process Regression is categorized as a nonparametric regression method that gives probabilistic models through extracting the features within the data and provides confidence information [26]. The data space = [ , , … , ] represents the Each new timestep starts with training GPR models. GPR models and the cell models solve for the temperature and electrical potential distributions on the plates in 2D in each timestep, and they become inputs for the next timestep. Each new timestep starts with training GPR models to predict electrical current source, heat source, and updated SOC (uSOC); these results are then fed to cell models to solve for the potential and temperature distributions through the cell structure in 2D (cf. Figure 1), which then become inputs for the next timestep.
In the first timestep, the inputs to the GPR models are deterministic. The predicted electrical current source, heat source, and uSOC are not correlated, and the predictions are parallel; see Section 2.2. However, in the second timestep and thereafter, the inputs to the GPR models are uncertain and subject to probability distributions resulting in non-parallel predictions and correlated predicted results.

Gaussian Process Regression Models
The lithium diffusion dynamics and charge transfer kinetics of the particle domain and the electrode domain can be solved with the P2D porous electrode model introduced by [15]. Solving the set of highly nonlinear and tightly coupled Differential-Algebraic Equations (DAEs) from this model is a computationally intensive process. Here, we will introduce GPR-based data-driven approaches to solve these DAEs only to generate the training data, and in real-time battery management processes, the recurring solution of the DAEs will be bypassed.
Data are obtained from the P2D model-based simulations with different combinations of inputs. The model simulates a potentiostatic process of the battery cell in the particle and electrode domains for a fixed time interval (timestep). For a specific battery cell under potentiostatic charge/discharge, we fix all the other changeable settings in the P2D model to constants and keep reference voltage, working temperature, and initial State of Charge (iSOC) as the three input variables. The combinations of these three variables compose the training input data pool. The output values of electrical current source, heat source, and uSOC after one timestep compose the training target data pool; these are obtained by solving the P2D model for a one-time step process. The training input data is then paired with the training target data from the observations we have available.
Gaussian Process Regression is categorized as a nonparametric regression method that gives probabilistic models through extracting the features within the data and provides confidence information [26]. The data space X = [x 1 , x 2 , . . . , x N ] T represents the N training input data selected from the training input data pool, and y represents the corresponding training target data. GPRs give the complete posterior distributions over possible functions under which the function values at any and every point are jointly Gaussian f(X) ∼ N (m(X), K(X, X)) (1) where the matrix K has entries K i,j representing k x i , x j , the covariance function takes two arguments and returns the covariance between their corresponding function values. The mean function m and the covariance function k fully determine the Gaussian Process. According to [27], the GPR is sufficiently expressive when specifying the mean function m(x) = 0. This paper applies the squared exponential covariance function, the hyperparameters of which are denoted as θ. The covariance between function values is determined by the corresponding input locations and the covariance function with specified θ hyperparameters and is not related to the actual values of the function. The process of optimizing the hyperparameters to make the covariance and thus the posterior distribution on functions agree most with the observed data is realized by minimizing the negative log marginal likelihood defined as NLML = − log p(y|X, θ ) as in the work of [28].
We can find the hyperparameters througĥ here, y is the observation, which is expressed by the distribution of functions in addition to the Gaussian distributed observation noise.
where ε ∼ N 0, σ 2 ε . The consequent log marginal likelihood is At the given prediction (testing) inputs X * , we use the jointly Gaussian distribution to make predictions of the new states f * , where, and The same set of training inputs X = [x 1 , x 2 , . . . , x N ] T , x i ∈ R D , but different training targets y a = y a 1 , y a 2 , . . . , y a N T , a = 1, . . . , E, can train E independent GPR models. In this work, the GPRs predict the one-timestep-after heat source, electrical current source, and uSOC, with the reference voltage, working temperature, and iSOC as the three input variables, so D = 3 and E = 3. As shown in the works of [26,29], the target values do not covary with each other since the training and prediction inputs are deterministic; that is, the predictions are parallel.

Probabilistic Finite Element Analysis of Battery Cells
The electrode plate potentials and the temperature in the cell domain are governed by three Partial Differential Equations (PDEs). As listed in Table 1, the governing PDEs are The cell is separated into 16 subareas in 2D, cf. Figure 1, and each grid has its heat source and electrical current source input. Thus, j Θ i and q Θ i , i = 1, 2, . . . , 16, form the variable vectors j Θ and q Θ .
The heat source term q Θ i and the electrical current source term j Θ i are predicted by GPRs. They are independent random variables following Gaussian distributions in the first timestep. The PDEs are to be solved with these random variables as input terms, and the solutions will thus be in the form of probability distributions. Here, the Multiplicative Dimensional Reduction Method (M-DRM) [30] is combined with finite element analysis (FEA) to find these solution distributions; hereafter this process is denoted as probabilistic FEAs.
Noticing that the negative electrode plate potential Φ − , the positive electrode plate potential Φ + , and the cell temperature T are not coupled among the three governing PDEs Equations (8)-(10); thus, they can be solved independently of the probabilistic FEAs. For example, the flowchart in Figure 3 explains how to use probabilistic FEAs to find the solution distribution of the positive electrode plate potential Φ + at a designated location. The procedure of the M-DRM application in [31] is followed here, which incorporates the Maximum Entropy principle [32] with fractional moment constraints [33].  In this procedure, cut functions are defined at the beginning for each PDE. Normal FEAs can solve their results, and their multiplication approximates the response of the PDE at the variable vector corresponding to the cuts.
Additionally, to find the statistical moments of the response, an adapted Gauss-Hermite quadrature method is derived; it allocates weighted cuts for the cut functions and splices the approximated moments. Finally, the Maximum Entropy principle is applied to find the solution distribution from the statistical moments. The details are shown step by step in the rest of this section. Firstly, the solution of Φ + (X = x, Y = y) in Equation (8) with j Θ as input is defined as function h P j Θ , the solution of Φ − (X = x, Y = y) in Equation (9) with j Θ as input is defined as function h N j Θ , and the solution of T(X = x, Y = y) in Equation (10) with q Θ as input is defined as function h T q Θ .
In this context, the i-th cut functions of the governing PDEs when i ≥ 1 is defined as With the zeroth cut functions described as where, c i s are mean values of the corresponding variables.
The response function of a PDE is the solution, in our case the numerical solution from the FEA to the PDE given the variable vector. The response functions of the governing PDEs in Equations (8)-(10) are approximated by Next, the statistical moments of the response functions are calculated. From the definition of the moments, the α-th statistical moments of the response functions are To avoid possible computational difficulties from the integrations, we use the approximations in Equations (17)- (19) instead to further approximate the α-th statistical moments of the response functions , the Gauss-Hermite quadrature method can be applied. The steps from Equations (26) through (33) explain how the Gauss-Hermite quadrature method is derived for application in our case.
According to [34], an integral of the form e −x 2 g(x)dx can be assessed as where w j and x j are weights and nodes given by this approximation. So, if the expectation of a function H(Z) where Z is standard normal is estimated, Equation (26) can be rewritten as Subsequently, the expectation is approximated by With our proposed weights q j = 1 √ π w j , and nodes z j = √ 2x j . For a 5-node case, the nodes z j and weights q j are listed in Table 2. On the other hand, for normally distributed random variables j Θ i ∼ N µ J i , σ J i 2 , which are the electrical current source terms predicted by GPRs, they can be related to the standard normal random variable Z via Using integration by substitution, we find In other words, to find the expectation of h Pi j Θ i α , a function of a normally dis- which is a function of a standard normal random variable as H(Z) in Equation (28). We thereby follow the Equation (28) to approximate the Equation (30), Similarly, we also have where the normally distributed random variables q Θ i ∼ N µ Qi , σ Qi 2 are the heat source terms predicted by GPRs.
Following the Maximum Entropy principle with fractional moment constraints [32,33], the estimated PDFs of Y P , Y N , and Y T are in the form of where Y represents Y P , Y N , and Y T , and y represents y P , y N , and y T , correspondingly, λ = [λ 0 , λ 1 , . . . , λ m ] T are the Lagrange multipliers, and α = [α 0 , α 1 , . . . , α m ] T are the fractions associated with the fractional moments to be determined.
Based on the normalization condition that the integration of a single PDF must be equal to one, for k = 0, α 0 = 0, and We find the aim solution distributions by where M α k Y are the moments in Equations (23)- (25). The solutions for the case Y = Y P are noted as α P and λ P , the solutions for the case Y = Y N are noted as α N and λ N , and the solutions for the case Y = Y T are noted as α T and λ T .

Observation Data Creation
Data were obtained from the P2D model-based simulations with different input combinations, as explained in Section 2.1. Potentiostatic discharge processes of the battery cell in the particle and electrode domains with a timestep of 25 s. were performed. The parameters of the simulations are specified in Table 3. The reference voltage, working temperature, and iSOC were maintained as the three input variables. The observation points shown in Table 3 consist of values picked uniformly for each input variable within their upper and lower bounds. There are, thus, 65,025 input variable observation point combinations. The P2D model is solved under every combination by the tool LIONSIMBA [35] to generate observation data.

Training Data Selection and GPR Results
To prognosticate at any prediction input point x * , we reference the observation data to find training input data X = [x 1 , x 2 , . . . , x N ] T that satisfy Paired with the corresponding training target data y a = y a 1 , y a 2 , . . . , y a N T , a = 1, 2, 3, we create independent GPRs as explained in Section 2.2, in which y 1 i represent the electrical current source, y 2 i represent the heat source, and y 3 i represent the uSOC. The reliability of the GPR predictions was tested by selecting 20 random groups of testing inputs and comparing the GPR predictions with the P2D simulation results.
The box plots reflect the performance of the predictions on the above-selected testing inputs. In the plots, every two error bars and one box construct one column, and each of the columns describes a corresponding distribution of [(f * − m)|x * , X, y ], which is the transformed prediction result on a testing input. The distributions are still Gaussian but centered at zero. The upper and lower ending tips of the error bars in each column mark the 97.5th and 2.5th percentiles, respectively. The ceilings and floors of the boxes are respectively at the 75th and the 25th percentiles. The bars at the center of the boxes are the means of the distribution, which here are all zeros. The "X" mark on each column shows the value of the difference between the testing target and the testing prediction mean [y * − m|x * , X, y ]. The closer the "X" mark is to the center (0 value) of the column, the better the performance of that prediction.
In Figure 4, the box plot illustrates the distribution of the transformed prediction for the electrical current source, and the scatter plot denotes the testing prediction mean for the electrical current source. The horizontal axis denotes the 20 testing groups. The vertical axis on the left corresponds to the boxes denoting the distributions of the transformed predictions for the electrical current source, while the vertical axis on the right corresponds to the square marks denoting the testing prediction means. The bars at the center of the boxes, signifying the means of the distributions, are all zeros. The "X" mark on each column indicates the value of the difference between the testing target and the testing prediction mean for the electrical current source. The closer the "X" mark is to the center of the column, the better that prediction performance. For the electrical current source, the distributions of the transformed predictions are in the order of 10 3 . The testing prediction means are in the order of 10 7 .
The box plot in Figure 5 illustrates the distribution of the transformed prediction for the heat source, and the scatter plot denotes the testing prediction mean for the heat source. The horizontal axis denotes the 20 testing groups. The vertical axis on the left corresponds to the boxes denoting the distributions of the transformed predictions for the heat source, while the vertical axis on the right corresponds to the square marks denoting the testing prediction means. The bars at the center of the boxes, denoting the means of the distributions, are all zeros as well. The "X" mark on each column indicates the value of the difference between the testing target and the testing prediction mean for the heat source. The closer the "X" mark is to the center of the column, the better that prediction performance. For a heat source, the distributions of the transformed predictions are in the order of 10 0 The testing prediction means are in the order of 10 5 .
while the vertical axis on the right corresponds to the square marks denoting the testing prediction means. The bars at the center of the boxes, denoting the means of the distributions, are all zeros as well. The "X" mark on each column indicates the value of the difference between the testing target and the testing prediction mean for the heat source. The closer the "X" mark is to the center of the column, the better that prediction performance. For a heat source, the distributions of the transformed predictions are in the order of 10 The testing prediction means are in the order of 10 . The "X" mark on each column indicates the value of the difference between the testing target and the testing prediction mean. The closer the "X" mark is to the center of the column, the better that prediction performance. The scatter plot denotes the testing prediction mean for electrical current source. The predictions for electrical current source are reliable given that prediction means are in the order of 10 while the transformed prediction distributions are in the order of 10 . The scatter plots illustrating the testing targets and the testing prediction means for the updated SOC (uSOC) are shown in Figure 6. The horizontal axis denotes the 20 testing groups. The vertical axis corresponds to the square marks denoting the uSOC of the testing targets, and the "+" marks denoting the uSOC of the testing prediction means. The The bars at the center of the boxes are the means of distribution. The "X" mark on each column indicates the value of the difference between the testing target and the testing prediction mean. The closer the "X" mark is to the center of the column, the better that prediction performance. The scatter plot denotes the testing prediction mean for electrical current source. The predictions for electrical current source are reliable given that prediction means are in the order of 10 7 while the transformed prediction distributions are in the order of 10 3 .
while the vertical axis on the right corresponds to the square marks denoting the testing prediction means. The bars at the center of the boxes, denoting the means of the distributions, are all zeros as well. The "X" mark on each column indicates the value of the difference between the testing target and the testing prediction mean for the heat source. The closer the "X" mark is to the center of the column, the better that prediction performance. For a heat source, the distributions of the transformed predictions are in the order of 10 The testing prediction means are in the order of 10 . The "X" mark on each column indicates the value of the difference between the testing target and the testing prediction mean. The closer the "X" mark is to the center of the column, the better that prediction performance. The scatter plot denotes the testing prediction mean for electrical current source. The predictions for electrical current source are reliable given that prediction means are in the order of 10 while the transformed prediction distributions are in the order of 10 . The scatter plots illustrating the testing targets and the testing prediction means for the updated SOC (uSOC) are shown in Figure 6. The horizontal axis denotes the 20 testing groups. The vertical axis corresponds to the square marks denoting the uSOC of the testing targets, and the "+" marks denoting the uSOC of the testing prediction means. The The scatter plots illustrating the testing targets and the testing prediction means for the updated SOC (uSOC) are shown in Figure 6. The horizontal axis denotes the 20 testing groups. The vertical axis corresponds to the square marks denoting the uSOC of the testing targets, and the "+" marks denoting the uSOC of the testing prediction means. The prediction means are in the order of 10 2 while their Mean Squared Error (MSE) with the testing targets is in the order of 10 −4 . Box plots are therefore not necessary here to see the obvious adequate accuracy of the uSOC predictions.

Probabilistic FEA Results
Two groups of FEA simulations were performed on two different initial conditions. In the first group, the cell started discharging initially from the universal temperature of 298 K, with a State of Charge (SOC) of 85.51%, the output voltage at 3.75 volts, and a simulation timestep of 25 s. The results were compared with those from another simulation under the same settings based on Guo's method. In the second group, the previous results from Guo's method in the first group served as the initial condition and started another 25-s. timestep. The simulation results with the proposed method were compared with those from Guo's method again. The other specifications about the cell are in Table  4.

Probabilistic FEA Results
Two groups of FEA simulations were performed on two different initial conditions. In the first group, the cell started discharging initially from the universal temperature of 298 K, with a State of Charge (SOC) of 85.51%, the output voltage at 3.75 volts, and a simulation timestep of 25 s. The results were compared with those from another simulation under the same settings based on Guo's method. In the second group, the previous results from Guo's method in the first group served as the initial condition and started another 25-s. timestep. The simulation results with the proposed method were compared with those from Guo's method again. The other specifications about the cell are in Table 4. Activation energy for reaction J mol −1 5000 5000 The universal gas constant, Table 4. Cont.

Domain Parameter Value
The Planar thermal conductivity,

First Group Experiments
In the first group, three points in subarea block A and three points in subarea block B were randomly selected on a cell to observe the simulation results, as shown in Figure 7.

First Group Experiments
In the first group, three points in subarea block A and three points in subarea block B were randomly selected on a cell to observe the simulation results, as shown in Figure  7. In block A, the probability distributions of the potential difference between the pairs at the three points are plotted in Figure 8A(a-c). The probability distributions of the temperature at the three points are plotted in Figure 8B(a-c).
The box plots display the predicted probability distributions of potential differences between plates at three random locations selected in block A in the first simulation group, cf. Figure 8A. The scales on the vertical axis indicate potential differences. The red "X" marks denote the corresponding potential difference results at these locations at the 25th s. using Guo's method [16]. All results fall within the 95% prediction confidence interval presented in this paper. Similarly, the predicted probability distributions of temperatures In block A, the probability distributions of the potential difference between the pairs at the three points are plotted in Figure 8A(a-c). The probability distributions of the temperature at the three points are plotted in Figure 8B(a-c).  Figure 8. The box plots display the predicted probability distributions of (A) potential differences, (B) temperature, between plates at three random locations selected in block A in the first simulation group, and (C) potential differences, (D) temperature at the three locations in block B in the first simulation group. The red "X" marks denote the corresponding results at these locations at the 25th s. using the method in work [16], and they all fall within the 95% prediction confidence interval presented in this paper. The scales are to show potential differences and temperatures as marked (Units: (A,C) Volt and (B,D) K).
The entire cell temperature results from the presented GPR-based model are depicted in Figure 9A(a) for the first simulation group. Figure 9A(b) illustrates the corresponding temperature results at the 25th s. using Guo's work [16]. When the two are compared, they Figure 8. The box plots display the predicted probability distributions of (A) potential differences, (B) temperature, between plates at three random locations selected in block A in the first simulation group, and (C) potential differences, (D) temperature at the three locations in block B in the first simulation group. The red "X" marks denote the corresponding results at these locations at the 25th s. using the method in work [16], and they all fall within the 95% prediction confidence interval presented in this paper. The scales are to show potential differences and temperatures as marked (Units: (A,C) Volt and (B,D) K).
The box plots display the predicted probability distributions of potential differences between plates at three random locations selected in block A in the first simulation group, cf. Figure 8A. The scales on the vertical axis indicate potential differences. The red "X" marks denote the corresponding potential difference results at these locations at the 25th s. using Guo's method [16]. All results fall within the 95% prediction confidence interval presented in this paper. Similarly, the predicted probability distributions of temperatures at the three locations in block A in the first simulation group are illustrated in Figure 8B. Note that the red marks indicate the corresponding temperature results using Guo's work [16], and fall within our 95% prediction confidence interval.
In block B, the probability distributions of the potential difference and temperatures between the pairs at the three points are plotted in Figure 8C(a-c) and Figure 8D(a-c), respectively. The data in Figure 8C,D confirm that the corresponding potential difference results and temperature results from Guo's method [16] are always within our 95% prediction confidence interval as expected.
The entire cell temperature results from the presented GPR-based model are depicted in Figure 9A(a) for the first simulation group. Figure 9A(b) illustrates the corresponding temperature results at the 25th s. using Guo's work [16]. When the two are compared, they match. The temperature distribution of a cell in 2D does not show severe unevenness in the first simulation group. The end far from the current collector tab has a relatively lower temperature while the unevenness is growing. The horizontal and vertical axes scales indicate the cell dimensions as in Figure 1, while the color scales index the cell temperatures. Figure 9B(a) illustrates the GPR model-based electrical potential distribution result on the cell's negative plate in 2D in the first simulation group; the potential is relatively higher near the current collector tab area. Figure 9B(b) illustrates the GPR model-based electrical potential distribution result on the cell's positive plate in 2D in the first simulation group; the potential is relatively lower near the current collector tab area. In addition to the scales on the horizontal and vertical axes showing the cell dimensions, the color scales index the electrical potentials.
The electrical potential distribution results at the 25th s. using the method in Guo's work [16] is illustrated in Figure 9C(a,b). The outcomes correspond to the results from our GPR models in the first simulation group, as in Figure 9B. Figure 9C(a) is the result on the cell's negative plate. Figure 9C(b) is the cell's positive plate. They are identical to the results shown in Figure 9B.

Second Group Experiments
Three locations in block A and three locations in block B were randomly selected on a cell to observe the simulation results, as shown in Figure 7. In block A, the probability distributions of the potential difference between the pairs at the three points are plotted in Figure 10A(a-c). The probability distributions of the temperature at the three points are plotted in Figure 10B(a-c). In an identical manner as in the first group of experiments, the red "X" marks indicate results using Guo's method [16] for the potential differences ( Figure 10A) and temperatures ( Figure 10B) all fall within the 95% prediction confidence interval by the method in this paper without exception. The experiments are repeated for block B, and the probability distributions of the potential difference between the pairs at the three points are plotted in Figure 10C(a-c) and the temperature ones in Figure 10D(a-c). Again, for all locations in block B, our 95% prediction confidence interval still covers the corresponding potential difference and temperature results compared to Guo's simulations [16]. The results validate the presented method in the second simulation group.
The entire cell's temperature results, of the second simulation group, from our GPRbased model, are depicted in Figure 11a. Figure 11b illustrates the results employing Guo's work [16]. A comparison between the two reveals that the results of the two methods show an exact match. The temperature distribution is uneven in the second simulation group with the areas near collector tabs subjected to higher temperatures because of the more intensive electrical and chemical activities.
The electrical potential distribution results of the GPR model on the negative and positive plates are depicted in Figure 11B,C. Both illustrations are for the second simulation group. As seen, when compared to the first simulation group, cf. Figure 9B,C, the results are identical signifying the robustness of the proposed framework. Figure 11C(a,b) shows the electrical potential distribution results using [16], which correspond to the results by our GPR models in the second simulation group as in Figure 11B(a-c). Electrical potential distribution at the 25th s. using [16], which correspond to the results by our GPR models in the first simulation group as in Figure 9B.  Electrical potential distribution at the 25th s. using [16], which correspond to the results by our GPR models in the first simulation group as in Figure 9B. Electrical potential distribution on the (a) negative plate, and (b) positive plate of the cell in 2D. They are identical to the results in Figure 9B. Color scales index the electrical potentials in Volts. Scales indicate cell dimensions (Unit: m). Energies 2021, 14, x FOR PEER REVIEW 20 of 25 Figure 10. The box plots display the predicted probability distributions of (A) potential differences, (B) temperature, between plates at three random locations selected in block A in the second simulation group, and (C) potential differences, (D) temperature at the three locations in block B in the second simulation group. The red marks denote the corresponding results at these locations by the method in Guo's work [16], and they all fall within the 95% prediction confidence interval presented in this paper. The scales are to show potential differences and temperatures as marked (Units: (A,C) Volt and (B,D) K). Figure 10. The box plots display the predicted probability distributions of (A) potential differences, (B) temperature, between plates at three random locations selected in block A in the second simulation group, and (C) potential differences, (D) temperature at the three locations in block B in the second simulation group. The red marks denote the corresponding results at these locations by the method in Guo's work [16], and they all fall within the 95% prediction confidence interval presented in this paper. The scales are to show potential differences and temperatures as marked (Units: (A,C) Volt and (B,D) K).  Color scales index the electrical potentials in Volts. (C) Electrical potential distribution at the 25th s. using [16], which correspond to the results by our GPR models in the first simulation group as in Figure 11B. Electrical potential distribution on the (a) negative plate, and (b) positive plate of the cell in 2D. They are identical to the results in Figure 11B. Color scales index the electrical potentials in Volts. Scales indicate cell dimensions (Unit: m). Electrical potential distribution at the 25th s. using [16], which correspond to the results by our GPR models in the first simulation group as in Figure 11B. Electrical potential distribution on the (a) negative plate, and (b) positive plate of the cell in 2D. They are identical to the results in Figure 11B. Color scales index the electrical potentials in Volts. Scales indicate cell dimensions (Unit: m).

Discussion
The previous section tested the GPR predictions' reliability at the beginning of the simulation experiments by randomly selecting 20 groups of testing inputs and comparing the GPR predictions with the P2D simulation results.
Predictions of the electrical current and heat sources are reliable as seen from comparing Figures 4 and 5. As seen, the transformed prediction distributions are in the order of 10 3 and 10 0 , given that the prediction means are in the order of 10 7 and 10 5 for them. The prediction means on the uSOC are in the order of 10 2 while their Mean Squared Error (MSE) with the testing targets is in the order of 10 −4 . Based on the above, one may surmise that the proposed method is of adequate accuracy without using box plots that uSOC predictions, as shown in Figure 6.
Next, two groups of FEA simulation experiments were performed on two different initial conditions. The cell discharged at a fixed output voltage in the first group, with a universal initial temperature and SOC, at a 25-s. simulation timestep. The results obtained using the proposed method were compared with those from another simulation under the same setting but instead based on Guo's method. In the second group, the previous results from Guo's method in the first group served as the initial condition and started another 25-s. timestep. The simulation results obtained using the proposed method were compared with those from Guo's method again. Three random locations in subarea block A and three random locations in subarea block B, as shown in Figure 7, were selected on the battery cell to observe the simulation results for potential differences and temperatures. The entire cell results in terms of electrical potential and temperature distributions were also observed.
In both the first and the second group of experiments, for all the six locations within the subarea blocks A and B, the corresponding results by Guo's method [16] fall within the 95% confidence intervals of our predicted probability distributions without exceptions (cf. Figures 8-11). The first group of experiments compares the entire cell temperature results from the presented method and the one in Guo [16], cf. Figure 9A, the results match exactly. There is no severe unevenness throughout the plates, and the end far from the current collector tab has a relatively lower temperature while the unevenness is growing.
Moreover, the present method is further validated for the electrical potential distributions against Guo's work [16] for the positive and the negative plates. Comparison of Figure 9B,C illustrates identical behavior, namely the potential is relatively higher near the current collector tab area on the negative plate, while on the positive plate, the potential is relatively lower near the current collector tab area. In the second group of experiments, a comparison of the entire cell temperature results between this paper and Guo's work [16], cf. Figure 11A reveals an exact match. Similar conclusions are noted for the electrical potentials' distributions, cf. Figure 11B,C. These observations indicate that the presented method is reliable and of adequate accuracy.
It is noted that the presented work relied on simulating the potentiostatic discharge processes of the battery cell to carry out testing. Additional simulations could be arranged toward various current standards for EV batteries in future work. That helps to fully discover and evaluate the limits of the presented GPR-based data-driven framework before it is applied in an EV battery management system. Some noticeable standards related to the EV battery performance and safety are the "62660" series documents for lithium-ion battery use in EVs by the International Electrotechnical Commission, the "2580" document by UL LLC, and the "USABC" manuals produced and maintained by the United States Council for Automotive Research LLC, along with the United States Advanced Battery Consortium [36][37][38][39][40]. Implementing the framework for real-time battery management systems must consider the data pipelines and communication protocols with the rest of EV systems [41]. The testing in this paper did not cover the battery's degradation as in work [42,43]. However, because of its data-driven nature, the presented method can easily incorporate empirical data reflecting battery degradation for model enhancement as equivalent circuit models [44] in future work.

Conclusions
This paper presents the applications of a GPR-based data-driven framework on a single Li-ion battery cell when the inputs to the GPRs are deterministic. A Li-ion electrode pair is used as a battery cell example, and its thermal and electrical behaviors are solved following the MSMD framework in multiple domains.
The GPRs work in the particle and electrode domains in a battery cell, and the predictions for the different behavior aspects, are not correlated with each other when the inputs are deterministic. Different combinations of the settings in the P2D model compose the training input data pool, and the corresponding P2D model simulation results compose the training target data pool. It is noteworthy that the testing results clearly demonstrate that the GPRs predictions are reliable, accuracy, and stable.
The probabilistic FEA method solves for the temperature and electrode plate potential distribution in the cell domain. PDEs cut functions are defined where variables are not correlated with each other. Furthermore, a combination of Gaussian-Hermite methods with the Maximum Entropy principle is adopted to approximate the solution distributions following the M-DRM procedure. The testing results exhibit that the means of the solution distributions match the results from other methods. Notably, the probability distributions of the temperatures and the plate potential differences are within reasonable ranges.
In this work, the overall GPR data-driven framework is validated against published data. The framework offers accurate and stable predictions for battery cell's thermal and electrical behaviors during the first timestep in a real-time battery management system where the GPR inputs are deterministic. Future work will examine battery degradation and validation against current EV battery management standards.